refactor: extract codex_responses logic into dedicated adapter

Move 10 Responses API format-conversion and normalization functions from run_agent.py into agent/codex_responses_adapter.py. All functions are now stateless module-level functions with zero self references. The AIAgent methods remain as thin one-line wrappers that delegate to the adapter, so all callers (tests, gateway, CLI) are unchanged. Functions extracted: - _deterministic_call_id: deterministic tool call ID generation - _split_responses_tool_id: composite ID splitting - _derive_responses_function_call_id: call_ to fc_ prefix conversion - _responses_tools: chat completions tool schema → Responses format - _chat_messages_to_responses_input: message format conversion - _preflight_codex_input_items: input item normalization - _preflight_codex_api_kwargs: API kwargs validation/cleaning - _extract_responses_message_text: text extraction from response items - _extract_responses_reasoning_text: reasoning extraction - _normalize_codex_response: full response normalization This brings codex_responses in line with anthropic_adapter.py and bedrock_adapter.py which already have their own adapter files. run_agent.py: 12410 → 11845 lines (-565 net)
feat(plugins): add transform_tool_result hook for generic tool-result rewriting (#12972 )
2026-04-20 16:32:43 +05:30 · 2026-04-20 03:48:08 -07:00 · 2026-04-20 03:10:19 -07:00 · 2026-04-20 03:10:19 -07:00 · 2026-04-20 03:08:04 -07:00 · 2026-04-20 03:07:51 -07:00
416 changed files with 37965 additions and 7710 deletions
@@ -3,8 +3,13 @@ name: Docker Build and Publish
 on:
  push:
    branches: [main]
-  pull_request:
-    branches: [main]
+    paths:
+      - '**/*.py'
+      - 'pyproject.toml'
+      - 'uv.lock'
+      - 'Dockerfile'
+      - 'docker/**'
+      - '.github/workflows/docker-publish.yml'
  release:
    types: [published]

@@ -49,6 +54,14 @@ jobs:

      - name: Test image starts
        run: |
+          # The image runs as the hermes user (UID 10000).  GitHub Actions
+          # creates /tmp/hermes-test root-owned by default, which hermes
+          # can't write to — chown it to match the in-container UID before
+          # bind-mounting.  Real users doing `docker run -v ~/.hermes:...`
+          # with their own UID hit the same issue and have their own
+          # remediations (HERMES_UID env var, or chown locally).
+          mkdir -p /tmp/hermes-test
+          sudo chown -R 10000:10000 /tmp/hermes-test
          docker run --rm \
            -v /tmp/hermes-test:/opt/data \
            --entrypoint /opt/hermes/docker/entrypoint.sh \
@@ -3,14 +3,31 @@ name: Supply Chain Audit
 on:
  pull_request:
    types: [opened, synchronize, reopened]
+    paths:
+      - '**/*.py'
+      - '**/*.pth'
+      - '**/setup.py'
+      - '**/setup.cfg'
+      - '**/sitecustomize.py'
+      - '**/usercustomize.py'
+      - '**/__init__.pth'

 permissions:
  pull-requests: write
  contents: read

+# Narrow, high-signal scanner. Only fires on critical indicators of supply
+# chain attacks (e.g. the litellm-style payloads). Low-signal heuristics
+# (plain base64, plain exec/eval, dependency/Dockerfile/workflow edits,
+# Actions version unpinning, outbound POST/PUT) were intentionally
+# removed — they fired on nearly every PR and trained reviewers to ignore
+# the scanner. Keep this file's checks ruthlessly narrow: if you find
+# yourself adding WARNING-tier patterns here again, make a separate
+# advisory-only workflow instead.
+
 jobs:
  scan:
-    name: Scan PR for supply chain risks
+    name: Scan PR for critical supply chain risks
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
@@ -18,7 +35,7 @@ jobs:
        with:
          fetch-depth: 0

-      - name: Scan diff for suspicious patterns
+      - name: Scan diff for critical patterns
        id: scan
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@@ -28,19 +45,19 @@ jobs:
          BASE="${{ github.event.pull_request.base.sha }}"
          HEAD="${{ github.event.pull_request.head.sha }}"

-          # Get the full diff (added lines only)
+          # Added lines only, excluding lockfiles.
          DIFF=$(git diff "$BASE".."$HEAD" -- . ':!uv.lock' ':!*.lock' ':!package-lock.json' ':!yarn.lock' || true)

          FINDINGS=""
-          CRITICAL=false

          # --- .pth files (auto-execute on Python startup) ---
+          # The exact mechanism used in the litellm supply chain attack:
+          # https://github.com/BerriAI/litellm/issues/24512
          PTH_FILES=$(git diff --name-only "$BASE".."$HEAD" | grep '\.pth$' || true)
          if [ -n "$PTH_FILES" ]; then
-            CRITICAL=true
            FINDINGS="${FINDINGS}
          ### 🚨 CRITICAL: .pth file added or modified
-          Python \`.pth\` files in \`site-packages/\` execute automatically when the interpreter starts — no import required. This is the exact mechanism used in the [litellm supply chain attack](https://github.com/BerriAI/litellm/issues/24512).
+          Python \`.pth\` files in \`site-packages/\` execute automatically when the interpreter starts — no import required.

          **Files:**
          \`\`\`
@@ -49,13 +66,12 @@ jobs:
          "
          fi

-          # --- base64 + exec/eval combo (the litellm attack pattern) ---
+          # --- base64 decode + exec/eval on the same line (the litellm attack pattern) ---
          B64_EXEC_HITS=$(echo "$DIFF" | grep -n '^\+' | grep -iE 'base64\.(b64decode|decodebytes|urlsafe_b64decode)' | grep -iE 'exec\(|eval\(' | head -10 || true)
          if [ -n "$B64_EXEC_HITS" ]; then
-            CRITICAL=true
            FINDINGS="${FINDINGS}
          ### 🚨 CRITICAL: base64 decode + exec/eval combo
-          This is the exact pattern used in the [litellm supply chain attack](https://github.com/BerriAI/litellm/issues/24512) — base64-decoded strings passed to exec/eval to hide credential-stealing payloads.
+          Base64-decoded strings passed directly to exec/eval — the signature of hidden credential-stealing payloads.

          **Matches:**
          \`\`\`
@@ -64,41 +80,12 @@ jobs:
          "
          fi

-          # --- base64 decode/encode (alone — legitimate uses exist) ---
-          B64_HITS=$(echo "$DIFF" | grep -n '^\+' | grep -iE 'base64\.(b64decode|b64encode|decodebytes|encodebytes|urlsafe_b64decode)|atob\(|btoa\(|Buffer\.from\(.*base64' | head -20 || true)
-          if [ -n "$B64_HITS" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: base64 encoding/decoding detected
-          Base64 has legitimate uses (images, JWT, etc.) but is also commonly used to obfuscate malicious payloads. Verify the usage is appropriate.
-
-          **Matches (first 20):**
-          \`\`\`
-          ${B64_HITS}
-          \`\`\`
-          "
-          fi
-
-          # --- exec/eval with string arguments ---
-          EXEC_HITS=$(echo "$DIFF" | grep -n '^\+' | grep -E '(exec|eval)\s*\(' | grep -v '^\+\s*#' | grep -v 'test_\|mock\|assert\|# ' | head -20 || true)
-          if [ -n "$EXEC_HITS" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: exec() or eval() usage
-          Dynamic code execution can hide malicious behavior, especially when combined with base64 or network fetches.
-
-          **Matches (first 20):**
-          \`\`\`
-          ${EXEC_HITS}
-          \`\`\`
-          "
-          fi
-
-          # --- subprocess with encoded/obfuscated commands ---
-          PROC_HITS=$(echo "$DIFF" | grep -n '^\+' | grep -E 'subprocess\.(Popen|call|run)\s*\(' | grep -iE 'base64|decode|encode|\\x|chr\(' | head -10 || true)
+          # --- subprocess with encoded/obfuscated command argument ---
+          PROC_HITS=$(echo "$DIFF" | grep -n '^\+' | grep -E 'subprocess\.(Popen|call|run)\s*\(' | grep -iE 'base64|\\x[0-9a-f]{2}|chr\(' | head -10 || true)
          if [ -n "$PROC_HITS" ]; then
-            CRITICAL=true
            FINDINGS="${FINDINGS}
          ### 🚨 CRITICAL: subprocess with encoded/obfuscated command
-          Subprocess calls with encoded arguments are a strong indicator of payload execution.
+          Subprocess calls whose command strings are base64- or hex-encoded are a strong indicator of payload execution.

          **Matches:**
          \`\`\`
@@ -107,25 +94,12 @@ jobs:
          "
          fi

-          # --- Network calls to non-standard domains ---
-          EXFIL_HITS=$(echo "$DIFF" | grep -n '^\+' | grep -iE 'requests\.(post|put)\(|httpx\.(post|put)\(|urllib\.request\.urlopen' | grep -v '^\+\s*#' | grep -v 'test_\|mock\|assert' | head -10 || true)
-          if [ -n "$EXFIL_HITS" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: Outbound network calls (POST/PUT)
-          Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.
-
-          **Matches (first 10):**
-          \`\`\`
-          ${EXFIL_HITS}
-          \`\`\`
-          "
-          fi
-
-          # --- setup.py / setup.cfg install hooks ---
-          SETUP_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -E '(setup\.py|setup\.cfg|__init__\.pth|sitecustomize\.py|usercustomize\.py)$' || true)
+          # --- Install-hook files (setup.py/sitecustomize/usercustomize/__init__.pth) ---
+          # These execute during pip install or interpreter startup.
+          SETUP_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -E '(^|/)(setup\.py|setup\.cfg|sitecustomize\.py|usercustomize\.py|__init__\.pth)$' || true)
          if [ -n "$SETUP_HITS" ]; then
            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: Install hook files modified
+          ### 🚨 CRITICAL: Install-hook file added or modified
          These files can execute code during package installation or interpreter startup.

          **Files:**
@@ -135,114 +109,31 @@ jobs:
          "
          fi

-          # --- Compile/marshal/pickle (code object injection) ---
-          MARSHAL_HITS=$(echo "$DIFF" | grep -n '^\+' | grep -iE 'marshal\.loads|pickle\.loads|compile\(' | grep -v '^\+\s*#' | grep -v 'test_\|re\.compile\|ast\.compile' | head -10 || true)
-          if [ -n "$MARSHAL_HITS" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: marshal/pickle/compile usage
-          These can deserialize or construct executable code objects.
-
-          **Matches:**
-          \`\`\`
-          ${MARSHAL_HITS}
-          \`\`\`
-          "
-          fi
-
-          # --- CI/CD workflow files modified ---
-          WORKFLOW_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -E '\.github/workflows/.*\.ya?ml$' || true)
-          if [ -n "$WORKFLOW_HITS" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: CI/CD workflow files modified
-          Changes to workflow files can alter build pipelines, inject steps, or modify permissions. Verify no unauthorized actions or secrets access were added.
-
-          **Files:**
-          \`\`\`
-          ${WORKFLOW_HITS}
-          \`\`\`
-          "
-          fi
-
-          # --- Dockerfile / container build files modified ---
-          DOCKER_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -iE '(Dockerfile|\.dockerignore|docker-compose)' || true)
-          if [ -n "$DOCKER_HITS" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: Container build files modified
-          Changes to Dockerfiles or compose files can alter base images, add build steps, or expose ports. Verify base image pins and build commands.
-
-          **Files:**
-          \`\`\`
-          ${DOCKER_HITS}
-          \`\`\`
-          "
-          fi
-
-          # --- Dependency manifest files modified ---
-          DEP_HITS=$(git diff --name-only "$BASE".."$HEAD" | grep -E '(pyproject\.toml|requirements.*\.txt|package\.json|Gemfile|go\.mod|Cargo\.toml)$' || true)
-          if [ -n "$DEP_HITS" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: Dependency manifest files modified
-          Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.
-
-          **Files:**
-          \`\`\`
-          ${DEP_HITS}
-          \`\`\`
-          "
-          fi
-
-          # --- GitHub Actions version unpinning (mutable tags instead of SHAs) ---
-          ACTIONS_UNPIN=$(echo "$DIFF" | grep -n '^\+' | grep 'uses:' | grep -v '#' | grep -E '@v[0-9]' | head -10 || true)
-          if [ -n "$ACTIONS_UNPIN" ]; then
-            FINDINGS="${FINDINGS}
-          ### ⚠️ WARNING: GitHub Actions with mutable version tags
-          Actions should be pinned to full commit SHAs (not \`@v4\`, \`@v5\`). Mutable tags can be retargeted silently if a maintainer account is compromised.
-
-          **Matches:**
-          \`\`\`
-          ${ACTIONS_UNPIN}
-          \`\`\`
-          "
-          fi
-
-          # --- Output results ---
          if [ -n "$FINDINGS" ]; then
            echo "found=true" >> "$GITHUB_OUTPUT"
-            if [ "$CRITICAL" = true ]; then
-              echo "critical=true" >> "$GITHUB_OUTPUT"
-            else
-              echo "critical=false" >> "$GITHUB_OUTPUT"
-            fi
-            # Write findings to a file (multiline env vars are fragile)
            echo "$FINDINGS" > /tmp/findings.md
          else
            echo "found=false" >> "$GITHUB_OUTPUT"
-            echo "critical=false" >> "$GITHUB_OUTPUT"
          fi

-      - name: Post warning comment
+      - name: Post critical finding comment
        if: steps.scan.outputs.found == 'true'
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
-          SEVERITY="⚠️ Supply Chain Risk Detected"
-          if [ "${{ steps.scan.outputs.critical }}" = "true" ]; then
-            SEVERITY="🚨 CRITICAL Supply Chain Risk Detected"
-          fi
+          BODY="## 🚨 CRITICAL Supply Chain Risk Detected

-          BODY="## ${SEVERITY}
-
-          This PR contains patterns commonly associated with supply chain attacks. This does **not** mean the PR is malicious — but these patterns require careful human review before merging.
+          This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

          $(cat /tmp/findings.md)

          ---
-          *Automated scan triggered by [supply-chain-audit](/.github/workflows/supply-chain-audit.yml). If this is a false positive, a maintainer can approve after manual review.*"
+          *Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.*"

          gh pr comment "${{ github.event.pull_request.number }}" --body "$BODY" || echo "::warning::Could not post PR comment (expected for fork PRs — GITHUB_TOKEN is read-only)"

      - name: Fail on critical findings
-        if: steps.scan.outputs.critical == 'true'
+        if: steps.scan.outputs.found == 'true'
        run: |
          echo "::error::CRITICAL supply chain risk patterns detected in this PR. See the PR comment for details."
          exit 1
@@ -3,8 +3,14 @@ name: Tests
 on:
  push:
    branches: [main]
+    paths-ignore:
+      - '**/*.md'
+      - 'docs/**'
  pull_request:
    branches: [main]
+    paths-ignore:
+      - '**/*.md'
+      - 'docs/**'

 permissions:
  contents: read
@@ -17,7 +23,7 @@ concurrency:
 jobs:
  test:
    runs-on: ubuntu-latest
-    timeout-minutes: 10
+    timeout-minutes: 20
    steps:
      - name: Checkout code
        uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5  # v4
@@ -54,6 +54,11 @@ environments/benchmarks/evals/
 # Web UI build output
 hermes_cli/web_dist/

+# Web UI assets — synced from @nous-research/ui at build time via
+# `npm run sync-assets` (see web/package.json).
+web/public/fonts/
+web/public/ds-assets/
+
 # Release script temp files
 .release_notes.md
 mini-swe-agent/
@@ -21,26 +21,36 @@ RUN useradd -u 10000 -m -d /opt/data hermes
 COPY --chmod=0755 --from=gosu_source /gosu /usr/local/bin/
 COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/

-COPY . /opt/hermes
 WORKDIR /opt/hermes

-# Install Node dependencies and Playwright as root (--with-deps needs apt)
+# ---------- Layer-cached dependency install ----------
+# Copy only package manifests first so npm install + Playwright are cached
+# unless the lockfiles themselves change.
+COPY package.json package-lock.json ./
+COPY scripts/whatsapp-bridge/package.json scripts/whatsapp-bridge/package-lock.json scripts/whatsapp-bridge/
+COPY web/package.json web/package-lock.json web/
+
 RUN npm install --prefer-offline --no-audit && \
    npx playwright install --with-deps chromium --only-shell && \
-    cd /opt/hermes/scripts/whatsapp-bridge && \
-    npm install --prefer-offline --no-audit && \
+    (cd scripts/whatsapp-bridge && npm install --prefer-offline --no-audit) && \
+    (cd web && npm install --prefer-offline --no-audit) && \
    npm cache clean --force

-# Hand ownership to hermes user, then install Python deps in a virtualenv
-RUN chown -R hermes:hermes /opt/hermes
-USER hermes
+# ---------- Source code ----------
+# .dockerignore excludes node_modules, so the installs above survive.
+COPY --chown=hermes:hermes . .

+# Build web dashboard (Vite outputs to hermes_cli/web_dist/)
+RUN cd web && npm run build
+
+# ---------- Python virtualenv ----------
+RUN chown hermes:hermes /opt/hermes
+USER hermes
 RUN uv venv && \
    uv pip install --no-cache-dir -e ".[all]"

-USER root
-RUN chmod +x /opt/hermes/docker/entrypoint.sh
-
+# ---------- Runtime ----------
+ENV HERMES_WEB_DIST=/opt/hermes/hermes_cli/web_dist
 ENV HERMES_HOME=/opt/data
 VOLUME [ "/opt/data" ]
 ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]
@@ -20,6 +20,46 @@ from pathlib import Path
 from hermes_constants import get_hermes_home


+# Methods clients send as periodic liveness probes. They are not part of the
+# ACP schema, so the acp router correctly returns JSON-RPC -32601 to the
+# caller — but the supervisor task that dispatches the request then surfaces
+# the raised RequestError via ``logging.exception("Background task failed")``,
+# which dumps a traceback to stderr every probe interval. Clients like
+# acp-bridge already treat the -32601 response as "agent alive", so the
+# traceback is pure noise. We keep the protocol response intact and only
+# silence the stderr noise for this specific benign case.
+_BENIGN_PROBE_METHODS = frozenset({"ping", "health", "healthcheck"})
+
+
+class _BenignProbeMethodFilter(logging.Filter):
+    """Suppress acp 'Background task failed' tracebacks caused by unknown
+    liveness-probe methods (e.g. ``ping``) while leaving every other
+    background-task error — including method_not_found for any non-probe
+    method — visible in stderr.
+    """
+
+    def filter(self, record: logging.LogRecord) -> bool:
+        if record.getMessage() != "Background task failed":
+            return True
+        exc_info = record.exc_info
+        if not exc_info:
+            return True
+        exc = exc_info[1]
+        # Imported lazily so this module stays importable when the optional
+        # ``agent-client-protocol`` dependency is not installed.
+        try:
+            from acp.exceptions import RequestError
+        except ImportError:
+            return True
+        if not isinstance(exc, RequestError):
+            return True
+        if getattr(exc, "code", None) != -32601:
+            return True
+        data = getattr(exc, "data", None)
+        method = data.get("method") if isinstance(data, dict) else None
+        return method not in _BENIGN_PROBE_METHODS
+
+
 def _setup_logging() -> None:
    """Route all logging to stderr so stdout stays clean for ACP stdio."""
    handler = logging.StreamHandler(sys.stderr)
@@ -29,6 +69,7 @@ def _setup_logging() -> None:
            datefmt="%Y-%m-%d %H:%M:%S",
        )
    )
+    handler.addFilter(_BenignProbeMethodFilter())
    root = logging.getLogger()
    root.handlers.clear()
    root.addHandler(handler)
@@ -292,9 +292,15 @@ def _common_betas_for_base_url(base_url: str | None) -> list[str]:
    return _COMMON_BETAS


-def build_anthropic_client(api_key: str, base_url: str = None):
+def build_anthropic_client(api_key: str, base_url: str = None, timeout: float = None):
    """Create an Anthropic client, auto-detecting setup-tokens vs API keys.

+    If *timeout* is provided it overrides the default 900s read timeout.  The
+    connect timeout stays at 10s.  Callers pass this from the per-provider /
+    per-model ``request_timeout_seconds`` config so Anthropic-native and
+    Anthropic-compatible providers respect the same knob as OpenAI-wire
+    providers.
+
    Returns an anthropic.Anthropic instance.
    """
    if _anthropic_sdk is None:
@@ -305,8 +311,9 @@ def build_anthropic_client(api_key: str, base_url: str = None):
    from httpx import Timeout

    normalized_base_url = _normalize_base_url_text(base_url)
+    _read_timeout = timeout if (isinstance(timeout, (int, float)) and timeout > 0) else 900.0
    kwargs = {
-        "timeout": Timeout(timeout=900.0, connect=10.0),
+        "timeout": Timeout(timeout=float(_read_timeout), connect=10.0),
    }
    if normalized_base_url:
        kwargs["base_url"] = normalized_base_url
@@ -99,11 +99,81 @@ _FIXED_TEMPERATURE_MODELS: Dict[str, float] = {
    "kimi-for-coding": 0.6,
 }

+# Moonshot's kimi-for-coding endpoint (api.kimi.com/coding) documents:
+# "k2.5 model will use a fixed value 1.0, non-thinking mode will use a fixed
+# value 0.6.  Any other value will result in an error."  The same lock applies
+# to the other k2.* models served on that endpoint.  Enumerated explicitly so
+# non-coding siblings like `kimi-k2-instruct` (variable temperature, served on
+# the standard chat API and third parties) are NOT clamped.
+# Source: https://platform.kimi.ai/docs/guide/kimi-k2-5-quickstart
+_KIMI_INSTANT_MODELS: frozenset = frozenset({
+    "kimi-k2.5",
+    "kimi-k2-turbo-preview",
+    "kimi-k2-0905-preview",
+})
+_KIMI_THINKING_MODELS: frozenset = frozenset({
+    "kimi-k2-thinking",
+    "kimi-k2-thinking-turbo",
+})

-def _fixed_temperature_for_model(model: Optional[str]) -> Optional[float]:
-    """Return a required temperature override for models with strict contracts."""
+# Moonshot's public chat endpoint (api.moonshot.ai/v1) enforces a different
+# temperature contract than the Coding Plan endpoint above.  Empirically,
+# `kimi-k2.5` on the public API rejects 0.6 with HTTP 400
+# "invalid temperature: only 1 is allowed for this model" — the Coding Plan
+# lock (0.6 for non-thinking) does not apply.  `kimi-k2-turbo-preview` and the
+# thinking variants already match the Coding Plan contract on the public
+# endpoint, so we only override the models that diverge.
+# Users hit this endpoint when `KIMI_API_KEY` is a legacy `sk-*` key (the
+# `sk-kimi-*` prefix routes to api.kimi.com/coding/v1 instead — see
+# hermes_cli/auth.py:_kimi_base_url_for_key).
+_KIMI_PUBLIC_API_OVERRIDES: Dict[str, float] = {
+    "kimi-k2.5": 1.0,
+}
+
+
+def _fixed_temperature_for_model(
+    model: Optional[str],
+    base_url: Optional[str] = None,
+) -> Optional[float]:
+    """Return a required temperature override for models with strict contracts.
+
+    Moonshot's kimi-for-coding endpoint rejects any non-approved temperature on
+    the k2.5 family.  Non-thinking variants require exactly 0.6; thinking
+    variants require 1.0.  An optional ``vendor/`` prefix (e.g.
+    ``moonshotai/kimi-k2.5``) is tolerated for aggregator routings.
+
+    When ``base_url`` points to Moonshot's public chat endpoint
+    (``api.moonshot.ai``), the contract changes for ``kimi-k2.5``: the public
+    API only accepts ``temperature=1``, not 0.6.  That override takes precedence
+    over the Coding Plan defaults above.
+
+    Returns ``None`` for every other model, including ``kimi-k2-instruct*``
+    which is the separate non-coding K2 family with variable temperature.
+    """
    normalized = (model or "").strip().lower()
-    return _FIXED_TEMPERATURE_MODELS.get(normalized)
+    bare = normalized.rsplit("/", 1)[-1]
+
+    # Public Moonshot API has a stricter contract for some models than the
+    # Coding Plan endpoint — check it first so it wins on conflict.
+    if base_url and ("api.moonshot.ai" in base_url.lower() or "api.moonshot.cn" in base_url.lower()):
+        public = _KIMI_PUBLIC_API_OVERRIDES.get(bare)
+        if public is not None:
+            logger.debug(
+                "Forcing temperature=%s for %r on public Moonshot API", public, model
+            )
+            return public
+
+    fixed = _FIXED_TEMPERATURE_MODELS.get(normalized)
+    if fixed is not None:
+        logger.debug("Forcing temperature=%s for model %r (fixed map)", fixed, model)
+        return fixed
+    if bare in _KIMI_THINKING_MODELS:
+        logger.debug("Forcing temperature=1.0 for kimi thinking model %r", model)
+        return 1.0
+    if bare in _KIMI_INSTANT_MODELS:
+        logger.debug("Forcing temperature=0.6 for kimi instant model %r", model)
+        return 0.6
+    return None

 # Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
 _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
@@ -163,6 +233,45 @@ _CODEX_AUX_MODEL = "gpt-5.2-codex"
 _CODEX_AUX_BASE_URL = "https://chatgpt.com/backend-api/codex"


+def _codex_cloudflare_headers(access_token: str) -> Dict[str, str]:
+    """Headers required to avoid Cloudflare 403s on chatgpt.com/backend-api/codex.
+
+    The Cloudflare layer in front of the Codex endpoint whitelists a small set of
+    first-party originators (``codex_cli_rs``, ``codex_vscode``, ``codex_sdk_ts``,
+    anything starting with ``Codex``). Requests from non-residential IPs (VPS,
+    server-hosted agents) that don't advertise an allowed originator are served
+    a 403 with ``cf-mitigated: challenge`` regardless of auth correctness.
+
+    We pin ``originator: codex_cli_rs`` to match the upstream codex-rs CLI, set
+    ``User-Agent`` to a codex_cli_rs-shaped string (beats SDK fingerprinting),
+    and extract ``ChatGPT-Account-ID`` (canonical casing, from codex-rs
+    ``auth.rs``) out of the OAuth JWT's ``chatgpt_account_id`` claim.
+
+    Malformed tokens are tolerated — we drop the account-ID header rather than
+    raise, so a bad token still surfaces as an auth error (401) instead of a
+    crash at client construction.
+    """
+    headers = {
+        "User-Agent": "codex_cli_rs/0.0.0 (Hermes Agent)",
+        "originator": "codex_cli_rs",
+    }
+    if not isinstance(access_token, str) or not access_token.strip():
+        return headers
+    try:
+        import base64
+        parts = access_token.split(".")
+        if len(parts) < 2:
+            return headers
+        payload_b64 = parts[1] + "=" * (-len(parts[1]) % 4)
+        claims = json.loads(base64.urlsafe_b64decode(payload_b64))
+        acct_id = claims.get("https://api.openai.com/auth", {}).get("chatgpt_account_id")
+        if isinstance(acct_id, str) and acct_id:
+            headers["ChatGPT-Account-ID"] = acct_id
+    except Exception:
+        pass
+    return headers
+
+
 def _to_openai_base_url(base_url: str) -> str:
    """Normalize an Anthropic-style base URL to OpenAI-compatible format.

@@ -738,6 +847,11 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
            if model is None:
                continue  # skip provider if we don't know a valid aux model
            logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
+            if provider_id == "gemini":
+                from agent.gemini_native_adapter import GeminiNativeClient, is_native_gemini_base_url
+
+                if is_native_gemini_base_url(base_url):
+                    return GeminiNativeClient(api_key=api_key, base_url=base_url), model
            extra = {}
            if "api.kimi.com" in base_url.lower():
                extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
@@ -745,15 +859,6 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
                from hermes_cli.models import copilot_default_headers

                extra["default_headers"] = copilot_default_headers()
-            elif "generativelanguage.googleapis.com" in base_url.lower():
-                # Google's OpenAI-compatible endpoint only accepts x-goog-api-key.
-                # Passing api_key= causes the SDK to inject Authorization: Bearer,
-                # which Google rejects with HTTP 400 "Multiple authentication
-                # credentials received". Use a placeholder for api_key and pass
-                # the real key via x-goog-api-key header instead.
-                # Fixes: https://github.com/NousResearch/hermes-agent/issues/7893
-                extra["default_headers"] = {"x-goog-api-key": api_key}
-                api_key = "not-used"
            return OpenAI(api_key=api_key, base_url=base_url, **extra), model

        creds = resolve_api_key_provider_credentials(provider_id)
@@ -768,6 +873,11 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
        if model is None:
            continue  # skip provider if we don't know a valid aux model
        logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
+        if provider_id == "gemini":
+            from agent.gemini_native_adapter import GeminiNativeClient, is_native_gemini_base_url
+
+            if is_native_gemini_base_url(base_url):
+                return GeminiNativeClient(api_key=api_key, base_url=base_url), model
        extra = {}
        if "api.kimi.com" in base_url.lower():
            extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
@@ -775,15 +885,6 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
            from hermes_cli.models import copilot_default_headers

            extra["default_headers"] = copilot_default_headers()
-        elif "generativelanguage.googleapis.com" in base_url.lower():
-            # Google's OpenAI-compatible endpoint only accepts x-goog-api-key.
-            # Passing api_key= causes the SDK to inject Authorization: Bearer,
-            # which Google rejects with HTTP 400 "Multiple authentication
-            # credentials received". Use a placeholder for api_key and pass
-            # the real key via x-goog-api-key header instead.
-            # Fixes: https://github.com/NousResearch/hermes-agent/issues/7893
-            extra["default_headers"] = {"x-goog-api-key": api_key}
-            api_key = "not-used"
        return OpenAI(api_key=api_key, base_url=base_url, **extra), model

    return None, None
@@ -997,7 +1098,7 @@ def _validate_base_url(base_url: str) -> None:
        ) from exc


-def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
+def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
    runtime = _resolve_custom_runtime()
    if len(runtime) == 2:
        custom_base, custom_key = runtime
@@ -1013,6 +1114,23 @@ def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
    if custom_mode == "codex_responses":
        real_client = OpenAI(api_key=custom_key, base_url=custom_base)
        return CodexAuxiliaryClient(real_client, model), model
+    if custom_mode == "anthropic_messages":
+        # Third-party Anthropic-compatible gateway (MiniMax, Zhipu GLM,
+        # LiteLLM proxies, etc.).  Must NEVER be treated as OAuth —
+        # Anthropic OAuth claims only apply to api.anthropic.com.
+        try:
+            from agent.anthropic_adapter import build_anthropic_client
+            real_client = build_anthropic_client(custom_key, custom_base)
+        except ImportError:
+            logger.warning(
+                "Custom endpoint declares api_mode=anthropic_messages but the "
+                "anthropic SDK is not installed — falling back to OpenAI-wire."
+            )
+            return OpenAI(api_key=custom_key, base_url=custom_base), model
+        return (
+            AnthropicAuxiliaryClient(real_client, model, custom_key, custom_base, is_oauth=False),
+            model,
+        )
    return OpenAI(api_key=custom_key, base_url=custom_base), model


@@ -1033,7 +1151,11 @@ def _try_codex() -> Tuple[Optional[Any], Optional[str]]:
            return None, None
        base_url = _CODEX_AUX_BASE_URL
    logger.debug("Auxiliary client: Codex OAuth (%s via Responses API)", _CODEX_AUX_MODEL)
-    real_client = OpenAI(api_key=codex_token, base_url=base_url)
+    real_client = OpenAI(
+        api_key=codex_token,
+        base_url=base_url,
+        default_headers=_codex_cloudflare_headers(codex_token),
+    )
    return CodexAuxiliaryClient(real_client, _CODEX_AUX_MODEL), _CODEX_AUX_MODEL


@@ -1329,6 +1451,13 @@ def _to_async_client(sync_client, model: str):
        return AsyncCodexAuxiliaryClient(sync_client), model
    if isinstance(sync_client, AnthropicAuxiliaryClient):
        return AsyncAnthropicAuxiliaryClient(sync_client), model
+    try:
+        from agent.gemini_native_adapter import GeminiNativeClient, AsyncGeminiNativeClient
+
+        if isinstance(sync_client, GeminiNativeClient):
+            return AsyncGeminiNativeClient(sync_client), model
+    except ImportError:
+        pass
    try:
        from agent.copilot_acp_client import CopilotACPClient
        if isinstance(sync_client, CopilotACPClient):
@@ -1493,7 +1622,11 @@ def resolve_provider_client(
                               "but no Codex OAuth token found (run: hermes model)")
                return None, None
            final_model = _normalize_resolved_model(model or _CODEX_AUX_MODEL, provider)
-            raw_client = OpenAI(api_key=codex_token, base_url=_CODEX_AUX_BASE_URL)
+            raw_client = OpenAI(
+                api_key=codex_token,
+                base_url=_CODEX_AUX_BASE_URL,
+                default_headers=_codex_cloudflare_headers(codex_token),
+            )
            return (raw_client, final_model)
        # Standard path: wrap in CodexAuxiliaryClient adapter
        client, default = _try_codex()
@@ -1621,6 +1754,15 @@ def resolve_provider_client(
        default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
        final_model = _normalize_resolved_model(model or default_model, provider)

+        if provider == "gemini":
+            from agent.gemini_native_adapter import GeminiNativeClient, is_native_gemini_base_url
+
+            if is_native_gemini_base_url(base_url):
+                client = GeminiNativeClient(api_key=api_key, base_url=base_url)
+                logger.debug("resolve_provider_client: %s (%s)", provider, final_model)
+                return (_to_async_client(client, final_model) if async_mode
+                        else (client, final_model))
+
        # Provider-specific headers
        headers = {}
        if "api.kimi.com" in base_url.lower():
@@ -1629,16 +1771,6 @@ def resolve_provider_client(
            from hermes_cli.models import copilot_default_headers

            headers.update(copilot_default_headers())
-        elif "generativelanguage.googleapis.com" in base_url.lower():
-            # Google's OpenAI-compatible endpoint only accepts x-goog-api-key.
-            # Passing api_key= causes the OpenAI SDK to inject Authorization: Bearer,
-            # which Google rejects with HTTP 400 "Multiple authentication credentials
-            # received". Use a placeholder for api_key and pass the real key via
-            # x-goog-api-key header instead.
-            # Fixes: https://github.com/NousResearch/hermes-agent/issues/7893
-            headers["x-goog-api-key"] = api_key
-            api_key = "not-used"
-
        client = OpenAI(api_key=api_key, base_url=base_url,
                        **({"default_headers": headers} if headers else {}))

@@ -2181,7 +2313,6 @@ def _resolve_task_provider_model(
    to "custom" and the task uses that direct endpoint. api_mode is one of
    "chat_completions", "codex_responses", or None (auto-detect).
    """
-    config = {}
    cfg_provider = None
    cfg_model = None
    cfg_base_url = None
@@ -2189,16 +2320,7 @@ def _resolve_task_provider_model(
    cfg_api_mode = None

    if task:
-        try:
-            from hermes_cli.config import load_config
-            config = load_config()
-        except ImportError:
-            config = {}
-
-        aux = config.get("auxiliary", {}) if isinstance(config, dict) else {}
-        task_config = aux.get(task, {}) if isinstance(aux, dict) else {}
-        if not isinstance(task_config, dict):
-            task_config = {}
+        task_config = _get_auxiliary_task_config(task)
        cfg_provider = str(task_config.get("provider", "")).strip() or None
        cfg_model = str(task_config.get("model", "")).strip() or None
        cfg_base_url = str(task_config.get("base_url", "")).strip() or None
@@ -2228,17 +2350,25 @@ def _resolve_task_provider_model(
 _DEFAULT_AUX_TIMEOUT = 30.0


-def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float:
-    """Read timeout from auxiliary.{task}.timeout in config, falling back to *default*."""
+def _get_auxiliary_task_config(task: str) -> Dict[str, Any]:
+    """Return the config dict for auxiliary.<task>, or {} when unavailable."""
    if not task:
-        return default
+        return {}
    try:
        from hermes_cli.config import load_config
        config = load_config()
    except ImportError:
-        return default
+        return {}
    aux = config.get("auxiliary", {}) if isinstance(config, dict) else {}
    task_config = aux.get(task, {}) if isinstance(aux, dict) else {}
+    return task_config if isinstance(task_config, dict) else {}
+
+
+def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float:
+    """Read timeout from auxiliary.{task}.timeout in config, falling back to *default*."""
+    if not task:
+        return default
+    task_config = _get_auxiliary_task_config(task)
    raw = task_config.get("timeout")
    if raw is not None:
        try:
@@ -2248,6 +2378,15 @@ def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float
    return default


+def _get_task_extra_body(task: str) -> Dict[str, Any]:
+    """Read auxiliary.<task>.extra_body and return a shallow copy when valid."""
+    task_config = _get_auxiliary_task_config(task)
+    raw = task_config.get("extra_body")
+    if isinstance(raw, dict):
+        return dict(raw)
+    return {}
+
+
 # ---------------------------------------------------------------------------
 # Anthropic-compatible endpoint detection + image block conversion
 # ---------------------------------------------------------------------------
@@ -2335,7 +2474,7 @@ def _build_call_kwargs(
        "timeout": timeout,
    }

-    fixed_temperature = _fixed_temperature_for_model(model)
+    fixed_temperature = _fixed_temperature_for_model(model, base_url)
    if fixed_temperature is not None:
        temperature = fixed_temperature

@@ -2448,6 +2587,8 @@ def call_llm(
    """
    resolved_provider, resolved_model, resolved_base_url, resolved_api_key, resolved_api_mode = _resolve_task_provider_model(
        task, provider, model, base_url, api_key)
+    effective_extra_body = _get_task_extra_body(task)
+    effective_extra_body.update(extra_body or {})

    if task == "vision":
        effective_provider, client, final_model = resolve_vision_provider_client(
@@ -2516,11 +2657,14 @@ def call_llm(
                     task, resolved_provider or "auto", final_model or "default",
                     f" at {_base_info}" if _base_info and "openrouter" not in _base_info else "")

+    # Pass the client's actual base_url (not just resolved_base_url) so
+    # endpoint-specific temperature overrides can distinguish
+    # api.moonshot.ai vs api.kimi.com/coding even on auto-detected routes.
    kwargs = _build_call_kwargs(
        resolved_provider, final_model, messages,
        temperature=temperature, max_tokens=max_tokens,
-        tools=tools, timeout=effective_timeout, extra_body=extra_body,
-        base_url=resolved_base_url)
+        tools=tools, timeout=effective_timeout, extra_body=effective_extra_body,
+        base_url=_base_info or resolved_base_url)

    # Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
    _client_base = str(getattr(client, "base_url", "") or "")
@@ -2574,7 +2718,8 @@ def call_llm(
                    fb_label, fb_model, messages,
                    temperature=temperature, max_tokens=max_tokens,
                    tools=tools, timeout=effective_timeout,
-                    extra_body=extra_body)
+                    extra_body=effective_extra_body,
+                    base_url=str(getattr(fb_client, "base_url", "") or ""))
                return _validate_llm_response(
                    fb_client.chat.completions.create(**fb_kwargs), task)
        raise
@@ -2656,6 +2801,8 @@ async def async_call_llm(
    """
    resolved_provider, resolved_model, resolved_base_url, resolved_api_key, resolved_api_mode = _resolve_task_provider_model(
        task, provider, model, base_url, api_key)
+    effective_extra_body = _get_task_extra_body(task)
+    effective_extra_body.update(extra_body or {})

    if task == "vision":
        effective_provider, client, final_model = resolve_vision_provider_client(
@@ -2709,14 +2856,17 @@ async def async_call_llm(

    effective_timeout = timeout if timeout is not None else _get_task_timeout(task)

+    # Pass the client's actual base_url (not just resolved_base_url) so
+    # endpoint-specific temperature overrides can distinguish
+    # api.moonshot.ai vs api.kimi.com/coding even on auto-detected routes.
+    _client_base = str(getattr(client, "base_url", "") or "")
    kwargs = _build_call_kwargs(
        resolved_provider, final_model, messages,
        temperature=temperature, max_tokens=max_tokens,
-        tools=tools, timeout=effective_timeout, extra_body=extra_body,
-        base_url=resolved_base_url)
+        tools=tools, timeout=effective_timeout, extra_body=effective_extra_body,
+        base_url=_client_base or resolved_base_url)

    # Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
-    _client_base = str(getattr(client, "base_url", "") or "")
    if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
        kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])

@@ -2752,7 +2902,8 @@ async def async_call_llm(
                    fb_label, fb_model, messages,
                    temperature=temperature, max_tokens=max_tokens,
                    tools=tools, timeout=effective_timeout,
-                    extra_body=extra_body)
+                    extra_body=effective_extra_body,
+                    base_url=str(getattr(fb_client, "base_url", "") or ""))
                # Convert sync fallback client to async
                async_fb, async_fb_model = _to_async_client(fb_client, fb_model or "")
                if async_fb_model and async_fb_model != fb_kwargs.get("model"):
@@ -0,0 +1,650 @@
+"""Codex Responses API adapter.
+
+Pure format-conversion and normalization logic for the OpenAI Responses API
+(used by OpenAI Codex, xAI, GitHub Models, and other Responses-compatible endpoints).
+
+Extracted from run_agent.py to isolate Responses API-specific logic from the
+core agent loop. All functions are stateless — they operate on the data passed
+in and return transformed results.
+"""
+
+from __future__ import annotations
+
+import hashlib
+import json
+import logging
+import re
+import uuid
+from types import SimpleNamespace
+from typing import Any, Dict, List, Optional
+
+from agent.prompt_builder import DEFAULT_AGENT_IDENTITY
+
+logger = logging.getLogger(__name__)
+
+def _deterministic_call_id(fn_name: str, arguments: str, index: int = 0) -> str:
+    """Generate a deterministic call_id from tool call content.
+
+    Used as a fallback when the API doesn't provide a call_id.
+    Deterministic IDs prevent cache invalidation — random UUIDs would
+    make every API call's prefix unique, breaking OpenAI's prompt cache.
+    """
+    seed = f"{fn_name}:{arguments}:{index}"
+    digest = hashlib.sha256(seed.encode("utf-8", errors="replace")).hexdigest()[:12]
+    return f"call_{digest}"
+
+
+def _split_responses_tool_id(raw_id: Any) -> tuple[Optional[str], Optional[str]]:
+    """Split a stored tool id into (call_id, response_item_id)."""
+    if not isinstance(raw_id, str):
+        return None, None
+    value = raw_id.strip()
+    if not value:
+        return None, None
+    if "|" in value:
+        call_id, response_item_id = value.split("|", 1)
+        call_id = call_id.strip() or None
+        response_item_id = response_item_id.strip() or None
+        return call_id, response_item_id
+    if value.startswith("fc_"):
+        return None, value
+    return value, None
+
+
+def _derive_responses_function_call_id(
+    call_id: str,
+    response_item_id: Optional[str] = None,
+) -> str:
+    """Build a valid Responses `function_call.id` (must start with `fc_`)."""
+    if isinstance(response_item_id, str):
+        candidate = response_item_id.strip()
+        if candidate.startswith("fc_"):
+            return candidate
+
+    source = (call_id or "").strip()
+    if source.startswith("fc_"):
+        return source
+    if source.startswith("call_") and len(source) > len("call_"):
+        return f"fc_{source[len('call_'):]}"
+
+    sanitized = re.sub(r"[^A-Za-z0-9_-]", "", source)
+    if sanitized.startswith("fc_"):
+        return sanitized
+    if sanitized.startswith("call_") and len(sanitized) > len("call_"):
+        return f"fc_{sanitized[len('call_'):]}"
+    if sanitized:
+        return f"fc_{sanitized[:48]}"
+
+    seed = source or str(response_item_id or "") or uuid.uuid4().hex
+    digest = hashlib.sha1(seed.encode("utf-8")).hexdigest()[:24]
+    return f"fc_{digest}"
+
+
+def _responses_tools(tools: Optional[List[Dict[str, Any]]] = None) -> Optional[List[Dict[str, Any]]]:
+    """Convert chat-completions tool schemas to Responses function-tool schemas."""
+    source_tools = tools
+    if not source_tools:
+        return None
+
+    converted: List[Dict[str, Any]] = []
+    for item in source_tools:
+        fn = item.get("function", {}) if isinstance(item, dict) else {}
+        name = fn.get("name")
+        if not isinstance(name, str) or not name.strip():
+            continue
+        converted.append({
+            "type": "function",
+            "name": name,
+            "description": fn.get("description", ""),
+            "strict": False,
+            "parameters": fn.get("parameters", {"type": "object", "properties": {}}),
+        })
+    return converted or None
+
+
+def _chat_messages_to_responses_input(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
+    """Convert internal chat-style messages to Responses input items."""
+    items: List[Dict[str, Any]] = []
+    seen_item_ids: set = set()
+
+    for msg in messages:
+        if not isinstance(msg, dict):
+            continue
+        role = msg.get("role")
+        if role == "system":
+            continue
+
+        if role in {"user", "assistant"}:
+            content = msg.get("content", "")
+            content_text = str(content) if content is not None else ""
+
+            if role == "assistant":
+                # Replay encrypted reasoning items from previous turns
+                # so the API can maintain coherent reasoning chains.
+                codex_reasoning = msg.get("codex_reasoning_items")
+                has_codex_reasoning = False
+                if isinstance(codex_reasoning, list):
+                    for ri in codex_reasoning:
+                        if isinstance(ri, dict) and ri.get("encrypted_content"):
+                            item_id = ri.get("id")
+                            if item_id and item_id in seen_item_ids:
+                                continue
+                            # Strip the "id" field — with store=False the
+                            # Responses API cannot look up items by ID and
+                            # returns 404.  The encrypted_content blob is
+                            # self-contained for reasoning chain continuity.
+                            replay_item = {k: v for k, v in ri.items() if k != "id"}
+                            items.append(replay_item)
+                            if item_id:
+                                seen_item_ids.add(item_id)
+                            has_codex_reasoning = True
+
+                if content_text.strip():
+                    items.append({"role": "assistant", "content": content_text})
+                elif has_codex_reasoning:
+                    # The Responses API requires a following item after each
+                    # reasoning item (otherwise: missing_following_item error).
+                    # When the assistant produced only reasoning with no visible
+                    # content, emit an empty assistant message as the required
+                    # following item.
+                    items.append({"role": "assistant", "content": ""})
+
+                tool_calls = msg.get("tool_calls")
+                if isinstance(tool_calls, list):
+                    for tc in tool_calls:
+                        if not isinstance(tc, dict):
+                            continue
+                        fn = tc.get("function", {})
+                        fn_name = fn.get("name")
+                        if not isinstance(fn_name, str) or not fn_name.strip():
+                            continue
+
+                        embedded_call_id, embedded_response_item_id = _split_responses_tool_id(
+                            tc.get("id")
+                        )
+                        call_id = tc.get("call_id")
+                        if not isinstance(call_id, str) or not call_id.strip():
+                            call_id = embedded_call_id
+                        if not isinstance(call_id, str) or not call_id.strip():
+                            if (
+                                isinstance(embedded_response_item_id, str)
+                                and embedded_response_item_id.startswith("fc_")
+                                and len(embedded_response_item_id) > len("fc_")
+                            ):
+                                call_id = f"call_{embedded_response_item_id[len('fc_'):]}"
+                            else:
+                                _raw_args = str(fn.get("arguments", "{}"))
+                                call_id = _deterministic_call_id(fn_name, _raw_args, len(items))
+                        call_id = call_id.strip()
+
+                        arguments = fn.get("arguments", "{}")
+                        if isinstance(arguments, dict):
+                            arguments = json.dumps(arguments, ensure_ascii=False)
+                        elif not isinstance(arguments, str):
+                            arguments = str(arguments)
+                        arguments = arguments.strip() or "{}"
+
+                        items.append({
+                            "type": "function_call",
+                            "call_id": call_id,
+                            "name": fn_name,
+                            "arguments": arguments,
+                        })
+                continue
+
+            items.append({"role": role, "content": content_text})
+            continue
+
+        if role == "tool":
+            raw_tool_call_id = msg.get("tool_call_id")
+            call_id, _ = _split_responses_tool_id(raw_tool_call_id)
+            if not isinstance(call_id, str) or not call_id.strip():
+                if isinstance(raw_tool_call_id, str) and raw_tool_call_id.strip():
+                    call_id = raw_tool_call_id.strip()
+            if not isinstance(call_id, str) or not call_id.strip():
+                continue
+            items.append({
+                "type": "function_call_output",
+                "call_id": call_id,
+                "output": str(msg.get("content", "") or ""),
+            })
+
+    return items
+
+
+def _preflight_codex_input_items(raw_items: Any) -> List[Dict[str, Any]]:
+    if not isinstance(raw_items, list):
+        raise ValueError("Codex Responses input must be a list of input items.")
+
+    normalized: List[Dict[str, Any]] = []
+    seen_ids: set = set()
+    for idx, item in enumerate(raw_items):
+        if not isinstance(item, dict):
+            raise ValueError(f"Codex Responses input[{idx}] must be an object.")
+
+        item_type = item.get("type")
+        if item_type == "function_call":
+            call_id = item.get("call_id")
+            name = item.get("name")
+            if not isinstance(call_id, str) or not call_id.strip():
+                raise ValueError(f"Codex Responses input[{idx}] function_call is missing call_id.")
+            if not isinstance(name, str) or not name.strip():
+                raise ValueError(f"Codex Responses input[{idx}] function_call is missing name.")
+
+            arguments = item.get("arguments", "{}")
+            if isinstance(arguments, dict):
+                arguments = json.dumps(arguments, ensure_ascii=False)
+            elif not isinstance(arguments, str):
+                arguments = str(arguments)
+            arguments = arguments.strip() or "{}"
+
+            normalized.append(
+                {
+                    "type": "function_call",
+                    "call_id": call_id.strip(),
+                    "name": name.strip(),
+                    "arguments": arguments,
+                }
+            )
+            continue
+
+        if item_type == "function_call_output":
+            call_id = item.get("call_id")
+            if not isinstance(call_id, str) or not call_id.strip():
+                raise ValueError(f"Codex Responses input[{idx}] function_call_output is missing call_id.")
+            output = item.get("output", "")
+            if output is None:
+                output = ""
+            if not isinstance(output, str):
+                output = str(output)
+
+            normalized.append(
+                {
+                    "type": "function_call_output",
+                    "call_id": call_id.strip(),
+                    "output": output,
+                }
+            )
+            continue
+
+        if item_type == "reasoning":
+            encrypted = item.get("encrypted_content")
+            if isinstance(encrypted, str) and encrypted:
+                item_id = item.get("id")
+                if isinstance(item_id, str) and item_id:
+                    if item_id in seen_ids:
+                        continue
+                    seen_ids.add(item_id)
+                reasoning_item = {"type": "reasoning", "encrypted_content": encrypted}
+                # Do NOT include the "id" in the outgoing item — with
+                # store=False (our default) the API tries to resolve the
+                # id server-side and returns 404.  The id is still used
+                # above for local deduplication via seen_ids.
+                summary = item.get("summary")
+                if isinstance(summary, list):
+                    reasoning_item["summary"] = summary
+                else:
+                    reasoning_item["summary"] = []
+                normalized.append(reasoning_item)
+            continue
+
+        role = item.get("role")
+        if role in {"user", "assistant"}:
+            content = item.get("content", "")
+            if content is None:
+                content = ""
+            if not isinstance(content, str):
+                content = str(content)
+
+            normalized.append({"role": role, "content": content})
+            continue
+
+        raise ValueError(
+            f"Codex Responses input[{idx}] has unsupported item shape (type={item_type!r}, role={role!r})."
+        )
+
+    return normalized
+
+
+def _preflight_codex_api_kwargs(
+    api_kwargs: Any,
+    *,
+    allow_stream: bool = False,
+) -> Dict[str, Any]:
+    if not isinstance(api_kwargs, dict):
+        raise ValueError("Codex Responses request must be a dict.")
+
+    required = {"model", "instructions", "input"}
+    missing = [key for key in required if key not in api_kwargs]
+    if missing:
+        raise ValueError(f"Codex Responses request missing required field(s): {', '.join(sorted(missing))}.")
+
+    model = api_kwargs.get("model")
+    if not isinstance(model, str) or not model.strip():
+        raise ValueError("Codex Responses request 'model' must be a non-empty string.")
+    model = model.strip()
+
+    instructions = api_kwargs.get("instructions")
+    if instructions is None:
+        instructions = ""
+    if not isinstance(instructions, str):
+        instructions = str(instructions)
+    instructions = instructions.strip() or DEFAULT_AGENT_IDENTITY
+
+    normalized_input = _preflight_codex_input_items(api_kwargs.get("input"))
+
+    tools = api_kwargs.get("tools")
+    normalized_tools = None
+    if tools is not None:
+        if not isinstance(tools, list):
+            raise ValueError("Codex Responses request 'tools' must be a list when provided.")
+        normalized_tools = []
+        for idx, tool in enumerate(tools):
+            if not isinstance(tool, dict):
+                raise ValueError(f"Codex Responses tools[{idx}] must be an object.")
+            if tool.get("type") != "function":
+                raise ValueError(f"Codex Responses tools[{idx}] has unsupported type {tool.get('type')!r}.")
+
+            name = tool.get("name")
+            parameters = tool.get("parameters")
+            if not isinstance(name, str) or not name.strip():
+                raise ValueError(f"Codex Responses tools[{idx}] is missing a valid name.")
+            if not isinstance(parameters, dict):
+                raise ValueError(f"Codex Responses tools[{idx}] is missing valid parameters.")
+
+            description = tool.get("description", "")
+            if description is None:
+                description = ""
+            if not isinstance(description, str):
+                description = str(description)
+
+            strict = tool.get("strict", False)
+            if not isinstance(strict, bool):
+                strict = bool(strict)
+
+            normalized_tools.append(
+                {
+                    "type": "function",
+                    "name": name.strip(),
+                    "description": description,
+                    "strict": strict,
+                    "parameters": parameters,
+                }
+            )
+
+    store = api_kwargs.get("store", False)
+    if store is not False:
+        raise ValueError("Codex Responses contract requires 'store' to be false.")
+
+    allowed_keys = {
+        "model", "instructions", "input", "tools", "store",
+        "reasoning", "include", "max_output_tokens", "temperature",
+        "tool_choice", "parallel_tool_calls", "prompt_cache_key", "service_tier",
+        "extra_headers",
+    }
+    normalized: Dict[str, Any] = {
+        "model": model,
+        "instructions": instructions,
+        "input": normalized_input,
+        "store": False,
+    }
+    if normalized_tools is not None:
+        normalized["tools"] = normalized_tools
+
+    # Pass through reasoning config
+    reasoning = api_kwargs.get("reasoning")
+    if isinstance(reasoning, dict):
+        normalized["reasoning"] = reasoning
+    include = api_kwargs.get("include")
+    if isinstance(include, list):
+        normalized["include"] = include
+    service_tier = api_kwargs.get("service_tier")
+    if isinstance(service_tier, str) and service_tier.strip():
+        normalized["service_tier"] = service_tier.strip()
+
+    # Pass through max_output_tokens and temperature
+    max_output_tokens = api_kwargs.get("max_output_tokens")
+    if isinstance(max_output_tokens, (int, float)) and max_output_tokens > 0:
+        normalized["max_output_tokens"] = int(max_output_tokens)
+    temperature = api_kwargs.get("temperature")
+    if isinstance(temperature, (int, float)):
+        normalized["temperature"] = float(temperature)
+
+    # Pass through tool_choice, parallel_tool_calls, prompt_cache_key
+    for passthrough_key in ("tool_choice", "parallel_tool_calls", "prompt_cache_key"):
+        val = api_kwargs.get(passthrough_key)
+        if val is not None:
+            normalized[passthrough_key] = val
+
+    extra_headers = api_kwargs.get("extra_headers")
+    if extra_headers is not None:
+        if not isinstance(extra_headers, dict):
+            raise ValueError("Codex Responses request 'extra_headers' must be an object.")
+        normalized_headers: Dict[str, str] = {}
+        for key, value in extra_headers.items():
+            if not isinstance(key, str) or not key.strip():
+                raise ValueError("Codex Responses request 'extra_headers' keys must be non-empty strings.")
+            if value is None:
+                continue
+            normalized_headers[key.strip()] = str(value)
+        if normalized_headers:
+            normalized["extra_headers"] = normalized_headers
+
+    if allow_stream:
+        stream = api_kwargs.get("stream")
+        if stream is not None and stream is not True:
+            raise ValueError("Codex Responses 'stream' must be true when set.")
+        if stream is True:
+            normalized["stream"] = True
+        allowed_keys.add("stream")
+    elif "stream" in api_kwargs:
+        raise ValueError("Codex Responses stream flag is only allowed in fallback streaming requests.")
+
+    unexpected = sorted(key for key in api_kwargs if key not in allowed_keys)
+    if unexpected:
+        raise ValueError(
+            f"Codex Responses request has unsupported field(s): {', '.join(unexpected)}."
+        )
+
+    return normalized
+
+
+def _extract_responses_message_text(item: Any) -> str:
+    """Extract assistant text from a Responses message output item."""
+    content = getattr(item, "content", None)
+    if not isinstance(content, list):
+        return ""
+
+    chunks: List[str] = []
+    for part in content:
+        ptype = getattr(part, "type", None)
+        if ptype not in {"output_text", "text"}:
+            continue
+        text = getattr(part, "text", None)
+        if isinstance(text, str) and text:
+            chunks.append(text)
+    return "".join(chunks).strip()
+
+
+def _extract_responses_reasoning_text(item: Any) -> str:
+    """Extract a compact reasoning text from a Responses reasoning item."""
+    summary = getattr(item, "summary", None)
+    if isinstance(summary, list):
+        chunks: List[str] = []
+        for part in summary:
+            text = getattr(part, "text", None)
+            if isinstance(text, str) and text:
+                chunks.append(text)
+        if chunks:
+            return "\n".join(chunks).strip()
+    text = getattr(item, "text", None)
+    if isinstance(text, str) and text:
+        return text.strip()
+    return ""
+
+
+def _normalize_codex_response(response: Any) -> tuple[Any, str]:
+    """Normalize a Responses API object to an assistant_message-like object."""
+    output = getattr(response, "output", None)
+    if not isinstance(output, list) or not output:
+        # The Codex backend can return empty output when the answer was
+        # delivered entirely via stream events. Check output_text as a
+        # last-resort fallback before raising.
+        out_text = getattr(response, "output_text", None)
+        if isinstance(out_text, str) and out_text.strip():
+            logger.debug(
+                "Codex response has empty output but output_text is present (%d chars); "
+                "synthesizing output item.", len(out_text.strip()),
+            )
+            output = [SimpleNamespace(
+                type="message", role="assistant", status="completed",
+                content=[SimpleNamespace(type="output_text", text=out_text.strip())],
+            )]
+            response.output = output
+        else:
+            raise RuntimeError("Responses API returned no output items")
+
+    response_status = getattr(response, "status", None)
+    if isinstance(response_status, str):
+        response_status = response_status.strip().lower()
+    else:
+        response_status = None
+
+    if response_status in {"failed", "cancelled"}:
+        error_obj = getattr(response, "error", None)
+        if isinstance(error_obj, dict):
+            error_msg = error_obj.get("message") or str(error_obj)
+        else:
+            error_msg = str(error_obj) if error_obj else f"Responses API returned status '{response_status}'"
+        raise RuntimeError(error_msg)
+
+    content_parts: List[str] = []
+    reasoning_parts: List[str] = []
+    reasoning_items_raw: List[Dict[str, Any]] = []
+    tool_calls: List[Any] = []
+    has_incomplete_items = response_status in {"queued", "in_progress", "incomplete"}
+    saw_commentary_phase = False
+    saw_final_answer_phase = False
+
+    for item in output:
+        item_type = getattr(item, "type", None)
+        item_status = getattr(item, "status", None)
+        if isinstance(item_status, str):
+            item_status = item_status.strip().lower()
+        else:
+            item_status = None
+
+        if item_status in {"queued", "in_progress", "incomplete"}:
+            has_incomplete_items = True
+
+        if item_type == "message":
+            item_phase = getattr(item, "phase", None)
+            if isinstance(item_phase, str):
+                normalized_phase = item_phase.strip().lower()
+                if normalized_phase in {"commentary", "analysis"}:
+                    saw_commentary_phase = True
+                elif normalized_phase in {"final_answer", "final"}:
+                    saw_final_answer_phase = True
+            message_text = _extract_responses_message_text(item)
+            if message_text:
+                content_parts.append(message_text)
+        elif item_type == "reasoning":
+            reasoning_text = _extract_responses_reasoning_text(item)
+            if reasoning_text:
+                reasoning_parts.append(reasoning_text)
+            # Capture the full reasoning item for multi-turn continuity.
+            # encrypted_content is an opaque blob the API needs back on
+            # subsequent turns to maintain coherent reasoning chains.
+            encrypted = getattr(item, "encrypted_content", None)
+            if isinstance(encrypted, str) and encrypted:
+                raw_item = {"type": "reasoning", "encrypted_content": encrypted}
+                item_id = getattr(item, "id", None)
+                if isinstance(item_id, str) and item_id:
+                    raw_item["id"] = item_id
+                # Capture summary — required by the API when replaying reasoning items
+                summary = getattr(item, "summary", None)
+                if isinstance(summary, list):
+                    raw_summary = []
+                    for part in summary:
+                        text = getattr(part, "text", None)
+                        if isinstance(text, str):
+                            raw_summary.append({"type": "summary_text", "text": text})
+                    raw_item["summary"] = raw_summary
+                reasoning_items_raw.append(raw_item)
+        elif item_type == "function_call":
+            if item_status in {"queued", "in_progress", "incomplete"}:
+                continue
+            fn_name = getattr(item, "name", "") or ""
+            arguments = getattr(item, "arguments", "{}")
+            if not isinstance(arguments, str):
+                arguments = json.dumps(arguments, ensure_ascii=False)
+            raw_call_id = getattr(item, "call_id", None)
+            raw_item_id = getattr(item, "id", None)
+            embedded_call_id, _ = _split_responses_tool_id(raw_item_id)
+            call_id = raw_call_id if isinstance(raw_call_id, str) and raw_call_id.strip() else embedded_call_id
+            if not isinstance(call_id, str) or not call_id.strip():
+                call_id = _deterministic_call_id(fn_name, arguments, len(tool_calls))
+            call_id = call_id.strip()
+            response_item_id = raw_item_id if isinstance(raw_item_id, str) else None
+            response_item_id = _derive_responses_function_call_id(call_id, response_item_id)
+            tool_calls.append(SimpleNamespace(
+                id=call_id,
+                call_id=call_id,
+                response_item_id=response_item_id,
+                type="function",
+                function=SimpleNamespace(name=fn_name, arguments=arguments),
+            ))
+        elif item_type == "custom_tool_call":
+            fn_name = getattr(item, "name", "") or ""
+            arguments = getattr(item, "input", "{}")
+            if not isinstance(arguments, str):
+                arguments = json.dumps(arguments, ensure_ascii=False)
+            raw_call_id = getattr(item, "call_id", None)
+            raw_item_id = getattr(item, "id", None)
+            embedded_call_id, _ = _split_responses_tool_id(raw_item_id)
+            call_id = raw_call_id if isinstance(raw_call_id, str) and raw_call_id.strip() else embedded_call_id
+            if not isinstance(call_id, str) or not call_id.strip():
+                call_id = _deterministic_call_id(fn_name, arguments, len(tool_calls))
+            call_id = call_id.strip()
+            response_item_id = raw_item_id if isinstance(raw_item_id, str) else None
+            response_item_id = _derive_responses_function_call_id(call_id, response_item_id)
+            tool_calls.append(SimpleNamespace(
+                id=call_id,
+                call_id=call_id,
+                response_item_id=response_item_id,
+                type="function",
+                function=SimpleNamespace(name=fn_name, arguments=arguments),
+            ))
+
+    final_text = "\n".join([p for p in content_parts if p]).strip()
+    if not final_text and hasattr(response, "output_text"):
+        out_text = getattr(response, "output_text", "")
+        if isinstance(out_text, str):
+            final_text = out_text.strip()
+
+    assistant_message = SimpleNamespace(
+        content=final_text,
+        tool_calls=tool_calls,
+        reasoning="\n\n".join(reasoning_parts).strip() if reasoning_parts else None,
+        reasoning_content=None,
+        reasoning_details=None,
+        codex_reasoning_items=reasoning_items_raw or None,
+    )
+
+    if tool_calls:
+        finish_reason = "tool_calls"
+    elif has_incomplete_items or (saw_commentary_phase and not saw_final_answer_phase):
+        finish_reason = "incomplete"
+    elif reasoning_items_raw and not final_text:
+        # Response contains only reasoning (encrypted thinking state) with
+        # no visible content or tool calls.  The model is still thinking and
+        # needs another turn to produce the actual answer.  Marking this as
+        # "stop" would send it into the empty-content retry loop which burns
+        # 3 retries then fails — treat it as incomplete instead so the Codex
+        # continuation path handles it correctly.
+        finish_reason = "incomplete"
+    else:
+        finish_reason = "stop"
+    return assistant_message, finish_reason
+
+
@@ -63,6 +63,52 @@ _CHARS_PER_TOKEN = 4
 _SUMMARY_FAILURE_COOLDOWN_SECONDS = 600


+def _truncate_tool_call_args_json(args: str, head_chars: int = 200) -> str:
+    """Shrink long string values inside a tool-call arguments JSON blob while
+    preserving JSON validity.
+
+    The ``function.arguments`` field on a tool call is a JSON-encoded string
+    passed through to the LLM provider; downstream providers strictly
+    validate it and return a non-retryable 400 when it is not well-formed.
+    An earlier implementation sliced the raw JSON at a fixed byte offset and
+    appended ``...[truncated]`` — which routinely produced strings like::
+
+        {"path": "/foo/bar", "content": "# long markdown
+        ...[truncated]
+
+    i.e. an unterminated string and a missing closing brace. MiniMax, for
+    example, rejects this with ``invalid function arguments json string``
+    and the session gets stuck re-sending the same broken history on every
+    turn. See issue #11762 for the observed loop.
+
+    This helper parses the arguments, shrinks long string leaves inside the
+    parsed structure, and re-serialises. Non-string values (paths, ints,
+    booleans) are preserved intact. If the arguments are not valid JSON
+    to begin with — some model backends use non-JSON tool arguments — the
+    original string is returned unchanged rather than replaced with
+    something neither we nor the backend can parse.
+    """
+    try:
+        parsed = json.loads(args)
+    except (ValueError, TypeError):
+        return args
+
+    def _shrink(obj: Any) -> Any:
+        if isinstance(obj, str):
+            if len(obj) > head_chars:
+                return obj[:head_chars] + "...[truncated]"
+            return obj
+        if isinstance(obj, dict):
+            return {k: _shrink(v) for k, v in obj.items()}
+        if isinstance(obj, list):
+            return [_shrink(v) for v in obj]
+        return obj
+
+    shrunken = _shrink(parsed)
+    # ensure_ascii=False preserves CJK/emoji instead of bloating with \uXXXX
+    return json.dumps(shrunken, ensure_ascii=False)
+
+
 def _summarize_tool_result(tool_name: str, tool_args: str, tool_content: str) -> str:
    """Create an informative 1-line summary of a tool call + result.

@@ -449,6 +495,11 @@ class ContextCompressor(ContextEngine):
        # Pass 3: Truncate large tool_call arguments in assistant messages
        # outside the protected tail. write_file with 50KB content, for
        # example, survives pruning entirely without this.
+        #
+        # The shrinking is done inside the parsed JSON structure so the
+        # result remains valid JSON — otherwise downstream providers 400
+        # on every subsequent turn until the broken call falls out of
+        # the window. See ``_truncate_tool_call_args_json`` docstring.
        for i in range(prune_boundary):
            msg = result[i]
            if msg.get("role") != "assistant" or not msg.get("tool_calls"):
@@ -459,8 +510,10 @@ class ContextCompressor(ContextEngine):
                if isinstance(tc, dict):
                    args = tc.get("function", {}).get("arguments", "")
                    if len(args) > 500:
-                        tc = {**tc, "function": {**tc["function"], "arguments": args[:200] + "...[truncated]"}}
-                        modified = True
+                        new_args = _truncate_tool_call_args_json(args)
+                        if new_args != args:
+                            tc = {**tc, "function": {**tc["function"], "arguments": new_args}}
+                            modified = True
                new_tcs.append(tc)
            if modified:
                result[i] = {**msg, "tool_calls": new_tcs}
@@ -580,7 +633,9 @@ class ContextCompressor(ContextEngine):
            "assistant that continues the conversation. "
            "Do NOT respond to any questions or requests in the conversation — "
            "only output the structured summary. "
-            "Do NOT include any preamble, greeting, or prefix."
+            "Do NOT include any preamble, greeting, or prefix. "
+            "Write the summary in the same language the user was using in the "
+            "conversation — do not translate or switch to English."
        )

        # Shared structured template (used by both paths).
@@ -483,9 +483,7 @@ def _rg_files(path: Path, cwd: Path, limit: int) -> list[Path] | None:
            text=True,
            timeout=10,
        )
-    except FileNotFoundError:
-        return None
-    except subprocess.TimeoutExpired:
+    except (FileNotFoundError, OSError, subprocess.TimeoutExpired):
        return None
    if result.returncode != 0:
        return None
@@ -22,8 +22,6 @@ from hermes_cli.auth import (
    _auth_store_lock,
    _codex_access_token_is_expiring,
    _decode_jwt_claims,
-    _import_codex_cli_tokens,
-    _write_codex_cli_tokens,
    _load_auth_store,
    _load_provider_state,
    _resolve_kimi_base_url,
@@ -457,39 +455,6 @@ class CredentialPool:
            logger.debug("Failed to sync from credentials file: %s", exc)
        return entry

-    def _sync_codex_entry_from_cli(self, entry: PooledCredential) -> PooledCredential:
-        """Sync an openai-codex pool entry from ~/.codex/auth.json if tokens differ.
-
-        OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
-        When the Codex CLI (or another Hermes profile) refreshes its token,
-        the pool entry's refresh_token becomes stale.  This method detects that
-        by comparing against ~/.codex/auth.json and syncing the fresh pair.
-        """
-        if self.provider != "openai-codex":
-            return entry
-        try:
-            cli_tokens = _import_codex_cli_tokens()
-            if not cli_tokens:
-                return entry
-            cli_refresh = cli_tokens.get("refresh_token", "")
-            cli_access = cli_tokens.get("access_token", "")
-            if cli_refresh and cli_refresh != entry.refresh_token:
-                logger.debug("Pool entry %s: syncing tokens from ~/.codex/auth.json (refresh token changed)", entry.id)
-                updated = replace(
-                    entry,
-                    access_token=cli_access,
-                    refresh_token=cli_refresh,
-                    last_status=None,
-                    last_status_at=None,
-                    last_error_code=None,
-                )
-                self._replace_entry(entry, updated)
-                self._persist()
-                return updated
-        except Exception as exc:
-            logger.debug("Failed to sync from ~/.codex/auth.json: %s", exc)
-        return entry
-
    def _sync_device_code_entry_to_auth_store(self, entry: PooledCredential) -> None:
        """Write refreshed pool entry tokens back to auth.json providers.

@@ -585,13 +550,6 @@ class CredentialPool:
                    except Exception as wexc:
                        logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
            elif self.provider == "openai-codex":
-                # Proactively sync from ~/.codex/auth.json before refresh.
-                # The Codex CLI (or another Hermes profile) may have already
-                # consumed our refresh_token.  Syncing first avoids a
-                # "refresh_token_reused" error when the CLI has a newer pair.
-                synced = self._sync_codex_entry_from_cli(entry)
-                if synced is not entry:
-                    entry = synced
                refreshed = auth_mod.refresh_codex_oauth_pure(
                    entry.access_token,
                    entry.refresh_token,
@@ -677,45 +635,6 @@ class CredentialPool:
                    # Credentials file had a valid (non-expired) token — use it directly
                    logger.debug("Credentials file has valid token, using without refresh")
                    return synced
-            # For openai-codex: the refresh_token may have been consumed by
-            # the Codex CLI between our proactive sync and the refresh call.
-            # Re-sync and retry once.
-            if self.provider == "openai-codex":
-                synced = self._sync_codex_entry_from_cli(entry)
-                if synced.refresh_token != entry.refresh_token:
-                    logger.debug("Retrying Codex refresh with synced token from ~/.codex/auth.json")
-                    try:
-                        refreshed = auth_mod.refresh_codex_oauth_pure(
-                            synced.access_token,
-                            synced.refresh_token,
-                        )
-                        updated = replace(
-                            synced,
-                            access_token=refreshed["access_token"],
-                            refresh_token=refreshed["refresh_token"],
-                            last_refresh=refreshed.get("last_refresh"),
-                            last_status=STATUS_OK,
-                            last_status_at=None,
-                            last_error_code=None,
-                        )
-                        self._replace_entry(synced, updated)
-                        self._persist()
-                        self._sync_device_code_entry_to_auth_store(updated)
-                        try:
-                            _write_codex_cli_tokens(
-                                updated.access_token,
-                                updated.refresh_token,
-                                last_refresh=updated.last_refresh,
-                            )
-                        except Exception as wexc:
-                            logger.debug("Failed to write refreshed Codex tokens to CLI file (retry): %s", wexc)
-                        return updated
-                    except Exception as retry_exc:
-                        logger.debug("Codex retry refresh also failed: %s", retry_exc)
-                elif not self._entry_needs_refresh(synced):
-                    logger.debug("Codex CLI has valid token, using without refresh")
-                    self._sync_device_code_entry_to_auth_store(synced)
-                    return synced
            self._mark_exhausted(entry, None)
            return None

@@ -734,17 +653,6 @@ class CredentialPool:
        # _seed_from_singletons() on the next load_pool() sees fresh state
        # instead of re-seeding stale/consumed tokens.
        self._sync_device_code_entry_to_auth_store(updated)
-        # Write refreshed tokens back to ~/.codex/auth.json so Codex CLI
-        # and VS Code don't hit "refresh_token_reused" on their next refresh.
-        if self.provider == "openai-codex":
-            try:
-                _write_codex_cli_tokens(
-                    updated.access_token,
-                    updated.refresh_token,
-                    last_refresh=updated.last_refresh,
-                )
-            except Exception as wexc:
-                logger.debug("Failed to write refreshed Codex tokens to CLI file: %s", wexc)
        return updated

    def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
@@ -790,16 +698,6 @@ class CredentialPool:
                if synced is not entry:
                    entry = synced
                    cleared_any = True
-            # For openai-codex entries, sync from ~/.codex/auth.json before
-            # any status/refresh checks.  This picks up tokens refreshed by
-            # the Codex CLI or another Hermes profile.
-            if (self.provider == "openai-codex"
-                    and entry.last_status == STATUS_EXHAUSTED
-                    and entry.refresh_token):
-                synced = self._sync_codex_entry_from_cli(entry)
-                if synced is not entry:
-                    entry = synced
-                    cleared_any = True
            if entry.last_status == STATUS_EXHAUSTED:
                exhausted_until = _exhausted_until(entry)
                if exhausted_until is not None and now < exhausted_until:
@@ -1218,8 +1116,8 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
    elif provider == "openai-codex":
        # Respect user suppression — `hermes auth remove openai-codex` marks
        # the device_code source as suppressed so it won't be re-seeded from
-        # either the Hermes auth store or ~/.codex/auth.json.  Without this
-        # gate the removal is instantly undone on the next load_pool() call.
+        # the Hermes auth store.  Without this gate the removal is instantly
+        # undone on the next load_pool() call.
        codex_suppressed = False
        try:
            from hermes_cli.auth import is_source_suppressed
@@ -1231,23 +1129,12 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup

        state = _load_provider_state(auth_store, "openai-codex")
        tokens = state.get("tokens") if isinstance(state, dict) else None
-        # Fallback: import from Codex CLI (~/.codex/auth.json) if Hermes auth
-        # store has no tokens.  This mirrors resolve_codex_runtime_credentials()
-        # so that load_pool() and list_authenticated_providers() detect tokens
-        # that only exist in the Codex CLI shared file.
-        if not (isinstance(tokens, dict) and tokens.get("access_token")):
-            try:
-                from hermes_cli.auth import _import_codex_cli_tokens, _save_codex_tokens
-                cli_tokens = _import_codex_cli_tokens()
-                if cli_tokens:
-                    logger.info("Importing Codex CLI tokens into Hermes auth store.")
-                    _save_codex_tokens(cli_tokens)
-                    # Re-read state after import
-                    auth_store = _load_auth_store()
-                    state = _load_provider_state(auth_store, "openai-codex")
-                    tokens = state.get("tokens") if isinstance(state, dict) else None
-            except Exception as exc:
-                logger.debug("Codex CLI token import failed: %s", exc)
+        # Hermes owns its own Codex auth state — we do NOT auto-import from
+        # ~/.codex/auth.json at pool-load time.  OAuth refresh tokens are
+        # single-use, so sharing them with Codex CLI / VS Code causes
+        # refresh_token_reused race failures.  Users who want to adopt
+        # existing Codex CLI credentials get a one-time, explicit prompt
+        # via `hermes auth openai-codex`.
        if isinstance(tokens, dict) and tokens.get("access_token"):
            active_sources.add("device_code")
            changed |= _upsert_entry(
@@ -225,9 +225,11 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -
            content = _oneline(args.get("content", ""))
            return f"+{target}: \"{content[:25]}{'...' if len(content) > 25 else ''}\""
        elif action == "replace":
-            return f"~{target}: \"{_oneline(args.get('old_text', '')[:20])}\""
+            old = _oneline(args.get("old_text") or "") or "<missing old_text>"
+            return f"~{target}: \"{old[:20]}\""
        elif action == "remove":
-            return f"-{target}: \"{_oneline(args.get('old_text', '')[:20])}\""
+            old = _oneline(args.get("old_text") or "") or "<missing old_text>"
+            return f"-{target}: \"{old[:20]}\""
        return action

    if tool_name == "send_message":
@@ -939,9 +941,13 @@ def get_cute_tool_message(
        if action == "add":
            return _wrap(f"┊ 🧠 memory    +{target}: \"{_trunc(args.get('content', ''), 30)}\"  {dur}")
        elif action == "replace":
-            return _wrap(f"┊ 🧠 memory    ~{target}: \"{_trunc(args.get('old_text', ''), 20)}\"  {dur}")
+            old = args.get("old_text") or ""
+            old = old if old else "<missing old_text>"
+            return _wrap(f"┊ 🧠 memory    ~{target}: \"{_trunc(old, 20)}\"  {dur}")
        elif action == "remove":
-            return _wrap(f"┊ 🧠 memory    -{target}: \"{_trunc(args.get('old_text', ''), 20)}\"  {dur}")
+            old = args.get("old_text") or ""
+            old = old if old else "<missing old_text>"
+            return _wrap(f"┊ 🧠 memory    -{target}: \"{_trunc(old, 20)}\"  {dur}")
        return _wrap(f"┊ 🧠 memory    {action}  {dur}")
    if tool_name == "skills_list":
        return _wrap(f"┊ 📚 skills    list {args.get('category', 'all')}  {dur}")
@@ -290,7 +290,7 @@ def classify_api_error(
    if isinstance(body, dict):
        _err_obj = body.get("error", {})
        if isinstance(_err_obj, dict):
-            _body_msg = (_err_obj.get("message") or "").lower()
+            _body_msg = str(_err_obj.get("message") or "").lower()
            # Parse metadata.raw for wrapped provider errors
            _metadata = _err_obj.get("metadata", {})
            if isinstance(_metadata, dict):
@@ -302,11 +302,11 @@ def classify_api_error(
                        if isinstance(_inner, dict):
                            _inner_err = _inner.get("error", {})
                            if isinstance(_inner_err, dict):
-                                _metadata_msg = (_inner_err.get("message") or "").lower()
+                                _metadata_msg = str(_inner_err.get("message") or "").lower()
                    except (json.JSONDecodeError, TypeError):
                        pass
        if not _body_msg:
-            _body_msg = (body.get("message") or "").lower()
+            _body_msg = str(body.get("message") or "").lower()
    # Combine all message sources for pattern matching
    parts = [_raw_msg]
    if _body_msg and _body_msg not in _raw_msg:
@@ -606,10 +606,10 @@ def _classify_400(
    if isinstance(body, dict):
        err_obj = body.get("error", {})
        if isinstance(err_obj, dict):
-            err_body_msg = (err_obj.get("message") or "").strip().lower()
+            err_body_msg = str(err_obj.get("message") or "").strip().lower()
        # Responses API (and some providers) use flat body: {"message": "..."}
        if not err_body_msg:
-            err_body_msg = (body.get("message") or "").strip().lower()
+            err_body_msg = str(body.get("message") or "").strip().lower()
    is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
    is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80

@@ -39,6 +39,7 @@ from typing import Any, Dict, Iterator, List, Optional
 import httpx

 from agent import google_oauth
+from agent.gemini_schema import sanitize_gemini_tool_parameters
 from agent.google_code_assist import (
    CODE_ASSIST_ENDPOINT,
    FREE_TIER_ID,
@@ -205,7 +206,7 @@ def _translate_tools_to_gemini(tools: Any) -> List[Dict[str, Any]]:
            decl["description"] = str(fn["description"])
        params = fn.get("parameters")
        if isinstance(params, dict):
-            decl["parameters"] = params
+            decl["parameters"] = sanitize_gemini_tool_parameters(params)
        declarations.append(decl)
    if not declarations:
        return []
@@ -504,9 +505,16 @@ def _iter_sse_events(response: httpx.Response) -> Iterator[Dict[str, Any]]:
 def _translate_stream_event(
    event: Dict[str, Any],
    model: str,
-    tool_call_indices: Dict[str, int],
+    tool_call_counter: List[int],
 ) -> List[_GeminiStreamChunk]:
-    """Unwrap Code Assist envelope and emit OpenAI-shaped chunk(s)."""
+    """Unwrap Code Assist envelope and emit OpenAI-shaped chunk(s).
+
+    ``tool_call_counter`` is a single-element list used as a mutable counter
+    across events in the same stream. Each ``functionCall`` part gets a
+    fresh, unique OpenAI ``index`` — keying by function name would collide
+    whenever the model issues parallel calls to the same tool (e.g. reading
+    three files in one turn).
+    """
    inner = event.get("response") if isinstance(event.get("response"), dict) else event
    candidates = inner.get("candidates") or []
    if not candidates:
@@ -532,7 +540,8 @@ def _translate_stream_event(
        fc = part.get("functionCall")
        if isinstance(fc, dict) and fc.get("name"):
            name = str(fc["name"])
-            idx = tool_call_indices.setdefault(name, len(tool_call_indices))
+            idx = tool_call_counter[0]
+            tool_call_counter[0] += 1
            try:
                args_str = json.dumps(fc.get("args") or {}, ensure_ascii=False)
            except (TypeError, ValueError):
@@ -549,7 +558,7 @@ def _translate_stream_event(
    finish_reason_raw = str(cand.get("finishReason") or "")
    if finish_reason_raw:
        mapped = _map_gemini_finish_reason(finish_reason_raw)
-        if tool_call_indices:
+        if tool_call_counter[0] > 0:
            mapped = "tool_calls"
        chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
    return chunks
@@ -733,9 +742,9 @@ class GeminiCloudCodeClient:
                        # Materialize error body for better diagnostics
                        response.read()
                        raise _gemini_http_error(response)
-                    tool_call_indices: Dict[str, int] = {}
+                    tool_call_counter: List[int] = [0]
                    for event in _iter_sse_events(response):
-                        for chunk in _translate_stream_event(event, model, tool_call_indices):
+                        for chunk in _translate_stream_event(event, model, tool_call_counter):
                            yield chunk
            except httpx.HTTPError as exc:
                raise CodeAssistError(
@@ -0,0 +1,846 @@
+"""OpenAI-compatible facade over Google AI Studio's native Gemini API.
+
+Hermes keeps ``api_mode='chat_completions'`` for the ``gemini`` provider so the
+main agent loop can keep using its existing OpenAI-shaped message flow.
+This adapter is the transport shim that converts those OpenAI-style
+``messages[]`` / ``tools[]`` requests into Gemini's native
+``models/{model}:generateContent`` schema and converts the responses back.
+
+Why this exists
+---------------
+Google's OpenAI-compatible endpoint has been brittle for Hermes's multi-turn
+agent/tool loop (auth churn, tool-call replay quirks, thought-signature
+requirements).  The native Gemini API is the canonical path and avoids the
+OpenAI-compat layer entirely.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import base64
+import json
+import logging
+import time
+import uuid
+from types import SimpleNamespace
+from typing import Any, Dict, Iterator, List, Optional
+
+import httpx
+
+from agent.gemini_schema import sanitize_gemini_tool_parameters
+
+logger = logging.getLogger(__name__)
+
+DEFAULT_GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta"
+
+
+def is_native_gemini_base_url(base_url: str) -> bool:
+    """Return True when the endpoint speaks Gemini's native REST API."""
+    normalized = str(base_url or "").strip().rstrip("/").lower()
+    if not normalized:
+        return False
+    if "generativelanguage.googleapis.com" not in normalized:
+        return False
+    return not normalized.endswith("/openai")
+
+
+class GeminiAPIError(Exception):
+    """Error shape compatible with Hermes retry/error classification."""
+
+    def __init__(
+        self,
+        message: str,
+        *,
+        code: str = "gemini_api_error",
+        status_code: Optional[int] = None,
+        response: Optional[httpx.Response] = None,
+        retry_after: Optional[float] = None,
+        details: Optional[Dict[str, Any]] = None,
+    ) -> None:
+        super().__init__(message)
+        self.code = code
+        self.status_code = status_code
+        self.response = response
+        self.retry_after = retry_after
+        self.details = details or {}
+
+
+def _coerce_content_to_text(content: Any) -> str:
+    if content is None:
+        return ""
+    if isinstance(content, str):
+        return content
+    if isinstance(content, list):
+        pieces: List[str] = []
+        for part in content:
+            if isinstance(part, str):
+                pieces.append(part)
+            elif isinstance(part, dict) and part.get("type") == "text":
+                text = part.get("text")
+                if isinstance(text, str):
+                    pieces.append(text)
+        return "\n".join(pieces)
+    return str(content)
+
+
+def _extract_multimodal_parts(content: Any) -> List[Dict[str, Any]]:
+    if not isinstance(content, list):
+        text = _coerce_content_to_text(content)
+        return [{"text": text}] if text else []
+
+    parts: List[Dict[str, Any]] = []
+    for item in content:
+        if isinstance(item, str):
+            parts.append({"text": item})
+            continue
+        if not isinstance(item, dict):
+            continue
+        ptype = item.get("type")
+        if ptype == "text":
+            text = item.get("text")
+            if isinstance(text, str) and text:
+                parts.append({"text": text})
+        elif ptype == "image_url":
+            url = ((item.get("image_url") or {}).get("url") or "")
+            if not isinstance(url, str) or not url.startswith("data:"):
+                continue
+            try:
+                header, encoded = url.split(",", 1)
+                mime = header.split(":", 1)[1].split(";", 1)[0]
+                raw = base64.b64decode(encoded)
+            except Exception:
+                continue
+            parts.append(
+                {
+                    "inlineData": {
+                        "mimeType": mime,
+                        "data": base64.b64encode(raw).decode("ascii"),
+                    }
+                }
+            )
+    return parts
+
+
+def _tool_call_extra_signature(tool_call: Dict[str, Any]) -> Optional[str]:
+    extra = tool_call.get("extra_content") or {}
+    if not isinstance(extra, dict):
+        return None
+    google = extra.get("google") or extra.get("thought_signature")
+    if isinstance(google, dict):
+        sig = google.get("thought_signature") or google.get("thoughtSignature")
+        return str(sig) if isinstance(sig, str) and sig else None
+    if isinstance(google, str) and google:
+        return google
+    return None
+
+
+def _translate_tool_call_to_gemini(tool_call: Dict[str, Any]) -> Dict[str, Any]:
+    fn = tool_call.get("function") or {}
+    args_raw = fn.get("arguments", "")
+    try:
+        args = json.loads(args_raw) if isinstance(args_raw, str) and args_raw else {}
+    except json.JSONDecodeError:
+        args = {"_raw": args_raw}
+    if not isinstance(args, dict):
+        args = {"_value": args}
+
+    part: Dict[str, Any] = {
+        "functionCall": {
+            "name": str(fn.get("name") or ""),
+            "args": args,
+        }
+    }
+    thought_signature = _tool_call_extra_signature(tool_call)
+    if thought_signature:
+        part["thoughtSignature"] = thought_signature
+    return part
+
+
+def _translate_tool_result_to_gemini(
+    message: Dict[str, Any],
+    tool_name_by_call_id: Optional[Dict[str, str]] = None,
+) -> Dict[str, Any]:
+    tool_name_by_call_id = tool_name_by_call_id or {}
+    tool_call_id = str(message.get("tool_call_id") or "")
+    name = str(
+        message.get("name")
+        or tool_name_by_call_id.get(tool_call_id)
+        or tool_call_id
+        or "tool"
+    )
+    content = _coerce_content_to_text(message.get("content"))
+    try:
+        parsed = json.loads(content) if content.strip().startswith(("{", "[")) else None
+    except json.JSONDecodeError:
+        parsed = None
+    response = parsed if isinstance(parsed, dict) else {"output": content}
+    return {
+        "functionResponse": {
+            "name": name,
+            "response": response,
+        }
+    }
+
+
+def _build_gemini_contents(messages: List[Dict[str, Any]]) -> tuple[List[Dict[str, Any]], Optional[Dict[str, Any]]]:
+    system_text_parts: List[str] = []
+    contents: List[Dict[str, Any]] = []
+    tool_name_by_call_id: Dict[str, str] = {}
+
+    for msg in messages:
+        if not isinstance(msg, dict):
+            continue
+        role = str(msg.get("role") or "user")
+
+        if role == "system":
+            system_text_parts.append(_coerce_content_to_text(msg.get("content")))
+            continue
+
+        if role in {"tool", "function"}:
+            contents.append(
+                {
+                    "role": "user",
+                    "parts": [
+                        _translate_tool_result_to_gemini(
+                            msg,
+                            tool_name_by_call_id=tool_name_by_call_id,
+                        )
+                    ],
+                }
+            )
+            continue
+
+        gemini_role = "model" if role == "assistant" else "user"
+        parts: List[Dict[str, Any]] = []
+
+        content_parts = _extract_multimodal_parts(msg.get("content"))
+        parts.extend(content_parts)
+
+        tool_calls = msg.get("tool_calls") or []
+        if isinstance(tool_calls, list):
+            for tool_call in tool_calls:
+                if isinstance(tool_call, dict):
+                    tool_call_id = str(tool_call.get("id") or tool_call.get("call_id") or "")
+                    tool_name = str(((tool_call.get("function") or {}).get("name") or ""))
+                    if tool_call_id and tool_name:
+                        tool_name_by_call_id[tool_call_id] = tool_name
+                    parts.append(_translate_tool_call_to_gemini(tool_call))
+
+        if parts:
+            contents.append({"role": gemini_role, "parts": parts})
+
+    system_instruction = None
+    joined_system = "\n".join(part for part in system_text_parts if part).strip()
+    if joined_system:
+        system_instruction = {"parts": [{"text": joined_system}]}
+    return contents, system_instruction
+
+
+def _translate_tools_to_gemini(tools: Any) -> List[Dict[str, Any]]:
+    if not isinstance(tools, list):
+        return []
+    declarations: List[Dict[str, Any]] = []
+    for tool in tools:
+        if not isinstance(tool, dict):
+            continue
+        fn = tool.get("function") or {}
+        if not isinstance(fn, dict):
+            continue
+        name = fn.get("name")
+        if not isinstance(name, str) or not name:
+            continue
+        decl: Dict[str, Any] = {"name": name}
+        description = fn.get("description")
+        if isinstance(description, str) and description:
+            decl["description"] = description
+        parameters = fn.get("parameters")
+        if isinstance(parameters, dict):
+            decl["parameters"] = sanitize_gemini_tool_parameters(parameters)
+        declarations.append(decl)
+    return [{"functionDeclarations": declarations}] if declarations else []
+
+
+def _translate_tool_choice_to_gemini(tool_choice: Any) -> Optional[Dict[str, Any]]:
+    if tool_choice is None:
+        return None
+    if isinstance(tool_choice, str):
+        if tool_choice == "auto":
+            return {"functionCallingConfig": {"mode": "AUTO"}}
+        if tool_choice == "required":
+            return {"functionCallingConfig": {"mode": "ANY"}}
+        if tool_choice == "none":
+            return {"functionCallingConfig": {"mode": "NONE"}}
+    if isinstance(tool_choice, dict):
+        fn = tool_choice.get("function") or {}
+        name = fn.get("name")
+        if isinstance(name, str) and name:
+            return {"functionCallingConfig": {"mode": "ANY", "allowedFunctionNames": [name]}}
+    return None
+
+
+def _normalize_thinking_config(config: Any) -> Optional[Dict[str, Any]]:
+    if not isinstance(config, dict) or not config:
+        return None
+    budget = config.get("thinkingBudget", config.get("thinking_budget"))
+    include = config.get("includeThoughts", config.get("include_thoughts"))
+    level = config.get("thinkingLevel", config.get("thinking_level"))
+    normalized: Dict[str, Any] = {}
+    if isinstance(budget, (int, float)):
+        normalized["thinkingBudget"] = int(budget)
+    if isinstance(include, bool):
+        normalized["includeThoughts"] = include
+    if isinstance(level, str) and level.strip():
+        normalized["thinkingLevel"] = level.strip().lower()
+    return normalized or None
+
+
+def build_gemini_request(
+    *,
+    messages: List[Dict[str, Any]],
+    tools: Any = None,
+    tool_choice: Any = None,
+    temperature: Optional[float] = None,
+    max_tokens: Optional[int] = None,
+    top_p: Optional[float] = None,
+    stop: Any = None,
+    thinking_config: Any = None,
+) -> Dict[str, Any]:
+    contents, system_instruction = _build_gemini_contents(messages)
+    request: Dict[str, Any] = {"contents": contents}
+    if system_instruction:
+        request["systemInstruction"] = system_instruction
+
+    gemini_tools = _translate_tools_to_gemini(tools)
+    if gemini_tools:
+        request["tools"] = gemini_tools
+
+    tool_config = _translate_tool_choice_to_gemini(tool_choice)
+    if tool_config:
+        request["toolConfig"] = tool_config
+
+    generation_config: Dict[str, Any] = {}
+    if temperature is not None:
+        generation_config["temperature"] = temperature
+    if max_tokens is not None:
+        generation_config["maxOutputTokens"] = max_tokens
+    if top_p is not None:
+        generation_config["topP"] = top_p
+    if stop:
+        generation_config["stopSequences"] = stop if isinstance(stop, list) else [str(stop)]
+    normalized_thinking = _normalize_thinking_config(thinking_config)
+    if normalized_thinking:
+        generation_config["thinkingConfig"] = normalized_thinking
+    if generation_config:
+        request["generationConfig"] = generation_config
+
+    return request
+
+
+def _map_gemini_finish_reason(reason: str) -> str:
+    mapping = {
+        "STOP": "stop",
+        "MAX_TOKENS": "length",
+        "SAFETY": "content_filter",
+        "RECITATION": "content_filter",
+        "OTHER": "stop",
+    }
+    return mapping.get(str(reason or "").upper(), "stop")
+
+
+def _tool_call_extra_from_part(part: Dict[str, Any]) -> Optional[Dict[str, Any]]:
+    sig = part.get("thoughtSignature")
+    if isinstance(sig, str) and sig:
+        return {"google": {"thought_signature": sig}}
+    return None
+
+
+def _empty_response(model: str) -> SimpleNamespace:
+    message = SimpleNamespace(
+        role="assistant",
+        content="",
+        tool_calls=None,
+        reasoning=None,
+        reasoning_content=None,
+        reasoning_details=None,
+    )
+    choice = SimpleNamespace(index=0, message=message, finish_reason="stop")
+    usage = SimpleNamespace(
+        prompt_tokens=0,
+        completion_tokens=0,
+        total_tokens=0,
+        prompt_tokens_details=SimpleNamespace(cached_tokens=0),
+    )
+    return SimpleNamespace(
+        id=f"chatcmpl-{uuid.uuid4().hex[:12]}",
+        object="chat.completion",
+        created=int(time.time()),
+        model=model,
+        choices=[choice],
+        usage=usage,
+    )
+
+
+def translate_gemini_response(resp: Dict[str, Any], model: str) -> SimpleNamespace:
+    candidates = resp.get("candidates") or []
+    if not isinstance(candidates, list) or not candidates:
+        return _empty_response(model)
+
+    cand = candidates[0] if isinstance(candidates[0], dict) else {}
+    content_obj = cand.get("content") if isinstance(cand, dict) else {}
+    parts = content_obj.get("parts") if isinstance(content_obj, dict) else []
+
+    text_pieces: List[str] = []
+    reasoning_pieces: List[str] = []
+    tool_calls: List[SimpleNamespace] = []
+
+    for index, part in enumerate(parts or []):
+        if not isinstance(part, dict):
+            continue
+        if part.get("thought") is True and isinstance(part.get("text"), str):
+            reasoning_pieces.append(part["text"])
+            continue
+        if isinstance(part.get("text"), str):
+            text_pieces.append(part["text"])
+            continue
+        fc = part.get("functionCall")
+        if isinstance(fc, dict) and fc.get("name"):
+            try:
+                args_str = json.dumps(fc.get("args") or {}, ensure_ascii=False)
+            except (TypeError, ValueError):
+                args_str = "{}"
+            tool_call = SimpleNamespace(
+                id=f"call_{uuid.uuid4().hex[:12]}",
+                type="function",
+                index=index,
+                function=SimpleNamespace(name=str(fc["name"]), arguments=args_str),
+            )
+            extra_content = _tool_call_extra_from_part(part)
+            if extra_content:
+                tool_call.extra_content = extra_content
+            tool_calls.append(tool_call)
+
+    finish_reason = "tool_calls" if tool_calls else _map_gemini_finish_reason(str(cand.get("finishReason") or ""))
+    usage_meta = resp.get("usageMetadata") or {}
+    usage = SimpleNamespace(
+        prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
+        completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
+        total_tokens=int(usage_meta.get("totalTokenCount") or 0),
+        prompt_tokens_details=SimpleNamespace(
+            cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
+        ),
+    )
+    reasoning = "".join(reasoning_pieces) or None
+    message = SimpleNamespace(
+        role="assistant",
+        content="".join(text_pieces) if text_pieces else None,
+        tool_calls=tool_calls or None,
+        reasoning=reasoning,
+        reasoning_content=reasoning,
+        reasoning_details=None,
+    )
+    choice = SimpleNamespace(index=0, message=message, finish_reason=finish_reason)
+    return SimpleNamespace(
+        id=f"chatcmpl-{uuid.uuid4().hex[:12]}",
+        object="chat.completion",
+        created=int(time.time()),
+        model=model,
+        choices=[choice],
+        usage=usage,
+    )
+
+
+class _GeminiStreamChunk(SimpleNamespace):
+    pass
+
+
+def _make_stream_chunk(
+    *,
+    model: str,
+    content: str = "",
+    tool_call_delta: Optional[Dict[str, Any]] = None,
+    finish_reason: Optional[str] = None,
+    reasoning: str = "",
+) -> _GeminiStreamChunk:
+    delta_kwargs: Dict[str, Any] = {
+        "role": "assistant",
+        "content": None,
+        "tool_calls": None,
+        "reasoning": None,
+        "reasoning_content": None,
+    }
+    if content:
+        delta_kwargs["content"] = content
+    if tool_call_delta is not None:
+        tool_delta = SimpleNamespace(
+            index=tool_call_delta.get("index", 0),
+            id=tool_call_delta.get("id") or f"call_{uuid.uuid4().hex[:12]}",
+            type="function",
+            function=SimpleNamespace(
+                name=tool_call_delta.get("name") or "",
+                arguments=tool_call_delta.get("arguments") or "",
+            ),
+        )
+        extra_content = tool_call_delta.get("extra_content")
+        if isinstance(extra_content, dict):
+            tool_delta.extra_content = extra_content
+        delta_kwargs["tool_calls"] = [tool_delta]
+    if reasoning:
+        delta_kwargs["reasoning"] = reasoning
+        delta_kwargs["reasoning_content"] = reasoning
+    delta = SimpleNamespace(**delta_kwargs)
+    choice = SimpleNamespace(index=0, delta=delta, finish_reason=finish_reason)
+    return _GeminiStreamChunk(
+        id=f"chatcmpl-{uuid.uuid4().hex[:12]}",
+        object="chat.completion.chunk",
+        created=int(time.time()),
+        model=model,
+        choices=[choice],
+        usage=None,
+    )
+
+
+def _iter_sse_events(response: httpx.Response) -> Iterator[Dict[str, Any]]:
+    buffer = ""
+    for chunk in response.iter_text():
+        if not chunk:
+            continue
+        buffer += chunk
+        while "\n" in buffer:
+            line, buffer = buffer.split("\n", 1)
+            line = line.rstrip("\r")
+            if not line:
+                continue
+            if not line.startswith("data: "):
+                continue
+            data = line[6:]
+            if data == "[DONE]":
+                return
+            try:
+                payload = json.loads(data)
+            except json.JSONDecodeError:
+                logger.debug("Non-JSON Gemini SSE line: %s", data[:200])
+                continue
+            if isinstance(payload, dict):
+                yield payload
+
+
+def translate_stream_event(event: Dict[str, Any], model: str, tool_call_indices: Dict[str, Dict[str, Any]]) -> List[_GeminiStreamChunk]:
+    candidates = event.get("candidates") or []
+    if not candidates:
+        return []
+    cand = candidates[0] if isinstance(candidates[0], dict) else {}
+    parts = ((cand.get("content") or {}).get("parts") or []) if isinstance(cand, dict) else []
+    chunks: List[_GeminiStreamChunk] = []
+
+    for part_index, part in enumerate(parts):
+        if not isinstance(part, dict):
+            continue
+        if part.get("thought") is True and isinstance(part.get("text"), str):
+            chunks.append(_make_stream_chunk(model=model, reasoning=part["text"]))
+            continue
+        if isinstance(part.get("text"), str) and part["text"]:
+            chunks.append(_make_stream_chunk(model=model, content=part["text"]))
+        fc = part.get("functionCall")
+        if isinstance(fc, dict) and fc.get("name"):
+            name = str(fc["name"])
+            try:
+                args_str = json.dumps(fc.get("args") or {}, ensure_ascii=False, sort_keys=True)
+            except (TypeError, ValueError):
+                args_str = "{}"
+            thought_signature = part.get("thoughtSignature") if isinstance(part.get("thoughtSignature"), str) else ""
+            call_key = json.dumps(
+                {
+                    "part_index": part_index,
+                    "name": name,
+                    "thought_signature": thought_signature,
+                },
+                sort_keys=True,
+            )
+            slot = tool_call_indices.get(call_key)
+            if slot is None:
+                slot = {
+                    "index": len(tool_call_indices),
+                    "id": f"call_{uuid.uuid4().hex[:12]}",
+                    "last_arguments": "",
+                }
+                tool_call_indices[call_key] = slot
+            emitted_arguments = args_str
+            last_arguments = str(slot.get("last_arguments") or "")
+            if last_arguments:
+                if args_str == last_arguments:
+                    emitted_arguments = ""
+                elif args_str.startswith(last_arguments):
+                    emitted_arguments = args_str[len(last_arguments):]
+            slot["last_arguments"] = args_str
+            chunks.append(
+                _make_stream_chunk(
+                    model=model,
+                    tool_call_delta={
+                        "index": slot["index"],
+                        "id": slot["id"],
+                        "name": name,
+                        "arguments": emitted_arguments,
+                        "extra_content": _tool_call_extra_from_part(part),
+                    },
+                )
+            )
+
+    finish_reason_raw = str(cand.get("finishReason") or "")
+    if finish_reason_raw:
+        mapped = "tool_calls" if tool_call_indices else _map_gemini_finish_reason(finish_reason_raw)
+        chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
+    return chunks
+
+
+def gemini_http_error(response: httpx.Response) -> GeminiAPIError:
+    status = response.status_code
+    body_text = ""
+    body_json: Dict[str, Any] = {}
+    try:
+        body_text = response.text
+    except Exception:
+        body_text = ""
+    if body_text:
+        try:
+            parsed = json.loads(body_text)
+            if isinstance(parsed, dict):
+                body_json = parsed
+        except (ValueError, TypeError):
+            body_json = {}
+
+    err_obj = body_json.get("error") if isinstance(body_json, dict) else None
+    if not isinstance(err_obj, dict):
+        err_obj = {}
+    err_status = str(err_obj.get("status") or "").strip()
+    err_message = str(err_obj.get("message") or "").strip()
+    details_list = err_obj.get("details") if isinstance(err_obj.get("details"), list) else []
+
+    reason = ""
+    retry_after: Optional[float] = None
+    metadata: Dict[str, Any] = {}
+    for detail in details_list:
+        if not isinstance(detail, dict):
+            continue
+        type_url = str(detail.get("@type") or "")
+        if not reason and type_url.endswith("/google.rpc.ErrorInfo"):
+            reason_value = detail.get("reason")
+            if isinstance(reason_value, str):
+                reason = reason_value
+            md = detail.get("metadata")
+            if isinstance(md, dict):
+                metadata = md
+    header_retry = response.headers.get("Retry-After") or response.headers.get("retry-after")
+    if header_retry:
+        try:
+            retry_after = float(header_retry)
+        except (TypeError, ValueError):
+            retry_after = None
+
+    code = f"gemini_http_{status}"
+    if status == 401:
+        code = "gemini_unauthorized"
+    elif status == 429:
+        code = "gemini_rate_limited"
+    elif status == 404:
+        code = "gemini_model_not_found"
+
+    if err_message:
+        message = f"Gemini HTTP {status} ({err_status or 'error'}): {err_message}"
+    else:
+        message = f"Gemini returned HTTP {status}: {body_text[:500]}"
+
+    return GeminiAPIError(
+        message,
+        code=code,
+        status_code=status,
+        response=response,
+        retry_after=retry_after,
+        details={
+            "status": err_status,
+            "reason": reason,
+            "metadata": metadata,
+            "message": err_message,
+        },
+    )
+
+
+class _GeminiChatCompletions:
+    def __init__(self, client: "GeminiNativeClient"):
+        self._client = client
+
+    def create(self, **kwargs: Any) -> Any:
+        return self._client._create_chat_completion(**kwargs)
+
+
+class _AsyncGeminiChatCompletions:
+    def __init__(self, client: "AsyncGeminiNativeClient"):
+        self._client = client
+
+    async def create(self, **kwargs: Any) -> Any:
+        return await self._client._create_chat_completion(**kwargs)
+
+
+class _GeminiChatNamespace:
+    def __init__(self, client: "GeminiNativeClient"):
+        self.completions = _GeminiChatCompletions(client)
+
+
+class _AsyncGeminiChatNamespace:
+    def __init__(self, client: "AsyncGeminiNativeClient"):
+        self.completions = _AsyncGeminiChatCompletions(client)
+
+
+class GeminiNativeClient:
+    """Minimal OpenAI-SDK-compatible facade over Gemini's native REST API."""
+
+    def __init__(
+        self,
+        *,
+        api_key: str,
+        base_url: Optional[str] = None,
+        default_headers: Optional[Dict[str, str]] = None,
+        timeout: Any = None,
+        http_client: Optional[httpx.Client] = None,
+        **_: Any,
+    ) -> None:
+        self.api_key = api_key
+        normalized_base = (base_url or DEFAULT_GEMINI_BASE_URL).rstrip("/")
+        if normalized_base.endswith("/openai"):
+            normalized_base = normalized_base[: -len("/openai")]
+        self.base_url = normalized_base
+        self._default_headers = dict(default_headers or {})
+        self.chat = _GeminiChatNamespace(self)
+        self.is_closed = False
+        self._http = http_client or httpx.Client(
+            timeout=timeout or httpx.Timeout(connect=15.0, read=600.0, write=30.0, pool=30.0)
+        )
+
+    def close(self) -> None:
+        self.is_closed = True
+        try:
+            self._http.close()
+        except Exception:
+            pass
+
+    def __enter__(self):
+        return self
+
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.close()
+
+    def _headers(self) -> Dict[str, str]:
+        headers = {
+            "Content-Type": "application/json",
+            "Accept": "application/json",
+            "x-goog-api-key": self.api_key,
+            "User-Agent": "hermes-agent (gemini-native)",
+        }
+        headers.update(self._default_headers)
+        return headers
+
+    @staticmethod
+    def _advance_stream_iterator(iterator: Iterator[_GeminiStreamChunk]) -> tuple[bool, Optional[_GeminiStreamChunk]]:
+        try:
+            return False, next(iterator)
+        except StopIteration:
+            return True, None
+
+    def _create_chat_completion(
+        self,
+        *,
+        model: str = "gemini-2.5-flash",
+        messages: Optional[List[Dict[str, Any]]] = None,
+        stream: bool = False,
+        tools: Any = None,
+        tool_choice: Any = None,
+        temperature: Optional[float] = None,
+        max_tokens: Optional[int] = None,
+        top_p: Optional[float] = None,
+        stop: Any = None,
+        extra_body: Optional[Dict[str, Any]] = None,
+        timeout: Any = None,
+        **_: Any,
+    ) -> Any:
+        thinking_config = None
+        if isinstance(extra_body, dict):
+            thinking_config = extra_body.get("thinking_config") or extra_body.get("thinkingConfig")
+
+        request = build_gemini_request(
+            messages=messages or [],
+            tools=tools,
+            tool_choice=tool_choice,
+            temperature=temperature,
+            max_tokens=max_tokens,
+            top_p=top_p,
+            stop=stop,
+            thinking_config=thinking_config,
+        )
+
+        if stream:
+            return self._stream_completion(model=model, request=request, timeout=timeout)
+
+        url = f"{self.base_url}/models/{model}:generateContent"
+        response = self._http.post(url, json=request, headers=self._headers(), timeout=timeout)
+        if response.status_code != 200:
+            raise gemini_http_error(response)
+        try:
+            payload = response.json()
+        except ValueError as exc:
+            raise GeminiAPIError(
+                f"Invalid JSON from Gemini native API: {exc}",
+                code="gemini_invalid_json",
+                status_code=response.status_code,
+                response=response,
+            ) from exc
+        return translate_gemini_response(payload, model=model)
+
+    def _stream_completion(self, *, model: str, request: Dict[str, Any], timeout: Any = None) -> Iterator[_GeminiStreamChunk]:
+        url = f"{self.base_url}/models/{model}:streamGenerateContent?alt=sse"
+        stream_headers = dict(self._headers())
+        stream_headers["Accept"] = "text/event-stream"
+
+        def _generator() -> Iterator[_GeminiStreamChunk]:
+            try:
+                with self._http.stream("POST", url, json=request, headers=stream_headers, timeout=timeout) as response:
+                    if response.status_code != 200:
+                        response.read()
+                        raise gemini_http_error(response)
+                    tool_call_indices: Dict[str, Dict[str, Any]] = {}
+                    for event in _iter_sse_events(response):
+                        for chunk in translate_stream_event(event, model, tool_call_indices):
+                            yield chunk
+            except httpx.HTTPError as exc:
+                raise GeminiAPIError(
+                    f"Gemini streaming request failed: {exc}",
+                    code="gemini_stream_error",
+                ) from exc
+
+        return _generator()
+
+
+class AsyncGeminiNativeClient:
+    """Async wrapper used by auxiliary_client for native Gemini calls."""
+
+    def __init__(self, sync_client: GeminiNativeClient):
+        self._sync = sync_client
+        self.api_key = sync_client.api_key
+        self.base_url = sync_client.base_url
+        self.chat = _AsyncGeminiChatNamespace(self)
+
+    async def _create_chat_completion(self, **kwargs: Any) -> Any:
+        stream = bool(kwargs.get("stream"))
+        result = await asyncio.to_thread(self._sync.chat.completions.create, **kwargs)
+        if not stream:
+            return result
+
+        async def _async_stream() -> Any:
+            while True:
+                done, chunk = await asyncio.to_thread(self._sync._advance_stream_iterator, result)
+                if done:
+                    break
+                yield chunk
+
+        return _async_stream()
+
+    async def close(self) -> None:
+        await asyncio.to_thread(self._sync.close)
@@ -0,0 +1,85 @@
+"""Helpers for translating OpenAI-style tool schemas to Gemini's schema subset."""
+
+from __future__ import annotations
+
+from typing import Any, Dict, List
+
+# Gemini's ``FunctionDeclaration.parameters`` field accepts the ``Schema``
+# object, which is only a subset of OpenAPI 3.0 / JSON Schema.  Strip fields
+# outside that subset before sending Hermes tool schemas to Google.
+_GEMINI_SCHEMA_ALLOWED_KEYS = {
+    "type",
+    "format",
+    "title",
+    "description",
+    "nullable",
+    "enum",
+    "maxItems",
+    "minItems",
+    "properties",
+    "required",
+    "minProperties",
+    "maxProperties",
+    "minLength",
+    "maxLength",
+    "pattern",
+    "example",
+    "anyOf",
+    "propertyOrdering",
+    "default",
+    "items",
+    "minimum",
+    "maximum",
+}
+
+
+def sanitize_gemini_schema(schema: Any) -> Dict[str, Any]:
+    """Return a Gemini-compatible copy of a tool parameter schema.
+
+    Hermes tool schemas are OpenAI-flavored JSON Schema and may contain keys
+    such as ``$schema`` or ``additionalProperties`` that Google's Gemini
+    ``Schema`` object rejects.  This helper preserves the documented Gemini
+    subset and recursively sanitizes nested ``properties`` / ``items`` /
+    ``anyOf`` definitions.
+    """
+
+    if not isinstance(schema, dict):
+        return {}
+
+    cleaned: Dict[str, Any] = {}
+    for key, value in schema.items():
+        if key not in _GEMINI_SCHEMA_ALLOWED_KEYS:
+            continue
+        if key == "properties":
+            if not isinstance(value, dict):
+                continue
+            props: Dict[str, Any] = {}
+            for prop_name, prop_schema in value.items():
+                if not isinstance(prop_name, str):
+                    continue
+                props[prop_name] = sanitize_gemini_schema(prop_schema)
+            cleaned[key] = props
+            continue
+        if key == "items":
+            cleaned[key] = sanitize_gemini_schema(value)
+            continue
+        if key == "anyOf":
+            if not isinstance(value, list):
+                continue
+            cleaned[key] = [
+                sanitize_gemini_schema(item)
+                for item in value
+                if isinstance(item, dict)
+            ]
+            continue
+        cleaned[key] = value
+    return cleaned
+
+
+def sanitize_gemini_tool_parameters(parameters: Any) -> Dict[str, Any]:
+    """Normalize tool parameters to a valid Gemini object schema."""
+
+    cleaned = sanitize_gemini_schema(parameters)
+    if not cleaned:
+        return {"type": "object", "properties": {}}
+    return cleaned
@@ -420,7 +420,10 @@ def list_provider_models(provider: str) -> List[str]:
    models = _get_provider_models(provider)
    if models is None:
        return []
-    return list(models.keys())
+    return [
+        mid for mid in models.keys()
+        if not _should_hide_from_provider_catalog(provider, mid)
+    ]


 # Patterns that indicate non-agentic or noise models (TTS, embedding,
@@ -432,6 +435,43 @@ _NOISE_PATTERNS: re.Pattern = re.compile(
    re.IGNORECASE,
 )

+# Google's live Gemini catalogs currently include a mix of stale slugs and
+# Gemma models whose TPM quotas are too small for normal Hermes agent traffic.
+# Keep capability metadata available for direct/manual use, but hide these from
+# the Gemini model catalogs we surface in setup and model selection.
+_GOOGLE_HIDDEN_MODELS = frozenset({
+    # Low-TPM Gemma models that trip Google input-token quota walls under
+    # agent-style traffic despite advertising large context windows.
+    "gemma-4-31b-it",
+    "gemma-4-26b-it",
+    "gemma-4-26b-a4b-it",
+    "gemma-3-1b",
+    "gemma-3-1b-it",
+    "gemma-3-2b",
+    "gemma-3-2b-it",
+    "gemma-3-4b",
+    "gemma-3-4b-it",
+    "gemma-3-12b",
+    "gemma-3-12b-it",
+    "gemma-3-27b",
+    "gemma-3-27b-it",
+    # Stale/retired Google slugs that still surface through models.dev-backed
+    # Gemini selection but 404 on the current Google endpoints.
+    "gemini-1.5-flash",
+    "gemini-1.5-pro",
+    "gemini-1.5-flash-8b",
+    "gemini-2.0-flash",
+    "gemini-2.0-flash-lite",
+})
+
+
+def _should_hide_from_provider_catalog(provider: str, model_id: str) -> bool:
+    provider_lower = (provider or "").strip().lower()
+    model_lower = (model_id or "").strip().lower()
+    if provider_lower in {"gemini", "google"} and model_lower in _GOOGLE_HIDDEN_MODELS:
+        return True
+    return False
+

 def list_agentic_models(provider: str) -> List[str]:
    """Return model IDs suitable for agentic use from models.dev.
@@ -448,6 +488,8 @@ def list_agentic_models(provider: str) -> List[str]:
    for mid, entry in models.items():
        if not isinstance(entry, dict):
            continue
+        if _should_hide_from_provider_catalog(provider, mid):
+            continue
        if not entry.get("tool_call", False):
            continue
        if _NOISE_PATTERNS.search(mid):
@@ -582,5 +624,3 @@ def get_model_info(
            return _parse_model_info(mid, mdata, mdev_id)

    return None
-
-
@@ -152,7 +152,13 @@ MEMORY_GUIDANCE = (
    "Do NOT save task progress, session outcomes, completed-work logs, or temporary TODO "
    "state to memory; use session_search to recall those from past transcripts. "
    "If you've discovered a new way to do something, solved a problem that could be "
-    "necessary later, save it as a skill with the skill tool."
+    "necessary later, save it as a skill with the skill tool.\n"
+    "Write memories as declarative facts, not instructions to yourself. "
+    "'User prefers concise responses' ✓ — 'Always respond concisely' ✗. "
+    "'Project uses pytest with xdist' ✓ — 'Run tests with pytest -n 4' ✗. "
+    "Imperative phrasing gets re-read as a directive in later sessions and can "
+    "cause repeated work or override the user's current request. Procedures and "
+    "workflows belong in skills, not memory."
 )

 SESSION_SEARCH_GUIDANCE = (
@@ -613,12 +619,14 @@ def build_skills_system_prompt(
        or get_session_env("HERMES_SESSION_PLATFORM")
        or ""
    )
+    disabled = get_disabled_skill_names()
    cache_key = (
        str(skills_dir.resolve()),
        tuple(str(d) for d in external_dirs),
        tuple(sorted(str(t) for t in (available_tools or set()))),
        tuple(sorted(str(ts) for ts in (available_toolsets or set()))),
        _platform_hint,
+        tuple(sorted(disabled)),
    )
    with _SKILLS_PROMPT_CACHE_LOCK:
        cached = _SKILLS_PROMPT_CACHE.get(cache_key)
@@ -626,8 +634,6 @@ def build_skills_system_prompt(
            _SKILLS_PROMPT_CACHE.move_to_end(cache_key)
            return cached

-    disabled = get_disabled_skill_names()
-
    # ── Layer 2: disk snapshot ────────────────────────────────────────
    snapshot = _load_skills_snapshot(skills_dir)

@@ -1,195 +0,0 @@
-"""Helpers for optional cheap-vs-strong model routing."""
-
-from __future__ import annotations
-
-import os
-import re
-from typing import Any, Dict, Optional
-
-from utils import is_truthy_value
-
-_COMPLEX_KEYWORDS = {
-    "debug",
-    "debugging",
-    "implement",
-    "implementation",
-    "refactor",
-    "patch",
-    "traceback",
-    "stacktrace",
-    "exception",
-    "error",
-    "analyze",
-    "analysis",
-    "investigate",
-    "architecture",
-    "design",
-    "compare",
-    "benchmark",
-    "optimize",
-    "optimise",
-    "review",
-    "terminal",
-    "shell",
-    "tool",
-    "tools",
-    "pytest",
-    "test",
-    "tests",
-    "plan",
-    "planning",
-    "delegate",
-    "subagent",
-    "cron",
-    "docker",
-    "kubernetes",
-}
-
-_URL_RE = re.compile(r"https?://|www\.", re.IGNORECASE)
-
-
-def _coerce_bool(value: Any, default: bool = False) -> bool:
-    return is_truthy_value(value, default=default)
-
-
-def _coerce_int(value: Any, default: int) -> int:
-    try:
-        return int(value)
-    except (TypeError, ValueError):
-        return default
-
-
-def choose_cheap_model_route(user_message: str, routing_config: Optional[Dict[str, Any]]) -> Optional[Dict[str, Any]]:
-    """Return the configured cheap-model route when a message looks simple.
-
-    Conservative by design: if the message has signs of code/tool/debugging/
-    long-form work, keep the primary model.
-    """
-    cfg = routing_config or {}
-    if not _coerce_bool(cfg.get("enabled"), False):
-        return None
-
-    cheap_model = cfg.get("cheap_model") or {}
-    if not isinstance(cheap_model, dict):
-        return None
-    provider = str(cheap_model.get("provider") or "").strip().lower()
-    model = str(cheap_model.get("model") or "").strip()
-    if not provider or not model:
-        return None
-
-    text = (user_message or "").strip()
-    if not text:
-        return None
-
-    max_chars = _coerce_int(cfg.get("max_simple_chars"), 160)
-    max_words = _coerce_int(cfg.get("max_simple_words"), 28)
-
-    if len(text) > max_chars:
-        return None
-    if len(text.split()) > max_words:
-        return None
-    if text.count("\n") > 1:
-        return None
-    if "```" in text or "`" in text:
-        return None
-    if _URL_RE.search(text):
-        return None
-
-    lowered = text.lower()
-    words = {token.strip(".,:;!?()[]{}\"'`") for token in lowered.split()}
-    if words & _COMPLEX_KEYWORDS:
-        return None
-
-    route = dict(cheap_model)
-    route["provider"] = provider
-    route["model"] = model
-    route["routing_reason"] = "simple_turn"
-    return route
-
-
-def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any]], primary: Dict[str, Any]) -> Dict[str, Any]:
-    """Resolve the effective model/runtime for one turn.
-
-    Returns a dict with model/runtime/signature/label fields.
-    """
-    route = choose_cheap_model_route(user_message, routing_config)
-    if not route:
-        return {
-            "model": primary.get("model"),
-            "runtime": {
-                "api_key": primary.get("api_key"),
-                "base_url": primary.get("base_url"),
-                "provider": primary.get("provider"),
-                "api_mode": primary.get("api_mode"),
-                "command": primary.get("command"),
-                "args": list(primary.get("args") or []),
-                "credential_pool": primary.get("credential_pool"),
-            },
-            "label": None,
-            "signature": (
-                primary.get("model"),
-                primary.get("provider"),
-                primary.get("base_url"),
-                primary.get("api_mode"),
-                primary.get("command"),
-                tuple(primary.get("args") or ()),
-            ),
-        }
-
-    from hermes_cli.runtime_provider import resolve_runtime_provider
-
-    explicit_api_key = None
-    api_key_env = str(route.get("api_key_env") or "").strip()
-    if api_key_env:
-        explicit_api_key = os.getenv(api_key_env) or None
-
-    try:
-        runtime = resolve_runtime_provider(
-            requested=route.get("provider"),
-            explicit_api_key=explicit_api_key,
-            explicit_base_url=route.get("base_url"),
-        )
-    except Exception:
-        return {
-            "model": primary.get("model"),
-            "runtime": {
-                "api_key": primary.get("api_key"),
-                "base_url": primary.get("base_url"),
-                "provider": primary.get("provider"),
-                "api_mode": primary.get("api_mode"),
-                "command": primary.get("command"),
-                "args": list(primary.get("args") or []),
-                "credential_pool": primary.get("credential_pool"),
-            },
-            "label": None,
-            "signature": (
-                primary.get("model"),
-                primary.get("provider"),
-                primary.get("base_url"),
-                primary.get("api_mode"),
-                primary.get("command"),
-                tuple(primary.get("args") or ()),
-            ),
-        }
-
-    return {
-        "model": route.get("model"),
-        "runtime": {
-            "api_key": runtime.get("api_key"),
-            "base_url": runtime.get("base_url"),
-            "provider": runtime.get("provider"),
-            "api_mode": runtime.get("api_mode"),
-            "command": runtime.get("command"),
-            "args": list(runtime.get("args") or []),
-            "credential_pool": runtime.get("credential_pool"),
-        },
-        "label": f"smart route → {route.get('model')} ({runtime.get('provider')})",
-        "signature": (
-            route.get("model"),
-            runtime.get("provider"),
-            runtime.get("base_url"),
-            runtime.get("api_mode"),
-            runtime.get("command"),
-            tuple(runtime.get("args") or ()),
-        ),
-    }
@@ -63,7 +63,38 @@ model:
  #   Leave unset to use the model's native output ceiling (recommended).
  #   Set only if you want to deliberately limit individual response length.
  #
-  # max_tokens: 8192
+# max_tokens: 8192
+
+# Named provider overrides (optional)
+# Use this for per-provider request timeouts, non-stream stale timeouts,
+# and per-model exceptions.
+# Applies to the primary turn client on every api_mode (OpenAI-wire, native
+# Anthropic, and Anthropic-compatible providers), the fallback chain, and
+# client rebuilds during credential rotation.  For OpenAI-wire chat
+# completions (streaming and non-streaming) the configured value is also
+# used as the per-request ``timeout=`` kwarg so it wins over the legacy
+# HERMES_API_TIMEOUT env var (which still applies when no config is set).
+# ``stale_timeout_seconds`` controls the non-streaming stale-call detector and
+# wins over the legacy HERMES_API_CALL_STALE_TIMEOUT env var. Leaving these
+# unset keeps the legacy defaults (HERMES_API_TIMEOUT=1800s,
+# HERMES_API_CALL_STALE_TIMEOUT=300s, native Anthropic 900s).
+#
+# Not currently wired for AWS Bedrock (bedrock_converse + AnthropicBedrock
+# SDK paths) — those use boto3 with its own timeout configuration.
+#
+# providers:
+#   ollama-local:
+#     request_timeout_seconds: 300   # Longer timeout for local cold-starts
+#     stale_timeout_seconds: 900     # Explicitly re-enable stale detection on local endpoints
+#   anthropic:
+#     request_timeout_seconds: 30    # Fast-fail cloud requests
+#     models:
+#       claude-opus-4.6:
+#         timeout_seconds: 600       # Longer timeout for extended-thinking Opus calls
+#   openai-codex:
+#     models:
+#       gpt-5.4:
+#         stale_timeout_seconds: 1800  # Longer non-stream stale timeout for slow large-context turns

 # =============================================================================
 # OpenRouter Provider Routing (only applies when using OpenRouter)
@@ -91,20 +122,6 @@ model:
 #   # Data policy: "allow" (default) or "deny" to exclude providers that may store data
 #   # data_collection: "deny"

-# =============================================================================
-# Smart Model Routing (optional)
-# =============================================================================
-# Use a cheaper model for short/simple turns while keeping your main model for
-# more complex requests. Disabled by default.
-#
-# smart_model_routing:
-#   enabled: true
-#   max_simple_chars: 160
-#   max_simple_words: 28
-#   cheap_model:
-#     provider: openrouter
-#     model: google/gemini-2.5-flash
-
 # =============================================================================
 # Git Worktree Isolation
 # =============================================================================
@@ -357,6 +374,18 @@ compression:
 #   web_extract:
 #     provider: "auto"
 #     model: ""
+#
+#   # Session search — summarizes matching past sessions
+#   session_search:
+#     provider: "auto"
+#     model: ""
+#     timeout: 30
+#     max_concurrency: 3    # Limit parallel summaries to reduce request-burst 429s
+#     extra_body: {}        # Provider-specific OpenAI-compatible request fields
+#                           # Example for providers that support request-body
+#                           # reasoning controls:
+#                           # extra_body:
+#                           #   enable_thinking: false

 # =============================================================================
 # Persistent Memory
@@ -564,15 +564,53 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
        return False, f"Script execution failed: {exc}"


-def _build_job_prompt(job: dict) -> str:
-    """Build the effective prompt for a cron job, optionally loading one or more skills first."""
+def _parse_wake_gate(script_output: str) -> bool:
+    """Parse the last non-empty stdout line of a cron job's pre-check script
+    as a wake gate.
+
+    The convention (ported from nanoclaw #1232): if the last stdout line is
+    JSON like ``{"wakeAgent": false}``, the agent is skipped entirely — no
+    LLM run, no delivery. Any other output (non-JSON, missing flag, gate
+    absent, or ``wakeAgent: true``) means wake the agent normally.
+
+    Returns True if the agent should wake, False to skip.
+    """
+    if not script_output:
+        return True
+    stripped_lines = [line for line in script_output.splitlines() if line.strip()]
+    if not stripped_lines:
+        return True
+    last_line = stripped_lines[-1].strip()
+    try:
+        gate = json.loads(last_line)
+    except (json.JSONDecodeError, ValueError):
+        return True
+    if not isinstance(gate, dict):
+        return True
+    return gate.get("wakeAgent", True) is not False
+
+
+def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
+    """Build the effective prompt for a cron job, optionally loading one or more skills first.
+
+    Args:
+        job: The cron job dict.
+        prerun_script: Optional ``(success, stdout)`` from a script that has
+            already been executed by the caller (e.g. for a wake-gate check).
+            When provided, the script is not re-executed and the cached
+            result is used for prompt injection. When omitted, the script
+            (if any) runs inline as before.
+    """
    prompt = job.get("prompt", "")
    skills = job.get("skills")

    # Run data-collection script if configured, inject output as context.
    script_path = job.get("script")
    if script_path:
-        success, script_output = _run_job_script(script_path)
+        if prerun_script is not None:
+            success, script_output = prerun_script
+        else:
+            success, script_output = _run_job_script(script_path)
        if success:
            if script_output:
                prompt = (
@@ -674,13 +712,41 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
    
    job_id = job["id"]
    job_name = job["name"]
-    prompt = _build_job_prompt(job)
+
+    # Wake-gate: if this job has a pre-check script, run it BEFORE building
+    # the prompt so a ``{"wakeAgent": false}`` response can short-circuit
+    # the whole agent run. We pass the result into _build_job_prompt so
+    # the script is only executed once.
+    prerun_script = None
+    script_path = job.get("script")
+    if script_path:
+        prerun_script = _run_job_script(script_path)
+        _ran_ok, _script_output = prerun_script
+        if _ran_ok and not _parse_wake_gate(_script_output):
+            logger.info(
+                "Job '%s' (ID: %s): wakeAgent=false, skipping agent run",
+                job_name, job_id,
+            )
+            silent_doc = (
+                f"# Cron Job: {job_name}\n\n"
+                f"**Job ID:** {job_id}\n"
+                f"**Run Time:** {_hermes_now().strftime('%Y-%m-%d %H:%M:%S')}\n\n"
+                "Script gate returned `wakeAgent=false` — agent skipped.\n"
+            )
+            return True, silent_doc, SILENT_MARKER, None
+
+    prompt = _build_job_prompt(job, prerun_script=prerun_script)
    origin = _resolve_origin(job)
    _cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"

    logger.info("Running job '%s' (ID: %s)", job_name, job_id)
    logger.info("Prompt: %s", prompt[:100])

+    # Mark this as a cron session so the approval system can apply cron_mode.
+    # This env var is process-wide and persists for the lifetime of the
+    # scheduler process — every job this process runs is a cron job.
+    os.environ["HERMES_CRON_SESSION"] = "1"
+
    try:
        # Inject origin context so the agent's send_message tool knows the chat.
        # Must be INSIDE the try block so the finally cleanup always runs.
@@ -760,7 +826,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:

        # Provider routing
        pr = _cfg.get("provider_routing", {})
-        smart_routing = _cfg.get("smart_model_routing", {}) or {}

        from hermes_cli.runtime_provider import (
            resolve_runtime_provider,
@@ -777,24 +842,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            message = format_runtime_provider_error(exc)
            raise RuntimeError(message) from exc

-        from agent.smart_model_routing import resolve_turn_route
-        turn_route = resolve_turn_route(
-            prompt,
-            smart_routing,
-            {
-                "model": model,
-                "api_key": runtime.get("api_key"),
-                "base_url": runtime.get("base_url"),
-                "provider": runtime.get("provider"),
-                "api_mode": runtime.get("api_mode"),
-                "command": runtime.get("command"),
-                "args": list(runtime.get("args") or []),
-            },
-        )
-
        fallback_model = _cfg.get("fallback_providers") or _cfg.get("fallback_model") or None
        credential_pool = None
-        runtime_provider = str(turn_route["runtime"].get("provider") or "").strip().lower()
+        runtime_provider = str(runtime.get("provider") or "").strip().lower()
        if runtime_provider:
            try:
                from agent.credential_pool import load_pool
@@ -811,13 +861,13 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
                logger.debug("Job '%s': failed to load credential pool for %s: %s", job_id, runtime_provider, e)

        agent = AIAgent(
-            model=turn_route["model"],
-            api_key=turn_route["runtime"].get("api_key"),
-            base_url=turn_route["runtime"].get("base_url"),
-            provider=turn_route["runtime"].get("provider"),
-            api_mode=turn_route["runtime"].get("api_mode"),
-            acp_command=turn_route["runtime"].get("command"),
-            acp_args=turn_route["runtime"].get("args"),
+            model=model,
+            api_key=runtime.get("api_key"),
+            base_url=runtime.get("base_url"),
+            provider=runtime.get("provider"),
+            api_mode=runtime.get("api_mode"),
+            acp_command=runtime.get("command"),
+            acp_args=runtime.get("args"),
            max_iterations=max_iterations,
            reasoning_config=reasoning_config,
            prefill_messages=prefill_messages,
@@ -1,228 +0,0 @@
-# Hermes Agent — ACP (Agent Client Protocol) Setup Guide
-
-Hermes Agent supports the **Agent Client Protocol (ACP)**, allowing it to run as
-a coding agent inside your editor. ACP lets your IDE send tasks to Hermes, and
-Hermes responds with file edits, terminal commands, and explanations — all shown
-natively in the editor UI.
-
---
-
-## Prerequisites
-
- Hermes Agent installed and configured (`hermes setup` completed)
- An API key / provider set up in `~/.hermes/.env` or via `hermes login`
- Python 3.11+
-
-Install the ACP extra:
-
-```bash
-pip install -e ".[acp]"
-```
-
---
-
-## VS Code Setup
-
-### 1. Install the ACP Client extension
-
-Open VS Code and install **ACP Client** from the marketplace:
-
- Press `Ctrl+Shift+X` (or `Cmd+Shift+X` on macOS)
- Search for **"ACP Client"**
- Click **Install**
-
-Or install from the command line:
-
-```bash
-code --install-extension anysphere.acp-client
-```
-
-### 2. Configure settings.json
-
-Open your VS Code settings (`Ctrl+,` → click the `{}` icon for JSON) and add:
-
-```json
-{
-  "acpClient.agents": [
-    {
-      "name": "hermes-agent",
-      "registryDir": "/path/to/hermes-agent/acp_registry"
-    }
-  ]
-}
-```
-
-Replace `/path/to/hermes-agent` with the actual path to your Hermes Agent
-installation (e.g. `~/.hermes/hermes-agent`).
-
-Alternatively, if `hermes` is on your PATH, the ACP Client can discover it
-automatically via the registry directory.
-
-### 3. Restart VS Code
-
-After configuring, restart VS Code. You should see **Hermes Agent** appear in
-the ACP agent picker in the chat/agent panel.
-
---
-
-## Zed Setup
-
-Zed has built-in ACP support.
-
-### 1. Configure Zed settings
-
-Open Zed settings (`Cmd+,` on macOS or `Ctrl+,` on Linux) and add to your
-`settings.json`:
-
-```json
-{
-  "agent_servers": {
-    "hermes-agent": {
-      "type": "custom",
-      "command": "hermes",
-      "args": ["acp"],
-    },
-  },
-}
-```
-
-### 2. Restart Zed
-
-Hermes Agent will appear in the agent panel. Select it and start a conversation.
-
---
-
-## JetBrains Setup (IntelliJ, PyCharm, WebStorm, etc.)
-
-### 1. Install the ACP plugin
-
- Open **Settings** → **Plugins** → **Marketplace**
- Search for **"ACP"** or **"Agent Client Protocol"**
- Install and restart the IDE
-
-### 2. Configure the agent
-
- Open **Settings** → **Tools** → **ACP Agents**
- Click **+** to add a new agent
- Set the registry directory to your `acp_registry/` folder:
-  `/path/to/hermes-agent/acp_registry`
- Click **OK**
-
-### 3. Use the agent
-
-Open the ACP panel (usually in the right sidebar) and select **Hermes Agent**.
-
---
-
-## What You Will See
-
-Once connected, your editor provides a native interface to Hermes Agent:
-
-### Chat Panel
-A conversational interface where you can describe tasks, ask questions, and
-give instructions. Hermes responds with explanations and actions.
-
-### File Diffs
-When Hermes edits files, you see standard diffs in the editor. You can:
- **Accept** individual changes
- **Reject** changes you don't want
- **Review** the full diff before applying
-
-### Terminal Commands
-When Hermes needs to run shell commands (builds, tests, installs), the editor
-shows them in an integrated terminal. Depending on your settings:
- Commands may run automatically
- Or you may be prompted to **approve** each command
-
-### Approval Flow
-For potentially destructive operations, the editor will prompt you for
-approval before Hermes proceeds. This includes:
- File deletions
- Shell commands
- Git operations
-
---
-
-## Configuration
-
-Hermes Agent under ACP uses the **same configuration** as the CLI:
-
- **API keys / providers**: `~/.hermes/.env`
- **Agent config**: `~/.hermes/config.yaml`
- **Skills**: `~/.hermes/skills/`
- **Sessions**: `~/.hermes/state.db`
-
-You can run `hermes setup` to configure providers, or edit `~/.hermes/.env`
-directly.
-
-### Changing the model
-
-Edit `~/.hermes/config.yaml`:
-
-```yaml
-model: openrouter/nous/hermes-3-llama-3.1-70b
-```
-
-Or set the `HERMES_MODEL` environment variable.
-
-### Toolsets
-
-ACP sessions use the curated `hermes-acp` toolset by default. It is designed for editor workflows and intentionally excludes things like messaging delivery, cronjob management, and audio-first UX features.
-
---
-
-## Troubleshooting
-
-### Agent doesn't appear in the editor
-
-1. **Check the registry path** — make sure the `acp_registry/` directory path
-   in your editor settings is correct and contains `agent.json`.
-2. **Check `hermes` is on PATH** — run `which hermes` in a terminal. If not
-   found, you may need to activate your virtualenv or add it to PATH.
-3. **Restart the editor** after changing settings.
-
-### Agent starts but errors immediately
-
-1. Run `hermes doctor` to check your configuration.
-2. Check that you have a valid API key: `hermes status`
-3. Try running `hermes acp` directly in a terminal to see error output.
-
-### "Module not found" errors
-
-Make sure you installed the ACP extra:
-
-```bash
-pip install -e ".[acp]"
-```
-
-### Slow responses
-
- ACP streams responses, so you should see incremental output. If the agent
-  appears stuck, check your network connection and API provider status.
- Some providers have rate limits. Try switching to a different model/provider.
-
-### Permission denied for terminal commands
-
-If the editor blocks terminal commands, check your ACP Client extension
-settings for auto-approval or manual-approval preferences.
-
-### Logs
-
-Hermes logs are written to stderr when running in ACP mode. Check:
- VS Code: **Output** panel → select **ACP Client** or **Hermes Agent**
- Zed: **View** → **Toggle Terminal** and check the process output
- JetBrains: **Event Log** or the ACP tool window
-
-You can also enable verbose logging:
-
-```bash
-HERMES_LOG_LEVEL=DEBUG hermes acp
-```
-
---
-
-## Further Reading
-
- [ACP Specification](https://github.com/anysphere/acp)
- [Hermes Agent Documentation](https://github.com/NousResearch/hermes-agent)
- Run `hermes --help` for all CLI options
@@ -1,698 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-<meta charset="UTF-8">
-<meta name="viewport" content="width=device-width, initial-scale=1.0">
-<title>honcho-integration-spec</title>
-<style>
-  :root {
-    --bg:             #0b0e14;
-    --bg-surface:     #11151c;
-    --bg-elevated:    #181d27;
-    --bg-code:        #0d1018;
-    --fg:             #c9d1d9;
-    --fg-bright:      #e6edf3;
-    --fg-muted:       #6e7681;
-    --fg-subtle:      #484f58;
-    --accent:         #7eb8f6;
-    --accent-dim:     #3d6ea5;
-    --accent-glow:    rgba(126, 184, 246, 0.08);
-    --green:          #7ee6a8;
-    --green-dim:      #2ea04f;
-    --orange:         #e6a855;
-    --red:            #f47067;
-    --purple:         #bc8cff;
-    --cyan:           #56d4dd;
-    --border:         #21262d;
-    --border-subtle:  #161b22;
-    --radius:         6px;
-    --font-sans:      'New York', ui-serif, 'Iowan Old Style', 'Apple Garamond', Baskerville, 'Times New Roman', 'Noto Emoji', serif;
-    --font-mono:      'Departure Mono', 'Noto Emoji', monospace;
-  }
-
-  *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
-  html { scroll-behavior: smooth; scroll-padding-top: 2rem; }
-  body {
-    font-family: var(--font-sans);
-    background: var(--bg);
-    color: var(--fg);
-    line-height: 1.7;
-    font-size: 15px;
-    -webkit-font-smoothing: antialiased;
-  }
-
-  .container { max-width: 860px; margin: 0 auto; padding: 3rem 2rem 6rem; }
-
-  .hero {
-    text-align: center;
-    padding: 4rem 0 3rem;
-    border-bottom: 1px solid var(--border);
-    margin-bottom: 3rem;
-  }
-  .hero h1 { font-family: var(--font-mono); font-size: 2.2rem; font-weight: 700; color: var(--fg-bright); letter-spacing: -0.03em; margin-bottom: 0.5rem; }
-  .hero h1 span { color: var(--accent); }
-  .hero .subtitle { font-family: var(--font-sans); color: var(--fg-muted); font-size: 0.92rem; max-width: 560px; margin: 0 auto; line-height: 1.6; }
-  .hero .meta { margin-top: 1.5rem; display: flex; justify-content: center; gap: 1.5rem; flex-wrap: wrap; }
-  .hero .meta span { font-size: 0.8rem; color: var(--fg-subtle); font-family: var(--font-mono); }
-
-  .toc { background: var(--bg-surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 1.5rem 2rem; margin-bottom: 3rem; }
-  .toc h2 { font-size: 0.75rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--fg-muted); margin-bottom: 1rem; }
-  .toc ol { list-style: none; counter-reset: toc; columns: 2; column-gap: 2rem; }
-  .toc li { counter-increment: toc; break-inside: avoid; margin-bottom: 0.35rem; }
-  .toc li::before { content: counter(toc, decimal-leading-zero) " "; color: var(--fg-subtle); font-family: var(--font-mono); font-size: 0.75rem; margin-right: 0.25rem; }
-  .toc a { font-family: var(--font-mono); color: var(--fg); text-decoration: none; font-size: 0.82rem; transition: color 0.15s; }
-  .toc a:hover { color: var(--accent); }
-
-  section { margin-bottom: 4rem; }
-  section + section { padding-top: 1rem; }
-
-  h2 { font-family: var(--font-mono); font-size: 1.3rem; font-weight: 700; color: var(--fg-bright); letter-spacing: -0.01em; margin-bottom: 1.25rem; padding-bottom: 0.5rem; border-bottom: 1px solid var(--border); }
-  h3 { font-family: var(--font-mono); font-size: 1rem; font-weight: 600; color: var(--fg-bright); margin-top: 2rem; margin-bottom: 0.75rem; }
-  h4 { font-family: var(--font-mono); font-size: 0.9rem; font-weight: 600; color: var(--accent); margin-top: 1.5rem; margin-bottom: 0.5rem; }
-
-  p { margin-bottom: 1rem; font-size: 0.95rem; line-height: 1.75; }
-  strong { color: var(--fg-bright); font-weight: 600; }
-  a { color: var(--accent); text-decoration: none; }
-  a:hover { text-decoration: underline; }
-
-  ul, ol { margin-bottom: 1rem; padding-left: 1.5rem; font-size: 0.93rem; line-height: 1.7; }
-  li { margin-bottom: 0.35rem; }
-  li::marker { color: var(--fg-subtle); }
-
-  .table-wrap { overflow-x: auto; margin-bottom: 1.5rem; }
-  table { width: 100%; border-collapse: collapse; font-size: 0.88rem; }
-  th, td { text-align: left; padding: 0.6rem 1rem; border-bottom: 1px solid var(--border-subtle); }
-  th { font-family: var(--font-mono); font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.06em; color: var(--fg-muted); background: var(--bg-surface); border-bottom-color: var(--border); white-space: nowrap; }
-  td { font-family: var(--font-sans); font-size: 0.88rem; color: var(--fg); }
-  tr:hover td { background: var(--accent-glow); }
-  td code { background: var(--bg-elevated); padding: 0.15em 0.4em; border-radius: 3px; font-family: var(--font-mono); font-size: 0.82em; color: var(--cyan); }
-
-  pre { background: var(--bg-code); border: 1px solid var(--border); border-radius: var(--radius); padding: 1.25rem 1.5rem; overflow-x: auto; margin-bottom: 1.5rem; font-family: var(--font-mono); font-size: 0.82rem; line-height: 1.65; color: var(--fg); }
-  pre code { background: none; padding: 0; color: inherit; font-size: inherit; }
-  code { font-family: var(--font-mono); font-size: 0.85em; }
-  p code, li code { background: var(--bg-elevated); padding: 0.15em 0.4em; border-radius: 3px; color: var(--cyan); font-size: 0.85em; }
-
-  .kw { color: var(--purple); }
-  .str { color: var(--green); }
-  .cm { color: var(--fg-subtle); font-style: italic; }
-  .num { color: var(--orange); }
-  .key { color: var(--accent); }
-
-  .mermaid { margin: 1.5rem 0 2rem; text-align: center; }
-  .mermaid svg { max-width: 100%; height: auto; }
-
-  .callout { font-family: var(--font-sans); background: var(--bg-surface); border-left: 3px solid var(--accent-dim); border-radius: 0 var(--radius) var(--radius) 0; padding: 1rem 1.25rem; margin-bottom: 1.5rem; font-size: 0.88rem; color: var(--fg-muted); line-height: 1.6; }
-  .callout strong { font-family: var(--font-mono); color: var(--fg-bright); }
-  .callout.success { border-left-color: var(--green-dim); }
-  .callout.warn { border-left-color: var(--orange); }
-
-  .badge { display: inline-block; font-family: var(--font-mono); font-size: 0.65rem; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; padding: 0.2em 0.6em; border-radius: 3px; vertical-align: middle; margin-left: 0.4rem; }
-  .badge-done { background: var(--green-dim); color: #fff; }
-  .badge-wip { background: var(--orange); color: #0b0e14; }
-  .badge-todo { background: var(--fg-subtle); color: var(--fg); }
-
-  .checklist { list-style: none; padding-left: 0; }
-  .checklist li { padding-left: 1.5rem; position: relative; margin-bottom: 0.5rem; }
-  .checklist li::before { position: absolute; left: 0; font-family: var(--font-mono); font-size: 0.85rem; }
-  .checklist li.done { color: var(--fg-muted); }
-  .checklist li.done::before { content: "\2713"; color: var(--green); }
-  .checklist li.todo::before { content: "\25CB"; color: var(--fg-subtle); }
-  .checklist li.wip::before { content: "\25D4"; color: var(--orange); }
-
-  .compare { display: grid; grid-template-columns: 1fr 1fr; gap: 1rem; margin-bottom: 2rem; }
-  .compare-card { background: var(--bg-surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 1.25rem; }
-  .compare-card h4 { margin-top: 0; font-size: 0.82rem; }
-  .compare-card.after { border-color: var(--accent-dim); }
-  .compare-card ul { font-family: var(--font-mono); padding-left: 1.25rem; font-size: 0.8rem; }
-
-  hr { border: none; border-top: 1px solid var(--border); margin: 3rem 0; }
-
-  .progress-bar { position: fixed; top: 0; left: 0; height: 2px; background: var(--accent); z-index: 999; transition: width 0.1s linear; }
-
-  @media (max-width: 640px) {
-    .container { padding: 2rem 1rem 4rem; }
-    .hero h1 { font-size: 1.6rem; }
-    .toc ol { columns: 1; }
-    .compare { grid-template-columns: 1fr; }
-    table { font-size: 0.8rem; }
-    th, td { padding: 0.4rem 0.6rem; }
-  }
-</style>
-<link rel="preconnect" href="https://fonts.googleapis.com">
-<link href="https://fonts.googleapis.com/css2?family=Noto+Emoji&display=swap" rel="stylesheet">
-<style>
-  @font-face {
-    font-family: 'Departure Mono';
-    src: url('https://cdn.jsdelivr.net/gh/rektdeckard/departure-mono@latest/fonts/DepartureMono-Regular.woff2') format('woff2');
-    font-weight: normal;
-    font-style: normal;
-    font-display: swap;
-  }
-</style>
-</head>
-<body>
-
-<div class="progress-bar" id="progress"></div>
-
-<div class="container">
-
-<header class="hero">
-  <h1>honcho<span>-integration-spec</span></h1>
-  <p class="subtitle">Comparison of Hermes Agent vs. openclaw-honcho — and a porting spec for bringing Hermes patterns into other Honcho integrations.</p>
-  <div class="meta">
-    <span>hermes-agent / openclaw-honcho</span>
-    <span>Python + TypeScript</span>
-    <span>2026-03-09</span>
-  </div>
-</header>
-
-<nav class="toc">
-  <h2>Contents</h2>
-  <ol>
-    <li><a href="#overview">Overview</a></li>
-    <li><a href="#architecture">Architecture comparison</a></li>
-    <li><a href="#diff-table">Diff table</a></li>
-    <li><a href="#patterns">Hermes patterns to port</a></li>
-    <li><a href="#spec-async">Spec: async prefetch</a></li>
-    <li><a href="#spec-reasoning">Spec: dynamic reasoning level</a></li>
-    <li><a href="#spec-modes">Spec: per-peer memory modes</a></li>
-    <li><a href="#spec-identity">Spec: AI peer identity formation</a></li>
-    <li><a href="#spec-sessions">Spec: session naming strategies</a></li>
-    <li><a href="#spec-cli">Spec: CLI surface injection</a></li>
-    <li><a href="#openclaw-checklist">openclaw-honcho checklist</a></li>
-    <li><a href="#nanobot-checklist">nanobot-honcho checklist</a></li>
-  </ol>
-</nav>
-
-<!-- OVERVIEW -->
-<section id="overview">
-  <h2>Overview</h2>
-
-  <p>Two independent Honcho integrations have been built for two different agent runtimes: <strong>Hermes Agent</strong> (Python, baked into the runner) and <strong>openclaw-honcho</strong> (TypeScript plugin via hook/tool API). Both use the same Honcho peer paradigm — dual peer model, <code>session.context()</code>, <code>peer.chat()</code> — but they made different tradeoffs at every layer.</p>
-
-  <p>This document maps those tradeoffs and defines a porting spec: a set of Hermes-originated patterns, each stated as an integration-agnostic interface, that any Honcho integration can adopt regardless of runtime or language.</p>
-
-  <div class="callout">
-    <strong>Scope</strong> Both integrations work correctly today. This spec is about the delta — patterns in Hermes that are worth propagating and patterns in openclaw-honcho that Hermes should eventually adopt. The spec is additive, not prescriptive.
-  </div>
-</section>
-
-<!-- ARCHITECTURE -->
-<section id="architecture">
-  <h2>Architecture comparison</h2>
-
-  <h3>Hermes: baked-in runner</h3>
-  <p>Honcho is initialised directly inside <code>AIAgent.__init__</code>. There is no plugin boundary. Session management, context injection, async prefetch, and CLI surface are all first-class concerns of the runner. Context is injected once per session (baked into <code>_cached_system_prompt</code>) and never re-fetched mid-session — this maximises prefix cache hits at the LLM provider.</p>
-
-  <div class="mermaid">
-%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1f3150', 'primaryTextColor': '#c9d1d9', 'primaryBorderColor': '#3d6ea5', 'lineColor': '#3d6ea5', 'secondaryColor': '#162030', 'tertiaryColor': '#11151c' }}}%%
-flowchart TD
-    U["user message"] --> P["_honcho_prefetch()<br/>(reads cache — no HTTP)"]
-    P --> SP["_build_system_prompt()<br/>(first turn only, cached)"]
-    SP --> LLM["LLM call"]
-    LLM --> R["response"]
-    R --> FP["_honcho_fire_prefetch()<br/>(daemon threads, turn end)"]
-    FP --> C1["prefetch_context() thread"]
-    FP --> C2["prefetch_dialectic() thread"]
-    C1 --> CACHE["_context_cache / _dialectic_cache"]
-    C2 --> CACHE
-
-    style U fill:#162030,stroke:#3d6ea5,color:#c9d1d9
-    style P fill:#1f3150,stroke:#3d6ea5,color:#c9d1d9
-    style SP fill:#1f3150,stroke:#3d6ea5,color:#c9d1d9
-    style LLM fill:#162030,stroke:#3d6ea5,color:#c9d1d9
-    style R fill:#162030,stroke:#3d6ea5,color:#c9d1d9
-    style FP fill:#2a1a40,stroke:#bc8cff,color:#c9d1d9
-    style C1 fill:#2a1a40,stroke:#bc8cff,color:#c9d1d9
-    style C2 fill:#2a1a40,stroke:#bc8cff,color:#c9d1d9
-    style CACHE fill:#11151c,stroke:#484f58,color:#6e7681
-  </div>
-
-  <h3>openclaw-honcho: hook-based plugin</h3>
-  <p>The plugin registers hooks against OpenClaw's event bus. Context is fetched synchronously inside <code>before_prompt_build</code> on every turn. Message capture happens in <code>agent_end</code>. The multi-agent hierarchy is tracked via <code>subagent_spawned</code>. This model is correct but every turn pays a blocking Honcho round-trip before the LLM call can begin.</p>
-
-  <div class="mermaid">
-%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1f3150', 'primaryTextColor': '#c9d1d9', 'primaryBorderColor': '#3d6ea5', 'lineColor': '#3d6ea5', 'secondaryColor': '#162030', 'tertiaryColor': '#11151c' }}}%%
-flowchart TD
-    U2["user message"] --> BPB["before_prompt_build<br/>(BLOCKING HTTP — every turn)"]
-    BPB --> CTX["session.context()"]
-    CTX --> SP2["system prompt assembled"]
-    SP2 --> LLM2["LLM call"]
-    LLM2 --> R2["response"]
-    R2 --> AE["agent_end hook"]
-    AE --> SAVE["session.addMessages()<br/>session.setMetadata()"]
-
-    style U2 fill:#162030,stroke:#3d6ea5,color:#c9d1d9
-    style BPB fill:#3a1515,stroke:#f47067,color:#c9d1d9
-    style CTX fill:#3a1515,stroke:#f47067,color:#c9d1d9
-    style SP2 fill:#1f3150,stroke:#3d6ea5,color:#c9d1d9
-    style LLM2 fill:#162030,stroke:#3d6ea5,color:#c9d1d9
-    style R2 fill:#162030,stroke:#3d6ea5,color:#c9d1d9
-    style AE fill:#162030,stroke:#3d6ea5,color:#c9d1d9
-    style SAVE fill:#11151c,stroke:#484f58,color:#6e7681
-  </div>
-</section>
-
-<!-- DIFF TABLE -->
-<section id="diff-table">
-  <h2>Diff table</h2>
-
-  <div class="table-wrap">
-    <table>
-      <thead>
-        <tr>
-          <th>Dimension</th>
-          <th>Hermes Agent</th>
-          <th>openclaw-honcho</th>
-        </tr>
-      </thead>
-      <tbody>
-        <tr>
-          <td><strong>Context injection timing</strong></td>
-          <td>Once per session (cached). Zero HTTP on response path after turn 1.</td>
-          <td>Every turn, blocking. Fresh context per turn but adds latency.</td>
-        </tr>
-        <tr>
-          <td><strong>Prefetch strategy</strong></td>
-          <td>Daemon threads fire at turn end; consumed next turn from cache.</td>
-          <td>None. Blocking call at prompt-build time.</td>
-        </tr>
-        <tr>
-          <td><strong>Dialectic (peer.chat)</strong></td>
-          <td>Prefetched async; result injected into system prompt next turn.</td>
-          <td>On-demand via <code>honcho_recall</code> / <code>honcho_analyze</code> tools.</td>
-        </tr>
-        <tr>
-          <td><strong>Reasoning level</strong></td>
-          <td>Dynamic: scales with message length. Floor = config default. Cap = "high".</td>
-          <td>Fixed per tool: recall=minimal, analyze=medium.</td>
-        </tr>
-        <tr>
-          <td><strong>Memory modes</strong></td>
-          <td><code>user_memory_mode</code> / <code>agent_memory_mode</code>: hybrid / honcho / local.</td>
-          <td>None. Always writes to Honcho.</td>
-        </tr>
-        <tr>
-          <td><strong>Write frequency</strong></td>
-          <td>async (background queue), turn, session, N turns.</td>
-          <td>After every agent_end (no control).</td>
-        </tr>
-        <tr>
-          <td><strong>AI peer identity</strong></td>
-          <td><code>observe_me=True</code>, <code>seed_ai_identity()</code>, <code>get_ai_representation()</code>, SOUL.md → AI peer.</td>
-          <td>Agent files uploaded to agent peer at setup. No ongoing self-observation seeding.</td>
-        </tr>
-        <tr>
-          <td><strong>Context scope</strong></td>
-          <td>User peer + AI peer representation, both injected.</td>
-          <td>User peer (owner) representation + conversation summary. <code>peerPerspective</code> on context call.</td>
-        </tr>
-        <tr>
-          <td><strong>Session naming</strong></td>
-          <td>per-directory / global / manual map / title-based.</td>
-          <td>Derived from platform session key.</td>
-        </tr>
-        <tr>
-          <td><strong>Multi-agent</strong></td>
-          <td>Single-agent only.</td>
-          <td>Parent observer hierarchy via <code>subagent_spawned</code>.</td>
-        </tr>
-        <tr>
-          <td><strong>Tool surface</strong></td>
-          <td>Single <code>query_user_context</code> tool (on-demand dialectic).</td>
-          <td>6 tools: session, profile, search, context (fast) + recall, analyze (LLM).</td>
-        </tr>
-        <tr>
-          <td><strong>Platform metadata</strong></td>
-          <td>Not stripped.</td>
-          <td>Explicitly stripped before Honcho storage.</td>
-        </tr>
-        <tr>
-          <td><strong>Message dedup</strong></td>
-          <td>None (sends on every save cycle).</td>
-          <td><code>lastSavedIndex</code> in session metadata prevents re-sending.</td>
-        </tr>
-        <tr>
-          <td><strong>CLI surface in prompt</strong></td>
-          <td>Management commands injected into system prompt. Agent knows its own CLI.</td>
-          <td>Not injected.</td>
-        </tr>
-        <tr>
-          <td><strong>AI peer name in identity</strong></td>
-          <td>Replaces "Hermes Agent" in DEFAULT_AGENT_IDENTITY when configured.</td>
-          <td>Not implemented.</td>
-        </tr>
-        <tr>
-          <td><strong>QMD / local file search</strong></td>
-          <td>Not implemented.</td>
-          <td>Passthrough tools when QMD backend configured.</td>
-        </tr>
-        <tr>
-          <td><strong>Workspace metadata</strong></td>
-          <td>Not implemented.</td>
-          <td><code>agentPeerMap</code> in workspace metadata tracks agent&#8594;peer ID.</td>
-        </tr>
-      </tbody>
-    </table>
-  </div>
-</section>
-
-<!-- PATTERNS -->
-<section id="patterns">
-  <h2>Hermes patterns to port</h2>
-
-  <p>Six patterns from Hermes are worth adopting in any Honcho integration. They are described below as integration-agnostic interfaces — the implementation will differ per runtime, but the contract is the same.</p>
-
-  <div class="compare">
-    <div class="compare-card">
-      <h4>Patterns Hermes contributes</h4>
-      <ul>
-        <li>Async prefetch (zero-latency)</li>
-        <li>Dynamic reasoning level</li>
-        <li>Per-peer memory modes</li>
-        <li>AI peer identity formation</li>
-        <li>Session naming strategies</li>
-        <li>CLI surface injection</li>
-      </ul>
-    </div>
-    <div class="compare-card after">
-      <h4>Patterns openclaw contributes back</h4>
-      <ul>
-        <li>lastSavedIndex dedup</li>
-        <li>Platform metadata stripping</li>
-        <li>Multi-agent observer hierarchy</li>
-        <li>peerPerspective on context()</li>
-        <li>Tiered tool surface (fast/LLM)</li>
-        <li>Workspace agentPeerMap</li>
-      </ul>
-    </div>
-  </div>
-</section>
-
-<!-- SPEC: ASYNC PREFETCH -->
-<section id="spec-async">
-  <h2>Spec: async prefetch</h2>
-
-  <h3>Problem</h3>
-  <p>Calling <code>session.context()</code> and <code>peer.chat()</code> synchronously before each LLM call adds 200–800ms of Honcho round-trip latency to every turn. Users experience this as the agent "thinking slowly."</p>
-
-  <h3>Pattern</h3>
-  <p>Fire both calls as non-blocking background work at the <strong>end</strong> of each turn. Store results in a per-session cache keyed by session ID. At the <strong>start</strong> of the next turn, pop from cache — the HTTP is already done. First turn is cold (empty cache); all subsequent turns are zero-latency on the response path.</p>
-
-  <h3>Interface contract</h3>
-  <pre><code><span class="cm">// TypeScript (openclaw / nanobot plugin shape)</span>
-
-<span class="kw">interface</span> <span class="key">AsyncPrefetch</span> {
-  <span class="cm">// Fire context + dialectic fetches at turn end. Non-blocking.</span>
-  firePrefetch(sessionId: <span class="str">string</span>, userMessage: <span class="str">string</span>): <span class="kw">void</span>;
-
-  <span class="cm">// Pop cached results at turn start. Returns empty if cache is cold.</span>
-  popContextResult(sessionId: <span class="str">string</span>): ContextResult | <span class="kw">null</span>;
-  popDialecticResult(sessionId: <span class="str">string</span>): <span class="str">string</span> | <span class="kw">null</span>;
-}
-
-<span class="kw">type</span> <span class="key">ContextResult</span> = {
-  representation: <span class="str">string</span>;
-  card: <span class="str">string</span>[];
-  aiRepresentation?: <span class="str">string</span>;  <span class="cm">// AI peer context if enabled</span>
-  summary?: <span class="str">string</span>;            <span class="cm">// conversation summary if fetched</span>
-};</code></pre>
-
-  <h3>Implementation notes</h3>
-  <ul>
-    <li>Python: <code>threading.Thread(daemon=True)</code>. Write to <code>dict[session_id, result]</code> — GIL makes this safe for simple writes.</li>
-    <li>TypeScript: <code>Promise</code> stored in <code>Map&lt;string, Promise&lt;ContextResult&gt;&gt;</code>. Await at pop time. If not resolved yet, skip (return null) — do not block.</li>
-    <li>The pop is destructive: clears the cache entry after reading so stale data never accumulates.</li>
-    <li>Prefetch should also fire on first turn (even though it won't be consumed until turn 2) — this ensures turn 2 is never cold.</li>
-  </ul>
-
-  <h3>openclaw-honcho adoption</h3>
-  <p>Move <code>session.context()</code> from <code>before_prompt_build</code> to a post-<code>agent_end</code> background task. Store result in <code>state.contextCache</code>. In <code>before_prompt_build</code>, read from cache instead of calling Honcho. If cache is empty (turn 1), inject nothing — the prompt is still valid without Honcho context on the first turn.</p>
-</section>
-
-<!-- SPEC: DYNAMIC REASONING LEVEL -->
-<section id="spec-reasoning">
-  <h2>Spec: dynamic reasoning level</h2>
-
-  <h3>Problem</h3>
-  <p>Honcho's dialectic endpoint supports reasoning levels from <code>minimal</code> to <code>max</code>. A fixed level per tool wastes budget on simple queries and under-serves complex ones.</p>
-
-  <h3>Pattern</h3>
-  <p>Select the reasoning level dynamically based on the user's message. Use the configured default as a floor. Bump by message length. Cap auto-selection at <code>high</code> — never select <code>max</code> automatically.</p>
-
-  <h3>Interface contract</h3>
-  <pre><code><span class="cm">// Shared helper — identical logic in any language</span>
-
-<span class="kw">const</span> LEVELS = [<span class="str">"minimal"</span>, <span class="str">"low"</span>, <span class="str">"medium"</span>, <span class="str">"high"</span>, <span class="str">"max"</span>];
-
-<span class="kw">function</span> <span class="key">dynamicReasoningLevel</span>(
-  query: <span class="str">string</span>,
-  configDefault: <span class="str">string</span> = <span class="str">"low"</span>
-): <span class="str">string</span> {
-  <span class="kw">const</span> baseIdx = Math.max(<span class="num">0</span>, LEVELS.indexOf(configDefault));
-  <span class="kw">const</span> n = query.length;
-  <span class="kw">const</span> bump = n &lt; <span class="num">120</span> ? <span class="num">0</span> : n &lt; <span class="num">400</span> ? <span class="num">1</span> : <span class="num">2</span>;
-  <span class="kw">return</span> LEVELS[Math.min(baseIdx + bump, <span class="num">3</span>)]; <span class="cm">// cap at "high" (idx 3)</span>
-}</code></pre>
-
-  <h3>Config key</h3>
-  <p>Add a <code>dialecticReasoningLevel</code> config field (string, default <code>"low"</code>). This sets the floor. Users can raise or lower it. The dynamic bump always applies on top.</p>
-
-  <h3>openclaw-honcho adoption</h3>
-  <p>Apply in <code>honcho_recall</code> and <code>honcho_analyze</code>: replace the fixed <code>reasoningLevel</code> with the dynamic selector. <code>honcho_recall</code> should use floor <code>"minimal"</code> and <code>honcho_analyze</code> floor <code>"medium"</code> — both still bump with message length.</p>
-</section>
-
-<!-- SPEC: PER-PEER MEMORY MODES -->
-<section id="spec-modes">
-  <h2>Spec: per-peer memory modes</h2>
-
-  <h3>Problem</h3>
-  <p>Users want independent control over whether user context and agent context are written locally, to Honcho, or both. A single <code>memoryMode</code> shorthand is not granular enough.</p>
-
-  <h3>Pattern</h3>
-  <p>Three modes per peer: <code>hybrid</code> (write both local + Honcho), <code>honcho</code> (Honcho only, disable local files), <code>local</code> (local files only, skip Honcho sync for this peer). Two orthogonal axes: user peer and agent peer.</p>
-
-  <h3>Config schema</h3>
-  <pre><code><span class="cm">// ~/.openclaw/openclaw.json  (or ~/.nanobot/config.json)</span>
-{
-  <span class="str">"plugins"</span>: {
-    <span class="str">"openclaw-honcho"</span>: {
-      <span class="str">"config"</span>: {
-        <span class="str">"apiKey"</span>: <span class="str">"..."</span>,
-        <span class="str">"memoryMode"</span>: <span class="str">"hybrid"</span>,          <span class="cm">// shorthand: both peers</span>
-        <span class="str">"userMemoryMode"</span>: <span class="str">"honcho"</span>,       <span class="cm">// override for user peer</span>
-        <span class="str">"agentMemoryMode"</span>: <span class="str">"hybrid"</span>       <span class="cm">// override for agent peer</span>
-      }
-    }
-  }
-}</code></pre>
-
-  <h3>Resolution order</h3>
-  <ol>
-    <li>Per-peer field (<code>userMemoryMode</code> / <code>agentMemoryMode</code>) — wins if present.</li>
-    <li>Shorthand <code>memoryMode</code> — applies to both peers as default.</li>
-    <li>Hardcoded default: <code>"hybrid"</code>.</li>
-  </ol>
-
-  <h3>Effect on Honcho sync</h3>
-  <ul>
-    <li><code>userMemoryMode=local</code>: skip adding user peer messages to Honcho.</li>
-    <li><code>agentMemoryMode=local</code>: skip adding assistant peer messages to Honcho.</li>
-    <li>Both local: skip <code>session.addMessages()</code> entirely.</li>
-    <li><code>userMemoryMode=honcho</code>: disable local USER.md writes.</li>
-    <li><code>agentMemoryMode=honcho</code>: disable local MEMORY.md / SOUL.md writes.</li>
-  </ul>
-</section>
-
-<!-- SPEC: AI PEER IDENTITY -->
-<section id="spec-identity">
-  <h2>Spec: AI peer identity formation</h2>
-
-  <h3>Problem</h3>
-  <p>Honcho builds the user's representation organically by observing what the user says. The same mechanism exists for the AI peer — but only if <code>observe_me=True</code> is set for the agent peer. Without it, the agent peer accumulates nothing and Honcho's AI-side model never forms.</p>
-
-  <p>Additionally, existing persona files (SOUL.md, IDENTITY.md) should seed the AI peer's Honcho representation at first activation, rather than waiting for it to emerge from scratch.</p>
-
-  <h3>Part A: observe_me=True for agent peer</h3>
-  <pre><code><span class="cm">// TypeScript — in session.addPeers() call</span>
-<span class="kw">await</span> session.addPeers([
-  [ownerPeer.id, { observeMe: <span class="kw">true</span>,  observeOthers: <span class="kw">false</span> }],
-  [agentPeer.id, { observeMe: <span class="kw">true</span>,  observeOthers: <span class="kw">true</span>  }], <span class="cm">// was false</span>
-]);</code></pre>
-
-  <p>This is a one-line change but foundational. Without it, Honcho's AI peer representation stays empty regardless of what the agent says.</p>
-
-  <h3>Part B: seedAiIdentity()</h3>
-  <pre><code><span class="kw">async function</span> <span class="key">seedAiIdentity</span>(
-  session: HonchoSession,
-  agentPeer: Peer,
-  content: <span class="str">string</span>,
-  source: <span class="str">string</span>
-): Promise&lt;<span class="kw">boolean</span>&gt; {
-  <span class="kw">const</span> wrapped = [
-    <span class="str">`&lt;ai_identity_seed&gt;`</span>,
-    <span class="str">`&lt;source&gt;${source}&lt;/source&gt;`</span>,
-    <span class="str">``</span>,
-    content.trim(),
-    <span class="str">`&lt;/ai_identity_seed&gt;`</span>,
-  ].join(<span class="str">"\n"</span>);
-
-  <span class="kw">await</span> agentPeer.addMessage(<span class="str">"assistant"</span>, wrapped);
-  <span class="kw">return true</span>;
-}</code></pre>
-
-  <h3>Part C: migrate agent files at setup</h3>
-  <p>During <code>openclaw honcho setup</code>, upload agent-self files (SOUL.md, IDENTITY.md, AGENTS.md, BOOTSTRAP.md) to the agent peer using <code>seedAiIdentity()</code> instead of <code>session.uploadFile()</code>. This routes the content through Honcho's observation pipeline rather than the file store.</p>
-
-  <h3>Part D: AI peer name in identity</h3>
-  <p>When the agent has a configured name (non-default), inject it into the agent's self-identity prefix. In OpenClaw this means adding to the injected system prompt section:</p>
-  <pre><code><span class="cm">// In context hook return value</span>
-<span class="kw">return</span> {
-  systemPrompt: [
-    agentName ? <span class="str">`You are ${agentName}.`</span> : <span class="str">""</span>,
-    <span class="str">"## User Memory Context"</span>,
-    ...sections,
-  ].filter(Boolean).join(<span class="str">"\n\n"</span>)
-};</code></pre>
-
-  <h3>CLI surface: honcho identity subcommand</h3>
-  <pre><code>openclaw honcho identity &lt;file&gt;    <span class="cm"># seed from file</span>
-openclaw honcho identity --show    <span class="cm"># show current AI peer representation</span></code></pre>
-</section>
-
-<!-- SPEC: SESSION NAMING -->
-<section id="spec-sessions">
-  <h2>Spec: session naming strategies</h2>
-
-  <h3>Problem</h3>
-  <p>When Honcho is used across multiple projects or directories, a single global session means every project shares the same context. Per-directory sessions provide isolation without requiring users to name sessions manually.</p>
-
-  <h3>Strategies</h3>
-  <div class="table-wrap">
-    <table>
-      <thead><tr><th>Strategy</th><th>Session key</th><th>When to use</th></tr></thead>
-      <tbody>
-        <tr><td><code>per-directory</code></td><td>basename of CWD</td><td>Default. Each project gets its own session.</td></tr>
-        <tr><td><code>global</code></td><td>fixed string <code>"global"</code></td><td>Single cross-project session.</td></tr>
-        <tr><td>manual map</td><td>user-configured per path</td><td><code>sessions</code> config map overrides directory basename.</td></tr>
-        <tr><td>title-based</td><td>sanitized session title</td><td>When agent supports named sessions; title set mid-conversation.</td></tr>
-      </tbody>
-    </table>
-  </div>
-
-  <h3>Config schema</h3>
-  <pre><code>{
-  <span class="str">"sessionStrategy"</span>: <span class="str">"per-directory"</span>,   <span class="cm">// "per-directory" | "global"</span>
-  <span class="str">"sessionPeerPrefix"</span>: <span class="kw">false</span>,            <span class="cm">// prepend peer name to session key</span>
-  <span class="str">"sessions"</span>: {                            <span class="cm">// manual overrides</span>
-    <span class="str">"/home/user/projects/foo"</span>: <span class="str">"foo-project"</span>
-  }
-}</code></pre>
-
-  <h3>CLI surface</h3>
-  <pre><code>openclaw honcho sessions              <span class="cm"># list all mappings</span>
-openclaw honcho map &lt;name&gt;           <span class="cm"># map cwd to session name</span>
-openclaw honcho map                   <span class="cm"># no-arg = list mappings</span></code></pre>
-
-  <p>Resolution order: manual map wins &rarr; session title &rarr; directory basename &rarr; platform key.</p>
-</section>
-
-<!-- SPEC: CLI SURFACE INJECTION -->
-<section id="spec-cli">
-  <h2>Spec: CLI surface injection</h2>
-
-  <h3>Problem</h3>
-  <p>When a user asks "how do I change my memory settings?" or "what Honcho commands are available?" the agent either hallucinates or says it doesn't know. The agent should know its own management interface.</p>
-
-  <h3>Pattern</h3>
-  <p>When Honcho is active, append a compact command reference to the system prompt. The agent can cite these commands directly instead of guessing.</p>
-
-  <pre><code><span class="cm">// In context hook, append to systemPrompt</span>
-<span class="kw">const</span> honchoSection = [
-  <span class="str">"# Honcho memory integration"</span>,
-  <span class="str">`Active. Session: ${sessionKey}. Mode: ${mode}.`</span>,
-  <span class="str">"Management commands:"</span>,
-  <span class="str">"  openclaw honcho status                    — show config + connection"</span>,
-  <span class="str">"  openclaw honcho mode [hybrid|honcho|local] — show or set memory mode"</span>,
-  <span class="str">"  openclaw honcho sessions                  — list session mappings"</span>,
-  <span class="str">"  openclaw honcho map &lt;name&gt;                — map directory to session"</span>,
-  <span class="str">"  openclaw honcho identity [file] [--show]  — seed or show AI identity"</span>,
-  <span class="str">"  openclaw honcho setup                     — full interactive wizard"</span>,
-].join(<span class="str">"\n"</span>);</code></pre>
-
-  <div class="callout warn">
-    <strong>Keep it compact.</strong> This section is injected every turn. Keep it under 300 chars of context. List commands, not explanations — the agent can explain them on request.
-  </div>
-</section>
-
-<!-- OPENCLAW CHECKLIST -->
-<section id="openclaw-checklist">
-  <h2>openclaw-honcho checklist</h2>
-
-  <p>Ordered by impact. Each item maps to a spec section above.</p>
-
-  <ul class="checklist">
-    <li class="todo"><strong>Async prefetch</strong> — move <code>session.context()</code> out of <code>before_prompt_build</code> into post-<code>agent_end</code> background Promise. Pop from cache at prompt build. (<a href="#spec-async">spec</a>)</li>
-    <li class="todo"><strong>observe_me=True for agent peer</strong> — one-line change in <code>session.addPeers()</code> config for agent peer. (<a href="#spec-identity">spec</a>)</li>
-    <li class="todo"><strong>Dynamic reasoning level</strong> — add <code>dynamicReasoningLevel()</code> helper; apply in <code>honcho_recall</code> and <code>honcho_analyze</code>. Add <code>dialecticReasoningLevel</code> to config schema. (<a href="#spec-reasoning">spec</a>)</li>
-    <li class="todo"><strong>Per-peer memory modes</strong> — add <code>userMemoryMode</code> / <code>agentMemoryMode</code> to config; gate Honcho sync and local writes accordingly. (<a href="#spec-modes">spec</a>)</li>
-    <li class="todo"><strong>seedAiIdentity()</strong> — add helper; apply during setup migration for SOUL.md / IDENTITY.md instead of <code>session.uploadFile()</code>. (<a href="#spec-identity">spec</a>)</li>
-    <li class="todo"><strong>Session naming strategies</strong> — add <code>sessionStrategy</code>, <code>sessions</code> map, <code>sessionPeerPrefix</code> to config; implement resolution function. (<a href="#spec-sessions">spec</a>)</li>
-    <li class="todo"><strong>CLI surface injection</strong> — append command reference to <code>before_prompt_build</code> return value when Honcho is active. (<a href="#spec-cli">spec</a>)</li>
-    <li class="todo"><strong>honcho identity subcommand</strong> — add <code>openclaw honcho identity</code> CLI command. (<a href="#spec-identity">spec</a>)</li>
-    <li class="todo"><strong>AI peer name injection</strong> — if <code>aiPeer</code> name configured, prepend to injected system prompt. (<a href="#spec-identity">spec</a>)</li>
-    <li class="todo"><strong>honcho mode / honcho sessions / honcho map</strong> — CLI parity with Hermes. (<a href="#spec-sessions">spec</a>)</li>
-  </ul>
-
-  <div class="callout success">
-    <strong>Already done in openclaw-honcho (do not re-implement):</strong> lastSavedIndex dedup, platform metadata stripping, multi-agent parent observer hierarchy, peerPerspective on context(), tiered tool surface (fast/LLM), workspace agentPeerMap, QMD passthrough, self-hosted Honcho support.
-  </div>
-</section>
-
-<!-- NANOBOT CHECKLIST -->
-<section id="nanobot-checklist">
-  <h2>nanobot-honcho checklist</h2>
-
-  <p>nanobot-honcho is a greenfield integration. Start from openclaw-honcho's architecture (hook-based, dual peer) and apply all Hermes patterns from day one rather than retrofitting. Priority order:</p>
-
-  <h3>Phase 1 — core correctness</h3>
-  <ul class="checklist">
-    <li class="todo">Dual peer model (owner + agent peer), both with <code>observe_me=True</code></li>
-    <li class="todo">Message capture at turn end with <code>lastSavedIndex</code> dedup</li>
-    <li class="todo">Platform metadata stripping before Honcho storage</li>
-    <li class="todo">Async prefetch from day one — do not implement blocking context injection</li>
-    <li class="todo">Legacy file migration at first activation (USER.md → owner peer, SOUL.md → <code>seedAiIdentity()</code>)</li>
-  </ul>
-
-  <h3>Phase 2 — configuration</h3>
-  <ul class="checklist">
-    <li class="todo">Config schema: <code>apiKey</code>, <code>workspaceId</code>, <code>baseUrl</code>, <code>memoryMode</code>, <code>userMemoryMode</code>, <code>agentMemoryMode</code>, <code>dialecticReasoningLevel</code>, <code>sessionStrategy</code>, <code>sessions</code></li>
-    <li class="todo">Per-peer memory mode gating</li>
-    <li class="todo">Dynamic reasoning level</li>
-    <li class="todo">Session naming strategies</li>
-  </ul>
-
-  <h3>Phase 3 — tools and CLI</h3>
-  <ul class="checklist">
-    <li class="todo">Tool surface: <code>honcho_profile</code>, <code>honcho_recall</code>, <code>honcho_analyze</code>, <code>honcho_search</code>, <code>honcho_context</code></li>
-    <li class="todo">CLI: <code>setup</code>, <code>status</code>, <code>sessions</code>, <code>map</code>, <code>mode</code>, <code>identity</code></li>
-    <li class="todo">CLI surface injection into system prompt</li>
-    <li class="todo">AI peer name wired into agent identity</li>
-  </ul>
-</section>
-
-</div>
-
-<script type="module">
-  import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
-  mermaid.initialize({ startOnLoad: true, securityLevel: 'loose', fontFamily: 'Departure Mono, Noto Emoji, monospace' });
-</script>
-<script>
-  window.addEventListener('scroll', () => {
-    const bar = document.getElementById('progress');
-    const max = document.documentElement.scrollHeight - window.innerHeight;
-    bar.style.width = (max > 0 ? (window.scrollY / max) * 100 : 0) + '%';
-  });
-</script>
-</body>
-</html>
@@ -1,377 +0,0 @@
-# honcho-integration-spec
-
-Comparison of Hermes Agent vs. openclaw-honcho — and a porting spec for bringing Hermes patterns into other Honcho integrations.
-
---
-
-## Overview
-
-Two independent Honcho integrations have been built for two different agent runtimes: **Hermes Agent** (Python, baked into the runner) and **openclaw-honcho** (TypeScript plugin via hook/tool API). Both use the same Honcho peer paradigm — dual peer model, `session.context()`, `peer.chat()` — but they made different tradeoffs at every layer.
-
-This document maps those tradeoffs and defines a porting spec: a set of Hermes-originated patterns, each stated as an integration-agnostic interface, that any Honcho integration can adopt regardless of runtime or language.
-
-> **Scope** Both integrations work correctly today. This spec is about the delta — patterns in Hermes that are worth propagating and patterns in openclaw-honcho that Hermes should eventually adopt. The spec is additive, not prescriptive.
-
---
-
-## Architecture comparison
-
-### Hermes: baked-in runner
-
-Honcho is initialised directly inside `AIAgent.__init__`. There is no plugin boundary. Session management, context injection, async prefetch, and CLI surface are all first-class concerns of the runner. Context is injected once per session (baked into `_cached_system_prompt`) and never re-fetched mid-session — this maximises prefix cache hits at the LLM provider.
-
-Turn flow:
-
-```
-user message
-  → _honcho_prefetch()       (reads cache — no HTTP)
-  → _build_system_prompt()   (first turn only, cached)
-  → LLM call
-  → response
-  → _honcho_fire_prefetch()  (daemon threads, turn end)
-       → prefetch_context() thread  ──┐
-       → prefetch_dialectic() thread ─┴→ _context_cache / _dialectic_cache
-```
-
-### openclaw-honcho: hook-based plugin
-
-The plugin registers hooks against OpenClaw's event bus. Context is fetched synchronously inside `before_prompt_build` on every turn. Message capture happens in `agent_end`. The multi-agent hierarchy is tracked via `subagent_spawned`. This model is correct but every turn pays a blocking Honcho round-trip before the LLM call can begin.
-
-Turn flow:
-
-```
-user message
-  → before_prompt_build (BLOCKING HTTP — every turn)
-       → session.context()
-  → system prompt assembled
-  → LLM call
-  → response
-  → agent_end hook
-       → session.addMessages()
-       → session.setMetadata()
-```
-
---
-
-## Diff table
-
-| Dimension | Hermes Agent | openclaw-honcho |
-|---|---|---|
-| **Context injection timing** | Once per session (cached). Zero HTTP on response path after turn 1. | Every turn, blocking. Fresh context per turn but adds latency. |
-| **Prefetch strategy** | Daemon threads fire at turn end; consumed next turn from cache. | None. Blocking call at prompt-build time. |
-| **Dialectic (peer.chat)** | Prefetched async; result injected into system prompt next turn. | On-demand via `honcho_recall` / `honcho_analyze` tools. |
-| **Reasoning level** | Dynamic: scales with message length. Floor = config default. Cap = "high". | Fixed per tool: recall=minimal, analyze=medium. |
-| **Memory modes** | `user_memory_mode` / `agent_memory_mode`: hybrid / honcho / local. | None. Always writes to Honcho. |
-| **Write frequency** | async (background queue), turn, session, N turns. | After every agent_end (no control). |
-| **AI peer identity** | `observe_me=True`, `seed_ai_identity()`, `get_ai_representation()`, SOUL.md → AI peer. | Agent files uploaded to agent peer at setup. No ongoing self-observation. |
-| **Context scope** | User peer + AI peer representation, both injected. | User peer (owner) representation + conversation summary. `peerPerspective` on context call. |
-| **Session naming** | per-directory / global / manual map / title-based. | Derived from platform session key. |
-| **Multi-agent** | Single-agent only. | Parent observer hierarchy via `subagent_spawned`. |
-| **Tool surface** | Single `query_user_context` tool (on-demand dialectic). | 6 tools: session, profile, search, context (fast) + recall, analyze (LLM). |
-| **Platform metadata** | Not stripped. | Explicitly stripped before Honcho storage. |
-| **Message dedup** | None. | `lastSavedIndex` in session metadata prevents re-sending. |
-| **CLI surface in prompt** | Management commands injected into system prompt. Agent knows its own CLI. | Not injected. |
-| **AI peer name in identity** | Replaces "Hermes Agent" in DEFAULT_AGENT_IDENTITY when configured. | Not implemented. |
-| **QMD / local file search** | Not implemented. | Passthrough tools when QMD backend configured. |
-| **Workspace metadata** | Not implemented. | `agentPeerMap` in workspace metadata tracks agent→peer ID. |
-
---
-
-## Patterns
-
-Six patterns from Hermes are worth adopting in any Honcho integration. Each is described as an integration-agnostic interface.
-
-**Hermes contributes:**
- Async prefetch (zero-latency)
- Dynamic reasoning level
- Per-peer memory modes
- AI peer identity formation
- Session naming strategies
- CLI surface injection
-
-**openclaw-honcho contributes back (Hermes should adopt):**
- `lastSavedIndex` dedup
- Platform metadata stripping
- Multi-agent observer hierarchy
- `peerPerspective` on `context()`
- Tiered tool surface (fast/LLM)
- Workspace `agentPeerMap`
-
---
-
-## Spec: async prefetch
-
-### Problem
-
-Calling `session.context()` and `peer.chat()` synchronously before each LLM call adds 200–800ms of Honcho round-trip latency to every turn.
-
-### Pattern
-
-Fire both calls as non-blocking background work at the **end** of each turn. Store results in a per-session cache keyed by session ID. At the **start** of the next turn, pop from cache — the HTTP is already done. First turn is cold (empty cache); all subsequent turns are zero-latency on the response path.
-
-### Interface contract
-
-```typescript
-interface AsyncPrefetch {
-  // Fire context + dialectic fetches at turn end. Non-blocking.
-  firePrefetch(sessionId: string, userMessage: string): void;
-
-  // Pop cached results at turn start. Returns empty if cache is cold.
-  popContextResult(sessionId: string): ContextResult | null;
-  popDialecticResult(sessionId: string): string | null;
-}
-
-type ContextResult = {
-  representation: string;
-  card: string[];
-  aiRepresentation?: string;  // AI peer context if enabled
-  summary?: string;           // conversation summary if fetched
-};
-```
-
-### Implementation notes
-
- **Python:** `threading.Thread(daemon=True)`. Write to `dict[session_id, result]` — GIL makes this safe for simple writes.
- **TypeScript:** `Promise` stored in `Map<string, Promise<ContextResult>>`. Await at pop time. If not resolved yet, return null — do not block.
- The pop is destructive: clears the cache entry after reading so stale data never accumulates.
- Prefetch should also fire on first turn (even though it won't be consumed until turn 2).
-
-### openclaw-honcho adoption
-
-Move `session.context()` from `before_prompt_build` to a post-`agent_end` background task. Store result in `state.contextCache`. In `before_prompt_build`, read from cache instead of calling Honcho. If cache is empty (turn 1), inject nothing — the prompt is still valid without Honcho context on the first turn.
-
---
-
-## Spec: dynamic reasoning level
-
-### Problem
-
-Honcho's dialectic endpoint supports reasoning levels from `minimal` to `max`. A fixed level per tool wastes budget on simple queries and under-serves complex ones.
-
-### Pattern
-
-Select the reasoning level dynamically based on the user's message. Use the configured default as a floor. Bump by message length. Cap auto-selection at `high` — never select `max` automatically.
-
-### Logic
-
-```
-< 120 chars  → default (typically "low")
-120–400 chars → one level above default (cap at "high")
-> 400 chars  → two levels above default (cap at "high")
-```
-
-### Config key
-
-Add `dialecticReasoningLevel` (string, default `"low"`). This sets the floor. The dynamic bump always applies on top.
-
-### openclaw-honcho adoption
-
-Apply in `honcho_recall` and `honcho_analyze`: replace fixed `reasoningLevel` with the dynamic selector. `honcho_recall` uses floor `"minimal"`, `honcho_analyze` uses floor `"medium"` — both still bump with message length.
-
---
-
-## Spec: per-peer memory modes
-
-### Problem
-
-Users want independent control over whether user context and agent context are written locally, to Honcho, or both.
-
-### Modes
-
-| Mode | Effect |
-|---|---|
-| `hybrid` | Write to both local files and Honcho (default) |
-| `honcho` | Honcho only — disable corresponding local file writes |
-| `local` | Local files only — skip Honcho sync for this peer |
-
-### Config schema
-
-```json
-{
-  "memoryMode": "hybrid",
-  "userMemoryMode": "honcho",
-  "agentMemoryMode": "hybrid"
-}
-```
-
-Resolution order: per-peer field wins → shorthand `memoryMode` → default `"hybrid"`.
-
-### Effect on Honcho sync
-
- `userMemoryMode=local`: skip adding user peer messages to Honcho
- `agentMemoryMode=local`: skip adding assistant peer messages to Honcho
- Both local: skip `session.addMessages()` entirely
- `userMemoryMode=honcho`: disable local USER.md writes
- `agentMemoryMode=honcho`: disable local MEMORY.md / SOUL.md writes
-
---
-
-## Spec: AI peer identity formation
-
-### Problem
-
-Honcho builds the user's representation organically by observing what the user says. The same mechanism exists for the AI peer — but only if `observe_me=True` is set for the agent peer. Without it, the agent peer accumulates nothing.
-
-Additionally, existing persona files (SOUL.md, IDENTITY.md) should seed the AI peer's Honcho representation at first activation.
-
-### Part A: observe_me=True for agent peer
-
-```typescript
-await session.addPeers([
-  [ownerPeer.id, { observeMe: true,  observeOthers: false }],
-  [agentPeer.id, { observeMe: true,  observeOthers: true  }], // was false
-]);
-```
-
-One-line change. Foundational. Without it, the AI peer representation stays empty regardless of what the agent says.
-
-### Part B: seedAiIdentity()
-
-```typescript
-async function seedAiIdentity(
-  agentPeer: Peer,
-  content: string,
-  source: string
-): Promise<boolean> {
-  const wrapped = [
-    `<ai_identity_seed>`,
-    `<source>${source}</source>`,
-    ``,
-    content.trim(),
-    `</ai_identity_seed>`,
-  ].join("\n");
-
-  await agentPeer.addMessage("assistant", wrapped);
-  return true;
-}
-```
-
-### Part C: migrate agent files at setup
-
-During `honcho setup`, upload agent-self files (SOUL.md, IDENTITY.md, AGENTS.md) to the agent peer via `seedAiIdentity()` instead of `session.uploadFile()`. This routes content through Honcho's observation pipeline.
-
-### Part D: AI peer name in identity
-
-When the agent has a configured name, prepend it to the injected system prompt:
-
-```typescript
-const namePrefix = agentName ? `You are ${agentName}.\n\n` : "";
-return { systemPrompt: namePrefix + "## User Memory Context\n\n" + sections };
-```
-
-### CLI surface
-
-```
-honcho identity <file>    # seed from file
-honcho identity --show    # show current AI peer representation
-```
-
---
-
-## Spec: session naming strategies
-
-### Problem
-
-A single global session means every project shares the same Honcho context. Per-directory sessions provide isolation without requiring users to name sessions manually.
-
-### Strategies
-
-| Strategy | Session key | When to use |
-|---|---|---|
-| `per-directory` | basename of CWD | Default. Each project gets its own session. |
-| `global` | fixed string `"global"` | Single cross-project session. |
-| manual map | user-configured per path | `sessions` config map overrides directory basename. |
-| title-based | sanitized session title | When agent supports named sessions set mid-conversation. |
-
-### Config schema
-
-```json
-{
-  "sessionStrategy": "per-directory",
-  "sessionPeerPrefix": false,
-  "sessions": {
-    "/home/user/projects/foo": "foo-project"
-  }
-}
-```
-
-### CLI surface
-
-```
-honcho sessions              # list all mappings
-honcho map <name>            # map cwd to session name
-honcho map                   # no-arg = list mappings
-```
-
-Resolution order: manual map → session title → directory basename → platform key.
-
---
-
-## Spec: CLI surface injection
-
-### Problem
-
-When a user asks "how do I change my memory settings?" the agent either hallucinates or says it doesn't know. The agent should know its own management interface.
-
-### Pattern
-
-When Honcho is active, append a compact command reference to the system prompt. Keep it under 300 chars.
-
-```
-# Honcho memory integration
-Active. Session: {sessionKey}. Mode: {mode}.
-Management commands:
-  honcho status                    — show config + connection
-  honcho mode [hybrid|honcho|local] — show or set memory mode
-  honcho sessions                  — list session mappings
-  honcho map <name>                — map directory to session
-  honcho identity [file] [--show]  — seed or show AI identity
-  honcho setup                     — full interactive wizard
-```
-
---
-
-## openclaw-honcho checklist
-
-Ordered by impact:
-
- [ ] **Async prefetch** — move `session.context()` out of `before_prompt_build` into post-`agent_end` background Promise
- [ ] **observe_me=True for agent peer** — one-line change in `session.addPeers()`
- [ ] **Dynamic reasoning level** — add helper; apply in `honcho_recall` and `honcho_analyze`; add `dialecticReasoningLevel` to config
- [ ] **Per-peer memory modes** — add `userMemoryMode` / `agentMemoryMode` to config; gate Honcho sync and local writes
- [ ] **seedAiIdentity()** — add helper; use during setup migration for SOUL.md / IDENTITY.md
- [ ] **Session naming strategies** — add `sessionStrategy`, `sessions` map, `sessionPeerPrefix`
- [ ] **CLI surface injection** — append command reference to `before_prompt_build` return value
- [ ] **honcho identity subcommand** — seed from file or `--show` current representation
- [ ] **AI peer name injection** — if `aiPeer` name configured, prepend to injected system prompt
- [ ] **honcho mode / sessions / map** — CLI parity with Hermes
-
-Already done in openclaw-honcho (do not re-implement): `lastSavedIndex` dedup, platform metadata stripping, multi-agent parent observer, `peerPerspective` on `context()`, tiered tool surface, workspace `agentPeerMap`, QMD passthrough, self-hosted Honcho.
-
---
-
-## nanobot-honcho checklist
-
-Greenfield integration. Start from openclaw-honcho's architecture and apply all Hermes patterns from day one.
-
-### Phase 1 — core correctness
-
- [ ] Dual peer model (owner + agent peer), both with `observe_me=True`
- [ ] Message capture at turn end with `lastSavedIndex` dedup
- [ ] Platform metadata stripping before Honcho storage
- [ ] Async prefetch from day one — do not implement blocking context injection
- [ ] Legacy file migration at first activation (USER.md → owner peer, SOUL.md → `seedAiIdentity()`)
-
-### Phase 2 — configuration
-
- [ ] Config schema: `apiKey`, `workspaceId`, `baseUrl`, `memoryMode`, `userMemoryMode`, `agentMemoryMode`, `dialecticReasoningLevel`, `sessionStrategy`, `sessions`
- [ ] Per-peer memory mode gating
- [ ] Dynamic reasoning level
- [ ] Session naming strategies
-
-### Phase 3 — tools and CLI
-
- [ ] Tool surface: `honcho_profile`, `honcho_recall`, `honcho_analyze`, `honcho_search`, `honcho_context`
- [ ] CLI: `setup`, `status`, `sessions`, `map`, `mode`, `identity`
- [ ] CLI surface injection into system prompt
- [ ] AI peer name wired into agent identity
@@ -1,142 +0,0 @@
-# Migrating from OpenClaw to Hermes Agent
-
-This guide covers how to import your OpenClaw settings, memories, skills, and API keys into Hermes Agent.
-
-## Three Ways to Migrate
-
-### 1. Automatic (during first-time setup)
-
-When you run `hermes setup` for the first time and Hermes detects `~/.openclaw`, it automatically offers to import your OpenClaw data before configuration begins. Just accept the prompt and everything is handled for you.
-
-### 2. CLI Command (quick, scriptable)
-
-```bash
-hermes claw migrate                      # Preview then migrate (always shows preview first)
-hermes claw migrate --dry-run            # Preview only, no changes
-hermes claw migrate --preset user-data   # Migrate without API keys/secrets
-hermes claw migrate --yes                # Skip confirmation prompt
-```
-
-The migration always shows a full preview of what will be imported before making any changes. You review the preview and confirm before anything is written.
-
-**All options:**
-
-| Flag | Description |
-|------|-------------|
-| `--source PATH` | Path to OpenClaw directory (default: `~/.openclaw`) |
-| `--dry-run` | Preview only — no files are modified |
-| `--preset {user-data,full}` | Migration preset (default: `full`). `user-data` excludes secrets |
-| `--overwrite` | Overwrite existing files (default: skip conflicts) |
-| `--migrate-secrets` | Include allowlisted secrets (auto-enabled with `full` preset) |
-| `--workspace-target PATH` | Copy workspace instructions (AGENTS.md) to this absolute path |
-| `--skill-conflict {skip,overwrite,rename}` | How to handle skill name conflicts (default: `skip`) |
-| `--yes`, `-y` | Skip confirmation prompts |
-
-### 3. Agent-Guided (interactive, with previews)
-
-Ask the agent to run the migration for you:
-
-```
-> Migrate my OpenClaw setup to Hermes
-```
-
-The agent will use the `openclaw-migration` skill to:
-1. Run a preview first to show what would change
-2. Ask about conflict resolution (SOUL.md, skills, etc.)
-3. Let you choose between `user-data` and `full` presets
-4. Execute the migration with your choices
-5. Print a detailed summary of what was migrated
-
-## What Gets Migrated
-
-### `user-data` preset
-| Item | Source | Destination |
-|------|--------|-------------|
-| SOUL.md | `~/.openclaw/workspace/SOUL.md` | `~/.hermes/SOUL.md` |
-| Memory entries | `~/.openclaw/workspace/MEMORY.md` | `~/.hermes/memories/MEMORY.md` |
-| User profile | `~/.openclaw/workspace/USER.md` | `~/.hermes/memories/USER.md` |
-| Skills | `~/.openclaw/workspace/skills/` | `~/.hermes/skills/openclaw-imports/` |
-| Command allowlist | `~/.openclaw/workspace/exec_approval_patterns.yaml` | Merged into `~/.hermes/config.yaml` |
-| Messaging settings | `~/.openclaw/config.yaml` (TELEGRAM_ALLOWED_USERS, MESSAGING_CWD) | `~/.hermes/.env` |
-| TTS assets | `~/.openclaw/workspace/tts/` | `~/.hermes/tts/` |
-
-Workspace files are also checked at `workspace.default/` and `workspace-main/` as fallback paths (OpenClaw renamed `workspace/` to `workspace-main/` in recent versions).
-
-### `full` preset (adds to `user-data`)
-| Item | Source | Destination |
-|------|--------|-------------|
-| Telegram bot token | `openclaw.json` channels config | `~/.hermes/.env` |
-| OpenRouter API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
-| OpenAI API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
-| Anthropic API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
-| ElevenLabs API key | `.env`, `openclaw.json`, or `openclaw.json["env"]` | `~/.hermes/.env` |
-
-API keys are searched across four sources: inline config values, `~/.openclaw/.env`, the `openclaw.json` `"env"` sub-object, and per-agent auth profiles.
-
-Only allowlisted secrets are ever imported. Other credentials are skipped and reported.
-
-## OpenClaw Schema Compatibility
-
-The migration handles both old and current OpenClaw config layouts:
-
- **Channel tokens**: Reads from flat paths (`channels.telegram.botToken`) and the newer `accounts.default` layout (`channels.telegram.accounts.default.botToken`)
- **TTS provider**: OpenClaw renamed "edge" to "microsoft" — both are recognized and mapped to Hermes' "edge"
- **Provider API types**: Both short (`openai`, `anthropic`) and hyphenated (`openai-completions`, `anthropic-messages`, `google-generative-ai`) values are mapped correctly
- **thinkingDefault**: All enum values are handled including newer ones (`minimal`, `xhigh`, `adaptive`)
- **Matrix**: Uses `accessToken` field (not `botToken`)
- **SecretRef formats**: Plain strings, env templates (`${VAR}`), and `source: "env"` SecretRefs are resolved. `source: "file"` and `source: "exec"` SecretRefs produce a warning — add those keys manually after migration.
-
-## Conflict Handling
-
-By default, the migration **will not overwrite** existing Hermes data:
-
- **SOUL.md** — skipped if one already exists in `~/.hermes/`
- **Memory entries** — skipped if memories already exist (to avoid duplicates)
- **Skills** — skipped if a skill with the same name already exists
- **API keys** — skipped if the key is already set in `~/.hermes/.env`
-
-To overwrite conflicts, use `--overwrite`. The migration creates backups before overwriting.
-
-For skills, you can also use `--skill-conflict rename` to import conflicting skills under a new name (e.g., `skill-name-imported`).
-
-## Migration Report
-
-Every migration produces a report showing:
- **Migrated items** — what was successfully imported
- **Conflicts** — items skipped because they already exist
- **Skipped items** — items not found in the source
- **Errors** — items that failed to import
-
-For executed migrations, the full report is saved to `~/.hermes/migration/openclaw/<timestamp>/`.
-
-## Post-Migration Notes
-
- **Skills require a new session** — imported skills take effect after restarting your agent or starting a new chat.
- **WhatsApp requires re-pairing** — WhatsApp uses QR-code pairing, not token-based auth. Run `hermes whatsapp` to pair.
- **Archive cleanup** — after migration, you'll be offered to rename `~/.openclaw/` to `.openclaw.pre-migration/` to prevent state confusion. You can also run `hermes claw cleanup` later.
-
-## Troubleshooting
-
-### "OpenClaw directory not found"
-The migration looks for `~/.openclaw` by default, then tries `~/.clawdbot` and `~/.moltbot`. If your OpenClaw is installed elsewhere, use `--source`:
-```bash
-hermes claw migrate --source /path/to/.openclaw
-```
-
-### "Migration script not found"
-The migration script ships with Hermes Agent. If you installed via pip (not git clone), the `optional-skills/` directory may not be present. Install the skill from the Skills Hub:
-```bash
-hermes skills install openclaw-migration
-```
-
-### Memory overflow
-If your OpenClaw MEMORY.md or USER.md exceeds Hermes' character limits, excess entries are exported to an overflow file in the migration report directory. You can manually review and add the most important ones.
-
-### API keys not found
-Keys might be stored in different places depending on your OpenClaw setup:
- `~/.openclaw/.env` file
- Inline in `openclaw.json` under `models.providers.*.apiKey`
- In `openclaw.json` under the `"env"` or `"env.vars"` sub-objects
- In `~/.openclaw/agents/main/agent/auth-profiles.json`
-
-The migration checks all four. If keys use `source: "file"` or `source: "exec"` SecretRefs, they can't be resolved automatically — add them via `hermes config set`.
@@ -1,608 +0,0 @@
-# Pricing Accuracy Architecture
-
-Date: 2026-03-16
-
-## Goal
-
-Hermes should only show dollar costs when they are backed by an official source for the user's actual billing path.
-
-This design replaces the current static, heuristic pricing flow in:
-
- `run_agent.py`
- `agent/usage_pricing.py`
- `agent/insights.py`
- `cli.py`
-
-with a provider-aware pricing system that:
-
- handles cache billing correctly
- distinguishes `actual` vs `estimated` vs `included` vs `unknown`
- reconciles post-hoc costs when providers expose authoritative billing data
- supports direct providers, OpenRouter, subscriptions, enterprise pricing, and custom endpoints
-
-## Problems In The Current Design
-
-Current Hermes behavior has four structural issues:
-
-1. It stores only `prompt_tokens` and `completion_tokens`, which is insufficient for providers that bill cache reads and cache writes separately.
-2. It uses a static model price table and fuzzy heuristics, which can drift from current official pricing.
-3. It assumes public API list pricing matches the user's real billing path.
-4. It has no distinction between live estimates and reconciled billed cost.
-
-## Design Principles
-
-1. Normalize usage before pricing.
-2. Never fold cached tokens into plain input cost.
-3. Track certainty explicitly.
-4. Treat the billing path as part of the model identity.
-5. Prefer official machine-readable sources over scraped docs.
-6. Use post-hoc provider cost APIs when available.
-7. Show `n/a` rather than inventing precision.
-
-## High-Level Architecture
-
-The new system has four layers:
-
-1. `usage_normalization`
-   Converts raw provider usage into a canonical usage record.
-2. `pricing_source_resolution`
-   Determines the billing path, source of truth, and applicable pricing source.
-3. `cost_estimation_and_reconciliation`
-   Produces an immediate estimate when possible, then replaces or annotates it with actual billed cost later.
-4. `presentation`
-   `/usage`, `/insights`, and the status bar display cost with certainty metadata.
-
-## Canonical Usage Record
-
-Add a canonical usage model that every provider path maps into before any pricing math happens.
-
-Suggested structure:
-
-```python
-@dataclass
-class CanonicalUsage:
-    provider: str
-    billing_provider: str
-    model: str
-    billing_route: str
-
-    input_tokens: int = 0
-    output_tokens: int = 0
-    cache_read_tokens: int = 0
-    cache_write_tokens: int = 0
-    reasoning_tokens: int = 0
-    request_count: int = 1
-
-    raw_usage: dict[str, Any] | None = None
-    raw_usage_fields: dict[str, str] | None = None
-    computed_fields: set[str] | None = None
-
-    provider_request_id: str | None = None
-    provider_generation_id: str | None = None
-    provider_response_id: str | None = None
-```
-
-Rules:
-
- `input_tokens` means non-cached input only.
- `cache_read_tokens` and `cache_write_tokens` are never merged into `input_tokens`.
- `output_tokens` excludes cache metrics.
- `reasoning_tokens` is telemetry unless a provider officially bills it separately.
-
-This is the same normalization pattern used by `opencode`, extended with provenance and reconciliation ids.
-
-## Provider Normalization Rules
-
-### OpenAI Direct
-
-Source usage fields:
-
- `prompt_tokens`
- `completion_tokens`
- `prompt_tokens_details.cached_tokens`
-
-Normalization:
-
- `cache_read_tokens = cached_tokens`
- `input_tokens = prompt_tokens - cached_tokens`
- `cache_write_tokens = 0` unless OpenAI exposes it in the relevant route
- `output_tokens = completion_tokens`
-
-### Anthropic Direct
-
-Source usage fields:
-
- `input_tokens`
- `output_tokens`
- `cache_read_input_tokens`
- `cache_creation_input_tokens`
-
-Normalization:
-
- `input_tokens = input_tokens`
- `output_tokens = output_tokens`
- `cache_read_tokens = cache_read_input_tokens`
- `cache_write_tokens = cache_creation_input_tokens`
-
-### OpenRouter
-
-Estimate-time usage normalization should use the response usage payload with the same rules as the underlying provider when possible.
-
-Reconciliation-time records should also store:
-
- OpenRouter generation id
- native token fields when available
- `total_cost`
- `cache_discount`
- `upstream_inference_cost`
- `is_byok`
-
-### Gemini / Vertex
-
-Use official Gemini or Vertex usage fields where available.
-
-If cached content tokens are exposed:
-
- map them to `cache_read_tokens`
-
-If a route exposes no cache creation metric:
-
- store `cache_write_tokens = 0`
- preserve the raw usage payload for later extension
-
-### DeepSeek And Other Direct Providers
-
-Normalize only the fields that are officially exposed.
-
-If a provider does not expose cache buckets:
-
- do not infer them unless the provider explicitly documents how to derive them
-
-### Subscription / Included-Cost Routes
-
-These still use the canonical usage model.
-
-Tokens are tracked normally. Cost depends on billing mode, not on whether usage exists.
-
-## Billing Route Model
-
-Hermes must stop keying pricing solely by `model`.
-
-Introduce a billing route descriptor:
-
-```python
-@dataclass
-class BillingRoute:
-    provider: str
-    base_url: str | None
-    model: str
-    billing_mode: str
-    organization_hint: str | None = None
-```
-
-`billing_mode` values:
-
- `official_cost_api`
- `official_generation_api`
- `official_models_api`
- `official_docs_snapshot`
- `subscription_included`
- `user_override`
- `custom_contract`
- `unknown`
-
-Examples:
-
- OpenAI direct API with Costs API access: `official_cost_api`
- Anthropic direct API with Usage & Cost API access: `official_cost_api`
- OpenRouter request before reconciliation: `official_models_api`
- OpenRouter request after generation lookup: `official_generation_api`
- GitHub Copilot style subscription route: `subscription_included`
- local OpenAI-compatible server: `unknown`
- enterprise contract with configured rates: `custom_contract`
-
-## Cost Status Model
-
-Every displayed cost should have:
-
-```python
-@dataclass
-class CostResult:
-    amount_usd: Decimal | None
-    status: Literal["actual", "estimated", "included", "unknown"]
-    source: Literal[
-        "provider_cost_api",
-        "provider_generation_api",
-        "provider_models_api",
-        "official_docs_snapshot",
-        "user_override",
-        "custom_contract",
-        "none",
-    ]
-    label: str
-    fetched_at: datetime | None
-    pricing_version: str | None
-    notes: list[str]
-```
-
-Presentation rules:
-
- `actual`: show dollar amount as final
- `estimated`: show dollar amount with estimate labeling
- `included`: show `included` or `$0.00 (included)` depending on UX choice
- `unknown`: show `n/a`
-
-## Official Source Hierarchy
-
-Resolve cost using this order:
-
-1. Request-level or account-level official billed cost
-2. Official machine-readable model pricing
-3. Official docs snapshot
-4. User override or custom contract
-5. Unknown
-
-The system must never skip to a lower level if a higher-confidence source exists for the current billing route.
-
-## Provider-Specific Truth Rules
-
-### OpenAI Direct
-
-Preferred truth:
-
-1. Costs API for reconciled spend
-2. Official pricing page for live estimate
-
-### Anthropic Direct
-
-Preferred truth:
-
-1. Usage & Cost API for reconciled spend
-2. Official pricing docs for live estimate
-
-### OpenRouter
-
-Preferred truth:
-
-1. `GET /api/v1/generation` for reconciled `total_cost`
-2. `GET /api/v1/models` pricing for live estimate
-
-Do not use underlying provider public pricing as the source of truth for OpenRouter billing.
-
-### Gemini / Vertex
-
-Preferred truth:
-
-1. official billing export or billing API for reconciled spend when available for the route
-2. official pricing docs for estimate
-
-### DeepSeek
-
-Preferred truth:
-
-1. official machine-readable cost source if available in the future
-2. official pricing docs snapshot today
-
-### Subscription-Included Routes
-
-Preferred truth:
-
-1. explicit route config marking the model as included in subscription
-
-These should display `included`, not an API list-price estimate.
-
-### Custom Endpoint / Local Model
-
-Preferred truth:
-
-1. user override
-2. custom contract config
-3. unknown
-
-These should default to `unknown`.
-
-## Pricing Catalog
-
-Replace the current `MODEL_PRICING` dict with a richer pricing catalog.
-
-Suggested record:
-
-```python
-@dataclass
-class PricingEntry:
-    provider: str
-    route_pattern: str
-    model_pattern: str
-
-    input_cost_per_million: Decimal | None = None
-    output_cost_per_million: Decimal | None = None
-    cache_read_cost_per_million: Decimal | None = None
-    cache_write_cost_per_million: Decimal | None = None
-    request_cost: Decimal | None = None
-    image_cost: Decimal | None = None
-
-    source: str = "official_docs_snapshot"
-    source_url: str | None = None
-    fetched_at: datetime | None = None
-    pricing_version: str | None = None
-```
-
-The catalog should be route-aware:
-
- `openai:gpt-5`
- `anthropic:claude-opus-4-6`
- `openrouter:anthropic/claude-opus-4.6`
- `copilot:gpt-4o`
-
-This avoids conflating direct-provider billing with aggregator billing.
-
-## Pricing Sync Architecture
-
-Introduce a pricing sync subsystem instead of manually maintaining a single hardcoded table.
-
-Suggested modules:
-
- `agent/pricing/catalog.py`
- `agent/pricing/sources.py`
- `agent/pricing/sync.py`
- `agent/pricing/reconcile.py`
- `agent/pricing/types.py`
-
-### Sync Sources
-
- OpenRouter models API
- official provider docs snapshots where no API exists
- user overrides from config
-
-### Sync Output
-
-Cache pricing entries locally with:
-
- source URL
- fetch timestamp
- version/hash
- confidence/source type
-
-### Sync Frequency
-
- startup warm cache
- background refresh every 6 to 24 hours depending on source
- manual `hermes pricing sync`
-
-## Reconciliation Architecture
-
-Live requests may produce only an estimate initially. Hermes should reconcile them later when a provider exposes actual billed cost.
-
-Suggested flow:
-
-1. Agent call completes.
-2. Hermes stores canonical usage plus reconciliation ids.
-3. Hermes computes an immediate estimate if a pricing source exists.
-4. A reconciliation worker fetches actual cost when supported.
-5. Session and message records are updated with `actual` cost.
-
-This can run:
-
- inline for cheap lookups
- asynchronously for delayed provider accounting
-
-## Persistence Changes
-
-Session storage should stop storing only aggregate prompt/completion totals.
-
-Add fields for both usage and cost certainty:
-
- `input_tokens`
- `output_tokens`
- `cache_read_tokens`
- `cache_write_tokens`
- `reasoning_tokens`
- `estimated_cost_usd`
- `actual_cost_usd`
- `cost_status`
- `cost_source`
- `pricing_version`
- `billing_provider`
- `billing_mode`
-
-If schema expansion is too large for one PR, add a new pricing events table:
-
-```text
-session_cost_events
-  id
-  session_id
-  request_id
-  provider
-  model
-  billing_mode
-  input_tokens
-  output_tokens
-  cache_read_tokens
-  cache_write_tokens
-  estimated_cost_usd
-  actual_cost_usd
-  cost_status
-  cost_source
-  pricing_version
-  created_at
-  updated_at
-```
-
-## Hermes Touchpoints
-
-### `run_agent.py`
-
-Current responsibility:
-
- parse raw provider usage
- update session token counters
-
-New responsibility:
-
- build `CanonicalUsage`
- update canonical counters
- store reconciliation ids
- emit usage event to pricing subsystem
-
-### `agent/usage_pricing.py`
-
-Current responsibility:
-
- static lookup table
- direct cost arithmetic
-
-New responsibility:
-
- move or replace with pricing catalog facade
- no fuzzy model-family heuristics
- no direct pricing without billing-route context
-
-### `cli.py`
-
-Current responsibility:
-
- compute session cost directly from prompt/completion totals
-
-New responsibility:
-
- display `CostResult`
- show status badges:
-  - `actual`
-  - `estimated`
-  - `included`
-  - `n/a`
-
-### `agent/insights.py`
-
-Current responsibility:
-
- recompute historical estimates from static pricing
-
-New responsibility:
-
- aggregate stored pricing events
- prefer actual cost over estimate
- surface estimates only when reconciliation is unavailable
-
-## UX Rules
-
-### Status Bar
-
-Show one of:
-
- `$1.42`
- `~$1.42`
- `included`
- `cost n/a`
-
-Where:
-
- `$1.42` means `actual`
- `~$1.42` means `estimated`
- `included` means subscription-backed or explicitly zero-cost route
- `cost n/a` means unknown
-
-### `/usage`
-
-Show:
-
- token buckets
- estimated cost
- actual cost if available
- cost status
- pricing source
-
-### `/insights`
-
-Aggregate:
-
- actual cost totals
- estimated-only totals
- unknown-cost sessions count
- included-cost sessions count
-
-## Config And Overrides
-
-Add user-configurable pricing overrides in config:
-
-```yaml
-pricing:
-  mode: hybrid
-  sync_on_startup: true
-  sync_interval_hours: 12
-  overrides:
-    - provider: openrouter
-      model: anthropic/claude-opus-4.6
-      billing_mode: custom_contract
-      input_cost_per_million: 4.25
-      output_cost_per_million: 22.0
-      cache_read_cost_per_million: 0.5
-      cache_write_cost_per_million: 6.0
-  included_routes:
-    - provider: copilot
-      model: "*"
-    - provider: codex-subscription
-      model: "*"
-```
-
-Overrides must win over catalog defaults for the matching billing route.
-
-## Rollout Plan
-
-### Phase 1
-
- add canonical usage model
- split cache token buckets in `run_agent.py`
- stop pricing cache-inflated prompt totals
- preserve current UI with improved backend math
-
-### Phase 2
-
- add route-aware pricing catalog
- integrate OpenRouter models API sync
- add `estimated` vs `included` vs `unknown`
-
-### Phase 3
-
- add reconciliation for OpenRouter generation cost
- add actual cost persistence
- update `/insights` to prefer actual cost
-
-### Phase 4
-
- add direct OpenAI and Anthropic reconciliation paths
- add user overrides and contract pricing
- add pricing sync CLI command
-
-## Testing Strategy
-
-Add tests for:
-
- OpenAI cached token subtraction
- Anthropic cache read/write separation
- OpenRouter estimated vs actual reconciliation
- subscription-backed models showing `included`
- custom endpoints showing `n/a`
- override precedence
- stale catalog fallback behavior
-
-Current tests that assume heuristic pricing should be replaced with route-aware expectations.
-
-## Non-Goals
-
- exact enterprise billing reconstruction without an official source or user override
- backfilling perfect historical cost for old sessions that lack cache bucket data
- scraping arbitrary provider web pages at request time
-
-## Recommendation
-
-Do not expand the existing `MODEL_PRICING` dict.
-
-That path cannot satisfy the product requirement. Hermes should instead migrate to:
-
- canonical usage normalization
- route-aware pricing sources
- estimate-then-reconcile cost lifecycle
- explicit certainty states in the UI
-
-This is the minimum architecture that makes the statement "Hermes pricing is backed by official sources where possible, and otherwise clearly labeled" defensible.
@@ -1,108 +0,0 @@
-# Ink Gateway TUI Migration — Post-mortem
-
-Planned: 2026-04-01 · Delivered: 2026-04 · Status: shipped, classic (prompt_toolkit) CLI still present
-
-## What Shipped
-
-Three layers, same repo, Python runtime unchanged.
-
-```
-ui-tui (Node/TS)  ──stdio JSON-RPC──▶  tui_gateway (Py)  ──▶  AIAgent (run_agent.py)
-```
-
-### Backend — `tui_gateway/`
-
-```
-tui_gateway/
-├── entry.py          # subprocess entrypoint, stdio read/write loop
-├── server.py         # everything: sessions dict, @method handlers, _emit
-├── render.py         # stream renderer, diff rendering, message rendering
-├── slash_worker.py   # subprocess that runs hermes_cli slash commands
-└── __init__.py
-```
-
-`server.py` owns the full runtime-control surface: session store (`_sessions: dict[str, dict]`), method registry (`@method("…")` decorator), event emitter (`_emit`), agent lifecycle (`_make_agent`, `_init_session`, `_wire_callbacks`), approval/sudo/clarify round-trips, and JSON-RPC dispatch.
-
-Protocol methods (`@method(...)` in `server.py`):
-
- session: `session.{create, resume, list, close, interrupt, usage, history, compress, branch, title, save, undo}`
- prompt: `prompt.{submit, background, btw}`
- tools: `tools.{list, show, configure}`
- slash: `slash.exec`, `command.{dispatch, resolve}`, `commands.catalog`, `complete.{path, slash}`
- approvals: `approval.respond`, `sudo.respond`, `clarify.respond`, `secret.respond`
- config/state: `config.{get, set, show}`, `model.options`, `reload.mcp`
- ops: `shell.exec`, `cli.exec`, `terminal.resize`, `input.detect_drop`, `clipboard.paste`, `paste.collapse`, `image.attach`, `process.stop`
- misc: `agents.list`, `skills.manage`, `plugins.list`, `cron.manage`, `insights.get`, `rollback.{list, diff, restore}`, `browser.manage`
-
-Protocol events (`_emit(…)` → handled in `ui-tui/src/app/createGatewayEventHandler.ts`):
-
- lifecycle: `gateway.{ready, stderr}`, `session.info`, `skin.changed`
- stream: `message.{start, delta, complete}`, `thinking.delta`, `reasoning.{delta, available}`, `status.update`
- tools: `tool.{start, progress, complete, generating}`, `subagent.{start, thinking, tool, progress, complete}`
- interactive: `approval.request`, `sudo.request`, `clarify.request`, `secret.request`
- async: `background.complete`, `btw.complete`, `error`
-
-### Frontend — `ui-tui/src/`
-
-```
-src/
-├── entry.tsx            # node bootstrap: bootBanner → spawn python → dynamic-import Ink → render(<App/>)
-├── app.tsx              # <GatewayProvider> wraps <AppLayout>
-├── bootBanner.ts        # raw-ANSI banner to stdout in ~2ms, pre-React
-├── gatewayClient.ts     # JSON-RPC client over child_process stdio
-├── gatewayTypes.ts      # typed RPC responses + GatewayEvent union
-├── theme.ts             # DEFAULT_THEME + fromSkin
-│
-├── app/                 # hooks + stores — the orchestration layer
-│   ├── uiStore.ts             # nanostore: sid, info, busy, usage, theme, status…
-│   ├── turnStore.ts           # nanostore: per-turn activity / reasoning / tools
-│   ├── turnController.ts      # imperative singleton for stream-time operations
-│   ├── overlayStore.ts        # nanostore: modal/overlay state
-│   ├── useMainApp.ts          # top-level composition hook
-│   ├── useSessionLifecycle.ts # session.create/resume/close/reset
-│   ├── useSubmission.ts       # shell/slash/prompt dispatch + interpolation
-│   ├── useConfigSync.ts       # config.get + mtime poll
-│   ├── useComposerState.ts    # input buffer, paste snippets, editor mode
-│   ├── useInputHandlers.ts    # key bindings
-│   ├── createGatewayEventHandler.ts  # event-stream dispatcher
-│   ├── createSlashHandler.ts         # slash command router (registry + python fallback)
-│   └── slash/commands/        # core.ts, ops.ts, session.ts — TS-owned slash commands
-│
-├── components/          # AppLayout, AppChrome, AppOverlays, MessageLine, Thinking, Markdown, pickers, prompts, Banner, SessionPanel
-├── config/              # env, limits, timing constants
-├── content/             # charms, faces, fortunes, hotkeys, placeholders, verbs
-├── domain/              # details, messages, paths, roles, slash, usage, viewport
-├── protocol/            # interpolation, paste regex
-├── hooks/               # useCompletion, useInputHistory, useQueue, useVirtualHistory
-└── lib/                 # history, messages, osc52, rpc, text
-```
-
-### CLI entry points — `hermes_cli/main.py`
-
- `hermes --tui`      → `node dist/entry.js` (auto-builds when `.ts`/`.tsx` newer than `dist/entry.js`)
- `hermes --tui --dev` → `tsx src/entry.tsx` (skip build)
- `HERMES_TUI_DIR=…`  → external prebuilt dist (nix, distro packaging)
-
-## Diverged From Original Plan
-
-| Plan | Reality | Why |
-|---|---|---|
-| `tui_gateway/{controller,session_state,events,protocol}.py` | all collapsed into `server.py` | no second consumer ever emerged, keeping one file cheaper than four |
-| `ui-tui/src/main.tsx` | split into `entry.tsx` (bootstrap) + `app.tsx` (shell) | boot banner + early python spawn wanted a pre-React moment |
-| `ui-tui/src/state/store.ts` | three nanostores (`uiStore`, `turnStore`, `overlayStore`) | separate lifetimes: ui persists, turn resets per reply, overlay is modal |
-| `approval.requested` / `sudo.requested` / `clarify.requested` | `*.request` (no `-ed`) | cosmetic |
-| `session.cancel` | dropped | `session.interrupt` covers it |
-| `HERMES_EXPERIMENTAL_TUI=1`, `display.experimental_tui: true`, `/tui on/off/status` | none shipped | `--tui` went from opt-in to first-class without an experimental phase |
-
-## Post-migration Additions (not in original plan)
-
- **Async `session.create`** — returns sid in ~1ms, agent builds on a background thread, `session.info` broadcasts when ready; `_wait_agent()` gates every agent-touching handler via `_sess`
- **`bootBanner`** — raw-ANSI logo painted to stdout at T≈2ms, before Ink loads; `<AlternateScreen>` wipes it seamlessly when React mounts
- **Selection uniform bg** — `theme.color.selectionBg` wired via `useSelection().setSelectionBgColor`; replaces SGR-inverse per-cell swap that fragmented over amber/gold fg
- **Slash command registry** — TS-owned commands in `app/slash/commands/{core,ops,session}.ts`, everything else falls through to `slash.exec` (python worker)
- **Turn store + controller split** — imperative singleton (`turnController`) holds refs/timers, nanostore (`turnStore`) holds render-visible state
-
-## What's Still Open
-
- **Classic CLI not deleted.** `cli.py` still has ~80 `prompt_toolkit` references; classic REPL is still the default when `--tui` is absent. The original plan's "Cut 4 · prompt_toolkit removal later" hasn't happened.
- **No config-file opt-in.** `HERMES_EXPERIMENTAL_TUI` and `display.experimental_tui` were never built; only the CLI flag exists. Fine for now — if we want "default to TUI", a single line in `main.py` flips it.
@@ -1,106 +0,0 @@
-# ============================================================================
-# Hermes Agent — Example Skin Template
-# ============================================================================
-#
-# Copy this file to ~/.hermes/skins/<name>.yaml to create a custom skin.
-# All fields are optional — missing values inherit from the default skin.
-# Activate with: /skin <name>  or  display.skin: <name> in config.yaml
-#
-# Keys are marked:
-#   (both)    — applies to both the classic CLI and the TUI
-#   (classic) — classic CLI only (see hermes --tui in user-guide/tui.md)
-#   (tui)     — TUI only
-#
-# See hermes_cli/skin_engine.py for the full schema reference.
-# ============================================================================
-
-# Required: unique skin name (used in /skin command and config)
-name: example
-description: An example custom skin — copy and modify this template
-
-# ── Colors ──────────────────────────────────────────────────────────────────
-# Hex color values. These control the visual palette.
-colors:
-  # Banner panel (the startup welcome box) — (both)
-  banner_border: "#CD7F32"        # Panel border
-  banner_title: "#FFD700"         # Panel title text
-  banner_accent: "#FFBF00"        # Section headers (Available Tools, Skills, etc.)
-  banner_dim: "#B8860B"           # Dim/muted text (separators, model info)
-  banner_text: "#FFF8DC"          # Body text (tool names, skill names)
-
-  # UI elements — (both)
-  ui_accent: "#FFBF00"            # General accent (falls back to banner_accent)
-  ui_label: "#4dd0e1"             # Labels
-  ui_ok: "#4caf50"                # Success indicators
-  ui_error: "#ef5350"             # Error indicators
-  ui_warn: "#ffa726"              # Warning indicators
-
-  # Input area
-  prompt: "#FFF8DC"               # Prompt text / `❯` glyph color (both)
-  input_rule: "#CD7F32"           # Horizontal rule above input (classic)
-
-  # Response box — (classic)
-  response_border: "#FFD700"      # Response box border
-
-  # Session display — (both)
-  session_label: "#DAA520"        # "Session: " label
-  session_border: "#8B8682"       # Session ID text
-
-  # TUI / CLI surfaces — (classic: status bar, voice badge, completion meta)
-  status_bar_bg: "#1a1a2e"              # Status / usage bar background (classic)
-  voice_status_bg: "#1a1a2e"            # Voice-mode badge background (classic)
-  completion_menu_bg: "#1a1a2e"         # Completion list background (both)
-  completion_menu_current_bg: "#333355" # Active completion row background (both)
-  completion_menu_meta_bg: "#1a1a2e"    # Completion meta column bg (classic)
-  completion_menu_meta_current_bg: "#333355"  # Active meta bg (classic)
-
-  # Drag-to-select background — (tui)
-  selection_bg: "#3a3a55"               # Uniform selection highlight in the TUI
-
-# ── Spinner ─────────────────────────────────────────────────────────────────
-# (classic) — the TUI uses its own animated indicators; spinner config here
-# is only read by the classic prompt_toolkit CLI.
-spinner:
-  # Faces shown while waiting for the API response
-  waiting_faces:
-    - "(｡◕‿◕｡)"
-    - "(◕‿◕✿)"
-    - "٩(◕‿◕｡)۶"
-
-  # Faces shown during extended thinking/reasoning
-  thinking_faces:
-    - "(｡•́︿•̀｡)"
-    - "(◔_◔)"
-    - "(¬‿¬)"
-
-  # Verbs used in spinner messages (e.g., "pondering your request...")
-  thinking_verbs:
-    - "pondering"
-    - "contemplating"
-    - "musing"
-    - "ruminating"
-
-  # Optional: left/right decorations around the spinner
-  # Each entry is a [left, right] pair. Omit entirely for no wings.
-  # wings:
-  #   - ["⟪⚔", "⚔⟫"]
-  #   - ["⟪▲", "▲⟫"]
-
-# ── Branding ────────────────────────────────────────────────────────────────
-# Text strings used throughout the interface.
-branding:
-  agent_name: "Hermes Agent"                  # (both) Banner title, about display
-  welcome: "Welcome! Type your message or /help for commands."  # (both)
-  goodbye: "Goodbye! ⚕"                       # (both) Exit message
-  response_label: " ⚕ Hermes "                # (classic) Response box header label
-  prompt_symbol: "❯ "                          # (both) Input prompt glyph
-  help_header: "(^_^)? Available Commands"     # (both) /help overlay title
-
-# ── Tool Output ─────────────────────────────────────────────────────────────
-# Character used as the prefix for tool output lines. (both)
-# Default is "┊" (thin dotted vertical line). Some alternatives:
-#   "╎" (light triple dash vertical)
-#   "▏" (left one-eighth block)
-#   "│" (box drawing light vertical)
-#   "┃" (box drawing heavy vertical)
-tool_prefix: "┊"
@@ -1,329 +0,0 @@
-# Container-Aware CLI Review Fixes Spec
-
-**PR:** NousResearch/hermes-agent#7543
-**Review:** cursor[bot] bugbot review (4094049442) + two prior rounds
-**Date:** 2026-04-12
-**Branch:** `feat/container-aware-cli-clean`
-
-## Review Issues Summary
-
-Six issues were raised across three bugbot review rounds. Three were fixed in intermediate commits (38277a6a, 726cf90f). This spec addresses remaining design concerns surfaced by those reviews and simplifies the implementation based on interview decisions.
-
-| # | Issue | Severity | Status |
-|---|-------|----------|--------|
-| 1 | `os.execvp` retry loop unreachable | Medium | Fixed in 79e8cd12 (switched to subprocess.run) |
-| 2 | Redundant `shutil.which("sudo")` | Medium | Fixed in 38277a6a (reuses `sudo` var) |
-| 3 | Missing `chown -h` on symlink update | Low | Fixed in 38277a6a |
-| 4 | Container routing after `parse_args()` | High | Fixed in 726cf90f |
-| 5 | Hardcoded `/home/${user}` | Medium | Fixed in 726cf90f |
-| 6 | Group membership not gated on `container.enable` | Low | Fixed in 726cf90f |
-
-The mechanical fixes are in place but the overall design needs revision. The retry loop, error swallowing, and process model have deeper issues than what the bugbot flagged.
-
---
-
-## Spec: Revised `_exec_in_container`
-
-### Design Principles
-
-1. **Let it crash.** No silent fallbacks. If `.container-mode` exists but something goes wrong, the error propagates naturally (Python traceback). The only case where container routing is skipped is when `.container-mode` doesn't exist or `HERMES_DEV=1`.
-2. **No retries.** Probe once for sudo, exec once. If it fails, docker/podman's stderr reaches the user verbatim.
-3. **Completely transparent.** No error wrapping, no prefixes, no spinners. Docker's output goes straight through.
-4. **`os.execvp` on the happy path.** Replace the Python process entirely so there's no idle parent during interactive sessions. Note: `execvp` never returns on success (process is replaced) and raises `OSError` on failure (it does not return a value). The container process's exit code becomes the process exit code by definition — no explicit propagation needed.
-5. **One human-readable exception to "let it crash".** `subprocess.TimeoutExpired` from the sudo probe gets a specific catch with a readable message, since a raw traceback for "your Docker daemon is slow" is confusing. All other exceptions propagate naturally.
-
-### Execution Flow
-
-```
-1. get_container_exec_info()
-   - HERMES_DEV=1 → return None (skip routing)
-   - Inside container → return None (skip routing)
-   - .container-mode doesn't exist → return None (skip routing)
-   - .container-mode exists → parse and return dict
-   - .container-mode exists but malformed/unreadable → LET IT CRASH (no try/except)
-
-2. _exec_in_container(container_info, sys.argv[1:])
-   a. shutil.which(backend) → if None, print "{backend} not found on PATH" and sys.exit(1)
-   b. Sudo probe: subprocess.run([runtime, "inspect", "--format", "ok", container_name], timeout=15)
-      - If succeeds → needs_sudo = False
-      - If fails → try subprocess.run([sudo, "-n", runtime, "inspect", ...], timeout=15)
-        - If succeeds → needs_sudo = True
-        - If fails → print error with sudoers hint (including why -n is required) and sys.exit(1)
-      - If TimeoutExpired → catch specifically, print human-readable message about slow daemon
-   c. Build exec_cmd: [sudo? + runtime, "exec", tty_flags, "-u", exec_user, env_flags, container, hermes_bin, *cli_args]
-   d. os.execvp(exec_cmd[0], exec_cmd)
-      - On success: process is replaced — Python is gone, container exit code IS the process exit code
-      - On OSError: let it crash (natural traceback)
-```
-
-### Changes to `hermes_cli/main.py`
-
-#### `_exec_in_container` — rewrite
-
-Remove:
- The entire retry loop (`max_retries`, `for attempt in range(...)`)
- Spinner logic (`"Waiting for container..."`, dots)
- Exit code classification (125/126/127 handling)
- `subprocess.run` for the exec call (keep it only for the sudo probe)
- Special TTY vs non-TTY retry counts
- The `time` import (no longer needed)
-
-Change:
- Use `os.execvp(exec_cmd[0], exec_cmd)` as the final call
- Keep the `subprocess` import only for the sudo probe
- Keep TTY detection for the `-it` vs `-i` flag
- Keep env var forwarding (TERM, COLORTERM, LANG, LC_ALL)
- Keep the sudo probe as-is (it's the one "smart" part)
- Bump probe `timeout` from 5s to 15s — cold podman on a loaded machine needs headroom
- Catch `subprocess.TimeoutExpired` specifically on both probe calls — print a readable message about the daemon being unresponsive instead of a raw traceback
- Expand the sudoers hint error message to explain *why* `-n` (non-interactive) is required: a password prompt would hang the CLI or break piped commands
-
-The function becomes roughly:
-
-```python
-def _exec_in_container(container_info: dict, cli_args: list):
-    """Replace the current process with a command inside the managed container.
-
-    Probes whether sudo is needed (rootful containers), then os.execvp
-    into the container. If exec fails, the OS error propagates naturally.
-    """
-    import shutil
-    import subprocess
-
-    backend = container_info["backend"]
-    container_name = container_info["container_name"]
-    exec_user = container_info["exec_user"]
-    hermes_bin = container_info["hermes_bin"]
-
-    runtime = shutil.which(backend)
-    if not runtime:
-        print(f"Error: {backend} not found on PATH. Cannot route to container.",
-              file=sys.stderr)
-        sys.exit(1)
-
-    # Probe whether we need sudo to see the rootful container.
-    # Timeout is 15s — cold podman on a loaded machine can take a while.
-    # TimeoutExpired is caught specifically for a human-readable message;
-    # all other exceptions propagate naturally.
-    needs_sudo = False
-    sudo = None
-    try:
-        probe = subprocess.run(
-            [runtime, "inspect", "--format", "ok", container_name],
-            capture_output=True, text=True, timeout=15,
-        )
-    except subprocess.TimeoutExpired:
-        print(
-            f"Error: timed out waiting for {backend} to respond.\n"
-            f"The {backend} daemon may be unresponsive or starting up.",
-            file=sys.stderr,
-        )
-        sys.exit(1)
-
-    if probe.returncode != 0:
-        sudo = shutil.which("sudo")
-        if sudo:
-            try:
-                probe2 = subprocess.run(
-                    [sudo, "-n", runtime, "inspect", "--format", "ok", container_name],
-                    capture_output=True, text=True, timeout=15,
-                )
-            except subprocess.TimeoutExpired:
-                print(
-                    f"Error: timed out waiting for sudo {backend} to respond.",
-                    file=sys.stderr,
-                )
-                sys.exit(1)
-
-            if probe2.returncode == 0:
-                needs_sudo = True
-            else:
-                print(
-                    f"Error: container '{container_name}' not found via {backend}.\n"
-                    f"\n"
-                    f"The NixOS service runs the container as root. Your user cannot\n"
-                    f"see it because {backend} uses per-user namespaces.\n"
-                    f"\n"
-                    f"Fix: grant passwordless sudo for {backend}. The -n (non-interactive)\n"
-                    f"flag is required because the CLI calls sudo non-interactively —\n"
-                    f"a password prompt would hang or break piped commands:\n"
-                    f"\n"
-                    f'  security.sudo.extraRules = [{{\n'
-                    f'    users = [ "{os.getenv("USER", "your-user")}" ];\n'
-                    f'    commands = [{{ command = "{runtime}"; options = [ "NOPASSWD" ]; }}];\n'
-                    f'  }}];\n'
-                    f"\n"
-                    f"Or run: sudo hermes {' '.join(cli_args)}",
-                    file=sys.stderr,
-                )
-                sys.exit(1)
-        else:
-            print(
-                f"Error: container '{container_name}' not found via {backend}.\n"
-                f"The container may be running under root. Try: sudo hermes {' '.join(cli_args)}",
-                file=sys.stderr,
-            )
-            sys.exit(1)
-
-    is_tty = sys.stdin.isatty()
-    tty_flags = ["-it"] if is_tty else ["-i"]
-
-    env_flags = []
-    for var in ("TERM", "COLORTERM", "LANG", "LC_ALL"):
-        val = os.environ.get(var)
-        if val:
-            env_flags.extend(["-e", f"{var}={val}"])
-
-    cmd_prefix = [sudo, "-n", runtime] if needs_sudo else [runtime]
-    exec_cmd = (
-        cmd_prefix + ["exec"]
-        + tty_flags
-        + ["-u", exec_user]
-        + env_flags
-        + [container_name, hermes_bin]
-        + cli_args
-    )
-
-    # execvp replaces this process entirely — it never returns on success.
-    # On failure it raises OSError, which propagates naturally.
-    os.execvp(exec_cmd[0], exec_cmd)
-```
-
-#### Container routing call site in `main()` — remove try/except
-
-Current:
-```python
-try:
-    from hermes_cli.config import get_container_exec_info
-    container_info = get_container_exec_info()
-    if container_info:
-        _exec_in_container(container_info, sys.argv[1:])
-        sys.exit(1)  # exec failed if we reach here
-except SystemExit:
-    raise
-except Exception:
-    pass  # Container routing unavailable, proceed locally
-```
-
-Revised:
-```python
-from hermes_cli.config import get_container_exec_info
-container_info = get_container_exec_info()
-if container_info:
-    _exec_in_container(container_info, sys.argv[1:])
-    # Unreachable: os.execvp never returns on success (process is replaced)
-    # and raises OSError on failure (which propagates as a traceback).
-    # This line exists only as a defensive assertion.
-    sys.exit(1)
-```
-
-No try/except. If `.container-mode` doesn't exist, `get_container_exec_info()` returns `None` and we skip routing. If it exists but is broken, the exception propagates with a natural traceback.
-
-Note: `sys.exit(1)` after `_exec_in_container` is dead code in all paths — `os.execvp` either replaces the process or raises. It's kept as a belt-and-suspenders assertion with a comment marking it unreachable, not as actual error handling.
-
-### Changes to `hermes_cli/config.py`
-
-#### `get_container_exec_info` — remove inner try/except
-
-Current code catches `(OSError, IOError)` and returns `None`. This silently hides permission errors, corrupt files, etc.
-
-Change: Remove the try/except around file reading. Keep the early returns for `HERMES_DEV=1` and `_is_inside_container()`. The `FileNotFoundError` from `open()` when `.container-mode` doesn't exist should still return `None` (this is the "container mode not enabled" case). All other exceptions propagate.
-
-```python
-def get_container_exec_info() -> Optional[dict]:
-    if os.environ.get("HERMES_DEV") == "1":
-        return None
-    if _is_inside_container():
-        return None
-
-    container_mode_file = get_hermes_home() / ".container-mode"
-
-    try:
-        with open(container_mode_file, "r") as f:
-            # ... parse key=value lines ...
-    except FileNotFoundError:
-        return None
-    # All other exceptions (PermissionError, malformed data, etc.) propagate
-
-    return { ... }
-```
-
---
-
-## Spec: NixOS Module Changes
-
-### Symlink creation — simplify to two branches
-
-Current: 4 branches (symlink exists, directory exists, other file, doesn't exist).
-
-Revised: 2 branches.
-
-```bash
-if [ -d "${symlinkPath}" ] && [ ! -L "${symlinkPath}" ]; then
-  # Real directory — back it up, then create symlink
-  _backup="${symlinkPath}.bak.$(date +%s)"
-  echo "hermes-agent: backing up existing ${symlinkPath} to $_backup"
-  mv "${symlinkPath}" "$_backup"
-fi
-# For everything else (symlink, doesn't exist, etc.) — just force-create
-ln -sfn "${target}" "${symlinkPath}"
-chown -h ${user}:${cfg.group} "${symlinkPath}"
-```
-
-`ln -sfn` handles: existing symlink (replaces), doesn't exist (creates), and after the `mv` above (creates). The only case that needs special handling is a real directory, because `ln -sfn` cannot atomically replace a directory.
-
-Note: there is a theoretical race between the `[ -d ... ]` check and the `mv` (something could create/remove the directory in between). In practice this is a NixOS activation script running as root during `nixos-rebuild switch` — no other process should be touching `~/.hermes` at that moment. Not worth adding locking for.
-
-### Sudoers — document, don't auto-configure
-
-Do NOT add `security.sudo.extraRules` to the module. Document the sudoers requirement in the module's description/comments and in the error message the CLI prints when sudo probe fails.
-
-### Group membership gating — keep as-is
-
-The fix in 726cf90f (`cfg.container.enable && cfg.container.hostUsers != []`) is correct. Leftover group membership when container mode is disabled is harmless. No cleanup needed.
-
---
-
-## Spec: Test Rewrite
-
-The existing test file (`tests/hermes_cli/test_container_aware_cli.py`) has 16 tests. With the simplified exec model, several are obsolete.
-
-### Tests to keep (update as needed)
-
- `test_is_inside_container_dockerenv` — unchanged
- `test_is_inside_container_containerenv` — unchanged
- `test_is_inside_container_cgroup_docker` — unchanged
- `test_is_inside_container_false_on_host` — unchanged
- `test_get_container_exec_info_returns_metadata` — unchanged
- `test_get_container_exec_info_none_inside_container` — unchanged
- `test_get_container_exec_info_none_without_file` — unchanged
- `test_get_container_exec_info_skipped_when_hermes_dev` — unchanged
- `test_get_container_exec_info_not_skipped_when_hermes_dev_zero` — unchanged
- `test_get_container_exec_info_defaults` — unchanged
- `test_get_container_exec_info_docker_backend` — unchanged
-
-### Tests to add
-
- `test_get_container_exec_info_crashes_on_permission_error` — verify that `PermissionError` propagates (no silent `None` return)
- `test_exec_in_container_calls_execvp` — verify `os.execvp` is called with correct args (runtime, tty flags, user, env, container, binary, cli args)
- `test_exec_in_container_sudo_probe_sets_prefix` — verify that when first probe fails and sudo probe succeeds, `os.execvp` is called with `sudo -n` prefix
- `test_exec_in_container_no_runtime_hard_fails` — keep existing, verify `sys.exit(1)` when `shutil.which` returns None
- `test_exec_in_container_non_tty_uses_i_only` — update to check `os.execvp` args instead of `subprocess.run` args
- `test_exec_in_container_probe_timeout_prints_message` — verify that `subprocess.TimeoutExpired` from the probe produces a human-readable error and `sys.exit(1)`, not a raw traceback
- `test_exec_in_container_container_not_running_no_sudo` — verify the path where runtime exists (`shutil.which` returns a path) but probe returns non-zero and no sudo is available. Should print the "container may be running under root" error. This is distinct from `no_runtime_hard_fails` which covers `shutil.which` returning None.
-
-### Tests to delete
-
- `test_exec_in_container_tty_retries_on_container_failure` — retry loop removed
- `test_exec_in_container_non_tty_retries_silently_exits_126` — retry loop removed
- `test_exec_in_container_propagates_hermes_exit_code` — no subprocess.run to check exit codes; execvp replaces the process. Note: exit code propagation still works correctly — when `os.execvp` succeeds, the container's process *becomes* this process, so its exit code is the process exit code by OS semantics. No application code needed, no test needed. A comment in the function docstring documents this intent for future readers.
-
---
-
-## Out of Scope
-
- Auto-configuring sudoers rules in the NixOS module
- Any changes to `get_container_exec_info` parsing logic beyond the try/except narrowing
- Changes to `.container-mode` file format
- Changes to the `HERMES_DEV=1` bypass
- Changes to container detection logic (`_is_inside_container`)
@@ -6,6 +6,7 @@ and implement the required methods.
 """

 import asyncio
+import inspect
 import ipaddress
 import logging
 import os
@@ -880,10 +881,11 @@ class BasePlatformAdapter(ABC):
        # working on a task after --replace or manual restarts.
        self._background_tasks: set[asyncio.Task] = set()
        # One-shot callbacks to fire after the main response is delivered.
-        # Keyed by session_key.  GatewayRunner uses this to defer
-        # background-review notifications ("💾 Skill created") until the
-        # primary reply has been sent.
-        self._post_delivery_callbacks: Dict[str, Callable] = {}
+        # Keyed by session_key. Values are either a bare callback (legacy) or
+        # a ``(generation, callback)`` tuple so GatewayRunner can make deferred
+        # deliveries generation-aware and avoid stale runs clearing callbacks
+        # registered by a fresher run for the same session.
+        self._post_delivery_callbacks: Dict[str, Any] = {}
        self._expected_cancelled_tasks: set[asyncio.Task] = set()
        self._busy_session_handler: Optional[Callable[[MessageEvent, str], Awaitable[bool]]] = None
        # Chats where auto-TTS on voice input is disabled (set by /voice off)
@@ -1401,7 +1403,13 @@ class BasePlatformAdapter(ABC):

        return paths, cleaned

-    async def _keep_typing(self, chat_id: str, interval: float = 2.0, metadata=None) -> None:
+    async def _keep_typing(
+        self,
+        chat_id: str,
+        interval: float = 2.0,
+        metadata=None,
+        stop_event: asyncio.Event | None = None,
+    ) -> None:
        """
        Continuously send typing indicator until cancelled.
        
@@ -1415,9 +1423,18 @@ class BasePlatformAdapter(ABC):
        """
        try:
            while True:
+                if stop_event is not None and stop_event.is_set():
+                    return
                if chat_id not in self._typing_paused:
                    await self.send_typing(chat_id, metadata=metadata)
-                await asyncio.sleep(interval)
+                if stop_event is None:
+                    await asyncio.sleep(interval)
+                    continue
+                try:
+                    await asyncio.wait_for(stop_event.wait(), timeout=interval)
+                except asyncio.TimeoutError:
+                    continue
+                return
        except asyncio.CancelledError:
            pass  # Normal cancellation when handler completes
        finally:
@@ -1444,6 +1461,59 @@ class BasePlatformAdapter(ABC):
        """Resume typing indicator for a chat after approval resolves."""
        self._typing_paused.discard(chat_id)

+    async def interrupt_session_activity(self, session_key: str, chat_id: str) -> None:
+        """Signal the active session loop to stop and clear typing immediately."""
+        if session_key:
+            interrupt_event = self._active_sessions.get(session_key)
+            if interrupt_event is not None:
+                interrupt_event.set()
+        try:
+            await self.stop_typing(chat_id)
+        except Exception:
+            pass
+
+    def register_post_delivery_callback(
+        self,
+        session_key: str,
+        callback: Callable,
+        *,
+        generation: int | None = None,
+    ) -> None:
+        """Register a deferred callback to fire after the main response.
+
+        ``generation`` lets callers tie the callback to a specific gateway run
+        generation so stale runs cannot clear callbacks owned by a fresher run.
+        """
+        if not session_key or not callable(callback):
+            return
+        if generation is None:
+            self._post_delivery_callbacks[session_key] = callback
+        else:
+            self._post_delivery_callbacks[session_key] = (int(generation), callback)
+
+    def pop_post_delivery_callback(
+        self,
+        session_key: str,
+        *,
+        generation: int | None = None,
+    ) -> Callable | None:
+        """Pop a deferred callback, optionally requiring generation ownership."""
+        if not session_key:
+            return None
+        entry = self._post_delivery_callbacks.get(session_key)
+        if entry is None:
+            return None
+        if isinstance(entry, tuple) and len(entry) == 2:
+            entry_generation, callback = entry
+            if generation is not None and int(entry_generation) != int(generation):
+                return None
+            self._post_delivery_callbacks.pop(session_key, None)
+            return callback if callable(callback) else None
+        if generation is not None:
+            return None
+        self._post_delivery_callbacks.pop(session_key, None)
+        return entry if callable(entry) else None
+
    # ── Processing lifecycle hooks ──────────────────────────────────────────
    # Subclasses override these to react to message processing events
    # (e.g. Discord adds 👀/✅/❌ reactions).
@@ -1714,10 +1784,23 @@ class BasePlatformAdapter(ABC):
        # Fall back to a new Event only if the entry was removed externally.
        interrupt_event = self._active_sessions.get(session_key) or asyncio.Event()
        self._active_sessions[session_key] = interrupt_event
+        callback_generation = getattr(interrupt_event, "_hermes_run_generation", None)
        
        # Start continuous typing indicator (refreshes every 2 seconds)
        _thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
-        typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id, metadata=_thread_metadata))
+        _keep_typing_kwargs = {"metadata": _thread_metadata}
+        try:
+            _keep_typing_sig = inspect.signature(self._keep_typing)
+        except (TypeError, ValueError):
+            _keep_typing_sig = None
+        if _keep_typing_sig is None or "stop_event" in _keep_typing_sig.parameters:
+            _keep_typing_kwargs["stop_event"] = interrupt_event
+        typing_task = asyncio.create_task(
+            self._keep_typing(
+                event.source.chat_id,
+                **_keep_typing_kwargs,
+            )
+        )
        
        try:
            await self._run_processing_hook("on_processing_start", event)
@@ -1926,9 +2009,18 @@ class BasePlatformAdapter(ABC):
            if session_key in self._pending_messages:
                pending_event = self._pending_messages.pop(session_key)
                logger.debug("[%s] Processing queued message from interrupt", self.name)
-                # Clean up current session before processing pending
-                if session_key in self._active_sessions:
-                    del self._active_sessions[session_key]
+                # Keep the _active_sessions entry live across the turn chain
+                # and only CLEAR the interrupt Event — do NOT delete the entry.
+                # If we deleted here, a concurrent inbound message arriving
+                # during the awaits below would pass the Level-1 guard, spawn
+                # its own _process_message_background, and run simultaneously
+                # with the recursive drain below.  Two agents on one
+                # session_key = duplicate responses, duplicate tool calls.
+                # Clearing the Event keeps the guard live so follow-ups take
+                # the busy-handler path (queue + interrupt) as intended.
+                _active = self._active_sessions.get(session_key)
+                if _active is not None:
+                    _active.clear()
                typing_task.cancel()
                try:
                    await typing_task
@@ -1967,7 +2059,14 @@ class BasePlatformAdapter(ABC):
        finally:
            # Fire any one-shot post-delivery callback registered for this
            # session (e.g. deferred background-review notifications).
-            _post_cb = getattr(self, "_post_delivery_callbacks", {}).pop(session_key, None)
+            _callback_generation = callback_generation
+            if hasattr(self, "pop_post_delivery_callback"):
+                _post_cb = self.pop_post_delivery_callback(
+                    session_key,
+                    generation=_callback_generation,
+                )
+            else:
+                _post_cb = getattr(self, "_post_delivery_callbacks", {}).pop(session_key, None)
            if callable(_post_cb):
                try:
                    _post_cb()
@@ -1986,9 +2085,37 @@ class BasePlatformAdapter(ABC):
                    await self.stop_typing(event.source.chat_id)
            except Exception:
                pass
-            # Clean up session tracking
-            if session_key in self._active_sessions:
-                del self._active_sessions[session_key]
+            # Late-arrival drain: a message may have arrived during the
+            # cleanup awaits above (typing_task cancel, stop_typing).  Such
+            # messages passed the Level-1 guard (entry still live, Event
+            # possibly set) and landed in _pending_messages via the
+            # busy-handler path.  Without this block, we would delete the
+            # active-session entry and the queued message would be silently
+            # dropped (user never gets a reply).
+            late_pending = self._pending_messages.pop(session_key, None)
+            if late_pending is not None:
+                logger.debug(
+                    "[%s] Late-arrival pending message during cleanup — spawning drain task",
+                    self.name,
+                )
+                _active = self._active_sessions.get(session_key)
+                if _active is not None:
+                    _active.clear()
+                drain_task = asyncio.create_task(
+                    self._process_message_background(late_pending, session_key)
+                )
+                try:
+                    self._background_tasks.add(drain_task)
+                    drain_task.add_done_callback(self._background_tasks.discard)
+                except TypeError:
+                    # Tests stub create_task() with non-hashable sentinels; tolerate.
+                    pass
+                # Leave _active_sessions[session_key] populated — the drain
+                # task's own lifecycle will clean it up.
+            else:
+                # Clean up session tracking
+                if session_key in self._active_sessions:
+                    del self._active_sessions[session_key]
    
    async def cancel_background_tasks(self) -> None:
        """Cancel any in-flight background message-processing tasks.
@@ -1996,12 +2123,26 @@ class BasePlatformAdapter(ABC):
        Used during gateway shutdown/replacement so active sessions from the old
        process do not keep running after adapters are being torn down.
        """
-        tasks = [task for task in self._background_tasks if not task.done()]
-        for task in tasks:
-            self._expected_cancelled_tasks.add(task)
-            task.cancel()
-        if tasks:
+        # Loop until no new tasks appear.  Without this, a message
+        # arriving during the `await asyncio.gather` below would spawn
+        # a fresh _process_message_background task (added to
+        # self._background_tasks at line ~1668 via handle_message),
+        # and the _background_tasks.clear() at the end of this method
+        # would drop the reference — the task runs untracked against a
+        # disconnecting adapter, logs send-failures, and may linger
+        # until it completes on its own.  Retrying the drain until the
+        # task set stabilizes closes the window.
+        MAX_DRAIN_ROUNDS = 5
+        for _ in range(MAX_DRAIN_ROUNDS):
+            tasks = [task for task in self._background_tasks if not task.done()]
+            if not tasks:
+                break
+            for task in tasks:
+                self._expected_cancelled_tasks.add(task)
+                task.cancel()
            await asyncio.gather(*tasks, return_exceptions=True)
+            # Loop: late-arrival tasks spawned during the gather above
+            # will be in self._background_tasks now.  Re-check.
        self._background_tasks.clear()
        self._expected_cancelled_tasks.clear()
        self._pending_messages.clear()
@@ -498,6 +498,7 @@ class DiscordAdapter(BasePlatformAdapter):
        self._allowed_role_ids: set = set()  # For DISCORD_ALLOWED_ROLES filtering
        # Voice channel state (per-guild)
        self._voice_clients: Dict[int, Any] = {}  # guild_id -> VoiceClient
+        self._voice_locks: Dict[int, asyncio.Lock] = {}  # guild_id -> serialize join/leave
        # Text batching: merge rapid successive messages (Telegram-style)
        self._text_batch_delay_seconds = float(os.getenv("HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS", "0.6"))
        self._text_batch_split_delay_seconds = float(os.getenv("HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
@@ -636,6 +637,15 @@ class DiscordAdapter(BasePlatformAdapter):

            @self._client.event
            async def on_message(message: DiscordMessage):
+                # Block until _resolve_allowed_usernames has swapped
+                # any raw usernames in DISCORD_ALLOWED_USERS for numeric
+                # IDs (otherwise on_message's author.id lookup can miss).
+                if not adapter_self._ready_event.is_set():
+                    try:
+                        await asyncio.wait_for(adapter_self._ready_event.wait(), timeout=30.0)
+                    except asyncio.TimeoutError:
+                        pass
+
                # Dedup: Discord RESUME replays events after reconnects (#4777)
                if adapter_self._dedup.is_duplicate(str(message.id)):
                    return
@@ -1071,6 +1081,8 @@ class DiscordAdapter(BasePlatformAdapter):
        chat_id: str,
        message_id: str,
        content: str,
+        *,
+        finalize: bool = False,
    ) -> SendResult:
        """Edit a previously sent Discord message."""
        if not self._client:
@@ -1237,51 +1249,53 @@ class DiscordAdapter(BasePlatformAdapter):
            return False
        guild_id = channel.guild.id

-        # Already connected in this guild?
-        existing = self._voice_clients.get(guild_id)
-        if existing and existing.is_connected():
-            if existing.channel.id == channel.id:
+        async with self._voice_locks.setdefault(guild_id, asyncio.Lock()):
+            # Already connected in this guild?
+            existing = self._voice_clients.get(guild_id)
+            if existing and existing.is_connected():
+                if existing.channel.id == channel.id:
+                    self._reset_voice_timeout(guild_id)
+                    return True
+                await existing.move_to(channel)
                self._reset_voice_timeout(guild_id)
                return True
-            await existing.move_to(channel)
+
+            vc = await channel.connect()
+            self._voice_clients[guild_id] = vc
            self._reset_voice_timeout(guild_id)
+
+            # Start voice receiver (Phase 2: listen to users)
+            try:
+                receiver = VoiceReceiver(vc, allowed_user_ids=self._allowed_user_ids)
+                receiver.start()
+                self._voice_receivers[guild_id] = receiver
+                self._voice_listen_tasks[guild_id] = asyncio.ensure_future(
+                    self._voice_listen_loop(guild_id)
+                )
+            except Exception as e:
+                logger.warning("Voice receiver failed to start: %s", e)
+
            return True

-        vc = await channel.connect()
-        self._voice_clients[guild_id] = vc
-        self._reset_voice_timeout(guild_id)
-
-        # Start voice receiver (Phase 2: listen to users)
-        try:
-            receiver = VoiceReceiver(vc, allowed_user_ids=self._allowed_user_ids)
-            receiver.start()
-            self._voice_receivers[guild_id] = receiver
-            self._voice_listen_tasks[guild_id] = asyncio.ensure_future(
-                self._voice_listen_loop(guild_id)
-            )
-        except Exception as e:
-            logger.warning("Voice receiver failed to start: %s", e)
-
-        return True
-
    async def leave_voice_channel(self, guild_id: int) -> None:
        """Disconnect from the voice channel in a guild."""
-        # Stop voice receiver first
-        receiver = self._voice_receivers.pop(guild_id, None)
-        if receiver:
-            receiver.stop()
-        listen_task = self._voice_listen_tasks.pop(guild_id, None)
-        if listen_task:
-            listen_task.cancel()
+        async with self._voice_locks.setdefault(guild_id, asyncio.Lock()):
+            # Stop voice receiver first
+            receiver = self._voice_receivers.pop(guild_id, None)
+            if receiver:
+                receiver.stop()
+            listen_task = self._voice_listen_tasks.pop(guild_id, None)
+            if listen_task:
+                listen_task.cancel()

-        vc = self._voice_clients.pop(guild_id, None)
-        if vc and vc.is_connected():
-            await vc.disconnect()
-        task = self._voice_timeout_tasks.pop(guild_id, None)
-        if task:
-            task.cancel()
-        self._voice_text_channels.pop(guild_id, None)
-        self._voice_sources.pop(guild_id, None)
+            vc = self._voice_clients.pop(guild_id, None)
+            if vc and vc.is_connected():
+                await vc.disconnect()
+            task = self._voice_timeout_tasks.pop(guild_id, None)
+            if task:
+                task.cancel()
+            self._voice_text_channels.pop(guild_id, None)
+            self._voice_sources.pop(guild_id, None)

    # Maximum seconds to wait for voice playback before giving up
    PLAYBACK_TIMEOUT = 120
@@ -1933,6 +1947,24 @@ class DiscordAdapter(BasePlatformAdapter):
        the "thinking..." indicator is replaced with that text; otherwise it
        is deleted so the channel isn't cluttered.
        """
+        # Log the invoker so ghost-command reports can be triaged.  Discord
+        # native slash invocations are always user-initiated (no bot can fire
+        # them), but mobile autocomplete / keyboard shortcuts / other users
+        # in the same channel are easy to miss in post-mortems.
+        try:
+            _user = interaction.user
+            _chan_id = getattr(interaction.channel, "id", None) or getattr(interaction, "channel_id", None)
+            logger.info(
+                "[Discord] slash '%s' invoked by user=%s id=%s channel=%s guild=%s",
+                command_text,
+                getattr(_user, "name", "?"),
+                getattr(_user, "id", "?"),
+                _chan_id,
+                getattr(interaction, "guild_id", None),
+            )
+        except Exception:
+            pass  # logging must never block command dispatch
+
        await interaction.response.defer(ephemeral=True)
        event = self._build_slash_event(interaction, command_text)
        await self.handle_message(event)
@@ -3247,7 +3279,20 @@ class DiscordAdapter(BasePlatformAdapter):
                "[Discord] Flushing text batch %s (%d chars)",
                key, len(event.text or ""),
            )
-            await self.handle_message(event)
+            # Shield the downstream dispatch so that a subsequent chunk
+            # arriving while handle_message is mid-flight cannot cancel
+            # the running agent turn.  _enqueue_text_event always cancels
+            # the prior flush task when a new chunk lands; without this
+            # shield, CancelledError would propagate from our task down
+            # into handle_message → the agent's streaming request,
+            # aborting the response the user was waiting on.  The new
+            # chunk is handled by the fresh flush task regardless.
+            await asyncio.shield(self.handle_message(event))
+        except asyncio.CancelledError:
+            # Only reached if cancel landed before the pop — the shielded
+            # handle_message is unaffected either way.  Let the task exit
+            # cleanly so the finally block cleans up.
+            pass
        finally:
            if self._pending_text_batch_tasks.get(key) is current_task:
                self._pending_text_batch_tasks.pop(key, None)
@@ -8,7 +8,8 @@ Supports:
 - Gateway allowlist integration via FEISHU_ALLOWED_USERS
 - Persistent dedup state across restarts
 - Per-chat serial message processing (matches openclaw createChatQueue)
- Persistent ACK emoji reaction on inbound messages
+- Processing status reactions: Typing while working, removed on success,
+  swapped for CrossMark on failure
 - Reaction events routed as synthetic text events (matches openclaw)
 - Interactive card button-click events routed as synthetic COMMAND events
 - Webhook anomaly tracking (matches openclaw createWebhookAnomalyTracker)
@@ -29,6 +30,7 @@ import re
 import threading
 import time
 import uuid
+from collections import OrderedDict
 from dataclasses import dataclass, field
 from datetime import datetime
 from pathlib import Path
@@ -98,6 +100,7 @@ from gateway.platforms.base import (
    BasePlatformAdapter,
    MessageEvent,
    MessageType,
+    ProcessingOutcome,
    SendResult,
    SUPPORTED_DOCUMENT_TYPES,
    cache_document_from_bytes,
@@ -119,6 +122,8 @@ _MARKDOWN_HINT_RE = re.compile(
    re.MULTILINE,
 )
 _MARKDOWN_LINK_RE = re.compile(r"\[([^\]]+)\]\(([^)]+)\)")
+_MARKDOWN_FENCE_OPEN_RE = re.compile(r"^```([^\n`]*)\s*$")
+_MARKDOWN_FENCE_CLOSE_RE = re.compile(r"^```\s*$")
 _MENTION_RE = re.compile(r"@_user_\d+")
 _MULTISPACE_RE = re.compile(r"[ \t]{2,}")
 _POST_CONTENT_INVALID_RE = re.compile(r"content format of the post type is incorrect", re.IGNORECASE)
@@ -188,7 +193,17 @@ _APPROVAL_LABEL_MAP: Dict[str, str] = {
 }
 _FEISHU_BOT_MSG_TRACK_SIZE = 512                   # LRU size for tracking sent message IDs
 _FEISHU_REPLY_FALLBACK_CODES = frozenset({230011, 231003})  # reply target withdrawn/missing → create fallback
-_FEISHU_ACK_EMOJI = "OK"
+
+# Feishu reactions render as prominent badges, unlike Discord/Telegram's
+# small footer emoji — a success badge on every message would add noise, so
+# we only mark start (Typing) and failure (CrossMark); the reply itself is
+# the success signal.
+_FEISHU_REACTION_IN_PROGRESS = "Typing"
+_FEISHU_REACTION_FAILURE = "CrossMark"
+# Bound on the (message_id → reaction_id) handle cache. Happy-path entries
+# drain on completion; the cap is a safeguard against unbounded growth from
+# delete-failures, not a capacity plan.
+_FEISHU_PROCESSING_REACTION_CACHE_SIZE = 1024

 # QR onboarding constants
 _ONBOARD_ACCOUNTS_URLS = {
@@ -430,23 +445,66 @@ def _coerce_required_int(value: Any, default: int, min_value: int = 0) -> int:


 def _build_markdown_post_payload(content: str) -> str:
+    rows = _build_markdown_post_rows(content)
    return json.dumps(
        {
            "zh_cn": {
-                "content": [
-                    [
-                        {
-                            "tag": "md",
-                            "text": content,
-                        }
-                    ]
-                ],
+                "content": rows,
            }
        },
        ensure_ascii=False,
    )


+def _build_markdown_post_rows(content: str) -> List[List[Dict[str, str]]]:
+    """Build Feishu post rows while isolating fenced code blocks.
+
+    Feishu's `md` renderer can swallow trailing content when a fenced code block
+    appears inside one large markdown element. Split the reply at real fence
+    lines so prose before/after the code block remains visible while code stays
+    in a dedicated row.
+    """
+    if not content:
+        return [[{"tag": "md", "text": ""}]]
+    if "```" not in content:
+        return [[{"tag": "md", "text": content}]]
+
+    rows: List[List[Dict[str, str]]] = []
+    current: List[str] = []
+    in_code_block = False
+
+    def _flush_current() -> None:
+        nonlocal current
+        if not current:
+            return
+        segment = "\n".join(current)
+        if segment.strip():
+            rows.append([{"tag": "md", "text": segment}])
+        current = []
+
+    for raw_line in content.splitlines():
+        stripped_line = raw_line.strip()
+        is_fence = bool(
+            _MARKDOWN_FENCE_CLOSE_RE.match(stripped_line)
+            if in_code_block
+            else _MARKDOWN_FENCE_OPEN_RE.match(stripped_line)
+        )
+
+        if is_fence:
+            if not in_code_block:
+                _flush_current()
+            current.append(raw_line)
+            in_code_block = not in_code_block
+            if not in_code_block:
+                _flush_current()
+            continue
+
+        current.append(raw_line)
+
+    _flush_current()
+    return rows or [[{"tag": "md", "text": content}]]
+
+
 def parse_feishu_post_payload(payload: Any) -> FeishuPostParseResult:
    resolved = _resolve_post_payload(payload)
    if not resolved:
@@ -1096,6 +1154,9 @@ class FeishuAdapter(BasePlatformAdapter):
        # Exec approval button state (approval_id → {session_key, message_id, chat_id})
        self._approval_state: Dict[int, Dict[str, str]] = {}
        self._approval_counter = itertools.count(1)
+        # Feishu reaction deletion requires the opaque reaction_id returned
+        # by create, so we cache it per message_id.
+        self._pending_processing_reactions: "OrderedDict[str, str]" = OrderedDict()
        self._load_seen_message_ids()

    @staticmethod
@@ -1423,6 +1484,8 @@ class FeishuAdapter(BasePlatformAdapter):
        chat_id: str,
        message_id: str,
        content: str,
+        *,
+        finalize: bool = False,
    ) -> SendResult:
        """Edit a previously sent Feishu text/post message."""
        if not self._client:
@@ -1925,8 +1988,8 @@ class FeishuAdapter(BasePlatformAdapter):
        if not message_id or self._is_duplicate(message_id):
            logger.debug("[Feishu] Dropping duplicate/missing message_id: %s", message_id)
            return
-        if getattr(sender, "sender_type", "") == "bot":
-            logger.debug("[Feishu] Dropping bot-originated event: %s", message_id)
+        if self._is_self_sent_bot_message(event):
+            logger.debug("[Feishu] Dropping self-sent bot event: %s", message_id)
            return

        chat_type = getattr(message, "chat_type", "p2p")
@@ -2003,12 +2066,12 @@ class FeishuAdapter(BasePlatformAdapter):
            operator_type,
            emoji_type,
        )
-        # Only process reactions from real users. Ignore app/bot-generated reactions
-        # and Hermes' own ACK emoji to avoid feedback loops.
+        # Drop bot/app-origin reactions to break the feedback loop from our
+        # own lifecycle reactions. A human reacting with the same emoji (e.g.
+        # clicking Typing on a bot message) is still routed through.
        loop = self._loop
        if (
            operator_type in {"bot", "app"}
-            or emoji_type == _FEISHU_ACK_EMOJI
            or not message_id
            or loop is None
            or bool(getattr(loop, "is_closed", lambda: False)())
@@ -2232,33 +2295,35 @@ class FeishuAdapter(BasePlatformAdapter):

    async def _handle_message_with_guards(self, event: MessageEvent) -> None:
        """Dispatch a single event through the agent pipeline with per-chat serialization
-        and a persistent ACK emoji reaction before processing starts.
+        before handing the event off to the agent.

-        - Per-chat lock: ensures messages in the same chat are processed one at a time
-          (matches openclaw's createChatQueue serial queue behaviour).
-        - ACK indicator: adds a CHECK reaction to the triggering message before handing
-          off to the agent and leaves it in place as a receipt marker.
+        Per-chat lock ensures messages in the same chat are processed one at a
+        time (matches openclaw's createChatQueue serial queue behaviour).
        """
        chat_id = getattr(event.source, "chat_id", "") or "" if event.source else ""
        chat_lock = self._get_chat_lock(chat_id)
        async with chat_lock:
-            message_id = event.message_id
-            if message_id:
-                await self._add_ack_reaction(message_id)
            await self.handle_message(event)

-    async def _add_ack_reaction(self, message_id: str) -> Optional[str]:
-        """Add a persistent ACK emoji reaction to signal the message was received."""
-        if not self._client or not message_id:
+    # =========================================================================
+    # Processing status reactions
+    # =========================================================================
+
+    def _reactions_enabled(self) -> bool:
+        return os.getenv("FEISHU_REACTIONS", "true").strip().lower() not in ("false", "0", "no")
+
+    async def _add_reaction(self, message_id: str, emoji_type: str) -> Optional[str]:
+        """Return the reaction_id on success, else None. The id is needed later for deletion."""
+        if not self._client or not message_id or not emoji_type:
            return None
        try:
-            from lark_oapi.api.im.v1 import (  # lazy import — keeps optional dep optional
+            from lark_oapi.api.im.v1 import (
                CreateMessageReactionRequest,
                CreateMessageReactionRequestBody,
            )
            body = (
                CreateMessageReactionRequestBody.builder()
-                .reaction_type({"emoji_type": _FEISHU_ACK_EMOJI})
+                .reaction_type({"emoji_type": emoji_type})
                .build()
            )
            request = (
@@ -2271,16 +2336,93 @@ class FeishuAdapter(BasePlatformAdapter):
            if response and getattr(response, "success", lambda: False)():
                data = getattr(response, "data", None)
                return getattr(data, "reaction_id", None)
-            logger.warning(
-                "[Feishu] Failed to add ack reaction to %s: code=%s msg=%s",
+            logger.debug(
+                "[Feishu] Add reaction %s on %s rejected: code=%s msg=%s",
+                emoji_type,
                message_id,
                getattr(response, "code", None),
                getattr(response, "msg", None),
            )
        except Exception:
-            logger.warning("[Feishu] Failed to add ack reaction to %s", message_id, exc_info=True)
+            logger.warning(
+                "[Feishu] Add reaction %s on %s raised",
+                emoji_type,
+                message_id,
+                exc_info=True,
+            )
        return None

+    async def _remove_reaction(self, message_id: str, reaction_id: str) -> bool:
+        if not self._client or not message_id or not reaction_id:
+            return False
+        try:
+            from lark_oapi.api.im.v1 import DeleteMessageReactionRequest
+            request = (
+                DeleteMessageReactionRequest.builder()
+                .message_id(message_id)
+                .reaction_id(reaction_id)
+                .build()
+            )
+            response = await asyncio.to_thread(self._client.im.v1.message_reaction.delete, request)
+            if response and getattr(response, "success", lambda: False)():
+                return True
+            logger.debug(
+                "[Feishu] Remove reaction %s on %s rejected: code=%s msg=%s",
+                reaction_id,
+                message_id,
+                getattr(response, "code", None),
+                getattr(response, "msg", None),
+            )
+        except Exception:
+            logger.warning(
+                "[Feishu] Remove reaction %s on %s raised",
+                reaction_id,
+                message_id,
+                exc_info=True,
+            )
+        return False
+
+    def _remember_processing_reaction(self, message_id: str, reaction_id: str) -> None:
+        cache = self._pending_processing_reactions
+        cache[message_id] = reaction_id
+        cache.move_to_end(message_id)
+        while len(cache) > _FEISHU_PROCESSING_REACTION_CACHE_SIZE:
+            cache.popitem(last=False)
+
+    def _pop_processing_reaction(self, message_id: str) -> Optional[str]:
+        return self._pending_processing_reactions.pop(message_id, None)
+
+    async def on_processing_start(self, event: MessageEvent) -> None:
+        if not self._reactions_enabled():
+            return
+        message_id = event.message_id
+        if not message_id or message_id in self._pending_processing_reactions:
+            return
+        reaction_id = await self._add_reaction(message_id, _FEISHU_REACTION_IN_PROGRESS)
+        if reaction_id:
+            self._remember_processing_reaction(message_id, reaction_id)
+
+    async def on_processing_complete(
+        self, event: MessageEvent, outcome: ProcessingOutcome
+    ) -> None:
+        if not self._reactions_enabled():
+            return
+        message_id = event.message_id
+        if not message_id:
+            return
+
+        start_reaction_id = self._pending_processing_reactions.get(message_id)
+        if start_reaction_id:
+            if not await self._remove_reaction(message_id, start_reaction_id):
+                # Don't stack a second badge on top of a Typing we couldn't
+                # remove — UI would read as both "working" and "done/failed"
+                # simultaneously. Keep the handle so LRU eventually evicts it.
+                return
+            self._pop_processing_reaction(message_id)
+
+        if outcome is ProcessingOutcome.FAILURE:
+            await self._add_reaction(message_id, _FEISHU_REACTION_FAILURE)
+
    # =========================================================================
    # Webhook server and security
    # =========================================================================
@@ -3249,6 +3391,23 @@ class FeishuAdapter(BasePlatformAdapter):
            return self._post_mentions_bot(normalized.mentioned_ids)
        return False

+    def _is_self_sent_bot_message(self, event: Any) -> bool:
+        """Return True only for Feishu events emitted by this Hermes bot."""
+        sender = getattr(event, "sender", None)
+        sender_type = str(getattr(sender, "sender_type", "") or "").strip().lower()
+        if sender_type not in {"bot", "app"}:
+            return False
+
+        sender_id = getattr(sender, "sender_id", None)
+        sender_open_id = str(getattr(sender_id, "open_id", "") or "").strip()
+        sender_user_id = str(getattr(sender_id, "user_id", "") or "").strip()
+
+        if self._bot_open_id and sender_open_id == self._bot_open_id:
+            return True
+        if self._bot_user_id and sender_user_id == self._bot_user_id:
+            return True
+        return False
+
    def _message_mentions_bot(self, mentions: List[Any]) -> bool:
        """Check whether any mention targets the configured or inferred bot identity."""
        for mention in mentions:
@@ -3276,10 +3435,55 @@ class FeishuAdapter(BasePlatformAdapter):
        return False

    async def _hydrate_bot_identity(self) -> None:
-        """Best-effort discovery of bot identity for precise group mention gating."""
+        """Best-effort discovery of bot identity for precise group mention gating
+        and self-sent bot event filtering.
+
+        Populates ``_bot_open_id`` and ``_bot_name`` from /open-apis/bot/v3/info
+        (no extra scopes required beyond the tenant access token). Falls back to
+        the application info endpoint for ``_bot_name`` only when the first probe
+        doesn't return it. Each field is hydrated independently — a value already
+        supplied via env vars (FEISHU_BOT_OPEN_ID / FEISHU_BOT_USER_ID /
+        FEISHU_BOT_NAME) is preserved and skips its probe.
+        """
        if not self._client:
            return
-        if any((self._bot_open_id, self._bot_user_id, self._bot_name)):
+        if self._bot_open_id and self._bot_name:
+            # Everything the self-send filter and precise mention gate need is
+            # already in place; nothing to probe.
+            return
+
+        # Primary probe: /open-apis/bot/v3/info — returns bot_name + open_id, no
+        # extra scopes required. This is the same endpoint the onboarding wizard
+        # uses via probe_bot().
+        if not self._bot_open_id or not self._bot_name:
+            try:
+                resp = await asyncio.to_thread(
+                    self._client.request,
+                    method="GET",
+                    url="/open-apis/bot/v3/info",
+                    body=None,
+                    raw_response=True,
+                )
+                content = getattr(resp, "content", None)
+                if content:
+                    payload = json.loads(content)
+                    parsed = _parse_bot_response(payload) or {}
+                    open_id = (parsed.get("bot_open_id") or "").strip()
+                    bot_name = (parsed.get("bot_name") or "").strip()
+                    if open_id and not self._bot_open_id:
+                        self._bot_open_id = open_id
+                    if bot_name and not self._bot_name:
+                        self._bot_name = bot_name
+            except Exception:
+                logger.debug(
+                    "[Feishu] /bot/v3/info probe failed during hydration",
+                    exc_info=True,
+                )
+
+        # Fallback probe for _bot_name only: application info endpoint. Needs
+        # admin:app.info:readonly or application:application:self_manage scope,
+        # so it's best-effort.
+        if self._bot_name:
            return
        try:
            request = self._build_get_application_request(app_id=self._app_id, lang="en_us")
@@ -3288,17 +3492,17 @@ class FeishuAdapter(BasePlatformAdapter):
                code = getattr(response, "code", None)
                if code == 99991672:
                    logger.warning(
-                        "[Feishu] Unable to hydrate bot identity from application info. "
+                        "[Feishu] Unable to hydrate bot name from application info. "
                        "Grant admin:app.info:readonly or application:application:self_manage "
                        "so group @mention gating can resolve the bot name precisely."
                    )
                return
            app = getattr(getattr(response, "data", None), "app", None)
            app_name = (getattr(app, "app_name", None) or "").strip()
-            if app_name:
+            if app_name and not self._bot_name:
                self._bot_name = app_name
        except Exception:
-            logger.debug("[Feishu] Failed to hydrate bot identity", exc_info=True)
+            logger.debug("[Feishu] Failed to hydrate bot name from application info", exc_info=True)

    # =========================================================================
    # Deduplication — seen message ID cache (persistent)
@@ -825,7 +825,7 @@ class MatrixAdapter(BasePlatformAdapter):


    async def edit_message(
-        self, chat_id: str, message_id: str, content: str
+        self, chat_id: str, message_id: str, content: str, *, finalize: bool = False
    ) -> SendResult:
        """Edit an existing message (via m.replace)."""

@@ -304,7 +304,7 @@ class MattermostAdapter(BasePlatformAdapter):
        )

    async def edit_message(
-        self, chat_id: str, message_id: str, content: str
+        self, chat_id: str, message_id: str, content: str, *, finalize: bool = False
    ) -> SendResult:
        """Edit an existing post."""
        formatted = self.format_message(content)
@@ -18,6 +18,7 @@ import logging
 import os
 import random
 import time
+import uuid
 from datetime import datetime, timezone
 from pathlib import Path
 from typing import Dict, List, Optional, Any
@@ -127,6 +128,27 @@ def _render_mentions(text: str, mentions: list) -> str:
    return text


+def _is_signal_service_id(value: str) -> bool:
+    """Return True if *value* already looks like a Signal service identifier."""
+    if not value:
+        return False
+    if value.startswith("PNI:") or value.startswith("u:"):
+        return True
+    try:
+        uuid.UUID(value)
+        return True
+    except (ValueError, AttributeError, TypeError):
+        return False
+
+
+def _looks_like_e164_number(value: str) -> bool:
+    """Return True for a plausible E.164 phone number."""
+    if not value or not value.startswith("+"):
+        return False
+    digits = value[1:]
+    return digits.isdigit() and 7 <= len(digits) <= 15
+
+
 def check_signal_requirements() -> bool:
    """Check if Signal is configured (has URL and account)."""
    return bool(os.getenv("SIGNAL_HTTP_URL") and os.getenv("SIGNAL_ACCOUNT"))
@@ -179,6 +201,12 @@ class SignalAdapter(BasePlatformAdapter):
        # in Note to Self / self-chat mode (mirrors WhatsApp recentlySentIds)
        self._recent_sent_timestamps: set = set()
        self._max_recent_timestamps = 50
+        # Signal increasingly exposes ACI/PNI UUIDs as stable recipient IDs.
+        # Keep a best-effort mapping so outbound sends can upgrade from a
+        # phone number to the corresponding UUID when signal-cli prefers it.
+        self._recipient_uuid_by_number: Dict[str, str] = {}
+        self._recipient_number_by_uuid: Dict[str, str] = {}
+        self._recipient_cache_lock = asyncio.Lock()

        logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
                     self.http_url, redact_phone(self.account),
@@ -195,31 +223,40 @@ class SignalAdapter(BasePlatformAdapter):
            return False

        # Acquire scoped lock to prevent duplicate Signal listeners for the same phone
+        lock_acquired = False
        try:
            if not self._acquire_platform_lock('signal-phone', self.account, 'Signal account'):
                return False
+            lock_acquired = True
        except Exception as e:
            logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)

        self.client = httpx.AsyncClient(timeout=30.0)
-
-        # Health check — verify signal-cli daemon is reachable
        try:
-            resp = await self.client.get(f"{self.http_url}/api/v1/check", timeout=10.0)
-            if resp.status_code != 200:
-                logger.error("Signal: health check failed (status %d)", resp.status_code)
+            # Health check — verify signal-cli daemon is reachable
+            try:
+                resp = await self.client.get(f"{self.http_url}/api/v1/check", timeout=10.0)
+                if resp.status_code != 200:
+                    logger.error("Signal: health check failed (status %d)", resp.status_code)
+                    return False
+            except Exception as e:
+                logger.error("Signal: cannot reach signal-cli at %s: %s", self.http_url, e)
                return False
-        except Exception as e:
-            logger.error("Signal: cannot reach signal-cli at %s: %s", self.http_url, e)
-            return False

-        self._running = True
-        self._last_sse_activity = time.time()
-        self._sse_task = asyncio.create_task(self._sse_listener())
-        self._health_monitor_task = asyncio.create_task(self._health_monitor())
+            self._running = True
+            self._last_sse_activity = time.time()
+            self._sse_task = asyncio.create_task(self._sse_listener())
+            self._health_monitor_task = asyncio.create_task(self._health_monitor())

-        logger.info("Signal: connected to %s", self.http_url)
-        return True
+            logger.info("Signal: connected to %s", self.http_url)
+            return True
+        finally:
+            if not self._running:
+                if self.client:
+                    await self.client.aclose()
+                    self.client = None
+                if lock_acquired:
+                    self._release_platform_lock()

    async def disconnect(self) -> None:
        """Stop SSE listener and clean up."""
@@ -400,6 +437,7 @@ class SignalAdapter(BasePlatformAdapter):
        )
        sender_name = envelope_data.get("sourceName", "")
        sender_uuid = envelope_data.get("sourceUuid", "")
+        self._remember_recipient_identifiers(sender, sender_uuid)

        if not sender:
            logger.debug("Signal: ignoring envelope with no sender")
@@ -518,6 +556,64 @@ class SignalAdapter(BasePlatformAdapter):

        await self.handle_message(event)

+    def _remember_recipient_identifiers(self, number: Optional[str], service_id: Optional[str]) -> None:
+        """Cache any number↔UUID mapping observed from Signal envelopes."""
+        if not number or not service_id or not _is_signal_service_id(service_id):
+            return
+        self._recipient_uuid_by_number[number] = service_id
+        self._recipient_number_by_uuid[service_id] = number
+
+    def _extract_contact_uuid(self, contact: Any, phone_number: str) -> Optional[str]:
+        """Best-effort extraction of a Signal service ID from listContacts output."""
+        if not isinstance(contact, dict):
+            return None
+
+        number = contact.get("number")
+        recipient = contact.get("recipient")
+        service_id = contact.get("uuid") or contact.get("serviceId")
+        if not service_id:
+            profile = contact.get("profile")
+            if isinstance(profile, dict):
+                service_id = profile.get("serviceId") or profile.get("uuid")
+
+        if service_id and _is_signal_service_id(service_id):
+            matches_number = number == phone_number or recipient == phone_number
+            if matches_number:
+                return service_id
+        return None
+
+    async def _resolve_recipient(self, chat_id: str) -> str:
+        """Return the preferred Signal recipient identifier for a direct chat."""
+        if (
+            not chat_id
+            or chat_id.startswith("group:")
+            or _is_signal_service_id(chat_id)
+            or not _looks_like_e164_number(chat_id)
+        ):
+            return chat_id
+
+        cached = self._recipient_uuid_by_number.get(chat_id)
+        if cached:
+            return cached
+
+        async with self._recipient_cache_lock:
+            cached = self._recipient_uuid_by_number.get(chat_id)
+            if cached:
+                return cached
+
+            contacts = await self._rpc("listContacts", {
+                "account": self.account,
+                "allRecipients": True,
+            })
+            if isinstance(contacts, list):
+                for contact in contacts:
+                    number = contact.get("number") if isinstance(contact, dict) else None
+                    service_id = self._extract_contact_uuid(contact, chat_id)
+                    if number and service_id:
+                        self._remember_recipient_identifiers(number, service_id)
+
+            return self._recipient_uuid_by_number.get(chat_id, chat_id)
+
    # ------------------------------------------------------------------
    # Attachment Handling
    # ------------------------------------------------------------------
@@ -633,7 +729,7 @@ class SignalAdapter(BasePlatformAdapter):
        if chat_id.startswith("group:"):
            params["groupId"] = chat_id[6:]
        else:
-            params["recipient"] = [chat_id]
+            params["recipient"] = [await self._resolve_recipient(chat_id)]

        result = await self._rpc("send", params)

@@ -684,7 +780,7 @@ class SignalAdapter(BasePlatformAdapter):
        if chat_id.startswith("group:"):
            params["groupId"] = chat_id[6:]
        else:
-            params["recipient"] = [chat_id]
+            params["recipient"] = [await self._resolve_recipient(chat_id)]

        fails = self._typing_failures.get(chat_id, 0)
        result = await self._rpc(
@@ -745,7 +841,7 @@ class SignalAdapter(BasePlatformAdapter):
        if chat_id.startswith("group:"):
            params["groupId"] = chat_id[6:]
        else:
-            params["recipient"] = [chat_id]
+            params["recipient"] = [await self._resolve_recipient(chat_id)]

        result = await self._rpc("send", params)
        if result is not None:
@@ -784,7 +880,7 @@ class SignalAdapter(BasePlatformAdapter):
        if chat_id.startswith("group:"):
            params["groupId"] = chat_id[6:]
        else:
-            params["recipient"] = [chat_id]
+            params["recipient"] = [await self._resolve_recipient(chat_id)]

        result = await self._rpc("send", params)
        if result is not None:
@@ -150,9 +150,11 @@ class SlackAdapter(BasePlatformAdapter):
            except Exception as e:
                logger.warning("[Slack] Failed to read %s: %s", tokens_file, e)

+        lock_acquired = False
        try:
            if not self._acquire_platform_lock('slack-app-token', app_token, 'Slack app token'):
                return False
+            lock_acquired = True

            # First token is the primary — used for AsyncApp / Socket Mode
            primary_token = bot_tokens[0]
@@ -228,6 +230,9 @@ class SlackAdapter(BasePlatformAdapter):
        except Exception as e:  # pragma: no cover - defensive logging
            logger.error("[Slack] Connection failed: %s", e, exc_info=True)
            return False
+        finally:
+            if lock_acquired and not self._running:
+                self._release_platform_lock()

    async def disconnect(self) -> None:
        """Disconnect from Slack."""
@@ -316,6 +321,8 @@ class SlackAdapter(BasePlatformAdapter):
        chat_id: str,
        message_id: str,
        content: str,
+        *,
+        finalize: bool = False,
    ) -> SendResult:
        """Edit a previously sent Slack message."""
        if not self._app:
@@ -11,6 +11,7 @@ import asyncio
 import json
 import logging
 import os
+import tempfile
 import html as _html
 import re
 from typing import Dict, List, Optional, Any
@@ -534,8 +535,23 @@ class TelegramAdapter(BasePlatformAdapter):
                        break

            if changed:
-                with open(config_path, "w") as f:
-                    _yaml.dump(config, f, default_flow_style=False, sort_keys=False)
+                fd, tmp_path = tempfile.mkstemp(
+                    dir=str(config_path.parent),
+                    suffix=".tmp",
+                    prefix=".config_",
+                )
+                try:
+                    with os.fdopen(fd, "w", encoding="utf-8") as f:
+                        _yaml.dump(config, f, default_flow_style=False, sort_keys=False)
+                        f.flush()
+                        os.fsync(f.fileno())
+                    os.replace(tmp_path, config_path)
+                except BaseException:
+                    try:
+                        os.unlink(tmp_path)
+                    except OSError:
+                        pass
+                    raise
                logger.info(
                    "[%s] Persisted thread_id=%s for topic '%s' in config.yaml",
                    self.name, thread_id, topic_name,
@@ -1081,6 +1097,8 @@ class TelegramAdapter(BasePlatformAdapter):
        chat_id: str,
        message_id: str,
        content: str,
+        *,
+        finalize: bool = False,
    ) -> SendResult:
        """Edit a previously sent Telegram message."""
        if not self._bot:
@@ -1657,6 +1675,21 @@ class TelegramAdapter(BasePlatformAdapter):
        except Exception as exc:
            logger.error("Failed to write update response from callback: %s", exc)

+    def _missing_media_path_error(self, label: str, path: str) -> str:
+        """Build an actionable file-not-found error for gateway MEDIA delivery.
+
+        Paths like /workspace/... or /output/... often only exist inside the
+        Docker sandbox, while the gateway process runs on the host.
+        """
+        error = f"{label} file not found: {path}"
+        if path.startswith(("/workspace/", "/output/", "/outputs/")):
+            error += (
+                " (path may only exist inside the Docker sandbox. "
+                "Bind-mount a host directory and emit the host-visible "
+                "path in MEDIA: for gateway file delivery.)"
+            )
+        return error
+
    async def send_voice(
        self,
        chat_id: str,
@@ -1673,7 +1706,7 @@ class TelegramAdapter(BasePlatformAdapter):
        try:
            import os
            if not os.path.exists(audio_path):
-                return SendResult(success=False, error=f"Audio file not found: {audio_path}")
+                return SendResult(success=False, error=self._missing_media_path_error("Audio", audio_path))
            
            with open(audio_path, "rb") as audio_file:
                # .ogg files -> send as voice (round playable bubble)
@@ -1722,7 +1755,7 @@ class TelegramAdapter(BasePlatformAdapter):
        try:
            import os
            if not os.path.exists(image_path):
-                return SendResult(success=False, error=f"Image file not found: {image_path}")
+                return SendResult(success=False, error=self._missing_media_path_error("Image", image_path))

            _thread = self._metadata_thread_id(metadata)
            with open(image_path, "rb") as image_file:
@@ -1759,7 +1792,7 @@ class TelegramAdapter(BasePlatformAdapter):

        try:
            if not os.path.exists(file_path):
-                return SendResult(success=False, error=f"File not found: {file_path}")
+                return SendResult(success=False, error=self._missing_media_path_error("File", file_path))

            display_name = file_name or os.path.basename(file_path)
            _thread = self._metadata_thread_id(metadata)
@@ -1793,7 +1826,7 @@ class TelegramAdapter(BasePlatformAdapter):

        try:
            if not os.path.exists(video_path):
-                return SendResult(success=False, error=f"Video file not found: {video_path}")
+                return SendResult(success=False, error=self._missing_media_path_error("Video", video_path))

            _thread = self._metadata_thread_id(metadata)
            with open(video_path, "rb") as f:
@@ -2241,22 +2274,27 @@ class TelegramAdapter(BasePlatformAdapter):

        bot_username = (getattr(self._bot, "username", None) or "").lstrip("@").lower()
        bot_id = getattr(self._bot, "id", None)
+        expected = f"@{bot_username}" if bot_username else None

        def _iter_sources():
            yield getattr(message, "text", None) or "", getattr(message, "entities", None) or []
            yield getattr(message, "caption", None) or "", getattr(message, "caption_entities", None) or []

+        # Telegram parses mentions server-side and emits MessageEntity objects
+        # (type=mention for @username, type=text_mention for @FirstName targeting
+        # a user without a public username). Only those entities are authoritative —
+        # raw substring matches like "foo@hermes_bot.example" are not mentions
+        # (bug #12545). Entities also correctly handle @handles inside URLs, code
+        # blocks, and quoted text, where a regex scan would over-match.
        for source_text, entities in _iter_sources():
-            if bot_username and f"@{bot_username}" in source_text.lower():
-                return True
            for entity in entities:
                entity_type = str(getattr(entity, "type", "")).split(".")[-1].lower()
-                if entity_type == "mention" and bot_username:
+                if entity_type == "mention" and expected:
                    offset = int(getattr(entity, "offset", -1))
                    length = int(getattr(entity, "length", 0))
                    if offset < 0 or length <= 0:
                        continue
-                    if source_text[offset:offset + length].strip().lower() == f"@{bot_username}":
+                    if source_text[offset:offset + length].strip().lower() == expected:
                        return True
                elif entity_type == "text_mention":
                    user = getattr(entity, "user", None)
@@ -2926,8 +2964,8 @@ class TelegramAdapter(BasePlatformAdapter):
            chat_id=str(chat.id),
            chat_name=chat.title or (chat.full_name if hasattr(chat, "full_name") else None),
            chat_type=chat_type,
-            user_id=str(user.id) if user else None,
-            user_name=user.full_name if user else None,
+            user_id=str(user.id) if user else (str(chat.id) if chat_type == "dm" else None),
+            user_name=user.full_name if user else (chat.full_name if hasattr(chat, "full_name") and chat_type == "dm" else None),
            thread_id=thread_id_str,
            chat_topic=chat_topic,
        )
@@ -13,6 +13,10 @@ Each route defines:
  - skills: optional list of skills to load for the agent
  - deliver: where to send the response (github_comment, telegram, etc.)
  - deliver_extra: additional delivery config (repo, pr_number, chat_id)
+  - deliver_only: if true, skip the agent — the rendered prompt IS the
+    message that gets delivered.  Use for external push notifications
+    (Supabase, monitoring alerts, inter-agent pings) where zero LLM cost
+    and sub-second delivery matter more than agent reasoning.

 Security:
  - HMAC secret is required per route (validated at startup)
@@ -122,6 +126,19 @@ class WebhookAdapter(BasePlatformAdapter):
                    f"For testing without auth, set secret to '{_INSECURE_NO_AUTH}'."
                )

+            # deliver_only routes bypass the agent — the POST body becomes a
+            # direct push notification via the configured delivery target.
+            # Validate up-front so misconfiguration surfaces at startup rather
+            # than on the first webhook POST.
+            if route.get("deliver_only"):
+                deliver = route.get("deliver", "log")
+                if not deliver or deliver == "log":
+                    raise ValueError(
+                        f"[webhook] Route '{name}' has deliver_only=true but "
+                        f"deliver is '{deliver}'. Direct delivery requires a "
+                        f"real target (telegram, discord, slack, github_comment, etc.)."
+                    )
+
        app = web.Application()
        app.router.add_get("/health", self._handle_health)
        app.router.add_post("/webhooks/{route_name}", self._handle_webhook)
@@ -296,24 +313,14 @@ class WebhookAdapter(BasePlatformAdapter):
                {"error": "Payload too large"}, status=413
            )

-        # ── Rate limiting ────────────────────────────────────────
-        now = time.time()
-        window = self._rate_counts.setdefault(route_name, [])
-        window[:] = [t for t in window if now - t < 60]
-        if len(window) >= self._rate_limit:
-            return web.json_response(
-                {"error": "Rate limit exceeded"}, status=429
-            )
-        window.append(now)
-
-        # Read body
+        # Read body (must be done before any validation)
        try:
            raw_body = await request.read()
        except Exception as e:
            logger.error("[webhook] Failed to read body: %s", e)
            return web.json_response({"error": "Bad request"}, status=400)

-        # Validate HMAC signature (skip for INSECURE_NO_AUTH testing mode)
+        # Validate HMAC signature FIRST (skip for INSECURE_NO_AUTH testing mode)
        secret = route_config.get("secret", self._global_secret)
        if secret and secret != _INSECURE_NO_AUTH:
            if not self._validate_signature(request, raw_body, secret):
@@ -324,6 +331,16 @@ class WebhookAdapter(BasePlatformAdapter):
                    {"error": "Invalid signature"}, status=401
                )

+        # ── Rate limiting (after auth) ───────────────────────────
+        now = time.time()
+        window = self._rate_counts.setdefault(route_name, [])
+        window[:] = [t for t in window if now - t < 60]
+        if len(window) >= self._rate_limit:
+            return web.json_response(
+                {"error": "Rate limit exceeded"}, status=429
+            )
+        window.append(now)
+
        # Parse payload
        try:
            payload = json.loads(raw_body)
@@ -419,6 +436,64 @@ class WebhookAdapter(BasePlatformAdapter):
            )
        self._seen_deliveries[delivery_id] = now

+        # ── Direct delivery mode (deliver_only) ─────────────────
+        # Skip the agent entirely — the rendered prompt IS the message we
+        # deliver.  Use case: external services (Supabase, monitoring,
+        # cron jobs, other agents) that need to push a plain notification
+        # to a user's chat with zero LLM cost.  Reuses the same HMAC auth,
+        # rate limiting, idempotency, and template rendering as agent mode.
+        if route_config.get("deliver_only"):
+            delivery = {
+                "deliver": route_config.get("deliver", "log"),
+                "deliver_extra": self._render_delivery_extra(
+                    route_config.get("deliver_extra", {}), payload
+                ),
+                "payload": payload,
+            }
+            logger.info(
+                "[webhook] direct-deliver event=%s route=%s target=%s msg_len=%d delivery=%s",
+                event_type,
+                route_name,
+                delivery["deliver"],
+                len(prompt),
+                delivery_id,
+            )
+            try:
+                result = await self._direct_deliver(prompt, delivery)
+            except Exception:
+                logger.exception(
+                    "[webhook] direct-deliver failed route=%s delivery=%s",
+                    route_name,
+                    delivery_id,
+                )
+                return web.json_response(
+                    {"status": "error", "error": "Delivery failed", "delivery_id": delivery_id},
+                    status=502,
+                )
+
+            if result.success:
+                return web.json_response(
+                    {
+                        "status": "delivered",
+                        "route": route_name,
+                        "target": delivery["deliver"],
+                        "delivery_id": delivery_id,
+                    },
+                    status=200,
+                )
+            # Delivery attempted but target rejected it — surface as 502
+            # with a generic error (don't leak adapter-level detail).
+            logger.warning(
+                "[webhook] direct-deliver target rejected route=%s target=%s error=%s",
+                route_name,
+                delivery["deliver"],
+                result.error,
+            )
+            return web.json_response(
+                {"status": "error", "error": "Delivery failed", "delivery_id": delivery_id},
+                status=502,
+            )
+
        # Use delivery_id in session key so concurrent webhooks on the
        # same route get independent agent runs (not queued/interrupted).
        session_chat_id = f"webhook:{route_name}:{delivery_id}"
@@ -572,6 +647,34 @@ class WebhookAdapter(BasePlatformAdapter):
    # Response delivery
    # ------------------------------------------------------------------

+    async def _direct_deliver(
+        self, content: str, delivery: dict
+    ) -> SendResult:
+        """Deliver *content* directly without invoking the agent.
+
+        Used by ``deliver_only`` routes: the rendered template becomes the
+        literal message body, and we dispatch to the same delivery helpers
+        that the agent-mode ``send()`` flow uses.  All target types that
+        work in agent mode work here — Telegram, Discord, Slack, GitHub
+        PR comments, etc.
+        """
+        deliver_type = delivery.get("deliver", "log")
+
+        if deliver_type == "log":
+            # Shouldn't reach here — startup validation rejects deliver_only
+            # with deliver=log — but guard defensively.
+            logger.info("[webhook] direct-deliver log-only: %s", content[:200])
+            return SendResult(success=True)
+
+        if deliver_type == "github_comment":
+            return await self._deliver_github_comment(content, delivery)
+
+        # Fall through to the cross-platform dispatcher, which validates the
+        # target name and routes via the gateway runner.
+        return await self._deliver_cross_platform(
+            deliver_type, content, delivery
+        )
+
    async def _deliver_github_comment(
        self, content: str, delivery: dict
    ) -> SendResult:
@@ -289,33 +289,35 @@ class WhatsAppAdapter(BasePlatformAdapter):
        logger.info("[%s] Bridge found at %s", self.name, bridge_path)
        
        # Acquire scoped lock to prevent duplicate sessions
+        lock_acquired = False
        try:
            if not self._acquire_platform_lock('whatsapp-session', str(self._session_path), 'WhatsApp session'):
                return False
+            lock_acquired = True
        except Exception as e:
            logger.warning("[%s] Could not acquire session lock (non-fatal): %s", self.name, e)

-        # Auto-install npm dependencies if node_modules doesn't exist
-        bridge_dir = bridge_path.parent
-        if not (bridge_dir / "node_modules").exists():
-            print(f"[{self.name}] Installing WhatsApp bridge dependencies...")
-            try:
-                install_result = subprocess.run(
-                    ["npm", "install", "--silent"],
-                    cwd=str(bridge_dir),
-                    capture_output=True,
-                    text=True,
-                    timeout=60,
-                )
-                if install_result.returncode != 0:
-                    print(f"[{self.name}] npm install failed: {install_result.stderr}")
-                    return False
-                print(f"[{self.name}] Dependencies installed")
-            except Exception as e:
-                print(f"[{self.name}] Failed to install dependencies: {e}")
-                return False
-        
        try:
+            # Auto-install npm dependencies if node_modules doesn't exist
+            bridge_dir = bridge_path.parent
+            if not (bridge_dir / "node_modules").exists():
+                print(f"[{self.name}] Installing WhatsApp bridge dependencies...")
+                try:
+                    install_result = subprocess.run(
+                        ["npm", "install", "--silent"],
+                        cwd=str(bridge_dir),
+                        capture_output=True,
+                        text=True,
+                        timeout=60,
+                    )
+                    if install_result.returncode != 0:
+                        print(f"[{self.name}] npm install failed: {install_result.stderr}")
+                        return False
+                    print(f"[{self.name}] Dependencies installed")
+                except Exception as e:
+                    print(f"[{self.name}] Failed to install dependencies: {e}")
+                    return False
+
            # Ensure session directory exists
            self._session_path.mkdir(parents=True, exist_ok=True)
            
@@ -452,10 +454,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
            return True
            
        except Exception as e:
-            self._release_platform_lock()
            logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
-            self._close_bridge_log()
            return False
+        finally:
+            if not self._running:
+                if lock_acquired:
+                    self._release_platform_lock()
+                self._close_bridge_log()
    
    def _close_bridge_log(self) -> None:
        """Close the bridge log file handle if open."""
@@ -655,6 +660,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
        chat_id: str,
        message_id: str,
        content: str,
+        *,
+        finalize: bool = False,
    ) -> SendResult:
        """Edit a previously sent message via the WhatsApp bridge."""
        if not self._running or not self._http_session:
@@ -377,7 +377,19 @@ class SessionEntry:
    # this session (create a new session_id) so the user starts fresh.
    # Set by /stop to break stuck-resume loops (#7536).
    suspended: bool = False
-    
+
+    # When True the session was interrupted by a gateway restart/shutdown
+    # drain timeout, but recovery is still expected.  Unlike ``suspended``,
+    # ``resume_pending`` preserves the existing session_id on next access —
+    # the user stays on the same transcript and the agent auto-continues
+    # from where it left off.  Cleared after the next successful turn.
+    # Escalation to ``suspended`` is handled by the existing
+    # ``.restart_failure_counts`` stuck-loop counter (#7536), not by a
+    # parallel counter on this entry.
+    resume_pending: bool = False
+    resume_reason: Optional[str] = None  # e.g. "restart_timeout"
+    last_resume_marked_at: Optional[datetime] = None
+
    def to_dict(self) -> Dict[str, Any]:
        result = {
            "session_key": self.session_key,
@@ -397,6 +409,13 @@ class SessionEntry:
            "cost_status": self.cost_status,
            "memory_flushed": self.memory_flushed,
            "suspended": self.suspended,
+            "resume_pending": self.resume_pending,
+            "resume_reason": self.resume_reason,
+            "last_resume_marked_at": (
+                self.last_resume_marked_at.isoformat()
+                if self.last_resume_marked_at
+                else None
+            ),
        }
        if self.origin:
            result["origin"] = self.origin.to_dict()
@@ -414,7 +433,15 @@ class SessionEntry:
                platform = Platform(data["platform"])
            except ValueError as e:
                logger.debug("Unknown platform value %r: %s", data["platform"], e)
-        
+
+        last_resume_marked_at = None
+        _lrma = data.get("last_resume_marked_at")
+        if _lrma:
+            try:
+                last_resume_marked_at = datetime.fromisoformat(_lrma)
+            except (TypeError, ValueError):
+                last_resume_marked_at = None
+
        return cls(
            session_key=data["session_key"],
            session_id=data["session_id"],
@@ -434,6 +461,9 @@ class SessionEntry:
            cost_status=data.get("cost_status", "unknown"),
            memory_flushed=data.get("memory_flushed", False),
            suspended=data.get("suspended", False),
+            resume_pending=data.get("resume_pending", False),
+            resume_reason=data.get("resume_reason"),
+            last_resume_marked_at=last_resume_marked_at,
        )


@@ -710,9 +740,23 @@ class SessionStore:
                entry = self._entries[session_key]

                # Auto-reset sessions marked as suspended (e.g. after /stop
-                # broke a stuck loop — #7536).
+                # broke a stuck loop — #7536).  ``suspended`` is the hard
+                # forced-wipe signal and always wins over ``resume_pending``,
+                # so repeated interrupted restarts that escalate via the
+                # existing ``.restart_failure_counts`` stuck-loop counter
+                # still converge to a clean slate.
                if entry.suspended:
                    reset_reason = "suspended"
+                elif entry.resume_pending:
+                    # Restart-interrupted session: preserve the session_id
+                    # and return the existing entry so the transcript
+                    # reloads intact.  ``resume_pending`` is cleared after
+                    # the NEXT successful turn completes (not here), which
+                    # means a re-interrupted retry keeps trying — the
+                    # stuck-loop counter handles terminal escalation.
+                    entry.updated_at = now
+                    self._save()
+                    return entry
                else:
                    reset_reason = self._should_reset(entry, source)
                if not reset_reason:
@@ -802,6 +846,55 @@ class SessionStore:
                return True
        return False

+    def mark_resume_pending(
+        self,
+        session_key: str,
+        reason: str = "restart_timeout",
+    ) -> bool:
+        """Mark a session as resumable after a restart interruption.
+
+        Unlike ``suspend_session()``, this preserves the existing
+        ``session_id`` and the transcript.  The next call to
+        ``get_or_create_session()`` for this key returns the same entry
+        so the user auto-resumes on the same conversation lane.
+
+        Returns True if the session existed and was marked.
+        """
+        with self._lock:
+            self._ensure_loaded_locked()
+            if session_key in self._entries:
+                entry = self._entries[session_key]
+                # Never override an explicit ``suspended`` — that is a hard
+                # forced-wipe signal (from /stop or stuck-loop escalation).
+                if entry.suspended:
+                    return False
+                entry.resume_pending = True
+                entry.resume_reason = reason
+                entry.last_resume_marked_at = _now()
+                self._save()
+                return True
+        return False
+
+    def clear_resume_pending(self, session_key: str) -> bool:
+        """Clear the resume-pending flag after a successful resumed turn.
+
+        Called from the gateway after ``run_conversation()`` returns a
+        final response for a session that had ``resume_pending=True``,
+        signalling that recovery succeeded.
+
+        Returns True if a flag was cleared.
+        """
+        with self._lock:
+            self._ensure_loaded_locked()
+            entry = self._entries.get(session_key)
+            if entry is None or not entry.resume_pending:
+                return False
+            entry.resume_pending = False
+            entry.resume_reason = None
+            entry.last_resume_marked_at = None
+            self._save()
+            return True
+
    def prune_old_entries(self, max_age_days: int) -> int:
        """Drop SessionEntry records older than max_age_days.

@@ -833,12 +926,18 @@ class SessionStore:
                    continue
                # Never prune sessions with an active background process
                # attached — the user may still be waiting on output.
+                # The callback is keyed by session_key (see process_registry.
+                # has_active_for_session); passing session_id here used to
+                # never match, so active sessions got pruned anyway.
                if self._has_active_processes_fn is not None:
                    try:
-                        if self._has_active_processes_fn(entry.session_id):
+                        if self._has_active_processes_fn(entry.session_key):
                            continue
-                    except Exception:
-                        pass
+                    except Exception as exc:
+                        logger.debug(
+                            "has_active_processes_fn raised during prune for %s: %s",
+                            entry.session_key, exc,
+                        )
                if entry.updated_at < cutoff:
                    removed_keys.append(key)
            for key in removed_keys:
@@ -861,6 +960,12 @@ class SessionStore:
        (#7536).  Only suspends sessions updated within *max_age_seconds*
        to avoid resetting long-idle sessions that are harmless to resume.
        Returns the number of sessions that were suspended.
+
+        Entries flagged ``resume_pending=True`` are skipped — those were
+        marked intentionally by the drain-timeout path as recoverable.
+        Terminal escalation for genuinely stuck ``resume_pending`` sessions
+        is handled by the existing ``.restart_failure_counts`` stuck-loop
+        counter, which runs after this method on startup.
        """
        from datetime import timedelta

@@ -869,6 +974,8 @@ class SessionStore:
        with self._lock:
            self._ensure_loaded_locked()
            for entry in self._entries.values():
+                if entry.resume_pending:
+                    continue
                if not entry.suspended and entry.updated_at >= cutoff:
                    entry.suspended = True
                    count += 1
@@ -430,6 +430,21 @@ class GatewayStreamConsumer:
                # a real string like "msg_1", not "__no_edit__", so that case
                # still resets and creates a fresh segment as intended.)
                if got_segment_break:
+                    # If the segment-break edit failed to deliver the
+                    # accumulated content (flood control that has not yet
+                    # promoted to fallback mode, or fallback mode itself),
+                    # _accumulated still holds pre-boundary text the user
+                    # never saw. Flush that tail as a continuation message
+                    # before the reset below wipes _accumulated — otherwise
+                    # text generated before the tool boundary is silently
+                    # dropped (issue #8124).
+                    if (
+                        self._accumulated
+                        and not current_update_visible
+                        and self._message_id
+                        and self._message_id != "__no_edit__"
+                    ):
+                        await self._flush_segment_tail_on_edit_failure()
                    self._reset_segment_state(preserve_no_edit=True)

                await asyncio.sleep(0.05)  # Small yield to not busy-loop
@@ -556,6 +571,30 @@ class GatewayStreamConsumer:
            if final_text.strip() and final_text != self._visible_prefix():
                continuation = final_text
            else:
+                # Defence-in-depth for #7183: the last edit may still show the
+                # cursor character because fallback mode was entered after an
+                # edit failure left it stuck.  Try one final edit to strip it
+                # so the message doesn't freeze with a visible ▉.  Best-effort
+                # — if this edit also fails (flood control still active),
+                # _try_strip_cursor has already been called on fallback entry
+                # and the adaptive-backoff retries will have had their shot.
+                if (
+                    self._message_id
+                    and self._last_sent_text
+                    and self.cfg.cursor
+                    and self._last_sent_text.endswith(self.cfg.cursor)
+                ):
+                    clean_text = self._last_sent_text[:-len(self.cfg.cursor)]
+                    try:
+                        result = await self.adapter.edit_message(
+                            chat_id=self.chat_id,
+                            message_id=self._message_id,
+                            content=clean_text,
+                        )
+                        if result.success:
+                            self._last_sent_text = clean_text
+                    except Exception:
+                        pass
                self._already_sent = True
                self._final_response_sent = True
                return
@@ -620,6 +659,39 @@ class GatewayStreamConsumer:
        err_lower = err.lower()
        return "flood" in err_lower or "retry after" in err_lower or "rate" in err_lower

+    async def _flush_segment_tail_on_edit_failure(self) -> None:
+        """Deliver un-sent tail content before a segment-break reset.
+
+        When an edit fails (flood control, transport error) and a tool
+        boundary arrives before the next retry, ``_accumulated`` holds text
+        that was generated but never shown to the user. Without this flush,
+        the segment reset would discard that tail and leave a frozen cursor
+        in the partial message.
+
+        Sends the tail that sits after the last successfully-delivered
+        prefix as a new message, and best-effort strips the stuck cursor
+        from the previous partial message.
+        """
+        if not self._fallback_final_send:
+            await self._try_strip_cursor()
+        visible = self._fallback_prefix or self._visible_prefix()
+        tail = self._accumulated
+        if visible and tail.startswith(visible):
+            tail = tail[len(visible):].lstrip()
+        tail = self._clean_for_display(tail)
+        if not tail.strip():
+            return
+        try:
+            result = await self.adapter.send(
+                chat_id=self.chat_id,
+                content=tail,
+                metadata=self.metadata,
+            )
+            if result.success:
+                self._already_sent = True
+        except Exception as e:
+            logger.error("Segment-break tail flush error: %s", e)
+
    async def _try_strip_cursor(self) -> None:
        """Best-effort edit to remove the cursor from the last visible message.

@@ -20,6 +20,7 @@ import logging
 import os
 import shutil
 import shlex
+import ssl
 import stat
 import base64
 import hashlib
@@ -151,7 +152,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        id="gemini",
        name="Google AI Studio",
        auth_type="api_key",
-        inference_base_url="https://generativelanguage.googleapis.com/v1beta/openai",
+        inference_base_url="https://generativelanguage.googleapis.com/v1beta",
        api_key_env_vars=("GOOGLE_API_KEY", "GEMINI_API_KEY"),
        base_url_env_var="GEMINI_BASE_URL",
    ),
@@ -353,6 +354,9 @@ def _resolve_kimi_base_url(api_key: str, default_url: str, env_override: str) ->
    """
    if env_override:
        return env_override
+    # No key → nothing to infer from.  Return default without inspecting.
+    if not api_key:
+        return default_url
    if api_key.startswith("sk-kimi-"):
        return KIMI_CODE_BASE_URL
    return default_url
@@ -480,6 +484,14 @@ def _resolve_zai_base_url(api_key: str, default_url: str, env_override: str) ->
    if env_override:
        return env_override

+    # No API key set → don't probe (would fire N×M HTTPS requests with an
+    # empty Bearer token, all returning 401).  This path is hit during
+    # auxiliary-client auto-detection when the user has no Z.AI credentials
+    # at all — the caller discards the result immediately, so the probe is
+    # pure latency for every AIAgent construction.
+    if not api_key:
+        return default_url
+
    # Check provider-state cache for a previously-detected endpoint.
    auth_store = _load_auth_store()
    state = _load_provider_state(auth_store, "zai") or {}
@@ -1434,49 +1446,6 @@ def _read_codex_tokens(*, _lock: bool = True) -> Dict[str, Any]:
    }


-def _write_codex_cli_tokens(
-    access_token: str,
-    refresh_token: str,
-    *,
-    last_refresh: Optional[str] = None,
-) -> None:
-    """Write refreshed tokens back to ~/.codex/auth.json.
-
-    OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
-    When Hermes refreshes a token it consumes the old refresh_token; if we
-    don't write the new pair back, the Codex CLI (or VS Code extension) will
-    fail with ``refresh_token_reused`` on its next refresh attempt.
-
-    This mirrors the Anthropic write-back to ~/.claude/.credentials.json
-    via ``_write_claude_code_credentials()``.
-    """
-    codex_home = os.getenv("CODEX_HOME", "").strip()
-    if not codex_home:
-        codex_home = str(Path.home() / ".codex")
-    auth_path = Path(codex_home).expanduser() / "auth.json"
-    try:
-        existing: Dict[str, Any] = {}
-        if auth_path.is_file():
-            existing = json.loads(auth_path.read_text(encoding="utf-8"))
-        if not isinstance(existing, dict):
-            existing = {}
-
-        tokens_dict = existing.get("tokens")
-        if not isinstance(tokens_dict, dict):
-            tokens_dict = {}
-        tokens_dict["access_token"] = access_token
-        tokens_dict["refresh_token"] = refresh_token
-        existing["tokens"] = tokens_dict
-        if last_refresh is not None:
-            existing["last_refresh"] = last_refresh
-
-        auth_path.parent.mkdir(parents=True, exist_ok=True)
-        auth_path.write_text(json.dumps(existing, indent=2), encoding="utf-8")
-        auth_path.chmod(0o600)
-    except (OSError, IOError) as exc:
-        logger.debug("Failed to write refreshed tokens to %s: %s", auth_path, exc)
-
-
 def _save_codex_tokens(tokens: Dict[str, str], last_refresh: str = None) -> None:
    """Save Codex OAuth tokens to Hermes auth store (~/.hermes/auth.json)."""
    if last_refresh is None:
@@ -1544,6 +1513,11 @@ def refresh_codex_oauth_pure(
                "then run `hermes auth` to re-authenticate."
            )
            relogin_required = True
+        # A 401/403 from the token endpoint always means the refresh token
+        # is invalid/expired — force relogin even if the body error code
+        # wasn't one of the known strings above.
+        if response.status_code in (401, 403) and not relogin_required:
+            relogin_required = True
        raise AuthError(
            message,
            provider="openai-codex",
@@ -1599,12 +1573,6 @@ def _refresh_codex_auth_tokens(
    updated_tokens["refresh_token"] = refreshed["refresh_token"]

    _save_codex_tokens(updated_tokens)
-    # Write back to ~/.codex/auth.json so Codex CLI / VS Code stay in sync.
-    _write_codex_cli_tokens(
-        refreshed["access_token"],
-        refreshed["refresh_token"],
-        last_refresh=refreshed.get("last_refresh"),
-    )
    return updated_tokens


@@ -1649,25 +1617,7 @@ def resolve_codex_runtime_credentials(
    refresh_skew_seconds: int = CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
 ) -> Dict[str, Any]:
    """Resolve runtime credentials from Hermes's own Codex token store."""
-    try:
-        data = _read_codex_tokens()
-    except AuthError as orig_err:
-        # Only attempt migration when there are NO tokens stored at all
-        # (code == "codex_auth_missing"), not when tokens exist but are invalid.
-        if orig_err.code != "codex_auth_missing":
-            raise
-
-        # Migration: user had Codex as active provider with old storage (~/.codex/).
-        cli_tokens = _import_codex_cli_tokens()
-        if cli_tokens:
-            logger.info("Migrating Codex credentials from ~/.codex/ to Hermes auth store")
-            print("⚠️  Migrating Codex credentials to Hermes's own auth store.")
-            print("   This avoids conflicts with Codex CLI and VS Code.")
-            print("   Run `hermes auth` to create a fully independent session.\n")
-            _save_codex_tokens(cli_tokens)
-            data = _read_codex_tokens()
-        else:
-            raise
+    data = _read_codex_tokens()
    tokens = dict(data["tokens"])
    access_token = str(tokens.get("access_token", "") or "").strip()
    refresh_timeout_seconds = float(os.getenv("HERMES_CODEX_REFRESH_TIMEOUT_SECONDS", "20"))
@@ -1714,7 +1664,7 @@ def _resolve_verify(
    insecure: Optional[bool] = None,
    ca_bundle: Optional[str] = None,
    auth_state: Optional[Dict[str, Any]] = None,
-) -> bool | str:
+) -> bool | ssl.SSLContext:
    tls_state = auth_state.get("tls") if isinstance(auth_state, dict) else {}
    tls_state = tls_state if isinstance(tls_state, dict) else {}

@@ -1734,13 +1684,12 @@ def _resolve_verify(
    if effective_ca:
        ca_path = str(effective_ca)
        if not os.path.isfile(ca_path):
-            import logging
-            logging.getLogger("hermes.auth").warning(
+            logger.warning(
                "CA bundle path does not exist: %s — falling back to default certificates",
                ca_path,
            )
            return True
-        return ca_path
+        return ssl.create_default_context(cafile=ca_path)
    return True


@@ -2783,6 +2732,17 @@ def _update_config_for_provider(
        # Clear stale base_url to prevent contamination when switching providers
        model_cfg.pop("base_url", None)

+    # Clear stale api_key/api_mode left over from a previous custom provider.
+    # When the user switches from e.g. a MiniMax custom endpoint
+    # (api_mode=anthropic_messages, api_key=mxp-...) to a built-in provider
+    # (e.g. OpenRouter), the stale api_key/api_mode would override the new
+    # provider's credentials and transport choice.  Built-in providers that
+    # need a specific api_mode (copilot, xai) set it at request-resolution
+    # time via `_copilot_runtime_api_mode` / `_detect_api_mode_for_url`, so
+    # removing the persisted value here is safe.
+    model_cfg.pop("api_key", None)
+    model_cfg.pop("api_mode", None)
+
    # When switching to a non-OpenRouter provider, ensure model.default is
    # valid for the new provider.  An OpenRouter-formatted name like
    # "anthropic/claude-opus-4.6" will fail on direct-API providers.
@@ -201,7 +201,7 @@ def run_backup(args) -> None:
                else:
                    zf.write(abs_path, arcname=str(rel_path))
                    total_bytes += abs_path.stat().st_size
-            except (PermissionError, OSError) as exc:
+            except (PermissionError, OSError, ValueError) as exc:
                errors.append(f"  {rel_path}: {exc}")
                continue

@@ -260,10 +260,10 @@ GATEWAY_KNOWN_COMMANDS: frozenset[str] = frozenset(
 )


-# Commands that must never be queued behind an active gateway session.
-# These are explicit control/info commands handled by the gateway itself;
-# if they get queued as pending text, the safety net in gateway.run will
-# discard them before they ever reach the user.
+# Commands with explicit Level-2 running-agent handlers in gateway/run.py.
+# Listed here for introspection / tests; semantically a subset of
+# "all resolvable commands" — which is the real bypass set (see
+# should_bypass_active_session below).
 ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(
    {
        "agents",
@@ -285,9 +285,26 @@ ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(


 def should_bypass_active_session(command_name: str | None) -> bool:
-    """Return True when a slash command must bypass active-session queuing."""
-    cmd = resolve_command(command_name) if command_name else None
-    return bool(cmd and cmd.name in ACTIVE_SESSION_BYPASS_COMMANDS)
+    """Return True for any resolvable slash command.
+
+    Rationale: every gateway-registered slash command either has a
+    specific Level-2 handler in gateway/run.py (/stop, /new, /model,
+    /approve, etc.) or reaches the running-agent catch-all that returns
+    a "busy — wait or /stop first" response. In both paths the command
+    is dispatched, not queued.
+
+    Queueing is always wrong for a recognized slash command because the
+    safety net in gateway.run discards any command text that reaches
+    the pending queue — which meant a mid-run /model (or /reasoning,
+    /voice, /insights, /title, /resume, /retry, /undo, /compress,
+    /usage, /provider, /reload-mcp, /sethome, /reset) would silently
+    interrupt the agent AND get discarded, producing a zero-char
+    response. See issue #5057 / PRs #6252, #10370, #4665.
+
+    ACTIVE_SESSION_BYPASS_COMMANDS remains the subset of commands with
+    explicit Level-2 handlers; the rest fall through to the catch-all.
+    """
+    return resolve_command(command_name) is not None if command_name else False


 def _resolve_config_gates() -> set[str]:
@@ -403,7 +403,11 @@ DEFAULT_CONFIG = {
        "container_persistent": True,   # Persist filesystem across sessions
        # Docker volume mounts — share host directories with the container.
        # Each entry is "host_path:container_path" (standard Docker -v syntax).
-        # Example: ["/home/user/projects:/workspace/projects", "/data:/data"]
+        # Example:
+        # ["/home/user/projects:/workspace/projects",
+        #  "/home/user/.hermes/cache/documents:/output"]
+        # For gateway MEDIA delivery, write inside Docker to /output/... and emit
+        # the host-visible path in MEDIA:, not the container path.
        "docker_volumes": [],
        # Explicit opt-in: mount the host cwd into /workspace for Docker sessions.
        # Default off because passing host directories into a sandbox weakens isolation.
@@ -470,13 +474,6 @@ DEFAULT_CONFIG = {
        },
    },

-    "smart_model_routing": {
-        "enabled": False,
-        "max_simple_chars": 160,
-        "max_simple_words": 28,
-        "cheap_model": {},
-    },
-    
    # Auxiliary model config — provider:model for each side task.
    # Format: provider is the provider name, model is the model slug.
    # "auto" for provider = auto-detect best available provider.
@@ -490,6 +487,7 @@ DEFAULT_CONFIG = {
            "base_url": "",        # direct OpenAI-compatible endpoint (takes precedence over provider)
            "api_key": "",         # API key for base_url (falls back to OPENAI_API_KEY)
            "timeout": 120,        # seconds — LLM API call timeout; vision payloads need generous timeout
+            "extra_body": {},      # OpenAI-compatible provider-specific request fields
            "download_timeout": 30,  # seconds — image HTTP download timeout; increase for slow connections
        },
        "web_extract": {
@@ -498,6 +496,7 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 360,        # seconds (6min) — per-attempt LLM summarization timeout; increase for slow local models
+            "extra_body": {},
        },
        "compression": {
            "provider": "auto",
@@ -505,6 +504,7 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 120,        # seconds — compression summarises large contexts; increase for local models
+            "extra_body": {},
        },
        "session_search": {
            "provider": "auto",
@@ -512,6 +512,8 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 30,
+            "extra_body": {},
+            "max_concurrency": 3,  # Clamp parallel summaries to avoid request-burst 429s on small providers
        },
        "skills_hub": {
            "provider": "auto",
@@ -519,6 +521,7 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 30,
+            "extra_body": {},
        },
        "approval": {
            "provider": "auto",
@@ -526,6 +529,7 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 30,
+            "extra_body": {},
        },
        "mcp": {
            "provider": "auto",
@@ -533,6 +537,7 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 30,
+            "extra_body": {},
        },
        "flush_memories": {
            "provider": "auto",
@@ -540,6 +545,7 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 30,
+            "extra_body": {},
        },
        "title_generation": {
            "provider": "auto",
@@ -547,6 +553,7 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
            "timeout": 30,
+            "extra_body": {},
        },
    },
    
@@ -558,9 +565,14 @@ DEFAULT_CONFIG = {
        "bell_on_complete": False,
        "show_reasoning": False,
        "streaming": False,
+        "final_response_markdown": "strip",  # render | strip | raw
        "inline_diffs": True,     # Show inline diff previews for write actions (write_file, patch, skill_manage)
        "show_cost": False,       # Show $ cost in the status bar (off by default)
        "skin": "default",
+        "user_message_preview": {  # CLI: how many submitted user-message lines to echo back in scrollback
+            "first_lines": 2,
+            "last_lines": 2,
+        },
        "interim_assistant_messages": True,  # Gateway: show natural mid-turn assistant status messages
        "tool_progress_command": False,  # Enable /verbose command in messaging gateway
        "tool_progress_overrides": {},  # DEPRECATED — use display.platforms instead
@@ -708,6 +720,14 @@ DEFAULT_CONFIG = {
        "auto_thread": True,           # Auto-create threads on @mention in channels (like Slack)
        "reactions": True,             # Add 👀/✅/❌ reactions to messages during processing
        "channel_prompts": {},         # Per-channel ephemeral system prompts (forum parents apply to child threads)
+        # discord_server tool: restrict which actions the agent may call.
+        # Default (empty) = all actions allowed (subject to bot privileged intents).
+        # Accepts comma-separated string ("list_guilds,list_channels,fetch_messages")
+        # or YAML list. Unknown names are dropped with a warning at load time.
+        # Actions: list_guilds, server_info, list_channels, channel_info,
+        # list_roles, member_info, search_members, fetch_messages, list_pins,
+        # pin_message, unpin_message, create_thread, add_role, remove_role.
+        "server_actions": "",
    },

    # WhatsApp platform settings (gateway mode)
@@ -737,9 +757,14 @@ DEFAULT_CONFIG = {
    #   manual — always prompt the user (default)
    #   smart  — use auxiliary LLM to auto-approve low-risk commands, prompt for high-risk
    #   off    — skip all approval prompts (equivalent to --yolo)
+    #
+    # cron_mode — what to do when a cron job hits a dangerous command:
+    #   deny    — block the command and let the agent find another way (default, safe)
+    #   approve — auto-approve all dangerous commands in cron jobs
    "approvals": {
        "mode": "manual",
        "timeout": 60,
+        "cron_mode": "deny",
    },

    # Permanently allowed dangerous command patterns (added via "always" approval)
@@ -802,7 +827,7 @@ DEFAULT_CONFIG = {
    },

    # Config schema version - bump this when adding new required fields
-    "_config_version": 19,
+    "_config_version": 20,
 }

 # =============================================================================
@@ -2856,24 +2881,11 @@ _FALLBACK_COMMENT = """
 #   minimax      (MINIMAX_API_KEY)     — MiniMax
 #   minimax-cn   (MINIMAX_CN_API_KEY)  — MiniMax (China)
 #
-# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
+# For custom OpenAI-compatible endpoints, add base_url and key_env.
 #
 # fallback_model:
 #   provider: openrouter
 #   model: anthropic/claude-sonnet-4
-#
-# ── Smart Model Routing ────────────────────────────────────────────────
-# Optional cheap-vs-strong routing for simple turns.
-# Keeps the primary model for complex work, but can route short/simple
-# messages to a cheaper model across providers.
-#
-# smart_model_routing:
-#   enabled: true
-#   max_simple_chars: 160
-#   max_simple_words: 28
-#   cheap_model:
-#     provider: openrouter
-#     model: google/gemini-2.5-flash
 """


@@ -2900,24 +2912,11 @@ _COMMENTED_SECTIONS = """
 #   minimax      (MINIMAX_API_KEY)     — MiniMax
 #   minimax-cn   (MINIMAX_CN_API_KEY)  — MiniMax (China)
 #
-# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
+# For custom OpenAI-compatible endpoints, add base_url and key_env.
 #
 # fallback_model:
 #   provider: openrouter
 #   model: anthropic/claude-sonnet-4
-#
-# ── Smart Model Routing ────────────────────────────────────────────────
-# Optional cheap-vs-strong routing for simple turns.
-# Keeps the primary model for complex work, but can route short/simple
-# messages to a cheaper model across providers.
-#
-# smart_model_routing:
-#   enabled: true
-#   max_simple_chars: 160
-#   max_simple_words: 28
-#   cheap_model:
-#     provider: openrouter
-#     model: google/gemini-2.5-flash
 """


@@ -3380,6 +3379,10 @@ def show_config():
    print(f"  Personality:  {display.get('personality', 'kawaii')}")
    print(f"  Reasoning:    {'on' if display.get('show_reasoning', False) else 'off'}")
    print(f"  Bell:         {'on' if display.get('bell_on_complete', False) else 'off'}")
+    ump = display.get('user_message_preview', {}) if isinstance(display.get('user_message_preview', {}), dict) else {}
+    ump_first = ump.get('first_lines', 2)
+    ump_last = ump.get('last_lines', 2)
+    print(f"  User preview: first {ump_first} line(s), last {ump_last} line(s)")

    # Terminal
    print()
@@ -277,6 +277,86 @@ def run_doctor(args):
    config_path = HERMES_HOME / 'config.yaml'
    if config_path.exists():
        check_ok(f"{_DHH}/config.yaml exists")
+
+        # Validate model.provider and model.default values
+        try:
+            import yaml as _yaml
+            cfg = _yaml.safe_load(config_path.read_text(encoding="utf-8")) or {}
+            model_section = cfg.get("model") or {}
+            provider_raw = (model_section.get("provider") or "").strip()
+            provider = provider_raw.lower()
+            default_model = (model_section.get("default") or model_section.get("model") or "").strip()
+
+            known_providers: set = set()
+            try:
+                from hermes_cli.auth import PROVIDER_REGISTRY
+                known_providers = set(PROVIDER_REGISTRY.keys()) | {"openrouter", "custom", "auto"}
+            except Exception:
+                pass
+            try:
+                from hermes_cli.auth import resolve_provider as _resolve_provider
+            except Exception:
+                _resolve_provider = None
+
+            canonical_provider = provider
+            if provider and _resolve_provider is not None and provider != "auto":
+                try:
+                    canonical_provider = _resolve_provider(provider)
+                except Exception:
+                    canonical_provider = None
+
+            if provider and provider != "auto":
+                if canonical_provider is None or (known_providers and canonical_provider not in known_providers):
+                    known_list = ", ".join(sorted(known_providers)) if known_providers else "(unavailable)"
+                    check_fail(
+                        f"model.provider '{provider_raw}' is not a recognised provider",
+                        f"(known: {known_list})",
+                    )
+                    issues.append(
+                        f"model.provider '{provider_raw}' is unknown. "
+                        f"Valid providers: {known_list}. "
+                        f"Fix: run 'hermes config set model.provider <valid_provider>'"
+                    )
+
+            # Warn if model is set to a provider-prefixed name on a provider that doesn't use them
+            if default_model and "/" in default_model and canonical_provider and canonical_provider not in ("openrouter", "custom", "auto", "ai-gateway", "kilocode", "opencode-zen", "huggingface", "nous"):
+                check_warn(
+                    f"model.default '{default_model}' uses a vendor/model slug but provider is '{provider_raw}'",
+                    "(vendor-prefixed slugs belong to aggregators like openrouter)",
+                )
+                issues.append(
+                    f"model.default '{default_model}' is vendor-prefixed but model.provider is '{provider_raw}'. "
+                    "Either set model.provider to 'openrouter', or drop the vendor prefix."
+                )
+
+            # Check credentials for the configured provider.
+            # Limit to API-key providers in PROVIDER_REGISTRY — other provider
+            # types (OAuth, SDK, openrouter/anthropic/custom/auto) have their
+            # own env-var checks elsewhere in doctor, and get_auth_status()
+            # returns a bare {logged_in: False} for anything it doesn't
+            # explicitly dispatch, which would produce false positives.
+            if canonical_provider and canonical_provider not in ("auto", "custom", "openrouter"):
+                try:
+                    from hermes_cli.auth import PROVIDER_REGISTRY, get_auth_status
+                    pconfig = PROVIDER_REGISTRY.get(canonical_provider)
+                    if pconfig and getattr(pconfig, "auth_type", "") == "api_key":
+                        status = get_auth_status(canonical_provider) or {}
+                        configured = bool(status.get("configured") or status.get("logged_in") or status.get("api_key"))
+                        if not configured:
+                            check_fail(
+                                f"model.provider '{canonical_provider}' is set but no API key is configured",
+                                "(check ~/.hermes/.env or run 'hermes setup')",
+                            )
+                            issues.append(
+                                f"No credentials found for provider '{canonical_provider}'. "
+                                f"Run 'hermes setup' or set the provider's API key in {_DHH}/.env, "
+                                f"or switch providers with 'hermes config set model.provider <name>'"
+                            )
+                except Exception:
+                    pass
+
+        except Exception as e:
+            check_warn("Could not validate model/provider config", f"({e})")
    else:
        fallback_config = PROJECT_ROOT / 'cli-config.yaml'
        if fallback_config.exists():
@@ -778,6 +858,16 @@ def run_doctor(args):
            elif response.status_code == 401:
                print(f"\r  {color('✗', Colors.RED)} OpenRouter API {color('(invalid API key)', Colors.DIM)}                ")
                issues.append("Check OPENROUTER_API_KEY in .env")
+            elif response.status_code == 402:
+                print(f"\r  {color('✗', Colors.RED)} OpenRouter API {color('(out of credits — payment required)', Colors.DIM)}")
+                issues.append(
+                    "OpenRouter account has insufficient credits. "
+                    "Fix: run 'hermes config set model.provider <provider>' to switch providers, "
+                    "or fund your OpenRouter account at https://openrouter.ai/settings/credits"
+                )
+            elif response.status_code == 429:
+                print(f"\r  {color('✗', Colors.RED)} OpenRouter API {color('(rate limited)', Colors.DIM)}                ")
+                issues.append("OpenRouter rate limit hit — consider switching to a different provider or waiting")
            else:
                print(f"\r  {color('✗', Colors.RED)} OpenRouter API {color(f'(HTTP {response.status_code})', Colors.DIM)}                ")
        except Exception as e:
@@ -160,7 +160,6 @@ def _config_overrides(config: dict) -> dict[str, str]:
        ("display", "streaming"),
        ("display", "skin"),
        ("display", "show_reasoning"),
-        ("smart_model_routing", "enabled"),
        ("privacy", "redact_pii"),
        ("tts", "provider"),
    ]
@@ -693,6 +693,10 @@ def _resolve_session_by_name_or_id(name_or_id: str) -> Optional[str]:
    - If it looks like a session ID (contains underscore + hex), try direct lookup first.
    - Otherwise, treat it as a title and use resolve_session_by_title (auto-latest).
    - Falls back to the other method if the first doesn't match.
+    - If the resolved session is a compression root, follow the chain forward
+      to the latest continuation. Users who remember the old root ID (e.g.
+      from an exit summary printed before the bug fix, or from notes) get
+      resumed at the live tip instead of a stale parent with no messages.
    """
    try:
        from hermes_state import SessionDB
@@ -701,14 +705,23 @@ def _resolve_session_by_name_or_id(name_or_id: str) -> Optional[str]:

        # Try as exact session ID first
        session = db.get_session(name_or_id)
+        resolved_id: Optional[str] = None
        if session:
-            db.close()
-            return session["id"]
+            resolved_id = session["id"]
+        else:
+            # Try as title (with auto-latest for lineage)
+            resolved_id = db.resolve_session_by_title(name_or_id)
+
+        if resolved_id:
+            # Project forward through compression chain so resumes land on
+            # the live tip instead of a dead compressed parent.
+            try:
+                resolved_id = db.get_compression_tip(resolved_id) or resolved_id
+            except Exception:
+                pass

-        # Try as title (with auto-latest for lineage)
-        session_id = db.resolve_session_by_title(name_or_id)
        db.close()
-        return session_id
+        return resolved_id
    except Exception:
        pass
    return None
@@ -897,6 +910,10 @@ def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
    _ensure_tui_node()

    def _node_bin(bin: str) -> str:
+        if bin == "node":
+            env_node = os.environ.get("HERMES_NODE")
+            if env_node and os.path.isfile(env_node) and os.access(env_node, os.X_OK):
+                return env_node
        path = shutil.which(bin)
        if not path:
            print(f"{bin} not found — install Node.js to use the TUI.")
@@ -2347,7 +2364,7 @@ def _model_flow_google_gemini_cli(_config, current_model=""):
        return

    models = list(_PROVIDER_MODELS.get("google-gemini-cli") or [])
-    default = current_model or (models[0] if models else "gemini-2.5-flash")
+    default = current_model or (models[0] if models else "gemini-3-flash-preview")
    selected = _prompt_model_selection(models, current_model=default)
    if selected:
        _save_model_choice(selected)
@@ -3969,7 +3986,7 @@ def _model_flow_anthropic(config, current_model=""):

        elif choice == "2":
            print()
-            print("  Get an API key at: https://console.anthropic.com/settings/keys")
+            print("  Get an API key at: https://platform.claude.com/settings/keys")
            print()
            try:
                import getpass
@@ -6225,8 +6242,9 @@ def cmd_dashboard(args):
        print(f"Install them with:  {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'")
        sys.exit(1)

-    if not _build_web_ui(PROJECT_ROOT / "web", fatal=True):
-        sys.exit(1)
+    if "HERMES_WEB_DIST" not in os.environ:
+        if not _build_web_ui(PROJECT_ROOT / "web", fatal=True):
+            sys.exit(1)

    from hermes_cli.web_server import start_server

@@ -6997,6 +7015,13 @@ For more help on a command:
    wh_sub.add_argument(
        "--secret", default="", help="HMAC secret (auto-generated if omitted)"
    )
+    wh_sub.add_argument(
+        "--deliver-only",
+        action="store_true",
+        help="Skip the agent — deliver the rendered prompt directly as the "
+        "message. Zero LLM cost. Requires --deliver to be a real target "
+        "(not 'log').",
+    )

    webhook_subparsers.add_parser(
        "list", aliases=["ls"], help="List all dynamic subscriptions"
@@ -1035,21 +1035,49 @@ def list_authenticated_providers(
        seen_slugs.add(_cp.slug.lower())

    # --- 3. User-defined endpoints from config ---
+    # Track (name, base_url) of what section 3 emits so section 4 can skip
+    # any overlapping ``custom_providers:`` entries.  Callers typically pass
+    # both (gateway/CLI invoke ``get_compatible_custom_providers()`` which
+    # merges ``providers:`` into the list) — without this, the same endpoint
+    # produces two picker rows: one bare-slug ("openrouter") from section 3
+    # and one "custom:openrouter" from section 4, both labelled identically.
+    _section3_emitted_pairs: set = set()
    if user_providers and isinstance(user_providers, dict):
        for ep_name, ep_cfg in user_providers.items():
            if not isinstance(ep_cfg, dict):
                continue
+            # Skip if this slug was already emitted (e.g. canonical provider
+            # with the same name) or will be picked up by section 4.
+            if ep_name.lower() in seen_slugs:
+                continue
            display_name = ep_cfg.get("name", "") or ep_name
-            api_url = ep_cfg.get("api", "") or ep_cfg.get("url", "") or ""
-            default_model = ep_cfg.get("default_model", "")
+            # ``base_url`` is Hermes's canonical write key (matches
+            # custom_providers and _save_custom_provider); ``api`` / ``url``
+            # remain as fallbacks for hand-edited / legacy configs.
+            api_url = (
+                ep_cfg.get("base_url", "")
+                or ep_cfg.get("api", "")
+                or ep_cfg.get("url", "")
+                or ""
+            )
+            # ``default_model`` is the legacy key; ``model`` matches what
+            # custom_providers entries use, so accept either.
+            default_model = ep_cfg.get("default_model", "") or ep_cfg.get("model", "")

            # Build models list from both default_model and full models array
            models_list = []
            if default_model:
                models_list.append(default_model)
-            # Also include the full models list from config
+            # Also include the full models list from config.
+            # Hermes writes ``models:`` as a dict keyed by model id
+            # (see hermes_cli/main.py::_save_custom_provider); older
+            # configs or hand-edited files may still use a list.
            cfg_models = ep_cfg.get("models", [])
-            if isinstance(cfg_models, list):
+            if isinstance(cfg_models, dict):
+                for m in cfg_models:
+                    if m and m not in models_list:
+                        models_list.append(m)
+            elif isinstance(cfg_models, list):
                for m in cfg_models:
                    if m and m not in models_list:
                        models_list.append(m)
@@ -1066,6 +1094,13 @@ def list_authenticated_providers(
                "source": "user-config",
                "api_url": api_url,
            })
+            seen_slugs.add(ep_name.lower())
+            _pair = (
+                str(display_name).strip().lower(),
+                str(api_url).strip().rstrip("/").lower(),
+            )
+            if _pair[0] and _pair[1]:
+                _section3_emitted_pairs.add(_pair)

    # --- 4. Saved custom providers from config ---
    # Each ``custom_providers`` entry represents one model under a named
@@ -1100,13 +1135,41 @@ def list_authenticated_providers(
                    "api_url": api_url,
                    "models": [],
                }
+            # The singular ``model:`` field only holds the currently
+            # active model. Hermes's own writer (main.py::_save_custom_provider)
+            # stores every configured model as a dict under ``models:``;
+            # downstream readers (agent/models_dev.py, gateway/run.py,
+            # run_agent.py, hermes_cli/config.py) already consume that dict.
+            # The /model picker previously ignored it, so multi-model
+            # custom providers appeared to have only the active model.
            default_model = (entry.get("model") or "").strip()
            if default_model and default_model not in groups[slug]["models"]:
                groups[slug]["models"].append(default_model)

+            cfg_models = entry.get("models", {})
+            if isinstance(cfg_models, dict):
+                for m in cfg_models:
+                    if m and m not in groups[slug]["models"]:
+                        groups[slug]["models"].append(m)
+            elif isinstance(cfg_models, list):
+                for m in cfg_models:
+                    if m and m not in groups[slug]["models"]:
+                        groups[slug]["models"].append(m)
+
        for slug, grp in groups.items():
            if slug.lower() in seen_slugs:
                continue
+            # Skip if section 3 already emitted this endpoint under its
+            # ``providers:`` dict key — matches on (display_name, base_url),
+            # the tuple section 4 groups by.  Prevents two picker rows
+            # labelled identically when callers pass both ``user_providers``
+            # and a compatibility-merged ``custom_providers`` list.
+            _pair_key = (
+                str(grp["name"]).strip().lower(),
+                str(grp["api_url"]).strip().rstrip("/").lower(),
+            )
+            if _pair_key[0] and _pair_key[1] and _pair_key in _section3_emitted_pairs:
+                continue
            results.append({
                "slug": slug,
                "name": grp["name"],
@@ -128,18 +128,14 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
    ],
    "gemini": [
        "gemini-3.1-pro-preview",
+        "gemini-3-pro-preview",
        "gemini-3-flash-preview",
        "gemini-3.1-flash-lite-preview",
-        "gemini-2.5-pro",
-        "gemini-2.5-flash",
-        "gemini-2.5-flash-lite",
-        # Gemma open models (also served via AI Studio)
-        "gemma-4-31b-it",
    ],
    "google-gemini-cli": [
-        "gemini-2.5-pro",
-        "gemini-2.5-flash",
-        "gemini-2.5-flash-lite",
+        "gemini-3.1-pro-preview",
+        "gemini-3-pro-preview",
+        "gemini-3-flash-preview",
    ],
    "zai": [
        "glm-5.1",
@@ -554,7 +550,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("copilot",        "GitHub Copilot",           "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
    ProviderEntry("copilot-acp",    "GitHub Copilot ACP",       "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
    ProviderEntry("huggingface",    "Hugging Face",             "Hugging Face Inference Providers (20+ open models)"),
-    ProviderEntry("gemini",         "Google AI Studio",         "Google AI Studio (Gemini models — OpenAI-compatible endpoint)"),
+    ProviderEntry("gemini",         "Google AI Studio",         "Google AI Studio (Gemini models — native Gemini API)"),
    ProviderEntry("google-gemini-cli", "Google Gemini (OAuth)",   "Google Gemini via OAuth + Code Assist (free tier supported; no API key needed)"),
    ProviderEntry("deepseek",       "DeepSeek",                 "DeepSeek (DeepSeek-V3, R1, coder — direct API)"),
    ProviderEntry("xai",            "xAI",                      "xAI (Grok models — direct API)"),
@@ -2108,6 +2104,51 @@ def validate_requested_model(
                ),
            }

+    # MiniMax providers don't expose a /models endpoint — validate against
+    # the static catalog instead, similar to openai-codex.
+    if normalized in ("minimax", "minimax-cn"):
+        try:
+            catalog_models = provider_model_ids(normalized)
+        except Exception:
+            catalog_models = []
+        if catalog_models:
+            # Case-insensitive lookup (catalog uses mixed case like MiniMax-M2.7)
+            catalog_lower = {m.lower(): m for m in catalog_models}
+            if requested_for_lookup.lower() in catalog_lower:
+                return {
+                    "accepted": True,
+                    "persist": True,
+                    "recognized": True,
+                    "message": None,
+                }
+            # Auto-correct close matches (case-insensitive)
+            catalog_lower_list = list(catalog_lower.keys())
+            auto = get_close_matches(requested_for_lookup.lower(), catalog_lower_list, n=1, cutoff=0.9)
+            if auto:
+                corrected = catalog_lower[auto[0]]
+                return {
+                    "accepted": True,
+                    "persist": True,
+                    "recognized": True,
+                    "corrected_model": corrected,
+                    "message": f"Auto-corrected `{requested}` → `{corrected}`",
+                }
+            suggestions = get_close_matches(requested_for_lookup.lower(), catalog_lower_list, n=3, cutoff=0.5)
+            suggestion_text = ""
+            if suggestions:
+                suggestion_text = "\n  Similar models: " + ", ".join(f"`{catalog_lower[s]}`" for s in suggestions)
+            return {
+                "accepted": True,
+                "persist": True,
+                "recognized": False,
+                "message": (
+                    f"Note: `{requested}` was not found in the MiniMax catalog."
+                    f"{suggestion_text}"
+                    "\n  MiniMax does not expose a /models endpoint, so Hermes cannot verify the model name."
+                    "\n  The model may still work if it exists on the server."
+                ),
+            }
+
    # Probe the live API to check if the model actually exists
    api_models = fetch_api_models(api_key, base_url)

@@ -54,6 +54,8 @@ logger = logging.getLogger(__name__)
 VALID_HOOKS: Set[str] = {
    "pre_tool_call",
    "post_tool_call",
+    "transform_terminal_output",
+    "transform_tool_result",
    "pre_llm_call",
    "post_llm_call",
    "pre_api_request",
@@ -322,12 +322,16 @@ def normalize_provider(name: str) -> str:


 def get_provider(name: str) -> Optional[ProviderDef]:
-    """Look up a provider by id or alias, merging all data sources.
+    """Look up a built-in provider by id or alias.

    Resolution order:
      1. Hermes overlays (for providers not in models.dev: nous, openai-codex, etc.)
      2. models.dev catalog + Hermes overlay
-      3. User-defined providers from config (TODO: Phase 4)
+
+    User-defined providers from config.yaml (``providers:`` / ``custom_providers:``)
+    are resolved by :func:`resolve_provider_full`, which layers ``resolve_user_provider``
+    and ``resolve_custom_provider`` on top of this function. Callers that need
+    user-config support should use ``resolve_provider_full`` instead.

    Returns a fully-resolved ProviderDef or None.
    """
@@ -38,14 +38,21 @@ def _normalize_custom_provider_name(value: str) -> str:
 def _detect_api_mode_for_url(base_url: str) -> Optional[str]:
    """Auto-detect api_mode from the resolved base URL.

-    Direct api.openai.com endpoints need the Responses API for GPT-5.x
-    tool calls with reasoning (chat/completions returns 400).
+    - Direct api.openai.com endpoints need the Responses API for GPT-5.x
+      tool calls with reasoning (chat/completions returns 400).
+    - Third-party Anthropic-compatible gateways (MiniMax, Zhipu GLM,
+      LiteLLM proxies, etc.) conventionally expose the native Anthropic
+      protocol under a ``/anthropic`` suffix — treat those as
+      ``anthropic_messages`` transport instead of the default
+      ``chat_completions``.
    """
    normalized = (base_url or "").strip().lower().rstrip("/")
    if "api.x.ai" in normalized:
        return "codex_responses"
    if "api.openai.com" in normalized and "openrouter" not in normalized:
        return "codex_responses"
+    if normalized.endswith("/anthropic"):
+        return "anthropic_messages"
    return None


@@ -194,8 +201,12 @@ def _resolve_runtime_from_pool_entry(
        elif provider in ("opencode-zen", "opencode-go"):
            from hermes_cli.models import opencode_model_api_mode
            api_mode = opencode_model_api_mode(provider, model_cfg.get("default", ""))
-        elif base_url.rstrip("/").endswith("/anthropic"):
-            api_mode = "anthropic_messages"
+        else:
+            # Auto-detect Anthropic-compatible endpoints (/anthropic suffix,
+            # api.openai.com → codex_responses, api.x.ai → codex_responses).
+            detected = _detect_api_mode_for_url(base_url)
+            if detected:
+                api_mode = detected

    # OpenCode base URLs end with /v1 for OpenAI-compatible models, but the
    # Anthropic SDK prepends its own /v1/messages to the base_url.  Strip the
@@ -642,8 +653,11 @@ def _resolve_explicit_runtime(
            configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
            if configured_mode:
                api_mode = configured_mode
-            elif base_url.rstrip("/").endswith("/anthropic"):
-                api_mode = "anthropic_messages"
+            else:
+                # Auto-detect Anthropic-compatible endpoints (/anthropic suffix).
+                detected = _detect_api_mode_for_url(base_url)
+                if detected:
+                    api_mode = detected

        return {
            "provider": provider,
@@ -965,10 +979,13 @@ def resolve_runtime_provider(
            elif provider in ("opencode-zen", "opencode-go"):
                from hermes_cli.models import opencode_model_api_mode
                api_mode = opencode_model_api_mode(provider, model_cfg.get("default", ""))
-            # Auto-detect Anthropic-compatible endpoints by URL convention
-            # (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
-            elif base_url.rstrip("/").endswith("/anthropic"):
-                api_mode = "anthropic_messages"
+            else:
+                # Auto-detect Anthropic-compatible endpoints by URL convention
+                # (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
+                # plus api.openai.com → codex_responses and api.x.ai → codex_responses.
+                detected = _detect_api_mode_for_url(base_url)
+                if detected:
+                    api_mode = detected
        # Strip trailing /v1 for OpenCode Anthropic models (see comment above).
        if api_mode == "anthropic_messages" and provider in ("opencode-zen", "opencode-go"):
            base_url = re.sub(r"/v1/?$", "", base_url)
@@ -89,9 +89,8 @@ _DEFAULT_PROVIDER_MODELS = {
        "grok-code-fast-1",
    ],
    "gemini": [
-        "gemini-3.1-pro-preview", "gemini-3-flash-preview", "gemini-3.1-flash-lite-preview",
-        "gemini-2.5-pro", "gemini-2.5-flash", "gemini-2.5-flash-lite",
-        "gemma-4-31b-it",
+        "gemini-3.1-pro-preview", "gemini-3-pro-preview",
+        "gemini-3-flash-preview", "gemini-3.1-flash-lite-preview",
    ],
    "zai": ["glm-5.1", "glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
    "kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
@@ -1461,7 +1460,9 @@ def setup_agent_settings(config: dict):
    )
    print_info("Maximum tool-calling iterations per conversation.")
    print_info("Higher = more complex tasks, but costs more tokens.")
-    print_info("Default is 90, which works for most tasks. Use 150+ for open exploration.")
+    print_info(
+        f"Press Enter to keep {current_max}. Use 90 for most tasks or 150+ for open exploration."
+    )

    max_iter_str = prompt("Max iterations", current_max)
    try:
@@ -0,0 +1,82 @@
+from __future__ import annotations
+
+
+def _coerce_timeout(raw: object) -> float | None:
+    try:
+        timeout = float(raw)
+    except (TypeError, ValueError):
+        return None
+    if timeout <= 0:
+        return None
+    return timeout
+
+
+def get_provider_request_timeout(
+    provider_id: str, model: str | None = None
+) -> float | None:
+    """Return a configured provider request timeout in seconds, if any."""
+    if not provider_id:
+        return None
+
+    try:
+        from hermes_cli.config import load_config
+    except ImportError:
+        return None
+
+    config = load_config()
+    providers = config.get("providers", {}) if isinstance(config, dict) else {}
+    provider_config = (
+        providers.get(provider_id, {}) if isinstance(providers, dict) else {}
+    )
+    if not isinstance(provider_config, dict):
+        return None
+
+    model_config = _get_model_config(provider_config, model)
+    if model_config is not None:
+        timeout = _coerce_timeout(model_config.get("timeout_seconds"))
+        if timeout is not None:
+            return timeout
+
+    return _coerce_timeout(provider_config.get("request_timeout_seconds"))
+
+
+def get_provider_stale_timeout(
+    provider_id: str, model: str | None = None
+) -> float | None:
+    """Return a configured non-stream stale timeout in seconds, if any."""
+    if not provider_id:
+        return None
+
+    try:
+        from hermes_cli.config import load_config
+    except ImportError:
+        return None
+
+    config = load_config()
+    providers = config.get("providers", {}) if isinstance(config, dict) else {}
+    provider_config = (
+        providers.get(provider_id, {}) if isinstance(providers, dict) else {}
+    )
+    if not isinstance(provider_config, dict):
+        return None
+
+    model_config = _get_model_config(provider_config, model)
+    if model_config is not None:
+        timeout = _coerce_timeout(model_config.get("stale_timeout_seconds"))
+        if timeout is not None:
+            return timeout
+
+    return _coerce_timeout(provider_config.get("stale_timeout_seconds"))
+
+
+def _get_model_config(
+    provider_config: dict[str, object], model: str | None
+) -> dict[str, object] | None:
+    if not model:
+        return None
+
+    models = provider_config.get("models", {})
+    model_config = models.get(model, {}) if isinstance(models, dict) else {}
+    if isinstance(model_config, dict):
+        return model_config
+    return None
@@ -245,7 +245,7 @@ TIPS = [
    "Three plugin types: general (tools/hooks), memory providers, and context engines.",
    "hermes plugins install owner/repo installs plugins directly from GitHub.",
    "8 external memory providers available: Honcho, OpenViking, Mem0, Hindsight, and more.",
-    "Plugin hooks include pre_tool_call, post_tool_call, pre_llm_call, and post_llm_call.",
+    "Plugin hooks include pre/post_tool_call, pre/post_llm_call, and transform_terminal_output for output canonicalization.",

    # --- Miscellaneous ---
    "Prompt caching (Anthropic) reduces costs by reusing cached system prompt prefixes.",
@@ -323,7 +323,6 @@ TIPS = [
    "GPT-5 and Codex use 'developer' role instead of 'system' in the message format.",
    "Per-task auxiliary overrides: auxiliary.vision.provider, auxiliary.compression.model, etc. in config.yaml.",
    "The auxiliary client treats 'main' as a provider alias — resolves to your actual primary provider + model.",
-    "Smart routing can auto-route simple queries to a cheaper model — set smart_model_routing.enabled: true.",
    "hermes claw migrate --dry-run previews OpenClaw migration without writing anything.",
    "File paths pasted with quotes or escaped spaces are handled automatically — no manual cleanup needed.",
    "Slash commands never trigger the large-paste collapse — /command with big arguments works correctly.",
@@ -346,4 +345,3 @@ def get_random_tip(exclude_recent: int = 0) -> str:
    return random.choice(TIPS)


-
@@ -118,59 +118,166 @@ def remove_wrapper_script():


 def uninstall_gateway_service():
-    """Stop and uninstall the gateway service if running."""
+    """Stop and uninstall the gateway service (systemd, launchd) and kill any
+    standalone gateway processes.
+
+    Delegates to the gateway module which handles:
+    - Linux: user + system systemd services (with proper DBUS env setup)
+    - macOS: launchd plists
+    - All platforms: standalone ``hermes gateway run`` processes
+    - Termux/Android: skips systemd (no systemd on Android), still kills standalone processes
+    """
    import platform
-    
-    if platform.system() != "Linux":
-        return False
+    stopped_something = False

-    prefix = os.getenv("PREFIX", "")
-    if os.getenv("TERMUX_VERSION") or "com.termux/files/usr" in prefix:
-        return False
-    
+    # 1. Kill any standalone gateway processes (all platforms, including Termux)
    try:
-        from hermes_cli.gateway import get_service_name
-        svc_name = get_service_name()
-    except Exception:
-        svc_name = "hermes-gateway"
-
-    service_file = Path.home() / ".config" / "systemd" / "user" / f"{svc_name}.service"
-    
-    if not service_file.exists():
-        return False
-    
-    try:
-        # Stop the service
-        subprocess.run(
-            ["systemctl", "--user", "stop", svc_name],
-            capture_output=True,
-            check=False
-        )
-        
-        # Disable the service
-        subprocess.run(
-            ["systemctl", "--user", "disable", svc_name],
-            capture_output=True,
-            check=False
-        )
-        
-        # Remove service file
-        service_file.unlink()
-        
-        # Reload systemd
-        subprocess.run(
-            ["systemctl", "--user", "daemon-reload"],
-            capture_output=True,
-            check=False
-        )
-        
-        return True
-        
+        from hermes_cli.gateway import kill_gateway_processes, find_gateway_pids
+        pids = find_gateway_pids()
+        if pids:
+            killed = kill_gateway_processes()
+            if killed:
+                log_success(f"Killed {killed} running gateway process(es)")
+                stopped_something = True
    except Exception as e:
-        log_warn(f"Could not fully remove gateway service: {e}")
+        log_warn(f"Could not check for gateway processes: {e}")
+
+    system = platform.system()
+
+    # Termux/Android has no systemd and no launchd — nothing left to do.
+    prefix = os.getenv("PREFIX", "")
+    is_termux = bool(os.getenv("TERMUX_VERSION") or "com.termux/files/usr" in prefix)
+    if is_termux:
+        return stopped_something
+
+    # 2. Linux: uninstall systemd services (both user and system scopes)
+    if system == "Linux":
+        try:
+            from hermes_cli.gateway import (
+                get_systemd_unit_path,
+                get_service_name,
+                _systemctl_cmd,
+            )
+            svc_name = get_service_name()
+
+            for is_system in (False, True):
+                unit_path = get_systemd_unit_path(system=is_system)
+                if not unit_path.exists():
+                    continue
+
+                scope = "system" if is_system else "user"
+                try:
+                    if is_system and os.geteuid() != 0:
+                        log_warn(f"System gateway service exists at {unit_path} "
+                                 f"but needs sudo to remove")
+                        continue
+
+                    cmd = _systemctl_cmd(is_system)
+                    subprocess.run(cmd + ["stop", svc_name],
+                                   capture_output=True, check=False)
+                    subprocess.run(cmd + ["disable", svc_name],
+                                   capture_output=True, check=False)
+                    unit_path.unlink()
+                    subprocess.run(cmd + ["daemon-reload"],
+                                   capture_output=True, check=False)
+                    log_success(f"Removed {scope} gateway service ({unit_path})")
+                    stopped_something = True
+                except Exception as e:
+                    log_warn(f"Could not remove {scope} gateway service: {e}")
+        except Exception as e:
+            log_warn(f"Could not check systemd gateway services: {e}")
+
+    # 3. macOS: uninstall launchd plist
+    elif system == "Darwin":
+        try:
+            from hermes_cli.gateway import get_launchd_plist_path
+            plist_path = get_launchd_plist_path()
+            if plist_path.exists():
+                subprocess.run(["launchctl", "unload", str(plist_path)],
+                               capture_output=True, check=False)
+                plist_path.unlink()
+                log_success(f"Removed macOS gateway service ({plist_path})")
+                stopped_something = True
+        except Exception as e:
+            log_warn(f"Could not remove launchd gateway service: {e}")
+
+    return stopped_something
+
+
+def _is_default_hermes_home(hermes_home: Path) -> bool:
+    """Return True when ``hermes_home`` points at the default (non-profile) root."""
+    try:
+        from hermes_constants import get_default_hermes_root
+        return hermes_home.resolve() == get_default_hermes_root().resolve()
+    except Exception:
        return False


+def _discover_named_profiles():
+    """Return a list of ``ProfileInfo`` for every non-default profile, or ``[]``
+    if profile support is unavailable or nothing is installed beyond the
+    default root."""
+    try:
+        from hermes_cli.profiles import list_profiles
+    except Exception:
+        return []
+    try:
+        return [p for p in list_profiles() if not getattr(p, "is_default", False)]
+    except Exception as e:
+        log_warn(f"Could not enumerate profiles: {e}")
+        return []
+
+
+def _uninstall_profile(profile) -> None:
+    """Fully uninstall a single named profile: stop its gateway service,
+    remove its alias wrapper, and wipe its HERMES_HOME directory.
+
+    We shell out to ``hermes -p <name> gateway stop|uninstall`` because
+    service names, unit paths, and plist paths are all derived from the
+    current HERMES_HOME and can't be easily switched in-process.
+    """
+    import sys as _sys
+    name = profile.name
+    profile_home = profile.path
+
+    log_info(f"Uninstalling profile '{name}'...")
+
+    # 1. Stop and remove this profile's gateway service.
+    #    Use `python -m hermes_cli.main` so we don't depend on a `hermes`
+    #    wrapper that may be half-removed mid-uninstall.
+    hermes_invocation = [_sys.executable, "-m", "hermes_cli.main", "--profile", name]
+    for subcmd in ("stop", "uninstall"):
+        try:
+            subprocess.run(
+                hermes_invocation + ["gateway", subcmd],
+                capture_output=True,
+                text=True,
+                timeout=60,
+                check=False,
+            )
+        except subprocess.TimeoutExpired:
+            log_warn(f"  Gateway {subcmd} timed out for '{name}'")
+        except Exception as e:
+            log_warn(f"  Could not run gateway {subcmd} for '{name}': {e}")
+
+    # 2. Remove the wrapper alias script at ~/.local/bin/<name> (if any).
+    alias_path = getattr(profile, "alias_path", None)
+    if alias_path and alias_path.exists():
+        try:
+            alias_path.unlink()
+            log_success(f"  Removed alias {alias_path}")
+        except Exception as e:
+            log_warn(f"  Could not remove alias {alias_path}: {e}")
+
+    # 3. Wipe the profile's HERMES_HOME directory.
+    try:
+        if profile_home.exists():
+            shutil.rmtree(profile_home)
+            log_success(f"  Removed {profile_home}")
+    except Exception as e:
+        log_warn(f"  Could not remove {profile_home}: {e}")
+
+
 def run_uninstall(args):
    """
    Run the uninstall process.
@@ -181,7 +288,13 @@ def run_uninstall(args):
    """
    project_root = get_project_root()
    hermes_home = get_hermes_home()
-    
+
+    # Detect named profiles when uninstalling from the default root —
+    # offer to clean them up too instead of leaving zombie HERMES_HOMEs
+    # and systemd units behind.
+    is_default_profile = _is_default_hermes_home(hermes_home)
+    named_profiles = _discover_named_profiles() if is_default_profile else []
+
    print()
    print(color("┌─────────────────────────────────────────────────────────┐", Colors.MAGENTA, Colors.BOLD))
    print(color("│            ⚕ Hermes Agent Uninstaller                  │", Colors.MAGENTA, Colors.BOLD))
@@ -195,6 +308,13 @@ def run_uninstall(args):
    print(f"  Secrets: {hermes_home / '.env'}")
    print(f"  Data:    {hermes_home / 'cron/'}, {hermes_home / 'sessions/'}, {hermes_home / 'logs/'}")
    print()
+
+    if named_profiles:
+        print(color("Other profiles detected:", Colors.CYAN, Colors.BOLD))
+        for p in named_profiles:
+            running = " (gateway running)" if getattr(p, "gateway_running", False) else ""
+            print(f"  • {p.name}{running}: {p.path}")
+        print()
    
    # Ask for confirmation
    print(color("Uninstall Options:", Colors.YELLOW, Colors.BOLD))
@@ -221,12 +341,40 @@ def run_uninstall(args):
        return
    
    full_uninstall = (choice == "2")
-    
+
+    # When doing a full uninstall from the default profile, also offer to
+    # remove any named profiles — stopping their gateway services, unlinking
+    # their alias wrappers, and wiping their HERMES_HOME dirs. Otherwise
+    # those leave zombie services and data behind.
+    remove_profiles = False
+    if full_uninstall and named_profiles:
+        print()
+        print(color("Other profiles will NOT be removed by default.", Colors.YELLOW))
+        print(f"Found {len(named_profiles)} named profile(s): " +
+              ", ".join(p.name for p in named_profiles))
+        print()
+        try:
+            resp = input(color(
+                f"Also stop and remove these {len(named_profiles)} profile(s)? [y/N]: ",
+                Colors.BOLD
+            )).strip().lower()
+        except (KeyboardInterrupt, EOFError):
+            print()
+            print("Cancelled.")
+            return
+        remove_profiles = resp in ("y", "yes")
+
    # Final confirmation
    print()
    if full_uninstall:
        print(color("⚠️  WARNING: This will permanently delete ALL Hermes data!", Colors.RED, Colors.BOLD))
        print(color("   Including: configs, API keys, sessions, scheduled jobs, logs", Colors.RED))
+        if remove_profiles:
+            print(color(
+                f"   Plus {len(named_profiles)} profile(s): " +
+                ", ".join(p.name for p in named_profiles),
+                Colors.RED
+            ))
    else:
        print("This will remove the Hermes code but keep your configuration and data.")
    
@@ -247,12 +395,10 @@ def run_uninstall(args):
    print(color("Uninstalling...", Colors.CYAN, Colors.BOLD))
    print()
    
-    # 1. Stop and uninstall gateway service
-    log_info("Checking for gateway service...")
-    if uninstall_gateway_service():
-        log_success("Gateway service stopped and removed")
-    else:
-        log_info("No gateway service found")
+    # 1. Stop and uninstall gateway service + kill standalone processes
+    log_info("Checking for running gateway...")
+    if not uninstall_gateway_service():
+        log_info("No gateway service or processes found")
    
    # 2. Remove PATH entries from shell configs
    log_info("Removing PATH entries from shell configs...")
@@ -291,8 +437,17 @@ def run_uninstall(args):
        log_warn(f"Could not fully remove {project_root}: {e}")
        log_info("You may need to manually remove it")
    
-    # 5. Optionally remove ~/.hermes/ data directory
+    # 5. Optionally remove ~/.hermes/ data directory (and named profiles)
    if full_uninstall:
+        # 5a. Stop and remove each named profile's gateway service and
+        #     alias wrapper. The profile HERMES_HOME dirs live under
+        #     ``<default>/profiles/<name>/`` and will be swept away by the
+        #     rmtree below, but services + alias scripts live OUTSIDE the
+        #     default root and have to be cleaned up explicitly.
+        if remove_profiles and named_profiles:
+            for prof in named_profiles:
+                _uninstall_profile(prof)
+
        log_info("Removing configuration and data...")
        try:
            if hermes_home.exists():
@@ -59,7 +59,7 @@ except ImportError:
        f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
    )

-WEB_DIST = Path(__file__).parent / "web_dist"
+WEB_DIST = Path(os.environ["HERMES_WEB_DIST"]) if "HERMES_WEB_DIST" in os.environ else Path(__file__).parent / "web_dist"
 _log = logging.getLogger(__name__)

 app = FastAPI(title="Hermes Agent", version=__version__)
@@ -232,8 +232,8 @@ _CATEGORY_MERGE: Dict[str, str] = {
    "checkpoints": "agent",
    "approvals": "security",
    "human_delay": "display",
-    "smart_model_routing": "agent",
    "dashboard": "display",
+    "code_execution": "agent",
 }

 # Display order for tabs — unlisted categories sort alphabetically after these.
@@ -155,6 +155,15 @@ def _cmd_subscribe(args):
        "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
    }

+    if getattr(args, "deliver_only", False):
+        if route["deliver"] == "log":
+            print(
+                "Error: --deliver-only requires --deliver to be a real target "
+                "(telegram, discord, slack, github_comment, etc.) — not 'log'."
+            )
+            return
+        route["deliver_only"] = True
+
    if args.deliver_chat_id:
        route["deliver_extra"] = {"chat_id": args.deliver_chat_id}

@@ -172,9 +181,12 @@ def _cmd_subscribe(args):
    else:
        print("  Events: (all)")
    print(f"  Deliver: {route['deliver']}")
+    if route.get("deliver_only"):
+        print("  Mode: direct delivery (no agent, zero LLM cost)")
    if route.get("prompt"):
        prompt_preview = route["prompt"][:80] + ("..." if len(route["prompt"]) > 80 else "")
-        print(f"  Prompt: {prompt_preview}")
+        label = "Message" if route.get("deliver_only") else "Prompt"
+        print(f"  {label}: {prompt_preview}")
    print(f"\n  Configure your service to POST to the URL above.")
    print(f"  Use the secret for HMAC-SHA256 signature validation.")
    print(f"  The gateway must be running to receive events (hermes gateway run).\n")
@@ -192,6 +204,8 @@ def _cmd_list(args):
    for name, route in subs.items():
        events = ", ".join(route.get("events", [])) or "(all)"
        deliver = route.get("deliver", "log")
+        if route.get("deliver_only"):
+            deliver = f"{deliver} (direct — no agent)"
        desc = route.get("description", "")
        print(f"  ◆ {name}")
        if desc:
@@ -383,10 +383,19 @@ class SessionDB:
        return session_id

    def end_session(self, session_id: str, end_reason: str) -> None:
-        """Mark a session as ended."""
+        """Mark a session as ended.
+
+        No-ops when the session is already ended. The first end_reason wins:
+        compression-split sessions must keep their ``end_reason = 'compression'``
+        record even if a later stale ``end_session()`` call (e.g. from a
+        desynced CLI session_id after ``/resume`` or ``/branch``) targets them
+        with a different reason. Use ``reopen_session()`` first if you
+        intentionally need to re-end a closed session with a new reason.
+        """
        def _do(conn):
            conn.execute(
-                "UPDATE sessions SET ended_at = ?, end_reason = ? WHERE id = ?",
+                "UPDATE sessions SET ended_at = ?, end_reason = ? "
+                "WHERE id = ? AND ended_at IS NULL",
                (time.time(), end_reason, session_id),
            )
        self._execute_write(_do)
@@ -714,6 +723,42 @@ class SessionDB:

        return f"{base} #{max_num + 1}"

+    def get_compression_tip(self, session_id: str) -> Optional[str]:
+        """Walk the compression-continuation chain forward and return the tip.
+
+        A compression continuation is a child session where:
+        1. The parent's ``end_reason = 'compression'``
+        2. The child was created AFTER the parent was ended (started_at >= ended_at)
+
+        The second condition distinguishes compression continuations from
+        delegate subagents or branch children, which can also have a
+        ``parent_session_id`` but were created while the parent was still live.
+
+        Returns the session_id of the latest continuation in the chain, or the
+        input ``session_id`` if it isn't part of a compression chain (or if the
+        input itself doesn't exist).
+        """
+        current = session_id
+        # Bound the walk defensively — compression chains this deep are
+        # pathological and shouldn't happen in practice. 100 = plenty.
+        for _ in range(100):
+            with self._lock:
+                cursor = self._conn.execute(
+                    "SELECT id FROM sessions "
+                    "WHERE parent_session_id = ? "
+                    "  AND started_at >= ("
+                    "      SELECT ended_at FROM sessions "
+                    "      WHERE id = ? AND end_reason = 'compression'"
+                    "  ) "
+                    "ORDER BY started_at DESC LIMIT 1",
+                    (current, current),
+                )
+                row = cursor.fetchone()
+            if row is None:
+                return current
+            current = row["id"]
+        return current
+
    def list_sessions_rich(
        self,
        source: str = None,
@@ -721,6 +766,7 @@ class SessionDB:
        limit: int = 20,
        offset: int = 0,
        include_children: bool = False,
+        project_compression_tips: bool = True,
    ) -> List[Dict[str, Any]]:
        """List sessions with preview (first user message) and last active timestamp.

@@ -732,6 +778,14 @@ class SessionDB:

        By default, child sessions (subagent runs, compression continuations)
        are excluded.  Pass ``include_children=True`` to include them.
+
+        With ``project_compression_tips=True`` (default), sessions that are
+        roots of compression chains are projected forward to their latest
+        continuation — one logical conversation = one list entry, showing the
+        live continuation's id/message_count/title/last_active. This prevents
+        compressed continuations from being invisible to users while keeping
+        delegate subagents and branches hidden. Pass ``False`` to return the
+        raw root rows (useful for admin/debug UIs).
        """
        where_clauses = []
        params = []
@@ -782,8 +836,77 @@ class SessionDB:
                s["preview"] = ""
            sessions.append(s)

+        # Project compression roots forward to their tips. Each row whose
+        # end_reason is 'compression' has a continuation child; replace the
+        # surfaced fields (id, message_count, title, last_active, ended_at,
+        # end_reason, preview) with the tip's values so the list entry acts
+        # as the live conversation. Keep the root's started_at to preserve
+        # chronological ordering by original conversation start.
+        if project_compression_tips and not include_children:
+            projected = []
+            for s in sessions:
+                if s.get("end_reason") != "compression":
+                    projected.append(s)
+                    continue
+                tip_id = self.get_compression_tip(s["id"])
+                if tip_id == s["id"]:
+                    projected.append(s)
+                    continue
+                tip_row = self._get_session_rich_row(tip_id)
+                if not tip_row:
+                    projected.append(s)
+                    continue
+                # Preserve the root's started_at for stable sort order, but
+                # surface the tip's identity and activity data.
+                merged = dict(s)
+                for key in (
+                    "id", "ended_at", "end_reason", "message_count",
+                    "tool_call_count", "title", "last_active", "preview",
+                    "model", "system_prompt",
+                ):
+                    if key in tip_row:
+                        merged[key] = tip_row[key]
+                merged["_lineage_root_id"] = s["id"]
+                projected.append(merged)
+            sessions = projected
+
        return sessions

+    def _get_session_rich_row(self, session_id: str) -> Optional[Dict[str, Any]]:
+        """Fetch a single session with the same enriched columns as
+        ``list_sessions_rich`` (preview + last_active). Returns None if the
+        session doesn't exist.
+        """
+        query = """
+            SELECT s.*,
+                COALESCE(
+                    (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
+                     FROM messages m
+                     WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
+                     ORDER BY m.timestamp, m.id LIMIT 1),
+                    ''
+                ) AS _preview_raw,
+                COALESCE(
+                    (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
+                    s.started_at
+                ) AS last_active
+            FROM sessions s
+            WHERE s.id = ?
+        """
+        with self._lock:
+            cursor = self._conn.execute(query, (session_id,))
+            row = cursor.fetchone()
+        if not row:
+            return None
+        s = dict(row)
+        raw = s.pop("_preview_raw", "").strip()
+        if raw:
+            text = raw[:60]
+            s["preview"] = text + ("..." if len(raw) > 60 else "")
+        else:
+            s["preview"] = ""
+        return s
+
    # =========================================================================
    # Message storage
    # =========================================================================
@@ -43,13 +43,16 @@ from dotenv import load_dotenv
 load_dotenv()


-def _effective_temperature_for_model(model: str) -> Optional[float]:
+def _effective_temperature_for_model(
+    model: str,
+    base_url: Optional[str] = None,
+) -> Optional[float]:
    """Return a fixed temperature for models with strict sampling contracts."""
    try:
        from agent.auxiliary_client import _fixed_temperature_for_model
    except Exception:
        return None
-    return _fixed_temperature_for_model(model)
+    return _fixed_temperature_for_model(model, base_url)



@@ -457,7 +460,10 @@ Complete the user's task step by step."""
                        "tools": self.tools,
                        "timeout": 300.0,
                    }
-                    fixed_temperature = _effective_temperature_for_model(self.model)
+                    fixed_temperature = _effective_temperature_for_model(
+                        self.model,
+                        str(getattr(self.client, "base_url", "") or ""),
+                    )
                    if fixed_temperature is not None:
                        api_kwargs["temperature"] = fixed_temperature

@@ -282,6 +282,31 @@ def get_tool_definitions(
                filtered_tools[i] = {"type": "function", "function": dynamic_schema}
                break

+    # Rebuild discord_server schema based on the bot's privileged intents
+    # (detected from GET /applications/@me) and the user's action allowlist
+    # in config.  Hides actions the bot's intents don't support so the
+    # model never attempts them, and annotates fetch_messages when the
+    # MESSAGE_CONTENT intent is missing.
+    if "discord_server" in available_tool_names:
+        try:
+            from tools.discord_tool import get_dynamic_schema
+            dynamic = get_dynamic_schema()
+        except Exception:  # pragma: no cover — defensive, fall back to static
+            dynamic = None
+        if dynamic is None:
+            # Tool filtered out entirely (empty allowlist or detection disabled
+            # the only remaining actions).  Drop it from the schema list.
+            filtered_tools = [
+                t for t in filtered_tools
+                if t.get("function", {}).get("name") != "discord_server"
+            ]
+            available_tool_names.discard("discord_server")
+        else:
+            for i, td in enumerate(filtered_tools):
+                if td.get("function", {}).get("name") == "discord_server":
+                    filtered_tools[i] = {"type": "function", "function": dynamic}
+                    break
+
    # Strip web tool cross-references from browser_navigate description when
    # web_search / web_extract are not available.  The static schema says
    # "prefer web_search or web_extract" which causes the model to hallucinate
@@ -525,6 +550,30 @@ def handle_function_call(
        except Exception:
            pass

+        # Generic tool-result canonicalization seam: plugins receive the
+        # final result string (JSON, usually) and may replace it by
+        # returning a string from transform_tool_result. Runs after
+        # post_tool_call (which stays observational) and before the result
+        # is appended back into conversation context. Fail-open; the first
+        # valid string return wins; non-string returns are ignored.
+        try:
+            from hermes_cli.plugins import invoke_hook
+            hook_results = invoke_hook(
+                "transform_tool_result",
+                tool_name=function_name,
+                args=function_args,
+                result=result,
+                task_id=task_id or "",
+                session_id=session_id or "",
+                tool_call_id=tool_call_id or "",
+            )
+            for hook_result in hook_results:
+                if isinstance(hook_result, str):
+                    result = hook_result
+                    break
+        except Exception:
+            pass
+
        return result

    except Exception as e:
@@ -37,7 +37,30 @@ json.dump(sorted(leaf_paths(DEFAULT_CONFIG)), sys.stdout, indent=2)
    in {
      packages.configKeys = configKeys;

-      checks = lib.optionalAttrs pkgs.stdenv.hostPlatform.isLinux {
+      checks = {
+        # Cross-platform evaluation — catches "not supported for interpreter"
+        # errors (e.g. sphinx dropping python311) without needing a darwin builder.
+        # Evaluation is pure and instant; it doesn't build anything.
+        cross-eval = let
+          targetSystems = builtins.filter
+            (s: inputs.self.packages ? ${s})
+            [ "x86_64-linux" "aarch64-linux" "aarch64-darwin" "x86_64-darwin" ];
+          tryEvalPkg = sys:
+            let pkg = inputs.self.packages.${sys}.default;
+            in builtins.tryEval (builtins.seq pkg.drvPath true);
+          results = map (sys: { inherit sys; result = tryEvalPkg sys; }) targetSystems;
+          failures = builtins.filter (r: !r.result.success) results;
+          failMsg = lib.concatMapStringsSep "\n" (r: "  - ${r.sys}") failures;
+        in pkgs.runCommand "hermes-cross-eval" { } (
+          if failures != [] then
+            builtins.throw "Package fails to evaluate on:\n${failMsg}"
+          else ''
+            echo "PASS: package evaluates on all ${toString (builtins.length targetSystems)} platforms"
+            mkdir -p $out
+            echo "ok" > $out/result
+          ''
+        );
+      } // lib.optionalAttrs pkgs.stdenv.hostPlatform.isLinux {
        # Verify binaries exist and are executable
        package-contents = pkgs.runCommand "hermes-package-contents" { } ''
          set -e
@@ -125,6 +148,29 @@ json.dump(sorted(leaf_paths(DEFAULT_CONFIG)), sys.stdout, indent=2)
          echo "ok" > $out/result
        '';

+        # Verify HERMES_NODE is set in wrapper and points to Node 20+
+        # (string-width uses the /v regex flag which requires Node 20+)
+        hermes-node = pkgs.runCommand "hermes-node-version" { } ''
+          set -e
+          echo "=== Checking HERMES_NODE in wrapper ==="
+          grep -q "HERMES_NODE" ${hermes-agent}/bin/hermes || \
+            (echo "FAIL: HERMES_NODE not set in wrapper"; exit 1)
+          echo "PASS: HERMES_NODE present in wrapper"
+
+          HERMES_NODE=$(sed -n "s/^export HERMES_NODE='\(.*\)'/\1/p" ${hermes-agent}/bin/hermes)
+          test -x "$HERMES_NODE" || (echo "FAIL: HERMES_NODE=$HERMES_NODE not executable"; exit 1)
+          echo "PASS: HERMES_NODE executable at $HERMES_NODE"
+
+          NODE_MAJOR=$("$HERMES_NODE" --version | sed 's/^v//' | cut -d. -f1)
+          test "$NODE_MAJOR" -ge 20 || \
+            (echo "FAIL: Node v$NODE_MAJOR < 20, TUI needs /v regex flag support"; exit 1)
+          echo "PASS: Node v$NODE_MAJOR >= 20"
+
+          echo "=== All HERMES_NODE checks passed ==="
+          mkdir -p $out
+          echo "ok" > $out/result
+        '';
+
        # Verify HERMES_MANAGED guard works on all mutation commands
        managed-guard = pkgs.runCommand "hermes-managed-guard" { } ''
          set -e
@@ -12,7 +12,7 @@
      devShells.default = pkgs.mkShell {
        inputsFrom = packages;
        packages = with pkgs; [
-          python311 uv nodejs_22 ripgrep git openssh ffmpeg
+          python312 uv nodejs_22 ripgrep git openssh ffmpeg
        ];

        shellHook = let
@@ -121,11 +121,19 @@
      # ── Provision apt packages (first boot only, cached in writable layer) ──
      # sudo: agent self-modification
      # nodejs/npm: writable node so npm i -g works (nix store copies are read-only)
-      # curl: needed for uv installer
+      #   Node 22 via NodeSource — Ubuntu 24.04 ships Node 18 which is EOL.
+      # curl: needed for uv installer + NodeSource setup
      if [ ! -f /var/lib/hermes-tools-provisioned ] && command -v apt-get >/dev/null 2>&1; then
        echo "First boot: provisioning agent tools..."
        apt-get update -qq
-        apt-get install -y -qq sudo nodejs npm curl
+        apt-get install -y -qq sudo curl ca-certificates gnupg
+        mkdir -p /etc/apt/keyrings
+        curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key \
+          | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg
+        echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_22.x nodistro main" \
+          > /etc/apt/sources.list.d/nodesource.list
+        apt-get update -qq
+        apt-get install -y -qq nodejs
        touch /var/lib/hermes-tools-provisioned
      fi

@@ -140,15 +148,14 @@
        su -s /bin/sh "$TARGET_USER" -c 'curl -LsSf https://astral.sh/uv/install.sh | sh' || true
      fi

-      # Python 3.11 venv — gives the agent a writable Python with pip.
-      # Uses uv to install Python 3.11 (Ubuntu 24.04 ships 3.12).
+      # Python 3.12 venv — gives the agent a writable Python with pip.
      # --seed includes pip/setuptools so bare `pip install` works.
      _UV_BIN="$TARGET_HOME/.local/bin/uv"
      if [ ! -d "$TARGET_HOME/.venv" ] && [ -x "$_UV_BIN" ]; then
        su -s /bin/sh "$TARGET_USER" -c "
          export PATH=\"\$HOME/.local/bin:\$PATH\"
-          uv python install 3.11
-          uv venv --python 3.11 --seed \"\$HOME/.venv\"
+          uv python install 3.12
+          uv venv --python 3.12 --seed \"\$HOME/.venv\"
        " || true
      fi

@@ -171,7 +178,7 @@
    # Package and entrypoint use stable symlinks (current-package, current-entrypoint)
    # so they can update without recreation. Env vars go through $HERMES_HOME/.env.
    containerIdentity = builtins.hashString "sha256" (builtins.toJSON {
-      schema = 3; # bump when identity inputs change
+      schema = 4; # bump when identity inputs change (4: Node 18→22 via NodeSource)
      image = cfg.container.image;
      extraVolumes = cfg.container.extraVolumes;
      extraOptions = cfg.container.extraOptions;
@@ -18,6 +18,10 @@
        filter = path: _type: !(pkgs.lib.hasInfix "/index-cache/" path);
      };

+      hermesWeb = pkgs.callPackage ./web.nix {
+        npm-lockfile-fix = inputs'.npm-lockfile-fix.packages.default;
+      };
+
      runtimeDeps = with pkgs; [
        nodejs_22
        ripgrep
@@ -52,6 +56,7 @@

            mkdir -p $out/share/hermes-agent $out/bin
            cp -r ${bundledSkills} $out/share/hermes-agent/skills
+            cp -r ${hermesWeb} $out/share/hermes-agent/web_dist

            # copy pre-built TUI (same layout as dev: ui-tui/dist/ + node_modules/)
            mkdir -p $out/ui-tui
@@ -62,8 +67,10 @@
                makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
                  --suffix PATH : "${runtimePath}" \
                  --set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills \
+                  --set HERMES_WEB_DIST $out/share/hermes-agent/web_dist \
                  --set HERMES_TUI_DIR $out/ui-tui \
-                  --set HERMES_PYTHON ${hermesVenv}/bin/python3
+                  --set HERMES_PYTHON ${hermesVenv}/bin/python3 \
+                  --set HERMES_NODE ${pkgs.nodejs_22}/bin/node
              '')
              [
                "hermes"
@@ -80,7 +87,7 @@
            STAMP_VALUE="${pyprojectHash}:${uvLockHash}"
            if [ ! -f "$STAMP" ] || [ "$(cat "$STAMP")" != "$STAMP_VALUE" ]; then
              echo "hermes-agent: installing Python dependencies..."
-              uv venv .venv --python ${pkgs.python311}/bin/python3 2>/dev/null || true
+              uv venv .venv --python ${pkgs.python312}/bin/python3 2>/dev/null || true
              source .venv/bin/activate
              uv pip install -e ".[all]"
              [ -d mini-swe-agent ] && uv pip install -e ./mini-swe-agent 2>/dev/null || true
@@ -103,6 +110,7 @@
        };

        tui = hermesTui;
+        web = hermesWeb;
      };
    };
 }
@@ -1,6 +1,6 @@
 # nix/python.nix — uv2nix virtual environment builder
 {
-  python311,
+  python312,
  lib,
  callPackage,
  uv2nix,
@@ -35,30 +35,46 @@ let
      };
    };

+  # Legacy alibabacloud packages ship only sdists with setup.py/setup.cfg
+  # and no pyproject.toml, so setuptools isn't declared as a build dep.
+  buildSystemOverrides = final: prev: builtins.mapAttrs
+    (name: _: prev.${name}.overrideAttrs (old: {
+      nativeBuildInputs = (old.nativeBuildInputs or [ ]) ++ [ final.setuptools ];
+    }))
+    (lib.genAttrs [
+      "alibabacloud-credentials-api"
+      "alibabacloud-endpoint-util"
+      "alibabacloud-gateway-dingtalk"
+      "alibabacloud-gateway-spi"
+      "alibabacloud-tea"
+    ] (_: null));
+
  pythonPackageOverrides = final: _prev:
    if isAarch64Darwin then {
-      numpy = mkPrebuiltOverride final python311.pkgs.numpy { };
+      numpy = mkPrebuiltOverride final python312.pkgs.numpy { };

-      av = mkPrebuiltOverride final python311.pkgs.av { };
+      pyarrow = mkPrebuiltOverride final python312.pkgs.pyarrow { };

-      humanfriendly = mkPrebuiltOverride final python311.pkgs.humanfriendly { };
+      av = mkPrebuiltOverride final python312.pkgs.av { };

-      coloredlogs = mkPrebuiltOverride final python311.pkgs.coloredlogs {
+      humanfriendly = mkPrebuiltOverride final python312.pkgs.humanfriendly { };
+
+      coloredlogs = mkPrebuiltOverride final python312.pkgs.coloredlogs {
        humanfriendly = [ ];
      };

-      onnxruntime = mkPrebuiltOverride final python311.pkgs.onnxruntime {
+      onnxruntime = mkPrebuiltOverride final python312.pkgs.onnxruntime {
        coloredlogs = [ ];
        numpy = [ ];
        packaging = [ ];
      };

-      ctranslate2 = mkPrebuiltOverride final python311.pkgs.ctranslate2 {
+      ctranslate2 = mkPrebuiltOverride final python312.pkgs.ctranslate2 {
        numpy = [ ];
        pyyaml = [ ];
      };

-      faster-whisper = mkPrebuiltOverride final python311.pkgs.faster-whisper {
+      faster-whisper = mkPrebuiltOverride final python312.pkgs.faster-whisper {
        av = [ ];
        ctranslate2 = [ ];
        huggingface-hub = [ ];
@@ -70,11 +86,12 @@ let

  pythonSet =
    (callPackage pyproject-nix.build.packages {
-      python = python311;
+      python = python312;
    }).overrideScope
      (lib.composeManyExtensions [
        pyproject-build-systems.overlays.default
        overlay
+        buildSystemOverrides
        pythonPackageOverrides
      ]);
 in
@@ -4,7 +4,7 @@ let
  src = ../ui-tui;
  npmDeps = pkgs.fetchNpmDeps {
    inherit src;
-    hash = "sha256-zsUPmbC6oMUO10EhS3ptvDjwlfpCSEmrkjyeORw7fac=";
+    hash = "sha256-mG3vpgGi4ljt4X3XIf3I/5mIcm+rVTUAmx2DQ6YVA90=";
  };

  packageJson = builtins.fromJSON (builtins.readFile (src + "/package.json"));
@@ -18,11 +18,6 @@ pkgs.buildNpmPackage {

  doCheck = false;

-  postPatch = ''
-    # fetchNpmDeps strips the trailing newline; match it so the diff passes
-    sed -i -z 's/\n$//' package-lock.json
-  '';
-
  installPhase = ''
    runHook preInstall

@@ -0,0 +1,63 @@
+# nix/web.nix — Hermes Web Dashboard (Vite/React) frontend build
+{ pkgs, npm-lockfile-fix, ... }:
+let
+  src = ../web;
+  npmDeps = pkgs.fetchNpmDeps {
+    inherit src;
+    hash = "sha256-Y0pOzdFG8BLjfvCLmsvqYpjxFjAQabXp1i7X9W/cCU4=";
+  };
+
+  npmLockHash = builtins.hashString "sha256" (builtins.readFile ../web/package-lock.json);
+in
+pkgs.buildNpmPackage {
+  pname = "hermes-web";
+  version = "0.0.0";
+  inherit src npmDeps;
+
+  doCheck = false;
+
+  buildPhase = ''
+    npx tsc -b
+    npx vite build --outDir dist
+  '';
+
+  installPhase = ''
+    runHook preInstall
+    cp -r dist $out
+    runHook postInstall
+  '';
+
+  nativeBuildInputs = [
+    (pkgs.writeShellScriptBin "update_web_lockfile" ''
+      set -euox pipefail
+
+      REPO_ROOT=$(git rev-parse --show-toplevel)
+
+      cd "$REPO_ROOT/web"
+      rm -rf node_modules/
+      npm cache clean --force
+      CI=true npm install
+      ${pkgs.lib.getExe npm-lockfile-fix} ./package-lock.json
+
+      NIX_FILE="$REPO_ROOT/nix/web.nix"
+      sed -i "s/hash = \"[^\"]*\";/hash = \"\";/" $NIX_FILE
+      NIX_OUTPUT=$(nix build .#web 2>&1 || true)
+      NEW_HASH=$(echo "$NIX_OUTPUT" | grep 'got:' | awk '{print $2}')
+      echo got new hash $NEW_HASH
+      sed -i "s|hash = \"[^\"]*\";|hash = \"$NEW_HASH\";|" $NIX_FILE
+      nix build .#web
+      echo "Updated npm hash in $NIX_FILE to $NEW_HASH"
+    '')
+  ];
+
+  passthru.devShellHook = ''
+    STAMP=".nix-stamps/hermes-web"
+    STAMP_VALUE="${npmLockHash}"
+    if [ ! -f "$STAMP" ] || [ "$(cat "$STAMP")" != "$STAMP_VALUE" ]; then
+      echo "hermes-web: installing npm dependencies..."
+      cd web && CI=true npm install --silent --no-fund --no-audit 2>/dev/null && cd ..
+      mkdir -p .nix-stamps
+      echo "$STAMP_VALUE" > "$STAMP"
+    fi
+  '';
+}
@@ -145,10 +145,10 @@ Controls **how often** dialectic and context calls happen.
 | Key | Default | Description |
 |-----|---------|-------------|
 | `contextCadence` | `1` | Min turns between context API calls |
-| `dialecticCadence` | `3` | Min turns between dialectic API calls |
+| `dialecticCadence` | `2` | Min turns between dialectic API calls. Recommended 1–5 |
 | `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` for base context injection |

-Higher cadence values reduce API calls and cost. `dialecticCadence: 3` (default) means the dialectic engine fires at most every 3rd turn.
+Higher cadence values fire the dialectic LLM less often. `dialecticCadence: 2` means the engine fires every other turn. Setting it to `1` fires every turn.

 ### Depth (how many)

@@ -180,6 +180,8 @@ If `dialecticDepthLevels` is omitted, rounds use **proportional levels** derived

 This keeps earlier passes cheap while using full depth on the final synthesis.

+**Depth at session start.** The session-start prewarm runs the full configured `dialecticDepth` in the background before turn 1. A single-pass prewarm on a cold peer often returns thin output — multi-pass depth runs the audit/reconcile cycle before the user ever speaks. Turn 1 consumes the prewarm result directly; if prewarm hasn't landed in time, turn 1 falls back to a synchronous call with a bounded timeout.
+
 ### Level (how hard)

 Controls the **intensity** of each dialectic reasoning round.
@@ -368,7 +370,7 @@ Config file: `$HERMES_HOME/honcho.json` (profile-local) or `~/.honcho/config.jso
 | `contextTokens` | uncapped | Max tokens for the combined base context injection (summary + representation + card). Opt-in cap — omit to leave uncapped, set to an integer to bound injection size. |
 | `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` |
 | `contextCadence` | `1` | Min turns between context API calls |
-| `dialecticCadence` | `3` | Min turns between dialectic LLM calls |
+| `dialecticCadence` | `2` | Min turns between dialectic LLM calls (recommended 1–5) |

 The `contextTokens` budget is enforced at injection time. If the session summary + representation + card exceed the budget, Honcho trims the summary first, then the representation, preserving the card. This prevents context blowup in long sessions.

@@ -0,0 +1,339 @@
+---
+name: touchdesigner-mcp
+description: "Control a running TouchDesigner instance via twozero MCP — create operators, set parameters, wire connections, execute Python, build real-time visuals. 36 native tools."
+version: 1.0.0
+author: kshitijk4poor
+license: MIT
+metadata:
+  hermes:
+    tags: [TouchDesigner, MCP, twozero, creative-coding, real-time-visuals, generative-art, audio-reactive, VJ, installation, GLSL]
+    related_skills: [native-mcp, ascii-video, manim-video, hermes-video]
+
+---
+
+# TouchDesigner Integration (twozero MCP)
+
+## CRITICAL RULES
+
+1. **NEVER guess parameter names.** Call `td_get_par_info` for the op type FIRST. Your training data is wrong for TD 2025.32.
+2. **If `tdAttributeError` fires, STOP.** Call `td_get_operator_info` on the failing node before continuing.
+3. **NEVER hardcode absolute paths** in script callbacks. Use `me.parent()` / `scriptOp.parent()`.
+4. **Prefer native MCP tools over td_execute_python.** Use `td_create_operator`, `td_set_operator_pars`, `td_get_errors` etc. Only fall back to `td_execute_python` for complex multi-step logic.
+5. **Call `td_get_hints` before building.** It returns patterns specific to the op type you're working with.
+
+## Architecture
+
+```
+Hermes Agent -> MCP (Streamable HTTP) -> twozero.tox (port 40404) -> TD Python
+```
+
+36 native tools. Free plugin (no payment/license — confirmed April 2026).
+Context-aware (knows selected OP, current network).
+Hub health check: `GET http://localhost:40404/mcp` returns JSON with instance PID, project name, TD version.
+
+## Setup (Automated)
+
+Run the setup script to handle everything:
+
+```bash
+bash "${HERMES_HOME:-$HOME/.hermes}/skills/creative/touchdesigner-mcp/scripts/setup.sh"
+```
+
+The script will:
+1. Check if TD is running
+2. Download twozero.tox if not already cached
+3. Add `twozero_td` MCP server to Hermes config (if missing)
+4. Test the MCP connection on port 40404
+5. Report what manual steps remain (drag .tox into TD, enable MCP toggle)
+
+### Manual steps (one-time, cannot be automated)
+
+1. **Drag `~/Downloads/twozero.tox` into the TD network editor** → click Install
+2. **Enable MCP:** click twozero icon → Settings → mcp → "auto start MCP" → Yes
+3. **Restart Hermes session** to pick up the new MCP server
+
+After setup, verify:
+```bash
+nc -z 127.0.0.1 40404 && echo "twozero MCP: READY"
+```
+
+## Environment Notes
+
+- **Non-Commercial TD** caps resolution at 1280×1280. Use `outputresolution = 'custom'` and set width/height explicitly.
+- **Codecs:** `prores` (preferred on macOS) or `mjpa` as fallback. H.264/H.265/AV1 require a Commercial license.
+- Always call `td_get_par_info` before setting params — names vary by TD version (see CRITICAL RULES #1).
+
+## Workflow
+
+### Step 0: Discover (before building anything)
+
+```
+Call td_get_par_info with op_type for each type you plan to use.
+Call td_get_hints with the topic you're building (e.g. "glsl", "audio reactive", "feedback").
+Call td_get_focus to see where the user is and what's selected.
+Call td_get_network to see what already exists.
+```
+
+No temp nodes, no cleanup. This replaces the old discovery dance entirely.
+
+### Step 1: Clean + Build
+
+**IMPORTANT: Split cleanup and creation into SEPARATE MCP calls.** Destroying and recreating same-named nodes in one `td_execute_python` script causes "Invalid OP object" errors. See pitfalls #11b.
+
+Use `td_create_operator` for each node (handles viewport positioning automatically):
+
+```
+td_create_operator(type="noiseTOP", parent="/project1", name="bg", parameters={"resolutionw": 1280, "resolutionh": 720})
+td_create_operator(type="levelTOP", parent="/project1", name="brightness")
+td_create_operator(type="nullTOP", parent="/project1", name="out")
+```
+
+For bulk creation or wiring, use `td_execute_python`:
+
+```python
+# td_execute_python script:
+root = op('/project1')
+nodes = []
+for name, optype in [('bg', noiseTOP), ('fx', levelTOP), ('out', nullTOP)]:
+    n = root.create(optype, name)
+    nodes.append(n.path)
+# Wire chain
+for i in range(len(nodes)-1):
+    op(nodes[i]).outputConnectors[0].connect(op(nodes[i+1]).inputConnectors[0])
+result = {'created': nodes}
+```
+
+### Step 2: Set Parameters
+
+Prefer the native tool (validates params, won't crash):
+
+```
+td_set_operator_pars(path="/project1/bg", parameters={"roughness": 0.6, "monochrome": true})
+```
+
+For expressions or modes, use `td_execute_python`:
+
+```python
+op('/project1/time_driver').par.colorr.expr = "absTime.seconds % 1000.0"
+```
+
+### Step 3: Wire
+
+Use `td_execute_python` — no native wire tool exists:
+
+```python
+op('/project1/bg').outputConnectors[0].connect(op('/project1/fx').inputConnectors[0])
+```
+
+### Step 4: Verify
+
+```
+td_get_errors(path="/project1", recursive=true)
+td_get_perf()
+td_get_operator_info(path="/project1/out", detail="full")
+```
+
+### Step 5: Display / Capture
+
+```
+td_get_screenshot(path="/project1/out")
+```
+
+Or open a window via script:
+
+```python
+win = op('/project1').create(windowCOMP, 'display')
+win.par.winop = op('/project1/out').path
+win.par.winw = 1280; win.par.winh = 720
+win.par.winopen.pulse()
+```
+
+## MCP Tool Quick Reference
+
+**Core (use these most):**
+| Tool | What |
+|------|------|
+| `td_execute_python` | Run arbitrary Python in TD. Full API access. |
+| `td_create_operator` | Create node with params + auto-positioning |
+| `td_set_operator_pars` | Set params safely (validates, won't crash) |
+| `td_get_operator_info` | Inspect one node: connections, params, errors |
+| `td_get_operators_info` | Inspect multiple nodes in one call |
+| `td_get_network` | See network structure at a path |
+| `td_get_errors` | Find errors/warnings recursively |
+| `td_get_par_info` | Get param names for an OP type (replaces discovery) |
+| `td_get_hints` | Get patterns/tips before building |
+| `td_get_focus` | What network is open, what's selected |
+
+**Read/Write:**
+| Tool | What |
+|------|------|
+| `td_read_dat` | Read DAT text content |
+| `td_write_dat` | Write/patch DAT content |
+| `td_read_chop` | Read CHOP channel values |
+| `td_read_textport` | Read TD console output |
+
+**Visual:**
+| Tool | What |
+|------|------|
+| `td_get_screenshot` | Capture one OP viewer to file |
+| `td_get_screenshots` | Capture multiple OPs at once |
+| `td_get_screen_screenshot` | Capture actual screen via TD |
+| `td_navigate_to` | Jump network editor to an OP |
+
+**Search:**
+| Tool | What |
+|------|------|
+| `td_find_op` | Find ops by name/type across project |
+| `td_search` | Search code, expressions, string params |
+
+**System:**
+| Tool | What |
+|------|------|
+| `td_get_perf` | Performance profiling (FPS, slow ops) |
+| `td_list_instances` | List all running TD instances |
+| `td_get_docs` | In-depth docs on a TD topic |
+| `td_agents_md` | Read/write per-COMP markdown docs |
+| `td_reinit_extension` | Reload extension after code edit |
+| `td_clear_textport` | Clear console before debug session |
+
+**Input Automation:**
+| Tool | What |
+|------|------|
+| `td_input_execute` | Send mouse/keyboard to TD |
+| `td_input_status` | Poll input queue status |
+| `td_input_clear` | Stop input automation |
+| `td_op_screen_rect` | Get screen coords of a node |
+| `td_click_screen_point` | Click a point in a screenshot |
+
+See `references/mcp-tools.md` for full parameter schemas.
+
+## Key Implementation Rules
+
+**GLSL time:** No `uTDCurrentTime` in GLSL TOP. Use the Values page:
+```python
+# Call td_get_par_info(op_type="glslTOP") first to confirm param names
+td_set_operator_pars(path="/project1/shader", parameters={"value0name": "uTime"})
+# Then set expression via script:
+# op('/project1/shader').par.value0.expr = "absTime.seconds"
+# In GLSL: uniform float uTime;
+```
+
+Fallback: Constant TOP in `rgba32float` format (8-bit clamps to 0-1, freezing the shader).
+
+**Feedback TOP:** Use `top` parameter reference, not direct input wire. "Not enough sources" resolves after first cook. "Cook dependency loop" warning is expected.
+
+**Resolution:** Non-Commercial caps at 1280×1280. Use `outputresolution = 'custom'`.
+
+**Large shaders:** Write GLSL to `/tmp/file.glsl`, then use `td_write_dat` or `td_execute_python` to load.
+
+**Vertex/Point access (TD 2025.32):** `point.P[0]`, `point.P[1]`, `point.P[2]` — NOT `.x`, `.y`, `.z`.
+
+**Extensions:** `ext0object` format is `"op('./datName').module.ClassName(me)"` in CONSTANT mode. After editing extension code with `td_write_dat`, call `td_reinit_extension`.
+
+**Script callbacks:** ALWAYS use relative paths via `me.parent()` / `scriptOp.parent()`.
+
+**Cleaning nodes:** Always `list(root.children)` before iterating + `child.valid` check.
+
+## Recording / Exporting Video
+
+```python
+# via td_execute_python:
+root = op('/project1')
+rec = root.create(moviefileoutTOP, 'recorder')
+op('/project1/out').outputConnectors[0].connect(rec.inputConnectors[0])
+rec.par.type = 'movie'
+rec.par.file = '/tmp/output.mov'
+rec.par.videocodec = 'prores'  # Apple ProRes — NOT license-restricted on macOS
+rec.par.record = True   # start
+# rec.par.record = False  # stop (call separately later)
+```
+
+H.264/H.265/AV1 need Commercial license. Use `prores` on macOS or `mjpa` as fallback.
+Extract frames: `ffmpeg -i /tmp/output.mov -vframes 120 /tmp/frames/frame_%06d.png`
+
+**TOP.save() is useless for animation** — captures same GPU texture every time. Always use MovieFileOut.
+
+### Before Recording: Checklist
+
+1. **Verify FPS > 0** via `td_get_perf`. If FPS=0 the recording will be empty. See pitfalls #38-39.
+2. **Verify shader output is not black** via `td_get_screenshot`. Black output = shader error or missing input. See pitfalls #8, #40.
+3. **If recording with audio:** cue audio to start first, then delay recording by 3 frames. See pitfalls #19.
+4. **Set output path before starting record** — setting both in the same script can race.
+
+## Audio-Reactive GLSL (Proven Recipe)
+
+### Correct signal chain (tested April 2026)
+
+```
+AudioFileIn CHOP (playmode=sequential)
+  → AudioSpectrum CHOP (FFT=512, outputmenu=setmanually, outlength=256, timeslice=ON)
+  → Math CHOP (gain=10)
+  → CHOP to TOP (dataformat=r, layout=rowscropped)
+  → GLSL TOP input 1 (spectrum texture, 256x2)
+
+Constant TOP (rgba32float, time) → GLSL TOP input 0
+GLSL TOP → Null TOP → MovieFileOut
+```
+
+### Critical audio-reactive rules (empirically verified)
+
+1. **TimeSlice must stay ON** for AudioSpectrum. OFF = processes entire audio file → 24000+ samples → CHOP to TOP overflow.
+2. **Set Output Length manually** to 256 via `outputmenu='setmanually'` and `outlength=256`. Default outputs 22050 samples.
+3. **DO NOT use Lag CHOP for spectrum smoothing.** Lag CHOP operates in timeslice mode and expands 256 samples to 2400+, averaging all values to near-zero (~1e-06). The shader receives no usable data. This was the #1 audio sync failure in testing.
+4. **DO NOT use Filter CHOP either** — same timeslice expansion problem with spectrum data.
+5. **Smoothing belongs in the GLSL shader** if needed, via temporal lerp with a feedback texture: `mix(prevValue, newValue, 0.3)`. This gives frame-perfect sync with zero pipeline latency.
+6. **CHOP to TOP dataformat = 'r'**, layout = 'rowscropped'. Spectrum output is 256x2 (stereo). Sample at y=0.25 for first channel.
+7. **Math gain = 10** (not 5). Raw spectrum values are ~0.19 in bass range. Gain of 10 gives usable ~5.0 for the shader.
+8. **No Resample CHOP needed.** Control output size via AudioSpectrum's `outlength` param directly.
+
+### GLSL spectrum sampling
+
+```glsl
+// Input 0 = time (1x1 rgba32float), Input 1 = spectrum (256x2)
+float iTime = texture(sTD2DInputs[0], vec2(0.5)).r;
+
+// Sample multiple points per band and average for stability:
+// NOTE: y=0.25 for first channel (stereo texture is 256x2, first row center is 0.25)
+float bass = (texture(sTD2DInputs[1], vec2(0.02, 0.25)).r +
+              texture(sTD2DInputs[1], vec2(0.05, 0.25)).r) / 2.0;
+float mid  = (texture(sTD2DInputs[1], vec2(0.2, 0.25)).r +
+              texture(sTD2DInputs[1], vec2(0.35, 0.25)).r) / 2.0;
+float hi   = (texture(sTD2DInputs[1], vec2(0.6, 0.25)).r +
+              texture(sTD2DInputs[1], vec2(0.8, 0.25)).r) / 2.0;
+```
+
+See `references/network-patterns.md` for complete build scripts + shader code.
+
+## Operator Quick Reference
+
+| Family | Color | Python class / MCP type | Suffix |
+|--------|-------|-------------|--------|
+| TOP | Purple | noiseTOP, glslTOP, compositeTOP, levelTop, blurTOP, textTOP, nullTOP | TOP |
+| CHOP | Green | audiofileinCHOP, audiospectrumCHOP, mathCHOP, lfoCHOP, constantCHOP | CHOP |
+| SOP | Blue | gridSOP, sphereSOP, transformSOP, noiseSOP | SOP |
+| DAT | White | textDAT, tableDAT, scriptDAT, webserverDAT | DAT |
+| MAT | Yellow | phongMAT, pbrMAT, glslMAT, constMAT | MAT |
+| COMP | Gray | geometryCOMP, containerCOMP, cameraCOMP, lightCOMP, windowCOMP | COMP |
+
+## Security Notes
+
+- MCP runs on localhost only (port 40404). No authentication — any local process can send commands.
+- `td_execute_python` has unrestricted access to the TD Python environment and filesystem as the TD process user.
+- `setup.sh` downloads twozero.tox from the official 404zero.com URL. Verify the download if concerned.
+- The skill never sends data outside localhost. All MCP communication is local.
+
+## References
+
+| File | What |
+|------|------|
+| `references/pitfalls.md` | Hard-won lessons from real sessions |
+| `references/operators.md` | All operator families with params and use cases |
+| `references/network-patterns.md` | Recipes: audio-reactive, generative, GLSL, instancing |
+| `references/mcp-tools.md` | Full twozero MCP tool parameter schemas |
+| `references/python-api.md` | TD Python: op(), scripting, extensions |
+| `references/troubleshooting.md` | Connection diagnostics, debugging |
+| `scripts/setup.sh` | Automated setup script |
+
+---
+
+> You're not writing code. You're conducting light.
@@ -0,0 +1,382 @@
+# twozero MCP Tools Reference
+
+36 tools from twozero MCP v2.774+ (April 2026).
+All tools accept an optional `target_instance` param for multi-TD-instance scenarios.
+
+## Execution & Scripting
+
+### td_execute_python
+
+Execute Python code inside TouchDesigner and return the result. Has full access to TD Python API (op, project, app, etc). Print statements and the last expression value are captured. Best for: wiring connections (inputConnectors), setting expressions (par.X.expr/mode), querying parameter names, and batch creation scripts (5+ operators). For creating 1-4 operators, prefer td_create_operator instead.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `code` | string | yes | Python code to execute in TouchDesigner |
+
+## Network & Structure
+
+### td_get_network
+
+Get the operator network structure in TouchDesigner (TD) at a given path. Returns compact list: name OPType flags. First line is full path of queried op. Flags: ch:N=children count, !cook=allowCooking off, bypass, private=isPrivate, blocked:reason, "comment text". depth=0 (default) = current level only. depth=1 = one level of children (indented). To explore deeper, call again on a specific COMP path. System operators (/ui, /sys) are hidden by default.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | no | Network path to inspect, e.g. '/' or '/project1' |
+| `depth` | integer | no | How many levels deep to recurse. 0=current level only (recommended), 1=include direct children of COMPs |
+| `includeSystem` | boolean | no | Include system operators (/ui, /sys). Default false. |
+| `nodeXY` | boolean | no | Include nodeX,nodeY coordinates. Default false. |
+
+### td_create_operator
+
+Create a new operator (node) in TouchDesigner (TD). Preferred way to create operators — handles viewport positioning, viewer flag, and docked ops automatically. For batch creation (5+ ops), you may use td_execute_python with a script instead, but then call td_get_hints('construction') first for correct parameter names and layout rules. Supports all TD operator types: TOP, CHOP, SOP, DAT, COMP, MAT. If parent is omitted, creates in the currently open network at the user's viewport position. When building a container: first create baseCOMP (no parent), then create children with parent=compPath.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `type` | string | yes | Operator type, e.g. 'textDAT', 'constantCHOP', 'noiseTOP', 'transformTOP', 'baseCOMP' |
+| `parent` | string | no | Path to the parent operator. If omitted, uses the currently open network in TD. |
+| `name` | string | no | Name for the new operator (optional, TD auto-names if omitted) |
+| `parameters` | object | no | Key-value pairs of parameters to set on the created operator |
+
+### td_find_op
+
+Find operators by name and/or type across the project. Returns TSV: path, OPType, flags. Flags: bypass, !cook, private, blocked:reason. Use td_search to search inside code/expressions; use td_find_op to find operators themselves.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `name` | string | no | Substring to match in operator name (case-insensitive). E.g. 'noise' finds noise1, noise2, myNoise. |
+| `type` | string | no | Substring to match in OPType (case-insensitive). E.g. 'noiseTOP', 'baseCOMP', 'CHOP'. Use exact type for precision or partial for broader matches. |
+| `root` | string | no | Root operator path to search from. Default '/project1'. |
+| `max_results` | number | no | Maximum results to return. Default 50. |
+| `max_depth` | number | no | Max recursion depth from root. Default unlimited. |
+| `detail` | `basic` / `summary` | no | Result detail level. 'basic' = name/path/type (fast). 'summary' = + connections, non-default pars, expressions. Default 'basic'. |
+
+### td_search
+
+Search for text across all code (DAT scripts), parameter expressions, and string parameter values in the TD project. Returns TSV: path, kind (code/expression/parameter/ref), line, text. JSON when context>0. Words are OR-matched. Use quotes for exact phrases: 'GetLogin "op('login')"'. Use count_only=true to quickly check if something is referenced without fetching full results.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `query` | string | yes | Search query. Multiple words = OR (any match). Wrap in quotes for exact phrase. Example: 'GetLogin getLogin' finds either. |
+| `root` | string | no | Root operator path to search from. Default '/project1'. |
+| `scope` | `all` / `code` / `editable` / `expressions` / `parameters` | no | What to search. 'code' = DAT scripts only (fast, ~0.05s). 'editable' = only editable code (skips inherited/ref DATs). 'expressions' = parameter expressions only. 'parameters' = string parameter values only. 'all' = everything (slow, ~1.5s due to parameter scan). Default 'all'. |
+| `case_sensitive` | boolean | no | Case-sensitive matching. Default false. |
+| `max_results` | number | no | Maximum results to return. Default 50. |
+| `context` | number | no | Lines to show before/after each code match. Saves td_read_dat calls. Default 0. |
+| `count_only` | boolean | no | Return only match count, not results. Fast existence check. |
+| `max_depth` | number | no | Max recursion depth from root. Default unlimited. |
+
+### td_navigate_to
+
+Navigate the TouchDesigner Network Editor viewport to show a specific operator. Opens the operator's parent network and centers the view on it. Use this to show the user where a problem is, or to navigate to an operator before modifying it.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Path to the operator to navigate to, e.g. '/project1/noise1' |
+
+## Operator Inspection
+
+### td_get_operator_info
+
+Get information about a specific operator (node) in TouchDesigner (TD). detail='summary': connections, non-default pars, expressions, CHOP channels (compact). detail='full': all of the above PLUS every parameter with value/default/label.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Full path to the operator, e.g. '/project1/noise1' |
+| `detail` | `summary` / `full` | no | Level of detail. 'summary' = connections, expressions, non-default pars, custom pars (pulse marked), CHOP channels. 'full' = summary + all parameters. Default 'full'. |
+
+### td_get_operators_info
+
+Get information about multiple operators in one call. Returns an array of operator info objects. Use instead of calling td_get_operator_info multiple times.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `paths` | array | yes | Array of full operator paths, e.g. ['/project1/null1', '/project1/null2'] |
+| `detail` | `summary` / `full` | no | Level of detail. Default 'summary'. |
+
+### td_get_par_info
+
+Get parameter names and details for a TouchDesigner operator type. Without specific pars: returns compact list of all parameters with their names, types, and menu options. With pars: returns full details (help text, menu values, style) for specific parameters. Use this when you need to know exact parameter names before setting them.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `op_type` | string | yes | TD operator type name, e.g. 'noiseTOP', 'blurTOP', 'lfoCHOP', 'compositeTOP' |
+| `pars` | array | no | Optional list of specific parameter names to get full details for |
+
+## Parameter Setting
+
+### td_set_operator_pars
+
+Set parameters and flags on an operator in TouchDesigner (TD). Safer than td_execute_python for simple parameter changes. Can set values, toggle bypass/viewer, without writing Python code.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Path to the operator |
+| `parameters` | object | no | Key-value pairs of parameters to set |
+| `bypass` | boolean | no | Set bypass state of the operator (not available on COMPs) |
+| `viewer` | boolean | no | Set viewer state of the operator |
+| `allowCooking` | boolean | no | Set cooking flag on a COMP. When False, internal network stops cooking (0 CPU). COMP-only. |
+
+## Data Read/Write
+
+### td_read_dat
+
+Read the text content of a DAT operator in TouchDesigner (TD). Returns content with line numbers. Use to read scripts, extensions, GLSL shaders, table data.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Path to the DAT operator |
+| `start_line` | integer | no | Start line (1-based). Omit to read from beginning. |
+| `end_line` | integer | no | End line (inclusive). Omit to read to end. |
+
+### td_write_dat
+
+Write or patch text content of a DAT operator in TouchDesigner (TD). Can do full replacement or StrReplace-style patching (old_text -> new_text). Use for editing scripts, extensions, shaders. Does NOT reinit extensions automatically.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Path to the DAT operator |
+| `text` | string | no | Full replacement text. Use this OR old_text+new_text, not both. |
+| `old_text` | string | no | Text to find and replace (must be unique in the DAT) |
+| `new_text` | string | no | Replacement text |
+| `replace_all` | boolean | no | If true, replaces ALL occurrences of old_text (default: false, requires unique match) |
+
+### td_read_chop
+
+Read CHOP channel sample data. Returns channel values as arrays. Use when you need the actual sample values (animation curves, lookup tables, waveforms), not just the summary from td_get_operator_info.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Path to the CHOP operator |
+| `channels` | array | no | Channel names to read. Omit to read all channels. |
+| `start` | integer | no | Start sample index (0-based). Omit to read from beginning. |
+| `end` | integer | no | End sample index (inclusive). Omit to read to end. |
+
+### td_read_textport
+
+Read the last N lines from the TouchDesigner (TD) log/textport (console output). Use this to see errors, warnings and print output from TD.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `lines` | integer | no | Number of recent lines to return |
+
+### td_clear_textport
+
+Clear the MCP textport log buffer. Use this before starting a debug session or an edit-run-check loop to keep td_read_textport output focused and minimal.
+
+No parameters (other than optional `target_instance`).
+
+## Visual Capture
+
+### td_get_screenshot
+
+Get a screenshot of an operator's viewer in TouchDesigner (TD). Saves the image to a file and returns the file path. Use your file-reading tool to view the image. Shows what the operator looks like in its viewer (TOP output, CHOP waveform graph, SOP geometry, DAT table, parameter UI, etc). Use this to visually inspect any operator, or to generate images via TD for use in your project. TWO-STEP ASYNC USAGE: Step 1 — call with 'path' to start: returns {'status': 'pending', 'requestId': '...'}. Step 2 — call with 'request_id' to retrieve: returns {'file': '/tmp/.../opname_id.jpg'}. Then read the file to see the image. If step 2 still returns pending, make one other tool call then retry.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | no | Full operator path to screenshot, e.g. '/project1/noise1'. Required for step 1. |
+| `request_id` | string | no | Request ID from step 1 to retrieve the completed screenshot. |
+| `max_size` | integer | no | Max pixel size for the longer side (default 512). Use 0 for original operator resolution (useful for pixel-accurate UI work). Higher values (e.g. 1024) for more detail. |
+| `output_path` | string | no | Optional absolute path where the image should be saved (e.g. '/Users/me/project/render.png'). If omitted, saved to /tmp/pisang_mcp/screenshots/. Use absolute paths — TD's working directory may differ from the agent's. |
+| `as_top` | boolean | no | If true, captures the operator directly as a TOP (bypasses the viewer renderer), preserving alpha/transparency. Only works for TOP operators — if the target is not a TOP, falls back to the viewer automatically. Use this when you need a clean PNG with alpha, e.g. to save a generated image for use in another project. |
+| `format` | `auto` / `jpg` / `png` | no | Image format. 'auto' (default): JPEG for viewer mode, PNG for as_top=true. 'jpg': always JPEG (smaller). 'png': always PNG (lossless). |
+
+### td_get_screenshots
+
+Get screenshots of multiple operators in one batch. Saves images to files and returns file paths. Use your file-reading tool to view images. TWO-STEP ASYNC USAGE: Step 1 — call with 'paths' array to start: returns {'status': 'pending', 'batchId': '...', 'total': N}. Step 2 — call with 'batch_id' to retrieve: returns {'files': [{op, file}, ...]}. Then read the files to see the images. If still processing returns {'status': 'pending', 'ready': K, 'total': N}.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `paths` | array | no | List of full operator paths to screenshot. Required for step 1. |
+| `batch_id` | string | no | Batch ID from step 1 to retrieve completed screenshots. |
+| `max_size` | integer | no | Max pixel size for longer side (default 512). Use 0 for original resolution. |
+| `as_top` | boolean | no | If true, captures TOP operators directly (preserves alpha). Non-TOP operators fall back to viewer. |
+| `output_dir` | string | no | Optional absolute path to a directory. Each screenshot saved as <opname>.jpg or .png inside it and kept on disk. |
+| `format` | `auto` / `jpg` / `png` | no | Image format. 'auto' (default): JPEG for viewer mode, PNG for as_top=true. 'jpg': always JPEG (smaller). 'png': always PNG (lossless). |
+
+### td_get_screen_screenshot
+
+Capture a screenshot of the actual screen via TD's screenGrabTOP. Saves the image to a file and returns the file path. Use your file-reading tool to view the image. Unlike td_get_screenshot (operator viewer), this shows what the user literally sees on their monitor — TD windows, UI panels, everything. Use when simulating mouse/keyboard input to verify what happened on screen. Workflow: td_get_screen_screenshot → read file → td_input_execute → wait idle → td_get_screen_screenshot again. TWO-STEP ASYNC: Step 1 — call without request_id: returns {'status':'pending','requestId':'...'}. Step 2 — call with request_id: returns {'file': '/tmp/.../screen_id.jpg', 'info': '...metadata...'}. Then read the file to see the image. The requestId also stays usable with td_screen_point_to_global for later coordinate lookup. crop_x/y/w/h are in ACTUAL SCREEN PIXELS (not image pixels). Crops exceeding screen bounds are auto-clamped. SMART DEFAULTS: max_size is auto when omitted — 1920 for full screen (good overview), max(crop_w,crop_h) for cropped (guarantees 1:1 scale). At 1:1 scale: screen_coord = crop_origin + image_pixel. Otherwise use the formula from metadata.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `request_id` | string | no | Request ID from step 1 to retrieve the completed screenshot. |
+| `max_size` | integer | no | Max pixel size for the longer side. Auto when omitted: 1920 for full screen, max(crop_w,crop_h) for cropped (1:1). Set explicitly to override. |
+| `crop_x` | integer | no | Left edge in screen pixels. |
+| `crop_y` | integer | no | Top edge in screen pixels (y=0 at top of screen). |
+| `crop_w` | integer | no | Width in pixels. |
+| `crop_h` | integer | no | Height in pixels. |
+| `display` | integer | no | Screen index (default 0 = primary display). |
+
+## Context & Focus
+
+### td_get_focus
+
+Get the current user focus in TouchDesigner (TD): which network is open, selected operators, current operator, and rollover (what is under the mouse cursor). IMPORTANT: when the user says 'this operator' or 'вот этот', they mean the SELECTED/CURRENT operator, NOT the rollover. Rollover is just incidental mouse position and should be ignored for intent. Pass screenshots=true to immediately start a screenshot batch for all selected operators — response includes a 'screenshots' field with batchId; retrieve with td_get_screenshots(batch_id=...).
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `screenshots` | boolean | no | If true, start a screenshot batch for all selected operators. Retrieve with td_get_screenshots(batch_id=...). |
+| `max_size` | integer | no | Max screenshot size when screenshots=true (default 512). |
+| `as_top` | boolean | no | Passed to the screenshot batch when screenshots=true. |
+
+### td_get_errors
+
+Find errors and warnings in TouchDesigner (TD) operators. Checks operator errors, warnings, AND broken parameter expressions (missing channels, bad references, etc). Also includes recent script errors from the log (tracebacks), grouped and deduplicated — e.g. 1000 identical mouse-move errors shown as ×1000 with one entry. If path is given, checks that operator and its children. If no path, checks the currently open network. Use '/' for entire project. Use when user says something is broken, has errors, red nodes, горит ошибка, etc. TIP: call td_clear_textport before reproducing an error to keep log focused. TIP: combine with td_get_perf when user says 'тупит/лагает' to check both errors and performance.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | no | Path to check. If omitted, checks the current network. Use '/' to scan entire project. |
+| `recursive` | boolean | no | Check children recursively (default true) |
+| `include_log` | boolean | no | Include recent script errors from log, grouped by unique signature (default true). Use td_clear_textport before reproducing an error to keep results focused. |
+
+### td_get_perf
+
+Get performance data from TouchDesigner (TD). Returns TSV: header with fps/budget/memory summary, then slowest operators sorted by cook time. Columns: path, OPType, cpu/cook(ms), gpu/cook(ms), cpu/s, gpu/s, rate, flags. Use when user reports lag, low FPS, slow performance, тупит, тормозит.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | no | Path to profile. If omitted, profiles the current network. Use '/' for entire project. |
+| `top` | integer | no | Number of slowest operators to return |
+
+## Documentation
+
+### td_get_docs
+
+Get comprehensive documentation on a TouchDesigner topic. Unlike td_get_hints (compact tips), this returns in-depth reference material. Call without arguments to see available topics with descriptions. Call with a topic name to get the full documentation.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `topic` | string | no | Topic to get docs for. Omit to list available topics. |
+
+### td_get_hints
+
+Get TouchDesigner tips and common patterns for a topic. Call this BEFORE creating operators or writing TD Python code to learn correct parameter names, expressions, and idiomatic approaches. Available topics: animation, noise, connections, parameters, scripting, construction, ui_analysis, panel_layout, screenshots, input_simulation, undo. IMPORTANT: always call with topic='construction' before building multi-operator setups to get correct TOP/CHOP parameter names, compositeTOP input ordering, and layout guidelines. IMPORTANT: always call with topic='input_simulation' before using td_input_execute to learn focus recovery, coordinate systems, and testing workflow.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `topic` | string | yes | Topic to get hints for. Available: 'animation', 'noise', 'connections', 'parameters', 'scripting', 'construction', 'ui_analysis', 'panel_layout', 'screenshots', 'input_simulation', 'undo', 'networking', 'all' |
+
+### td_agents_md
+
+Read, write, or update the agents_md documentation inside a COMP container. agents_md is a Markdown textDAT describing the container's purpose, structure, and conventions. action='read': returns content + staleness check (compares documented children vs live state). action='update': refreshes auto-generated sections (children list, connections) from live state, preserves human-written sections. action='write': sets full content, creates the DAT if missing.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Path to the COMP container |
+| `action` | `read` / `update` / `write` | yes | read=get content+staleness, update=refresh auto sections, write=set content |
+| `content` | string | no | Markdown content (only for action='write') |
+
+## Input Automation
+
+### td_input_execute
+
+Send a sequence of mouse/keyboard commands to TouchDesigner. Commands execute sequentially with smooth bezier movement. Returns immediately — poll td_input_status() until status='idle' before proceeding. Command types: 'focus' — bring TD to foreground. 'move' — smooth mouse move: {type,x,y,duration,easing}. 'click' — click: {type,x,y,button,hold,duration,easing}. hold=seconds to hold down. duration=smooth move before click. 'dblclick' — double click: {type,x,y,duration}. 'mousedown'/'mouseup' — {type,x,y,button}. 'key' — keystroke: {type,keys} e.g. 'ctrl+z','tab','escape','shift+f5'. Requires Accessibility permission on Mac. 'type' — human-like typing: {type,text,wpm,variance} — layout-independent Unicode, variable timing. 'wait' — pause: {type,duration}. 'scroll' — {type,x,y,dx,dy,steps} — human-like scroll: moves mouse to (x,y) first, then sends dy (vertical, +up) and dx (horizontal, +right) as multiple ticks with natural timing. steps=4 by default. Mouse commands may include coord_space='logical' (default) or coord_space='physical'. On macOS, 'physical' means actual screen pixels from td_get_screen_screenshot and is converted to CGEvent logical coords automatically. Top-level coord_space applies to commands that do not override it. on_error: 'stop' (default) clears queue on error; 'continue' skips failed command. IMPORTANT: call td_get_hints('input_simulation') before first use to learn focus recovery, coordinate systems, and testing workflow.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `commands` | array | yes | List of command dicts to execute in sequence. |
+| `coord_space` | `logical` / `physical` | no | Default coordinate space for mouse commands that do not specify their own coord_space. 'logical' uses CGEvent coords directly. 'physical' uses actual screen pixels from td_get_screen_screenshot and is auto-converted on macOS. |
+| `on_error` | `stop` / `continue` | no | What to do on error. Default 'stop'. |
+
+### td_input_status
+
+Get current status of the td_input command queue. Poll this after td_input_execute until status='idle'. Returns: status ('idle'/'running'), current command, queue_remaining, last error.
+
+No parameters (other than optional `target_instance`).
+
+### td_input_clear
+
+Clear the td_input command queue and stop current execution immediately.
+
+No parameters (other than optional `target_instance`).
+
+### td_op_screen_rect
+
+Get the screen coordinates of an operator node in the network editor. Returns {x,y,w,h,cx,cy} where cx,cy is the center for clicking. Use this to find where to click on a specific operator. Only works if the operator's parent network is currently open in a network editor pane.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Full path to the operator, e.g. '/project1/myComp/noise1' |
+
+### td_click_screen_point
+
+Resolve a point inside a previous td_get_screen_screenshot result and click it. Pass the screenshot request_id plus either normalized u/v or image_x/image_y. Queues a td_input click using physical screen coordinates, so it works directly with screenshot-derived points. Use duration/easing to control the cursor travel before the click.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `request_id` | string | yes | Request ID originally returned by td_get_screen_screenshot. |
+| `u` | number | no | Normalized horizontal position inside the screenshot region (0=left, 1=right). Use with v. |
+| `v` | number | no | Normalized vertical position inside the screenshot region (0=top, 1=bottom). Use with u. |
+| `image_x` | number | no | Horizontal pixel coordinate inside the returned screenshot image. Use with image_y. |
+| `image_y` | number | no | Vertical pixel coordinate inside the returned screenshot image. Use with image_x. |
+| `button` | `left` / `right` / `middle` | no | Mouse button to click. Default left. |
+| `hold` | number | no | Seconds to hold the mouse button down before releasing. |
+| `duration` | number | no | Seconds for the cursor to travel to the target before clicking. |
+| `easing` | `linear` / `ease-in` / `ease-out` / `ease-in-out` | no | Cursor movement easing for the pre-click travel. |
+| `focus` | boolean | no | If true, bring TD to the front before clicking and wait briefly for focus to settle. |
+
+### td_screen_point_to_global
+
+Convert a point inside a previous td_get_screen_screenshot result into absolute screen coordinates. Pass the screenshot request_id plus either normalized u/v (0..1 inside that screenshot region) or image_x/image_y in returned image pixels. Returns absolute physical screen coordinates, logical coordinates, and a ready-to-use td_input_execute payload. Metadata is kept for the most recent screen screenshots so multiple agents can resolve points later by request_id.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `request_id` | string | yes | Request ID originally returned by td_get_screen_screenshot. |
+| `u` | number | no | Normalized horizontal position inside the screenshot region (0=left, 1=right). Use with v. |
+| `v` | number | no | Normalized vertical position inside the screenshot region (0=top, 1=bottom). Use with u. |
+| `image_x` | number | no | Horizontal pixel coordinate inside the returned screenshot image. Use with image_y. |
+| `image_y` | number | no | Vertical pixel coordinate inside the returned screenshot image. Use with image_x. |
+
+## System
+
+### td_list_instances
+
+List all running TouchDesigner (TD) instances with active MCP servers. Returns port, project name, PID, and instanceId for each instance. Call this at the start of every conversation to discover available instances and choose which one to work with. instanceId is stable for the lifetime of a TD process and is used as target_instance in all other tool calls.
+
+No parameters (other than optional `target_instance`).
+
+### td_project_quit
+
+Save and/or close the current TouchDesigner (TD) project. Can save before closing. Reports if project has unsaved changes. To close a different instance, pass target_instance=instanceId. WARNING: this will shut down the MCP server on that instance.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `save` | boolean | no | Save the project before closing. Default true. |
+| `force` | boolean | no | Force close without save dialog. Default false. |
+
+### td_reinit_extension
+
+Reinitialize an extension on a COMP in TouchDesigner (TD). Call this AFTER finishing all code edits via td_write_dat to apply changes. Do NOT call after every small edit - batch your changes first.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `path` | string | yes | Path to the COMP with the extension |
+
+### td_dev_log
+
+Read the last N entries from the MCP dev log. Only available when Devmode is enabled. Shows request/response history.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `count` | integer | no | Number of recent log entries to return |
+
+### td_clear_dev_log
+
+Clear the current MCP dev log by closing the old file and starting a fresh one. Only available when Devmode is enabled.
+
+No parameters (other than optional `target_instance`).
+
+### td_test_session
+
+Manage test sessions, bug reports, and conversation export. IMPORTANT: Do NOT proactively suggest exporting chat or submitting reports. These are tools for specific situations: - export_chat / submit_report: ONLY when the user encounters a BUG with the plugin or TouchDesigner and wants to report it, or when the user explicitly asks to export the conversation. Never suggest this at session end or as routine action. USER PHRASES → ACTIONS: 'разбор тестовых сессий' / 'analyze test sessions' → list, then pull, read meta.json → index.jsonl → calls/. 'разбор репортов' / 'analyze user reports' → list with session='user', then pull by name. 'экспортируй чат' / 'export chat' → (1) export_chat_id → marker, (2) export_chat with session=marker. 'сообщи о проблеме' / 'report bug' → export chat, review for privacy, then submit_report with summary + tags + result_op=file_path. ACTIONS: export_chat_id | export_chat | submit_report | start | note | import_chat | end | list | pull. list: default=auto-detect repo. session='user' for user_reports (dev only). pull: auto-searches both repos. Auto-detects dev vs user Hub access.
+
+| Param | Type | Required | Description |
+|-------|------|----------|-------------|
+| `action` | `export_chat_id` / `export_chat` / `submit_report` / `start` / `note` / `import_chat` / `end` / `list` / `pull` | yes | Action: export_chat_id / export_chat / submit_report / start / note / import_chat / end / list / pull |
+| `prompt` | string | no | (start) The test prompt/task description |
+| `tags` | array | no | (start) Tags for categorization, e.g. ['ui', 'layout'] |
+| `text` | string | no | (note) Observation text. (import_chat) Full conversation text. |
+| `outcome` | `success` / `partial` / `failure` | no | (end) Result: success / partial / failure |
+| `summary` | string | no | (end) Brief summary of what happened |
+| `result_op` | string | no | (end) Path to operator to save as result.tox |
+| `session` | string | no | (pull) Session name or substring to download |
@@ -0,0 +1,966 @@
+# TouchDesigner Network Patterns
+
+Complete network recipes for common creative coding tasks. Each pattern shows the operator chain, MCP tool calls to build it, and key parameter settings.
+
+## Audio-Reactive Visuals
+
+### Pattern 1: Audio Spectrum -> Noise Displacement
+
+Audio drives noise parameters for organic, music-responsive textures.
+
+```
+Audio File In CHOP -> Audio Spectrum CHOP -> Math CHOP (scale)
+                                                |
+                                                v (export to noise params)
+                          Noise TOP -> Level TOP -> Feedback TOP -> Composite TOP -> Null TOP (out)
+                                                        ^                |
+                                                        |________________|
+```
+
+**MCP Build Sequence:**
+
+```
+1. td_create_operator(parent="/project1", type="audiofileinChop", name="audio_in")
+2. td_create_operator(parent="/project1", type="audiospectrumChop", name="spectrum")
+3. td_create_operator(parent="/project1", type="mathChop", name="spectrum_scale")
+4. td_create_operator(parent="/project1", type="noiseTop", name="noise1")
+5. td_create_operator(parent="/project1", type="levelTop", name="level1")
+6. td_create_operator(parent="/project1", type="feedbackTop", name="feedback1")
+7. td_create_operator(parent="/project1", type="compositeTop", name="comp1")
+8. td_create_operator(parent="/project1", type="nullTop", name="out")
+
+9. td_set_operator_pars(path="/project1/audio_in",
+     properties={"file": "/path/to/music.wav", "play": true})
+10. td_set_operator_pars(path="/project1/spectrum",
+     properties={"size": 512})
+11. td_set_operator_pars(path="/project1/spectrum_scale",
+     properties={"gain": 2.0, "postoff": 0.0})
+12. td_set_operator_pars(path="/project1/noise1",
+     properties={"type": 1, "monochrome": false, "resolutionw": 1280, "resolutionh": 720,
+                  "period": 4.0, "harmonics": 3, "amp": 1.0})
+13. td_set_operator_pars(path="/project1/level1",
+     properties={"opacity": 0.95, "gamma1": 0.75})
+14. td_set_operator_pars(path="/project1/feedback1",
+     properties={"top": "/project1/comp1"})
+15. td_set_operator_pars(path="/project1/comp1",
+     properties={"operand": 0})
+
+16. td_execute_python: """
+op('/project1/audio_in').outputConnectors[0].connect(op('/project1/spectrum'))
+op('/project1/spectrum').outputConnectors[0].connect(op('/project1/spectrum_scale'))
+op('/project1/noise1').outputConnectors[0].connect(op('/project1/level1'))
+op('/project1/level1').outputConnectors[0].connect(op('/project1/comp1').inputConnectors[0])
+op('/project1/feedback1').outputConnectors[0].connect(op('/project1/comp1').inputConnectors[1])
+op('/project1/comp1').outputConnectors[0].connect(op('/project1/out'))
+"""
+
+17. td_execute_python: """
+# Export spectrum values to drive noise parameters
+# This makes the noise react to audio frequencies
+op('/project1/noise1').par.seed.expr = "op('/project1/spectrum_scale')['chan1']"
+op('/project1/noise1').par.period.expr = "tdu.remap(op('/project1/spectrum_scale')['chan1'].eval(), 0, 1, 1, 8)"
+"""
+```
+
+### Pattern 2: Beat Detection -> Visual Pulses
+
+Detect beats from audio and trigger visual events.
+
+```
+Audio Device In CHOP -> Audio Spectrum CHOP -> Math CHOP (isolate bass)
+                                                    |
+                                              Trigger CHOP (envelope)
+                                                    |
+                                              [export to visual params]
+```
+
+**Key parameter settings:**
+
+```
+# Isolate bass frequencies (20-200 Hz)
+Math CHOP: chanop=1 (Add channels), range1low=0, range1high=10
+           (first 10 FFT bins = bass frequencies with 512 FFT at 44100Hz)
+
+# ADSR envelope on each beat
+Trigger CHOP: attack=0.02, peak=1.0, decay=0.3, sustain=0.0, release=0.1
+
+# Export to visual: Scale, brightness, or color intensity
+td_execute_python: "op('/project1/level1').par.brightness1.expr = \"1.0 + op('/project1/trigger1')['chan1'] * 0.5\""
+```
+
+### Pattern 3: Multi-Band Audio -> Multi-Layer Visuals
+
+Split audio into frequency bands, drive different visual layers per band.
+
+```
+Audio In -> Spectrum -> Audio Band EQ (3 bands: bass, mid, treble)
+                              |
+                    +---------+---------+
+                    |         |         |
+                 Bass      Mids     Treble
+                  |          |         |
+           Noise TOP   Circle TOP  Text TOP
+           (slow,dark) (mid,warm)  (fast,bright)
+                  |          |         |
+                  +-----+----+----+----+
+                        |         |
+                   Composite  Composite
+                        |
+                       Out
+```
+
+### Pattern 3b: Audio-Reactive GLSL Fractal (Proven Recipe)
+
+Complete working recipe. Plays an MP3, runs FFT, feeds spectrum as a texture into a GLSL shader where inner fractal reacts to bass, outer to treble.
+
+**Network:**
+```
+AudioFileIn CHOP → AudioSpectrum CHOP (FFT=512, outlength=256)
+    → Math CHOP (gain=10) → CHOP To TOP (256x2 spectrum texture, dataformat=r)
+                                                                   ↓
+Constant TOP (time, rgba32float) → GLSL TOP (input 0=time, input 1=spectrum) → Null → MovieFileOut
+                                                                                        ↓
+AudioFileIn CHOP → Audio Device Out CHOP                                          Record to .mov
+```
+
+**Build via td_execute_python (one call per step for reliability):**
+
+```python
+# Step 1: Audio chain
+# td_execute_python script:
+td_execute_python(code="""
+root = op('/project1')
+audio = root.create(audiofileinCHOP, 'audio_in')
+audio.par.file = '/path/to/music.mp3'
+audio.par.playmode = 0  # Locked to timeline
+audio.par.volume = 0.5
+
+spec = root.create(audiospectrumCHOP, 'spectrum')
+audio.outputConnectors[0].connect(spec.inputConnectors[0])
+
+math_n = root.create(mathCHOP, 'math_norm')
+spec.outputConnectors[0].connect(math_n.inputConnectors[0])
+math_n.par.gain = 5  # boost signal
+
+resamp = root.create(resampleCHOP, 'resample_spec')
+math_n.outputConnectors[0].connect(resamp.inputConnectors[0])
+resamp.par.timeslice = True
+resamp.par.rate = 256
+
+chop2top = root.create(choptoTOP, 'spectrum_tex')
+chop2top.par.chop = resamp  # CHOP To TOP has NO input connectors — use par.chop reference
+
+# Audio output (hear the music)
+aout = root.create(audiodeviceoutCHOP, 'audio_out')
+audio.outputConnectors[0].connect(aout.inputConnectors[0])
+result = 'audio chain ok'
+""")
+
+# Step 2: Time driver (MUST be rgba32float — see pitfalls #6)
+# td_execute_python script:
+td_execute_python(code="""
+root = op('/project1')
+td = root.create(constantTOP, 'time_driver')
+td.par.format = 'rgba32float'
+td.par.outputresolution = 'custom'
+td.par.resolutionw = 1
+td.par.resolutionh = 1
+td.par.colorr.expr = "absTime.seconds % 1000.0"
+td.par.colorg.expr = "int(absTime.seconds / 1000.0)"
+result = 'time ok'
+""")
+
+# Step 3: GLSL shader (write to /tmp, load from file)
+# td_execute_python script:
+td_execute_python(code="""
+root = op('/project1')
+glsl = root.create(glslTOP, 'audio_shader')
+glsl.par.outputresolution = 'custom'
+glsl.par.resolutionw = 1280
+glsl.par.resolutionh = 720
+
+sd = root.create(textDAT, 'shader_code')
+sd.text = open('/tmp/my_shader.glsl').read()
+glsl.par.pixeldat = sd
+
+# Wire: input 0 = time, input 1 = spectrum texture
+op('/project1/time_driver').outputConnectors[0].connect(glsl.inputConnectors[0])
+op('/project1/spectrum_tex').outputConnectors[0].connect(glsl.inputConnectors[1])
+result = 'glsl ok'
+""")
+
+# Step 4: Output + recorder
+# td_execute_python script:
+td_execute_python(code="""
+root = op('/project1')
+out = root.create(nullTOP, 'output')
+op('/project1/audio_shader').outputConnectors[0].connect(out.inputConnectors[0])
+
+rec = root.create(moviefileoutTOP, 'recorder')
+out.outputConnectors[0].connect(rec.inputConnectors[0])
+rec.par.type = 'movie'
+rec.par.file = '/tmp/output.mov'
+rec.par.videocodec = 'mjpa'
+result = 'output ok'
+""")
+```
+
+**GLSL shader pattern (audio-reactive fractal):**
+```glsl
+out vec4 fragColor;
+
+vec3 palette(float t) {
+    vec3 a = vec3(0.5); vec3 b = vec3(0.5);
+    vec3 c = vec3(1.0); vec3 d = vec3(0.263, 0.416, 0.557);
+    return a + b * cos(6.28318 * (c * t + d));
+}
+
+void main() {
+    // Input 0 = time (1x1 rgba32float constant)
+    // Input 1 = audio spectrum (256x2 CHOP To TOP, stereo — sample at y=0.25 for first channel)
+    vec4 td = texture(sTD2DInputs[0], vec2(0.5));
+    float t = td.r + td.g * 1000.0;
+
+    vec2 res = uTDOutputInfo.res.zw;
+    vec2 uv = (gl_FragCoord.xy * 2.0 - res) / min(res.x, res.y);
+    vec2 uv0 = uv;
+    vec3 finalColor = vec3(0.0);
+
+    float bass = texture(sTD2DInputs[1], vec2(0.05, 0.25)).r;
+    float mids = texture(sTD2DInputs[1], vec2(0.25, 0.25)).r;
+
+    for (float i = 0.0; i < 4.0; i++) {
+        uv = fract(uv * (1.4 + bass * 0.3)) - 0.5;
+        float d = length(uv) * exp(-length(uv0));
+
+        // Sample spectrum at distance: inner=bass, outer=treble
+        float freq = texture(sTD2DInputs[1], vec2(clamp(d * 0.5, 0.0, 1.0), 0.25)).r;
+
+        vec3 col = palette(length(uv0) + i * 0.4 + t * 0.35);
+        d = sin(d * (7.0 + bass * 4.0) + t * 1.5) / 8.0;
+        d = abs(d);
+        d = pow(0.012 / d, 1.2 + freq * 0.8 + bass * 0.5);
+        finalColor += col * d;
+    }
+
+    // Tone mapping
+    finalColor = finalColor / (finalColor + vec3(1.0));
+    fragColor = TDOutputSwizzle(vec4(finalColor, 1.0));
+}
+```
+
+**Key insights from testing:**
+- `spectrum_tex` (CHOP To TOP) produces a 256x2 texture — x position = frequency, y=0.25 for first channel
+- Sampling at `vec2(0.05, 0.0)` gets bass, `vec2(0.65, 0.0)` gets treble
+- Sampling based on pixel distance (`d * 0.5`) makes inner fractal react to bass, outer to treble
+- `bass * 0.3` in the `fract()` zoom makes the fractal breathe with kicks
+- Math CHOP gain of 5 is needed because raw spectrum values are very small
+
+## Generative Art
+
+### Pattern 4: Feedback Loop with Transform
+
+Classic generative technique — texture evolves through recursive transformation.
+
+```
+Noise TOP -> Composite TOP -> Level TOP -> Null TOP (out)
+                  ^      |
+                  |      v
+            Transform TOP <- Feedback TOP
+```
+
+**MCP Build Sequence:**
+
+```
+1. td_create_operator(parent="/project1", type="noiseTop", name="seed_noise")
+2. td_create_operator(parent="/project1", type="compositeTop", name="mix")
+3. td_create_operator(parent="/project1", type="transformTop", name="evolve")
+4. td_create_operator(parent="/project1", type="feedbackTop", name="fb")
+5. td_create_operator(parent="/project1", type="levelTop", name="color_correct")
+6. td_create_operator(parent="/project1", type="nullTop", name="out")
+
+7. td_set_operator_pars(path="/project1/seed_noise",
+     properties={"type": 1, "monochrome": false, "period": 2.0, "amp": 0.3,
+                  "resolutionw": 1280, "resolutionh": 720})
+8. td_set_operator_pars(path="/project1/mix",
+     properties={"operand": 27})  # 27 = Screen blend
+9. td_set_operator_pars(path="/project1/evolve",
+     properties={"sx": 1.003, "sy": 1.003, "rz": 0.5, "extend": 2})  # slight zoom + rotate, repeat edges
+10. td_set_operator_pars(path="/project1/fb",
+     properties={"top": "/project1/mix"})
+11. td_set_operator_pars(path="/project1/color_correct",
+     properties={"opacity": 0.98, "gamma1": 0.85})
+
+12. td_execute_python: """
+op('/project1/seed_noise').outputConnectors[0].connect(op('/project1/mix').inputConnectors[0])
+op('/project1/fb').outputConnectors[0].connect(op('/project1/evolve'))
+op('/project1/evolve').outputConnectors[0].connect(op('/project1/mix').inputConnectors[1])
+op('/project1/mix').outputConnectors[0].connect(op('/project1/color_correct'))
+op('/project1/color_correct').outputConnectors[0].connect(op('/project1/out'))
+"""
+```
+
+**Variations:**
+- Change Transform: `rz` (rotation), `sx/sy` (zoom), `tx/ty` (drift)
+- Change Composite operand: Screen (glow), Add (bright), Multiply (dark)
+- Add HSV Adjust in the feedback loop for color evolution
+- Add Blur for dreamlike softness
+- Replace Noise with a GLSL TOP for custom seed patterns
+
+### Pattern 5: Instancing (Particle-Like Systems)
+
+Render thousands of copies of geometry, each with unique position/rotation/scale driven by CHOP data or DATs.
+
+```
+Table DAT (instance data) -> DAT to CHOP -> Geometry COMP (instancing on) -> Render TOP
+                                              + Sphere SOP (template geometry)
+                                              + Constant MAT (material)
+                                              + Camera COMP
+                                              + Light COMP
+```
+
+**MCP Build Sequence:**
+
+```
+1. td_create_operator(parent="/project1", type="tableDat", name="instance_data")
+2. td_create_operator(parent="/project1", type="geometryComp", name="geo1")
+3. td_create_operator(parent="/project1/geo1", type="sphereSop", name="sphere")
+4. td_create_operator(parent="/project1", type="constMat", name="mat1")
+5. td_create_operator(parent="/project1", type="cameraComp", name="cam1")
+6. td_create_operator(parent="/project1", type="lightComp", name="light1")
+7. td_create_operator(parent="/project1", type="renderTop", name="render1")
+
+8. td_execute_python: """
+import random, math
+dat = op('/project1/instance_data')
+dat.clear()
+dat.appendRow(['tx', 'ty', 'tz', 'sx', 'sy', 'sz', 'cr', 'cg', 'cb'])
+for i in range(500):
+    angle = i * 0.1
+    r = 2 + i * 0.01
+    dat.appendRow([
+        str(math.cos(angle) * r),
+        str(math.sin(angle) * r),
+        str((i - 250) * 0.02),
+        '0.05', '0.05', '0.05',
+        str(random.random()),
+        str(random.random()),
+        str(random.random())
+    ])
+"""
+
+9. td_set_operator_pars(path="/project1/geo1",
+     properties={"instancing": true, "instancechop": "",
+                  "instancedat": "/project1/instance_data",
+                  "material": "/project1/mat1"})
+10. td_set_operator_pars(path="/project1/render1",
+     properties={"camera": "/project1/cam1", "geometry": "/project1/geo1",
+                  "light": "/project1/light1",
+                  "resolutionw": 1280, "resolutionh": 720})
+11. td_set_operator_pars(path="/project1/cam1",
+     properties={"tz": 10})
+```
+
+### Pattern 6: Reaction-Diffusion (GLSL)
+
+Classic Gray-Scott reaction-diffusion system running on the GPU.
+
+```
+Text DAT (GLSL code) -> GLSL TOP (resolution, dat reference) -> Feedback TOP
+                              ^                                       |
+                              |_______________________________________|
+                         Level TOP (out)
+```
+
+**Key GLSL code (write to Text DAT via td_execute_python):**
+
+```glsl
+// Gray-Scott reaction-diffusion
+uniform float feed;    // 0.037
+uniform float kill;    // 0.06
+uniform float dA;      // 1.0
+uniform float dB;      // 0.5
+
+layout(location = 0) out vec4 fragColor;
+
+void main() {
+    vec2 uv = vUV.st;
+    vec2 texel = 1.0 / uTDOutputInfo.res.zw;
+
+    vec4 c = texture(sTD2DInputs[0], uv);
+    float a = c.r;
+    float b = c.g;
+
+    // Laplacian (9-point stencil)
+    float lA = 0.0, lB = 0.0;
+    for(int dx = -1; dx <= 1; dx++) {
+        for(int dy = -1; dy <= 1; dy++) {
+            float w = (dx == 0 && dy == 0) ? -1.0 : (abs(dx) + abs(dy) == 1 ? 0.2 : 0.05);
+            vec4 s = texture(sTD2DInputs[0], uv + vec2(dx, dy) * texel);
+            lA += s.r * w;
+            lB += s.g * w;
+        }
+    }
+
+    float reaction = a * b * b;
+    float newA = a + (dA * lA - reaction + feed * (1.0 - a));
+    float newB = b + (dB * lB + reaction - (kill + feed) * b);
+
+    fragColor = vec4(clamp(newA, 0.0, 1.0), clamp(newB, 0.0, 1.0), 0.0, 1.0);
+}
+```
+
+## Video Processing
+
+### Pattern 7: Video Effects Chain
+
+Apply a chain of effects to a video file.
+
+```
+Movie File In TOP -> HSV Adjust TOP -> Level TOP -> Blur TOP -> Composite TOP -> Null TOP (out)
+                                                                      ^
+                                                          Text TOP ---+
+```
+
+**MCP Build Sequence:**
+
+```
+1. td_create_operator(parent="/project1", type="moviefileinTop", name="video_in")
+2. td_create_operator(parent="/project1", type="hsvadjustTop", name="color")
+3. td_create_operator(parent="/project1", type="levelTop", name="levels")
+4. td_create_operator(parent="/project1", type="blurTop", name="blur")
+5. td_create_operator(parent="/project1", type="compositeTop", name="overlay")
+6. td_create_operator(parent="/project1", type="textTop", name="title")
+7. td_create_operator(parent="/project1", type="nullTop", name="out")
+
+8. td_set_operator_pars(path="/project1/video_in",
+     properties={"file": "/path/to/video.mp4", "play": true})
+9. td_set_operator_pars(path="/project1/color",
+     properties={"hueoffset": 0.1, "saturationmult": 1.3})
+10. td_set_operator_pars(path="/project1/levels",
+     properties={"brightness1": 1.1, "contrast": 1.2, "gamma1": 0.9})
+11. td_set_operator_pars(path="/project1/blur",
+     properties={"sizex": 2, "sizey": 2})
+12. td_set_operator_pars(path="/project1/title",
+     properties={"text": "My Video", "fontsizex": 48, "alignx": 1, "aligny": 1})
+
+13. td_execute_python: """
+chain = ['video_in', 'color', 'levels', 'blur']
+for i in range(len(chain) - 1):
+    op(f'/project1/{chain[i]}').outputConnectors[0].connect(op(f'/project1/{chain[i+1]}'))
+op('/project1/blur').outputConnectors[0].connect(op('/project1/overlay').inputConnectors[0])
+op('/project1/title').outputConnectors[0].connect(op('/project1/overlay').inputConnectors[1])
+op('/project1/overlay').outputConnectors[0].connect(op('/project1/out'))
+"""
+```
+
+### Pattern 8: Video Recording
+
+Record the output to a file. **H.264/H.265 require a Commercial license** — use Motion JPEG (`mjpa`) on Non-Commercial.
+
+```
+[any TOP chain] -> Null TOP -> Movie File Out TOP
+```
+
+```python
+# Build via td_execute_python:
+root = op('/project1')
+
+# Always put a Null TOP before the recorder
+null_out = root.op('out')  # or create one
+rec = root.create(moviefileoutTOP, 'recorder')
+null_out.outputConnectors[0].connect(rec.inputConnectors[0])
+
+rec.par.type = 'movie'
+rec.par.file = '/tmp/output.mov'
+rec.par.videocodec = 'mjpa'  # Motion JPEG — works on Non-Commercial
+
+# Start recording (par.record is a toggle — .record() method may not exist)
+rec.par.record = True
+# ... let TD run for desired duration ...
+rec.par.record = False
+
+# For image sequences:
+# rec.par.type = 'imagesequence'
+# rec.par.imagefiletype = 'png'
+# rec.par.file.expr = "'/tmp/frames/out' + me.fileSuffix"  # fileSuffix REQUIRED
+```
+
+**Pitfalls:**
+- Setting `par.file` + `par.record = True` in the same script may race — use `run("...", delayFrames=2)`
+- `TOP.save()` called rapidly always captures the same frame — use MovieFileOut for animation
+- See `pitfalls.md` #25-27 for full details
+
+### Pattern 8b: TD → External Pipeline (FFmpeg / Python / Post-Processing)
+
+Export TD visuals for use in another tool (ffmpeg, Python, ASCII art, etc.). This is the standard workflow when you need to composite TD output with external processing (ASCII conversion, Python shader chains, ML inference, etc.).
+
+**Step 1: Record to video in TD**
+
+```python
+# Preferred: ProRes on macOS (lossless, Non-Commercial OK, ~55MB/s at 1280x720)
+rec.par.videocodec = 'prores'
+# Fallback for non-macOS: mjpa (Motion JPEG)
+# rec.par.videocodec = 'mjpa'
+rec.par.record = True
+# ... wait N seconds ...
+rec.par.record = False
+```
+
+**Step 2: Extract frames with ffmpeg**
+
+```bash
+# Extract all frames at 30fps
+ffmpeg -y -i /tmp/output.mov -vf 'fps=30' /tmp/frames/frame_%06d.png
+
+# Or extract a specific duration
+ffmpeg -y -i /tmp/output.mov -t 25 -vf 'fps=30' /tmp/frames/frame_%06d.png
+
+# Or extract specific frame range
+ffmpeg -y -i /tmp/output.mov -vf 'select=between(n\,0\,749)' -vsync vfr /tmp/frames/frame_%06d.png
+```
+
+**Step 3: Process frames in Python**
+
+```python
+from PIL import Image
+import os
+
+frames_dir = '/tmp/frames'
+output_dir = '/tmp/processed'
+os.makedirs(output_dir, exist_ok=True)
+
+for fname in sorted(os.listdir(frames_dir)):
+    if not fname.endswith('.png'):
+        continue
+    img = Image.open(os.path.join(frames_dir, fname))
+    # ... apply your processing ...
+    img.save(os.path.join(output_dir, fname))
+```
+
+**Step 4: Mux processed frames back with audio**
+
+```bash
+# Create video from processed frames + audio with fade-out
+ffmpeg -y \
+  -framerate 30 -i /tmp/processed/frame_%06d.png \
+  -i /tmp/audio.mp3 \
+  -c:v libx264 -pix_fmt yuv420p -crf 18 \
+  -c:a aac -b:a 192k \
+  -shortest \
+  -af 'afade=t=out:st=23:d=2' \
+  /tmp/final_output.mp4
+```
+
+**Key considerations:**
+- Use ProRes for the TD recording step to avoid generation loss during compositing
+- Extract at the target output framerate (not TD's render framerate)
+- For audio-synced content, analyze the audio file separately in Python (scipy FFT) to get per-frame features (rms, spectral bands, beats) and drive compositing parameters
+- Always verify TD FPS > 0 before recording (see pitfalls #37, #38)
+
+## Data Visualization
+
+### Pattern 9: Table Data -> Bar Chart via Instancing
+
+Visualize tabular data as a 3D bar chart.
+
+```
+Table DAT (data) -> Script DAT (transform to instance format) -> DAT to CHOP
+                                                                      |
+Box SOP -> Geometry COMP (instancing from CHOP) -> Render TOP -> Null TOP (out)
+           + PBR MAT
+           + Camera COMP
+           + Light COMP
+```
+
+```python
+# Script DAT code to transform data to instance positions
+td_execute_python: """
+source = op('/project1/data_table')
+instance = op('/project1/instance_transform')
+instance.clear()
+instance.appendRow(['tx', 'ty', 'tz', 'sx', 'sy', 'sz', 'cr', 'cg', 'cb'])
+
+for i in range(1, source.numRows):
+    value = float(source[i, 'value'])
+    name = source[i, 'name']
+    instance.appendRow([
+        str(i * 1.5),          # x position (spread bars)
+        str(value / 2),        # y position (center bar vertically)
+        '0',                   # z position
+        '1', str(value), '1',  # scale (height = data value)
+        '0.2', '0.6', '1.0'   # color (blue)
+    ])
+"""
+```
+
+### Pattern 9b: Audio-Reactive GLSL Fractal (Proven Recipe)
+
+Audio spectrum drives a GLSL fractal shader directly via a spectrum texture input. Bass thickens inner fractal lines, mids twist rotation, highs light outer edges. **Always run discovery (SKILL.md Step 0) before using any param names from these recipes — they may differ in your TD version.**
+
+```
+Audio File In CHOP → Audio Spectrum CHOP (FFT=512, outlength=256)
+    → Math CHOP (gain=10)
+    → CHOP To TOP (spectrum texture, 256x2, dataformat=r)
+                                          ↓ (input 1)
+Constant TOP (rgba32float, time) → GLSL TOP (audio-reactive shader) → Null TOP
+        (input 0)                    ↑
+                              Text DAT (shader code)
+```
+
+**Build via td_execute_python (complete working script):**
+
+```python
+# td_execute_python script:
+td_execute_python(code="""
+import os
+root = op('/project1')
+
+# Audio input
+audio = root.create(audiofileinCHOP, 'audio_in')
+audio.par.file = '/path/to/music.mp3'
+audio.par.playmode = 0  # Locked to timeline
+
+# FFT analysis (output length manually set to 256 bins)
+spectrum = root.create(audiospectrumCHOP, 'spectrum')
+audio.outputConnectors[0].connect(spectrum.inputConnectors[0])
+spectrum.par.fftsize = '512'
+spectrum.par.outputmenu = 'setmanually'
+spectrum.par.outlength = 256
+
+# THEN boost gain on the raw spectrum (NO Lag CHOP — see pitfall #34)
+math = root.create(mathCHOP, 'math_norm')
+spectrum.outputConnectors[0].connect(math.inputConnectors[0])
+math.par.gain = 10
+
+# Spectrum → texture (256x2 image — stereo, sample at y=0.25 for first channel)
+# NOTE: choptoTOP has NO input connectors — use par.chop reference!
+spec_tex = root.create(choptoTOP, 'spectrum_tex')
+spec_tex.par.chop = math
+spec_tex.par.dataformat = 'r'
+spec_tex.par.layout = 'rowscropped'
+
+# Time driver (rgba32float to avoid 0-1 clamping!)
+time_drv = root.create(constantTOP, 'time_driver')
+time_drv.par.format = 'rgba32float'
+time_drv.par.outputresolution = 'custom'
+time_drv.par.resolutionw = 1
+time_drv.par.resolutionh = 1
+time_drv.par.colorr.expr = "absTime.seconds % 1000.0"
+time_drv.par.colorg.expr = "int(absTime.seconds / 1000.0)"
+
+# GLSL shader
+glsl = root.create(glslTOP, 'audio_shader')
+glsl.par.outputresolution = 'custom'
+glsl.par.resolutionw = 1280; glsl.par.resolutionh = 720
+
+shader_dat = root.create(textDAT, 'shader_code')
+shader_dat.text = open('/tmp/shader.glsl').read()
+glsl.par.pixeldat = shader_dat
+
+# Wire: input 0=time, input 1=spectrum
+time_drv.outputConnectors[0].connect(glsl.inputConnectors[0])
+spec_tex.outputConnectors[0].connect(glsl.inputConnectors[1])
+
+# Output + audio playback
+out = root.create(nullTOP, 'output')
+glsl.outputConnectors[0].connect(out.inputConnectors[0])
+audio_out = root.create(audiodeviceoutCHOP, 'audio_out')
+audio.outputConnectors[0].connect(audio_out.inputConnectors[0])
+
+result = 'network built'
+""")
+```
+
+**GLSL shader (reads spectrum from input 1 texture):**
+
+```glsl
+out vec4 fragColor;
+
+vec3 palette(float t) {
+    vec3 a = vec3(0.5); vec3 b = vec3(0.5);
+    vec3 c = vec3(1.0); vec3 d = vec3(0.263, 0.416, 0.557);
+    return a + b * cos(6.28318 * (c * t + d));
+}
+
+void main() {
+    vec4 td = texture(sTD2DInputs[0], vec2(0.5));
+    float t = td.r + td.g * 1000.0;
+
+    vec2 res = uTDOutputInfo.res.zw;
+    vec2 uv = (gl_FragCoord.xy * 2.0 - res) / min(res.x, res.y);
+    vec2 uv0 = uv;
+    vec3 finalColor = vec3(0.0);
+
+    float bass = texture(sTD2DInputs[1], vec2(0.05, 0.25)).r;
+    float mids = texture(sTD2DInputs[1], vec2(0.25, 0.25)).r;
+    float highs = texture(sTD2DInputs[1], vec2(0.65, 0.25)).r;
+
+    float ca = cos(t * (0.15 + mids * 0.3));
+    float sa = sin(t * (0.15 + mids * 0.3));
+    uv = mat2(ca, -sa, sa, ca) * uv;
+
+    for (float i = 0.0; i < 4.0; i++) {
+        uv = fract(uv * (1.4 + bass * 0.3)) - 0.5;
+        float d = length(uv) * exp(-length(uv0));
+        float freq = texture(sTD2DInputs[1], vec2(clamp(d*0.5, 0.0, 1.0), 0.25)).r;
+        vec3 col = palette(length(uv0) + i * 0.4 + t * 0.35);
+        d = sin(d * (7.0 + bass * 4.0) + t * 1.5) / 8.0;
+        d = abs(d);
+        d = pow(0.012 / d, 1.2 + freq * 0.8 + bass * 0.5);
+        finalColor += col * d;
+    }
+
+    float glow = (0.03 + bass * 0.05) / (length(uv0) + 0.03);
+    finalColor += vec3(0.4, 0.1, 0.7) * glow * (0.6 + 0.4 * sin(t * 2.5));
+
+    float ring = abs(length(uv0) - 0.4 - mids * 0.3);
+    finalColor += vec3(0.1, 0.6, 0.8) * (0.005 / ring) * (0.2 + highs * 0.5);
+
+    finalColor *= smoothstep(0.0, 1.0, 1.0 - dot(uv0*0.55, uv0*0.55));
+    finalColor = finalColor / (finalColor + vec3(1.0));
+
+    fragColor = TDOutputSwizzle(vec4(finalColor, 1.0));
+}
+```
+
+**How spectrum sampling drives the visual:**
+- `texture(sTD2DInputs[1], vec2(x, 0.0)).r` — x position = frequency (0=bass, 1=treble)
+- Inner fractal iterations sample lower x → react to bass
+- Outer iterations sample higher x → react to treble
+- `bass * 0.3` on `fract()` scale → fractal zoom pulses with bass
+- `bass * 4.0` on sin frequency → line density pulses with bass
+- `mids * 0.3` on rotation speed → spiral twists faster during vocal/mid sections
+- `highs * 0.5` on ring opacity → high-frequency sparkle on outer ring
+
+**Recording the output:** Use MovieFileOut TOP with `mjpa` codec (H.264 requires Commercial license). See pitfalls #25-27.
+
+## GLSL Shaders
+
+### Pattern 10: Custom Fragment Shader
+
+Write a custom visual effect as a GLSL fragment shader.
+
+```
+Text DAT (shader code) -> GLSL TOP -> Level TOP -> Null TOP (out)
+                           + optional input TOPs for texture sampling
+```
+
+**Common GLSL uniforms available in TouchDesigner:**
+
+```glsl
+// Automatically provided by TD
+uniform vec4 uTDOutputInfo;  // .res.zw = resolution
+
+// NOTE: uTDCurrentTime does NOT exist in TD 099!
+// Feed time via a 1x1 Constant TOP (format=rgba32float):
+//   t.par.colorr.expr = "absTime.seconds % 1000.0"
+//   t.par.colorg.expr = "int(absTime.seconds / 1000.0)"
+// Then read in GLSL:
+//   vec4 td = texture(sTD2DInputs[0], vec2(0.5));
+//   float t = td.r + td.g * 1000.0;
+
+// Input textures (from connected TOP inputs)
+uniform sampler2D sTD2DInputs[1];  // array of input samplers
+
+// From vertex shader
+in vec3 vUV;  // UV coordinates (0-1 range)
+```
+
+**Example: Plasma shader (using time from input texture)**
+
+```glsl
+layout(location = 0) out vec4 fragColor;
+
+void main() {
+    vec2 uv = vUV.st;
+    // Read time from Constant TOP input 0 (rgba32float format)
+    vec4 td = texture(sTD2DInputs[0], vec2(0.5));
+    float t = td.r + td.g * 1000.0;
+
+    float v1 = sin(uv.x * 10.0 + t);
+    float v2 = sin(uv.y * 10.0 + t * 0.7);
+    float v3 = sin((uv.x + uv.y) * 10.0 + t * 1.3);
+    float v4 = sin(length(uv - 0.5) * 20.0 - t * 2.0);
+
+    float v = (v1 + v2 + v3 + v4) * 0.25;
+
+    vec3 color = vec3(
+        sin(v * 3.14159 + 0.0) * 0.5 + 0.5,
+        sin(v * 3.14159 + 2.094) * 0.5 + 0.5,
+        sin(v * 3.14159 + 4.189) * 0.5 + 0.5
+    );
+
+    fragColor = vec4(color, 1.0);
+}
+```
+
+### Pattern 11: Multi-Pass GLSL (Ping-Pong)
+
+For effects needing state across frames (particles, fluid, cellular automata), use GLSL Multi TOP with multiple passes or a Feedback TOP loop.
+
+```
+GLSL Multi TOP (pass 0: simulation, pass 1: rendering)
+   + Text DAT (simulation shader)
+   + Text DAT (render shader)
+   -> Level TOP -> Null TOP (out)
+      ^
+      |__ Feedback TOP (feeds simulation state back)
+```
+
+## Interactive Installations
+
+### Pattern 12: Mouse/Touch -> Visual Response
+
+```
+Mouse In CHOP -> Math CHOP (normalize to 0-1) -> [export to visual params]
+
+# Or for touch/multi-touch:
+Multi Touch In DAT -> Script CHOP (parse touches) -> [export to visual params]
+```
+
+```python
+# Normalize mouse position to 0-1 range
+td_execute_python: """
+op('/project1/noise1').par.offsetx.expr = "op('/project1/mouse_norm')['tx']"
+op('/project1/noise1').par.offsety.expr = "op('/project1/mouse_norm')['ty']"
+"""
+```
+
+### Pattern 13: OSC Control (from external software)
+
+```
+OSC In CHOP (port 7000) -> Select CHOP (pick channels) -> [export to visual params]
+```
+
+```
+1. td_create_operator(parent="/project1", type="oscinChop", name="osc_in")
+2. td_set_operator_pars(path="/project1/osc_in", properties={"port": 7000})
+
+# OSC messages like /frequency 440 will appear as channel "frequency" with value 440
+# Export to any parameter:
+3. td_execute_python: "op('/project1/noise1').par.period.expr = \"op('/project1/osc_in')['frequency']\""
+```
+
+### Pattern 14: MIDI Control (DJ/VJ)
+
+```
+MIDI In CHOP (device) -> Select CHOP -> [export channels to visual params]
+```
+
+Common MIDI mappings:
+- CC channels (knobs/faders): continuous 0-127, map to float params
+- Note On/Off: binary triggers, map to Trigger CHOP for envelopes
+- Velocity: intensity/brightness
+
+## Live Performance
+
+### Pattern 15: Multi-Source VJ Setup
+
+```
+Source A (generative) ----+
+Source B (video) ---------+-- Switch/Cross TOP -- Level TOP -- Window COMP (output)
+Source C (camera) --------+
+                           ^
+                    MIDI/OSC control selects active source and crossfade
+```
+
+```python
+# MIDI CC1 controls which source is active (0-127 -> 0-2)
+td_execute_python: """
+op('/project1/switch1').par.index.expr = "int(op('/project1/midi_in')['cc1'] / 42)"
+"""
+
+# MIDI CC2 controls crossfade between current and next
+td_execute_python: """
+op('/project1/cross1').par.cross.expr = "op('/project1/midi_in')['cc2'] / 127.0"
+"""
+```
+
+### Pattern 16: Projection Mapping
+
+```
+Content TOPs ----+
+                 |
+Stoner TOP (UV mapping) -> Composite TOP -> Window COMP (projector output)
+   or
+Kantan Mapper COMP (external .tox)
+```
+
+For projection mapping, the key is:
+1. Create your visual content as standard TOPs
+2. Use Stoner TOP or a third-party mapping tool to UV-map content to physical surfaces
+3. Output via Window COMP to the projector
+
+### Pattern 17: Cue System
+
+```
+Table DAT (cue list: cue_number, scene_name, duration, transition_type)
+    |
+Script CHOP (cue state: current_cue, progress, next_cue_trigger)
+    |
+[export to Switch/Cross TOPs to transition between scenes]
+```
+
+```python
+td_execute_python: """
+# Simple cue system
+cue_table = op('/project1/cue_list')
+cue_state = op('/project1/cue_state')
+
+def advance_cue():
+    current = int(cue_state.par.value0.val)
+    next_cue = min(current + 1, cue_table.numRows - 1)
+    cue_state.par.value0.val = next_cue
+    
+    scene = cue_table[next_cue, 'scene']
+    duration = float(cue_table[next_cue, 'duration'])
+    
+    # Set crossfade target and duration
+    op('/project1/cross1').par.cross.val = 0
+    # Animate cross to 1.0 over duration seconds
+    # (use a Timer CHOP or LFO CHOP for smooth animation)
+"""
+```
+
+## Networking
+
+### Pattern 18: OSC Server/Client
+
+```
+# Sending OSC
+OSC Out CHOP -> (network) -> external application
+
+# Receiving OSC  
+(network) -> OSC In CHOP -> Select CHOP -> [use values]
+```
+
+### Pattern 19: NDI Video Streaming
+
+```
+# Send video over network
+[any TOP chain] -> NDI Out TOP (source name)
+
+# Receive video from network
+NDI In TOP (select source) -> [process as normal TOP]
+```
+
+### Pattern 20: WebSocket Communication
+
+```
+WebSocket DAT -> Script DAT (parse JSON messages) -> [update visuals]
+```
+
+```python
+td_execute_python: """
+ws = op('/project1/websocket1')
+ws.par.address = 'ws://localhost:8080'
+ws.par.active = True
+
+# In a DAT Execute callback (Script DAT watching WebSocket DAT):
+# def onTableChange(dat):
+#     import json
+#     msg = json.loads(dat.text)
+#     op('/project1/noise1').par.seed.val = msg.get('seed', 0)
+"""
+```
@@ -0,0 +1,239 @@
+# TouchDesigner Operator Reference
+
+## Operator Families Overview
+
+TouchDesigner has 6 operator families. Each family processes a specific data type and is color-coded in the UI. Operators can only connect to others of the SAME family (with cross-family converters as the bridge).
+
+## TOPs — Texture Operators (Purple)
+
+2D image/texture processing on the GPU. The workhorse of visual output.
+
+### Generators (create images from nothing)
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Noise TOP | `noiseTop` | `type` (0-6), `monochrome`, `seed`, `period`, `harmonics`, `exponent`, `amp`, `offset`, `resolutionw/h` | Procedural noise textures — Perlin, Simplex, Sparse, etc. Foundation of generative art. |
+| Constant TOP | `constantTop` | `colorr/g/b/a`, `resolutionw/h` | Solid color. Use as background or blend input. |
+| Text TOP | `textTop` | `text`, `fontsizex`, `fontfile`, `alignx/y`, `colorr/g/b` | Render text to texture. Supports multi-line, word wrap. |
+| Ramp TOP | `rampTop` | `type` (0=horizontal, 1=vertical, 2=radial, 3=circular), `phase`, `period` | Gradient textures for masking, color mapping. |
+| Circle TOP | `circleTop` | `radiusx/y`, `centerx/y`, `width` | Circles, rings, ellipses. |
+| Rectangle TOP | `rectangleTop` | `sizex/y`, `centerx/y`, `softness` | Rectangles with optional softness. |
+| GLSL TOP | `glslTop` | `dat` (points to shader DAT), `resolutionw/h`, `outputformat`, custom uniforms | Custom fragment shaders. Most powerful TOP for custom visuals. |
+| GLSL Multi TOP | `glslmultiTop` | `dat`, `numinputs`, `numoutputs`, `numcomputepasses` | Multi-pass GLSL with compute shaders. Advanced. |
+| Render TOP | `renderTop` | `camera`, `geometry`, `lights`, `resolutionw/h` | Renders 3D scenes (SOPs + MATs + Camera/Light COMPs). |
+
+### Filters (modify a single input)
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Level TOP | `levelTop` | `opacity`, `brightness1/2`, `gamma1/2`, `contrast`, `invert`, `blacklevel/whitelevel` | Brightness, contrast, gamma, levels. Essential color correction. |
+| Blur TOP | `blurTop` | `sizex/y`, `type` (0=Gaussian, 1=Box, 2=Bartlett) | Gaussian/box blur. |
+| Transform TOP | `transformTop` | `tx/ty`, `sx/sy`, `rz`, `pivotx/y`, `extend` (0=Hold, 1=Zero, 2=Repeat, 3=Mirror) | Translate, scale, rotate textures. |
+| HSV Adjust TOP | `hsvadjustTop` | `hueoffset`, `saturationmult`, `valuemult` | HSV color adjustments. |
+| Lookup TOP | `lookupTop` | (input: texture + lookup table) | Color remapping via lookup table texture. |
+| Edge TOP | `edgeTop` | `type` (0=Sobel, 1=Frei-Chen) | Edge detection. |
+| Displace TOP | `displaceTop` | `scalex/y` | Pixel displacement using a second input as displacement map. |
+| Flip TOP | `flipTop` | `flipx`, `flipy`, `flop` (diagonal) | Mirror/flip textures. |
+| Crop TOP | `cropTop` | `cropleft/right/top/bottom` | Crop region of texture. |
+| Resolution TOP | `resolutionTop` | `resolutionw/h`, `outputresolution` | Resize textures. |
+| Null TOP | `nullTop` | (none significant) | Pass-through. Use for organization, referencing, feedback delay. |
+| Cache TOP | `cacheTop` | `length`, `step` | Store N frames of history. Useful for trails, time effects. |
+
+### Compositors (combine multiple inputs)
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Composite TOP | `compositeTop` | `operand` (0-31: Over, Add, Multiply, Screen, etc.) | Blend two textures with standard compositing modes. |
+| Over TOP | `overTop` | (simple alpha compositing) | Layer with alpha. Simpler than Composite. |
+| Add TOP | `addTop` | (additive blend) | Additive blending. Great for glow, light effects. |
+| Multiply TOP | `multiplyTop` | (multiplicative blend) | Multiply blend. Good for masking, darkening. |
+| Switch TOP | `switchTop` | `index` (0-based) | Switch between multiple inputs by index. |
+| Cross TOP | `crossTop` | `cross` (0.0-1.0) | Crossfade between two inputs. |
+
+### I/O (input/output)
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Movie File In TOP | `moviefileinTop` | `file`, `speed`, `trim`, `index` | Load video files, image sequences. |
+| Movie File Out TOP | `moviefileoutTop` | `file`, `type` (codec), `record` (toggle) | Record/export video files. |
+| NDI In TOP | `ndiinTop` | `sourcename` | Receive NDI video streams. |
+| NDI Out TOP | `ndioutTop` | `sourcename` | Send NDI video streams. |
+| Syphon Spout In/Out TOP | `syphonspoutinTop` / `syphonspoutoutTop` | `servername` | Inter-app texture sharing. |
+| Video Device In TOP | `videodeviceinTop` | `device` | Webcam/capture card input. |
+| Feedback TOP | `feedbackTop` | `top` (path to the TOP to feed back) | One-frame delay feedback. Essential for recursive effects. |
+
+### Converters
+
+| Operator | Type Name | Direction | Use |
+|----------|-----------|-----------|-----|
+| CHOP to TOP | `choptopTop` | CHOP -> TOP | Visualize channel data as texture (waveform, spectrum display). |
+| TOP to CHOP | `topchopChop` | TOP -> CHOP | Sample texture pixels as channel data. |
+
+## CHOPs — Channel Operators (Green)
+
+Time-varying numeric data: audio, animation curves, sensor data, control signals.
+
+### Generators
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Constant CHOP | `constantChop` | `name0/value0`, `name1/value1`... | Static named channels. Control panel for parameters. |
+| LFO CHOP | `lfoChop` | `frequency`, `type` (0=Sin, 1=Tri, 2=Square, 3=Ramp, 4=Pulse), `amp`, `offset`, `phase` | Low frequency oscillator. Animation driver. |
+| Noise CHOP | `noiseChop` | `type`, `roughness`, `period`, `amp`, `seed`, `channels` | Smooth random motion. Organic animation. |
+| Pattern CHOP | `patternChop` | `type` (0=Sine, 1=Triangle, ...), `length`, `cycles` | Generate waveform patterns. |
+| Timer CHOP | `timerChop` | `length`, `play`, `cue`, `cycles` | Countdown/count-up timer with cue points. |
+| Count CHOP | `countChop` | `threshold`, `limittype`, `limitmin/max` | Event counter with wrapping/clamping. |
+
+### Audio
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Audio File In CHOP | `audiofileinChop` | `file`, `volume`, `play`, `speed`, `trim` | Play audio files. |
+| Audio Device In CHOP | `audiodeviceinChop` | `device`, `channels` | Live microphone/line input. |
+| Audio Spectrum CHOP | `audiospectrumChop` | `size` (FFT size), `outputformat` (0=Power, 1=Magnitude) | FFT frequency analysis. |
+| Audio Band EQ CHOP | `audiobandeqChop` | `bands`, `gaindb` per band | Frequency band isolation. |
+| Audio Device Out CHOP | `audiodeviceoutChop` | `device` | Audio playback output. |
+
+### Math/Logic
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Math CHOP | `mathChop` | `preoff`, `gain`, `postoff`, `chanop` (0=Off, 1=Add, 2=Subtract, 3=Multiply...) | Math operations on channels. The Swiss army knife. |
+| Logic CHOP | `logicChop` | `preop` (0=Off, 1=AND, 2=OR, 3=XOR, 4=NAND), `convert` | Boolean logic on channels. |
+| Filter CHOP | `filterChop` | `type` (0=Low Pass, 1=Band Pass, 2=High Pass, 3=Notch), `cutofffreq`, `filterwidth` | Smooth, dampen, filter signals. |
+| Lag CHOP | `lagChop` | `lag1/2`, `overshoot1/2` | Smooth transitions with overshoot. |
+| Limit CHOP | `limitChop` | `type` (0=Clamp, 1=Loop, 2=ZigZag), `min/max` | Clamp or wrap channel values. |
+| Speed CHOP | `speedChop` | (none significant) | Integrate values (velocity to position, acceleration to velocity). |
+| Trigger CHOP | `triggerChop` | `attack`, `peak`, `decay`, `sustain`, `release` | ADSR envelope from trigger events. |
+| Select CHOP | `selectChop` | `chop` (path), `channames` | Reference channels from another CHOP. |
+| Merge CHOP | `mergeChop` | `align` (0=Extend, 1=Trim to First, 2=Trim to Shortest) | Combine channels from multiple CHOPs. |
+| Null CHOP | `nullChop` | (none significant) | Pass-through for organization and referencing. |
+
+### Input Devices
+
+| Operator | Type Name | Use |
+|----------|-----------|-----|
+| Mouse In CHOP | `mouseinChop` | Mouse position, buttons, wheel. |
+| Keyboard In CHOP | `keyboardinChop` | Keyboard key states. |
+| MIDI In CHOP | `midiinChop` | MIDI note/CC input. |
+| OSC In CHOP | `oscinChop` | OSC message input (network). |
+
+## SOPs — Surface Operators (Blue)
+
+3D geometry: points, polygons, NURBS, meshes.
+
+### Generators
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Grid SOP | `gridSop` | `rows`, `cols`, `sizex/y`, `type` (0=Polygon, 1=Mesh, 2=NURBS) | Flat grid mesh. Foundation for displacement, instancing. |
+| Sphere SOP | `sphereSop` | `type`, `rows`, `cols`, `radius` | Sphere geometry. |
+| Box SOP | `boxSop` | `sizex/y/z` | Box geometry. |
+| Torus SOP | `torusSop` | `radiusx/y`, `rows`, `cols` | Donut shape. |
+| Circle SOP | `circleSop` | `type`, `radius`, `divs` | Circle/ring geometry. |
+| Line SOP | `lineSop` | `dist`, `points` | Line segments. |
+| Text SOP | `textSop` | `text`, `fontsizex`, `fontfile`, `extrude` | 3D text geometry. |
+
+### Modifiers
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Transform SOP | `transformSop` | `tx/ty/tz`, `rx/ry/rz`, `sx/sy/sz` | Transform geometry (translate, rotate, scale). |
+| Noise SOP | `noiseSop` | `type`, `amp`, `period`, `roughness` | Deform geometry with noise. |
+| Sort SOP | `sortSop` | `ptsort`, `primsort` | Reorder points/primitives. |
+| Facet SOP | `facetSop` | `unique`, `consolidate`, `computenormals` | Normals, consolidation, unique points. |
+| Merge SOP | `mergeSop` | (none significant) | Combine multiple geometry inputs. |
+| Null SOP | `nullSop` | (none significant) | Pass-through. |
+
+## DATs — Data Operators (White)
+
+Text, tables, scripts, network data.
+
+### Core
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Table DAT | `tableDat` | (edit content directly) | Spreadsheet-like data tables. |
+| Text DAT | `textDat` | (edit content directly) | Arbitrary text content. Shader code, configs, scripts. |
+| Script DAT | `scriptDat` | `language` (0=Python, 1=C++) | Custom callbacks and DAT processing. |
+| CHOP Execute DAT | `chopexecDat` | `chop` (path to watch), callbacks | Trigger Python on CHOP value changes. |
+| DAT Execute DAT | `datexecDat` | `dat` (path to watch) | Trigger Python on DAT content changes. |
+| Panel Execute DAT | `panelexecDat` | `panel` | Trigger Python on UI panel events. |
+
+### I/O
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Web DAT | `webDat` | `url`, `fetchmethod` (0=GET, 1=POST) | HTTP requests. API integration. |
+| TCP/IP DAT | `tcpipDat` | `address`, `port`, `mode` | TCP networking. |
+| OSC In DAT | `oscinDat` | `port` | Receive OSC as text messages. |
+| Serial DAT | `serialDat` | `port`, `baudrate` | Serial port communication (Arduino, etc.). |
+| File In DAT | `fileinDat` | `file` | Read text files. |
+| File Out DAT | `fileoutDat` | `file`, `write` | Write text files. |
+
+### Conversions
+
+| Operator | Type Name | Direction | Use |
+|----------|-----------|-----------|-----|
+| DAT to CHOP | `dattochopChop` | DAT -> CHOP | Convert table data to channels. |
+| CHOP to DAT | `choptodatDat` | CHOP -> DAT | Convert channel data to table rows. |
+| SOP to DAT | `soptodatDat` | SOP -> DAT | Extract geometry data as table. |
+
+## MATs — Material Operators (Yellow)
+
+Materials for 3D rendering in Render TOP / Geometry COMP.
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Phong MAT | `phongMat` | `diff_colorr/g/b`, `spec_colorr/g/b`, `shininess`, `colormap`, `normalmap` | Classic Phong shading. Simple, fast. |
+| PBR MAT | `pbrMat` | `basecolorr/g/b`, `metallic`, `roughness`, `normalmap`, `emitcolorr/g/b` | Physically-based rendering. Realistic materials. |
+| GLSL MAT | `glslMat` | `dat` (shader DAT), custom uniforms | Custom vertex + fragment shaders for 3D. |
+| Constant MAT | `constMat` | `colorr/g/b`, `colormap` | Flat unlit color/texture. No shading. |
+| Point Sprite MAT | `pointspriteMat` | `colormap`, `scale` | Render points as camera-facing sprites. Great for particles. |
+| Wireframe MAT | `wireframeMat` | `colorr/g/b`, `width` | Wireframe rendering. |
+| Depth MAT | `depthMat` | `near`, `far` | Render depth buffer as grayscale. |
+
+## COMPs — Component Operators (Gray)
+
+Containers, 3D scene elements, UI components.
+
+### 3D Scene
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Geometry COMP | `geometryComp` | `material` (path), `instancechop` (path), `instancing` (toggle) | Renders geometry with material. Instancing host. |
+| Camera COMP | `cameraComp` | `tx/ty/tz`, `rx/ry/rz`, `fov`, `near/far` | Camera for Render TOP. |
+| Light COMP | `lightComp` | `lighttype` (0=Point, 1=Directional, 2=Spot, 3=Cone), `dimmer`, `colorr/g/b` | Lighting for 3D scenes. |
+| Ambient Light COMP | `ambientlightComp` | `dimmer`, `colorr/g/b` | Ambient lighting. |
+| Environment Light COMP | `envlightComp` | `envmap` | Image-based lighting (IBL). |
+
+### Containers
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Container COMP | `containerComp` | `w`, `h`, `bgcolor1/2/3` | UI container. Holds other COMPs for panel layouts. |
+| Base COMP | `baseComp` | (none significant) | Generic container. Networks-inside-networks. |
+| Replicator COMP | `replicatorComp` | `template`, `operatorsdat` | Clone a template operator N times from a table. |
+
+### Utilities
+
+| Operator | Type Name | Key Parameters | Use |
+|----------|-----------|---------------|-----|
+| Window COMP | `windowComp` | `winw/h`, `winoffsetx/y`, `monitor`, `borders` | Output window for display/projection. |
+| Select COMP | `selectComp` | `rowcol`, `panel` | Select and display content from elsewhere. |
+| Engine COMP | `engineComp` | `tox`, `externaltox` | Load external .tox components. Sub-process isolation. |
+
+## Cross-Family Converter Summary
+
+| From | To | Operator | Type Name |
+|------|-----|----------|-----------|
+| CHOP | TOP | CHOP to TOP | `choptopTop` |
+| TOP | CHOP | TOP to CHOP | `topchopChop` |
+| DAT | CHOP | DAT to CHOP | `dattochopChop` |
+| CHOP | DAT | CHOP to DAT | `choptodatDat` |
+| SOP | CHOP | SOP to CHOP | `soptochopChop` |
+| CHOP | SOP | CHOP to SOP | `choptosopSop` |
+| SOP | DAT | SOP to DAT | `soptodatDat` |
+| DAT | SOP | DAT to SOP | `dattosopSop` |
+| SOP | TOP | (use Render TOP + Geometry COMP) | — |
+| TOP | SOP | TOP to SOP | `toptosopSop` |
@@ -0,0 +1,508 @@
+# TouchDesigner MCP — Pitfalls & Lessons Learned
+
+Hard-won knowledge from real TD sessions. Read this before building anything.
+
+## Parameter Names
+
+### 1. NEVER hardcode parameter names — always discover
+
+Parameter names change between TD versions. What works in one build may not work in another. ALWAYS use td_get_par_info to discover actual names from TD.
+
+The agent's LLM training data contains WRONG parameter names. Do not trust them.
+
+Known historical differences (may vary further — always verify):
+| What docs/training say | Actual in some versions | Notes |
+|---------------|---------------|-------|
+| `dat` | `pixeldat` | GLSL TOP pixel shader DAT |
+| `colora` | `alpha` | Constant TOP alpha |
+| `sizex` / `sizey` | `size` | Blur TOP (single value) |
+| `fontr/g/b/a` | `fontcolorr/g/b/a` | Text TOP font color (r/g/b) |
+| `fontcolora` | `fontalpha` | Text TOP font alpha (NOT `fontcolora`) |
+| `bgcolora` | `bgalpha` | Text TOP bg alpha |
+| `value1name` | `vec0name` | GLSL TOP uniform name |
+
+### 2. twozero td_execute_python response format
+
+When calling `td_execute_python` via twozero MCP, successful responses return `(ok)` followed by FPS/error summary (e.g. `[fps 60.0/60] [0 err/0 warn]`), NOT the raw Python `result` dict. If you're parsing responses programmatically, check for the `(ok)` prefix — don't pattern-match on Python variable names from the script. Use `td_get_operator_info` or separate inspection calls to read back values.
+
+### 3. When using td_set_operator_pars, param names must match exactly
+
+Use td_get_par_info to discover them. The MCP tool validates parameter names and returns clear errors explaining what went wrong, unlike raw Python which crashes the whole script with tdAttributeError and stops execution. Always discover before setting.
+
+### 4. Use `safe_par()` pattern for cross-version compatibility
+
+```python
+def safe_par(node, name, value):
+    p = getattr(node.par, name, None)
+    if p is not None:
+        p.val = value
+        return True
+    return False
+```
+
+### 5. `td.tdAttributeError` crashes the whole script — use defensive access
+
+If you do `node.par.nonexistent = value`, TD raises `tdAttributeError` and stops the entire script. Prevention is better than catching:
+- Use `op()` instead of `opex()` — `op()` returns None on failure, `opex()` raises
+- Use `hasattr(node.par, 'name')` before accessing any parameter
+- Use `getattr(node.par, 'name', None)` with a default
+- Use the `safe_par()` pattern from pitfall #3
+
+```python
+# WRONG — crashes if param doesn't exist:
+node.par.nonexistent = value
+
+# CORRECT — defensive access:
+if hasattr(node.par, 'nonexistent'):
+    node.par.nonexistent = value
+```
+
+### 6. `outputresolution` is a string menu, not an integer
+
+```
+menuNames: ['useinput','eighth','quarter','half','2x','4x','8x','fit','limit','custom','parpanel']
+```
+Always use the string form. Setting `outputresolution = 9` may silently fail.
+```python
+node.par.outputresolution = 'custom'  # correct
+node.par.resolutionw = 1280; node.par.resolutionh = 720
+```
+Discover valid values: `list(node.par.outputresolution.menuNames)`
+
+## GLSL Shaders
+
+### 7. `uTDCurrentTime` does NOT exist in GLSL TOP
+
+There is NO built-in time uniform for GLSL TOPs. GLSL MAT has `uTDGeneral.seconds` but that's NOT available in GLSL TOP context.
+
+**PRIMARY — GLSL TOP Vectors/Values page:**
+```python
+gl.par.value0name = 'uTime'
+gl.par.value0.expr = "absTime.seconds"
+# In GLSL: uniform float uTime;
+```
+
+**FALLBACK — Constant TOP texture (for complex time data):**
+
+CRITICAL: set format to `rgba32float` — default 8-bit clamps to 0-1:
+```python
+t = root.create(constantTOP, 'time_driver')
+t.par.format = 'rgba32float'
+t.par.outputresolution = 'custom'
+t.par.resolutionw = 1; t.par.resolutionh = 1
+t.par.colorr.expr = "absTime.seconds % 1000.0"
+t.outputConnectors[0].connect(glsl.inputConnectors[0])
+```
+
+### 8. GLSL compile errors are silent in the API
+
+The GLSL TOP shows a yellow warning triangle in the UI but `node.errors()` may return empty string. Check `node.warnings()` too, and create an Info DAT pointed at the GLSL TOP to read the actual compiler output.
+
+### 9. TD GLSL uses `vUV.st` not `gl_FragCoord` — and REQUIRES `TDOutputSwizzle()` on macOS
+
+Standard GLSL patterns don't work. TD provides:
+- `vUV.st` — UV coordinates (0-1)
+- `uTDOutputInfo.res.zw` — resolution
+- `sTD2DInputs[0]` — input textures
+- `layout(location = 0) out vec4 fragColor` — output
+
+CRITICAL on macOS: Always wrap output with `TDOutputSwizzle()`:
+```glsl
+fragColor = TDOutputSwizzle(color);
+```
+TD uses GLSL 4.60 (Vulkan backend). GLSL 3.30 and earlier removed.
+
+### 10. Large GLSL shaders — write to temp file
+
+GLSL code with special characters can corrupt JSON payloads. Write the shader to a temp file and load it in TD:
+```python
+# Agent side: write shader to /tmp/shader.glsl via write_file
+# TD side:
+sd = root.create(textDAT, 'shader_code')
+with open('/tmp/shader.glsl', 'r') as f:
+    sd.text = f.read()
+```
+
+## Node Management
+
+### 11. Destroying nodes while iterating `root.children` causes `tdError`
+
+The iterator is invalidated when a child is destroyed. Always snapshot first:
+```python
+kids = list(root.children)  # snapshot
+for child in kids:
+    if child.valid:  # check — earlier destroys may cascade
+        child.destroy()
+```
+
+### 11b. Split cleanup and creation into SEPARATE td_execute_python calls
+
+Creating nodes with the same names you just destroyed in the SAME script causes "Invalid OP object" errors — even with `list()` snapshot. TD's internal references can go stale within one execution context.
+
+**WRONG (single call):**
+```python
+# td_execute_python:
+for c in list(root.children):
+    if c.valid and c.name.startswith('promo_'):
+        c.destroy()
+# ... then create promo_audio, promo_shader etc. in same script → CRASHES
+```
+
+**CORRECT (two separate calls):**
+```python
+# Call 1: td_execute_python — clean only
+for c in list(root.children):
+    if c.valid and c.name.startswith('promo_'):
+        c.destroy()
+
+# Call 2: td_execute_python — build (separate MCP call)
+audio = root.create(audiofileinCHOP, 'promo_audio')
+# ... rest of build
+```
+
+### 12. Feedback TOP: use `top` parameter, NOT direct input wire
+
+The feedbackTOP's `top` parameter references which TOP to delay. Do NOT also wire that TOP directly into the feedback's input — this creates a real cook dependency loop.
+
+Correct setup:
+```python
+fb = root.create(feedbackTOP, 'fb_delay')
+fb.par.top = comp.path          # reference only — no wire to fb input
+fb.outputConnectors[0].connect(xf)  # fb output -> transform -> fade -> comp
+```
+
+The "Cook dependency loop detected" warning on the transform/fade chain is expected.
+
+### 13. GLSL TOP auto-creates companion nodes
+
+Creating a `glslTOP` also creates `name_pixel` (Text DAT), `name_info` (Info DAT), and `name_compute` (Text DAT). These are visible in the network. Don't be alarmed by "extra" nodes.
+
+### 14. The default project root is `/project1`
+
+New TD files start with `/project1` as the main container. System nodes live at `/`, `/ui`, `/sys`, `/local`, `/perform`. Don't create user nodes outside `/project1`.
+
+### 15. Non-Commercial license caps resolution at 1280x1280
+
+Setting `resolutionw=1920` silently clamps to 1280. Always check effective resolution after creation:
+```python
+n.cook(force=True)
+actual = str(n.width) + 'x' + str(n.height)
+```
+
+## Recording & Codecs
+
+### 16. MovieFileOut TOP: H.264/H.265/AV1 requires Commercial license
+
+In Non-Commercial TD, these codecs produce an error. Recommended alternatives:
+- `prores` — Apple ProRes, **best on macOS**, HW accelerated, NOT license-restricted. ~55MB/s at 1280x720 but lossless quality. **Use this as default on macOS.**
+- `cineform` — GoPro Cineform, supports alpha
+- `hap` — GPU-accelerated playback, large files
+- `notchlc` — GPU-accelerated, good quality
+- `mjpa` — Motion JPEG, legacy fallback (lossy, use only if ProRes unavailable)
+
+For image sequences: `rec.par.type = 'imagesequence'`, `rec.par.imagefiletype = 'png'`
+
+### 17. MovieFileOut `.record()` method may not exist
+
+Use the toggle parameter instead:
+```python
+rec.par.record = True   # start recording
+rec.par.record = False  # stop recording
+```
+
+When setting file path and starting recording in the same script, use delayFrames:
+```python
+rec.par.file = '/tmp/new_output.mov'
+run("op('/project1/recorder').par.record = True", delayFrames=2)
+```
+
+### 18. TOP.save() captures same frame when called rapidly
+
+Use MovieFileOut for real-time recording. Set `project.realTime = False` for frame-accurate output.
+
+### 19. AudioFileIn CHOP: cue and recording sequence matters
+
+The recording sequence must be done in exact order, or the recording will be empty, audio will start mid-file, or the file won't be written.
+
+**Proven recording sequence:**
+
+```python
+# Step 1: Stop any existing recording
+rec.par.record = False
+
+# Step 2: Reset audio to beginning
+audio.par.play = False
+audio.par.cue = True
+audio.par.cuepoint = 0      # may need cuepointunit=0 too
+# Verify: audio.par.cue.eval() should be True
+
+# Step 3: Set output file path
+rec.par.file = '/tmp/output.mov'
+
+# Step 4: Release cue + start playing + start recording (with frame delay)
+audio.par.cue = False
+audio.par.play = True
+audio.par.playmode = 2      # Sequential — plays once through
+run("op('/project1/recorder').par.record = True", delayFrames=3)
+```
+
+**Why each step matters:**
+- `rec.par.record = False` first — if a previous recording is active, setting `par.file` may fail silently
+- `audio.par.cue = True` + `cuepoint = 0` — guarantees audio starts from the beginning, otherwise the spectrum may be silent for the first few seconds
+- `delayFrames=3` on the record start — setting `par.file` and `par.record = True` in the same script can race; the file path needs a frame to register before recording starts
+- `playmode = 2` (Sequential) — plays the file once. Use `playmode = 0` (Locked to Timeline) if you want TD's timeline to control position
+
+## TD Python API Patterns
+
+### 20. COMP extension setup: ext0object format is CRITICAL
+
+`ext0object` expects a CONSTANT string (NOT expression mode):
+```python
+comp.par.ext0object = "op('./myExtensionDat').module.MyClassName(me)"
+```
+NEVER set as just the DAT name. NEVER use ParMode.EXPRESSION. ALWAYS ensure the DAT has `par.language='python'`.
+
+### 21. td.Panel is NOT subscriptable — use attribute access
+
+```python
+comp.panel.select      # correct (attribute access, returns float)
+comp.panel['select']   # WRONG — 'td.Panel' object is not subscriptable
+```
+
+### 22. ALWAYS use relative paths in script callbacks
+
+In scriptTOP/CHOP/SOP/DAT callbacks, use paths relative to `scriptOp` or `me`:
+```python
+root = scriptOp.parent().parent()
+dat = root.op('pixel_data')
+```
+NEVER hardcode absolute paths like `op('/project1/myComp/child')` — they break when containers are renamed or copied.
+
+### 23. keyboardinCHOP channel names have 'k' prefix
+
+Channel names are `kup`, `kdown`, `kleft`, `kright`, `ka`, `kb`, etc. — NOT `up`, `down`, `a`, `b`. Always verify with:
+```python
+channels = [c.name for c in op('/project1/keyboard1').chans()]
+```
+
+### 24. expressCHOP cook-only properties — false positive errors
+
+`me.inputVal`, `me.chanIndex`, `me.sampleIndex` work ONLY in cook-context. Calling `par.expr0expr.eval()` from outside always raises an error — this is NOT a real operator error. Ignore these in error scans.
+
+### 25. td.Vertex attributes — use index access not named attributes
+
+In TD 2025.32, `td.Vertex` objects do NOT have `.x`, `.y`, `.z` attributes:
+```python
+# WRONG — crashes:
+vertex.x, vertex.y, vertex.z
+
+# CORRECT — index-based:
+vertex.point.P[0], vertex.point.P[1], vertex.point.P[2]
+# Or for SOP point positions:
+pt = sop.points()[i]
+pos = pt.P    # use P[0], P[1], P[2]
+```
+
+## Audio
+
+### 26. Audio Spectrum CHOP output is weak — boost it
+
+Raw output is very small (0.001-0.05). Use built-in boost: `spectrum.par.highfrequencyboost = 3.0`
+
+If still weak, add Math CHOP in Range mode: `fromrangehi=0.05, torangehi=1.0`
+
+### 27. AudioSpectrum CHOP: timeslice and sample count are the #1 gotcha
+
+AudioSpectrum at 44100Hz with `timeslice=False` outputs the ENTIRE audio file as samples (~24000+). CHOP-to-TOP then exceeds texture resolution max and warns/fails.
+
+**Fix:** Keep `timeslice = True` (default) for real-time per-frame FFT. Set `fftsize` to control bin count (it's a STRING enum: `'256'` not `256`).
+
+If the CHOP-to-TOP still gets too many samples, set `layout = 'rowscropped'` on the choptoTOP.
+
+```python
+spectrum.par.fftsize = '256'      # STRING, not int — enum values
+spectrum.par.timeslice = True     # MUST be True for real-time audio reactivity
+spectex.par.layout = 'rowscropped'  # handles oversized CHOP inputs
+```
+
+**resampleCHOP has NO `numsamples` param.** It uses `rate`, `start`, `end`, `method`. Don't guess — always `td_get_par_info('resampleCHOP')` first.
+
+### 28. CHOP To TOP has NO input connectors — use par.chop reference
+
+```python
+spec_tex = root.create(choptoTOP, 'spectrum_tex')
+spec_tex.par.chop = resample  # correct: parameter reference
+# NOT: resample.outputConnectors[0].connect(spec_tex.inputConnectors[0])  # WRONG
+```
+
+## Workflow
+
+### 29. Always verify after building — errors are silent
+
+Node errors and broken connections produce no output. Always check:
+```python
+for c in list(root.children):
+    e = c.errors()
+    w = c.warnings()
+    if e: print(c.name, 'ERR:', e)
+    if w: print(c.name, 'WARN:', w)
+```
+
+### 30. Window COMP param for display target is `winop`
+
+```python
+win = root.create(windowCOMP, 'display')
+win.par.winop = '/project1/logo_out'
+win.par.winw = 1280; win.par.winh = 720
+win.par.winopen.pulse()
+```
+
+### 31. `sample()` returns frozen pixels in rapid calls
+
+`out.sample(x, y)` returns pixels from a single cook snapshot. Compare samples with 2+ second delays, or use screencapture on the display window.
+
+### 32. Audio-reactive GLSL: dual-layer sync pipeline
+
+For audio-synced visuals, use BOTH layers for maximum effect:
+
+**Layer 1 (TD-side, real-time):** AudioFileIn → AudioSpectrum(timeslice=True, fftsize='256') → Math(gain=5) → choptoTOP(par.chop=math, layout='rowscropped') → GLSL input. The shader samples `sTD2DInputs[1]` at different x positions for bass/mid/hi. Record the TD output with MovieFileOut.
+
+**Layer 2 (Python-side, post-hoc):** scipy FFT on the SAME audio file → per-frame features (rms, bass, mid, hi, beat detection) → drive ASCII brightness, chromatic aberration, beat flashes during the render pass.
+
+Both layers locked to the same audio file = visuals genuinely sync to the beat at two independent stages.
+
+**Key gotcha:** AudioFileIn must be cued (`par.cue=True` → `par.cuepulse.pulse()`) then uncued (`par.cue=False`, `par.play=True`) before recording starts. Otherwise the spectrum is silent for the first few seconds.
+
+### 33. twozero MCP: benchmark and prefer native tools
+
+Benchmarked April 2026: twozero MCP with 36 native tools. The old curl/REST method (port 9981) had zero native tools.
+
+**Always prefer native MCP tools over td_execute_python:**
+- `td_create_operator` over `root.create()` scripts (handles viewport positioning)
+- `td_set_operator_pars` over `node.par.X = Y` scripts (validates param names)
+- `td_get_par_info` over temp-node discovery dance (instant, no cleanup)
+- `td_get_errors` over manual `c.errors()` loops
+- `td_get_focus` for context awareness (no equivalent in old method)
+
+Only fall back to `td_execute_python` for multi-step logic (wiring chains, conditional builds, loops).
+
+### 34. twozero td_execute_python response wrapping
+
+twozero wraps `td_execute_python` responses with status info: `(ok)\n\n[fps 60.0/60] [0 err/0 warn]`. Your Python `result` variable value may not appear verbatim in the response text. If you need to check results programmatically, use `print()` statements in the script — they appear in the response. Don't rely on string-matching the `result` dict.
+
+### 35. Audio-reactive chain: DO NOT use Lag CHOP or Filter CHOP for spectrum smoothing
+
+The Derivative docs and tutorials suggest using Lag CHOP (lag1=0.2, lag2=0.5) to smooth raw FFT output before passing to a shader. **This does NOT work with AudioSpectrum → CHOP to TOP → GLSL.**
+
+What happens: Lag CHOP operates in timeslice mode. A 256-sample spectrum input gets expanded to 1600-2400 samples. The Lag averaging drives all values to near-zero (~1e-06). The CHOP to TOP produces a 2400x2 texture instead of 256x2. The shader receives effectively zero audio data.
+
+**The correct chain is: Spectrum(outlength=256) → Math(gain=10) → CHOPtoTOP → GLSL.** No CHOP smoothing at all. If you need smoothing, do it in the GLSL shader via temporal lerp with a feedback texture.
+
+Verified values with audio playing:
+- Without Lag CHOP: bass bins = 5.0-5.4, mid bins = 1.0-1.7 (strong, usable)
+- With Lag CHOP: ALL bins = 0.000001-0.00004 (dead, zero audio reactivity)
+
+### 36. AudioSpectrum Output Length: set manually to avoid CHOP to TOP overflow
+
+AudioSpectrum in Visualization mode with FFT 8192 outputs 22,050 samples by default (1 per Hz, 0–22050). CHOP to TOP cannot handle this — you get "Number of samples exceeded texture resolution max".
+
+Fix: `spectrum.par.outputmenu = 'setmanually'` and `spectrum.par.outlength = 256`. This gives 256 frequency bins — plenty for visual FFT.
+
+DO NOT set `timeslice = False` as a workaround — that processes the entire audio file at once and produces even more samples.
+
+### 37. GLSL spectrum texture from CHOP to TOP is 256x2 not 256x1
+
+AudioSpectrum outputs 2 channels (stereo: chan1, chan2). CHOP to TOP with `dataformat='r'` creates a 256x2 texture — one row per channel. Sample the first channel at `y=0.25` (center of first row), NOT `y=0.5` (boundary between rows):
+
+```glsl
+float bass = texture(sTD2DInputs[1], vec2(0.05, 0.25)).r;  // correct
+float bass = texture(sTD2DInputs[1], vec2(0.05, 0.5)).r;   // WRONG — samples between rows
+```
+
+### 38. FPS=0 doesn't mean ops aren't cooking — check play state
+
+TD can show `fps:0` in `td_get_perf` while ops still cook and `TOP.save()` still produces valid screenshots. The two most common causes:
+
+**a) Project is paused (playbar stopped).** TD's playbar can be toggled with spacebar. The `root` at `/` has no `.playbar` attribute (it's on the perform COMP). The easiest fix is sending a spacebar keypress via `td_input_execute`, though this tool can sometimes error. As a workaround, `TOP.save()` always works regardless of play state — use it to verify rendering is actually happening before spending time debugging FPS.
+
+**b) Audio device CHOP blocking the main thread.** An `audiooutCHOP` with an active audio device can consume 300-400ms/s (2000%+ of frame budget), stalling the cook loop at FPS=0. Fix: keep the CHOP active but set `volume=0` to prevent the audio driver from blocking. Disabling it entirely (`active=False`) may also work but can prevent downstream audio processing CHOPs from cooking.
+
+Diagnostic sequence when FPS=0:
+1. `td_get_perf` — check if any op has extreme CPU/s
+2. `TOP.save()` on the output — if it produces a valid image, the pipeline works, just not at real-time rate
+3. Check for blocking CHOPs (audioout, audiodevin, etc.)
+4. Toggle play state (spacebar, or check if absTime.seconds is advancing)
+
+### 39. Recording while FPS=0 produces empty or near-empty files
+
+This is the #1 cause of "I recorded for 30 seconds but got a 2-frame video." If TD's cook loop is stalled (FPS=0 or very low), MovieFileOut has nothing to record. Unlike `TOP.save()` which captures the last cooked frame regardless, MovieFileOut only writes frames that actually cook.
+
+**Always verify FPS before starting a recording:**
+```python
+# Check via td_get_perf first
+# If FPS < 30, do NOT start recording — fix the performance issue first
+# If FPS=0, the playbar is likely paused — see pitfall #37
+```
+
+Common causes of recording empty video:
+- Playbar paused (FPS=0) — see pitfall #37
+- Audio device CHOP blocking the main thread — see pitfall #37b
+- Recording started before audio was cued — audio is silent, GLSL outputs black, MovieFileOut records black frames that look empty
+- `par.file` set in the same script as `par.record = True` — see pitfall #18
+
+### 40. GLSL shader produces black output — test before committing to a long render
+
+New GLSL shaders can fail silently (see pitfall #7). Before recording a long take, always:
+
+1. **Write a minimal test shader first** that just outputs a solid color or pass-through:
+```glsl
+void main() {
+    vec2 uv = vUV.st;
+    fragColor = TDOutputSwizzle(vec4(uv, 0.0, 1.0));
+}
+```
+
+2. **Verify the test renders correctly** via `td_get_screenshot` on the GLSL TOP's output.
+
+3. **Swap in the real shader** and screenshot again immediately. If black, the shader has a compile error or logic issue.
+
+4. **Only then start recording.** A 90-second ProRes recording is ~5GB. Recording black frames wastes disk and time.
+
+Common causes of black GLSL output:
+- Missing `TDOutputSwizzle()` on macOS (pitfall #8)
+- Time uniform not connected — shader uses default 0.0, fractal stays at origin
+- Spectrum texture not connected — audio values all 0.0, driving everything to black
+- Integer division where float division was expected (`1/2 = 0` not `0.5`)
+- `absTime.seconds % 1000.0` rolled over past 1000 and the modulo produces unexpected values
+
+### 41. td_write_dat uses `text` parameter, NOT `content`
+
+The MCP tool `td_write_dat` expects a `text` parameter for full replacement. Passing `content` returns an error: `"Provide either 'text' for full replace, or 'old_text'+'new_text' for patching"`.
+
+If `td_write_dat` fails, fall back to `td_execute_python`:
+```python
+op("/project1/shader_code").text = shader_string
+```
+
+### 42. td_execute_python does NOT return stdout or print() output
+
+Despite what earlier versions of pitfall #33 stated, `print()` and `debug()` output from `td_execute_python` scripts does NOT appear in the MCP response. The response is always just `(ok)` + FPS/error summary. To read values back, use dedicated inspection tools (`td_get_operator_info`, `td_read_dat`, `td_read_chop`) instead of trying to print from within a script.
+
+### 43. td_get_operator_info JSON is appended with `[fps X.X/X]` — breaks json.loads()
+
+The response text from `td_get_operator_info` has `[fps 60.0/60]` appended after the JSON object. This causes `json.loads()` to fail with "Extra data" errors. Strip it before parsing:
+```python
+clean = response_text.rsplit('[fps', 1)[0]
+data = json.loads(clean)
+```
+
+### 44. td_get_screenshot is asynchronous — returns `{"status": "pending"}`
+
+Screenshots don't complete instantly. The tool returns `{"status": "pending", "requestId": "..."}` and the actual file appears later. Wait a few seconds before checking for the file. There is no callback or completion notification — poll the filesystem.
+
+### 45. Recording duration is manual — no auto-stop at audio end
+
+MovieFileOut records until `par.record = False` is set. If audio ends before you stop recording, the file keeps growing with repeated frames. Always stop recording promptly after the audio duration. For precision: set a timer on the agent side matching the audio length, then send `par.record = False`. Trim excess with ffmpeg as a safety net:
+```bash
+ffmpeg -i raw.mov -t 25 -c copy trimmed.mov
+```
@@ -0,0 +1,463 @@
+# TouchDesigner Python API Reference
+
+## The td Module
+
+TouchDesigner's Python environment auto-imports the `td` module. All TD-specific classes, functions, and constants live here. Scripts inside TD (Script DATs, CHOP/DAT Execute callbacks, Extensions) have full access.
+
+When using the MCP `execute_python_script` tool, these globals are pre-loaded:
+- `op` — shortcut for `td.op()`, finds operators by path
+- `ops` — shortcut for `td.ops()`, finds multiple operators by pattern
+- `me` — the operator running the script (via MCP this is the twozero internal executor)
+- `parent` — shortcut for `me.parent()`
+- `project` — the root project component
+- `td` — the full td module
+
+## Finding Operators: op() and ops()
+
+### op(path) — Find a single operator
+
+```python
+# Absolute path (always works from MCP)
+node = op('/project1/noise1')
+
+# Relative path (relative to current operator — only in Script DATs)
+node = op('noise1')      # sibling
+node = op('../noise1')   # parent's sibling
+
+# Returns None if not found (does NOT raise)
+node = op('/project1/nonexistent')  # None
+```
+
+### ops(pattern) — Find multiple operators
+
+```python
+# Glob patterns
+nodes = ops('/project1/noise*')       # all nodes starting with "noise"
+nodes = ops('/project1/*')            # all direct children
+nodes = ops('/project1/container1/*') # all children of container1
+
+# Returns a tuple of operators (may be empty)
+for n in ops('/project1/*'):
+    print(n.name, n.OPType)
+```
+
+### Navigation from a node
+
+```python
+node = op('/project1/noise1')
+
+node.name        # 'noise1'
+node.path        # '/project1/noise1'
+node.OPType      # 'noiseTop'
+node.type         # <class 'noiseTop'>
+node.family       # 'TOP'
+
+# Parent / children
+node.parent()              # the parent COMP
+node.parent().children     # all siblings + self
+node.parent().findChildren(name='noise*')  # filtered
+
+# Type checking
+node.isTOP   # True
+node.isCHOP  # False
+node.isSOP   # False
+node.isDAT   # False
+node.isMAT   # False
+node.isCOMP  # False
+```
+
+## Parameters
+
+Every operator has parameters accessed via the `.par` attribute.
+
+### Reading parameters
+
+```python
+node = op('/project1/noise1')
+
+# Direct access
+node.par.seed.val        # current evaluated value (may be an expression result)
+node.par.seed.eval()     # same as .val
+node.par.seed.default    # default value
+node.par.monochrome.val  # boolean parameters: True/False
+
+# List all parameters
+for p in node.pars():
+    print(f"{p.name}: {p.val} (default: {p.default})")
+
+# Filter by page (parameter group)
+for p in node.pars('Noise'):  # page name
+    print(f"{p.name}: {p.val}")
+```
+
+### Setting parameters
+
+```python
+# Direct value setting
+node.par.seed.val = 42
+node.par.monochrome.val = True
+node.par.resolutionw.val = 1920
+node.par.resolutionh.val = 1080
+
+# String parameters
+op('/project1/text1').par.text.val = 'Hello World'
+
+# File paths
+op('/project1/moviefilein1').par.file.val = '/path/to/video.mp4'
+
+# Reference another operator (for "dat", "chop", "top" type parameters)
+op('/project1/glsl1').par.dat.val = '/project1/shader_code'
+```
+
+### Parameter expressions
+
+```python
+# Python expressions that evaluate dynamically
+node.par.seed.expr = "me.time.frame"
+node.par.tx.expr = "math.sin(me.time.seconds * 2)"
+
+# Reference another parameter
+node.par.brightness1.expr = "op('/project1/constant1').par.value0.val"
+
+# Export (one-way binding from CHOP to parameter)
+# This makes the parameter follow a CHOP channel value
+op('/project1/noise1').par.seed.val  # can also be driven by exports
+```
+
+### Parameter types
+
+| Type | Python Type | Example |
+|------|------------|---------|
+| Float | `float` | `node.par.brightness1.val = 0.5` |
+| Int | `int` | `node.par.seed.val = 42` |
+| Toggle | `bool` | `node.par.monochrome.val = True` |
+| String | `str` | `node.par.text.val = 'hello'` |
+| Menu | `int` (index) or `str` (label) | `node.par.type.val = 'sine'` |
+| File | `str` (path) | `node.par.file.val = '/path/to/file'` |
+| OP reference | `str` (path) | `node.par.dat.val = '/project1/text1'` |
+| Color | separate r/g/b/a floats | `node.par.colorr.val = 1.0` |
+| XY/XYZ | separate x/y/z floats | `node.par.tx.val = 0.5` |
+
+## Creating and Deleting Operators
+
+```python
+# Create via parent component
+parent = op('/project1')
+new_node = parent.create(noiseTop)         # using class reference
+new_node = parent.create(noiseTop, 'my_noise')  # with custom name
+
+# The MCP create_td_node tool handles this automatically:
+# create_td_node(parentPath="/project1", nodeType="noiseTop", nodeName="my_noise")
+
+# Delete
+node = op('/project1/my_noise')
+node.destroy()
+
+# Copy
+original = op('/project1/noise1')
+copy = parent.copy(original, name='noise1_copy')
+```
+
+## Connections (Wiring Operators)
+
+### Output to Input connections
+
+```python
+# Connect noise1's output to level1's input
+op('/project1/noise1').outputConnectors[0].connect(op('/project1/level1'))
+
+# Connect to specific input index (for multi-input operators like Composite)
+op('/project1/noise1').outputConnectors[0].connect(op('/project1/composite1').inputConnectors[0])
+op('/project1/text1').outputConnectors[0].connect(op('/project1/composite1').inputConnectors[1])
+
+# Disconnect all outputs
+op('/project1/noise1').outputConnectors[0].disconnect()
+
+# Query connections
+node = op('/project1/level1')
+inputs = node.inputs          # list of connected input operators
+outputs = node.outputs        # list of connected output operators
+```
+
+### Connection patterns for common setups
+
+```python
+# Linear chain: A -> B -> C -> D
+ops_list = [op(f'/project1/{name}') for name in ['noise1', 'level1', 'blur1', 'null1']]
+for i in range(len(ops_list) - 1):
+    ops_list[i].outputConnectors[0].connect(ops_list[i+1])
+
+# Fan-out: A -> B, A -> C, A -> D
+source = op('/project1/noise1')
+for target_name in ['level1', 'composite1', 'transform1']:
+    source.outputConnectors[0].connect(op(f'/project1/{target_name}'))
+
+# Merge: A + B + C -> Composite
+comp = op('/project1/composite1')
+for i, source_name in enumerate(['noise1', 'text1', 'ramp1']):
+    op(f'/project1/{source_name}').outputConnectors[0].connect(comp.inputConnectors[i])
+```
+
+## DAT Content Manipulation
+
+### Text DATs
+
+```python
+dat = op('/project1/text1')
+
+# Read
+content = dat.text          # full text as string
+
+# Write
+dat.text = "new content"
+dat.text = '''multi
+line
+content'''
+
+# Append
+dat.text += "\nnew line"
+```
+
+### Table DATs
+
+```python
+dat = op('/project1/table1')
+
+# Read cell
+val = dat[0, 0]         # row 0, col 0
+val = dat[0, 'name']    # row 0, column named 'name'
+val = dat['key', 1]     # row named 'key', col 1
+
+# Write cell
+dat[0, 0] = 'value'
+
+# Read row/col
+row = dat.row(0)         # list of Cell objects
+col = dat.col('name')    # list of Cell objects
+
+# Dimensions
+rows = dat.numRows
+cols = dat.numCols
+
+# Append row
+dat.appendRow(['col1_val', 'col2_val', 'col3_val'])
+
+# Clear
+dat.clear()
+
+# Set entire table
+dat.clear()
+dat.appendRow(['name', 'value', 'type'])
+dat.appendRow(['frequency', '440', 'float'])
+dat.appendRow(['amplitude', '0.8', 'float'])
+```
+
+## Time and Animation
+
+```python
+# Global time
+td.absTime.frame       # absolute frame number (never resets)
+td.absTime.seconds     # absolute seconds
+
+# Timeline time (affected by play/pause/loop)
+me.time.frame          # current frame on timeline
+me.time.seconds        # current seconds on timeline
+me.time.rate           # FPS setting
+
+# Timeline control (via execute_python_script)
+project.play = True
+project.play = False
+project.frameRange = (1, 300)   # set timeline range
+
+# Cook frame (when operator was last computed)
+node.cookFrame
+node.cookTime
+```
+
+## Extensions (Custom Python Classes on Components)
+
+Extensions add custom Python methods and attributes to COMPs.
+
+```python
+# Create extension on a Base COMP
+base = op('/project1/myBase')
+
+# The extension class is defined in a Text DAT inside the COMP
+# Typically named 'ExtClass' with the extension code:
+
+extension_code = '''
+class MyExtension:
+    def __init__(self, ownerComp):
+        self.ownerComp = ownerComp
+        self.counter = 0
+
+    def Reset(self):
+        self.counter = 0
+
+    def Increment(self):
+        self.counter += 1
+        return self.counter
+
+    @property
+    def Count(self):
+        return self.counter
+'''
+
+# Write extension code to DAT inside the COMP
+op('/project1/myBase/extClass').text = extension_code
+
+# Configure the extension on the COMP
+base.par.extension1 = 'extClass'  # name of the DAT
+base.par.promoteextension1 = True  # promote methods to parent
+
+# Call extension methods
+base.Increment()       # calls MyExtension.Increment()
+count = base.Count     # accesses MyExtension.Count property
+base.Reset()
+```
+
+## Useful Built-in Modules
+
+### tdu — TouchDesigner Utilities
+
+```python
+import tdu
+
+# Dependency tracking (reactive values)
+dep = tdu.Dependency(initial_value)
+dep.val = new_value   # triggers dependents to recook
+
+# File path utilities
+tdu.expandPath('$HOME/Desktop/output.mov')
+
+# Math
+tdu.clamp(value, min, max)
+tdu.remap(value, from_min, from_max, to_min, to_max)
+```
+
+### TDFunctions
+
+```python
+from TDFunctions import *
+
+# Commonly used utilities
+clamp(value, low, high)
+remap(value, inLow, inHigh, outLow, outHigh)
+interp(value1, value2, t)  # linear interpolation
+```
+
+### TDStoreTools — Persistent Storage
+
+```python
+from TDStoreTools import StorageManager
+
+# Store data that survives project reload
+me.store('myKey', 'myValue')
+val = me.fetch('myKey', default='fallback')
+
+# Storage dict
+me.storage['key'] = value
+```
+
+## Common Patterns via execute_python_script
+
+### Build a complete chain
+
+```python
+# Create a complete audio-reactive noise chain
+parent = op('/project1')
+
+# Create operators
+audio_in = parent.create(audiofileinChop, 'audio_in')
+spectrum = parent.create(audiospectrumChop, 'spectrum')
+chop_to_top = parent.create(choptopTop, 'chop_to_top')
+noise = parent.create(noiseTop, 'noise1')
+level = parent.create(levelTop, 'level1')
+null_out = parent.create(nullTop, 'out')
+
+# Wire the chain
+audio_in.outputConnectors[0].connect(spectrum)
+spectrum.outputConnectors[0].connect(chop_to_top)
+noise.outputConnectors[0].connect(level)
+level.outputConnectors[0].connect(null_out)
+
+# Set parameters
+audio_in.par.file = '/path/to/music.wav'
+audio_in.par.play = True
+spectrum.par.size = 512
+noise.par.type = 1  # Sparse
+noise.par.monochrome = False
+noise.par.resolutionw = 1920
+noise.par.resolutionh = 1080
+level.par.opacity = 0.8
+level.par.gamma1 = 0.7
+```
+
+### Query network state
+
+```python
+# Get all TOPs in the project
+tops = [c for c in op('/project1').findChildren(type=TOP)]
+for t in tops:
+    print(f"{t.path}: {t.OPType} {'ERROR' if t.errors() else 'OK'}")
+
+# Find all operators with errors
+def find_errors(parent_path='/project1'):
+    parent = op(parent_path)
+    errors = []
+    for child in parent.findChildren(depth=-1):
+        if child.errors():
+            errors.append((child.path, child.errors()))
+    return errors
+
+result = find_errors()
+```
+
+### Batch parameter changes
+
+```python
+# Set parameters on multiple nodes at once
+settings = {
+    '/project1/noise1': {'seed': 42, 'monochrome': False, 'resolutionw': 1920},
+    '/project1/level1': {'brightness1': 1.2, 'gamma1': 0.8},
+    '/project1/blur1': {'sizex': 5, 'sizey': 5},
+}
+
+for path, params in settings.items():
+    node = op(path)
+    if node:
+        for key, val in params.items():
+            setattr(node.par, key, val)
+```
+
+## Python Version and Packages
+
+TouchDesigner bundles Python 3.11+ with these pre-installed:
+- **numpy** — array operations, fast math
+- **scipy** — signal processing, FFT
+- **OpenCV** (cv2) — computer vision
+- **PIL/Pillow** — image processing
+- **requests** — HTTP client
+- **json**, **re**, **os**, **sys** — standard library
+
+**IMPORTANT:** Parameter names in examples below are illustrative. Always run discovery (SKILL.md Step 0) to get actual names for your TD version. Do NOT copy param names from these examples verbatim.
+
+Custom packages can be installed to TD's Python site-packages directory. See TD documentation for the exact path per platform.
+
+## SOP Vertex/Point Access (TD 2025.32)
+
+In TD 2025.32, `td.Vertex` does NOT have `.x`, `.y`, `.z` attributes. Use index access:
+
+```python
+# WRONG — crashes in TD 2025.32:
+vertex.x, vertex.y, vertex.z
+
+# CORRECT — index/attribute access:
+pt = sop.points()[i]
+pos = pt.P          # Position object
+x, y, z = pos[0], pos[1], pos[2]
+
+# Always introspect first:
+dir(sop.points()[0])   # see what attributes actually exist
+dir(sop.points()[0].P) # see Position object interface
+```
@@ -0,0 +1,244 @@
+# TouchDesigner Troubleshooting (twozero MCP)
+
+> See `references/pitfalls.md` for the comprehensive lessons-learned list.
+
+## 1. Connection Issues
+
+### Port 40404 not responding
+
+Check these in order:
+
+1. Is TouchDesigner running?
+   ```bash
+   pgrep TouchDesigner
+   ```
+
+1b. Quick hub health check (no JSON-RPC needed):
+   A plain GET to the MCP URL returns instance info:
+   ```
+   curl -s http://localhost:40404/mcp
+   ```
+   Returns: `{"hub": true, "pid": ..., "instances": {"127.0.0.1_PID": {"project": "...", "tdVersion": "...", ...}}}`
+   If this returns JSON but `instances` is empty, TD is running but twozero hasn't registered yet.
+
+2. Is twozero installed in TD?
+   Open TD Palette Browser > twozero should be listed. If not, install it.
+
+3. Is MCP enabled in twozero settings?
+   In TD, open twozero preferences and confirm MCP server is toggled ON.
+
+4. Test the port directly:
+   ```bash
+   nc -z 127.0.0.1 40404
+   ```
+
+5. Test the MCP endpoint:
+   ```bash
+   curl -s http://localhost:40404/mcp
+   ```
+   Should return JSON with hub info. If it does, the server is running.
+
+### Hub responds but no TD instances
+
+The twozero MCP hub is running but TD hasn't registered. Causes:
+- TD project not loaded yet (still on splash screen)
+- twozero COMP not initialized in the current project
+- twozero version mismatch
+
+Fix: Open/reload a TD project that contains the twozero COMP. Use td_list_instances
+to check which TD instances are registered.
+
+### Multi-instance setup
+
+twozero auto-assigns ports for multiple TD instances:
+- First instance: 40404
+- Second instance: 40405
+- Third instance: 40406
+- etc.
+
+Use `td_list_instances` to discover all running instances and their ports.
+
+## 2. MCP Tool Errors
+
+### td_execute_python returns error
+
+The error message from td_execute_python often contains the Python traceback.
+If it's unclear, use `td_read_textport` to see the full TD console output —
+Python exceptions are always printed there.
+
+Common causes:
+- Syntax error in the script
+- Referencing a node that doesn't exist (op() returns None, then you call .par on None)
+- Using wrong parameter names (see pitfalls.md)
+
+### td_set_operator_pars fails
+
+Parameter name mismatch is the #1 cause. The tool validates param names and
+returns clear errors, but you must use exact names.
+
+Fix: ALWAYS call `td_get_par_info` first to discover the real parameter names:
+```
+td_get_par_info(op_type='glslTOP')
+td_get_par_info(op_type='noiseTOP')
+```
+
+### td_create_operator type name errors
+
+Operator type names use camelCase with family suffix:
+- CORRECT: noiseTOP, glslTOP, levelTOP, compositeTOP, audiospectrumCHOP
+- WRONG:   NoiseTOP, noise_top, NOISE TOP, Noise
+
+### td_get_operator_info for deep inspection
+
+If unsure about any aspect of an operator (params, inputs, outputs, state):
+```
+td_get_operator_info(path='/project1/noise1', detail='full')
+```
+
+## 3. Parameter Discovery
+
+CRITICAL: ALWAYS use td_get_par_info to discover parameter names.
+
+The agent's LLM training data contains WRONG parameter names for TouchDesigner.
+Do not trust them. Known wrong names include dat vs pixeldat, colora vs alpha,
+sizex vs size, and many more. See pitfalls.md for the full list.
+
+Workflow:
+1. td_get_par_info(op_type='glslTOP') — get all params for a type
+2. td_get_operator_info(path='/project1/mynode', detail='full') — get params for a specific instance
+3. Use ONLY the names returned by these tools
+
+## 4. Performance
+
+### Diagnosing slow performance
+
+Use `td_get_perf` to see which operators are slow. Look at cook times —
+anything over 1ms per frame is worth investigating.
+
+Common causes:
+- Resolution too high (especially on Non-Commercial)
+- Complex GLSL shaders
+- Too many TOP-to-CHOP or CHOP-to-TOP transfers (GPU-CPU memory copies)
+- Feedback loops without decay (values accumulate, memory grows)
+
+### Non-Commercial license restrictions
+
+- Resolution cap: 1280x1280. Setting resolutionw=1920 silently clamps to 1280.
+- H.264/H.265/AV1 encoding requires Commercial license. Use ProRes or Hap instead.
+- No commercial use of output.
+
+Always check effective resolution after creation:
+```python
+n.cook(force=True)
+actual = str(n.width) + 'x' + str(n.height)
+```
+
+## 5. Hermes Configuration
+
+### Config location
+
+`$HERMES_HOME/config.yaml` (defaults to `~/.hermes/config.yaml` when `HERMES_HOME` is unset)
+
+### MCP entry format
+
+The twozero TD entry should look like:
+```yaml
+mcpServers:
+  twozero_td:
+    url: http://localhost:40404/mcp
+```
+
+### After config changes
+
+Restart the Hermes session for changes to take effect. The MCP connection is
+established at session startup.
+
+### Verifying MCP tools are available
+
+After restarting, the session log should show twozero MCP tools registered.
+If tools show as registered but aren't callable, check:
+- The twozero MCP hub is still running (curl test above)
+- TD is still running with a project loaded
+- No firewall blocking localhost:40404
+
+## 6. Node Creation Issues
+
+### "Node type not found" error
+
+Wrong type string. Use camelCase with family suffix:
+- Wrong: NoiseTop, noise_top, NOISE TOP
+- Right: noiseTOP
+
+### Node created but not visible
+
+Check parentPath — use absolute paths like /project1. The default project
+root is /project1. System nodes live at /, /ui, /sys, /local, /perform.
+Don't create user nodes outside /project1.
+
+### Cannot create node inside a non-COMP
+
+Only COMP operators (Container, Base, Geometry, etc.) can contain children.
+You cannot create nodes inside a TOP, CHOP, SOP, DAT, or MAT.
+
+## 7. Wiring Issues
+
+### Cross-family wiring
+
+TOPs connect to TOPs, CHOPs to CHOPs, SOPs to SOPs, DATs to DATs.
+Use converter operators to bridge: choptoTOP, topToCHOP, soptoDAT, etc.
+
+Note: choptoTOP has NO input connectors. Use par.chop reference instead:
+```python
+spec_tex.par.chop = resample_node  # correct
+# NOT: resample.outputConnectors[0].connect(spec_tex.inputConnectors[0])
+```
+
+### Feedback loops
+
+Never create A -> B -> A directly. Use a Feedback TOP:
+```python
+fb = root.create(feedbackTOP, 'fb')
+fb.par.top = comp.path          # reference only, no wire to fb input
+fb.outputConnectors[0].connect(next_node)
+```
+"Cook dependency loop detected" warning on the chain is expected and correct.
+
+## 8. GLSL Issues
+
+### Shader compilation errors are silent
+
+GLSL TOP shows a yellow warning in the UI but node.errors() may return empty.
+Check node.warnings() too. Create an Info DAT pointed at the GLSL TOP for
+full compiler output.
+
+### TD GLSL specifics
+
+- Uses GLSL 4.60 (Vulkan backend). GLSL 3.30 and earlier removed.
+- UV coordinates: vUV.st (not gl_FragCoord)
+- Input textures: sTD2DInputs[0]
+- Output: layout(location = 0) out vec4 fragColor
+- macOS CRITICAL: Always wrap output with TDOutputSwizzle(color)
+- No built-in time uniform. Pass time via GLSL TOP Values page or Constant TOP.
+
+## 9. Recording Issues
+
+### H.264/H.265/AV1 requires Commercial license
+
+Use Apple ProRes on macOS (hardware accelerated, not license-restricted):
+```python
+rec.par.videocodec = 'prores'  # Preferred on macOS — lossless, Non-Commercial OK
+# rec.par.videocodec = 'mjpa'  # Fallback — lossy, works everywhere
+```
+
+### MovieFileOut has no .record() method
+
+Use the toggle parameter:
+```python
+rec.par.record = True   # start
+rec.par.record = False  # stop
+```
+
+### All exported frames identical
+
+TOP.save() captures same frame when called rapidly. Use MovieFileOut for
+real-time recording. Set project.realTime = False for frame-accurate output.
@@ -0,0 +1,115 @@
+#!/usr/bin/env bash
+# setup.sh — Automated setup for twozero MCP plugin for TouchDesigner
+# Idempotent: safe to run multiple times.
+set -euo pipefail
+
+GREEN='\033[0;32m'; RED='\033[0;31m'; YELLOW='\033[1;33m'; CYAN='\033[0;36m'; NC='\033[0m'
+OK="${GREEN}✔${NC}"; FAIL="${RED}✘${NC}"; WARN="${YELLOW}⚠${NC}"
+
+TWOZERO_URL="https://www.404zero.com/pisang/twozero.tox"
+TOX_PATH="$HOME/Downloads/twozero.tox"
+HERMES_HOME_DIR="${HERMES_HOME:-$HOME/.hermes}"
+HERMES_CFG="${HERMES_HOME_DIR}/config.yaml"
+MCP_PORT=40404
+MCP_ENDPOINT="http://localhost:${MCP_PORT}/mcp"
+
+manual_steps=()
+
+echo -e "\n${CYAN}═══ twozero MCP for TouchDesigner — Setup ═══${NC}\n"
+
+# ── 1. Check if TouchDesigner is running ──
+# Match on process *name* (not full cmdline) to avoid self-matching shells
+# that happen to have "TouchDesigner" in their args. macOS and Linux pgrep
+# both support -x for exact name match.
+if pgrep -x TouchDesigner >/dev/null 2>&1 || pgrep -x TouchDesignerFTE >/dev/null 2>&1; then
+    echo -e " ${OK} TouchDesigner is running"
+    td_running=true
+else
+    echo -e " ${WARN} TouchDesigner is not running"
+    td_running=false
+fi
+
+# ── 2. Ensure twozero.tox exists ──
+if [[ -f "$TOX_PATH" ]]; then
+    echo -e " ${OK} twozero.tox already exists at ${TOX_PATH}"
+else
+    echo -e " ${WARN} twozero.tox not found — downloading..."
+    if curl -fSL -o "$TOX_PATH" "$TWOZERO_URL" 2>/dev/null; then
+        echo -e " ${OK} Downloaded twozero.tox to ${TOX_PATH}"
+    else
+        echo -e " ${FAIL} Failed to download twozero.tox from ${TWOZERO_URL}"
+        echo "       Please download manually and place at ${TOX_PATH}"
+        manual_steps+=("Download twozero.tox from ${TWOZERO_URL} to ${TOX_PATH}")
+    fi
+fi
+
+# ── 3. Ensure Hermes config has twozero_td MCP entry ──
+if [[ ! -f "$HERMES_CFG" ]]; then
+    echo -e " ${FAIL} Hermes config not found at ${HERMES_CFG}"
+    manual_steps+=("Create ${HERMES_CFG} with twozero_td MCP server entry")
+elif grep -q 'twozero_td' "$HERMES_CFG" 2>/dev/null; then
+    echo -e " ${OK} twozero_td MCP entry exists in Hermes config"
+else
+    echo -e " ${WARN} Adding twozero_td MCP entry to Hermes config..."
+    python3 -c "
+import yaml, sys, copy
+
+cfg_path = '$HERMES_CFG'
+with open(cfg_path, 'r') as f:
+    cfg = yaml.safe_load(f) or {}
+
+if 'mcp_servers' not in cfg:
+    cfg['mcp_servers'] = {}
+
+if 'twozero_td' not in cfg['mcp_servers']:
+    cfg['mcp_servers']['twozero_td'] = {
+        'url': '${MCP_ENDPOINT}',
+        'timeout': 120,
+        'connect_timeout': 60
+    }
+    with open(cfg_path, 'w') as f:
+        yaml.dump(cfg, f, default_flow_style=False, sort_keys=False)
+" 2>/dev/null && echo -e " ${OK} twozero_td MCP entry added to config" \
+              || { echo -e " ${FAIL} Could not update config (is PyYAML installed?)"; \
+                   manual_steps+=("Add twozero_td MCP entry to ${HERMES_CFG} manually"); }
+    manual_steps+=("Restart Hermes session to pick up config change")
+fi
+
+# ── 4. Test if MCP port is responding ──
+if nc -z 127.0.0.1 "$MCP_PORT" 2>/dev/null; then
+    echo -e " ${OK} Port ${MCP_PORT} is open"
+
+    # ── 5. Verify MCP endpoint responds ──
+    resp=$(curl -s --max-time 3 "$MCP_ENDPOINT" 2>/dev/null || true)
+    if [[ -n "$resp" ]]; then
+        echo -e " ${OK} MCP endpoint responded at ${MCP_ENDPOINT}"
+    else
+        echo -e " ${WARN} Port open but MCP endpoint returned empty response"
+        manual_steps+=("Verify MCP is enabled in twozero settings")
+    fi
+else
+    echo -e " ${WARN} Port ${MCP_PORT} is not open"
+    if [[ "$td_running" == true ]]; then
+        manual_steps+=("In TD: drag twozero.tox into network editor → click Install")
+        manual_steps+=("Enable MCP: twozero icon → Settings → mcp → 'auto start MCP' → Yes")
+    else
+        manual_steps+=("Launch TouchDesigner")
+        manual_steps+=("Drag twozero.tox into the TD network editor and click Install")
+        manual_steps+=("Enable MCP: twozero icon → Settings → mcp → 'auto start MCP' → Yes")
+    fi
+fi
+
+# ── Status Report ──
+echo -e "\n${CYAN}═══ Status Report ═══${NC}\n"
+
+if [[ ${#manual_steps[@]} -eq 0 ]]; then
+    echo -e " ${OK} ${GREEN}Fully configured! twozero MCP is ready to use.${NC}\n"
+    exit 0
+else
+    echo -e " ${WARN} ${YELLOW}Manual steps remaining:${NC}\n"
+    for i in "${!manual_steps[@]}"; do
+        echo -e "   $((i+1)). ${manual_steps[$i]}"
+    done
+    echo ""
+    exit 1
+fi
@@ -7,7 +7,7 @@ license: MIT
 metadata:
  hermes:
    tags: [telephony, phone, sms, mms, voice, twilio, bland.ai, vapi, calling, texting]
-    related_skills: [find-nearby, google-workspace, agentmail]
+    related_skills: [maps, google-workspace, agentmail]
    category: productivity
 ---

@@ -4,7 +4,7 @@
 Add a first-class `gemini` provider that authenticates via Google OAuth, using the standard Gemini API (not Cloud Code Assist). Users who have a Google AI subscription or Gemini API access can authenticate through the browser without needing to manually copy API keys.

 ## Architecture Decision
- **Path A (chosen):** Standard Gemini API at `generativelanguage.googleapis.com/v1beta/openai/`
+- **Path A (chosen):** Standard Gemini API at `generativelanguage.googleapis.com/v1beta`
 - **NOT Path B:** Cloud Code Assist (`cloudcode-pa.googleapis.com`) — rate-limited free tier, internal API, account ban risk
 - Standard `chat_completions` api_mode via OpenAI SDK — no new api_mode needed
 - Our own OAuth credentials — NOT sharing tokens with Gemini CLI
@@ -32,9 +32,9 @@ Add a first-class `gemini` provider that authenticates via Google OAuth, using t
 - File locking for concurrent access (multiple agent sessions)

 ## API Integration
- Base URL: `https://generativelanguage.googleapis.com/v1beta/openai/`
- Auth: `Authorization: Bearer <access_token>` (passed as `api_key` to OpenAI SDK)
- api_mode: `chat_completions` (standard)
+- Base URL: `https://generativelanguage.googleapis.com/v1beta`
+- Auth: native Gemini API authentication handled by the provider adapter
+- api_mode: `chat_completions` (standard facade over native transport)
 - Models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, etc.

 ## Files to Create/Modify
--- a/Show More
+++ b/Show More