Merge PR #425 : feat(#417 ): add pokemon-player skill

Authored by teyrebaz33. Closes #417. Adds pokemon-player skill for playing Pokemon via headless emulation using the pokemon-agent package (NousResearch/pokemon-agent).
feat: find-nearby skill and Telegram location support
2026-03-09 05:42:58 -07:00 · 2026-03-09 05:31:10 -07:00 · 2026-03-09 05:08:01 -07:00 · 2026-03-09 05:07:53 -07:00 · 2026-03-09 04:58:27 -07:00 · 2026-03-09 04:58:20 -07:00
294 changed files with 10058 additions and 2525 deletions
--- a/.env.example
+++ b/.env.example
@@ -53,10 +53,6 @@ MINIMAX_CN_API_KEY=
 # Get at: https://firecrawl.dev/
 FIRECRAWL_API_KEY=

-# Nous Research API Key - Vision analysis and multi-model reasoning
-# Get at: https://inference-api.nousresearch.com/
-NOUS_API_KEY=
-
 # FAL.ai API Key - Image generation
 # Get at: https://fal.ai/
 FAL_KEY=
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,80 +1,60 @@
 # Hermes Agent - Development Guide

-Instructions for AI coding assistants (GitHub Copilot, Cursor, etc.) and human developers.
-
-Hermes Agent is an AI agent harness with tool-calling capabilities, interactive CLI, messaging integrations, and scheduled tasks.
+Instructions for AI coding assistants and developers working on the hermes-agent codebase.

 ## Development Environment

-**IMPORTANT**: Always use the virtual environment if it exists:
 ```bash
-source venv/bin/activate  # Before running any Python commands
+source .venv/bin/activate  # ALWAYS activate before running Python
 ```

 ## Project Structure

 ```
 hermes-agent/
-├── agent/                # Agent internals (extracted from run_agent.py)
-│   ├── auxiliary_client.py   # Shared auxiliary OpenAI client (vision, compression, web extract)
-│   ├── model_metadata.py     # Model context lengths, token estimation
+├── run_agent.py          # AIAgent class — core conversation loop
+├── model_tools.py        # Tool orchestration, _discover_tools(), handle_function_call()
+├── toolsets.py           # Toolset definitions, _HERMES_CORE_TOOLS list
+├── cli.py                # HermesCLI class — interactive CLI orchestrator
+├── hermes_state.py       # SessionDB — SQLite session store (FTS5 search)
+├── agent/                # Agent internals
+│   ├── prompt_builder.py     # System prompt assembly
 │   ├── context_compressor.py # Auto context compression
 │   ├── prompt_caching.py     # Anthropic prompt caching
-│   ├── prompt_builder.py     # System prompt assembly (identity, skills index, context files)
+│   ├── auxiliary_client.py   # Auxiliary LLM client (vision, summarization)
+│   ├── model_metadata.py     # Model context lengths, token estimation
 │   ├── display.py            # KawaiiSpinner, tool preview formatting
+│   ├── skill_commands.py     # Skill slash commands (shared CLI/gateway)
 │   └── trajectory.py         # Trajectory saving helpers
-├── hermes_cli/           # CLI implementation
-│   ├── main.py           # Entry point, command dispatcher
-│   ├── banner.py         # Welcome banner, ASCII art, skills summary
-│   ├── commands.py       # Slash command definitions + autocomplete
-│   ├── callbacks.py      # Interactive prompt callbacks (clarify, sudo, approval)
-│   ├── setup.py          # Interactive setup wizard
-│   ├── config.py         # Config management & migration
-│   ├── status.py         # Status display
-│   ├── doctor.py         # Diagnostics
-│   ├── gateway.py        # Gateway management
-│   ├── uninstall.py      # Uninstaller
-│   ├── cron.py           # Cron job management
-│   └── skills_hub.py     # Skills Hub CLI + /skills slash command
-├── tools/                # Tool implementations
-│   ├── registry.py            # Central tool registry (schemas, handlers, dispatch)
-│   ├── approval.py            # Dangerous command detection + per-session approval
-│   ├── environments/          # Terminal execution backends
-│   │   ├── base.py            # BaseEnvironment ABC
-│   │   ├── local.py           # Local execution with interrupt support
-│   │   ├── docker.py          # Docker container execution
-│   │   ├── ssh.py             # SSH remote execution
-│   │   ├── singularity.py     # Singularity/Apptainer + SIF management
-│   │   ├── modal.py           # Modal cloud execution
-│   │   └── daytona.py         # Daytona cloud sandboxes
-│   ├── terminal_tool.py       # Terminal orchestration (sudo, lifecycle, factory)
-│   ├── todo_tool.py           # Planning & task management
-│   ├── process_registry.py    # Background process management
-│   └── ...                    # Other tool files
-├── gateway/              # Messaging platform adapters
-│   ├── platforms/        # Platform-specific adapters (telegram, discord, slack, whatsapp)
-│   └── ...
-├── cron/                 # Scheduler implementation
-├── environments/         # RL training environments (Atropos integration)
-├── skills/               # Bundled skill sources
-├── optional-skills/      # Official optional skills (not activated by default)
-├── cli.py                # Interactive CLI orchestrator (HermesCLI class)
-├── hermes_state.py       # SessionDB — SQLite session store (schema, titles, FTS5 search)
-├── run_agent.py          # AIAgent class (core conversation loop)
-├── model_tools.py        # Tool orchestration (thin layer over tools/registry.py)
-├── toolsets.py           # Tool groupings
-├── toolset_distributions.py  # Probability-based tool selection
+├── hermes_cli/           # CLI subcommands and setup
+│   ├── main.py           # Entry point — all `hermes` subcommands
+│   ├── config.py         # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
+│   ├── commands.py       # Slash command definitions + SlashCommandCompleter
+│   ├── callbacks.py      # Terminal callbacks (clarify, sudo, approval)
+│   └── setup.py          # Interactive setup wizard
+├── tools/                # Tool implementations (one file per tool)
+│   ├── registry.py       # Central tool registry (schemas, handlers, dispatch)
+│   ├── approval.py       # Dangerous command detection
+│   ├── terminal_tool.py  # Terminal orchestration
+│   ├── process_registry.py # Background process management
+│   ├── file_tools.py     # File read/write/search/patch
+│   ├── web_tools.py      # Firecrawl search/extract
+│   ├── browser_tool.py   # Browserbase browser automation
+│   ├── code_execution_tool.py # execute_code sandbox
+│   ├── delegate_tool.py  # Subagent delegation
+│   ├── mcp_tool.py       # MCP client (~1050 lines)
+│   └── environments/     # Terminal backends (local, docker, ssh, modal, daytona, singularity)
+├── gateway/              # Messaging platform gateway
+│   ├── run.py            # Main loop, slash commands, message dispatch
+│   ├── session.py        # SessionStore — conversation persistence
+│   └── platforms/        # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal
+├── cron/                 # Scheduler (jobs.py, scheduler.py)
+├── environments/         # RL training environments (Atropos)
+├── tests/                # Pytest suite (~2500+ tests)
 └── batch_runner.py       # Parallel batch processing
 ```

-**User Configuration** (stored in `~/.hermes/`):
- `~/.hermes/config.yaml` - Settings (model, terminal, toolsets, etc.)
- `~/.hermes/.env` - API keys and secrets
- `~/.hermes/pairing/` - DM pairing data
- `~/.hermes/hooks/` - Custom event hooks
- `~/.hermes/image_cache/` - Cached user images
- `~/.hermes/audio_cache/` - Cached user voice messages
- `~/.hermes/sticker_cache.json` - Telegram sticker descriptions
+**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys)

 ## File Dependency Chain

@@ -88,698 +68,175 @@ model_tools.py  (imports tools/registry + triggers tool discovery)
 run_agent.py, cli.py, batch_runner.py, environments/
 ```

-Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer.
-
 ---

-## AIAgent Class
-
-The main agent is implemented in `run_agent.py`:
+## AIAgent Class (run_agent.py)

 ```python
 class AIAgent:
-    def __init__(
-        self,
-        model: str = "anthropic/claude-sonnet-4.6",
-        api_key: str = None,
-        base_url: str = "https://openrouter.ai/api/v1",
-        max_iterations: int = 60,        # Max tool-calling loops
+    def __init__(self,
+        model: str = "anthropic/claude-opus-4.6",
+        max_iterations: int = 90,
        enabled_toolsets: list = None,
        disabled_toolsets: list = None,
-        verbose_logging: bool = False,
-        quiet_mode: bool = False,         # Suppress progress output
-        tool_progress_callback: callable = None,  # Called on each tool use
-    ):
-        # Initialize OpenAI client, load tools based on toolsets
-        ...
-    
-    def chat(self, user_message: str, task_id: str = None) -> str:
-        # Main entry point - runs the agent loop
-        ...
+        quiet_mode: bool = False,
+        save_trajectories: bool = False,
+        platform: str = None,           # "cli", "telegram", etc.
+        session_id: str = None,
+        skip_context_files: bool = False,
+        skip_memory: bool = False,
+        # ... plus provider, api_mode, callbacks, routing params
+    ): ...
+
+    def chat(self, message: str) -> str:
+        """Simple interface — returns final response string."""
+
+    def run_conversation(self, user_message: str, system_message: str = None,
+                         conversation_history: list = None, task_id: str = None) -> dict:
+        """Full interface — returns dict with final_response + messages."""
 ```

 ### Agent Loop

-The core loop in `_run_agent_loop()`:
-
-```
-1. Add user message to conversation
-2. Call LLM with tools
-3. If LLM returns tool calls:
-   - Execute each tool
-   - Add tool results to conversation
-   - Go to step 2
-4. If LLM returns text response:
-   - Return response to user
-```
+The core loop is inside `run_conversation()` — entirely synchronous:

 ```python
-while turns < max_turns:
-    response = client.chat.completions.create(
-        model=model,
-        messages=messages,
-        tools=tool_schemas,
-    )
-    
+while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
+    response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
    if response.tool_calls:
        for tool_call in response.tool_calls:
-            result = await execute_tool(tool_call)
+            result = handle_function_call(tool_call.name, tool_call.args, task_id)
            messages.append(tool_result_message(result))
-        turns += 1
+        api_call_count += 1
    else:
        return response.content
 ```

-### Conversation Management
-
-Messages are stored as a list of dicts following OpenAI format:
-
-```python
-messages = [
-    {"role": "system", "content": "You are a helpful assistant..."},
-    {"role": "user", "content": "Search for Python tutorials"},
-    {"role": "assistant", "content": None, "tool_calls": [...]},
-    {"role": "tool", "tool_call_id": "...", "content": "..."},
-    {"role": "assistant", "content": "Here's what I found..."},
-]
-```
-
-### Reasoning Model Support
-
-For models that support chain-of-thought reasoning:
- Extract `reasoning_content` from API responses
- Store in `assistant_msg["reasoning"]` for trajectory export
- Pass back via `reasoning_content` field on subsequent turns
+Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Reasoning content is stored in `assistant_msg["reasoning"]`.

 ---

 ## CLI Architecture (cli.py)

-The interactive CLI uses:
- **Rich** - For the welcome banner and styled panels
- **prompt_toolkit** - For fixed input area with history, `patch_stdout`, slash command autocomplete, and floating completion menus
- **KawaiiSpinner** (in run_agent.py) - Animated kawaii faces during API calls; clean `┊` activity feed for tool execution results
-
-Key components:
- `HermesCLI` class - Main CLI controller with commands and conversation loop
- `SlashCommandCompleter` - Autocomplete dropdown for `/commands` (type `/` to see all)
- `agent/skill_commands.py` - Scans skills and builds invocation messages (shared with gateway)
- `load_cli_config()` - Loads config, sets environment variables for terminal
- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
- `_preload_resumed_session()` - Loads session history early (before banner) for immediate display on resume
- `_display_resumed_history()` - Renders a compact conversation recap in a Rich Panel on session resume
-
-CLI UX notes:
- Thinking spinner (during LLM API call) shows animated kawaii face + verb (`(⌐■_■) deliberating...`)
- When LLM returns tool calls, the spinner clears silently (no "got it!" noise)
- Tool execution results appear as a clean activity feed: `┊ {emoji} {verb} {detail} {duration}`
- "got it!" only appears when the LLM returns a final text response (`⚕ ready`)
- The prompt shows `⚕ ❯` when the agent is working, `❯` when idle
- Pasting 5+ lines auto-saves to `~/.hermes/pastes/` and collapses to a reference
- Multi-line input via Alt+Enter or Ctrl+J
- When resuming a session (`--continue`/`--resume`), a "Previous Conversation" panel shows previous messages before the input prompt (configurable via `display.resume_display`)
- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
- `/skill-name` - Invoke installed skills directly (e.g., `/axolotl`, `/gif-search`)
-
-CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging.
-
-### Skill Slash Commands
-
-Every installed skill in `~/.hermes/skills/` is automatically registered as a slash command.
-The skill name (from frontmatter or folder name) becomes the command: `axolotl` → `/axolotl`.
-
-Implementation (`agent/skill_commands.py`, shared between CLI and gateway):
-1. `scan_skill_commands()` scans all SKILL.md files at startup, filtering out skills incompatible with the current OS platform (via the `platforms` frontmatter field)
-2. `build_skill_invocation_message()` loads the SKILL.md content and builds a user-turn message
-3. The message includes the full skill content, a list of supporting files (not loaded), and the user's instruction
-4. Supporting files can be loaded on demand via the `skill_view` tool
-5. Injected as a **user message** (not system prompt) to preserve prompt caching
+- **Rich** for banner/panels, **prompt_toolkit** for input with autocomplete
+- **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
+- `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
+- `process_command()` is a method on `HermesCLI` (not in commands.py)
+- Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching

 ### Adding CLI Commands

-1. Add to `COMMANDS` dict with description
-2. Add handler in `process_command()` method
-3. For persistent settings, use `save_config_value()` to update config
-
---
-
-## Hermes CLI Commands
-
-The unified `hermes` command provides all functionality:
-
-| Command | Description |
-|---------|-------------|
-| `hermes` | Interactive chat (default) |
-| `hermes chat -q "..."` | Single query mode |
-| `hermes -c` / `hermes --continue` | Resume the most recent session |
-| `hermes -c "my project"` | Resume a session by name (latest in lineage) |
-| `hermes --resume <session_id>` | Resume a specific session by ID or title |
-| `hermes -w` / `hermes --worktree` | Start in isolated git worktree (for parallel agents) |
-| `hermes setup` | Configure API keys and settings |
-| `hermes config` | View current configuration |
-| `hermes config edit` | Open config in editor |
-| `hermes config set KEY VAL` | Set a specific value |
-| `hermes config check` | Check for missing config |
-| `hermes config migrate` | Prompt for missing config interactively |
-| `hermes status` | Show configuration status |
-| `hermes doctor` | Diagnose issues |
-| `hermes update` | Update to latest (checks for new config) |
-| `hermes uninstall` | Uninstall (can keep configs for reinstall) |
-| `hermes gateway` | Start gateway (messaging + cron scheduler) |
-| `hermes gateway setup` | Configure messaging platforms interactively |
-| `hermes gateway install` | Install gateway as system service |
-| `hermes sessions list` | List past sessions (title, preview, last active) |
-| `hermes sessions rename <id> <title>` | Rename/title a session |
-| `hermes cron list` | View scheduled jobs |
-| `hermes cron status` | Check if cron scheduler is running |
-| `hermes version` | Show version info |
-| `hermes pairing list/approve/revoke` | Manage DM pairing codes |
-
---
-
-## Messaging Gateway
-
-The gateway connects Hermes to Telegram, Discord, Slack, and WhatsApp.
-
-### Setup
-
-The interactive setup wizard handles platform configuration:
-
-```bash
-hermes gateway setup      # Arrow-key menu of all platforms, configure tokens/allowlists/home channels
-```
-
-This is the recommended way to configure messaging. It shows which platforms are already set up, walks through each one interactively, and offers to start/restart the gateway service at the end.
-
-Platforms can also be configured manually in `~/.hermes/.env`:
-
-### Configuration (in `~/.hermes/.env`):
-
-```bash
-# Telegram
-TELEGRAM_BOT_TOKEN=123456:ABC-DEF...      # From @BotFather
-TELEGRAM_ALLOWED_USERS=123456789,987654   # Comma-separated user IDs (from @userinfobot)
-
-# Discord  
-DISCORD_BOT_TOKEN=MTIz...                 # From Developer Portal
-DISCORD_ALLOWED_USERS=123456789012345678  # Comma-separated user IDs
-
-# Agent Behavior
-HERMES_MAX_ITERATIONS=60                  # Max tool-calling iterations
-MESSAGING_CWD=/home/myuser                # Terminal working directory for messaging
-
-# Tool progress is configured in config.yaml (display.tool_progress: off|new|all|verbose)
-```
-
-### Working Directory Behavior
-
- **CLI (`hermes` command)**: Uses current directory (`.` → `os.getcwd()`)
- **Messaging (Telegram/Discord)**: Uses `MESSAGING_CWD` (default: home directory)
-
-This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
-
-### Security (User Allowlists):
-
-**IMPORTANT**: By default, the gateway denies all users who are not in an allowlist or paired via DM.
-
-The gateway checks `{PLATFORM}_ALLOWED_USERS` environment variables:
- If set: Only listed user IDs can interact with the bot
- If unset: All users are denied unless `GATEWAY_ALLOW_ALL_USERS=true` is set
-
-Users can find their IDs:
- **Telegram**: Message [@userinfobot](https://t.me/userinfobot)
- **Discord**: Enable Developer Mode, right-click name → Copy ID
-
-### DM Pairing System
-
-Instead of static allowlists, users can pair via one-time codes:
-1. Unknown user DMs the bot → receives pairing code
-2. Owner runs `hermes pairing approve <platform> <code>`
-3. User is permanently authorized
-
-Security: 8-char codes, 1-hour expiry, rate-limited (1/10min/user), max 3 pending per platform, lockout after 5 failed attempts, `chmod 0600` on data files.
-
-Files: `gateway/pairing.py`, `hermes_cli/pairing.py`
-
-### Event Hooks
-
-Hooks fire at lifecycle points. Place hook directories in `~/.hermes/hooks/`:
-
-```
-~/.hermes/hooks/my-hook/
-├── HOOK.yaml    # name, description, events list
-└── handler.py   # async def handle(event_type, context): ...
-```
-
-Events: `gateway:startup`, `session:start`, `session:reset`, `agent:start`, `agent:step`, `agent:end`, `command:*`
-
-The `agent:step` event fires each iteration of the tool-calling loop with tool names and results.
-
-Files: `gateway/hooks.py`
-
-### Tool Progress Notifications
-
-When `tool_progress` is enabled in `config.yaml`, the bot sends status messages as it works:
- `💻 \`ls -la\`...` (terminal commands show the actual command)
- `🔍 web_search...`
- `📄 web_extract...`
- `🐍 execute_code...` (programmatic tool calling sandbox)
- `🔀 delegate_task...` (subagent delegation)
- `❓ clarify...` (user question, CLI-only)
-
-Modes:
- `new`: Only when switching to a different tool (less spam)
- `all`: Every single tool call
-
-### Typing Indicator
-
-The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
-
-### Platform Toolsets:
-
-Each platform has a dedicated toolset in `toolsets.py`:
- `hermes-telegram`: Full tools including terminal (with safety checks)
- `hermes-discord`: Full tools including terminal
- `hermes-whatsapp`: Full tools including terminal
-
---
-
-## Configuration System
-
-Configuration files are stored in `~/.hermes/` for easy user access:
- `~/.hermes/config.yaml` - All settings (model, terminal, compression, etc.)
- `~/.hermes/.env` - API keys and secrets
-
-### Adding New Configuration Options
-
-When adding new configuration variables, you MUST follow this process:
-
-#### For config.yaml options:
-
-1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
-2. **CRITICAL**: Bump `_config_version` in `DEFAULT_CONFIG` when adding required fields
-3. This triggers migration prompts for existing users on next `hermes update` or `hermes setup`
-
-Example:
-```python
-DEFAULT_CONFIG = {
-    # ... existing config ...
-    
-    "new_feature": {
-        "enabled": True,
-        "option": "default_value",
-    },
-    
-    # BUMP THIS when adding required fields
-    "_config_version": 2,  # Was 1, now 2
-}
-```
-
-#### For .env variables (API keys/secrets):
-
-1. Add to `REQUIRED_ENV_VARS` or `OPTIONAL_ENV_VARS` in `hermes_cli/config.py`
-2. Include metadata for the migration system:
-
-```python
-OPTIONAL_ENV_VARS = {
-    # ... existing vars ...
-    "NEW_API_KEY": {
-        "description": "What this key is for",
-        "prompt": "Display name in prompts",
-        "url": "https://where-to-get-it.com/",
-        "tools": ["tools_it_enables"],  # What tools need this
-        "password": True,  # Mask input
-    },
-}
-```
-
-#### Update related files:
-
- `hermes_cli/setup.py` - Add prompts in the setup wizard
- `cli-config.yaml.example` - Add example with comments
- Update README.md if user-facing
-
-### Config Version Migration
-
-The system uses `_config_version` to detect outdated configs:
-
-1. `check_for_missing_config()` compares user config to `DEFAULT_CONFIG`
-2. `migrate_config()` interactively prompts for missing values
-3. Called automatically by `hermes update` and optionally by `hermes setup`
-
---
-
-## Environment Variables
-
-API keys are loaded from `~/.hermes/.env`:
- `OPENROUTER_API_KEY` - Main LLM API access (primary provider)
- `FIRECRAWL_API_KEY` - Web search/extract tools
- `FIRECRAWL_API_URL` - Self-hosted Firecrawl endpoint (optional)
- `BROWSERBASE_API_KEY` / `BROWSERBASE_PROJECT_ID` - Browser automation
- `FAL_KEY` - Image generation (FLUX model)
- `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
-
-Terminal tool configuration (in `~/.hermes/config.yaml`):
- `terminal.backend` - Backend: local, docker, singularity, modal, daytona, or ssh
- `terminal.cwd` - Working directory ("." = host CWD for local only; for remote backends set an absolute path inside the target, or omit to use the backend's default)
- `terminal.docker_image` - Image for Docker backend
- `terminal.singularity_image` - Image for Singularity backend
- `terminal.modal_image` - Image for Modal backend
- `terminal.daytona_image` - Image for Daytona backend
- `DAYTONA_API_KEY` - API key for Daytona backend (in .env)
- SSH: `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` in .env
-
-Agent behavior (in `~/.hermes/.env`):
- `HERMES_MAX_ITERATIONS` - Max tool-calling iterations (default: 60)
- `MESSAGING_CWD` - Working directory for messaging platforms (default: ~)
- `display.tool_progress` in config.yaml - Tool progress: `off`, `new`, `all`, `verbose`
- `OPENAI_API_KEY` - Voice transcription (Whisper STT)
- `SLACK_BOT_TOKEN` / `SLACK_APP_TOKEN` - Slack integration (Socket Mode)
- `SLACK_ALLOWED_USERS` - Comma-separated Slack user IDs
- `HERMES_HUMAN_DELAY_MODE` - Response pacing: off/natural/custom
- `HERMES_HUMAN_DELAY_MIN_MS` / `HERMES_HUMAN_DELAY_MAX_MS` - Custom delay range
-
-### Dangerous Command Approval
-
-The terminal tool includes safety checks for potentially destructive commands (e.g., `rm -rf`, `DROP TABLE`, `chmod 777`, etc.):
-
-**Behavior by Backend:**
- **Docker/Singularity/Modal**: Commands run unrestricted (isolated containers)
- **Local/SSH**: Dangerous commands trigger approval flow
-
-**Approval Flow (CLI):**
-```
-⚠️  Potentially dangerous command detected: recursive delete
-    rm -rf /tmp/test
-
-    [o]nce  |  [s]ession  |  [a]lways  |  [d]eny
-    Choice [o/s/a/D]: 
-```
-
-**Approval Flow (Messaging):**
- Command is blocked with explanation
- Agent explains the command was blocked for safety
- User must add the pattern to their allowlist via `hermes config edit` or run the command directly on their machine
-
-**Configuration:**
- `command_allowlist` in `~/.hermes/config.yaml` stores permanently allowed patterns
- Add patterns via "always" approval or edit directly
-
-**Sudo Handling (Messaging):**
- If sudo fails over messaging, output includes tip to add `SUDO_PASSWORD` to `~/.hermes/.env`
-
---
-
-## Background Process Management
-
-The `process` tool works alongside `terminal` for managing long-running background processes:
-
-**Starting a background process:**
-```python
-terminal(command="pytest -v tests/", background=true)
-# Returns: {"session_id": "proc_abc123", "pid": 12345, ...}
-```
-
-**Managing it with the process tool:**
- `process(action="list")` -- show all running/recent processes
- `process(action="poll", session_id="proc_abc123")` -- check status + new output
- `process(action="log", session_id="proc_abc123")` -- full output with pagination
- `process(action="wait", session_id="proc_abc123", timeout=600)` -- block until done
- `process(action="kill", session_id="proc_abc123")` -- terminate
- `process(action="write", session_id="proc_abc123", data="y")` -- send stdin
- `process(action="submit", session_id="proc_abc123", data="yes")` -- send + Enter
-
-**Key behaviors:**
- Background processes execute through the configured terminal backend (local/Docker/Modal/Daytona/SSH/Singularity) -- never directly on the host unless `TERMINAL_ENV=local`
- The `wait` action blocks the tool call until the process finishes, times out, or is interrupted by a new user message
- PTY mode (`pty=true` on terminal) enables interactive CLI tools (Codex, Claude Code)
- In RL training, background processes are auto-killed when the episode ends (`tool_context.cleanup()`)
- In the gateway, sessions with active background processes are exempt from idle reset
- The process registry checkpoints to `~/.hermes/processes.json` for crash recovery
-
-Files: `tools/process_registry.py` (registry + handler), `tools/terminal_tool.py` (spawn integration)
+1. Add to `COMMANDS` dict in `hermes_cli/commands.py`
+2. Add handler in `HermesCLI.process_command()` in `cli.py`
+3. For persistent settings, use `save_config_value()` in `cli.py`

 ---

 ## Adding New Tools

-Adding a tool requires changes in **2 files** (the tool file and `toolsets.py`):
-
-1. **Create `tools/your_tool.py`** with handler, schema, check function, and registry call:
+Requires changes in **3 files**:

+**1. Create `tools/your_tool.py`:**
 ```python
-# tools/example_tool.py
-import json
-import os
+import json, os
 from tools.registry import registry

-def check_example_requirements() -> bool:
-    """Check if required API keys/dependencies are available."""
+def check_requirements() -> bool:
    return bool(os.getenv("EXAMPLE_API_KEY"))

 def example_tool(param: str, task_id: str = None) -> str:
-    """Execute the tool and return JSON string result."""
-    try:
-        result = {"success": True, "data": "..."}
-        return json.dumps(result, ensure_ascii=False)
-    except Exception as e:
-        return json.dumps({"error": str(e)}, ensure_ascii=False)
-
-EXAMPLE_SCHEMA = {
-    "name": "example_tool",
-    "description": "Does something useful.",
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "param": {"type": "string", "description": "The parameter"}
-        },
-        "required": ["param"]
-    }
-}
+    return json.dumps({"success": True, "data": "..."})

 registry.register(
    name="example_tool",
    toolset="example",
-    schema=EXAMPLE_SCHEMA,
-    handler=lambda args, **kw: example_tool(
-        param=args.get("param", ""), task_id=kw.get("task_id")),
-    check_fn=check_example_requirements,
+    schema={"name": "example_tool", "description": "...", "parameters": {...}},
+    handler=lambda args, **kw: example_tool(param=args.get("param", ""), task_id=kw.get("task_id")),
+    check_fn=check_requirements,
    requires_env=["EXAMPLE_API_KEY"],
 )
 ```

-2. **Add to `toolsets.py`**: Add `"example_tool"` to `_HERMES_CORE_TOOLS` if it should be in all platform toolsets, or create a new toolset entry.
+**2. Add import** in `model_tools.py` `_discover_tools()` list.

-3. **Add discovery import** in `model_tools.py`'s `_discover_tools()` list: `"tools.example_tool"`.
+**3. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.

-That's it. The registry handles schema collection, dispatch, availability checking, and error wrapping automatically. No edits to `TOOLSET_REQUIREMENTS`, `handle_function_call()`, `get_all_tool_names()`, or any other data structure.
+The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.

-**Optional:** Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` for the setup wizard, and to `toolset_distributions.py` for batch processing.
-
-**Special case: tools that need agent-level state** (like `todo`, `memory`):
-These are intercepted by `run_agent.py`'s tool dispatch loop *before* `handle_function_call()`. The registry still holds their schemas, but dispatch returns a stub error as a safety fallback. See `todo_tool.py` for the pattern.
-
-All tool handlers MUST return a JSON string. The registry's `dispatch()` wraps all exceptions in `{"error": "..."}` automatically.
-
-### Dynamic Tool Availability
-
-Tools declare their requirements at registration time via `check_fn` and `requires_env`. The registry checks `check_fn()` when building tool definitions -- tools whose check fails are silently excluded.
-
-### Stateful Tools
-
-Tools that maintain state (terminal, browser) require:
- `task_id` parameter for session isolation between concurrent tasks
- `cleanup_*()` function to release resources
- Cleanup is called automatically in run_agent.py after conversation completes
+**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.

 ---

-## Trajectory Format
+## Adding Configuration

-Conversations are saved in ShareGPT format for training:
-```json
-{"from": "system", "value": "System prompt with <tools>...</tools>"}
-{"from": "human", "value": "User message"}
-{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
-{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
-{"from": "gpt", "value": "Final response"}
-```
-
-Tool calls use `<tool_call>` XML tags, responses use `<tool_response>` tags, reasoning uses `<think>` tags.
-
-### Trajectory Export
+### config.yaml options:
+1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
+2. Bump `_config_version` (currently 5) to trigger migration for existing users

+### .env variables:
+1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
 ```python
-agent = AIAgent(save_trajectories=True)
-agent.chat("Do something")
-# Saves to trajectories/*.jsonl in ShareGPT format
+"NEW_API_KEY": {
+    "description": "What it's for",
+    "prompt": "Display name",
+    "url": "https://...",
+    "password": True,
+    "category": "tool",  # provider, tool, messaging, setting
+},
 ```

+### Config loaders (two separate systems):
+
+| Loader | Used by | Location |
+|--------|---------|----------|
+| `load_cli_config()` | CLI mode | `cli.py` |
+| `load_config()` | `hermes tools`, `hermes setup` | `hermes_cli/config.py` |
+| Direct YAML load | Gateway | `gateway/run.py` |
+
 ---

-## Batch Processing (batch_runner.py)
+## Important Policies

-For processing multiple prompts:
- Parallel execution with multiprocessing
- Content-based resume for fault tolerance (matches on prompt text, not indices)
- Toolset distributions control probabilistic tool availability per prompt
- Output: `data/<run_name>/trajectories.jsonl` (combined) + individual batch files
+### Prompt Caching Must Not Break

-```bash
-python batch_runner.py \
-    --dataset_file=prompts.jsonl \
-    --batch_size=20 \
-    --num_workers=4 \
-    --run_name=my_run
-```
+Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
+- Alter past context mid-conversation
+- Change toolsets mid-conversation
+- Reload memories or rebuild system prompts mid-conversation

---
+Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.

-## Skills System
-
-Skills are on-demand knowledge documents the agent can load. Compatible with the [agentskills.io](https://agentskills.io/specification) open standard.
-
-```
-skills/
-├── mlops/                    # Category folder
-│   ├── axolotl/             # Skill folder
-│   │   ├── SKILL.md         # Main instructions (required)
-│   │   ├── references/      # Additional docs, API specs
-│   │   ├── templates/       # Output formats, configs
-│   │   └── assets/          # Supplementary files (agentskills.io)
-│   └── vllm/
-│       └── SKILL.md
-├── .hub/                    # Skills Hub state (gitignored)
-│   ├── lock.json            # Installed skill provenance
-│   ├── quarantine/          # Pending security review
-│   ├── audit.log            # Security scan history
-│   ├── taps.json            # Custom source repos
-│   └── index-cache/         # Cached remote indexes
-```
-
-**Progressive disclosure** (token-efficient):
-1. `skills_categories()` - List category names (~50 tokens)
-2. `skills_list(category)` - Name + description per skill (~3k tokens)
-3. `skill_view(name)` - Full content + tags + linked files
-
-SKILL.md files use YAML frontmatter (agentskills.io format):
-```yaml
---
-name: skill-name
-description: Brief description for listing
-version: 1.0.0
-platforms: [macos]              # Optional — restrict to specific OS (macos/linux/windows)
-metadata:
-  hermes:
-    tags: [tag1, tag2]
-    related_skills: [other-skill]
---
-# Skill Content...
-```
-
-**Platform filtering** — Skills with a `platforms` field are automatically excluded from the system prompt index, `skills_list()`, and slash commands on incompatible platforms. Skills without the field load everywhere (backward compatible). See `skills/apple/` for macOS-only examples (iMessage, Reminders, Notes, FindMy).
-
-**Skills Hub** — user-driven skill search/install from online registries and official optional skills. Sources: official optional skills (shipped with repo, labeled "official"), GitHub (openai/skills, anthropics/skills, custom taps), ClawHub, Claude marketplace, LobeHub. Not exposed as an agent tool — the model cannot search for or install skills. Users manage skills via `hermes skills browse/search/install` CLI commands or the `/skills` slash command in chat.
-
-Key files:
- `tools/skills_tool.py` — Agent-facing skill list/view (progressive disclosure)
- `tools/skills_guard.py` — Security scanner (regex + LLM audit, trust-aware install policy)
- `tools/skills_hub.py` — Source adapters (OptionalSkillSource, GitHub, ClawHub, Claude marketplace, LobeHub), lock file, auth
- `hermes_cli/skills_hub.py` — CLI subcommands + `/skills` slash command handler
-
---
-
-## Auxiliary Model Configuration
-
-Hermes uses lightweight "auxiliary" models for side tasks that run alongside the main conversation model:
-
-| Task | Tool(s) | Default Model |
-|------|---------|---------------|
-| **Vision analysis** | `vision_analyze`, `browser_vision` | `google/gemini-3-flash-preview` (via OpenRouter) |
-| **Web extraction** | `web_extract`, browser snapshot summarization | `google/gemini-3-flash-preview` (via OpenRouter) |
-| **Context compression** | Auto-compression when approaching context limit | `google/gemini-3-flash-preview` (via OpenRouter) |
-
-By default, these auto-detect the best available provider: OpenRouter → Nous Portal → (text tasks only) custom endpoint → Codex → API-key providers.
-
-### Changing the Vision Model
-
-To use a different model for image analysis (e.g., GPT-4o instead of Gemini Flash), add to `~/.hermes/config.yaml`:
-
-```yaml
-auxiliary:
-  vision:
-    provider: "openrouter"        # or "nous", "main", "auto"
-    model: "openai/gpt-4o"        # any model slug your provider supports
-```
-
-Or set environment variables (in `~/.hermes/.env` or shell):
-
-```bash
-AUXILIARY_VISION_MODEL=openai/gpt-4o
-# Optionally force a specific provider:
-AUXILIARY_VISION_PROVIDER=openrouter
-```
-
-### Changing the Web Extraction Model
-
-```yaml
-auxiliary:
-  web_extract:
-    provider: "auto"
-    model: "google/gemini-2.5-flash"
-```
-
-### Changing the Compression Model
-
-```yaml
-compression:
-  summary_model: "google/gemini-2.5-flash"
-  summary_provider: "auto"          # "auto", "openrouter", "nous", "main"
-```
-
-### Provider Options
-
-| Provider | Description |
-|----------|-------------|
-| `"auto"` | Best available (default). For vision, only tries OpenRouter + Nous. |
-| `"openrouter"` | Force OpenRouter (requires `OPENROUTER_API_KEY`) |
-| `"nous"` | Force Nous Portal (requires `hermes login`) |
-| `"codex"` | Force Codex OAuth (ChatGPT account). Supports vision via gpt-5.3-codex. |
-| `"main"` | Use your custom endpoint (`OPENAI_BASE_URL` + `OPENAI_API_KEY`). Works with OpenAI API, local models, etc. |
-
-**Important:** Vision tasks require a multimodal-capable model. In `auto` mode, OpenRouter, Nous Portal, and Codex OAuth are tried (they all support vision). Setting `provider: "main"` for vision will work only if your endpoint supports multimodal input (e.g. OpenAI with GPT-4o, or a local model with vision).
-
-**Key files:** `agent/auxiliary_client.py` (resolution chain), `tools/vision_tools.py`, `tools/browser_tool.py`, `tools/web_tools.py`
+### Working Directory Behavior
+- **CLI**: Uses current directory (`.` → `os.getcwd()`)
+- **Messaging**: Uses `MESSAGING_CWD` env var (default: home directory)

 ---

 ## Known Pitfalls

 ### DO NOT use `simple_term_menu` for interactive menus
-
-`simple_term_menu` has rendering bugs in tmux, iTerm2, and other non-standard terminals. When the user scrolls with arrow keys, previously highlighted items "ghost" — duplicating upward and corrupting the display. This happens because the library uses ANSI cursor-up codes to redraw in place, and tmux/iTerm miscalculate positions when the menu is near the bottom of the viewport.
-
-**Rule:** All interactive menus in `hermes_cli/` must use `curses` (Python stdlib) instead. See `tools_config.py` for the pattern — both `_prompt_choice()` (single-select) and `_prompt_toolset_checklist()` (multi-select with space toggle) use `curses.wrapper()`. The numbered-input fallback handles Windows where curses isn't available.
+Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.

 ### DO NOT use `\033[K` (ANSI erase-to-EOL) in spinner/display code
-
-The ANSI escape `\033[K` leaks as literal `?[K` text when `prompt_toolkit`'s `patch_stdout` is active. Use space-padding instead to clear lines: `f"\r{line}{' ' * pad}"`. See `agent/display.py` `KawaiiSpinner`.
+Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-padding: `f"\r{line}{' ' * pad}"`.

 ### `_last_resolved_tool_names` is a process-global in `model_tools.py`
-
-The `execute_code` sandbox uses `_last_resolved_tool_names` (set by `get_tool_definitions()`) to decide which tool stubs to generate. When subagents run with restricted toolsets, they overwrite this global. After delegation returns to the parent, `execute_code` may see the child's restricted list instead of the parent's full list. This is a known bug — `execute_code` calls after delegation may fail with `ImportError: cannot import name 'patch' from 'hermes_tools'`.
+When subagents overwrite this global, `execute_code` calls after delegation may fail with missing tool imports. Known bug.

 ### Tests must not write to `~/.hermes/`
-
-The `autouse` fixture `_isolate_hermes_home` in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Every test runs in isolation. If you add a test that creates `AIAgent` instances or writes session logs, the fixture handles cleanup automatically. Never hardcode `~/.hermes/` paths in tests.
+The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.

 ---

-## Testing Changes
+## Testing

-After making changes:
+```bash
+source .venv/bin/activate
+python -m pytest tests/ -q          # Full suite (~2500 tests, ~2 min)
+python -m pytest tests/test_model_tools.py -q   # Toolset resolution
+python -m pytest tests/test_cli_init.py -q       # CLI config loading
+python -m pytest tests/gateway/ -q               # Gateway tests
+python -m pytest tests/tools/ -q                 # Tool-level tests
+```

-1. Run `hermes doctor` to check setup
-2. Run `hermes config check` to verify config
-3. Test with `hermes chat -q "test message"`
-4. For new config options, test fresh install: `rm -rf ~/.hermes && hermes setup`
+Always run the full suite before pushing changes.
--- a/README.md
+++ b/README.md
@@ -17,7 +17,7 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open

 <table>
 <tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
-<tr><td><b>Lives where you do</b></td><td>Telegram, Discord, Slack, WhatsApp, and CLI — all from a single gateway process. Voice memo transcription, cross-platform conversation continuity.</td></tr>
+<tr><td><b>Lives where you do</b></td><td>Telegram, Discord, Slack, WhatsApp, Signal, and CLI — all from a single gateway process. Voice memo transcription, cross-platform conversation continuity.</td></tr>
 <tr><td><b>A closed learning loop</b></td><td>Agent-curated memory with periodic nudges. Autonomous skill creation after complex tasks. Skills self-improve during use. FTS5 session search with LLM summarization for cross-session recall. <a href="https://github.com/plastic-labs/honcho">Honcho</a> dialectic user modeling. Compatible with the <a href="https://agentskills.io">agentskills.io</a> open standard.</td></tr>
 <tr><td><b>Scheduled automations</b></td><td>Built-in cron scheduler with delivery to any platform. Daily reports, nightly backups, weekly audits — all in natural language, running unattended.</td></tr>
 <tr><td><b>Delegates and parallelizes</b></td><td>Spawn isolated subagents for parallel workstreams. Write Python scripts that call tools via RPC, collapsing multi-step pipelines into zero-context-cost turns.</td></tr>
@@ -71,7 +71,7 @@ All documentation lives at **[hermes-agent.nousresearch.com/docs](https://hermes
 | [Quickstart](https://hermes-agent.nousresearch.com/docs/getting-started/quickstart) | Install → setup → first conversation in 2 minutes |
 | [CLI Usage](https://hermes-agent.nousresearch.com/docs/user-guide/cli) | Commands, keybindings, personalities, sessions |
 | [Configuration](https://hermes-agent.nousresearch.com/docs/user-guide/configuration) | Config file, providers, models, all options |
-| [Messaging Gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging) | Telegram, Discord, Slack, WhatsApp, Home Assistant |
+| [Messaging Gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging) | Telegram, Discord, Slack, WhatsApp, Signal, Home Assistant |
 | [Security](https://hermes-agent.nousresearch.com/docs/user-guide/security) | Command approval, DM pairing, container isolation |
 | [Tools & Toolsets](https://hermes-agent.nousresearch.com/docs/user-guide/features/tools) | 40+ tools, toolset system, terminal backends |
 | [Skills System](https://hermes-agent.nousresearch.com/docs/user-guide/features/skills) | Procedural memory, Skills Hub, creating skills |
--- a/agent/context_compressor.py
+++ b/agent/context_compressor.py
@@ -342,7 +342,9 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
            compressed.append(msg)

        if summary:
-            compressed.append({"role": "user", "content": summary})
+            last_head_role = messages[compress_start - 1].get("role", "user") if compress_start > 0 else "user"
+            summary_role = "user" if last_head_role in ("assistant", "tool") else "assistant"
+            compressed.append({"role": summary_role, "content": summary})
        else:
            if not self.quiet_mode:
                print("   ⚠️  No summary model available — middle turns dropped without summary")
--- a/agent/prompt_builder.py
+++ b/agent/prompt_builder.py
@@ -122,6 +122,15 @@ PLATFORM_HINTS = {
        "attachments, audio as file attachments. You can also include image URLs "
        "in markdown format ![alt](url) and they will be uploaded as attachments."
    ),
+    "signal": (
+        "You are on a text messaging communication platform, Signal. "
+        "Please do not use markdown as it does not render. "
+        "You can send media files natively: to deliver a file to the user, "
+        "include MEDIA:/absolute/path/to/file in your response. Images "
+        "(.png, .jpg, .webp) appear as photos, audio as attachments, and other "
+        "files arrive as downloadable documents. You can also include image "
+        "URLs in markdown format ![alt](url) and they will be sent as photos."
+    ),
    "cli": (
        "You are a CLI AI Agent. Try not to use markdown but simple text "
        "renderable inside a terminal."
@@ -186,6 +195,8 @@ def build_skills_system_prompt() -> str:

    # Collect skills with descriptions, grouped by category
    # Each entry: (skill_name, description)
+    # Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
+    # → category "mlops/training", skill "axolotl"
    skills_by_category: dict[str, list[tuple[str, str]]] = {}
    for skill_file in skills_dir.rglob("SKILL.md"):
        # Skip skills incompatible with the current OS platform
@@ -194,8 +205,13 @@ def build_skills_system_prompt() -> str:
        rel_path = skill_file.relative_to(skills_dir)
        parts = rel_path.parts
        if len(parts) >= 2:
-            category = parts[0]
+            # Category is everything between skills_dir and the skill folder
+            # e.g. parts = ("mlops", "training", "axolotl", "SKILL.md")
+            #   → category = "mlops/training", skill_name = "axolotl"
+            # e.g. parts = ("github", "github-auth", "SKILL.md")
+            #   → category = "github", skill_name = "github-auth"
            skill_name = parts[-2]
+            category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
        else:
            category = "general"
            skill_name = skill_file.parent.name
@@ -206,9 +222,11 @@ def build_skills_system_prompt() -> str:
        return ""

    # Read category-level descriptions from DESCRIPTION.md
+    # Checks both the exact category path and parent directories
    category_descriptions = {}
    for category in skills_by_category:
-        desc_file = skills_dir / category / "DESCRIPTION.md"
+        cat_path = Path(category)
+        desc_file = skills_dir / cat_path / "DESCRIPTION.md"
        if desc_file.exists():
            try:
                content = desc_file.read_text(encoding="utf-8")
--- a/agent/redact.py
+++ b/agent/redact.py
@@ -8,6 +8,7 @@ the first 6 and last 4 characters for debuggability.
 """

 import logging
+import os
 import re
 from typing import Optional

@@ -15,7 +16,7 @@ logger = logging.getLogger(__name__)

 # Known API key prefixes -- match the prefix + contiguous token chars
 _PREFIX_PATTERNS = [
-    r"sk-[A-Za-z0-9_-]{10,}",           # OpenAI / OpenRouter
+    r"sk-[A-Za-z0-9_-]{10,}",           # OpenAI / OpenRouter / Anthropic (sk-ant-*)
    r"ghp_[A-Za-z0-9]{10,}",            # GitHub PAT (classic)
    r"github_pat_[A-Za-z0-9_]{10,}",    # GitHub PAT (fine-grained)
    r"xox[baprs]-[A-Za-z0-9-]{10,}",    # Slack tokens
@@ -25,6 +26,18 @@ _PREFIX_PATTERNS = [
    r"fc-[A-Za-z0-9]{10,}",             # Firecrawl
    r"bb_live_[A-Za-z0-9_-]{10,}",      # BrowserBase
    r"gAAAA[A-Za-z0-9_=-]{20,}",        # Codex encrypted tokens
+    r"AKIA[A-Z0-9]{16}",                # AWS Access Key ID
+    r"sk_live_[A-Za-z0-9]{10,}",        # Stripe secret key (live)
+    r"sk_test_[A-Za-z0-9]{10,}",        # Stripe secret key (test)
+    r"rk_live_[A-Za-z0-9]{10,}",        # Stripe restricted key
+    r"SG\.[A-Za-z0-9_-]{10,}",          # SendGrid API key
+    r"hf_[A-Za-z0-9]{10,}",             # HuggingFace token
+    r"r8_[A-Za-z0-9]{10,}",             # Replicate API token
+    r"npm_[A-Za-z0-9]{10,}",            # npm access token
+    r"pypi-[A-Za-z0-9_-]{10,}",         # PyPI API token
+    r"dop_v1_[A-Za-z0-9]{10,}",         # DigitalOcean PAT
+    r"doo_v1_[A-Za-z0-9]{10,}",         # DigitalOcean OAuth
+    r"am_[A-Za-z0-9_-]{10,}",           # AgentMail API key
 ]

 # ENV assignment patterns: KEY=value where KEY contains a secret-like name
@@ -52,6 +65,22 @@ _TELEGRAM_RE = re.compile(
    r"(bot)?(\d{8,}):([-A-Za-z0-9_]{30,})",
 )

+# Private key blocks: -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY-----
+_PRIVATE_KEY_RE = re.compile(
+    r"-----BEGIN[A-Z ]*PRIVATE KEY-----[\s\S]*?-----END[A-Z ]*PRIVATE KEY-----"
+)
+
+# Database connection strings: protocol://user:PASSWORD@host
+# Catches postgres, mysql, mongodb, redis, amqp URLs and redacts the password
+_DB_CONNSTR_RE = re.compile(
+    r"((?:postgres(?:ql)?|mysql|mongodb(?:\+srv)?|redis|amqp)://[^:]+:)([^@]+)(@)",
+    re.IGNORECASE,
+)
+
+# E.164 phone numbers: +<country><number>, 7-15 digits
+# Negative lookahead prevents matching hex strings or identifiers
+_SIGNAL_PHONE_RE = re.compile(r"(\+[1-9]\d{6,14})(?![A-Za-z0-9])")
+
 # Compile known prefix patterns into one alternation
 _PREFIX_RE = re.compile(
    r"(?<![A-Za-z0-9_-])(" + "|".join(_PREFIX_PATTERNS) + r")(?![A-Za-z0-9_-])"
@@ -69,9 +98,12 @@ def redact_sensitive_text(text: str) -> str:
    """Apply all redaction patterns to a block of text.

    Safe to call on any string -- non-matching text passes through unchanged.
+    Disabled when security.redact_secrets is false in config.yaml.
    """
    if not text:
        return text
+    if os.getenv("HERMES_REDACT_SECRETS", "").lower() in ("0", "false", "no", "off"):
+        return text

    # Known prefixes (sk-, ghp_, etc.)
    text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
@@ -101,6 +133,20 @@ def redact_sensitive_text(text: str) -> str:
        return f"{prefix}{digits}:***"
    text = _TELEGRAM_RE.sub(_redact_telegram, text)

+    # Private key blocks
+    text = _PRIVATE_KEY_RE.sub("[REDACTED PRIVATE KEY]", text)
+
+    # Database connection string passwords
+    text = _DB_CONNSTR_RE.sub(lambda m: f"{m.group(1)}***{m.group(3)}", text)
+
+    # E.164 phone numbers (Signal, WhatsApp)
+    def _redact_phone(m):
+        phone = m.group(1)
+        if len(phone) <= 8:
+            return phone[:2] + "****" + phone[-2:]
+        return phone[:4] + "****" + phone[-4:]
+    text = _SIGNAL_PHONE_RE.sub(_redact_phone, text)
+
    return text


--- a/cli-config.yaml.example
+++ b/cli-config.yaml.example
@@ -555,6 +555,21 @@ toolsets:
 #     args: ["-y", "@modelcontextprotocol/server-github"]
 #     env:
 #       GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..."
+#
+# Sampling (server-initiated LLM requests) — enabled by default.
+# Per-server config under the 'sampling' key:
+#   analysis:
+#     command: npx
+#     args: ["-y", "analysis-server"]
+#     sampling:
+#       enabled: true           # default: true
+#       model: "gemini-3-flash" # override model (optional)
+#       max_tokens_cap: 4096    # max tokens per request
+#       timeout: 30             # LLM call timeout (seconds)
+#       max_rpm: 10             # max requests per minute
+#       allowed_models: []      # model whitelist (empty = all)
+#       max_tool_rounds: 5      # tool loop limit (0 = disable)
+#       log_level: "info"       # audit verbosity

 # =============================================================================
 # Voice Transcription (Speech-to-Text)
--- a/cli.py
+++ b/cli.py
@@ -161,6 +161,7 @@ def load_cli_config() -> Dict[str, Any]:
        },
        "browser": {
            "inactivity_timeout": 120,  # Auto-cleanup inactive browser sessions after 2 min
+            "record_sessions": False,  # Auto-record browser sessions as WebM videos
        },
        "compression": {
            "enabled": True,      # Auto-compress when approaching context limit
@@ -363,6 +364,13 @@ def load_cli_config() -> Dict[str, Any]:
        if model:
            os.environ[model_env] = model
    
+    # Security settings
+    security_config = defaults.get("security", {})
+    if isinstance(security_config, dict):
+        redact = security_config.get("redact_secrets")
+        if redact is not None:
+            os.environ["HERMES_REDACT_SECRETS"] = str(redact).lower()
+
    return defaults

 # Load configuration at module startup
@@ -1120,6 +1128,10 @@ class HermesCLI:
        self._provider_require_params = pr.get("require_parameters", False)
        self._provider_data_collection = pr.get("data_collection")
        
+        # Fallback model config — tried when primary provider fails after retries
+        fb = CLI_CONFIG.get("fallback_model") or {}
+        self._fallback_model = fb if fb.get("provider") and fb.get("model") else None
+
        # Agent will be initialized on first use
        self.agent: Optional[AIAgent] = None
        self._app = None  # prompt_toolkit Application (set in run())
@@ -1351,6 +1363,7 @@ class HermesCLI:
                session_db=self._session_db,
                clarify_callback=self._clarify_callback,
                honcho_session_key=self.session_id,
+                fallback_model=self._fallback_model,
            )
            # Apply any pending title now that the session exists in the DB
            if self._pending_title and self._session_db:
--- a/cron/scheduler.py
+++ b/cron/scheduler.py
@@ -98,6 +98,7 @@ def _deliver_result(job: dict, content: str) -> None:
        "discord": Platform.DISCORD,
        "slack": Platform.SLACK,
        "whatsapp": Platform.WHATSAPP,
+        "signal": Platform.SIGNAL,
    }
    platform = platform_map.get(platform_name.lower())
    if not platform:
--- a/docs/README.md
+++ b/docs/README.md
@@ -1,7 +0,0 @@
-# Documentation
-
-All documentation has moved to the website:
-
-**📖 [hermes-agent.nousresearch.com/docs](https://hermes-agent.nousresearch.com/docs/)**
-
-The documentation source files live in [`website/docs/`](../website/docs/).
--- a/docs/send_file_integration_map.md
+++ b/docs/send_file_integration_map.md
@@ -1,345 +0,0 @@
-# send_file Integration Map — Hermes Agent Codebase Deep Dive
-
-## 1. environments/tool_context.py — Base64 File Transfer Implementation
-
-### upload_file() (lines 153-205)
- Reads local file as raw bytes, base64-encodes to ASCII string
- Creates parent dirs in sandbox via `self.terminal(f"mkdir -p {parent}")`
- **Chunk size:** 60,000 chars (~60KB per shell command)
- **Small files (<=60KB b64):** Single `printf '%s' '{b64}' | base64 -d > {remote_path}`
- **Large files:** Writes chunks to `/tmp/_hermes_upload.b64` via `printf >> append`, then `base64 -d` to target
- **Error handling:** Checks local file exists; returns `{exit_code, output}`
- **Size limits:** No explicit limit, but shell arg limit ~2MB means chunking is necessary for files >~45KB raw
- **No theoretical max** — but very large files would be slow (many terminal round trips)
-
-### download_file() (lines 234-278)
- Runs `base64 {remote_path}` inside sandbox, captures stdout
- Strips output, base64-decodes to raw bytes
- Writes to host filesystem with parent dir creation
- **Error handling:** Checks exit code, empty output, decode errors
- Returns `{success: bool, bytes: int}` or `{success: false, error: str}`
- **Size limit:** Bounded by terminal output buffer (practical limit ~few MB via base64 terminal output)
-
-### Promotion potential:
- These methods work via `self.terminal()` — they're environment-agnostic
- Could be directly lifted into a new tool that operates on the agent's current sandbox
- For send_file, this `download_file()` pattern is the key: it extracts files from sandbox → host
-
-## 2. tools/environments/base.py — BaseEnvironment Interface
-
-### Current methods:
- `execute(command, cwd, timeout, stdin_data)` → `{output, returncode}`
- `cleanup()` — release resources
- `stop()` — alias for cleanup
- `_prepare_command()` — sudo transformation
- `_build_run_kwargs()` — subprocess kwargs
- `_timeout_result()` — standard timeout dict
-
-### What would need to be added for file transfer:
- **Nothing required at this level.** File transfer can be implemented via `execute()` (base64 over terminal, like ToolContext does) or via environment-specific methods.
- Optional: `upload_file(local_path, remote_path)` and `download_file(remote_path, local_path)` methods could be added to BaseEnvironment for optimized per-backend transfers, but the base64-over-terminal approach already works universally.
-
-## 3. tools/environments/docker.py — Docker Container Details
-
-### Container ID tracking:
- `self._container_id` stored at init from `self._inner.container_id`
- Inner is `minisweagent.environments.docker.DockerEnvironment`
- Container ID is a standard Docker container hash
-
-### docker cp feasibility:
- **YES**, `docker cp` could be used for optimized file transfer:
-  - `docker cp {container_id}:{remote_path} {local_path}` (download)
-  - `docker cp {local_path} {container_id}:{remote_path}` (upload)
- Much faster than base64-over-terminal for large files
- Container ID is directly accessible via `env._container_id` or `env._inner.container_id`
-
-### Volumes mounted:
- **Persistent mode:** Bind mounts at `~/.hermes/sandboxes/docker/{task_id}/workspace` → `/workspace` and `.../home` → `/root`
- **Ephemeral mode:** tmpfs at `/workspace` (10GB), `/home` (1GB), `/root` (1GB)
- **User volumes:** From `config.yaml docker_volumes` (arbitrary `-v` mounts)
- **Security tmpfs:** `/tmp` (512MB), `/var/tmp` (256MB), `/run` (64MB)
-
-### Direct host access for persistent mode:
- If persistent, files at `/workspace/foo.txt` are just `~/.hermes/sandboxes/docker/{task_id}/workspace/foo.txt` on host — no transfer needed!
-
-## 4. tools/environments/ssh.py — SSH Connection Management
-
-### Connection management:
- Uses SSH ControlMaster for persistent connection
- Control socket at `/tmp/hermes-ssh/{user}@{host}:{port}.sock`
- ControlPersist=300 (5 min keepalive)
- BatchMode=yes (non-interactive)
- Stores: `self.host`, `self.user`, `self.port`, `self.key_path`
-
-### SCP/SFTP feasibility:
- **YES**, SCP can piggyback on the ControlMaster socket:
-  - `scp -o ControlPath={socket} {user}@{host}:{remote} {local}` (download)
-  - `scp -o ControlPath={socket} {local} {user}@{host}:{remote}` (upload)
- Same SSH key and connection reuse — zero additional auth
- Would be much faster than base64-over-terminal for large files
-
-## 5. tools/environments/modal.py — Modal Sandbox Filesystem
-
-### Filesystem API exposure:
- **Not directly.** The inner `SwerexModalEnvironment` wraps Modal's sandbox
- The sandbox object is accessible at: `env._inner.deployment._sandbox`
- Modal's Python SDK exposes `sandbox.open()` for file I/O — but only via async API
- Currently only used for `snapshot_filesystem()` during cleanup
- **Could use:** `sandbox.open(path, "rb")` to read files or `sandbox.open(path, "wb")` to write
- **Alternative:** Base64-over-terminal already works via `execute()` — simpler, no SDK dependency
-
-## 6. gateway/platforms/base.py — MEDIA: Tag Flow (Complete)
-
-### extract_media() (lines 587-620):
- **Pattern:** `MEDIA:\S+` — extracts file paths after MEDIA: prefix
- **Voice flag:** `[[audio_as_voice]]` global directive sets `is_voice=True` for all media in message
- Returns `List[Tuple[str, bool]]` (path, is_voice) and cleaned content
-
-### _process_message_background() media routing (lines 752-786):
- After extracting MEDIA tags, routes by file extension:
-  - `.ogg .opus .mp3 .wav .m4a` → `send_voice()`
-  - `.mp4 .mov .avi .mkv .3gp` → `send_video()`
-  - `.jpg .jpeg .png .webp .gif` → `send_image_file()`
-  - **Everything else** → `send_document()`
- This routing already supports arbitrary files!
-
-### send_* method inventory (base class):
- `send(chat_id, content, reply_to, metadata)` — ABSTRACT, text
- `send_image(chat_id, image_url, caption, reply_to)` — URL-based images
- `send_animation(chat_id, animation_url, caption, reply_to)` — GIF animations
- `send_voice(chat_id, audio_path, caption, reply_to)` — voice messages
- `send_video(chat_id, video_path, caption, reply_to)` — video files
- `send_document(chat_id, file_path, caption, file_name, reply_to)` — generic files
- `send_image_file(chat_id, image_path, caption, reply_to)` — local image files
- `send_typing(chat_id)` — typing indicator
- `edit_message(chat_id, message_id, content)` — edit sent messages
-
-### What's missing:
- **Telegram:** No override for `send_document` — falls back to text! (`send_image_file` ✅ added)
- **Discord:** No override for `send_document` — falls back to text! (`send_image_file` ✅ added)
- **Slack:** No override for `send_document` — falls back to text! (`send_image_file` ✅ added)
- **WhatsApp:** Has `send_document` and `send_image_file` via bridge — COMPLETE.
- The base class defaults just send "📎 File: /path" as text — useless for actual file delivery.
-
-## 7. gateway/platforms/telegram.py — Send Method Analysis
-
-### Implemented send methods:
- `send()` — MarkdownV2 text with fallback to plain
- `send_voice()` — `.ogg`/`.opus` as `send_voice()`, others as `send_audio()`
- `send_image()` — URL-based via `send_photo()`
- `send_image_file()` — local file via `send_photo(photo=open(path, 'rb'))` ✅
- `send_animation()` — GIF via `send_animation()`
- `send_typing()` — "typing" chat action
- `edit_message()` — edit text messages
-
-### MISSING:
- **`send_document()` NOT overridden** — Need to add `self._bot.send_document(chat_id, document=open(file_path, 'rb'), ...)`
- **`send_video()` NOT overridden** — Need to add `self._bot.send_video(...)`
-
-## 8. gateway/platforms/discord.py — Send Method Analysis
-
-### Implemented send methods:
- `send()` — text messages with chunking
- `send_voice()` — discord.File attachment
- `send_image()` — downloads URL, creates discord.File attachment
- `send_image_file()` — local file via discord.File attachment ✅
- `send_typing()` — channel.typing()
- `edit_message()` — edit text messages
-
-### MISSING:
- **`send_document()` NOT overridden** — Need to add discord.File attachment
- **`send_video()` NOT overridden** — Need to add discord.File attachment
-
-## 9. gateway/run.py — User File Attachment Handling
-
-### Current attachment flow:
-1. **Telegram photos** (line 509-529): Download via `photo.get_file()` → `cache_image_from_bytes()` → vision auto-analysis
-2. **Telegram voice** (line 532-541): Download → `cache_audio_from_bytes()` → STT transcription
-3. **Telegram audio** (line 542-551): Same pattern
-4. **Telegram documents** (line 553-617): Extension validation against `SUPPORTED_DOCUMENT_TYPES`, 20MB limit, content injection for text files
-5. **Discord attachments** (line 717-751): Content-type detection, image/audio caching, URL fallback for other types
-6. **Gateway run.py** (lines 818-883): Auto-analyzes images with vision, transcribes audio, enriches document messages with context notes
-
-### Key insight: Files are always cached to host filesystem first, then processed. The agent sees local file paths.
-
-## 10. tools/terminal_tool.py — Terminal Tool & Environment Interaction
-
-### How it manages environments:
- Global dict `_active_environments: Dict[str, Any]` keyed by task_id
- Per-task creation locks prevent duplicate sandbox creation
- Auto-cleanup thread kills idle environments after `TERMINAL_LIFETIME_SECONDS`
- `_get_env_config()` reads all TERMINAL_* env vars for backend selection
- `_create_environment()` factory creates the right backend type
-
-### Could send_file piggyback?
- **YES.** send_file needs access to the same environment to extract files from sandboxes.
- It can reuse `_active_environments[task_id]` to get the environment, then:
-  - Docker: Use `docker cp` via `env._container_id`
-  - SSH: Use `scp` via `env.control_socket`
-  - Local: Just read the file directly
-  - Modal: Use base64-over-terminal via `env.execute()`
- The file_tools.py module already does this with `ShellFileOperations` — read_file/write_file/search/patch all share the same env instance.
-
-## 11. tools/tts_tool.py — Working Example of File Delivery
-
-### Flow:
-1. Generate audio file to `~/.hermes/audio_cache/tts_TIMESTAMP.{ogg,mp3}`
-2. Return JSON with `media_tag: "MEDIA:/path/to/file"`
-3. For Telegram voice: prepend `[[audio_as_voice]]` directive
-4. The LLM includes the MEDIA tag in its response text
-5. `BasePlatformAdapter._process_message_background()` calls `extract_media()` to find the tag
-6. Routes by extension → `send_voice()` for audio files
-7. Platform adapter sends the file natively
-
-### Key pattern: Tool saves file to host → returns MEDIA: path → LLM echoes it → gateway extracts → platform delivers
-
-## 12. tools/image_generation_tool.py — Working Example of Image Delivery
-
-### Flow:
-1. Call FAL.ai API → get image URL
-2. Return JSON with `image: "https://fal.media/..."` URL
-3. The LLM includes the URL in markdown: `![description](URL)`
-4. `BasePlatformAdapter.extract_images()` finds `![alt](url)` patterns
-5. Routes through `send_image()` (URL) or `send_animation()` (GIF)
-6. Platform downloads and sends natively
-
-### Key difference from TTS: Images are URL-based, not local files. The gateway downloads at send time.
-
---
-
-# INTEGRATION MAP: Where send_file Hooks In
-
-## Architecture Decision: MEDIA: Tag Protocol vs. New Tool
-
-The MEDIA: tag protocol is already the established pattern for file delivery. Two options:
-
-### Option A: Pure MEDIA: Tag (Minimal Change)
- No new tool needed
- Agent downloads file from sandbox to host using terminal (base64)
- Saves to known location (e.g., `~/.hermes/file_cache/`)
- Includes `MEDIA:/path` in response text
- Existing routing in `_process_message_background()` handles delivery
- **Problem:** Agent has to manually do base64 dance + know about MEDIA: convention
-
-### Option B: Dedicated send_file Tool (Recommended)
- New tool that the agent calls with `(file_path, caption?)`
- Tool handles the sandbox → host extraction automatically
- Returns MEDIA: tag that gets routed through existing pipeline
- Much cleaner agent experience
-
-## Implementation Plan for Option B
-
-### Files to CREATE:
-
-1. **`tools/send_file_tool.py`** — The new tool
-   - Accepts: `file_path` (path in sandbox), `caption` (optional)
-   - Detects environment backend from `_active_environments`
-   - Extracts file from sandbox:
-     - **local:** `shutil.copy()` or direct path
-     - **docker:** `docker cp {container_id}:{path} {local_cache}/` 
-     - **ssh:** `scp -o ControlPath=... {user}@{host}:{path} {local_cache}/`
-     - **modal:** base64-over-terminal via `env.execute("base64 {path}")`
-   - Saves to `~/.hermes/file_cache/{uuid}_{filename}`
-   - Returns: `MEDIA:/cached/path` in response for gateway to pick up
-   - Register with `registry.register(name="send_file", toolset="file", ...)`
-
-### Files to MODIFY:
-
-2. **`gateway/platforms/telegram.py`** — Add missing send methods:
-   ```python
-   async def send_document(self, chat_id, file_path, caption=None, file_name=None, reply_to=None):
-       with open(file_path, "rb") as f:
-           msg = await self._bot.send_document(
-               chat_id=int(chat_id), document=f,
-               caption=caption, filename=file_name or os.path.basename(file_path))
-       return SendResult(success=True, message_id=str(msg.message_id))
-   
-   async def send_image_file(self, chat_id, image_path, caption=None, reply_to=None):
-       with open(image_path, "rb") as f:
-           msg = await self._bot.send_photo(chat_id=int(chat_id), photo=f, caption=caption)
-       return SendResult(success=True, message_id=str(msg.message_id))
-   
-   async def send_video(self, chat_id, video_path, caption=None, reply_to=None):
-       with open(video_path, "rb") as f:
-           msg = await self._bot.send_video(chat_id=int(chat_id), video=f, caption=caption)
-       return SendResult(success=True, message_id=str(msg.message_id))
-   ```
-
-3. **`gateway/platforms/discord.py`** — Add missing send methods:
-   ```python
-   async def send_document(self, chat_id, file_path, caption=None, file_name=None, reply_to=None):
-       channel = self._client.get_channel(int(chat_id)) or await self._client.fetch_channel(int(chat_id))
-       with open(file_path, "rb") as f:
-           file = discord.File(io.BytesIO(f.read()), filename=file_name or os.path.basename(file_path))
-           msg = await channel.send(content=caption, file=file)
-       return SendResult(success=True, message_id=str(msg.id))
-   
-   async def send_image_file(self, chat_id, image_path, caption=None, reply_to=None):
-       # Same pattern as send_document with image filename
-   
-   async def send_video(self, chat_id, video_path, caption=None, reply_to=None):
-       # Same pattern, discord renders video attachments inline
-   ```
-
-4. **`toolsets.py`** — Add `"send_file"` to `_HERMES_CORE_TOOLS` list
-
-5. **`agent/prompt_builder.py`** — Update platform hints to mention send_file tool
-
-### Code that can be REUSED (zero rewrite):
-
- `BasePlatformAdapter.extract_media()` — Already extracts MEDIA: tags
- `BasePlatformAdapter._process_message_background()` — Already routes by extension
- `ToolContext.download_file()` — Base64-over-terminal extraction pattern
- `tools/terminal_tool.py` _active_environments dict — Environment access
- `tools/registry.py` — Tool registration infrastructure
- `gateway/platforms/base.py` send_document/send_image_file/send_video signatures — Already defined
-
-### Code that needs to be WRITTEN from scratch:
-
-1. `tools/send_file_tool.py` (~150 lines):
-   - File extraction from each environment backend type
-   - Local file cache management
-   - Registry registration
-   
-2. Telegram `send_document` + `send_image_file` + `send_video` overrides (~40 lines)
-3. Discord `send_document` + `send_image_file` + `send_video` overrides (~50 lines)
-
-### Total effort: ~240 lines of new code, ~5 lines of config changes
-
-## Key Environment-Specific Extract Strategies
-
-| Backend    | Extract Method                 | Speed    | Complexity |
-|------------|-------------------------------|----------|------------|
-| local      | shutil.copy / direct path     | Instant  | None       |
-| docker     | `docker cp container:path .`  | Fast     | Low        |
-| docker+vol | Direct host path access       | Instant  | None       |
-| ssh        | `scp -o ControlPath=...`      | Fast     | Low        |
-| modal      | base64-over-terminal          | Moderate | Medium     |
-| singularity| Direct path (overlay mount)   | Fast     | Low        |
-
-## Data Flow Summary
-
-```
-Agent calls send_file(file_path="/workspace/output.pdf", caption="Here's the report")
-    │
-    ▼
-send_file_tool.py:
-    1. Get environment from _active_environments[task_id]
-    2. Detect backend type (docker/ssh/modal/local)
-    3. Extract file to ~/.hermes/file_cache/{uuid}_{filename}
-    4. Return: '{"success": true, "media_tag": "MEDIA:/home/user/.hermes/file_cache/abc123_output.pdf"}'
-    │
-    ▼
-LLM includes MEDIA: tag in its response text
-    │
-    ▼
-BasePlatformAdapter._process_message_background():
-    1. extract_media(response) → finds MEDIA:/path
-    2. Checks extension: .pdf → send_document()
-    3. Calls platform-specific send_document(chat_id, file_path, caption)
-    │
-    ▼
-TelegramAdapter.send_document() / DiscordAdapter.send_document():
-    Opens file, sends via platform API as native document attachment
-    User receives downloadable file in chat
-```
--- a/gateway/channel_directory.py
+++ b/gateway/channel_directory.py
@@ -40,8 +40,8 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
        except Exception as e:
            logger.warning("Channel directory: failed to build %s: %s", platform.value, e)

-    # Telegram & WhatsApp can't enumerate chats -- pull from session history
-    for plat_name in ("telegram", "whatsapp"):
+    # Telegram, WhatsApp & Signal can't enumerate chats -- pull from session history
+    for plat_name in ("telegram", "whatsapp", "signal"):
        if plat_name not in platforms:
            platforms[plat_name] = _build_from_sessions(plat_name)

@@ -52,7 +52,7 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:

    try:
        DIRECTORY_PATH.parent.mkdir(parents=True, exist_ok=True)
-        with open(DIRECTORY_PATH, "w") as f:
+        with open(DIRECTORY_PATH, "w", encoding="utf-8") as f:
            json.dump(directory, f, indent=2, ensure_ascii=False)
    except Exception as e:
        logger.warning("Channel directory: failed to write: %s", e)
@@ -115,7 +115,7 @@ def _build_from_sessions(platform_name: str) -> List[Dict[str, str]]:

    entries = []
    try:
-        with open(sessions_path) as f:
+        with open(sessions_path, encoding="utf-8") as f:
            data = json.load(f)

        seen_ids = set()
@@ -147,7 +147,7 @@ def load_directory() -> Dict[str, Any]:
    if not DIRECTORY_PATH.exists():
        return {"updated_at": None, "platforms": {}}
    try:
-        with open(DIRECTORY_PATH) as f:
+        with open(DIRECTORY_PATH, encoding="utf-8") as f:
            return json.load(f)
    except Exception:
        return {"updated_at": None, "platforms": {}}
--- a/gateway/config.py
+++ b/gateway/config.py
@@ -26,6 +26,7 @@ class Platform(Enum):
    DISCORD = "discord"
    WHATSAPP = "whatsapp"
    SLACK = "slack"
+    SIGNAL = "signal"
    HOMEASSISTANT = "homeassistant"


@@ -155,7 +156,16 @@ class GatewayConfig:
        """Return list of platforms that are enabled and configured."""
        connected = []
        for platform, config in self.platforms.items():
-            if config.enabled and (config.token or config.api_key):
+            if not config.enabled:
+                continue
+            # Platforms that use token/api_key auth
+            if config.token or config.api_key:
+                connected.append(platform)
+            # WhatsApp uses enabled flag only (bridge handles auth)
+            elif platform == Platform.WHATSAPP:
+                connected.append(platform)
+            # Signal uses extra dict for config (http_url + account)
+            elif platform == Platform.SIGNAL and config.extra.get("http_url"):
                connected.append(platform)
        return connected
    
@@ -379,6 +389,26 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
            )
    
+    # Signal
+    signal_url = os.getenv("SIGNAL_HTTP_URL")
+    signal_account = os.getenv("SIGNAL_ACCOUNT")
+    if signal_url and signal_account:
+        if Platform.SIGNAL not in config.platforms:
+            config.platforms[Platform.SIGNAL] = PlatformConfig()
+        config.platforms[Platform.SIGNAL].enabled = True
+        config.platforms[Platform.SIGNAL].extra.update({
+            "http_url": signal_url,
+            "account": signal_account,
+            "ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in ("true", "1", "yes"),
+        })
+        signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
+        if signal_home:
+            config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
+                platform=Platform.SIGNAL,
+                chat_id=signal_home,
+                name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
+            )
+
    # Home Assistant
    hass_token = os.getenv("HASS_TOKEN")
    if hass_token:
--- a/gateway/mirror.py
+++ b/gateway/mirror.py
@@ -73,7 +73,7 @@ def _find_session_id(platform: str, chat_id: str) -> Optional[str]:
        return None

    try:
-        with open(_SESSIONS_INDEX) as f:
+        with open(_SESSIONS_INDEX, encoding="utf-8") as f:
            data = json.load(f)
    except Exception:
        return None
@@ -103,7 +103,7 @@ def _append_to_jsonl(session_id: str, message: dict) -> None:
    """Append a message to the JSONL transcript file."""
    transcript_path = _SESSIONS_DIR / f"{session_id}.jsonl"
    try:
-        with open(transcript_path, "a") as f:
+        with open(transcript_path, "a", encoding="utf-8") as f:
            f.write(json.dumps(message, ensure_ascii=False) + "\n")
    except Exception as e:
        logger.debug("Mirror JSONL write failed: %s", e)
--- a/gateway/platforms/ADDING_A_PLATFORM.md
+++ b/gateway/platforms/ADDING_A_PLATFORM.md
@@ -0,0 +1,313 @@
+# Adding a New Messaging Platform
+
+Checklist for integrating a new messaging platform into the Hermes gateway.
+Use this as a reference when building a new adapter — every item here is a
+real integration point that exists in the codebase. Missing any of them will
+cause broken functionality, missing features, or inconsistent behavior.
+
+---
+
+## 1. Core Adapter (`gateway/platforms/<platform>.py`)
+
+The adapter is a subclass of `BasePlatformAdapter` from `gateway/platforms/base.py`.
+
+### Required methods
+
+| Method | Purpose |
+|--------|---------|
+| `__init__(self, config)` | Parse config, init state. Call `super().__init__(config, Platform.YOUR_PLATFORM)` |
+| `connect() -> bool` | Connect to the platform, start listeners. Return True on success |
+| `disconnect()` | Stop listeners, close connections, cancel tasks |
+| `send(chat_id, text, ...) -> SendResult` | Send a text message |
+| `send_typing(chat_id)` | Send typing indicator |
+| `send_image(chat_id, image_url, caption) -> SendResult` | Send an image |
+| `get_chat_info(chat_id) -> dict` | Return `{name, type, chat_id}` for a chat |
+
+### Optional methods (have default stubs in base)
+
+| Method | Purpose |
+|--------|---------|
+| `send_document(chat_id, path, caption)` | Send a file attachment |
+| `send_voice(chat_id, path)` | Send a voice message |
+| `send_video(chat_id, path, caption)` | Send a video |
+| `send_animation(chat_id, path, caption)` | Send a GIF/animation |
+| `send_image_file(chat_id, path, caption)` | Send image from local file |
+
+### Required function
+
+```python
+def check_<platform>_requirements() -> bool:
+    """Check if this platform's dependencies are available."""
+```
+
+### Key patterns to follow
+
+- Use `self.build_source(...)` to construct `SessionSource` objects
+- Call `self.handle_message(event)` to dispatch inbound messages to the gateway
+- Use `MessageEvent`, `MessageType`, `SendResult` from base
+- Use `cache_image_from_bytes`, `cache_audio_from_bytes`, `cache_document_from_bytes` for attachments
+- Filter self-messages (prevent reply loops)
+- Filter sync/echo messages if the platform has them
+- Redact sensitive identifiers (phone numbers, tokens) in all log output
+- Implement reconnection with exponential backoff + jitter for streaming connections
+- Set `MAX_MESSAGE_LENGTH` if the platform has message size limits
+
+---
+
+## 2. Platform Enum (`gateway/config.py`)
+
+Add the platform to the `Platform` enum:
+
+```python
+class Platform(Enum):
+    ...
+    YOUR_PLATFORM = "your_platform"
+```
+
+Add env var loading in `_apply_env_overrides()`:
+
+```python
+# Your Platform
+your_token = os.getenv("YOUR_PLATFORM_TOKEN")
+if your_token:
+    if Platform.YOUR_PLATFORM not in config.platforms:
+        config.platforms[Platform.YOUR_PLATFORM] = PlatformConfig()
+    config.platforms[Platform.YOUR_PLATFORM].enabled = True
+    config.platforms[Platform.YOUR_PLATFORM].token = your_token
+```
+
+Update `get_connected_platforms()` if your platform doesn't use token/api_key
+(e.g., WhatsApp uses `enabled` flag, Signal uses `extra` dict).
+
+---
+
+## 3. Adapter Factory (`gateway/run.py`)
+
+Add to `_create_adapter()`:
+
+```python
+elif platform == Platform.YOUR_PLATFORM:
+    from gateway.platforms.your_platform import YourAdapter, check_your_requirements
+    if not check_your_requirements():
+        logger.warning("Your Platform: dependencies not met")
+        return None
+    return YourAdapter(config)
+```
+
+---
+
+## 4. Authorization Maps (`gateway/run.py`)
+
+Add to BOTH dicts in `_is_user_authorized()`:
+
+```python
+platform_env_map = {
+    ...
+    Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOWED_USERS",
+}
+platform_allow_all_map = {
+    ...
+    Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOW_ALL_USERS",
+}
+```
+
+---
+
+## 5. Session Source (`gateway/session.py`)
+
+If your platform needs extra identity fields (e.g., Signal's UUID alongside
+phone number), add them to the `SessionSource` dataclass with `Optional` defaults,
+and update `to_dict()`, `from_dict()`, and `build_source()` in base.py.
+
+---
+
+## 6. System Prompt Hints (`agent/prompt_builder.py`)
+
+Add a `PLATFORM_HINTS` entry so the agent knows what platform it's on:
+
+```python
+PLATFORM_HINTS = {
+    ...
+    "your_platform": (
+        "You are on Your Platform. "
+        "Describe formatting capabilities, media support, etc."
+    ),
+}
+```
+
+Without this, the agent won't know it's on your platform and may use
+inappropriate formatting (e.g., markdown on platforms that don't render it).
+
+---
+
+## 7. Toolset (`toolsets.py`)
+
+Add a named toolset for your platform:
+
+```python
+"hermes-your-platform": {
+    "description": "Your Platform bot toolset",
+    "tools": _HERMES_CORE_TOOLS,
+    "includes": []
+},
+```
+
+And add it to the `hermes-gateway` composite:
+
+```python
+"hermes-gateway": {
+    "includes": [..., "hermes-your-platform"]
+}
+```
+
+---
+
+## 8. Cron Delivery (`cron/scheduler.py`)
+
+Add to `platform_map` in `_deliver_result()`:
+
+```python
+platform_map = {
+    ...
+    "your_platform": Platform.YOUR_PLATFORM,
+}
+```
+
+Without this, `schedule_cronjob(deliver="your_platform")` silently fails.
+
+---
+
+## 9. Send Message Tool (`tools/send_message_tool.py`)
+
+Add to `platform_map` in `send_message_tool()`:
+
+```python
+platform_map = {
+    ...
+    "your_platform": Platform.YOUR_PLATFORM,
+}
+```
+
+Add routing in `_send_to_platform()`:
+
+```python
+elif platform == Platform.YOUR_PLATFORM:
+    return await _send_your_platform(pconfig, chat_id, message)
+```
+
+Implement `_send_your_platform()` — a standalone async function that sends
+a single message without requiring the full adapter (for use by cron jobs
+and the send_message tool outside the gateway process).
+
+Update the tool schema `target` description to include your platform example.
+
+---
+
+## 10. Cronjob Tool Schema (`tools/cronjob_tools.py`)
+
+Update the `deliver` parameter description and docstring to mention your
+platform as a delivery option.
+
+---
+
+## 11. Channel Directory (`gateway/channel_directory.py`)
+
+If your platform can't enumerate chats (most can't), add it to the
+session-based discovery list:
+
+```python
+for plat_name in ("telegram", "whatsapp", "signal", "your_platform"):
+```
+
+---
+
+## 12. Status Display (`hermes_cli/status.py`)
+
+Add to the `platforms` dict in the Messaging Platforms section:
+
+```python
+platforms = {
+    ...
+    "Your Platform": ("YOUR_PLATFORM_TOKEN", "YOUR_PLATFORM_HOME_CHANNEL"),
+}
+```
+
+---
+
+## 13. Gateway Setup Wizard (`hermes_cli/gateway.py`)
+
+Add to the `_PLATFORMS` list:
+
+```python
+{
+    "key": "your_platform",
+    "label": "Your Platform",
+    "emoji": "📱",
+    "token_var": "YOUR_PLATFORM_TOKEN",
+    "setup_instructions": [...],
+    "vars": [...],
+}
+```
+
+If your platform needs custom setup logic (connectivity testing, QR codes,
+policy choices), add a `_setup_your_platform()` function and route to it
+in the platform selection switch.
+
+Update `_platform_status()` if your platform's "configured" check differs
+from the standard `bool(get_env_value(token_var))`.
+
+---
+
+## 14. Phone/ID Redaction (`agent/redact.py`)
+
+If your platform uses sensitive identifiers (phone numbers, etc.), add a
+regex pattern and redaction function to `agent/redact.py`. This ensures
+identifiers are masked in ALL log output, not just your adapter's logs.
+
+---
+
+## 15. Documentation
+
+| File | What to update |
+|------|---------------|
+| `README.md` | Platform list in feature table + documentation table |
+| `AGENTS.md` | Gateway description + env var config section |
+| `website/docs/user-guide/messaging/<platform>.md` | **NEW** — Full setup guide (see existing platform docs for template) |
+| `website/docs/user-guide/messaging/index.md` | Architecture diagram, toolset table, security examples, Next Steps links |
+| `website/docs/reference/environment-variables.md` | All env vars for the platform |
+
+---
+
+## 16. Tests (`tests/gateway/test_<platform>.py`)
+
+Recommended test coverage:
+
+- Platform enum exists with correct value
+- Config loading from env vars via `_apply_env_overrides`
+- Adapter init (config parsing, allowlist handling, default values)
+- Helper functions (redaction, parsing, file type detection)
+- Session source round-trip (to_dict → from_dict)
+- Authorization integration (platform in allowlist maps)
+- Send message tool routing (platform in platform_map)
+
+Optional but valuable:
+- Async tests for message handling flow (mock the platform API)
+- SSE/WebSocket reconnection logic
+- Attachment processing
+- Group message filtering
+
+---
+
+## Quick Verification
+
+After implementing everything, verify with:
+
+```bash
+# All tests pass
+python -m pytest tests/ -q
+
+# Grep for your platform name to find any missed integration points
+grep -r "telegram\|discord\|whatsapp\|slack" gateway/ tools/ agent/ cron/ hermes_cli/ toolsets.py \
+  --include="*.py" -l | sort -u
+# Check each file in the output — if it mentions other platforms but not yours, you missed it
+```
--- a/gateway/platforms/base.py
+++ b/gateway/platforms/base.py
@@ -252,6 +252,7 @@ def cleanup_document_cache(max_age_hours: int = 24) -> int:
 class MessageType(Enum):
    """Types of incoming messages."""
    TEXT = "text"
+    LOCATION = "location"
    PHOTO = "photo"
    VIDEO = "video"
    AUDIO = "audio"
@@ -838,6 +839,8 @@ class BasePlatformAdapter(ABC):
        user_name: Optional[str] = None,
        thread_id: Optional[str] = None,
        chat_topic: Optional[str] = None,
+        user_id_alt: Optional[str] = None,
+        chat_id_alt: Optional[str] = None,
    ) -> SessionSource:
        """Helper to build a SessionSource for this platform."""
        # Normalize empty topic to None
@@ -852,6 +855,8 @@ class BasePlatformAdapter(ABC):
            user_name=user_name,
            thread_id=str(thread_id) if thread_id else None,
            chat_topic=chat_topic.strip() if chat_topic else None,
+            user_id_alt=user_id_alt,
+            chat_id_alt=chat_id_alt,
        )
    
    @abstractmethod
--- a/gateway/platforms/signal.py
+++ b/gateway/platforms/signal.py
@@ -0,0 +1,716 @@
+"""Signal messenger platform adapter.
+
+Connects to a signal-cli daemon running in HTTP mode.
+Inbound messages arrive via SSE (Server-Sent Events) streaming.
+Outbound messages and actions use JSON-RPC 2.0 over HTTP.
+
+Based on PR #268 by ibhagwan, rebuilt with bug fixes.
+
+Requires:
+  - signal-cli installed and running: signal-cli daemon --http 127.0.0.1:8080
+  - SIGNAL_HTTP_URL and SIGNAL_ACCOUNT environment variables set
+"""
+
+import asyncio
+import base64
+import json
+import logging
+import os
+import random
+import re
+import time
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Dict, List, Optional, Any
+from urllib.parse import unquote
+
+import httpx
+
+from gateway.config import Platform, PlatformConfig
+from gateway.platforms.base import (
+    BasePlatformAdapter,
+    MessageEvent,
+    MessageType,
+    SendResult,
+    cache_image_from_bytes,
+    cache_audio_from_bytes,
+    cache_document_from_bytes,
+    cache_image_from_url,
+)
+
+logger = logging.getLogger(__name__)
+
+# ---------------------------------------------------------------------------
+# Constants
+# ---------------------------------------------------------------------------
+SIGNAL_MAX_ATTACHMENT_SIZE = 100 * 1024 * 1024  # 100 MB
+MAX_MESSAGE_LENGTH = 8000  # Signal message size limit
+TYPING_INTERVAL = 8.0  # seconds between typing indicator refreshes
+SSE_RETRY_DELAY_INITIAL = 2.0
+SSE_RETRY_DELAY_MAX = 60.0
+HEALTH_CHECK_INTERVAL = 30.0  # seconds between health checks
+HEALTH_CHECK_STALE_THRESHOLD = 120.0  # seconds without SSE activity before concern
+
+# E.164 phone number pattern for redaction
+_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _redact_phone(phone: str) -> str:
+    """Redact a phone number for logging: +15551234567 -> +155****4567."""
+    if not phone:
+        return "<none>"
+    if len(phone) <= 8:
+        return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
+    return phone[:4] + "****" + phone[-4:]
+
+
+def _parse_comma_list(value: str) -> List[str]:
+    """Split a comma-separated string into a list, stripping whitespace."""
+    return [v.strip() for v in value.split(",") if v.strip()]
+
+
+def _guess_extension(data: bytes) -> str:
+    """Guess file extension from magic bytes."""
+    if data[:4] == b"\x89PNG":
+        return ".png"
+    if data[:2] == b"\xff\xd8":
+        return ".jpg"
+    if data[:4] == b"GIF8":
+        return ".gif"
+    if len(data) >= 12 and data[:4] == b"RIFF" and data[8:12] == b"WEBP":
+        return ".webp"
+    if data[:4] == b"%PDF":
+        return ".pdf"
+    if len(data) >= 8 and data[4:8] == b"ftyp":
+        return ".mp4"
+    if data[:4] == b"OggS":
+        return ".ogg"
+    if len(data) >= 2 and data[0] == 0xFF and (data[1] & 0xE0) == 0xE0:
+        return ".mp3"
+    if data[:2] == b"PK":
+        return ".zip"
+    return ".bin"
+
+
+def _is_image_ext(ext: str) -> bool:
+    return ext.lower() in (".jpg", ".jpeg", ".png", ".gif", ".webp")
+
+
+def _is_audio_ext(ext: str) -> bool:
+    return ext.lower() in (".mp3", ".wav", ".ogg", ".m4a", ".aac")
+
+
+def _render_mentions(text: str, mentions: list) -> str:
+    """Replace Signal mention placeholders (\\uFFFC) with readable @identifiers.
+
+    Signal encodes @mentions as the Unicode object replacement character
+    with out-of-band metadata containing the mentioned user's UUID/number.
+    """
+    if not mentions or "\uFFFC" not in text:
+        return text
+    # Sort mentions by start position (reverse) to replace from end to start
+    # so indices don't shift as we replace
+    sorted_mentions = sorted(mentions, key=lambda m: m.get("start", 0), reverse=True)
+    for mention in sorted_mentions:
+        start = mention.get("start", 0)
+        length = mention.get("length", 1)
+        # Use the mention's number or UUID as the replacement
+        identifier = mention.get("number") or mention.get("uuid") or "user"
+        replacement = f"@{identifier}"
+        text = text[:start] + replacement + text[start + length:]
+    return text
+
+
+def check_signal_requirements() -> bool:
+    """Check if Signal is configured (has URL and account)."""
+    return bool(os.getenv("SIGNAL_HTTP_URL") and os.getenv("SIGNAL_ACCOUNT"))
+
+
+# ---------------------------------------------------------------------------
+# Signal Adapter
+# ---------------------------------------------------------------------------
+
+class SignalAdapter(BasePlatformAdapter):
+    """Signal messenger adapter using signal-cli HTTP daemon."""
+
+    platform = Platform.SIGNAL
+
+    def __init__(self, config: PlatformConfig):
+        super().__init__(config, Platform.SIGNAL)
+
+        extra = config.extra or {}
+        self.http_url = extra.get("http_url", "http://127.0.0.1:8080").rstrip("/")
+        self.account = extra.get("account", "")
+        self.ignore_stories = extra.get("ignore_stories", True)
+
+        # Parse allowlists — group policy is derived from presence of group allowlist
+        group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
+        self.group_allow_from = set(_parse_comma_list(group_allowed_str))
+
+        # HTTP client
+        self.client: Optional[httpx.AsyncClient] = None
+
+        # Background tasks
+        self._sse_task: Optional[asyncio.Task] = None
+        self._health_monitor_task: Optional[asyncio.Task] = None
+        self._typing_tasks: Dict[str, asyncio.Task] = {}
+        self._running = False
+        self._last_sse_activity = 0.0
+        self._sse_response: Optional[httpx.Response] = None
+
+        # Normalize account for self-message filtering
+        self._account_normalized = self.account.strip()
+
+        logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
+                     self.http_url, _redact_phone(self.account),
+                     "enabled" if self.group_allow_from else "disabled")
+
+    # ------------------------------------------------------------------
+    # Lifecycle
+    # ------------------------------------------------------------------
+
+    async def connect(self) -> bool:
+        """Connect to signal-cli daemon and start SSE listener."""
+        if not self.http_url or not self.account:
+            logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
+            return False
+
+        self.client = httpx.AsyncClient(timeout=30.0)
+
+        # Health check — verify signal-cli daemon is reachable
+        try:
+            resp = await self.client.get(f"{self.http_url}/api/v1/check", timeout=10.0)
+            if resp.status_code != 200:
+                logger.error("Signal: health check failed (status %d)", resp.status_code)
+                return False
+        except Exception as e:
+            logger.error("Signal: cannot reach signal-cli at %s: %s", self.http_url, e)
+            return False
+
+        self._running = True
+        self._last_sse_activity = time.time()
+        self._sse_task = asyncio.create_task(self._sse_listener())
+        self._health_monitor_task = asyncio.create_task(self._health_monitor())
+
+        logger.info("Signal: connected to %s", self.http_url)
+        return True
+
+    async def disconnect(self) -> None:
+        """Stop SSE listener and clean up."""
+        self._running = False
+
+        if self._sse_task:
+            self._sse_task.cancel()
+            try:
+                await self._sse_task
+            except asyncio.CancelledError:
+                pass
+
+        if self._health_monitor_task:
+            self._health_monitor_task.cancel()
+            try:
+                await self._health_monitor_task
+            except asyncio.CancelledError:
+                pass
+
+        # Cancel all typing tasks
+        for task in self._typing_tasks.values():
+            task.cancel()
+        self._typing_tasks.clear()
+
+        if self.client:
+            await self.client.aclose()
+            self.client = None
+
+        logger.info("Signal: disconnected")
+
+    # ------------------------------------------------------------------
+    # SSE Streaming (inbound messages)
+    # ------------------------------------------------------------------
+
+    async def _sse_listener(self) -> None:
+        """Listen for SSE events from signal-cli daemon."""
+        url = f"{self.http_url}/api/v1/events?account={self.account}"
+        backoff = SSE_RETRY_DELAY_INITIAL
+
+        while self._running:
+            try:
+                logger.debug("Signal SSE: connecting to %s", url)
+                async with self.client.stream(
+                    "GET", url,
+                    headers={"Accept": "text/event-stream"},
+                    timeout=None,
+                ) as response:
+                    self._sse_response = response
+                    backoff = SSE_RETRY_DELAY_INITIAL  # Reset on successful connection
+                    self._last_sse_activity = time.time()
+                    logger.info("Signal SSE: connected")
+
+                    buffer = ""
+                    async for chunk in response.aiter_text():
+                        if not self._running:
+                            break
+                        buffer += chunk
+                        while "\n" in buffer:
+                            line, buffer = buffer.split("\n", 1)
+                            line = line.strip()
+                            if not line:
+                                continue
+                            # Parse SSE data lines
+                            if line.startswith("data:"):
+                                data_str = line[5:].strip()
+                                if not data_str:
+                                    continue
+                                self._last_sse_activity = time.time()
+                                try:
+                                    data = json.loads(data_str)
+                                    await self._handle_envelope(data)
+                                except json.JSONDecodeError:
+                                    logger.debug("Signal SSE: invalid JSON: %s", data_str[:100])
+                                except Exception:
+                                    logger.exception("Signal SSE: error handling event")
+
+            except asyncio.CancelledError:
+                break
+            except httpx.HTTPError as e:
+                if self._running:
+                    logger.warning("Signal SSE: HTTP error: %s (reconnecting in %.0fs)", e, backoff)
+            except Exception as e:
+                if self._running:
+                    logger.warning("Signal SSE: error: %s (reconnecting in %.0fs)", e, backoff)
+
+            if self._running:
+                # Add 20% jitter to prevent thundering herd on reconnection
+                jitter = backoff * 0.2 * random.random()
+                await asyncio.sleep(backoff + jitter)
+                backoff = min(backoff * 2, SSE_RETRY_DELAY_MAX)
+
+        self._sse_response = None
+
+    # ------------------------------------------------------------------
+    # Health Monitor
+    # ------------------------------------------------------------------
+
+    async def _health_monitor(self) -> None:
+        """Monitor SSE connection health and force reconnect if stale."""
+        while self._running:
+            await asyncio.sleep(HEALTH_CHECK_INTERVAL)
+            if not self._running:
+                break
+
+            elapsed = time.time() - self._last_sse_activity
+            if elapsed > HEALTH_CHECK_STALE_THRESHOLD:
+                logger.warning("Signal: SSE idle for %.0fs, checking daemon health", elapsed)
+                try:
+                    resp = await self.client.get(
+                        f"{self.http_url}/api/v1/check", timeout=10.0
+                    )
+                    if resp.status_code == 200:
+                        # Daemon is alive but SSE is idle — update activity to
+                        # avoid repeated warnings (connection may just be quiet)
+                        self._last_sse_activity = time.time()
+                        logger.debug("Signal: daemon healthy, SSE idle")
+                    else:
+                        logger.warning("Signal: health check failed (%d), forcing reconnect", resp.status_code)
+                        self._force_reconnect()
+                except Exception as e:
+                    logger.warning("Signal: health check error: %s, forcing reconnect", e)
+                    self._force_reconnect()
+
+    def _force_reconnect(self) -> None:
+        """Force SSE reconnection by closing the current response."""
+        if self._sse_response and not self._sse_response.is_stream_consumed:
+            try:
+                asyncio.create_task(self._sse_response.aclose())
+            except Exception:
+                pass
+            self._sse_response = None
+
+    # ------------------------------------------------------------------
+    # Message Handling
+    # ------------------------------------------------------------------
+
+    async def _handle_envelope(self, envelope: dict) -> None:
+        """Process an incoming signal-cli envelope."""
+        # Unwrap nested envelope if present
+        envelope_data = envelope.get("envelope", envelope)
+
+        # Filter syncMessage envelopes (sent transcripts, read receipts, etc.)
+        # signal-cli may set syncMessage to null vs omitting it, so check key existence
+        if "syncMessage" in envelope_data:
+            return
+
+        # Extract sender info
+        sender = (
+            envelope_data.get("sourceNumber")
+            or envelope_data.get("sourceUuid")
+            or envelope_data.get("source")
+        )
+        sender_name = envelope_data.get("sourceName", "")
+        sender_uuid = envelope_data.get("sourceUuid", "")
+
+        if not sender:
+            logger.debug("Signal: ignoring envelope with no sender")
+            return
+
+        # Self-message filtering — prevent reply loops
+        if self._account_normalized and sender == self._account_normalized:
+            return
+
+        # Filter stories
+        if self.ignore_stories and envelope_data.get("storyMessage"):
+            return
+
+        # Get data message — also check editMessage (edited messages contain
+        # their updated dataMessage inside editMessage.dataMessage)
+        data_message = (
+            envelope_data.get("dataMessage")
+            or (envelope_data.get("editMessage") or {}).get("dataMessage")
+        )
+        if not data_message:
+            return
+
+        # Check for group message
+        group_info = data_message.get("groupInfo")
+        group_id = group_info.get("groupId") if group_info else None
+        is_group = bool(group_id)
+
+        # Group message filtering — derived from SIGNAL_GROUP_ALLOWED_USERS:
+        # - No env var set → groups disabled (default safe behavior)
+        # - Env var set with group IDs → only those groups allowed
+        # - Env var set with "*" → all groups allowed
+        # DM auth is fully handled by run.py (_is_user_authorized)
+        if is_group:
+            if not self.group_allow_from:
+                logger.debug("Signal: ignoring group message (no SIGNAL_GROUP_ALLOWED_USERS)")
+                return
+            if "*" not in self.group_allow_from and group_id not in self.group_allow_from:
+                logger.debug("Signal: group %s not in allowlist", group_id[:8] if group_id else "?")
+                return
+
+        # Build chat info
+        chat_id = sender if not is_group else f"group:{group_id}"
+        chat_type = "group" if is_group else "dm"
+
+        # Extract text and render mentions
+        text = data_message.get("message", "")
+        mentions = data_message.get("mentions", [])
+        if text and mentions:
+            text = _render_mentions(text, mentions)
+
+        # Process attachments
+        attachments_data = data_message.get("attachments", [])
+        image_paths = []
+        audio_path = None
+        document_paths = []
+
+        if attachments_data and not getattr(self, "ignore_attachments", False):
+            for att in attachments_data:
+                att_id = att.get("id")
+                att_size = att.get("size", 0)
+                if not att_id:
+                    continue
+                if att_size > SIGNAL_MAX_ATTACHMENT_SIZE:
+                    logger.warning("Signal: attachment too large (%d bytes), skipping", att_size)
+                    continue
+                try:
+                    cached_path, ext = await self._fetch_attachment(att_id)
+                    if cached_path:
+                        if _is_image_ext(ext):
+                            image_paths.append(cached_path)
+                        elif _is_audio_ext(ext):
+                            audio_path = cached_path
+                        else:
+                            document_paths.append(cached_path)
+                except Exception:
+                    logger.exception("Signal: failed to fetch attachment %s", att_id)
+
+        # Build session source
+        source = self.build_source(
+            chat_id=chat_id,
+            chat_name=group_info.get("groupName") if group_info else sender_name,
+            chat_type=chat_type,
+            user_id=sender,
+            user_name=sender_name or sender,
+            user_id_alt=sender_uuid if sender_uuid else None,
+            chat_id_alt=group_id if is_group else None,
+        )
+
+        # Determine message type
+        msg_type = MessageType.TEXT
+        if audio_path:
+            msg_type = MessageType.VOICE
+        elif image_paths:
+            msg_type = MessageType.IMAGE
+
+        # Parse timestamp from envelope data (milliseconds since epoch)
+        ts_ms = envelope_data.get("timestamp", 0)
+        if ts_ms:
+            try:
+                timestamp = datetime.fromtimestamp(ts_ms / 1000, tz=timezone.utc)
+            except (ValueError, OSError):
+                timestamp = datetime.now(tz=timezone.utc)
+        else:
+            timestamp = datetime.now(tz=timezone.utc)
+
+        # Build and dispatch event
+        event = MessageEvent(
+            source=source,
+            text=text or "",
+            message_type=msg_type,
+            image_paths=image_paths,
+            audio_path=audio_path,
+            document_paths=document_paths,
+            timestamp=timestamp,
+        )
+
+        logger.debug("Signal: message from %s in %s: %s",
+                      _redact_phone(sender), chat_id[:20], (text or "")[:50])
+
+        await self.handle_message(event)
+
+    # ------------------------------------------------------------------
+    # Attachment Handling
+    # ------------------------------------------------------------------
+
+    async def _fetch_attachment(self, attachment_id: str) -> tuple:
+        """Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
+        result = await self._rpc("getAttachment", {
+            "account": self.account,
+            "attachmentId": attachment_id,
+        })
+
+        if not result:
+            return None, ""
+
+        # Result is base64-encoded file content
+        raw_data = base64.b64decode(result)
+        ext = _guess_extension(raw_data)
+
+        if _is_image_ext(ext):
+            path = cache_image_from_bytes(raw_data, ext)
+        elif _is_audio_ext(ext):
+            path = cache_audio_from_bytes(raw_data, ext)
+        else:
+            path = cache_document_from_bytes(raw_data, ext)
+
+        return path, ext
+
+    # ------------------------------------------------------------------
+    # JSON-RPC Communication
+    # ------------------------------------------------------------------
+
+    async def _rpc(self, method: str, params: dict, rpc_id: str = None) -> Any:
+        """Send a JSON-RPC 2.0 request to signal-cli daemon."""
+        if not self.client:
+            logger.warning("Signal: RPC called but client not connected")
+            return None
+
+        if rpc_id is None:
+            rpc_id = f"{method}_{int(time.time() * 1000)}"
+
+        payload = {
+            "jsonrpc": "2.0",
+            "method": method,
+            "params": params,
+            "id": rpc_id,
+        }
+
+        try:
+            resp = await self.client.post(
+                f"{self.http_url}/api/v1/rpc",
+                json=payload,
+                timeout=30.0,
+            )
+            resp.raise_for_status()
+            data = resp.json()
+
+            if "error" in data:
+                logger.warning("Signal RPC error (%s): %s", method, data["error"])
+                return None
+
+            return data.get("result")
+
+        except Exception as e:
+            logger.warning("Signal RPC %s failed: %s", method, e)
+            return None
+
+    # ------------------------------------------------------------------
+    # Sending
+    # ------------------------------------------------------------------
+
+    async def send(
+        self,
+        chat_id: str,
+        text: str,
+        reply_to_message_id: Optional[str] = None,
+        **kwargs,
+    ) -> SendResult:
+        """Send a text message."""
+        await self._stop_typing_indicator(chat_id)
+
+        params: Dict[str, Any] = {
+            "account": self.account,
+            "message": text,
+        }
+
+        if chat_id.startswith("group:"):
+            params["groupId"] = chat_id[6:]
+        else:
+            params["recipient"] = [chat_id]
+
+        result = await self._rpc("send", params)
+
+        if result is not None:
+            return SendResult(success=True)
+        return SendResult(success=False, error="RPC send failed")
+
+    async def send_typing(self, chat_id: str) -> None:
+        """Send a typing indicator."""
+        params: Dict[str, Any] = {
+            "account": self.account,
+        }
+
+        if chat_id.startswith("group:"):
+            params["groupId"] = chat_id[6:]
+        else:
+            params["recipient"] = [chat_id]
+
+        await self._rpc("sendTyping", params, rpc_id="typing")
+
+    async def send_image(
+        self,
+        chat_id: str,
+        image_url: str,
+        caption: Optional[str] = None,
+        **kwargs,
+    ) -> SendResult:
+        """Send an image. Supports http(s):// and file:// URLs."""
+        await self._stop_typing_indicator(chat_id)
+
+        # Resolve image to local path
+        if image_url.startswith("file://"):
+            file_path = unquote(image_url[7:])
+        else:
+            # Download remote image to cache
+            try:
+                file_path = await cache_image_from_url(image_url)
+            except Exception as e:
+                logger.warning("Signal: failed to download image: %s", e)
+                return SendResult(success=False, error=str(e))
+
+        if not file_path or not Path(file_path).exists():
+            return SendResult(success=False, error="Image file not found")
+
+        # Validate size
+        file_size = Path(file_path).stat().st_size
+        if file_size > SIGNAL_MAX_ATTACHMENT_SIZE:
+            return SendResult(success=False, error=f"Image too large ({file_size} bytes)")
+
+        params: Dict[str, Any] = {
+            "account": self.account,
+            "message": caption or "",
+            "attachments": [file_path],
+        }
+
+        if chat_id.startswith("group:"):
+            params["groupId"] = chat_id[6:]
+        else:
+            params["recipient"] = [chat_id]
+
+        result = await self._rpc("send", params)
+        if result is not None:
+            return SendResult(success=True)
+        return SendResult(success=False, error="RPC send with attachment failed")
+
+    async def send_document(
+        self,
+        chat_id: str,
+        file_path: str,
+        caption: Optional[str] = None,
+        filename: Optional[str] = None,
+        **kwargs,
+    ) -> SendResult:
+        """Send a document/file attachment."""
+        await self._stop_typing_indicator(chat_id)
+
+        if not Path(file_path).exists():
+            return SendResult(success=False, error="File not found")
+
+        params: Dict[str, Any] = {
+            "account": self.account,
+            "message": caption or "",
+            "attachments": [file_path],
+        }
+
+        if chat_id.startswith("group:"):
+            params["groupId"] = chat_id[6:]
+        else:
+            params["recipient"] = [chat_id]
+
+        result = await self._rpc("send", params)
+        if result is not None:
+            return SendResult(success=True)
+        return SendResult(success=False, error="RPC send document failed")
+
+    # ------------------------------------------------------------------
+    # Typing Indicators
+    # ------------------------------------------------------------------
+
+    async def _start_typing_indicator(self, chat_id: str) -> None:
+        """Start a typing indicator loop for a chat."""
+        if chat_id in self._typing_tasks:
+            return  # Already running
+
+        async def _typing_loop():
+            try:
+                while True:
+                    await self.send_typing(chat_id)
+                    await asyncio.sleep(TYPING_INTERVAL)
+            except asyncio.CancelledError:
+                pass
+
+        self._typing_tasks[chat_id] = asyncio.create_task(_typing_loop())
+
+    async def _stop_typing_indicator(self, chat_id: str) -> None:
+        """Stop a typing indicator loop for a chat."""
+        task = self._typing_tasks.pop(chat_id, None)
+        if task:
+            task.cancel()
+            try:
+                await task
+            except asyncio.CancelledError:
+                pass
+
+    # ------------------------------------------------------------------
+    # Chat Info
+    # ------------------------------------------------------------------
+
+    async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
+        """Get information about a chat/contact."""
+        if chat_id.startswith("group:"):
+            return {
+                "name": chat_id,
+                "type": "group",
+                "chat_id": chat_id,
+            }
+
+        # Try to resolve contact name
+        result = await self._rpc("getContact", {
+            "account": self.account,
+            "contactAddress": chat_id,
+        })
+
+        name = chat_id
+        if result and isinstance(result, dict):
+            name = result.get("name") or result.get("profileName") or chat_id
+
+        return {
+            "name": name,
+            "type": "dm",
+            "chat_id": chat_id,
+        }
--- a/gateway/platforms/telegram.py
+++ b/gateway/platforms/telegram.py
@@ -132,6 +132,10 @@ class TelegramAdapter(BasePlatformAdapter):
                filters.COMMAND,
                self._handle_command
            ))
+            self._app.add_handler(TelegramMessageHandler(
+                filters.LOCATION | getattr(filters, "VENUE", filters.LOCATION),
+                self._handle_location_message
+            ))
            self._app.add_handler(TelegramMessageHandler(
                filters.PHOTO | filters.VIDEO | filters.AUDIO | filters.VOICE | filters.Document.ALL | filters.Sticker.ALL,
                self._handle_media_message
@@ -546,6 +550,41 @@ class TelegramAdapter(BasePlatformAdapter):
        event = self._build_message_event(update.message, MessageType.COMMAND)
        await self.handle_message(event)
    
+    async def _handle_location_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
+        """Handle incoming location/venue pin messages."""
+        if not update.message:
+            return
+
+        msg = update.message
+        venue = getattr(msg, "venue", None)
+        location = getattr(venue, "location", None) if venue else getattr(msg, "location", None)
+
+        if not location:
+            return
+
+        lat = getattr(location, "latitude", None)
+        lon = getattr(location, "longitude", None)
+        if lat is None or lon is None:
+            return
+
+        # Build a text message with coordinates and context
+        parts = ["[The user shared a location pin.]"]
+        if venue:
+            title = getattr(venue, "title", None)
+            address = getattr(venue, "address", None)
+            if title:
+                parts.append(f"Venue: {title}")
+            if address:
+                parts.append(f"Address: {address}")
+        parts.append(f"latitude: {lat}")
+        parts.append(f"longitude: {lon}")
+        parts.append(f"Map: https://www.google.com/maps/search/?api=1&query={lat},{lon}")
+        parts.append("Ask what they'd like to find nearby (restaurants, cafes, etc.) and any preferences.")
+
+        event = self._build_message_event(msg, MessageType.LOCATION)
+        event.text = "\n".join(parts)
+        await self.handle_message(event)
+
    async def _handle_media_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
        """Handle incoming media messages, downloading images to local cache."""
        if not update.message:
--- a/gateway/run.py
+++ b/gateway/run.py
@@ -118,6 +118,12 @@ if _config_path.exists():
        _tz_cfg = _cfg.get("timezone", "")
        if _tz_cfg and isinstance(_tz_cfg, str) and "HERMES_TIMEZONE" not in os.environ:
            os.environ["HERMES_TIMEZONE"] = _tz_cfg.strip()
+        # Security settings
+        _security_cfg = _cfg.get("security", {})
+        if isinstance(_security_cfg, dict):
+            _redact = _security_cfg.get("redact_secrets")
+            if _redact is not None:
+                os.environ["HERMES_REDACT_SECRETS"] = str(_redact).lower()
    except Exception:
        pass  # Non-fatal; gateway can still run with .env values

@@ -194,6 +200,7 @@ class GatewayRunner:
        self._ephemeral_system_prompt = self._load_ephemeral_system_prompt()
        self._reasoning_config = self._load_reasoning_config()
        self._provider_routing = self._load_provider_routing()
+        self._fallback_model = self._load_fallback_model()

        # Wire process registry into session store for reset protection
        from tools.process_registry import process_registry
@@ -393,6 +400,26 @@ class GatewayRunner:
            pass
        return {}

+    @staticmethod
+    def _load_fallback_model() -> dict | None:
+        """Load fallback model config from config.yaml.
+
+        Returns a dict with 'provider' and 'model' keys, or None if
+        not configured / both fields empty.
+        """
+        try:
+            import yaml as _y
+            cfg_path = _hermes_home / "config.yaml"
+            if cfg_path.exists():
+                with open(cfg_path) as _f:
+                    cfg = _y.safe_load(_f) or {}
+                fb = cfg.get("fallback_model", {}) or {}
+                if fb.get("provider") and fb.get("model"):
+                    return fb
+        except Exception:
+            pass
+        return None
+
    async def start(self) -> bool:
        """
        Start the gateway and all configured platform adapters.
@@ -591,6 +618,13 @@ class GatewayRunner:
                return None
            return SlackAdapter(config)

+        elif platform == Platform.SIGNAL:
+            from gateway.platforms.signal import SignalAdapter, check_signal_requirements
+            if not check_signal_requirements():
+                logger.warning("Signal: SIGNAL_HTTP_URL or SIGNAL_ACCOUNT not configured")
+                return None
+            return SignalAdapter(config)
+
        elif platform == Platform.HOMEASSISTANT:
            from gateway.platforms.homeassistant import HomeAssistantAdapter, check_ha_requirements
            if not check_ha_requirements():
@@ -626,12 +660,14 @@ class GatewayRunner:
            Platform.DISCORD: "DISCORD_ALLOWED_USERS",
            Platform.WHATSAPP: "WHATSAPP_ALLOWED_USERS",
            Platform.SLACK: "SLACK_ALLOWED_USERS",
+            Platform.SIGNAL: "SIGNAL_ALLOWED_USERS",
        }
        platform_allow_all_map = {
            Platform.TELEGRAM: "TELEGRAM_ALLOW_ALL_USERS",
            Platform.DISCORD: "DISCORD_ALLOW_ALL_USERS",
            Platform.WHATSAPP: "WHATSAPP_ALLOW_ALL_USERS",
            Platform.SLACK: "SLACK_ALLOW_ALL_USERS",
+            Platform.SIGNAL: "SIGNAL_ALLOW_ALL_USERS",
        }

        # Per-platform allow-all flag (e.g., DISCORD_ALLOW_ALL_USERS=true)
@@ -870,159 +906,187 @@ class GatewayRunner:
        # every new message rehydrates an oversized transcript, causing
        # repeated truncation/context failures.  Detect this early and
        # compress proactively — before the agent even starts.  (#628)
+        #
+        # Thresholds are derived from the SAME compression config the
+        # agent uses (compression.threshold × model context length) so
+        # CLI and messaging platforms behave identically.
        # -----------------------------------------------------------------
        if history and len(history) >= 4:
-            from agent.model_metadata import estimate_messages_tokens_rough
+            from agent.model_metadata import (
+                estimate_messages_tokens_rough,
+                get_model_context_length,
+            )

-            # Read thresholds from config.yaml → session_hygiene section
-            _hygiene_cfg = {}
+            # Read model + compression config from config.yaml — same
+            # source of truth the agent itself uses.
+            _hyg_model = "anthropic/claude-sonnet-4.6"
+            _hyg_threshold_pct = 0.85
+            _hyg_compression_enabled = True
            try:
                _hyg_cfg_path = _hermes_home / "config.yaml"
                if _hyg_cfg_path.exists():
                    import yaml as _hyg_yaml
                    with open(_hyg_cfg_path) as _hyg_f:
                        _hyg_data = _hyg_yaml.safe_load(_hyg_f) or {}
-                    _hygiene_cfg = _hyg_data.get("session_hygiene", {})
-                    if not isinstance(_hygiene_cfg, dict):
-                        _hygiene_cfg = {}
+
+                    # Resolve model name (same logic as run_sync)
+                    _model_cfg = _hyg_data.get("model", {})
+                    if isinstance(_model_cfg, str):
+                        _hyg_model = _model_cfg
+                    elif isinstance(_model_cfg, dict):
+                        _hyg_model = _model_cfg.get("default", _hyg_model)
+
+                    # Read compression settings
+                    _comp_cfg = _hyg_data.get("compression", {})
+                    if isinstance(_comp_cfg, dict):
+                        _hyg_threshold_pct = float(
+                            _comp_cfg.get("threshold", _hyg_threshold_pct)
+                        )
+                        _hyg_compression_enabled = str(
+                            _comp_cfg.get("enabled", True)
+                        ).lower() in ("true", "1", "yes")
            except Exception:
                pass

-            _compress_token_threshold = int(
-                _hygiene_cfg.get("auto_compress_tokens", 100_000)
-            )
-            _compress_msg_threshold = int(
-                _hygiene_cfg.get("auto_compress_messages", 200)
-            )
-            _warn_token_threshold = int(
-                _hygiene_cfg.get("warn_tokens", 200_000)
+            # Also check env overrides (same as run_agent.py)
+            _hyg_threshold_pct = float(
+                os.getenv("CONTEXT_COMPRESSION_THRESHOLD", str(_hyg_threshold_pct))
            )
+            if os.getenv("CONTEXT_COMPRESSION_ENABLED", "").lower() in ("false", "0", "no"):
+                _hyg_compression_enabled = False

-            _msg_count = len(history)
-            _approx_tokens = estimate_messages_tokens_rough(history)
-
-            _needs_compress = (
-                _approx_tokens >= _compress_token_threshold
-                or _msg_count >= _compress_msg_threshold
-            )
-
-            if _needs_compress:
-                logger.info(
-                    "Session hygiene: %s messages, ~%s tokens — auto-compressing "
-                    "(thresholds: %s msgs / %s tokens)",
-                    _msg_count, f"{_approx_tokens:,}",
-                    _compress_msg_threshold, f"{_compress_token_threshold:,}",
+            if _hyg_compression_enabled:
+                _hyg_context_length = get_model_context_length(_hyg_model)
+                _compress_token_threshold = int(
+                    _hyg_context_length * _hyg_threshold_pct
                )
+                # Warn if still huge after compression (95% of context)
+                _warn_token_threshold = int(_hyg_context_length * 0.95)
+
+                _msg_count = len(history)
+                _approx_tokens = estimate_messages_tokens_rough(history)
+
+                _needs_compress = _approx_tokens >= _compress_token_threshold
+
+                if _needs_compress:
+                    logger.info(
+                        "Session hygiene: %s messages, ~%s tokens — auto-compressing "
+                        "(threshold: %s%% of %s = %s tokens)",
+                        _msg_count, f"{_approx_tokens:,}",
+                        int(_hyg_threshold_pct * 100),
+                        f"{_hyg_context_length:,}",
+                        f"{_compress_token_threshold:,}",
+                    )
+
+                    _hyg_adapter = self.adapters.get(source.platform)
+                    if _hyg_adapter:
+                        try:
+                            await _hyg_adapter.send(
+                                source.chat_id,
+                                f"🗜️ Session is large ({_msg_count} messages, "
+                                f"~{_approx_tokens:,} tokens). Auto-compressing..."
+                            )
+                        except Exception:
+                            pass

-                _hyg_adapter = self.adapters.get(source.platform)
-                if _hyg_adapter:
                    try:
-                        await _hyg_adapter.send(
-                            source.chat_id,
-                            f"🗜️ Session is large ({_msg_count} messages, "
-                            f"~{_approx_tokens:,} tokens). Auto-compressing..."
-                        )
-                    except Exception:
-                        pass
+                        from run_agent import AIAgent

-                try:
-                    from run_agent import AIAgent
+                        _hyg_runtime = _resolve_runtime_agent_kwargs()
+                        if _hyg_runtime.get("api_key"):
+                            _hyg_msgs = [
+                                {"role": m.get("role"), "content": m.get("content")}
+                                for m in history
+                                if m.get("role") in ("user", "assistant")
+                                and m.get("content")
+                            ]

-                    _hyg_runtime = _resolve_runtime_agent_kwargs()
-                    if _hyg_runtime.get("api_key"):
-                        _hyg_msgs = [
-                            {"role": m.get("role"), "content": m.get("content")}
-                            for m in history
-                            if m.get("role") in ("user", "assistant")
-                            and m.get("content")
-                        ]
-
-                        if len(_hyg_msgs) >= 4:
-                            _hyg_agent = AIAgent(
-                                **_hyg_runtime,
-                                max_iterations=4,
-                                quiet_mode=True,
-                                enabled_toolsets=["memory"],
-                                session_id=session_entry.session_id,
-                            )
-
-                            loop = asyncio.get_event_loop()
-                            _compressed, _ = await loop.run_in_executor(
-                                None,
-                                lambda: _hyg_agent._compress_context(
-                                    _hyg_msgs, "",
-                                    approx_tokens=_approx_tokens,
-                                ),
-                            )
-
-                            self.session_store.rewrite_transcript(
-                                session_entry.session_id, _compressed
-                            )
-                            history = _compressed
-                            _new_count = len(_compressed)
-                            _new_tokens = estimate_messages_tokens_rough(
-                                _compressed
-                            )
-
-                            logger.info(
-                                "Session hygiene: compressed %s → %s msgs, "
-                                "~%s → ~%s tokens",
-                                _msg_count, _new_count,
-                                f"{_approx_tokens:,}", f"{_new_tokens:,}",
-                            )
-
-                            if _hyg_adapter:
-                                try:
-                                    await _hyg_adapter.send(
-                                        source.chat_id,
-                                        f"🗜️ Compressed: {_msg_count} → "
-                                        f"{_new_count} messages, "
-                                        f"~{_approx_tokens:,} → "
-                                        f"~{_new_tokens:,} tokens"
-                                    )
-                                except Exception:
-                                    pass
-
-                            # Still too large after compression — warn user
-                            if _new_tokens >= _warn_token_threshold:
-                                logger.warning(
-                                    "Session hygiene: still ~%s tokens after "
-                                    "compression — suggesting /reset",
-                                    f"{_new_tokens:,}",
+                            if len(_hyg_msgs) >= 4:
+                                _hyg_agent = AIAgent(
+                                    **_hyg_runtime,
+                                    max_iterations=4,
+                                    quiet_mode=True,
+                                    enabled_toolsets=["memory"],
+                                    session_id=session_entry.session_id,
                                )
+
+                                loop = asyncio.get_event_loop()
+                                _compressed, _ = await loop.run_in_executor(
+                                    None,
+                                    lambda: _hyg_agent._compress_context(
+                                        _hyg_msgs, "",
+                                        approx_tokens=_approx_tokens,
+                                    ),
+                                )
+
+                                self.session_store.rewrite_transcript(
+                                    session_entry.session_id, _compressed
+                                )
+                                history = _compressed
+                                _new_count = len(_compressed)
+                                _new_tokens = estimate_messages_tokens_rough(
+                                    _compressed
+                                )
+
+                                logger.info(
+                                    "Session hygiene: compressed %s → %s msgs, "
+                                    "~%s → ~%s tokens",
+                                    _msg_count, _new_count,
+                                    f"{_approx_tokens:,}", f"{_new_tokens:,}",
+                                )
+
                                if _hyg_adapter:
                                    try:
                                        await _hyg_adapter.send(
                                            source.chat_id,
-                                            "⚠️ Session is still very large "
-                                            "after compression "
-                                            f"(~{_new_tokens:,} tokens). "
-                                            "Consider using /reset to start "
-                                            "fresh if you experience issues."
+                                            f"🗜️ Compressed: {_msg_count} → "
+                                            f"{_new_count} messages, "
+                                            f"~{_approx_tokens:,} → "
+                                            f"~{_new_tokens:,} tokens"
                                        )
                                    except Exception:
                                        pass

-                except Exception as e:
-                    logger.warning(
-                        "Session hygiene auto-compress failed: %s", e
-                    )
-                    # Compression failed and session is dangerously large
-                    if _approx_tokens >= _warn_token_threshold:
-                        _hyg_adapter = self.adapters.get(source.platform)
-                        if _hyg_adapter:
-                            try:
-                                await _hyg_adapter.send(
-                                    source.chat_id,
-                                    f"⚠️ Session is very large "
-                                    f"({_msg_count} messages, "
-                                    f"~{_approx_tokens:,} tokens) and "
-                                    "auto-compression failed. Consider "
-                                    "using /compress or /reset to avoid "
-                                    "issues."
-                                )
-                            except Exception:
-                                pass
+                                # Still too large after compression — warn user
+                                if _new_tokens >= _warn_token_threshold:
+                                    logger.warning(
+                                        "Session hygiene: still ~%s tokens after "
+                                        "compression — suggesting /reset",
+                                        f"{_new_tokens:,}",
+                                    )
+                                    if _hyg_adapter:
+                                        try:
+                                            await _hyg_adapter.send(
+                                                source.chat_id,
+                                                "⚠️ Session is still very large "
+                                                "after compression "
+                                                f"(~{_new_tokens:,} tokens). "
+                                                "Consider using /reset to start "
+                                                "fresh if you experience issues."
+                                            )
+                                        except Exception:
+                                            pass
+
+                    except Exception as e:
+                        logger.warning(
+                            "Session hygiene auto-compress failed: %s", e
+                        )
+                        # Compression failed and session is dangerously large
+                        if _approx_tokens >= _warn_token_threshold:
+                            _hyg_adapter = self.adapters.get(source.platform)
+                            if _hyg_adapter:
+                                try:
+                                    await _hyg_adapter.send(
+                                        source.chat_id,
+                                        f"⚠️ Session is very large "
+                                        f"({_msg_count} messages, "
+                                        f"~{_approx_tokens:,} tokens) and "
+                                        "auto-compression failed. Consider "
+                                        "using /compress or /reset to avoid "
+                                        "issues."
+                                    )
+                                except Exception:
+                                    pass

        # First-message onboarding -- only on the very first interaction ever
        if not history and not self.session_store.has_any_sessions():
@@ -1385,6 +1449,11 @@ class GatewayRunner:
            except Exception:
                current_provider = "openrouter"

+        # Detect custom endpoint: provider resolved to openrouter but a custom
+        # base URL is configured — the user set up a custom endpoint.
+        if current_provider == "openrouter" and os.getenv("OPENAI_BASE_URL", "").strip():
+            current_provider = "custom"
+
        if not args:
            provider_label = _PROVIDER_LABELS.get(current_provider, current_provider)
            lines = [
@@ -1511,6 +1580,10 @@ class GatewayRunner:
            except Exception:
                current_provider = "openrouter"

+        # Detect custom endpoint
+        if current_provider == "openrouter" and os.getenv("OPENAI_BASE_URL", "").strip():
+            current_provider = "custom"
+
        current_label = _PROVIDER_LABELS.get(current_provider, current_provider)

        lines = [
@@ -2623,6 +2696,7 @@ class GatewayRunner:
                platform=platform_key,
                honcho_session_key=session_key,
                session_db=self._session_db,
+                fallback_model=self._fallback_model,
            )
            
            # Store agent reference for interrupt support
--- a/gateway/session.py
+++ b/gateway/session.py
@@ -45,6 +45,8 @@ class SessionSource:
    user_name: Optional[str] = None
    thread_id: Optional[str] = None  # For forum topics, Discord threads, etc.
    chat_topic: Optional[str] = None  # Channel topic/description (Discord, Slack)
+    user_id_alt: Optional[str] = None  # Signal UUID (alternative to phone number)
+    chat_id_alt: Optional[str] = None  # Signal group internal ID
    
    @property
    def description(self) -> str:
@@ -68,7 +70,7 @@ class SessionSource:
        return ", ".join(parts)
    
    def to_dict(self) -> Dict[str, Any]:
-        return {
+        d = {
            "platform": self.platform.value,
            "chat_id": self.chat_id,
            "chat_name": self.chat_name,
@@ -78,6 +80,11 @@ class SessionSource:
            "thread_id": self.thread_id,
            "chat_topic": self.chat_topic,
        }
+        if self.user_id_alt:
+            d["user_id_alt"] = self.user_id_alt
+        if self.chat_id_alt:
+            d["chat_id_alt"] = self.chat_id_alt
+        return d
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> "SessionSource":
@@ -90,6 +97,8 @@ class SessionSource:
            user_name=data.get("user_name"),
            thread_id=data.get("thread_id"),
            chat_topic=data.get("chat_topic"),
+            user_id_alt=data.get("user_id_alt"),
+            chat_id_alt=data.get("chat_id_alt"),
        )
    
    @classmethod
@@ -333,7 +342,7 @@ class SessionStore:
        
        if sessions_file.exists():
            try:
-                with open(sessions_file, "r") as f:
+                with open(sessions_file, "r", encoding="utf-8") as f:
                    data = json.load(f)
                    for key, entry_data in data.items():
                        self._entries[key] = SessionEntry.from_dict(entry_data)
@@ -348,7 +357,7 @@ class SessionStore:
        sessions_file = self.sessions_dir / "sessions.json"
        
        data = {key: entry.to_dict() for key, entry in self._entries.items()}
-        with open(sessions_file, "w") as f:
+        with open(sessions_file, "w", encoding="utf-8") as f:
            json.dump(data, f, indent=2)
    
    def _generate_session_key(self, source: SessionSource) -> str:
@@ -672,7 +681,7 @@ class SessionStore:
        
        # Also write legacy JSONL (keeps existing tooling working during transition)
        transcript_path = self.get_transcript_path(session_id)
-        with open(transcript_path, "a") as f:
+        with open(transcript_path, "a", encoding="utf-8") as f:
            f.write(json.dumps(message, ensure_ascii=False) + "\n")
    
    def rewrite_transcript(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
@@ -699,7 +708,7 @@ class SessionStore:
        
        # JSONL: overwrite the file
        transcript_path = self.get_transcript_path(session_id)
-        with open(transcript_path, "w") as f:
+        with open(transcript_path, "w", encoding="utf-8") as f:
            for msg in messages:
                f.write(json.dumps(msg, ensure_ascii=False) + "\n")

@@ -721,7 +730,7 @@ class SessionStore:
            return []
        
        messages = []
-        with open(transcript_path, "r") as f:
+        with open(transcript_path, "r", encoding="utf-8") as f:
            for line in f:
                line = line.strip()
                if line:
--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@@ -81,6 +81,7 @@ DEFAULT_CONFIG = {
    
    "browser": {
        "inactivity_timeout": 120,
+        "record_sessions": False,  # Auto-record browser sessions as WebM videos
    },
    
    "compression": {
@@ -438,7 +439,7 @@ OPTIONAL_ENV_VARS = {
        "category": "setting",
    },
    "HERMES_MAX_ITERATIONS": {
-        "description": "Maximum tool-calling iterations per conversation (default: 60)",
+        "description": "Maximum tool-calling iterations per conversation (default: 90)",
        "prompt": "Max iterations",
        "url": None,
        "password": False,
@@ -758,6 +759,36 @@ def load_config() -> Dict[str, Any]:
    return config


+_COMMENTED_SECTIONS = """
+# ── Security ──────────────────────────────────────────────────────────
+# API keys, tokens, and passwords are redacted from tool output by default.
+# Set to false to see full values (useful for debugging auth issues).
+#
+# security:
+#   redact_secrets: false
+
+# ── Fallback Model ────────────────────────────────────────────────────
+# Automatic provider failover when primary is unavailable.
+# Uncomment and configure to enable. Triggers on rate limits (429),
+# overload (529), service errors (503), or connection failures.
+#
+# Supported providers:
+#   openrouter   (OPENROUTER_API_KEY)  — routes to any model
+#   openai-codex (OAuth — hermes login) — OpenAI Codex
+#   nous         (OAuth — hermes login) — Nous Portal
+#   zai          (ZAI_API_KEY)         — Z.AI / GLM
+#   kimi-coding  (KIMI_API_KEY)        — Kimi / Moonshot
+#   minimax      (MINIMAX_API_KEY)     — MiniMax
+#   minimax-cn   (MINIMAX_CN_API_KEY)  — MiniMax (China)
+#
+# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
+#
+# fallback_model:
+#   provider: openrouter
+#   model: anthropic/claude-sonnet-4
+"""
+
+
 def save_config(config: Dict[str, Any]):
    """Save configuration to ~/.hermes/config.yaml."""
    ensure_hermes_home()
@@ -765,6 +796,18 @@ def save_config(config: Dict[str, Any]):
    
    with open(config_path, 'w') as f:
        yaml.dump(config, f, default_flow_style=False, sort_keys=False)
+        # Append commented-out sections for features that are off by default
+        # or only relevant when explicitly configured. Skip sections the
+        # user has already uncommented and configured.
+        sections = []
+        sec = config.get("security", {})
+        if not sec or sec.get("redact_secrets") is None:
+            sections.append("security")
+        fb = config.get("fallback_model", {})
+        if not fb or not (fb.get("provider") and fb.get("model")):
+            sections.append("fallback")
+        if sections:
+            f.write(_COMMENTED_SECTIONS)


 def load_env() -> Dict[str, str]:
@@ -1010,7 +1053,7 @@ def set_config_value(key: str, value: str):
        'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
        'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
        'SUDO_PASSWORD', 'SLACK_BOT_TOKEN', 'SLACK_APP_TOKEN',
-        'GITHUB_TOKEN', 'HONCHO_API_KEY', 'NOUS_API_KEY', 'WANDB_API_KEY',
+        'GITHUB_TOKEN', 'HONCHO_API_KEY', 'WANDB_API_KEY',
        'TINKER_API_KEY',
    ]
    
--- a/hermes_cli/gateway.py
+++ b/hermes_cli/gateway.py
@@ -507,6 +507,12 @@ _PLATFORMS = [
        "emoji": "📲",
        "token_var": "WHATSAPP_ENABLED",
    },
+    {
+        "key": "signal",
+        "label": "Signal",
+        "emoji": "📡",
+        "token_var": "SIGNAL_HTTP_URL",
+    },
 ]


@@ -525,6 +531,13 @@ def _platform_status(platform: dict) -> str:
                return "configured + paired"
            return "enabled, not paired"
        return "not configured"
+    if platform.get("key") == "signal":
+        account = get_env_value("SIGNAL_ACCOUNT")
+        if val and account:
+            return "configured"
+        if val or account:
+            return "partially configured"
+        return "not configured"
    if val:
        return "configured"
    return "not configured"
@@ -650,6 +663,121 @@ def _is_service_running() -> bool:
    return len(find_gateway_pids()) > 0


+def _setup_signal():
+    """Interactive setup for Signal messenger."""
+    import shutil
+
+    print()
+    print(color("  ─── 📡 Signal Setup ───", Colors.CYAN))
+
+    existing_url = get_env_value("SIGNAL_HTTP_URL")
+    existing_account = get_env_value("SIGNAL_ACCOUNT")
+    if existing_url and existing_account:
+        print()
+        print_success("Signal is already configured.")
+        if not prompt_yes_no("  Reconfigure Signal?", False):
+            return
+
+    # Check if signal-cli is available
+    print()
+    if shutil.which("signal-cli"):
+        print_success("signal-cli found on PATH.")
+    else:
+        print_warning("signal-cli not found on PATH.")
+        print_info("  Signal requires signal-cli running as an HTTP daemon.")
+        print_info("  Install options:")
+        print_info("    Linux:  sudo apt install signal-cli")
+        print_info("            or download from https://github.com/AsamK/signal-cli")
+        print_info("    macOS:  brew install signal-cli")
+        print_info("    Docker: bbernhard/signal-cli-rest-api")
+        print()
+        print_info("  After installing, link your account and start the daemon:")
+        print_info("    signal-cli link -n \"HermesAgent\"")
+        print_info("    signal-cli --account +YOURNUMBER daemon --http 127.0.0.1:8080")
+        print()
+
+    # HTTP URL
+    print()
+    print_info("  Enter the URL where signal-cli HTTP daemon is running.")
+    default_url = existing_url or "http://127.0.0.1:8080"
+    try:
+        url = input(f"  HTTP URL [{default_url}]: ").strip() or default_url
+    except (EOFError, KeyboardInterrupt):
+        print("\n  Setup cancelled.")
+        return
+
+    # Test connectivity
+    print_info("  Testing connection...")
+    try:
+        import httpx
+        resp = httpx.get(f"{url.rstrip('/')}/api/v1/check", timeout=10.0)
+        if resp.status_code == 200:
+            print_success("  signal-cli daemon is reachable!")
+        else:
+            print_warning(f"  signal-cli responded with status {resp.status_code}.")
+            if not prompt_yes_no("  Continue anyway?", False):
+                return
+    except Exception as e:
+        print_warning(f"  Could not reach signal-cli at {url}: {e}")
+        if not prompt_yes_no("  Save this URL anyway? (you can start signal-cli later)", True):
+            return
+
+    save_env_value("SIGNAL_HTTP_URL", url)
+
+    # Account phone number
+    print()
+    print_info("  Enter your Signal account phone number in E.164 format.")
+    print_info("  Example: +15551234567")
+    default_account = existing_account or ""
+    try:
+        account = input(f"  Account number{f' [{default_account}]' if default_account else ''}: ").strip()
+        if not account:
+            account = default_account
+    except (EOFError, KeyboardInterrupt):
+        print("\n  Setup cancelled.")
+        return
+
+    if not account:
+        print_error("  Account number is required.")
+        return
+
+    save_env_value("SIGNAL_ACCOUNT", account)
+
+    # Allowed users
+    print()
+    print_info("  The gateway DENIES all users by default for security.")
+    print_info("  Enter phone numbers or UUIDs of allowed users (comma-separated).")
+    existing_allowed = get_env_value("SIGNAL_ALLOWED_USERS") or ""
+    default_allowed = existing_allowed or account
+    try:
+        allowed = input(f"  Allowed users [{default_allowed}]: ").strip() or default_allowed
+    except (EOFError, KeyboardInterrupt):
+        print("\n  Setup cancelled.")
+        return
+
+    save_env_value("SIGNAL_ALLOWED_USERS", allowed)
+
+    # Group messaging
+    print()
+    if prompt_yes_no("  Enable group messaging? (disabled by default for security)", False):
+        print()
+        print_info("  Enter group IDs to allow, or * for all groups.")
+        existing_groups = get_env_value("SIGNAL_GROUP_ALLOWED_USERS") or ""
+        try:
+            groups = input(f"  Group IDs [{existing_groups or '*'}]: ").strip() or existing_groups or "*"
+        except (EOFError, KeyboardInterrupt):
+            print("\n  Setup cancelled.")
+            return
+        save_env_value("SIGNAL_GROUP_ALLOWED_USERS", groups)
+
+    print()
+    print_success("Signal configured!")
+    print_info(f"  URL: {url}")
+    print_info(f"  Account: {account}")
+    print_info(f"  DM auth: via SIGNAL_ALLOWED_USERS + DM pairing")
+    print_info(f"  Groups: {'enabled' if get_env_value('SIGNAL_GROUP_ALLOWED_USERS') else 'disabled'}")
+
+
 def gateway_setup():
    """Interactive setup for messaging platforms + gateway service."""

@@ -702,6 +830,8 @@ def gateway_setup():

        if platform["key"] == "whatsapp":
            _setup_whatsapp()
+        elif platform["key"] == "signal":
+            _setup_signal()
        else:
            _setup_standard_platform(platform)

--- a/hermes_cli/main.py
+++ b/hermes_cli/main.py
@@ -761,9 +761,39 @@ def cmd_model(args):
        ("kimi-coding", "Kimi / Moonshot (Moonshot AI direct API)"),
        ("minimax", "MiniMax (global direct API)"),
        ("minimax-cn", "MiniMax China (domestic direct API)"),
-        ("custom", "Custom endpoint (self-hosted / VLLM / etc.)"),
    ]

+    # Add user-defined custom providers from config.yaml
+    custom_providers_cfg = config.get("custom_providers") or []
+    _custom_provider_map = {}  # key → {name, base_url, api_key}
+    if isinstance(custom_providers_cfg, list):
+        for entry in custom_providers_cfg:
+            if not isinstance(entry, dict):
+                continue
+            name = entry.get("name", "").strip()
+            base_url = entry.get("base_url", "").strip()
+            if not name or not base_url:
+                continue
+            # Generate a stable key from the name
+            key = "custom:" + name.lower().replace(" ", "-")
+            short_url = base_url.replace("https://", "").replace("http://", "").rstrip("/")
+            saved_model = entry.get("model", "")
+            model_hint = f" — {saved_model}" if saved_model else ""
+            providers.append((key, f"{name} ({short_url}){model_hint}"))
+            _custom_provider_map[key] = {
+                "name": name,
+                "base_url": base_url,
+                "api_key": entry.get("api_key", ""),
+                "model": saved_model,
+            }
+
+    # Always add the manual custom endpoint option last
+    providers.append(("custom", "Custom endpoint (enter URL manually)"))
+
+    # Add removal option if there are saved custom providers
+    if _custom_provider_map:
+        providers.append(("remove-custom", "Remove a saved custom provider"))
+
    # Reorder so the active provider is at the top
    known_keys = {k for k, _ in providers}
    active_key = active if active in known_keys else "custom"
@@ -791,6 +821,10 @@ def cmd_model(args):
        _model_flow_openai_codex(config, current_model)
    elif selected_provider == "custom":
        _model_flow_custom(config)
+    elif selected_provider.startswith("custom:") and selected_provider in _custom_provider_map:
+        _model_flow_named_custom(config, _custom_provider_map[selected_provider])
+    elif selected_provider == "remove-custom":
+        _remove_custom_provider(config)
    elif selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn"):
        _model_flow_api_key_provider(config, selected_provider, current_model)

@@ -1006,7 +1040,11 @@ def _model_flow_openai_codex(config, current_model=""):


 def _model_flow_custom(config):
-    """Custom endpoint: collect URL, API key, and model name."""
+    """Custom endpoint: collect URL, API key, and model name.
+
+    Automatically saves the endpoint to ``custom_providers`` in config.yaml
+    so it appears in the provider menu on subsequent runs.
+    """
    from hermes_cli.auth import _save_model_choice, deactivate_provider
    from hermes_cli.config import get_env_value, save_env_value, load_config, save_config

@@ -1038,6 +1076,8 @@ def _model_flow_custom(config):
        print(f"Invalid URL: {effective_url} (must start with http:// or https://)")
        return

+    effective_key = api_key or current_key
+
    if base_url:
        save_env_value("OPENAI_BASE_URL", base_url)
    if api_key:
@@ -1050,7 +1090,7 @@ def _model_flow_custom(config):
        cfg = load_config()
        model = cfg.get("model")
        if isinstance(model, dict):
-            model["provider"] = "auto"
+            model["provider"] = "custom"
            model["base_url"] = effective_url
        save_config(cfg)
        deactivate_provider()
@@ -1061,6 +1101,223 @@ def _model_flow_custom(config):
            deactivate_provider()
        print("Endpoint saved. Use `/model` in chat or `hermes model` to set a model.")

+    # Auto-save to custom_providers so it appears in the menu next time
+    _save_custom_provider(effective_url, effective_key, model_name or "")
+
+
+def _save_custom_provider(base_url, api_key="", model=""):
+    """Save a custom endpoint to custom_providers in config.yaml.
+
+    Deduplicates by base_url — if the URL already exists, updates the
+    model name but doesn't add a duplicate entry.
+    Auto-generates a display name from the URL hostname.
+    """
+    from hermes_cli.config import load_config, save_config
+
+    cfg = load_config()
+    providers = cfg.get("custom_providers") or []
+    if not isinstance(providers, list):
+        providers = []
+
+    # Check if this URL is already saved — update model if so
+    for entry in providers:
+        if isinstance(entry, dict) and entry.get("base_url", "").rstrip("/") == base_url.rstrip("/"):
+            if model and entry.get("model") != model:
+                entry["model"] = model
+                cfg["custom_providers"] = providers
+                save_config(cfg)
+            return  # already saved, updated model if needed
+
+    # Auto-generate a name from the URL
+    import re
+    clean = base_url.replace("https://", "").replace("http://", "").rstrip("/")
+    # Remove /v1 suffix for cleaner names
+    clean = re.sub(r"/v1/?$", "", clean)
+    # Use hostname:port as the name
+    name = clean.split("/")[0]
+    # Capitalize for readability
+    if "localhost" in name or "127.0.0.1" in name:
+        name = f"Local ({name})"
+    elif "runpod" in name.lower():
+        name = f"RunPod ({name})"
+    else:
+        name = name.capitalize()
+
+    entry = {"name": name, "base_url": base_url}
+    if api_key:
+        entry["api_key"] = api_key
+    if model:
+        entry["model"] = model
+
+    providers.append(entry)
+    cfg["custom_providers"] = providers
+    save_config(cfg)
+    print(f"  💾 Saved to custom providers as \"{name}\" (edit in config.yaml)")
+
+
+def _remove_custom_provider(config):
+    """Let the user remove a saved custom provider from config.yaml."""
+    from hermes_cli.config import load_config, save_config
+
+    cfg = load_config()
+    providers = cfg.get("custom_providers") or []
+    if not isinstance(providers, list) or not providers:
+        print("No custom providers configured.")
+        return
+
+    print("Remove a custom provider:\n")
+
+    choices = []
+    for entry in providers:
+        if isinstance(entry, dict):
+            name = entry.get("name", "unnamed")
+            url = entry.get("base_url", "")
+            short_url = url.replace("https://", "").replace("http://", "").rstrip("/")
+            choices.append(f"{name} ({short_url})")
+        else:
+            choices.append(str(entry))
+    choices.append("Cancel")
+
+    try:
+        from simple_term_menu import TerminalMenu
+        menu = TerminalMenu(
+            [f"  {c}" for c in choices], cursor_index=0,
+            menu_cursor="-> ", menu_cursor_style=("fg_red", "bold"),
+            menu_highlight_style=("fg_red",),
+            cycle_cursor=True, clear_screen=False,
+            title="Select provider to remove:",
+        )
+        idx = menu.show()
+        print()
+    except (ImportError, NotImplementedError):
+        for i, c in enumerate(choices, 1):
+            print(f"  {i}. {c}")
+        print()
+        try:
+            val = input(f"Choice [1-{len(choices)}]: ").strip()
+            idx = int(val) - 1 if val else None
+        except (ValueError, KeyboardInterrupt, EOFError):
+            idx = None
+
+    if idx is None or idx >= len(providers):
+        print("No change.")
+        return
+
+    removed = providers.pop(idx)
+    cfg["custom_providers"] = providers
+    save_config(cfg)
+    removed_name = removed.get("name", "unnamed") if isinstance(removed, dict) else str(removed)
+    print(f"✅ Removed \"{removed_name}\" from custom providers.")
+
+
+def _model_flow_named_custom(config, provider_info):
+    """Handle a named custom provider from config.yaml custom_providers list.
+
+    If the entry has a saved model name, activates it immediately.
+    Otherwise probes the endpoint's /models API to let the user pick one.
+    """
+    from hermes_cli.auth import _save_model_choice, deactivate_provider
+    from hermes_cli.config import save_env_value, load_config, save_config
+    from hermes_cli.models import fetch_api_models
+
+    name = provider_info["name"]
+    base_url = provider_info["base_url"]
+    api_key = provider_info.get("api_key", "")
+    saved_model = provider_info.get("model", "")
+
+    # If a model is saved, just activate immediately — no probing needed
+    if saved_model:
+        save_env_value("OPENAI_BASE_URL", base_url)
+        if api_key:
+            save_env_value("OPENAI_API_KEY", api_key)
+        _save_model_choice(saved_model)
+
+        cfg = load_config()
+        model = cfg.get("model")
+        if isinstance(model, dict):
+            model["provider"] = "custom"
+            model["base_url"] = base_url
+        save_config(cfg)
+        deactivate_provider()
+
+        print(f"✅ Switched to: {saved_model}")
+        print(f"   Provider: {name} ({base_url})")
+        return
+
+    # No saved model — probe endpoint and let user pick
+    print(f"  Provider: {name}")
+    print(f"  URL:      {base_url}")
+    print()
+    print("No model saved for this provider. Fetching available models...")
+    models = fetch_api_models(api_key, base_url, timeout=8.0)
+
+    if models:
+        print(f"Found {len(models)} model(s):\n")
+        try:
+            from simple_term_menu import TerminalMenu
+            menu_items = [f"  {m}" for m in models] + ["  Cancel"]
+            menu = TerminalMenu(
+                menu_items, cursor_index=0,
+                menu_cursor="-> ", menu_cursor_style=("fg_green", "bold"),
+                menu_highlight_style=("fg_green",),
+                cycle_cursor=True, clear_screen=False,
+                title=f"Select model from {name}:",
+            )
+            idx = menu.show()
+            print()
+            if idx is None or idx >= len(models):
+                print("Cancelled.")
+                return
+            model_name = models[idx]
+        except (ImportError, NotImplementedError):
+            for i, m in enumerate(models, 1):
+                print(f"  {i}. {m}")
+            print(f"  {len(models) + 1}. Cancel")
+            print()
+            try:
+                val = input(f"Choice [1-{len(models) + 1}]: ").strip()
+                if not val:
+                    print("Cancelled.")
+                    return
+                idx = int(val) - 1
+                if idx < 0 or idx >= len(models):
+                    print("Cancelled.")
+                    return
+                model_name = models[idx]
+            except (ValueError, KeyboardInterrupt, EOFError):
+                print("\nCancelled.")
+                return
+    else:
+        print("Could not fetch models from endpoint. Enter model name manually.")
+        try:
+            model_name = input("Model name: ").strip()
+        except (KeyboardInterrupt, EOFError):
+            print("\nCancelled.")
+            return
+        if not model_name:
+            print("No model specified. Cancelled.")
+            return
+
+    # Activate and save the model to the custom_providers entry
+    save_env_value("OPENAI_BASE_URL", base_url)
+    if api_key:
+        save_env_value("OPENAI_API_KEY", api_key)
+    _save_model_choice(model_name)
+
+    cfg = load_config()
+    model = cfg.get("model")
+    if isinstance(model, dict):
+        model["provider"] = "custom"
+        model["base_url"] = base_url
+    save_config(cfg)
+    deactivate_provider()
+
+    # Save model name to the custom_providers entry for next time
+    _save_custom_provider(base_url, api_key, model_name)
+
+    print(f"\n✅ Model set to: {model_name}")
+    print(f"   Provider: {name} ({base_url})")
+

 # Curated model lists for direct API-key providers
 _PROVIDER_MODELS = {
--- a/hermes_cli/models.py
+++ b/hermes_cli/models.py
@@ -63,7 +63,7 @@ _PROVIDER_LABELS = {
    "kimi-coding": "Kimi / Moonshot",
    "minimax": "MiniMax",
    "minimax-cn": "MiniMax (China)",
-    "custom": "custom endpoint",
+    "custom": "Custom endpoint",
 }

 _PROVIDER_ALIASES = {
--- a/hermes_cli/setup.py
+++ b/hermes_cli/setup.py
@@ -632,6 +632,29 @@ def setup_model_provider(config: dict):
            save_env_value("OPENAI_BASE_URL", "")
            save_env_value("OPENAI_API_KEY", "")

+        # Update config.yaml and deactivate any OAuth provider so the
+        # resolver doesn't keep returning the old provider (e.g. Codex).
+        try:
+            from hermes_cli.auth import deactivate_provider
+            deactivate_provider()
+        except Exception:
+            pass
+        import yaml
+        config_path = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "config.yaml"
+        try:
+            disk_cfg = {}
+            if config_path.exists():
+                disk_cfg = yaml.safe_load(config_path.read_text()) or {}
+            model_section = disk_cfg.get("model", {})
+            if isinstance(model_section, str):
+                model_section = {"default": model_section}
+            model_section["provider"] = "openrouter"
+            model_section.pop("base_url", None)  # OpenRouter uses default URL
+            disk_cfg["model"] = model_section
+            config_path.write_text(yaml.safe_dump(disk_cfg, sort_keys=False))
+        except Exception as e:
+            logger.debug("Could not save provider to config.yaml: %s", e)
+
    elif provider_idx == 3:  # Custom endpoint
        selected_provider = "custom"
        print()
@@ -659,6 +682,28 @@ def setup_model_provider(config: dict):
        if model_name:
            config['model'] = model_name
            save_env_value("LLM_MODEL", model_name)
+
+        # Save provider and base_url to config.yaml so the gateway and CLI
+        # both resolve the correct provider without relying on env-var heuristics.
+        if base_url:
+            import yaml
+            config_path = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "config.yaml"
+            try:
+                disk_cfg = {}
+                if config_path.exists():
+                    disk_cfg = yaml.safe_load(config_path.read_text()) or {}
+                model_section = disk_cfg.get("model", {})
+                if isinstance(model_section, str):
+                    model_section = {"default": model_section}
+                model_section["provider"] = "custom"
+                model_section["base_url"] = base_url.rstrip("/")
+                if model_name:
+                    model_section["default"] = model_name
+                disk_cfg["model"] = model_section
+                config_path.write_text(yaml.safe_dump(disk_cfg, sort_keys=False))
+            except Exception as e:
+                logger.debug("Could not save provider to config.yaml: %s", e)
+
        print_success("Custom endpoint configured")

    elif provider_idx == 4:  # Z.AI / GLM
@@ -1264,7 +1309,7 @@ def setup_agent_settings(config: dict):
    # ── Max Iterations ──
    print_header("Agent Settings")

-    current_max = get_env_value('HERMES_MAX_ITERATIONS') or '60'
+    current_max = get_env_value('HERMES_MAX_ITERATIONS') or '90'
    print_info("Maximum tool-calling iterations per conversation.")
    print_info("Higher = more complex tasks, but costs more tokens.")
    print_info("Recommended: 30-60 for most tasks, 100+ for open exploration.")
@@ -1660,14 +1705,18 @@ def setup_gateway(config: dict):
 # Section 5: Tool Configuration (delegates to unified tools_config.py)
 # =============================================================================

-def setup_tools(config: dict):
+def setup_tools(config: dict, first_install: bool = False):
    """Configure tools — delegates to the unified tools_command() in tools_config.py.
    
    Both `hermes setup tools` and `hermes tools` use the same flow:
    platform selection → toolset toggles → provider/API key configuration.
+    
+    Args:
+        first_install: When True, uses the simplified first-install flow
+            (no platform menu, prompts for all unconfigured API keys).
    """
    from hermes_cli.tools_config import tools_command
-    tools_command()
+    tools_command(first_install=first_install, config=config)


 # =============================================================================
@@ -1820,7 +1869,7 @@ def run_setup_wizard(args):
    setup_gateway(config)

    # Section 5: Tools
-    setup_tools(config)
+    setup_tools(config, first_install=not is_existing)

    # Save and show summary
    save_config(config)
--- a/hermes_cli/status.py
+++ b/hermes_cli/status.py
@@ -206,6 +206,8 @@ def show_status(args):
        "Telegram": ("TELEGRAM_BOT_TOKEN", "TELEGRAM_HOME_CHANNEL"),
        "Discord": ("DISCORD_BOT_TOKEN", "DISCORD_HOME_CHANNEL"),
        "WhatsApp": ("WHATSAPP_ENABLED", None),
+        "Signal": ("SIGNAL_HTTP_URL", "SIGNAL_HOME_CHANNEL"),
+        "Slack": ("SLACK_BOT_TOKEN", None),
    }
    
    for name, (token_var, home_var) in platforms.items():
--- a/hermes_cli/tools_config.py
+++ b/hermes_cli/tools_config.py
@@ -96,6 +96,11 @@ CONFIGURABLE_TOOLSETS = [
    ("homeassistant",    "🏠 Home Assistant",           "smart home device control"),
 ]

+# Toolsets that are OFF by default for new installs.
+# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
+# but the setup checklist won't pre-select them for first-time users.
+_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl"}
+
 # Platform display config
 PLATFORMS = {
    "cli":      {"label": "🖥️  CLI",       "default_toolset": "hermes-cli"},
@@ -142,6 +147,8 @@ TOOL_CATEGORIES = {
    },
    "web": {
        "name": "Web Search & Extract",
+        "setup_title": "Select Search Provider",
+        "setup_note": "A free DuckDuckGo search skill is also included — skip this if you don't need Firecrawl.",
        "icon": "🔍",
        "providers": [
            {
@@ -595,11 +602,18 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
        print(color(f"  --- {icon} {name} ({provider['name']}) ---", Colors.CYAN))
        if provider.get("tag"):
            _print_info(f"  {provider['tag']}")
+        # For single-provider tools, show a note if available
+        if cat.get("setup_note"):
+            _print_info(f"  {cat['setup_note']}")
        _configure_provider(provider, config)
    else:
        # Multiple providers - let user choose
        print()
-        print(color(f"  --- {icon} {name} - Choose a provider ---", Colors.CYAN))
+        # Use custom title if provided (e.g. "Select Search Provider")
+        title = cat.get("setup_title", f"Choose a provider")
+        print(color(f"  --- {icon} {name} - {title} ---", Colors.CYAN))
+        if cat.get("setup_note"):
+            _print_info(f"  {cat['setup_note']}")
        print()

        # Plain text labels only (no ANSI codes in menu items)
@@ -617,6 +631,9 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
                    configured = " [configured]"
            provider_choices.append(f"{p['name']}{tag}{configured}")

+        # Add skip option
+        provider_choices.append("Skip — keep defaults / configure later")
+
        # Detect current provider as default
        default_idx = 0
        for i, p in enumerate(providers):
@@ -628,7 +645,13 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
                default_idx = i
                break

-        provider_idx = _prompt_choice("  Select provider:", provider_choices, default_idx)
+        provider_idx = _prompt_choice(f"  {title}:", provider_choices, default_idx)
+
+        # Skip selected
+        if provider_idx >= len(providers):
+            _print_info(f"  Skipped {name}")
+            return
+
        _configure_provider(providers[provider_idx], config)


@@ -835,9 +858,19 @@ def _reconfigure_simple_requirements(ts_key: str):

 # ─── Main Entry Point ─────────────────────────────────────────────────────────

-def tools_command(args=None):
-    """Entry point for `hermes tools` and `hermes setup tools`."""
-    config = load_config()
+def tools_command(args=None, first_install: bool = False, config: dict = None):
+    """Entry point for `hermes tools` and `hermes setup tools`.
+
+    Args:
+        first_install: When True (set by the setup wizard on fresh installs),
+            skip the platform menu, go straight to the CLI checklist, and
+            prompt for API keys on all enabled tools that need them.
+        config: Optional config dict to use.  When called from the setup
+            wizard, the wizard passes its own dict so that platform_toolsets
+            are written into it and survive the wizard's final save_config().
+    """
+    if config is None:
+        config = load_config()
    enabled_platforms = _get_enabled_platforms()

    print()
@@ -846,6 +879,57 @@ def tools_command(args=None):
    print(color("  Tools that need API keys will be configured when enabled.", Colors.DIM))
    print()

+    # ── First-time install: linear flow, no platform menu ──
+    if first_install:
+        for pkey in enabled_platforms:
+            pinfo = PLATFORMS[pkey]
+            current_enabled = _get_platform_tools(config, pkey)
+
+            # Uncheck toolsets that should be off by default
+            checklist_preselected = current_enabled - _DEFAULT_OFF_TOOLSETS
+
+            # Show checklist
+            new_enabled = _prompt_toolset_checklist(pinfo["label"], checklist_preselected)
+
+            added = new_enabled - current_enabled
+            removed = current_enabled - new_enabled
+            if added:
+                for ts in sorted(added):
+                    label = next((l for k, l, _ in CONFIGURABLE_TOOLSETS if k == ts), ts)
+                    print(color(f"  + {label}", Colors.GREEN))
+            if removed:
+                for ts in sorted(removed):
+                    label = next((l for k, l, _ in CONFIGURABLE_TOOLSETS if k == ts), ts)
+                    print(color(f"  - {label}", Colors.RED))
+
+            # Walk through ALL selected tools that have provider options or
+            # need API keys.  This ensures browser (Local vs Browserbase),
+            # TTS (Edge vs OpenAI vs ElevenLabs), etc. are shown even when
+            # a free provider exists.
+            to_configure = [
+                ts_key for ts_key in sorted(new_enabled)
+                if TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key)
+            ]
+
+            if to_configure:
+                print()
+                print(color(f"  Configuring {len(to_configure)} tool(s):", Colors.YELLOW))
+                for ts_key in to_configure:
+                    label = next((l for k, l, _ in CONFIGURABLE_TOOLSETS if k == ts_key), ts_key)
+                    print(color(f"    • {label}", Colors.DIM))
+                print(color("  You can skip any tool you don't need right now.", Colors.DIM))
+                print()
+                for ts_key in to_configure:
+                    _configure_toolset(ts_key, config)
+
+            _save_platform_tools(config, pkey, new_enabled)
+            save_config(config)
+            print(color(f"  ✓ Saved {pinfo['label']} tool configuration", Colors.GREEN))
+            print()
+
+        return
+
+    # ── Returning user: platform menu loop ──
    # Build platform choices
    platform_choices = []
    platform_keys = []
@@ -896,11 +980,10 @@ def tools_command(args=None):
                    print(color(f"  - {label}", Colors.RED))

            # Configure newly enabled toolsets that need API keys
-            if added:
-                for ts_key in sorted(added):
-                    if TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key):
-                        if not _toolset_has_keys(ts_key):
-                            _configure_toolset(ts_key, config)
+            for ts_key in sorted(added):
+                if (TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key)):
+                    if not _toolset_has_keys(ts_key):
+                        _configure_toolset(ts_key, config)

            _save_platform_tools(config, pkey, new_enabled)
            save_config(config)
--- a/optional-skills/email/agentmail/SKILL.md
+++ b/optional-skills/email/agentmail/SKILL.md
@@ -0,0 +1,125 @@
+---
+name: agentmail
+description: Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to).
+version: 1.0.0
+metadata:
+  hermes:
+    tags: [email, communication, agentmail, mcp]
+    category: email
+---
+
+# AgentMail — Agent-Owned Email Inboxes
+
+## Requirements
+
+- **AgentMail API key** (required) — sign up at https://console.agentmail.to (free tier: 3 inboxes, 3,000 emails/month; paid plans from $20/mo)
+- Node.js 18+ (for the MCP server)
+
+## When to Use
+Use this skill when you need to:
+- Give the agent its own dedicated email address
+- Send emails autonomously on behalf of the agent
+- Receive and read incoming emails
+- Manage email threads and conversations
+- Sign up for services or authenticate via email
+- Communicate with other agents or humans via email
+
+This is NOT for reading the user's personal email (use himalaya or Gmail for that).
+AgentMail gives the agent its own identity and inbox.
+
+## Setup
+
+### 1. Get an API Key
+- Go to https://console.agentmail.to
+- Create an account and generate an API key (starts with `am_`)
+
+### 2. Configure MCP Server
+Add to `~/.hermes/config.yaml` (paste your actual key — MCP env vars are not expanded from .env):
+```yaml
+mcp_servers:
+  agentmail:
+    command: "npx"
+    args: ["-y", "agentmail-mcp"]
+    env:
+      AGENTMAIL_API_KEY: "am_your_key_here"
+```
+
+### 3. Restart Hermes
+```bash
+hermes
+```
+All 11 AgentMail tools are now available automatically.
+
+## Available Tools (via MCP)
+
+| Tool | Description |
+|------|-------------|
+| `list_inboxes` | List all agent inboxes |
+| `get_inbox` | Get details of a specific inbox |
+| `create_inbox` | Create a new inbox (gets a real email address) |
+| `delete_inbox` | Delete an inbox |
+| `list_threads` | List email threads in an inbox |
+| `get_thread` | Get a specific email thread |
+| `send_message` | Send a new email |
+| `reply_to_message` | Reply to an existing email |
+| `forward_message` | Forward an email |
+| `update_message` | Update message labels/status |
+| `get_attachment` | Download an email attachment |
+
+## Procedure
+
+### Create an inbox and send an email
+1. Create a dedicated inbox:
+   - Use `create_inbox` with a username (e.g. `hermes-agent`)
+   - The agent gets address: `hermes-agent@agentmail.to`
+2. Send an email:
+   - Use `send_message` with `inbox_id`, `to`, `subject`, `text`
+3. Check for replies:
+   - Use `list_threads` to see incoming conversations
+   - Use `get_thread` to read a specific thread
+
+### Check incoming email
+1. Use `list_inboxes` to find your inbox ID
+2. Use `list_threads` with the inbox ID to see conversations
+3. Use `get_thread` to read a thread and its messages
+
+### Reply to an email
+1. Get the thread with `get_thread`
+2. Use `reply_to_message` with the message ID and your reply text
+
+## Example Workflows
+
+**Sign up for a service:**
+```
+1. create_inbox (username: "signup-bot")
+2. Use the inbox address to register on the service
+3. list_threads to check for verification email
+4. get_thread to read the verification code
+```
+
+**Agent-to-human outreach:**
+```
+1. create_inbox (username: "hermes-outreach")
+2. send_message (to: user@example.com, subject: "Hello", text: "...")
+3. list_threads to check for replies
+```
+
+## Pitfalls
+- Free tier limited to 3 inboxes and 3,000 emails/month
+- Emails come from `@agentmail.to` domain on free tier (custom domains on paid plans)
+- Node.js (18+) is required for the MCP server (`npx -y agentmail-mcp`)
+- The `mcp` Python package must be installed: `pip install mcp`
+- Real-time inbound email (webhooks) requires a public server — use `list_threads` polling via cronjob instead for personal use
+
+## Verification
+After setup, test with:
+```
+hermes --toolsets mcp -q "Create an AgentMail inbox called test-agent and tell me its email address"
+```
+You should see the new inbox address returned.
+
+## References
+- AgentMail docs: https://docs.agentmail.to/
+- AgentMail console: https://console.agentmail.to
+- AgentMail MCP repo: https://github.com/agentmail-to/agentmail-mcp
+- Pricing: https://www.agentmail.to/pricing
--- a/run_agent.py
+++ b/run_agent.py
@@ -183,6 +183,7 @@ class AIAgent:
        session_db=None,
        honcho_session_key: str = None,
        iteration_budget: "IterationBudget" = None,
+        fallback_model: Dict[str, Any] = None,
    ):
        """
        Initialize the AI Agent.
@@ -406,6 +407,17 @@ class AIAgent:
        except Exception as e:
            raise RuntimeError(f"Failed to initialize OpenAI client: {e}")
        
+        # Provider fallback — a single backup model/provider tried when the
+        # primary is exhausted (rate-limit, overload, connection failure).
+        # Config shape: {"provider": "openrouter", "model": "anthropic/claude-sonnet-4"}
+        self._fallback_model = fallback_model if isinstance(fallback_model, dict) else None
+        self._fallback_activated = False
+        if self._fallback_model:
+            fb_p = self._fallback_model.get("provider", "")
+            fb_m = self._fallback_model.get("model", "")
+            if fb_p and fb_m and not self.quiet_mode:
+                print(f"🔄 Fallback model: {fb_m} ({fb_p})")
+
        # Get available tools with filtering
        self.tools = get_tool_definitions(
            enabled_toolsets=enabled_toolsets,
@@ -2146,6 +2158,141 @@ class AIAgent:
            raise result["error"]
        return result["response"]

+    # ── Provider fallback ──────────────────────────────────────────────────
+
+    # API-key providers: provider → (base_url, [env_var_names])
+    _FALLBACK_API_KEY_PROVIDERS = {
+        "openrouter": (OPENROUTER_BASE_URL, ["OPENROUTER_API_KEY"]),
+        "zai": ("https://api.z.ai/api/paas/v4", ["ZAI_API_KEY", "Z_AI_API_KEY"]),
+        "kimi-coding": ("https://api.moonshot.ai/v1", ["KIMI_API_KEY"]),
+        "minimax": ("https://api.minimax.io/v1", ["MINIMAX_API_KEY"]),
+        "minimax-cn": ("https://api.minimaxi.com/v1", ["MINIMAX_CN_API_KEY"]),
+    }
+
+    # OAuth providers: provider → (resolver_import_path, api_mode)
+    # Each resolver returns {"api_key": ..., "base_url": ...}.
+    _FALLBACK_OAUTH_PROVIDERS = {
+        "openai-codex": ("resolve_codex_runtime_credentials", "codex_responses"),
+        "nous": ("resolve_nous_runtime_credentials", "chat_completions"),
+    }
+
+    def _resolve_fallback_credentials(
+        self, fb_provider: str, fb_config: dict
+    ) -> Optional[tuple]:
+        """Resolve credentials for a fallback provider.
+
+        Returns (api_key, base_url, api_mode) on success, or None on failure.
+        Handles three cases:
+          1. OAuth providers (openai-codex, nous) — call credential resolver
+          2. API-key providers (openrouter, zai, etc.) — read env var
+          3. Custom endpoints — use base_url + api_key_env from config
+        """
+        # ── 1. OAuth providers ────────────────────────────────────────
+        if fb_provider in self._FALLBACK_OAUTH_PROVIDERS:
+            resolver_name, api_mode = self._FALLBACK_OAUTH_PROVIDERS[fb_provider]
+            try:
+                import hermes_cli.auth as _auth
+                resolver = getattr(_auth, resolver_name)
+                creds = resolver()
+                return creds["api_key"], creds["base_url"], api_mode
+            except Exception as e:
+                logging.warning(
+                    "Fallback to %s failed (credential resolution): %s",
+                    fb_provider, e,
+                )
+                return None
+
+        # ── 2. API-key providers ──────────────────────────────────────
+        fb_key = (fb_config.get("api_key") or "").strip()
+        if not fb_key:
+            key_env = (fb_config.get("api_key_env") or "").strip()
+            if key_env:
+                fb_key = os.getenv(key_env, "")
+            elif fb_provider in self._FALLBACK_API_KEY_PROVIDERS:
+                for env_var in self._FALLBACK_API_KEY_PROVIDERS[fb_provider][1]:
+                    fb_key = os.getenv(env_var, "")
+                    if fb_key:
+                        break
+        if not fb_key:
+            logging.warning(
+                "Fallback model configured but no API key found for provider '%s'",
+                fb_provider,
+            )
+            return None
+
+        # ── 3. Resolve base URL ───────────────────────────────────────
+        fb_base_url = (fb_config.get("base_url") or "").strip()
+        if not fb_base_url and fb_provider in self._FALLBACK_API_KEY_PROVIDERS:
+            fb_base_url = self._FALLBACK_API_KEY_PROVIDERS[fb_provider][0]
+        if not fb_base_url:
+            fb_base_url = OPENROUTER_BASE_URL
+
+        return fb_key, fb_base_url, "chat_completions"
+
+    def _try_activate_fallback(self) -> bool:
+        """Switch to the configured fallback model/provider.
+
+        Called when the primary model is failing after retries.  Swaps the
+        OpenAI client, model slug, and provider in-place so the retry loop
+        can continue with the new backend.  One-shot: returns False if
+        already activated or not configured.
+        """
+        if self._fallback_activated or not self._fallback_model:
+            return False
+
+        fb = self._fallback_model
+        fb_provider = (fb.get("provider") or "").strip().lower()
+        fb_model = (fb.get("model") or "").strip()
+        if not fb_provider or not fb_model:
+            return False
+
+        resolved = self._resolve_fallback_credentials(fb_provider, fb)
+        if resolved is None:
+            return False
+        fb_key, fb_base_url, fb_api_mode = resolved
+
+        # Build new client
+        try:
+            client_kwargs = {"api_key": fb_key, "base_url": fb_base_url}
+            if "openrouter" in fb_base_url.lower():
+                client_kwargs["default_headers"] = {
+                    "HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
+                    "X-OpenRouter-Title": "Hermes Agent",
+                    "X-OpenRouter-Categories": "productivity,cli-agent",
+                }
+            elif "api.kimi.com" in fb_base_url.lower():
+                client_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
+
+            self.client = OpenAI(**client_kwargs)
+            self._client_kwargs = client_kwargs
+            old_model = self.model
+            self.model = fb_model
+            self.provider = fb_provider
+            self.base_url = fb_base_url
+            self.api_mode = fb_api_mode
+            self._fallback_activated = True
+
+            # Re-evaluate prompt caching for the new provider/model
+            self._use_prompt_caching = (
+                "openrouter" in fb_base_url.lower()
+                and "claude" in fb_model.lower()
+            )
+
+            print(
+                f"{self.log_prefix}🔄 Primary model failed — switching to fallback: "
+                f"{fb_model} via {fb_provider}"
+            )
+            logging.info(
+                "Fallback activated: %s → %s (%s)",
+                old_model, fb_model, fb_provider,
+            )
+            return True
+        except Exception as e:
+            logging.error("Failed to activate fallback model: %s", e)
+            return False
+
+    # ── End provider fallback ──────────────────────────────────────────────
+
    def _build_api_kwargs(self, api_messages: list) -> dict:
        """Build the keyword arguments dict for the active API mode."""
        if self.api_mode == "codex_responses":
@@ -2945,9 +3092,14 @@ class AIAgent:
            )
            self._iters_since_skill = 0

-        # Honcho prefetch: retrieve user context for system prompt injection
+        # Honcho prefetch: retrieve user context for system prompt injection.
+        # Only on the FIRST turn of a session (empty history).  On subsequent
+        # turns the model already has all prior context in its conversation
+        # history, and the Honcho context is baked into the stored system
+        # prompt — re-fetching it would change the system message and break
+        # Anthropic prompt caching.
        self._honcho_context = ""
-        if self._honcho and self._honcho_session_key:
+        if self._honcho and self._honcho_session_key and not conversation_history:
            try:
                self._honcho_context = self._honcho_prefetch(user_message)
            except Exception as e:
@@ -2965,14 +3117,42 @@ class AIAgent:
        # Built once on first call, reused for all subsequent calls.
        # Only rebuilt after context compression events (which invalidate
        # the cache and reload memory from disk).
+        #
+        # For continuing sessions (gateway creates a fresh AIAgent per
+        # message), we load the stored system prompt from the session DB
+        # instead of rebuilding.  Rebuilding would pick up memory changes
+        # from disk that the model already knows about (it wrote them!),
+        # producing a different system prompt and breaking the Anthropic
+        # prefix cache.
        if self._cached_system_prompt is None:
-            self._cached_system_prompt = self._build_system_prompt(system_message)
-            # Store the system prompt snapshot in SQLite
-            if self._session_db:
+            stored_prompt = None
+            if conversation_history and self._session_db:
                try:
-                    self._session_db.update_system_prompt(self.session_id, self._cached_system_prompt)
-                except Exception as e:
-                    logger.debug("Session DB update_system_prompt failed: %s", e)
+                    session_row = self._session_db.get_session(self.session_id)
+                    if session_row:
+                        stored_prompt = session_row.get("system_prompt") or None
+                except Exception:
+                    pass  # Fall through to build fresh
+
+            if stored_prompt:
+                # Continuing session — reuse the exact system prompt from
+                # the previous turn so the Anthropic cache prefix matches.
+                self._cached_system_prompt = stored_prompt
+            else:
+                # First turn of a new session — build from scratch.
+                self._cached_system_prompt = self._build_system_prompt(system_message)
+                # Bake Honcho context into the prompt so it's stable for
+                # the entire session (not re-fetched per turn).
+                if self._honcho_context:
+                    self._cached_system_prompt = (
+                        self._cached_system_prompt + "\n\n" + self._honcho_context
+                    ).strip()
+                # Store the system prompt snapshot in SQLite
+                if self._session_db:
+                    try:
+                        self._session_db.update_system_prompt(self.session_id, self._cached_system_prompt)
+                    except Exception as e:
+                        logger.debug("Session DB update_system_prompt failed: %s", e)

        active_system_prompt = self._cached_system_prompt

@@ -3097,11 +3277,13 @@ class AIAgent:
            # Build the final system message: cached prompt + ephemeral system prompt.
            # The ephemeral part is appended here (not baked into the cached prompt)
            # so it stays out of the session DB and logs.
+            # Note: Honcho context is baked into _cached_system_prompt on the first
+            # turn and stored in the session DB, so it does NOT need to be injected
+            # here.  This keeps the system message identical across all turns in a
+            # session, maximizing Anthropic prompt cache hits.
            effective_system = active_system_prompt or ""
            if self.ephemeral_system_prompt:
                effective_system = (effective_system + "\n\n" + self.ephemeral_system_prompt).strip()
-            if self._honcho_context:
-                effective_system = (effective_system + "\n\n" + self._honcho_context).strip()
            if effective_system:
                api_messages = [{"role": "system", "content": effective_system}] + api_messages
            
@@ -3252,6 +3434,10 @@ class AIAgent:
                        print(f"{self.log_prefix}   ⏱️  Response time: {api_duration:.2f}s (fast response often indicates rate limiting)")
                        
                        if retry_count >= max_retries:
+                            # Try fallback before giving up
+                            if self._try_activate_fallback():
+                                retry_count = 0
+                                continue
                            print(f"{self.log_prefix}❌ Max retries ({max_retries}) exceeded for invalid responses. Giving up.")
                            logging.error(f"{self.log_prefix}Invalid API response after {max_retries} retries.")
                            self._persist_session(messages, conversation_history)
@@ -3576,6 +3762,11 @@ class AIAgent:
                    ])) and not is_context_length_error

                    if is_client_error:
+                        # Try fallback before aborting — a different provider
+                        # may not have the same issue (rate limit, auth, etc.)
+                        if self._try_activate_fallback():
+                            retry_count = 0
+                            continue
                        self._dump_api_request_debug(
                            api_kwargs, reason="non_retryable_client_error", error=api_error,
                        )
@@ -3593,6 +3784,10 @@ class AIAgent:
                        }

                    if retry_count >= max_retries:
+                        # Try fallback before giving up entirely
+                        if self._try_activate_fallback():
+                            retry_count = 0
+                            continue
                        print(f"{self.log_prefix}❌ Max retries ({max_retries}) exceeded. Giving up.")
                        logging.error(f"{self.log_prefix}API call failed after {max_retries} retries. Last error: {api_error}")
                        logging.error(f"{self.log_prefix}Request details - Messages: {len(api_messages)}, Approx tokens: {approx_tokens:,}")
@@ -3639,6 +3834,27 @@ class AIAgent:
                else:
                    assistant_message = response.choices[0].message
                
+                # Normalize content to string — some OpenAI-compatible servers
+                # (llama-server, etc.) return content as a dict or list instead
+                # of a plain string, which crashes downstream .strip() calls.
+                if assistant_message.content is not None and not isinstance(assistant_message.content, str):
+                    raw = assistant_message.content
+                    if isinstance(raw, dict):
+                        assistant_message.content = raw.get("text", "") or raw.get("content", "") or json.dumps(raw)
+                    elif isinstance(raw, list):
+                        # Multimodal content list — extract text parts
+                        parts = []
+                        for part in raw:
+                            if isinstance(part, str):
+                                parts.append(part)
+                            elif isinstance(part, dict) and part.get("type") == "text":
+                                parts.append(part.get("text", ""))
+                            elif isinstance(part, dict) and "text" in part:
+                                parts.append(str(part["text"]))
+                        assistant_message.content = "\n".join(parts)
+                    else:
+                        assistant_message.content = str(raw)
+
                # Handle assistant response
                if assistant_message.content and not self.quiet_mode:
                    print(f"{self.log_prefix}🤖 Assistant: {assistant_message.content[:100]}{'...' if len(assistant_message.content) > 100 else ''}")
--- a/scripts/install.sh
+++ b/scripts/install.sh
@@ -492,9 +492,23 @@ install_system_packages() {
                        return 0
                    fi
                fi
+            elif [ -e /dev/tty ]; then
+                # Non-interactive (e.g. curl | bash) but a terminal is available.
+                # Read the prompt from /dev/tty (same approach the setup wizard uses).
+                echo ""
+                log_info "Installing ${description} requires sudo."
+                read -p "Install? [Y/n] " -n 1 -r < /dev/tty
+                echo
+                if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
+                    if sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a $install_cmd < /dev/tty; then
+                        [ "$need_ripgrep" = true ] && HAS_RIPGREP=true && log_success "ripgrep installed"
+                        [ "$need_ffmpeg" = true ]  && HAS_FFMPEG=true  && log_success "ffmpeg installed"
+                        return 0
+                    fi
+                fi
            else
-                log_warn "Non-interactive mode: cannot prompt for sudo password"
-                log_info "Install missing packages manually: sudo $install_cmd"
+                log_warn "Non-interactive mode and no terminal available — cannot install system packages"
+                log_info "Install manually after setup completes: sudo $install_cmd"
            fi
        fi
    fi
--- a/skills/creative/DESCRIPTION.md
+++ b/skills/creative/DESCRIPTION.md
@@ -0,0 +1,3 @@
+---
+description: Creative content generation — ASCII art, hand-drawn style diagrams, and visual design tools.
+---
--- a/skills/diagramming/excalidraw/SKILL.md
+++ b/skills/diagramming/excalidraw/SKILL.md
--- a/skills/diagramming/excalidraw/references/colors.md
+++ b/skills/diagramming/excalidraw/references/colors.md
--- a/skills/diagramming/excalidraw/references/dark-mode.md
+++ b/skills/diagramming/excalidraw/references/dark-mode.md
--- a/skills/diagramming/excalidraw/references/examples.md
+++ b/skills/diagramming/excalidraw/references/examples.md
--- a/skills/diagramming/excalidraw/scripts/upload.py
+++ b/skills/diagramming/excalidraw/scripts/upload.py
--- a/skills/dogfood/SKILL.md
+++ b/skills/dogfood/SKILL.md
@@ -0,0 +1,162 @@
+---
+name: dogfood
+description: Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports
+version: 1.0.0
+metadata:
+  hermes:
+    tags: [qa, testing, browser, web, dogfood]
+    related_skills: []
+---
+
+# Dogfood: Systematic Web Application QA Testing
+
+## Overview
+
+This skill guides you through systematic exploratory QA testing of web applications using the browser toolset. You will navigate the application, interact with elements, capture evidence of issues, and produce a structured bug report.
+
+## Prerequisites
+
+- Browser toolset must be available (`browser_navigate`, `browser_snapshot`, `browser_click`, `browser_type`, `browser_vision`, `browser_console`, `browser_scroll`, `browser_back`, `browser_press`, `browser_close`)
+- A target URL and testing scope from the user
+
+## Inputs
+
+The user provides:
+1. **Target URL** — the entry point for testing
+2. **Scope** — what areas/features to focus on (or "full site" for comprehensive testing)
+3. **Output directory** (optional) — where to save screenshots and the report (default: `./dogfood-output`)
+
+## Workflow
+
+Follow this 5-phase systematic workflow:
+
+### Phase 1: Plan
+
+1. Create the output directory structure:
+   ```
+   {output_dir}/
+   ├── screenshots/       # Evidence screenshots
+   └── report.md          # Final report (generated in Phase 5)
+   ```
+2. Identify the testing scope based on user input.
+3. Build a rough sitemap by planning which pages and features to test:
+   - Landing/home page
+   - Navigation links (header, footer, sidebar)
+   - Key user flows (sign up, login, search, checkout, etc.)
+   - Forms and interactive elements
+   - Edge cases (empty states, error pages, 404s)
+
+### Phase 2: Explore
+
+For each page or feature in your plan:
+
+1. **Navigate** to the page:
+   ```
+   browser_navigate(url="https://example.com/page")
+   ```
+
+2. **Take a snapshot** to understand the DOM structure:
+   ```
+   browser_snapshot()
+   ```
+
+3. **Check the console** for JavaScript errors:
+   ```
+   browser_console(clear=true)
+   ```
+   Do this after every navigation and after every significant interaction. Silent JS errors are high-value findings.
+
+4. **Take an annotated screenshot** to visually assess the page and identify interactive elements:
+   ```
+   browser_vision(question="Describe the page layout, identify any visual issues, broken elements, or accessibility concerns", annotate=true)
+   ```
+   The `annotate=true` flag overlays numbered `[N]` labels on interactive elements. Each `[N]` maps to ref `@eN` for subsequent browser commands.
+
+5. **Test interactive elements** systematically:
+   - Click buttons and links: `browser_click(ref="@eN")`
+   - Fill forms: `browser_type(ref="@eN", text="test input")`
+   - Test keyboard navigation: `browser_press(key="Tab")`, `browser_press(key="Enter")`
+   - Scroll through content: `browser_scroll(direction="down")`
+   - Test form validation with invalid inputs
+   - Test empty submissions
+
+6. **After each interaction**, check for:
+   - Console errors: `browser_console()`
+   - Visual changes: `browser_vision(question="What changed after the interaction?")`
+   - Expected vs actual behavior
+
+### Phase 3: Collect Evidence
+
+For every issue found:
+
+1. **Take a screenshot** showing the issue:
+   ```
+   browser_vision(question="Capture and describe the issue visible on this page", annotate=false)
+   ```
+   Save the `screenshot_path` from the response — you will reference it in the report.
+
+2. **Record the details**:
+   - URL where the issue occurs
+   - Steps to reproduce
+   - Expected behavior
+   - Actual behavior
+   - Console errors (if any)
+   - Screenshot path
+
+3. **Classify the issue** using the issue taxonomy (see `references/issue-taxonomy.md`):
+   - Severity: Critical / High / Medium / Low
+   - Category: Functional / Visual / Accessibility / Console / UX / Content
+
+### Phase 4: Categorize
+
+1. Review all collected issues.
+2. De-duplicate — merge issues that are the same bug manifesting in different places.
+3. Assign final severity and category to each issue.
+4. Sort by severity (Critical first, then High, Medium, Low).
+5. Count issues by severity and category for the executive summary.
+
+### Phase 5: Report
+
+Generate the final report using the template at `templates/dogfood-report-template.md`.
+
+The report must include:
+1. **Executive summary** with total issue count, breakdown by severity, and testing scope
+2. **Per-issue sections** with:
+   - Issue number and title
+   - Severity and category badges
+   - URL where observed
+   - Description of the issue
+   - Steps to reproduce
+   - Expected vs actual behavior
+   - Screenshot references (use `MEDIA:<screenshot_path>` for inline images)
+   - Console errors if relevant
+3. **Summary table** of all issues
+4. **Testing notes** — what was tested, what was not, any blockers
+
+Save the report to `{output_dir}/report.md`.
+
+## Tools Reference
+
+| Tool | Purpose |
+|------|---------|
+| `browser_navigate` | Go to a URL |
+| `browser_snapshot` | Get DOM text snapshot (accessibility tree) |
+| `browser_click` | Click an element by ref (`@eN`) or text |
+| `browser_type` | Type into an input field |
+| `browser_scroll` | Scroll up/down on the page |
+| `browser_back` | Go back in browser history |
+| `browser_press` | Press a keyboard key |
+| `browser_vision` | Screenshot + AI analysis; use `annotate=true` for element labels |
+| `browser_console` | Get JS console output and errors |
+| `browser_close` | Close the browser session |
+
+## Tips
+
+- **Always check `browser_console()` after navigating and after significant interactions.** Silent JS errors are among the most valuable findings.
+- **Use `annotate=true` with `browser_vision`** when you need to reason about interactive element positions or when the snapshot refs are unclear.
+- **Test with both valid and invalid inputs** — form validation bugs are common.
+- **Scroll through long pages** — content below the fold may have rendering issues.
+- **Test navigation flows** — click through multi-step processes end-to-end.
+- **Check responsive behavior** by noting any layout issues visible in screenshots.
+- **Don't forget edge cases**: empty states, very long text, special characters, rapid clicking.
+- When reporting screenshots to the user, include `MEDIA:<screenshot_path>` so they can see the evidence inline.
--- a/skills/dogfood/references/issue-taxonomy.md
+++ b/skills/dogfood/references/issue-taxonomy.md
@@ -0,0 +1,109 @@
+# Issue Taxonomy
+
+Use this taxonomy to classify issues found during dogfood QA testing.
+
+## Severity Levels
+
+### Critical
+The issue makes a core feature completely unusable or causes data loss.
+
+**Examples:**
+- Application crashes or shows a blank white page
+- Form submission silently loses user data
+- Authentication is completely broken (can't log in at all)
+- Payment flow fails and charges the user without completing the order
+- Security vulnerability (e.g., XSS, exposed credentials in console)
+
+### High
+The issue significantly impairs functionality but a workaround may exist.
+
+**Examples:**
+- A key button does nothing when clicked (but refreshing fixes it)
+- Search returns no results for valid queries
+- Form validation rejects valid input
+- Page loads but critical content is missing or garbled
+- Navigation link leads to a 404 or wrong page
+- Uncaught JavaScript exceptions in the console on core pages
+
+### Medium
+The issue is noticeable and affects user experience but doesn't block core functionality.
+
+**Examples:**
+- Layout is misaligned or overlapping on certain screen sections
+- Images fail to load (broken image icons)
+- Slow performance (visible loading delays > 3 seconds)
+- Form field lacks proper validation feedback (no error message on bad input)
+- Console warnings that suggest deprecated or misconfigured features
+- Inconsistent styling between similar pages
+
+### Low
+Minor polish issues that don't affect functionality.
+
+**Examples:**
+- Typos or grammatical errors in text content
+- Minor spacing or alignment inconsistencies
+- Placeholder text left in production ("Lorem ipsum")
+- Favicon missing
+- Console info/debug messages that shouldn't be in production
+- Subtle color contrast issues that don't fail WCAG requirements
+
+## Categories
+
+### Functional
+Issues where features don't work as expected.
+
+- Buttons/links that don't respond
+- Forms that don't submit or submit incorrectly
+- Broken user flows (can't complete a multi-step process)
+- Incorrect data displayed
+- Features that work partially
+
+### Visual
+Issues with the visual presentation of the page.
+
+- Layout problems (overlapping elements, broken grids)
+- Broken images or missing media
+- Styling inconsistencies
+- Responsive design failures
+- Z-index issues (elements hidden behind others)
+- Text overflow or truncation
+
+### Accessibility
+Issues that prevent or hinder access for users with disabilities.
+
+- Missing alt text on meaningful images
+- Poor color contrast (fails WCAG AA)
+- Elements not reachable via keyboard navigation
+- Missing form labels or ARIA attributes
+- Focus indicators missing or unclear
+- Screen reader incompatible content
+
+### Console
+Issues detected through JavaScript console output.
+
+- Uncaught exceptions and unhandled promise rejections
+- Failed network requests (4xx, 5xx errors in console)
+- Deprecation warnings
+- CORS errors
+- Mixed content warnings (HTTP resources on HTTPS page)
+- Excessive console.log output left from development
+
+### UX (User Experience)
+Issues where functionality works but the experience is poor.
+
+- Confusing navigation or information architecture
+- Missing loading indicators (user doesn't know something is happening)
+- No feedback after user actions (e.g., button click with no visible result)
+- Inconsistent interaction patterns
+- Missing confirmation dialogs for destructive actions
+- Poor error messages that don't help the user recover
+
+### Content
+Issues with the text, media, or information on the page.
+
+- Typos and grammatical errors
+- Placeholder/dummy content in production
+- Outdated information
+- Missing content (empty sections)
+- Broken or dead links to external resources
+- Incorrect or misleading labels
--- a/skills/dogfood/templates/dogfood-report-template.md
+++ b/skills/dogfood/templates/dogfood-report-template.md
@@ -0,0 +1,86 @@
+# Dogfood QA Report
+
+**Target:** {target_url}
+**Date:** {date}
+**Scope:** {scope_description}
+**Tester:** Hermes Agent (automated exploratory QA)
+
+---
+
+## Executive Summary
+
+| Severity | Count |
+|----------|-------|
+| 🔴 Critical | {critical_count} |
+| 🟠 High | {high_count} |
+| 🟡 Medium | {medium_count} |
+| 🔵 Low | {low_count} |
+| **Total** | **{total_count}** |
+
+**Overall Assessment:** {one_sentence_assessment}
+
+---
+
+## Issues
+
+<!-- Repeat this section for each issue found, sorted by severity (Critical first) -->
+
+### Issue #{issue_number}: {issue_title}
+
+| Field | Value |
+|-------|-------|
+| **Severity** | {severity} |
+| **Category** | {category} |
+| **URL** | {url_where_found} |
+
+**Description:**
+{detailed_description_of_the_issue}
+
+**Steps to Reproduce:**
+1. {step_1}
+2. {step_2}
+3. {step_3}
+
+**Expected Behavior:**
+{what_should_happen}
+
+**Actual Behavior:**
+{what_actually_happens}
+
+**Screenshot:**
+MEDIA:{screenshot_path}
+
+**Console Errors** (if applicable):
+```
+{console_error_output}
+```
+
+---
+
+<!-- End of per-issue section -->
+
+## Issues Summary Table
+
+| # | Title | Severity | Category | URL |
+|---|-------|----------|----------|-----|
+| {n} | {title} | {severity} | {category} | {url} |
+
+## Testing Coverage
+
+### Pages Tested
+- {list_of_pages_visited}
+
+### Features Tested
+- {list_of_features_exercised}
+
+### Not Tested / Out of Scope
+- {areas_not_covered_and_why}
+
+### Blockers
+- {any_issues_that_prevented_testing_certain_areas}
+
+---
+
+## Notes
+
+{any_additional_observations_or_recommendations}
--- a/skills/gaming/pokemon-player/SKILL.md
+++ b/skills/gaming/pokemon-player/SKILL.md
@@ -0,0 +1,161 @@
+---
+name: pokemon-player
+description: Play Pokémon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal.
+tags: [gaming, pokemon, emulator, pyboy, gameplay, gameboy]
+---
+# Pokémon Player
+
+Play Pokémon games via headless emulation using the `pokemon-agent` package.
+
+## When to Use
+- User says "play pokemon", "start pokemon", "pokemon game"
+- User asks about Pokemon Red, Blue, Yellow, FireRed, etc.
+- User wants to watch an AI play Pokemon
+- User references a ROM file (.gb, .gbc, .gba)
+
+## First-Time Setup
+
+### 1. Install the package
+```bash
+pip install pokemon-agent[dashboard] pyboy
+```
+
+### 2. Get the ROM
+Ask the user for their ROM file path. Do NOT attempt to download ROMs.
+
+### 3. Start the game server
+```bash
+pokemon-agent serve --rom <ROM_PATH> --port 8765 &
+```
+Wait 3 seconds, then verify:
+```bash
+curl -s http://localhost:8765/health
+```
+
+## The Gameplay Loop
+
+### Step 1: OBSERVE
+```bash
+curl -s http://localhost:8765/state
+```
+
+### Step 2: ORIENT
+- Dialog active → advance text
+- In battle → fight
+- Party hurt → heal
+- Near objective → navigate
+
+### Step 3: DECIDE
+Priority order:
+1. If dialog active → a_until_dialog_end
+2. If in battle → choose best move
+3. If any Pokemon <20% HP → Pokémon Center
+4. If near story objective → navigate to it
+5. If underleveled → train in grass
+6. Otherwise → explore
+
+### Step 4: ACT
+```bash
+curl -s -X POST http://localhost:8765/action \
+  -H "Content-Type: application/json" \
+  -d '{"actions": ["walk_up", "walk_up", "press_a"]}'
+```
+
+Action reference:
+- press_a — confirm, talk, select
+- press_b — cancel, close menu
+- press_start — open game menu
+- walk_up/down/left/right — move one tile
+- a_until_dialog_end — advance all dialog
+- wait_60 — wait ~1 second
+
+### Step 5: VERIFY
+Check state_after in the response. If stuck 3+ turns:
+1. Press B several times
+2. Try different directions
+3. Take screenshot and use vision_analyze
+4. Load last save if truly stuck
+
+### Step 6: RECORD
+```
+memory add: PKM:OBJECTIVE: Heading to Pewter City to challenge Brock
+memory add: PKM:PROGRESS: Got Squirtle, Got Pokedex, → Pewter City
+```
+
+### Step 7: SAVE
+Save every 20-30 turns and ALWAYS before gym battles:
+```bash
+curl -s -X POST http://localhost:8765/save \
+  -H "Content-Type: application/json" \
+  -d '{"name": "before_brock"}'
+```
+
+## Battle Strategy
+
+### Decision Tree
+1. Want to catch? → Weaken then throw Poké Ball
+2. Wild you don't need? → RUN
+3. Type advantage? → Use super-effective move
+4. No advantage? → Use strongest STAB move
+5. Low HP? → Switch or use Potion
+
+### Type Chart
+- Water beats Fire, Ground, Rock
+- Fire beats Grass, Bug, Ice
+- Grass beats Water, Ground, Rock
+- Electric beats Water, Flying
+- Ground beats Fire, Electric, Rock, Poison
+- Psychic beats Fighting, Poison (dominant in Gen 1!)
+
+### Gen 1 Quirks
+- Special stat is both offense AND defense for special moves
+- Psychic is overpowered (Ghost moves bugged)
+- Critical hits based on Speed stat
+- Wrap/Bind prevent opponent from acting
+
+## Memory Conventions
+| Prefix | Purpose | Example |
+|--------|---------|---------|
+| PKM:OBJECTIVE | Current goal | Defeat Brock in Pewter City |
+| PKM:MAP | Navigation knowledge | Viridian Forest: go north |
+| PKM:STRATEGY | Battle/team plans | Need Grass type before Misty |
+| PKM:PROGRESS | Milestone tracker | ✓ Boulder Badge → Cascade Badge |
+| PKM:STUCK | Stuck situations | Got stuck in Cerulean Cave |
+| PKM:TEAM | Team notes | Squirtle is Water/Ice coverage |
+
+## Progression Milestones
+- ☐ Choose starter
+- ☐ Deliver Oak's Parcel → receive Pokédex
+- ☐ Boulder Badge — Brock (Rock) → use Water/Grass
+- ☐ Cascade Badge — Misty (Water) → use Grass/Electric
+- ☐ Thunder Badge — Lt. Surge (Electric) → use Ground
+- ☐ Rainbow Badge — Erika (Grass) → use Fire/Ice/Flying
+- ☐ Soul Badge — Koga (Poison) → use Ground/Psychic
+- ☐ Marsh Badge — Sabrina (Psychic)
+- ☐ Volcano Badge — Blaine (Fire) → use Water/Ground
+- ☐ Earth Badge — Giovanni (Ground) → use Water/Grass/Ice
+- ☐ Elite Four → Champion!
+
+## Stopping Play
+1. Save the game:
+```bash
+curl -s -X POST http://localhost:8765/save \
+  -d '{"name": "session_end"}'
+```
+2. Update memory with progress
+3. Tell user: "Game saved! Say 'play pokemon' to resume."
+4. Kill the background server process
+
+## Dashboard
+If `pokemon-agent[dashboard]` is installed, open:
+http://localhost:8765/dashboard
+
+Live features: game screen, AI reasoning stream, team status, action log.
+
+## Pitfalls
+- NEVER download or provide ROM files — always ask the user
+- Don't send more than 15 actions per /action call
+- Always wait for dialog to clear before moving
+- Save BEFORE gym battles
+- Take screenshots sparingly — they cost vision tokens
+- Verify server is running with /health before any commands
--- a/skills/leisure/find-nearby/SKILL.md
+++ b/skills/leisure/find-nearby/SKILL.md
@@ -0,0 +1,69 @@
+---
+name: find-nearby
+description: Find nearby places (restaurants, cafes, bars, pharmacies, etc.) using OpenStreetMap. Works with coordinates, addresses, cities, zip codes, or Telegram location pins. No API keys needed.
+version: 1.0.0
+metadata:
+  hermes:
+    tags: [location, maps, nearby, places, restaurants, local]
+    related_skills: []
+---
+
+# Find Nearby — Local Place Discovery
+
+Find restaurants, cafes, bars, pharmacies, and other places near any location. Uses OpenStreetMap (free, no API keys). Works with:
+
+- **Coordinates** from Telegram location pins (latitude/longitude in conversation)
+- **Addresses** ("near 123 Main St, Springfield")
+- **Cities** ("restaurants in downtown Austin")
+- **Zip codes** ("pharmacies near 90210")
+- **Landmarks** ("cafes near Times Square")
+
+## Quick Reference
+
+```bash
+# By coordinates (from Telegram location pin or user-provided)
+python3 SKILL_DIR/scripts/find_nearby.py --lat <LAT> --lon <LON> --type restaurant --radius 1500
+
+# By address, city, or landmark (auto-geocoded)
+python3 SKILL_DIR/scripts/find_nearby.py --near "Times Square, New York" --type cafe
+
+# Multiple place types
+python3 SKILL_DIR/scripts/find_nearby.py --near "downtown austin" --type restaurant --type bar --limit 10
+
+# JSON output
+python3 SKILL_DIR/scripts/find_nearby.py --near "90210" --type pharmacy --json
+```
+
+### Parameters
+
+| Flag | Description | Default |
+|------|-------------|---------|
+| `--lat`, `--lon` | Exact coordinates | — |
+| `--near` | Address, city, zip, or landmark (geocoded) | — |
+| `--type` | Place type (repeatable for multiple) | restaurant |
+| `--radius` | Search radius in meters | 1500 |
+| `--limit` | Max results | 15 |
+| `--json` | Machine-readable JSON output | off |
+
+### Common Place Types
+
+`restaurant`, `cafe`, `bar`, `pub`, `fast_food`, `pharmacy`, `hospital`, `bank`, `atm`, `fuel`, `parking`, `supermarket`, `convenience`, `hotel`
+
+## Workflow
+
+1. **Get the location.** Look for coordinates (`latitude: ... / longitude: ...`) from a Telegram pin, or ask the user for an address/city/zip.
+
+2. **Ask for preferences** (only if not already stated): place type, how far they're willing to go, any specifics (cuisine, "open now", etc.).
+
+3. **Run the script** with appropriate flags. Use `--json` if you need to process results programmatically.
+
+4. **Present results** with names, distances, and Google Maps links. If the user asked about hours or "open now," check the `hours` field in results — if missing or unclear, verify with `web_search`.
+
+5. **For directions**, use the `directions_url` from results, or construct: `https://www.google.com/maps/dir/?api=1&origin=<LAT>,<LON>&destination=<LAT>,<LON>`
+
+## Tips
+
+- If results are sparse, widen the radius (1500 → 3000m)
+- For "open now" requests: check the `hours` field in results, cross-reference with `web_search` for accuracy since OSM hours aren't always complete
+- Zip codes alone can be ambiguous globally — prompt the user for country/state if results look wrong
+- The script uses OpenStreetMap data which is community-maintained; coverage varies by region
--- a/skills/leisure/find-nearby/scripts/find_nearby.py
+++ b/skills/leisure/find-nearby/scripts/find_nearby.py
@@ -0,0 +1,184 @@
+#!/usr/bin/env python3
+"""Find nearby places using OpenStreetMap (Overpass + Nominatim). No API keys needed.
+
+Usage:
+    # By coordinates
+    python find_nearby.py --lat 36.17 --lon -115.14 --type restaurant --radius 1500
+
+    # By address/city/zip (auto-geocoded)
+    python find_nearby.py --near "Times Square, New York" --type cafe --radius 1000
+    python find_nearby.py --near "90210" --type pharmacy
+
+    # Multiple types
+    python find_nearby.py --lat 36.17 --lon -115.14 --type restaurant --type bar
+
+    # JSON output for programmatic use
+    python find_nearby.py --near "downtown las vegas" --type restaurant --json
+"""
+
+import argparse
+import json
+import math
+import sys
+import urllib.parse
+import urllib.request
+from typing import Any
+
+OVERPASS_URLS = [
+    "https://overpass-api.de/api/interpreter",
+    "https://overpass.kumi.systems/api/interpreter",
+]
+NOMINATIM_URL = "https://nominatim.openstreetmap.org/search"
+USER_AGENT = "HermesAgent/1.0 (find-nearby skill)"
+TIMEOUT = 15
+
+
+def _http_get(url: str) -> Any:
+    req = urllib.request.Request(url, headers={"User-Agent": USER_AGENT})
+    with urllib.request.urlopen(req, timeout=TIMEOUT) as r:
+        return json.loads(r.read())
+
+
+def _http_post(url: str, data: str) -> Any:
+    req = urllib.request.Request(
+        url, data=data.encode(), headers={"User-Agent": USER_AGENT}
+    )
+    with urllib.request.urlopen(req, timeout=TIMEOUT) as r:
+        return json.loads(r.read())
+
+
+def haversine(lat1: float, lon1: float, lat2: float, lon2: float) -> float:
+    """Distance in meters between two coordinates."""
+    R = 6_371_000
+    rlat1, rlat2 = math.radians(lat1), math.radians(lat2)
+    dlat = math.radians(lat2 - lat1)
+    dlon = math.radians(lon2 - lon1)
+    a = math.sin(dlat / 2) ** 2 + math.cos(rlat1) * math.cos(rlat2) * math.sin(dlon / 2) ** 2
+    return R * 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
+
+
+def geocode(query: str) -> tuple[float, float]:
+    """Convert address/city/zip to coordinates via Nominatim."""
+    params = urllib.parse.urlencode({"q": query, "format": "json", "limit": 1})
+    results = _http_get(f"{NOMINATIM_URL}?{params}")
+    if not results:
+        print(f"Error: Could not geocode '{query}'. Try a more specific address.", file=sys.stderr)
+        sys.exit(1)
+    return float(results[0]["lat"]), float(results[0]["lon"])
+
+
+def find_nearby(lat: float, lon: float, types: list[str], radius: int = 1500, limit: int = 15) -> list[dict]:
+    """Query Overpass for nearby amenities."""
+    # Build Overpass QL query
+    type_filters = "".join(
+        f'nwr["amenity"="{t}"](around:{radius},{lat},{lon});' for t in types
+    )
+    query = f"[out:json][timeout:{TIMEOUT}];({type_filters});out center tags;"
+
+    # Try each Overpass server
+    data = None
+    for url in OVERPASS_URLS:
+        try:
+            data = _http_post(url, f"data={urllib.parse.quote(query)}")
+            break
+        except Exception:
+            continue
+
+    if not data:
+        return []
+
+    # Parse results
+    places = []
+    for el in data.get("elements", []):
+        tags = el.get("tags", {})
+        name = tags.get("name")
+        if not name:
+            continue
+
+        # Get coordinates (nodes have lat/lon directly, ways/relations use center)
+        plat = el.get("lat") or (el.get("center", {}) or {}).get("lat")
+        plon = el.get("lon") or (el.get("center", {}) or {}).get("lon")
+        if not plat or not plon:
+            continue
+
+        dist = haversine(lat, lon, plat, plon)
+
+        place = {
+            "name": name,
+            "type": tags.get("amenity", ""),
+            "distance_m": round(dist),
+            "lat": plat,
+            "lon": plon,
+            "maps_url": f"https://www.google.com/maps/search/?api=1&query={plat},{plon}",
+            "directions_url": f"https://www.google.com/maps/dir/?api=1&origin={lat},{lon}&destination={plat},{plon}",
+        }
+
+        # Add useful optional fields
+        if tags.get("cuisine"):
+            place["cuisine"] = tags["cuisine"]
+        if tags.get("opening_hours"):
+            place["hours"] = tags["opening_hours"]
+        if tags.get("phone"):
+            place["phone"] = tags["phone"]
+        if tags.get("website"):
+            place["website"] = tags["website"]
+        if tags.get("addr:street"):
+            addr_parts = [tags.get("addr:housenumber", ""), tags.get("addr:street", "")]
+            if tags.get("addr:city"):
+                addr_parts.append(tags["addr:city"])
+            place["address"] = " ".join(p for p in addr_parts if p)
+
+        places.append(place)
+
+    # Sort by distance, limit results
+    places.sort(key=lambda p: p["distance_m"])
+    return places[:limit]
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Find nearby places via OpenStreetMap")
+    parser.add_argument("--lat", type=float, help="Latitude")
+    parser.add_argument("--lon", type=float, help="Longitude")
+    parser.add_argument("--near", type=str, help="Address, city, or zip code (geocoded automatically)")
+    parser.add_argument("--type", action="append", dest="types", default=[], help="Place type (restaurant, cafe, bar, pharmacy, etc.)")
+    parser.add_argument("--radius", type=int, default=1500, help="Search radius in meters (default: 1500)")
+    parser.add_argument("--limit", type=int, default=15, help="Max results (default: 15)")
+    parser.add_argument("--json", action="store_true", dest="json_output", help="Output as JSON")
+    args = parser.parse_args()
+
+    # Resolve coordinates
+    if args.near:
+        lat, lon = geocode(args.near)
+    elif args.lat is not None and args.lon is not None:
+        lat, lon = args.lat, args.lon
+    else:
+        print("Error: Provide --lat/--lon or --near", file=sys.stderr)
+        sys.exit(1)
+
+    if not args.types:
+        args.types = ["restaurant"]
+
+    places = find_nearby(lat, lon, args.types, args.radius, args.limit)
+
+    if args.json_output:
+        print(json.dumps({"origin": {"lat": lat, "lon": lon}, "results": places, "count": len(places)}, indent=2))
+    else:
+        if not places:
+            print(f"No {'/'.join(args.types)} found within {args.radius}m")
+            return
+        print(f"Found {len(places)} places within {args.radius}m:\n")
+        for i, p in enumerate(places, 1):
+            dist_str = f"{p['distance_m']}m" if p["distance_m"] < 1000 else f"{p['distance_m']/1000:.1f}km"
+            print(f"  {i}. {p['name']} ({p['type']}) — {dist_str}")
+            if p.get("cuisine"):
+                print(f"     Cuisine: {p['cuisine']}")
+            if p.get("hours"):
+                print(f"     Hours: {p['hours']}")
+            if p.get("address"):
+                print(f"     Address: {p['address']}")
+            print(f"     Map: {p['maps_url']}")
+            print()
+
+
+if __name__ == "__main__":
+    main()
--- a/skills/mcp/native-mcp/SKILL.md
+++ b/skills/mcp/native-mcp/SKILL.md
@@ -321,6 +321,32 @@ mcp_servers:

 All tools from all servers are registered and available simultaneously. Each server's tools are prefixed with its name to avoid collisions.

+## Sampling (Server-Initiated LLM Requests)
+
+Hermes supports MCP's `sampling/createMessage` capability — MCP servers can request LLM completions through the agent during tool execution. This enables agent-in-the-loop workflows (data analysis, content generation, decision-making).
+
+Sampling is **enabled by default**. Configure per server:
+
+```yaml
+mcp_servers:
+  my_server:
+    command: "npx"
+    args: ["-y", "my-mcp-server"]
+    sampling:
+      enabled: true           # default: true
+      model: "gemini-3-flash" # model override (optional)
+      max_tokens_cap: 4096    # max tokens per request
+      timeout: 30             # LLM call timeout (seconds)
+      max_rpm: 10             # max requests per minute
+      allowed_models: []      # model whitelist (empty = all)
+      max_tool_rounds: 5      # tool loop limit (0 = disable)
+      log_level: "info"       # audit verbosity
+```
+
+Servers can also include `tools` in sampling requests for multi-turn tool-augmented workflows. The `max_tool_rounds` config prevents infinite tool loops. Per-server audit metrics (requests, errors, tokens, tool use count) are tracked via `get_mcp_status()`.
+
+Disable sampling for untrusted servers with `sampling: { enabled: false }`.
+
 ## Notes

 - MCP tools are called synchronously from the agent's perspective but run asynchronously on a dedicated background event loop
--- a/skills/media/DESCRIPTION.md
+++ b/skills/media/DESCRIPTION.md
@@ -1 +1,3 @@
-Media content extraction and transformation tools — YouTube transcripts, audio, video processing.
+---
+description: Skills for working with media content — YouTube transcripts, GIF search, music generation, and audio visualization.
+---
--- a/skills/media/gif-search/SKILL.md
+++ b/skills/media/gif-search/SKILL.md
--- a/skills/music-creation/heartmula/SKILL.md
+++ b/skills/music-creation/heartmula/SKILL.md
--- a/skills/music-creation/songsee/SKILL.md
+++ b/skills/music-creation/songsee/SKILL.md
--- a/skills/mlops/cloud/DESCRIPTION.md
+++ b/skills/mlops/cloud/DESCRIPTION.md
@@ -0,0 +1,3 @@
+---
+description: GPU cloud providers and serverless compute platforms for ML workloads.
+---
--- a/skills/mlops/cloud/lambda-labs/SKILL.md
+++ b/skills/mlops/cloud/lambda-labs/SKILL.md
--- a/skills/mlops/cloud/lambda-labs/references/advanced-usage.md
+++ b/skills/mlops/cloud/lambda-labs/references/advanced-usage.md
--- a/skills/mlops/cloud/lambda-labs/references/troubleshooting.md
+++ b/skills/mlops/cloud/lambda-labs/references/troubleshooting.md
--- a/skills/mlops/cloud/modal/SKILL.md
+++ b/skills/mlops/cloud/modal/SKILL.md
--- a/skills/mlops/cloud/modal/references/advanced-usage.md
+++ b/skills/mlops/cloud/modal/references/advanced-usage.md
--- a/skills/mlops/cloud/modal/references/troubleshooting.md
+++ b/skills/mlops/cloud/modal/references/troubleshooting.md
--- a/skills/mlops/evaluation/DESCRIPTION.md
+++ b/skills/mlops/evaluation/DESCRIPTION.md
@@ -0,0 +1,3 @@
+---
+description: Model evaluation benchmarks, experiment tracking, data curation, tokenizers, and interpretability tools.
+---
--- a/skills/mlops/evaluation/huggingface-tokenizers/SKILL.md
+++ b/skills/mlops/evaluation/huggingface-tokenizers/SKILL.md
--- a/skills/mlops/evaluation/huggingface-tokenizers/references/algorithms.md
+++ b/skills/mlops/evaluation/huggingface-tokenizers/references/algorithms.md
--- a/skills/mlops/evaluation/huggingface-tokenizers/references/integration.md
+++ b/skills/mlops/evaluation/huggingface-tokenizers/references/integration.md
--- a/skills/mlops/evaluation/huggingface-tokenizers/references/pipeline.md
+++ b/skills/mlops/evaluation/huggingface-tokenizers/references/pipeline.md
--- a/skills/mlops/evaluation/huggingface-tokenizers/references/training.md
+++ b/skills/mlops/evaluation/huggingface-tokenizers/references/training.md
--- a/skills/mlops/evaluation/lm-evaluation-harness/SKILL.md
+++ b/skills/mlops/evaluation/lm-evaluation-harness/SKILL.md
--- a/skills/mlops/evaluation/lm-evaluation-harness/references/api-evaluation.md
+++ b/skills/mlops/evaluation/lm-evaluation-harness/references/api-evaluation.md
--- a/skills/mlops/evaluation/lm-evaluation-harness/references/benchmark-guide.md
+++ b/skills/mlops/evaluation/lm-evaluation-harness/references/benchmark-guide.md
--- a/skills/mlops/evaluation/lm-evaluation-harness/references/custom-tasks.md
+++ b/skills/mlops/evaluation/lm-evaluation-harness/references/custom-tasks.md
--- a/skills/mlops/evaluation/lm-evaluation-harness/references/distributed-eval.md
+++ b/skills/mlops/evaluation/lm-evaluation-harness/references/distributed-eval.md
--- a/skills/mlops/evaluation/nemo-curator/SKILL.md
+++ b/skills/mlops/evaluation/nemo-curator/SKILL.md
--- a/skills/mlops/evaluation/nemo-curator/references/deduplication.md
+++ b/skills/mlops/evaluation/nemo-curator/references/deduplication.md
--- a/skills/mlops/evaluation/nemo-curator/references/filtering.md
+++ b/skills/mlops/evaluation/nemo-curator/references/filtering.md
--- a/skills/mlops/evaluation/saelens/SKILL.md
+++ b/skills/mlops/evaluation/saelens/SKILL.md
--- a/skills/mlops/evaluation/saelens/references/README.md
+++ b/skills/mlops/evaluation/saelens/references/README.md
--- a/skills/mlops/evaluation/saelens/references/api.md
+++ b/skills/mlops/evaluation/saelens/references/api.md
--- a/skills/mlops/evaluation/saelens/references/tutorials.md
+++ b/skills/mlops/evaluation/saelens/references/tutorials.md
--- a/skills/mlops/evaluation/weights-and-biases/SKILL.md
+++ b/skills/mlops/evaluation/weights-and-biases/SKILL.md
--- a/skills/mlops/evaluation/weights-and-biases/references/artifacts.md
+++ b/skills/mlops/evaluation/weights-and-biases/references/artifacts.md
--- a/skills/mlops/evaluation/weights-and-biases/references/integrations.md
+++ b/skills/mlops/evaluation/weights-and-biases/references/integrations.md
--- a/skills/mlops/evaluation/weights-and-biases/references/sweeps.md
+++ b/skills/mlops/evaluation/weights-and-biases/references/sweeps.md
--- a/skills/mlops/inference/DESCRIPTION.md
+++ b/skills/mlops/inference/DESCRIPTION.md
@@ -0,0 +1,3 @@
+---
+description: Model serving, quantization (GGUF/GPTQ), structured output, inference optimization, and model surgery tools for deploying and running LLMs.
+---
--- a/skills/mlops/inference/gguf/SKILL.md
+++ b/skills/mlops/inference/gguf/SKILL.md
--- a/skills/mlops/inference/gguf/references/advanced-usage.md
+++ b/skills/mlops/inference/gguf/references/advanced-usage.md
--- a/skills/mlops/inference/gguf/references/troubleshooting.md
+++ b/skills/mlops/inference/gguf/references/troubleshooting.md
--- a/skills/mlops/inference/guidance/SKILL.md
+++ b/skills/mlops/inference/guidance/SKILL.md
--- a/skills/mlops/inference/guidance/references/backends.md
+++ b/skills/mlops/inference/guidance/references/backends.md
--- a/skills/mlops/inference/guidance/references/constraints.md
+++ b/skills/mlops/inference/guidance/references/constraints.md
--- a/skills/mlops/inference/guidance/references/examples.md
+++ b/skills/mlops/inference/guidance/references/examples.md
--- a/skills/mlops/inference/instructor/SKILL.md
+++ b/skills/mlops/inference/instructor/SKILL.md
--- a/skills/mlops/inference/instructor/references/examples.md
+++ b/skills/mlops/inference/instructor/references/examples.md
--- a/skills/mlops/inference/instructor/references/providers.md
+++ b/skills/mlops/inference/instructor/references/providers.md
--- a/skills/mlops/inference/instructor/references/validation.md
+++ b/skills/mlops/inference/instructor/references/validation.md
--- a/skills/mlops/inference/llama-cpp/SKILL.md
+++ b/skills/mlops/inference/llama-cpp/SKILL.md
--- a/skills/mlops/inference/llama-cpp/references/optimization.md
+++ b/skills/mlops/inference/llama-cpp/references/optimization.md
--- a/skills/mlops/inference/llama-cpp/references/quantization.md
+++ b/skills/mlops/inference/llama-cpp/references/quantization.md
--- a/skills/mlops/inference/llama-cpp/references/server.md
+++ b/skills/mlops/inference/llama-cpp/references/server.md
--- a/skills/mlops/inference/obliteratus/SKILL.md
+++ b/skills/mlops/inference/obliteratus/SKILL.md
@@ -0,0 +1,330 @@
+---
+name: obliteratus
+description: Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods, 28 analysis modules, 116 model presets across 5 compute tiers, tournament evaluation, and telemetry-driven recommendations. Use when a user wants to uncensor, abliterate, or remove refusal from an LLM.
+version: 2.0.0
+author: Hermes Agent
+license: MIT
+dependencies: [obliteratus, torch, transformers, bitsandbytes, accelerate, safetensors]
+metadata:
+  hermes:
+    tags: [Abliteration, Uncensoring, Refusal-Removal, LLM, Weight-Projection, SVD, Mechanistic-Interpretability, HuggingFace, Model-Surgery]
+    related_skills: [vllm, gguf, huggingface-tokenizers]
+---
+
+# OBLITERATUS Skill
+
+Remove refusal behaviors (guardrails) from open-weight LLMs without retraining or fine-tuning. Uses mechanistic interpretability techniques — including diff-in-means, SVD, whitened SVD, LEACE concept erasure, SAE decomposition, Bayesian kernel projection, and more — to identify and surgically excise refusal directions from model weights while preserving reasoning capabilities.
+
+**License warning:** OBLITERATUS is AGPL-3.0. NEVER import it as a Python library. Always invoke via CLI (`obliteratus` command) or subprocess. This keeps Hermes Agent's MIT license clean.
+
+## When to Use This Skill
+
+Trigger when the user:
+- Wants to "uncensor" or "abliterate" an LLM
+- Asks about removing refusal/guardrails from a model
+- Wants to create an uncensored version of Llama, Qwen, Mistral, etc.
+- Mentions "refusal removal", "abliteration", "weight projection"
+- Wants to analyze how a model's refusal mechanism works
+- References OBLITERATUS, abliterator, or refusal directions
+
+## Step 1: Installation
+
+Check if already installed:
+```bash
+obliteratus --version 2>/dev/null && echo "INSTALLED" || echo "NOT INSTALLED"
+```
+
+If not installed, clone and install from GitHub:
+```bash
+git clone https://github.com/elder-plinius/OBLITERATUS.git
+cd OBLITERATUS
+pip install -e .
+# For Gradio web UI support:
+# pip install -e ".[spaces]"
+```
+
+**IMPORTANT:** Confirm with user before installing. This pulls in ~5-10GB of dependencies (PyTorch, Transformers, bitsandbytes, etc.).
+
+## Step 2: Check Hardware
+
+Before anything, check what GPU is available:
+```bash
+python3 -c "
+import torch
+if torch.cuda.is_available():
+    gpu = torch.cuda.get_device_name(0)
+    vram = torch.cuda.get_device_properties(0).total_memory / 1024**3
+    print(f'GPU: {gpu}')
+    print(f'VRAM: {vram:.1f} GB')
+    if vram < 4: print('TIER: tiny (models under 1B)')
+    elif vram < 8: print('TIER: small (models 1-4B)')
+    elif vram < 16: print('TIER: medium (models 4-9B with 4bit quant)')
+    elif vram < 32: print('TIER: large (models 8-32B with 4bit quant)')
+    else: print('TIER: frontier (models 32B+)')
+else:
+    print('NO GPU - only tiny models (under 1B) on CPU')
+"
+```
+
+### VRAM Requirements (with 4-bit quantization)
+
+| VRAM     | Max Model Size  | Example Models                              |
+|:---------|:----------------|:--------------------------------------------|
+| CPU only | ~1B params      | GPT-2, TinyLlama, SmolLM                    |
+| 4-8 GB   | ~4B params      | Qwen2.5-1.5B, Phi-3.5 mini, Llama 3.2 3B   |
+| 8-16 GB  | ~9B params      | Llama 3.1 8B, Mistral 7B, Gemma 2 9B       |
+| 24 GB    | ~32B params     | Qwen3-32B, Llama 3.1 70B (tight), Command-R |
+| 48 GB+   | ~72B+ params    | Qwen2.5-72B, DeepSeek-R1                    |
+| Multi-GPU| 200B+ params    | Llama 3.1 405B, DeepSeek-V3 (685B MoE)      |
+
+## Step 3: Browse Available Models & Get Recommendations
+
+```bash
+# Browse models by compute tier
+obliteratus models --tier medium
+
+# Get architecture info for a specific model
+obliteratus info <model_name>
+
+# Get telemetry-driven recommendation for best method & params
+obliteratus recommend <model_name>
+obliteratus recommend <model_name> --insights  # global cross-architecture rankings
+```
+
+## Step 4: Choose a Method
+
+### Method Selection Guide
+**Default / recommended for most cases: `advanced`.** It uses multi-direction SVD with norm-preserving projection and is well-tested.
+
+| Situation                         | Recommended Method | Why                                      |
+|:----------------------------------|:-------------------|:-----------------------------------------|
+| Default / most models             | `advanced`         | Multi-direction SVD, norm-preserving, reliable |
+| Quick test / prototyping          | `basic`            | Fast, simple, good enough to evaluate    |
+| Dense model (Llama, Mistral)      | `advanced`         | Multi-direction, norm-preserving         |
+| MoE model (DeepSeek, Mixtral)     | `nuclear`          | Expert-granular, handles MoE complexity  |
+| Reasoning model (R1 distills)     | `surgical`         | CoT-aware, preserves chain-of-thought    |
+| Stubborn refusals persist         | `aggressive`       | Whitened SVD + head surgery + jailbreak   |
+| Want reversible changes           | Use steering vectors (see Analysis section) |
+| Maximum quality, time no object   | `optimized`        | Bayesian search for best parameters      |
+| Experimental auto-detection       | `informed`         | Auto-detects alignment type — experimental, may not always outperform advanced |
+
+### 9 CLI Methods
+- **basic** — Single refusal direction via diff-in-means. Fast (~5-10 min for 8B).
+- **advanced** (DEFAULT, RECOMMENDED) — Multiple SVD directions, norm-preserving projection, 2 refinement passes. Medium speed (~10-20 min).
+- **aggressive** — Whitened SVD + jailbreak-contrastive + attention head surgery. Higher risk of coherence damage.
+- **spectral_cascade** — DCT frequency-domain decomposition. Research/novel approach.
+- **informed** — Runs analysis DURING abliteration to auto-configure. Experimental — slower and less predictable than advanced.
+- **surgical** — SAE features + neuron masking + head surgery + per-expert. Very slow (~1-2 hrs). Best for reasoning models.
+- **optimized** — Bayesian hyperparameter search (Optuna TPE). Longest runtime but finds optimal parameters.
+- **inverted** — Flips the refusal direction. Model becomes actively willing.
+- **nuclear** — Maximum force combo for stubborn MoE models. Expert-granular.
+
+### Direction Extraction Methods (--direction-method flag)
+- **diff_means** (default) — Simple difference-in-means between refused/complied activations. Robust.
+- **svd** — Multi-direction SVD extraction. Better for complex alignment.
+- **leace** — LEACE (Linear Erasure via Closed-form Estimation). Optimal linear erasure.
+
+### 4 Python-API-Only Methods
+(NOT available via CLI — require Python import, which violates AGPL boundary. Mention to user only if they explicitly want to use OBLITERATUS as a library in their own AGPL project.)
+- failspy, gabliteration, heretic, rdo
+
+## Step 5: Run Abliteration
+
+### Standard usage
+```bash
+# Default method (advanced) — recommended for most models
+obliteratus obliterate <model_name> --method advanced --output-dir ./abliterated-models
+
+# With 4-bit quantization (saves VRAM)
+obliteratus obliterate <model_name> --method advanced --quantization 4bit --output-dir ./abliterated-models
+
+# Large models (70B+) — conservative defaults
+obliteratus obliterate <model_name> --method advanced --quantization 4bit --large-model --output-dir ./abliterated-models
+```
+
+### Fine-tuning parameters
+```bash
+obliteratus obliterate <model_name> \
+  --method advanced \
+  --direction-method diff_means \
+  --n-directions 4 \
+  --refinement-passes 2 \
+  --regularization 0.1 \
+  --quantization 4bit \
+  --output-dir ./abliterated-models \
+  --contribute  # opt-in telemetry for community research
+```
+
+### Key flags
+| Flag | Description | Default |
+|:-----|:------------|:--------|
+| `--method` | Abliteration method | advanced |
+| `--direction-method` | Direction extraction | diff_means |
+| `--n-directions` | Number of refusal directions (1-32) | method-dependent |
+| `--refinement-passes` | Iterative passes (1-5) | 2 |
+| `--regularization` | Regularization strength (0.0-1.0) | 0.1 |
+| `--quantization` | Load in 4bit or 8bit | none (full precision) |
+| `--large-model` | Conservative defaults for 120B+ | false |
+| `--output-dir` | Where to save the abliterated model | ./obliterated_model |
+| `--contribute` | Share anonymized results for research | false |
+| `--verify-sample-size` | Number of test prompts for refusal check | 20 |
+| `--dtype` | Model dtype (float16, bfloat16) | auto |
+
+### Other execution modes
+```bash
+# Interactive guided mode (hardware → model → preset)
+obliteratus interactive
+
+# Web UI (Gradio)
+obliteratus ui --port 7860
+
+# Run a full ablation study from YAML config
+obliteratus run config.yaml --preset quick
+
+# Tournament: pit all methods against each other
+obliteratus tourney <model_name>
+```
+
+## Step 6: Verify Results
+
+After abliteration, check the output metrics:
+
+| Metric | Good Value | Warning |
+|:-------|:-----------|:--------|
+| Refusal rate | < 5% (ideally ~0%) | > 10% means refusals persist |
+| Perplexity change | < 10% increase | > 15% means coherence damage |
+| KL divergence | < 0.1 | > 0.5 means significant distribution shift |
+| Coherence | High / passes qualitative check | Degraded responses, repetition |
+
+### If refusals persist (> 10%)
+1. Try `aggressive` method
+2. Increase `--n-directions` (e.g., 8 or 16)
+3. Add `--refinement-passes 3`
+4. Try `--direction-method svd` instead of diff_means
+
+### If coherence is damaged (perplexity > 15% increase)
+1. Reduce `--n-directions` (try 2)
+2. Increase `--regularization` (try 0.3)
+3. Reduce `--refinement-passes` to 1
+4. Try `basic` method (gentler)
+
+## Step 7: Use the Abliterated Model
+
+The output is a standard HuggingFace model directory.
+
+```bash
+# Test locally with transformers
+python3 -c "
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained('./abliterated-models/<model>')
+tokenizer = AutoTokenizer.from_pretrained('./abliterated-models/<model>')
+inputs = tokenizer('How do I pick a lock?', return_tensors='pt')
+outputs = model.generate(**inputs, max_new_tokens=200)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+"
+
+# Upload to HuggingFace Hub
+huggingface-cli upload <username>/<model-name>-abliterated ./abliterated-models/<model>
+
+# Serve with vLLM
+vllm serve ./abliterated-models/<model>
+```
+
+## CLI Command Reference
+
+| Command | Description |
+|:--------|:------------|
+| `obliteratus obliterate` | Main abliteration command |
+| `obliteratus info <model>` | Print model architecture details |
+| `obliteratus models --tier <tier>` | Browse curated models by compute tier |
+| `obliteratus recommend <model>` | Telemetry-driven method/param suggestion |
+| `obliteratus interactive` | Guided setup wizard |
+| `obliteratus tourney <model>` | Tournament: all methods head-to-head |
+| `obliteratus run <config.yaml>` | Execute ablation study from YAML |
+| `obliteratus strategies` | List all registered ablation strategies |
+| `obliteratus report <results.json>` | Regenerate visual reports |
+| `obliteratus ui` | Launch Gradio web interface |
+| `obliteratus aggregate` | Summarize community telemetry data |
+
+## Analysis Modules
+
+OBLITERATUS includes 28 analysis modules for mechanistic interpretability.
+See `skill_view(name="obliteratus", file_path="references/analysis-modules.md")` for the full reference.
+
+### Quick analysis commands
+```bash
+# Run specific analysis modules
+obliteratus run analysis-config.yaml --preset quick
+
+# Key modules to run first:
+# - alignment_imprint: Fingerprint DPO/RLHF/CAI/SFT alignment method
+# - concept_geometry: Single direction vs polyhedral cone
+# - logit_lens: Which layer decides to refuse
+# - anti_ouroboros: Self-repair risk score
+# - causal_tracing: Causally necessary components
+```
+
+### Steering Vectors (Reversible Alternative)
+Instead of permanent weight modification, use inference-time steering:
+```python
+# Python API only — for user's own projects
+from obliteratus.analysis.steering_vectors import SteeringVectorFactory, SteeringHookManager
+```
+
+## Ablation Strategies
+
+Beyond direction-based abliteration, OBLITERATUS includes structural ablation strategies:
+- **Embedding Ablation** — Target embedding layer components
+- **FFN Ablation** — Feed-forward network block removal
+- **Head Pruning** — Attention head pruning
+- **Layer Removal** — Full layer removal
+
+List all available: `obliteratus strategies`
+
+## Evaluation
+
+OBLITERATUS includes built-in evaluation tools:
+- Refusal rate benchmarking
+- Perplexity comparison (before/after)
+- LM Eval Harness integration for academic benchmarks
+- Head-to-head competitor comparison
+- Baseline performance tracking
+
+## Platform Support
+
+- **CUDA** — Full support (NVIDIA GPUs)
+- **Apple Silicon (MLX)** — Supported via MLX backend
+- **CPU** — Supported for tiny models (< 1B params)
+
+## YAML Config Templates
+
+Load templates for reproducible runs via `skill_view`:
+- `templates/abliteration-config.yaml` — Standard single-model config
+- `templates/analysis-study.yaml` — Pre-abliteration analysis study
+- `templates/batch-abliteration.yaml` — Multi-model batch processing
+
+## Telemetry
+
+OBLITERATUS can optionally contribute anonymized run data to a global research dataset.
+Enable with `--contribute` flag. No personal data is collected — only model name, method, metrics.
+
+## Common Pitfalls
+
+1. **Don't use `informed` as default** — it's experimental and slower. Use `advanced` for reliable results.
+2. **Models under ~1B respond poorly to abliteration** — their refusal behaviors are shallow and fragmented, making clean direction extraction difficult. Expect partial results (20-40% remaining refusal). Models 3B+ have cleaner refusal directions and respond much better (often 0% refusal with `advanced`).
+3. **`aggressive` can make things worse** — on small models it can damage coherence and actually increase refusal rate. Only use it if `advanced` leaves > 10% refusals on a 3B+ model.
+4. **Always check perplexity** — if it spikes > 15%, the model is damaged. Reduce aggressiveness.
+5. **MoE models need special handling** — use `nuclear` method for Mixtral, DeepSeek-MoE, etc.
+6. **Quantized models can't be re-quantized** — abliterate the full-precision model, then quantize the output.
+7. **VRAM estimation is approximate** — 4-bit quant helps but peak usage can spike during extraction.
+8. **Reasoning models are sensitive** — use `surgical` for R1 distills to preserve chain-of-thought.
+9. **Check `obliteratus recommend`** — telemetry data may have better parameters than defaults.
+10. **AGPL license** — never `import obliteratus` in MIT/Apache projects. CLI invocation only.
+11. **Large models (70B+)** — always use `--large-model` flag for conservative defaults.
+12. **Spectral certification RED is common** — the spectral check often flags "incomplete" even when practical refusal rate is 0%. Check actual refusal rate rather than relying on spectral certification alone.
+
+## Complementary Skills
+
+- **vllm** — Serve abliterated models with high throughput
+- **gguf** — Convert abliterated models to GGUF for llama.cpp
+- **huggingface-tokenizers** — Work with model tokenizers
--- a/skills/mlops/inference/obliteratus/references/analysis-modules.md
+++ b/skills/mlops/inference/obliteratus/references/analysis-modules.md
@@ -0,0 +1,166 @@
+# OBLITERATUS Analysis Modules — Reference
+
+OBLITERATUS includes 28 analysis modules for mechanistic interpretability of refusal in LLMs.
+These modules help understand how and where refusal behaviors are encoded before performing abliteration.
+
+---
+
+## Core Analysis (Run These First)
+
+### 1. Alignment Imprint Detection (`alignment_imprint.py`)
+Fingerprints whether a model was trained via DPO, RLHF, CAI, or SFT.
+This determines which extraction strategy will work best.
+
+### 2. Concept Cone Geometry (`concept_geometry.py`)
+Determines if refusal is a single linear direction or a polyhedral cone
+(set of multiple mechanisms). Single-direction models respond well to `basic`;
+polyhedral models need `advanced` or `surgical`.
+
+### 3. Refusal Logit Lens (`logit_lens.py`)
+Identifies the specific layer where a model "decides" to refuse by decoding
+intermediate layer representations into token space.
+
+### 4. Ouroboros Detection (`anti_ouroboros.py`)
+Identifies if a model attempts to "self-repair" refusal behaviors after
+excision. Reports a risk score (0-1). High scores mean additional refinement
+passes are needed.
+
+### 5. Causal Tracing (`causal_tracing.py`)
+Identifies which components (layers, heads, MLPs) are causally necessary
+for refusal behavior using activation patching.
+
+---
+
+## Geometric Analysis
+
+### 6. Cross-Layer Alignment (`cross_layer.py`)
+Measures how refusal directions align across different layers. High alignment
+means the refusal signal is consistent; low alignment suggests layer-specific
+mechanisms.
+
+### 7. Residual Stream Decomposition (`residual_stream.py`)
+Decomposes the residual stream into attention and MLP contributions to
+understand which component type contributes more to refusal.
+
+### 8. Riemannian Manifold Geometry (`riemannian_manifold.py`)
+Analyzes the curvature and geometry of the weight manifold near refusal
+directions. Informs how aggressively projections can be applied without
+damaging the manifold structure.
+
+### 9. Whitened SVD (`whitened_svd.py`)
+Covariance-normalized SVD extraction that separates guardrail signals from
+natural activation variance. More precise than standard SVD for models with
+high activation variance.
+
+### 10. Concept Cone Geometry (extended)
+Maps the full polyhedral structure of refusal, including cone angles,
+face counts, and intersection patterns.
+
+---
+
+## Probing & Classification
+
+### 11. Activation Probing (`activation_probing.py`)
+Post-excision verification — probes for residual refusal concepts after
+abliteration to ensure complete removal.
+
+### 12. Probing Classifiers (`probing_classifiers.py`)
+Trains linear classifiers to detect refusal in activations. Used both
+before (to verify refusal exists) and after (to verify it's gone).
+
+### 13. Activation Patching (`activation_patching.py`)
+Interchange interventions — swaps activations between refused and complied
+runs to identify causal components.
+
+### 14. Tuned Lens (`tuned_lens.py`)
+Trained version of logit lens that provides more accurate per-layer
+decoding by learning affine transformations for each layer.
+
+### 15. Multi-Token Position Analysis (`multi_token_position.py`)
+Analyzes refusal signals across multiple token positions, not just the
+last token. Important for models that distribute refusal across the sequence.
+
+---
+
+## Abliteration & Manipulation
+
+### 16. SAE-Based Abliteration (`sae_abliteration.py`)
+Uses Sparse Autoencoder features to identify and remove specific refusal
+features. More surgical than direction-based methods.
+
+### 17. Steering Vectors (`steering_vectors.py`)
+Creates and applies inference-time steering vectors for reversible refusal
+modification. Includes `SteeringVectorFactory` and `SteeringHookManager`.
+
+### 18. LEACE Concept Erasure (`leace.py`)
+Linear Erasure via Closed-form Estimation — mathematically optimal linear
+concept removal. Available as both analysis module and direction extraction method.
+
+### 19. Sparse Surgery (`sparse_surgery.py`)
+High-precision weight modification targeting individual neurons and
+weight matrix entries rather than full directions.
+
+### 20. Conditional Abliteration (`conditional_abliteration.py`)
+Targeted removal that only affects specific refusal categories while
+preserving others (e.g., remove weapons refusal but keep CSAM refusal).
+
+---
+
+## Transfer & Robustness
+
+### 21. Cross-Model Transfer (`cross_model_transfer.py`)
+Tests whether refusal directions extracted from one model transfer to
+another architecture. Measures universality of guardrail directions.
+
+### 22. Defense Robustness (`defense_robustness.py`)
+Evaluates how robust the abliteration is against various defense mechanisms
+and re-alignment attempts.
+
+### 23. Spectral Certification (`spectral_certification.py`)
+Provides mathematical bounds on the completeness of refusal removal
+using spectral analysis of the projection.
+
+### 24. Wasserstein Optimal Extraction (`wasserstein_optimal.py`)
+Uses optimal transport theory for more precise direction extraction
+that minimizes distribution shift.
+
+### 25. Wasserstein Transfer (`wasserstein_transfer.py`)
+Distribution transfer between models using Wasserstein distance
+for cross-architecture refusal direction mapping.
+
+---
+
+## Advanced / Research
+
+### 26. Bayesian Kernel Projection (`bayesian_kernel_projection.py`)
+Probabilistic feature mapping that estimates uncertainty in refusal
+direction identification.
+
+### 27. Cross-Model Universality Index
+Measures if guardrail directions generalize across different model
+architectures and training regimes.
+
+### 28. Visualization (`visualization.py`)
+Plotting and graphing utilities for all analysis modules. Generates
+heatmaps, direction plots, and layer-wise analysis charts.
+
+---
+
+## Running Analysis
+
+### Via CLI
+```bash
+# Run analysis from a YAML config
+obliteratus run analysis-study.yaml --preset quick
+
+# Available study presets:
+# quick     — Fast sanity check (2-3 modules)
+# full      — All core + geometric analysis
+# jailbreak — Refusal circuit localization
+# knowledge — Knowledge preservation analysis
+# robustness — Stress testing / defense evaluation
+```
+
+### Via YAML Config
+See the `templates/analysis-study.yaml` template for a complete example.
+Load with: `skill_view(name="obliteratus", file_path="templates/analysis-study.yaml")`
--- a/skills/mlops/inference/obliteratus/references/methods-guide.md
+++ b/skills/mlops/inference/obliteratus/references/methods-guide.md
@@ -0,0 +1,141 @@
+# OBLITERATUS Methods — Detailed Guide
+
+> The CLI accepts 9 methods via `--method`: basic, advanced, aggressive, spectral_cascade,
+> informed, surgical, optimized, inverted, nuclear.
+> Four additional methods (failspy, gabliteration, heretic, rdo) are available only via the Python API.
+
+## How Abliteration Works (Theory)
+
+Abliteration identifies a "refusal direction" — a vector in the model's activation space that
+corresponds to refusal behavior — and projects it out of the weight matrices.
+
+Mathematically: `W_new = W_old - (W_old @ d @ d.T)` where `d` is the refusal direction.
+
+The key challenge is finding accurate refusal directions without damaging other capabilities.
+
+---
+
+## Direction Extraction Methods
+
+Before projecting, OBLITERATUS extracts refusal directions using one of three methods:
+
+| Method | Flag | Description | Best For |
+|:-------|:-----|:------------|:---------|
+| Diff-in-Means | `--direction-method diff_means` | Difference between mean activations on refused vs. complied prompts | Default, fast, robust |
+| SVD | `--direction-method svd` | Multi-direction extraction via Singular Value Decomposition | Complex alignment, multiple refusal mechanisms |
+| LEACE | `--direction-method leace` | Linear Erasure via Closed-form Estimation — mathematically optimal | Maximum precision, research |
+
+---
+
+## Method Details
+
+### basic
+- **Directions:** 1 (single diff-in-means vector)
+- **Speed:** Fast (~5-10 min for 8B model)
+- **Risk:** Low
+- **Use case:** Quick tests, prototyping, evaluating if abliteration works for a model
+- **How it works:** Extracts one refusal direction and projects it out uniformly across all layers.
+
+### advanced (DEFAULT — RECOMMENDED)
+- **Directions:** 4 (multi-direction SVD)
+- **Speed:** Medium (~10-20 min for 8B model)
+- **Risk:** Low-Medium
+- **Refinement passes:** 2
+- **Use case:** Default for most models. Well-tested and reliable.
+- **How it works:** Extracts multiple refusal directions via SVD, applies norm-preserving bi-projection to maintain weight matrix norms. Two refinement passes catch residual refusal.
+
+### aggressive
+- **Directions:** 8+ (whitened SVD + jailbreak-contrastive)
+- **Speed:** Medium-Slow
+- **Risk:** Medium-High (may damage coherence)
+- **Use case:** When `advanced` leaves > 10% refusals. Stubborn models.
+- **How it works:** Uses whitened SVD for covariance-normalized extraction, adds jailbreak-contrastive directions, performs attention head surgery on the most refusal-active heads.
+
+### spectral_cascade
+- **Speed:** Medium
+- **Risk:** Medium
+- **Use case:** Research, novel approaches
+- **How it works:** DCT (Discrete Cosine Transform) frequency-domain decomposition of refusal signals. Separates high-frequency (surface-level) from low-frequency (deep) refusal patterns.
+
+### informed (EXPERIMENTAL)
+- **Speed:** Slow (~20-40 min for 8B model)
+- **Risk:** Variable — results depend on analysis quality
+- **Use case:** When you want auto-configuration, but be aware this is experimental and may not outperform `advanced`.
+- **How it works:** Runs 4 analysis modules first (alignment imprint, concept geometry, logit lens, ouroboros detection), then auto-configures extraction strategy. Includes an "Ouroboros loop" that detects and counteracts self-repair.
+- **Note:** The auto-detection can sometimes misconfigure. If results are poor, fall back to `advanced`.
+
+### surgical
+- **Speed:** Very slow (~1-2 hrs for 8B model)
+- **Risk:** Low (very precise)
+- **Use case:** Reasoning models (R1 distills, QwQ, etc.) where chain-of-thought must be preserved.
+- **How it works:** Uses SAE (Sparse Autoencoder) features + individual neuron masking + attention head surgery + per-expert decomposition (for MoE). CoT-aware — identifies and protects reasoning-critical directions before projecting.
+
+### optimized
+- **Speed:** Very slow (hours — runs many trials)
+- **Risk:** Low (finds optimal parameters)
+- **Use case:** When quality matters more than speed. Production models.
+- **How it works:** Bayesian hyperparameter search via Optuna TPE sampler. Optimizes n_directions, regularization, refinement passes, and layer selection jointly. Evaluates each configuration on refusal rate + perplexity.
+
+### inverted
+- **Speed:** Fast
+- **Risk:** High (model behavior changes dramatically)
+- **Use case:** Research, studying refusal mechanisms
+- **How it works:** Instead of projecting out the refusal direction, reflects it. The model actively complies rather than passively not-refusing. Useful for understanding the geometry of alignment.
+
+### nuclear
+- **Speed:** Slow
+- **Risk:** Medium-High
+- **Use case:** Stubborn MoE models (DeepSeek-MoE, Mixtral, etc.)
+- **How it works:** Combines expert-granular abliteration (EGA), steering vector injection, attention head pruning, and multi-pass refinement. Decomposes refusal signals into per-expert components for MoE architectures.
+
+---
+
+## Method Selection Flowchart
+
+```
+Is this a quick test?
+  → YES: basic
+  → NO: continue
+
+Is it an MoE model (Mixtral, DeepSeek-MoE)?
+  → YES: nuclear
+  → NO: continue
+
+Is it a reasoning model (R1, QwQ, CoT-focused)?
+  → YES: surgical
+  → NO: continue
+
+Do you need the absolute best quality and have time?
+  → YES: optimized
+  → NO: advanced (recommended default)
+
+Did advanced leave > 10% refusals?
+  → YES: aggressive
+  → Still refusing: nuclear
+```
+
+---
+
+## Key Parameters
+
+| Parameter | Range | Default | Effect |
+|:----------|:------|:--------|:-------|
+| `--n-directions` | 1-32 | method-dependent | More directions = more complete removal, but higher damage risk |
+| `--regularization` | 0.0-1.0 | 0.1 | Higher = more conservative (less removal, less damage) |
+| `--refinement-passes` | 1-5 | 2 | More passes catch residual refusal, but diminishing returns |
+| `--quantization` | 4bit, 8bit | none | Reduces VRAM usage; quality impact minimal for extraction |
+| `--verify-sample-size` | 10-200 | 20 | More samples = more accurate refusal rate estimate |
+
+---
+
+## Troubleshooting
+
+| Problem | Likely Cause | Fix |
+|:--------|:-------------|:----|
+| Refusal rate > 20% | Too few directions | Increase `--n-directions`, try `aggressive` |
+| Refusal rate 5-20% | Residual refusal | Add `--refinement-passes 3`, try `--direction-method svd` |
+| Perplexity spike > 20% | Over-aggressive removal | Reduce `--n-directions`, increase `--regularization` |
+| Repetitive output | Weight matrix damage | Use `basic` with fewer directions, check norm preservation |
+| MoE model still refuses | Non-expert-aware method | Switch to `nuclear` |
+| Reasoning degraded | CoT directions damaged | Use `surgical` method |
+| OOM during extraction | Insufficient VRAM | Add `--quantization 4bit` and/or `--large-model` |
--- a/skills/mlops/inference/obliteratus/templates/abliteration-config.yaml
+++ b/skills/mlops/inference/obliteratus/templates/abliteration-config.yaml
--- a/skills/mlops/inference/obliteratus/templates/analysis-study.yaml
+++ b/skills/mlops/inference/obliteratus/templates/analysis-study.yaml
--- a/skills/mlops/inference/obliteratus/templates/batch-abliteration.yaml
+++ b/skills/mlops/inference/obliteratus/templates/batch-abliteration.yaml
--- a/skills/mlops/inference/outlines/SKILL.md
+++ b/skills/mlops/inference/outlines/SKILL.md
--- a/skills/mlops/inference/outlines/references/backends.md
+++ b/skills/mlops/inference/outlines/references/backends.md
--- a/Show More
+++ b/Show More