Compare commits

..

1 Commits

Author SHA1 Message Date
teknium1
f984cc335b feat: enhance auxiliary model configuration and environment variable handling
- Added support for auxiliary model overrides in the configuration, allowing users to specify providers and models for vision and web extraction tasks.
- Updated the CLI configuration example to include new auxiliary model settings.
- Enhanced the environment variable mapping in the CLI to accommodate auxiliary model configurations.
- Improved the resolution logic for auxiliary clients to support task-specific provider overrides.
- Updated relevant documentation and comments for clarity on the new features and their usage.
2026-03-07 08:52:06 -08:00
412 changed files with 12514 additions and 33663 deletions

View File

@@ -1,18 +0,0 @@
root = true
[*]
indent_style = space
indent_size = 4
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
[*.{yml,yaml,json,toml}]
indent_size = 2
[*.md]
trim_trailing_whitespace = false
[Makefile]
indent_style = tab

View File

@@ -24,14 +24,10 @@ GLM_API_KEY=
# =============================================================================
# LLM PROVIDER (Kimi / Moonshot)
# =============================================================================
# Kimi Code provides access to Moonshot AI coding models (kimi-k2.5, etc.)
# Get your key at: https://platform.kimi.ai (Kimi Code console)
# Keys prefixed sk-kimi- use the Kimi Code API (api.kimi.com) by default.
# Legacy keys from platform.moonshot.ai need KIMI_BASE_URL override below.
# Kimi/Moonshot provides access to Moonshot AI coding models
# Get your key at: https://platform.moonshot.ai
KIMI_API_KEY=
# KIMI_BASE_URL=https://api.kimi.com/coding/v1 # Default for sk-kimi- keys
# KIMI_BASE_URL=https://api.moonshot.ai/v1 # For legacy Moonshot keys
# KIMI_BASE_URL=https://api.moonshot.cn/v1 # For Moonshot China keys
# KIMI_BASE_URL=https://api.moonshot.ai/v1 # Override default base URL
# =============================================================================
# LLM PROVIDER (MiniMax)
@@ -53,6 +49,10 @@ MINIMAX_CN_API_KEY=
# Get at: https://firecrawl.dev/
FIRECRAWL_API_KEY=
# Nous Research API Key - Vision analysis and multi-model reasoning
# Get at: https://inference-api.nousresearch.com/
NOUS_API_KEY=
# FAL.ai API Key - Image generation
# Get at: https://fal.ai/
FAL_KEY=

View File

@@ -46,7 +46,7 @@ Fixes #
- [ ] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.)
- [ ] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate
- [ ] My PR contains **only** changes related to this fix/feature (no unrelated commits)
- [ ] I've run `make check` (lint + test) and all checks pass
- [ ] I've run `pytest tests/ -q` and all tests pass
- [ ] I've added tests for my changes (required for bug fixes, strongly encouraged for features)
- [ ] I've tested on my platform: <!-- e.g. Ubuntu 24.04, macOS 15.2, Windows 11 -->

View File

@@ -1,4 +1,4 @@
name: CI
name: Tests
on:
push:
@@ -6,42 +6,37 @@ on:
pull_request:
branches: [main]
# Cancel in-progress runs for the same PR/branch
concurrency:
group: ci-${{ github.ref }}
group: tests-${{ github.ref }}
cancel-in-progress: true
env:
SRC: >-
run_agent.py model_tools.py toolsets.py cli.py hermes_state.py batch_runner.py
tools/ hermes_cli/ gateway/ agent/ cron/
jobs:
lint:
runs-on: ubuntu-latest
timeout-minutes: 3
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v5
- run: uvx ruff check $SRC
- run: uvx ruff format --check $SRC
test:
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v5
with:
enable-cache: true
- run: uv python install 3.11
- run: |
- name: Checkout code
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v5
- name: Set up Python 3.11
run: uv python install 3.11
- name: Install dependencies
run: |
uv venv .venv --python 3.11
source .venv/bin/activate
uv pip install -e ".[all,dev]"
- run: |
- name: Run tests
run: |
source .venv/bin/activate
python -m pytest tests/ -q --ignore=tests/integration --tb=short
env:
# Ensure tests don't accidentally call real APIs
OPENROUTER_API_KEY: ""
OPENAI_API_KEY: ""
NOUS_API_KEY: ""

77
.gitignore vendored
View File

@@ -1,53 +1,50 @@
# Python
/venv/
/_pycache/
*.pyc*
__pycache__/
*.pyc
*.pyo
*.egg-info/
dist/
build/
# Environments
.venv/
venv/
# Tools
.ruff_cache/
.mypy_cache/
.pytest_cache/
# Editors
.vscode/
.idea/
# Secrets & config
.env
.env.local
.env.*.local
*.pem
*.ppk
# Node
node_modules/
# Project-specific
.env.development.local
.env.test.local
.env.production.local
.env.development
.env.test
export*
__pycache__/model_tools.cpython-310.pyc
__pycache__/web_tools.cpython-310.pyc
logs/
data/
.pytest_cache/
tmp/
wandb/
images/
browser-use/
agent-browser/
source-data/
testlogs/
ignored/
.worktrees/
temp_vision_images/
cli-config.yaml
skills/.hub/
hermes-*/*
examples/
export*
privvy*
run_datagen_*.sh
tests/quick_test_dataset.jsonl
tests/sample_dataset.jsonl
run_datagen_kimik2-thinking.sh
run_datagen_megascience_glm4-6.sh
run_datagen_sonnet.sh
source-data/*
run_datagen_megascience_glm4-6.sh
data/*
node_modules/
browser-use/
agent-browser/
# Private keys
*.ppk
*.pem
privvy*
images/
__pycache__/
hermes_agent.egg-info/
wandb/
testlogs
# CLI config (may contain sensitive SSH paths)
cli-config.yaml
# Skills Hub state (lives in ~/.hermes/skills/.hub/ at runtime, but just in case)
skills/.hub/
ignored/

View File

@@ -1,18 +0,0 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.15.5
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-merge-conflict
- id: check-yaml
args: [--allow-multiple-documents]
- id: check-added-large-files
args: [--maxkb=500]

741
AGENTS.md
View File

@@ -1,61 +1,78 @@
# Hermes Agent - Development Guide
Instructions for AI coding assistants and developers working on the hermes-agent codebase.
Instructions for AI coding assistants (GitHub Copilot, Cursor, etc.) and human developers.
Hermes Agent is an AI agent harness with tool-calling capabilities, interactive CLI, messaging integrations, and scheduled tasks.
## Development Environment
**IMPORTANT**: Always use the virtual environment if it exists:
```bash
make setup # First time: creates .venv, installs deps, sets up pre-commit
source .venv/bin/activate
source venv/bin/activate # Before running any Python commands
```
## Project Structure
```
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop
├── model_tools.py # Tool orchestration, _discover_tools(), handle_function_call()
├── toolsets.py # Toolset definitions, _HERMES_CORE_TOOLS list
├── cli.py # HermesCLI class — interactive CLI orchestrator
├── hermes_state.py # SessionDB — SQLite session store (FTS5 search)
├── agent/ # Agent internals
│ ├── prompt_builder.py # System prompt assembly
├── agent/ # Agent internals (extracted from run_agent.py)
├── model_metadata.py # Model context lengths, token estimation
│ ├── context_compressor.py # Auto context compression
│ ├── prompt_caching.py # Anthropic prompt caching
│ ├── auxiliary_client.py # Auxiliary LLM client (vision, summarization)
│ ├── model_metadata.py # Model context lengths, token estimation
│ ├── prompt_builder.py # System prompt assembly (identity, skills index, context files)
│ ├── display.py # KawaiiSpinner, tool preview formatting
│ ├── skill_commands.py # Skill slash commands (shared CLI/gateway)
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI subcommands and setup
│ ├── main.py # Entry point — all `hermes` subcommands
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
│ ├── commands.py # Slash command definitions + SlashCommandCompleter
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
── setup.py # Interactive setup wizard
├── tools/ # Tool implementations (one file per tool)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection
│ ├── terminal_tool.py # Terminal orchestration
│ ├── process_registry.py # Background process management
│ ├── file_tools.py # File read/write/search/patch
── web_tools.py # Firecrawl search/extract
│ ├── browser_tool.py # Browserbase browser automation
│ ├── code_execution_tool.py # execute_code sandbox
│ ├── delegate_tool.py # Subagent delegation
│ ├── mcp_tool.py # MCP client (~1050 lines)
└── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
├── gateway/ # Messaging platform gateway
│ ├── run.py # Main loop, slash commands, message dispatch
│ ├── session.py # SessionStore — conversation persistence
└── platforms/ # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal
├── cron/ # Scheduler (jobs.py, scheduler.py)
├── environments/ # RL training environments (Atropos)
├── tests/ # Pytest suite (~2500+ tests)
├── hermes_cli/ # CLI implementation
│ ├── main.py # Entry point, command dispatcher
│ ├── banner.py # Welcome banner, ASCII art, skills summary
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive prompt callbacks (clarify, sudo, approval)
── setup.py # Interactive setup wizard
│ ├── config.py # Config management & migration
│ ├── status.py # Status display
│ ├── doctor.py # Diagnostics
│ ├── gateway.py # Gateway management
│ ├── uninstall.py # Uninstaller
│ ├── cron.py # Cron job management
── skills_hub.py # Skills Hub CLI + /skills slash command
├── tools/ # Tool implementations
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection + per-session approval
│ ├── environments/ # Terminal execution backends
│ ├── base.py # BaseEnvironment ABC
│ │ ├── local.py # Local execution with interrupt support
│ ├── docker.py # Docker container execution
│ ├── ssh.py # SSH remote execution
│ ├── singularity.py # Singularity/Apptainer + SIF management
├── modal.py # Modal cloud execution
│ │ └── daytona.py # Daytona cloud sandboxes
├── terminal_tool.py # Terminal orchestration (sudo, lifecycle, factory)
│ ├── todo_tool.py # Planning & task management
│ ├── process_registry.py # Background process management
│ └── ... # Other tool files
├── gateway/ # Messaging platform adapters
│ ├── platforms/ # Platform-specific adapters (telegram, discord, slack, whatsapp)
│ └── ...
├── cron/ # Scheduler implementation
├── environments/ # RL training environments (Atropos integration)
├── skills/ # Bundled skill sources
├── optional-skills/ # Official optional skills (not activated by default)
├── cli.py # Interactive CLI orchestrator (HermesCLI class)
├── run_agent.py # AIAgent class (core conversation loop)
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings
├── toolset_distributions.py # Probability-based tool selection
└── batch_runner.py # Parallel batch processing
```
**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys)
**User Configuration** (stored in `~/.hermes/`):
- `~/.hermes/config.yaml` - Settings (model, terminal, toolsets, etc.)
- `~/.hermes/.env` - API keys and secrets
- `~/.hermes/pairing/` - DM pairing data
- `~/.hermes/hooks/` - Custom event hooks
- `~/.hermes/image_cache/` - Cached user images
- `~/.hermes/audio_cache/` - Cached user voice messages
- `~/.hermes/sticker_cache.json` - Telegram sticker descriptions
## File Dependency Chain
@@ -69,187 +86,603 @@ model_tools.py (imports tools/registry + triggers tool discovery)
run_agent.py, cli.py, batch_runner.py, environments/
```
Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer.
---
## AIAgent Class (run_agent.py)
## AIAgent Class
The main agent is implemented in `run_agent.py`:
```python
class AIAgent:
def __init__(self,
model: str = "anthropic/claude-opus-4.6",
max_iterations: int = 90,
def __init__(
self,
model: str = "anthropic/claude-sonnet-4",
api_key: str = None,
base_url: str = "https://openrouter.ai/api/v1",
max_iterations: int = 60, # Max tool-calling loops
enabled_toolsets: list = None,
disabled_toolsets: list = None,
quiet_mode: bool = False,
save_trajectories: bool = False,
platform: str = None, # "cli", "telegram", etc.
session_id: str = None,
skip_context_files: bool = False,
skip_memory: bool = False,
# ... plus provider, api_mode, callbacks, routing params
): ...
def chat(self, message: str) -> str:
"""Simple interface — returns final response string."""
def run_conversation(self, user_message: str, system_message: str = None,
conversation_history: list = None, task_id: str = None) -> dict:
"""Full interface — returns dict with final_response + messages."""
verbose_logging: bool = False,
quiet_mode: bool = False, # Suppress progress output
tool_progress_callback: callable = None, # Called on each tool use
):
# Initialize OpenAI client, load tools based on toolsets
...
def chat(self, user_message: str, task_id: str = None) -> str:
# Main entry point - runs the agent loop
...
```
### Agent Loop
The core loop is inside `run_conversation()` — entirely synchronous:
The core loop in `_run_agent_loop()`:
```
1. Add user message to conversation
2. Call LLM with tools
3. If LLM returns tool calls:
- Execute each tool
- Add tool results to conversation
- Go to step 2
4. If LLM returns text response:
- Return response to user
```
```python
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
while turns < max_turns:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_schemas,
)
if response.tool_calls:
for tool_call in response.tool_calls:
result = handle_function_call(tool_call.name, tool_call.args, task_id)
result = await execute_tool(tool_call)
messages.append(tool_result_message(result))
api_call_count += 1
turns += 1
else:
return response.content
```
Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Reasoning content is stored in `assistant_msg["reasoning"]`.
### Conversation Management
Messages are stored as a list of dicts following OpenAI format:
```python
messages = [
{"role": "system", "content": "You are a helpful assistant..."},
{"role": "user", "content": "Search for Python tutorials"},
{"role": "assistant", "content": None, "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."},
{"role": "assistant", "content": "Here's what I found..."},
]
```
### Reasoning Model Support
For models that support chain-of-thought reasoning:
- Extract `reasoning_content` from API responses
- Store in `assistant_msg["reasoning"]` for trajectory export
- Pass back via `reasoning_content` field on subsequent turns
---
## CLI Architecture (cli.py)
- **Rich** for banner/panels, **prompt_toolkit** for input with autocomplete
- **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
- `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
- `process_command()` is a method on `HermesCLI` (not in commands.py)
- Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching
The interactive CLI uses:
- **Rich** - For the welcome banner and styled panels
- **prompt_toolkit** - For fixed input area with history, `patch_stdout`, slash command autocomplete, and floating completion menus
- **KawaiiSpinner** (in run_agent.py) - Animated kawaii faces during API calls; clean `┊` activity feed for tool execution results
Key components:
- `HermesCLI` class - Main CLI controller with commands and conversation loop
- `SlashCommandCompleter` - Autocomplete dropdown for `/commands` (type `/` to see all)
- `agent/skill_commands.py` - Scans skills and builds invocation messages (shared with gateway)
- `load_cli_config()` - Loads config, sets environment variables for terminal
- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
CLI UX notes:
- Thinking spinner (during LLM API call) shows animated kawaii face + verb (`(⌐■_■) deliberating...`)
- When LLM returns tool calls, the spinner clears silently (no "got it!" noise)
- Tool execution results appear as a clean activity feed: `┊ {emoji} {verb} {detail} {duration}`
- "got it!" only appears when the LLM returns a final text response (`⚕ ready`)
- The prompt shows `⚕ ` when the agent is working, `` when idle
- Pasting 5+ lines auto-saves to `~/.hermes/pastes/` and collapses to a reference
- Multi-line input via Alt+Enter or Ctrl+J
- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
- `/skill-name` - Invoke installed skills directly (e.g., `/axolotl`, `/gif-search`)
CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging.
### Skill Slash Commands
Every installed skill in `~/.hermes/skills/` is automatically registered as a slash command.
The skill name (from frontmatter or folder name) becomes the command: `axolotl``/axolotl`.
Implementation (`agent/skill_commands.py`, shared between CLI and gateway):
1. `scan_skill_commands()` scans all SKILL.md files at startup, filtering out skills incompatible with the current OS platform (via the `platforms` frontmatter field)
2. `build_skill_invocation_message()` loads the SKILL.md content and builds a user-turn message
3. The message includes the full skill content, a list of supporting files (not loaded), and the user's instruction
4. Supporting files can be loaded on demand via the `skill_view` tool
5. Injected as a **user message** (not system prompt) to preserve prompt caching
### Adding CLI Commands
1. Add to `COMMANDS` dict in `hermes_cli/commands.py`
2. Add handler in `HermesCLI.process_command()` in `cli.py`
3. For persistent settings, use `save_config_value()` in `cli.py`
1. Add to `COMMANDS` dict with description
2. Add handler in `process_command()` method
3. For persistent settings, use `save_config_value()` to update config
---
## Hermes CLI Commands
The unified `hermes` command provides all functionality:
| Command | Description |
|---------|-------------|
| `hermes` | Interactive chat (default) |
| `hermes chat -q "..."` | Single query mode |
| `hermes setup` | Configure API keys and settings |
| `hermes config` | View current configuration |
| `hermes config edit` | Open config in editor |
| `hermes config set KEY VAL` | Set a specific value |
| `hermes config check` | Check for missing config |
| `hermes config migrate` | Prompt for missing config interactively |
| `hermes status` | Show configuration status |
| `hermes doctor` | Diagnose issues |
| `hermes update` | Update to latest (checks for new config) |
| `hermes uninstall` | Uninstall (can keep configs for reinstall) |
| `hermes gateway` | Start gateway (messaging + cron scheduler) |
| `hermes gateway setup` | Configure messaging platforms interactively |
| `hermes gateway install` | Install gateway as system service |
| `hermes cron list` | View scheduled jobs |
| `hermes cron status` | Check if cron scheduler is running |
| `hermes version` | Show version info |
| `hermes pairing list/approve/revoke` | Manage DM pairing codes |
---
## Messaging Gateway
The gateway connects Hermes to Telegram, Discord, Slack, and WhatsApp.
### Setup
The interactive setup wizard handles platform configuration:
```bash
hermes gateway setup # Arrow-key menu of all platforms, configure tokens/allowlists/home channels
```
This is the recommended way to configure messaging. It shows which platforms are already set up, walks through each one interactively, and offers to start/restart the gateway service at the end.
Platforms can also be configured manually in `~/.hermes/.env`:
### Configuration (in `~/.hermes/.env`):
```bash
# Telegram
TELEGRAM_BOT_TOKEN=123456:ABC-DEF... # From @BotFather
TELEGRAM_ALLOWED_USERS=123456789,987654 # Comma-separated user IDs (from @userinfobot)
# Discord
DISCORD_BOT_TOKEN=MTIz... # From Developer Portal
DISCORD_ALLOWED_USERS=123456789012345678 # Comma-separated user IDs
# Agent Behavior
HERMES_MAX_ITERATIONS=60 # Max tool-calling iterations
MESSAGING_CWD=/home/myuser # Terminal working directory for messaging
# Tool progress is configured in config.yaml (display.tool_progress: off|new|all|verbose)
```
### Working Directory Behavior
- **CLI (`hermes` command)**: Uses current directory (`.``os.getcwd()`)
- **Messaging (Telegram/Discord)**: Uses `MESSAGING_CWD` (default: home directory)
This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
### Security (User Allowlists):
**IMPORTANT**: By default, the gateway denies all users who are not in an allowlist or paired via DM.
The gateway checks `{PLATFORM}_ALLOWED_USERS` environment variables:
- If set: Only listed user IDs can interact with the bot
- If unset: All users are denied unless `GATEWAY_ALLOW_ALL_USERS=true` is set
Users can find their IDs:
- **Telegram**: Message [@userinfobot](https://t.me/userinfobot)
- **Discord**: Enable Developer Mode, right-click name → Copy ID
### DM Pairing System
Instead of static allowlists, users can pair via one-time codes:
1. Unknown user DMs the bot → receives pairing code
2. Owner runs `hermes pairing approve <platform> <code>`
3. User is permanently authorized
Security: 8-char codes, 1-hour expiry, rate-limited (1/10min/user), max 3 pending per platform, lockout after 5 failed attempts, `chmod 0600` on data files.
Files: `gateway/pairing.py`, `hermes_cli/pairing.py`
### Event Hooks
Hooks fire at lifecycle points. Place hook directories in `~/.hermes/hooks/`:
```
~/.hermes/hooks/my-hook/
├── HOOK.yaml # name, description, events list
└── handler.py # async def handle(event_type, context): ...
```
Events: `gateway:startup`, `session:start`, `session:reset`, `agent:start`, `agent:step`, `agent:end`, `command:*`
The `agent:step` event fires each iteration of the tool-calling loop with tool names and results.
Files: `gateway/hooks.py`
### Tool Progress Notifications
When `tool_progress` is enabled in `config.yaml`, the bot sends status messages as it works:
- `💻 \`ls -la\`...` (terminal commands show the actual command)
- `🔍 web_search...`
- `📄 web_extract...`
- `🐍 execute_code...` (programmatic tool calling sandbox)
- `🔀 delegate_task...` (subagent delegation)
- `❓ clarify...` (user question, CLI-only)
Modes:
- `new`: Only when switching to a different tool (less spam)
- `all`: Every single tool call
### Typing Indicator
The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
### Platform Toolsets:
Each platform has a dedicated toolset in `toolsets.py`:
- `hermes-telegram`: Full tools including terminal (with safety checks)
- `hermes-discord`: Full tools including terminal
- `hermes-whatsapp`: Full tools including terminal
---
## Configuration System
Configuration files are stored in `~/.hermes/` for easy user access:
- `~/.hermes/config.yaml` - All settings (model, terminal, compression, etc.)
- `~/.hermes/.env` - API keys and secrets
### Adding New Configuration Options
When adding new configuration variables, you MUST follow this process:
#### For config.yaml options:
1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
2. **CRITICAL**: Bump `_config_version` in `DEFAULT_CONFIG` when adding required fields
3. This triggers migration prompts for existing users on next `hermes update` or `hermes setup`
Example:
```python
DEFAULT_CONFIG = {
# ... existing config ...
"new_feature": {
"enabled": True,
"option": "default_value",
},
# BUMP THIS when adding required fields
"_config_version": 2, # Was 1, now 2
}
```
#### For .env variables (API keys/secrets):
1. Add to `REQUIRED_ENV_VARS` or `OPTIONAL_ENV_VARS` in `hermes_cli/config.py`
2. Include metadata for the migration system:
```python
OPTIONAL_ENV_VARS = {
# ... existing vars ...
"NEW_API_KEY": {
"description": "What this key is for",
"prompt": "Display name in prompts",
"url": "https://where-to-get-it.com/",
"tools": ["tools_it_enables"], # What tools need this
"password": True, # Mask input
},
}
```
#### Update related files:
- `hermes_cli/setup.py` - Add prompts in the setup wizard
- `cli-config.yaml.example` - Add example with comments
- Update README.md if user-facing
### Config Version Migration
The system uses `_config_version` to detect outdated configs:
1. `check_for_missing_config()` compares user config to `DEFAULT_CONFIG`
2. `migrate_config()` interactively prompts for missing values
3. Called automatically by `hermes update` and optionally by `hermes setup`
---
## Environment Variables
API keys are loaded from `~/.hermes/.env`:
- `OPENROUTER_API_KEY` - Main LLM API access (primary provider)
- `FIRECRAWL_API_KEY` - Web search/extract tools
- `FIRECRAWL_API_URL` - Self-hosted Firecrawl endpoint (optional)
- `BROWSERBASE_API_KEY` / `BROWSERBASE_PROJECT_ID` - Browser automation
- `FAL_KEY` - Image generation (FLUX model)
- `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
Terminal tool configuration (in `~/.hermes/config.yaml`):
- `terminal.backend` - Backend: local, docker, singularity, modal, daytona, or ssh
- `terminal.cwd` - Working directory ("." = host CWD for local only; for remote backends set an absolute path inside the target, or omit to use the backend's default)
- `terminal.docker_image` - Image for Docker backend
- `terminal.singularity_image` - Image for Singularity backend
- `terminal.modal_image` - Image for Modal backend
- `terminal.daytona_image` - Image for Daytona backend
- `DAYTONA_API_KEY` - API key for Daytona backend (in .env)
- SSH: `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` in .env
Agent behavior (in `~/.hermes/.env`):
- `HERMES_MAX_ITERATIONS` - Max tool-calling iterations (default: 60)
- `MESSAGING_CWD` - Working directory for messaging platforms (default: ~)
- `display.tool_progress` in config.yaml - Tool progress: `off`, `new`, `all`, `verbose`
- `OPENAI_API_KEY` - Voice transcription (Whisper STT)
- `SLACK_BOT_TOKEN` / `SLACK_APP_TOKEN` - Slack integration (Socket Mode)
- `SLACK_ALLOWED_USERS` - Comma-separated Slack user IDs
- `HERMES_HUMAN_DELAY_MODE` - Response pacing: off/natural/custom
- `HERMES_HUMAN_DELAY_MIN_MS` / `HERMES_HUMAN_DELAY_MAX_MS` - Custom delay range
### Dangerous Command Approval
The terminal tool includes safety checks for potentially destructive commands (e.g., `rm -rf`, `DROP TABLE`, `chmod 777`, etc.):
**Behavior by Backend:**
- **Docker/Singularity/Modal**: Commands run unrestricted (isolated containers)
- **Local/SSH**: Dangerous commands trigger approval flow
**Approval Flow (CLI):**
```
⚠️ Potentially dangerous command detected: recursive delete
rm -rf /tmp/test
[o]nce | [s]ession | [a]lways | [d]eny
Choice [o/s/a/D]:
```
**Approval Flow (Messaging):**
- Command is blocked with explanation
- Agent explains the command was blocked for safety
- User must add the pattern to their allowlist via `hermes config edit` or run the command directly on their machine
**Configuration:**
- `command_allowlist` in `~/.hermes/config.yaml` stores permanently allowed patterns
- Add patterns via "always" approval or edit directly
**Sudo Handling (Messaging):**
- If sudo fails over messaging, output includes tip to add `SUDO_PASSWORD` to `~/.hermes/.env`
---
## Background Process Management
The `process` tool works alongside `terminal` for managing long-running background processes:
**Starting a background process:**
```python
terminal(command="pytest -v tests/", background=true)
# Returns: {"session_id": "proc_abc123", "pid": 12345, ...}
```
**Managing it with the process tool:**
- `process(action="list")` -- show all running/recent processes
- `process(action="poll", session_id="proc_abc123")` -- check status + new output
- `process(action="log", session_id="proc_abc123")` -- full output with pagination
- `process(action="wait", session_id="proc_abc123", timeout=600)` -- block until done
- `process(action="kill", session_id="proc_abc123")` -- terminate
- `process(action="write", session_id="proc_abc123", data="y")` -- send stdin
- `process(action="submit", session_id="proc_abc123", data="yes")` -- send + Enter
**Key behaviors:**
- Background processes execute through the configured terminal backend (local/Docker/Modal/Daytona/SSH/Singularity) -- never directly on the host unless `TERMINAL_ENV=local`
- The `wait` action blocks the tool call until the process finishes, times out, or is interrupted by a new user message
- PTY mode (`pty=true` on terminal) enables interactive CLI tools (Codex, Claude Code)
- In RL training, background processes are auto-killed when the episode ends (`tool_context.cleanup()`)
- In the gateway, sessions with active background processes are exempt from idle reset
- The process registry checkpoints to `~/.hermes/processes.json` for crash recovery
Files: `tools/process_registry.py` (registry + handler), `tools/terminal_tool.py` (spawn integration)
---
## Adding New Tools
Requires changes in **3 files**:
Adding a tool requires changes in **2 files** (the tool file and `toolsets.py`):
1. **Create `tools/your_tool.py`** with handler, schema, check function, and registry call:
**1. Create `tools/your_tool.py`:**
```python
import json, os
# tools/example_tool.py
import json
import os
from tools.registry import registry
def check_requirements() -> bool:
def check_example_requirements() -> bool:
"""Check if required API keys/dependencies are available."""
return bool(os.getenv("EXAMPLE_API_KEY"))
def example_tool(param: str, task_id: str = None) -> str:
return json.dumps({"success": True, "data": "..."})
"""Execute the tool and return JSON string result."""
try:
result = {"success": True, "data": "..."}
return json.dumps(result, ensure_ascii=False)
except Exception as e:
return json.dumps({"error": str(e)}, ensure_ascii=False)
EXAMPLE_SCHEMA = {
"name": "example_tool",
"description": "Does something useful.",
"parameters": {
"type": "object",
"properties": {
"param": {"type": "string", "description": "The parameter"}
},
"required": ["param"]
}
}
registry.register(
name="example_tool",
toolset="example",
schema={"name": "example_tool", "description": "...", "parameters": {...}},
handler=lambda args, **kw: example_tool(param=args.get("param", ""), task_id=kw.get("task_id")),
check_fn=check_requirements,
schema=EXAMPLE_SCHEMA,
handler=lambda args, **kw: example_tool(
param=args.get("param", ""), task_id=kw.get("task_id")),
check_fn=check_example_requirements,
requires_env=["EXAMPLE_API_KEY"],
)
```
**2. Add import** in `model_tools.py` `_discover_tools()` list.
2. **Add to `toolsets.py`**: Add `"example_tool"` to `_HERMES_CORE_TOOLS` if it should be in all platform toolsets, or create a new toolset entry.
**3. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.
3. **Add discovery import** in `model_tools.py`'s `_discover_tools()` list: `"tools.example_tool"`.
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
That's it. The registry handles schema collection, dispatch, availability checking, and error wrapping automatically. No edits to `TOOLSET_REQUIREMENTS`, `handle_function_call()`, `get_all_tool_names()`, or any other data structure.
**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.
**Optional:** Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` for the setup wizard, and to `toolset_distributions.py` for batch processing.
**Special case: tools that need agent-level state** (like `todo`, `memory`):
These are intercepted by `run_agent.py`'s tool dispatch loop *before* `handle_function_call()`. The registry still holds their schemas, but dispatch returns a stub error as a safety fallback. See `todo_tool.py` for the pattern.
All tool handlers MUST return a JSON string. The registry's `dispatch()` wraps all exceptions in `{"error": "..."}` automatically.
### Dynamic Tool Availability
Tools declare their requirements at registration time via `check_fn` and `requires_env`. The registry checks `check_fn()` when building tool definitions -- tools whose check fails are silently excluded.
### Stateful Tools
Tools that maintain state (terminal, browser) require:
- `task_id` parameter for session isolation between concurrent tasks
- `cleanup_*()` function to release resources
- Cleanup is called automatically in run_agent.py after conversation completes
---
## Adding Configuration
## Trajectory Format
### config.yaml options:
1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
2. Bump `_config_version` (currently 5) to trigger migration for existing users
Conversations are saved in ShareGPT format for training:
```json
{"from": "system", "value": "System prompt with <tools>...</tools>"}
{"from": "human", "value": "User message"}
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
{"from": "gpt", "value": "Final response"}
```
Tool calls use `<tool_call>` XML tags, responses use `<tool_response>` tags, reasoning uses `<think>` tags.
### Trajectory Export
### .env variables:
1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
```python
"NEW_API_KEY": {
"description": "What it's for",
"prompt": "Display name",
"url": "https://...",
"password": True,
"category": "tool", # provider, tool, messaging, setting
},
agent = AIAgent(save_trajectories=True)
agent.chat("Do something")
# Saves to trajectories/*.jsonl in ShareGPT format
```
### Config loaders (two separate systems):
| Loader | Used by | Location |
|--------|---------|----------|
| `load_cli_config()` | CLI mode | `cli.py` |
| `load_config()` | `hermes tools`, `hermes setup` | `hermes_cli/config.py` |
| Direct YAML load | Gateway | `gateway/run.py` |
---
## Important Policies
## Batch Processing (batch_runner.py)
### Prompt Caching Must Not Break
Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
- Alter past context mid-conversation
- Change toolsets mid-conversation
- Reload memories or rebuild system prompts mid-conversation
Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.
### Working Directory Behavior
- **CLI**: Uses current directory (`.``os.getcwd()`)
- **Messaging**: Uses `MESSAGING_CWD` env var (default: home directory)
---
## Known Pitfalls
### DO NOT use `simple_term_menu` for interactive menus
Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.
### DO NOT use `\033[K` (ANSI erase-to-EOL) in spinner/display code
Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-padding: `f"\r{line}{' ' * pad}"`.
### `_last_resolved_tool_names` is a process-global in `model_tools.py`
When subagents overwrite this global, `execute_code` calls after delegation may fail with missing tool imports. Known bug.
### Tests must not write to `~/.hermes/`
The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.
---
## Development Commands
For processing multiple prompts:
- Parallel execution with multiprocessing
- Content-based resume for fault tolerance (matches on prompt text, not indices)
- Toolset distributions control probabilistic tool availability per prompt
- Output: `data/<run_name>/trajectories.jsonl` (combined) + individual batch files
```bash
make setup # First time: .venv + deps + pre-commit hooks
make check # Lint + test (mirrors CI — run before pushing)
make lint # Ruff check
make fmt # Ruff format + auto-fix
make test # Full test suite (~2500 tests, ~2 min)
make test-fast # Tests with fail-fast (-x)
make test-watch # Rerun tests on file changes
make dev-cli # Auto-restart CLI on file changes
make dev-gateway # Auto-restart gateway on file changes
python batch_runner.py \
--dataset_file=prompts.jsonl \
--batch_size=20 \
--num_workers=4 \
--run_name=my_run
```
For targeted testing, use `pytest` directly:
---
```bash
python -m pytest tests/test_model_tools.py -q # Toolset resolution
python -m pytest tests/test_cli_init.py -q # CLI config loading
python -m pytest tests/gateway/ -q # Gateway tests
python -m pytest tests/tools/ -q # Tool-level tests
## Skills System
Skills are on-demand knowledge documents the agent can load. Compatible with the [agentskills.io](https://agentskills.io/specification) open standard.
```
skills/
├── mlops/ # Category folder
│ ├── axolotl/ # Skill folder
│ │ ├── SKILL.md # Main instructions (required)
│ │ ├── references/ # Additional docs, API specs
│ │ ├── templates/ # Output formats, configs
│ │ └── assets/ # Supplementary files (agentskills.io)
│ └── vllm/
│ └── SKILL.md
├── .hub/ # Skills Hub state (gitignored)
│ ├── lock.json # Installed skill provenance
│ ├── quarantine/ # Pending security review
│ ├── audit.log # Security scan history
│ ├── taps.json # Custom source repos
│ └── index-cache/ # Cached remote indexes
```
Formatting is enforced by **ruff** (config in `pyproject.toml`). Pre-commit hooks run on every commit.
**Progressive disclosure** (token-efficient):
1. `skills_categories()` - List category names (~50 tokens)
2. `skills_list(category)` - Name + description per skill (~3k tokens)
3. `skill_view(name)` - Full content + tags + linked files
SKILL.md files use YAML frontmatter (agentskills.io format):
```yaml
---
name: skill-name
description: Brief description for listing
version: 1.0.0
platforms: [macos] # Optional — restrict to specific OS (macos/linux/windows)
metadata:
hermes:
tags: [tag1, tag2]
related_skills: [other-skill]
---
# Skill Content...
```
**Platform filtering** — Skills with a `platforms` field are automatically excluded from the system prompt index, `skills_list()`, and slash commands on incompatible platforms. Skills without the field load everywhere (backward compatible). See `skills/apple/` for macOS-only examples (iMessage, Reminders, Notes, FindMy).
**Skills Hub** — user-driven skill search/install from online registries and official optional skills. Sources: official optional skills (shipped with repo, labeled "official"), GitHub (openai/skills, anthropics/skills, custom taps), ClawHub, Claude marketplace, LobeHub. Not exposed as an agent tool — the model cannot search for or install skills. Users manage skills via `hermes skills browse/search/install` CLI commands or the `/skills` slash command in chat.
Key files:
- `tools/skills_tool.py` — Agent-facing skill list/view (progressive disclosure)
- `tools/skills_guard.py` — Security scanner (regex + LLM audit, trust-aware install policy)
- `tools/skills_hub.py` — Source adapters (OptionalSkillSource, GitHub, ClawHub, Claude marketplace, LobeHub), lock file, auth
- `hermes_cli/skills_hub.py` — CLI subcommands + `/skills` slash command handler
---
## Testing Changes
After making changes:
1. Run `hermes doctor` to check setup
2. Run `hermes config check` to verify config
3. Test with `hermes chat -q "test message"`
4. For new config options, test fresh install: `rm -rf ~/.hermes && hermes setup`

View File

@@ -65,7 +65,18 @@ If your skill is specialized, community-contributed, or niche, it's better suite
```bash
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
make setup # creates .venv, installs all deps
# Create venv with Python 3.11
uv venv venv --python 3.11
export VIRTUAL_ENV="$(pwd)/venv"
# Install with all extras (messaging, cron, CLI menus, dev tools)
uv pip install -e ".[all,dev]"
uv pip install -e "./mini-swe-agent"
uv pip install -e "./tinker-atropos"
# Optional: browser tools
npm install
```
### Configure for development
@@ -79,16 +90,22 @@ touch ~/.hermes/.env
echo 'OPENROUTER_API_KEY=sk-or-v1-your-key' >> ~/.hermes/.env
```
### Common commands
### Run
```bash
make test # run unit tests
make lint # ruff check
make fmt # ruff format + fix
make check # lint + test (same as CI)
make dev-cli # auto-restart hermes CLI on file changes
make dev-gateway # auto-restart gateway on file changes
make test-watch # rerun tests on file changes
# Symlink for global access
mkdir -p ~/.local/bin
ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes
# Verify
hermes doctor
hermes chat -q "Hello"
```
### Run tests
```bash
pytest tests/ -v
```
---
@@ -101,7 +118,7 @@ hermes-agent/
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit integration
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings and presets (hermes-cli, hermes-telegram, etc.)
├── hermes_state.py # SQLite session database with FTS5 full-text search, session titles
├── hermes_state.py # SQLite session database with FTS5 full-text search
├── batch_runner.py # Parallel batch processing for trajectory generation
├── agent/ # Agent internals (extracted modules)
@@ -201,7 +218,7 @@ User message → AIAgent._run_agent_loop()
- **Self-registering tools**: Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
- **Toolset grouping**: Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search and unique session titles. JSON logs go to `~/.hermes/sessions/`.
- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
- **Ephemeral injection**: System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
- **Provider abstraction**: The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
- **Provider routing**: When using OpenRouter, `provider_routing` in config.yaml controls provider selection (sort by throughput/latency/price, allow/ignore specific providers, data retention policies). These are injected as `extra_body.provider` in API requests.
@@ -210,7 +227,7 @@ User message → AIAgent._run_agent_loop()
## Code Style
- **Formatting**: Enforced by **ruff** (config in `pyproject.toml`). Run `make fmt` to auto-fix, `make lint` to check. Pre-commit hooks handle this automatically.
- **PEP 8** with practical exceptions (we don't enforce strict line length)
- **Comments**: Only when explaining non-obvious intent, trade-offs, or API quirks. Don't narrate what the code does — `# increment counter` adds nothing
- **Error handling**: Catch specific exceptions. Log with `logger.warning()`/`logger.error()` — use `exc_info=True` for unexpected errors so stack traces appear in logs
- **Cross-platform**: Never assume Unix. See [Cross-Platform Compatibility](#cross-platform-compatibility)
@@ -440,7 +457,7 @@ refactor/description # Code restructuring
### Before submitting
1. **Run checks**: `make check` (lint + test — same as CI)
1. **Run tests**: `pytest tests/ -v`
2. **Test manually**: Run `hermes` and exercise the code path you changed
3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider Windows and macOS
4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature.

21
LICENSE
View File

@@ -1,21 +0,0 @@
MIT License
Copyright (c) 2025 Nous Research
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -1,69 +0,0 @@
.DEFAULT_GOAL := help
SHELL := /bin/bash
VENV := .venv
UV := uv
SRC := run_agent.py model_tools.py toolsets.py cli.py hermes_state.py batch_runner.py \
tools/ hermes_cli/ gateway/ agent/ cron/
# ─── Setup ──────────────────────────────────────────────────────────────────────
.PHONY: setup sync clean
setup: ## Full dev setup (venv + deps + pre-commit)
$(UV) venv $(VENV) --python 3.11
. $(VENV)/bin/activate && $(UV) pip install -e ".[all,dev]"
. $(VENV)/bin/activate && $(UV) pip install -e "./mini-swe-agent"
. $(VENV)/bin/activate && pre-commit install
@echo "\n✅ Setup complete. Run: source $(VENV)/bin/activate"
sync: ## Reinstall deps into existing venv
. $(VENV)/bin/activate && $(UV) pip install -e ".[all,dev]"
clean: ## Remove build artifacts and caches
rm -rf .ruff_cache .mypy_cache .pytest_cache dist build *.egg-info
find . -type d -name __pycache__ -not -path "./.venv/*" -exec rm -rf {} +
# ─── Quality ────────────────────────────────────────────────────────────────────
.PHONY: lint fmt check
lint: ## Check lint + formatting (no changes)
. $(VENV)/bin/activate && ruff check $(SRC)
. $(VENV)/bin/activate && ruff format --check $(SRC)
fmt: ## Auto-fix lint + format
. $(VENV)/bin/activate && ruff format $(SRC)
. $(VENV)/bin/activate && ruff check --fix $(SRC)
check: lint test ## Lint + test (mirrors CI)
# ─── Test ───────────────────────────────────────────────────────────────────────
.PHONY: test test-fast test-watch
test: ## Run full test suite
. $(VENV)/bin/activate && python -m pytest tests/ -q --ignore=tests/integration --tb=short
test-fast: ## Run tests with fail-fast
. $(VENV)/bin/activate && python -m pytest tests/ -q --ignore=tests/integration --tb=short -x
test-watch: ## Rerun tests on file changes
. $(VENV)/bin/activate && python -m watchfiles "python -m pytest tests/ -q --ignore=tests/integration --tb=short -x" $(SRC) tests/
# ─── Dev Servers ────────────────────────────────────────────────────────────────
.PHONY: dev-cli dev-gateway
dev-cli: ## Auto-restart CLI on file changes
. $(VENV)/bin/activate && python -m watchfiles "python -m hermes_cli.main" $(SRC)
dev-gateway: ## Auto-restart gateway on file changes
. $(VENV)/bin/activate && python -m watchfiles "python -m gateway.run" $(SRC)
# ─── Misc ───────────────────────────────────────────────────────────────────────
.PHONY: help
help: ## Show this help
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf " \033[36m%-15s\033[0m %s\n", $$1, $$2}'

View File

@@ -17,7 +17,7 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open
<table>
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
<tr><td><b>Lives where you do</b></td><td>Telegram, Discord, Slack, WhatsApp, Signal, and CLI — all from a single gateway process. Voice memo transcription, cross-platform conversation continuity.</td></tr>
<tr><td><b>Lives where you do</b></td><td>Telegram, Discord, Slack, WhatsApp, and CLI — all from a single gateway process. Voice memo transcription, cross-platform conversation continuity.</td></tr>
<tr><td><b>A closed learning loop</b></td><td>Agent-curated memory with periodic nudges. Autonomous skill creation after complex tasks. Skills self-improve during use. FTS5 session search with LLM summarization for cross-session recall. <a href="https://github.com/plastic-labs/honcho">Honcho</a> dialectic user modeling. Compatible with the <a href="https://agentskills.io">agentskills.io</a> open standard.</td></tr>
<tr><td><b>Scheduled automations</b></td><td>Built-in cron scheduler with delivery to any platform. Daily reports, nightly backups, weekly audits — all in natural language, running unattended.</td></tr>
<tr><td><b>Delegates and parallelizes</b></td><td>Spawn isolated subagents for parallel workstreams. Write Python scripts that call tools via RPC, collapsing multi-step pipelines into zero-context-cost turns.</td></tr>
@@ -71,7 +71,7 @@ All documentation lives at **[hermes-agent.nousresearch.com/docs](https://hermes
| [Quickstart](https://hermes-agent.nousresearch.com/docs/getting-started/quickstart) | Install → setup → first conversation in 2 minutes |
| [CLI Usage](https://hermes-agent.nousresearch.com/docs/user-guide/cli) | Commands, keybindings, personalities, sessions |
| [Configuration](https://hermes-agent.nousresearch.com/docs/user-guide/configuration) | Config file, providers, models, all options |
| [Messaging Gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging) | Telegram, Discord, Slack, WhatsApp, Signal, Home Assistant |
| [Messaging Gateway](https://hermes-agent.nousresearch.com/docs/user-guide/messaging) | Telegram, Discord, Slack, WhatsApp, Home Assistant |
| [Security](https://hermes-agent.nousresearch.com/docs/user-guide/security) | Command approval, DM pairing, container isolation |
| [Tools & Toolsets](https://hermes-agent.nousresearch.com/docs/user-guide/features/tools) | 40+ tools, toolset system, terminal backends |
| [Skills System](https://hermes-agent.nousresearch.com/docs/user-guide/features/skills) | Procedural memory, Skills Hub, creating skills |
@@ -95,8 +95,12 @@ Quick start for contributors:
```bash
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
make setup # creates .venv, installs everything
make check # lint + test (same as CI)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv .venv --python 3.11
source .venv/bin/activate
uv pip install -e ".[all,dev]"
uv pip install -e "./mini-swe-agent"
python -m pytest tests/ -q
```
---

View File

@@ -4,7 +4,7 @@ Provides a single resolution chain so every consumer (context compression,
session search, web extraction, vision analysis, browser vision) picks up
the best available backend without duplicating fallback logic.
Resolution order for text tasks (auto mode):
Resolution order (same for text and vision tasks):
1. OpenRouter (OPENROUTER_API_KEY)
2. Nous Portal (~/.hermes/auth.json active provider)
3. Custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY)
@@ -14,19 +14,10 @@ Resolution order for text tasks (auto mode):
— checked via PROVIDER_REGISTRY entries with auth_type='api_key'
6. None
Resolution order for vision/multimodal tasks (auto mode):
1. OpenRouter
2. Nous Portal
3. None (steps 3-5 are skipped — they may not support multimodal)
Per-task provider overrides (e.g. AUXILIARY_VISION_PROVIDER,
CONTEXT_COMPRESSION_PROVIDER) can force a specific provider for each task:
"openrouter", "nous", "codex", or "main" (= steps 3-5).
Default "auto" follows the chains above.
Per-task model overrides (e.g. AUXILIARY_VISION_MODEL,
AUXILIARY_WEB_EXTRACT_MODEL) let callers use a different model slug
than the provider's default.
"openrouter", "nous", or "main" (= steps 3-5).
Default "auto" follows the full chain above.
"""
import json
@@ -34,7 +25,7 @@ import logging
import os
from pathlib import Path
from types import SimpleNamespace
from typing import Any
from typing import Any, Dict, List, Optional, Tuple
from openai import OpenAI
@@ -43,7 +34,7 @@ from hermes_constants import OPENROUTER_BASE_URL
logger = logging.getLogger(__name__)
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
_API_KEY_PROVIDER_AUX_MODELS: dict[str, str] = {
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
"minimax": "MiniMax-M2.5-highspeed",
@@ -82,55 +73,6 @@ _CODEX_AUX_BASE_URL = "https://chatgpt.com/backend-api/codex"
# read response.choices[0].message.content. This adapter translates those
# calls to the Codex Responses API so callers don't need any changes.
def _convert_content_for_responses(content: Any) -> Any:
"""Convert chat.completions content to Responses API format.
chat.completions uses:
{"type": "text", "text": "..."}
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
Responses API uses:
{"type": "input_text", "text": "..."}
{"type": "input_image", "image_url": "data:image/png;base64,..."}
If content is a plain string, it's returned as-is (the Responses API
accepts strings directly for text-only messages).
"""
if isinstance(content, str):
return content
if not isinstance(content, list):
return str(content) if content else ""
converted: list[dict[str, Any]] = []
for part in content:
if not isinstance(part, dict):
continue
ptype = part.get("type", "")
if ptype == "text":
converted.append({"type": "input_text", "text": part.get("text", "")})
elif ptype == "image_url":
# chat.completions nests the URL: {"image_url": {"url": "..."}}
image_data = part.get("image_url", {})
url = image_data.get("url", "") if isinstance(image_data, dict) else str(image_data)
entry: dict[str, Any] = {"type": "input_image", "image_url": url}
# Preserve detail if specified
detail = image_data.get("detail") if isinstance(image_data, dict) else None
if detail:
entry["detail"] = detail
converted.append(entry)
elif ptype in ("input_text", "input_image"):
# Already in Responses format — pass through
converted.append(part)
else:
# Unknown content type — try to preserve as text
text = part.get("text", "")
if text:
converted.append({"type": "input_text", "text": text})
return converted or ""
class _CodexCompletionsAdapter:
"""Drop-in shim that accepts chat.completions.create() kwargs and
routes them through the Codex Responses streaming API."""
@@ -144,33 +86,30 @@ class _CodexCompletionsAdapter:
model = kwargs.get("model", self._model)
temperature = kwargs.get("temperature")
# Separate system/instructions from conversation messages.
# Convert chat.completions multimodal content blocks to Responses
# API format (input_text / input_image instead of text / image_url).
# Separate system/instructions from conversation messages
instructions = "You are a helpful assistant."
input_msgs: list[dict[str, Any]] = []
input_msgs: List[Dict[str, Any]] = []
for msg in messages:
role = msg.get("role", "user")
content = msg.get("content") or ""
if role == "system":
instructions = content if isinstance(content, str) else str(content)
instructions = content
else:
input_msgs.append(
{
"role": role,
"content": _convert_content_for_responses(content),
}
)
input_msgs.append({"role": role, "content": content})
resp_kwargs: dict[str, Any] = {
resp_kwargs: Dict[str, Any] = {
"model": model,
"instructions": instructions,
"input": input_msgs or [{"role": "user", "content": ""}],
"stream": True,
"store": False,
}
# Note: the Codex endpoint (chatgpt.com/backend-api/codex) does NOT
# support max_output_tokens or temperature — omit to avoid 400 errors.
max_tokens = kwargs.get("max_output_tokens") or kwargs.get("max_completion_tokens") or kwargs.get("max_tokens")
if max_tokens is not None:
resp_kwargs["max_output_tokens"] = int(max_tokens)
if temperature is not None:
resp_kwargs["temperature"] = temperature
# Tools support for flush_memories and similar callers
tools = kwargs.get("tools")
@@ -181,20 +120,18 @@ class _CodexCompletionsAdapter:
name = fn.get("name")
if not name:
continue
converted.append(
{
"type": "function",
"name": name,
"description": fn.get("description", ""),
"parameters": fn.get("parameters", {}),
}
)
converted.append({
"type": "function",
"name": name,
"description": fn.get("description", ""),
"parameters": fn.get("parameters", {}),
})
if converted:
resp_kwargs["tools"] = converted
# Stream and collect the response
text_parts: list[str] = []
tool_calls_raw: list[Any] = []
text_parts: List[str] = []
tool_calls_raw: List[Any] = []
usage = None
try:
@@ -212,16 +149,14 @@ class _CodexCompletionsAdapter:
if ptype in ("output_text", "text"):
text_parts.append(getattr(part, "text", ""))
elif item_type == "function_call":
tool_calls_raw.append(
SimpleNamespace(
id=getattr(item, "call_id", ""),
type="function",
function=SimpleNamespace(
name=getattr(item, "name", ""),
arguments=getattr(item, "arguments", "{}"),
),
)
)
tool_calls_raw.append(SimpleNamespace(
id=getattr(item, "call_id", ""),
type="function",
function=SimpleNamespace(
name=getattr(item, "name", ""),
arguments=getattr(item, "arguments", "{}"),
),
))
resp_usage = getattr(final, "usage", None)
if resp_usage:
@@ -291,7 +226,6 @@ class _AsyncCodexCompletionsAdapter:
async def create(self, **kwargs) -> Any:
import asyncio
return await asyncio.to_thread(self._sync.create, **kwargs)
@@ -311,7 +245,7 @@ class AsyncCodexAuxiliaryClient:
self.base_url = sync_wrapper.base_url
def _read_nous_auth() -> dict | None:
def _read_nous_auth() -> Optional[dict]:
"""Read and validate ~/.hermes/auth.json for an active Nous provider.
Returns the provider state dict if Nous is active with tokens,
@@ -343,11 +277,10 @@ def _nous_base_url() -> str:
return os.getenv("NOUS_INFERENCE_BASE_URL", _NOUS_DEFAULT_BASE_URL)
def _read_codex_access_token() -> str | None:
def _read_codex_access_token() -> Optional[str]:
"""Read a valid Codex OAuth access token from Hermes auth store (~/.hermes/auth.json)."""
try:
from hermes_cli.auth import _read_codex_tokens
data = _read_codex_tokens()
tokens = data.get("tokens", {})
access_token = tokens.get("access_token")
@@ -359,7 +292,7 @@ def _read_codex_access_token() -> str | None:
return None
def _resolve_api_key_provider() -> tuple[OpenAI | None, str | None]:
def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Try each API-key provider in PROVIDER_REGISTRY order.
Returns (client, model) for the first provider whose env var is set,
@@ -384,29 +317,20 @@ def _resolve_api_key_provider() -> tuple[OpenAI | None, str | None]:
if not api_key:
continue
# Resolve base URL (with optional env-var override)
# Kimi Code keys (sk-kimi-) need api.kimi.com/coding/v1
env_url = ""
base_url = pconfig.inference_base_url
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if env_url:
base_url = env_url.rstrip("/")
elif provider_id == "kimi-coding" and api_key.startswith("sk-kimi-"):
base_url = "https://api.kimi.com/coding/v1"
else:
base_url = pconfig.inference_base_url
if env_url:
base_url = env_url.rstrip("/")
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id, "default")
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
extra = {}
if "api.kimi.com" in base_url.lower():
extra["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
return OpenAI(api_key=api_key, base_url=base_url), model
return None, None
# ── Provider resolution helpers ─────────────────────────────────────────────
def _get_auxiliary_provider(task: str = "") -> str:
"""Read the provider override for a specific auxiliary task.
@@ -422,15 +346,16 @@ def _get_auxiliary_provider(task: str = "") -> str:
return "auto"
def _try_openrouter() -> tuple[OpenAI | None, str | None]:
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
or_key = os.getenv("OPENROUTER_API_KEY")
if not or_key:
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL, default_headers=_OR_HEADERS), _OPENROUTER_MODEL
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
def _try_nous() -> tuple[OpenAI | None, str | None]:
def _try_nous() -> Tuple[Optional[OpenAI], Optional[str]]:
nous = _read_nous_auth()
if not nous:
return None, None
@@ -443,7 +368,7 @@ def _try_nous() -> tuple[OpenAI | None, str | None]:
)
def _try_custom_endpoint() -> tuple[OpenAI | None, str | None]:
def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
custom_base = os.getenv("OPENAI_BASE_URL")
custom_key = os.getenv("OPENAI_API_KEY")
if not custom_base or not custom_key:
@@ -453,7 +378,7 @@ def _try_custom_endpoint() -> tuple[OpenAI | None, str | None]:
return OpenAI(api_key=custom_key, base_url=custom_base), model
def _try_codex() -> tuple[Any | None, str | None]:
def _try_codex() -> Tuple[Optional[Any], Optional[str]]:
codex_token = _read_codex_access_token()
if not codex_token:
return None, None
@@ -462,7 +387,7 @@ def _try_codex() -> tuple[Any | None, str | None]:
return CodexAuxiliaryClient(real_client, _CODEX_AUX_MODEL), _CODEX_AUX_MODEL
def _resolve_forced_provider(forced: str) -> tuple[OpenAI | None, str | None]:
def _resolve_forced_provider(forced: str) -> Tuple[Optional[OpenAI], Optional[str]]:
"""Resolve a specific forced provider. Returns (None, None) if creds missing."""
if forced == "openrouter":
client, model = _try_openrouter()
@@ -476,12 +401,6 @@ def _resolve_forced_provider(forced: str) -> tuple[OpenAI | None, str | None]:
logger.warning("auxiliary.provider=nous but Nous Portal not configured (run: hermes login)")
return client, model
if forced == "codex":
client, model = _try_codex()
if client is None:
logger.warning("auxiliary.provider=codex but no Codex OAuth token found (run: hermes model)")
return client, model
if forced == "main":
# "main" = skip OpenRouter/Nous, use the main chat model's credentials.
for try_fn in (_try_custom_endpoint, _try_codex, _resolve_api_key_provider):
@@ -496,9 +415,10 @@ def _resolve_forced_provider(forced: str) -> tuple[OpenAI | None, str | None]:
return None, None
def _resolve_auto() -> tuple[OpenAI | None, str | None]:
def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Full auto-detection chain: OpenRouter → Nous → custom → Codex → API-key → None."""
for try_fn in (_try_openrouter, _try_nous, _try_custom_endpoint, _try_codex, _resolve_api_key_provider):
for try_fn in (_try_openrouter, _try_nous, _try_custom_endpoint,
_try_codex, _resolve_api_key_provider):
client, model = try_fn()
if client is not None:
return client, model
@@ -508,8 +428,7 @@ def _resolve_auto() -> tuple[OpenAI | None, str | None]:
# ── Public API ──────────────────────────────────────────────────────────────
def get_text_auxiliary_client(task: str = "") -> tuple[OpenAI | None, str | None]:
def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optional[str]]:
"""Return (client, default_model_slug) for text-only auxiliary tasks.
Args:
@@ -547,42 +466,25 @@ def get_async_text_auxiliary_client(task: str = ""):
}
if "openrouter" in str(sync_client.base_url).lower():
async_kwargs["default_headers"] = dict(_OR_HEADERS)
elif "api.kimi.com" in str(sync_client.base_url).lower():
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
return AsyncOpenAI(**async_kwargs), model
def get_vision_auxiliary_client() -> tuple[OpenAI | None, str | None]:
def get_vision_auxiliary_client() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Return (client, default_model_slug) for vision/multimodal auxiliary tasks.
Checks AUXILIARY_VISION_PROVIDER for a forced provider, otherwise
auto-detects. Callers may override the returned model with
AUXILIARY_VISION_MODEL.
In auto mode, only providers known to support multimodal are tried:
OpenRouter, Nous Portal, and Codex OAuth (gpt-5.3-codex supports
vision via the Responses API). Custom endpoints and API-key
providers are skipped — they may not handle vision input. To use
them, set AUXILIARY_VISION_PROVIDER explicitly.
"""
forced = _get_auxiliary_provider("vision")
if forced != "auto":
return _resolve_forced_provider(forced)
# Auto: try providers known to support multimodal first, then fall
# back to the user's custom endpoint. Many local models (Qwen-VL,
# LLaVA, Pixtral, etc.) support vision — skipping them entirely
# caused silent failures for local-only users.
for try_fn in (_try_openrouter, _try_nous, _try_codex, _try_custom_endpoint):
client, model = try_fn()
if client is not None:
return client, model
logger.debug("Auxiliary vision client: none available")
return None, None
return _resolve_auto()
def get_auxiliary_extra_body() -> dict:
"""Return extra_body kwargs for auxiliary API calls.
Includes Nous Portal product tags when the auxiliary client is backed
by Nous Portal. Returns empty dict otherwise.
"""
@@ -591,7 +493,7 @@ def get_auxiliary_extra_body() -> dict:
def auxiliary_max_tokens_param(value: int) -> dict:
"""Return the correct max tokens kwarg for the auxiliary client's provider.
OpenRouter and local models use 'max_tokens'. Direct OpenAI with newer
models (gpt-4o, o-series, gpt-5+) requires 'max_completion_tokens'.
The Codex adapter translates max_tokens internally, so we use max_tokens
@@ -600,6 +502,8 @@ def auxiliary_max_tokens_param(value: int) -> dict:
custom_base = os.getenv("OPENAI_BASE_URL", "")
or_key = os.getenv("OPENROUTER_API_KEY")
# Only use max_completion_tokens for direct OpenAI custom endpoints
if not or_key and _read_nous_auth() is None and "api.openai.com" in custom_base.lower():
if (not or_key
and _read_nous_auth() is None
and "api.openai.com" in custom_base.lower()):
return {"max_completion_tokens": value}
return {"max_tokens": value}

View File

@@ -7,12 +7,12 @@ protecting head and tail context.
import logging
import os
from typing import Any
from typing import Any, Dict, List
from agent.auxiliary_client import get_text_auxiliary_client
from agent.model_metadata import (
estimate_messages_tokens_rough,
get_model_context_length,
estimate_messages_tokens_rough,
)
logger = logging.getLogger(__name__)
@@ -56,7 +56,7 @@ class ContextCompressor:
self.client, default_model = get_text_auxiliary_client("compression")
self.summary_model = summary_model_override or default_model
def update_from_response(self, usage: dict[str, Any]):
def update_from_response(self, usage: Dict[str, Any]):
"""Update tracked token usage from API response."""
self.last_prompt_tokens = usage.get("prompt_tokens", 0)
self.last_completion_tokens = usage.get("completion_tokens", 0)
@@ -67,12 +67,12 @@ class ContextCompressor:
tokens = prompt_tokens if prompt_tokens is not None else self.last_prompt_tokens
return tokens >= self.threshold_tokens
def should_compress_preflight(self, messages: list[dict[str, Any]]) -> bool:
def should_compress_preflight(self, messages: List[Dict[str, Any]]) -> bool:
"""Quick pre-flight check using rough estimate (before API call)."""
rough_estimate = estimate_messages_tokens_rough(messages)
return rough_estimate >= self.threshold_tokens
def get_status(self) -> dict[str, Any]:
def get_status(self) -> Dict[str, Any]:
"""Get current compression status for display/logging."""
return {
"last_prompt_tokens": self.last_prompt_tokens,
@@ -82,14 +82,11 @@ class ContextCompressor:
"compression_count": self.compression_count,
}
def _generate_summary(self, turns_to_summarize: list[dict[str, Any]]) -> str | None:
"""Generate a concise summary of conversation turns.
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]]) -> str:
"""Generate a concise summary of conversation turns using a fast model."""
if not self.client:
return "[CONTEXT SUMMARY]: Previous conversation turns have been compressed to save space. The assistant performed various actions and received responses."
Tries the auxiliary model first, then falls back to the user's main
model. Returns None if all attempts fail — the caller should drop
the middle turns without a summary rather than inject a useless
placeholder.
"""
parts = []
for msg in turns_to_summarize:
role = msg.get("role", "unknown")
@@ -120,30 +117,28 @@ TURNS TO SUMMARIZE:
Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
# 1. Try the auxiliary model (cheap/fast)
if self.client:
try:
return self._call_summary_model(self.client, self.summary_model, prompt)
except Exception as e:
logging.warning(f"Failed to generate context summary with auxiliary model: {e}")
try:
return self._call_summary_model(self.client, self.summary_model, prompt)
except Exception as e:
logging.warning(f"Failed to generate context summary with auxiliary model: {e}")
# 2. Fallback: try the user's main model endpoint
fallback_client, fallback_model = self._get_fallback_client()
if fallback_client is not None:
try:
logger.info("Retrying context summary with main model (%s)", fallback_model)
summary = self._call_summary_model(fallback_client, fallback_model, prompt)
self.client = fallback_client
self.summary_model = fallback_model
return summary
except Exception as fallback_err:
logging.warning(f"Main model summary also failed: {fallback_err}")
# Fallback: try the main model's endpoint. This handles the common
# case where the user switched providers (e.g. OpenRouter → local LLM)
# but a stale API key causes the auxiliary client to pick the old
# provider which then fails (402, auth error, etc.).
fallback_client, fallback_model = self._get_fallback_client()
if fallback_client is not None:
try:
logger.info("Retrying context summary with fallback client (%s)", fallback_model)
summary = self._call_summary_model(fallback_client, fallback_model, prompt)
# Success — swap in the working client for future compressions
self.client = fallback_client
self.summary_model = fallback_model
return summary
except Exception as fallback_err:
logging.warning(f"Fallback summary model also failed: {fallback_err}")
# 3. All models failed — return None so the caller drops turns without a summary
logging.warning(
"Context compression: no model available for summary. Middle turns will be dropped without summary."
)
return None
return "[CONTEXT SUMMARY]: Previous conversation turns have been compressed. The assistant performed tool calls and received responses."
def _call_summary_model(self, client, model: str, prompt: str) -> str:
"""Make the actual LLM call to generate a summary. Raises on failure."""
@@ -188,14 +183,12 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
# Don't fallback to the same provider that just failed
from hermes_constants import OPENROUTER_BASE_URL
if custom_base.rstrip("/") == OPENROUTER_BASE_URL.rstrip("/"):
return None, None
model = os.getenv("LLM_MODEL") or os.getenv("OPENAI_MODEL") or self.model
try:
from openai import OpenAI as _OpenAI
client = _OpenAI(api_key=custom_key, base_url=custom_base)
logger.debug("Built fallback auxiliary client: %s via %s", model, custom_base)
return client, model
@@ -214,7 +207,7 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
return tc.get("id", "")
return getattr(tc, "id", "") or ""
def _sanitize_tool_pairs(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
def _sanitize_tool_pairs(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Fix orphaned tool_call / tool_result pairs after compression.
Two failure modes:
@@ -247,7 +240,8 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
orphaned_results = result_call_ids - surviving_call_ids
if orphaned_results:
messages = [
m for m in messages if not (m.get("role") == "tool" and m.get("tool_call_id") in orphaned_results)
m for m in messages
if not (m.get("role") == "tool" and m.get("tool_call_id") in orphaned_results)
]
if not self.quiet_mode:
logger.info("Compression sanitizer: removed %d orphaned tool result(s)", len(orphaned_results))
@@ -255,27 +249,25 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
# 2. Add stub results for assistant tool_calls whose results were dropped
missing_results = surviving_call_ids - result_call_ids
if missing_results:
patched: list[dict[str, Any]] = []
patched: List[Dict[str, Any]] = []
for msg in messages:
patched.append(msg)
if msg.get("role") == "assistant":
for tc in msg.get("tool_calls") or []:
cid = self._get_tool_call_id(tc)
if cid in missing_results:
patched.append(
{
"role": "tool",
"content": "[Result from earlier conversation — see context summary above]",
"tool_call_id": cid,
}
)
patched.append({
"role": "tool",
"content": "[Result from earlier conversation — see context summary above]",
"tool_call_id": cid,
})
messages = patched
if not self.quiet_mode:
logger.info("Compression sanitizer: added %d stub tool result(s)", len(missing_results))
return messages
def _align_boundary_forward(self, messages: list[dict[str, Any]], idx: int) -> int:
def _align_boundary_forward(self, messages: List[Dict[str, Any]], idx: int) -> int:
"""Push a compress-start boundary forward past any orphan tool results.
If ``messages[idx]`` is a tool result, slide forward until we hit a
@@ -285,7 +277,7 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
idx += 1
return idx
def _align_boundary_backward(self, messages: list[dict[str, Any]], idx: int) -> int:
def _align_boundary_backward(self, messages: List[Dict[str, Any]], idx: int) -> int:
"""Pull a compress-end boundary backward to avoid splitting a
tool_call / result group.
@@ -303,7 +295,7 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
idx -= 1
return idx
def compress(self, messages: list[dict[str, Any]], current_tokens: int = None) -> list[dict[str, Any]]:
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None) -> List[Dict[str, Any]]:
"""Compress conversation messages by summarizing middle turns.
Keeps first N + last N turns, summarizes everything in between.
@@ -313,9 +305,7 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
n_messages = len(messages)
if n_messages <= self.protect_first_n + self.protect_last_n + 1:
if not self.quiet_mode:
print(
f"⚠️ Cannot compress: only {n_messages} messages (need > {self.protect_first_n + self.protect_last_n + 1})"
)
print(f"⚠️ Cannot compress: only {n_messages} messages (need > {self.protect_first_n + self.protect_last_n + 1})")
return messages
compress_start = self.protect_first_n
@@ -330,20 +320,33 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
return messages
turns_to_summarize = messages[compress_start:compress_end]
display_tokens = (
current_tokens if current_tokens else self.last_prompt_tokens or estimate_messages_tokens_rough(messages)
)
display_tokens = current_tokens if current_tokens else self.last_prompt_tokens or estimate_messages_tokens_rough(messages)
if not self.quiet_mode:
print(
f"\n📦 Context compression triggered ({display_tokens:,} tokens {self.threshold_tokens:,} threshold)"
)
print(
f" 📊 Model context limit: {self.context_length:,} tokens ({self.threshold_percent * 100:.0f}% = {self.threshold_tokens:,})"
)
print(f"\n📦 Context compression triggered ({display_tokens:,} tokens ≥ {self.threshold_tokens:,} threshold)")
print(f" 📊 Model context limit: {self.context_length:,} tokens ({self.threshold_percent*100:.0f}% = {self.threshold_tokens:,})")
# Truncation fallback when no auxiliary model is available
if self.client is None:
print("⚠️ Context compression: no auxiliary model available. Falling back to message truncation.")
# Keep system message(s) at the front and the protected tail;
# simply drop the oldest non-system messages until under threshold.
kept = []
for msg in messages:
if msg.get("role") == "system":
kept.append(msg.copy())
else:
break
tail = messages[-self.protect_last_n:]
kept.extend(m.copy() for m in tail)
self.compression_count += 1
kept = self._sanitize_tool_pairs(kept)
if not self.quiet_mode:
print(f" ✂️ Truncated: {len(messages)}{len(kept)} messages (dropped middle turns)")
return kept
if not self.quiet_mode:
print(f" 🗜️ Summarizing turns {compress_start + 1}-{compress_end} ({len(turns_to_summarize)} turns)")
print(f" 🗜️ Summarizing turns {compress_start+1}-{compress_end} ({len(turns_to_summarize)} turns)")
summary = self._generate_summary(turns_to_summarize)
@@ -351,18 +354,10 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
for i in range(compress_start):
msg = messages[i].copy()
if i == 0 and msg.get("role") == "system" and self.compression_count == 0:
msg["content"] = (
msg.get("content") or ""
) + "\n\n[Note: Some earlier conversation turns may be summarized to preserve context space.]"
msg["content"] = (msg.get("content") or "") + "\n\n[Note: Some earlier conversation turns may be summarized to preserve context space.]"
compressed.append(msg)
if summary:
last_head_role = messages[compress_start - 1].get("role", "user") if compress_start > 0 else "user"
summary_role = "user" if last_head_role in ("assistant", "tool") else "assistant"
compressed.append({"role": summary_role, "content": summary})
else:
if not self.quiet_mode:
print(" ⚠️ No summary model available — middle turns dropped without summary")
compressed.append({"role": "user", "content": summary})
for i in range(compress_end, n_messages):
compressed.append(messages[i].copy())

View File

@@ -6,6 +6,7 @@ Used by AIAgent._execute_tool_calls for CLI feedback.
import json
import os
import random
import sys
import threading
import time
@@ -19,31 +20,19 @@ _RESET = "\033[0m"
# Tool preview (one-line summary of a tool call's primary argument)
# =========================================================================
def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
"""Build a short preview of a tool call's primary argument for display."""
primary_args = {
"terminal": "command",
"web_search": "query",
"web_extract": "urls",
"read_file": "path",
"write_file": "path",
"patch": "path",
"search_files": "pattern",
"browser_navigate": "url",
"browser_click": "ref",
"browser_type": "text",
"image_generate": "prompt",
"text_to_speech": "text",
"vision_analyze": "question",
"mixture_of_agents": "user_prompt",
"skill_view": "name",
"skills_list": "category",
"terminal": "command", "web_search": "query", "web_extract": "urls",
"read_file": "path", "write_file": "path", "patch": "path",
"search_files": "pattern", "browser_navigate": "url",
"browser_click": "ref", "browser_type": "text",
"image_generate": "prompt", "text_to_speech": "text",
"vision_analyze": "question", "mixture_of_agents": "user_prompt",
"skill_view": "name", "skills_list": "category",
"schedule_cronjob": "name",
"execute_code": "code",
"delegate_task": "goal",
"clarify": "question",
"skill_manage": "name",
"execute_code": "code", "delegate_task": "goal",
"clarify": "question", "skill_manage": "name",
}
if tool_name == "process":
@@ -72,18 +61,18 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
if tool_name == "session_search":
query = args.get("query", "")
return f'recall: "{query[:25]}{"..." if len(query) > 25 else ""}"'
return f"recall: \"{query[:25]}{'...' if len(query) > 25 else ''}\""
if tool_name == "memory":
action = args.get("action", "")
target = args.get("target", "")
if action == "add":
content = args.get("content", "")
return f'+{target}: "{content[:25]}{"..." if len(content) > 25 else ""}"'
return f"+{target}: \"{content[:25]}{'...' if len(content) > 25 else ''}\""
elif action == "replace":
return f'~{target}: "{args.get("old_text", "")[:20]}"'
return f"~{target}: \"{args.get('old_text', '')[:20]}\""
elif action == "remove":
return f'-{target}: "{args.get("old_text", "")[:20]}"'
return f"-{target}: \"{args.get('old_text', '')[:20]}\""
return action
if tool_name == "send_message":
@@ -91,7 +80,7 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
msg = args.get("message", "")
if len(msg) > 20:
msg = msg[:17] + "..."
return f'to {target}: "{msg}"'
return f"to {target}: \"{msg}\""
if tool_name.startswith("rl_"):
rl_previews = {
@@ -126,7 +115,7 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
if not preview:
return None
if len(preview) > max_len:
preview = preview[: max_len - 3] + "..."
preview = preview[:max_len - 3] + "..."
return preview
@@ -134,74 +123,41 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str:
# KawaiiSpinner
# =========================================================================
class KawaiiSpinner:
"""Animated spinner with kawaii faces for CLI feedback during tool execution."""
SPINNERS = {
"dots": ["", "", "", "", "", "", "", "", "", ""],
"bounce": ["", "", "", "", "", "", "", ""],
"grow": ["", "", "", "", "", "", "", "", "", "", "", "", "", ""],
"arrows": ["", "", "", "", "", "", "", ""],
"star": ["", "", "", "", "", "", "", ""],
"moon": ["🌑", "🌒", "🌓", "🌔", "🌕", "🌖", "🌗", "🌘"],
"pulse": ["", "", "", "", "", ""],
"brain": ["🧠", "💭", "💡", "", "💫", "🌟", "💡", "💭"],
"sparkle": ["", "˚", "*", "", "", "", "*", "˚"],
'dots': ['', '', '', '', '', '', '', '', '', ''],
'bounce': ['', '', '', '', '', '', '', ''],
'grow': ['', '', '', '', '', '', '', '', '', '', '', '', '', ''],
'arrows': ['', '', '', '', '', '', '', ''],
'star': ['', '', '', '', '', '', '', ''],
'moon': ['🌑', '🌒', '🌓', '🌔', '🌕', '🌖', '🌗', '🌘'],
'pulse': ['', '', '', '', '', ''],
'brain': ['🧠', '💭', '💡', '', '💫', '🌟', '💡', '💭'],
'sparkle': ['', '˚', '*', '', '', '', '*', '˚'],
}
KAWAII_WAITING = [
"(。◕‿◕。)",
"(◕◕✿)",
"٩(◕‿◕。)۶",
"(✿◠‿◠)",
"( ˘▽˘)っ",
"♪(´ε` )",
"(◕ᴗ◕✿)",
"ヾ(^∇^)",
"(≧◡≦)",
"(★ω★)",
"(。◕‿◕。)", "(◕‿◕✿)", "٩(◕‿◕。)۶", "(✿◠‿◠)", "( ˘▽˘)っ",
"♪(´ε` )", "(◕◕✿)", "ヾ(^∇^)", "(≧◡≦)", "(★ω★)",
]
KAWAII_THINKING = [
"(。•́︿•̀。)",
"(◔_◔)",
"¬)",
"( •_•)>⌐■-■",
"(⌐■_■)",
"(´・_・`)",
"◉_◉",
"(°ロ°)",
"( ˘⌣˘)♡",
"ヽ(>∀<☆)☆",
"٩(๑❛ᴗ❛๑)۶",
"(⊙_⊙)",
"(¬_¬)",
"( ͡° ͜ʖ ͡°)",
"ಠ_ಠ",
"(。•́︿•̀。)", "(◔_◔)", "(¬‿¬)", "( •_•)>⌐■-■", "(⌐■_■)",
"(´・_・`)", "◉_◉", "(°ロ°)", "( ˘⌣˘)♡", "ヽ(>∀<☆)☆",
"٩(๑❛ᴗ❛๑)۶", "(⊙_⊙)", "_¬)", "( ͡° ͜ʖ ͡°)", "ಠ_ಠ",
]
THINKING_VERBS = [
"pondering",
"contemplating",
"musing",
"cogitating",
"ruminating",
"deliberating",
"mulling",
"reflecting",
"processing",
"reasoning",
"analyzing",
"computing",
"synthesizing",
"formulating",
"brainstorming",
"pondering", "contemplating", "musing", "cogitating", "ruminating",
"deliberating", "mulling", "reflecting", "processing", "reasoning",
"analyzing", "computing", "synthesizing", "formulating", "brainstorming",
]
def __init__(self, message: str = "", spinner_type: str = "dots"):
def __init__(self, message: str = "", spinner_type: str = 'dots'):
self.message = message
self.spinner_frames = self.SPINNERS.get(spinner_type, self.SPINNERS["dots"])
self.spinner_frames = self.SPINNERS.get(spinner_type, self.SPINNERS['dots'])
self.running = False
self.thread = None
self.frame_idx = 0
@@ -211,7 +167,7 @@ class KawaiiSpinner:
# child agents can replace sys.stdout with a black hole.
self._out = sys.stdout
def _write(self, text: str, end: str = "\n", flush: bool = False):
def _write(self, text: str, end: str = '\n', flush: bool = False):
"""Write to the stdout captured at spinner creation time."""
try:
self._out.write(text + end)
@@ -229,7 +185,7 @@ class KawaiiSpinner:
elapsed = time.time() - self.start_time
line = f" {frame} {self.message} ({elapsed:.1f}s)"
pad = max(self.last_line_len - len(line), 0)
self._write(f"\r{line}{' ' * pad}", end="", flush=True)
self._write(f"\r{line}{' ' * pad}", end='', flush=True)
self.last_line_len = len(line)
self.frame_idx += 1
time.sleep(0.12)
@@ -260,7 +216,7 @@ class KawaiiSpinner:
# Clear spinner line with spaces (not \033[K) to avoid garbled escape
# codes when prompt_toolkit's patch_stdout is active — same approach
# as stop(). Then print text; spinner redraws on next tick.
blanks = " " * max(self.last_line_len + 5, 40)
blanks = ' ' * max(self.last_line_len + 5, 40)
self._write(f"\r{blanks}\r {text}", flush=True)
def stop(self, final_message: str = None):
@@ -269,8 +225,8 @@ class KawaiiSpinner:
self.thread.join(timeout=0.5)
# Clear the spinner line with spaces instead of \033[K to avoid
# garbled escape codes when prompt_toolkit's patch_stdout is active.
blanks = " " * max(self.last_line_len + 5, 40)
self._write(f"\r{blanks}\r", end="", flush=True)
blanks = ' ' * max(self.last_line_len + 5, 40)
self._write(f"\r{blanks}\r", end='', flush=True)
if final_message:
self._write(f" {final_message}", flush=True)
@@ -288,110 +244,38 @@ class KawaiiSpinner:
# =========================================================================
KAWAII_SEARCH = [
"♪(´ε` )",
"(◕‿◕。)",
"ヾ(^∇^)",
"(◕ᴗ◕✿)",
"( ˘▽˘)っ",
"٩(◕‿◕。)۶",
"(✿◠‿◠)",
"♪~(´ε` )",
"(ノ´ヮ`)*:・゚✧",
"(◎o◎)",
"♪(´ε` )", "(。◕‿◕。)", "ヾ(^∇^)", "(◕ᴗ◕✿)", "( ˘▽˘)っ",
"٩(◕‿◕。)۶", "(✿◠‿◠)", "♪~(´ε` )", "(ノ´ヮ`)*:・゚✧", "(◎o◎)",
]
KAWAII_READ = [
"φ(゜▽゜*)♪",
"( ˘▽˘)っ",
"(⌐■_■)",
"٩(。•́‿•̀。)۶",
"(◕‿◕✿)",
"ヾ(@⌒ー⌒@)",
"(✧ω✧)",
"♪(๑ᴖ◡ᴖ๑)♪",
"(≧◡≦)",
"( ´ ▽ ` )",
"φ(゜▽゜*)♪", "( ˘▽˘)っ", "(⌐■_■)", "٩(。•́‿•̀。)۶", "(◕‿◕✿)",
"ヾ(@⌒ー⌒@)", "(✧ω✧)", "♪(๑ᴖ◡ᴖ๑)♪", "(≧◡≦)", "( ´ ▽ ` )",
]
KAWAII_TERMINAL = [
"ヽ(>∀<☆)",
"(ノ°∀°)",
"٩(^ᴗ^)۶",
"ヾ(⌐■_■)ノ♪",
"(•̀ᴗ•́)و",
"┗(0)┓",
"(`・ω・´)",
"( ̄▽ ̄)",
"(ง •̀_•́)ง",
"ヽ(´▽`)/",
"ヽ(>∀<☆)", "(ノ°∀°)", "٩(^ᴗ^)۶", "ヾ(⌐■_■)ノ♪", "(•̀ᴗ•́)و",
"┗(0)┓", "(`・ω・´)", "( ̄▽ ̄)", "(ง •̀_•́)ง", "ヽ(´▽`)/",
]
KAWAII_BROWSER = [
"(ノ°∀°)",
"(☞゚ヮ゚)☞",
"( ͡° ͜ʖ ͡°)",
"┌( ಠ_ಠ)┘",
"(⊙_⊙)",
"ヾ(•ω•`)o",
"( ̄ω ̄)",
"( ˇωˇ )",
"(ᵔᴥᵔ)",
"(◎o◎)",
"(ノ°∀°)", "(☞゚ヮ゚)☞", "( ͡° ͜ʖ ͡°)", "┌( ಠ_ಠ)┘", "(⊙_⊙)",
"ヾ(•ω•`)o", "( ̄ω ̄)", "( ˇωˇ )", "(ᵔᴥᵔ)", "(◎o◎)",
]
KAWAII_CREATE = [
"✧*。٩(ˊᗜˋ*)و✧",
"(ノ◕ヮ◕)ノ*:・゚✧",
"ヽ(>∀<☆)",
"٩(♡ε♡)۶",
"(◕‿◕)♡",
"✿◕ ‿ ◕✿",
"(*≧▽≦)",
"ヾ(-)",
"(☆▽☆)",
"°˖✧◝(⁰▿⁰)◜✧˖°",
"✧*。٩(ˊᗜˋ*)و✧", "(ノ◕ヮ◕)ノ*:・゚✧", "ヽ(>∀<☆)", "٩(♡ε♡)۶", "(◕‿◕)♡",
"✿◕ ‿ ◕✿", "(*≧▽≦)", "ヾ(-)", "(☆▽☆)", "°˖✧◝(⁰▿⁰)◜✧˖°",
]
KAWAII_SKILL = [
"ヾ(@⌒ー⌒@)",
"(๑˃ᴗ˂)ﻭ",
"٩(◕‿◕。)۶",
"(✿╹◡╹)",
"ヽ(・∀・)",
"(ノ´ヮ`)*:・゚✧",
"♪(๑ᴖ◡ᴖ๑)♪",
"(◠‿◠)",
"٩(ˊᗜˋ*)و",
"(^▽^)",
"ヾ(^∇^)",
"(★ω★)/",
"٩(。•́‿•̀。)۶",
"(◕ᴗ◕✿)",
"(◎o◎)",
"(✧ω✧)",
"ヽ(>∀<☆)",
"( ˘▽˘)っ",
"(≧◡≦) ♡",
"ヾ( ̄▽ ̄)",
"ヾ(@⌒ー⌒@)", "(๑˃ᴗ˂)ﻭ", "٩(◕‿◕。)۶", "(✿╹◡╹)", "ヽ(・∀・)",
"(ノ´ヮ`)*:・゚✧", "♪(๑ᴖ◡ᴖ๑)♪", "(◠‿◠)", "٩(ˊᗜˋ*)و", "(^▽^)",
"ヾ(^∇^)", "(★ω★)/", "٩(。•́‿•̀。)۶", "(◕ᴗ◕✿)", "(◎o◎)",
"(✧ω✧)", "ヽ(>∀<☆)", "( ˘▽˘)っ", "(≧◡≦) ♡", "ヾ( ̄▽ ̄)",
]
KAWAII_THINK = [
"(っ°Д°;)っ",
"(;′⌒`)",
"(・_・ヾ",
"( ´_ゝ`)",
"( ̄ヘ ̄)",
"(。-`ω´-)",
"( ˘︹˘ )",
"(¬_¬)",
"ヽ(ー_ー )",
"(一_一)",
"(っ°Д°;)っ", "(;′⌒`)", "(・_・ヾ", "( ´_ゝ`)", "( ̄ヘ ̄)",
"(。-`ω´-)", "( ˘︹˘ )", "(¬_¬)", "ヽ(ー_ー )", "(一_一)",
]
KAWAII_GENERIC = [
"♪(´ε` )",
"(◕‿◕✿)",
"ヾ(^∇^)",
"٩(◕‿◕。)۶",
"(✿◠‿◠)",
"(ノ´ヮ`)*:・゚✧",
"ヽ(>∀<☆)",
"(☆▽☆)",
"( ˘▽˘)っ",
"(≧◡≦)",
"♪(´ε` )", "(◕‿◕✿)", "ヾ(^∇^)", "٩(◕‿◕。)۶", "(✿◠‿◠)",
"(ノ´ヮ`)*:・゚✧", "ヽ(>∀<☆)", "(☆▽☆)", "( ˘▽˘)っ", "(≧◡≦)",
]
@@ -399,7 +283,6 @@ KAWAII_GENERIC = [
# Cute tool message (completion line that replaces the spinner)
# =========================================================================
def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]:
"""Inspect a tool result string for signs of failure.
@@ -438,10 +321,7 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
def get_cute_tool_message(
tool_name: str,
args: dict,
duration: float,
result: str | None = None,
tool_name: str, args: dict, duration: float, result: str | None = None,
) -> str:
"""Generate a formatted tool completion line for CLI quiet mode.
@@ -455,11 +335,11 @@ def get_cute_tool_message(
def _trunc(s, n=40):
s = str(s)
return (s[: n - 3] + "...") if len(s) > n else s
return (s[:n-3] + "...") if len(s) > n else s
def _path(p, n=35):
p = str(p)
return ("..." + p[-(n - 3) :]) if len(p) > n else p
return ("..." + p[-(n-3):]) if len(p) > n else p
def _wrap(line: str) -> str:
"""Append failure suffix when the tool failed."""
@@ -474,7 +354,7 @@ def get_cute_tool_message(
if urls:
url = urls[0] if isinstance(urls, list) else str(urls)
domain = url.replace("https://", "").replace("http://", "").split("/")[0]
extra = f" +{len(urls) - 1}" if len(urls) > 1 else ""
extra = f" +{len(urls)-1}" if len(urls) > 1 else ""
return _wrap(f"┊ 📄 fetch {_trunc(domain, 35)}{extra} {dur}")
return _wrap(f"┊ 📄 fetch pages {dur}")
if tool_name == "web_crawl":
@@ -486,15 +366,8 @@ def get_cute_tool_message(
if tool_name == "process":
action = args.get("action", "?")
sid = args.get("session_id", "")[:12]
labels = {
"list": "ls processes",
"poll": f"poll {sid}",
"log": f"log {sid}",
"wait": f"wait {sid}",
"kill": f"kill {sid}",
"write": f"write {sid}",
"submit": f"submit {sid}",
}
labels = {"list": "ls processes", "poll": f"poll {sid}", "log": f"log {sid}",
"wait": f"wait {sid}", "kill": f"kill {sid}", "write": f"write {sid}", "submit": f"submit {sid}"}
return _wrap(f"┊ ⚙️ proc {labels.get(action, f'{action} {sid}')} {dur}")
if tool_name == "read_file":
return _wrap(f"┊ 📖 read {_path(args.get('path', ''))} {dur}")
@@ -517,7 +390,7 @@ def get_cute_tool_message(
if tool_name == "browser_click":
return _wrap(f"┊ 👆 click {args.get('ref', '?')} {dur}")
if tool_name == "browser_type":
return _wrap(f'┊ ⌨️ type "{_trunc(args.get("text", ""), 30)}" {dur}')
return _wrap(f"┊ ⌨️ type \"{_trunc(args.get('text', ''), 30)}\" {dur}")
if tool_name == "browser_scroll":
d = args.get("direction", "down")
arrow = {"down": "", "up": "", "right": "", "left": ""}.get(d, "")
@@ -542,16 +415,16 @@ def get_cute_tool_message(
else:
return _wrap(f"┊ 📋 plan {len(todos_arg)} task(s) {dur}")
if tool_name == "session_search":
return _wrap(f'┊ 🔍 recall "{_trunc(args.get("query", ""), 35)}" {dur}')
return _wrap(f"┊ 🔍 recall \"{_trunc(args.get('query', ''), 35)}\" {dur}")
if tool_name == "memory":
action = args.get("action", "?")
target = args.get("target", "")
if action == "add":
return _wrap(f'┊ 🧠 memory +{target}: "{_trunc(args.get("content", ""), 30)}" {dur}')
return _wrap(f"┊ 🧠 memory +{target}: \"{_trunc(args.get('content', ''), 30)}\" {dur}")
elif action == "replace":
return _wrap(f'┊ 🧠 memory ~{target}: "{_trunc(args.get("old_text", ""), 20)}" {dur}')
return _wrap(f"┊ 🧠 memory ~{target}: \"{_trunc(args.get('old_text', ''), 20)}\" {dur}")
elif action == "remove":
return _wrap(f'┊ 🧠 memory -{target}: "{_trunc(args.get("old_text", ""), 20)}" {dur}')
return _wrap(f"┊ 🧠 memory -{target}: \"{_trunc(args.get('old_text', ''), 20)}\" {dur}")
return _wrap(f"┊ 🧠 memory {action} {dur}")
if tool_name == "skills_list":
return _wrap(f"┊ 📚 skills list {args.get('category', 'all')} {dur}")
@@ -566,7 +439,7 @@ def get_cute_tool_message(
if tool_name == "mixture_of_agents":
return _wrap(f"┊ 🧠 reason {_trunc(args.get('user_prompt', ''), 30)} {dur}")
if tool_name == "send_message":
return _wrap(f'┊ 📨 send {args.get("target", "?")}: "{_trunc(args.get("message", ""), 25)}" {dur}')
return _wrap(f"┊ 📨 send {args.get('target', '?')}: \"{_trunc(args.get('message', ''), 25)}\" {dur}")
if tool_name == "schedule_cronjob":
return _wrap(f"┊ ⏰ schedule {_trunc(args.get('name', args.get('prompt', 'task')), 30)} {dur}")
if tool_name == "list_cronjobs":
@@ -575,16 +448,11 @@ def get_cute_tool_message(
return _wrap(f"┊ ⏰ remove job {args.get('job_id', '?')} {dur}")
if tool_name.startswith("rl_"):
rl = {
"rl_list_environments": "list envs",
"rl_select_environment": f"select {args.get('name', '')}",
"rl_get_current_config": "get config",
"rl_edit_config": f"set {args.get('field', '?')}",
"rl_start_training": "start training",
"rl_check_status": f"status {args.get('run_id', '?')[:12]}",
"rl_stop_training": f"stop {args.get('run_id', '?')[:12]}",
"rl_get_results": f"results {args.get('run_id', '?')[:12]}",
"rl_list_runs": "list runs",
"rl_test_inference": "test inference",
"rl_list_environments": "list envs", "rl_select_environment": f"select {args.get('name', '')}",
"rl_get_current_config": "get config", "rl_edit_config": f"set {args.get('field', '?')}",
"rl_start_training": "start training", "rl_check_status": f"status {args.get('run_id', '?')[:12]}",
"rl_stop_training": f"stop {args.get('run_id', '?')[:12]}", "rl_get_results": f"results {args.get('run_id', '?')[:12]}",
"rl_list_runs": "list runs", "rl_test_inference": "test inference",
}
return _wrap(f"┊ 🧪 rl {rl.get(tool_name, tool_name.replace('rl_', ''))} {dur}")
if tool_name == "execute_code":

View File

@@ -20,7 +20,7 @@ import json
import time
from collections import Counter, defaultdict
from datetime import datetime
from typing import Any
from typing import Any, Dict, List, Optional
# =========================================================================
# Model pricing (USD per million tokens) — approximate as of early 2026
@@ -81,7 +81,7 @@ def _has_known_pricing(model_name: str) -> bool:
return _get_pricing(model_name) is not _DEFAULT_PRICING
def _get_pricing(model_name: str) -> dict[str, float]:
def _get_pricing(model_name: str) -> Dict[str, float]:
"""Look up pricing for a model. Uses fuzzy matching on model name.
Returns _DEFAULT_PRICING (zero cost) for unknown/custom models —
@@ -150,7 +150,7 @@ def _format_duration(seconds: float) -> str:
return f"{days:.1f}d"
def _bar_chart(values: list[int], max_width: int = 20) -> list[str]:
def _bar_chart(values: List[int], max_width: int = 20) -> List[str]:
"""Create simple horizontal bar chart strings from values."""
peak = max(values) if values else 1
if peak == 0:
@@ -176,7 +176,7 @@ class InsightsEngine:
self.db = db
self._conn = db._conn
def generate(self, days: int = 30, source: str = None) -> dict[str, Any]:
def generate(self, days: int = 30, source: str = None) -> Dict[str, Any]:
"""
Generate a complete insights report.
@@ -233,11 +233,10 @@ class InsightsEngine:
# =========================================================================
# Columns we actually need (skip system_prompt, model_config blobs)
_SESSION_COLS = (
"id, source, model, started_at, ended_at, message_count, tool_call_count, input_tokens, output_tokens"
)
_SESSION_COLS = ("id, source, model, started_at, ended_at, "
"message_count, tool_call_count, input_tokens, output_tokens")
def _get_sessions(self, cutoff: float, source: str = None) -> list[dict]:
def _get_sessions(self, cutoff: float, source: str = None) -> List[Dict]:
"""Fetch sessions within the time window."""
if source:
cursor = self._conn.execute(
@@ -255,7 +254,7 @@ class InsightsEngine:
)
return [dict(row) for row in cursor.fetchall()]
def _get_tool_usage(self, cutoff: float, source: str = None) -> list[dict]:
def _get_tool_usage(self, cutoff: float, source: str = None) -> List[Dict]:
"""Get tool call counts from messages.
Uses two sources:
@@ -342,9 +341,12 @@ class InsightsEngine:
tool_counts = merged
# Convert to the expected format
return [{"tool_name": name, "count": count} for name, count in tool_counts.most_common()]
return [
{"tool_name": name, "count": count}
for name, count in tool_counts.most_common()
]
def _get_message_stats(self, cutoff: float, source: str = None) -> dict:
def _get_message_stats(self, cutoff: float, source: str = None) -> Dict:
"""Get aggregate message statistics."""
if source:
cursor = self._conn.execute(
@@ -371,22 +373,16 @@ class InsightsEngine:
(cutoff,),
)
row = cursor.fetchone()
return (
dict(row)
if row
else {
"total_messages": 0,
"user_messages": 0,
"assistant_messages": 0,
"tool_messages": 0,
}
)
return dict(row) if row else {
"total_messages": 0, "user_messages": 0,
"assistant_messages": 0, "tool_messages": 0,
}
# =========================================================================
# Computation
# =========================================================================
def _compute_overview(self, sessions: list[dict], message_stats: dict) -> dict:
def _compute_overview(self, sessions: List[Dict], message_stats: Dict) -> Dict:
"""Compute high-level overview statistics."""
total_input = sum(s.get("input_tokens") or 0 for s in sessions)
total_output = sum(s.get("output_tokens") or 0 for s in sessions)
@@ -446,18 +442,12 @@ class InsightsEngine:
"models_without_pricing": sorted(models_without_pricing),
}
def _compute_model_breakdown(self, sessions: list[dict]) -> list[dict]:
def _compute_model_breakdown(self, sessions: List[Dict]) -> List[Dict]:
"""Break down usage by model."""
model_data = defaultdict(
lambda: {
"sessions": 0,
"input_tokens": 0,
"output_tokens": 0,
"total_tokens": 0,
"tool_calls": 0,
"cost": 0.0,
}
)
model_data = defaultdict(lambda: {
"sessions": 0, "input_tokens": 0, "output_tokens": 0,
"total_tokens": 0, "tool_calls": 0, "cost": 0.0,
})
for s in sessions:
model = s.get("model") or "unknown"
@@ -474,23 +464,20 @@ class InsightsEngine:
d["cost"] += _estimate_cost(model, inp, out)
d["has_pricing"] = _has_known_pricing(model)
result = [{"model": model, **data} for model, data in model_data.items()]
result = [
{"model": model, **data}
for model, data in model_data.items()
]
# Sort by tokens first, fall back to session count when tokens are 0
result.sort(key=lambda x: (x["total_tokens"], x["sessions"]), reverse=True)
return result
def _compute_platform_breakdown(self, sessions: list[dict]) -> list[dict]:
def _compute_platform_breakdown(self, sessions: List[Dict]) -> List[Dict]:
"""Break down usage by platform/source."""
platform_data = defaultdict(
lambda: {
"sessions": 0,
"messages": 0,
"input_tokens": 0,
"output_tokens": 0,
"total_tokens": 0,
"tool_calls": 0,
}
)
platform_data = defaultdict(lambda: {
"sessions": 0, "messages": 0, "input_tokens": 0,
"output_tokens": 0, "total_tokens": 0, "tool_calls": 0,
})
for s in sessions:
source = s.get("source") or "unknown"
@@ -504,26 +491,27 @@ class InsightsEngine:
d["total_tokens"] += inp + out
d["tool_calls"] += s.get("tool_call_count") or 0
result = [{"platform": platform, **data} for platform, data in platform_data.items()]
result = [
{"platform": platform, **data}
for platform, data in platform_data.items()
]
result.sort(key=lambda x: x["sessions"], reverse=True)
return result
def _compute_tool_breakdown(self, tool_usage: list[dict]) -> list[dict]:
def _compute_tool_breakdown(self, tool_usage: List[Dict]) -> List[Dict]:
"""Process tool usage data into a ranked list with percentages."""
total_calls = sum(t["count"] for t in tool_usage) if tool_usage else 0
result = []
for t in tool_usage:
pct = (t["count"] / total_calls * 100) if total_calls else 0
result.append(
{
"tool": t["tool_name"],
"count": t["count"],
"percentage": pct,
}
)
result.append({
"tool": t["tool_name"],
"count": t["count"],
"percentage": pct,
})
return result
def _compute_activity_patterns(self, sessions: list[dict]) -> dict:
def _compute_activity_patterns(self, sessions: List[Dict]) -> Dict:
"""Analyze activity patterns by day of week and hour."""
day_counts = Counter() # 0=Monday ... 6=Sunday
hour_counts = Counter()
@@ -539,9 +527,15 @@ class InsightsEngine:
daily_counts[dt.strftime("%Y-%m-%d")] += 1
day_names = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]
day_breakdown = [{"day": day_names[i], "count": day_counts.get(i, 0)} for i in range(7)]
day_breakdown = [
{"day": day_names[i], "count": day_counts.get(i, 0)}
for i in range(7)
]
hour_breakdown = [{"hour": i, "count": hour_counts.get(i, 0)} for i in range(24)]
hour_breakdown = [
{"hour": i, "count": hour_counts.get(i, 0)}
for i in range(24)
]
# Busiest day and hour
busiest_day = max(day_breakdown, key=lambda x: x["count"]) if day_breakdown else None
@@ -575,40 +569,37 @@ class InsightsEngine:
"max_streak": max_streak,
}
def _compute_top_sessions(self, sessions: list[dict]) -> list[dict]:
def _compute_top_sessions(self, sessions: List[Dict]) -> List[Dict]:
"""Find notable sessions (longest, most messages, most tokens)."""
top = []
# Longest by duration
sessions_with_duration = [s for s in sessions if s.get("started_at") and s.get("ended_at")]
sessions_with_duration = [
s for s in sessions
if s.get("started_at") and s.get("ended_at")
]
if sessions_with_duration:
longest = max(
sessions_with_duration,
key=lambda s: s["ended_at"] - s["started_at"],
key=lambda s: (s["ended_at"] - s["started_at"]),
)
dur = longest["ended_at"] - longest["started_at"]
top.append(
{
"label": "Longest session",
"session_id": longest["id"][:16],
"value": _format_duration(dur),
"date": datetime.fromtimestamp(longest["started_at"]).strftime("%b %d"),
}
)
top.append({
"label": "Longest session",
"session_id": longest["id"][:16],
"value": _format_duration(dur),
"date": datetime.fromtimestamp(longest["started_at"]).strftime("%b %d"),
})
# Most messages
most_msgs = max(sessions, key=lambda s: s.get("message_count") or 0)
if (most_msgs.get("message_count") or 0) > 0:
top.append(
{
"label": "Most messages",
"session_id": most_msgs["id"][:16],
"value": f"{most_msgs['message_count']} msgs",
"date": datetime.fromtimestamp(most_msgs["started_at"]).strftime("%b %d")
if most_msgs.get("started_at")
else "?",
}
)
top.append({
"label": "Most messages",
"session_id": most_msgs["id"][:16],
"value": f"{most_msgs['message_count']} msgs",
"date": datetime.fromtimestamp(most_msgs["started_at"]).strftime("%b %d") if most_msgs.get("started_at") else "?",
})
# Most tokens
most_tokens = max(
@@ -617,30 +608,22 @@ class InsightsEngine:
)
token_total = (most_tokens.get("input_tokens") or 0) + (most_tokens.get("output_tokens") or 0)
if token_total > 0:
top.append(
{
"label": "Most tokens",
"session_id": most_tokens["id"][:16],
"value": f"{token_total:,} tokens",
"date": datetime.fromtimestamp(most_tokens["started_at"]).strftime("%b %d")
if most_tokens.get("started_at")
else "?",
}
)
top.append({
"label": "Most tokens",
"session_id": most_tokens["id"][:16],
"value": f"{token_total:,} tokens",
"date": datetime.fromtimestamp(most_tokens["started_at"]).strftime("%b %d") if most_tokens.get("started_at") else "?",
})
# Most tool calls
most_tools = max(sessions, key=lambda s: s.get("tool_call_count") or 0)
if (most_tools.get("tool_call_count") or 0) > 0:
top.append(
{
"label": "Most tool calls",
"session_id": most_tools["id"][:16],
"value": f"{most_tools['tool_call_count']} calls",
"date": datetime.fromtimestamp(most_tools["started_at"]).strftime("%b %d")
if most_tools.get("started_at")
else "?",
}
)
top.append({
"label": "Most tool calls",
"session_id": most_tools["id"][:16],
"value": f"{most_tools['tool_call_count']} calls",
"date": datetime.fromtimestamp(most_tools["started_at"]).strftime("%b %d") if most_tools.get("started_at") else "?",
})
return top
@@ -648,7 +631,7 @@ class InsightsEngine:
# Formatting
# =========================================================================
def format_terminal(self, report: dict) -> str:
def format_terminal(self, report: Dict) -> str:
"""Format the insights report for terminal display (CLI)."""
if report.get("empty"):
days = report.get("days", 30)
@@ -686,17 +669,13 @@ class InsightsEngine:
lines.append(" " + "" * 56)
lines.append(f" Sessions: {o['total_sessions']:<12} Messages: {o['total_messages']:,}")
lines.append(f" Tool calls: {o['total_tool_calls']:<12,} User messages: {o['user_messages']:,}")
lines.append(
f" Input tokens: {o['total_input_tokens']:<12,} Output tokens: {o['total_output_tokens']:,}"
)
lines.append(f" Input tokens: {o['total_input_tokens']:<12,} Output tokens: {o['total_output_tokens']:,}")
cost_str = f"${o['estimated_cost']:.2f}"
if o.get("models_without_pricing"):
cost_str += " *"
lines.append(f" Total tokens: {o['total_tokens']:<12,} Est. cost: {cost_str}")
if o["total_hours"] > 0:
lines.append(
f" Active time: ~{_format_duration(o['total_hours'] * 3600):<11} Avg session: ~{_format_duration(o['avg_session_duration'])}"
)
lines.append(f" Active time: ~{_format_duration(o['total_hours'] * 3600):<11} Avg session: ~{_format_duration(o['avg_session_duration'])}")
lines.append(f" Avg msgs/session: {o['avg_messages_per_session']:.1f}")
lines.append("")
@@ -713,7 +692,7 @@ class InsightsEngine:
cost_cell = " N/A"
lines.append(f" {model_name:<30} {m['sessions']:>8} {m['total_tokens']:>12,} {cost_cell}")
if o.get("models_without_pricing"):
lines.append(" * Cost N/A for custom/self-hosted models")
lines.append(f" * Cost N/A for custom/self-hosted models")
lines.append("")
# Platform breakdown
@@ -779,7 +758,7 @@ class InsightsEngine:
return "\n".join(lines)
def format_gateway(self, report: dict) -> str:
def format_gateway(self, report: Dict) -> str:
"""Format the insights report for gateway/messaging (shorter)."""
if report.get("empty"):
days = report.get("days", 30)
@@ -792,20 +771,14 @@ class InsightsEngine:
lines.append(f"📊 **Hermes Insights** — Last {days} days\n")
# Overview
lines.append(
f"**Sessions:** {o['total_sessions']} | **Messages:** {o['total_messages']:,} | **Tool calls:** {o['total_tool_calls']:,}"
)
lines.append(
f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})"
)
lines.append(f"**Sessions:** {o['total_sessions']} | **Messages:** {o['total_messages']:,} | **Tool calls:** {o['total_tool_calls']:,}")
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})")
cost_note = ""
if o.get("models_without_pricing"):
cost_note = " _(excludes custom/self-hosted models)_"
lines.append(f"**Est. cost:** ${o['estimated_cost']:.2f}{cost_note}")
if o["total_hours"] > 0:
lines.append(
f"**Active time:** ~{_format_duration(o['total_hours'] * 3600)} | **Avg session:** ~{_format_duration(o['avg_session_duration'])}"
)
lines.append(f"**Active time:** ~{_format_duration(o['total_hours'] * 3600)} | **Avg session:** ~{_format_duration(o['avg_session_duration'])}")
lines.append("")
# Models (top 5)
@@ -813,9 +786,7 @@ class InsightsEngine:
lines.append("**🤖 Models:**")
for m in report["models"][:5]:
cost_str = f"${m['cost']:.2f}" if m.get("has_pricing") else "N/A"
lines.append(
f" {m['model'][:25]}{m['sessions']} sessions, {m['total_tokens']:,} tokens, {cost_str}"
)
lines.append(f" {m['model'][:25]}{m['sessions']} sessions, {m['total_tokens']:,} tokens, {cost_str}")
lines.append("")
# Platforms (if multi-platform)
@@ -838,13 +809,9 @@ class InsightsEngine:
hr = act["busiest_hour"]["hour"]
ampm = "AM" if hr < 12 else "PM"
display_hr = hr % 12 or 12
lines.append(
f"**📅 Busiest:** {act['busiest_day']['day']}s ({act['busiest_day']['count']} sessions), {display_hr}{ampm} ({act['busiest_hour']['count']} sessions)"
)
lines.append(f"**📅 Busiest:** {act['busiest_day']['day']}s ({act['busiest_day']['count']} sessions), {display_hr}{ampm} ({act['busiest_hour']['count']} sessions)")
if act.get("active_days"):
lines.append(
f"**Active days:** {act['active_days']}",
)
lines.append(f"**Active days:** {act['active_days']}", )
if act.get("max_streak", 0) > 1:
lines.append(f"**Best streak:** {act['max_streak']} consecutive days")

View File

@@ -9,7 +9,7 @@ import os
import re
import time
from pathlib import Path
from typing import Any
from typing import Any, Dict, List, Optional
import requests
import yaml
@@ -18,7 +18,7 @@ from hermes_constants import OPENROUTER_MODELS_URL
logger = logging.getLogger(__name__)
_model_metadata_cache: dict[str, dict[str, Any]] = {}
_model_metadata_cache: Dict[str, Dict[str, Any]] = {}
_model_metadata_cache_time: float = 0
_MODEL_CACHE_TTL = 3600
@@ -63,7 +63,7 @@ DEFAULT_CONTEXT_LENGTHS = {
}
def fetch_model_metadata(force_refresh: bool = False) -> dict[str, dict[str, Any]]:
def fetch_model_metadata(force_refresh: bool = False) -> Dict[str, Dict[str, Any]]:
"""Fetch model metadata from OpenRouter (cached for 1 hour)."""
global _model_metadata_cache, _model_metadata_cache_time
@@ -104,7 +104,7 @@ def _get_context_cache_path() -> Path:
return hermes_home / "context_length_cache.yaml"
def _load_context_cache() -> dict[str, int]:
def _load_context_cache() -> Dict[str, int]:
"""Load the model+provider → context_length cache from disk."""
path = _get_context_cache_path()
if not path.exists():
@@ -139,14 +139,14 @@ def save_context_length(model: str, base_url: str, length: int) -> None:
logger.debug("Failed to save context length cache: %s", e)
def get_cached_context_length(model: str, base_url: str) -> int | None:
def get_cached_context_length(model: str, base_url: str) -> Optional[int]:
"""Look up a previously discovered context length for model+provider."""
key = f"{model}@{base_url}"
cache = _load_context_cache()
return cache.get(key)
def get_next_probe_tier(current_length: int) -> int | None:
def get_next_probe_tier(current_length: int) -> Optional[int]:
"""Return the next lower probe tier, or None if already at minimum."""
for tier in CONTEXT_PROBE_TIERS:
if tier < current_length:
@@ -154,7 +154,7 @@ def get_next_probe_tier(current_length: int) -> int | None:
return None
def parse_context_limit_from_error(error_msg: str) -> int | None:
def parse_context_limit_from_error(error_msg: str) -> Optional[int]:
"""Try to extract the actual context limit from an API error message.
Many providers include the limit in their error text, e.g.:
@@ -166,11 +166,11 @@ def parse_context_limit_from_error(error_msg: str) -> int | None:
error_lower = error_msg.lower()
# Pattern: look for numbers near context-related keywords
patterns = [
r"(?:max(?:imum)?|limit)\s*(?:context\s*)?(?:length|size|window)?\s*(?:is|of|:)?\s*(\d{4,})",
r"context\s*(?:length|size|window)\s*(?:is|of|:)?\s*(\d{4,})",
r"(\d{4,})\s*(?:token)?\s*(?:context|limit)",
r">\s*(\d{4,})\s*(?:max|limit|token)", # "250000 tokens > 200000 maximum"
r"(\d{4,})\s*(?:max(?:imum)?)\b", # "200000 maximum"
r'(?:max(?:imum)?|limit)\s*(?:context\s*)?(?:length|size|window)?\s*(?:is|of|:)?\s*(\d{4,})',
r'context\s*(?:length|size|window)\s*(?:is|of|:)?\s*(\d{4,})',
r'(\d{4,})\s*(?:token)?\s*(?:context|limit)',
r'>\s*(\d{4,})\s*(?:max|limit|token)', # "250000 tokens > 200000 maximum"
r'(\d{4,})\s*(?:max(?:imum)?)\b', # "200000 maximum"
]
for pattern in patterns:
match = re.search(pattern, error_lower)
@@ -218,7 +218,7 @@ def estimate_tokens_rough(text: str) -> int:
return len(text) // 4
def estimate_messages_tokens_rough(messages: list[dict[str, Any]]) -> int:
def estimate_messages_tokens_rough(messages: List[Dict[str, Any]]) -> int:
"""Rough token estimate for a message list (pre-flight only)."""
total_chars = sum(len(str(msg)) for msg in messages)
return total_chars // 4

View File

@@ -8,6 +8,7 @@ import logging
import os
import re
from pathlib import Path
from typing import Optional
logger = logging.getLogger(__name__)
@@ -17,29 +18,21 @@ logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
_CONTEXT_THREAT_PATTERNS = [
(r"ignore\s+(previous|all|above|prior)\s+instructions", "prompt_injection"),
(r"do\s+not\s+tell\s+the\s+user", "deception_hide"),
(r"system\s+prompt\s+override", "sys_prompt_override"),
(r"disregard\s+(your|all|any)\s+(instructions|rules|guidelines)", "disregard_rules"),
(r"act\s+as\s+(if|though)\s+you\s+(have\s+no|don\'t\s+have)\s+(restrictions|limits|rules)", "bypass_restrictions"),
(r"<!--[^>]*(?:ignore|override|system|secret|hidden)[^>]*-->", "html_comment_injection"),
(r'ignore\s+(previous|all|above|prior)\s+instructions', "prompt_injection"),
(r'do\s+not\s+tell\s+the\s+user', "deception_hide"),
(r'system\s+prompt\s+override', "sys_prompt_override"),
(r'disregard\s+(your|all|any)\s+(instructions|rules|guidelines)', "disregard_rules"),
(r'act\s+as\s+(if|though)\s+you\s+(have\s+no|don\'t\s+have)\s+(restrictions|limits|rules)', "bypass_restrictions"),
(r'<!--[^>]*(?:ignore|override|system|secret|hidden)[^>]*-->', "html_comment_injection"),
(r'<\s*div\s+style\s*=\s*["\'].*display\s*:\s*none', "hidden_div"),
(r"translate\s+.*\s+into\s+.*\s+and\s+(execute|run|eval)", "translate_execute"),
(r"curl\s+[^\n]*\$\{?\w*(KEY|TOKEN|SECRET|PASSWORD|CREDENTIAL|API)", "exfil_curl"),
(r"cat\s+[^\n]*(\.env|credentials|\.netrc|\.pgpass)", "read_secrets"),
(r'translate\s+.*\s+into\s+.*\s+and\s+(execute|run|eval)', "translate_execute"),
(r'curl\s+[^\n]*\$\{?\w*(KEY|TOKEN|SECRET|PASSWORD|CREDENTIAL|API)', "exfil_curl"),
(r'cat\s+[^\n]*(\.env|credentials|\.netrc|\.pgpass)', "read_secrets"),
]
_CONTEXT_INVISIBLE_CHARS = {
"\u200b",
"\u200c",
"\u200d",
"\u2060",
"\ufeff",
"\u202a",
"\u202b",
"\u202c",
"\u202d",
"\u202e",
'\u200b', '\u200c', '\u200d', '\u2060', '\ufeff',
'\u202a', '\u202b', '\u202c', '\u202d', '\u202e',
}
@@ -59,13 +52,10 @@ def _scan_context_content(content: str, filename: str) -> str:
if findings:
logger.warning("Context file %s blocked: %s", filename, ", ".join(findings))
return (
f"[BLOCKED: {filename} contained potential prompt injection ({', '.join(findings)}). Content not loaded.]"
)
return f"[BLOCKED: {filename} contained potential prompt injection ({', '.join(findings)}). Content not loaded.]"
return content
# =========================================================================
# Constants
# =========================================================================
@@ -76,8 +66,7 @@ DEFAULT_AGENT_IDENTITY = (
"range of tasks including answering questions, writing and editing code, "
"analyzing information, creative work, and executing actions via your tools. "
"You communicate clearly, admit uncertainty when appropriate, and prioritize "
"being genuinely useful over being verbose unless otherwise directed below. "
"Be targeted and efficient in your exploration and investigations."
"being genuinely useful over being verbose unless otherwise directed below."
)
MEMORY_GUIDANCE = (
@@ -113,35 +102,17 @@ PLATFORM_HINTS = {
"You are on a text messaging communication platform, Telegram. "
"Please do not use markdown as it does not render. "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. Images "
"(.png, .jpg, .webp) appear as photos, audio (.ogg) sends as voice "
"bubbles, and videos (.mp4) play inline. You can also include image "
"URLs in markdown format ![alt](url) and they will be sent as native photos."
"include MEDIA:/absolute/path/to/file in your response. Audio "
"(.ogg) sends as voice bubbles. You can also include image URLs "
"in markdown format ![alt](url) and they will be sent as native photos."
),
"discord": (
"You are in a Discord server or group chat communicating with your user. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.png, .jpg, .webp) are sent as photo "
"attachments, audio as file attachments. You can also include image URLs "
"in markdown format ![alt](url) and they will be sent as attachments."
"You are in a Discord server or group chat communicating with your user."
),
"slack": (
"You are in a Slack workspace communicating with your user. "
"You can send media files natively: include MEDIA:/absolute/path/to/file "
"in your response. Images (.png, .jpg, .webp) are uploaded as photo "
"attachments, audio as file attachments. You can also include image URLs "
"in markdown format ![alt](url) and they will be uploaded as attachments."
"cli": (
"You are a CLI AI Agent. Try not to use markdown but simple text "
"renderable inside a terminal."
),
"signal": (
"You are on a text messaging communication platform, Signal. "
"Please do not use markdown as it does not render. "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. Images "
"(.png, .jpg, .webp) appear as photos, audio as attachments, and other "
"files arrive as downloadable documents. You can also include image "
"URLs in markdown format ![alt](url) and they will be sent as photos."
),
"cli": ("You are a CLI AI Agent. Try not to use markdown but simple text renderable inside a terminal."),
}
CONTEXT_FILE_MAX_CHARS = 20_000
@@ -153,20 +124,18 @@ CONTEXT_TRUNCATE_TAIL_RATIO = 0.2
# Skills index
# =========================================================================
def _read_skill_description(skill_file: Path, max_chars: int = 60) -> str:
"""Read the description from a SKILL.md frontmatter, capped at max_chars."""
try:
raw = skill_file.read_text(encoding="utf-8")[:2000]
match = re.search(
r"^---\s*\n.*?description:\s*(.+?)\s*\n.*?^---",
raw,
re.MULTILINE | re.DOTALL,
raw, re.MULTILINE | re.DOTALL,
)
if match:
desc = match.group(1).strip().strip("'\"")
if len(desc) > max_chars:
desc = desc[: max_chars - 3] + "..."
desc = desc[:max_chars - 3] + "..."
return desc
except Exception:
pass
@@ -181,7 +150,6 @@ def _skill_is_platform_compatible(skill_file: Path) -> bool:
"""
try:
from tools.skills_tool import _parse_frontmatter, skill_matches_platform
raw = skill_file.read_text(encoding="utf-8")[:2000]
frontmatter, _ = _parse_frontmatter(raw)
return skill_matches_platform(frontmatter)
@@ -205,8 +173,6 @@ def build_skills_system_prompt() -> str:
# Collect skills with descriptions, grouped by category
# Each entry: (skill_name, description)
# Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
# → category "mlops/training", skill "axolotl"
skills_by_category: dict[str, list[tuple[str, str]]] = {}
for skill_file in skills_dir.rglob("SKILL.md"):
# Skip skills incompatible with the current OS platform
@@ -215,13 +181,8 @@ def build_skills_system_prompt() -> str:
rel_path = skill_file.relative_to(skills_dir)
parts = rel_path.parts
if len(parts) >= 2:
# Category is everything between skills_dir and the skill folder
# e.g. parts = ("mlops", "training", "axolotl", "SKILL.md")
# → category = "mlops/training", skill_name = "axolotl"
# e.g. parts = ("github", "github-auth", "SKILL.md")
# → category = "github", skill_name = "github-auth"
category = parts[0]
skill_name = parts[-2]
category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
else:
category = "general"
skill_name = skill_file.parent.name
@@ -232,11 +193,9 @@ def build_skills_system_prompt() -> str:
return ""
# Read category-level descriptions from DESCRIPTION.md
# Checks both the exact category path and parent directories
category_descriptions = {}
for category in skills_by_category:
cat_path = Path(category)
desc_file = skills_dir / cat_path / "DESCRIPTION.md"
desc_file = skills_dir / category / "DESCRIPTION.md"
if desc_file.exists():
try:
content = desc_file.read_text(encoding="utf-8")
@@ -270,7 +229,8 @@ def build_skills_system_prompt() -> str:
"load it with skill_view(name) and follow its instructions. "
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"\n"
"<available_skills>\n" + "\n".join(index_lines) + "\n"
"<available_skills>\n"
+ "\n".join(index_lines) + "\n"
"</available_skills>\n"
"\n"
"If none match, proceed normally without loading a skill."
@@ -281,7 +241,6 @@ def build_skills_system_prompt() -> str:
# Context files (SOUL.md, AGENTS.md, .cursorrules)
# =========================================================================
def _truncate_content(content: str, filename: str, max_chars: int = CONTEXT_FILE_MAX_CHARS) -> str:
"""Head/tail truncation with a marker in the middle."""
if len(content) <= max_chars:
@@ -294,7 +253,7 @@ def _truncate_content(content: str, filename: str, max_chars: int = CONTEXT_FILE
return head + marker + tail
def build_context_files_prompt(cwd: str | None = None) -> str:
def build_context_files_prompt(cwd: Optional[str] = None) -> str:
"""Discover and load context files for the system prompt.
Discovery: AGENTS.md (recursive), .cursorrules / .cursor/rules/*.mdc,
@@ -317,9 +276,7 @@ def build_context_files_prompt(cwd: str | None = None) -> str:
if top_level_agents:
agents_files = []
for root, dirs, files in os.walk(cwd_path):
dirs[:] = [
d for d in dirs if not d.startswith(".") and d not in ("node_modules", "__pycache__", "venv", ".venv")
]
dirs[:] = [d for d in dirs if not d.startswith('.') and d not in ('node_modules', '__pycache__', 'venv', '.venv')]
for f in files:
if f.lower() == "agents.md":
agents_files.append(Path(root) / f)
@@ -396,7 +353,4 @@ def build_context_files_prompt(cwd: str | None = None) -> str:
if not sections:
return ""
return (
"# Project Context\n\nThe following project context files have been loaded and should be followed:\n\n"
+ "\n".join(sections)
)
return "# Project Context\n\nThe following project context files have been loaded and should be followed:\n\n" + "\n".join(sections)

View File

@@ -9,7 +9,7 @@ Pure functions -- no class state, no AIAgent dependency.
"""
import copy
from typing import Any
from typing import Any, Dict, List
def _apply_cache_marker(msg: dict, cache_marker: dict) -> None:
@@ -36,9 +36,9 @@ def _apply_cache_marker(msg: dict, cache_marker: dict) -> None:
def apply_anthropic_cache_control(
api_messages: list[dict[str, Any]],
api_messages: List[Dict[str, Any]],
cache_ttl: str = "5m",
) -> list[dict[str, Any]]:
) -> List[Dict[str, Any]]:
"""Apply system_and_3 caching strategy to messages for Anthropic models.
Places up to 4 cache_control breakpoints: system prompt + last 3 non-system messages.

View File

@@ -8,35 +8,23 @@ the first 6 and last 4 characters for debuggability.
"""
import logging
import os
import re
from typing import Optional
logger = logging.getLogger(__name__)
# Known API key prefixes -- match the prefix + contiguous token chars
_PREFIX_PATTERNS = [
r"sk-[A-Za-z0-9_-]{10,}", # OpenAI / OpenRouter / Anthropic (sk-ant-*)
r"ghp_[A-Za-z0-9]{10,}", # GitHub PAT (classic)
r"github_pat_[A-Za-z0-9_]{10,}", # GitHub PAT (fine-grained)
r"xox[baprs]-[A-Za-z0-9-]{10,}", # Slack tokens
r"AIza[A-Za-z0-9_-]{30,}", # Google API keys
r"pplx-[A-Za-z0-9]{10,}", # Perplexity
r"fal_[A-Za-z0-9_-]{10,}", # Fal.ai
r"fc-[A-Za-z0-9]{10,}", # Firecrawl
r"bb_live_[A-Za-z0-9_-]{10,}", # BrowserBase
r"gAAAA[A-Za-z0-9_=-]{20,}", # Codex encrypted tokens
r"AKIA[A-Z0-9]{16}", # AWS Access Key ID
r"sk_live_[A-Za-z0-9]{10,}", # Stripe secret key (live)
r"sk_test_[A-Za-z0-9]{10,}", # Stripe secret key (test)
r"rk_live_[A-Za-z0-9]{10,}", # Stripe restricted key
r"SG\.[A-Za-z0-9_-]{10,}", # SendGrid API key
r"hf_[A-Za-z0-9]{10,}", # HuggingFace token
r"r8_[A-Za-z0-9]{10,}", # Replicate API token
r"npm_[A-Za-z0-9]{10,}", # npm access token
r"pypi-[A-Za-z0-9_-]{10,}", # PyPI API token
r"dop_v1_[A-Za-z0-9]{10,}", # DigitalOcean PAT
r"doo_v1_[A-Za-z0-9]{10,}", # DigitalOcean OAuth
r"am_[A-Za-z0-9_-]{10,}", # AgentMail API key
r"sk-[A-Za-z0-9_-]{10,}", # OpenAI / OpenRouter
r"ghp_[A-Za-z0-9]{10,}", # GitHub PAT (classic)
r"github_pat_[A-Za-z0-9_]{10,}", # GitHub PAT (fine-grained)
r"xox[baprs]-[A-Za-z0-9-]{10,}", # Slack tokens
r"AIza[A-Za-z0-9_-]{30,}", # Google API keys
r"pplx-[A-Za-z0-9]{10,}", # Perplexity
r"fal_[A-Za-z0-9_-]{10,}", # Fal.ai
r"fc-[A-Za-z0-9]{10,}", # Firecrawl
r"bb_live_[A-Za-z0-9_-]{10,}", # BrowserBase
r"gAAAA[A-Za-z0-9_=-]{20,}", # Codex encrypted tokens
]
# ENV assignment patterns: KEY=value where KEY contains a secret-like name
@@ -64,22 +52,10 @@ _TELEGRAM_RE = re.compile(
r"(bot)?(\d{8,}):([-A-Za-z0-9_]{30,})",
)
# Private key blocks: -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY-----
_PRIVATE_KEY_RE = re.compile(r"-----BEGIN[A-Z ]*PRIVATE KEY-----[\s\S]*?-----END[A-Z ]*PRIVATE KEY-----")
# Database connection strings: protocol://user:PASSWORD@host
# Catches postgres, mysql, mongodb, redis, amqp URLs and redacts the password
_DB_CONNSTR_RE = re.compile(
r"((?:postgres(?:ql)?|mysql|mongodb(?:\+srv)?|redis|amqp)://[^:]+:)([^@]+)(@)",
re.IGNORECASE,
)
# E.164 phone numbers: +<country><number>, 7-15 digits
# Negative lookahead prevents matching hex strings or identifiers
_SIGNAL_PHONE_RE = re.compile(r"(\+[1-9]\d{6,14})(?![A-Za-z0-9])")
# Compile known prefix patterns into one alternation
_PREFIX_RE = re.compile(r"(?<![A-Za-z0-9_-])(" + "|".join(_PREFIX_PATTERNS) + r")(?![A-Za-z0-9_-])")
_PREFIX_RE = re.compile(
r"(?<![A-Za-z0-9_-])(" + "|".join(_PREFIX_PATTERNS) + r")(?![A-Za-z0-9_-])"
)
def _mask_token(token: str) -> str:
@@ -93,12 +69,9 @@ def redact_sensitive_text(text: str) -> str:
"""Apply all redaction patterns to a block of text.
Safe to call on any string -- non-matching text passes through unchanged.
Disabled when security.redact_secrets is false in config.yaml.
"""
if not text:
return text
if os.getenv("HERMES_REDACT_SECRETS", "").lower() in ("0", "false", "no", "off"):
return text
# Known prefixes (sk-, ghp_, etc.)
text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
@@ -107,14 +80,12 @@ def redact_sensitive_text(text: str) -> str:
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# JSON fields: "apiKey": "value"
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# Authorization headers
@@ -128,31 +99,15 @@ def redact_sensitive_text(text: str) -> str:
prefix = m.group(1) or ""
digits = m.group(2)
return f"{prefix}{digits}:***"
text = _TELEGRAM_RE.sub(_redact_telegram, text)
# Private key blocks
text = _PRIVATE_KEY_RE.sub("[REDACTED PRIVATE KEY]", text)
# Database connection string passwords
text = _DB_CONNSTR_RE.sub(lambda m: f"{m.group(1)}***{m.group(3)}", text)
# E.164 phone numbers (Signal, WhatsApp)
def _redact_phone(m):
phone = m.group(1)
if len(phone) <= 8:
return phone[:2] + "****" + phone[-2:]
return phone[:4] + "****" + phone[-4:]
text = _SIGNAL_PHONE_RE.sub(_redact_phone, text)
return text
class RedactingFormatter(logging.Formatter):
"""Log formatter that redacts secrets from all log messages."""
def __init__(self, fmt=None, datefmt=None, style="%", **kwargs):
def __init__(self, fmt=None, datefmt=None, style='%', **kwargs):
super().__init__(fmt, datefmt, style, **kwargs)
def format(self, record: logging.LogRecord) -> str:

View File

@@ -6,14 +6,14 @@ can invoke skills via /skill-name commands.
import logging
from pathlib import Path
from typing import Any
from typing import Any, Dict, Optional
logger = logging.getLogger(__name__)
_skill_commands: dict[str, dict[str, Any]] = {}
_skill_commands: Dict[str, Dict[str, Any]] = {}
def scan_skill_commands() -> dict[str, dict[str, Any]]:
def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
"""Scan ~/.hermes/skills/ and return a mapping of /command -> skill info.
Returns:
@@ -23,27 +23,26 @@ def scan_skill_commands() -> dict[str, dict[str, Any]]:
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform
if not SKILLS_DIR.exists():
return _skill_commands
for skill_md in SKILLS_DIR.rglob("SKILL.md"):
if any(part in (".git", ".github", ".hub") for part in skill_md.parts):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
try:
content = skill_md.read_text(encoding="utf-8")
content = skill_md.read_text(encoding='utf-8')
frontmatter, body = _parse_frontmatter(content)
# Skip skills incompatible with the current OS platform
if not skill_matches_platform(frontmatter):
continue
name = frontmatter.get("name", skill_md.parent.name)
description = frontmatter.get("description", "")
name = frontmatter.get('name', skill_md.parent.name)
description = frontmatter.get('description', '')
if not description:
for line in body.strip().split("\n"):
for line in body.strip().split('\n'):
line = line.strip()
if line and not line.startswith("#"):
if line and not line.startswith('#'):
description = line[:80]
break
cmd_name = name.lower().replace(" ", "-").replace("_", "-")
cmd_name = name.lower().replace(' ', '-').replace('_', '-')
_skill_commands[f"/{cmd_name}"] = {
"name": name,
"description": description or f"Invoke the {name} skill",
@@ -57,14 +56,14 @@ def scan_skill_commands() -> dict[str, dict[str, Any]]:
return _skill_commands
def get_skill_commands() -> dict[str, dict[str, Any]]:
def get_skill_commands() -> Dict[str, Dict[str, Any]]:
"""Return the current skill commands mapping (scan first if empty)."""
if not _skill_commands:
scan_skill_commands()
return _skill_commands
def build_skill_invocation_message(cmd_key: str, user_instruction: str = "") -> str | None:
def build_skill_invocation_message(cmd_key: str, user_instruction: str = "") -> Optional[str]:
"""Build the user message content for a skill slash command invocation.
Args:
@@ -84,7 +83,7 @@ def build_skill_invocation_message(cmd_key: str, user_instruction: str = "") ->
skill_name = skill_info["name"]
try:
content = skill_md_path.read_text(encoding="utf-8")
content = skill_md_path.read_text(encoding='utf-8')
except Exception:
return f"[Failed to load skill: {skill_name}]"
@@ -112,8 +111,6 @@ def build_skill_invocation_message(cmd_key: str, user_instruction: str = "") ->
if user_instruction:
parts.append("")
parts.append(
f"The user has provided the following instruction alongside the skill invocation: {user_instruction}"
)
parts.append(f"The user has provided the following instruction alongside the skill invocation: {user_instruction}")
return "\n".join(parts)

View File

@@ -8,7 +8,7 @@ the file-write logic live here.
import json
import logging
from datetime import datetime
from typing import Any
from typing import Any, Dict, List
logger = logging.getLogger(__name__)
@@ -27,7 +27,8 @@ def has_incomplete_scratchpad(content: str) -> bool:
return "<REASONING_SCRATCHPAD>" in content and "</REASONING_SCRATCHPAD>" not in content
def save_trajectory(trajectory: list[dict[str, Any]], model: str, completed: bool, filename: str = None):
def save_trajectory(trajectory: List[Dict[str, Any]], model: str,
completed: bool, filename: str = None):
"""Append a trajectory entry to a JSONL file.
Args:

File diff suppressed because it is too large Load Diff

View File

@@ -50,16 +50,6 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# Git Worktree Isolation
# =============================================================================
# When enabled, each CLI session creates an isolated git worktree so multiple
# agents can work on the same repo concurrently without file collisions.
# Equivalent to always passing --worktree / -w on the command line.
#
# worktree: true # Always create a worktree when in a git repo
# worktree: false # Default — only create when -w flag is passed
# =============================================================================
# Terminal Tool Configuration
# =============================================================================
@@ -241,11 +231,11 @@ compression:
# "auto" - Best available: OpenRouter → Nous Portal → main endpoint (default)
# "openrouter" - Force OpenRouter (requires OPENROUTER_API_KEY)
# "nous" - Force Nous Portal (requires: hermes login)
# "codex" - Force Codex OAuth (requires: hermes model → Codex).
# Uses gpt-5.3-codex which supports vision.
# "main" - Use your custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY).
# Works with OpenAI API, local models, or any OpenAI-compatible
# endpoint. Also falls back to Codex OAuth and API-key providers.
# "main" - Use the same provider & credentials as your main chat model.
# Skips OpenRouter/Nous and uses your custom endpoint
# (OPENAI_BASE_URL), Codex OAuth, or API-key provider directly.
# Useful if you run a local model and want auxiliary tasks to
# use it too.
#
# Model: leave empty to use the provider's default. When empty, OpenRouter
# uses "google/gemini-3-flash-preview" and Nous uses "gemini-3-flash".
@@ -345,7 +335,7 @@ agent:
# Reasoning effort level (OpenRouter and Nous Portal)
# Controls how much "thinking" the model does before responding.
# Options: "xhigh" (max), "high", "medium", "low", "minimal", "none" (disable)
reasoning_effort: "medium"
reasoning_effort: "xhigh"
# Predefined personalities (use with /personality command)
personalities:
@@ -555,21 +545,6 @@ toolsets:
# args: ["-y", "@modelcontextprotocol/server-github"]
# env:
# GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..."
#
# Sampling (server-initiated LLM requests) — enabled by default.
# Per-server config under the 'sampling' key:
# analysis:
# command: npx
# args: ["-y", "analysis-server"]
# sampling:
# enabled: true # default: true
# model: "gemini-3-flash" # override model (optional)
# max_tokens_cap: 4096 # max tokens per request
# timeout: 30 # LLM call timeout (seconds)
# max_rpm: 10 # max requests per minute
# allowed_models: [] # model whitelist (empty = all)
# max_tool_rounds: 5 # tool loop limit (0 = disable)
# log_level: "info" # audit verbosity
# =============================================================================
# Voice Transcription (Speech-to-Text)
@@ -650,8 +625,3 @@ display:
# verbose: Full args, results, and debug logs (same as /verbose)
# Toggle at runtime with /verbose in the CLI
tool_progress: all
# Play terminal bell when agent finishes a response.
# Useful for long-running tasks — your terminal will ding when the agent is done.
# Works over SSH. Most terminals can be configured to flash the taskbar or play a sound.
bell_on_complete: false

2179
cli.py

File diff suppressed because it is too large Load Diff

View File

@@ -15,18 +15,18 @@ duplicate execution if multiple processes overlap.
"""
from cron.jobs import (
JOBS_FILE,
create_job,
get_job,
list_jobs,
remove_job,
update_job,
JOBS_FILE,
)
from cron.scheduler import tick
__all__ = [
"create_job",
"get_job",
"get_job",
"list_jobs",
"remove_job",
"update_job",

View File

@@ -6,19 +6,18 @@ Output is saved to ~/.hermes/cron/output/{job_id}/{timestamp}.md
"""
import json
import tempfile
import os
import re
import tempfile
import uuid
from datetime import datetime, timedelta
from pathlib import Path
from typing import Any
from typing import Optional, Dict, List, Any
from hermes_time import now as _hermes_now
try:
from croniter import croniter
HAS_CRONITER = True
except ImportError:
HAS_CRONITER = False
@@ -43,38 +42,37 @@ def ensure_dirs():
# Schedule Parsing
# =============================================================================
def parse_duration(s: str) -> int:
"""
Parse duration string into minutes.
Examples:
"30m" → 30
"2h" → 120
"1d" → 1440
"""
s = s.strip().lower()
match = re.match(r"^(\d+)\s*(m|min|mins|minute|minutes|h|hr|hrs|hour|hours|d|day|days)$", s)
match = re.match(r'^(\d+)\s*(m|min|mins|minute|minutes|h|hr|hrs|hour|hours|d|day|days)$', s)
if not match:
raise ValueError(f"Invalid duration: '{s}'. Use format like '30m', '2h', or '1d'")
value = int(match.group(1))
unit = match.group(2)[0] # First char: m, h, or d
multipliers = {"m": 1, "h": 60, "d": 1440}
multipliers = {'m': 1, 'h': 60, 'd': 1440}
return value * multipliers[unit]
def parse_schedule(schedule: str) -> dict[str, Any]:
def parse_schedule(schedule: str) -> Dict[str, Any]:
"""
Parse schedule string into structured format.
Returns dict with:
- kind: "once" | "interval" | "cron"
- For "once": "run_at" (ISO timestamp)
- For "interval": "minutes" (int)
- For "cron": "expr" (cron expression)
Examples:
"30m" → once in 30 minutes
"2h" → once in 2 hours
@@ -86,17 +84,23 @@ def parse_schedule(schedule: str) -> dict[str, Any]:
schedule = schedule.strip()
original = schedule
schedule_lower = schedule.lower()
# "every X" pattern → recurring interval
if schedule_lower.startswith("every "):
duration_str = schedule[6:].strip()
minutes = parse_duration(duration_str)
return {"kind": "interval", "minutes": minutes, "display": f"every {minutes}m"}
return {
"kind": "interval",
"minutes": minutes,
"display": f"every {minutes}m"
}
# Check for cron expression (5 or 6 space-separated fields)
# Cron fields: minute hour day month weekday [year]
parts = schedule.split()
if len(parts) >= 5 and all(re.match(r"^[\d\*\-,/]+$", p) for p in parts[:5]):
if len(parts) >= 5 and all(
re.match(r'^[\d\*\-,/]+$', p) for p in parts[:5]
):
if not HAS_CRONITER:
raise ValueError("Cron expressions require 'croniter' package. Install with: pip install croniter")
# Validate cron expression
@@ -104,25 +108,37 @@ def parse_schedule(schedule: str) -> dict[str, Any]:
croniter(schedule)
except Exception as e:
raise ValueError(f"Invalid cron expression '{schedule}': {e}")
return {"kind": "cron", "expr": schedule, "display": schedule}
return {
"kind": "cron",
"expr": schedule,
"display": schedule
}
# ISO timestamp (contains T or looks like date)
if "T" in schedule or re.match(r"^\d{4}-\d{2}-\d{2}", schedule):
if 'T' in schedule or re.match(r'^\d{4}-\d{2}-\d{2}', schedule):
try:
# Parse and validate
dt = datetime.fromisoformat(schedule.replace("Z", "+00:00"))
return {"kind": "once", "run_at": dt.isoformat(), "display": f"once at {dt.strftime('%Y-%m-%d %H:%M')}"}
dt = datetime.fromisoformat(schedule.replace('Z', '+00:00'))
return {
"kind": "once",
"run_at": dt.isoformat(),
"display": f"once at {dt.strftime('%Y-%m-%d %H:%M')}"
}
except ValueError as e:
raise ValueError(f"Invalid timestamp '{schedule}': {e}")
# Duration like "30m", "2h", "1d" → one-shot from now
try:
minutes = parse_duration(schedule)
run_at = _hermes_now() + timedelta(minutes=minutes)
return {"kind": "once", "run_at": run_at.isoformat(), "display": f"once in {original}"}
return {
"kind": "once",
"run_at": run_at.isoformat(),
"display": f"once in {original}"
}
except ValueError:
pass
raise ValueError(
f"Invalid schedule '{original}'. Use:\n"
f" - Duration: '30m', '2h', '1d' (one-shot)\n"
@@ -145,7 +161,7 @@ def _ensure_aware(dt: datetime) -> datetime:
return dt
def compute_next_run(schedule: dict[str, Any], last_run_at: str | None = None) -> str | None:
def compute_next_run(schedule: Dict[str, Any], last_run_at: Optional[str] = None) -> Optional[str]:
"""
Compute the next run time for a schedule.
@@ -183,27 +199,26 @@ def compute_next_run(schedule: dict[str, Any], last_run_at: str | None = None) -
# Job CRUD Operations
# =============================================================================
def load_jobs() -> list[dict[str, Any]]:
def load_jobs() -> List[Dict[str, Any]]:
"""Load all jobs from storage."""
ensure_dirs()
if not JOBS_FILE.exists():
return []
try:
with open(JOBS_FILE, encoding="utf-8") as f:
with open(JOBS_FILE, 'r', encoding='utf-8') as f:
data = json.load(f)
return data.get("jobs", [])
except (OSError, json.JSONDecodeError):
except (json.JSONDecodeError, IOError):
return []
def save_jobs(jobs: list[dict[str, Any]]):
def save_jobs(jobs: List[Dict[str, Any]]):
"""Save all jobs to storage."""
ensure_dirs()
fd, tmp_path = tempfile.mkstemp(dir=str(JOBS_FILE.parent), suffix=".tmp", prefix=".jobs_")
fd, tmp_path = tempfile.mkstemp(dir=str(JOBS_FILE.parent), suffix='.tmp', prefix='.jobs_')
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
with os.fdopen(fd, 'w', encoding='utf-8') as f:
json.dump({"jobs": jobs, "updated_at": _hermes_now().isoformat()}, f, indent=2)
f.flush()
os.fsync(f.fileno())
@@ -219,14 +234,14 @@ def save_jobs(jobs: list[dict[str, Any]]):
def create_job(
prompt: str,
schedule: str,
name: str | None = None,
repeat: int | None = None,
deliver: str | None = None,
origin: dict[str, Any] | None = None,
) -> dict[str, Any]:
name: Optional[str] = None,
repeat: Optional[int] = None,
deliver: Optional[str] = None,
origin: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""
Create a new cron job.
Args:
prompt: The prompt to run (must be self-contained)
schedule: Schedule string (see parse_schedule)
@@ -234,23 +249,23 @@ def create_job(
repeat: How many times to run (None = forever, 1 = once)
deliver: Where to deliver output ("origin", "local", "telegram", etc.)
origin: Source info where job was created (for "origin" delivery)
Returns:
The created job dict
"""
parsed_schedule = parse_schedule(schedule)
# Auto-set repeat=1 for one-shot schedules if not specified
if parsed_schedule["kind"] == "once" and repeat is None:
repeat = 1
# Default delivery to origin if available, otherwise local
if deliver is None:
deliver = "origin" if origin else "local"
job_id = uuid.uuid4().hex[:12]
now = _hermes_now().isoformat()
job = {
"id": job_id,
"name": name or prompt[:50].strip(),
@@ -259,7 +274,7 @@ def create_job(
"schedule_display": parsed_schedule.get("display", schedule),
"repeat": {
"times": repeat, # None = forever
"completed": 0,
"completed": 0
},
"enabled": True,
"created_at": now,
@@ -271,15 +286,15 @@ def create_job(
"deliver": deliver,
"origin": origin, # Tracks where job was created for "origin" delivery
}
jobs = load_jobs()
jobs.append(job)
save_jobs(jobs)
return job
def get_job(job_id: str) -> dict[str, Any] | None:
def get_job(job_id: str) -> Optional[Dict[str, Any]]:
"""Get a job by ID."""
jobs = load_jobs()
for job in jobs:
@@ -288,7 +303,7 @@ def get_job(job_id: str) -> dict[str, Any] | None:
return None
def list_jobs(include_disabled: bool = False) -> list[dict[str, Any]]:
def list_jobs(include_disabled: bool = False) -> List[Dict[str, Any]]:
"""List all jobs, optionally including disabled ones."""
jobs = load_jobs()
if not include_disabled:
@@ -296,7 +311,7 @@ def list_jobs(include_disabled: bool = False) -> list[dict[str, Any]]:
return jobs
def update_job(job_id: str, updates: dict[str, Any]) -> dict[str, Any] | None:
def update_job(job_id: str, updates: Dict[str, Any]) -> Optional[Dict[str, Any]]:
"""Update a job by ID."""
jobs = load_jobs()
for i, job in enumerate(jobs):
@@ -318,10 +333,10 @@ def remove_job(job_id: str) -> bool:
return False
def mark_job_run(job_id: str, success: bool, error: str | None = None):
def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
"""
Mark a job as having been run.
Updates last_run_at, last_status, increments completed count,
computes next_run_at, and auto-deletes if repeat limit reached.
"""
@@ -332,11 +347,11 @@ def mark_job_run(job_id: str, success: bool, error: str | None = None):
job["last_run_at"] = now
job["last_status"] = "ok" if success else "error"
job["last_error"] = error if not success else None
# Increment completed count
if job.get("repeat"):
job["repeat"]["completed"] = job["repeat"].get("completed", 0) + 1
# Check if we've hit the repeat limit
times = job["repeat"].get("times")
completed = job["repeat"]["completed"]
@@ -345,38 +360,38 @@ def mark_job_run(job_id: str, success: bool, error: str | None = None):
jobs.pop(i)
save_jobs(jobs)
return
# Compute next run
job["next_run_at"] = compute_next_run(job["schedule"], now)
# If no next run (one-shot completed), disable
if job["next_run_at"] is None:
job["enabled"] = False
save_jobs(jobs)
return
save_jobs(jobs)
def get_due_jobs() -> list[dict[str, Any]]:
def get_due_jobs() -> List[Dict[str, Any]]:
"""Get all jobs that are due to run now."""
now = _hermes_now()
jobs = load_jobs()
due = []
for job in jobs:
if not job.get("enabled", True):
continue
next_run = job.get("next_run_at")
if not next_run:
continue
next_run_dt = _ensure_aware(datetime.fromisoformat(next_run))
if next_run_dt <= now:
due.append(job)
return due
@@ -385,11 +400,11 @@ def save_job_output(job_id: str, output: str):
ensure_dirs()
job_output_dir = OUTPUT_DIR / job_id
job_output_dir.mkdir(parents=True, exist_ok=True)
timestamp = _hermes_now().strftime("%Y-%m-%d_%H-%M-%S")
output_file = job_output_dir / f"{timestamp}.md"
with open(output_file, "w", encoding="utf-8") as f:
with open(output_file, 'w', encoding='utf-8') as f:
f.write(output)
return output_file

View File

@@ -23,7 +23,9 @@ except ImportError:
import msvcrt
except ImportError:
msvcrt = None
from datetime import datetime
from pathlib import Path
from typing import Optional
from hermes_time import now as _hermes_now
@@ -42,7 +44,7 @@ _LOCK_DIR = _hermes_home / "cron"
_LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _resolve_origin(job: dict) -> dict | None:
def _resolve_origin(job: dict) -> Optional[dict]:
"""Extract origin info from a job, returning {platform, chat_id, chat_name} or None."""
origin = job.get("origin")
if not origin:
@@ -85,23 +87,17 @@ def _deliver_result(job: dict, content: str) -> None:
# Fall back to home channel
chat_id = os.getenv(f"{platform_name.upper()}_HOME_CHANNEL", "")
if not chat_id:
logger.warning(
"Job '%s' deliver=%s but no chat_id or home channel. Set via: hermes config set %s_HOME_CHANNEL <channel_id>",
job["id"],
deliver,
platform_name.upper(),
)
logger.warning("Job '%s' deliver=%s but no chat_id or home channel. Set via: hermes config set %s_HOME_CHANNEL <channel_id>", job["id"], deliver, platform_name.upper())
return
from gateway.config import Platform, load_gateway_config
from tools.send_message_tool import _send_to_platform
from gateway.config import load_gateway_config, Platform
platform_map = {
"telegram": Platform.TELEGRAM,
"discord": Platform.DISCORD,
"slack": Platform.SLACK,
"whatsapp": Platform.WHATSAPP,
"signal": Platform.SIGNAL,
}
platform = platform_map.get(platform_name.lower())
if not platform:
@@ -126,7 +122,6 @@ def _deliver_result(job: dict, content: str) -> None:
# asyncio.run() fails if there's already a running loop in this thread;
# spin up a new thread to avoid that.
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, content))
result = future.result(timeout=30)
@@ -141,26 +136,25 @@ def _deliver_result(job: dict, content: str) -> None:
# Mirror the delivered content into the target's gateway session
try:
from gateway.mirror import mirror_to_session
mirror_to_session(platform_name, chat_id, content, source_label="cron")
except Exception:
pass
def run_job(job: dict) -> tuple[bool, str, str, str | None]:
def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
"""
Execute a single cron job.
Returns:
Tuple of (success, full_output_doc, final_response, error_message)
"""
from run_agent import AIAgent
job_id = job["id"]
job_name = job["name"]
prompt = job["prompt"]
origin = _resolve_origin(job)
logger.info("Running job '%s' (ID: %s)", job_name, job_id)
logger.info("Prompt: %s", prompt[:100])
@@ -175,7 +169,6 @@ def run_job(job: dict) -> tuple[bool, str, str, str | None]:
# Re-read .env and config.yaml fresh every run so provider/key
# changes take effect without a gateway restart.
from dotenv import load_dotenv
try:
load_dotenv(str(_hermes_home / ".env"), override=True, encoding="utf-8")
except UnicodeDecodeError:
@@ -183,11 +176,8 @@ def run_job(job: dict) -> tuple[bool, str, str, str | None]:
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
# Load config.yaml for model, reasoning, prefill, toolsets, provider routing
_cfg = {}
try:
import yaml
_cfg_path = str(_hermes_home / "config.yaml")
if os.path.exists(_cfg_path):
with open(_cfg_path) as _f:
@@ -200,47 +190,10 @@ def run_job(job: dict) -> tuple[bool, str, str, str | None]:
except Exception:
pass
# Reasoning config from env or config.yaml
reasoning_config = None
effort = os.getenv("HERMES_REASONING_EFFORT", "")
if not effort:
effort = str(_cfg.get("agent", {}).get("reasoning_effort", "")).strip()
if effort and effort.lower() != "none":
valid = ("xhigh", "high", "medium", "low", "minimal")
if effort.lower() in valid:
reasoning_config = {"enabled": True, "effort": effort.lower()}
elif effort.lower() == "none":
reasoning_config = {"enabled": False}
# Prefill messages from env or config.yaml
prefill_messages = None
prefill_file = os.getenv("HERMES_PREFILL_MESSAGES_FILE", "") or _cfg.get("prefill_messages_file", "")
if prefill_file:
import json as _json
pfpath = Path(prefill_file).expanduser()
if not pfpath.is_absolute():
pfpath = _hermes_home / pfpath
if pfpath.exists():
try:
with open(pfpath, encoding="utf-8") as _pf:
prefill_messages = _json.load(_pf)
if not isinstance(prefill_messages, list):
prefill_messages = None
except Exception:
prefill_messages = None
# Max iterations
max_iterations = _cfg.get("agent", {}).get("max_turns") or _cfg.get("max_turns") or 90
# Provider routing
pr = _cfg.get("provider_routing", {})
from hermes_cli.runtime_provider import (
format_runtime_provider_error,
resolve_runtime_provider,
format_runtime_provider_error,
)
try:
runtime = resolve_runtime_provider(
requested=os.getenv("HERMES_INFERENCE_PROVIDER"),
@@ -255,28 +208,21 @@ def run_job(job: dict) -> tuple[bool, str, str, str | None]:
base_url=runtime.get("base_url"),
provider=runtime.get("provider"),
api_mode=runtime.get("api_mode"),
max_iterations=max_iterations,
reasoning_config=reasoning_config,
prefill_messages=prefill_messages,
providers_allowed=pr.get("only"),
providers_ignored=pr.get("ignore"),
providers_order=pr.get("order"),
provider_sort=pr.get("sort"),
quiet_mode=True,
session_id=f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}",
session_id=f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"
)
result = agent.run_conversation(prompt)
final_response = result.get("final_response", "")
if not final_response:
final_response = "(No response generated)"
output = f"""# Cron Job: {job_name}
**Job ID:** {job_id}
**Run Time:** {_hermes_now().strftime("%Y-%m-%d %H:%M:%S")}
**Schedule:** {job.get("schedule_display", "N/A")}
**Run Time:** {_hermes_now().strftime('%Y-%m-%d %H:%M:%S')}
**Schedule:** {job.get('schedule_display', 'N/A')}
## Prompt
@@ -286,19 +232,19 @@ def run_job(job: dict) -> tuple[bool, str, str, str | None]:
{final_response}
"""
logger.info("Job '%s' completed successfully", job_name)
return True, output, final_response, None
except Exception as e:
error_msg = f"{type(e).__name__}: {str(e)}"
logger.error("Job '%s' failed: %s", job_name, error_msg)
output = f"""# Cron Job: {job_name} (FAILED)
**Job ID:** {job_id}
**Run Time:** {_hermes_now().strftime("%Y-%m-%d %H:%M:%S")}
**Schedule:** {job.get("schedule_display", "N/A")}
**Run Time:** {_hermes_now().strftime('%Y-%m-%d %H:%M:%S')}
**Schedule:** {job.get('schedule_display', 'N/A')}
## Prompt
@@ -323,13 +269,13 @@ def run_job(job: dict) -> tuple[bool, str, str, str | None]:
def tick(verbose: bool = True) -> int:
"""
Check and run all due jobs.
Uses a file lock so only one tick runs at a time, even if the gateway's
in-process ticker and a standalone daemon or manual tick overlap.
Args:
verbose: Whether to print status messages
Returns:
Number of jobs executed (0 if another tick is already running)
"""
@@ -343,7 +289,7 @@ def tick(verbose: bool = True) -> int:
fcntl.flock(lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
elif msvcrt:
msvcrt.locking(lock_fd.fileno(), msvcrt.LK_NBLCK, 1)
except OSError:
except (OSError, IOError):
logger.debug("Tick skipped — another instance holds the lock")
if lock_fd is not None:
lock_fd.close()
@@ -353,11 +299,11 @@ def tick(verbose: bool = True) -> int:
due_jobs = get_due_jobs()
if verbose and not due_jobs:
logger.info("%s - No jobs due", _hermes_now().strftime("%H:%M:%S"))
logger.info("%s - No jobs due", _hermes_now().strftime('%H:%M:%S'))
return 0
if verbose:
logger.info("%s - %s job(s) due", _hermes_now().strftime("%H:%M:%S"), len(due_jobs))
logger.info("%s - %s job(s) due", _hermes_now().strftime('%H:%M:%S'), len(due_jobs))
executed = 0
for job in due_jobs:
@@ -369,9 +315,7 @@ def tick(verbose: bool = True) -> int:
logger.info("Output saved to: %s", output_file)
# Deliver the final response to the origin/target chat
deliver_content = (
final_response if success else f"⚠️ Cron job '{job.get('name', job['id'])}' failed:\n{error}"
)
deliver_content = final_response if success else f"⚠️ Cron job '{job.get('name', job['id'])}' failed:\n{error}"
if deliver_content:
try:
_deliver_result(job, deliver_content)
@@ -382,7 +326,7 @@ def tick(verbose: bool = True) -> int:
executed += 1
except Exception as e:
logger.error("Error processing job %s: %s", job["id"], e)
logger.error("Error processing job %s: %s", job['id'], e)
mark_job_run(job["id"], False, str(e))
return executed
@@ -392,7 +336,7 @@ def tick(verbose: bool = True) -> int:
elif msvcrt:
try:
msvcrt.locking(lock_fd.fileno(), msvcrt.LK_UNLCK, 1)
except OSError:
except (OSError, IOError):
pass
lock_fd.close()

View File

@@ -1,46 +0,0 @@
# datagen-config-examples/web_research.yaml
#
# Batch data generation config for WebResearchEnv.
# Generates tool-calling trajectories for multi-step web research tasks.
#
# Usage:
# python batch_runner.py \
# --config datagen-config-examples/web_research.yaml \
# --run_name web_research_v1
environment: web-research
# Toolsets available to the agent during data generation
toolsets:
- web
- file
# How many parallel workers to use
num_workers: 4
# Questions per batch
batch_size: 20
# Total trajectories to generate (comment out to run full dataset)
max_items: 500
# Model to use for generation (override with --model flag)
model: openrouter/nousresearch/hermes-3-llama-3.1-405b
# System prompt additions (ephemeral — not saved to trajectories)
ephemeral_system_prompt: |
You are a highly capable research agent. When asked a factual question,
always use web_search to find current, accurate information before answering.
Cite at least 2 sources. Be concise and accurate.
# Output directory
output_dir: data/web_research_v1
# Trajectory compression settings (for fitting into training token budgets)
compression:
enabled: true
target_max_tokens: 16000
# Eval settings
eval_every: 100 # Run eval every N trajectories
eval_size: 25 # Number of held-out questions per eval run

7
docs/README.md Normal file
View File

@@ -0,0 +1,7 @@
# Documentation
All documentation has moved to the website:
**📖 [hermes-agent.nousresearch.com/docs](https://hermes-agent.nousresearch.com/docs/)**
The documentation source files live in [`website/docs/`](../website/docs/).

View File

@@ -0,0 +1,344 @@
# send_file Integration Map — Hermes Agent Codebase Deep Dive
## 1. environments/tool_context.py — Base64 File Transfer Implementation
### upload_file() (lines 153-205)
- Reads local file as raw bytes, base64-encodes to ASCII string
- Creates parent dirs in sandbox via `self.terminal(f"mkdir -p {parent}")`
- **Chunk size:** 60,000 chars (~60KB per shell command)
- **Small files (<=60KB b64):** Single `printf '%s' '{b64}' | base64 -d > {remote_path}`
- **Large files:** Writes chunks to `/tmp/_hermes_upload.b64` via `printf >> append`, then `base64 -d` to target
- **Error handling:** Checks local file exists; returns `{exit_code, output}`
- **Size limits:** No explicit limit, but shell arg limit ~2MB means chunking is necessary for files >~45KB raw
- **No theoretical max** — but very large files would be slow (many terminal round trips)
### download_file() (lines 234-278)
- Runs `base64 {remote_path}` inside sandbox, captures stdout
- Strips output, base64-decodes to raw bytes
- Writes to host filesystem with parent dir creation
- **Error handling:** Checks exit code, empty output, decode errors
- Returns `{success: bool, bytes: int}` or `{success: false, error: str}`
- **Size limit:** Bounded by terminal output buffer (practical limit ~few MB via base64 terminal output)
### Promotion potential:
- These methods work via `self.terminal()` — they're environment-agnostic
- Could be directly lifted into a new tool that operates on the agent's current sandbox
- For send_file, this `download_file()` pattern is the key: it extracts files from sandbox → host
## 2. tools/environments/base.py — BaseEnvironment Interface
### Current methods:
- `execute(command, cwd, timeout, stdin_data)``{output, returncode}`
- `cleanup()` — release resources
- `stop()` — alias for cleanup
- `_prepare_command()` — sudo transformation
- `_build_run_kwargs()` — subprocess kwargs
- `_timeout_result()` — standard timeout dict
### What would need to be added for file transfer:
- **Nothing required at this level.** File transfer can be implemented via `execute()` (base64 over terminal, like ToolContext does) or via environment-specific methods.
- Optional: `upload_file(local_path, remote_path)` and `download_file(remote_path, local_path)` methods could be added to BaseEnvironment for optimized per-backend transfers, but the base64-over-terminal approach already works universally.
## 3. tools/environments/docker.py — Docker Container Details
### Container ID tracking:
- `self._container_id` stored at init from `self._inner.container_id`
- Inner is `minisweagent.environments.docker.DockerEnvironment`
- Container ID is a standard Docker container hash
### docker cp feasibility:
- **YES**, `docker cp` could be used for optimized file transfer:
- `docker cp {container_id}:{remote_path} {local_path}` (download)
- `docker cp {local_path} {container_id}:{remote_path}` (upload)
- Much faster than base64-over-terminal for large files
- Container ID is directly accessible via `env._container_id` or `env._inner.container_id`
### Volumes mounted:
- **Persistent mode:** Bind mounts at `~/.hermes/sandboxes/docker/{task_id}/workspace``/workspace` and `.../home``/root`
- **Ephemeral mode:** tmpfs at `/workspace` (10GB), `/home` (1GB), `/root` (1GB)
- **User volumes:** From `config.yaml docker_volumes` (arbitrary `-v` mounts)
- **Security tmpfs:** `/tmp` (512MB), `/var/tmp` (256MB), `/run` (64MB)
### Direct host access for persistent mode:
- If persistent, files at `/workspace/foo.txt` are just `~/.hermes/sandboxes/docker/{task_id}/workspace/foo.txt` on host — no transfer needed!
## 4. tools/environments/ssh.py — SSH Connection Management
### Connection management:
- Uses SSH ControlMaster for persistent connection
- Control socket at `/tmp/hermes-ssh/{user}@{host}:{port}.sock`
- ControlPersist=300 (5 min keepalive)
- BatchMode=yes (non-interactive)
- Stores: `self.host`, `self.user`, `self.port`, `self.key_path`
### SCP/SFTP feasibility:
- **YES**, SCP can piggyback on the ControlMaster socket:
- `scp -o ControlPath={socket} {user}@{host}:{remote} {local}` (download)
- `scp -o ControlPath={socket} {local} {user}@{host}:{remote}` (upload)
- Same SSH key and connection reuse — zero additional auth
- Would be much faster than base64-over-terminal for large files
## 5. tools/environments/modal.py — Modal Sandbox Filesystem
### Filesystem API exposure:
- **Not directly.** The inner `SwerexModalEnvironment` wraps Modal's sandbox
- The sandbox object is accessible at: `env._inner.deployment._sandbox`
- Modal's Python SDK exposes `sandbox.open()` for file I/O — but only via async API
- Currently only used for `snapshot_filesystem()` during cleanup
- **Could use:** `sandbox.open(path, "rb")` to read files or `sandbox.open(path, "wb")` to write
- **Alternative:** Base64-over-terminal already works via `execute()` — simpler, no SDK dependency
## 6. gateway/platforms/base.py — MEDIA: Tag Flow (Complete)
### extract_media() (lines 587-620):
- **Pattern:** `MEDIA:\S+` — extracts file paths after MEDIA: prefix
- **Voice flag:** `[[audio_as_voice]]` global directive sets `is_voice=True` for all media in message
- Returns `List[Tuple[str, bool]]` (path, is_voice) and cleaned content
### _process_message_background() media routing (lines 752-786):
- After extracting MEDIA tags, routes by file extension:
- `.ogg .opus .mp3 .wav .m4a``send_voice()`
- `.mp4 .mov .avi .mkv .3gp``send_video()`
- `.jpg .jpeg .png .webp .gif``send_image_file()`
- **Everything else** → `send_document()`
- This routing already supports arbitrary files!
### send_* method inventory (base class):
- `send(chat_id, content, reply_to, metadata)` — ABSTRACT, text
- `send_image(chat_id, image_url, caption, reply_to)` — URL-based images
- `send_animation(chat_id, animation_url, caption, reply_to)` — GIF animations
- `send_voice(chat_id, audio_path, caption, reply_to)` — voice messages
- `send_video(chat_id, video_path, caption, reply_to)` — video files
- `send_document(chat_id, file_path, caption, file_name, reply_to)` — generic files
- `send_image_file(chat_id, image_path, caption, reply_to)` — local image files
- `send_typing(chat_id)` — typing indicator
- `edit_message(chat_id, message_id, content)` — edit sent messages
### What's missing:
- **Telegram:** No override for `send_document` or `send_image_file` — falls back to text!
- **Discord:** No override for `send_document` — falls back to text!
- **WhatsApp:** Has `send_document` and `send_image_file` via bridge — COMPLETE.
- The base class defaults just send "📎 File: /path" as text — useless for actual file delivery.
## 7. gateway/platforms/telegram.py — Send Method Analysis
### Implemented send methods:
- `send()` — MarkdownV2 text with fallback to plain
- `send_voice()``.ogg`/`.opus` as `send_voice()`, others as `send_audio()`
- `send_image()` — URL-based via `send_photo()`
- `send_animation()` — GIF via `send_animation()`
- `send_typing()` — "typing" chat action
- `edit_message()` — edit text messages
### MISSING:
- **`send_document()` NOT overridden** — Need to add `self._bot.send_document(chat_id, document=open(file_path, 'rb'), ...)`
- **`send_image_file()` NOT overridden** — Need to add `self._bot.send_photo(chat_id, photo=open(path, 'rb'), ...)`
- **`send_video()` NOT overridden** — Need to add `self._bot.send_video(...)`
## 8. gateway/platforms/discord.py — Send Method Analysis
### Implemented send methods:
- `send()` — text messages with chunking
- `send_voice()` — discord.File attachment
- `send_image()` — downloads URL, creates discord.File attachment
- `send_typing()` — channel.typing()
- `edit_message()` — edit text messages
### MISSING:
- **`send_document()` NOT overridden** — Need to add discord.File attachment
- **`send_image_file()` NOT overridden** — Need to add discord.File from local path
- **`send_video()` NOT overridden** — Need to add discord.File attachment
## 9. gateway/run.py — User File Attachment Handling
### Current attachment flow:
1. **Telegram photos** (line 509-529): Download via `photo.get_file()``cache_image_from_bytes()` → vision auto-analysis
2. **Telegram voice** (line 532-541): Download → `cache_audio_from_bytes()` → STT transcription
3. **Telegram audio** (line 542-551): Same pattern
4. **Telegram documents** (line 553-617): Extension validation against `SUPPORTED_DOCUMENT_TYPES`, 20MB limit, content injection for text files
5. **Discord attachments** (line 717-751): Content-type detection, image/audio caching, URL fallback for other types
6. **Gateway run.py** (lines 818-883): Auto-analyzes images with vision, transcribes audio, enriches document messages with context notes
### Key insight: Files are always cached to host filesystem first, then processed. The agent sees local file paths.
## 10. tools/terminal_tool.py — Terminal Tool & Environment Interaction
### How it manages environments:
- Global dict `_active_environments: Dict[str, Any]` keyed by task_id
- Per-task creation locks prevent duplicate sandbox creation
- Auto-cleanup thread kills idle environments after `TERMINAL_LIFETIME_SECONDS`
- `_get_env_config()` reads all TERMINAL_* env vars for backend selection
- `_create_environment()` factory creates the right backend type
### Could send_file piggyback?
- **YES.** send_file needs access to the same environment to extract files from sandboxes.
- It can reuse `_active_environments[task_id]` to get the environment, then:
- Docker: Use `docker cp` via `env._container_id`
- SSH: Use `scp` via `env.control_socket`
- Local: Just read the file directly
- Modal: Use base64-over-terminal via `env.execute()`
- The file_tools.py module already does this with `ShellFileOperations` — read_file/write_file/search/patch all share the same env instance.
## 11. tools/tts_tool.py — Working Example of File Delivery
### Flow:
1. Generate audio file to `~/.hermes/audio_cache/tts_TIMESTAMP.{ogg,mp3}`
2. Return JSON with `media_tag: "MEDIA:/path/to/file"`
3. For Telegram voice: prepend `[[audio_as_voice]]` directive
4. The LLM includes the MEDIA tag in its response text
5. `BasePlatformAdapter._process_message_background()` calls `extract_media()` to find the tag
6. Routes by extension → `send_voice()` for audio files
7. Platform adapter sends the file natively
### Key pattern: Tool saves file to host → returns MEDIA: path → LLM echoes it → gateway extracts → platform delivers
## 12. tools/image_generation_tool.py — Working Example of Image Delivery
### Flow:
1. Call FAL.ai API → get image URL
2. Return JSON with `image: "https://fal.media/..."` URL
3. The LLM includes the URL in markdown: `![description](URL)`
4. `BasePlatformAdapter.extract_images()` finds `![alt](url)` patterns
5. Routes through `send_image()` (URL) or `send_animation()` (GIF)
6. Platform downloads and sends natively
### Key difference from TTS: Images are URL-based, not local files. The gateway downloads at send time.
---
# INTEGRATION MAP: Where send_file Hooks In
## Architecture Decision: MEDIA: Tag Protocol vs. New Tool
The MEDIA: tag protocol is already the established pattern for file delivery. Two options:
### Option A: Pure MEDIA: Tag (Minimal Change)
- No new tool needed
- Agent downloads file from sandbox to host using terminal (base64)
- Saves to known location (e.g., `~/.hermes/file_cache/`)
- Includes `MEDIA:/path` in response text
- Existing routing in `_process_message_background()` handles delivery
- **Problem:** Agent has to manually do base64 dance + know about MEDIA: convention
### Option B: Dedicated send_file Tool (Recommended)
- New tool that the agent calls with `(file_path, caption?)`
- Tool handles the sandbox → host extraction automatically
- Returns MEDIA: tag that gets routed through existing pipeline
- Much cleaner agent experience
## Implementation Plan for Option B
### Files to CREATE:
1. **`tools/send_file_tool.py`** — The new tool
- Accepts: `file_path` (path in sandbox), `caption` (optional)
- Detects environment backend from `_active_environments`
- Extracts file from sandbox:
- **local:** `shutil.copy()` or direct path
- **docker:** `docker cp {container_id}:{path} {local_cache}/`
- **ssh:** `scp -o ControlPath=... {user}@{host}:{path} {local_cache}/`
- **modal:** base64-over-terminal via `env.execute("base64 {path}")`
- Saves to `~/.hermes/file_cache/{uuid}_{filename}`
- Returns: `MEDIA:/cached/path` in response for gateway to pick up
- Register with `registry.register(name="send_file", toolset="file", ...)`
### Files to MODIFY:
2. **`gateway/platforms/telegram.py`** — Add missing send methods:
```python
async def send_document(self, chat_id, file_path, caption=None, file_name=None, reply_to=None):
with open(file_path, "rb") as f:
msg = await self._bot.send_document(
chat_id=int(chat_id), document=f,
caption=caption, filename=file_name or os.path.basename(file_path))
return SendResult(success=True, message_id=str(msg.message_id))
async def send_image_file(self, chat_id, image_path, caption=None, reply_to=None):
with open(image_path, "rb") as f:
msg = await self._bot.send_photo(chat_id=int(chat_id), photo=f, caption=caption)
return SendResult(success=True, message_id=str(msg.message_id))
async def send_video(self, chat_id, video_path, caption=None, reply_to=None):
with open(video_path, "rb") as f:
msg = await self._bot.send_video(chat_id=int(chat_id), video=f, caption=caption)
return SendResult(success=True, message_id=str(msg.message_id))
```
3. **`gateway/platforms/discord.py`** — Add missing send methods:
```python
async def send_document(self, chat_id, file_path, caption=None, file_name=None, reply_to=None):
channel = self._client.get_channel(int(chat_id)) or await self._client.fetch_channel(int(chat_id))
with open(file_path, "rb") as f:
file = discord.File(io.BytesIO(f.read()), filename=file_name or os.path.basename(file_path))
msg = await channel.send(content=caption, file=file)
return SendResult(success=True, message_id=str(msg.id))
async def send_image_file(self, chat_id, image_path, caption=None, reply_to=None):
# Same pattern as send_document with image filename
async def send_video(self, chat_id, video_path, caption=None, reply_to=None):
# Same pattern, discord renders video attachments inline
```
4. **`toolsets.py`** — Add `"send_file"` to `_HERMES_CORE_TOOLS` list
5. **`agent/prompt_builder.py`** — Update platform hints to mention send_file tool
### Code that can be REUSED (zero rewrite):
- `BasePlatformAdapter.extract_media()` — Already extracts MEDIA: tags
- `BasePlatformAdapter._process_message_background()` — Already routes by extension
- `ToolContext.download_file()` — Base64-over-terminal extraction pattern
- `tools/terminal_tool.py` _active_environments dict — Environment access
- `tools/registry.py` — Tool registration infrastructure
- `gateway/platforms/base.py` send_document/send_image_file/send_video signatures — Already defined
### Code that needs to be WRITTEN from scratch:
1. `tools/send_file_tool.py` (~150 lines):
- File extraction from each environment backend type
- Local file cache management
- Registry registration
2. Telegram `send_document` + `send_image_file` + `send_video` overrides (~40 lines)
3. Discord `send_document` + `send_image_file` + `send_video` overrides (~50 lines)
### Total effort: ~240 lines of new code, ~5 lines of config changes
## Key Environment-Specific Extract Strategies
| Backend | Extract Method | Speed | Complexity |
|------------|-------------------------------|----------|------------|
| local | shutil.copy / direct path | Instant | None |
| docker | `docker cp container:path .` | Fast | Low |
| docker+vol | Direct host path access | Instant | None |
| ssh | `scp -o ControlPath=...` | Fast | Low |
| modal | base64-over-terminal | Moderate | Medium |
| singularity| Direct path (overlay mount) | Fast | Low |
## Data Flow Summary
```
Agent calls send_file(file_path="/workspace/output.pdf", caption="Here's the report")
send_file_tool.py:
1. Get environment from _active_environments[task_id]
2. Detect backend type (docker/ssh/modal/local)
3. Extract file to ~/.hermes/file_cache/{uuid}_{filename}
4. Return: '{"success": true, "media_tag": "MEDIA:/home/user/.hermes/file_cache/abc123_output.pdf"}'
LLM includes MEDIA: tag in its response text
BasePlatformAdapter._process_message_background():
1. extract_media(response) → finds MEDIA:/path
2. Checks extension: .pdf → send_document()
3. Calls platform-specific send_document(chat_id, file_path, caption)
TelegramAdapter.send_document() / DiscordAdapter.send_document():
Opens file, sends via platform API as native document attachment
User receives downloadable file in chat
```

View File

@@ -1,643 +0,0 @@
"""
WebResearchEnv — RL Environment for Multi-Step Web Research
============================================================
Trains models to do accurate, efficient, multi-source web research.
Reward signals:
- Answer correctness (LLM judge, 0.01.0)
- Source diversity (used ≥2 distinct domains)
- Efficiency (penalizes excessive tool calls)
- Tool usage (bonus for actually using web tools)
Dataset: FRAMES benchmark (Google, 2024) — multi-hop factual questions
HuggingFace: google/frames-benchmark
Fallback: built-in sample questions (no HF token needed)
Usage:
# Phase 1 (OpenAI-compatible server)
python environments/web_research_env.py serve \\
--openai.base_url http://localhost:8000/v1 \\
--openai.model_name YourModel \\
--openai.server_type openai
# Process mode (offline data generation)
python environments/web_research_env.py process \\
--env.data_path_to_save_groups data/web_research.jsonl
# Standalone eval
python environments/web_research_env.py evaluate \\
--openai.base_url http://localhost:8000/v1 \\
--openai.model_name YourModel
Built by: github.com/jackx707
Inspired by: GroceryMind — production Hermes agent doing live web research
across German grocery stores (firecrawl + hermes-agent)
"""
from __future__ import annotations
import asyncio
import json
import logging
import os
import random
import re
import sys
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from urllib.parse import urlparse
from pydantic import Field
# Ensure hermes-agent root is on path
_repo_root = Path(__file__).resolve().parent.parent
if str(_repo_root) not in sys.path:
sys.path.insert(0, str(_repo_root))
# ---------------------------------------------------------------------------
# Optional HuggingFace datasets import
# ---------------------------------------------------------------------------
try:
from datasets import load_dataset
HF_AVAILABLE = True
except ImportError:
HF_AVAILABLE = False
from atroposlib.envs.base import ScoredDataGroup
from atroposlib.envs.server_handling.server_manager import APIServerConfig
from atroposlib.type_definitions import Item
from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
from environments.agent_loop import AgentResult
from environments.tool_context import ToolContext
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Fallback sample dataset (used when HuggingFace is unavailable)
# Multi-hop questions requiring real web search to answer.
# ---------------------------------------------------------------------------
SAMPLE_QUESTIONS = [
{
"question": "What is the current population of the capital city of the country that won the 2022 FIFA World Cup?",
"answer": "Buenos Aires has approximately 3 million people in the city proper, or around 15 million in the greater metro area.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "Who is the CEO of the company that makes the most widely used open-source container orchestration platform?",
"answer": "The Linux Foundation oversees Kubernetes. CNCF (Cloud Native Computing Foundation) is the specific body — it does not have a traditional CEO but has an executive director.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "What programming language was used to write the original version of the web framework used by Instagram?",
"answer": "Django, which Instagram was built on, is written in Python.",
"difficulty": "easy",
"hops": 2,
},
{
"question": "In what year was the university founded where the inventor of the World Wide Web currently holds a professorship?",
"answer": "Tim Berners-Lee holds a professorship at MIT (founded 1861) and the University of Southampton (founded 1952).",
"difficulty": "hard",
"hops": 3,
},
{
"question": "What is the latest stable version of the programming language that ranks #1 on the TIOBE index as of this year?",
"answer": "Python is currently #1 on TIOBE. The latest stable version should be verified via the official python.org site.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "How many employees does the parent company of Instagram have?",
"answer": "Meta Platforms (parent of Instagram) employs approximately 70,000+ people as of recent reports.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "What is the current interest rate set by the central bank of the country where the Eiffel Tower is located?",
"answer": "The European Central Bank sets rates for France/eurozone. The current rate should be verified — it has changed frequently in 2023-2025.",
"difficulty": "hard",
"hops": 2,
},
{
"question": "Which company acquired the startup founded by the creator of Oculus VR?",
"answer": "Palmer Luckey founded Oculus VR, which was acquired by Facebook (now Meta). He later founded Anduril Industries.",
"difficulty": "medium",
"hops": 2,
},
{
"question": "What is the market cap of the company that owns the most popular search engine in Russia?",
"answer": "Yandex (now split into separate entities after 2024 restructuring). Current market cap should be verified via financial sources.",
"difficulty": "hard",
"hops": 2,
},
{
"question": "What was the GDP growth rate of the country that hosted the most recent Summer Olympics?",
"answer": "Paris, France hosted the 2024 Summer Olympics. France's recent GDP growth should be verified via World Bank or IMF data.",
"difficulty": "hard",
"hops": 2,
},
]
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
class WebResearchEnvConfig(HermesAgentEnvConfig):
"""Configuration for the web research RL environment."""
# Reward weights
correctness_weight: float = Field(
default=0.6,
description="Weight for answer correctness in reward (LLM judge score).",
)
tool_usage_weight: float = Field(
default=0.2,
description="Weight for tool usage signal (did the model actually use web tools?).",
)
efficiency_weight: float = Field(
default=0.2,
description="Weight for efficiency signal (penalizes excessive tool calls).",
)
diversity_bonus: float = Field(
default=0.1,
description="Bonus reward for citing ≥2 distinct domains.",
)
# Efficiency thresholds
efficient_max_calls: int = Field(
default=5,
description="Maximum tool calls before efficiency penalty begins.",
)
heavy_penalty_calls: int = Field(
default=10,
description="Tool call count where efficiency penalty steepens.",
)
# Eval
eval_size: int = Field(
default=20,
description="Number of held-out items for evaluation.",
)
eval_split_ratio: float = Field(
default=0.1,
description="Fraction of dataset to hold out for evaluation (0.01.0).",
)
# Dataset
dataset_name: str = Field(
default="google/frames-benchmark",
description="HuggingFace dataset name for research questions.",
)
# ---------------------------------------------------------------------------
# Environment
# ---------------------------------------------------------------------------
class WebResearchEnv(HermesAgentBaseEnv):
"""
RL environment for training multi-step web research skills.
The model is given a factual question requiring 2-3 hops of web research
and must use web_search / web_extract tools to find and synthesize the answer.
Reward is multi-signal:
60% — answer correctness (LLM judge)
20% — tool usage (did the model actually search the web?)
20% — efficiency (penalizes >5 tool calls)
Bonus +0.1 for source diversity (≥2 distinct domains cited).
"""
name = "web-research"
env_config_cls = WebResearchEnvConfig
# Default toolsets for this environment — web + file for saving notes
default_toolsets = ["web", "file"]
@classmethod
def config_init(cls) -> Tuple[WebResearchEnvConfig, List[APIServerConfig]]:
"""Default configuration for the web research environment."""
env_config = WebResearchEnvConfig(
enabled_toolsets=["web", "file"],
max_agent_turns=15,
agent_temperature=1.0,
system_prompt=(
"You are a highly capable research agent. When asked a factual question, "
"always use web_search to find current, accurate information before answering. "
"Cite at least 2 sources. Be concise and accurate."
),
group_size=4,
total_steps=1000,
steps_per_eval=100,
use_wandb=True,
wandb_name="web-research",
)
server_configs = [
APIServerConfig(
base_url="https://openrouter.ai/api/v1",
model_name="anthropic/claude-sonnet-4.5",
server_type="openai",
api_key=os.getenv("OPENROUTER_API_KEY", ""),
health_check=False,
)
]
return env_config, server_configs
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._items: list[dict] = []
self._eval_items: list[dict] = []
self._index: int = 0
# Metrics tracking for wandb
self._reward_buffer: list[float] = []
self._correctness_buffer: list[float] = []
self._tool_usage_buffer: list[float] = []
self._efficiency_buffer: list[float] = []
self._diversity_buffer: list[float] = []
# ------------------------------------------------------------------
# 1. Setup — load dataset
# ------------------------------------------------------------------
async def setup(self) -> None:
"""Load the FRAMES benchmark or fall back to built-in samples."""
if HF_AVAILABLE:
try:
logger.info("Loading FRAMES benchmark from HuggingFace...")
ds = load_dataset(self.config.dataset_name, split="test")
self._items = [
{
"question": row["Prompt"],
"answer": row["Answer"],
"difficulty": row.get("reasoning_types", "unknown"),
"hops": 2,
}
for row in ds
]
# Hold out for eval
eval_size = max(
self.config.eval_size,
int(len(self._items) * self.config.eval_split_ratio),
)
random.shuffle(self._items)
self._eval_items = self._items[:eval_size]
self._items = self._items[eval_size:]
logger.info(
f"Loaded {len(self._items)} train / {len(self._eval_items)} eval items "
f"from FRAMES benchmark."
)
return
except Exception as e:
logger.warning(f"Could not load FRAMES from HuggingFace: {e}. Using built-in samples.")
# Fallback
random.shuffle(SAMPLE_QUESTIONS)
split = max(1, len(SAMPLE_QUESTIONS) * 8 // 10)
self._items = SAMPLE_QUESTIONS[:split]
self._eval_items = SAMPLE_QUESTIONS[split:]
logger.info(
f"Using built-in sample dataset: {len(self._items)} train / "
f"{len(self._eval_items)} eval items."
)
# ------------------------------------------------------------------
# 2. get_next_item — return the next question
# ------------------------------------------------------------------
async def get_next_item(self) -> dict:
"""Return the next item, cycling through the dataset."""
if not self._items:
raise RuntimeError("Dataset is empty. Did you call setup()?")
item = self._items[self._index % len(self._items)]
self._index += 1
return item
# ------------------------------------------------------------------
# 3. format_prompt — build the user-facing prompt
# ------------------------------------------------------------------
def format_prompt(self, item: dict) -> str:
"""Format the research question as a task prompt."""
return (
f"Research the following question thoroughly using web search. "
f"You MUST search the web to find current, accurate information — "
f"do not rely solely on your training data.\n\n"
f"Question: {item['question']}\n\n"
f"Requirements:\n"
f"- Use web_search and/or web_extract tools to find information\n"
f"- Search at least 2 different sources\n"
f"- Provide a concise, accurate answer (2-4 sentences)\n"
f"- Cite the sources you used"
)
# ------------------------------------------------------------------
# 4. compute_reward — multi-signal scoring
# ------------------------------------------------------------------
async def compute_reward(
self,
item: dict,
result: AgentResult,
ctx: ToolContext,
) -> float:
"""
Multi-signal reward function:
correctness_weight * correctness — LLM judge comparing answer to ground truth
tool_usage_weight * tool_used — binary: did the model use web tools?
efficiency_weight * efficiency — penalizes wasteful tool usage
+ diversity_bonus — source diversity (≥2 distinct domains)
"""
final_response: str = result.final_response or ""
tools_used: list[str] = [
tc.tool_name for tc in (result.tool_calls or [])
] if hasattr(result, "tool_calls") and result.tool_calls else []
tool_call_count: int = result.turns_used or len(tools_used)
cfg = self.config
# ---- Signal 1: Answer correctness (LLM judge) ----------------
correctness = await self._llm_judge(
question=item["question"],
expected=item["answer"],
model_answer=final_response,
)
# ---- Signal 2: Web tool usage --------------------------------
web_tools = {"web_search", "web_extract", "search", "firecrawl"}
tool_used = 1.0 if any(t in web_tools for t in tools_used) else 0.0
# ---- Signal 3: Efficiency ------------------------------------
if tool_call_count <= cfg.efficient_max_calls:
efficiency = 1.0
elif tool_call_count <= cfg.heavy_penalty_calls:
efficiency = 1.0 - (tool_call_count - cfg.efficient_max_calls) * 0.08
else:
efficiency = max(0.0, 1.0 - (tool_call_count - cfg.efficient_max_calls) * 0.12)
# ---- Bonus: Source diversity ---------------------------------
domains = self._extract_domains(final_response)
diversity = cfg.diversity_bonus if len(domains) >= 2 else 0.0
# ---- Combine ------------------------------------------------
reward = (
cfg.correctness_weight * correctness
+ cfg.tool_usage_weight * tool_used
+ cfg.efficiency_weight * efficiency
+ diversity
)
reward = min(1.0, max(0.0, reward)) # clamp to [0, 1]
# Track for wandb
self._reward_buffer.append(reward)
self._correctness_buffer.append(correctness)
self._tool_usage_buffer.append(tool_used)
self._efficiency_buffer.append(efficiency)
self._diversity_buffer.append(diversity)
logger.debug(
f"Reward breakdown — correctness={correctness:.2f}, "
f"tool_used={tool_used:.1f}, efficiency={efficiency:.2f}, "
f"diversity={diversity:.1f} → total={reward:.3f}"
)
return reward
# ------------------------------------------------------------------
# 5. evaluate — run on held-out eval split
# ------------------------------------------------------------------
async def evaluate(self, *args, **kwargs) -> None:
"""Run evaluation on the held-out split using the agent loop."""
import time
items = self._eval_items
if not items:
logger.warning("No eval items available.")
return
eval_size = min(self.config.eval_size, len(items))
eval_items = items[:eval_size]
logger.info(f"Running eval on {len(eval_items)} questions...")
start_time = time.time()
samples = []
for item in eval_items:
try:
# Use the base env's agent loop for eval (same as training)
prompt = self.format_prompt(item)
completion = await self.server.chat_completion(
messages=[
{"role": "system", "content": self.config.system_prompt or ""},
{"role": "user", "content": prompt},
],
n=1,
max_tokens=self.config.max_token_length,
temperature=0.0,
split="eval",
)
response_content = (
completion.choices[0].message.content if completion.choices else ""
)
# Score the response
correctness = await self._llm_judge(
question=item["question"],
expected=item["answer"],
model_answer=response_content,
)
samples.append({
"prompt": item["question"],
"response": response_content,
"expected": item["answer"],
"correctness": correctness,
})
except Exception as e:
logger.error(f"Eval error on item: {e}")
samples.append({
"prompt": item["question"],
"response": f"ERROR: {e}",
"expected": item["answer"],
"correctness": 0.0,
})
end_time = time.time()
# Compute metrics
correctness_scores = [s["correctness"] for s in samples]
eval_metrics = {
"eval/mean_correctness": (
sum(correctness_scores) / len(correctness_scores)
if correctness_scores else 0.0
),
"eval/n_items": len(samples),
}
await self.evaluate_log(
metrics=eval_metrics,
samples=samples,
start_time=start_time,
end_time=end_time,
)
# ------------------------------------------------------------------
# 6. wandb_log — custom metrics
# ------------------------------------------------------------------
async def wandb_log(self, wandb_metrics: Optional[Dict] = None) -> None:
"""Log reward breakdown metrics to wandb."""
if wandb_metrics is None:
wandb_metrics = {}
if self._reward_buffer:
n = len(self._reward_buffer)
wandb_metrics["train/mean_reward"] = sum(self._reward_buffer) / n
wandb_metrics["train/mean_correctness"] = sum(self._correctness_buffer) / n
wandb_metrics["train/mean_tool_usage"] = sum(self._tool_usage_buffer) / n
wandb_metrics["train/mean_efficiency"] = sum(self._efficiency_buffer) / n
wandb_metrics["train/mean_diversity"] = sum(self._diversity_buffer) / n
wandb_metrics["train/total_rollouts"] = n
# Accuracy buckets
wandb_metrics["train/correct_rate"] = (
sum(1 for c in self._correctness_buffer if c >= 0.7) / n
)
wandb_metrics["train/tool_usage_rate"] = (
sum(1 for t in self._tool_usage_buffer if t > 0) / n
)
# Clear buffers
self._reward_buffer.clear()
self._correctness_buffer.clear()
self._tool_usage_buffer.clear()
self._efficiency_buffer.clear()
self._diversity_buffer.clear()
await super().wandb_log(wandb_metrics)
# ------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------
async def _llm_judge(
self,
question: str,
expected: str,
model_answer: str,
) -> float:
"""
Use the server's LLM to judge answer correctness.
Falls back to keyword heuristic if LLM call fails.
"""
if not model_answer or not model_answer.strip():
return 0.0
judge_prompt = (
"You are an impartial judge evaluating the quality of an AI research answer.\n\n"
f"Question: {question}\n\n"
f"Reference answer: {expected}\n\n"
f"Model answer: {model_answer}\n\n"
"Score the model answer on a scale from 0.0 to 1.0 where:\n"
" 1.0 = fully correct and complete\n"
" 0.7 = mostly correct with minor gaps\n"
" 0.4 = partially correct\n"
" 0.1 = mentions relevant topic but wrong or very incomplete\n"
" 0.0 = completely wrong or no answer\n\n"
"Consider: factual accuracy, completeness, and relevance.\n"
'Respond with ONLY a JSON object: {"score": <float>, "reason": "<one sentence>"}'
)
try:
response = await self.server.chat_completion(
messages=[{"role": "user", "content": judge_prompt}],
n=1,
max_tokens=150,
temperature=0.0,
split="eval",
)
text = response.choices[0].message.content if response.choices else ""
parsed = self._parse_judge_json(text)
if parsed is not None:
return float(parsed)
except Exception as e:
logger.debug(f"LLM judge failed: {e}. Using heuristic.")
return self._heuristic_score(expected, model_answer)
@staticmethod
def _parse_judge_json(text: str) -> Optional[float]:
"""Extract the score float from LLM judge JSON response."""
try:
clean = re.sub(r"```(?:json)?|```", "", text).strip()
data = json.loads(clean)
score = float(data.get("score", -1))
if 0.0 <= score <= 1.0:
return score
except Exception:
match = re.search(r'"score"\s*:\s*([0-9.]+)', text)
if match:
score = float(match.group(1))
if 0.0 <= score <= 1.0:
return score
return None
@staticmethod
def _heuristic_score(expected: str, model_answer: str) -> float:
"""Lightweight keyword overlap score as fallback."""
stopwords = {
"the", "a", "an", "is", "are", "was", "were", "of", "in", "on",
"at", "to", "for", "with", "and", "or", "but", "it", "its",
"this", "that", "as", "by", "from", "be", "has", "have", "had",
}
def tokenize(text: str) -> set:
tokens = re.findall(r'\b\w+\b', text.lower())
return {t for t in tokens if t not in stopwords and len(t) > 2}
expected_tokens = tokenize(expected)
answer_tokens = tokenize(model_answer)
if not expected_tokens:
return 0.5
overlap = len(expected_tokens & answer_tokens)
union = len(expected_tokens | answer_tokens)
jaccard = overlap / union if union > 0 else 0.0
recall = overlap / len(expected_tokens)
return min(1.0, 0.4 * jaccard + 0.6 * recall)
@staticmethod
def _extract_domains(text: str) -> set:
"""Extract unique domains from URLs cited in the response."""
urls = re.findall(r'https?://[^\s\)>\]"\']+', text)
domains = set()
for url in urls:
try:
parsed = urlparse(url)
domain = parsed.netloc.lower().lstrip("www.")
if domain:
domains.add(domain)
except Exception:
pass
return domains
# ---------------------------------------------------------------------------
# Entry point
# ---------------------------------------------------------------------------
if __name__ == "__main__":
WebResearchEnv.cli()

View File

@@ -9,18 +9,19 @@ to various messaging platforms (Telegram, Discord, WhatsApp) with:
- Platform-specific toolsets (different capabilities per platform)
"""
from .config import GatewayConfig, HomeChannel, PlatformConfig, SessionResetPolicy, load_gateway_config
from .delivery import DeliveryRouter, DeliveryTarget
from .config import GatewayConfig, PlatformConfig, HomeChannel, load_gateway_config
from .session import (
SessionContext,
SessionStore,
SessionResetPolicy,
build_session_context_prompt,
)
from .delivery import DeliveryRouter, DeliveryTarget
__all__ = [
# Config
"GatewayConfig",
"PlatformConfig",
"PlatformConfig",
"HomeChannel",
"load_gateway_config",
# Session

View File

@@ -10,7 +10,7 @@ import json
import logging
from datetime import datetime
from pathlib import Path
from typing import Any
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
@@ -21,8 +21,7 @@ DIRECTORY_PATH = Path.home() / ".hermes" / "channel_directory.json"
# Build / refresh
# ---------------------------------------------------------------------------
def build_channel_directory(adapters: dict[Any, Any]) -> dict[str, Any]:
def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
"""
Build a channel directory from connected platform adapters and session data.
@@ -30,7 +29,7 @@ def build_channel_directory(adapters: dict[Any, Any]) -> dict[str, Any]:
"""
from gateway.config import Platform
platforms: dict[str, list[dict[str, str]]] = {}
platforms: Dict[str, List[Dict[str, str]]] = {}
for platform, adapter in adapters.items():
try:
@@ -41,8 +40,8 @@ def build_channel_directory(adapters: dict[Any, Any]) -> dict[str, Any]:
except Exception as e:
logger.warning("Channel directory: failed to build %s: %s", platform.value, e)
# Telegram, WhatsApp & Signal can't enumerate chats -- pull from session history
for plat_name in ("telegram", "whatsapp", "signal"):
# Telegram & WhatsApp can't enumerate chats -- pull from session history
for plat_name in ("telegram", "whatsapp"):
if plat_name not in platforms:
platforms[plat_name] = _build_from_sessions(plat_name)
@@ -53,7 +52,7 @@ def build_channel_directory(adapters: dict[Any, Any]) -> dict[str, Any]:
try:
DIRECTORY_PATH.parent.mkdir(parents=True, exist_ok=True)
with open(DIRECTORY_PATH, "w", encoding="utf-8") as f:
with open(DIRECTORY_PATH, "w") as f:
json.dump(directory, f, indent=2, ensure_ascii=False)
except Exception as e:
logger.warning("Channel directory: failed to write: %s", e)
@@ -61,7 +60,7 @@ def build_channel_directory(adapters: dict[Any, Any]) -> dict[str, Any]:
return directory
def _build_discord(adapter) -> list[dict[str, str]]:
def _build_discord(adapter) -> List[Dict[str, str]]:
"""Enumerate all text channels the Discord bot can see."""
channels = []
client = getattr(adapter, "_client", None)
@@ -75,14 +74,12 @@ def _build_discord(adapter) -> list[dict[str, str]]:
for guild in client.guilds:
for ch in guild.text_channels:
channels.append(
{
"id": str(ch.id),
"name": ch.name,
"guild": guild.name,
"type": "channel",
}
)
channels.append({
"id": str(ch.id),
"name": ch.name,
"guild": guild.name,
"type": "channel",
})
# Also include DM-capable users we've interacted with is not
# feasible via guild enumeration; those come from sessions.
@@ -91,7 +88,7 @@ def _build_discord(adapter) -> list[dict[str, str]]:
return channels
def _build_slack(adapter) -> list[dict[str, str]]:
def _build_slack(adapter) -> List[Dict[str, str]]:
"""List Slack channels the bot has joined."""
channels = []
# Slack adapter may expose a web client
@@ -100,6 +97,7 @@ def _build_slack(adapter) -> list[dict[str, str]]:
return _build_from_sessions("slack")
try:
import asyncio
from tools.send_message_tool import _send_slack # noqa: F401
# Use the Slack Web API directly if available
except Exception:
@@ -109,7 +107,7 @@ def _build_slack(adapter) -> list[dict[str, str]]:
return _build_from_sessions("slack")
def _build_from_sessions(platform_name: str) -> list[dict[str, str]]:
def _build_from_sessions(platform_name: str) -> List[Dict[str, str]]:
"""Pull known channels/contacts from sessions.json origin data."""
sessions_path = Path.home() / ".hermes" / "sessions" / "sessions.json"
if not sessions_path.exists():
@@ -117,7 +115,7 @@ def _build_from_sessions(platform_name: str) -> list[dict[str, str]]:
entries = []
try:
with open(sessions_path, encoding="utf-8") as f:
with open(sessions_path) as f:
data = json.load(f)
seen_ids = set()
@@ -129,13 +127,11 @@ def _build_from_sessions(platform_name: str) -> list[dict[str, str]]:
if not chat_id or chat_id in seen_ids:
continue
seen_ids.add(chat_id)
entries.append(
{
"id": str(chat_id),
"name": origin.get("chat_name") or origin.get("user_name") or str(chat_id),
"type": session.get("chat_type", "dm"),
}
)
entries.append({
"id": str(chat_id),
"name": origin.get("chat_name") or origin.get("user_name") or str(chat_id),
"type": session.get("chat_type", "dm"),
})
except Exception as e:
logger.debug("Channel directory: failed to read sessions for %s: %s", platform_name, e)
@@ -146,19 +142,18 @@ def _build_from_sessions(platform_name: str) -> list[dict[str, str]]:
# Read / resolve
# ---------------------------------------------------------------------------
def load_directory() -> dict[str, Any]:
def load_directory() -> Dict[str, Any]:
"""Load the cached channel directory from disk."""
if not DIRECTORY_PATH.exists():
return {"updated_at": None, "platforms": {}}
try:
with open(DIRECTORY_PATH, encoding="utf-8") as f:
with open(DIRECTORY_PATH) as f:
return json.load(f)
except Exception:
return {"updated_at": None, "platforms": {}}
def resolve_channel_name(platform_name: str, name: str) -> str | None:
def resolve_channel_name(platform_name: str, name: str) -> Optional[str]:
"""
Resolve a human-friendly channel name to a numeric ID.
@@ -211,8 +206,8 @@ def format_directory_for_display() -> str:
# Group Discord channels by guild
if plat_name == "discord":
guilds: dict[str, list] = {}
dms: list = []
guilds: Dict[str, List] = {}
dms: List = []
for ch in channels:
guild = ch.get("guild")
if guild:

View File

@@ -8,26 +8,24 @@ Handles loading and validating configuration for:
- Delivery preferences
"""
import json
import logging
import os
from dataclasses import dataclass, field
from enum import Enum
import json
from pathlib import Path
from typing import Any
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Any
from enum import Enum
logger = logging.getLogger(__name__)
class Platform(Enum):
"""Supported messaging platforms."""
LOCAL = "local"
TELEGRAM = "telegram"
DISCORD = "discord"
WHATSAPP = "whatsapp"
SLACK = "slack"
SIGNAL = "signal"
HOMEASSISTANT = "homeassistant"
@@ -35,24 +33,23 @@ class Platform(Enum):
class HomeChannel:
"""
Default destination for a platform.
When a cron job specifies deliver="telegram" without a specific chat ID,
messages are sent to this home channel.
"""
platform: Platform
chat_id: str
name: str # Human-readable name for display
def to_dict(self) -> dict[str, Any]:
def to_dict(self) -> Dict[str, Any]:
return {
"platform": self.platform.value,
"chat_id": self.chat_id,
"name": self.name,
}
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "HomeChannel":
def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
return cls(
platform=Platform(data["platform"]),
chat_id=str(data["chat_id"]),
@@ -64,27 +61,26 @@ class HomeChannel:
class SessionResetPolicy:
"""
Controls when sessions reset (lose context).
Modes:
- "daily": Reset at a specific hour each day
- "idle": Reset after N minutes of inactivity
- "both": Whichever triggers first (daily boundary OR idle timeout)
- "none": Never auto-reset (context managed only by compression)
"""
mode: str = "both" # "daily", "idle", "both", or "none"
at_hour: int = 4 # Hour for daily reset (0-23, local time)
idle_minutes: int = 1440 # Minutes of inactivity before reset (24 hours)
def to_dict(self) -> dict[str, Any]:
def to_dict(self) -> Dict[str, Any]:
return {
"mode": self.mode,
"at_hour": self.at_hour,
"idle_minutes": self.idle_minutes,
}
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "SessionResetPolicy":
def from_dict(cls, data: Dict[str, Any]) -> "SessionResetPolicy":
return cls(
mode=data.get("mode", "both"),
at_hour=data.get("at_hour", 4),
@@ -95,16 +91,15 @@ class SessionResetPolicy:
@dataclass
class PlatformConfig:
"""Configuration for a single messaging platform."""
enabled: bool = False
token: str | None = None # Bot token (Telegram, Discord)
api_key: str | None = None # API key if different from token
home_channel: HomeChannel | None = None
token: Optional[str] = None # Bot token (Telegram, Discord)
api_key: Optional[str] = None # API key if different from token
home_channel: Optional[HomeChannel] = None
# Platform-specific settings
extra: dict[str, Any] = field(default_factory=dict)
def to_dict(self) -> dict[str, Any]:
extra: Dict[str, Any] = field(default_factory=dict)
def to_dict(self) -> Dict[str, Any]:
result = {
"enabled": self.enabled,
"extra": self.extra,
@@ -116,13 +111,13 @@ class PlatformConfig:
if self.home_channel:
result["home_channel"] = self.home_channel.to_dict()
return result
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "PlatformConfig":
def from_dict(cls, data: Dict[str, Any]) -> "PlatformConfig":
home_channel = None
if "home_channel" in data:
home_channel = HomeChannel.from_dict(data["home_channel"])
return cls(
enabled=data.get("enabled", False),
token=data.get("token"),
@@ -136,80 +131,80 @@ class PlatformConfig:
class GatewayConfig:
"""
Main gateway configuration.
Manages all platform connections, session policies, and delivery settings.
"""
# Platform configurations
platforms: dict[Platform, PlatformConfig] = field(default_factory=dict)
platforms: Dict[Platform, PlatformConfig] = field(default_factory=dict)
# Session reset policies by type
default_reset_policy: SessionResetPolicy = field(default_factory=SessionResetPolicy)
reset_by_type: dict[str, SessionResetPolicy] = field(default_factory=dict)
reset_by_platform: dict[Platform, SessionResetPolicy] = field(default_factory=dict)
reset_by_type: Dict[str, SessionResetPolicy] = field(default_factory=dict)
reset_by_platform: Dict[Platform, SessionResetPolicy] = field(default_factory=dict)
# Reset trigger commands
reset_triggers: list[str] = field(default_factory=lambda: ["/new", "/reset"])
reset_triggers: List[str] = field(default_factory=lambda: ["/new", "/reset"])
# Storage paths
sessions_dir: Path = field(default_factory=lambda: Path.home() / ".hermes" / "sessions")
# Delivery settings
always_log_local: bool = True # Always save cron outputs to local files
def get_connected_platforms(self) -> list[Platform]:
def get_connected_platforms(self) -> List[Platform]:
"""Return list of platforms that are enabled and configured."""
connected = []
for platform, config in self.platforms.items():
if not config.enabled:
continue
# Platforms that use token/api_key auth
if (
config.token
or config.api_key
or platform == Platform.WHATSAPP
or platform == Platform.SIGNAL
and config.extra.get("http_url")
):
if config.enabled and (config.token or config.api_key):
connected.append(platform)
return connected
def get_home_channel(self, platform: Platform) -> HomeChannel | None:
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
"""Get the home channel for a platform."""
config = self.platforms.get(platform)
if config:
return config.home_channel
return None
def get_reset_policy(self, platform: Platform | None = None, session_type: str | None = None) -> SessionResetPolicy:
def get_reset_policy(
self,
platform: Optional[Platform] = None,
session_type: Optional[str] = None
) -> SessionResetPolicy:
"""
Get the appropriate reset policy for a session.
Priority: platform override > type override > default
"""
# Platform-specific override takes precedence
if platform and platform in self.reset_by_platform:
return self.reset_by_platform[platform]
# Type-specific override (dm, group, thread)
if session_type and session_type in self.reset_by_type:
return self.reset_by_type[session_type]
return self.default_reset_policy
def to_dict(self) -> dict[str, Any]:
def to_dict(self) -> Dict[str, Any]:
return {
"platforms": {p.value: c.to_dict() for p, c in self.platforms.items()},
"platforms": {
p.value: c.to_dict() for p, c in self.platforms.items()
},
"default_reset_policy": self.default_reset_policy.to_dict(),
"reset_by_type": {k: v.to_dict() for k, v in self.reset_by_type.items()},
"reset_by_platform": {p.value: v.to_dict() for p, v in self.reset_by_platform.items()},
"reset_by_type": {
k: v.to_dict() for k, v in self.reset_by_type.items()
},
"reset_by_platform": {
p.value: v.to_dict() for p, v in self.reset_by_platform.items()
},
"reset_triggers": self.reset_triggers,
"sessions_dir": str(self.sessions_dir),
"always_log_local": self.always_log_local,
}
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "GatewayConfig":
def from_dict(cls, data: Dict[str, Any]) -> "GatewayConfig":
platforms = {}
for platform_name, platform_data in data.get("platforms", {}).items():
try:
@@ -217,11 +212,11 @@ class GatewayConfig:
platforms[platform] = PlatformConfig.from_dict(platform_data)
except ValueError:
pass # Skip unknown platforms
reset_by_type = {}
for type_name, policy_data in data.get("reset_by_type", {}).items():
reset_by_type[type_name] = SessionResetPolicy.from_dict(policy_data)
reset_by_platform = {}
for platform_name, policy_data in data.get("reset_by_platform", {}).items():
try:
@@ -229,15 +224,15 @@ class GatewayConfig:
reset_by_platform[platform] = SessionResetPolicy.from_dict(policy_data)
except ValueError:
pass
default_policy = SessionResetPolicy()
if "default_reset_policy" in data:
default_policy = SessionResetPolicy.from_dict(data["default_reset_policy"])
sessions_dir = Path.home() / ".hermes" / "sessions"
if "sessions_dir" in data:
sessions_dir = Path(data["sessions_dir"])
return cls(
platforms=platforms,
default_reset_policy=default_policy,
@@ -252,7 +247,7 @@ class GatewayConfig:
def load_gateway_config() -> GatewayConfig:
"""
Load gateway configuration from multiple sources.
Priority (highest to lowest):
1. Environment variables
2. ~/.hermes/gateway.json
@@ -260,23 +255,22 @@ def load_gateway_config() -> GatewayConfig:
4. Defaults
"""
config = GatewayConfig()
# Try loading from ~/.hermes/gateway.json
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
if gateway_config_path.exists():
try:
with open(gateway_config_path) as f:
with open(gateway_config_path, "r") as f:
data = json.load(f)
config = GatewayConfig.from_dict(data)
except Exception as e:
print(f"[gateway] Warning: Failed to load {gateway_config_path}: {e}")
# Bridge session_reset from config.yaml (the user-facing config file)
# into the gateway config. config.yaml takes precedence over gateway.json
# for session reset policy since that's where hermes setup writes it.
try:
import yaml
config_yaml_path = Path.home() / ".hermes" / "config.yaml"
if config_yaml_path.exists():
with open(config_yaml_path) as f:
@@ -289,12 +283,14 @@ def load_gateway_config() -> GatewayConfig:
# Override with environment variables
_apply_env_overrides(config)
# --- Validate loaded values ---
policy = config.default_reset_policy
if not (0 <= policy.at_hour <= 23):
logger.warning("Invalid at_hour=%s (must be 0-23). Using default 4.", policy.at_hour)
logger.warning(
"Invalid at_hour=%s (must be 0-23). Using default 4.", policy.at_hour
)
policy.at_hour = 4
if policy.idle_minutes is None or policy.idle_minutes <= 0:
@@ -317,9 +313,9 @@ def load_gateway_config() -> GatewayConfig:
env_name = _token_env_names.get(platform)
if env_name and pconfig.token is not None and not pconfig.token.strip():
logger.warning(
"%s is enabled but %s is empty. The adapter will likely fail to connect.",
platform.value,
env_name,
"%s is enabled but %s is empty. "
"The adapter will likely fail to connect.",
platform.value, env_name,
)
return config
@@ -327,7 +323,7 @@ def load_gateway_config() -> GatewayConfig:
def _apply_env_overrides(config: GatewayConfig) -> None:
"""Apply environment variable overrides to config."""
# Telegram
telegram_token = os.getenv("TELEGRAM_BOT_TOKEN")
if telegram_token:
@@ -335,7 +331,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.TELEGRAM] = PlatformConfig()
config.platforms[Platform.TELEGRAM].enabled = True
config.platforms[Platform.TELEGRAM].token = telegram_token
telegram_home = os.getenv("TELEGRAM_HOME_CHANNEL")
if telegram_home and Platform.TELEGRAM in config.platforms:
config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
@@ -343,7 +339,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
chat_id=telegram_home,
name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
)
# Discord
discord_token = os.getenv("DISCORD_BOT_TOKEN")
if discord_token:
@@ -351,7 +347,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.DISCORD] = PlatformConfig()
config.platforms[Platform.DISCORD].enabled = True
config.platforms[Platform.DISCORD].token = discord_token
discord_home = os.getenv("DISCORD_HOME_CHANNEL")
if discord_home and Platform.DISCORD in config.platforms:
config.platforms[Platform.DISCORD].home_channel = HomeChannel(
@@ -359,14 +355,14 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
chat_id=discord_home,
name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
)
# WhatsApp (typically uses different auth mechanism)
whatsapp_enabled = os.getenv("WHATSAPP_ENABLED", "").lower() in ("true", "1", "yes")
if whatsapp_enabled:
if Platform.WHATSAPP not in config.platforms:
config.platforms[Platform.WHATSAPP] = PlatformConfig()
config.platforms[Platform.WHATSAPP].enabled = True
# Slack
slack_token = os.getenv("SLACK_BOT_TOKEN")
if slack_token:
@@ -382,29 +378,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
)
# Signal
signal_url = os.getenv("SIGNAL_HTTP_URL")
signal_account = os.getenv("SIGNAL_ACCOUNT")
if signal_url and signal_account:
if Platform.SIGNAL not in config.platforms:
config.platforms[Platform.SIGNAL] = PlatformConfig()
config.platforms[Platform.SIGNAL].enabled = True
config.platforms[Platform.SIGNAL].extra.update(
{
"http_url": signal_url,
"account": signal_account,
"ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in ("true", "1", "yes"),
}
)
signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
if signal_home:
config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
)
# Home Assistant
hass_token = os.getenv("HASS_TOKEN")
if hass_token:
@@ -423,7 +397,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.default_reset_policy.idle_minutes = int(idle_minutes)
except ValueError:
pass
reset_hour = os.getenv("SESSION_RESET_HOUR")
if reset_hour:
try:
@@ -436,6 +410,6 @@ def save_gateway_config(config: GatewayConfig) -> None:
"""Save gateway configuration to ~/.hermes/gateway.json."""
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
gateway_config_path.parent.mkdir(parents=True, exist_ok=True)
with open(gateway_config_path, "w") as f:
json.dump(config.to_dict(), f, indent=2)

View File

@@ -9,17 +9,18 @@ Routes messages to the appropriate destination based on:
"""
import logging
from dataclasses import dataclass
from datetime import datetime
from pathlib import Path
from typing import Any
from datetime import datetime
from dataclasses import dataclass
from typing import Dict, List, Optional, Any, Union
from enum import Enum
logger = logging.getLogger(__name__)
MAX_PLATFORM_OUTPUT = 4000
TRUNCATED_VISIBLE = 3800
from .config import GatewayConfig, Platform
from .config import Platform, GatewayConfig
from .session import SessionSource
@@ -27,24 +28,23 @@ from .session import SessionSource
class DeliveryTarget:
"""
A single delivery target.
Represents where a message should be sent:
- "origin" → back to source
- "local" → save to local files
- "telegram" → Telegram home channel
- "telegram:123456" → specific Telegram chat
"""
platform: Platform
chat_id: str | None = None # None means use home channel
chat_id: Optional[str] = None # None means use home channel
is_origin: bool = False
is_explicit: bool = False # True if chat_id was explicitly specified
@classmethod
def parse(cls, target: str, origin: SessionSource | None = None) -> "DeliveryTarget":
def parse(cls, target: str, origin: Optional[SessionSource] = None) -> "DeliveryTarget":
"""
Parse a delivery target string.
Formats:
- "origin" → back to source
- "local" → local files only
@@ -52,7 +52,7 @@ class DeliveryTarget:
- "telegram:123456" → specific Telegram chat
"""
target = target.strip().lower()
if target == "origin":
if origin:
return cls(
@@ -63,10 +63,10 @@ class DeliveryTarget:
else:
# Fallback to local if no origin
return cls(platform=Platform.LOCAL, is_origin=True)
if target == "local":
return cls(platform=Platform.LOCAL)
# Check for platform:chat_id format
if ":" in target:
platform_str, chat_id = target.split(":", 1)
@@ -76,7 +76,7 @@ class DeliveryTarget:
except ValueError:
# Unknown platform, treat as local
return cls(platform=Platform.LOCAL)
# Just a platform name (use home channel)
try:
platform = Platform(target)
@@ -84,7 +84,7 @@ class DeliveryTarget:
except ValueError:
# Unknown platform, treat as local
return cls(platform=Platform.LOCAL)
def to_string(self) -> str:
"""Convert back to string format."""
if self.is_origin:
@@ -99,15 +99,15 @@ class DeliveryTarget:
class DeliveryRouter:
"""
Routes messages to appropriate destinations.
Handles the logic of resolving delivery targets and dispatching
messages to the right platform adapters.
"""
def __init__(self, config: GatewayConfig, adapters: dict[Platform, Any] = None):
def __init__(self, config: GatewayConfig, adapters: Dict[Platform, Any] = None):
"""
Initialize the delivery router.
Args:
config: Gateway configuration
adapters: Dict mapping platforms to their adapter instances
@@ -115,27 +115,31 @@ class DeliveryRouter:
self.config = config
self.adapters = adapters or {}
self.output_dir = Path.home() / ".hermes" / "cron" / "output"
def resolve_targets(self, deliver: str | list[str], origin: SessionSource | None = None) -> list[DeliveryTarget]:
def resolve_targets(
self,
deliver: Union[str, List[str]],
origin: Optional[SessionSource] = None
) -> List[DeliveryTarget]:
"""
Resolve delivery specification to concrete targets.
Args:
deliver: Delivery spec - "origin", "telegram", ["local", "discord"], etc.
origin: The source where the request originated (for "origin" target)
Returns:
List of resolved delivery targets
"""
if isinstance(deliver, str):
deliver = [deliver]
targets = []
seen_platforms = set()
for target_str in deliver:
target = DeliveryTarget.parse(target_str, origin)
# Resolve home channel if needed
if target.chat_id is None and target.platform != Platform.LOCAL:
home = self.config.get_home_channel(target.platform)
@@ -144,96 +148,109 @@ class DeliveryRouter:
else:
# No home channel configured, skip this platform
continue
# Deduplicate
key = (target.platform, target.chat_id)
if key not in seen_platforms:
seen_platforms.add(key)
targets.append(target)
# Always include local if configured
if self.config.always_log_local:
local_key = (Platform.LOCAL, None)
if local_key not in seen_platforms:
targets.append(DeliveryTarget(platform=Platform.LOCAL))
return targets
async def deliver(
self,
content: str,
targets: list[DeliveryTarget],
job_id: str | None = None,
job_name: str | None = None,
metadata: dict[str, Any] | None = None,
) -> dict[str, Any]:
targets: List[DeliveryTarget],
job_id: Optional[str] = None,
job_name: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
"""
Deliver content to all specified targets.
Args:
content: The message/output to deliver
targets: List of delivery targets
job_id: Optional job ID (for cron jobs)
job_name: Optional job name
metadata: Additional metadata to include
Returns:
Dict with delivery results per target
"""
results = {}
for target in targets:
try:
if target.platform == Platform.LOCAL:
result = self._deliver_local(content, job_id, job_name, metadata)
else:
result = await self._deliver_to_platform(target, content, metadata)
results[target.to_string()] = {"success": True, "result": result}
results[target.to_string()] = {
"success": True,
"result": result
}
except Exception as e:
results[target.to_string()] = {"success": False, "error": str(e)}
results[target.to_string()] = {
"success": False,
"error": str(e)
}
return results
def _deliver_local(
self, content: str, job_id: str | None, job_name: str | None, metadata: dict[str, Any] | None
) -> dict[str, Any]:
self,
content: str,
job_id: Optional[str],
job_name: Optional[str],
metadata: Optional[Dict[str, Any]]
) -> Dict[str, Any]:
"""Save content to local files."""
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
if job_id:
output_path = self.output_dir / job_id / f"{timestamp}.md"
else:
output_path = self.output_dir / "misc" / f"{timestamp}.md"
output_path.parent.mkdir(parents=True, exist_ok=True)
# Build the output document
lines = []
if job_name:
lines.append(f"# {job_name}")
else:
lines.append("# Delivery Output")
lines.append("")
lines.append(f"**Timestamp:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
if job_id:
lines.append(f"**Job ID:** {job_id}")
if metadata:
for key, value in metadata.items():
lines.append(f"**{key}:** {value}")
lines.append("")
lines.append("---")
lines.append("")
lines.append(content)
output_path.write_text("\n".join(lines))
return {"path": str(output_path), "timestamp": timestamp}
return {
"path": str(output_path),
"timestamp": timestamp
}
def _save_full_output(self, content: str, job_id: str) -> Path:
"""Save full cron output to disk and return the file path."""
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
@@ -244,33 +261,41 @@ class DeliveryRouter:
return path
async def _deliver_to_platform(
self, target: DeliveryTarget, content: str, metadata: dict[str, Any] | None
) -> dict[str, Any]:
self,
target: DeliveryTarget,
content: str,
metadata: Optional[Dict[str, Any]]
) -> Dict[str, Any]:
"""Deliver content to a messaging platform."""
adapter = self.adapters.get(target.platform)
if not adapter:
raise ValueError(f"No adapter configured for {target.platform.value}")
if not target.chat_id:
raise ValueError(f"No chat ID for {target.platform.value} delivery")
# Guard: truncate oversized cron output to stay within platform limits
if len(content) > MAX_PLATFORM_OUTPUT:
job_id = (metadata or {}).get("job_id", "unknown")
saved_path = self._save_full_output(content, job_id)
logger.info("Cron output truncated (%d chars) — full output: %s", len(content), saved_path)
content = content[:TRUNCATED_VISIBLE] + f"\n\n... [truncated, full output saved to {saved_path}]"
content = (
content[:TRUNCATED_VISIBLE]
+ f"\n\n... [truncated, full output saved to {saved_path}]"
)
return await adapter.send(target.chat_id, content, metadata=metadata)
def parse_deliver_spec(
deliver: str | list[str] | None, origin: SessionSource | None = None, default: str = "origin"
) -> str | list[str]:
deliver: Optional[Union[str, List[str]]],
origin: Optional[SessionSource] = None,
default: str = "origin"
) -> Union[str, List[str]]:
"""
Normalize a delivery specification.
If None or empty, returns the default.
"""
if not deliver:
@@ -278,14 +303,17 @@ def parse_deliver_spec(
return deliver
def build_delivery_context_for_tool(config: GatewayConfig, origin: SessionSource | None = None) -> dict[str, Any]:
def build_delivery_context_for_tool(
config: GatewayConfig,
origin: Optional[SessionSource] = None
) -> Dict[str, Any]:
"""
Build context for the schedule_cronjob tool to understand delivery options.
This is passed to the tool so it can validate and explain delivery targets.
"""
connected = config.get_connected_platforms()
options = {
"origin": {
"description": "Back to where this job was created",
@@ -294,9 +322,9 @@ def build_delivery_context_for_tool(config: GatewayConfig, origin: SessionSource
"local": {
"description": "Save to local files only",
"available": True,
},
}
}
for platform in connected:
home = config.get_home_channel(platform)
options[platform.value] = {
@@ -304,7 +332,7 @@ def build_delivery_context_for_tool(config: GatewayConfig, origin: SessionSource
"available": True,
"home_channel": home.to_dict() if home else None,
}
return {
"origin": origin.to_dict() if origin else None,
"options": options,

View File

@@ -21,12 +21,12 @@ Errors in hooks are caught and logged but never block the main pipeline.
import asyncio
import importlib.util
import os
from collections.abc import Callable
from pathlib import Path
from typing import Any
from typing import Any, Callable, Dict, List, Optional
import yaml
HOOKS_DIR = Path(os.path.expanduser("~/.hermes/hooks"))
@@ -42,11 +42,11 @@ class HookRegistry:
def __init__(self):
# event_type -> [handler_fn, ...]
self._handlers: dict[str, list[Callable]] = {}
self._loaded_hooks: list[dict] = [] # metadata for listing
self._handlers: Dict[str, List[Callable]] = {}
self._loaded_hooks: List[dict] = [] # metadata for listing
@property
def loaded_hooks(self) -> list[dict]:
def loaded_hooks(self) -> List[dict]:
"""Return metadata about all loaded hooks."""
return list(self._loaded_hooks)
@@ -84,7 +84,9 @@ class HookRegistry:
continue
# Dynamically load the handler module
spec = importlib.util.spec_from_file_location(f"hermes_hook_{hook_name}", handler_path)
spec = importlib.util.spec_from_file_location(
f"hermes_hook_{hook_name}", handler_path
)
if spec is None or spec.loader is None:
print(f"[hooks] Skipping {hook_name}: could not load handler.py", flush=True)
continue
@@ -101,21 +103,19 @@ class HookRegistry:
for event in events:
self._handlers.setdefault(event, []).append(handle_fn)
self._loaded_hooks.append(
{
"name": hook_name,
"description": manifest.get("description", ""),
"events": events,
"path": str(hook_dir),
}
)
self._loaded_hooks.append({
"name": hook_name,
"description": manifest.get("description", ""),
"events": events,
"path": str(hook_dir),
})
print(f"[hooks] Loaded hook '{hook_name}' for events: {events}", flush=True)
except Exception as e:
print(f"[hooks] Error loading hook {hook_dir.name}: {e}", flush=True)
async def emit(self, event_type: str, context: dict[str, Any] | None = None) -> None:
async def emit(self, event_type: str, context: Optional[Dict[str, Any]] = None) -> None:
"""
Fire all handlers registered for an event.

View File

@@ -13,6 +13,7 @@ import json
import logging
from datetime import datetime
from pathlib import Path
from typing import Optional
logger = logging.getLogger(__name__)
@@ -60,7 +61,7 @@ def mirror_to_session(
return False
def _find_session_id(platform: str, chat_id: str) -> str | None:
def _find_session_id(platform: str, chat_id: str) -> Optional[str]:
"""
Find the active session_id for a platform + chat_id pair.
@@ -72,7 +73,7 @@ def _find_session_id(platform: str, chat_id: str) -> str | None:
return None
try:
with open(_SESSIONS_INDEX, encoding="utf-8") as f:
with open(_SESSIONS_INDEX) as f:
data = json.load(f)
except Exception:
return None
@@ -102,7 +103,7 @@ def _append_to_jsonl(session_id: str, message: dict) -> None:
"""Append a message to the JSONL transcript file."""
transcript_path = _SESSIONS_DIR / f"{session_id}.jsonl"
try:
with open(transcript_path, "a", encoding="utf-8") as f:
with open(transcript_path, "a") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
except Exception as e:
logger.debug("Mirror JSONL write failed: %s", e)
@@ -112,7 +113,6 @@ def _append_to_sqlite(session_id: str, message: dict) -> None:
"""Append a message to the SQLite session database."""
try:
from hermes_state import SessionDB
db = SessionDB()
db.append_message(
session_id=session_id,

View File

@@ -23,19 +23,21 @@ import os
import secrets
import time
from pathlib import Path
from typing import Optional
# Unambiguous alphabet -- excludes 0/O, 1/I to prevent confusion
ALPHABET = "ABCDEFGHJKLMNPQRSTUVWXYZ23456789"
CODE_LENGTH = 8
# Timing constants
CODE_TTL_SECONDS = 3600 # Codes expire after 1 hour
RATE_LIMIT_SECONDS = 600 # 1 request per user per 10 minutes
LOCKOUT_SECONDS = 3600 # Lockout duration after too many failures
CODE_TTL_SECONDS = 3600 # Codes expire after 1 hour
RATE_LIMIT_SECONDS = 600 # 1 request per user per 10 minutes
LOCKOUT_SECONDS = 3600 # Lockout duration after too many failures
# Limits
MAX_PENDING_PER_PLATFORM = 3 # Max pending codes per platform
MAX_FAILED_ATTEMPTS = 5 # Failed approvals before lockout
MAX_PENDING_PER_PLATFORM = 3 # Max pending codes per platform
MAX_FAILED_ATTEMPTS = 5 # Failed approvals before lockout
PAIRING_DIR = Path(os.path.expanduser("~/.hermes/pairing"))
@@ -121,7 +123,9 @@ class PairingStore:
# ----- Pending codes -----
def generate_code(self, platform: str, user_id: str, user_name: str = "") -> str | None:
def generate_code(
self, platform: str, user_id: str, user_name: str = ""
) -> Optional[str]:
"""
Generate a pairing code for a new user.
@@ -161,7 +165,7 @@ class PairingStore:
return code
def approve_code(self, platform: str, code: str) -> dict | None:
def approve_code(self, platform: str, code: str) -> Optional[dict]:
"""
Approve a pairing code. Adds the user to the approved list.
@@ -195,15 +199,13 @@ class PairingStore:
pending = self._load_json(self._pending_path(p))
for code, info in pending.items():
age_min = int((time.time() - info["created_at"]) / 60)
results.append(
{
"platform": p,
"code": code,
"user_id": info["user_id"],
"user_name": info.get("user_name", ""),
"age_minutes": age_min,
}
)
results.append({
"platform": p,
"code": code,
"user_id": info["user_id"],
"user_name": info.get("user_name", ""),
"age_minutes": age_min,
})
return results
def clear_pending(self, platform: str = None) -> int:
@@ -249,11 +251,8 @@ class PairingStore:
lockout_key = f"_lockout:{platform}"
limits[lockout_key] = time.time() + LOCKOUT_SECONDS
limits[fail_key] = 0 # Reset counter
print(
f"[pairing] Platform {platform} locked out for {LOCKOUT_SECONDS}s "
f"after {MAX_FAILED_ATTEMPTS} failed attempts",
flush=True,
)
print(f"[pairing] Platform {platform} locked out for {LOCKOUT_SECONDS}s "
f"after {MAX_FAILED_ATTEMPTS} failed attempts", flush=True)
self._save_json(self._rate_limit_path(), limits)
# ----- Cleanup -----
@@ -263,7 +262,10 @@ class PairingStore:
path = self._pending_path(platform)
pending = self._load_json(path)
now = time.time()
expired = [code for code, info in pending.items() if (now - info["created_at"]) > CODE_TTL_SECONDS]
expired = [
code for code, info in pending.items()
if (now - info["created_at"]) > CODE_TTL_SECONDS
]
if expired:
for code in expired:
del pending[code]

View File

@@ -1,313 +0,0 @@
# Adding a New Messaging Platform
Checklist for integrating a new messaging platform into the Hermes gateway.
Use this as a reference when building a new adapter — every item here is a
real integration point that exists in the codebase. Missing any of them will
cause broken functionality, missing features, or inconsistent behavior.
---
## 1. Core Adapter (`gateway/platforms/<platform>.py`)
The adapter is a subclass of `BasePlatformAdapter` from `gateway/platforms/base.py`.
### Required methods
| Method | Purpose |
|--------|---------|
| `__init__(self, config)` | Parse config, init state. Call `super().__init__(config, Platform.YOUR_PLATFORM)` |
| `connect() -> bool` | Connect to the platform, start listeners. Return True on success |
| `disconnect()` | Stop listeners, close connections, cancel tasks |
| `send(chat_id, text, ...) -> SendResult` | Send a text message |
| `send_typing(chat_id)` | Send typing indicator |
| `send_image(chat_id, image_url, caption) -> SendResult` | Send an image |
| `get_chat_info(chat_id) -> dict` | Return `{name, type, chat_id}` for a chat |
### Optional methods (have default stubs in base)
| Method | Purpose |
|--------|---------|
| `send_document(chat_id, path, caption)` | Send a file attachment |
| `send_voice(chat_id, path)` | Send a voice message |
| `send_video(chat_id, path, caption)` | Send a video |
| `send_animation(chat_id, path, caption)` | Send a GIF/animation |
| `send_image_file(chat_id, path, caption)` | Send image from local file |
### Required function
```python
def check_<platform>_requirements() -> bool:
"""Check if this platform's dependencies are available."""
```
### Key patterns to follow
- Use `self.build_source(...)` to construct `SessionSource` objects
- Call `self.handle_message(event)` to dispatch inbound messages to the gateway
- Use `MessageEvent`, `MessageType`, `SendResult` from base
- Use `cache_image_from_bytes`, `cache_audio_from_bytes`, `cache_document_from_bytes` for attachments
- Filter self-messages (prevent reply loops)
- Filter sync/echo messages if the platform has them
- Redact sensitive identifiers (phone numbers, tokens) in all log output
- Implement reconnection with exponential backoff + jitter for streaming connections
- Set `MAX_MESSAGE_LENGTH` if the platform has message size limits
---
## 2. Platform Enum (`gateway/config.py`)
Add the platform to the `Platform` enum:
```python
class Platform(Enum):
...
YOUR_PLATFORM = "your_platform"
```
Add env var loading in `_apply_env_overrides()`:
```python
# Your Platform
your_token = os.getenv("YOUR_PLATFORM_TOKEN")
if your_token:
if Platform.YOUR_PLATFORM not in config.platforms:
config.platforms[Platform.YOUR_PLATFORM] = PlatformConfig()
config.platforms[Platform.YOUR_PLATFORM].enabled = True
config.platforms[Platform.YOUR_PLATFORM].token = your_token
```
Update `get_connected_platforms()` if your platform doesn't use token/api_key
(e.g., WhatsApp uses `enabled` flag, Signal uses `extra` dict).
---
## 3. Adapter Factory (`gateway/run.py`)
Add to `_create_adapter()`:
```python
elif platform == Platform.YOUR_PLATFORM:
from gateway.platforms.your_platform import YourAdapter, check_your_requirements
if not check_your_requirements():
logger.warning("Your Platform: dependencies not met")
return None
return YourAdapter(config)
```
---
## 4. Authorization Maps (`gateway/run.py`)
Add to BOTH dicts in `_is_user_authorized()`:
```python
platform_env_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOWED_USERS",
}
platform_allow_all_map = {
...
Platform.YOUR_PLATFORM: "YOUR_PLATFORM_ALLOW_ALL_USERS",
}
```
---
## 5. Session Source (`gateway/session.py`)
If your platform needs extra identity fields (e.g., Signal's UUID alongside
phone number), add them to the `SessionSource` dataclass with `Optional` defaults,
and update `to_dict()`, `from_dict()`, and `build_source()` in base.py.
---
## 6. System Prompt Hints (`agent/prompt_builder.py`)
Add a `PLATFORM_HINTS` entry so the agent knows what platform it's on:
```python
PLATFORM_HINTS = {
...
"your_platform": (
"You are on Your Platform. "
"Describe formatting capabilities, media support, etc."
),
}
```
Without this, the agent won't know it's on your platform and may use
inappropriate formatting (e.g., markdown on platforms that don't render it).
---
## 7. Toolset (`toolsets.py`)
Add a named toolset for your platform:
```python
"hermes-your-platform": {
"description": "Your Platform bot toolset",
"tools": _HERMES_CORE_TOOLS,
"includes": []
},
```
And add it to the `hermes-gateway` composite:
```python
"hermes-gateway": {
"includes": [..., "hermes-your-platform"]
}
```
---
## 8. Cron Delivery (`cron/scheduler.py`)
Add to `platform_map` in `_deliver_result()`:
```python
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
```
Without this, `schedule_cronjob(deliver="your_platform")` silently fails.
---
## 9. Send Message Tool (`tools/send_message_tool.py`)
Add to `platform_map` in `send_message_tool()`:
```python
platform_map = {
...
"your_platform": Platform.YOUR_PLATFORM,
}
```
Add routing in `_send_to_platform()`:
```python
elif platform == Platform.YOUR_PLATFORM:
return await _send_your_platform(pconfig, chat_id, message)
```
Implement `_send_your_platform()` — a standalone async function that sends
a single message without requiring the full adapter (for use by cron jobs
and the send_message tool outside the gateway process).
Update the tool schema `target` description to include your platform example.
---
## 10. Cronjob Tool Schema (`tools/cronjob_tools.py`)
Update the `deliver` parameter description and docstring to mention your
platform as a delivery option.
---
## 11. Channel Directory (`gateway/channel_directory.py`)
If your platform can't enumerate chats (most can't), add it to the
session-based discovery list:
```python
for plat_name in ("telegram", "whatsapp", "signal", "your_platform"):
```
---
## 12. Status Display (`hermes_cli/status.py`)
Add to the `platforms` dict in the Messaging Platforms section:
```python
platforms = {
...
"Your Platform": ("YOUR_PLATFORM_TOKEN", "YOUR_PLATFORM_HOME_CHANNEL"),
}
```
---
## 13. Gateway Setup Wizard (`hermes_cli/gateway.py`)
Add to the `_PLATFORMS` list:
```python
{
"key": "your_platform",
"label": "Your Platform",
"emoji": "📱",
"token_var": "YOUR_PLATFORM_TOKEN",
"setup_instructions": [...],
"vars": [...],
}
```
If your platform needs custom setup logic (connectivity testing, QR codes,
policy choices), add a `_setup_your_platform()` function and route to it
in the platform selection switch.
Update `_platform_status()` if your platform's "configured" check differs
from the standard `bool(get_env_value(token_var))`.
---
## 14. Phone/ID Redaction (`agent/redact.py`)
If your platform uses sensitive identifiers (phone numbers, etc.), add a
regex pattern and redaction function to `agent/redact.py`. This ensures
identifiers are masked in ALL log output, not just your adapter's logs.
---
## 15. Documentation
| File | What to update |
|------|---------------|
| `README.md` | Platform list in feature table + documentation table |
| `AGENTS.md` | Gateway description + env var config section |
| `website/docs/user-guide/messaging/<platform>.md` | **NEW** — Full setup guide (see existing platform docs for template) |
| `website/docs/user-guide/messaging/index.md` | Architecture diagram, toolset table, security examples, Next Steps links |
| `website/docs/reference/environment-variables.md` | All env vars for the platform |
---
## 16. Tests (`tests/gateway/test_<platform>.py`)
Recommended test coverage:
- Platform enum exists with correct value
- Config loading from env vars via `_apply_env_overrides`
- Adapter init (config parsing, allowlist handling, default values)
- Helper functions (redaction, parsing, file type detection)
- Session source round-trip (to_dict → from_dict)
- Authorization integration (platform in allowlist maps)
- Send message tool routing (platform in platform_map)
Optional but valuable:
- Async tests for message handling flow (mock the platform API)
- SSE/WebSocket reconnection logic
- Attachment processing
- Group message filtering
---
## Quick Verification
After implementing everything, verify with:
```bash
# All checks pass (lint + test)
make check
# Grep for your platform name to find any missed integration points
grep -r "telegram\|discord\|whatsapp\|slack" gateway/ tools/ agent/ cron/ hermes_cli/ toolsets.py \
--include="*.py" -l | sort -u
# Check each file in the output — if it mentions other platforms but not yours, you missed it
```

View File

@@ -13,20 +13,20 @@ import uuid
from abc import ABC, abstractmethod
logger = logging.getLogger(__name__)
import sys
from collections.abc import Awaitable, Callable
from dataclasses import dataclass, field
from datetime import datetime
from enum import Enum
from pathlib import Path
from pathlib import Path as _Path
from typing import Any
from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple
from enum import Enum
import sys
from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.session import SessionSource
# ---------------------------------------------------------------------------
# Image cache utilities
#
@@ -251,9 +251,7 @@ def cleanup_document_cache(max_age_hours: int = 24) -> int:
class MessageType(Enum):
"""Types of incoming messages."""
TEXT = "text"
LOCATION = "location"
PHOTO = "photo"
VIDEO = "video"
AUDIO = "audio"
@@ -267,43 +265,42 @@ class MessageType(Enum):
class MessageEvent:
"""
Incoming message from a platform.
Normalized representation that all adapters produce.
"""
# Message content
text: str
message_type: MessageType = MessageType.TEXT
# Source information
source: SessionSource = None
# Original platform data
raw_message: Any = None
message_id: str | None = None
message_id: Optional[str] = None
# Media attachments
media_urls: list[str] = field(default_factory=list)
media_types: list[str] = field(default_factory=list)
media_urls: List[str] = field(default_factory=list)
media_types: List[str] = field(default_factory=list)
# Reply context
reply_to_message_id: str | None = None
reply_to_message_id: Optional[str] = None
# Timestamps
timestamp: datetime = field(default_factory=datetime.now)
def is_command(self) -> bool:
"""Check if this is a command message (e.g., /new, /reset)."""
return self.text.startswith("/")
def get_command(self) -> str | None:
def get_command(self) -> Optional[str]:
"""Extract command name if this is a command message."""
if not self.is_command():
return None
# Split on space and get first word, strip the /
parts = self.text.split(maxsplit=1)
return parts[0][1:].lower() if parts else None
def get_command_args(self) -> str:
"""Get the arguments after a command."""
if not self.is_command():
@@ -312,88 +309,91 @@ class MessageEvent:
return parts[1] if len(parts) > 1 else ""
@dataclass
@dataclass
class SendResult:
"""Result of sending a message."""
success: bool
message_id: str | None = None
error: str | None = None
message_id: Optional[str] = None
error: Optional[str] = None
raw_response: Any = None
# Type for message handlers
MessageHandler = Callable[[MessageEvent], Awaitable[str | None]]
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[str]]]
class BasePlatformAdapter(ABC):
"""
Base class for platform adapters.
Subclasses implement platform-specific logic for:
- Connecting and authenticating
- Receiving messages
- Sending messages/responses
- Handling media
"""
def __init__(self, config: PlatformConfig, platform: Platform):
self.config = config
self.platform = platform
self._message_handler: MessageHandler | None = None
self._message_handler: Optional[MessageHandler] = None
self._running = False
# Track active message handlers per session for interrupt support
# Key: session_key (e.g., chat_id), Value: (event, asyncio.Event for interrupt)
self._active_sessions: dict[str, asyncio.Event] = {}
self._pending_messages: dict[str, MessageEvent] = {}
self._active_sessions: Dict[str, asyncio.Event] = {}
self._pending_messages: Dict[str, MessageEvent] = {}
@property
def name(self) -> str:
"""Human-readable name for this adapter."""
return self.platform.value.title()
@property
def is_connected(self) -> bool:
"""Check if adapter is currently connected."""
return self._running
def set_message_handler(self, handler: MessageHandler) -> None:
"""
Set the handler for incoming messages.
The handler receives a MessageEvent and should return
an optional response string.
"""
self._message_handler = handler
@abstractmethod
async def connect(self) -> bool:
"""
Connect to the platform and start receiving messages.
Returns True if connection was successful.
"""
pass
@abstractmethod
async def disconnect(self) -> None:
"""Disconnect from the platform."""
pass
@abstractmethod
async def send(
self, chat_id: str, content: str, reply_to: str | None = None, metadata: dict[str, Any] | None = None
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""
Send a message to a chat.
Args:
chat_id: The chat/channel ID to send to
content: Message content (may be markdown)
reply_to: Optional message ID to reply to
metadata: Additional platform-specific options
Returns:
SendResult with success status and message ID
"""
@@ -415,21 +415,21 @@ class BasePlatformAdapter(ABC):
async def send_typing(self, chat_id: str) -> None:
"""
Send a typing indicator.
Override in subclasses if the platform supports it.
"""
pass
async def send_image(
self,
chat_id: str,
image_url: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""
Send an image natively via the platform API.
Override in subclasses to send images as proper attachments
instead of plain-text URLs. Default falls back to sending the
URL as a text message.
@@ -437,91 +437,87 @@ class BasePlatformAdapter(ABC):
# Fallback: send URL as text (subclasses override for native images)
text = f"{caption}\n{image_url}" if caption else image_url
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
async def send_animation(
self,
chat_id: str,
animation_url: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""
Send an animated GIF natively via the platform API.
Override in subclasses to send GIFs as proper animations
(e.g., Telegram send_animation) so they auto-play inline.
Default falls back to send_image.
"""
return await self.send_image(chat_id=chat_id, image_url=animation_url, caption=caption, reply_to=reply_to)
@staticmethod
def _is_animation_url(url: str) -> bool:
"""Check if a URL points to an animated GIF (vs a static image)."""
lower = url.lower().split("?")[0] # Strip query params
return lower.endswith(".gif")
lower = url.lower().split('?')[0] # Strip query params
return lower.endswith('.gif')
@staticmethod
def extract_images(content: str) -> tuple[list[tuple[str, str]], str]:
def extract_images(content: str) -> Tuple[List[Tuple[str, str]], str]:
"""
Extract image URLs from markdown and HTML image tags in a response.
Finds patterns like:
- ![alt text](https://example.com/image.png)
- <img src="https://example.com/image.png">
- <img src="https://example.com/image.png"></img>
Args:
content: The response text to scan.
Returns:
Tuple of (list of (url, alt_text) pairs, cleaned content with image tags removed).
"""
images = []
cleaned = content
# Match markdown images: ![alt](url)
md_pattern = r"!\[([^\]]*)\]\((https?://[^\s\)]+)\)"
md_pattern = r'!\[([^\]]*)\]\((https?://[^\s\)]+)\)'
for match in re.finditer(md_pattern, content):
alt_text = match.group(1)
url = match.group(2)
# Only extract URLs that look like actual images
if any(
url.lower().endswith(ext) or ext in url.lower()
for ext in [".png", ".jpg", ".jpeg", ".gif", ".webp", "fal.media", "fal-cdn", "replicate.delivery"]
):
if any(url.lower().endswith(ext) or ext in url.lower() for ext in
['.png', '.jpg', '.jpeg', '.gif', '.webp', 'fal.media', 'fal-cdn', 'replicate.delivery']):
images.append((url, alt_text))
# Match HTML img tags: <img src="url"> or <img src="url"></img> or <img src="url"/>
html_pattern = r'<img\s+src=["\']?(https?://[^\s"\'<>]+)["\']?\s*/?>\s*(?:</img>)?'
for match in re.finditer(html_pattern, content):
url = match.group(1)
images.append((url, ""))
# Remove only the matched image tags from content (not all markdown images)
if images:
extracted_urls = {url for url, _ in images}
def _remove_if_extracted(match):
url = match.group(2) if match.lastindex >= 2 else match.group(1)
return "" if url in extracted_urls else match.group(0)
return '' if url in extracted_urls else match.group(0)
cleaned = re.sub(md_pattern, _remove_if_extracted, cleaned)
cleaned = re.sub(html_pattern, _remove_if_extracted, cleaned)
# Clean up leftover blank lines
cleaned = re.sub(r"\n{3,}", "\n\n", cleaned).strip()
cleaned = re.sub(r'\n{3,}', '\n\n', cleaned).strip()
return images, cleaned
async def send_voice(
self,
chat_id: str,
audio_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""
Send an audio file as a native voice message via the platform API.
Override in subclasses to send audio as voice bubbles (Telegram)
or file attachments (Discord). Default falls back to sending the
file path as text.
@@ -535,8 +531,8 @@ class BasePlatformAdapter(ABC):
self,
chat_id: str,
video_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""
Send a video natively via the platform API.
@@ -553,9 +549,9 @@ class BasePlatformAdapter(ABC):
self,
chat_id: str,
file_path: str,
caption: str | None = None,
file_name: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""
Send a document/file natively via the platform API.
@@ -572,8 +568,8 @@ class BasePlatformAdapter(ABC):
self,
chat_id: str,
image_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""
Send a local image file natively via the platform API.
@@ -588,45 +584,45 @@ class BasePlatformAdapter(ABC):
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
@staticmethod
def extract_media(content: str) -> tuple[list[tuple[str, bool]], str]:
def extract_media(content: str) -> Tuple[List[Tuple[str, bool]], str]:
"""
Extract MEDIA:<path> tags and [[audio_as_voice]] directives from response text.
The TTS tool returns responses like:
[[audio_as_voice]]
MEDIA:/path/to/audio.ogg
Args:
content: The response text to scan.
Returns:
Tuple of (list of (path, is_voice) pairs, cleaned content with tags removed).
"""
media = []
cleaned = content
# Check for [[audio_as_voice]] directive
has_voice_tag = "[[audio_as_voice]]" in content
cleaned = cleaned.replace("[[audio_as_voice]]", "")
# Extract MEDIA:<path> tags (path may contain spaces)
media_pattern = r"MEDIA:(\S+)"
media_pattern = r'MEDIA:(\S+)'
for match in re.finditer(media_pattern, content):
path = match.group(1).strip()
if path:
media.append((path, has_voice_tag))
# Remove MEDIA tags from content
if media:
cleaned = re.sub(media_pattern, "", cleaned)
cleaned = re.sub(r"\n{3,}", "\n\n", cleaned).strip()
cleaned = re.sub(media_pattern, '', cleaned)
cleaned = re.sub(r'\n{3,}', '\n\n', cleaned).strip()
return media, cleaned
async def _keep_typing(self, chat_id: str, interval: float = 2.0) -> None:
"""
Continuously send typing indicator until cancelled.
Telegram/Discord typing status expires after ~5 seconds, so we refresh every 2
to recover quickly after progress messages interrupt it.
"""
@@ -636,20 +632,20 @@ class BasePlatformAdapter(ABC):
await asyncio.sleep(interval)
except asyncio.CancelledError:
pass # Normal cancellation when handler completes
async def handle_message(self, event: MessageEvent) -> None:
"""
Process an incoming message.
This method returns quickly by spawning background tasks.
This allows new messages to be processed even while an agent is running,
enabling interruption support.
"""
if not self._message_handler:
return
session_key = event.source.chat_id
# Check if there's already an active handler for this session
if session_key in self._active_sessions:
# Store this as a pending message - it will interrupt the running agent
@@ -658,10 +654,10 @@ class BasePlatformAdapter(ABC):
# Signal the interrupt (the processing task checks this)
self._active_sessions[session_key].set()
return # Don't process now - will be handled after current task finishes
# Spawn background task to process this message
asyncio.create_task(self._process_message_background(event, session_key))
@staticmethod
def _get_human_delay() -> float:
"""
@@ -688,40 +684,33 @@ class BasePlatformAdapter(ABC):
# Create interrupt event for this session
interrupt_event = asyncio.Event()
self._active_sessions[session_key] = interrupt_event
# Start continuous typing indicator (refreshes every 2 seconds)
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id))
try:
# Call the handler (this can take a while with tool calls)
response = await self._message_handler(event)
# Send response if any
if not response:
logger.warning("[%s] Handler returned empty/None response for %s", self.name, event.source.chat_id)
if response:
# Extract MEDIA:<path> tags (from TTS tool) before other processing
media_files, response = self.extract_media(response)
# Extract image URLs and send them as native platform attachments
images, text_content = self.extract_images(response)
if images:
logger.info(
"[%s] extract_images found %d image(s) in response (%d chars)",
self.name,
len(images),
len(response),
)
# Send the text portion first (if any remains after extractions)
if text_content:
logger.info(
"[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id
)
logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
result = await self.send(
chat_id=event.source.chat_id, content=text_content, reply_to=event.message_id
chat_id=event.source.chat_id,
content=text_content,
reply_to=event.message_id
)
# Log send failures (don't raise - user already saw tool progress)
if not result.success:
print(f"[{self.name}] Failed to send response: {result.error}")
@@ -729,27 +718,19 @@ class BasePlatformAdapter(ABC):
fallback_result = await self.send(
chat_id=event.source.chat_id,
content=f"(Response formatting failed, plain text:)\n\n{text_content[:3500]}",
reply_to=event.message_id,
reply_to=event.message_id
)
if not fallback_result.success:
print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
# Human-like pacing delay between text and media
human_delay = self._get_human_delay()
# Send extracted images as native attachments
if images:
logger.info("[%s] Extracted %d image(s) to send as attachments", self.name, len(images))
for image_url, alt_text in images:
if human_delay > 0:
await asyncio.sleep(human_delay)
try:
logger.info(
"[%s] Sending image: %s (alt=%s)",
self.name,
image_url[:80],
alt_text[:30] if alt_text else "",
)
# Route animated GIFs through send_animation for proper playback
if self._is_animation_url(image_url):
img_result = await self.send_animation(
@@ -764,14 +745,14 @@ class BasePlatformAdapter(ABC):
caption=alt_text if alt_text else None,
)
if not img_result.success:
logger.error("[%s] Failed to send image: %s", self.name, img_result.error)
print(f"[{self.name}] Failed to send image: {img_result.error}")
except Exception as img_err:
logger.error("[%s] Error sending image: %s", self.name, img_err, exc_info=True)
print(f"[{self.name}] Error sending image: {img_err}")
# Send extracted media files — route by file type
_AUDIO_EXTS = {".ogg", ".opus", ".mp3", ".wav", ".m4a"}
_VIDEO_EXTS = {".mp4", ".mov", ".avi", ".mkv", ".3gp"}
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".webp", ".gif"}
_AUDIO_EXTS = {'.ogg', '.opus', '.mp3', '.wav', '.m4a'}
_VIDEO_EXTS = {'.mp4', '.mov', '.avi', '.mkv', '.3gp'}
_IMAGE_EXTS = {'.jpg', '.jpeg', '.png', '.webp', '.gif'}
for media_path, is_voice in media_files:
if human_delay > 0:
@@ -803,7 +784,7 @@ class BasePlatformAdapter(ABC):
print(f"[{self.name}] Failed to send media ({ext}): {media_result.error}")
except Exception as media_err:
print(f"[{self.name}] Error sending media: {media_err}")
# Check if there's a pending message that was queued during our processing
if session_key in self._pending_messages:
pending_event = self._pending_messages.pop(session_key)
@@ -819,11 +800,10 @@ class BasePlatformAdapter(ABC):
# Process pending message in new background task
await self._process_message_background(pending_event, session_key)
return # Already cleaned up
except Exception as e:
print(f"[{self.name}] Error handling message: {e}")
import traceback
traceback.print_exc()
finally:
# Stop typing indicator
@@ -835,26 +815,24 @@ class BasePlatformAdapter(ABC):
# Clean up session tracking
if session_key in self._active_sessions:
del self._active_sessions[session_key]
def has_pending_interrupt(self, session_key: str) -> bool:
"""Check if there's a pending interrupt for a session."""
return session_key in self._active_sessions and self._active_sessions[session_key].is_set()
def get_pending_message(self, session_key: str) -> MessageEvent | None:
def get_pending_message(self, session_key: str) -> Optional[MessageEvent]:
"""Get and clear any pending message for a session."""
return self._pending_messages.pop(session_key, None)
def build_source(
self,
chat_id: str,
chat_name: str | None = None,
chat_name: Optional[str] = None,
chat_type: str = "dm",
user_id: str | None = None,
user_name: str | None = None,
thread_id: str | None = None,
chat_topic: str | None = None,
user_id_alt: str | None = None,
chat_id_alt: str | None = None,
user_id: Optional[str] = None,
user_name: Optional[str] = None,
thread_id: Optional[str] = None,
chat_topic: Optional[str] = None,
) -> SessionSource:
"""Helper to build a SessionSource for this platform."""
# Normalize empty topic to None
@@ -869,33 +847,31 @@ class BasePlatformAdapter(ABC):
user_name=user_name,
thread_id=str(thread_id) if thread_id else None,
chat_topic=chat_topic.strip() if chat_topic else None,
user_id_alt=user_id_alt,
chat_id_alt=chat_id_alt,
)
@abstractmethod
async def get_chat_info(self, chat_id: str) -> dict[str, Any]:
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""
Get information about a chat/channel.
Returns dict with at least:
- name: Chat name
- type: "dm", "group", "channel"
"""
pass
def format_message(self, content: str) -> str:
"""
Format a message for this platform.
Override in subclasses to handle platform-specific formatting
(e.g., Telegram MarkdownV2, Discord markdown).
Default implementation returns content as-is.
"""
return content
def truncate_message(self, content: str, max_length: int = 4096) -> list[str]:
def truncate_message(self, content: str, max_length: int = 4096) -> List[str]:
"""
Split a long message into chunks, preserving code block boundaries.
@@ -914,14 +890,14 @@ class BasePlatformAdapter(ABC):
if len(content) <= max_length:
return [content]
INDICATOR_RESERVE = 10 # room for " (XX/XX)"
INDICATOR_RESERVE = 10 # room for " (XX/XX)"
FENCE_CLOSE = "\n```"
chunks: list[str] = []
chunks: List[str] = []
remaining = content
# When the previous chunk ended mid-code-block, this holds the
# language tag (possibly "") so we can reopen the fence.
carry_lang: str | None = None
carry_lang: Optional[str] = None
while remaining:
# If we're continuing a code block from the previous chunk,
@@ -979,6 +955,8 @@ class BasePlatformAdapter(ABC):
# Append chunk indicators when the response spans multiple messages
if len(chunks) > 1:
total = len(chunks)
chunks = [f"{chunk} ({i + 1}/{total})" for i, chunk in enumerate(chunks)]
chunks = [
f"{chunk} ({i + 1}/{total})" for i, chunk in enumerate(chunks)
]
return chunks

View File

@@ -10,16 +10,14 @@ Uses discord.py library for:
import asyncio
import logging
import os
from typing import Any
from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
try:
import discord
from discord import Intents
from discord import Message as DiscordMessage
from discord import Message as DiscordMessage, Intents
from discord.ext import commands
DISCORD_AVAILABLE = True
except ImportError:
DISCORD_AVAILABLE = False
@@ -30,7 +28,6 @@ except ImportError:
import sys
from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
@@ -39,8 +36,8 @@ from gateway.platforms.base import (
MessageEvent,
MessageType,
SendResult,
cache_audio_from_url,
cache_image_from_url,
cache_audio_from_url,
)
@@ -52,7 +49,7 @@ def check_discord_requirements() -> bool:
class DiscordAdapter(BasePlatformAdapter):
"""
Discord bot adapter.
Handles:
- Receiving messages from servers and DMs
- Sending responses with Discord markdown
@@ -62,26 +59,26 @@ class DiscordAdapter(BasePlatformAdapter):
- Auto-threading for long conversations
- Reaction-based feedback
"""
# Discord message limits
MAX_MESSAGE_LENGTH = 2000
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.DISCORD)
self._client: commands.Bot | None = None
self._client: Optional[commands.Bot] = None
self._ready_event = asyncio.Event()
self._allowed_user_ids: set = set() # For button approval authorization
async def connect(self) -> bool:
"""Connect to Discord and start receiving events."""
if not DISCORD_AVAILABLE:
print(f"[{self.name}] discord.py not installed. Run: pip install discord.py")
return False
if not self.config.token:
print(f"[{self.name}] No bot token configured")
return False
try:
# Set up intents -- members intent needed for username-to-ID resolution
intents = Intents.default()
@@ -89,28 +86,30 @@ class DiscordAdapter(BasePlatformAdapter):
intents.dm_messages = True
intents.guild_messages = True
intents.members = True
# Create bot
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
)
# Parse allowed user entries (may contain usernames or IDs)
allowed_env = os.getenv("DISCORD_ALLOWED_USERS", "")
if allowed_env:
self._allowed_user_ids = {uid.strip() for uid in allowed_env.split(",") if uid.strip()}
self._allowed_user_ids = {
uid.strip() for uid in allowed_env.split(",") if uid.strip()
}
adapter_self = self # capture for closure
# Register event handlers
@self._client.event
async def on_ready():
print(f"[{adapter_self.name}] Connected as {adapter_self._client.user}")
# Resolve any usernames in the allowed list to numeric IDs
await adapter_self._resolve_allowed_usernames()
# Sync slash commands with Discord
try:
synced = await adapter_self._client.tree.sync()
@@ -118,33 +117,33 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e:
print(f"[{adapter_self.name}] Slash command sync failed: {e}")
adapter_self._ready_event.set()
@self._client.event
async def on_message(message: DiscordMessage):
# Ignore bot's own messages
if message.author == self._client.user:
return
await self._handle_message(message)
# Register slash commands
self._register_slash_commands()
# Start the bot in background
asyncio.create_task(self._client.start(self.config.token))
# Wait for ready
await asyncio.wait_for(self._ready_event.wait(), timeout=30)
self._running = True
return True
except TimeoutError:
except asyncio.TimeoutError:
print(f"[{self.name}] Timeout waiting for connection")
return False
except Exception as e:
print(f"[{self.name}] Failed to connect: {e}")
return False
async def disconnect(self) -> None:
"""Disconnect from Discord."""
if self._client:
@@ -152,55 +151,59 @@ class DiscordAdapter(BasePlatformAdapter):
await self._client.close()
except Exception as e:
print(f"[{self.name}] Error during disconnect: {e}")
self._running = False
self._client = None
self._ready_event.clear()
print(f"[{self.name}] Disconnected")
async def send(
self, chat_id: str, content: str, reply_to: str | None = None, metadata: dict[str, Any] | None = None
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message to a Discord channel."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
# Get the channel
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
message_ids = []
reference = None
if reply_to:
try:
ref_msg = await channel.fetch_message(int(reply_to))
reference = ref_msg
except Exception as e:
logger.debug("Could not fetch reply-to message: %s", e)
for i, chunk in enumerate(chunks):
msg = await channel.send(
content=chunk,
reference=reference if i == 0 else None,
)
message_ids.append(str(msg.id))
return SendResult(
success=True,
message_id=message_ids[0] if message_ids else None,
raw_response={"message_ids": message_ids},
raw_response={"message_ids": message_ids}
)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -220,7 +223,7 @@ class DiscordAdapter(BasePlatformAdapter):
msg = await channel.fetch_message(int(message_id))
formatted = self.format_message(content)
if len(formatted) > self.MAX_MESSAGE_LENGTH:
formatted = formatted[: self.MAX_MESSAGE_LENGTH - 3] + "..."
formatted = formatted[:self.MAX_MESSAGE_LENGTH - 3] + "..."
await msg.edit(content=formatted)
return SendResult(success=True, message_id=message_id)
except Exception as e:
@@ -230,28 +233,28 @@ class DiscordAdapter(BasePlatformAdapter):
self,
chat_id: str,
audio_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send audio as a Discord file attachment."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
import io
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
if not os.path.exists(audio_path):
return SendResult(success=False, error=f"Audio file not found: {audio_path}")
# Determine filename from path
filename = os.path.basename(audio_path)
with open(audio_path, "rb") as f:
file = discord.File(io.BytesIO(f.read()), filename=filename)
msg = await channel.send(
@@ -259,77 +262,40 @@ class DiscordAdapter(BasePlatformAdapter):
file=file,
)
return SendResult(success=True, message_id=str(msg.id))
except Exception as e:
print(f"[{self.name}] Failed to send audio: {e}")
return await super().send_voice(chat_id, audio_path, caption, reply_to)
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: str | None = None,
reply_to: str | None = None,
) -> SendResult:
"""Send a local image file natively as a Discord file attachment."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
import io
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
if not os.path.exists(image_path):
return SendResult(success=False, error=f"Image file not found: {image_path}")
filename = os.path.basename(image_path)
with open(image_path, "rb") as f:
file = discord.File(io.BytesIO(f.read()), filename=filename)
msg = await channel.send(
content=caption if caption else None,
file=file,
)
return SendResult(success=True, message_id=str(msg.id))
except Exception as e:
print(f"[{self.name}] Failed to send local image: {e}")
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_image(
self,
chat_id: str,
image_url: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send an image natively as a Discord file attachment."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
import aiohttp
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Download the image and send as a Discord file attachment
# (Discord renders attachments inline, unlike plain URLs)
async with aiohttp.ClientSession() as session:
async with session.get(image_url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
if resp.status != 200:
raise Exception(f"Failed to download image: HTTP {resp.status}")
image_data = await resp.read()
# Determine filename from URL or content type
content_type = resp.headers.get("content-type", "image/png")
ext = "png"
@@ -339,24 +305,23 @@ class DiscordAdapter(BasePlatformAdapter):
ext = "gif"
elif "webp" in content_type:
ext = "webp"
import io
file = discord.File(io.BytesIO(image_data), filename=f"image.{ext}")
msg = await channel.send(
content=caption if caption else None,
file=file,
)
return SendResult(success=True, message_id=str(msg.id))
except ImportError:
print(f"[{self.name}] aiohttp not installed, falling back to URL. Run: pip install aiohttp")
return await super().send_image(chat_id, image_url, caption, reply_to)
except Exception as e:
print(f"[{self.name}] Failed to send image attachment, falling back to URL: {e}")
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_typing(self, chat_id: str) -> None:
"""Send typing indicator."""
if self._client:
@@ -366,20 +331,20 @@ class DiscordAdapter(BasePlatformAdapter):
await channel.typing()
except Exception:
pass # Ignore typing indicator failures
async def get_chat_info(self, chat_id: str) -> dict[str, Any]:
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Discord channel."""
if not self._client:
return {"name": "Unknown", "type": "dm"}
try:
channel = self._client.get_channel(int(chat_id))
if not channel:
channel = await self._client.fetch_channel(int(chat_id))
if not channel:
return {"name": str(chat_id), "type": "dm"}
# Determine channel type
if isinstance(channel, discord.DMChannel):
chat_type = "dm"
@@ -395,7 +360,7 @@ class DiscordAdapter(BasePlatformAdapter):
else:
chat_type = "channel"
name = getattr(channel, "name", str(chat_id))
return {
"name": name,
"type": chat_type,
@@ -404,7 +369,7 @@ class DiscordAdapter(BasePlatformAdapter):
}
except Exception as e:
return {"name": str(chat_id), "type": "dm", "error": str(e)}
async def _resolve_allowed_usernames(self) -> None:
"""
Resolve non-numeric entries in DISCORD_ALLOWED_USERS to Discord user IDs.
@@ -451,10 +416,8 @@ class DiscordAdapter(BasePlatformAdapter):
uid = str(member.id)
numeric_ids.add(uid)
resolved_count += 1
matched_name = (
name_lower
if name_lower in to_resolve
else (display_lower if display_lower in to_resolve else global_lower)
matched_name = name_lower if name_lower in to_resolve else (
display_lower if display_lower in to_resolve else global_lower
)
to_resolve.discard(matched_name)
print(f"[{self.name}] Resolved '{matched_name}' -> {uid} ({member.name}#{member.discriminator})")
@@ -474,12 +437,12 @@ class DiscordAdapter(BasePlatformAdapter):
def format_message(self, content: str) -> str:
"""
Format message for Discord.
Discord uses its own markdown variant.
"""
# Discord markdown is fairly standard, no special escaping needed
return content
def _register_slash_commands(self) -> None:
"""Register Discord slash commands on the command tree."""
if not self._client:
@@ -592,89 +555,6 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="compress", description="Compress conversation context")
async def slash_compress(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/compress")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="title", description="Set or show the session title")
@discord.app_commands.describe(name="Session title. Leave empty to show current.")
async def slash_title(interaction: discord.Interaction, name: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/title {name}".strip())
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="resume", description="Resume a previously-named session")
@discord.app_commands.describe(name="Session name to resume. Leave empty to list sessions.")
async def slash_resume(interaction: discord.Interaction, name: str = ""):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/resume {name}".strip())
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="usage", description="Show token usage for this session")
async def slash_usage(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/usage")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="provider", description="Show available providers")
async def slash_provider(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/provider")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="help", description="Show available commands")
async def slash_help(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/help")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="insights", description="Show usage insights and analytics")
@discord.app_commands.describe(days="Number of days to analyze (default: 7)")
async def slash_insights(interaction: discord.Interaction, days: int = 7):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/insights {days}")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="reload-mcp", description="Reload MCP servers from config")
async def slash_reload_mcp(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, "/reload-mcp")
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="update", description="Update Hermes Agent to the latest version")
async def slash_update(interaction: discord.Interaction):
await interaction.response.defer(ephemeral=True)
@@ -694,7 +574,7 @@ class DiscordAdapter(BasePlatformAdapter):
chat_name = interaction.channel.name
if hasattr(interaction.channel, "guild") and interaction.channel.guild:
chat_name = f"{interaction.channel.guild.name} / #{chat_name}"
# Get channel topic (if available)
chat_topic = getattr(interaction.channel, "topic", None)
@@ -715,7 +595,9 @@ class DiscordAdapter(BasePlatformAdapter):
raw_message=interaction,
)
async def send_exec_approval(self, chat_id: str, command: str, approval_id: str) -> SendResult:
async def send_exec_approval(
self, chat_id: str, command: str, approval_id: str
) -> SendResult:
"""
Send a button-based exec approval prompt for a dangerous command.
@@ -757,28 +639,28 @@ class DiscordAdapter(BasePlatformAdapter):
# bot responds to every message without needing a mention.
# DISCORD_REQUIRE_MENTION: Set to "false" to disable mention requirement
# globally (all channels become free-response). Default: "true".
if not isinstance(message.channel, discord.DMChannel):
# Check if this channel is in the free-response list
free_channels_raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
channel_id = str(message.channel.id)
# Global override: if DISCORD_REQUIRE_MENTION=false, all channels are free
require_mention = os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
is_free_channel = channel_id in free_channels
if require_mention and not is_free_channel:
# Must be @mentioned to respond
if self._client.user not in message.mentions:
return # Silently ignore messages that don't mention the bot
# Strip the bot mention from the message text so the agent sees clean input
if self._client.user and self._client.user in message.mentions:
message.content = message.content.replace(f"<@{self._client.user.id}>", "").strip()
message.content = message.content.replace(f"<@!{self._client.user.id}>", "").strip()
# Determine message type
msg_type = MessageType.TEXT
if message.content.startswith("/"):
@@ -796,7 +678,7 @@ class DiscordAdapter(BasePlatformAdapter):
else:
msg_type = MessageType.DOCUMENT
break
# Determine chat type
if isinstance(message.channel, discord.DMChannel):
chat_type = "dm"
@@ -809,15 +691,15 @@ class DiscordAdapter(BasePlatformAdapter):
chat_name = getattr(message.channel, "name", str(message.channel.id))
if hasattr(message.channel, "guild") and message.channel.guild:
chat_name = f"{message.channel.guild.name} / #{chat_name}"
# Get thread ID if in a thread
thread_id = None
if isinstance(message.channel, discord.Thread):
thread_id = str(message.channel.id)
# Get channel topic (if available - TextChannels have topics, DMs/threads don't)
chat_topic = getattr(message.channel, "topic", None)
# Build source
source = self.build_source(
chat_id=str(message.channel.id),
@@ -828,7 +710,7 @@ class DiscordAdapter(BasePlatformAdapter):
thread_id=thread_id,
chat_topic=chat_topic,
)
# Build media URLs -- download image attachments to local cache so the
# vision tool can access them reliably (Discord CDN URLs can expire).
media_urls = []
@@ -867,7 +749,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Other attachments: keep the original URL
media_urls.append(att.url)
media_types.append(content_type)
event = MessageEvent(
text=message.content,
message_type=msg_type,
@@ -879,7 +761,7 @@ class DiscordAdapter(BasePlatformAdapter):
reply_to_message_id=str(message.reference.message_id) if message.reference else None,
timestamp=message.created_at,
)
await self.handle_message(event)
@@ -909,14 +791,20 @@ if DISCORD_AVAILABLE:
return True # No allowlist = anyone can approve
return str(interaction.user.id) in self.allowed_user_ids
async def _resolve(self, interaction: discord.Interaction, action: str, color: discord.Color):
async def _resolve(
self, interaction: discord.Interaction, action: str, color: discord.Color
):
"""Resolve the approval and update the message."""
if self.resolved:
await interaction.response.send_message("This approval has already been resolved~", ephemeral=True)
await interaction.response.send_message(
"This approval has already been resolved~", ephemeral=True
)
return
if not self._check_auth(interaction):
await interaction.response.send_message("You're not authorized to approve commands~", ephemeral=True)
await interaction.response.send_message(
"You're not authorized to approve commands~", ephemeral=True
)
return
self.resolved = True
@@ -936,7 +824,6 @@ if DISCORD_AVAILABLE:
# Store the approval decision
try:
from tools.approval import approve_permanent
if action == "allow_once":
pass # One-time approval handled by gateway
elif action == "allow_always":
@@ -945,15 +832,21 @@ if DISCORD_AVAILABLE:
pass
@discord.ui.button(label="Allow Once", style=discord.ButtonStyle.green)
async def allow_once(self, interaction: discord.Interaction, button: discord.ui.Button):
async def allow_once(
self, interaction: discord.Interaction, button: discord.ui.Button
):
await self._resolve(interaction, "allow_once", discord.Color.green())
@discord.ui.button(label="Always Allow", style=discord.ButtonStyle.blurple)
async def allow_always(self, interaction: discord.Interaction, button: discord.ui.Button):
async def allow_always(
self, interaction: discord.Interaction, button: discord.ui.Button
):
await self._resolve(interaction, "allow_always", discord.Color.blue())
@discord.ui.button(label="Deny", style=discord.ButtonStyle.red)
async def deny(self, interaction: discord.Interaction, button: discord.ui.Button):
async def deny(
self, interaction: discord.Interaction, button: discord.ui.Button
):
await self._resolve(interaction, "deny", discord.Color.red())
async def on_timeout(self):

View File

@@ -19,11 +19,10 @@ import os
import time
import uuid
from datetime import datetime
from typing import Any
from typing import Any, Dict, List, Optional, Set
try:
import aiohttp
AIOHTTP_AVAILABLE = True
except ImportError:
AIOHTTP_AVAILABLE = False
@@ -67,10 +66,10 @@ class HomeAssistantAdapter(BasePlatformAdapter):
super().__init__(config, Platform.HOMEASSISTANT)
# Connection state
self._session: aiohttp.ClientSession | None = None
self._ws: aiohttp.ClientWebSocketResponse | None = None
self._rest_session: aiohttp.ClientSession | None = None
self._listen_task: asyncio.Task | None = None
self._session: Optional["aiohttp.ClientSession"] = None
self._ws: Optional["aiohttp.ClientWebSocketResponse"] = None
self._rest_session: Optional["aiohttp.ClientSession"] = None
self._listen_task: Optional[asyncio.Task] = None
self._msg_id: int = 0
# Configuration from extra
@@ -81,13 +80,13 @@ class HomeAssistantAdapter(BasePlatformAdapter):
self._hass_token: str = token
# Event filtering
self._watch_domains: set[str] = set(extra.get("watch_domains", []))
self._watch_entities: set[str] = set(extra.get("watch_entities", []))
self._ignore_entities: set[str] = set(extra.get("ignore_entities", []))
self._watch_domains: Set[str] = set(extra.get("watch_domains", []))
self._watch_entities: Set[str] = set(extra.get("watch_entities", []))
self._ignore_entities: Set[str] = set(extra.get("ignore_entities", []))
self._cooldown_seconds: int = int(extra.get("cooldown_seconds", 30))
# Cooldown tracking: entity_id -> last_event_timestamp
self._last_event_time: dict[str, float] = {}
self._last_event_time: Dict[str, float] = {}
def _next_id(self) -> int:
"""Return the next WebSocket message ID."""
@@ -142,12 +141,10 @@ class HomeAssistantAdapter(BasePlatformAdapter):
return False
# Step 2: Send auth
await self._ws.send_json(
{
"type": "auth",
"access_token": self._hass_token,
}
)
await self._ws.send_json({
"type": "auth",
"access_token": self._hass_token,
})
# Step 3: Wait for auth_ok
msg = await self._ws.receive_json()
@@ -158,13 +155,11 @@ class HomeAssistantAdapter(BasePlatformAdapter):
# Step 4: Subscribe to state_changed events
sub_id = self._next_id()
await self._ws.send_json(
{
"id": sub_id,
"type": "subscribe_events",
"event_type": "state_changed",
}
)
await self._ws.send_json({
"id": sub_id,
"type": "subscribe_events",
"event_type": "state_changed",
})
# Verify subscription acknowledgement
msg = await self._ws.receive_json()
@@ -250,7 +245,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
elif ws_msg.type in (aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR):
break
async def _handle_ha_event(self, event: dict[str, Any]) -> None:
async def _handle_ha_event(self, event: Dict[str, Any]) -> None:
"""Process a state_changed event from Home Assistant."""
event_data = event.get("data", {})
entity_id: str = event_data.get("entity_id", "")
@@ -307,9 +302,9 @@ class HomeAssistantAdapter(BasePlatformAdapter):
@staticmethod
def _format_state_change(
entity_id: str,
old_state: dict[str, Any],
new_state: dict[str, Any],
) -> str | None:
old_state: Dict[str, Any],
new_state: Dict[str, Any],
) -> Optional[str]:
"""Convert a state_changed event into a human-readable description."""
if not new_state:
return None
@@ -336,7 +331,10 @@ class HomeAssistantAdapter(BasePlatformAdapter):
if domain == "sensor":
unit = new_state.get("attributes", {}).get("unit_of_measurement", "")
return f"[Home Assistant] {friendly_name}: changed from {old_val}{unit} to {new_val}{unit}"
return (
f"[Home Assistant] {friendly_name}: changed from "
f"{old_val}{unit} to {new_val}{unit}"
)
if domain == "binary_sensor":
return (
@@ -346,13 +344,22 @@ class HomeAssistantAdapter(BasePlatformAdapter):
)
if domain in ("light", "switch", "fan"):
return f"[Home Assistant] {friendly_name}: turned {'on' if new_val == 'on' else 'off'}"
return (
f"[Home Assistant] {friendly_name}: turned "
f"{'on' if new_val == 'on' else 'off'}"
)
if domain == "alarm_control_panel":
return f"[Home Assistant] {friendly_name}: alarm state changed from '{old_val}' to '{new_val}'"
return (
f"[Home Assistant] {friendly_name}: alarm state changed from "
f"'{old_val}' to '{new_val}'"
)
# Generic fallback
return f"[Home Assistant] {friendly_name} ({entity_id}): changed from '{old_val}' to '{new_val}'"
return (
f"[Home Assistant] {friendly_name} ({entity_id}): "
f"changed from '{old_val}' to '{new_val}'"
)
# ------------------------------------------------------------------
# Outbound messaging
@@ -362,8 +369,8 @@ class HomeAssistantAdapter(BasePlatformAdapter):
self,
chat_id: str,
content: str,
reply_to: str | None = None,
metadata: dict[str, Any] | None = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a notification via HA REST API (persistent_notification.create).
@@ -377,7 +384,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
}
payload = {
"title": "Hermes Agent",
"message": content[: self.MAX_MESSAGE_LENGTH],
"message": content[:self.MAX_MESSAGE_LENGTH],
}
try:
@@ -394,22 +401,20 @@ class HomeAssistantAdapter(BasePlatformAdapter):
body = await resp.text()
return SendResult(success=False, error=f"HTTP {resp.status}: {body}")
else:
async with (
aiohttp.ClientSession() as session,
session.post(
async with aiohttp.ClientSession() as session:
async with session.post(
url,
headers=headers,
json=payload,
timeout=aiohttp.ClientTimeout(total=10),
) as resp,
):
if resp.status < 300:
return SendResult(success=True, message_id=uuid.uuid4().hex[:12])
else:
body = await resp.text()
return SendResult(success=False, error=f"HTTP {resp.status}: {body}")
) as resp:
if resp.status < 300:
return SendResult(success=True, message_id=uuid.uuid4().hex[:12])
else:
body = await resp.text()
return SendResult(success=False, error=f"HTTP {resp.status}: {body}")
except TimeoutError:
except asyncio.TimeoutError:
return SendResult(success=False, error="Timeout sending notification to HA")
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -418,7 +423,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
"""No typing indicator for Home Assistant."""
pass
async def get_chat_info(self, chat_id: str) -> dict[str, Any]:
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Return basic info about the HA event channel."""
return {
"name": "Home Assistant Events",

View File

@@ -1,718 +0,0 @@
"""Signal messenger platform adapter.
Connects to a signal-cli daemon running in HTTP mode.
Inbound messages arrive via SSE (Server-Sent Events) streaming.
Outbound messages and actions use JSON-RPC 2.0 over HTTP.
Based on PR #268 by ibhagwan, rebuilt with bug fixes.
Requires:
- signal-cli installed and running: signal-cli daemon --http 127.0.0.1:8080
- SIGNAL_HTTP_URL and SIGNAL_ACCOUNT environment variables set
"""
import asyncio
import base64
import json
import logging
import os
import random
import re
import time
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
from urllib.parse import unquote
import httpx
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_audio_from_bytes,
cache_document_from_bytes,
cache_image_from_bytes,
cache_image_from_url,
)
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
SIGNAL_MAX_ATTACHMENT_SIZE = 100 * 1024 * 1024 # 100 MB
MAX_MESSAGE_LENGTH = 8000 # Signal message size limit
TYPING_INTERVAL = 8.0 # seconds between typing indicator refreshes
SSE_RETRY_DELAY_INITIAL = 2.0
SSE_RETRY_DELAY_MAX = 60.0
HEALTH_CHECK_INTERVAL = 30.0 # seconds between health checks
HEALTH_CHECK_STALE_THRESHOLD = 120.0 # seconds without SSE activity before concern
# E.164 phone number pattern for redaction
_PHONE_RE = re.compile(r"\+[1-9]\d{6,14}")
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _redact_phone(phone: str) -> str:
"""Redact a phone number for logging: +15551234567 -> +155****4567."""
if not phone:
return "<none>"
if len(phone) <= 8:
return phone[:2] + "****" + phone[-2:] if len(phone) > 4 else "****"
return phone[:4] + "****" + phone[-4:]
def _parse_comma_list(value: str) -> list[str]:
"""Split a comma-separated string into a list, stripping whitespace."""
return [v.strip() for v in value.split(",") if v.strip()]
def _guess_extension(data: bytes) -> str:
"""Guess file extension from magic bytes."""
if data[:4] == b"\x89PNG":
return ".png"
if data[:2] == b"\xff\xd8":
return ".jpg"
if data[:4] == b"GIF8":
return ".gif"
if len(data) >= 12 and data[:4] == b"RIFF" and data[8:12] == b"WEBP":
return ".webp"
if data[:4] == b"%PDF":
return ".pdf"
if len(data) >= 8 and data[4:8] == b"ftyp":
return ".mp4"
if data[:4] == b"OggS":
return ".ogg"
if len(data) >= 2 and data[0] == 0xFF and (data[1] & 0xE0) == 0xE0:
return ".mp3"
if data[:2] == b"PK":
return ".zip"
return ".bin"
def _is_image_ext(ext: str) -> bool:
return ext.lower() in (".jpg", ".jpeg", ".png", ".gif", ".webp")
def _is_audio_ext(ext: str) -> bool:
return ext.lower() in (".mp3", ".wav", ".ogg", ".m4a", ".aac")
def _render_mentions(text: str, mentions: list) -> str:
"""Replace Signal mention placeholders (\\uFFFC) with readable @identifiers.
Signal encodes @mentions as the Unicode object replacement character
with out-of-band metadata containing the mentioned user's UUID/number.
"""
if not mentions or "\ufffc" not in text:
return text
# Sort mentions by start position (reverse) to replace from end to start
# so indices don't shift as we replace
sorted_mentions = sorted(mentions, key=lambda m: m.get("start", 0), reverse=True)
for mention in sorted_mentions:
start = mention.get("start", 0)
length = mention.get("length", 1)
# Use the mention's number or UUID as the replacement
identifier = mention.get("number") or mention.get("uuid") or "user"
replacement = f"@{identifier}"
text = text[:start] + replacement + text[start + length :]
return text
def check_signal_requirements() -> bool:
"""Check if Signal is configured (has URL and account)."""
return bool(os.getenv("SIGNAL_HTTP_URL") and os.getenv("SIGNAL_ACCOUNT"))
# ---------------------------------------------------------------------------
# Signal Adapter
# ---------------------------------------------------------------------------
class SignalAdapter(BasePlatformAdapter):
"""Signal messenger adapter using signal-cli HTTP daemon."""
platform = Platform.SIGNAL
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.SIGNAL)
extra = config.extra or {}
self.http_url = extra.get("http_url", "http://127.0.0.1:8080").rstrip("/")
self.account = extra.get("account", "")
self.ignore_stories = extra.get("ignore_stories", True)
# Parse allowlists — group policy is derived from presence of group allowlist
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# HTTP client
self.client: httpx.AsyncClient | None = None
# Background tasks
self._sse_task: asyncio.Task | None = None
self._health_monitor_task: asyncio.Task | None = None
self._typing_tasks: dict[str, asyncio.Task] = {}
self._running = False
self._last_sse_activity = 0.0
self._sse_response: httpx.Response | None = None
# Normalize account for self-message filtering
self._account_normalized = self.account.strip()
logger.info(
"Signal adapter initialized: url=%s account=%s groups=%s",
self.http_url,
_redact_phone(self.account),
"enabled" if self.group_allow_from else "disabled",
)
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
async def connect(self) -> bool:
"""Connect to signal-cli daemon and start SSE listener."""
if not self.http_url or not self.account:
logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
return False
self.client = httpx.AsyncClient(timeout=30.0)
# Health check — verify signal-cli daemon is reachable
try:
resp = await self.client.get(f"{self.http_url}/api/v1/check", timeout=10.0)
if resp.status_code != 200:
logger.error("Signal: health check failed (status %d)", resp.status_code)
return False
except Exception as e:
logger.error("Signal: cannot reach signal-cli at %s: %s", self.http_url, e)
return False
self._running = True
self._last_sse_activity = time.time()
self._sse_task = asyncio.create_task(self._sse_listener())
self._health_monitor_task = asyncio.create_task(self._health_monitor())
logger.info("Signal: connected to %s", self.http_url)
return True
async def disconnect(self) -> None:
"""Stop SSE listener and clean up."""
self._running = False
if self._sse_task:
self._sse_task.cancel()
try:
await self._sse_task
except asyncio.CancelledError:
pass
if self._health_monitor_task:
self._health_monitor_task.cancel()
try:
await self._health_monitor_task
except asyncio.CancelledError:
pass
# Cancel all typing tasks
for task in self._typing_tasks.values():
task.cancel()
self._typing_tasks.clear()
if self.client:
await self.client.aclose()
self.client = None
logger.info("Signal: disconnected")
# ------------------------------------------------------------------
# SSE Streaming (inbound messages)
# ------------------------------------------------------------------
async def _sse_listener(self) -> None:
"""Listen for SSE events from signal-cli daemon."""
url = f"{self.http_url}/api/v1/events?account={self.account}"
backoff = SSE_RETRY_DELAY_INITIAL
while self._running:
try:
logger.debug("Signal SSE: connecting to %s", url)
async with self.client.stream(
"GET",
url,
headers={"Accept": "text/event-stream"},
timeout=None,
) as response:
self._sse_response = response
backoff = SSE_RETRY_DELAY_INITIAL # Reset on successful connection
self._last_sse_activity = time.time()
logger.info("Signal SSE: connected")
buffer = ""
async for chunk in response.aiter_text():
if not self._running:
break
buffer += chunk
while "\n" in buffer:
line, buffer = buffer.split("\n", 1)
line = line.strip()
if not line:
continue
# Parse SSE data lines
if line.startswith("data:"):
data_str = line[5:].strip()
if not data_str:
continue
self._last_sse_activity = time.time()
try:
data = json.loads(data_str)
await self._handle_envelope(data)
except json.JSONDecodeError:
logger.debug("Signal SSE: invalid JSON: %s", data_str[:100])
except Exception:
logger.exception("Signal SSE: error handling event")
except asyncio.CancelledError:
break
except httpx.HTTPError as e:
if self._running:
logger.warning("Signal SSE: HTTP error: %s (reconnecting in %.0fs)", e, backoff)
except Exception as e:
if self._running:
logger.warning("Signal SSE: error: %s (reconnecting in %.0fs)", e, backoff)
if self._running:
# Add 20% jitter to prevent thundering herd on reconnection
jitter = backoff * 0.2 * random.random()
await asyncio.sleep(backoff + jitter)
backoff = min(backoff * 2, SSE_RETRY_DELAY_MAX)
self._sse_response = None
# ------------------------------------------------------------------
# Health Monitor
# ------------------------------------------------------------------
async def _health_monitor(self) -> None:
"""Monitor SSE connection health and force reconnect if stale."""
while self._running:
await asyncio.sleep(HEALTH_CHECK_INTERVAL)
if not self._running:
break
elapsed = time.time() - self._last_sse_activity
if elapsed > HEALTH_CHECK_STALE_THRESHOLD:
logger.warning("Signal: SSE idle for %.0fs, checking daemon health", elapsed)
try:
resp = await self.client.get(f"{self.http_url}/api/v1/check", timeout=10.0)
if resp.status_code == 200:
# Daemon is alive but SSE is idle — update activity to
# avoid repeated warnings (connection may just be quiet)
self._last_sse_activity = time.time()
logger.debug("Signal: daemon healthy, SSE idle")
else:
logger.warning("Signal: health check failed (%d), forcing reconnect", resp.status_code)
self._force_reconnect()
except Exception as e:
logger.warning("Signal: health check error: %s, forcing reconnect", e)
self._force_reconnect()
def _force_reconnect(self) -> None:
"""Force SSE reconnection by closing the current response."""
if self._sse_response and not self._sse_response.is_stream_consumed:
try:
asyncio.create_task(self._sse_response.aclose())
except Exception:
pass
self._sse_response = None
# ------------------------------------------------------------------
# Message Handling
# ------------------------------------------------------------------
async def _handle_envelope(self, envelope: dict) -> None:
"""Process an incoming signal-cli envelope."""
# Unwrap nested envelope if present
envelope_data = envelope.get("envelope", envelope)
# Filter syncMessage envelopes (sent transcripts, read receipts, etc.)
# signal-cli may set syncMessage to null vs omitting it, so check key existence
if "syncMessage" in envelope_data:
return
# Extract sender info
sender = envelope_data.get("sourceNumber") or envelope_data.get("sourceUuid") or envelope_data.get("source")
sender_name = envelope_data.get("sourceName", "")
sender_uuid = envelope_data.get("sourceUuid", "")
if not sender:
logger.debug("Signal: ignoring envelope with no sender")
return
# Self-message filtering — prevent reply loops
if self._account_normalized and sender == self._account_normalized:
return
# Filter stories
if self.ignore_stories and envelope_data.get("storyMessage"):
return
# Get data message — also check editMessage (edited messages contain
# their updated dataMessage inside editMessage.dataMessage)
data_message = envelope_data.get("dataMessage") or (envelope_data.get("editMessage") or {}).get("dataMessage")
if not data_message:
return
# Check for group message
group_info = data_message.get("groupInfo")
group_id = group_info.get("groupId") if group_info else None
is_group = bool(group_id)
# Group message filtering — derived from SIGNAL_GROUP_ALLOWED_USERS:
# - No env var set → groups disabled (default safe behavior)
# - Env var set with group IDs → only those groups allowed
# - Env var set with "*" → all groups allowed
# DM auth is fully handled by run.py (_is_user_authorized)
if is_group:
if not self.group_allow_from:
logger.debug("Signal: ignoring group message (no SIGNAL_GROUP_ALLOWED_USERS)")
return
if "*" not in self.group_allow_from and group_id not in self.group_allow_from:
logger.debug("Signal: group %s not in allowlist", group_id[:8] if group_id else "?")
return
# Build chat info
chat_id = sender if not is_group else f"group:{group_id}"
chat_type = "group" if is_group else "dm"
# Extract text and render mentions
text = data_message.get("message", "")
mentions = data_message.get("mentions", [])
if text and mentions:
text = _render_mentions(text, mentions)
# Process attachments
attachments_data = data_message.get("attachments", [])
image_paths = []
audio_path = None
document_paths = []
if attachments_data and not getattr(self, "ignore_attachments", False):
for att in attachments_data:
att_id = att.get("id")
att_size = att.get("size", 0)
if not att_id:
continue
if att_size > SIGNAL_MAX_ATTACHMENT_SIZE:
logger.warning("Signal: attachment too large (%d bytes), skipping", att_size)
continue
try:
cached_path, ext = await self._fetch_attachment(att_id)
if cached_path:
if _is_image_ext(ext):
image_paths.append(cached_path)
elif _is_audio_ext(ext):
audio_path = cached_path
else:
document_paths.append(cached_path)
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
# Build session source
source = self.build_source(
chat_id=chat_id,
chat_name=group_info.get("groupName") if group_info else sender_name,
chat_type=chat_type,
user_id=sender,
user_name=sender_name or sender,
user_id_alt=sender_uuid if sender_uuid else None,
chat_id_alt=group_id if is_group else None,
)
# Determine message type
msg_type = MessageType.TEXT
if audio_path:
msg_type = MessageType.VOICE
elif image_paths:
msg_type = MessageType.IMAGE
# Parse timestamp from envelope data (milliseconds since epoch)
ts_ms = envelope_data.get("timestamp", 0)
if ts_ms:
try:
timestamp = datetime.fromtimestamp(ts_ms / 1000, tz=UTC)
except (ValueError, OSError):
timestamp = datetime.now(tz=UTC)
else:
timestamp = datetime.now(tz=UTC)
# Build and dispatch event
event = MessageEvent(
source=source,
text=text or "",
message_type=msg_type,
image_paths=image_paths,
audio_path=audio_path,
document_paths=document_paths,
timestamp=timestamp,
)
logger.debug("Signal: message from %s in %s: %s", _redact_phone(sender), chat_id[:20], (text or "")[:50])
await self.handle_message(event)
# ------------------------------------------------------------------
# Attachment Handling
# ------------------------------------------------------------------
async def _fetch_attachment(self, attachment_id: str) -> tuple:
"""Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
result = await self._rpc(
"getAttachment",
{
"account": self.account,
"attachmentId": attachment_id,
},
)
if not result:
return None, ""
# Result is base64-encoded file content
raw_data = base64.b64decode(result)
ext = _guess_extension(raw_data)
if _is_image_ext(ext):
path = cache_image_from_bytes(raw_data, ext)
elif _is_audio_ext(ext):
path = cache_audio_from_bytes(raw_data, ext)
else:
path = cache_document_from_bytes(raw_data, ext)
return path, ext
# ------------------------------------------------------------------
# JSON-RPC Communication
# ------------------------------------------------------------------
async def _rpc(self, method: str, params: dict, rpc_id: str = None) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon."""
if not self.client:
logger.warning("Signal: RPC called but client not connected")
return None
if rpc_id is None:
rpc_id = f"{method}_{int(time.time() * 1000)}"
payload = {
"jsonrpc": "2.0",
"method": method,
"params": params,
"id": rpc_id,
}
try:
resp = await self.client.post(
f"{self.http_url}/api/v1/rpc",
json=payload,
timeout=30.0,
)
resp.raise_for_status()
data = resp.json()
if "error" in data:
logger.warning("Signal RPC error (%s): %s", method, data["error"])
return None
return data.get("result")
except Exception as e:
logger.warning("Signal RPC %s failed: %s", method, e)
return None
# ------------------------------------------------------------------
# Sending
# ------------------------------------------------------------------
async def send(
self,
chat_id: str,
text: str,
reply_to_message_id: str | None = None,
**kwargs,
) -> SendResult:
"""Send a text message."""
await self._stop_typing_indicator(chat_id)
params: dict[str, Any] = {
"account": self.account,
"message": text,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
return SendResult(success=True)
return SendResult(success=False, error="RPC send failed")
async def send_typing(self, chat_id: str) -> None:
"""Send a typing indicator."""
params: dict[str, Any] = {
"account": self.account,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
await self._rpc("sendTyping", params, rpc_id="typing")
async def send_image(
self,
chat_id: str,
image_url: str,
caption: str | None = None,
**kwargs,
) -> SendResult:
"""Send an image. Supports http(s):// and file:// URLs."""
await self._stop_typing_indicator(chat_id)
# Resolve image to local path
if image_url.startswith("file://"):
file_path = unquote(image_url[7:])
else:
# Download remote image to cache
try:
file_path = await cache_image_from_url(image_url)
except Exception as e:
logger.warning("Signal: failed to download image: %s", e)
return SendResult(success=False, error=str(e))
if not file_path or not Path(file_path).exists():
return SendResult(success=False, error="Image file not found")
# Validate size
file_size = Path(file_path).stat().st_size
if file_size > SIGNAL_MAX_ATTACHMENT_SIZE:
return SendResult(success=False, error=f"Image too large ({file_size} bytes)")
params: dict[str, Any] = {
"account": self.account,
"message": caption or "",
"attachments": [file_path],
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
return SendResult(success=True)
return SendResult(success=False, error="RPC send with attachment failed")
async def send_document(
self,
chat_id: str,
file_path: str,
caption: str | None = None,
filename: str | None = None,
**kwargs,
) -> SendResult:
"""Send a document/file attachment."""
await self._stop_typing_indicator(chat_id)
if not Path(file_path).exists():
return SendResult(success=False, error="File not found")
params: dict[str, Any] = {
"account": self.account,
"message": caption or "",
"attachments": [file_path],
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("send", params)
if result is not None:
return SendResult(success=True)
return SendResult(success=False, error="RPC send document failed")
# ------------------------------------------------------------------
# Typing Indicators
# ------------------------------------------------------------------
async def _start_typing_indicator(self, chat_id: str) -> None:
"""Start a typing indicator loop for a chat."""
if chat_id in self._typing_tasks:
return # Already running
async def _typing_loop():
try:
while True:
await self.send_typing(chat_id)
await asyncio.sleep(TYPING_INTERVAL)
except asyncio.CancelledError:
pass
self._typing_tasks[chat_id] = asyncio.create_task(_typing_loop())
async def _stop_typing_indicator(self, chat_id: str) -> None:
"""Stop a typing indicator loop for a chat."""
task = self._typing_tasks.pop(chat_id, None)
if task:
task.cancel()
try:
await task
except asyncio.CancelledError:
pass
# ------------------------------------------------------------------
# Chat Info
# ------------------------------------------------------------------
async def get_chat_info(self, chat_id: str) -> dict[str, Any]:
"""Get information about a chat/contact."""
if chat_id.startswith("group:"):
return {
"name": chat_id,
"type": "group",
"chat_id": chat_id,
}
# Try to resolve contact name
result = await self._rpc(
"getContact",
{
"account": self.account,
"contactAddress": chat_id,
},
)
name = chat_id
if result and isinstance(result, dict):
name = result.get("name") or result.get("profileName") or chat_id
return {
"name": name,
"type": "dm",
"chat_id": chat_id,
}

View File

@@ -10,14 +10,12 @@ Uses slack-bolt (Python) with Socket Mode for:
import asyncio
import os
import re
from typing import Any
from typing import Dict, List, Optional, Any
try:
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
SLACK_AVAILABLE = True
except ImportError:
SLACK_AVAILABLE = False
@@ -27,17 +25,16 @@ except ImportError:
import sys
from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
SUPPORTED_DOCUMENT_TYPES,
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_document_from_bytes,
cache_image_from_url,
cache_audio_from_url,
)
@@ -66,9 +63,9 @@ class SlackAdapter(BasePlatformAdapter):
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.SLACK)
self._app: AsyncApp | None = None
self._handler: AsyncSocketModeHandler | None = None
self._bot_user_id: str | None = None
self._app: Optional[AsyncApp] = None
self._handler: Optional[AsyncSocketModeHandler] = None
self._bot_user_id: Optional[str] = None
async def connect(self) -> bool:
"""Connect to Slack via Socket Mode."""
@@ -99,13 +96,6 @@ class SlackAdapter(BasePlatformAdapter):
async def handle_message_event(event, say):
await self._handle_slack_message(event)
# Acknowledge app_mention events to prevent Bolt 404 errors.
# The "message" handler above already processes @mentions in
# channels, so this is intentionally a no-op to avoid duplicates.
@self._app.event("app_mention")
async def handle_app_mention(event, say):
pass
# Register slash command handler
@self._app.command("/hermes")
async def handle_hermes_command(ack, command):
@@ -135,8 +125,8 @@ class SlackAdapter(BasePlatformAdapter):
self,
chat_id: str,
content: str,
reply_to: str | None = None,
metadata: dict[str, Any] | None = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a message to a Slack channel or DM."""
if not self._app:
@@ -189,42 +179,12 @@ class SlackAdapter(BasePlatformAdapter):
"""Slack doesn't have a direct typing indicator API for bots."""
pass
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: str | None = None,
reply_to: str | None = None,
) -> SendResult:
"""Send a local image file to Slack by uploading it."""
if not self._app:
return SendResult(success=False, error="Not connected")
try:
import os
if not os.path.exists(image_path):
return SendResult(success=False, error=f"Image file not found: {image_path}")
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=image_path,
filename=os.path.basename(image_path),
initial_comment=caption or "",
thread_ts=reply_to,
)
return SendResult(success=True, raw_response=result)
except Exception as e:
print(f"[{self.name}] Failed to send local image: {e}")
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_image(
self,
chat_id: str,
image_url: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send an image to Slack by uploading the URL as a file."""
if not self._app:
@@ -248,7 +208,7 @@ class SlackAdapter(BasePlatformAdapter):
return SendResult(success=True, raw_response=result)
except Exception:
except Exception as e:
# Fall back to sending the URL as text
text = f"{caption}\n{image_url}" if caption else image_url
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
@@ -257,8 +217,8 @@ class SlackAdapter(BasePlatformAdapter):
self,
chat_id: str,
audio_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send an audio file to Slack."""
if not self._app:
@@ -277,66 +237,7 @@ class SlackAdapter(BasePlatformAdapter):
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_video(
self,
chat_id: str,
video_path: str,
caption: str | None = None,
reply_to: str | None = None,
) -> SendResult:
"""Send a video file to Slack."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not os.path.exists(video_path):
return SendResult(success=False, error=f"Video file not found: {video_path}")
try:
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=video_path,
filename=os.path.basename(video_path),
initial_comment=caption or "",
thread_ts=reply_to,
)
return SendResult(success=True, raw_response=result)
except Exception as e:
print(f"[{self.name}] Failed to send video: {e}")
return await super().send_video(chat_id, video_path, caption, reply_to)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: str | None = None,
file_name: str | None = None,
reply_to: str | None = None,
) -> SendResult:
"""Send a document/file attachment to Slack."""
if not self._app:
return SendResult(success=False, error="Not connected")
if not os.path.exists(file_path):
return SendResult(success=False, error=f"File not found: {file_path}")
display_name = file_name or os.path.basename(file_path)
try:
result = await self._app.client.files_upload_v2(
channel=chat_id,
file=file_path,
filename=display_name,
initial_comment=caption or "",
thread_ts=reply_to,
)
return SendResult(success=True, raw_response=result)
except Exception as e:
print(f"[{self.name}] Failed to send document: {e}")
return await super().send_document(chat_id, file_path, caption, file_name, reply_to)
async def get_chat_info(self, chat_id: str) -> dict[str, Any]:
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Slack channel."""
if not self._app:
return {"name": chat_id, "type": "unknown"}
@@ -417,56 +318,6 @@ class SlackAdapter(BasePlatformAdapter):
msg_type = MessageType.VOICE
except Exception as e:
print(f"[Slack] Failed to cache audio: {e}", flush=True)
elif url:
# Try to handle as a document attachment
try:
original_filename = f.get("name", "")
ext = ""
if original_filename:
_, ext = os.path.splitext(original_filename)
ext = ext.lower()
# Fallback: reverse-lookup from MIME type
if not ext and mimetype:
mime_to_ext = {v: k for k, v in SUPPORTED_DOCUMENT_TYPES.items()}
ext = mime_to_ext.get(mimetype, "")
if ext not in SUPPORTED_DOCUMENT_TYPES:
continue # Skip unsupported file types silently
# Check file size (Slack limit: 20 MB for bots)
file_size = f.get("size", 0)
MAX_DOC_BYTES = 20 * 1024 * 1024
if not file_size or file_size > MAX_DOC_BYTES:
print(f"[Slack] Document too large or unknown size: {file_size}", flush=True)
continue
# Download and cache
raw_bytes = await self._download_slack_file_bytes(url)
cached_path = cache_document_from_bytes(raw_bytes, original_filename or f"document{ext}")
doc_mime = SUPPORTED_DOCUMENT_TYPES[ext]
media_urls.append(cached_path)
media_types.append(doc_mime)
msg_type = MessageType.DOCUMENT
print(f"[Slack] Cached user document: {cached_path}", flush=True)
# Inject text content for .txt/.md files (capped at 100 KB)
MAX_TEXT_INJECT_BYTES = 100 * 1024
if ext in (".md", ".txt") and len(raw_bytes) <= MAX_TEXT_INJECT_BYTES:
try:
text_content = raw_bytes.decode("utf-8")
display_name = original_filename or f"document{ext}"
display_name = re.sub(r"[^\w.\- ]", "_", display_name)
injection = f"[Content of {display_name}]:\n{text_content}"
if text:
text = f"{injection}\n\n{text}"
else:
text = injection
except UnicodeDecodeError:
pass # Binary content, skip injection
except Exception as e:
print(f"[Slack] Failed to cache document: {e}", flush=True)
# Build source
source = self.build_source(
@@ -498,20 +349,16 @@ class SlackAdapter(BasePlatformAdapter):
# Map subcommands to gateway commands
subcommand_map = {
"new": "/reset",
"reset": "/reset",
"status": "/status",
"stop": "/stop",
"new": "/reset", "reset": "/reset",
"status": "/status", "stop": "/stop",
"help": "/help",
"model": "/model",
"personality": "/personality",
"retry": "/retry",
"undo": "/undo",
"model": "/model", "personality": "/personality",
"retry": "/retry", "undo": "/undo",
}
first_word = text.split()[0] if text else ""
if first_word in subcommand_map:
# Preserve arguments after the subcommand
rest = text[len(first_word) :].strip()
rest = text[len(first_word):].strip()
text = f"{subcommand_map[first_word]} {rest}".strip() if rest else subcommand_map[first_word]
elif text:
pass # Treat as a regular question
@@ -547,22 +394,7 @@ class SlackAdapter(BasePlatformAdapter):
if audio:
from gateway.platforms.base import cache_audio_from_bytes
return cache_audio_from_bytes(response.content, ext)
else:
from gateway.platforms.base import cache_image_from_bytes
return cache_image_from_bytes(response.content, ext)
async def _download_slack_file_bytes(self, url: str) -> bytes:
"""Download a Slack file and return raw bytes."""
import httpx
bot_token = self.config.token
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
response = await client.get(
url,
headers={"Authorization": f"Bearer {bot_token}"},
)
response.raise_for_status()
return response.content

View File

@@ -7,26 +7,24 @@ Uses python-telegram-bot library for:
- Handling media and commands
"""
import asyncio
import logging
import os
import re
from typing import Any
from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
try:
from telegram import Bot, Message, Update
from telegram.constants import ChatType, ParseMode
from telegram import Update, Bot, Message
from telegram.ext import (
Application,
CommandHandler,
MessageHandler as TelegramMessageHandler,
ContextTypes,
filters,
)
from telegram.ext import (
MessageHandler as TelegramMessageHandler,
)
from telegram.constants import ParseMode, ChatType
TELEGRAM_AVAILABLE = True
except ImportError:
TELEGRAM_AVAILABLE = False
@@ -44,24 +42,22 @@ except ImportError:
# don't crash during class definition when the library isn't installed.
class _MockContextTypes:
DEFAULT_TYPE = Any
ContextTypes = _MockContextTypes
import sys
from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
SUPPORTED_DOCUMENT_TYPES,
BasePlatformAdapter,
MessageEvent,
MessageType,
SendResult,
cache_image_from_bytes,
cache_audio_from_bytes,
cache_document_from_bytes,
cache_image_from_bytes,
SUPPORTED_DOCUMENT_TYPES,
)
@@ -72,12 +68,12 @@ def check_telegram_requirements() -> bool:
# Matches every character that MarkdownV2 requires to be backslash-escaped
# when it appears outside a code span or fenced code block.
_MDV2_ESCAPE_RE = re.compile(r"([_*\[\]()~`>#\+\-=|{}.!\\])")
_MDV2_ESCAPE_RE = re.compile(r'([_*\[\]()~`>#\+\-=|{}.!\\])')
def _escape_mdv2(text: str) -> str:
"""Escape Telegram MarkdownV2 special characters with a preceding backslash."""
return _MDV2_ESCAPE_RE.sub(r"\\\1", text)
return _MDV2_ESCAPE_RE.sub(r'\\\1', text)
def _strip_mdv2(text: str) -> str:
@@ -87,108 +83,91 @@ def _strip_mdv2(text: str) -> str:
doesn't show stray asterisks from header/bold conversion.
"""
# Remove escape backslashes before special characters
cleaned = re.sub(r"\\([_*\[\]()~`>#\+\-=|{}.!\\])", r"\1", text)
cleaned = re.sub(r'\\([_*\[\]()~`>#\+\-=|{}.!\\])', r'\1', text)
# Remove MarkdownV2 bold markers that format_message converted from **bold**
cleaned = re.sub(r"\*([^*]+)\*", r"\1", cleaned)
cleaned = re.sub(r'\*([^*]+)\*', r'\1', cleaned)
return cleaned
class TelegramAdapter(BasePlatformAdapter):
"""
Telegram bot adapter.
Handles:
- Receiving messages from users and groups
- Sending responses with Telegram markdown
- Forum topics (thread_id support)
- Media messages
"""
# Telegram message limits
MAX_MESSAGE_LENGTH = 4096
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.TELEGRAM)
self._app: Application | None = None
self._bot: Bot | None = None
self._app: Optional[Application] = None
self._bot: Optional[Bot] = None
async def connect(self) -> bool:
"""Connect to Telegram and start polling for updates."""
if not TELEGRAM_AVAILABLE:
print(f"[{self.name}] python-telegram-bot not installed. Run: pip install python-telegram-bot")
return False
if not self.config.token:
print(f"[{self.name}] No bot token configured")
return False
try:
# Build the application
self._app = Application.builder().token(self.config.token).build()
self._bot = self._app.bot
# Register handlers
self._app.add_handler(TelegramMessageHandler(filters.TEXT & ~filters.COMMAND, self._handle_text_message))
self._app.add_handler(TelegramMessageHandler(filters.COMMAND, self._handle_command))
self._app.add_handler(
TelegramMessageHandler(
filters.LOCATION | getattr(filters, "VENUE", filters.LOCATION), self._handle_location_message
)
)
self._app.add_handler(
TelegramMessageHandler(
filters.PHOTO
| filters.VIDEO
| filters.AUDIO
| filters.VOICE
| filters.Document.ALL
| filters.Sticker.ALL,
self._handle_media_message,
)
)
self._app.add_handler(TelegramMessageHandler(
filters.TEXT & ~filters.COMMAND,
self._handle_text_message
))
self._app.add_handler(TelegramMessageHandler(
filters.COMMAND,
self._handle_command
))
self._app.add_handler(TelegramMessageHandler(
filters.PHOTO | filters.VIDEO | filters.AUDIO | filters.VOICE | filters.Document.ALL | filters.Sticker.ALL,
self._handle_media_message
))
# Start polling in background
await self._app.initialize()
await self._app.start()
await self._app.updater.start_polling(allowed_updates=Update.ALL_TYPES)
# Register bot commands so Telegram shows a hint menu when users type /
try:
from telegram import BotCommand
await self._bot.set_my_commands(
[
BotCommand("new", "Start a new conversation"),
BotCommand("reset", "Reset conversation history"),
BotCommand("model", "Show or change the model"),
BotCommand("personality", "Set a personality"),
BotCommand("retry", "Retry your last message"),
BotCommand("undo", "Remove the last exchange"),
BotCommand("status", "Show session info"),
BotCommand("stop", "Stop the running agent"),
BotCommand("sethome", "Set this chat as the home channel"),
BotCommand("compress", "Compress conversation context"),
BotCommand("title", "Set or show the session title"),
BotCommand("resume", "Resume a previously-named session"),
BotCommand("usage", "Show token usage for this session"),
BotCommand("provider", "Show available providers"),
BotCommand("insights", "Show usage insights and analytics"),
BotCommand("update", "Update Hermes to the latest version"),
BotCommand("reload_mcp", "Reload MCP servers from config"),
BotCommand("help", "Show available commands"),
]
)
await self._bot.set_my_commands([
BotCommand("new", "Start a new conversation"),
BotCommand("reset", "Reset conversation history"),
BotCommand("model", "Show or change the model"),
BotCommand("personality", "Set a personality"),
BotCommand("retry", "Retry your last message"),
BotCommand("undo", "Remove the last exchange"),
BotCommand("status", "Show session info"),
BotCommand("stop", "Stop the running agent"),
BotCommand("sethome", "Set this chat as the home channel"),
BotCommand("help", "Show available commands"),
])
except Exception as e:
print(f"[{self.name}] Could not register command menu: {e}")
self._running = True
print(f"[{self.name}] Connected and polling for updates")
return True
except Exception as e:
print(f"[{self.name}] Failed to connect: {e}")
return False
async def disconnect(self) -> None:
"""Stop polling and disconnect."""
if self._app:
@@ -198,27 +177,31 @@ class TelegramAdapter(BasePlatformAdapter):
await self._app.shutdown()
except Exception as e:
print(f"[{self.name}] Error during disconnect: {e}")
self._running = False
self._app = None
self._bot = None
print(f"[{self.name}] Disconnected")
async def send(
self, chat_id: str, content: str, reply_to: str | None = None, metadata: dict[str, Any] | None = None
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message to a Telegram chat."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
message_ids = []
thread_id = metadata.get("thread_id") if metadata else None
for i, chunk in enumerate(chunks):
# Try Markdown first, fall back to plain text if it fails
try:
@@ -232,9 +215,7 @@ class TelegramAdapter(BasePlatformAdapter):
except Exception as md_error:
# Markdown parsing failed, try plain text
if "parse" in str(md_error).lower() or "markdown" in str(md_error).lower():
logger.warning(
"[%s] MarkdownV2 parse failed, falling back to plain text: %s", self.name, md_error
)
logger.warning("[%s] MarkdownV2 parse failed, falling back to plain text: %s", self.name, md_error)
# Strip MDV2 escape backslashes so the user doesn't
# see raw backslashes littered through the message.
plain_chunk = _strip_mdv2(chunk)
@@ -248,13 +229,13 @@ class TelegramAdapter(BasePlatformAdapter):
else:
raise # Re-raise if not a parse error
message_ids.append(str(msg.message_id))
return SendResult(
success=True,
message_id=message_ids[0] if message_ids else None,
raw_response={"message_ids": message_ids},
raw_response={"message_ids": message_ids}
)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -291,19 +272,18 @@ class TelegramAdapter(BasePlatformAdapter):
self,
chat_id: str,
audio_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send audio as a native Telegram voice message or audio file."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
import os
if not os.path.exists(audio_path):
return SendResult(success=False, error=f"Audio file not found: {audio_path}")
with open(audio_path, "rb") as audio_file:
# .ogg files -> send as voice (round playable bubble)
if audio_path.endswith(".ogg") or audio_path.endswith(".opus"):
@@ -325,53 +305,20 @@ class TelegramAdapter(BasePlatformAdapter):
except Exception as e:
print(f"[{self.name}] Failed to send voice/audio: {e}")
return await super().send_voice(chat_id, audio_path, caption, reply_to)
async def send_image_file(
self,
chat_id: str,
image_path: str,
caption: str | None = None,
reply_to: str | None = None,
) -> SendResult:
"""Send a local image file natively as a Telegram photo."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
import os
if not os.path.exists(image_path):
return SendResult(success=False, error=f"Image file not found: {image_path}")
with open(image_path, "rb") as image_file:
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
print(f"[{self.name}] Failed to send local image: {e}")
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_image(
self,
chat_id: str,
image_url: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send an image natively as a Telegram photo.
Tries URL-based send first (fast, works for <5MB images).
Falls back to downloading and uploading as file (supports up to 10MB).
"""
"""Send an image natively as a Telegram photo."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
# Telegram can send photos directly from URLs (up to ~5MB)
# Telegram can send photos directly from URLs
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_url,
@@ -380,39 +327,21 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.warning("[%s] URL-based send_photo failed (%s), trying file upload", self.name, e)
# Fallback: download and upload as file (supports up to 10MB)
try:
import httpx
async with httpx.AsyncClient(timeout=30.0) as client:
resp = await client.get(image_url)
resp.raise_for_status()
image_data = resp.content
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_data,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e2:
logger.error("[%s] File upload send_photo also failed: %s", self.name, e2)
# Final fallback: send URL as text
return await super().send_image(chat_id, image_url, caption, reply_to)
print(f"[{self.name}] Failed to send photo, falling back to URL: {e}")
# Fallback: send as text link
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_animation(
self,
chat_id: str,
animation_url: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send an animated GIF natively as a Telegram animation (auto-plays inline)."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
msg = await self._bot.send_animation(
chat_id=int(chat_id),
@@ -430,18 +359,21 @@ class TelegramAdapter(BasePlatformAdapter):
"""Send typing indicator."""
if self._bot:
try:
await self._bot.send_chat_action(chat_id=int(chat_id), action="typing")
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing"
)
except Exception:
pass # Ignore typing indicator failures
async def get_chat_info(self, chat_id: str) -> dict[str, Any]:
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Telegram chat."""
if not self._bot:
return {"name": "Unknown", "type": "dm"}
try:
chat = await self._bot.get_chat(int(chat_id))
chat_type = "dm"
if chat.type == ChatType.GROUP:
chat_type = "group"
@@ -451,7 +383,7 @@ class TelegramAdapter(BasePlatformAdapter):
chat_type = "forum"
elif chat.type == ChatType.CHANNEL:
chat_type = "channel"
return {
"name": chat.title or chat.full_name or str(chat_id),
"type": chat_type,
@@ -460,7 +392,7 @@ class TelegramAdapter(BasePlatformAdapter):
}
except Exception as e:
return {"name": str(chat_id), "type": "dm", "error": str(e)}
def format_message(self, content: str) -> str:
"""
Convert standard markdown to Telegram MarkdownV2 format.
@@ -487,36 +419,38 @@ class TelegramAdapter(BasePlatformAdapter):
# 1) Protect fenced code blocks (``` ... ```)
text = re.sub(
r"(```(?:[^\n]*\n)?[\s\S]*?```)",
r'(```(?:[^\n]*\n)?[\s\S]*?```)',
lambda m: _ph(m.group(0)),
text,
)
# 2) Protect inline code (`...`)
text = re.sub(r"(`[^`]+`)", lambda m: _ph(m.group(0)), text)
text = re.sub(r'(`[^`]+`)', lambda m: _ph(m.group(0)), text)
# 3) Convert markdown links escape the display text; inside the URL
# only ')' and '\' need escaping per the MarkdownV2 spec.
def _convert_link(m):
display = _escape_mdv2(m.group(1))
url = m.group(2).replace("\\", "\\\\").replace(")", "\\)")
return _ph(f"[{display}]({url})")
url = m.group(2).replace('\\', '\\\\').replace(')', '\\)')
return _ph(f'[{display}]({url})')
text = re.sub(r"\[([^\]]+)\]\(([^)]+)\)", _convert_link, text)
text = re.sub(r'\[([^\]]+)\]\(([^)]+)\)', _convert_link, text)
# 4) Convert markdown headers (## Title) → bold *Title*
def _convert_header(m):
inner = m.group(1).strip()
# Strip redundant bold markers that may appear inside a header
inner = re.sub(r"\*\*(.+?)\*\*", r"\1", inner)
return _ph(f"*{_escape_mdv2(inner)}*")
inner = re.sub(r'\*\*(.+?)\*\*', r'\1', inner)
return _ph(f'*{_escape_mdv2(inner)}*')
text = re.sub(r"^#{1,6}\s+(.+)$", _convert_header, text, flags=re.MULTILINE)
text = re.sub(
r'^#{1,6}\s+(.+)$', _convert_header, text, flags=re.MULTILINE
)
# 5) Convert bold: **text** → *text* (MarkdownV2 bold)
text = re.sub(
r"\*\*(.+?)\*\*",
lambda m: _ph(f"*{_escape_mdv2(m.group(1))}*"),
r'\*\*(.+?)\*\*',
lambda m: _ph(f'*{_escape_mdv2(m.group(1))}*'),
text,
)
@@ -524,8 +458,8 @@ class TelegramAdapter(BasePlatformAdapter):
# [^*\n]+ prevents matching across newlines (which would corrupt
# bullet lists using * markers and multi-line content).
text = re.sub(
r"\*([^*\n]+)\*",
lambda m: _ph(f"_{_escape_mdv2(m.group(1))}_"),
r'\*([^*\n]+)\*',
lambda m: _ph(f'_{_escape_mdv2(m.group(1))}_'),
text,
)
@@ -538,65 +472,30 @@ class TelegramAdapter(BasePlatformAdapter):
text = text.replace(key, placeholders[key])
return text
async def _handle_text_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming text messages."""
if not update.message or not update.message.text:
return
event = self._build_message_event(update.message, MessageType.TEXT)
await self.handle_message(event)
async def _handle_command(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming command messages."""
if not update.message or not update.message.text:
return
event = self._build_message_event(update.message, MessageType.COMMAND)
await self.handle_message(event)
async def _handle_location_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming location/venue pin messages."""
if not update.message:
return
msg = update.message
venue = getattr(msg, "venue", None)
location = getattr(venue, "location", None) if venue else getattr(msg, "location", None)
if not location:
return
lat = getattr(location, "latitude", None)
lon = getattr(location, "longitude", None)
if lat is None or lon is None:
return
# Build a text message with coordinates and context
parts = ["[The user shared a location pin.]"]
if venue:
title = getattr(venue, "title", None)
address = getattr(venue, "address", None)
if title:
parts.append(f"Venue: {title}")
if address:
parts.append(f"Address: {address}")
parts.append(f"latitude: {lat}")
parts.append(f"longitude: {lon}")
parts.append(f"Map: https://www.google.com/maps/search/?api=1&query={lat},{lon}")
parts.append("Ask what they'd like to find nearby (restaurants, cafes, etc.) and any preferences.")
event = self._build_message_event(msg, MessageType.LOCATION)
event.text = "\n".join(parts)
await self.handle_message(event)
async def _handle_media_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming media messages, downloading images to local cache."""
if not update.message:
return
msg = update.message
# Determine media type
if msg.sticker:
msg_type = MessageType.STICKER
@@ -612,19 +511,19 @@ class TelegramAdapter(BasePlatformAdapter):
msg_type = MessageType.DOCUMENT
else:
msg_type = MessageType.DOCUMENT
event = self._build_message_event(msg, msg_type)
# Add caption as text
if msg.caption:
event.text = msg.caption
# Handle stickers: describe via vision tool with caching
if msg.sticker:
await self._handle_sticker(msg, event)
await self.handle_message(event)
return
# Download photo to local image cache so the vision tool can access it
# even after Telegram's ephemeral file URLs expire (~1 hour).
if msg.photo:
@@ -648,7 +547,7 @@ class TelegramAdapter(BasePlatformAdapter):
print(f"[Telegram] Cached user photo: {cached_path}", flush=True)
except Exception as e:
print(f"[Telegram] Failed to cache photo: {e}", flush=True)
# Download voice/audio messages to cache for STT transcription
if msg.voice:
try:
@@ -690,7 +589,10 @@ class TelegramAdapter(BasePlatformAdapter):
# Check if supported
if ext not in SUPPORTED_DOCUMENT_TYPES:
supported_list = ", ".join(sorted(SUPPORTED_DOCUMENT_TYPES.keys()))
event.text = f"Unsupported document type '{ext or 'unknown'}'. Supported types: {supported_list}"
event.text = (
f"Unsupported document type '{ext or 'unknown'}'. "
f"Supported types: {supported_list}"
)
print(f"[Telegram] Unsupported document type: {ext or 'unknown'}", flush=True)
await self.handle_message(event)
return
@@ -698,7 +600,10 @@ class TelegramAdapter(BasePlatformAdapter):
# Check file size (Telegram Bot API limit: 20 MB)
MAX_DOC_BYTES = 20 * 1024 * 1024
if not doc.file_size or doc.file_size > MAX_DOC_BYTES:
event.text = "The document is too large or its size could not be verified. Maximum: 20 MB."
event.text = (
"The document is too large or its size could not be verified. "
"Maximum: 20 MB."
)
print(f"[Telegram] Document too large: {doc.file_size} bytes", flush=True)
await self.handle_message(event)
return
@@ -719,20 +624,20 @@ class TelegramAdapter(BasePlatformAdapter):
try:
text_content = raw_bytes.decode("utf-8")
display_name = original_filename or f"document{ext}"
display_name = re.sub(r"[^\w.\- ]", "_", display_name)
display_name = re.sub(r'[^\w.\- ]', '_', display_name)
injection = f"[Content of {display_name}]:\n{text_content}"
if event.text:
event.text = f"{injection}\n\n{event.text}"
else:
event.text = injection
except UnicodeDecodeError:
print("[Telegram] Could not decode text file as UTF-8, skipping content injection", flush=True)
print(f"[Telegram] Could not decode text file as UTF-8, skipping content injection", flush=True)
except Exception as e:
print(f"[Telegram] Failed to cache document: {e}", flush=True)
await self.handle_message(event)
async def _handle_sticker(self, msg: Message, event: "MessageEvent") -> None:
"""
Describe a Telegram sticker via vision analysis, with caching.
@@ -742,11 +647,11 @@ class TelegramAdapter(BasePlatformAdapter):
a placeholder noting the emoji.
"""
from gateway.sticker_cache import (
STICKER_VISION_PROMPT,
build_animated_sticker_injection,
build_sticker_injection,
cache_sticker_description,
get_cached_description,
cache_sticker_description,
build_sticker_injection,
build_animated_sticker_injection,
STICKER_VISION_PROMPT,
)
sticker = msg.sticker
@@ -774,9 +679,8 @@ class TelegramAdapter(BasePlatformAdapter):
cached_path = cache_image_from_bytes(bytes(image_bytes), ext=".webp")
print(f"[Telegram] Analyzing sticker: {cached_path}", flush=True)
import json as _json
from tools.vision_tools import vision_analyze_tool
import json as _json
result_json = await vision_analyze_tool(
image_url=cached_path,
@@ -792,29 +696,27 @@ class TelegramAdapter(BasePlatformAdapter):
# Vision failed -- use emoji as fallback
event.text = build_sticker_injection(
f"a sticker with emoji {emoji}" if emoji else "a sticker",
emoji,
set_name,
emoji, set_name,
)
except Exception as e:
print(f"[Telegram] Sticker analysis error: {e}", flush=True)
event.text = build_sticker_injection(
f"a sticker with emoji {emoji}" if emoji else "a sticker",
emoji,
set_name,
emoji, set_name,
)
def _build_message_event(self, message: Message, msg_type: MessageType) -> MessageEvent:
"""Build a MessageEvent from a Telegram message."""
chat = message.chat
user = message.from_user
# Determine chat type
chat_type = "dm"
if chat.type in (ChatType.GROUP, ChatType.SUPERGROUP):
chat_type = "group"
elif chat.type == ChatType.CHANNEL:
chat_type = "channel"
# Build source
source = self.build_source(
chat_id=str(chat.id),
@@ -824,7 +726,7 @@ class TelegramAdapter(BasePlatformAdapter):
user_name=user.full_name if user else None,
thread_id=str(message.message_thread_id) if message.message_thread_id else None,
)
return MessageEvent(
text=message.text or "",
message_type=msg_type,

View File

@@ -16,6 +16,7 @@ with different backends via a bridge pattern.
"""
import asyncio
import json
import logging
import os
import platform
@@ -23,7 +24,7 @@ import subprocess
_IS_WINDOWS = platform.system() == "Windows"
from pathlib import Path
from typing import Any
from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
@@ -35,9 +36,7 @@ def _kill_port_process(port: int) -> None:
# Use netstat to find the PID bound to this port, then taskkill
result = subprocess.run(
["netstat", "-ano", "-p", "TCP"],
capture_output=True,
text=True,
timeout=5,
capture_output=True, text=True, timeout=5,
)
for line in result.stdout.splitlines():
parts = line.split()
@@ -47,29 +46,24 @@ def _kill_port_process(port: int) -> None:
try:
subprocess.run(
["taskkill", "/PID", parts[4], "/F"],
capture_output=True,
timeout=5,
capture_output=True, timeout=5,
)
except subprocess.SubprocessError:
pass
else:
result = subprocess.run(
["fuser", f"{port}/tcp"],
capture_output=True,
timeout=5,
capture_output=True, timeout=5,
)
if result.returncode == 0:
subprocess.run(
["fuser", "-k", f"{port}/tcp"],
capture_output=True,
timeout=5,
capture_output=True, timeout=5,
)
except Exception:
pass
import sys
sys.path.insert(0, str(Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
@@ -78,20 +72,25 @@ from gateway.platforms.base import (
MessageEvent,
MessageType,
SendResult,
cache_audio_from_url,
cache_image_from_url,
cache_audio_from_url,
)
def check_whatsapp_requirements() -> bool:
"""
Check if WhatsApp dependencies are available.
WhatsApp requires a Node.js bridge for most implementations.
"""
# Check for Node.js
try:
result = subprocess.run(["node", "--version"], capture_output=True, text=True, timeout=5)
result = subprocess.run(
["node", "--version"],
capture_output=True,
text=True,
timeout=5
)
return result.returncode == 0
except Exception:
return False
@@ -100,61 +99,62 @@ def check_whatsapp_requirements() -> bool:
class WhatsAppAdapter(BasePlatformAdapter):
"""
WhatsApp adapter.
This implementation uses a simple HTTP bridge pattern where:
1. A Node.js process runs the WhatsApp Web client
2. Messages are forwarded via HTTP/IPC to this Python adapter
3. Responses are sent back through the bridge
The actual Node.js bridge implementation can vary:
- whatsapp-web.js based
- Baileys based
- Business API based
Configuration:
- bridge_script: Path to the Node.js bridge script
- bridge_port: Port for HTTP communication (default: 3000)
- session_path: Path to store WhatsApp session data
"""
# WhatsApp message limits
MAX_MESSAGE_LENGTH = 65536 # WhatsApp allows longer messages
# Default bridge location relative to the hermes-agent install
_DEFAULT_BRIDGE_DIR = Path(__file__).resolve().parents[2] / "scripts" / "whatsapp-bridge"
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WHATSAPP)
self._bridge_process: subprocess.Popen | None = None
self._bridge_process: Optional[subprocess.Popen] = None
self._bridge_port: int = config.extra.get("bridge_port", 3000)
self._bridge_script: str | None = config.extra.get(
self._bridge_script: Optional[str] = config.extra.get(
"bridge_script",
str(self._DEFAULT_BRIDGE_DIR / "bridge.js"),
)
self._session_path: Path = Path(
config.extra.get("session_path", Path.home() / ".hermes" / "whatsapp" / "session")
)
self._session_path: Path = Path(config.extra.get(
"session_path",
Path.home() / ".hermes" / "whatsapp" / "session"
))
self._message_queue: asyncio.Queue = asyncio.Queue()
self._bridge_log_fh = None
self._bridge_log: Path | None = None
self._bridge_log: Optional[Path] = None
async def connect(self) -> bool:
"""
Start the WhatsApp bridge.
This launches the Node.js bridge process and waits for it to be ready.
"""
if not check_whatsapp_requirements():
logger.warning("[%s] Node.js not found. WhatsApp requires Node.js.", self.name)
return False
bridge_path = Path(self._bridge_script)
if not bridge_path.exists():
logger.warning("[%s] Bridge script not found: %s", self.name, bridge_path)
return False
logger.info("[%s] Bridge found at %s", self.name, bridge_path)
# Auto-install npm dependencies if node_modules doesn't exist
bridge_dir = bridge_path.parent
if not (bridge_dir / "node_modules").exists():
@@ -174,17 +174,16 @@ class WhatsAppAdapter(BasePlatformAdapter):
except Exception as e:
print(f"[{self.name}] Failed to install dependencies: {e}")
return False
try:
# Ensure session directory exists
self._session_path.mkdir(parents=True, exist_ok=True)
# Kill any orphaned bridge from a previous gateway run
_kill_port_process(self._bridge_port)
import time
time.sleep(1)
# Start the bridge process in its own process group.
# Route output to a log file so QR codes, errors, and reconnection
# messages are preserved for troubleshooting.
@@ -196,23 +195,19 @@ class WhatsAppAdapter(BasePlatformAdapter):
[
"node",
str(bridge_path),
"--port",
str(self._bridge_port),
"--session",
str(self._session_path),
"--mode",
whatsapp_mode,
"--port", str(self._bridge_port),
"--session", str(self._session_path),
"--mode", whatsapp_mode,
],
stdout=bridge_log_fh,
stderr=bridge_log_fh,
preexec_fn=None if _IS_WINDOWS else os.setsid,
)
# Wait for the bridge to connect to WhatsApp.
# Phase 1: wait for the HTTP server to come up (up to 15s).
# Phase 2: wait for WhatsApp status: connected (up to 15s more).
import aiohttp
http_ready = False
data = {}
for attempt in range(15):
@@ -223,18 +218,17 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._close_bridge_log()
return False
try:
async with (
aiohttp.ClientSession() as session,
session.get(
f"http://localhost:{self._bridge_port}/health", timeout=aiohttp.ClientTimeout(total=2)
) as resp,
):
if resp.status == 200:
http_ready = True
data = await resp.json()
if data.get("status") == "connected":
print(f"[{self.name}] Bridge ready (status: connected)")
break
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://localhost:{self._bridge_port}/health",
timeout=aiohttp.ClientTimeout(total=2)
) as resp:
if resp.status == 200:
http_ready = True
data = await resp.json()
if data.get("status") == "connected":
print(f"[{self.name}] Bridge ready (status: connected)")
break
except Exception:
continue
@@ -243,7 +237,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
print(f"[{self.name}] Check log: {self._bridge_log}")
self._close_bridge_log()
return False
# Phase 2: HTTP is up but WhatsApp may still be connecting.
# Give it more time to authenticate with saved credentials.
if data.get("status") != "connected":
@@ -256,17 +250,16 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._close_bridge_log()
return False
try:
async with (
aiohttp.ClientSession() as session,
session.get(
f"http://localhost:{self._bridge_port}/health", timeout=aiohttp.ClientTimeout(total=2)
) as resp,
):
if resp.status == 200:
data = await resp.json()
if data.get("status") == "connected":
print(f"[{self.name}] Bridge ready (status: connected)")
break
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://localhost:{self._bridge_port}/health",
timeout=aiohttp.ClientTimeout(total=2)
) as resp:
if resp.status == 200:
data = await resp.json()
if data.get("status") == "connected":
print(f"[{self.name}] Bridge ready (status: connected)")
break
except Exception:
continue
else:
@@ -275,19 +268,19 @@ class WhatsAppAdapter(BasePlatformAdapter):
print(f"[{self.name}] ⚠ WhatsApp not connected after 30s")
print(f"[{self.name}] Bridge log: {self._bridge_log}")
print(f"[{self.name}] If session expired, re-pair: hermes whatsapp")
# Start message polling task
asyncio.create_task(self._poll_messages())
self._running = True
print(f"[{self.name}] Bridge started on port {self._bridge_port}")
return True
except Exception as e:
logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
self._close_bridge_log()
return False
def _close_bridge_log(self) -> None:
"""Close the bridge log file handle if open."""
if self._bridge_log_fh:
@@ -303,7 +296,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
# Kill the entire process group so child node processes die too
import signal
try:
if _IS_WINDOWS:
self._bridge_process.terminate()
@@ -322,25 +314,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._bridge_process.kill()
except Exception as e:
print(f"[{self.name}] Error stopping bridge: {e}")
# Also kill any orphaned bridge processes on our port
_kill_port_process(self._bridge_port)
self._running = False
self._bridge_process = None
self._close_bridge_log()
print(f"[{self.name}] Disconnected")
async def send(
self, chat_id: str, content: str, reply_to: str | None = None, metadata: dict[str, Any] | None = None
self,
chat_id: str,
content: str,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None
) -> SendResult:
"""Send a message via the WhatsApp bridge."""
if not self._running:
return SendResult(success=False, error="Not connected")
try:
import aiohttp
async with aiohttp.ClientSession() as session:
payload = {
"chatId": chat_id,
@@ -348,19 +344,28 @@ class WhatsAppAdapter(BasePlatformAdapter):
}
if reply_to:
payload["replyTo"] = reply_to
async with session.post(
f"http://localhost:{self._bridge_port}/send", json=payload, timeout=aiohttp.ClientTimeout(total=30)
f"http://localhost:{self._bridge_port}/send",
json=payload,
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(success=True, message_id=data.get("messageId"), raw_response=data)
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except ImportError:
return SendResult(success=False, error="aiohttp not installed. Run: pip install aiohttp")
return SendResult(
success=False,
error="aiohttp not installed. Run: pip install aiohttp"
)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -375,24 +380,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
return SendResult(success=False, error="Not connected")
try:
import aiohttp
async with (
aiohttp.ClientSession() as session,
session.post(
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://localhost:{self._bridge_port}/edit",
json={
"chatId": chat_id,
"messageId": message_id,
"message": content,
},
timeout=aiohttp.ClientTimeout(total=15),
) as resp,
):
if resp.status == 200:
return SendResult(success=True, message_id=message_id)
else:
error = await resp.text()
return SendResult(success=False, error=error)
timeout=aiohttp.ClientTimeout(total=15)
) as resp:
if resp.status == 200:
return SendResult(success=True, message_id=message_id)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -401,8 +403,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
chat_id: str,
file_path: str,
media_type: str,
caption: str | None = None,
file_name: str | None = None,
caption: Optional[str] = None,
file_name: Optional[str] = None,
) -> SendResult:
"""Send any media file via bridge /send-media endpoint."""
if not self._running:
@@ -413,7 +415,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
if not os.path.exists(file_path):
return SendResult(success=False, error=f"File not found: {file_path}")
payload: dict[str, Any] = {
payload: Dict[str, Any] = {
"chatId": chat_id,
"filePath": file_path,
"mediaType": media_type,
@@ -423,24 +425,22 @@ class WhatsAppAdapter(BasePlatformAdapter):
if file_name:
payload["fileName"] = file_name
async with (
aiohttp.ClientSession() as session,
session.post(
async with aiohttp.ClientSession() as session:
async with session.post(
f"http://localhost:{self._bridge_port}/send-media",
json=payload,
timeout=aiohttp.ClientTimeout(total=120),
) as resp,
):
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data,
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
) as resp:
if resp.status == 200:
data = await resp.json()
return SendResult(
success=True,
message_id=data.get("messageId"),
raw_response=data,
)
else:
error = await resp.text()
return SendResult(success=False, error=error)
except Exception as e:
return SendResult(success=False, error=str(e))
@@ -449,8 +449,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
self,
chat_id: str,
image_url: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Download image URL to cache, send natively via bridge."""
try:
@@ -463,8 +463,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
self,
chat_id: str,
image_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a local image file natively via bridge."""
return await self._send_media_to_bridge(chat_id, image_path, "image", caption)
@@ -473,8 +473,8 @@ class WhatsAppAdapter(BasePlatformAdapter):
self,
chat_id: str,
video_path: str,
caption: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a video natively via bridge — plays inline in WhatsApp."""
return await self._send_media_to_bridge(chat_id, video_path, "video", caption)
@@ -483,16 +483,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
self,
chat_id: str,
file_path: str,
caption: str | None = None,
file_name: str | None = None,
reply_to: str | None = None,
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
) -> SendResult:
"""Send a document/file as a downloadable attachment via bridge."""
return await self._send_media_to_bridge(
chat_id,
file_path,
"document",
caption,
chat_id, file_path, "document", caption,
file_name or os.path.basename(file_path),
)
@@ -500,45 +497,44 @@ class WhatsAppAdapter(BasePlatformAdapter):
"""Send typing indicator via bridge."""
if not self._running:
return
try:
import aiohttp
async with aiohttp.ClientSession() as session:
await session.post(
f"http://localhost:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5),
timeout=aiohttp.ClientTimeout(total=5)
)
except Exception:
pass # Ignore typing indicator failures
async def get_chat_info(self, chat_id: str) -> dict[str, Any]:
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a WhatsApp chat."""
if not self._running:
return {"name": "Unknown", "type": "dm"}
try:
import aiohttp
async with (
aiohttp.ClientSession() as session,
session.get(
f"http://localhost:{self._bridge_port}/chat/{chat_id}", timeout=aiohttp.ClientTimeout(total=10)
) as resp,
):
if resp.status == 200:
data = await resp.json()
return {
"name": data.get("name", chat_id),
"type": "group" if data.get("isGroup") else "dm",
"participants": data.get("participants", []),
}
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://localhost:{self._bridge_port}/chat/{chat_id}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"name": data.get("name", chat_id),
"type": "group" if data.get("isGroup") else "dm",
"participants": data.get("participants", []),
}
except Exception as e:
logger.debug("Could not get WhatsApp chat info for %s: %s", chat_id, e)
return {"name": chat_id, "type": "dm"}
async def _poll_messages(self) -> None:
"""Poll the bridge for incoming messages."""
try:
@@ -546,30 +542,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
except ImportError:
print(f"[{self.name}] aiohttp not installed, message polling disabled")
return
while self._running:
try:
async with (
aiohttp.ClientSession() as session,
session.get(
f"http://localhost:{self._bridge_port}/messages", timeout=aiohttp.ClientTimeout(total=30)
) as resp,
):
if resp.status == 200:
messages = await resp.json()
for msg_data in messages:
event = await self._build_message_event(msg_data)
if event:
await self.handle_message(event)
async with aiohttp.ClientSession() as session:
async with session.get(
f"http://localhost:{self._bridge_port}/messages",
timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status == 200:
messages = await resp.json()
for msg_data in messages:
event = await self._build_message_event(msg_data)
if event:
await self.handle_message(event)
except asyncio.CancelledError:
break
except Exception as e:
print(f"[{self.name}] Poll error: {e}")
await asyncio.sleep(5)
await asyncio.sleep(1) # Poll interval
async def _build_message_event(self, data: dict[str, Any]) -> MessageEvent | None:
async def _build_message_event(self, data: Dict[str, Any]) -> Optional[MessageEvent]:
"""Build a MessageEvent from bridge message data, downloading images to cache."""
try:
# Determine message type
@@ -584,11 +579,11 @@ class WhatsAppAdapter(BasePlatformAdapter):
msg_type = MessageType.VOICE
else:
msg_type = MessageType.DOCUMENT
# Determine chat type
is_group = data.get("isGroup", False)
chat_type = "group" if is_group else "dm"
# Build source
source = self.build_source(
chat_id=data.get("chatId", ""),
@@ -597,7 +592,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
user_id=data.get("senderId"),
user_name=data.get("senderName"),
)
# Download image media URLs to the local cache so the vision tool
# can access them reliably regardless of URL expiration.
raw_urls = data.get("mediaUrls", [])
@@ -627,7 +622,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
else:
cached_urls.append(url)
media_types.append("unknown")
return MessageEvent(
text=data.get("body", ""),
message_type=msg_type,
@@ -640,3 +635,4 @@ class WhatsAppAdapter(BasePlatformAdapter):
except Exception as e:
print(f"[{self.name}] Error building event: {e}")
return None

File diff suppressed because it is too large Load Diff

View File

@@ -8,20 +8,22 @@ Handles:
- Dynamic system prompt injection (agent knows its context)
"""
import json
import logging
import os
import json
import uuid
from dataclasses import dataclass
from datetime import datetime, timedelta
from pathlib import Path
from typing import Any
from datetime import datetime, timedelta
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Any
logger = logging.getLogger(__name__)
from .config import (
GatewayConfig,
HomeChannel,
Platform,
GatewayConfig,
SessionResetPolicy,
HomeChannel,
)
@@ -29,30 +31,27 @@ from .config import (
class SessionSource:
"""
Describes where a message originated from.
This information is used to:
1. Route responses back to the right place
2. Inject context into the system prompt
3. Track origin for cron job delivery
"""
platform: Platform
chat_id: str
chat_name: str | None = None
chat_name: Optional[str] = None
chat_type: str = "dm" # "dm", "group", "channel", "thread"
user_id: str | None = None
user_name: str | None = None
thread_id: str | None = None # For forum topics, Discord threads, etc.
chat_topic: str | None = None # Channel topic/description (Discord, Slack)
user_id_alt: str | None = None # Signal UUID (alternative to phone number)
chat_id_alt: str | None = None # Signal group internal ID
user_id: Optional[str] = None
user_name: Optional[str] = None
thread_id: Optional[str] = None # For forum topics, Discord threads, etc.
chat_topic: Optional[str] = None # Channel topic/description (Discord, Slack)
@property
def description(self) -> str:
"""Human-readable description of the source."""
if self.platform == Platform.LOCAL:
return "CLI terminal"
parts = []
if self.chat_type == "dm":
parts.append(f"DM with {self.user_name or self.user_id or 'user'}")
@@ -62,14 +61,14 @@ class SessionSource:
parts.append(f"channel: {self.chat_name or self.chat_id}")
else:
parts.append(self.chat_name or self.chat_id)
if self.thread_id:
parts.append(f"thread: {self.thread_id}")
return ", ".join(parts)
def to_dict(self) -> dict[str, Any]:
d = {
def to_dict(self) -> Dict[str, Any]:
return {
"platform": self.platform.value,
"chat_id": self.chat_id,
"chat_name": self.chat_name,
@@ -79,14 +78,9 @@ class SessionSource:
"thread_id": self.thread_id,
"chat_topic": self.chat_topic,
}
if self.user_id_alt:
d["user_id_alt"] = self.user_id_alt
if self.chat_id_alt:
d["chat_id_alt"] = self.chat_id_alt
return d
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "SessionSource":
def from_dict(cls, data: Dict[str, Any]) -> "SessionSource":
return cls(
platform=Platform(data["platform"]),
chat_id=str(data["chat_id"]),
@@ -96,10 +90,8 @@ class SessionSource:
user_name=data.get("user_name"),
thread_id=data.get("thread_id"),
chat_topic=data.get("chat_topic"),
user_id_alt=data.get("user_id_alt"),
chat_id_alt=data.get("chat_id_alt"),
)
@classmethod
def local_cli(cls) -> "SessionSource":
"""Create a source representing the local CLI."""
@@ -115,28 +107,29 @@ class SessionSource:
class SessionContext:
"""
Full context for a session, used for dynamic system prompt injection.
The agent receives this information to understand:
- Where messages are coming from
- What platforms are available
- Where it can deliver scheduled task outputs
"""
source: SessionSource
connected_platforms: list[Platform]
home_channels: dict[Platform, HomeChannel]
connected_platforms: List[Platform]
home_channels: Dict[Platform, HomeChannel]
# Session metadata
session_key: str = ""
session_id: str = ""
created_at: datetime | None = None
updated_at: datetime | None = None
def to_dict(self) -> dict[str, Any]:
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
def to_dict(self) -> Dict[str, Any]:
return {
"source": self.source.to_dict(),
"connected_platforms": [p.value for p in self.connected_platforms],
"home_channels": {p.value: hc.to_dict() for p, hc in self.home_channels.items()},
"home_channels": {
p.value: hc.to_dict() for p, hc in self.home_channels.items()
},
"session_key": self.session_key,
"session_id": self.session_id,
"created_at": self.created_at.isoformat() if self.created_at else None,
@@ -147,7 +140,7 @@ class SessionContext:
def build_session_context_prompt(context: SessionContext) -> str:
"""
Build the dynamic system prompt section that tells the agent about its context.
This is injected into the system prompt so the agent knows:
- Where messages are coming from
- What platforms are connected
@@ -157,14 +150,14 @@ def build_session_context_prompt(context: SessionContext) -> str:
"## Current Session Context",
"",
]
# Source info
platform_name = context.source.platform.value.title()
if context.source.platform == Platform.LOCAL:
lines.append(f"**Source:** {platform_name} (the machine running this agent)")
else:
lines.append(f"**Source:** {platform_name} ({context.source.description})")
# Channel topic (if available - provides context about the channel's purpose)
if context.source.chat_topic:
lines.append(f"**Channel Topic:** {context.source.chat_topic}")
@@ -174,43 +167,43 @@ def build_session_context_prompt(context: SessionContext) -> str:
lines.append(f"**User:** {context.source.user_name}")
elif context.source.user_id:
lines.append(f"**User ID:** {context.source.user_id}")
# Connected platforms
platforms_list = ["local (files on this machine)"]
for p in context.connected_platforms:
if p != Platform.LOCAL:
platforms_list.append(f"{p.value}: Connected ✓")
lines.append(f"**Connected Platforms:** {', '.join(platforms_list)}")
# Home channels
if context.home_channels:
lines.append("")
lines.append("**Home Channels (default destinations):**")
for platform, home in context.home_channels.items():
lines.append(f" - {platform.value}: {home.name} (ID: {home.chat_id})")
# Delivery options for scheduled tasks
lines.append("")
lines.append("**Delivery options for scheduled tasks:**")
# Origin delivery
if context.source.platform == Platform.LOCAL:
lines.append('- `"origin"` → Local output (saved to files)')
lines.append("- `\"origin\"` → Local output (saved to files)")
else:
lines.append(f'- `"origin"` → Back to this chat ({context.source.chat_name or context.source.chat_id})')
lines.append(f"- `\"origin\"` → Back to this chat ({context.source.chat_name or context.source.chat_id})")
# Local always available
lines.append('- `"local"` → Save to local files only (~/.hermes/cron/output/)')
lines.append("- `\"local\"` → Save to local files only (~/.hermes/cron/output/)")
# Platform home channels
for platform, home in context.home_channels.items():
lines.append(f'- `"{platform.value}"` → Home channel ({home.name})')
lines.append(f"- `\"{platform.value}\"` → Home channel ({home.name})")
# Note about explicit targeting
lines.append("")
lines.append('*For explicit targeting, use `"platform:chat_id"` format if the user provides a specific chat ID.*')
lines.append("*For explicit targeting, use `\"platform:chat_id\"` format if the user provides a specific chat ID.*")
return "\n".join(lines)
@@ -218,33 +211,32 @@ def build_session_context_prompt(context: SessionContext) -> str:
class SessionEntry:
"""
Entry in the session store.
Maps a session key to its current session ID and metadata.
"""
session_key: str
session_id: str
created_at: datetime
updated_at: datetime
# Origin metadata for delivery routing
origin: SessionSource | None = None
origin: Optional[SessionSource] = None
# Display metadata
display_name: str | None = None
platform: Platform | None = None
display_name: Optional[str] = None
platform: Optional[Platform] = None
chat_type: str = "dm"
# Token tracking
input_tokens: int = 0
output_tokens: int = 0
total_tokens: int = 0
# Set when a session was created because the previous one expired;
# consumed once by the message handler to inject a notice into context
was_auto_reset: bool = False
def to_dict(self) -> dict[str, Any]:
def to_dict(self) -> Dict[str, Any]:
result = {
"session_key": self.session_key,
"session_id": self.session_id,
@@ -260,20 +252,20 @@ class SessionEntry:
if self.origin:
result["origin"] = self.origin.to_dict()
return result
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "SessionEntry":
def from_dict(cls, data: Dict[str, Any]) -> "SessionEntry":
origin = None
if "origin" in data and data["origin"]:
origin = SessionSource.from_dict(data["origin"])
platform = None
if data.get("platform"):
try:
platform = Platform(data["platform"])
except ValueError:
pass
return cls(
session_key=data["session_key"],
session_id=data["session_id"],
@@ -306,106 +298,65 @@ def build_session_key(source: SessionSource) -> str:
class SessionStore:
"""
Manages session storage and retrieval.
Uses SQLite (via SessionDB) for session metadata and message transcripts.
Falls back to legacy JSONL files if SQLite is unavailable.
"""
def __init__(self, sessions_dir: Path, config: GatewayConfig, has_active_processes_fn=None, on_auto_reset=None):
def __init__(self, sessions_dir: Path, config: GatewayConfig,
has_active_processes_fn=None,
on_auto_reset=None):
self.sessions_dir = sessions_dir
self.config = config
self._entries: dict[str, SessionEntry] = {}
self._entries: Dict[str, SessionEntry] = {}
self._loaded = False
self._has_active_processes_fn = has_active_processes_fn
# on_auto_reset is deprecated — memory flush now runs proactively
# via the background session expiry watcher in GatewayRunner.
self._pre_flushed_sessions: set = set() # session_ids already flushed by watcher
self._on_auto_reset = on_auto_reset # callback(old_entry) before auto-reset
# Initialize SQLite session database
self._db = None
try:
from hermes_state import SessionDB
self._db = SessionDB()
except Exception as e:
print(f"[gateway] Warning: SQLite session store unavailable, falling back to JSONL: {e}")
def _ensure_loaded(self) -> None:
"""Load sessions index from disk if not already loaded."""
if self._loaded:
return
self.sessions_dir.mkdir(parents=True, exist_ok=True)
sessions_file = self.sessions_dir / "sessions.json"
if sessions_file.exists():
try:
with open(sessions_file, encoding="utf-8") as f:
with open(sessions_file, "r") as f:
data = json.load(f)
for key, entry_data in data.items():
self._entries[key] = SessionEntry.from_dict(entry_data)
except Exception as e:
print(f"[gateway] Warning: Failed to load sessions: {e}")
self._loaded = True
def _save(self) -> None:
"""Save sessions index to disk (kept for session key -> ID mapping)."""
self.sessions_dir.mkdir(parents=True, exist_ok=True)
sessions_file = self.sessions_dir / "sessions.json"
data = {key: entry.to_dict() for key, entry in self._entries.items()}
with open(sessions_file, "w", encoding="utf-8") as f:
with open(sessions_file, "w") as f:
json.dump(data, f, indent=2)
def _generate_session_key(self, source: SessionSource) -> str:
"""Generate a session key from a source."""
return build_session_key(source)
def _is_session_expired(self, entry: SessionEntry) -> bool:
"""Check if a session has expired based on its reset policy.
Works from the entry alone — no SessionSource needed.
Used by the background expiry watcher to proactively flush memories.
Sessions with active background processes are never considered expired.
"""
if self._has_active_processes_fn:
if self._has_active_processes_fn(entry.session_key):
return False
policy = self.config.get_reset_policy(
platform=entry.platform,
session_type=entry.chat_type,
)
if policy.mode == "none":
return False
now = datetime.now()
if policy.mode in ("idle", "both"):
idle_deadline = entry.updated_at + timedelta(minutes=policy.idle_minutes)
if now > idle_deadline:
return True
if policy.mode in ("daily", "both"):
today_reset = now.replace(
hour=policy.at_hour,
minute=0,
second=0,
microsecond=0,
)
if now.hour < policy.at_hour:
today_reset -= timedelta(days=1)
if entry.updated_at < today_reset:
return True
return False
def _should_reset(self, entry: SessionEntry, source: SessionSource) -> bool:
"""
Check if a session should be reset based on policy.
Sessions with active background processes are never reset.
"""
if self._has_active_processes_fn:
@@ -413,28 +364,36 @@ class SessionStore:
if self._has_active_processes_fn(session_key):
return False
policy = self.config.get_reset_policy(platform=source.platform, session_type=source.chat_type)
policy = self.config.get_reset_policy(
platform=source.platform,
session_type=source.chat_type
)
if policy.mode == "none":
return False
now = datetime.now()
if policy.mode in ("idle", "both"):
idle_deadline = entry.updated_at + timedelta(minutes=policy.idle_minutes)
if now > idle_deadline:
return True
if policy.mode in ("daily", "both"):
today_reset = now.replace(hour=policy.at_hour, minute=0, second=0, microsecond=0)
today_reset = now.replace(
hour=policy.at_hour,
minute=0,
second=0,
microsecond=0
)
if now.hour < policy.at_hour:
today_reset -= timedelta(days=1)
if entry.updated_at < today_reset:
return True
return False
def has_any_sessions(self) -> bool:
"""Check if any sessions have ever been created (across all platforms).
@@ -455,32 +414,38 @@ class SessionStore:
# This covers the rare case where the DB is unavailable.
self._ensure_loaded()
return len(self._entries) > 1
def get_or_create_session(self, source: SessionSource, force_new: bool = False) -> SessionEntry:
def get_or_create_session(
self,
source: SessionSource,
force_new: bool = False
) -> SessionEntry:
"""
Get an existing session or create a new one.
Evaluates reset policy to determine if the existing session is stale.
Creates a session record in SQLite when a new session starts.
"""
self._ensure_loaded()
session_key = self._generate_session_key(source)
now = datetime.now()
if session_key in self._entries and not force_new:
entry = self._entries[session_key]
if not self._should_reset(entry, source):
entry.updated_at = now
self._save()
return entry
else:
# Session is being auto-reset. The background expiry watcher
# should have already flushed memories proactively; discard
# the marker so it doesn't accumulate.
# Session is being auto-reset — flush memories before destroying
was_auto_reset = True
self._pre_flushed_sessions.discard(entry.session_id)
if self._on_auto_reset:
try:
self._on_auto_reset(entry)
except Exception as e:
logger.debug("Auto-reset callback failed: %s", e)
if self._db:
try:
self._db.end_session(entry.session_id, "session_reset")
@@ -488,10 +453,10 @@ class SessionStore:
logger.debug("Session DB operation failed: %s", e)
else:
was_auto_reset = False
# Create new session
session_id = f"{now.strftime('%Y%m%d_%H%M%S')}_{uuid.uuid4().hex[:8]}"
entry = SessionEntry(
session_key=session_key,
session_id=session_id,
@@ -503,10 +468,10 @@ class SessionStore:
chat_type=source.chat_type,
was_auto_reset=was_auto_reset,
)
self._entries[session_key] = entry
self._save()
# Create session in SQLite
if self._db:
try:
@@ -517,13 +482,18 @@ class SessionStore:
)
except Exception as e:
print(f"[gateway] Warning: Failed to create SQLite session: {e}")
return entry
def update_session(self, session_key: str, input_tokens: int = 0, output_tokens: int = 0) -> None:
def update_session(
self,
session_key: str,
input_tokens: int = 0,
output_tokens: int = 0
) -> None:
"""Update a session's metadata after an interaction."""
self._ensure_loaded()
if session_key in self._entries:
entry = self._entries[session_key]
entry.updated_at = datetime.now()
@@ -531,32 +501,34 @@ class SessionStore:
entry.output_tokens += output_tokens
entry.total_tokens = entry.input_tokens + entry.output_tokens
self._save()
if self._db:
try:
self._db.update_token_counts(entry.session_id, input_tokens, output_tokens)
self._db.update_token_counts(
entry.session_id, input_tokens, output_tokens
)
except Exception as e:
logger.debug("Session DB operation failed: %s", e)
def reset_session(self, session_key: str) -> SessionEntry | None:
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
"""Force reset a session, creating a new session ID."""
self._ensure_loaded()
if session_key not in self._entries:
return None
old_entry = self._entries[session_key]
# End old session in SQLite
if self._db:
try:
self._db.end_session(old_entry.session_id, "session_reset")
except Exception as e:
logger.debug("Session DB operation failed: %s", e)
now = datetime.now()
session_id = f"{now.strftime('%Y%m%d_%H%M%S')}_{uuid.uuid4().hex[:8]}"
new_entry = SessionEntry(
session_key=session_key,
session_id=session_id,
@@ -567,10 +539,10 @@ class SessionStore:
platform=old_entry.platform,
chat_type=old_entry.chat_type,
)
self._entries[session_key] = new_entry
self._save()
# Create new session in SQLite
if self._db:
try:
@@ -581,70 +553,28 @@ class SessionStore:
)
except Exception as e:
logger.debug("Session DB operation failed: %s", e)
return new_entry
def switch_session(self, session_key: str, target_session_id: str) -> SessionEntry | None:
"""Switch a session key to point at an existing session ID.
Used by ``/resume`` to restore a previously-named session.
Ends the current session in SQLite (like reset), but instead of
generating a fresh session ID, re-uses ``target_session_id`` so the
old transcript is loaded on the next message.
"""
self._ensure_loaded()
if session_key not in self._entries:
return None
old_entry = self._entries[session_key]
# Don't switch if already on that session
if old_entry.session_id == target_session_id:
return old_entry
# End the current session in SQLite
if self._db:
try:
self._db.end_session(old_entry.session_id, "session_switch")
except Exception as e:
logger.debug("Session DB end_session failed: %s", e)
now = datetime.now()
new_entry = SessionEntry(
session_key=session_key,
session_id=target_session_id,
created_at=now,
updated_at=now,
origin=old_entry.origin,
display_name=old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
)
self._entries[session_key] = new_entry
self._save()
return new_entry
def list_sessions(self, active_minutes: int | None = None) -> list[SessionEntry]:
def list_sessions(self, active_minutes: Optional[int] = None) -> List[SessionEntry]:
"""List all sessions, optionally filtered by activity."""
self._ensure_loaded()
entries = list(self._entries.values())
if active_minutes is not None:
cutoff = datetime.now() - timedelta(minutes=active_minutes)
entries = [e for e in entries if e.updated_at >= cutoff]
entries.sort(key=lambda e: e.updated_at, reverse=True)
return entries
def get_transcript_path(self, session_id: str) -> Path:
"""Get the path to a session's legacy transcript file."""
return self.sessions_dir / f"{session_id}.jsonl"
def append_to_transcript(self, session_id: str, message: dict[str, Any]) -> None:
def append_to_transcript(self, session_id: str, message: Dict[str, Any]) -> None:
"""Append a message to a session's transcript (SQLite + legacy JSONL)."""
# Write to SQLite
if self._db:
@@ -659,15 +589,15 @@ class SessionStore:
)
except Exception as e:
logger.debug("Session DB operation failed: %s", e)
# Also write legacy JSONL (keeps existing tooling working during transition)
transcript_path = self.get_transcript_path(session_id)
with open(transcript_path, "a", encoding="utf-8") as f:
with open(transcript_path, "a") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
def rewrite_transcript(self, session_id: str, messages: list[dict[str, Any]]) -> None:
def rewrite_transcript(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
"""Replace the entire transcript for a session with new messages.
Used by /retry, /undo, and /compress to persist modified conversation history.
Rewrites both SQLite and legacy JSONL storage.
"""
@@ -686,14 +616,14 @@ class SessionStore:
)
except Exception as e:
logger.debug("Failed to rewrite transcript in DB: %s", e)
# JSONL: overwrite the file
transcript_path = self.get_transcript_path(session_id)
with open(transcript_path, "w", encoding="utf-8") as f:
with open(transcript_path, "w") as f:
for msg in messages:
f.write(json.dumps(msg, ensure_ascii=False) + "\n")
def load_transcript(self, session_id: str) -> list[dict[str, Any]]:
def load_transcript(self, session_id: str) -> List[Dict[str, Any]]:
"""Load all messages from a session's transcript."""
# Try SQLite first
if self._db:
@@ -703,49 +633,51 @@ class SessionStore:
return messages
except Exception as e:
logger.debug("Could not load messages from DB: %s", e)
# Fall back to legacy JSONL
transcript_path = self.get_transcript_path(session_id)
if not transcript_path.exists():
return []
messages = []
with open(transcript_path, encoding="utf-8") as f:
with open(transcript_path, "r") as f:
for line in f:
line = line.strip()
if line:
messages.append(json.loads(line))
return messages
def build_session_context(
source: SessionSource, config: GatewayConfig, session_entry: SessionEntry | None = None
source: SessionSource,
config: GatewayConfig,
session_entry: Optional[SessionEntry] = None
) -> SessionContext:
"""
Build a full session context from a source and config.
This is used to inject context into the agent's system prompt.
"""
connected = config.get_connected_platforms()
home_channels = {}
for platform in connected:
home = config.get_home_channel(platform)
if home:
home_channels[platform] = home
context = SessionContext(
source=source,
connected_platforms=connected,
home_channels=home_channels,
)
if session_entry:
context.session_key = session_entry.session_key
context.session_id = session_entry.session_id
context.created_at = session_entry.created_at
context.updated_at = session_entry.updated_at
return context

View File

@@ -13,6 +13,7 @@ concurrently under distinct configurations).
import os
from pathlib import Path
from typing import Optional
def _get_pid_path() -> Path:
@@ -36,7 +37,7 @@ def remove_pid_file() -> None:
pass
def get_running_pid() -> int | None:
def get_running_pid() -> Optional[int]:
"""Return the PID of a running gateway instance, or ``None``.
Checks the PID file and verifies the process is actually alive.

View File

@@ -12,6 +12,8 @@ import json
import os
import time
from pathlib import Path
from typing import Optional
CACHE_PATH = Path(os.path.expanduser("~/.hermes/sticker_cache.json"))
@@ -41,7 +43,7 @@ def _save_cache(cache: dict) -> None:
)
def get_cached_description(file_unique_id: str) -> dict | None:
def get_cached_description(file_unique_id: str) -> Optional[dict]:
"""
Look up a cached sticker description.
@@ -90,11 +92,11 @@ def build_sticker_injection(
"""
context = ""
if set_name and emoji:
context = f' {emoji} from "{set_name}"'
context = f" {emoji} from \"{set_name}\""
elif emoji:
context = f" {emoji}"
return f'[The user sent a sticker{context}~ It shows: "{description}" (=^.w.^=)]'
return f"[The user sent a sticker{context}~ It shows: \"{description}\" (=^.w.^=)]"
def build_animated_sticker_injection(emoji: str = "") -> str:

View File

@@ -5,7 +5,7 @@ Provides subcommands for:
- hermes chat - Interactive chat (same as ./hermes)
- hermes gateway - Run gateway in foreground
- hermes gateway start - Start gateway service
- hermes gateway stop - Stop gateway service
- hermes gateway stop - Stop gateway service
- hermes setup - Interactive setup wizard
- hermes status - Show status of all components
- hermes cron - Manage cron jobs

File diff suppressed because it is too large Load Diff

View File

@@ -9,13 +9,15 @@ import os
import subprocess
import time
from pathlib import Path
from typing import Dict, List, Any, Optional
from prompt_toolkit import print_formatted_text as _pt_print
from prompt_toolkit.formatted_text import ANSI as _PT_ANSI
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
from prompt_toolkit import print_formatted_text as _pt_print
from prompt_toolkit.formatted_text import ANSI as _PT_ANSI
logger = logging.getLogger(__name__)
@@ -75,8 +77,7 @@ COMPACT_BANNER = """
# Skills scanning
# =========================================================================
def get_available_skills() -> dict[str, list[str]]:
def get_available_skills() -> Dict[str, List[str]]:
"""Scan ~/.hermes/skills/ and return skills grouped by category."""
import os
@@ -109,7 +110,7 @@ def get_available_skills() -> dict[str, list[str]]:
_UPDATE_CHECK_CACHE_SECONDS = 6 * 3600
def check_for_updates() -> int | None:
def check_for_updates() -> Optional[int]:
"""Check how many commits behind origin/main the local repo is.
Does a ``git fetch`` at most once every 6 hours (cached to
@@ -138,8 +139,7 @@ def check_for_updates() -> int | None:
try:
subprocess.run(
["git", "fetch", "origin", "--quiet"],
capture_output=True,
timeout=10,
capture_output=True, timeout=10,
cwd=str(repo_dir),
)
except Exception:
@@ -149,9 +149,7 @@ def check_for_updates() -> int | None:
try:
result = subprocess.run(
["git", "rev-list", "--count", "HEAD..origin/main"],
capture_output=True,
text=True,
timeout=5,
capture_output=True, text=True, timeout=5,
cwd=str(repo_dir),
)
if result.returncode == 0:
@@ -174,7 +172,6 @@ def check_for_updates() -> int | None:
# Welcome banner
# =========================================================================
def _format_context_length(tokens: int) -> str:
"""Format a token count for display (e.g. 128000 → '128K', 1048576 → '1M')."""
if tokens >= 1_000_000:
@@ -186,16 +183,12 @@ def _format_context_length(tokens: int) -> str:
return str(tokens)
def build_welcome_banner(
console: Console,
model: str,
cwd: str,
tools: list[dict] = None,
enabled_toolsets: list[str] = None,
session_id: str = None,
get_toolset_for_tool=None,
context_length: int = None,
):
def build_welcome_banner(console: Console, model: str, cwd: str,
tools: List[dict] = None,
enabled_toolsets: List[str] = None,
session_id: str = None,
get_toolset_for_tool=None,
context_length: int = None):
"""Build and print a welcome banner with caduceus on left and info on right.
Args:
@@ -208,8 +201,7 @@ def build_welcome_banner(
get_toolset_for_tool: Callable to map tool name -> toolset name.
context_length: Model's context window size in tokens.
"""
from model_tools import check_tool_availability
from model_tools import check_tool_availability, TOOLSET_REQUIREMENTS
if get_toolset_for_tool is None:
from model_tools import get_toolset_for_tool
@@ -229,9 +221,7 @@ def build_welcome_banner(
model_short = model.split("/")[-1] if "/" in model else model
if len(model_short) > 28:
model_short = model_short[:25] + "..."
ctx_str = (
f" [dim #B8860B]·[/] [dim #B8860B]{_format_context_length(context_length)} context[/]" if context_length else ""
)
ctx_str = f" [dim #B8860B]·[/] [dim #B8860B]{_format_context_length(context_length)} context[/]" if context_length else ""
left_lines.append(f"[#FFBF00]{model_short}[/]{ctx_str} [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
left_lines.append(f"[dim #B8860B]{cwd}[/]")
if session_id:
@@ -239,7 +229,7 @@ def build_welcome_banner(
left_content = "\n".join(left_lines)
right_lines = ["[bold #FFBF00]Available Tools[/]"]
toolsets_dict: dict[str, list] = {}
toolsets_dict: Dict[str, list] = {}
for tool in tools:
tool_name = tool["function"]["name"]
@@ -296,7 +286,6 @@ def build_welcome_banner(
# MCP Servers section (only if configured)
try:
from tools.mcp_tool import get_mcp_status
mcp_status = get_mcp_status()
except Exception:
mcp_status = []
@@ -311,7 +300,10 @@ def build_welcome_banner(
f"[dim #B8860B]—[/] [#FFF8DC]{srv['tools']} tool(s)[/]"
)
else:
right_lines.append(f"[red]{srv['name']}[/] [dim]({srv['transport']})[/] [red]— failed[/]")
right_lines.append(
f"[red]{srv['name']}[/] [dim]({srv['transport']})[/] "
f"[red]— failed[/]"
)
right_lines.append("")
right_lines.append("[bold #FFBF00]Available Skills[/]")

View File

@@ -9,7 +9,7 @@ with the TUI.
import queue
import time as _time
from hermes_cli.banner import _DIM, _RST, cprint
from hermes_cli.banner import cprint, _DIM, _RST
def clarify_callback(cli, question, choices):
@@ -33,7 +33,7 @@ def clarify_callback(cli, question, choices):
cli._clarify_deadline = _time.monotonic() + timeout
cli._clarify_freetext = is_open_ended
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
while True:
@@ -45,13 +45,13 @@ def clarify_callback(cli, question, choices):
remaining = cli._clarify_deadline - _time.monotonic()
if remaining <= 0:
break
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
cli._clarify_state = None
cli._clarify_freetext = False
cli._clarify_deadline = 0
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
cprint(f"\n{_DIM}(clarify timed out after {timeout}s — agent will decide){_RST}")
return (
@@ -71,7 +71,7 @@ def sudo_password_callback(cli) -> str:
cli._sudo_state = {"response_queue": response_queue}
cli._sudo_deadline = _time.monotonic() + timeout
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
while True:
@@ -79,7 +79,7 @@ def sudo_password_callback(cli) -> str:
result = response_queue.get(timeout=1)
cli._sudo_state = None
cli._sudo_deadline = 0
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
if result:
cprint(f"\n{_DIM} ✓ Password received (cached for session){_RST}")
@@ -90,12 +90,12 @@ def sudo_password_callback(cli) -> str:
remaining = cli._sudo_deadline - _time.monotonic()
if remaining <= 0:
break
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
cli._sudo_state = None
cli._sudo_deadline = 0
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
cprint(f"\n{_DIM} ⏱ Timeout — continuing without sudo{_RST}")
return ""
@@ -119,7 +119,7 @@ def approval_callback(cli, command: str, description: str) -> str:
}
cli._approval_deadline = _time.monotonic() + timeout
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
while True:
@@ -127,19 +127,19 @@ def approval_callback(cli, command: str, description: str) -> str:
result = response_queue.get(timeout=1)
cli._approval_state = None
cli._approval_deadline = 0
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
return result
except queue.Empty:
remaining = cli._approval_deadline - _time.monotonic()
if remaining <= 0:
break
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
cli._approval_state = None
cli._approval_deadline = 0
if hasattr(cli, "_app") and cli._app:
if hasattr(cli, '_app') and cli._app:
cli._app.invalidate()
cprint(f"\n{_DIM} ⏱ Timeout — denying command{_RST}")
return "deny"

View File

@@ -51,7 +51,6 @@ def has_clipboard_image() -> bool:
# ── macOS ────────────────────────────────────────────────────────────────
def _macos_save(dest: Path) -> bool:
"""Try pngpaste first (fast, handles more formats), fall back to osascript."""
return _macos_pngpaste(dest) or _macos_osascript(dest)
@@ -62,9 +61,7 @@ def _macos_has_image() -> bool:
try:
info = subprocess.run(
["osascript", "-e", "clipboard info"],
capture_output=True,
text=True,
timeout=3,
capture_output=True, text=True, timeout=3,
)
return "«class PNGf»" in info.stdout or "«class TIFF»" in info.stdout
except Exception:
@@ -76,8 +73,7 @@ def _macos_pngpaste(dest: Path) -> bool:
try:
r = subprocess.run(
["pngpaste", str(dest)],
capture_output=True,
timeout=3,
capture_output=True, timeout=3,
)
if r.returncode == 0 and dest.exists() and dest.stat().st_size > 0:
return True
@@ -95,21 +91,19 @@ def _macos_osascript(dest: Path) -> bool:
# Extract as PNG
script = (
"try\n"
" set imgData to the clipboard as «class PNGf»\n"
'try\n'
' set imgData to the clipboard as «class PNGf»\n'
f' set f to open for access POSIX file "{dest}" with write permission\n'
" write imgData to f\n"
" close access f\n"
"on error\n"
' write imgData to f\n'
' close access f\n'
'on error\n'
' return "fail"\n'
"end try\n"
'end try\n'
)
try:
r = subprocess.run(
["osascript", "-e", script],
capture_output=True,
text=True,
timeout=5,
capture_output=True, text=True, timeout=5,
)
if r.returncode == 0 and "fail" not in r.stdout and dest.exists() and dest.stat().st_size > 0:
return True
@@ -120,14 +114,13 @@ def _macos_osascript(dest: Path) -> bool:
# ── Linux ────────────────────────────────────────────────────────────────
def _is_wsl() -> bool:
"""Detect if running inside WSL (1 or 2)."""
global _wsl_detected
if _wsl_detected is not None:
return _wsl_detected
try:
with open("/proc/version") as f:
with open("/proc/version", "r") as f:
_wsl_detected = "microsoft" in f.read().lower()
except Exception:
_wsl_detected = False
@@ -152,7 +145,10 @@ def _linux_save(dest: Path) -> bool:
# PowerShell script: get clipboard image as base64-encoded PNG on stdout.
# Using .NET System.Windows.Forms.Clipboard — always available on Windows.
_PS_CHECK_IMAGE = "Add-Type -AssemblyName System.Windows.Forms;[System.Windows.Forms.Clipboard]::ContainsImage()"
_PS_CHECK_IMAGE = (
"Add-Type -AssemblyName System.Windows.Forms;"
"[System.Windows.Forms.Clipboard]::ContainsImage()"
)
_PS_EXTRACT_IMAGE = (
"Add-Type -AssemblyName System.Windows.Forms;"
@@ -169,10 +165,9 @@ def _wsl_has_image() -> bool:
"""Check if Windows clipboard has an image (via powershell.exe)."""
try:
r = subprocess.run(
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command", _PS_CHECK_IMAGE],
capture_output=True,
text=True,
timeout=8,
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command",
_PS_CHECK_IMAGE],
capture_output=True, text=True, timeout=8,
)
return r.returncode == 0 and "True" in r.stdout
except FileNotFoundError:
@@ -186,10 +181,9 @@ def _wsl_save(dest: Path) -> bool:
"""Extract clipboard image via powershell.exe → base64 → decode to PNG."""
try:
r = subprocess.run(
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command", _PS_EXTRACT_IMAGE],
capture_output=True,
text=True,
timeout=15,
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command",
_PS_EXTRACT_IMAGE],
capture_output=True, text=True, timeout=15,
)
if r.returncode != 0:
return False
@@ -212,17 +206,16 @@ def _wsl_save(dest: Path) -> bool:
# ── Wayland (wl-paste) ──────────────────────────────────────────────────
def _wayland_has_image() -> bool:
"""Check if Wayland clipboard has image content."""
try:
r = subprocess.run(
["wl-paste", "--list-types"],
capture_output=True,
text=True,
timeout=3,
capture_output=True, text=True, timeout=3,
)
return r.returncode == 0 and any(
t.startswith("image/") for t in r.stdout.splitlines()
)
return r.returncode == 0 and any(t.startswith("image/") for t in r.stdout.splitlines())
except FileNotFoundError:
logger.debug("wl-paste not installed — Wayland clipboard unavailable")
except Exception:
@@ -236,9 +229,7 @@ def _wayland_save(dest: Path) -> bool:
# Check available MIME types
types_r = subprocess.run(
["wl-paste", "--list-types"],
capture_output=True,
text=True,
timeout=3,
capture_output=True, text=True, timeout=3,
)
if types_r.returncode != 0:
return False
@@ -246,7 +237,8 @@ def _wayland_save(dest: Path) -> bool:
# Prefer PNG, fall back to other image formats
mime = None
for preferred in ("image/png", "image/jpeg", "image/bmp", "image/gif", "image/webp"):
for preferred in ("image/png", "image/jpeg", "image/bmp",
"image/gif", "image/webp"):
if preferred in types:
mime = preferred
break
@@ -258,10 +250,7 @@ def _wayland_save(dest: Path) -> bool:
with open(dest, "wb") as f:
subprocess.run(
["wl-paste", "--type", mime],
stdout=f,
stderr=subprocess.DEVNULL,
timeout=5,
check=True,
stdout=f, stderr=subprocess.DEVNULL, timeout=5, check=True,
)
if not dest.exists() or dest.stat().st_size == 0:
@@ -287,7 +276,6 @@ def _convert_to_png(path: Path) -> bool:
# Try Pillow first (likely installed in the venv)
try:
from PIL import Image
img = Image.open(path)
img.save(path, "PNG")
return True
@@ -297,25 +285,20 @@ def _convert_to_png(path: Path) -> bool:
logger.debug("Pillow BMP→PNG conversion failed: %s", e)
# Fall back to ImageMagick convert
tmp = path.with_suffix(".bmp")
try:
tmp = path.with_suffix(".bmp")
path.rename(tmp)
r = subprocess.run(
["convert", str(tmp), "png:" + str(path)],
capture_output=True,
timeout=5,
capture_output=True, timeout=5,
)
tmp.unlink(missing_ok=True)
if r.returncode == 0 and path.exists() and path.stat().st_size > 0:
return True
except FileNotFoundError:
logger.debug("ImageMagick not installed — cannot convert BMP to PNG")
if tmp.exists() and not path.exists():
tmp.rename(path)
except Exception as e:
logger.debug("ImageMagick BMP→PNG conversion failed: %s", e)
if tmp.exists() and not path.exists():
tmp.rename(path)
# Can't convert — BMP is still usable as-is for most APIs
return path.exists() and path.stat().st_size > 0
@@ -323,15 +306,12 @@ def _convert_to_png(path: Path) -> bool:
# ── X11 (xclip) ─────────────────────────────────────────────────────────
def _xclip_has_image() -> bool:
"""Check if X11 clipboard has image content."""
try:
r = subprocess.run(
["xclip", "-selection", "clipboard", "-t", "TARGETS", "-o"],
capture_output=True,
text=True,
timeout=3,
capture_output=True, text=True, timeout=3,
)
return r.returncode == 0 and "image/png" in r.stdout
except FileNotFoundError:
@@ -347,9 +327,7 @@ def _xclip_save(dest: Path) -> bool:
try:
targets = subprocess.run(
["xclip", "-selection", "clipboard", "-t", "TARGETS", "-o"],
capture_output=True,
text=True,
timeout=3,
capture_output=True, text=True, timeout=3,
)
if "image/png" not in targets.stdout:
return False
@@ -364,10 +342,7 @@ def _xclip_save(dest: Path) -> bool:
with open(dest, "wb") as f:
subprocess.run(
["xclip", "-selection", "clipboard", "-t", "image/png", "-o"],
stdout=f,
stderr=subprocess.DEVNULL,
timeout=5,
check=True,
stdout=f, stderr=subprocess.DEVNULL, timeout=5, check=True,
)
if dest.exists() and dest.stat().st_size > 0:
return True

View File

@@ -4,12 +4,14 @@ from __future__ import annotations
import json
import logging
import os
from pathlib import Path
from typing import List, Optional
import os
logger = logging.getLogger(__name__)
DEFAULT_CODEX_MODELS: list[str] = [
DEFAULT_CODEX_MODELS: List[str] = [
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-5.1-codex-max",
@@ -17,11 +19,10 @@ DEFAULT_CODEX_MODELS: list[str] = [
]
def _fetch_models_from_api(access_token: str) -> list[str]:
def _fetch_models_from_api(access_token: str) -> List[str]:
"""Fetch available models from the Codex API. Returns visible models sorted by priority."""
try:
import httpx
resp = httpx.get(
"https://chatgpt.com/backend-api/codex/models?client_version=1.0.0",
headers={"Authorization": f"Bearer {access_token}"},
@@ -46,7 +47,7 @@ def _fetch_models_from_api(access_token: str) -> list[str]:
if item.get("supported_in_api") is False:
continue
visibility = item.get("visibility", "")
if isinstance(visibility, str) and visibility.strip().lower() == "hidden":
if isinstance(visibility, str) and visibility.strip().lower() == "hide":
continue
priority = item.get("priority")
rank = int(priority) if isinstance(priority, (int, float)) else 10_000
@@ -56,7 +57,7 @@ def _fetch_models_from_api(access_token: str) -> list[str]:
return [slug for _, slug in sortable]
def _read_default_model(codex_home: Path) -> str | None:
def _read_default_model(codex_home: Path) -> Optional[str]:
config_path = codex_home / "config.toml"
if not config_path.exists():
return None
@@ -74,7 +75,7 @@ def _read_default_model(codex_home: Path) -> str | None:
return None
def _read_cache_models(codex_home: Path) -> list[str]:
def _read_cache_models(codex_home: Path) -> List[str]:
cache_path = codex_home / "models_cache.json"
if not cache_path.exists():
return []
@@ -93,6 +94,8 @@ def _read_cache_models(codex_home: Path) -> list[str]:
if not isinstance(slug, str) or not slug.strip():
continue
slug = slug.strip()
if "codex" not in slug.lower():
continue
if item.get("supported_in_api") is False:
continue
visibility = item.get("visibility")
@@ -103,22 +106,22 @@ def _read_cache_models(codex_home: Path) -> list[str]:
sortable.append((rank, slug))
sortable.sort(key=lambda item: (item[0], item[1]))
deduped: list[str] = []
deduped: List[str] = []
for _, slug in sortable:
if slug not in deduped:
deduped.append(slug)
return deduped
def get_codex_model_ids(access_token: str | None = None) -> list[str]:
def get_codex_model_ids(access_token: Optional[str] = None) -> List[str]:
"""Return available Codex model IDs, trying API first, then local sources.
Resolution order: API (live, if token provided) > config.toml default >
local cache > hardcoded defaults.
"""
codex_home_str = os.getenv("CODEX_HOME", "").strip() or str(Path.home() / ".codex")
codex_home = Path(codex_home_str).expanduser()
ordered: list[str] = []
ordered: List[str] = []
# Try live API if we have a token
if access_token:

View File

@@ -1,23 +1,17 @@
"""Slash command definitions and autocomplete for the Hermes CLI.
Contains the shared built-in ``COMMANDS`` dict and ``SlashCommandCompleter``.
The completer can optionally include dynamic skill slash commands supplied by the
interactive CLI.
Contains the COMMANDS dict and the SlashCommandCompleter class.
These are pure data/UI with no HermesCLI state dependency.
"""
from __future__ import annotations
from collections.abc import Callable, Mapping
from typing import Any
from prompt_toolkit.completion import Completer, Completion
COMMANDS = {
"/help": "Show this help message",
"/tools": "List available tools",
"/toolsets": "List available toolsets",
"/model": "Show or change the current model",
"/provider": "Show available providers and current provider",
"/prompt": "View/set custom system prompt",
"/personality": "Set a predefined personality",
"/clear": "Clear screen and reset conversation (fresh start)",
@@ -33,68 +27,26 @@ COMMANDS = {
"/platforms": "Show gateway/messaging platform status",
"/verbose": "Cycle tool progress display: off → new → all → verbose",
"/compress": "Manually compress conversation context (flush memories + summarize)",
"/title": "Set a title for the current session (usage: /title My Session Name)",
"/usage": "Show token usage for the current session",
"/insights": "Show usage insights and analytics (last 30 days)",
"/paste": "Check clipboard for an image and attach it",
"/reload-mcp": "Reload MCP servers from config.yaml",
"/quit": "Exit the CLI (also: /exit, /q)",
}
class SlashCommandCompleter(Completer):
"""Autocomplete for built-in slash commands and optional skill commands."""
def __init__(
self,
skill_commands_provider: Callable[[], Mapping[str, dict[str, Any]]] | None = None,
) -> None:
self._skill_commands_provider = skill_commands_provider
def _iter_skill_commands(self) -> Mapping[str, dict[str, Any]]:
if self._skill_commands_provider is None:
return {}
try:
return self._skill_commands_provider() or {}
except Exception:
return {}
@staticmethod
def _completion_text(cmd_name: str, word: str) -> str:
"""Return replacement text for a completion.
When the user has already typed the full command exactly (``/help``),
returning ``help`` would be a no-op and prompt_toolkit suppresses the
menu. Appending a trailing space keeps the dropdown visible and makes
backspacing retrigger it naturally.
"""
return f"{cmd_name} " if cmd_name == word else cmd_name
"""Autocomplete for /commands in the input area."""
def get_completions(self, document, complete_event):
text = document.text_before_cursor
if not text.startswith("/"):
return
word = text[1:]
for cmd, desc in COMMANDS.items():
cmd_name = cmd[1:]
if cmd_name.startswith(word):
yield Completion(
self._completion_text(cmd_name, word),
cmd_name,
start_position=-len(word),
display=cmd,
display_meta=desc,
)
for cmd, info in self._iter_skill_commands().items():
cmd_name = cmd[1:]
if cmd_name.startswith(word):
description = str(info.get("description", "Skill command"))
short_desc = description[:50] + ("..." if len(description) > 50 else "")
yield Completion(
self._completion_text(cmd_name, word),
start_position=-len(word),
display=cmd,
display_meta=f"{short_desc}",
)

File diff suppressed because it is too large Load Diff

View File

@@ -20,46 +20,46 @@ from hermes_cli.colors import Colors, color
def cron_list(show_all: bool = False):
"""List all scheduled jobs."""
from cron.jobs import list_jobs
jobs = list_jobs(include_disabled=show_all)
if not jobs:
print(color("No scheduled jobs.", Colors.DIM))
print(color("Create one with the /cron add command in chat, or via Telegram.", Colors.DIM))
return
print()
print(color("┌─────────────────────────────────────────────────────────────────────────┐", Colors.CYAN))
print(color("│ Scheduled Jobs │", Colors.CYAN))
print(color("└─────────────────────────────────────────────────────────────────────────┘", Colors.CYAN))
print()
for job in jobs:
job_id = job.get("id", "?")[:8]
name = job.get("name", "(unnamed)")
schedule = job.get("schedule_display", job.get("schedule", {}).get("value", "?"))
enabled = job.get("enabled", True)
next_run = job.get("next_run_at", "?")
repeat_info = job.get("repeat", {})
repeat_times = repeat_info.get("times")
repeat_completed = repeat_info.get("completed", 0)
if repeat_times:
repeat_str = f"{repeat_completed}/{repeat_times}"
else:
repeat_str = ""
deliver = job.get("deliver", ["local"])
if isinstance(deliver, str):
deliver = [deliver]
deliver_str = ", ".join(deliver)
if not enabled:
status = color("[disabled]", Colors.RED)
else:
status = color("[active]", Colors.GREEN)
print(f" {color(job_id, Colors.YELLOW)} {status}")
print(f" Name: {name}")
print(f" Schedule: {schedule}")
@@ -67,10 +67,9 @@ def cron_list(show_all: bool = False):
print(f" Next run: {next_run}")
print(f" Deliver: {deliver_str}")
print()
# Warn if gateway isn't running
from hermes_cli.gateway import find_gateway_pids
if not find_gateway_pids():
print(color(" ⚠ Gateway is not running — jobs won't fire automatically.", Colors.YELLOW))
print(color(" Start it with: hermes gateway install", Colors.DIM))
@@ -80,7 +79,6 @@ def cron_list(show_all: bool = False):
def cron_tick():
"""Run due jobs once and exit."""
from cron.scheduler import tick
tick(verbose=True)
@@ -88,9 +86,9 @@ def cron_status():
"""Show cron execution status."""
from cron.jobs import list_jobs
from hermes_cli.gateway import find_gateway_pids
print()
pids = find_gateway_pids()
if pids:
print(color("✓ Gateway is running — cron jobs will fire automatically", Colors.GREEN))
@@ -101,9 +99,9 @@ def cron_status():
print(" To enable automatic execution:")
print(" hermes gateway install # Install as system service (recommended)")
print(" hermes gateway # Or run in foreground")
print()
jobs = list_jobs(include_disabled=False)
if jobs:
next_runs = [j.get("next_run_at") for j in jobs if j.get("next_run_at")]
@@ -112,24 +110,24 @@ def cron_status():
print(f" Next run: {min(next_runs)}")
else:
print(" No active jobs")
print()
def cron_command(args):
"""Handle cron subcommands."""
subcmd = getattr(args, "cron_command", None)
subcmd = getattr(args, 'cron_command', None)
if subcmd is None or subcmd == "list":
show_all = getattr(args, "all", False)
show_all = getattr(args, 'all', False)
cron_list(show_all)
elif subcmd == "tick":
cron_tick()
elif subcmd == "status":
cron_status()
else:
print(f"Unknown cron command: {subcmd}")
print("Usage: hermes cron [list|status|tick]")

View File

@@ -5,18 +5,18 @@ Diagnoses issues with Hermes Agent setup.
"""
import os
import shutil
import subprocess
import sys
import subprocess
import shutil
from pathlib import Path
from hermes_cli.config import get_env_path, get_hermes_home, get_project_root
from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
PROJECT_ROOT = get_project_root()
HERMES_HOME = get_hermes_home()
# Load environment variables from ~/.hermes/.env so API key checks work
from dotenv import load_dotenv
_env_path = get_env_path()
if _env_path.exists():
try:
@@ -33,60 +33,38 @@ os.environ.setdefault("MSWEA_SILENT_STARTUP", "1")
from hermes_cli.colors import Colors, color
from hermes_constants import OPENROUTER_MODELS_URL
_PROVIDER_ENV_HINTS = (
"OPENROUTER_API_KEY",
"OPENAI_API_KEY",
"ANTHROPIC_API_KEY",
"OPENAI_BASE_URL",
"GLM_API_KEY",
"ZAI_API_KEY",
"Z_AI_API_KEY",
"KIMI_API_KEY",
"MINIMAX_API_KEY",
"MINIMAX_CN_API_KEY",
)
def _has_provider_env_config(content: str) -> bool:
"""Return True when ~/.hermes/.env contains provider auth/base URL settings."""
return any(key in content for key in _PROVIDER_ENV_HINTS)
def check_ok(text: str, detail: str = ""):
print(f" {color('', Colors.GREEN)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
def check_warn(text: str, detail: str = ""):
print(f" {color('', Colors.YELLOW)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
def check_fail(text: str, detail: str = ""):
print(f" {color('', Colors.RED)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
def check_info(text: str):
print(f" {color('', Colors.CYAN)} {text}")
def run_doctor(args):
"""Run diagnostic checks."""
should_fix = getattr(args, "fix", False)
should_fix = getattr(args, 'fix', False)
issues = []
manual_issues = [] # issues that can't be auto-fixed
fixed_count = 0
print()
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
print(color("│ 🩺 Hermes Doctor │", Colors.CYAN))
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
# =========================================================================
# Check: Python version
# =========================================================================
print()
print(color("◆ Python Environment", Colors.CYAN, Colors.BOLD))
py_version = sys.version_info
if py_version >= (3, 11):
check_ok(f"Python {py_version.major}.{py_version.minor}.{py_version.micro}")
@@ -98,20 +76,20 @@ def run_doctor(args):
else:
check_fail(f"Python {py_version.major}.{py_version.minor}.{py_version.micro}", "(3.10+ required)")
issues.append("Upgrade Python to 3.10+")
# Check if in virtual environment
in_venv = sys.prefix != sys.base_prefix
if in_venv:
check_ok("Virtual environment active")
else:
check_warn("Not in virtual environment", "(recommended)")
# =========================================================================
# Check: Required packages
# =========================================================================
print()
print(color("◆ Required Packages", Colors.CYAN, Colors.BOLD))
required_packages = [
("openai", "OpenAI SDK"),
("rich", "Rich (terminal UI)"),
@@ -119,13 +97,13 @@ def run_doctor(args):
("yaml", "PyYAML"),
("httpx", "HTTPX"),
]
optional_packages = [
("croniter", "Croniter (cron expressions)"),
("telegram", "python-telegram-bot"),
("discord", "discord.py"),
]
for module, name in required_packages:
try:
__import__(module)
@@ -133,35 +111,39 @@ def run_doctor(args):
except ImportError:
check_fail(name, "(missing)")
issues.append(f"Install {name}: uv pip install {module}")
for module, name in optional_packages:
try:
__import__(module)
check_ok(name, "(optional)")
except ImportError:
check_warn(name, "(optional, not installed)")
# =========================================================================
# Check: Configuration files
# =========================================================================
print()
print(color("◆ Configuration Files", Colors.CYAN, Colors.BOLD))
# Check ~/.hermes/.env (primary location for user config)
env_path = HERMES_HOME / ".env"
env_path = HERMES_HOME / '.env'
if env_path.exists():
check_ok("~/.hermes/.env file exists")
# Check for common issues
content = env_path.read_text()
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
if any(k in content for k in (
"OPENROUTER_API_KEY", "ANTHROPIC_API_KEY",
"GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY",
"KIMI_API_KEY", "MINIMAX_API_KEY", "MINIMAX_CN_API_KEY",
)):
check_ok("API key configured")
else:
check_warn("No API key found in ~/.hermes/.env")
issues.append("Run 'hermes setup' to configure API keys")
else:
# Also check project root as fallback
fallback_env = PROJECT_ROOT / ".env"
fallback_env = PROJECT_ROOT / '.env'
if fallback_env.exists():
check_ok(".env file exists (in project directory)")
else:
@@ -175,17 +157,17 @@ def run_doctor(args):
else:
check_info("Run 'hermes setup' to create one")
issues.append("Run 'hermes setup' to create .env")
# Check ~/.hermes/config.yaml (primary) or project cli-config.yaml (fallback)
config_path = HERMES_HOME / "config.yaml"
config_path = HERMES_HOME / 'config.yaml'
if config_path.exists():
check_ok("~/.hermes/config.yaml exists")
else:
fallback_config = PROJECT_ROOT / "cli-config.yaml"
fallback_config = PROJECT_ROOT / 'cli-config.yaml'
if fallback_config.exists():
check_ok("cli-config.yaml exists (in project directory)")
else:
example_config = PROJECT_ROOT / "cli-config.yaml.example"
example_config = PROJECT_ROOT / 'cli-config.yaml.example'
if should_fix and example_config.exists():
config_path.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(str(example_config), str(config_path))
@@ -196,7 +178,7 @@ def run_doctor(args):
manual_issues.append("Create ~/.hermes/config.yaml manually")
else:
check_warn("config.yaml not found", "(using defaults)")
# =========================================================================
# Check: Auth providers
# =========================================================================
@@ -204,7 +186,7 @@ def run_doctor(args):
print(color("◆ Auth Providers", Colors.CYAN, Colors.BOLD))
try:
from hermes_cli.auth import get_codex_auth_status, get_nous_auth_status
from hermes_cli.auth import get_nous_auth_status, get_codex_auth_status
nous_status = get_nous_auth_status()
if nous_status.get("logged_in"):
@@ -232,7 +214,7 @@ def run_doctor(args):
# =========================================================================
print()
print(color("◆ Directory Structure", Colors.CYAN, Colors.BOLD))
hermes_home = HERMES_HOME
if hermes_home.exists():
check_ok("~/.hermes directory exists")
@@ -243,7 +225,7 @@ def run_doctor(args):
fixed_count += 1
else:
check_warn("~/.hermes not found", "(will be created on first use)")
# Check expected subdirectories
expected_subdirs = ["cron", "sessions", "logs", "skills", "memories"]
for subdir_name in expected_subdirs:
@@ -257,7 +239,7 @@ def run_doctor(args):
fixed_count += 1
else:
check_warn(f"~/.hermes/{subdir_name}/ not found", "(will be created on first use)")
# Check for SOUL.md persona file
soul_path = hermes_home / "SOUL.md"
if soul_path.exists():
@@ -280,7 +262,7 @@ def run_doctor(args):
)
check_ok("Created ~/.hermes/SOUL.md with basic template")
fixed_count += 1
# Check memory directory
memories_dir = hermes_home / "memories"
if memories_dir.exists():
@@ -303,13 +285,12 @@ def run_doctor(args):
memories_dir.mkdir(parents=True, exist_ok=True)
check_ok("Created ~/.hermes/memories/")
fixed_count += 1
# Check SQLite session store
state_db_path = hermes_home / "state.db"
if state_db_path.exists():
try:
import sqlite3
conn = sqlite3.connect(str(state_db_path))
cursor = conn.execute("SELECT COUNT(*) FROM sessions")
count = cursor.fetchone()[0]
@@ -319,26 +300,26 @@ def run_doctor(args):
check_warn(f"~/.hermes/state.db exists but has issues: {e}")
else:
check_info("~/.hermes/state.db not created yet (will be created on first session)")
# =========================================================================
# Check: External tools
# =========================================================================
print()
print(color("◆ External Tools", Colors.CYAN, Colors.BOLD))
# Git
if shutil.which("git"):
check_ok("git")
else:
check_warn("git not found", "(optional)")
# ripgrep (optional, for faster file search)
if shutil.which("rg"):
check_ok("ripgrep (rg)", "(faster file search)")
else:
check_warn("ripgrep (rg) not found", "(file search uses grep fallback)")
check_info("Install for faster search: sudo apt install ripgrep")
# Docker (optional)
terminal_env = os.getenv("TERMINAL_ENV", "local")
if terminal_env == "docker":
@@ -358,7 +339,7 @@ def run_doctor(args):
check_ok("docker", "(optional)")
else:
check_warn("docker not found", "(optional)")
# SSH (if using ssh backend)
if terminal_env == "ssh":
ssh_host = os.getenv("TERMINAL_SSH_HOST")
@@ -367,7 +348,7 @@ def run_doctor(args):
result = subprocess.run(
["ssh", "-o", "ConnectTimeout=5", "-o", "BatchMode=yes", ssh_host, "echo ok"],
capture_output=True,
text=True,
text=True
)
if result.returncode == 0:
check_ok(f"SSH connection to {ssh_host}")
@@ -377,7 +358,7 @@ def run_doctor(args):
else:
check_fail("TERMINAL_SSH_HOST not set", "(required for TERMINAL_ENV=ssh)")
issues.append("Set TERMINAL_SSH_HOST in .env")
# Daytona (if using daytona backend)
if terminal_env == "daytona":
daytona_key = os.getenv("DAYTONA_API_KEY")
@@ -388,7 +369,6 @@ def run_doctor(args):
issues.append("Set DAYTONA_API_KEY environment variable")
try:
from daytona import Daytona
check_ok("daytona SDK", "(installed)")
except ImportError:
check_fail("daytona SDK not installed", "(pip install daytona)")
@@ -405,7 +385,7 @@ def run_doctor(args):
check_warn("agent-browser not installed", "(run: npm install)")
else:
check_warn("Node.js not found", "(optional, needed for browser tools)")
# npm audit for all Node.js packages
if shutil.which("npm"):
npm_dirs = [
@@ -419,12 +399,9 @@ def run_doctor(args):
audit_result = subprocess.run(
["npm", "audit", "--json"],
cwd=str(npm_dir),
capture_output=True,
text=True,
timeout=30,
capture_output=True, text=True, timeout=30,
)
import json as _json
audit_data = _json.loads(audit_result.stdout) if audit_result.stdout.strip() else {}
vuln_count = audit_data.get("metadata", {}).get("vulnerabilities", {})
critical = vuln_count.get("critical", 0)
@@ -436,7 +413,7 @@ def run_doctor(args):
elif critical > 0 or high > 0:
check_warn(
f"{label} deps",
f"({critical} critical, {high} high, {moderate} moderate — run: cd {npm_dir} && npm audit fix)",
f"({critical} critical, {high} high, {moderate} moderate — run: cd {npm_dir} && npm audit fix)"
)
issues.append(f"{label} has {total} npm vulnerability(ies)")
else:
@@ -449,50 +426,47 @@ def run_doctor(args):
# =========================================================================
print()
print(color("◆ API Connectivity", Colors.CYAN, Colors.BOLD))
openrouter_key = os.getenv("OPENROUTER_API_KEY")
if openrouter_key:
print(" Checking OpenRouter API...", end="", flush=True)
try:
import httpx
response = httpx.get(
OPENROUTER_MODELS_URL, headers={"Authorization": f"Bearer {openrouter_key}"}, timeout=10
OPENROUTER_MODELS_URL,
headers={"Authorization": f"Bearer {openrouter_key}"},
timeout=10
)
if response.status_code == 200:
print(f"\r {color('', Colors.GREEN)} OpenRouter API ")
elif response.status_code == 401:
print(
f"\r {color('', Colors.RED)} OpenRouter API {color('(invalid API key)', Colors.DIM)} "
)
print(f"\r {color('', Colors.RED)} OpenRouter API {color('(invalid API key)', Colors.DIM)} ")
issues.append("Check OPENROUTER_API_KEY in .env")
else:
print(
f"\r {color('', Colors.RED)} OpenRouter API {color(f'(HTTP {response.status_code})', Colors.DIM)} "
)
print(f"\r {color('', Colors.RED)} OpenRouter API {color(f'(HTTP {response.status_code})', Colors.DIM)} ")
except Exception as e:
print(f"\r {color('', Colors.RED)} OpenRouter API {color(f'({e})', Colors.DIM)} ")
issues.append("Check network connectivity")
else:
check_warn("OpenRouter API", "(not configured)")
anthropic_key = os.getenv("ANTHROPIC_API_KEY")
if anthropic_key:
print(" Checking Anthropic API...", end="", flush=True)
try:
import httpx
response = httpx.get(
"https://api.anthropic.com/v1/models",
headers={"x-api-key": anthropic_key, "anthropic-version": "2023-06-01"},
timeout=10,
headers={
"x-api-key": anthropic_key,
"anthropic-version": "2023-06-01"
},
timeout=10
)
if response.status_code == 200:
print(f"\r {color('', Colors.GREEN)} Anthropic API ")
elif response.status_code == 401:
print(
f"\r {color('', Colors.RED)} Anthropic API {color('(invalid API key)', Colors.DIM)} "
)
print(f"\r {color('', Colors.RED)} Anthropic API {color('(invalid API key)', Colors.DIM)} ")
else:
msg = "(couldn't verify)"
print(f"\r {color('', Colors.YELLOW)} Anthropic API {color(msg, Colors.DIM)} ")
@@ -501,15 +475,10 @@ def run_doctor(args):
# -- API-key providers (Z.AI/GLM, Kimi, MiniMax, MiniMax-CN) --
_apikey_providers = [
(
"Z.AI / GLM",
("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
"https://api.z.ai/api/paas/v4/models",
"GLM_BASE_URL",
),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL"),
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL"),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL"),
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL"),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL"),
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL"),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL"),
]
for _pname, _env_vars, _default_url, _base_env in _apikey_providers:
_key = ""
@@ -522,18 +491,11 @@ def run_doctor(args):
print(f" Checking {_pname} API...", end="", flush=True)
try:
import httpx
_base = os.getenv(_base_env, "")
# Auto-detect Kimi Code keys (sk-kimi-) → api.kimi.com
if not _base and _key.startswith("sk-kimi-"):
_base = "https://api.kimi.com/coding/v1"
_url = (_base.rstrip("/") + "/models") if _base else _default_url
_headers = {"Authorization": f"Bearer {_key}"}
if "api.kimi.com" in _url.lower():
_headers["User-Agent"] = "KimiCLI/1.0"
_resp = httpx.get(
_url,
headers=_headers,
headers={"Authorization": f"Bearer {_key}"},
timeout=10,
)
if _resp.status_code == 200:
@@ -542,9 +504,7 @@ def run_doctor(args):
print(f"\r {color('', Colors.RED)} {_label} {color('(invalid API key)', Colors.DIM)} ")
issues.append(f"Check {_env_vars[0]} in .env")
else:
print(
f"\r {color('', Colors.YELLOW)} {_label} {color(f'(HTTP {_resp.status_code})', Colors.DIM)} "
)
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'(HTTP {_resp.status_code})', Colors.DIM)} ")
except Exception as _e:
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'({_e})', Colors.DIM)} ")
@@ -553,7 +513,7 @@ def run_doctor(args):
# =========================================================================
print()
print(color("◆ Submodules", Colors.CYAN, Colors.BOLD))
# mini-swe-agent (terminal tool backend)
mini_swe_dir = PROJECT_ROOT / "mini-swe-agent"
if mini_swe_dir.exists() and (mini_swe_dir / "pyproject.toml").exists():
@@ -565,7 +525,7 @@ def run_doctor(args):
issues.append("Install mini-swe-agent: uv pip install -e ./mini-swe-agent")
else:
check_warn("mini-swe-agent not found", "(run: git submodule update --init --recursive)")
# tinker-atropos (RL training backend)
tinker_dir = PROJECT_ROOT / "tinker-atropos"
if tinker_dir.exists() and (tinker_dir / "pyproject.toml").exists():
@@ -580,24 +540,24 @@ def run_doctor(args):
check_warn("tinker-atropos requires Python 3.11+", f"(current: {py_version.major}.{py_version.minor})")
else:
check_warn("tinker-atropos not found", "(run: git submodule update --init --recursive)")
# =========================================================================
# Check: Tool Availability
# =========================================================================
print()
print(color("◆ Tool Availability", Colors.CYAN, Colors.BOLD))
try:
# Add project root to path for imports
sys.path.insert(0, str(PROJECT_ROOT))
from model_tools import TOOLSET_REQUIREMENTS, check_tool_availability
from model_tools import check_tool_availability, TOOLSET_REQUIREMENTS
available, unavailable = check_tool_availability()
for tid in available:
info = TOOLSET_REQUIREMENTS.get(tid, {})
check_ok(info.get("name", tid))
for item in unavailable:
env_vars = item.get("missing_vars") or item.get("env_vars") or []
if env_vars:
@@ -612,7 +572,7 @@ def run_doctor(args):
issues.append("Run 'hermes setup' to configure missing API keys for full tool access")
except Exception as e:
check_warn("Could not check tool availability", f"({e})")
# =========================================================================
# Check: Skills Hub
# =========================================================================
@@ -626,7 +586,6 @@ def run_doctor(args):
if lock_file.exists():
try:
import json
lock_data = json.loads(lock_file.read_text())
count = len(lock_data.get("installed", {}))
check_ok(f"Lock file OK ({count} hub-installed skill(s))")
@@ -640,7 +599,6 @@ def run_doctor(args):
check_warn("Skills Hub directory not initialized", "(run: hermes skills list)")
from hermes_cli.config import get_env_value
github_token = get_env_value("GITHUB_TOKEN") or get_env_value("GH_TOKEN")
if github_token:
check_ok("GitHub token configured (authenticated API access)")
@@ -676,5 +634,5 @@ def run_doctor(args):
else:
print(color("" * 60, Colors.GREEN))
print(color(" All checks passed! 🎉", Colors.GREEN, Colors.BOLD))
print()

View File

@@ -13,24 +13,18 @@ from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
from hermes_cli.colors import Colors, color
from hermes_cli.config import get_env_value, save_env_value
from hermes_cli.setup import (
print_error,
print_header,
print_info,
print_success,
print_warning,
prompt,
prompt_choice,
prompt_yes_no,
print_header, print_info, print_success, print_warning, print_error,
prompt, prompt_choice, prompt_yes_no,
)
from hermes_cli.colors import Colors, color
# =============================================================================
# Process Management (for manual gateway runs)
# =============================================================================
def find_gateway_pids() -> list:
"""Find PIDs of running gateway processes."""
pids = []
@@ -44,16 +38,17 @@ def find_gateway_pids() -> list:
if is_windows():
# Windows: use wmic to search command lines
result = subprocess.run(
["wmic", "process", "get", "ProcessId,CommandLine", "/FORMAT:LIST"], capture_output=True, text=True
["wmic", "process", "get", "ProcessId,CommandLine", "/FORMAT:LIST"],
capture_output=True, text=True
)
# Parse WMIC LIST output: blocks of "CommandLine=...\nProcessId=...\n"
current_cmd = ""
for line in result.stdout.split("\n"):
for line in result.stdout.split('\n'):
line = line.strip()
if line.startswith("CommandLine="):
current_cmd = line[len("CommandLine=") :]
current_cmd = line[len("CommandLine="):]
elif line.startswith("ProcessId="):
pid_str = line[len("ProcessId=") :]
pid_str = line[len("ProcessId="):]
if any(p in current_cmd for p in patterns):
try:
pid = int(pid_str)
@@ -63,10 +58,14 @@ def find_gateway_pids() -> list:
pass
current_cmd = ""
else:
result = subprocess.run(["ps", "aux"], capture_output=True, text=True)
for line in result.stdout.split("\n"):
result = subprocess.run(
["ps", "aux"],
capture_output=True,
text=True
)
for line in result.stdout.split('\n'):
# Skip grep and current process
if "grep" in line or str(os.getpid()) in line:
if 'grep' in line or str(os.getpid()) in line:
continue
for pattern in patterns:
if pattern in line:
@@ -89,7 +88,7 @@ def kill_gateway_processes(force: bool = False) -> int:
"""Kill any running gateway processes. Returns count killed."""
pids = find_gateway_pids()
killed = 0
for pid in pids:
try:
if force and not is_windows():
@@ -102,20 +101,18 @@ def kill_gateway_processes(force: bool = False) -> int:
pass
except PermissionError:
print(f"⚠ Permission denied to kill PID {pid}")
return killed
def is_linux() -> bool:
return sys.platform.startswith("linux")
return sys.platform.startswith('linux')
def is_macos() -> bool:
return sys.platform == "darwin"
return sys.platform == 'darwin'
def is_windows() -> bool:
return sys.platform == "win32"
return sys.platform == 'win32'
# =============================================================================
@@ -125,15 +122,12 @@ def is_windows() -> bool:
SERVICE_NAME = "hermes-gateway"
SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"
def get_systemd_unit_path() -> Path:
return Path.home() / ".config" / "systemd" / "user" / f"{SERVICE_NAME}.service"
def get_launchd_plist_path() -> Path:
return Path.home() / "Library" / "LaunchAgents" / "ai.hermes.gateway.plist"
def get_python_path() -> str:
if is_windows():
venv_python = PROJECT_ROOT / "venv" / "Scripts" / "python.exe"
@@ -143,16 +137,14 @@ def get_python_path() -> str:
return str(venv_python)
return sys.executable
def get_hermes_cli_path() -> str:
"""Get the path to the hermes CLI."""
# Check if installed via pip
import shutil
hermes_bin = shutil.which("hermes")
if hermes_bin:
return hermes_bin
# Fallback to direct module execution
return f"{get_python_path()} -m hermes_cli.main"
@@ -161,36 +153,20 @@ def get_hermes_cli_path() -> str:
# Systemd (Linux)
# =============================================================================
def generate_systemd_unit() -> str:
import shutil
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
venv_dir = str(PROJECT_ROOT / "venv")
venv_bin = str(PROJECT_ROOT / "venv" / "bin")
node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
# Build a PATH that includes the venv, node_modules, and standard system dirs
sane_path = f"{venv_bin}:{node_bin}:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
hermes_cli = shutil.which("hermes") or f"{python_path} -m hermes_cli.main"
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network.target
[Service]
Type=simple
ExecStart={python_path} -m hermes_cli.main gateway run --replace
ExecStop={hermes_cli} gateway stop
ExecStart={python_path} -m hermes_cli.main gateway run
WorkingDirectory={working_dir}
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Restart=on-failure
RestartSec=10
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=15
StandardOutput=journal
StandardError=journal
@@ -198,62 +174,56 @@ StandardError=journal
WantedBy=default.target
"""
def systemd_install(force: bool = False):
unit_path = get_systemd_unit_path()
if unit_path.exists() and not force:
print(f"Service already installed at: {unit_path}")
print("Use --force to reinstall")
return
unit_path.parent.mkdir(parents=True, exist_ok=True)
print(f"Installing systemd service to: {unit_path}")
unit_path.write_text(generate_systemd_unit())
subprocess.run(["systemctl", "--user", "daemon-reload"], check=True)
subprocess.run(["systemctl", "--user", "enable", SERVICE_NAME], check=True)
print()
print("✓ Service installed and enabled!")
print()
print("Next steps:")
print(" hermes gateway start # Start the service")
print(" hermes gateway status # Check status")
print(f" hermes gateway start # Start the service")
print(f" hermes gateway status # Check status")
print(f" journalctl --user -u {SERVICE_NAME} -f # View logs")
print()
print("To enable lingering (keeps running after logout):")
print(" sudo loginctl enable-linger $USER")
def systemd_uninstall():
subprocess.run(["systemctl", "--user", "stop", SERVICE_NAME], check=False)
subprocess.run(["systemctl", "--user", "disable", SERVICE_NAME], check=False)
unit_path = get_systemd_unit_path()
if unit_path.exists():
unit_path.unlink()
print(f"✓ Removed {unit_path}")
subprocess.run(["systemctl", "--user", "daemon-reload"], check=True)
print("✓ Service uninstalled")
def systemd_start():
subprocess.run(["systemctl", "--user", "start", SERVICE_NAME], check=True)
print("✓ Service started")
def systemd_stop():
subprocess.run(["systemctl", "--user", "stop", SERVICE_NAME], check=True)
print("✓ Service stopped")
def systemd_restart():
subprocess.run(["systemctl", "--user", "restart", SERVICE_NAME], check=True)
print("✓ Service restarted")
def systemd_status(deep: bool = False):
# Check if service unit file exists
unit_path = get_systemd_unit_path()
@@ -261,45 +231,54 @@ def systemd_status(deep: bool = False):
print("✗ Gateway service is not installed")
print(" Run: hermes gateway install")
return
# Show detailed status first
subprocess.run(["systemctl", "--user", "status", SERVICE_NAME, "--no-pager"], capture_output=False)
subprocess.run(
["systemctl", "--user", "status", SERVICE_NAME, "--no-pager"],
capture_output=False
)
# Check if service is active
result = subprocess.run(["systemctl", "--user", "is-active", SERVICE_NAME], capture_output=True, text=True)
result = subprocess.run(
["systemctl", "--user", "is-active", SERVICE_NAME],
capture_output=True,
text=True
)
status = result.stdout.strip()
if status == "active":
print("✓ Gateway service is running")
else:
print("✗ Gateway service is stopped")
print(" Run: hermes gateway start")
if deep:
print()
print("Recent logs:")
subprocess.run(["journalctl", "--user", "-u", SERVICE_NAME, "-n", "20", "--no-pager"])
subprocess.run([
"journalctl", "--user", "-u", SERVICE_NAME,
"-n", "20", "--no-pager"
])
# =============================================================================
# Launchd (macOS)
# =============================================================================
def generate_launchd_plist() -> str:
python_path = get_python_path()
working_dir = str(PROJECT_ROOT)
log_dir = Path.home() / ".hermes" / "logs"
log_dir.mkdir(parents=True, exist_ok=True)
return f"""<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>ai.hermes.gateway</string>
<key>ProgramArguments</key>
<array>
<string>{python_path}</string>
@@ -308,43 +287,42 @@ def generate_launchd_plist() -> str:
<string>gateway</string>
<string>run</string>
</array>
<key>WorkingDirectory</key>
<string>{working_dir}</string>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<dict>
<key>SuccessfulExit</key>
<false/>
</dict>
<key>StandardOutPath</key>
<string>{log_dir}/gateway.log</string>
<key>StandardErrorPath</key>
<string>{log_dir}/gateway.error.log</string>
</dict>
</plist>
"""
def launchd_install(force: bool = False):
plist_path = get_launchd_plist_path()
if plist_path.exists() and not force:
print(f"Service already installed at: {plist_path}")
print("Use --force to reinstall")
return
plist_path.parent.mkdir(parents=True, exist_ok=True)
print(f"Installing launchd service to: {plist_path}")
plist_path.write_text(generate_launchd_plist())
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
print()
print("✓ Service installed and loaded!")
print()
@@ -352,42 +330,41 @@ def launchd_install(force: bool = False):
print(" hermes gateway status # Check status")
print(" tail -f ~/.hermes/logs/gateway.log # View logs")
def launchd_uninstall():
plist_path = get_launchd_plist_path()
subprocess.run(["launchctl", "unload", str(plist_path)], check=False)
if plist_path.exists():
plist_path.unlink()
print(f"✓ Removed {plist_path}")
print("✓ Service uninstalled")
def launchd_start():
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
print("✓ Service started")
def launchd_stop():
subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
print("✓ Service stopped")
def launchd_restart():
launchd_stop()
launchd_start()
def launchd_status(deep: bool = False):
result = subprocess.run(["launchctl", "list", "ai.hermes.gateway"], capture_output=True, text=True)
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
capture_output=True,
text=True
)
if result.returncode == 0:
print("✓ Gateway service is loaded")
print(result.stdout)
else:
print("✗ Gateway service is not loaded")
if deep:
log_file = Path.home() / ".hermes" / "logs" / "gateway.log"
if log_file.exists():
@@ -400,20 +377,12 @@ def launchd_status(deep: bool = False):
# Gateway Runner
# =============================================================================
def run_gateway(verbose: bool = False, replace: bool = False):
"""Run the gateway in foreground.
Args:
verbose: Enable verbose logging output.
replace: If True, kill any existing gateway instance before starting.
This prevents systemd restart loops when the old process
hasn't fully exited yet.
"""
def run_gateway(verbose: bool = False):
"""Run the gateway in foreground."""
sys.path.insert(0, str(PROJECT_ROOT))
from gateway.run import start_gateway
print("┌─────────────────────────────────────────────────────────┐")
print("│ ⚕ Hermes Gateway Starting... │")
print("├─────────────────────────────────────────────────────────┤")
@@ -421,10 +390,10 @@ def run_gateway(verbose: bool = False, replace: bool = False):
print("│ Press Ctrl+C to stop │")
print("└─────────────────────────────────────────────────────────┘")
print()
# Exit with code 1 if gateway fails to connect any platform,
# so systemd Restart=on-failure will retry on transient errors
success = asyncio.run(start_gateway(replace=replace))
success = asyncio.run(start_gateway())
if not success:
sys.exit(1)
@@ -448,25 +417,13 @@ _PLATFORMS = [
"4. To find your user ID: message @userinfobot — it replies with your numeric ID",
],
"vars": [
{
"name": "TELEGRAM_BOT_TOKEN",
"prompt": "Bot token",
"password": True,
"help": "Paste the token from @BotFather (step 3 above).",
},
{
"name": "TELEGRAM_ALLOWED_USERS",
"prompt": "Allowed user IDs (comma-separated)",
"password": False,
"is_allowlist": True,
"help": "Paste your user ID from step 4 above.",
},
{
"name": "TELEGRAM_HOME_CHANNEL",
"prompt": "Home channel ID (for cron/notification delivery, or empty to set later with /set-home)",
"password": False,
"help": "For DMs, this is your user ID. You can set it later by typing /set-home in chat.",
},
{"name": "TELEGRAM_BOT_TOKEN", "prompt": "Bot token", "password": True,
"help": "Paste the token from @BotFather (step 3 above)."},
{"name": "TELEGRAM_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated)", "password": False,
"is_allowlist": True,
"help": "Paste your user ID from step 4 above."},
{"name": "TELEGRAM_HOME_CHANNEL", "prompt": "Home channel ID (for cron/notification delivery, or empty to set later with /set-home)", "password": False,
"help": "For DMs, this is your user ID. You can set it later by typing /set-home in chat."},
],
},
{
@@ -488,25 +445,13 @@ _PLATFORMS = [
" then right-click your name → Copy ID",
],
"vars": [
{
"name": "DISCORD_BOT_TOKEN",
"prompt": "Bot token",
"password": True,
"help": "Paste the token from step 2 above.",
},
{
"name": "DISCORD_ALLOWED_USERS",
"prompt": "Allowed user IDs or usernames (comma-separated)",
"password": False,
"is_allowlist": True,
"help": "Paste your user ID from step 5 above.",
},
{
"name": "DISCORD_HOME_CHANNEL",
"prompt": "Home channel ID (for cron/notification delivery, or empty to set later with /set-home)",
"password": False,
"help": "Right-click a channel → Copy Channel ID (requires Developer Mode).",
},
{"name": "DISCORD_BOT_TOKEN", "prompt": "Bot token", "password": True,
"help": "Paste the token from step 2 above."},
{"name": "DISCORD_ALLOWED_USERS", "prompt": "Allowed user IDs or usernames (comma-separated)", "password": False,
"is_allowlist": True,
"help": "Paste your user ID from step 5 above."},
{"name": "DISCORD_HOME_CHANNEL", "prompt": "Home channel ID (for cron/notification delivery, or empty to set later with /set-home)", "password": False,
"help": "Right-click a channel → Copy Channel ID (requires Developer Mode)."},
],
},
{
@@ -516,40 +461,23 @@ _PLATFORMS = [
"token_var": "SLACK_BOT_TOKEN",
"setup_instructions": [
"1. Go to https://api.slack.com/apps → Create New App → From Scratch",
"2. Enable Socket Mode: Settings → Socket Mode → Enable",
" Create an App-Level Token with scope: connections:write → copy xapp-... token",
"3. Add Bot Token Scopes: Features → OAuth & Permissions → Scopes",
" Required: chat:write, app_mentions:read, channels:history, channels:read,",
" groups:history, im:history, im:read, im:write, users:read, files:write",
"4. Subscribe to Events: Features → Event Subscriptions → Enable",
" Required events: message.im, message.channels, app_mention",
" Optional: message.groups (for private channels)",
" ⚠ Without message.channels the bot will ONLY work in DMs!",
"5. Install to Workspace: Settings → Install App → copy xoxb-... token",
"6. Reinstall the app after any scope or event changes",
"2. Enable Socket Mode: App Settings → Socket Mode → Enable",
"3. Get Bot Token: OAuth & Permissions → Install to Workspace → copy xoxb-... token",
"4. Get App Token: Basic Information → App-Level Tokens → Generate",
" Name it anything, add scope: connections:write → copy xapp-... token",
"5. Add bot scopes: OAuth & Permissions → Scopes → chat:write, im:history,",
" im:read, im:write, channels:history, channels:read",
"6. Reinstall the app to your workspace after adding scopes",
"7. Find your user ID: click your profile → three dots → Copy member ID",
"8. Invite the bot to channels: /invite @YourBot",
],
"vars": [
{
"name": "SLACK_BOT_TOKEN",
"prompt": "Bot Token (xoxb-...)",
"password": True,
"help": "Paste the bot token from step 3 above.",
},
{
"name": "SLACK_APP_TOKEN",
"prompt": "App Token (xapp-...)",
"password": True,
"help": "Paste the app-level token from step 4 above.",
},
{
"name": "SLACK_ALLOWED_USERS",
"prompt": "Allowed user IDs (comma-separated)",
"password": False,
"is_allowlist": True,
"help": "Paste your member ID from step 7 above.",
},
{"name": "SLACK_BOT_TOKEN", "prompt": "Bot Token (xoxb-...)", "password": True,
"help": "Paste the bot token from step 3 above."},
{"name": "SLACK_APP_TOKEN", "prompt": "App Token (xapp-...)", "password": True,
"help": "Paste the app-level token from step 4 above."},
{"name": "SLACK_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated)", "password": False,
"is_allowlist": True,
"help": "Paste your member ID from step 7 above."},
],
},
{
@@ -558,12 +486,6 @@ _PLATFORMS = [
"emoji": "📲",
"token_var": "WHATSAPP_ENABLED",
},
{
"key": "signal",
"label": "Signal",
"emoji": "📡",
"token_var": "SIGNAL_HTTP_URL",
},
]
@@ -582,13 +504,6 @@ def _platform_status(platform: dict) -> str:
return "configured + paired"
return "enabled, not paired"
return "not configured"
if platform.get("key") == "signal":
account = get_env_value("SIGNAL_ACCOUNT")
if val and account:
return "configured"
if val or account:
return "partially configured"
return "not configured"
if val:
return "configured"
return "not configured"
@@ -628,14 +543,14 @@ def _setup_standard_platform(platform: dict):
# Allowlist fields get special handling for the deny-by-default security model
if var.get("is_allowlist"):
print_info(" The gateway DENIES all users by default for security.")
print_info(" Enter user IDs to create an allowlist, or leave empty")
print_info(" and you'll be asked about open access next.")
print_info(f" The gateway DENIES all users by default for security.")
print_info(f" Enter user IDs to create an allowlist, or leave empty")
print_info(f" and you'll be asked about open access next.")
value = prompt(f" {var['prompt']}", password=False)
if value:
cleaned = value.replace(" ", "")
save_env_value(var["name"], cleaned)
print_success(" Saved — only these users can interact with the bot.")
print_success(f" Saved — only these users can interact with the bot.")
allowed_val_set = cleaned
else:
# No allowlist — ask about open access vs DM pairing
@@ -664,7 +579,7 @@ def _setup_standard_platform(platform: dict):
print_warning(f" Skipped — {label} won't work without this.")
return
else:
print_info(" Skipped (can configure later)")
print_info(f" Skipped (can configure later)")
# If an allowlist was set and home channel wasn't, offer to reuse
# the first user ID (common for Telegram DMs).
@@ -682,10 +597,8 @@ def _setup_standard_platform(platform: dict):
def _setup_whatsapp():
"""Delegate to the existing WhatsApp setup flow."""
import argparse
from hermes_cli.main import cmd_whatsapp
import argparse
cmd_whatsapp(argparse.Namespace())
@@ -701,131 +614,21 @@ def _is_service_installed() -> bool:
def _is_service_running() -> bool:
"""Check if the gateway service is currently running."""
if is_linux() and get_systemd_unit_path().exists():
result = subprocess.run(["systemctl", "--user", "is-active", SERVICE_NAME], capture_output=True, text=True)
result = subprocess.run(
["systemctl", "--user", "is-active", SERVICE_NAME],
capture_output=True, text=True
)
return result.stdout.strip() == "active"
elif is_macos() and get_launchd_plist_path().exists():
result = subprocess.run(["launchctl", "list", "ai.hermes.gateway"], capture_output=True, text=True)
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
capture_output=True, text=True
)
return result.returncode == 0
# Check for manual processes
return len(find_gateway_pids()) > 0
def _setup_signal():
"""Interactive setup for Signal messenger."""
import shutil
print()
print(color(" ─── 📡 Signal Setup ───", Colors.CYAN))
existing_url = get_env_value("SIGNAL_HTTP_URL")
existing_account = get_env_value("SIGNAL_ACCOUNT")
if existing_url and existing_account:
print()
print_success("Signal is already configured.")
if not prompt_yes_no(" Reconfigure Signal?", False):
return
# Check if signal-cli is available
print()
if shutil.which("signal-cli"):
print_success("signal-cli found on PATH.")
else:
print_warning("signal-cli not found on PATH.")
print_info(" Signal requires signal-cli running as an HTTP daemon.")
print_info(" Install options:")
print_info(" Linux: sudo apt install signal-cli")
print_info(" or download from https://github.com/AsamK/signal-cli")
print_info(" macOS: brew install signal-cli")
print_info(" Docker: bbernhard/signal-cli-rest-api")
print()
print_info(" After installing, link your account and start the daemon:")
print_info(' signal-cli link -n "HermesAgent"')
print_info(" signal-cli --account +YOURNUMBER daemon --http 127.0.0.1:8080")
print()
# HTTP URL
print()
print_info(" Enter the URL where signal-cli HTTP daemon is running.")
default_url = existing_url or "http://127.0.0.1:8080"
try:
url = input(f" HTTP URL [{default_url}]: ").strip() or default_url
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
# Test connectivity
print_info(" Testing connection...")
try:
import httpx
resp = httpx.get(f"{url.rstrip('/')}/api/v1/check", timeout=10.0)
if resp.status_code == 200:
print_success(" signal-cli daemon is reachable!")
else:
print_warning(f" signal-cli responded with status {resp.status_code}.")
if not prompt_yes_no(" Continue anyway?", False):
return
except Exception as e:
print_warning(f" Could not reach signal-cli at {url}: {e}")
if not prompt_yes_no(" Save this URL anyway? (you can start signal-cli later)", True):
return
save_env_value("SIGNAL_HTTP_URL", url)
# Account phone number
print()
print_info(" Enter your Signal account phone number in E.164 format.")
print_info(" Example: +15551234567")
default_account = existing_account or ""
try:
account = input(f" Account number{f' [{default_account}]' if default_account else ''}: ").strip()
if not account:
account = default_account
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
if not account:
print_error(" Account number is required.")
return
save_env_value("SIGNAL_ACCOUNT", account)
# Allowed users
print()
print_info(" The gateway DENIES all users by default for security.")
print_info(" Enter phone numbers or UUIDs of allowed users (comma-separated).")
existing_allowed = get_env_value("SIGNAL_ALLOWED_USERS") or ""
default_allowed = existing_allowed or account
try:
allowed = input(f" Allowed users [{default_allowed}]: ").strip() or default_allowed
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
save_env_value("SIGNAL_ALLOWED_USERS", allowed)
# Group messaging
print()
if prompt_yes_no(" Enable group messaging? (disabled by default for security)", False):
print()
print_info(" Enter group IDs to allow, or * for all groups.")
existing_groups = get_env_value("SIGNAL_GROUP_ALLOWED_USERS") or ""
try:
groups = input(f" Group IDs [{existing_groups or '*'}]: ").strip() or existing_groups or "*"
except (EOFError, KeyboardInterrupt):
print("\n Setup cancelled.")
return
save_env_value("SIGNAL_GROUP_ALLOWED_USERS", groups)
print()
print_success("Signal configured!")
print_info(f" URL: {url}")
print_info(f" Account: {account}")
print_info(" DM auth: via SIGNAL_ALLOWED_USERS + DM pairing")
print_info(f" Groups: {'enabled' if get_env_value('SIGNAL_GROUP_ALLOWED_USERS') else 'disabled'}")
def gateway_setup():
"""Interactive setup for messaging platforms + gateway service."""
@@ -878,16 +681,15 @@ def gateway_setup():
if platform["key"] == "whatsapp":
_setup_whatsapp()
elif platform["key"] == "signal":
_setup_signal()
else:
_setup_standard_platform(platform)
# ── Post-setup: offer to install/restart gateway ──
any_configured = (
any(bool(get_env_value(p["token_var"])) for p in _PLATFORMS if p["key"] != "whatsapp")
or (get_env_value("WHATSAPP_ENABLED") or "").lower() == "true"
)
any_configured = any(
bool(get_env_value(p["token_var"]))
for p in _PLATFORMS
if p["key"] != "whatsapp"
) or (get_env_value("WHATSAPP_ENABLED") or "").lower() == "true"
if any_configured:
print()
@@ -920,9 +722,7 @@ def gateway_setup():
print()
if is_linux() or is_macos():
platform_name = "systemd" if is_linux() else "launchd"
if prompt_yes_no(
f" Install the gateway as a {platform_name} service? (runs in background, starts on boot)", True
):
if prompt_yes_no(f" Install the gateway as a {platform_name} service? (runs in background, starts on boot)", True):
try:
force = False
if is_linux():
@@ -958,16 +758,14 @@ def gateway_setup():
# Main Command Handler
# =============================================================================
def gateway_command(args):
"""Handle gateway subcommands."""
subcmd = getattr(args, "gateway_command", None)
subcmd = getattr(args, 'gateway_command', None)
# Default to run if no subcommand
if subcmd is None or subcmd == "run":
verbose = getattr(args, "verbose", False)
replace = getattr(args, "replace", False)
run_gateway(verbose, replace=replace)
verbose = getattr(args, 'verbose', False)
run_gateway(verbose)
return
if subcmd == "setup":
@@ -976,7 +774,7 @@ def gateway_command(args):
# Service management commands
if subcmd == "install":
force = getattr(args, "force", False)
force = getattr(args, 'force', False)
if is_linux():
systemd_install(force)
elif is_macos():
@@ -985,7 +783,7 @@ def gateway_command(args):
print("Service installation not supported on this platform.")
print("Run manually: hermes gateway run")
sys.exit(1)
elif subcmd == "uninstall":
if is_linux():
systemd_uninstall()
@@ -994,7 +792,7 @@ def gateway_command(args):
else:
print("Not supported on this platform.")
sys.exit(1)
elif subcmd == "start":
if is_linux():
systemd_start()
@@ -1003,11 +801,11 @@ def gateway_command(args):
else:
print("Not supported on this platform.")
sys.exit(1)
elif subcmd == "stop":
# Try service first, fall back to killing processes directly
service_available = False
if is_linux() and get_systemd_unit_path().exists():
try:
systemd_stop()
@@ -1020,7 +818,7 @@ def gateway_command(args):
service_available = True
except subprocess.CalledProcessError:
pass
if not service_available:
# Kill gateway processes directly
killed = kill_gateway_processes()
@@ -1028,11 +826,11 @@ def gateway_command(args):
print(f"✓ Stopped {killed} gateway process(es)")
else:
print("✗ No gateway processes found")
elif subcmd == "restart":
# Try service first, fall back to killing and restarting
service_available = False
if is_linux() and get_systemd_unit_path().exists():
try:
systemd_restart()
@@ -1045,24 +843,23 @@ def gateway_command(args):
service_available = True
except subprocess.CalledProcessError:
pass
if not service_available:
# Manual restart: kill existing processes
killed = kill_gateway_processes()
if killed:
print(f"✓ Stopped {killed} gateway process(es)")
import time
time.sleep(2)
# Start fresh
print("Starting gateway...")
run_gateway(verbose=False)
elif subcmd == "status":
deep = getattr(args, "deep", False)
deep = getattr(args, 'deep', False)
# Check for service first
if is_linux() and get_systemd_unit_path().exists():
systemd_status(deep)

File diff suppressed because it is too large Load Diff

View File

@@ -1,85 +1,30 @@
"""
Canonical model catalogs and lightweight validation helpers.
Canonical list of OpenRouter models offered in CLI and setup wizards.
Add, remove, or reorder entries here — both `hermes setup` and
`hermes` provider-selection will pick up the change automatically.
"""
from __future__ import annotations
import json
import urllib.error
import urllib.request
from difflib import get_close_matches
from typing import Any
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.6", "recommended"),
("anthropic/claude-sonnet-4.5", ""),
("openai/gpt-5.4-pro", ""),
("openai/gpt-5.4", ""),
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-flash-preview", ""),
("qwen/qwen3.5-plus-02-15", ""),
("qwen/qwen3.5-35b-a3b", ""),
("stepfun/step-3.5-flash", ""),
("z-ai/glm-5", ""),
("moonshotai/kimi-k2.5", ""),
("minimax/minimax-m2.5", ""),
("anthropic/claude-opus-4.6", "recommended"),
("anthropic/claude-sonnet-4.5", ""),
("openai/gpt-5.4-pro", ""),
("openai/gpt-5.4", ""),
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-flash-preview", ""),
("qwen/qwen3.5-plus-02-15", ""),
("qwen/qwen3.5-35b-a3b", ""),
("stepfun/step-3.5-flash", ""),
("z-ai/glm-5", ""),
("moonshotai/kimi-k2.5", ""),
("minimax/minimax-m2.5", ""),
]
_PROVIDER_MODELS: dict[str, list[str]] = {
"zai": [
"glm-5",
"glm-4.7",
"glm-4.5",
"glm-4.5-flash",
],
"kimi-coding": [
"kimi-k2.5",
"kimi-k2-thinking",
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
],
"minimax": [
"MiniMax-M2.5",
"MiniMax-M2.5-highspeed",
"MiniMax-M2.1",
],
"minimax-cn": [
"MiniMax-M2.5",
"MiniMax-M2.5-highspeed",
"MiniMax-M2.1",
],
}
_PROVIDER_LABELS = {
"openrouter": "OpenRouter",
"openai-codex": "OpenAI Codex",
"nous": "Nous Portal",
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
"minimax": "MiniMax",
"minimax-cn": "MiniMax (China)",
"custom": "Custom endpoint",
}
_PROVIDER_ALIASES = {
"glm": "zai",
"z-ai": "zai",
"z.ai": "zai",
"zhipu": "zai",
"kimi": "kimi-coding",
"moonshot": "kimi-coding",
"minimax-china": "minimax-cn",
"minimax_cn": "minimax-cn",
}
def model_ids() -> list[str]:
"""Return just the OpenRouter model-id strings."""
"""Return just the model-id strings (convenience helper)."""
return [mid for mid, _ in OPENROUTER_MODELS]
@@ -89,234 +34,3 @@ def menu_labels() -> list[str]:
for mid, desc in OPENROUTER_MODELS:
labels.append(f"{mid} ({desc})" if desc else mid)
return labels
# All provider IDs and aliases that are valid for the provider:model syntax.
_KNOWN_PROVIDER_NAMES: set[str] = (
set(_PROVIDER_LABELS.keys()) | set(_PROVIDER_ALIASES.keys()) | {"openrouter", "custom"}
)
def list_available_providers() -> list[dict[str, str]]:
"""Return info about all providers the user could use with ``provider:model``.
Each dict has ``id``, ``label``, and ``aliases``.
Checks which providers have valid credentials configured.
"""
# Canonical providers in display order
_PROVIDER_ORDER = [
"openrouter",
"nous",
"openai-codex",
"zai",
"kimi-coding",
"minimax",
"minimax-cn",
]
# Build reverse alias map
aliases_for: dict[str, list[str]] = {}
for alias, canonical in _PROVIDER_ALIASES.items():
aliases_for.setdefault(canonical, []).append(alias)
result = []
for pid in _PROVIDER_ORDER:
label = _PROVIDER_LABELS.get(pid, pid)
alias_list = aliases_for.get(pid, [])
# Check if this provider has credentials available
has_creds = False
try:
from hermes_cli.runtime_provider import resolve_runtime_provider
runtime = resolve_runtime_provider(requested=pid)
has_creds = bool(runtime.get("api_key"))
except Exception:
pass
result.append(
{
"id": pid,
"label": label,
"aliases": alias_list,
"authenticated": has_creds,
}
)
return result
def parse_model_input(raw: str, current_provider: str) -> tuple[str, str]:
"""Parse ``/model`` input into ``(provider, model)``.
Supports ``provider:model`` syntax to switch providers at runtime::
openrouter:anthropic/claude-sonnet-4.5 → ("openrouter", "anthropic/claude-sonnet-4.5")
nous:hermes-3 → ("nous", "hermes-3")
anthropic/claude-sonnet-4.5 → (current_provider, "anthropic/claude-sonnet-4.5")
gpt-5.4 → (current_provider, "gpt-5.4")
The colon is only treated as a provider delimiter if the left side is a
recognized provider name or alias. This avoids misinterpreting model names
that happen to contain colons (e.g. ``anthropic/claude-3.5-sonnet:beta``).
Returns ``(provider, model)`` where *provider* is either the explicit
provider from the input or *current_provider* if none was specified.
"""
stripped = raw.strip()
colon = stripped.find(":")
if colon > 0:
provider_part = stripped[:colon].strip().lower()
model_part = stripped[colon + 1 :].strip()
if provider_part and model_part and provider_part in _KNOWN_PROVIDER_NAMES:
return (normalize_provider(provider_part), model_part)
return (current_provider, stripped)
def curated_models_for_provider(provider: str | None) -> list[tuple[str, str]]:
"""Return ``(model_id, description)`` tuples for a provider's curated list."""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return list(OPENROUTER_MODELS)
models = _PROVIDER_MODELS.get(normalized, [])
return [(m, "") for m in models]
def normalize_provider(provider: str | None) -> str:
"""Normalize provider aliases to Hermes' canonical provider ids.
Note: ``"auto"`` passes through unchanged — use
``hermes_cli.auth.resolve_provider()`` to resolve it to a concrete
provider based on credentials and environment.
"""
normalized = (provider or "openrouter").strip().lower()
return _PROVIDER_ALIASES.get(normalized, normalized)
def provider_model_ids(provider: str | None) -> list[str]:
"""Return the best known model catalog for a provider."""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return model_ids()
if normalized == "openai-codex":
from hermes_cli.codex_models import get_codex_model_ids
return get_codex_model_ids()
return list(_PROVIDER_MODELS.get(normalized, []))
def fetch_api_models(
api_key: str | None,
base_url: str | None,
timeout: float = 5.0,
) -> list[str] | None:
"""Fetch the list of available model IDs from the provider's ``/models`` endpoint.
Returns a list of model ID strings, or ``None`` if the endpoint could not
be reached (network error, timeout, auth failure, etc.).
"""
if not base_url:
return None
url = base_url.rstrip("/") + "/models"
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
# Standard OpenAI format: {"data": [{"id": "model-name", ...}, ...]}
return [m.get("id", "") for m in data.get("data", [])]
except Exception:
return None
def validate_requested_model(
model_name: str,
provider: str | None,
*,
api_key: str | None = None,
base_url: str | None = None,
) -> dict[str, Any]:
"""
Validate a ``/model`` value for the active provider.
Performs format checks first, then probes the live API to confirm
the model actually exists.
Returns a dict with:
- accepted: whether the CLI should switch to the requested model now
- persist: whether it is safe to save to config
- recognized: whether it matched a known provider catalog
- message: optional warning / guidance for the user
"""
requested = (model_name or "").strip()
normalized = normalize_provider(provider)
if normalized == "openrouter" and base_url and "openrouter.ai" not in base_url:
normalized = "custom"
if not requested:
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": "Model name cannot be empty.",
}
if any(ch.isspace() for ch in requested):
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": "Model names cannot contain spaces.",
}
# Probe the live API to check if the model actually exists
api_models = fetch_api_models(api_key, base_url)
if api_models is not None:
if requested in set(api_models):
# API confirmed the model exists
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
else:
# API responded but model is not listed
suggestions = get_close_matches(requested, api_models, n=3, cutoff=0.5)
suggestion_text = ""
if suggestions:
suggestion_text = "\n Did you mean: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": (f"Error: `{requested}` is not a valid model for this provider.{suggestion_text}"),
}
# api_models is None — couldn't reach API, fall back to catalog check
provider_label = _PROVIDER_LABELS.get(normalized, normalized)
known_models = provider_model_ids(normalized)
if requested in known_models:
return {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
# Can't validate — accept for session only
suggestion = get_close_matches(requested, known_models, n=1, cutoff=0.6)
suggestion_text = f" Did you mean `{suggestion[0]}`?" if suggestion else ""
return {
"accepted": True,
"persist": False,
"recognized": False,
"message": (
f"Could not validate `{requested}` against the live {provider_label} API. "
"Using it for this session only; config unchanged."
f"{suggestion_text}"
),
}

View File

@@ -8,7 +8,6 @@ Usage:
hermes pairing clear-pending # Clear all expired/pending codes
"""
def pairing_command(args):
"""Handle hermes pairing subcommands."""
from gateway.pairing import PairingStore
@@ -73,10 +72,10 @@ def _cmd_approve(store, platform: str, code: str):
name = result.get("user_name", "")
display = f"{name} ({uid})" if name else uid
print(f"\n Approved! User {display} on {platform} can now use the bot~")
print(" They'll be recognized automatically on their next message.\n")
print(f" They'll be recognized automatically on their next message.\n")
else:
print(f"\n Code '{code}' not found or expired for platform '{platform}'.")
print(" Run 'hermes pairing list' to see pending codes.\n")
print(f" Run 'hermes pairing list' to see pending codes.\n")
def _cmd_revoke(store, platform: str, user_id: str):

View File

@@ -3,22 +3,22 @@
from __future__ import annotations
import os
from typing import Any
from typing import Any, Dict, Optional
from hermes_cli.auth import (
PROVIDER_REGISTRY,
AuthError,
PROVIDER_REGISTRY,
format_auth_error,
resolve_api_key_provider_credentials,
resolve_codex_runtime_credentials,
resolve_nous_runtime_credentials,
resolve_provider,
resolve_nous_runtime_credentials,
resolve_codex_runtime_credentials,
resolve_api_key_provider_credentials,
)
from hermes_cli.config import load_config
from hermes_constants import OPENROUTER_BASE_URL
def _get_model_config() -> dict[str, Any]:
def _get_model_config() -> Dict[str, Any]:
config = load_config()
model_cfg = config.get("model")
if isinstance(model_cfg, dict):
@@ -28,7 +28,7 @@ def _get_model_config() -> dict[str, Any]:
return {}
def resolve_requested_provider(requested: str | None = None) -> str:
def resolve_requested_provider(requested: Optional[str] = None) -> str:
"""Resolve provider request from explicit arg, env, then config."""
if requested and requested.strip():
return requested.strip().lower()
@@ -48,9 +48,9 @@ def resolve_requested_provider(requested: str | None = None) -> str:
def _resolve_openrouter_runtime(
*,
requested_provider: str,
explicit_api_key: str | None = None,
explicit_base_url: str | None = None,
) -> dict[str, Any]:
explicit_api_key: Optional[str] = None,
explicit_base_url: Optional[str] = None,
) -> Dict[str, Any]:
model_cfg = _get_model_config()
cfg_base_url = model_cfg.get("base_url") if isinstance(model_cfg.get("base_url"), str) else ""
cfg_provider = model_cfg.get("provider") if isinstance(model_cfg.get("provider"), str) else ""
@@ -81,9 +81,19 @@ def _resolve_openrouter_runtime(
# provider (issues #420, #560).
_is_openrouter_url = "openrouter.ai" in base_url
if _is_openrouter_url:
api_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY") or os.getenv("OPENAI_API_KEY") or ""
api_key = (
explicit_api_key
or os.getenv("OPENROUTER_API_KEY")
or os.getenv("OPENAI_API_KEY")
or ""
)
else:
api_key = explicit_api_key or os.getenv("OPENAI_API_KEY") or os.getenv("OPENROUTER_API_KEY") or ""
api_key = (
explicit_api_key
or os.getenv("OPENAI_API_KEY")
or os.getenv("OPENROUTER_API_KEY")
or ""
)
source = "explicit" if (explicit_api_key or explicit_base_url) else "env/config"
@@ -98,10 +108,10 @@ def _resolve_openrouter_runtime(
def resolve_runtime_provider(
*,
requested: str | None = None,
explicit_api_key: str | None = None,
explicit_base_url: str | None = None,
) -> dict[str, Any]:
requested: Optional[str] = None,
explicit_api_key: Optional[str] = None,
explicit_base_url: Optional[str] = None,
) -> Dict[str, Any]:
"""Resolve runtime provider credentials for agent execution."""
requested_provider = resolve_requested_provider(requested)

File diff suppressed because it is too large Load Diff

View File

@@ -13,6 +13,7 @@ handler are thin wrappers that parse args and delegate.
import json
import shutil
from pathlib import Path
from typing import Optional
from rich.console import Console
from rich.panel import Panel
@@ -28,7 +29,6 @@ _console = Console()
# Shared do_* functions
# ---------------------------------------------------------------------------
def _resolve_short_name(name: str, sources, console: Console) -> str:
"""
Resolve a short skill name (e.g. 'pptx') to a full identifier by searching
@@ -57,9 +57,7 @@ def _resolve_short_name(name: str, sources, console: Console) -> str:
table.add_column("Trust", style="dim")
table.add_column("Identifier", style="bold cyan")
for r in exact:
trust_style = {"builtin": "bright_cyan", "trusted": "green", "community": "yellow"}.get(
r.trust_level, "dim"
)
trust_style = {"builtin": "bright_cyan", "trusted": "green", "community": "yellow"}.get(r.trust_level, "dim")
trust_label = "official" if r.source == "official" else r.trust_level
table.add_row(r.source, f"[{trust_style}]{trust_label}[/]", r.identifier)
c.print(table)
@@ -78,7 +76,8 @@ def _resolve_short_name(name: str, sources, console: Console) -> str:
return ""
def do_search(query: str, source: str = "all", limit: int = 10, console: Console | None = None) -> None:
def do_search(query: str, source: str = "all", limit: int = 10,
console: Optional[Console] = None) -> None:
"""Search registries and display results as a Rich table."""
from tools.skills_hub import GitHubAuth, create_source_router, unified_search
@@ -112,19 +111,18 @@ def do_search(query: str, source: str = "all", limit: int = 10, console: Console
)
c.print(table)
c.print(
"[dim]Use: hermes skills inspect <identifier> to preview, hermes skills install <identifier> to install[/]\n"
)
c.print("[dim]Use: hermes skills inspect <identifier> to preview, "
"hermes skills install <identifier> to install[/]\n")
def do_browse(page: int = 1, page_size: int = 20, source: str = "all", console: Console | None = None) -> None:
def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
console: Optional[Console] = None) -> None:
"""Browse all available skills across registries, paginated.
Official skills are always shown first, regardless of source filter.
"""
from tools.skills_hub import (
GitHubAuth,
create_source_router,
GitHubAuth, create_source_router, OptionalSkillSource, SkillMeta,
)
# Clamp page_size to safe range
@@ -138,7 +136,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all", console:
# Collect results from all (or filtered) sources
# Use empty query to get everything; per-source limits prevent overload
_TRUST_RANK = {"builtin": 3, "trusted": 2, "community": 1}
_PER_SOURCE_LIMIT = {"official": 100, "github": 100, "clawhub": 50, "claude-marketplace": 50, "lobehub": 50}
_PER_SOURCE_LIMIT = {"official": 100, "github": 100, "clawhub": 50,
"claude-marketplace": 50, "lobehub": 50}
all_results: list = []
source_counts: dict = {}
@@ -169,13 +168,11 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all", console:
deduped = list(seen.values())
# Sort: official first, then by trust level (desc), then alphabetically
deduped.sort(
key=lambda r: (
-_TRUST_RANK.get(r.trust_level, 0),
r.source != "official",
r.name.lower(),
)
)
deduped.sort(key=lambda r: (
-_TRUST_RANK.get(r.trust_level, 0),
r.source != "official",
r.name.lower(),
))
# Paginate
total = len(deduped)
@@ -190,7 +187,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all", console:
# Build header
source_label = f"{source}" if source != "all" else "— all sources"
c.print(f"\n[bold]Skills Hub — Browse {source_label}[/] [dim]({total} skills, page {page}/{total_pages})[/]")
c.print(f"\n[bold]Skills Hub — Browse {source_label}[/]"
f" [dim]({total} skills, page {page}/{total_pages})[/]")
if official_count > 0 and page == 1:
c.print(f"[bright_cyan]★ {official_count} official optional skill(s) from Nous Research[/]")
c.print()
@@ -204,7 +202,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all", console:
table.add_column("Trust", width=10)
for i, r in enumerate(page_items, start=start + 1):
trust_style = {"builtin": "bright_cyan", "trusted": "green", "community": "yellow"}.get(r.trust_level, "dim")
trust_style = {"builtin": "bright_cyan", "trusted": "green",
"community": "yellow"}.get(r.trust_level, "dim")
trust_label = "★ official" if r.source == "official" else r.trust_level
desc = r.description[:50]
@@ -236,22 +235,18 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all", console:
parts = [f"{sid}: {ct}" for sid, ct in sorted(source_counts.items())]
c.print(f" [dim]Sources: {', '.join(parts)}[/]")
c.print(
"[dim]Use: hermes skills inspect <identifier> to preview, hermes skills install <identifier> to install[/]\n"
)
c.print("[dim]Use: hermes skills inspect <identifier> to preview, "
"hermes skills install <identifier> to install[/]\n")
def do_install(identifier: str, category: str = "", force: bool = False, console: Console | None = None) -> None:
def do_install(identifier: str, category: str = "", force: bool = False,
console: Optional[Console] = None) -> None:
"""Fetch, quarantine, scan, confirm, and install a skill."""
from tools.skills_guard import format_scan_report, scan_skill, should_allow_install
from tools.skills_hub import (
GitHubAuth,
HubLockFile,
create_source_router,
ensure_hub_dirs,
install_from_quarantine,
quarantine_bundle,
GitHubAuth, create_source_router, ensure_hub_dirs,
quarantine_bundle, install_from_quarantine, HubLockFile,
)
from tools.skills_guard import scan_skill, should_allow_install, format_scan_report
c = console or _console
ensure_hub_dirs()
@@ -309,43 +304,33 @@ def do_install(identifier: str, category: str = "", force: bool = False, console
# Clean up quarantine
shutil.rmtree(q_path, ignore_errors=True)
from tools.skills_hub import append_audit_log
append_audit_log(
"BLOCKED",
bundle.name,
bundle.source,
bundle.trust_level,
result.verdict,
f"{len(result.findings)}_findings",
)
append_audit_log("BLOCKED", bundle.name, bundle.source,
bundle.trust_level, result.verdict,
f"{len(result.findings)}_findings")
return
# Confirm with user — show appropriate warning based on source
if not force:
c.print()
if bundle.source == "official":
c.print(
Panel(
"[bold bright_cyan]This is an official optional skill maintained by Nous Research.[/]\n\n"
"It ships with hermes-agent but is not activated by default.\n"
"Installing will copy it to your skills directory where the agent can use it.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Official Skill",
border_style="bright_cyan",
)
)
c.print(Panel(
"[bold bright_cyan]This is an official optional skill maintained by Nous Research.[/]\n\n"
"It ships with hermes-agent but is not activated by default.\n"
"Installing will copy it to your skills directory where the agent can use it.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Official Skill",
border_style="bright_cyan",
))
else:
c.print(
Panel(
"[bold yellow]You are installing a third-party skill at your own risk.[/]\n\n"
"External skills can contain instructions that influence agent behavior,\n"
"shell commands, and scripts. Even after automated scanning, you should\n"
"review the installed files before use.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Disclaimer",
border_style="yellow",
)
)
c.print(Panel(
"[bold yellow]You are installing a third-party skill at your own risk.[/]\n\n"
"External skills can contain instructions that influence agent behavior,\n"
"shell commands, and scripts. Even after automated scanning, you should\n"
"review the installed files before use.\n\n"
f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
title="Disclaimer",
border_style="yellow",
))
c.print(f"[bold]Install '{bundle.name}'?[/]")
try:
answer = input("Confirm [y/N]: ").strip().lower()
@@ -359,12 +344,11 @@ def do_install(identifier: str, category: str = "", force: bool = False, console
# Install
install_dir = install_from_quarantine(q_path, bundle.name, category, bundle, result)
from tools.skills_hub import SKILLS_DIR
c.print(f"[bold green]Installed:[/] {install_dir.relative_to(SKILLS_DIR)}")
c.print(f"[dim]Files: {', '.join(bundle.files.keys())}[/]\n")
def do_inspect(identifier: str, console: Console | None = None) -> None:
def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
"""Preview a skill's SKILL.md content without installing."""
from tools.skills_hub import GitHubAuth, create_source_router
@@ -422,13 +406,12 @@ def do_inspect(identifier: str, console: Console | None = None) -> None:
c.print()
def do_list(source_filter: str = "all", console: Console | None = None) -> None:
def do_list(source_filter: str = "all", console: Optional[Console] = None) -> None:
"""List installed skills, distinguishing builtins from hub-installed."""
from tools.skills_hub import HubLockFile, ensure_hub_dirs
from tools.skills_hub import HubLockFile, SKILLS_DIR
from tools.skills_tool import _find_all_skills
c = console or _console
ensure_hub_dirs()
lock = HubLockFile()
hub_installed = {e["name"]: e for e in lock.list_installed()}
@@ -462,13 +445,14 @@ def do_list(source_filter: str = "all", console: Console | None = None) -> None:
table.add_row(name, category, source_display, f"[{trust_style}]{trust_label}[/]")
c.print(table)
c.print(f"[dim]{len(hub_installed)} hub-installed, {len(all_skills) - len(hub_installed)} builtin[/]\n")
c.print(f"[dim]{len(hub_installed)} hub-installed, "
f"{len(all_skills) - len(hub_installed)} builtin[/]\n")
def do_audit(name: str | None = None, console: Console | None = None) -> None:
def do_audit(name: Optional[str] = None, console: Optional[Console] = None) -> None:
"""Re-run security scan on installed hub skills."""
from tools.skills_guard import format_scan_report, scan_skill
from tools.skills_hub import SKILLS_DIR, HubLockFile
from tools.skills_hub import HubLockFile, SKILLS_DIR
from tools.skills_guard import scan_skill, format_scan_report
c = console or _console
lock = HubLockFile()
@@ -498,7 +482,7 @@ def do_audit(name: str | None = None, console: Console | None = None) -> None:
c.print()
def do_uninstall(name: str, console: Console | None = None) -> None:
def do_uninstall(name: str, console: Optional[Console] = None) -> None:
"""Remove a hub-installed skill with confirmation."""
from tools.skills_hub import uninstall_skill
@@ -520,7 +504,7 @@ def do_uninstall(name: str, console: Console | None = None) -> None:
c.print(f"[bold red]Error:[/] {msg}\n")
def do_tap(action: str, repo: str = "", console: Console | None = None) -> None:
def do_tap(action: str, repo: str = "", console: Optional[Console] = None) -> None:
"""Manage taps (custom GitHub repo sources)."""
from tools.skills_hub import TapsManager
@@ -562,10 +546,11 @@ def do_tap(action: str, repo: str = "", console: Console | None = None) -> None:
c.print(f"[bold red]Unknown tap action:[/] {action}. Use: list, add, remove\n")
def do_publish(skill_path: str, target: str = "github", repo: str = "", console: Console | None = None) -> None:
def do_publish(skill_path: str, target: str = "github", repo: str = "",
console: Optional[Console] = None) -> None:
"""Publish a local skill to a registry (GitHub PR or ClawHub submission)."""
from tools.skills_guard import format_scan_report, scan_skill
from tools.skills_hub import SKILLS_DIR, GitHubAuth
from tools.skills_hub import GitHubAuth, SKILLS_DIR
from tools.skills_guard import scan_skill, format_scan_report
c = console or _console
path = Path(skill_path)
@@ -579,16 +564,14 @@ def do_publish(skill_path: str, target: str = "github", repo: str = "", console:
# Validate the skill
import yaml
skill_md = (path / "SKILL.md").read_text(encoding="utf-8")
fm = {}
if skill_md.startswith("---"):
import re
match = re.search(r"\n---\s*\n", skill_md[3:])
match = re.search(r'\n---\s*\n', skill_md[3:])
if match:
try:
fm = yaml.safe_load(skill_md[3 : match.start() + 3]) or {}
fm = yaml.safe_load(skill_md[3:match.start() + 3]) or {}
except yaml.YAMLError:
pass
@@ -608,18 +591,14 @@ def do_publish(skill_path: str, target: str = "github", repo: str = "", console:
if target == "github":
if not repo:
c.print(
"[bold red]Error:[/] --repo required for GitHub publish.\n"
"Usage: hermes skills publish <path> --to github --repo owner/repo\n"
)
c.print("[bold red]Error:[/] --repo required for GitHub publish.\n"
"Usage: hermes skills publish <path> --to github --repo owner/repo\n")
return
auth = GitHubAuth()
if not auth.is_authenticated():
c.print(
"[bold red]Error:[/] GitHub authentication required.\n"
"Set GITHUB_TOKEN in ~/.hermes/.env or run 'gh auth login'.\n"
)
c.print("[bold red]Error:[/] GitHub authentication required.\n"
"Set GITHUB_TOKEN in ~/.hermes/.env or run 'gh auth login'.\n")
return
c.print(f"[bold]Publishing '{name}' to {repo}...[/]")
@@ -630,12 +609,14 @@ def do_publish(skill_path: str, target: str = "github", repo: str = "", console:
c.print(f"[bold red]Error:[/] {msg}\n")
elif target == "clawhub":
c.print("[yellow]ClawHub publishing is not yet supported. Submit manually at https://clawhub.ai/submit[/]\n")
c.print("[yellow]ClawHub publishing is not yet supported. "
"Submit manually at https://clawhub.ai/submit[/]\n")
else:
c.print(f"[bold red]Unknown target:[/] {target}. Use 'github' or 'clawhub'.\n")
def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -> tuple:
def _github_publish(skill_path: Path, skill_name: str, target_repo: str,
auth) -> tuple:
"""Create a PR to a GitHub repo with the skill. Returns (success, message)."""
import httpx
@@ -645,8 +626,7 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -
try:
resp = httpx.post(
f"https://api.github.com/repos/{target_repo}/forks",
headers=headers,
timeout=30,
headers=headers, timeout=30,
)
if resp.status_code in (200, 202):
fork = resp.json()
@@ -662,8 +642,7 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -
try:
resp = httpx.get(
f"https://api.github.com/repos/{target_repo}",
headers=headers,
timeout=15,
headers=headers, timeout=15,
)
default_branch = resp.json().get("default_branch", "main")
except Exception:
@@ -673,8 +652,7 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -
try:
resp = httpx.get(
f"https://api.github.com/repos/{fork_repo}/git/refs/heads/{default_branch}",
headers=headers,
timeout=15,
headers=headers, timeout=15,
)
base_sha = resp.json()["object"]["sha"]
except Exception as e:
@@ -685,8 +663,7 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -
try:
httpx.post(
f"https://api.github.com/repos/{fork_repo}/git/refs",
headers=headers,
timeout=15,
headers=headers, timeout=15,
json={"ref": f"refs/heads/{branch_name}", "sha": base_sha},
)
except Exception as e:
@@ -700,12 +677,10 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -
upload_path = f"skills/{skill_name}/{rel}"
try:
import base64
content_b64 = base64.b64encode(f.read_bytes()).decode()
httpx.put(
f"https://api.github.com/repos/{fork_repo}/contents/{upload_path}",
headers=headers,
timeout=15,
headers=headers, timeout=15,
json={
"message": f"Add {skill_name} skill: {rel}",
"content": content_b64,
@@ -719,12 +694,11 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -
try:
resp = httpx.post(
f"https://api.github.com/repos/{target_repo}/pulls",
headers=headers,
timeout=15,
headers=headers, timeout=15,
json={
"title": f"Add skill: {skill_name}",
"body": f"Submitting the `{skill_name}` skill via Hermes Skills Hub.\n\n"
f"This skill was scanned by the Hermes Skills Guard before submission.",
f"This skill was scanned by the Hermes Skills Guard before submission.",
"head": f"{fork_repo.split('/')[0]}:{branch_name}",
"base": default_branch,
},
@@ -738,7 +712,7 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str, auth) -
return False, f"Network error creating PR: {e}"
def do_snapshot_export(output_path: str, console: Console | None = None) -> None:
def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> None:
"""Export current hub skill configuration to a portable JSON file."""
from tools.skills_hub import HubLockFile, TapsManager
@@ -751,15 +725,16 @@ def do_snapshot_export(output_path: str, console: Console | None = None) -> None
snapshot = {
"hermes_version": "0.1.0",
"exported_at": __import__("datetime").datetime.now(__import__("datetime").timezone.utc).isoformat(),
"exported_at": __import__("datetime").datetime.now(
__import__("datetime").timezone.utc
).isoformat(),
"skills": [
{
"name": entry["name"],
"source": entry.get("source", ""),
"identifier": entry.get("identifier", ""),
"category": str(Path(entry.get("install_path", "")).parent)
if "/" in entry.get("install_path", "")
else "",
if "/" in entry.get("install_path", "") else "",
}
for entry in installed
],
@@ -772,7 +747,8 @@ def do_snapshot_export(output_path: str, console: Console | None = None) -> None
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
def do_snapshot_import(input_path: str, force: bool = False, console: Console | None = None) -> None:
def do_snapshot_import(input_path: str, force: bool = False,
console: Optional[Console] = None) -> None:
"""Re-install skills from a snapshot file."""
from tools.skills_hub import TapsManager
@@ -822,7 +798,6 @@ def do_snapshot_import(input_path: str, force: bool = False, console: Console |
# CLI argparse entry point
# ---------------------------------------------------------------------------
def skills_command(args) -> None:
"""Router for `hermes skills <subcommand>` — called from hermes_cli/main.py."""
action = getattr(args, "skills_action", None)
@@ -863,9 +838,7 @@ def skills_command(args) -> None:
return
do_tap(tap_action, repo=repo)
else:
_console.print(
"Usage: hermes skills [browse|search|install|inspect|list|audit|uninstall|publish|snapshot|tap]\n"
)
_console.print("Usage: hermes skills [browse|search|install|inspect|list|audit|uninstall|publish|snapshot|tap]\n")
_console.print("Run 'hermes skills <command> --help' for details.\n")
@@ -873,8 +846,7 @@ def skills_command(args) -> None:
# Slash command entry point (/skills in chat)
# ---------------------------------------------------------------------------
def handle_skills_slash(cmd: str, console: Console | None = None) -> None:
def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
"""
Parse and dispatch `/skills <subcommand> [args]` from the chat interface.
@@ -1035,19 +1007,17 @@ def handle_skills_slash(cmd: str, console: Console | None = None) -> None:
def _print_skills_help(console: Console) -> None:
"""Print help for the /skills slash command."""
console.print(
Panel(
"[bold]Skills Hub Commands:[/]\n\n"
" [cyan]browse[/] [--source official] Browse all available skills (paginated)\n"
" [cyan]search[/] <query> Search registries for skills\n"
" [cyan]install[/] <identifier> Install a skill (with security scan)\n"
" [cyan]inspect[/] <identifier> Preview a skill without installing\n"
" [cyan]list[/] [--source hub|builtin] List installed skills\n"
" [cyan]audit[/] [name] Re-scan hub skills for security\n"
" [cyan]uninstall[/] <name> Remove a hub-installed skill\n"
" [cyan]publish[/] <path> --repo <r> Publish a skill to GitHub via PR\n"
" [cyan]snapshot[/] export|import Export/import skill configurations\n"
" [cyan]tap[/] list|add|remove Manage skill sources\n",
title="/skills",
)
)
console.print(Panel(
"[bold]Skills Hub Commands:[/]\n\n"
" [cyan]browse[/] [--source official] Browse all available skills (paginated)\n"
" [cyan]search[/] <query> Search registries for skills\n"
" [cyan]install[/] <identifier> Install a skill (with security scan)\n"
" [cyan]inspect[/] <identifier> Preview a skill without installing\n"
" [cyan]list[/] [--source hub|builtin] List installed skills\n"
" [cyan]audit[/] [name] Re-scan hub skills for security\n"
" [cyan]uninstall[/] <name> Remove a hub-installed skill\n"
" [cyan]publish[/] <path> --repo <r> Publish a skill to GitHub via PR\n"
" [cyan]snapshot[/] export|import Export/import skill configurations\n"
" [cyan]tap[/] list|add|remove Manage skill sources\n",
title="/skills",
))

View File

@@ -5,25 +5,21 @@ Shows the status of all Hermes Agent components.
"""
import os
import subprocess
import sys
import subprocess
from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
from datetime import UTC
from hermes_cli.colors import Colors, color
from hermes_cli.config import get_env_path, get_env_value
from hermes_constants import OPENROUTER_MODELS_URL
def check_mark(ok: bool) -> str:
if ok:
return color("", Colors.GREEN)
return color("", Colors.RED)
def redact_key(key: str) -> str:
"""Redact an API key for display."""
if not key:
@@ -37,8 +33,7 @@ def _format_iso_timestamp(value) -> str:
"""Format ISO timestamps for status output, converting to local timezone."""
if not value or not isinstance(value, str):
return "(unknown)"
from datetime import datetime
from datetime import datetime, timezone
text = value.strip()
if not text:
return "(unknown)"
@@ -47,7 +42,7 @@ def _format_iso_timestamp(value) -> str:
try:
parsed = datetime.fromisoformat(text)
if parsed.tzinfo is None:
parsed = parsed.replace(tzinfo=UTC)
parsed = parsed.replace(tzinfo=timezone.utc)
except Exception:
return value
return parsed.astimezone().strftime("%Y-%m-%d %H:%M:%S %Z")
@@ -55,14 +50,14 @@ def _format_iso_timestamp(value) -> str:
def show_status(args):
"""Show status of all Hermes Agent components."""
show_all = getattr(args, "all", False)
deep = getattr(args, "deep", False)
show_all = getattr(args, 'all', False)
deep = getattr(args, 'deep', False)
print()
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
print(color("│ ⚕ Hermes Agent Status │", Colors.CYAN))
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
# =========================================================================
# Environment
# =========================================================================
@@ -70,19 +65,19 @@ def show_status(args):
print(color("◆ Environment", Colors.CYAN, Colors.BOLD))
print(f" Project: {PROJECT_ROOT}")
print(f" Python: {sys.version.split()[0]}")
env_path = get_env_path()
print(f" .env file: {check_mark(env_path.exists())} {'exists' if env_path.exists() else 'not found'}")
# =========================================================================
# API Keys
# =========================================================================
print()
print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))
keys = {
"OpenRouter": "OPENROUTER_API_KEY",
"Anthropic": "ANTHROPIC_API_KEY",
"Anthropic": "ANTHROPIC_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
@@ -96,7 +91,7 @@ def show_status(args):
"ElevenLabs": "ELEVENLABS_API_KEY",
"GitHub": "GITHUB_TOKEN",
}
for name, env_var in keys.items():
value = get_env_value(env_var) or ""
has_key = bool(value)
@@ -110,8 +105,7 @@ def show_status(args):
print(color("◆ Auth Providers", Colors.CYAN, Colors.BOLD))
try:
from hermes_cli.auth import get_codex_auth_status, get_nous_auth_status
from hermes_cli.auth import get_nous_auth_status, get_codex_auth_status
nous_status = get_nous_auth_status()
codex_status = get_codex_auth_status()
except Exception:
@@ -154,10 +148,10 @@ def show_status(args):
print(color("◆ API-Key Providers", Colors.CYAN, Colors.BOLD))
apikey_providers = {
"Z.AI / GLM": ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
"Kimi / Moonshot": ("KIMI_API_KEY",),
"MiniMax": ("MINIMAX_API_KEY",),
"MiniMax (China)": ("MINIMAX_CN_API_KEY",),
"Z.AI / GLM": ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
"Kimi / Moonshot": ("KIMI_API_KEY",),
"MiniMax": ("MINIMAX_API_KEY",),
"MiniMax (China)": ("MINIMAX_CN_API_KEY",),
}
for pname, env_vars in apikey_providers.items():
key_val = ""
@@ -174,20 +168,19 @@ def show_status(args):
# =========================================================================
print()
print(color("◆ Terminal Backend", Colors.CYAN, Colors.BOLD))
terminal_env = os.getenv("TERMINAL_ENV", "")
if not terminal_env:
# Fall back to config file value when env var isn't set
# (hermes status doesn't go through cli.py's config loading)
try:
from hermes_cli.config import load_config
_cfg = load_config()
terminal_env = _cfg.get("terminal", {}).get("backend", "local")
except Exception:
terminal_env = "local"
print(f" Backend: {terminal_env}")
if terminal_env == "ssh":
ssh_host = os.getenv("TERMINAL_SSH_HOST", "")
ssh_user = os.getenv("TERMINAL_SSH_USER", "")
@@ -199,69 +192,74 @@ def show_status(args):
elif terminal_env == "daytona":
daytona_image = os.getenv("TERMINAL_DAYTONA_IMAGE", "nikolaik/python-nodejs:python3.11-nodejs20")
print(f" Daytona Image: {daytona_image}")
sudo_password = os.getenv("SUDO_PASSWORD", "")
print(f" Sudo: {check_mark(bool(sudo_password))} {'enabled' if sudo_password else 'disabled'}")
# =========================================================================
# Messaging Platforms
# =========================================================================
print()
print(color("◆ Messaging Platforms", Colors.CYAN, Colors.BOLD))
platforms = {
"Telegram": ("TELEGRAM_BOT_TOKEN", "TELEGRAM_HOME_CHANNEL"),
"Discord": ("DISCORD_BOT_TOKEN", "DISCORD_HOME_CHANNEL"),
"WhatsApp": ("WHATSAPP_ENABLED", None),
"Signal": ("SIGNAL_HTTP_URL", "SIGNAL_HOME_CHANNEL"),
"Slack": ("SLACK_BOT_TOKEN", None),
}
for name, (token_var, home_var) in platforms.items():
token = os.getenv(token_var, "")
has_token = bool(token)
home_channel = ""
if home_var:
home_channel = os.getenv(home_var, "")
status = "configured" if has_token else "not configured"
if home_channel:
status += f" (home: {home_channel})"
print(f" {name:<12} {check_mark(has_token)} {status}")
# =========================================================================
# Gateway Status
# =========================================================================
print()
print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
if sys.platform.startswith("linux"):
result = subprocess.run(["systemctl", "--user", "is-active", "hermes-gateway"], capture_output=True, text=True)
if sys.platform.startswith('linux'):
result = subprocess.run(
["systemctl", "--user", "is-active", "hermes-gateway"],
capture_output=True,
text=True
)
is_active = result.stdout.strip() == "active"
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
elif sys.platform == "darwin":
result = subprocess.run(["launchctl", "list", "ai.hermes.gateway"], capture_output=True, text=True)
print(f" Manager: systemd (user)")
elif sys.platform == 'darwin':
result = subprocess.run(
["launchctl", "list", "ai.hermes.gateway"],
capture_output=True,
text=True
)
is_loaded = result.returncode == 0
print(f" Status: {check_mark(is_loaded)} {'loaded' if is_loaded else 'not loaded'}")
print(" Manager: launchd")
print(f" Manager: launchd")
else:
print(f" Status: {color('N/A', Colors.DIM)}")
print(" Manager: (not supported on this platform)")
print(f" Manager: (not supported on this platform)")
# =========================================================================
# Cron Jobs
# =========================================================================
print()
print(color("◆ Scheduled Jobs", Colors.CYAN, Colors.BOLD))
jobs_file = Path.home() / ".hermes" / "cron" / "jobs.json"
if jobs_file.exists():
import json
try:
with open(jobs_file) as f:
data = json.load(f)
@@ -269,57 +267,56 @@ def show_status(args):
enabled_jobs = [j for j in jobs if j.get("enabled", True)]
print(f" Jobs: {len(enabled_jobs)} active, {len(jobs)} total")
except Exception:
print(" Jobs: (error reading jobs file)")
print(f" Jobs: (error reading jobs file)")
else:
print(" Jobs: 0")
print(f" Jobs: 0")
# =========================================================================
# Sessions
# =========================================================================
print()
print(color("◆ Sessions", Colors.CYAN, Colors.BOLD))
sessions_file = Path.home() / ".hermes" / "sessions" / "sessions.json"
if sessions_file.exists():
import json
try:
with open(sessions_file) as f:
data = json.load(f)
print(f" Active: {len(data)} session(s)")
except Exception:
print(" Active: (error reading sessions file)")
print(f" Active: (error reading sessions file)")
else:
print(" Active: 0")
print(f" Active: 0")
# =========================================================================
# Deep checks
# =========================================================================
if deep:
print()
print(color("◆ Deep Checks", Colors.CYAN, Colors.BOLD))
# Check OpenRouter connectivity
openrouter_key = os.getenv("OPENROUTER_API_KEY", "")
if openrouter_key:
try:
import httpx
response = httpx.get(
OPENROUTER_MODELS_URL, headers={"Authorization": f"Bearer {openrouter_key}"}, timeout=10
OPENROUTER_MODELS_URL,
headers={"Authorization": f"Bearer {openrouter_key}"},
timeout=10
)
ok = response.status_code == 200
print(f" OpenRouter: {check_mark(ok)} {'reachable' if ok else f'error ({response.status_code})'}")
except Exception as e:
print(f" OpenRouter: {check_mark(False)} error: {e}")
# Check gateway port
try:
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(1)
result = sock.connect_ex(("127.0.0.1", 18789))
result = sock.connect_ex(('127.0.0.1', 18789))
sock.close()
# Port in use = gateway likely running
port_in_use = result == 0
@@ -327,7 +324,7 @@ def show_status(args):
print(f" Port 18789: {'in use' if port_in_use else 'available'}")
except OSError:
pass
print()
print(color("" * 60, Colors.DIM))
print(color(" Run 'hermes doctor' for detailed diagnostics", Colors.DIM))

View File

@@ -11,37 +11,33 @@ the `platform_toolsets` key.
import sys
from pathlib import Path
from typing import Dict, List, Set
import os
from hermes_cli.colors import Colors, color
from hermes_cli.config import (
get_env_value,
load_config,
save_config,
save_env_value,
load_config, save_config, get_env_value, save_env_value,
get_hermes_home,
)
from hermes_cli.colors import Colors, color
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
# ─── UI Helpers (shared with setup.py) ────────────────────────────────────────
def _print_info(text: str):
print(color(f" {text}", Colors.DIM))
def _print_success(text: str):
print(color(f"{text}", Colors.GREEN))
def _print_warning(text: str):
print(color(f"{text}", Colors.YELLOW))
def _print_error(text: str):
print(color(f"{text}", Colors.RED))
def _prompt(question: str, default: str = None, password: bool = False) -> str:
if default:
display = f"{question} [{default}]: "
@@ -50,7 +46,6 @@ def _prompt(question: str, default: str = None, password: bool = False) -> str:
try:
if password:
import getpass
value = getpass.getpass(color(display, Colors.YELLOW))
else:
value = input(color(display, Colors.YELLOW))
@@ -59,7 +54,6 @@ def _prompt(question: str, default: str = None, password: bool = False) -> str:
print()
return default or ""
def _prompt_yes_no(question: str, default: bool = True) -> bool:
default_str = "Y/n" if default else "y/N"
while True:
@@ -70,9 +64,9 @@ def _prompt_yes_no(question: str, default: bool = True) -> bool:
return default
if not value:
return default
if value in ("y", "yes"):
if value in ('y', 'yes'):
return True
if value in ("n", "no"):
if value in ('n', 'no'):
return False
@@ -82,38 +76,33 @@ def _prompt_yes_no(question: str, default: bool = True) -> bool:
# Each entry: (toolset_name, label, description)
# These map to keys in toolsets.py TOOLSETS dict.
CONFIGURABLE_TOOLSETS = [
("web", "🔍 Web Search & Scraping", "web_search, web_extract"),
("browser", "🌐 Browser Automation", "navigate, click, type, scroll"),
("terminal", "💻 Terminal & Processes", "terminal, process"),
("file", "📁 File Operations", "read, write, patch, search"),
("code_execution", "⚡ Code Execution", "execute_code"),
("vision", "👁️ Vision / Image Analysis", "vision_analyze"),
("image_gen", "🎨 Image Generation", "image_generate"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
("skills", "📚 Skills", "list, view, manage"),
("todo", "📋 Task Planning", "todo"),
("memory", "💾 Memory", "persistent memory across sessions"),
("session_search", "🔎 Session Search", "search past conversations"),
("clarify", "❓ Clarifying Questions", "clarify"),
("delegation", "👥 Task Delegation", "delegate_task"),
("cronjob", "⏰ Cron Jobs", "schedule, list, remove"),
("rl", "🧪 RL Training", "Tinker-Atropos training tools"),
("homeassistant", "🏠 Home Assistant", "smart home device control"),
("web", "🔍 Web Search & Scraping", "web_search, web_extract"),
("browser", "🌐 Browser Automation", "navigate, click, type, scroll"),
("terminal", "💻 Terminal & Processes", "terminal, process"),
("file", "📁 File Operations", "read, write, patch, search"),
("code_execution", "⚡ Code Execution", "execute_code"),
("vision", "👁️ Vision / Image Analysis", "vision_analyze"),
("image_gen", "🎨 Image Generation", "image_generate"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
("skills", "📚 Skills", "list, view, manage"),
("todo", "📋 Task Planning", "todo"),
("memory", "💾 Memory", "persistent memory across sessions"),
("session_search", "🔎 Session Search", "search past conversations"),
("clarify", "❓ Clarifying Questions", "clarify"),
("delegation", "👥 Task Delegation", "delegate_task"),
("cronjob", "⏰ Cron Jobs", "schedule, list, remove"),
("rl", "🧪 RL Training", "Tinker-Atropos training tools"),
("homeassistant", "🏠 Home Assistant", "smart home device control"),
]
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl"}
# Platform display config
PLATFORMS = {
"cli": {"label": "🖥️ CLI", "default_toolset": "hermes-cli"},
"telegram": {"label": "📱 Telegram", "default_toolset": "hermes-telegram"},
"discord": {"label": "💬 Discord", "default_toolset": "hermes-discord"},
"slack": {"label": "💼 Slack", "default_toolset": "hermes-slack"},
"whatsapp": {"label": "📱 WhatsApp", "default_toolset": "hermes-whatsapp"},
"cli": {"label": "🖥️ CLI", "default_toolset": "hermes-cli"},
"telegram": {"label": "📱 Telegram", "default_toolset": "hermes-telegram"},
"discord": {"label": "💬 Discord", "default_toolset": "hermes-discord"},
"slack": {"label": "💼 Slack", "default_toolset": "hermes-slack"},
"whatsapp": {"label": "📱 WhatsApp", "default_toolset": "hermes-whatsapp"},
}
@@ -137,11 +126,7 @@ TOOL_CATEGORIES = {
"name": "OpenAI TTS",
"tag": "Premium - high quality voices",
"env_vars": [
{
"key": "VOICE_TOOLS_OPENAI_KEY",
"prompt": "OpenAI API key",
"url": "https://platform.openai.com/api-keys",
},
{"key": "VOICE_TOOLS_OPENAI_KEY", "prompt": "OpenAI API key", "url": "https://platform.openai.com/api-keys"},
],
"tts_provider": "openai",
},
@@ -149,11 +134,7 @@ TOOL_CATEGORIES = {
"name": "ElevenLabs",
"tag": "Premium - most natural voices",
"env_vars": [
{
"key": "ELEVENLABS_API_KEY",
"prompt": "ElevenLabs API key",
"url": "https://elevenlabs.io/app/settings/api-keys",
},
{"key": "ELEVENLABS_API_KEY", "prompt": "ElevenLabs API key", "url": "https://elevenlabs.io/app/settings/api-keys"},
],
"tts_provider": "elevenlabs",
},
@@ -161,8 +142,6 @@ TOOL_CATEGORIES = {
},
"web": {
"name": "Web Search & Extract",
"setup_title": "Select Search Provider",
"setup_note": "A free DuckDuckGo search skill is also included — skip this if you don't need Firecrawl.",
"icon": "🔍",
"providers": [
{
@@ -238,11 +217,7 @@ TOOL_CATEGORIES = {
"name": "Tinker / Atropos",
"tag": "RL training platform",
"env_vars": [
{
"key": "TINKER_API_KEY",
"prompt": "Tinker API key",
"url": "https://tinker-console.thinkingmachines.ai/keys",
},
{"key": "TINKER_API_KEY", "prompt": "Tinker API key", "url": "https://tinker-console.thinkingmachines.ai/keys"},
{"key": "WANDB_API_KEY", "prompt": "WandB API key", "url": "https://wandb.ai/authorize"},
],
"post_setup": "rl_training",
@@ -254,26 +229,24 @@ TOOL_CATEGORIES = {
# Simple env-var requirements for toolsets NOT in TOOL_CATEGORIES.
# Used as a fallback for tools like vision/moa that just need an API key.
TOOLSET_ENV_REQUIREMENTS = {
"vision": [("OPENROUTER_API_KEY", "https://openrouter.ai/keys")],
"moa": [("OPENROUTER_API_KEY", "https://openrouter.ai/keys")],
"vision": [("OPENROUTER_API_KEY", "https://openrouter.ai/keys")],
"moa": [("OPENROUTER_API_KEY", "https://openrouter.ai/keys")],
}
# ─── Post-Setup Hooks ─────────────────────────────────────────────────────────
def _run_post_setup(post_setup_key: str):
"""Run post-setup hooks for tools that need extra installation steps."""
import shutil
if post_setup_key == "browserbase":
node_modules = PROJECT_ROOT / "node_modules" / "agent-browser"
if not node_modules.exists() and shutil.which("npm"):
_print_info(" Installing Node.js dependencies for browser tools...")
import subprocess
result = subprocess.run(
["npm", "install", "--silent"], capture_output=True, text=True, cwd=str(PROJECT_ROOT)
["npm", "install", "--silent"],
capture_output=True, text=True, cwd=str(PROJECT_ROOT)
)
if result.returncode == 0:
_print_success(" Node.js dependencies installed")
@@ -290,17 +263,16 @@ def _run_post_setup(post_setup_key: str):
if tinker_dir.exists() and (tinker_dir / "pyproject.toml").exists():
_print_info(" Installing tinker-atropos submodule...")
import subprocess
uv_bin = shutil.which("uv")
if uv_bin:
result = subprocess.run(
[uv_bin, "pip", "install", "--python", sys.executable, "-e", str(tinker_dir)],
capture_output=True,
text=True,
capture_output=True, text=True
)
else:
result = subprocess.run(
[sys.executable, "-m", "pip", "install", "-e", str(tinker_dir)], capture_output=True, text=True
[sys.executable, "-m", "pip", "install", "-e", str(tinker_dir)],
capture_output=True, text=True
)
if result.returncode == 0:
_print_success(" tinker-atropos installed")
@@ -315,8 +287,7 @@ def _run_post_setup(post_setup_key: str):
# ─── Platform / Toolset Helpers ───────────────────────────────────────────────
def _get_enabled_platforms() -> list[str]:
def _get_enabled_platforms() -> List[str]:
"""Return platform keys that are configured (have tokens or are CLI)."""
enabled = ["cli"]
if get_env_value("TELEGRAM_BOT_TOKEN"):
@@ -330,14 +301,14 @@ def _get_enabled_platforms() -> list[str]:
return enabled
def _get_platform_tools(config: dict, platform: str) -> set[str]:
def _get_platform_tools(config: dict, platform: str) -> Set[str]:
"""Resolve which individual toolset names are enabled for a platform."""
from toolsets import resolve_toolset
from toolsets import resolve_toolset, TOOLSETS
platform_toolsets = config.get("platform_toolsets", {})
toolset_names = platform_toolsets.get(platform)
if toolset_names is None or not isinstance(toolset_names, list):
if not toolset_names or not isinstance(toolset_names, list):
default_ts = PLATFORMS[platform]["default_toolset"]
toolset_names = [default_ts]
@@ -357,7 +328,7 @@ def _get_platform_tools(config: dict, platform: str) -> set[str]:
return enabled_toolsets
def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: set[str]):
def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[str]):
"""Save the selected toolset keys for a platform to config."""
config.setdefault("platform_toolsets", {})
config["platform_toolsets"][platform] = sorted(enabled_toolset_keys)
@@ -386,92 +357,47 @@ def _toolset_has_keys(ts_key: str) -> bool:
# ─── Menu Helpers ─────────────────────────────────────────────────────────────
def _prompt_choice(question: str, choices: list, default: int = 0) -> int:
"""Single-select menu (arrow keys). Uses curses to avoid simple_term_menu
rendering bugs in tmux, iTerm, and other non-standard terminals."""
# Curses-based single-select — works in tmux, iTerm, and standard terminals
try:
import curses
result_holder = [default]
def _curses_menu(stdscr):
curses.curs_set(0)
if curses.has_colors():
curses.start_color()
curses.use_default_colors()
curses.init_pair(1, curses.COLOR_GREEN, -1)
curses.init_pair(2, curses.COLOR_YELLOW, -1)
cursor = default
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
try:
stdscr.addnstr(
0, 0, question, max_x - 1, curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0)
)
except curses.error:
pass
for i, c in enumerate(choices):
y = i + 2
if y >= max_y - 1:
break
arrow = "" if i == cursor else " "
line = f" {arrow} {c}"
attr = curses.A_NORMAL
if i == cursor:
attr = curses.A_BOLD
if curses.has_colors():
attr |= curses.color_pair(1)
try:
stdscr.addnstr(y, 0, line, max_x - 1, attr)
except curses.error:
pass
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
cursor = (cursor - 1) % len(choices)
elif key in (curses.KEY_DOWN, ord("j")):
cursor = (cursor + 1) % len(choices)
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = cursor
return
elif key in (27, ord("q")):
return
curses.wrapper(_curses_menu)
return result_holder[0]
except Exception:
pass
# Fallback: numbered input (Windows without curses, etc.)
"""Single-select menu (arrow keys)."""
print(color(question, Colors.YELLOW))
for i, c in enumerate(choices):
marker = "" if i == default else ""
style = Colors.GREEN if i == default else ""
print(color(f" {marker} {i + 1}. {c}", style) if style else f" {marker} {i + 1}. {c}")
while True:
try:
val = input(color(f" Select [1-{len(choices)}] ({default + 1}): ", Colors.DIM))
if not val:
return default
idx = int(val) - 1
if 0 <= idx < len(choices):
return idx
except (ValueError, KeyboardInterrupt, EOFError):
print()
try:
from simple_term_menu import TerminalMenu
menu = TerminalMenu(
[f" {c}" for c in choices],
cursor_index=default,
menu_cursor="",
menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True,
clear_screen=False,
)
idx = menu.show()
if idx is None:
return default
print()
return idx
except (ImportError, NotImplementedError):
for i, c in enumerate(choices):
marker = "" if i == default else ""
style = Colors.GREEN if i == default else ""
print(color(f" {marker} {c}", style) if style else f" {marker} {c}")
while True:
try:
val = input(color(f" Select [1-{len(choices)}] ({default + 1}): ", Colors.DIM))
if not val:
return default
idx = int(val) - 1
if 0 <= idx < len(choices):
return idx
except (ValueError, KeyboardInterrupt, EOFError):
print()
return default
def _prompt_toolset_checklist(platform_label: str, enabled: set[str]) -> set[str]:
def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str]:
"""Multi-select checklist of toolsets. Returns set of selected toolset keys."""
import platform as _platform
labels = []
for ts_key, ts_label, ts_desc in CONFIGURABLE_TOOLSETS:
@@ -480,13 +406,55 @@ def _prompt_toolset_checklist(platform_label: str, enabled: set[str]) -> set[str
suffix = " [no API key]"
labels.append(f"{ts_label} ({ts_desc}){suffix}")
pre_selected_indices = [i for i, (ts_key, _, _) in enumerate(CONFIGURABLE_TOOLSETS) if ts_key in enabled]
pre_selected_indices = [
i for i, (ts_key, _, _) in enumerate(CONFIGURABLE_TOOLSETS)
if ts_key in enabled
]
# simple_term_menu multi-select has rendering bugs on macOS terminals,
# so we use a curses-based fallback there.
use_term_menu = _platform.system() != "Darwin"
if use_term_menu:
try:
from simple_term_menu import TerminalMenu
print(color(f"Tools for {platform_label}", Colors.YELLOW))
print(color(" SPACE to toggle, ENTER to confirm.", Colors.DIM))
print()
menu_items = [f" {label}" for label in labels]
menu = TerminalMenu(
menu_items,
multi_select=True,
show_multi_select_hint=False,
multi_select_cursor="[✓] ",
multi_select_select_on_accept=False,
multi_select_empty_ok=True,
preselected_entries=pre_selected_indices if pre_selected_indices else None,
menu_cursor="",
menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True,
clear_screen=False,
clear_menu_on_exit=False,
)
menu.show()
if menu.chosen_menu_entries is None:
return enabled
selected_indices = list(menu.chosen_menu_indices or [])
return {CONFIGURABLE_TOOLSETS[i][0] for i in selected_indices}
except (ImportError, NotImplementedError):
pass # fall through to curses/numbered fallback
# Curses-based multi-select — arrow keys + space to toggle + enter to confirm.
# simple_term_menu has rendering bugs in tmux, iTerm, and other terminals.
# Used on macOS (where simple_term_menu ghosts) and as a fallback.
try:
import curses
selected = set(pre_selected_indices)
result_holder = [None]
@@ -506,13 +474,7 @@ def _prompt_toolset_checklist(platform_label: str, enabled: set[str]) -> set[str
max_y, max_x = stdscr.getmaxyx()
header = f"Tools for {platform_label} — ↑↓ navigate, SPACE toggle, ENTER confirm"
try:
stdscr.addnstr(
0,
0,
header,
max_x - 1,
curses.A_BOLD | curses.color_pair(2) if curses.has_colors() else curses.A_BOLD,
)
stdscr.addnstr(0, 0, header, max_x - 1, curses.A_BOLD | curses.color_pair(2) if curses.has_colors() else curses.A_BOLD)
except curses.error:
pass
@@ -543,11 +505,11 @@ def _prompt_toolset_checklist(platform_label: str, enabled: set[str]) -> set[str
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
if key in (curses.KEY_UP, ord('k')):
cursor = (cursor - 1) % len(labels)
elif key in (curses.KEY_DOWN, ord("j")):
elif key in (curses.KEY_DOWN, ord('j')):
cursor = (cursor + 1) % len(labels)
elif key == ord(" "):
elif key == ord(' '):
if cursor in selected:
selected.discard(cursor)
else:
@@ -555,7 +517,7 @@ def _prompt_toolset_checklist(platform_label: str, enabled: set[str]) -> set[str
elif key in (curses.KEY_ENTER, 10, 13):
result_holder[0] = {CONFIGURABLE_TOOLSETS[i][0] for i in selected}
return
elif key in (27, ord("q")): # ESC or q
elif key in (27, ord('q')): # ESC or q
result_holder[0] = enabled
return
@@ -594,10 +556,9 @@ def _prompt_toolset_checklist(platform_label: str, enabled: set[str]) -> set[str
# ─── Provider-Aware Configuration ────────────────────────────────────────────
def _configure_toolset(ts_key: str, config: dict):
"""Configure a toolset - provider selection + API keys.
Uses TOOL_CATEGORIES for provider-aware config, falls back to simple
env var prompts for toolsets not in TOOL_CATEGORIES.
"""
@@ -621,9 +582,7 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
req = cat["requires_python"]
if sys.version_info < req:
print()
_print_error(
f" {name} requires Python {req[0]}.{req[1]}+ (current: {sys.version_info.major}.{sys.version_info.minor})"
)
_print_error(f" {name} requires Python {req[0]}.{req[1]}+ (current: {sys.version_info.major}.{sys.version_info.minor})")
_print_info(" Upgrade Python and reinstall to enable this tool.")
return
@@ -634,18 +593,11 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
print(color(f" --- {icon} {name} ({provider['name']}) ---", Colors.CYAN))
if provider.get("tag"):
_print_info(f" {provider['tag']}")
# For single-provider tools, show a note if available
if cat.get("setup_note"):
_print_info(f" {cat['setup_note']}")
_configure_provider(provider, config)
else:
# Multiple providers - let user choose
print()
# Use custom title if provided (e.g. "Select Search Provider")
title = cat.get("setup_title", "Choose a provider")
print(color(f" --- {icon} {name} - {title} ---", Colors.CYAN))
if cat.get("setup_note"):
_print_info(f" {cat['setup_note']}")
print(color(f" --- {icon} {name} - Choose a provider ---", Colors.CYAN))
print()
# Plain text labels only (no ANSI codes in menu items)
@@ -658,18 +610,11 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
if p.get("tts_provider") and config.get("tts", {}).get("provider") == p["tts_provider"]:
configured = " [active]"
elif not env_vars:
configured = (
" [active]"
if config.get("tts", {}).get("provider", "edge") == p.get("tts_provider", "")
else ""
)
configured = " [active]" if config.get("tts", {}).get("provider", "edge") == p.get("tts_provider", "") else ""
else:
configured = " [configured]"
provider_choices.append(f"{p['name']}{tag}{configured}")
# Add skip option
provider_choices.append("Skip — keep defaults / configure later")
# Detect current provider as default
default_idx = 0
for i, p in enumerate(providers):
@@ -681,13 +626,7 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):
default_idx = i
break
provider_idx = _prompt_choice(f" {title}:", provider_choices, default_idx)
# Skip selected
if provider_idx >= len(providers):
_print_info(f" Skipped {name}")
return
provider_idx = _prompt_choice(" Select provider:", provider_choices, default_idx)
_configure_provider(providers[provider_idx], config)
@@ -724,9 +663,9 @@ def _configure_provider(provider: dict, config: dict):
if value:
save_env_value(var["key"], value)
_print_success(" Saved")
_print_success(f" Saved")
else:
_print_warning(" Skipped")
_print_warning(f" Skipped")
all_configured = False
# Run post-setup hooks if needed
@@ -757,9 +696,9 @@ def _configure_simple_requirements(ts_key: str):
value = _prompt(f" {var}", password=True)
if value and value.strip():
save_env_value(var, value.strip())
_print_success(" Saved")
_print_success(f" Saved")
else:
_print_warning(" Skipped")
_print_warning(f" Skipped")
def _reconfigure_tool(config: dict):
@@ -863,9 +802,9 @@ def _reconfigure_provider(provider: dict, config: dict):
value = _prompt(f" {var.get('prompt', var['key'])} (Enter to keep current)", password=not default_val)
if value and value.strip():
save_env_value(var["key"], value.strip())
_print_success(" Updated")
_print_success(f" Updated")
else:
_print_info(" Kept current")
_print_info(f" Kept current")
def _reconfigure_simple_requirements(ts_key: str):
@@ -887,27 +826,16 @@ def _reconfigure_simple_requirements(ts_key: str):
value = _prompt(f" {var} (Enter to keep current)", password=True)
if value and value.strip():
save_env_value(var, value.strip())
_print_success(" Updated")
_print_success(f" Updated")
else:
_print_info(" Kept current")
_print_info(f" Kept current")
# ─── Main Entry Point ─────────────────────────────────────────────────────────
def tools_command(args=None, first_install: bool = False, config: dict = None):
"""Entry point for `hermes tools` and `hermes setup tools`.
Args:
first_install: When True (set by the setup wizard on fresh installs),
skip the platform menu, go straight to the CLI checklist, and
prompt for API keys on all enabled tools that need them.
config: Optional config dict to use. When called from the setup
wizard, the wizard passes its own dict so that platform_toolsets
are written into it and survive the wizard's final save_config().
"""
if config is None:
config = load_config()
def tools_command(args=None):
"""Entry point for `hermes tools` and `hermes setup tools`."""
config = load_config()
enabled_platforms = _get_enabled_platforms()
print()
@@ -916,58 +844,6 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
print(color(" Tools that need API keys will be configured when enabled.", Colors.DIM))
print()
# ── First-time install: linear flow, no platform menu ──
if first_install:
for pkey in enabled_platforms:
pinfo = PLATFORMS[pkey]
current_enabled = _get_platform_tools(config, pkey)
# Uncheck toolsets that should be off by default
checklist_preselected = current_enabled - _DEFAULT_OFF_TOOLSETS
# Show checklist
new_enabled = _prompt_toolset_checklist(pinfo["label"], checklist_preselected)
added = new_enabled - current_enabled
removed = current_enabled - new_enabled
if added:
for ts in sorted(added):
label = next((l for k, l, _ in CONFIGURABLE_TOOLSETS if k == ts), ts)
print(color(f" + {label}", Colors.GREEN))
if removed:
for ts in sorted(removed):
label = next((l for k, l, _ in CONFIGURABLE_TOOLSETS if k == ts), ts)
print(color(f" - {label}", Colors.RED))
# Walk through ALL selected tools that have provider options or
# need API keys. This ensures browser (Local vs Browserbase),
# TTS (Edge vs OpenAI vs ElevenLabs), etc. are shown even when
# a free provider exists.
to_configure = [
ts_key
for ts_key in sorted(new_enabled)
if TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key)
]
if to_configure:
print()
print(color(f" Configuring {len(to_configure)} tool(s):", Colors.YELLOW))
for ts_key in to_configure:
label = next((l for k, l, _ in CONFIGURABLE_TOOLSETS if k == ts_key), ts_key)
print(color(f"{label}", Colors.DIM))
print(color(" You can skip any tool you don't need right now.", Colors.DIM))
print()
for ts_key in to_configure:
_configure_toolset(ts_key, config)
_save_platform_tools(config, pkey, new_enabled)
save_config(config)
print(color(f" ✓ Saved {pinfo['label']} tool configuration", Colors.GREEN))
print()
return
# ── Returning user: platform menu loop ──
# Build platform choices
platform_choices = []
platform_keys = []
@@ -1018,10 +894,11 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
print(color(f" - {label}", Colors.RED))
# Configure newly enabled toolsets that need API keys
for ts_key in sorted(added):
if TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key):
if not _toolset_has_keys(ts_key):
_configure_toolset(ts_key, config)
if added:
for ts_key in sorted(added):
if TOOL_CATEGORIES.get(ts_key) or TOOLSET_ENV_REQUIREMENTS.get(ts_key):
if not _toolset_has_keys(ts_key):
_configure_toolset(ts_key, config)
_save_platform_tools(config, pkey, new_enabled)
save_config(config)

View File

@@ -7,25 +7,23 @@ Provides options for:
"""
import os
import sys
import shutil
import subprocess
from pathlib import Path
from typing import Optional
from hermes_cli.colors import Colors, color
def log_info(msg: str):
print(f"{color('', Colors.CYAN)} {msg}")
def log_success(msg: str):
print(f"{color('', Colors.GREEN)} {msg}")
def log_warn(msg: str):
print(f"{color('', Colors.YELLOW)} {msg}")
def log_error(msg: str):
print(f"{color('', Colors.RED)} {msg}")
@@ -44,7 +42,7 @@ def find_shell_configs() -> list:
"""Find shell configuration files that might have PATH entries."""
home = Path.home()
configs = []
candidates = [
home / ".bashrc",
home / ".bash_profile",
@@ -52,11 +50,11 @@ def find_shell_configs() -> list:
home / ".zshrc",
home / ".zprofile",
]
for config in candidates:
if config.exists():
configs.append(config)
return configs
@@ -64,45 +62,45 @@ def remove_path_from_shell_configs():
"""Remove Hermes PATH entries from shell configuration files."""
configs = find_shell_configs()
removed_from = []
for config_path in configs:
try:
content = config_path.read_text()
original_content = content
# Remove lines containing hermes-agent or hermes PATH entries
new_lines = []
skip_next = False
for line in content.split("\n"):
for line in content.split('\n'):
# Skip the "# Hermes Agent" comment and following line
if "# Hermes Agent" in line or "# hermes-agent" in line:
if '# Hermes Agent' in line or '# hermes-agent' in line:
skip_next = True
continue
if skip_next and ("hermes" in line.lower() and "PATH" in line):
if skip_next and ('hermes' in line.lower() and 'PATH' in line):
skip_next = False
continue
skip_next = False
# Remove any PATH line containing hermes
if "hermes" in line.lower() and ("PATH=" in line or "path=" in line.lower()):
if 'hermes' in line.lower() and ('PATH=' in line or 'path=' in line.lower()):
continue
new_lines.append(line)
new_content = "\n".join(new_lines)
new_content = '\n'.join(new_lines)
# Clean up multiple blank lines
while "\n\n\n" in new_content:
new_content = new_content.replace("\n\n\n", "\n\n")
while '\n\n\n' in new_content:
new_content = new_content.replace('\n\n\n', '\n\n')
if new_content != original_content:
config_path.write_text(new_content)
removed_from.append(config_path)
except Exception as e:
log_warn(f"Could not update {config_path}: {e}")
return removed_from
@@ -112,49 +110,61 @@ def remove_wrapper_script():
Path.home() / ".local" / "bin" / "hermes",
Path("/usr/local/bin/hermes"),
]
removed = []
for wrapper in wrapper_paths:
if wrapper.exists():
try:
# Check if it's our wrapper (contains hermes_cli reference)
content = wrapper.read_text()
if "hermes_cli" in content or "hermes-agent" in content:
if 'hermes_cli' in content or 'hermes-agent' in content:
wrapper.unlink()
removed.append(wrapper)
except Exception as e:
log_warn(f"Could not remove {wrapper}: {e}")
return removed
def uninstall_gateway_service():
"""Stop and uninstall the gateway service if running."""
import platform
if platform.system() != "Linux":
return False
service_file = Path.home() / ".config" / "systemd" / "user" / "hermes-gateway.service"
if not service_file.exists():
return False
try:
# Stop the service
subprocess.run(["systemctl", "--user", "stop", "hermes-gateway"], capture_output=True, check=False)
subprocess.run(
["systemctl", "--user", "stop", "hermes-gateway"],
capture_output=True,
check=False
)
# Disable the service
subprocess.run(["systemctl", "--user", "disable", "hermes-gateway"], capture_output=True, check=False)
subprocess.run(
["systemctl", "--user", "disable", "hermes-gateway"],
capture_output=True,
check=False
)
# Remove service file
service_file.unlink()
# Reload systemd
subprocess.run(["systemctl", "--user", "daemon-reload"], capture_output=True, check=False)
subprocess.run(
["systemctl", "--user", "daemon-reload"],
capture_output=True,
check=False
)
return True
except Exception as e:
log_warn(f"Could not fully remove gateway service: {e}")
return False
@@ -163,20 +173,20 @@ def uninstall_gateway_service():
def run_uninstall(args):
"""
Run the uninstall process.
Options:
- Full uninstall: removes code + ~/.hermes/ (configs, data, logs)
- Keep data: removes code but keeps ~/.hermes/ for future reinstall
"""
project_root = get_project_root()
hermes_home = get_hermes_home()
print()
print(color("┌─────────────────────────────────────────────────────────┐", Colors.MAGENTA, Colors.BOLD))
print(color("│ ⚕ Hermes Agent Uninstaller │", Colors.MAGENTA, Colors.BOLD))
print(color("└─────────────────────────────────────────────────────────┘", Colors.MAGENTA, Colors.BOLD))
print()
# Show what will be affected
print(color("Current Installation:", Colors.CYAN, Colors.BOLD))
print(f" Code: {project_root}")
@@ -184,7 +194,7 @@ def run_uninstall(args):
print(f" Secrets: {hermes_home / '.env'}")
print(f" Data: {hermes_home / 'cron/'}, {hermes_home / 'sessions/'}, {hermes_home / 'logs/'}")
print()
# Ask for confirmation
print(color("Uninstall Options:", Colors.YELLOW, Colors.BOLD))
print()
@@ -196,21 +206,21 @@ def run_uninstall(args):
print()
print(" 3) " + color("Cancel", Colors.CYAN) + " - Don't uninstall")
print()
try:
choice = input(color("Select option [1/2/3]: ", Colors.BOLD)).strip()
except (KeyboardInterrupt, EOFError):
print()
print("Cancelled.")
return
if choice == "3" or choice.lower() in ("c", "cancel", "q", "quit", "n", "no"):
print()
print("Uninstall cancelled.")
return
full_uninstall = choice == "2"
full_uninstall = (choice == "2")
# Final confirmation
print()
if full_uninstall:
@@ -218,7 +228,7 @@ def run_uninstall(args):
print(color(" Including: configs, API keys, sessions, scheduled jobs, logs", Colors.RED))
else:
print("This will remove the Hermes code but keep your configuration and data.")
print()
try:
confirm = input(f"Type '{color('yes', Colors.YELLOW)}' to confirm: ").strip().lower()
@@ -226,23 +236,23 @@ def run_uninstall(args):
print()
print("Cancelled.")
return
if confirm != "yes":
print()
print("Uninstall cancelled.")
return
print()
print(color("Uninstalling...", Colors.CYAN, Colors.BOLD))
print()
# 1. Stop and uninstall gateway service
log_info("Checking for gateway service...")
if uninstall_gateway_service():
log_success("Gateway service stopped and removed")
else:
log_info("No gateway service found")
# 2. Remove PATH entries from shell configs
log_info("Removing PATH entries from shell configs...")
removed_configs = remove_path_from_shell_configs()
@@ -251,7 +261,7 @@ def run_uninstall(args):
log_success(f"Updated {config}")
else:
log_info("No PATH entries found to remove")
# 3. Remove wrapper script
log_info("Removing hermes command...")
removed_wrappers = remove_wrapper_script()
@@ -260,10 +270,10 @@ def run_uninstall(args):
log_success(f"Removed {wrapper}")
else:
log_info("No wrapper script found")
# 4. Remove installation directory (code)
log_info("Removing installation directory...")
log_info(f"Removing installation directory...")
# Check if we're running from within the install dir
# We need to be careful here
try:
@@ -279,7 +289,7 @@ def run_uninstall(args):
except Exception as e:
log_warn(f"Could not fully remove {project_root}: {e}")
log_info("You may need to manually remove it")
# 5. Optionally remove ~/.hermes/ data directory
if full_uninstall:
log_info("Removing configuration and data...")
@@ -292,27 +302,22 @@ def run_uninstall(args):
log_info("You may need to manually remove it")
else:
log_info(f"Keeping configuration and data in {hermes_home}")
# Done
print()
print(color("┌─────────────────────────────────────────────────────────┐", Colors.GREEN, Colors.BOLD))
print(color("│ ✓ Uninstall Complete! │", Colors.GREEN, Colors.BOLD))
print(color("└─────────────────────────────────────────────────────────┘", Colors.GREEN, Colors.BOLD))
print()
if not full_uninstall:
print(color("Your configuration and data have been preserved:", Colors.CYAN))
print(f" {hermes_home}/")
print()
print("To reinstall later with your existing settings:")
print(
color(
" curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash",
Colors.DIM,
)
)
print(color(" curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash", Colors.DIM))
print()
print(color("Reload your shell to complete the process:", Colors.YELLOW))
print(" source ~/.bashrc # or ~/.zshrc")
print()

View File

@@ -19,11 +19,12 @@ import os
import sqlite3
import time
from pathlib import Path
from typing import Any
from typing import Dict, Any, List, Optional
DEFAULT_DB_PATH = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) / "state.db"
SCHEMA_VERSION = 4
SCHEMA_VERSION = 2
SCHEMA_SQL = """
CREATE TABLE IF NOT EXISTS schema_version (
@@ -45,7 +46,6 @@ CREATE TABLE IF NOT EXISTS sessions (
tool_call_count INTEGER DEFAULT 0,
input_tokens INTEGER DEFAULT 0,
output_tokens INTEGER DEFAULT 0,
title TEXT,
FOREIGN KEY (parent_session_id) REFERENCES sessions(id)
);
@@ -133,32 +133,7 @@ class SessionDB:
except sqlite3.OperationalError:
pass # Column already exists
cursor.execute("UPDATE schema_version SET version = 2")
if current_version < 3:
# v3: add title column to sessions
try:
cursor.execute("ALTER TABLE sessions ADD COLUMN title TEXT")
except sqlite3.OperationalError:
pass # Column already exists
cursor.execute("UPDATE schema_version SET version = 3")
if current_version < 4:
# v4: add unique index on title (NULLs allowed, only non-NULL must be unique)
try:
cursor.execute(
"CREATE UNIQUE INDEX IF NOT EXISTS idx_sessions_title_unique "
"ON sessions(title) WHERE title IS NOT NULL"
)
except sqlite3.OperationalError:
pass # Index already exists
cursor.execute("UPDATE schema_version SET version = 4")
# Unique title index — always ensure it exists (safe to run after migrations
# since the title column is guaranteed to exist at this point)
try:
cursor.execute(
"CREATE UNIQUE INDEX IF NOT EXISTS idx_sessions_title_unique ON sessions(title) WHERE title IS NOT NULL"
)
except sqlite3.OperationalError:
pass # Index already exists
# FTS5 setup (separate because CREATE VIRTUAL TABLE can't be in executescript with IF NOT EXISTS reliably)
try:
@@ -183,7 +158,7 @@ class SessionDB:
session_id: str,
source: str,
model: str = None,
model_config: dict[str, Any] = None,
model_config: Dict[str, Any] = None,
system_prompt: str = None,
user_id: str = None,
parent_session_id: str = None,
@@ -223,7 +198,9 @@ class SessionDB:
)
self._conn.commit()
def update_token_counts(self, session_id: str, input_tokens: int = 0, output_tokens: int = 0) -> None:
def update_token_counts(
self, session_id: str, input_tokens: int = 0, output_tokens: int = 0
) -> None:
"""Increment token counters on a session."""
self._conn.execute(
"""UPDATE sessions SET
@@ -234,209 +211,14 @@ class SessionDB:
)
self._conn.commit()
def get_session(self, session_id: str) -> dict[str, Any] | None:
def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
"""Get a session by ID."""
cursor = self._conn.execute("SELECT * FROM sessions WHERE id = ?", (session_id,))
cursor = self._conn.execute(
"SELECT * FROM sessions WHERE id = ?", (session_id,)
)
row = cursor.fetchone()
return dict(row) if row else None
# Maximum length for session titles
MAX_TITLE_LENGTH = 100
@staticmethod
def sanitize_title(title: str | None) -> str | None:
"""Validate and sanitize a session title.
- Strips leading/trailing whitespace
- Removes ASCII control characters (0x00-0x1F, 0x7F) and problematic
Unicode control chars (zero-width, RTL/LTR overrides, etc.)
- Collapses internal whitespace runs to single spaces
- Normalizes empty/whitespace-only strings to None
- Enforces MAX_TITLE_LENGTH
Returns the cleaned title string or None.
Raises ValueError if the title exceeds MAX_TITLE_LENGTH after cleaning.
"""
if not title:
return None
import re
# Remove ASCII control characters (0x00-0x1F, 0x7F) but keep
# whitespace chars (\t=0x09, \n=0x0A, \r=0x0D) so they can be
# normalized to spaces by the whitespace collapsing step below
cleaned = re.sub(r"[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]", "", title)
# Remove problematic Unicode control characters:
# - Zero-width chars (U+200B-U+200F, U+FEFF)
# - Directional overrides (U+202A-U+202E, U+2066-U+2069)
# - Object replacement (U+FFFC), interlinear annotation (U+FFF9-U+FFFB)
cleaned = re.sub(
r"[\u200b-\u200f\u2028-\u202e\u2060-\u2069\ufeff\ufffc\ufff9-\ufffb]",
"",
cleaned,
)
# Collapse internal whitespace runs and strip
cleaned = re.sub(r"\s+", " ", cleaned).strip()
if not cleaned:
return None
if len(cleaned) > SessionDB.MAX_TITLE_LENGTH:
raise ValueError(f"Title too long ({len(cleaned)} chars, max {SessionDB.MAX_TITLE_LENGTH})")
return cleaned
def set_session_title(self, session_id: str, title: str) -> bool:
"""Set or update a session's title.
Returns True if session was found and title was set.
Raises ValueError if title is already in use by another session,
or if the title fails validation (too long, invalid characters).
Empty/whitespace-only strings are normalized to None (clearing the title).
"""
title = self.sanitize_title(title)
if title:
# Check uniqueness (allow the same session to keep its own title)
cursor = self._conn.execute(
"SELECT id FROM sessions WHERE title = ? AND id != ?",
(title, session_id),
)
conflict = cursor.fetchone()
if conflict:
raise ValueError(f"Title '{title}' is already in use by session {conflict['id']}")
cursor = self._conn.execute(
"UPDATE sessions SET title = ? WHERE id = ?",
(title, session_id),
)
self._conn.commit()
return cursor.rowcount > 0
def get_session_title(self, session_id: str) -> str | None:
"""Get the title for a session, or None."""
cursor = self._conn.execute("SELECT title FROM sessions WHERE id = ?", (session_id,))
row = cursor.fetchone()
return row["title"] if row else None
def get_session_by_title(self, title: str) -> dict[str, Any] | None:
"""Look up a session by exact title. Returns session dict or None."""
cursor = self._conn.execute("SELECT * FROM sessions WHERE title = ?", (title,))
row = cursor.fetchone()
return dict(row) if row else None
def resolve_session_by_title(self, title: str) -> str | None:
"""Resolve a title to a session ID, preferring the latest in a lineage.
If the exact title exists, returns that session's ID.
If not, searches for "title #N" variants and returns the latest one.
If the exact title exists AND numbered variants exist, returns the
latest numbered variant (the most recent continuation).
"""
# First try exact match
exact = self.get_session_by_title(title)
# Also search for numbered variants: "title #2", "title #3", etc.
# Escape SQL LIKE wildcards (%, _) in the title to prevent false matches
escaped = title.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
cursor = self._conn.execute(
"SELECT id, title, started_at FROM sessions WHERE title LIKE ? ESCAPE '\\' ORDER BY started_at DESC",
(f"{escaped} #%",),
)
numbered = cursor.fetchall()
if numbered:
# Return the most recent numbered variant
return numbered[0]["id"]
elif exact:
return exact["id"]
return None
def get_next_title_in_lineage(self, base_title: str) -> str:
"""Generate the next title in a lineage (e.g., "my session""my session #2").
Strips any existing " #N" suffix to find the base name, then finds
the highest existing number and increments.
"""
import re
# Strip existing #N suffix to find the true base
match = re.match(r"^(.*?) #(\d+)$", base_title)
if match:
base = match.group(1)
else:
base = base_title
# Find all existing numbered variants
# Escape SQL LIKE wildcards (%, _) in the base to prevent false matches
escaped = base.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
cursor = self._conn.execute(
"SELECT title FROM sessions WHERE title = ? OR title LIKE ? ESCAPE '\\'",
(base, f"{escaped} #%"),
)
existing = [row["title"] for row in cursor.fetchall()]
if not existing:
return base # No conflict, use the base name as-is
# Find the highest number
max_num = 1 # The unnumbered original counts as #1
for t in existing:
m = re.match(r"^.* #(\d+)$", t)
if m:
max_num = max(max_num, int(m.group(1)))
return f"{base} #{max_num + 1}"
def list_sessions_rich(
self,
source: str = None,
limit: int = 20,
offset: int = 0,
) -> list[dict[str, Any]]:
"""List sessions with preview (first user message) and last active timestamp.
Returns dicts with keys: id, source, model, title, started_at, ended_at,
message_count, preview (first 60 chars of first user message),
last_active (timestamp of last message).
Uses a single query with correlated subqueries instead of N+2 queries.
"""
source_clause = "WHERE s.source = ?" if source else ""
query = f"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
{source_clause}
ORDER BY s.started_at DESC
LIMIT ? OFFSET ?
"""
params = (source, limit, offset) if source else (limit, offset)
cursor = self._conn.execute(query, params)
sessions = []
for row in cursor.fetchall():
s = dict(row)
# Build the preview from the raw substring
raw = s.pop("_preview_raw", "").strip()
if raw:
text = raw[:60]
s["preview"] = text + ("..." if len(raw) > 60 else "")
else:
s["preview"] = ""
sessions.append(s)
return sessions
# =========================================================================
# Message storage
# =========================================================================
@@ -493,7 +275,7 @@ class SessionDB:
self._conn.commit()
return msg_id
def get_messages(self, session_id: str) -> list[dict[str, Any]]:
def get_messages(self, session_id: str) -> List[Dict[str, Any]]:
"""Load all messages for a session, ordered by timestamp."""
cursor = self._conn.execute(
"SELECT * FROM messages WHERE session_id = ? ORDER BY timestamp, id",
@@ -511,7 +293,7 @@ class SessionDB:
result.append(msg)
return result
def get_messages_as_conversation(self, session_id: str) -> list[dict[str, Any]]:
def get_messages_as_conversation(self, session_id: str) -> List[Dict[str, Any]]:
"""
Load messages in the OpenAI conversation format (role + content dicts).
Used by the gateway to restore conversation history.
@@ -543,11 +325,11 @@ class SessionDB:
def search_messages(
self,
query: str,
source_filter: list[str] = None,
role_filter: list[str] = None,
source_filter: List[str] = None,
role_filter: List[str] = None,
limit: int = 20,
offset: int = 0,
) -> list[dict[str, Any]]:
) -> List[Dict[str, Any]]:
"""
Full-text search across session messages using FTS5.
@@ -615,7 +397,8 @@ class SessionDB:
(match["session_id"], match["id"], match["id"]),
)
context_msgs = [
{"role": r["role"], "content": (r["content"] or "")[:200]} for r in ctx_cursor.fetchall()
{"role": r["role"], "content": (r["content"] or "")[:200]}
for r in ctx_cursor.fetchall()
]
match["context"] = context_msgs
except Exception:
@@ -631,7 +414,7 @@ class SessionDB:
source: str = None,
limit: int = 20,
offset: int = 0,
) -> list[dict[str, Any]]:
) -> List[Dict[str, Any]]:
"""List sessions, optionally filtered by source."""
if source:
cursor = self._conn.execute(
@@ -652,7 +435,9 @@ class SessionDB:
def session_count(self, source: str = None) -> int:
"""Count sessions, optionally filtered by source."""
if source:
cursor = self._conn.execute("SELECT COUNT(*) FROM sessions WHERE source = ?", (source,))
cursor = self._conn.execute(
"SELECT COUNT(*) FROM sessions WHERE source = ?", (source,)
)
else:
cursor = self._conn.execute("SELECT COUNT(*) FROM sessions")
return cursor.fetchone()[0]
@@ -660,7 +445,9 @@ class SessionDB:
def message_count(self, session_id: str = None) -> int:
"""Count messages, optionally for a specific session."""
if session_id:
cursor = self._conn.execute("SELECT COUNT(*) FROM messages WHERE session_id = ?", (session_id,))
cursor = self._conn.execute(
"SELECT COUNT(*) FROM messages WHERE session_id = ?", (session_id,)
)
else:
cursor = self._conn.execute("SELECT COUNT(*) FROM messages")
return cursor.fetchone()[0]
@@ -669,7 +456,7 @@ class SessionDB:
# Export and cleanup
# =========================================================================
def export_session(self, session_id: str) -> dict[str, Any] | None:
def export_session(self, session_id: str) -> Optional[Dict[str, Any]]:
"""Export a single session with all its messages as a dict."""
session = self.get_session(session_id)
if not session:
@@ -677,7 +464,7 @@ class SessionDB:
messages = self.get_messages(session_id)
return {**session, "messages": messages}
def export_all(self, source: str = None) -> list[dict[str, Any]]:
def export_all(self, source: str = None) -> List[Dict[str, Any]]:
"""
Export all sessions (with messages) as a list of dicts.
Suitable for writing to a JSONL file for backup/analysis.
@@ -691,7 +478,9 @@ class SessionDB:
def clear_messages(self, session_id: str) -> None:
"""Delete all messages for a session and reset its counters."""
self._conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
self._conn.execute(
"DELETE FROM messages WHERE session_id = ?", (session_id,)
)
self._conn.execute(
"UPDATE sessions SET message_count = 0, tool_call_count = 0 WHERE id = ?",
(session_id,),
@@ -700,7 +489,9 @@ class SessionDB:
def delete_session(self, session_id: str) -> bool:
"""Delete a session and all its messages. Returns True if found."""
cursor = self._conn.execute("SELECT COUNT(*) FROM sessions WHERE id = ?", (session_id,))
cursor = self._conn.execute(
"SELECT COUNT(*) FROM sessions WHERE id = ?", (session_id,)
)
if cursor.fetchone()[0] == 0:
return False
self._conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
@@ -714,7 +505,6 @@ class SessionDB:
Only prunes ended sessions (not active ones).
"""
import time as _time
cutoff = _time.time() - (older_than_days * 86400)
if source:

View File

@@ -149,7 +149,7 @@ class MiniSWERunner:
def __init__(
self,
model: str = "anthropic/claude-sonnet-4.6",
model: str = "anthropic/claude-sonnet-4-20250514",
base_url: str = None,
api_key: str = None,
env_type: str = "local",
@@ -200,7 +200,13 @@ class MiniSWERunner:
else:
client_kwargs["base_url"] = "https://openrouter.ai/api/v1"
if base_url and "api.anthropic.com" in base_url.strip().lower():
raise ValueError(
"Anthropic's native /v1/messages API is not supported yet (planned for a future release). "
"Hermes currently requires OpenAI-compatible /chat/completions endpoints. "
"To use Claude models now, route through OpenRouter (OPENROUTER_API_KEY) "
"or any OpenAI-compatible proxy that wraps the Anthropic API."
)
# Handle API key - OpenRouter is the primary provider
if api_key:

View File

@@ -20,10 +20,11 @@ Public API (signatures preserved from the original 2,400-line version):
check_tool_availability(quiet) -> tuple
"""
import asyncio
import json
import asyncio
import os
import logging
from typing import Any
from typing import Dict, Any, List, Optional, Tuple
from tools.registry import registry
from toolsets import resolve_toolset, validate_toolset
@@ -35,7 +36,6 @@ logger = logging.getLogger(__name__)
# Async Bridging (single source of truth -- used by registry.dispatch too)
# =============================================================================
def _run_async(coro):
"""Run an async coroutine from a sync context.
@@ -56,7 +56,6 @@ def _run_async(coro):
if loop and loop.is_running():
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, coro)
return future.result(timeout=300)
@@ -67,7 +66,6 @@ def _run_async(coro):
# Tool Discovery (importing each module triggers its registry.register calls)
# =============================================================================
def _discover_tools():
"""Import all tool modules to trigger their registry.register() calls.
@@ -99,7 +97,6 @@ def _discover_tools():
"tools.homeassistant_tool",
]
import importlib
for mod_name in _modules:
try:
importlib.import_module(mod_name)
@@ -112,7 +109,6 @@ _discover_tools()
# MCP tool discovery (external MCP servers from config)
try:
from tools.mcp_tool import discover_mcp_tools
discover_mcp_tools()
except Exception as e:
logger.debug("MCP tool discovery failed: %s", e)
@@ -122,13 +118,13 @@ except Exception as e:
# Backward-compat constants (built once after discovery)
# =============================================================================
TOOL_TO_TOOLSET_MAP: dict[str, str] = registry.get_tool_to_toolset_map()
TOOL_TO_TOOLSET_MAP: Dict[str, str] = registry.get_tool_to_toolset_map()
TOOLSET_REQUIREMENTS: dict[str, dict] = registry.get_toolset_requirements()
TOOLSET_REQUIREMENTS: Dict[str, dict] = registry.get_toolset_requirements()
# Resolved tool names from the last get_tool_definitions() call.
# Used by code_execution_tool to know which tools are available in this session.
_last_resolved_tool_names: list[str] = []
_last_resolved_tool_names: List[str] = []
# =============================================================================
@@ -143,29 +139,18 @@ _LEGACY_TOOLSET_MAP = {
"image_tools": ["image_generate"],
"skills_tools": ["skills_list", "skill_view", "skill_manage"],
"browser_tools": [
"browser_navigate",
"browser_snapshot",
"browser_click",
"browser_type",
"browser_scroll",
"browser_back",
"browser_press",
"browser_close",
"browser_get_images",
"browser_vision",
"browser_navigate", "browser_snapshot", "browser_click",
"browser_type", "browser_scroll", "browser_back",
"browser_press", "browser_close", "browser_get_images",
"browser_vision"
],
"cronjob_tools": ["schedule_cronjob", "list_cronjobs", "remove_cronjob"],
"rl_tools": [
"rl_list_environments",
"rl_select_environment",
"rl_get_current_config",
"rl_edit_config",
"rl_start_training",
"rl_check_status",
"rl_stop_training",
"rl_get_results",
"rl_list_runs",
"rl_test_inference",
"rl_list_environments", "rl_select_environment",
"rl_get_current_config", "rl_edit_config",
"rl_start_training", "rl_check_status",
"rl_stop_training", "rl_get_results",
"rl_list_runs", "rl_test_inference"
],
"file_tools": ["read_file", "write_file", "patch", "search_files"],
"tts_tools": ["text_to_speech"],
@@ -176,12 +161,11 @@ _LEGACY_TOOLSET_MAP = {
# get_tool_definitions (the main schema provider)
# =============================================================================
def get_tool_definitions(
enabled_toolsets: list[str] = None,
disabled_toolsets: list[str] = None,
enabled_toolsets: List[str] = None,
disabled_toolsets: List[str] = None,
quiet_mode: bool = False,
) -> list[dict[str, Any]]:
) -> List[Dict[str, Any]]:
"""
Get tool definitions for model API calls with toolset-based filtering.
@@ -216,7 +200,6 @@ def get_tool_definitions(
elif disabled_toolsets:
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
@@ -236,7 +219,6 @@ def get_tool_definitions(
print(f"⚠️ Unknown toolset: {toolset_name}")
else:
from toolsets import get_all_toolsets
for ts_name in get_all_toolsets():
tools_to_include.update(resolve_toolset(ts_name))
@@ -248,7 +230,6 @@ def get_tool_definitions(
# execute_code" even when the user disabled the web toolset (#560-discord).
if "execute_code" in tools_to_include:
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & tools_to_include
dynamic_schema = build_execute_code_schema(sandbox_enabled)
for i, td in enumerate(filtered_tools):
@@ -282,9 +263,9 @@ _AGENT_LOOP_TOOLS = {"todo", "memory", "session_search", "delegate_task"}
def handle_function_call(
function_name: str,
function_args: dict[str, Any],
task_id: str | None = None,
user_task: str | None = None,
function_args: Dict[str, Any],
task_id: Optional[str] = None,
user_task: Optional[str] = None,
) -> str:
"""
Main function call dispatcher that routes calls to the tool registry.
@@ -304,15 +285,13 @@ def handle_function_call(
if function_name == "execute_code":
return registry.dispatch(
function_name,
function_args,
function_name, function_args,
task_id=task_id,
enabled_tools=_last_resolved_tool_names,
)
return registry.dispatch(
function_name,
function_args,
function_name, function_args,
task_id=task_id,
user_task=user_task,
)
@@ -327,27 +306,26 @@ def handle_function_call(
# Backward-compat wrapper functions
# =============================================================================
def get_all_tool_names() -> list[str]:
def get_all_tool_names() -> List[str]:
"""Return all registered tool names."""
return registry.get_all_tool_names()
def get_toolset_for_tool(tool_name: str) -> str | None:
def get_toolset_for_tool(tool_name: str) -> Optional[str]:
"""Return the toolset a tool belongs to."""
return registry.get_toolset_for_tool(tool_name)
def get_available_toolsets() -> dict[str, dict]:
def get_available_toolsets() -> Dict[str, dict]:
"""Return toolset availability info for UI display."""
return registry.get_available_toolsets()
def check_toolset_requirements() -> dict[str, bool]:
def check_toolset_requirements() -> Dict[str, bool]:
"""Return {toolset: available_bool} for every registered toolset."""
return registry.check_toolset_requirements()
def check_tool_availability(quiet: bool = False) -> tuple[list[str], list[dict]]:
def check_tool_availability(quiet: bool = False) -> Tuple[List[str], List[dict]]:
"""Return (available_toolsets, unavailable_info)."""
return registry.check_tool_availability(quiet=quiet)

View File

@@ -1,207 +0,0 @@
---
name: solana
description: Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required.
version: 0.2.0
author: Deniz Alagoz (gizdusum), enhanced by Hermes Agent
license: MIT
metadata:
hermes:
tags: [Solana, Blockchain, Crypto, Web3, RPC, DeFi, NFT]
related_skills: []
---
# Solana Blockchain Skill
Query Solana on-chain data enriched with USD pricing via CoinGecko.
8 commands: wallet portfolio, token info, transactions, activity, NFTs,
whale detection, network stats, and price lookup.
No API key needed. Uses only Python standard library (urllib, json, argparse).
---
## When to Use
- User asks for a Solana wallet balance, token holdings, or portfolio value
- User wants to inspect a specific transaction by signature
- User wants SPL token metadata, price, supply, or top holders
- User wants recent transaction history for an address
- User wants NFTs owned by a wallet
- User wants to find large SOL transfers (whale detection)
- User wants Solana network health, TPS, epoch, or SOL price
- User asks "what's the price of BONK/JUP/SOL?"
---
## Prerequisites
The helper script uses only Python standard library (urllib, json, argparse).
No external packages required.
Pricing data comes from CoinGecko's free API (no key needed, rate-limited
to ~10-30 requests/minute). For faster lookups, use `--no-prices` flag.
---
## Quick Reference
RPC endpoint (default): https://api.mainnet-beta.solana.com
Override: export SOLANA_RPC_URL=https://your-private-rpc.com
Helper script path: ~/.hermes/skills/blockchain/solana/scripts/solana_client.py
```
python3 solana_client.py wallet <address> [--limit N] [--all] [--no-prices]
python3 solana_client.py tx <signature>
python3 solana_client.py token <mint_address>
python3 solana_client.py activity <address> [--limit N]
python3 solana_client.py nft <address>
python3 solana_client.py whales [--min-sol N]
python3 solana_client.py stats
python3 solana_client.py price <mint_or_symbol>
```
---
## Procedure
### 0. Setup Check
```bash
python3 --version
# Optional: set a private RPC for better rate limits
export SOLANA_RPC_URL="https://api.mainnet-beta.solana.com"
# Confirm connectivity
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py stats
```
### 1. Wallet Portfolio
Get SOL balance, SPL token holdings with USD values, NFT count, and
portfolio total. Tokens sorted by value, dust filtered, known tokens
labeled by name (BONK, JUP, USDC, etc.).
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
wallet 9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM
```
Flags:
- `--limit N` — show top N tokens (default: 20)
- `--all` — show all tokens, no dust filter, no limit
- `--no-prices` — skip CoinGecko price lookups (faster, RPC-only)
Output includes: SOL balance + USD value, token list with prices sorted
by value, dust count, NFT summary, total portfolio value in USD.
### 2. Transaction Details
Inspect a full transaction by its base58 signature. Shows balance changes
in both SOL and USD.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
tx 5j7s8K...your_signature_here
```
Output: slot, timestamp, fee, status, balance changes (SOL + USD),
program invocations.
### 3. Token Info
Get SPL token metadata, current price, market cap, supply, decimals,
mint/freeze authorities, and top 5 holders.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
token DezXAZ8z7PnrnRJjz3wXBoRgixCa6xjnB7YaB1pPB263
```
Output: name, symbol, decimals, supply, price, market cap, top 5
holders with percentages.
### 4. Recent Activity
List recent transactions for an address (default: last 10, max: 25).
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
activity 9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM --limit 25
```
### 5. NFT Portfolio
List NFTs owned by a wallet (heuristic: SPL tokens with amount=1, decimals=0).
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
nft 9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM
```
Note: Compressed NFTs (cNFTs) are not detected by this heuristic.
### 6. Whale Detector
Scan the most recent block for large SOL transfers with USD values.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py \
whales --min-sol 500
```
Note: scans the latest block only — point-in-time snapshot, not historical.
### 7. Network Stats
Live Solana network health: current slot, epoch, TPS, supply, validator
version, SOL price, and market cap.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py stats
```
### 8. Price Lookup
Quick price check for any token by mint address or known symbol.
```bash
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price BONK
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price JUP
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price SOL
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py price DezXAZ8z7PnrnRJjz3wXBoRgixCa6xjnB7YaB1pPB263
```
Known symbols: SOL, USDC, USDT, BONK, JUP, WETH, JTO, mSOL, stSOL,
PYTH, HNT, RNDR, WEN, W, TNSR, DRIFT, bSOL, JLP, WIF, MEW, BOME, PENGU.
---
## Pitfalls
- **CoinGecko rate-limits** — free tier allows ~10-30 requests/minute.
Price lookups use 1 request per token. Wallets with many tokens may
not get prices for all of them. Use `--no-prices` for speed.
- **Public RPC rate-limits** — Solana mainnet public RPC limits requests.
For production use, set SOLANA_RPC_URL to a private endpoint
(Helius, QuickNode, Triton).
- **NFT detection is heuristic** — amount=1 + decimals=0. Compressed
NFTs (cNFTs) and Token-2022 NFTs won't appear.
- **Whale detector scans latest block only** — not historical. Results
vary by the moment you query.
- **Transaction history** — public RPC keeps ~2 days. Older transactions
may not be available.
- **Token names** — ~25 well-known tokens are labeled by name. Others
show abbreviated mint addresses. Use the `token` command for full info.
- **Retry on 429** — both RPC and CoinGecko calls retry up to 2 times
with exponential backoff on rate-limit errors.
---
## Verification
```bash
# Should print current Solana slot, TPS, and SOL price
python3 ~/.hermes/skills/blockchain/solana/scripts/solana_client.py stats
```

View File

@@ -1,698 +0,0 @@
#!/usr/bin/env python3
"""
Solana Blockchain CLI Tool for Hermes Agent
--------------------------------------------
Queries the Solana JSON-RPC API and CoinGecko for enriched on-chain data.
Uses only Python standard library — no external packages required.
Usage:
python3 solana_client.py stats
python3 solana_client.py wallet <address> [--limit N] [--all] [--no-prices]
python3 solana_client.py tx <signature>
python3 solana_client.py token <mint_address>
python3 solana_client.py activity <address> [--limit N]
python3 solana_client.py nft <address>
python3 solana_client.py whales [--min-sol N]
python3 solana_client.py price <mint_address_or_symbol>
Environment:
SOLANA_RPC_URL Override the default RPC endpoint (default: mainnet-beta public)
"""
import argparse
import json
import os
import sys
import time
import urllib.request
import urllib.error
from typing import Any, Dict, List, Optional
RPC_URL = os.environ.get(
"SOLANA_RPC_URL",
"https://api.mainnet-beta.solana.com",
)
LAMPORTS_PER_SOL = 1_000_000_000
# Well-known Solana token names — avoids API calls for common tokens.
# Maps mint address → (symbol, name).
KNOWN_TOKENS: Dict[str, tuple] = {
"So11111111111111111111111111111111111111112": ("SOL", "Solana"),
"EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v": ("USDC", "USD Coin"),
"Es9vMFrzaCERmJfrF4H2FYD4KCoNkY11McCe8BenwNYB": ("USDT", "Tether"),
"DezXAZ8z7PnrnRJjz3wXBoRgixCa6xjnB7YaB1pPB263": ("BONK", "Bonk"),
"JUPyiwrYJFskUPiHa7hkeR8VUtAeFoSYbKedZNsDvCN": ("JUP", "Jupiter"),
"7vfCXTUXx5WJV5JADk17DUJ4ksgau7utNKj4b963voxs": ("WETH", "Wrapped Ether"),
"jtojtomepa8beP8AuQc6eXt5FriJwfFMwQx2v2f9mCL": ("JTO", "Jito"),
"mSoLzYCxHdYgdzU16g5QSh3i5K3z3KZK7ytfqcJm7So": ("mSOL", "Marinade Staked SOL"),
"7dHbWXmci3dT8UFYWYZweBLXgycu7Y3iL6trKn1Y7ARj": ("stSOL", "Lido Staked SOL"),
"HZ1JovNiVvGrGNiiYvEozEVgZ58xaU3RKwX8eACQBCt3": ("PYTH", "Pyth Network"),
"RLBxxFkseAZ4RgJH3Sqn8jXxhmGoz9jWxDNJMh8pL7a": ("RLBB", "Rollbit"),
"hntyVP6YFm1Hg25TN9WGLqM12b8TQmcknKrdu1oxWux": ("HNT", "Helium"),
"rndrizKT3MK1iimdxRdWabcF7Zg7AR5T4nud4EkHBof": ("RNDR", "Render"),
"WENWENvqqNya429ubCdR81ZmD69brwQaaBYY6p91oHQQ": ("WEN", "Wen"),
"85VBFQZC9TZkfaptBWjvUw7YbZjy52A6mjtPGjstQAmQ": ("W", "Wormhole"),
"TNSRxcUxoT9xBG3de7PiJyTDYu7kskLqcpddxnEJAS6": ("TNSR", "Tensor"),
"DriFtupJYLTosbwoN8koMbEYSx54aFAVLddWsbksjwg7": ("DRIFT", "Drift"),
"bSo13r4TkiE4KumL71LsHTPpL2euBYLFx6h9HP3piy1": ("bSOL", "BlazeStake Staked SOL"),
"27G8MtK7VtTcCHkpASjSDdkWWYfoqT6ggEuKidVJidD4": ("JLP", "Jupiter LP"),
"EKpQGSJtjMFqKZ9KQanSqYXRcF8fBopzLHYxdM65zcjm": ("WIF", "dogwifhat"),
"MEW1gQWJ3nEXg2qgERiKu7FAFj79PHvQVREQUzScPP5": ("MEW", "cat in a dogs world"),
"ukHH6c7mMyiWCf1b9pnWe25TSpkDDt3H5pQZgZ74J82": ("BOME", "Book of Meme"),
"A8C3xuqscfmyLrte3VwJvtPHXvcSN3FjDbUaSMAkQrCS": ("PENGU", "Pudgy Penguins"),
}
# Reverse lookup: symbol → mint (for the `price` command).
_SYMBOL_TO_MINT = {v[0].upper(): k for k, v in KNOWN_TOKENS.items()}
# ---------------------------------------------------------------------------
# HTTP / RPC helpers
# ---------------------------------------------------------------------------
def _http_get_json(url: str, timeout: int = 10, retries: int = 2) -> Any:
"""GET JSON from a URL with retry on 429 rate-limit. Returns parsed JSON or None."""
for attempt in range(retries + 1):
req = urllib.request.Request(
url, headers={"Accept": "application/json", "User-Agent": "HermesAgent/1.0"},
)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
return json.load(resp)
except urllib.error.HTTPError as exc:
if exc.code == 429 and attempt < retries:
time.sleep(2.0 * (attempt + 1))
continue
return None
except Exception:
return None
return None
def _rpc_call(method: str, params: list = None, retries: int = 2) -> Any:
"""Send a JSON-RPC request with retry on 429 rate-limit."""
payload = json.dumps({
"jsonrpc": "2.0", "id": 1,
"method": method, "params": params or [],
}).encode()
for attempt in range(retries + 1):
req = urllib.request.Request(
RPC_URL, data=payload,
headers={"Content-Type": "application/json"}, method="POST",
)
try:
with urllib.request.urlopen(req, timeout=20) as resp:
body = json.load(resp)
if "error" in body:
err = body["error"]
# Rate-limit: retry after delay
if isinstance(err, dict) and err.get("code") == 429:
if attempt < retries:
time.sleep(1.5 * (attempt + 1))
continue
sys.exit(f"RPC error: {err}")
return body.get("result")
except urllib.error.HTTPError as exc:
if exc.code == 429 and attempt < retries:
time.sleep(1.5 * (attempt + 1))
continue
sys.exit(f"RPC HTTP error: {exc}")
except urllib.error.URLError as exc:
sys.exit(f"RPC connection error: {exc}")
return None
# Keep backward compat — the rest of the code uses `rpc()`.
rpc = _rpc_call
def rpc_batch(calls: list) -> list:
"""Send a batch of JSON-RPC requests (with retry on 429)."""
payload = json.dumps([
{"jsonrpc": "2.0", "id": i, "method": c["method"], "params": c.get("params", [])}
for i, c in enumerate(calls)
]).encode()
for attempt in range(3):
req = urllib.request.Request(
RPC_URL, data=payload,
headers={"Content-Type": "application/json"}, method="POST",
)
try:
with urllib.request.urlopen(req, timeout=20) as resp:
return json.load(resp)
except urllib.error.HTTPError as exc:
if exc.code == 429 and attempt < 2:
time.sleep(1.5 * (attempt + 1))
continue
sys.exit(f"RPC batch HTTP error: {exc}")
except urllib.error.URLError as exc:
sys.exit(f"RPC batch error: {exc}")
return []
def lamports_to_sol(lamports: int) -> float:
return lamports / LAMPORTS_PER_SOL
def print_json(obj: Any) -> None:
print(json.dumps(obj, indent=2))
def _short_mint(mint: str) -> str:
"""Abbreviate a mint address for display: first 4 + last 4."""
if len(mint) <= 12:
return mint
return f"{mint[:4]}...{mint[-4:]}"
# ---------------------------------------------------------------------------
# Price & token name helpers (CoinGecko — free, no API key)
# ---------------------------------------------------------------------------
def fetch_prices(mints: List[str], max_lookups: int = 20) -> Dict[str, float]:
"""Fetch USD prices for mint addresses via CoinGecko (one per request).
CoinGecko free tier doesn't support batch Solana token lookups,
so we do individual calls — capped at *max_lookups* to stay within
rate limits. Returns {mint: usd_price}.
"""
prices: Dict[str, float] = {}
for i, mint in enumerate(mints[:max_lookups]):
url = (
f"https://api.coingecko.com/api/v3/simple/token_price/solana"
f"?contract_addresses={mint}&vs_currencies=usd"
)
data = _http_get_json(url, timeout=10)
if data and isinstance(data, dict):
for addr, info in data.items():
if isinstance(info, dict) and "usd" in info:
prices[mint] = info["usd"]
break
# Pause between calls to respect CoinGecko free-tier rate-limits
if i < len(mints[:max_lookups]) - 1:
time.sleep(1.0)
return prices
def fetch_sol_price() -> Optional[float]:
"""Fetch current SOL price in USD via CoinGecko."""
data = _http_get_json(
"https://api.coingecko.com/api/v3/simple/price?ids=solana&vs_currencies=usd"
)
if data and "solana" in data:
return data["solana"].get("usd")
return None
def resolve_token_name(mint: str) -> Optional[Dict[str, str]]:
"""Look up token name and symbol from CoinGecko by mint address.
Returns {"name": ..., "symbol": ...} or None.
"""
if mint in KNOWN_TOKENS:
sym, name = KNOWN_TOKENS[mint]
return {"symbol": sym, "name": name}
url = f"https://api.coingecko.com/api/v3/coins/solana/contract/{mint}"
data = _http_get_json(url, timeout=10)
if data and "symbol" in data:
return {"symbol": data["symbol"].upper(), "name": data.get("name", "")}
return None
def _token_label(mint: str) -> str:
"""Return a human-readable label for a mint: symbol if known, else abbreviated address."""
if mint in KNOWN_TOKENS:
return KNOWN_TOKENS[mint][0]
return _short_mint(mint)
# ---------------------------------------------------------------------------
# 1. Network Stats
# ---------------------------------------------------------------------------
def cmd_stats(_args):
"""Live Solana network: slot, epoch, TPS, supply, version, SOL price."""
results = rpc_batch([
{"method": "getSlot"},
{"method": "getEpochInfo"},
{"method": "getRecentPerformanceSamples", "params": [1]},
{"method": "getSupply"},
{"method": "getVersion"},
])
by_id = {r["id"]: r.get("result") for r in results}
slot = by_id.get(0)
epoch_info = by_id.get(1)
perf_samples = by_id.get(2)
supply = by_id.get(3)
version = by_id.get(4)
tps = None
if perf_samples:
s = perf_samples[0]
tps = round(s["numTransactions"] / s["samplePeriodSecs"], 1)
total_supply = lamports_to_sol(supply["value"]["total"]) if supply else None
circ_supply = lamports_to_sol(supply["value"]["circulating"]) if supply else None
sol_price = fetch_sol_price()
out = {
"slot": slot,
"epoch": epoch_info.get("epoch") if epoch_info else None,
"slot_in_epoch": epoch_info.get("slotIndex") if epoch_info else None,
"tps": tps,
"total_supply_SOL": round(total_supply, 2) if total_supply else None,
"circulating_supply_SOL": round(circ_supply, 2) if circ_supply else None,
"validator_version": version.get("solana-core") if version else None,
}
if sol_price is not None:
out["sol_price_usd"] = sol_price
if circ_supply:
out["market_cap_usd"] = round(sol_price * circ_supply, 0)
print_json(out)
# ---------------------------------------------------------------------------
# 2. Wallet Info (enhanced with prices, sorting, filtering)
# ---------------------------------------------------------------------------
def cmd_wallet(args):
"""SOL balance + SPL token holdings with USD values."""
address = args.address
show_all = getattr(args, "all", False)
limit = getattr(args, "limit", 20) or 20
skip_prices = getattr(args, "no_prices", False)
# Fetch SOL balance
balance_result = rpc("getBalance", [address])
sol_balance = lamports_to_sol(balance_result["value"])
# Fetch all SPL token accounts
token_result = rpc("getTokenAccountsByOwner", [
address,
{"programId": "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA"},
{"encoding": "jsonParsed"},
])
raw_tokens = []
for acct in (token_result.get("value") or []):
info = acct["account"]["data"]["parsed"]["info"]
ta = info["tokenAmount"]
amount = float(ta.get("uiAmountString") or 0)
if amount > 0:
raw_tokens.append({
"mint": info["mint"],
"amount": amount,
"decimals": ta["decimals"],
})
# Separate NFTs (amount=1, decimals=0) from fungible tokens
nfts = [t for t in raw_tokens if t["decimals"] == 0 and t["amount"] == 1]
fungible = [t for t in raw_tokens if not (t["decimals"] == 0 and t["amount"] == 1)]
# Fetch prices for fungible tokens (cap lookups to avoid API abuse)
sol_price = None
prices: Dict[str, float] = {}
if not skip_prices and fungible:
sol_price = fetch_sol_price()
# Prioritize known tokens, then a small sample of unknowns.
# CoinGecko free tier = 1 request per mint, so we cap lookups.
known_mints = [t["mint"] for t in fungible if t["mint"] in KNOWN_TOKENS]
other_mints = [t["mint"] for t in fungible if t["mint"] not in KNOWN_TOKENS][:15]
mints_to_price = known_mints + other_mints
if mints_to_price:
prices = fetch_prices(mints_to_price, max_lookups=30)
# Enrich tokens with labels and USD values
enriched = []
dust_count = 0
dust_value = 0.0
for t in fungible:
mint = t["mint"]
label = _token_label(mint)
usd_price = prices.get(mint)
usd_value = round(usd_price * t["amount"], 2) if usd_price else None
# Filter dust (< $0.01) unless --all
if not show_all and usd_value is not None and usd_value < 0.01:
dust_count += 1
dust_value += usd_value
continue
entry = {"token": label, "mint": mint, "amount": t["amount"]}
if usd_price is not None:
entry["price_usd"] = usd_price
entry["value_usd"] = usd_value
enriched.append(entry)
# Sort: tokens with known USD value first (highest→lowest), then unknowns
enriched.sort(key=lambda x: (x.get("value_usd") is not None, x.get("value_usd") or 0), reverse=True)
# Apply limit unless --all
total_tokens = len(enriched)
if not show_all and len(enriched) > limit:
enriched = enriched[:limit]
# Compute portfolio total
total_usd = sum(t.get("value_usd", 0) for t in enriched)
sol_value_usd = round(sol_price * sol_balance, 2) if sol_price else None
if sol_value_usd:
total_usd += sol_value_usd
total_usd += dust_value
output = {
"address": address,
"sol_balance": round(sol_balance, 9),
}
if sol_price:
output["sol_price_usd"] = sol_price
output["sol_value_usd"] = sol_value_usd
output["tokens_shown"] = len(enriched)
if total_tokens > len(enriched):
output["tokens_hidden"] = total_tokens - len(enriched)
output["spl_tokens"] = enriched
if dust_count > 0:
output["dust_filtered"] = {"count": dust_count, "total_value_usd": round(dust_value, 4)}
output["nft_count"] = len(nfts)
if nfts:
output["nfts"] = [_token_label(n["mint"]) + f" ({_short_mint(n['mint'])})" for n in nfts[:10]]
if len(nfts) > 10:
output["nfts"].append(f"... and {len(nfts) - 10} more")
if total_usd > 0:
output["portfolio_total_usd"] = round(total_usd, 2)
print_json(output)
# ---------------------------------------------------------------------------
# 3. Transaction Details
# ---------------------------------------------------------------------------
def cmd_tx(args):
"""Full transaction details by signature."""
result = rpc("getTransaction", [
args.signature,
{"encoding": "jsonParsed", "maxSupportedTransactionVersion": 0},
])
if result is None:
sys.exit("Transaction not found (may be too old for public RPC history).")
meta = result.get("meta", {}) or {}
msg = result.get("transaction", {}).get("message", {})
account_keys = msg.get("accountKeys", [])
pre = meta.get("preBalances", [])
post = meta.get("postBalances", [])
balance_changes = []
for i, key in enumerate(account_keys):
acct_key = key["pubkey"] if isinstance(key, dict) else key
if i < len(pre) and i < len(post):
change = lamports_to_sol(post[i] - pre[i])
if change != 0:
balance_changes.append({"account": acct_key, "change_SOL": round(change, 9)})
programs = []
for ix in msg.get("instructions", []):
prog = ix.get("programId")
if prog is None and "programIdIndex" in ix:
k = account_keys[ix["programIdIndex"]]
prog = k["pubkey"] if isinstance(k, dict) else k
if prog:
programs.append(prog)
# Add USD value for SOL changes
sol_price = fetch_sol_price()
if sol_price and balance_changes:
for bc in balance_changes:
bc["change_USD"] = round(bc["change_SOL"] * sol_price, 2)
print_json({
"signature": args.signature,
"slot": result.get("slot"),
"block_time": result.get("blockTime"),
"fee_SOL": lamports_to_sol(meta.get("fee", 0)),
"status": "success" if meta.get("err") is None else "failed",
"balance_changes": balance_changes,
"programs_invoked": list(dict.fromkeys(programs)),
})
# ---------------------------------------------------------------------------
# 4. Token Info (enhanced with name + price)
# ---------------------------------------------------------------------------
def cmd_token(args):
"""SPL token metadata, supply, decimals, price, top holders."""
mint = args.mint
mint_info = rpc("getAccountInfo", [mint, {"encoding": "jsonParsed"}])
if mint_info is None or mint_info.get("value") is None:
sys.exit("Mint account not found.")
parsed = mint_info["value"]["data"]["parsed"]["info"]
decimals = parsed.get("decimals", 0)
supply_raw = int(parsed.get("supply", 0))
supply_human = supply_raw / (10 ** decimals) if decimals else supply_raw
largest = rpc("getTokenLargestAccounts", [mint])
holders = []
for acct in (largest.get("value") or [])[:5]:
amount = float(acct.get("uiAmountString") or 0)
pct = round((amount / supply_human * 100), 4) if supply_human > 0 else 0
holders.append({
"account": acct["address"],
"amount": amount,
"percent": pct,
})
# Resolve name + price
token_meta = resolve_token_name(mint)
price_data = fetch_prices([mint])
out = {"mint": mint}
if token_meta:
out["name"] = token_meta["name"]
out["symbol"] = token_meta["symbol"]
out["decimals"] = decimals
out["supply"] = round(supply_human, min(decimals, 6))
out["mint_authority"] = parsed.get("mintAuthority")
out["freeze_authority"] = parsed.get("freezeAuthority")
if mint in price_data:
out["price_usd"] = price_data[mint]
out["market_cap_usd"] = round(price_data[mint] * supply_human, 0)
out["top_5_holders"] = holders
print_json(out)
# ---------------------------------------------------------------------------
# 5. Recent Activity
# ---------------------------------------------------------------------------
def cmd_activity(args):
"""Recent transaction signatures for an address."""
limit = min(args.limit, 25)
result = rpc("getSignaturesForAddress", [args.address, {"limit": limit}])
txs = [
{
"signature": item["signature"],
"slot": item.get("slot"),
"block_time": item.get("blockTime"),
"err": item.get("err"),
}
for item in (result or [])
]
print_json({"address": args.address, "transactions": txs})
# ---------------------------------------------------------------------------
# 6. NFT Portfolio
# ---------------------------------------------------------------------------
def cmd_nft(args):
"""NFTs owned by a wallet (amount=1 && decimals=0 heuristic)."""
result = rpc("getTokenAccountsByOwner", [
args.address,
{"programId": "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA"},
{"encoding": "jsonParsed"},
])
nfts = [
acct["account"]["data"]["parsed"]["info"]["mint"]
for acct in (result.get("value") or [])
if acct["account"]["data"]["parsed"]["info"]["tokenAmount"]["decimals"] == 0
and int(acct["account"]["data"]["parsed"]["info"]["tokenAmount"]["amount"]) == 1
]
print_json({
"address": args.address,
"nft_count": len(nfts),
"nfts": nfts,
"note": "Heuristic only. Compressed NFTs (cNFTs) are not detected.",
})
# ---------------------------------------------------------------------------
# 7. Whale Detector (enhanced with USD values)
# ---------------------------------------------------------------------------
def cmd_whales(args):
"""Scan the latest block for large SOL transfers."""
min_lamports = int(args.min_sol * LAMPORTS_PER_SOL)
slot = rpc("getSlot")
block = rpc("getBlock", [
slot,
{
"encoding": "jsonParsed",
"transactionDetails": "full",
"maxSupportedTransactionVersion": 0,
"rewards": False,
},
])
if block is None:
sys.exit("Could not retrieve latest block.")
sol_price = fetch_sol_price()
whales = []
for tx in (block.get("transactions") or []):
meta = tx.get("meta", {}) or {}
if meta.get("err") is not None:
continue
msg = tx["transaction"].get("message", {})
account_keys = msg.get("accountKeys", [])
pre = meta.get("preBalances", [])
post = meta.get("postBalances", [])
for i in range(len(pre)):
change = post[i] - pre[i]
if change >= min_lamports:
k = account_keys[i]
receiver = k["pubkey"] if isinstance(k, dict) else k
sender = None
for j in range(len(pre)):
if pre[j] - post[j] >= min_lamports:
sk = account_keys[j]
sender = sk["pubkey"] if isinstance(sk, dict) else sk
break
entry = {
"sender": sender,
"receiver": receiver,
"amount_SOL": round(lamports_to_sol(change), 4),
}
if sol_price:
entry["amount_USD"] = round(lamports_to_sol(change) * sol_price, 2)
whales.append(entry)
out = {
"slot": slot,
"min_threshold_SOL": args.min_sol,
"large_transfers": whales,
"note": "Scans latest block only — point-in-time snapshot.",
}
if sol_price:
out["sol_price_usd"] = sol_price
print_json(out)
# ---------------------------------------------------------------------------
# 8. Price Lookup
# ---------------------------------------------------------------------------
def cmd_price(args):
"""Quick price lookup for a token by mint address or known symbol."""
query = args.token
# Check if it's a known symbol
mint = _SYMBOL_TO_MINT.get(query.upper(), query)
# Try to resolve name
token_meta = resolve_token_name(mint)
# Fetch price
prices = fetch_prices([mint])
out = {"query": query, "mint": mint}
if token_meta:
out["name"] = token_meta["name"]
out["symbol"] = token_meta["symbol"]
if mint in prices:
out["price_usd"] = prices[mint]
else:
out["price_usd"] = None
out["note"] = "Price not available — token may not be listed on CoinGecko."
print_json(out)
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main():
parser = argparse.ArgumentParser(
prog="solana_client.py",
description="Solana blockchain query tool for Hermes Agent",
)
sub = parser.add_subparsers(dest="command", required=True)
sub.add_parser("stats", help="Network stats: slot, epoch, TPS, supply, SOL price")
p_wallet = sub.add_parser("wallet", help="SOL balance + SPL tokens with USD values")
p_wallet.add_argument("address")
p_wallet.add_argument("--limit", type=int, default=20,
help="Max tokens to display (default: 20)")
p_wallet.add_argument("--all", action="store_true",
help="Show all tokens (no limit, no dust filter)")
p_wallet.add_argument("--no-prices", action="store_true",
help="Skip price lookups (faster, RPC-only)")
p_tx = sub.add_parser("tx", help="Transaction details by signature")
p_tx.add_argument("signature")
p_token = sub.add_parser("token", help="SPL token metadata, price, and top holders")
p_token.add_argument("mint")
p_activity = sub.add_parser("activity", help="Recent transactions for an address")
p_activity.add_argument("address")
p_activity.add_argument("--limit", type=int, default=10,
help="Number of transactions (max 25, default 10)")
p_nft = sub.add_parser("nft", help="NFT portfolio for a wallet")
p_nft.add_argument("address")
p_whales = sub.add_parser("whales", help="Large SOL transfers in the latest block")
p_whales.add_argument("--min-sol", type=float, default=1000.0,
help="Minimum SOL transfer size (default: 1000)")
p_price = sub.add_parser("price", help="Quick price lookup by mint or symbol")
p_price.add_argument("token", help="Mint address or known symbol (SOL, BONK, JUP, ...)")
args = parser.parse_args()
dispatch = {
"stats": cmd_stats,
"wallet": cmd_wallet,
"tx": cmd_tx,
"token": cmd_token,
"activity": cmd_activity,
"nft": cmd_nft,
"whales": cmd_whales,
"price": cmd_price,
}
dispatch[args.command](args)
if __name__ == "__main__":
main()

View File

@@ -1,125 +0,0 @@
---
name: agentmail
description: Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to).
version: 1.0.0
metadata:
hermes:
tags: [email, communication, agentmail, mcp]
category: email
---
# AgentMail — Agent-Owned Email Inboxes
## Requirements
- **AgentMail API key** (required) — sign up at https://console.agentmail.to (free tier: 3 inboxes, 3,000 emails/month; paid plans from $20/mo)
- Node.js 18+ (for the MCP server)
## When to Use
Use this skill when you need to:
- Give the agent its own dedicated email address
- Send emails autonomously on behalf of the agent
- Receive and read incoming emails
- Manage email threads and conversations
- Sign up for services or authenticate via email
- Communicate with other agents or humans via email
This is NOT for reading the user's personal email (use himalaya or Gmail for that).
AgentMail gives the agent its own identity and inbox.
## Setup
### 1. Get an API Key
- Go to https://console.agentmail.to
- Create an account and generate an API key (starts with `am_`)
### 2. Configure MCP Server
Add to `~/.hermes/config.yaml` (paste your actual key — MCP env vars are not expanded from .env):
```yaml
mcp_servers:
agentmail:
command: "npx"
args: ["-y", "agentmail-mcp"]
env:
AGENTMAIL_API_KEY: "am_your_key_here"
```
### 3. Restart Hermes
```bash
hermes
```
All 11 AgentMail tools are now available automatically.
## Available Tools (via MCP)
| Tool | Description |
|------|-------------|
| `list_inboxes` | List all agent inboxes |
| `get_inbox` | Get details of a specific inbox |
| `create_inbox` | Create a new inbox (gets a real email address) |
| `delete_inbox` | Delete an inbox |
| `list_threads` | List email threads in an inbox |
| `get_thread` | Get a specific email thread |
| `send_message` | Send a new email |
| `reply_to_message` | Reply to an existing email |
| `forward_message` | Forward an email |
| `update_message` | Update message labels/status |
| `get_attachment` | Download an email attachment |
## Procedure
### Create an inbox and send an email
1. Create a dedicated inbox:
- Use `create_inbox` with a username (e.g. `hermes-agent`)
- The agent gets address: `hermes-agent@agentmail.to`
2. Send an email:
- Use `send_message` with `inbox_id`, `to`, `subject`, `text`
3. Check for replies:
- Use `list_threads` to see incoming conversations
- Use `get_thread` to read a specific thread
### Check incoming email
1. Use `list_inboxes` to find your inbox ID
2. Use `list_threads` with the inbox ID to see conversations
3. Use `get_thread` to read a thread and its messages
### Reply to an email
1. Get the thread with `get_thread`
2. Use `reply_to_message` with the message ID and your reply text
## Example Workflows
**Sign up for a service:**
```
1. create_inbox (username: "signup-bot")
2. Use the inbox address to register on the service
3. list_threads to check for verification email
4. get_thread to read the verification code
```
**Agent-to-human outreach:**
```
1. create_inbox (username: "hermes-outreach")
2. send_message (to: user@example.com, subject: "Hello", text: "...")
3. list_threads to check for replies
```
## Pitfalls
- Free tier limited to 3 inboxes and 3,000 emails/month
- Emails come from `@agentmail.to` domain on free tier (custom domains on paid plans)
- Node.js (18+) is required for the MCP server (`npx -y agentmail-mcp`)
- The `mcp` Python package must be installed: `pip install mcp`
- Real-time inbound email (webhooks) requires a public server — use `list_threads` polling via cronjob instead for personal use
## Verification
After setup, test with:
```
hermes --toolsets mcp -q "Create an AgentMail inbox called test-agent and tell me its email address"
```
You should see the new inbox address returned.
## References
- AgentMail docs: https://docs.agentmail.to/
- AgentMail console: https://console.agentmail.to
- AgentMail MCP repo: https://github.com/agentmail-to/agentmail-mcp
- Pricing: https://www.agentmail.to/pricing

View File

@@ -1,441 +0,0 @@
---
name: qmd
description: Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration.
version: 1.0.0
author: Hermes Agent + Teknium
license: MIT
platforms: [macos, linux]
metadata:
hermes:
tags: [Search, Knowledge-Base, RAG, Notes, MCP, Local-AI]
related_skills: [obsidian, native-mcp, arxiv]
---
# QMD — Query Markup Documents
Local, on-device search engine for personal knowledge bases. Indexes markdown
notes, meeting transcripts, documentation, and any text-based files, then
provides hybrid search combining keyword matching, semantic understanding, and
LLM-powered reranking — all running locally with no cloud dependencies.
Created by [Tobi Lütke](https://github.com/tobi/qmd). MIT licensed.
## When to Use
- User asks to search their notes, docs, knowledge base, or meeting transcripts
- User wants to find something across a large collection of markdown/text files
- User wants semantic search ("find notes about X concept") not just keyword grep
- User has already set up qmd collections and wants to query them
- User asks to set up a local knowledge base or document search system
- Keywords: "search my notes", "find in my docs", "knowledge base", "qmd"
## Prerequisites
### Node.js >= 22 (required)
```bash
# Check version
node --version # must be >= 22
# macOS — install or upgrade via Homebrew
brew install node@22
# Linux — use NodeSource or nvm
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt-get install -y nodejs
# or with nvm:
nvm install 22 && nvm use 22
```
### SQLite with Extension Support (macOS only)
macOS system SQLite lacks extension loading. Install via Homebrew:
```bash
brew install sqlite
```
### Install qmd
```bash
npm install -g @tobilu/qmd
# or with Bun:
bun install -g @tobilu/qmd
```
First run auto-downloads 3 local GGUF models (~2GB total):
| Model | Purpose | Size |
|-------|---------|------|
| embeddinggemma-300M-Q8_0 | Vector embeddings | ~300MB |
| qwen3-reranker-0.6b-q8_0 | Result reranking | ~640MB |
| qmd-query-expansion-1.7B | Query expansion | ~1.1GB |
### Verify Installation
```bash
qmd --version
qmd status
```
## Quick Reference
| Command | What It Does | Speed |
|---------|-------------|-------|
| `qmd search "query"` | BM25 keyword search (no models) | ~0.2s |
| `qmd vsearch "query"` | Semantic vector search (1 model) | ~3s |
| `qmd query "query"` | Hybrid + reranking (all 3 models) | ~2-3s warm, ~19s cold |
| `qmd get <docid>` | Retrieve full document content | instant |
| `qmd multi-get "glob"` | Retrieve multiple files | instant |
| `qmd collection add <path> --name <n>` | Add a directory as a collection | instant |
| `qmd context add <path> "description"` | Add context metadata to improve retrieval | instant |
| `qmd embed` | Generate/update vector embeddings | varies |
| `qmd status` | Show index health and collection info | instant |
| `qmd mcp` | Start MCP server (stdio) | persistent |
| `qmd mcp --http --daemon` | Start MCP server (HTTP, warm models) | persistent |
## Setup Workflow
### 1. Add Collections
Point qmd at directories containing your documents:
```bash
# Add a notes directory
qmd collection add ~/notes --name notes
# Add project docs
qmd collection add ~/projects/myproject/docs --name project-docs
# Add meeting transcripts
qmd collection add ~/meetings --name meetings
# List all collections
qmd collection list
```
### 2. Add Context Descriptions
Context metadata helps the search engine understand what each collection
contains. This significantly improves retrieval quality:
```bash
qmd context add qmd://notes "Personal notes, ideas, and journal entries"
qmd context add qmd://project-docs "Technical documentation for the main project"
qmd context add qmd://meetings "Meeting transcripts and action items from team syncs"
```
### 3. Generate Embeddings
```bash
qmd embed
```
This processes all documents in all collections and generates vector
embeddings. Re-run after adding new documents or collections.
### 4. Verify
```bash
qmd status # shows index health, collection stats, model info
```
## Search Patterns
### Fast Keyword Search (BM25)
Best for: exact terms, code identifiers, names, known phrases.
No models loaded — near-instant results.
```bash
qmd search "authentication middleware"
qmd search "handleError async"
```
### Semantic Vector Search
Best for: natural language questions, conceptual queries.
Loads embedding model (~3s first query).
```bash
qmd vsearch "how does the rate limiter handle burst traffic"
qmd vsearch "ideas for improving onboarding flow"
```
### Hybrid Search with Reranking (Best Quality)
Best for: important queries where quality matters most.
Uses all 3 models — query expansion, parallel BM25+vector, reranking.
```bash
qmd query "what decisions were made about the database migration"
```
### Structured Multi-Mode Queries
Combine different search types in a single query for precision:
```bash
# BM25 for exact term + vector for concept
qmd query $'lex: rate limiter\nvec: how does throttling work under load'
# With query expansion
qmd query $'expand: database migration plan\nlex: "schema change"'
```
### Query Syntax (lex/BM25 mode)
| Syntax | Effect | Example |
|--------|--------|---------|
| `term` | Prefix match | `perf` matches "performance" |
| `"phrase"` | Exact phrase | `"rate limiter"` |
| `-term` | Exclude term | `performance -sports` |
### HyDE (Hypothetical Document Embeddings)
For complex topics, write what you expect the answer to look like:
```bash
qmd query $'hyde: The migration plan involves three phases. First, we add the new columns without dropping the old ones. Then we backfill data. Finally we cut over and remove legacy columns.'
```
### Scoping to Collections
```bash
qmd search "query" --collection notes
qmd query "query" --collection project-docs
```
### Output Formats
```bash
qmd search "query" --json # JSON output (best for parsing)
qmd search "query" --limit 5 # Limit results
qmd get "#abc123" # Get by document ID
qmd get "path/to/file.md" # Get by file path
qmd get "file.md:50" -l 100 # Get specific line range
qmd multi-get "journals/*.md" --json # Batch retrieve by glob
```
## MCP Integration (Recommended)
qmd exposes an MCP server that provides search tools directly to
Hermes Agent via the native MCP client. This is the preferred
integration — once configured, the agent gets qmd tools automatically
without needing to load this skill.
### Option A: Stdio Mode (Simple)
Add to `~/.hermes/config.yaml`:
```yaml
mcp_servers:
qmd:
command: "qmd"
args: ["mcp"]
timeout: 30
connect_timeout: 45
```
This registers tools: `mcp_qmd_search`, `mcp_qmd_vsearch`,
`mcp_qmd_deep_search`, `mcp_qmd_get`, `mcp_qmd_status`.
**Tradeoff:** Models load on first search call (~19s cold start),
then stay warm for the session. Acceptable for occasional use.
### Option B: HTTP Daemon Mode (Fast, Recommended for Heavy Use)
Start the qmd daemon separately — it keeps models warm in memory:
```bash
# Start daemon (persists across agent restarts)
qmd mcp --http --daemon
# Runs on http://localhost:8181 by default
```
Then configure Hermes Agent to connect via HTTP:
```yaml
mcp_servers:
qmd:
url: "http://localhost:8181/mcp"
timeout: 30
```
**Tradeoff:** Uses ~2GB RAM while running, but every query is fast
(~2-3s). Best for users who search frequently.
### Keeping the Daemon Running
#### macOS (launchd)
```bash
cat > ~/Library/LaunchAgents/com.qmd.daemon.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.qmd.daemon</string>
<key>ProgramArguments</key>
<array>
<string>qmd</string>
<string>mcp</string>
<string>--http</string>
<string>--daemon</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/qmd-daemon.log</string>
<key>StandardErrorPath</key>
<string>/tmp/qmd-daemon.log</string>
</dict>
</plist>
EOF
launchctl load ~/Library/LaunchAgents/com.qmd.daemon.plist
```
#### Linux (systemd user service)
```bash
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/qmd-daemon.service << 'EOF'
[Unit]
Description=QMD MCP Daemon
After=network.target
[Service]
ExecStart=qmd mcp --http --daemon
Restart=on-failure
RestartSec=10
Environment=PATH=/usr/local/bin:/usr/bin:/bin
[Install]
WantedBy=default.target
EOF
systemctl --user daemon-reload
systemctl --user enable --now qmd-daemon
systemctl --user status qmd-daemon
```
### MCP Tools Reference
Once connected, these tools are available as `mcp_qmd_*`:
| MCP Tool | Maps To | Description |
|----------|---------|-------------|
| `mcp_qmd_search` | `qmd search` | BM25 keyword search |
| `mcp_qmd_vsearch` | `qmd vsearch` | Semantic vector search |
| `mcp_qmd_deep_search` | `qmd query` | Hybrid search + reranking |
| `mcp_qmd_get` | `qmd get` | Retrieve document by ID or path |
| `mcp_qmd_status` | `qmd status` | Index health and stats |
The MCP tools accept structured JSON queries for multi-mode search:
```json
{
"searches": [
{"type": "lex", "query": "authentication middleware"},
{"type": "vec", "query": "how user login is verified"}
],
"collections": ["project-docs"],
"limit": 10
}
```
## CLI Usage (Without MCP)
When MCP is not configured, use qmd directly via terminal:
```
terminal(command="qmd query 'what was decided about the API redesign' --json", timeout=30)
```
For setup and management tasks, always use terminal:
```
terminal(command="qmd collection add ~/Documents/notes --name notes")
terminal(command="qmd context add qmd://notes 'Personal research notes and ideas'")
terminal(command="qmd embed")
terminal(command="qmd status")
```
## How the Search Pipeline Works
Understanding the internals helps choose the right search mode:
1. **Query Expansion** — A fine-tuned 1.7B model generates 2 alternative
queries. The original gets 2x weight in fusion.
2. **Parallel Retrieval** — BM25 (SQLite FTS5) and vector search run
simultaneously across all query variants.
3. **RRF Fusion** — Reciprocal Rank Fusion (k=60) merges results.
Top-rank bonus: #1 gets +0.05, #2-3 get +0.02.
4. **LLM Reranking** — qwen3-reranker scores top 30 candidates (0.0-1.0).
5. **Position-Aware Blending** — Ranks 1-3: 75% retrieval / 25% reranker.
Ranks 4-10: 60/40. Ranks 11+: 40/60 (trusts reranker more for long tail).
**Smart Chunking:** Documents are split at natural break points (headings,
code blocks, blank lines) targeting ~900 tokens with 15% overlap. Code
blocks are never split mid-block.
## Best Practices
1. **Always add context descriptions**`qmd context add` dramatically
improves retrieval accuracy. Describe what each collection contains.
2. **Re-embed after adding documents**`qmd embed` must be re-run when
new files are added to collections.
3. **Use `qmd search` for speed** — when you need fast keyword lookup
(code identifiers, exact names), BM25 is instant and needs no models.
4. **Use `qmd query` for quality** — when the question is conceptual or
the user needs the best possible results, use hybrid search.
5. **Prefer MCP integration** — once configured, the agent gets native
tools without needing to load this skill each time.
6. **Daemon mode for frequent users** — if the user searches their
knowledge base regularly, recommend the HTTP daemon setup.
7. **First query in structured search gets 2x weight** — put the most
important/certain query first when combining lex and vec.
## Troubleshooting
### "Models downloading on first run"
Normal — qmd auto-downloads ~2GB of GGUF models on first use.
This is a one-time operation.
### Cold start latency (~19s)
This happens when models aren't loaded in memory. Solutions:
- Use HTTP daemon mode (`qmd mcp --http --daemon`) to keep warm
- Use `qmd search` (BM25 only) when models aren't needed
- MCP stdio mode loads models on first search, stays warm for session
### macOS: "unable to load extension"
Install Homebrew SQLite: `brew install sqlite`
Then ensure it's on PATH before system SQLite.
### "No collections found"
Run `qmd collection add <path> --name <name>` to add directories,
then `qmd embed` to index them.
### Embedding model override (CJK/multilingual)
Set `QMD_EMBED_MODEL` environment variable for non-English content:
```bash
export QMD_EMBED_MODEL="your-multilingual-model"
```
## Data Storage
- **Index & vectors:** `~/.cache/qmd/index.sqlite`
- **Models:** Auto-downloaded to local cache on first run
- **No cloud dependencies** — everything runs locally
## References
- [GitHub: tobi/qmd](https://github.com/tobi/qmd)
- [QMD Changelog](https://github.com/tobi/qmd/blob/main/CHANGELOG.md)

View File

@@ -40,7 +40,7 @@ dependencies = [
[project.optional-dependencies]
modal = ["swe-rex[modal]>=1.4.0"]
daytona = ["daytona>=0.148.0"]
dev = ["pytest", "pytest-asyncio", "mcp>=1.2.0", "ruff", "pre-commit", "watchfiles"]
dev = ["pytest", "pytest-asyncio"]
messaging = ["python-telegram-bot>=20.0", "discord.py>=2.0", "aiohttp>=3.9.0", "slack-bolt>=1.18.0", "slack-sdk>=3.27.0"]
cron = ["croniter"]
slack = ["slack-bolt>=1.18.0", "slack-sdk>=3.27.0"]
@@ -76,46 +76,6 @@ py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajector
[tool.setuptools.packages.find]
include = ["tools", "hermes_cli", "gateway", "cron", "honcho_integration"]
[tool.ruff]
target-version = "py311"
line-length = 120
[tool.ruff.lint]
select = ["E", "F", "W", "I", "UP", "B", "SIM"]
ignore = [
"E402", # late imports — intentional throughout codebase
"E501", # line too long — handled by formatter where it can
"E731", # lambda assignments — used in registry pattern
"E741", # ambiguous variable name — existing patterns
"F811", # redefined unused — intentional overrides
"F841", # unused variable — cleanup separately
"B007", # unused loop variable — cleanup separately
"B904", # raise from — too noisy to gate on
"B905", # zip strict — cleanup separately
"B027", # empty method without abstract decorator
"SIM102", # collapsible if — readability preference
"SIM103", # needless bool — readability preference
"SIM105", # suppressible exception — existing pattern
"SIM108", # ternary — readability preference
"SIM110", # reimplemented builtin
"SIM112", # uncapitalized env var
"SIM115", # open file with context handler
"SIM117", # multiple with statements
"SIM118", # in-dict-keys — cleanup separately
"SIM212", # if-expr twisted arms
]
[tool.ruff.lint.per-file-ignores]
"batch_runner.py" = ["F821"]
"tools/patch_parser.py" = ["F821"]
"gateway/run.py" = ["F821"]
"gateway/channel_directory.py" = ["F401"]
"hermes_cli/doctor.py" = ["F401"]
"tools/image_generation_tool.py" = ["F401"]
[tool.ruff.lint.isort]
known-first-party = ["tools", "hermes_cli", "gateway", "agent", "cron"]
[tool.pytest.ini_options]
testpaths = ["tests"]
markers = [

File diff suppressed because it is too large Load Diff

View File

@@ -492,23 +492,9 @@ install_system_packages() {
return 0
fi
fi
elif [ -e /dev/tty ]; then
# Non-interactive (e.g. curl | bash) but a terminal is available.
# Read the prompt from /dev/tty (same approach the setup wizard uses).
echo ""
log_info "Installing ${description} requires sudo."
read -p "Install? [Y/n] " -n 1 -r < /dev/tty
echo
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
if sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a $install_cmd < /dev/tty; then
[ "$need_ripgrep" = true ] && HAS_RIPGREP=true && log_success "ripgrep installed"
[ "$need_ffmpeg" = true ] && HAS_FFMPEG=true && log_success "ffmpeg installed"
return 0
fi
fi
else
log_warn "Non-interactive mode and no terminal available — cannot install system packages"
log_info "Install manually after setup completes: sudo $install_cmd"
log_warn "Non-interactive mode: cannot prompt for sudo password"
log_info "Install missing packages manually: sudo $install_cmd"
fi
fi
fi
@@ -843,33 +829,6 @@ install_node_deps() {
log_warn "npm install failed (browser tools may not work)"
}
log_success "Node.js dependencies installed"
# Install Playwright browser + system dependencies.
# Playwright's install-deps only supports apt/dnf/zypper natively.
# For Arch/Manjaro we install the system libs via pacman first.
log_info "Installing browser engine (Playwright Chromium)..."
case "$DISTRO" in
arch|manjaro)
if command -v pacman &> /dev/null; then
log_info "Arch/Manjaro detected — installing Chromium system dependencies via pacman..."
if command -v sudo &> /dev/null && sudo -n true 2>/dev/null; then
sudo NEEDRESTART_MODE=a pacman -S --noconfirm --needed \
nss atk at-spi2-core cups libdrm libxkbcommon mesa pango cairo alsa-lib >/dev/null 2>&1 || true
elif [ "$(id -u)" -eq 0 ]; then
pacman -S --noconfirm --needed \
nss atk at-spi2-core cups libdrm libxkbcommon mesa pango cairo alsa-lib >/dev/null 2>&1 || true
else
log_warn "Cannot install browser deps without sudo. Run manually:"
log_warn " sudo pacman -S nss atk at-spi2-core cups libdrm libxkbcommon mesa pango cairo alsa-lib"
fi
fi
cd "$INSTALL_DIR" && npx playwright install chromium 2>/dev/null || true
;;
*)
cd "$INSTALL_DIR" && npx playwright install --with-deps chromium 2>/dev/null || true
;;
esac
log_success "Browser engine installed"
fi
# Install WhatsApp bridge dependencies

View File

@@ -1,3 +0,0 @@
---
description: Creative content generation — ASCII art, hand-drawn style diagrams, and visual design tools.
---

View File

@@ -1,7 +1,7 @@
---
name: ascii-art
description: Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required.
version: 4.0.0
description: Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii conversion, and search curated art from emojicombos.com and asciiart.eu (11,000+ artworks). Falls back to LLM-generated art.
version: 3.1.0
author: 0xbyt4, Hermes Agent
license: MIT
dependencies: []
@@ -14,9 +14,9 @@ metadata:
# ASCII Art Skill
Multiple tools for different ASCII art needs. All tools are local CLI programs or free REST APIs — no API keys required.
Multiple tools for different ASCII art needs. All tools are local CLI programs — no API keys required.
## Tool 1: Text Banners (pyfiglet — local)
## Tool 1: Text Banners (pyfiglet)
Render text as large ASCII art banners. 571 built-in fonts.
@@ -53,35 +53,7 @@ python3 -m pyfiglet --list_fonts # List all 571 fonts
- Short text (1-8 chars) works best with detailed fonts like `doom` or `block`
- Long text works better with compact fonts like `small` or `mini`
## Tool 2: Text Banners (asciified API — remote, no install)
Free REST API that converts text to ASCII art. 250+ FIGlet fonts. Returns plain text directly — no parsing needed. Use this when pyfiglet is not installed or as a quick alternative.
### Usage (via terminal curl)
```bash
# Basic text banner (default font)
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello+World"
# With a specific font
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Slant"
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Doom"
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Star+Wars"
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=3-D"
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Banner3"
# List all available fonts (returns JSON array)
curl -s "https://asciified.thelicato.io/api/v2/fonts"
```
### Tips
- URL-encode spaces as `+` in the text parameter
- The response is plain text ASCII art — no JSON wrapping, ready to display
- Font names are case-sensitive; use the fonts endpoint to get exact names
- Works from any terminal with curl — no Python or pip needed
## Tool 3: Cowsay (Message Art)
## Tool 2: Cowsay (Message Art)
Classic tool that wraps text in a speech bubble with an ASCII character.
@@ -125,7 +97,7 @@ cowsay -e "OO" "Msg" # Custom eyes
cowsay -T "U " "Msg" # Custom tongue
```
## Tool 4: Boxes (Decorative Borders)
## Tool 3: Boxes (Decorative Borders)
Draw decorative ASCII art borders/frames around any text. 70+ built-in designs.
@@ -152,15 +124,13 @@ echo "Hello World" | boxes -a c # Center text
boxes -l # List all 70+ designs
```
### Combine with pyfiglet or asciified
### Combine with pyfiglet
```bash
python3 -m pyfiglet "HERMES" -f slant | boxes -d stone
# Or without pyfiglet installed:
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=HERMES&font=Slant" | boxes -d stone
```
## Tool 5: TOIlet (Colored Text Art)
## Tool 4: TOIlet (Colored Text Art)
Like pyfiglet but with ANSI color effects and visual filters. Great for terminal eye candy.
@@ -190,14 +160,14 @@ toilet -F list # List available filters
**Note**: toilet outputs ANSI escape codes for colors — works in terminals but may not render in all contexts (e.g., plain text files, some chat platforms).
## Tool 6: Image to ASCII Art
## Tool 5: Image to ASCII Art
Convert images (PNG, JPEG, GIF, WEBP) to ASCII art.
### Option A: ascii-image-converter (recommended, modern)
```bash
# Install
# Install via snap or Go
sudo snap install ascii-image-converter
# OR: go install github.com/TheZoraiz/ascii-image-converter@latest
```
@@ -220,77 +190,63 @@ jp2a --width=80 image.jpg
jp2a --colors image.jpg # Colorized
```
## Tool 7: Search Pre-Made ASCII Art
## Tool 6: Search Pre-Made ASCII Art (Web APIs)
Search curated ASCII art from the web. Use `terminal` with `curl`.
Search curated ASCII art databases via `web_extract`. No API keys needed.
### Source A: ascii.co.uk (recommended for pre-made art)
### Source A: emojicombos.com (recommended first)
Large collection of classic ASCII art organized by subject. Art is inside HTML `<pre>` tags. Fetch the page with curl, then extract art with a small Python snippet.
Huge collection of ASCII art, dot art, kaomoji, and emoji combos. Modern, meme-aware, user-submitted content. Great for pop culture, animals, objects, aesthetics.
**URL pattern:** `https://ascii.co.uk/art/{subject}`
**URL pattern:** `https://emojicombos.com/{term}-ascii-art`
**Step 1 — Fetch the page:**
```bash
curl -s 'https://ascii.co.uk/art/cat' -o /tmp/ascii_art.html
```
**Step 2 — Extract art from pre tags:**
```python
import re, html
with open('/tmp/ascii_art.html') as f:
text = f.read()
arts = re.findall(r'<pre[^>]*>(.*?)</pre>', text, re.DOTALL)
for art in arts:
clean = re.sub(r'<[^>]+>', '', art)
clean = html.unescape(clean).strip()
if len(clean) > 30:
print(clean)
print('\n---\n')
web_extract(urls=["https://emojicombos.com/cat-ascii-art"])
web_extract(urls=["https://emojicombos.com/rocket-ascii-art"])
web_extract(urls=["https://emojicombos.com/dragon-ascii-art"])
web_extract(urls=["https://emojicombos.com/skull-ascii-art"])
web_extract(urls=["https://emojicombos.com/heart-ascii-art"])
```
**Available subjects** (use as URL path):
- Animals: `cat`, `dog`, `horse`, `bird`, `fish`, `dragon`, `snake`, `rabbit`, `elephant`, `dolphin`, `butterfly`, `owl`, `wolf`, `bear`, `penguin`, `turtle`
- Objects: `car`, `ship`, `airplane`, `rocket`, `guitar`, `computer`, `coffee`, `beer`, `cake`, `house`, `castle`, `sword`, `crown`, `key`
- Nature: `tree`, `flower`, `sun`, `moon`, `star`, `mountain`, `ocean`, `rainbow`
- Characters: `skull`, `robot`, `angel`, `wizard`, `pirate`, `ninja`, `alien`
- Holidays: `christmas`, `halloween`, `valentine`
**Tips:**
- Preserve artist signatures/initials — important etiquette
- Multiple art pieces per page — pick the best one for the user
- Works reliably via curl, no JavaScript needed
- Use hyphenated search terms: `hello-kitty-ascii-art`, `star-wars-ascii-art`
- Returns a mix of classic ASCII, Braille dot art, and kaomoji — pick the best style for the user
- Includes modern meme art and pop culture references
- Great for kaomoji/emoticons too: `https://emojicombos.com/cat-kaomoji`
### Source B: GitHub Octocat API (fun easter egg)
### Source B: asciiart.eu (classic archive)
Returns a random GitHub Octocat with a wise quote. No auth needed.
11,000+ classic ASCII artworks organized by category. More traditional/vintage art.
**Browse by category** (use as URL paths):
- `animals/cats`, `animals/dogs`, `animals/birds`, `animals/horses`
- `animals/dolphins`, `animals/dragons`, `animals/insects`
- `space/rockets`, `space/stars`, `space/planets`
- `vehicles/cars`, `vehicles/ships`, `vehicles/airplanes`
- `food-and-drinks/coffee`, `food-and-drinks/beer`
- `computers/computers`, `electronics/robots`
- `art-and-design/hearts`, `art-and-design/skulls`
- `plants/flowers`, `plants/trees`
- `mythology/dragons`, `mythology/unicorns`
```
web_extract(urls=["https://www.asciiart.eu/animals/cats"])
web_extract(urls=["https://www.asciiart.eu/search?q=rocket"])
```
**Tips:**
- Preserve artist initials/signatures (e.g., `jgs`, `hjw`) — this is important etiquette
- Better for classic/vintage ASCII art style
### Source C: GitHub Octocat API (fun easter egg)
Returns a random GitHub Octocat with a quote. No auth needed.
```bash
curl -s https://api.github.com/octocat
```
## Tool 8: Fun ASCII Utilities (via curl)
These free services return ASCII art directly — great for fun extras.
### QR Codes as ASCII Art
```bash
curl -s "qrenco.de/Hello+World"
curl -s "qrenco.de/https://example.com"
```
### Weather as ASCII Art
```bash
curl -s "wttr.in/London" # Full weather report with ASCII graphics
curl -s "wttr.in/Moon" # Moon phase in ASCII art
curl -s "v2.wttr.in/London" # Detailed version
```
## Tool 9: LLM-Generated Custom Art (Fallback)
## Tool 7: LLM-Generated Custom Art (Fallback)
When tools above don't have what's needed, generate ASCII art directly using these Unicode characters:
@@ -308,14 +264,28 @@ When tools above don't have what's needed, generate ASCII art directly using the
- Max height: 15 lines for banners, 25 for scenes
- Monospace only: output must render correctly in fixed-width fonts
## Fun Extras
### Star Wars in ASCII (via telnet)
```bash
telnet towel.blinkenlights.nl
```
### Useful Resources
- [asciiart.eu](https://www.asciiart.eu/) — 11,000+ artworks, searchable
- [patorjk.com/software/taag](http://patorjk.com/software/taag/) — Web-based text-to-ASCII with font preview
- [asciiflow.com](http://asciiflow.com/) — Interactive ASCII diagram editor (browser)
- [awesome-ascii-art](https://github.com/moul/awesome-ascii-art) — Curated resource list
## Decision Flow
1. **Text as a banner** → pyfiglet if installed, otherwise asciified API via curl
1. **Text as a banner** → pyfiglet (or toilet for colored output)
2. **Wrap a message in fun character art** → cowsay
3. **Add decorative border/frame** → boxes (can combine with pyfiglet/asciified)
4. **Art of a specific thing** (cat, rocket, dragon) → ascii.co.uk via curl + parsing
5. **Convert an image to ASCII** → ascii-image-converter or jp2a
6. **QR code** → qrenco.de via curl
7. **Weather/moon art** → wttr.in via curl
8. **Something custom/creative** → LLM generation with Unicode palette
9. **Any tool not installed** → install it, or fall back to next option
3. **Add decorative border/frame** → boxes (can combine with pyfiglet)
4. **Art of a thing** (cat, rocket, dragon) → emojicombos.com first, then asciiart.eu
5. **Kaomoji / emoticons** → emojicombos.com (`{term}-kaomoji`)
6. **Convert an image to ASCII** → ascii-image-converter or jp2a
7. **Something custom/creative** → LLM generation with Unicode palette
8. **Any tool not installed** → install it, or fall back to next option

View File

@@ -1,162 +0,0 @@
---
name: dogfood
description: Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports
version: 1.0.0
metadata:
hermes:
tags: [qa, testing, browser, web, dogfood]
related_skills: []
---
# Dogfood: Systematic Web Application QA Testing
## Overview
This skill guides you through systematic exploratory QA testing of web applications using the browser toolset. You will navigate the application, interact with elements, capture evidence of issues, and produce a structured bug report.
## Prerequisites
- Browser toolset must be available (`browser_navigate`, `browser_snapshot`, `browser_click`, `browser_type`, `browser_vision`, `browser_console`, `browser_scroll`, `browser_back`, `browser_press`, `browser_close`)
- A target URL and testing scope from the user
## Inputs
The user provides:
1. **Target URL** — the entry point for testing
2. **Scope** — what areas/features to focus on (or "full site" for comprehensive testing)
3. **Output directory** (optional) — where to save screenshots and the report (default: `./dogfood-output`)
## Workflow
Follow this 5-phase systematic workflow:
### Phase 1: Plan
1. Create the output directory structure:
```
{output_dir}/
├── screenshots/ # Evidence screenshots
└── report.md # Final report (generated in Phase 5)
```
2. Identify the testing scope based on user input.
3. Build a rough sitemap by planning which pages and features to test:
- Landing/home page
- Navigation links (header, footer, sidebar)
- Key user flows (sign up, login, search, checkout, etc.)
- Forms and interactive elements
- Edge cases (empty states, error pages, 404s)
### Phase 2: Explore
For each page or feature in your plan:
1. **Navigate** to the page:
```
browser_navigate(url="https://example.com/page")
```
2. **Take a snapshot** to understand the DOM structure:
```
browser_snapshot()
```
3. **Check the console** for JavaScript errors:
```
browser_console(clear=true)
```
Do this after every navigation and after every significant interaction. Silent JS errors are high-value findings.
4. **Take an annotated screenshot** to visually assess the page and identify interactive elements:
```
browser_vision(question="Describe the page layout, identify any visual issues, broken elements, or accessibility concerns", annotate=true)
```
The `annotate=true` flag overlays numbered `[N]` labels on interactive elements. Each `[N]` maps to ref `@eN` for subsequent browser commands.
5. **Test interactive elements** systematically:
- Click buttons and links: `browser_click(ref="@eN")`
- Fill forms: `browser_type(ref="@eN", text="test input")`
- Test keyboard navigation: `browser_press(key="Tab")`, `browser_press(key="Enter")`
- Scroll through content: `browser_scroll(direction="down")`
- Test form validation with invalid inputs
- Test empty submissions
6. **After each interaction**, check for:
- Console errors: `browser_console()`
- Visual changes: `browser_vision(question="What changed after the interaction?")`
- Expected vs actual behavior
### Phase 3: Collect Evidence
For every issue found:
1. **Take a screenshot** showing the issue:
```
browser_vision(question="Capture and describe the issue visible on this page", annotate=false)
```
Save the `screenshot_path` from the response — you will reference it in the report.
2. **Record the details**:
- URL where the issue occurs
- Steps to reproduce
- Expected behavior
- Actual behavior
- Console errors (if any)
- Screenshot path
3. **Classify the issue** using the issue taxonomy (see `references/issue-taxonomy.md`):
- Severity: Critical / High / Medium / Low
- Category: Functional / Visual / Accessibility / Console / UX / Content
### Phase 4: Categorize
1. Review all collected issues.
2. De-duplicate — merge issues that are the same bug manifesting in different places.
3. Assign final severity and category to each issue.
4. Sort by severity (Critical first, then High, Medium, Low).
5. Count issues by severity and category for the executive summary.
### Phase 5: Report
Generate the final report using the template at `templates/dogfood-report-template.md`.
The report must include:
1. **Executive summary** with total issue count, breakdown by severity, and testing scope
2. **Per-issue sections** with:
- Issue number and title
- Severity and category badges
- URL where observed
- Description of the issue
- Steps to reproduce
- Expected vs actual behavior
- Screenshot references (use `MEDIA:<screenshot_path>` for inline images)
- Console errors if relevant
3. **Summary table** of all issues
4. **Testing notes** — what was tested, what was not, any blockers
Save the report to `{output_dir}/report.md`.
## Tools Reference
| Tool | Purpose |
|------|---------|
| `browser_navigate` | Go to a URL |
| `browser_snapshot` | Get DOM text snapshot (accessibility tree) |
| `browser_click` | Click an element by ref (`@eN`) or text |
| `browser_type` | Type into an input field |
| `browser_scroll` | Scroll up/down on the page |
| `browser_back` | Go back in browser history |
| `browser_press` | Press a keyboard key |
| `browser_vision` | Screenshot + AI analysis; use `annotate=true` for element labels |
| `browser_console` | Get JS console output and errors |
| `browser_close` | Close the browser session |
## Tips
- **Always check `browser_console()` after navigating and after significant interactions.** Silent JS errors are among the most valuable findings.
- **Use `annotate=true` with `browser_vision`** when you need to reason about interactive element positions or when the snapshot refs are unclear.
- **Test with both valid and invalid inputs** — form validation bugs are common.
- **Scroll through long pages** — content below the fold may have rendering issues.
- **Test navigation flows** — click through multi-step processes end-to-end.
- **Check responsive behavior** by noting any layout issues visible in screenshots.
- **Don't forget edge cases**: empty states, very long text, special characters, rapid clicking.
- When reporting screenshots to the user, include `MEDIA:<screenshot_path>` so they can see the evidence inline.

View File

@@ -1,109 +0,0 @@
# Issue Taxonomy
Use this taxonomy to classify issues found during dogfood QA testing.
## Severity Levels
### Critical
The issue makes a core feature completely unusable or causes data loss.
**Examples:**
- Application crashes or shows a blank white page
- Form submission silently loses user data
- Authentication is completely broken (can't log in at all)
- Payment flow fails and charges the user without completing the order
- Security vulnerability (e.g., XSS, exposed credentials in console)
### High
The issue significantly impairs functionality but a workaround may exist.
**Examples:**
- A key button does nothing when clicked (but refreshing fixes it)
- Search returns no results for valid queries
- Form validation rejects valid input
- Page loads but critical content is missing or garbled
- Navigation link leads to a 404 or wrong page
- Uncaught JavaScript exceptions in the console on core pages
### Medium
The issue is noticeable and affects user experience but doesn't block core functionality.
**Examples:**
- Layout is misaligned or overlapping on certain screen sections
- Images fail to load (broken image icons)
- Slow performance (visible loading delays > 3 seconds)
- Form field lacks proper validation feedback (no error message on bad input)
- Console warnings that suggest deprecated or misconfigured features
- Inconsistent styling between similar pages
### Low
Minor polish issues that don't affect functionality.
**Examples:**
- Typos or grammatical errors in text content
- Minor spacing or alignment inconsistencies
- Placeholder text left in production ("Lorem ipsum")
- Favicon missing
- Console info/debug messages that shouldn't be in production
- Subtle color contrast issues that don't fail WCAG requirements
## Categories
### Functional
Issues where features don't work as expected.
- Buttons/links that don't respond
- Forms that don't submit or submit incorrectly
- Broken user flows (can't complete a multi-step process)
- Incorrect data displayed
- Features that work partially
### Visual
Issues with the visual presentation of the page.
- Layout problems (overlapping elements, broken grids)
- Broken images or missing media
- Styling inconsistencies
- Responsive design failures
- Z-index issues (elements hidden behind others)
- Text overflow or truncation
### Accessibility
Issues that prevent or hinder access for users with disabilities.
- Missing alt text on meaningful images
- Poor color contrast (fails WCAG AA)
- Elements not reachable via keyboard navigation
- Missing form labels or ARIA attributes
- Focus indicators missing or unclear
- Screen reader incompatible content
### Console
Issues detected through JavaScript console output.
- Uncaught exceptions and unhandled promise rejections
- Failed network requests (4xx, 5xx errors in console)
- Deprecation warnings
- CORS errors
- Mixed content warnings (HTTP resources on HTTPS page)
- Excessive console.log output left from development
### UX (User Experience)
Issues where functionality works but the experience is poor.
- Confusing navigation or information architecture
- Missing loading indicators (user doesn't know something is happening)
- No feedback after user actions (e.g., button click with no visible result)
- Inconsistent interaction patterns
- Missing confirmation dialogs for destructive actions
- Poor error messages that don't help the user recover
### Content
Issues with the text, media, or information on the page.
- Typos and grammatical errors
- Placeholder/dummy content in production
- Outdated information
- Missing content (empty sections)
- Broken or dead links to external resources
- Incorrect or misleading labels

View File

@@ -1,86 +0,0 @@
# Dogfood QA Report
**Target:** {target_url}
**Date:** {date}
**Scope:** {scope_description}
**Tester:** Hermes Agent (automated exploratory QA)
---
## Executive Summary
| Severity | Count |
|----------|-------|
| 🔴 Critical | {critical_count} |
| 🟠 High | {high_count} |
| 🟡 Medium | {medium_count} |
| 🔵 Low | {low_count} |
| **Total** | **{total_count}** |
**Overall Assessment:** {one_sentence_assessment}
---
## Issues
<!-- Repeat this section for each issue found, sorted by severity (Critical first) -->
### Issue #{issue_number}: {issue_title}
| Field | Value |
|-------|-------|
| **Severity** | {severity} |
| **Category** | {category} |
| **URL** | {url_where_found} |
**Description:**
{detailed_description_of_the_issue}
**Steps to Reproduce:**
1. {step_1}
2. {step_2}
3. {step_3}
**Expected Behavior:**
{what_should_happen}
**Actual Behavior:**
{what_actually_happens}
**Screenshot:**
MEDIA:{screenshot_path}
**Console Errors** (if applicable):
```
{console_error_output}
```
---
<!-- End of per-issue section -->
## Issues Summary Table
| # | Title | Severity | Category | URL |
|---|-------|----------|----------|-----|
| {n} | {title} | {severity} | {category} | {url} |
## Testing Coverage
### Pages Tested
- {list_of_pages_visited}
### Features Tested
- {list_of_features_exercised}
### Not Tested / Out of Scope
- {areas_not_covered_and_why}
### Blockers
- {any_issues_that_prevented_testing_certain_areas}
---
## Notes
{any_additional_observations_or_recommendations}

View File

@@ -1,69 +0,0 @@
---
name: find-nearby
description: Find nearby places (restaurants, cafes, bars, pharmacies, etc.) using OpenStreetMap. Works with coordinates, addresses, cities, zip codes, or Telegram location pins. No API keys needed.
version: 1.0.0
metadata:
hermes:
tags: [location, maps, nearby, places, restaurants, local]
related_skills: []
---
# Find Nearby — Local Place Discovery
Find restaurants, cafes, bars, pharmacies, and other places near any location. Uses OpenStreetMap (free, no API keys). Works with:
- **Coordinates** from Telegram location pins (latitude/longitude in conversation)
- **Addresses** ("near 123 Main St, Springfield")
- **Cities** ("restaurants in downtown Austin")
- **Zip codes** ("pharmacies near 90210")
- **Landmarks** ("cafes near Times Square")
## Quick Reference
```bash
# By coordinates (from Telegram location pin or user-provided)
python3 SKILL_DIR/scripts/find_nearby.py --lat <LAT> --lon <LON> --type restaurant --radius 1500
# By address, city, or landmark (auto-geocoded)
python3 SKILL_DIR/scripts/find_nearby.py --near "Times Square, New York" --type cafe
# Multiple place types
python3 SKILL_DIR/scripts/find_nearby.py --near "downtown austin" --type restaurant --type bar --limit 10
# JSON output
python3 SKILL_DIR/scripts/find_nearby.py --near "90210" --type pharmacy --json
```
### Parameters
| Flag | Description | Default |
|------|-------------|---------|
| `--lat`, `--lon` | Exact coordinates | — |
| `--near` | Address, city, zip, or landmark (geocoded) | — |
| `--type` | Place type (repeatable for multiple) | restaurant |
| `--radius` | Search radius in meters | 1500 |
| `--limit` | Max results | 15 |
| `--json` | Machine-readable JSON output | off |
### Common Place Types
`restaurant`, `cafe`, `bar`, `pub`, `fast_food`, `pharmacy`, `hospital`, `bank`, `atm`, `fuel`, `parking`, `supermarket`, `convenience`, `hotel`
## Workflow
1. **Get the location.** Look for coordinates (`latitude: ... / longitude: ...`) from a Telegram pin, or ask the user for an address/city/zip.
2. **Ask for preferences** (only if not already stated): place type, how far they're willing to go, any specifics (cuisine, "open now", etc.).
3. **Run the script** with appropriate flags. Use `--json` if you need to process results programmatically.
4. **Present results** with names, distances, and Google Maps links. If the user asked about hours or "open now," check the `hours` field in results — if missing or unclear, verify with `web_search`.
5. **For directions**, use the `directions_url` from results, or construct: `https://www.google.com/maps/dir/?api=1&origin=<LAT>,<LON>&destination=<LAT>,<LON>`
## Tips
- If results are sparse, widen the radius (1500 → 3000m)
- For "open now" requests: check the `hours` field in results, cross-reference with `web_search` for accuracy since OSM hours aren't always complete
- Zip codes alone can be ambiguous globally — prompt the user for country/state if results look wrong
- The script uses OpenStreetMap data which is community-maintained; coverage varies by region

View File

@@ -1,184 +0,0 @@
#!/usr/bin/env python3
"""Find nearby places using OpenStreetMap (Overpass + Nominatim). No API keys needed.
Usage:
# By coordinates
python find_nearby.py --lat 36.17 --lon -115.14 --type restaurant --radius 1500
# By address/city/zip (auto-geocoded)
python find_nearby.py --near "Times Square, New York" --type cafe --radius 1000
python find_nearby.py --near "90210" --type pharmacy
# Multiple types
python find_nearby.py --lat 36.17 --lon -115.14 --type restaurant --type bar
# JSON output for programmatic use
python find_nearby.py --near "downtown las vegas" --type restaurant --json
"""
import argparse
import json
import math
import sys
import urllib.parse
import urllib.request
from typing import Any
OVERPASS_URLS = [
"https://overpass-api.de/api/interpreter",
"https://overpass.kumi.systems/api/interpreter",
]
NOMINATIM_URL = "https://nominatim.openstreetmap.org/search"
USER_AGENT = "HermesAgent/1.0 (find-nearby skill)"
TIMEOUT = 15
def _http_get(url: str) -> Any:
req = urllib.request.Request(url, headers={"User-Agent": USER_AGENT})
with urllib.request.urlopen(req, timeout=TIMEOUT) as r:
return json.loads(r.read())
def _http_post(url: str, data: str) -> Any:
req = urllib.request.Request(
url, data=data.encode(), headers={"User-Agent": USER_AGENT}
)
with urllib.request.urlopen(req, timeout=TIMEOUT) as r:
return json.loads(r.read())
def haversine(lat1: float, lon1: float, lat2: float, lon2: float) -> float:
"""Distance in meters between two coordinates."""
R = 6_371_000
rlat1, rlat2 = math.radians(lat1), math.radians(lat2)
dlat = math.radians(lat2 - lat1)
dlon = math.radians(lon2 - lon1)
a = math.sin(dlat / 2) ** 2 + math.cos(rlat1) * math.cos(rlat2) * math.sin(dlon / 2) ** 2
return R * 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
def geocode(query: str) -> tuple[float, float]:
"""Convert address/city/zip to coordinates via Nominatim."""
params = urllib.parse.urlencode({"q": query, "format": "json", "limit": 1})
results = _http_get(f"{NOMINATIM_URL}?{params}")
if not results:
print(f"Error: Could not geocode '{query}'. Try a more specific address.", file=sys.stderr)
sys.exit(1)
return float(results[0]["lat"]), float(results[0]["lon"])
def find_nearby(lat: float, lon: float, types: list[str], radius: int = 1500, limit: int = 15) -> list[dict]:
"""Query Overpass for nearby amenities."""
# Build Overpass QL query
type_filters = "".join(
f'nwr["amenity"="{t}"](around:{radius},{lat},{lon});' for t in types
)
query = f"[out:json][timeout:{TIMEOUT}];({type_filters});out center tags;"
# Try each Overpass server
data = None
for url in OVERPASS_URLS:
try:
data = _http_post(url, f"data={urllib.parse.quote(query)}")
break
except Exception:
continue
if not data:
return []
# Parse results
places = []
for el in data.get("elements", []):
tags = el.get("tags", {})
name = tags.get("name")
if not name:
continue
# Get coordinates (nodes have lat/lon directly, ways/relations use center)
plat = el.get("lat") or (el.get("center", {}) or {}).get("lat")
plon = el.get("lon") or (el.get("center", {}) or {}).get("lon")
if not plat or not plon:
continue
dist = haversine(lat, lon, plat, plon)
place = {
"name": name,
"type": tags.get("amenity", ""),
"distance_m": round(dist),
"lat": plat,
"lon": plon,
"maps_url": f"https://www.google.com/maps/search/?api=1&query={plat},{plon}",
"directions_url": f"https://www.google.com/maps/dir/?api=1&origin={lat},{lon}&destination={plat},{plon}",
}
# Add useful optional fields
if tags.get("cuisine"):
place["cuisine"] = tags["cuisine"]
if tags.get("opening_hours"):
place["hours"] = tags["opening_hours"]
if tags.get("phone"):
place["phone"] = tags["phone"]
if tags.get("website"):
place["website"] = tags["website"]
if tags.get("addr:street"):
addr_parts = [tags.get("addr:housenumber", ""), tags.get("addr:street", "")]
if tags.get("addr:city"):
addr_parts.append(tags["addr:city"])
place["address"] = " ".join(p for p in addr_parts if p)
places.append(place)
# Sort by distance, limit results
places.sort(key=lambda p: p["distance_m"])
return places[:limit]
def main():
parser = argparse.ArgumentParser(description="Find nearby places via OpenStreetMap")
parser.add_argument("--lat", type=float, help="Latitude")
parser.add_argument("--lon", type=float, help="Longitude")
parser.add_argument("--near", type=str, help="Address, city, or zip code (geocoded automatically)")
parser.add_argument("--type", action="append", dest="types", default=[], help="Place type (restaurant, cafe, bar, pharmacy, etc.)")
parser.add_argument("--radius", type=int, default=1500, help="Search radius in meters (default: 1500)")
parser.add_argument("--limit", type=int, default=15, help="Max results (default: 15)")
parser.add_argument("--json", action="store_true", dest="json_output", help="Output as JSON")
args = parser.parse_args()
# Resolve coordinates
if args.near:
lat, lon = geocode(args.near)
elif args.lat is not None and args.lon is not None:
lat, lon = args.lat, args.lon
else:
print("Error: Provide --lat/--lon or --near", file=sys.stderr)
sys.exit(1)
if not args.types:
args.types = ["restaurant"]
places = find_nearby(lat, lon, args.types, args.radius, args.limit)
if args.json_output:
print(json.dumps({"origin": {"lat": lat, "lon": lon}, "results": places, "count": len(places)}, indent=2))
else:
if not places:
print(f"No {'/'.join(args.types)} found within {args.radius}m")
return
print(f"Found {len(places)} places within {args.radius}m:\n")
for i, p in enumerate(places, 1):
dist_str = f"{p['distance_m']}m" if p["distance_m"] < 1000 else f"{p['distance_m']/1000:.1f}km"
print(f" {i}. {p['name']} ({p['type']}) — {dist_str}")
if p.get("cuisine"):
print(f" Cuisine: {p['cuisine']}")
if p.get("hours"):
print(f" Hours: {p['hours']}")
if p.get("address"):
print(f" Address: {p['address']}")
print(f" Map: {p['maps_url']}")
print()
if __name__ == "__main__":
main()

View File

@@ -321,32 +321,6 @@ mcp_servers:
All tools from all servers are registered and available simultaneously. Each server's tools are prefixed with its name to avoid collisions.
## Sampling (Server-Initiated LLM Requests)
Hermes supports MCP's `sampling/createMessage` capability — MCP servers can request LLM completions through the agent during tool execution. This enables agent-in-the-loop workflows (data analysis, content generation, decision-making).
Sampling is **enabled by default**. Configure per server:
```yaml
mcp_servers:
my_server:
command: "npx"
args: ["-y", "my-mcp-server"]
sampling:
enabled: true # default: true
model: "gemini-3-flash" # model override (optional)
max_tokens_cap: 4096 # max tokens per request
timeout: 30 # LLM call timeout (seconds)
max_rpm: 10 # max requests per minute
allowed_models: [] # model whitelist (empty = all)
max_tool_rounds: 5 # tool loop limit (0 = disable)
log_level: "info" # audit verbosity
```
Servers can also include `tools` in sampling requests for multi-turn tool-augmented workflows. The `max_tool_rounds` config prevents infinite tool loops. Per-server audit metrics (requests, errors, tokens, tool use count) are tracked via `get_mcp_status()`.
Disable sampling for untrusted servers with `sampling: { enabled: false }`.
## Notes
- MCP tools are called synchronously from the agent's perspective but run asynchronously on a dedicated background event loop

View File

@@ -1,3 +1 @@
---
description: Skills for working with media content — YouTube transcripts, GIF search, music generation, and audio visualization.
---
Media content extraction and transformation tools — YouTube transcripts, audio, video processing.

Some files were not shown because too many files have changed in this diff Show More