Compare commits
37 Commits
modal-inte
...
rl-capabil
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
533c064269 | ||
|
|
5c3105b437 | ||
|
|
3c0d0dba49 | ||
|
|
12bbca95ec | ||
|
|
f6574978de | ||
|
|
f018999da9 | ||
|
|
51a6b7d2b5 | ||
|
|
9bfe185a2e | ||
|
|
beeb7896e0 | ||
|
|
212460289b | ||
|
|
221fb17c5e | ||
|
|
488deb04a4 | ||
|
|
9d9eea9ac9 | ||
|
|
e7f0ffbf5d | ||
|
|
a09b018bd5 | ||
|
|
7eac4ee9fe | ||
|
|
17a5efb416 | ||
|
|
3e634aa7e4 | ||
|
|
5d3398aa8a | ||
|
|
76d929e177 | ||
|
|
be91af7551 | ||
|
|
c9011fc7e1 | ||
|
|
ff776b57bf | ||
|
|
3ee788dacc | ||
|
|
fef504f038 | ||
|
|
bbb5776763 | ||
|
|
e87bee9ccd | ||
|
|
69a338610a | ||
|
|
aa6394e94f | ||
|
|
ef409c6a24 | ||
|
|
da4167560f | ||
|
|
3488576bd8 | ||
|
|
619c72e566 | ||
|
|
a3ba41fce2 | ||
|
|
c935a604f8 | ||
|
|
e114f09f70 | ||
|
|
9b4d9452ba |
115
.clinerules
115
.clinerules
@@ -1,115 +0,0 @@
|
||||
# Cline's Memory Bank
|
||||
|
||||
I am Cline, an expert software engineer with a unique characteristic: my memory resets completely between sessions. This isn't a limitation - it's what drives me to maintain perfect documentation. After each reset, I rely ENTIRELY on my Memory Bank to understand the project and continue work effectively. I MUST read ALL memory bank files at the start of EVERY task - this is not optional.
|
||||
|
||||
## Memory Bank Structure
|
||||
|
||||
The Memory Bank consists of core files and optional context files, all in Markdown format. Files build upon each other in a clear hierarchy:
|
||||
|
||||
flowchart TD
|
||||
PB[projectbrief.md] --> PC[productContext.md]
|
||||
PB --> SP[systemPatterns.md]
|
||||
PB --> TC[techContext.md]
|
||||
|
||||
PC --> AC[activeContext.md]
|
||||
SP --> AC
|
||||
TC --> AC
|
||||
|
||||
AC --> P[progress.md]
|
||||
|
||||
### Core Files (Required)
|
||||
1. `projectbrief.md`
|
||||
- Foundation document that shapes all other files
|
||||
- Created at project start if it doesn't exist
|
||||
- Defines core requirements and goals
|
||||
- Source of truth for project scope
|
||||
|
||||
2. `productContext.md`
|
||||
- Why this project exists
|
||||
- Problems it solves
|
||||
- How it should work
|
||||
- User experience goals
|
||||
|
||||
3. `activeContext.md`
|
||||
- Current work focus
|
||||
- Recent changes
|
||||
- Next steps
|
||||
- Active decisions and considerations
|
||||
- Important patterns and preferences
|
||||
- Learnings and project insights
|
||||
|
||||
4. `systemPatterns.md`
|
||||
- System architecture
|
||||
- Key technical decisions
|
||||
- Design patterns in use
|
||||
- Component relationships
|
||||
- Critical implementation paths
|
||||
|
||||
5. `techContext.md`
|
||||
- Technologies used
|
||||
- Development setup
|
||||
- Technical constraints
|
||||
- Dependencies
|
||||
- Tool usage patterns
|
||||
|
||||
6. `progress.md`
|
||||
- What works
|
||||
- What's left to build
|
||||
- Current status
|
||||
- Known issues
|
||||
- Evolution of project decisions
|
||||
|
||||
### Additional Context
|
||||
Create additional files/folders within memory-bank/ when they help organize:
|
||||
- Complex feature documentation
|
||||
- Integration specifications
|
||||
- API documentation
|
||||
- Testing strategies
|
||||
- Deployment procedures
|
||||
|
||||
## Core Workflows
|
||||
|
||||
### Plan Mode
|
||||
flowchart TD
|
||||
Start[Start] --> ReadFiles[Read Memory Bank]
|
||||
ReadFiles --> CheckFiles{Files Complete?}
|
||||
|
||||
CheckFiles -->|No| Plan[Create Plan]
|
||||
Plan --> Document[Document in Chat]
|
||||
|
||||
CheckFiles -->|Yes| Verify[Verify Context]
|
||||
Verify --> Strategy[Develop Strategy]
|
||||
Strategy --> Present[Present Approach]
|
||||
|
||||
### Act Mode
|
||||
flowchart TD
|
||||
Start[Start] --> Context[Check Memory Bank]
|
||||
Context --> Update[Update Documentation]
|
||||
Update --> Execute[Execute Task]
|
||||
Execute --> Document[Document Changes]
|
||||
|
||||
## Documentation Updates
|
||||
|
||||
Memory Bank updates occur when:
|
||||
1. Discovering new project patterns
|
||||
2. After implementing significant changes
|
||||
3. When user requests with **update memory bank** (MUST review ALL files)
|
||||
4. When context needs clarification
|
||||
|
||||
flowchart TD
|
||||
Start[Update Process]
|
||||
|
||||
subgraph Process
|
||||
P1[Review ALL Files]
|
||||
P2[Document Current State]
|
||||
P3[Clarify Next Steps]
|
||||
P4[Document Insights & Patterns]
|
||||
|
||||
P1 --> P2 --> P3 --> P4
|
||||
end
|
||||
|
||||
Start --> Process
|
||||
|
||||
Note: When triggered by **update memory bank**, I MUST review every memory bank file, even if some don't require updates. Focus particularly on activeContext.md and progress.md as they track current state.
|
||||
|
||||
REMEMBER: After every memory reset, I begin completely fresh. The Memory Bank is my only link to previous work. It must be maintained with precision and clarity, as my effectiveness depends entirely on its accuracy.
|
||||
201
.cursorrules
201
.cursorrules
@@ -1,201 +0,0 @@
|
||||
Hermes-Agent is an agent harness for LLMs with an interactive CLI.
|
||||
|
||||
## Development Environment
|
||||
|
||||
**IMPORTANT**: Always use the virtual environment if it exists:
|
||||
```bash
|
||||
source venv/bin/activate # Before running any Python commands
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
- `hermes` - CLI launcher script (run with `./hermes`)
|
||||
- `cli.py` - Interactive CLI with Rich UI, prompt_toolkit, animated spinners
|
||||
- `cli-config.yaml` - CLI configuration (model, terminal, toolsets, personalities)
|
||||
- `tools/` - Individual tool implementations (web, terminal, browser, vision, etc.)
|
||||
- `tools/__init__.py` - Exports all tools for importing
|
||||
- `model_tools.py` - Consolidates tool schemas and handlers for the agent
|
||||
- `toolsets.py` - Groups tools into logical toolsets (web, terminal, browser, etc.)
|
||||
- `toolset_distributions.py` - Probability-based tool selection for data generation
|
||||
- `run_agent.py` - Primary agent runner with AIAgent class and KawaiiSpinner
|
||||
- `batch_runner.py` - Parallel batch processing with checkpointing
|
||||
- `tests/` - Test scripts
|
||||
|
||||
## File Dependency Chain
|
||||
|
||||
```
|
||||
tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py
|
||||
↑
|
||||
run_agent.py ──────────────────────────┘
|
||||
cli.py → run_agent.py (uses AIAgent with quiet_mode=True)
|
||||
batch_runner.py → run_agent.py + toolset_distributions.py
|
||||
```
|
||||
|
||||
Always ensure consistency between tools, model_tools.py, and toolsets.py when changing any of them.
|
||||
|
||||
## CLI Architecture (cli.py)
|
||||
|
||||
The interactive CLI uses:
|
||||
- **Rich** - For the welcome banner and styled panels
|
||||
- **prompt_toolkit** - For fixed input area with history and `patch_stdout`
|
||||
- **KawaiiSpinner** (in run_agent.py) - Animated feedback during API calls and tool execution
|
||||
|
||||
Key components:
|
||||
- `HermesCLI` class - Main CLI controller with commands and conversation loop
|
||||
- `load_cli_config()` - Loads `cli-config.yaml`, sets environment variables for terminal
|
||||
- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
|
||||
- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
|
||||
|
||||
CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging and enable kawaii-style feedback instead.
|
||||
|
||||
### Adding CLI Commands
|
||||
|
||||
1. Add to `COMMANDS` dict with description
|
||||
2. Add handler in `process_command()` method
|
||||
3. For persistent settings, use `save_config_value()` to update `cli-config.yaml`
|
||||
|
||||
## Adding a New Tool
|
||||
|
||||
Follow this strict order to maintain consistency:
|
||||
|
||||
1. Create `tools/your_tool.py` with:
|
||||
- Handler function (sync or async) returning a JSON string via `json.dumps()`
|
||||
- `check_*_requirements()` function to verify dependencies (e.g., API keys)
|
||||
- Schema definition following OpenAI function-calling format
|
||||
|
||||
2. Export in `tools/__init__.py`:
|
||||
- Import the handler and check function
|
||||
- Add to `__all__` list
|
||||
|
||||
3. Register in `model_tools.py`:
|
||||
- Create `get_*_tool_definitions()` function or add to existing
|
||||
- Add routing in `handle_function_call()` dispatcher
|
||||
- Update `get_all_tool_names()` with the tool name
|
||||
- Update `get_toolset_for_tool()` mapping
|
||||
- Update `get_available_toolsets()` and `check_toolset_requirements()`
|
||||
|
||||
4. Add to toolset in `toolsets.py`:
|
||||
- Add to existing toolset or create new one in TOOLSETS dict
|
||||
|
||||
5. Optionally add to `toolset_distributions.py` for batch processing
|
||||
|
||||
## Tool Implementation Pattern
|
||||
|
||||
```python
|
||||
# tools/example_tool.py
|
||||
import json
|
||||
import os
|
||||
|
||||
def check_example_requirements() -> bool:
|
||||
"""Check if required API keys/dependencies are available."""
|
||||
return bool(os.getenv("EXAMPLE_API_KEY"))
|
||||
|
||||
def example_tool(param: str, task_id: str = None) -> str:
|
||||
"""Execute the tool and return JSON string result."""
|
||||
try:
|
||||
result = {"success": True, "data": "..."}
|
||||
return json.dumps(result, ensure_ascii=False)
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)}, ensure_ascii=False)
|
||||
```
|
||||
|
||||
All tool handlers MUST return a JSON string. Never return raw dicts.
|
||||
|
||||
## Stateful Tools
|
||||
|
||||
Tools that maintain state (terminal, browser) require:
|
||||
- `task_id` parameter for session isolation between concurrent tasks
|
||||
- `cleanup_*()` function to release resources
|
||||
- Cleanup is called automatically in run_agent.py after conversation completes
|
||||
|
||||
## Environment Variables
|
||||
|
||||
API keys are loaded from `.env` file in repo root:
|
||||
- `OPENROUTER_API_KEY` - Main LLM API access (primary provider)
|
||||
- `FIRECRAWL_API_KEY` - Web search/extract tools
|
||||
- `BROWSERBASE_API_KEY` / `BROWSERBASE_PROJECT_ID` - Browser automation
|
||||
- `FAL_KEY` - Image generation (FLUX model)
|
||||
- `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
|
||||
|
||||
Terminal tool configuration (can also be set in `cli-config.yaml`):
|
||||
- `TERMINAL_ENV` - Backend: local, docker, singularity, modal, or ssh
|
||||
- `TERMINAL_CWD` - Working directory
|
||||
- `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` - For SSH backend
|
||||
|
||||
## Agent Loop (run_agent.py)
|
||||
|
||||
The AIAgent class handles:
|
||||
- Processing enabled toolsets to provide to the model
|
||||
- Piping prompts to the agent
|
||||
- Looping LLM calls when tools are invoked, until natural language response
|
||||
- Returning the final response
|
||||
|
||||
Uses OpenAI-compatible API (primarily OpenRouter) with the OpenAI Python SDK.
|
||||
|
||||
## Reasoning Model Support
|
||||
|
||||
For models that support chain-of-thought reasoning:
|
||||
- Extract `reasoning_content` from API responses
|
||||
- Store in `assistant_msg["reasoning"]` for trajectory export
|
||||
- Pass back via `reasoning_content` field on subsequent turns
|
||||
|
||||
## Trajectory Format
|
||||
|
||||
Conversations are saved in ShareGPT format for training:
|
||||
```json
|
||||
{"from": "system", "value": "System prompt with <tools>...</tools>"}
|
||||
{"from": "human", "value": "User message"}
|
||||
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
|
||||
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
|
||||
{"from": "gpt", "value": "Final response"}
|
||||
```
|
||||
|
||||
Tool calls use `<tool_call>` XML tags, responses use `<tool_response>` tags, reasoning uses `<think>` tags.
|
||||
|
||||
## Batch Processing (batch_runner.py)
|
||||
|
||||
For processing multiple prompts:
|
||||
- Parallel execution with multiprocessing
|
||||
- Content-based resume for fault tolerance (matches on prompt text, not indices)
|
||||
- Toolset distributions control probabilistic tool availability per prompt
|
||||
- Output: `data/<run_name>/trajectories.jsonl` (combined) + individual batch files
|
||||
|
||||
## Logging
|
||||
|
||||
Trajectories restructure tools as a system prompt for storage in a format suitable for later training use.
|
||||
|
||||
## Skills System
|
||||
|
||||
Skills are on-demand knowledge documents the agent can load. Located in `skills/` directory:
|
||||
|
||||
```
|
||||
skills/
|
||||
├── mlops/ # Category folder
|
||||
│ ├── axolotl/ # Skill folder
|
||||
│ │ ├── SKILL.md # Main instructions (required)
|
||||
│ │ ├── references/ # Additional docs, API specs
|
||||
│ │ └── templates/ # Output formats, configs
|
||||
│ └── vllm/
|
||||
│ └── SKILL.md
|
||||
└── example-skill/
|
||||
└── SKILL.md
|
||||
```
|
||||
|
||||
**Progressive disclosure** (token-efficient):
|
||||
1. `skills_categories()` - List category names (~50 tokens)
|
||||
2. `skills_list(category)` - Name + description per skill (~3k tokens)
|
||||
3. `skill_view(name)` - Full content + tags + linked files
|
||||
|
||||
SKILL.md files use YAML frontmatter:
|
||||
```yaml
|
||||
---
|
||||
name: skill-name
|
||||
description: Brief description for listing
|
||||
tags: [tag1, tag2]
|
||||
related_skills: [other-skill]
|
||||
version: 1.0.0
|
||||
---
|
||||
# Skill Content...
|
||||
```
|
||||
|
||||
Tool files: `tools/skills_tool.py` → `model_tools.py` → `toolsets.py`
|
||||
170
.env.example
170
.env.example
@@ -1,68 +1,12 @@
|
||||
# Hermes Agent Environment Configuration
|
||||
# Copy this file to .env and fill in your API keys
|
||||
|
||||
# =============================================================================
|
||||
# CORE SETTINGS
|
||||
# =============================================================================
|
||||
# Agent backend:
|
||||
# - openai : default Hermes-Agent loop (OpenAI function-calling via OpenAI SDK)
|
||||
# - atropos : Atroposlib ServerManager/ManagedServer-backed loop (training/env integration)
|
||||
HERMES_BACKEND=openai
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# LOCAL / SELF-HOSTED OPENAI-COMPATIBLE ENDPOINTS (vLLM, SGLang, llama.cpp, etc.)
|
||||
# =============================================================================
|
||||
# For local development (matches the Atropos test env defaults):
|
||||
# ATROPOS_SERVER_BASE_URL=http://127.0.0.1:8080
|
||||
# ATROPOS_SERVER_MODEL=hermes-4-36b
|
||||
# For hosted inference (Nous Research inference API):
|
||||
ATROPOS_SERVER_BASE_URL=
|
||||
ATROPOS_SERVER_MODEL=
|
||||
ATROPOS_TOKENIZER_NAME=
|
||||
# Set this to your Nous API key (Bearer token).
|
||||
ATROPOS_SERVER_API_KEY=
|
||||
|
||||
# Debugging (prints to stdout; use with care)
|
||||
# HERMES_DEBUG_ATROPOS_REQUEST=1
|
||||
# HERMES_DEBUG_ATROPOS_RESPONSE=1
|
||||
# HERMES_DEBUG_OPENAI_REQUEST=1
|
||||
# HERMES_DEBUG_OPENAI_RESPONSE=1
|
||||
|
||||
# =============================================================================
|
||||
# LOCAL / SELF-HOSTED OPENAI-COMPATIBLE ENDPOINTS (vLLM, SGLang, llama.cpp, etc.)
|
||||
# =============================================================================
|
||||
# If you set ATROPOS_SERVER_BASE_URL or OPENAI_BASE_URL, Hermes will use it instead
|
||||
# of OpenRouter.
|
||||
#
|
||||
# Local server convenience (base URL without /v1):
|
||||
# llama.cpp example (see `Hermes-Agent/scripts/launch_llama_cpp_hermes_4_36b.sh`):
|
||||
# ATROPOS_SERVER_BASE_URL=http://127.0.0.1:8080
|
||||
# ATROPOS_SERVER_MODEL=hermes-4-36b
|
||||
# ATROPOS_TOKENIZER_NAME=NousResearch/Hermes-4.3-36B
|
||||
# ATROPOS_SERVER_API_KEY=local
|
||||
#
|
||||
# Hosted Nous inference API:
|
||||
# ATROPOS_SERVER_BASE_URL=https://inference-api.nousresearch.com
|
||||
# ATROPOS_SERVER_MODEL=Hermes-4.3-36B
|
||||
# ATROPOS_TOKENIZER_NAME=NousResearch/Hermes-4.3-36B
|
||||
# ATROPOS_SERVER_API_KEY=sk-... (Bearer token)
|
||||
#
|
||||
# If you plan to run GRPO-style group sampling (e.g. `--env.group_size 4`) against
|
||||
# llama.cpp, start the server with at least that many slots, e.g.:
|
||||
# LLAMA_CPP_PARALLEL=4 Hermes-Agent/scripts/launch_llama_cpp_hermes_4_36b.sh
|
||||
#
|
||||
# Generic OpenAI-compatible (base URL should include /v1):
|
||||
# OPENAI_BASE_URL=http://127.0.0.1:8080/v1
|
||||
# OPENAI_API_KEY=local
|
||||
|
||||
# =============================================================================
|
||||
# LLM PROVIDER (OpenRouter)
|
||||
# =============================================================================
|
||||
# OpenRouter provides access to many models through one API
|
||||
# All LLM calls go through OpenRouter - no direct provider keys needed
|
||||
# Get your key at: https://openrouter.ai/keys
|
||||
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
|
||||
OPENROUTER_API_KEY=
|
||||
|
||||
# Default model to use (OpenRouter format: provider/model)
|
||||
@@ -96,6 +40,7 @@ FAL_KEY=
|
||||
# - modal: Runs in Modal cloud sandboxes (scalable, requires Modal account)
|
||||
TERMINAL_ENV=local
|
||||
|
||||
|
||||
# Container images (for singularity/docker/modal backends)
|
||||
TERMINAL_DOCKER_IMAGE=python:3.11
|
||||
TERMINAL_SINGULARITY_IMAGE=docker://python:3.11
|
||||
@@ -144,87 +89,12 @@ TERMINAL_LIFETIME_SECONDS=300
|
||||
# SUDO_PASSWORD=your_password_here
|
||||
|
||||
# =============================================================================
|
||||
# MODAL CLOUD BACKEND (for TERMINAL_ENV=modal)
|
||||
# MODAL CLOUD BACKEND (Optional - for TERMINAL_ENV=modal)
|
||||
# =============================================================================
|
||||
# Modal provides cloud sandboxes with per-second billing and auto-scaling.
|
||||
# This implementation uses a warm pool of sandboxes for cost efficiency.
|
||||
#
|
||||
# SETUP:
|
||||
# pip install modal && modal setup
|
||||
# (Authenticates via browser, stores credentials locally)
|
||||
#
|
||||
# FEATURES:
|
||||
# - Auto-scaling warm sandbox pool (no cold start after first use)
|
||||
# - Named sandbox recovery (reconnects after restart)
|
||||
# - Profile-based heterogeneous environments (CPU, GPU, different images)
|
||||
# - Server-side idle_timeout protection against orphaned sandboxes
|
||||
|
||||
# Modal app name (groups all sandboxes, used for recovery)
|
||||
TERMINAL_MODAL_APP_NAME=hermes-sandbox
|
||||
|
||||
# Default profile when none specified
|
||||
TERMINAL_MODAL_DEFAULT_PROFILE=default
|
||||
|
||||
# Profile config file (optional - YAML format, see modal_profiles.yaml)
|
||||
# TERMINAL_MODAL_PROFILES_FILE=modal_profiles.yaml
|
||||
|
||||
# --- Default Profile Settings (used if no YAML file) ---
|
||||
# These apply when no profile is specified or for the "default" profile
|
||||
TERMINAL_MODAL_IMAGE=python:3.11
|
||||
TERMINAL_MODAL_MIN_POOL=1
|
||||
TERMINAL_MODAL_MAX_POOL=5
|
||||
TERMINAL_MODAL_IDLE_TIMEOUT=120
|
||||
TERMINAL_MODAL_MAX_LIFETIME=3600
|
||||
TERMINAL_MODAL_SCALE_DOWN_IDLE=180
|
||||
|
||||
# --- Custom Profile Example: pytorch-gpu ---
|
||||
# Uncomment to enable a GPU profile for ML tasks
|
||||
# Usage: terminal_tool("python train.py", profile="pytorch-gpu")
|
||||
#
|
||||
# TERMINAL_MODAL_PROFILE_pytorch_gpu_IMAGE=pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
|
||||
# TERMINAL_MODAL_PROFILE_pytorch_gpu_GPU=T4
|
||||
# TERMINAL_MODAL_PROFILE_pytorch_gpu_MEMORY=16384
|
||||
# TERMINAL_MODAL_PROFILE_pytorch_gpu_MIN_POOL=0
|
||||
# TERMINAL_MODAL_PROFILE_pytorch_gpu_MAX_POOL=2
|
||||
# TERMINAL_MODAL_PROFILE_pytorch_gpu_IDLE_TIMEOUT=60
|
||||
|
||||
# --- Custom Profile Example: node ---
|
||||
# Uncomment to enable a Node.js profile
|
||||
# Usage: terminal_tool("npm test", profile="node")
|
||||
#
|
||||
# TERMINAL_MODAL_PROFILE_node_IMAGE=node:18
|
||||
# TERMINAL_MODAL_PROFILE_node_MIN_POOL=0
|
||||
# TERMINAL_MODAL_PROFILE_node_MAX_POOL=3
|
||||
|
||||
# =============================================================================
|
||||
# MODAL SECRETS (Secure credential injection)
|
||||
# =============================================================================
|
||||
# Modal Secrets allow you to securely pass API keys, passwords, and other
|
||||
# sensitive data to your sandboxes without exposing them in code or logs.
|
||||
#
|
||||
# SETUP SECRETS:
|
||||
# 1. Via Dashboard: https://modal.com/secrets
|
||||
# 2. Via CLI: modal secret create my-secret KEY1=value1 KEY2=value2
|
||||
# 3. Via CLI with env: modal secret create my-secret API_KEY="$API_KEY"
|
||||
#
|
||||
# LIST SECRETS:
|
||||
# modal secret list
|
||||
#
|
||||
# DELETE SECRETS:
|
||||
# modal secret delete my-secret
|
||||
|
||||
# Global secrets applied to ALL profiles (comma-separated secret names)
|
||||
# These secrets must be created on Modal dashboard or via CLI first
|
||||
# TERMINAL_MODAL_SECRETS=my-api-keys,database-creds
|
||||
|
||||
# Per-profile secrets (comma-separated secret names)
|
||||
# TERMINAL_MODAL_PROFILE_pytorch_gpu_SECRETS=huggingface-token,wandb-key
|
||||
|
||||
# Per-profile environment variables (semicolon-separated KEY=VALUE pairs)
|
||||
# TERMINAL_MODAL_PROFILE_default_ENV_VARS=DEBUG=1;LOG_LEVEL=info
|
||||
|
||||
# Load local .env file into sandbox (useful for development)
|
||||
# TERMINAL_MODAL_PROFILE_default_USE_DOTENV=true
|
||||
# Modal uses CLI authentication, not environment variables.
|
||||
# Run: pip install modal && modal setup
|
||||
# This will authenticate via browser and store credentials locally.
|
||||
# No API key needed in .env - Modal handles auth automatically.
|
||||
|
||||
# =============================================================================
|
||||
# BROWSER TOOL CONFIGURATION (agent-browser + Browserbase)
|
||||
@@ -285,3 +155,31 @@ WEB_TOOLS_DEBUG=false
|
||||
VISION_TOOLS_DEBUG=false
|
||||
MOA_TOOLS_DEBUG=false
|
||||
IMAGE_TOOLS_DEBUG=false
|
||||
|
||||
# =============================================================================
|
||||
# CONTEXT COMPRESSION (Auto-shrinks long conversations)
|
||||
# =============================================================================
|
||||
# When conversation approaches model's context limit, middle turns are
|
||||
# automatically summarized to free up space.
|
||||
#
|
||||
# CONTEXT_COMPRESSION_ENABLED=true # Enable auto-compression (default: true)
|
||||
# CONTEXT_COMPRESSION_THRESHOLD=0.85 # Compress at 85% of context limit
|
||||
# CONTEXT_COMPRESSION_MODEL=google/gemini-2.0-flash-001 # Fast model for summaries
|
||||
|
||||
# =============================================================================
|
||||
# RL TRAINING (Tinker + Atropos)
|
||||
# =============================================================================
|
||||
# Run reinforcement learning training on language models using the Tinker API.
|
||||
# Requires the rl-server to be running (from tinker-atropos package).
|
||||
|
||||
# Tinker API Key - RL training service
|
||||
# Get at: https://tinker-console.thinkingmachines.ai/keys
|
||||
TINKER_API_KEY=
|
||||
|
||||
# Weights & Biases API Key - Experiment tracking and metrics
|
||||
# Get at: https://wandb.ai/authorize
|
||||
WANDB_API_KEY=
|
||||
|
||||
# RL API Server URL (default: http://localhost:8080)
|
||||
# Change if running the rl-server on a different host/port
|
||||
# RL_API_URL=http://localhost:8080
|
||||
|
||||
20
.gitignore
vendored
20
.gitignore
vendored
@@ -42,23 +42,3 @@ images/
|
||||
|
||||
# CLI config (may contain sensitive SSH paths)
|
||||
cli-config.yaml
|
||||
|
||||
.DS_Store
|
||||
|
||||
# artifacts
|
||||
*.jsonl
|
||||
*.html
|
||||
*.json
|
||||
*.log
|
||||
*.csv
|
||||
|
||||
# Singularity/Apptainer images (large binary files)
|
||||
*.sif
|
||||
|
||||
# Test files
|
||||
test_singularity_*.py
|
||||
test_*.py
|
||||
!tests/test_*.py
|
||||
|
||||
# Nomad data
|
||||
/tmp/NomadClient*/
|
||||
|
||||
3
.gitmodules
vendored
3
.gitmodules
vendored
@@ -1,3 +1,6 @@
|
||||
[submodule "mini-swe-agent"]
|
||||
path = mini-swe-agent
|
||||
url = https://github.com/SWE-agent/mini-swe-agent
|
||||
[submodule "tinker-atropos"]
|
||||
path = tinker-atropos
|
||||
url = https://github.com/nousresearch/tinker-atropos
|
||||
|
||||
533
AGENTS.md
Normal file
533
AGENTS.md
Normal file
@@ -0,0 +1,533 @@
|
||||
# Hermes Agent - Development Guide
|
||||
|
||||
Instructions for AI coding assistants (GitHub Copilot, Cursor, etc.) and human developers.
|
||||
|
||||
Hermes-Agent is an AI agent harness with tool-calling capabilities, interactive CLI, messaging integrations, and scheduled tasks.
|
||||
|
||||
## Development Environment
|
||||
|
||||
**IMPORTANT**: Always use the virtual environment if it exists:
|
||||
```bash
|
||||
source venv/bin/activate # Before running any Python commands
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
hermes-agent/
|
||||
├── hermes_cli/ # Unified CLI commands
|
||||
│ ├── main.py # Entry point, command dispatcher
|
||||
│ ├── setup.py # Interactive setup wizard
|
||||
│ ├── config.py # Config management & migration
|
||||
│ ├── status.py # Status display
|
||||
│ ├── doctor.py # Diagnostics
|
||||
│ ├── gateway.py # Gateway management
|
||||
│ ├── uninstall.py # Uninstaller
|
||||
│ └── cron.py # Cron job management
|
||||
├── tools/ # Tool implementations
|
||||
├── gateway/ # Messaging platform adapters
|
||||
├── cron/ # Scheduler implementation
|
||||
├── skills/ # Knowledge documents
|
||||
├── cli.py # Interactive CLI (Rich UI)
|
||||
├── run_agent.py # Agent runner with AIAgent class
|
||||
├── model_tools.py # Tool schemas and handlers
|
||||
├── toolsets.py # Tool groupings
|
||||
├── toolset_distributions.py # Probability-based tool selection
|
||||
└── batch_runner.py # Parallel batch processing
|
||||
```
|
||||
|
||||
**User Configuration** (stored in `~/.hermes/`):
|
||||
- `~/.hermes/config.yaml` - Settings (model, terminal, toolsets, etc.)
|
||||
- `~/.hermes/.env` - API keys and secrets
|
||||
|
||||
## File Dependency Chain
|
||||
|
||||
```
|
||||
tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py
|
||||
↑
|
||||
run_agent.py ──────────────────────────┘
|
||||
cli.py → run_agent.py (uses AIAgent with quiet_mode=True)
|
||||
batch_runner.py → run_agent.py + toolset_distributions.py
|
||||
```
|
||||
|
||||
Always ensure consistency between tools, model_tools.py, and toolsets.py when changing any of them.
|
||||
|
||||
---
|
||||
|
||||
## AIAgent Class
|
||||
|
||||
The main agent is implemented in `run_agent.py`:
|
||||
|
||||
```python
|
||||
class AIAgent:
|
||||
def __init__(
|
||||
self,
|
||||
model: str = "anthropic/claude-sonnet-4",
|
||||
api_key: str = None,
|
||||
base_url: str = "https://openrouter.ai/api/v1",
|
||||
max_iterations: int = 60, # Max tool-calling loops
|
||||
enabled_toolsets: list = None,
|
||||
disabled_toolsets: list = None,
|
||||
verbose_logging: bool = False,
|
||||
quiet_mode: bool = False, # Suppress progress output
|
||||
tool_progress_callback: callable = None, # Called on each tool use
|
||||
):
|
||||
# Initialize OpenAI client, load tools based on toolsets
|
||||
...
|
||||
|
||||
def chat(self, user_message: str, task_id: str = None) -> str:
|
||||
# Main entry point - runs the agent loop
|
||||
...
|
||||
```
|
||||
|
||||
### Agent Loop
|
||||
|
||||
The core loop in `_run_agent_loop()`:
|
||||
|
||||
```
|
||||
1. Add user message to conversation
|
||||
2. Call LLM with tools
|
||||
3. If LLM returns tool calls:
|
||||
- Execute each tool
|
||||
- Add tool results to conversation
|
||||
- Go to step 2
|
||||
4. If LLM returns text response:
|
||||
- Return response to user
|
||||
```
|
||||
|
||||
```python
|
||||
while turns < max_turns:
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
tools=tool_schemas,
|
||||
)
|
||||
|
||||
if response.tool_calls:
|
||||
for tool_call in response.tool_calls:
|
||||
result = await execute_tool(tool_call)
|
||||
messages.append(tool_result_message(result))
|
||||
turns += 1
|
||||
else:
|
||||
return response.content
|
||||
```
|
||||
|
||||
### Conversation Management
|
||||
|
||||
Messages are stored as a list of dicts following OpenAI format:
|
||||
|
||||
```python
|
||||
messages = [
|
||||
{"role": "system", "content": "You are a helpful assistant..."},
|
||||
{"role": "user", "content": "Search for Python tutorials"},
|
||||
{"role": "assistant", "content": None, "tool_calls": [...]},
|
||||
{"role": "tool", "tool_call_id": "...", "content": "..."},
|
||||
{"role": "assistant", "content": "Here's what I found..."},
|
||||
]
|
||||
```
|
||||
|
||||
### Reasoning Model Support
|
||||
|
||||
For models that support chain-of-thought reasoning:
|
||||
- Extract `reasoning_content` from API responses
|
||||
- Store in `assistant_msg["reasoning"]` for trajectory export
|
||||
- Pass back via `reasoning_content` field on subsequent turns
|
||||
|
||||
---
|
||||
|
||||
## CLI Architecture (cli.py)
|
||||
|
||||
The interactive CLI uses:
|
||||
- **Rich** - For the welcome banner and styled panels
|
||||
- **prompt_toolkit** - For fixed input area with history and `patch_stdout`
|
||||
- **KawaiiSpinner** (in run_agent.py) - Animated feedback during API calls and tool execution
|
||||
|
||||
Key components:
|
||||
- `HermesCLI` class - Main CLI controller with commands and conversation loop
|
||||
- `load_cli_config()` - Loads config, sets environment variables for terminal
|
||||
- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
|
||||
- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
|
||||
|
||||
CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging.
|
||||
|
||||
### Adding CLI Commands
|
||||
|
||||
1. Add to `COMMANDS` dict with description
|
||||
2. Add handler in `process_command()` method
|
||||
3. For persistent settings, use `save_config_value()` to update config
|
||||
|
||||
---
|
||||
|
||||
## Hermes CLI Commands
|
||||
|
||||
The unified `hermes` command provides all functionality:
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `hermes` | Interactive chat (default) |
|
||||
| `hermes chat -q "..."` | Single query mode |
|
||||
| `hermes setup` | Configure API keys and settings |
|
||||
| `hermes config` | View current configuration |
|
||||
| `hermes config edit` | Open config in editor |
|
||||
| `hermes config set KEY VAL` | Set a specific value |
|
||||
| `hermes config check` | Check for missing config |
|
||||
| `hermes config migrate` | Prompt for missing config interactively |
|
||||
| `hermes status` | Show configuration status |
|
||||
| `hermes doctor` | Diagnose issues |
|
||||
| `hermes update` | Update to latest (checks for new config) |
|
||||
| `hermes uninstall` | Uninstall (can keep configs for reinstall) |
|
||||
| `hermes gateway` | Start messaging gateway |
|
||||
| `hermes cron list` | View scheduled jobs |
|
||||
| `hermes version` | Show version info |
|
||||
|
||||
---
|
||||
|
||||
## Messaging Gateway
|
||||
|
||||
The gateway connects Hermes to Telegram, Discord, and WhatsApp.
|
||||
|
||||
### Configuration (in `~/.hermes/.env`):
|
||||
|
||||
```bash
|
||||
# Telegram
|
||||
TELEGRAM_BOT_TOKEN=123456:ABC-DEF... # From @BotFather
|
||||
TELEGRAM_ALLOWED_USERS=123456789,987654 # Comma-separated user IDs (from @userinfobot)
|
||||
|
||||
# Discord
|
||||
DISCORD_BOT_TOKEN=MTIz... # From Developer Portal
|
||||
DISCORD_ALLOWED_USERS=123456789012345678 # Comma-separated user IDs
|
||||
|
||||
# Agent Behavior
|
||||
HERMES_MAX_ITERATIONS=60 # Max tool-calling iterations
|
||||
MESSAGING_CWD=/home/myuser # Terminal working directory for messaging
|
||||
|
||||
# Tool Progress (optional)
|
||||
HERMES_TOOL_PROGRESS=true # Send progress messages
|
||||
HERMES_TOOL_PROGRESS_MODE=new # "new" or "all"
|
||||
```
|
||||
|
||||
### Working Directory Behavior
|
||||
|
||||
- **CLI (`hermes` command)**: Uses current directory (`.` → `os.getcwd()`)
|
||||
- **Messaging (Telegram/Discord)**: Uses `MESSAGING_CWD` (default: home directory)
|
||||
|
||||
This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
|
||||
|
||||
### Security (User Allowlists):
|
||||
|
||||
**IMPORTANT**: Without an allowlist, anyone who finds your bot can use it!
|
||||
|
||||
The gateway checks `{PLATFORM}_ALLOWED_USERS` environment variables:
|
||||
- If set: Only listed user IDs can interact with the bot
|
||||
- If unset: All users are allowed (dangerous with terminal access!)
|
||||
|
||||
Users can find their IDs:
|
||||
- **Telegram**: Message [@userinfobot](https://t.me/userinfobot)
|
||||
- **Discord**: Enable Developer Mode, right-click name → Copy ID
|
||||
|
||||
### Tool Progress Notifications
|
||||
|
||||
When `HERMES_TOOL_PROGRESS=true`, the bot sends status messages as it works:
|
||||
- `💻 \`ls -la\`...` (terminal commands show the actual command)
|
||||
- `🔍 web_search...`
|
||||
- `📄 web_extract...`
|
||||
|
||||
Modes:
|
||||
- `new`: Only when switching to a different tool (less spam)
|
||||
- `all`: Every single tool call
|
||||
|
||||
### Typing Indicator
|
||||
|
||||
The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
|
||||
|
||||
### Platform Toolsets:
|
||||
|
||||
Each platform has a dedicated toolset in `toolsets.py`:
|
||||
- `hermes-telegram`: Full tools including terminal (with safety checks)
|
||||
- `hermes-discord`: Full tools including terminal
|
||||
- `hermes-whatsapp`: Full tools including terminal
|
||||
|
||||
---
|
||||
|
||||
## Configuration System
|
||||
|
||||
Configuration files are stored in `~/.hermes/` for easy user access:
|
||||
- `~/.hermes/config.yaml` - All settings (model, terminal, compression, etc.)
|
||||
- `~/.hermes/.env` - API keys and secrets
|
||||
|
||||
### Adding New Configuration Options
|
||||
|
||||
When adding new configuration variables, you MUST follow this process:
|
||||
|
||||
#### For config.yaml options:
|
||||
|
||||
1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
|
||||
2. **CRITICAL**: Bump `_config_version` in `DEFAULT_CONFIG` when adding required fields
|
||||
3. This triggers migration prompts for existing users on next `hermes update` or `hermes setup`
|
||||
|
||||
Example:
|
||||
```python
|
||||
DEFAULT_CONFIG = {
|
||||
# ... existing config ...
|
||||
|
||||
"new_feature": {
|
||||
"enabled": True,
|
||||
"option": "default_value",
|
||||
},
|
||||
|
||||
# BUMP THIS when adding required fields
|
||||
"_config_version": 2, # Was 1, now 2
|
||||
}
|
||||
```
|
||||
|
||||
#### For .env variables (API keys/secrets):
|
||||
|
||||
1. Add to `REQUIRED_ENV_VARS` or `OPTIONAL_ENV_VARS` in `hermes_cli/config.py`
|
||||
2. Include metadata for the migration system:
|
||||
|
||||
```python
|
||||
OPTIONAL_ENV_VARS = {
|
||||
# ... existing vars ...
|
||||
"NEW_API_KEY": {
|
||||
"description": "What this key is for",
|
||||
"prompt": "Display name in prompts",
|
||||
"url": "https://where-to-get-it.com/",
|
||||
"tools": ["tools_it_enables"], # What tools need this
|
||||
"password": True, # Mask input
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
#### Update related files:
|
||||
|
||||
- `hermes_cli/setup.py` - Add prompts in the setup wizard
|
||||
- `cli-config.yaml.example` - Add example with comments
|
||||
- Update README.md if user-facing
|
||||
|
||||
### Config Version Migration
|
||||
|
||||
The system uses `_config_version` to detect outdated configs:
|
||||
|
||||
1. `check_for_missing_config()` compares user config to `DEFAULT_CONFIG`
|
||||
2. `migrate_config()` interactively prompts for missing values
|
||||
3. Called automatically by `hermes update` and optionally by `hermes setup`
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
API keys are loaded from `~/.hermes/.env`:
|
||||
- `OPENROUTER_API_KEY` - Main LLM API access (primary provider)
|
||||
- `FIRECRAWL_API_KEY` - Web search/extract tools
|
||||
- `BROWSERBASE_API_KEY` / `BROWSERBASE_PROJECT_ID` - Browser automation
|
||||
- `FAL_KEY` - Image generation (FLUX model)
|
||||
- `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
|
||||
|
||||
Terminal tool configuration (in `~/.hermes/config.yaml`):
|
||||
- `terminal.backend` - Backend: local, docker, singularity, modal, or ssh
|
||||
- `terminal.cwd` - Working directory for CLI ("." = current directory)
|
||||
- `terminal.docker_image` - Image for Docker backend
|
||||
- `terminal.singularity_image` - Image for Singularity backend
|
||||
- `terminal.modal_image` - Image for Modal backend
|
||||
- SSH: `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` in .env
|
||||
|
||||
Agent behavior (in `~/.hermes/.env`):
|
||||
- `HERMES_MAX_ITERATIONS` - Max tool-calling iterations (default: 60)
|
||||
- `MESSAGING_CWD` - Working directory for messaging platforms (default: ~)
|
||||
- `HERMES_TOOL_PROGRESS` - Enable tool progress messages (`true`/`false`)
|
||||
- `HERMES_TOOL_PROGRESS_MODE` - Progress mode: `new` (tool changes) or `all`
|
||||
|
||||
### Dangerous Command Approval
|
||||
|
||||
The terminal tool includes safety checks for potentially destructive commands (e.g., `rm -rf`, `DROP TABLE`, `chmod 777`, etc.):
|
||||
|
||||
**Behavior by Backend:**
|
||||
- **Docker/Singularity/Modal**: Commands run unrestricted (isolated containers)
|
||||
- **Local/SSH**: Dangerous commands trigger approval flow
|
||||
|
||||
**Approval Flow (CLI):**
|
||||
```
|
||||
⚠️ Potentially dangerous command detected: recursive delete
|
||||
rm -rf /tmp/test
|
||||
|
||||
[o]nce | [s]ession | [a]lways | [d]eny
|
||||
Choice [o/s/a/D]:
|
||||
```
|
||||
|
||||
**Approval Flow (Messaging):**
|
||||
- Command is blocked with explanation
|
||||
- Agent explains the command was blocked for safety
|
||||
- User must add the pattern to their allowlist via `hermes config edit` or run the command directly on their machine
|
||||
|
||||
**Configuration:**
|
||||
- `command_allowlist` in `~/.hermes/config.yaml` stores permanently allowed patterns
|
||||
- Add patterns via "always" approval or edit directly
|
||||
|
||||
**Sudo Handling (Messaging):**
|
||||
- If sudo fails over messaging, output includes tip to add `SUDO_PASSWORD` to `~/.hermes/.env`
|
||||
|
||||
---
|
||||
|
||||
## Adding New Tools
|
||||
|
||||
Follow this strict order to maintain consistency:
|
||||
|
||||
1. Create `tools/your_tool.py` with:
|
||||
- Handler function (sync or async) returning a JSON string via `json.dumps()`
|
||||
- `check_*_requirements()` function to verify dependencies (e.g., API keys)
|
||||
- Schema definition following OpenAI function-calling format
|
||||
|
||||
2. Export in `tools/__init__.py`:
|
||||
- Import the handler and check function
|
||||
- Add to `__all__` list
|
||||
|
||||
3. Register in `model_tools.py`:
|
||||
- Add to `TOOLSET_REQUIREMENTS` if it needs API keys
|
||||
- Create `get_*_tool_definitions()` function or add to existing
|
||||
- Add routing in `handle_function_call()` dispatcher
|
||||
- Update `get_all_tool_names()` with the tool name
|
||||
- Update `get_toolset_for_tool()` mapping
|
||||
- Update `get_available_toolsets()` and `check_toolset_requirements()`
|
||||
|
||||
4. Add to toolset in `toolsets.py`:
|
||||
- Add to existing toolset or create new one in TOOLSETS dict
|
||||
|
||||
5. If the tool requires an API key:
|
||||
- Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py`
|
||||
- The tool will be auto-disabled if the key is missing
|
||||
|
||||
6. Optionally add to `toolset_distributions.py` for batch processing
|
||||
|
||||
### Tool Implementation Pattern
|
||||
|
||||
```python
|
||||
# tools/example_tool.py
|
||||
import json
|
||||
import os
|
||||
|
||||
def check_example_requirements() -> bool:
|
||||
"""Check if required API keys/dependencies are available."""
|
||||
return bool(os.getenv("EXAMPLE_API_KEY"))
|
||||
|
||||
def example_tool(param: str, task_id: str = None) -> str:
|
||||
"""Execute the tool and return JSON string result."""
|
||||
try:
|
||||
result = {"success": True, "data": "..."}
|
||||
return json.dumps(result, ensure_ascii=False)
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)}, ensure_ascii=False)
|
||||
```
|
||||
|
||||
All tool handlers MUST return a JSON string. Never return raw dicts.
|
||||
|
||||
### Dynamic Tool Availability
|
||||
|
||||
Tools are automatically disabled when their API keys are missing:
|
||||
|
||||
```python
|
||||
# In model_tools.py
|
||||
TOOLSET_REQUIREMENTS = {
|
||||
"web": {"env_vars": ["FIRECRAWL_API_KEY"]},
|
||||
"browser": {"env_vars": ["BROWSERBASE_API_KEY", "BROWSERBASE_PROJECT_ID"]},
|
||||
"creative": {"env_vars": ["FAL_KEY"]},
|
||||
}
|
||||
```
|
||||
|
||||
The `check_tool_availability()` function determines which tools to include.
|
||||
|
||||
### Stateful Tools
|
||||
|
||||
Tools that maintain state (terminal, browser) require:
|
||||
- `task_id` parameter for session isolation between concurrent tasks
|
||||
- `cleanup_*()` function to release resources
|
||||
- Cleanup is called automatically in run_agent.py after conversation completes
|
||||
|
||||
---
|
||||
|
||||
## Trajectory Format
|
||||
|
||||
Conversations are saved in ShareGPT format for training:
|
||||
```json
|
||||
{"from": "system", "value": "System prompt with <tools>...</tools>"}
|
||||
{"from": "human", "value": "User message"}
|
||||
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
|
||||
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
|
||||
{"from": "gpt", "value": "Final response"}
|
||||
```
|
||||
|
||||
Tool calls use `<tool_call>` XML tags, responses use `<tool_response>` tags, reasoning uses `<think>` tags.
|
||||
|
||||
### Trajectory Export
|
||||
|
||||
```python
|
||||
agent = AIAgent(save_trajectories=True)
|
||||
agent.chat("Do something")
|
||||
# Saves to trajectories/*.jsonl in ShareGPT format
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Batch Processing (batch_runner.py)
|
||||
|
||||
For processing multiple prompts:
|
||||
- Parallel execution with multiprocessing
|
||||
- Content-based resume for fault tolerance (matches on prompt text, not indices)
|
||||
- Toolset distributions control probabilistic tool availability per prompt
|
||||
- Output: `data/<run_name>/trajectories.jsonl` (combined) + individual batch files
|
||||
|
||||
```bash
|
||||
python batch_runner.py \
|
||||
--dataset_file=prompts.jsonl \
|
||||
--batch_size=20 \
|
||||
--num_workers=4 \
|
||||
--run_name=my_run
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Skills System
|
||||
|
||||
Skills are on-demand knowledge documents the agent can load. Located in `skills/` directory:
|
||||
|
||||
```
|
||||
skills/
|
||||
├── mlops/ # Category folder
|
||||
│ ├── axolotl/ # Skill folder
|
||||
│ │ ├── SKILL.md # Main instructions (required)
|
||||
│ │ ├── references/ # Additional docs, API specs
|
||||
│ │ └── templates/ # Output formats, configs
|
||||
│ └── vllm/
|
||||
│ └── SKILL.md
|
||||
└── example-skill/
|
||||
└── SKILL.md
|
||||
```
|
||||
|
||||
**Progressive disclosure** (token-efficient):
|
||||
1. `skills_categories()` - List category names (~50 tokens)
|
||||
2. `skills_list(category)` - Name + description per skill (~3k tokens)
|
||||
3. `skill_view(name)` - Full content + tags + linked files
|
||||
|
||||
SKILL.md files use YAML frontmatter:
|
||||
```yaml
|
||||
---
|
||||
name: skill-name
|
||||
description: Brief description for listing
|
||||
tags: [tag1, tag2]
|
||||
related_skills: [other-skill]
|
||||
version: 1.0.0
|
||||
---
|
||||
# Skill Content...
|
||||
```
|
||||
|
||||
Tool files: `tools/skills_tool.py` → `model_tools.py` → `toolsets.py`
|
||||
|
||||
---
|
||||
|
||||
## Testing Changes
|
||||
|
||||
After making changes:
|
||||
|
||||
1. Run `hermes doctor` to check setup
|
||||
2. Run `hermes config check` to verify config
|
||||
3. Test with `hermes chat -q "test message"`
|
||||
4. For new config options, test fresh install: `rm -rf ~/.hermes && hermes setup`
|
||||
590
TODO.md
590
TODO.md
@@ -4,84 +4,6 @@
|
||||
|
||||
---
|
||||
|
||||
## 🚨 HIGH PRIORITY - Immediate Fixes
|
||||
|
||||
These items need to be addressed ASAP:
|
||||
|
||||
### 1. SUDO Breaking Terminal Tool 🔐 ✅ COMPLETE
|
||||
- [x] **Problem:** SUDO commands break the terminal tool execution (hangs indefinitely)
|
||||
- [x] **Fix:** Created custom environment wrappers in `tools/terminal_tool.py`
|
||||
- `stdin=subprocess.DEVNULL` prevents hanging on interactive prompts
|
||||
- Sudo fails gracefully with clear error if no password configured
|
||||
- Same UX as Claude Code - agent sees error, tells user to run it themselves
|
||||
- [x] **All 5 environments now have consistent behavior:**
|
||||
- `_LocalEnvironment` - local execution
|
||||
- `_DockerEnvironment` - Docker containers
|
||||
- `_SingularityEnvironment` - Singularity/Apptainer containers
|
||||
- `_ModalEnvironment` - Modal cloud sandboxes
|
||||
- `_SSHEnvironment` - remote SSH execution
|
||||
- [x] **Optional sudo support via `SUDO_PASSWORD` env var:**
|
||||
- Shared `_transform_sudo_command()` helper used by all environments
|
||||
- If set, auto-transforms `sudo cmd` → pipes password via `sudo -S`
|
||||
- Documented in `.env.example`, `cli-config.yaml`, and README
|
||||
- Works for chained commands: `cmd1 && sudo cmd2`
|
||||
- [x] **Interactive sudo prompt in CLI mode:**
|
||||
- When sudo detected and no password configured, prompts user
|
||||
- 45-second timeout (auto-skips if no input)
|
||||
- Hidden password input via `getpass` (password not visible)
|
||||
- Password cached for session (don't ask repeatedly)
|
||||
- Spinner pauses during prompt for clean UX
|
||||
- Uses `HERMES_INTERACTIVE` env var to detect CLI mode
|
||||
|
||||
### 2. Fix `browser_get_images` Tool 🖼️ ✅ VERIFIED WORKING
|
||||
- [x] **Tested:** Tool works correctly on multiple sites
|
||||
- [x] **Results:** Successfully extracts image URLs, alt text, dimensions
|
||||
- [x] **Note:** Some sites (Pixabay, etc.) have Cloudflare bot protection that blocks headless browsers - this is expected behavior, not a bug
|
||||
|
||||
### 3. Better Action Logging for Debugging 📝 ✅ COMPLETE
|
||||
- [x] **Problem:** Need better logging of agent actions for debugging
|
||||
- [x] **Implementation:**
|
||||
- Save full session trajectories to `logs/` directory as JSON
|
||||
- Each session gets a unique file: `session_YYYYMMDD_HHMMSS_UUID.json`
|
||||
- Logs all messages, tool calls with inputs/outputs, timestamps
|
||||
- Structured JSON format for easy parsing and replay
|
||||
- Automatic on CLI runs (configurable)
|
||||
|
||||
### 4. Stream Thinking Summaries in Real-Time 💭 ⏸️ DEFERRED
|
||||
- [ ] **Problem:** Thinking/reasoning summaries not shown while streaming
|
||||
- [ ] **Complexity:** This is a significant refactor - leaving for later
|
||||
|
||||
**OpenRouter Streaming Info:**
|
||||
- Uses `stream=True` with OpenAI SDK
|
||||
- Reasoning comes in `choices[].delta.reasoning_details` chunks
|
||||
- Types: `reasoning.summary`, `reasoning.text`, `reasoning.encrypted`
|
||||
- Tool call arguments stream as partial JSON (need accumulation)
|
||||
- Items paradigm: same ID emitted multiple times with updated content
|
||||
|
||||
**Key Challenges:**
|
||||
- Tool call JSON accumulation (partial `{"query": "wea` → `{"query": "weather"}`)
|
||||
- Multiple concurrent outputs (thinking + tool calls + text simultaneously)
|
||||
- State management for partial responses
|
||||
- Error handling if connection drops mid-stream
|
||||
- Deciding when tool calls are "complete" enough to execute
|
||||
|
||||
**UX Questions to Resolve:**
|
||||
- Show raw thinking text or summarized?
|
||||
- Live expanding text vs. spinner replacement?
|
||||
- Markdown rendering while streaming?
|
||||
- How to handle thinking + tool call display simultaneously?
|
||||
|
||||
**Implementation Options:**
|
||||
- New `run_conversation_streaming()` method (keep non-streaming as fallback)
|
||||
- Wrapper that handles streaming internally
|
||||
- Big refactor of existing `run_conversation()`
|
||||
|
||||
**References:**
|
||||
- https://openrouter.ai/docs/api/reference/streaming
|
||||
- https://openrouter.ai/docs/guides/best-practices/reasoning-tokens#streaming-response
|
||||
|
||||
---
|
||||
|
||||
## 1. Subagent Architecture (Context Isolation) 🎯
|
||||
|
||||
**Problem:** Long-running tools (terminal commands, browser automation, complex file operations) consume massive context. A single `ls -la` can add hundreds of lines. Browser snapshots, debugging sessions, and iterative terminal work quickly bloat the main conversation, leaving less room for actual reasoning.
|
||||
@@ -160,87 +82,48 @@ These items need to be addressed ASAP:
|
||||
|
||||
---
|
||||
|
||||
## 2. Context Management (complements Subagents)
|
||||
## 2. Planning & Task Management 📋
|
||||
|
||||
**Problem:** Context grows unbounded during long conversations. Trajectory compression exists for training data post-hoc, but live conversations lack intelligent context management.
|
||||
**Problem:** Agent handles tasks reactively without explicit planning. Complex multi-step tasks lack structure, progress tracking, and the ability to decompose work into manageable chunks.
|
||||
|
||||
**Ideas:**
|
||||
- [ ] **Incremental summarization** - Compress old tool outputs on-the-fly during conversations
|
||||
- Trigger when context exceeds threshold (e.g., 80% of max tokens)
|
||||
- Preserve recent turns fully, summarize older tool responses
|
||||
- Could reuse logic from `trajectory_compressor.py`
|
||||
- [ ] **Task decomposition tool** - Break complex requests into subtasks:
|
||||
```
|
||||
User: "Set up a new Python project with FastAPI, tests, and Docker"
|
||||
|
||||
- [ ] **Semantic memory retrieval** - Vector store for long conversation recall
|
||||
- Embed important facts/findings as conversation progresses
|
||||
- Retrieve relevant memories when needed instead of keeping everything in context
|
||||
- Consider lightweight solutions: ChromaDB, FAISS, or even a simple embedding cache
|
||||
Agent creates plan:
|
||||
├── 1. Create project structure and requirements.txt
|
||||
├── 2. Implement FastAPI app skeleton
|
||||
├── 3. Add pytest configuration and initial tests
|
||||
├── 4. Create Dockerfile and docker-compose.yml
|
||||
└── 5. Verify everything works together
|
||||
```
|
||||
- Each subtask becomes a trackable unit
|
||||
- Agent can report progress: "Completed 3/5 tasks"
|
||||
|
||||
- [ ] **Working vs. episodic memory** distinction
|
||||
- Working memory: Current task state, recent tool results (always in context)
|
||||
- Episodic memory: Past findings, tried approaches (retrieved on demand)
|
||||
- Clear eviction policies for each
|
||||
- [ ] **Progress checkpoints** - Periodic self-assessment:
|
||||
- After N tool calls or time elapsed, pause to evaluate
|
||||
- "What have I accomplished? What remains? Am I on track?"
|
||||
- Detect if stuck in loops or making no progress
|
||||
- Could trigger replanning if approach isn't working
|
||||
|
||||
- [ ] **Explicit plan storage** - Persist plan in conversation:
|
||||
- Store as structured data (not just in context)
|
||||
- Update status as tasks complete
|
||||
- User can ask "What's the plan?" or "What's left?"
|
||||
- Survives context compression (plans are protected)
|
||||
|
||||
**Files to modify:** `run_agent.py` (add memory manager), possibly new `tools/memory_tool.py`
|
||||
- [ ] **Failure recovery with replanning** - When things go wrong:
|
||||
- Record what failed and why
|
||||
- Revise plan to work around the issue
|
||||
- "Step 3 failed because X, adjusting approach to Y"
|
||||
- Prevents repeating failed strategies
|
||||
|
||||
**Files to modify:** `run_agent.py` (add planning hooks), new `tools/planning_tool.py`
|
||||
|
||||
---
|
||||
|
||||
## 3. Self-Reflection & Course Correction 🔄
|
||||
|
||||
**Problem:** Current retry logic handles malformed outputs but not semantic failures. Agent doesn't reason about *why* something failed.
|
||||
|
||||
**Ideas:**
|
||||
- [ ] **Meta-reasoning after failures** - When a tool returns an error or unexpected result:
|
||||
```
|
||||
Tool failed → Reflect: "Why did this fail? What assumptions were wrong?"
|
||||
→ Adjust approach → Retry with new strategy
|
||||
```
|
||||
- Could be a lightweight LLM call or structured self-prompt
|
||||
|
||||
- [ ] **Planning/replanning module** - For complex multi-step tasks:
|
||||
- Generate plan before execution
|
||||
- After each step, evaluate: "Am I on track? Should I revise the plan?"
|
||||
- Store plan in working memory, update as needed
|
||||
|
||||
- [ ] **Approach memory** - Remember what didn't work:
|
||||
- "I tried X for this type of problem and it failed because Y"
|
||||
- Prevents repeating failed strategies in the same conversation
|
||||
|
||||
**Files to modify:** `run_agent.py` (add reflection hooks in tool loop), new `tools/reflection_tool.py`
|
||||
|
||||
---
|
||||
|
||||
## 4. Tool Composition & Learning 🔧
|
||||
|
||||
**Problem:** Tools are atomic. Complex tasks require repeated manual orchestration of the same tool sequences.
|
||||
|
||||
**Ideas:**
|
||||
- [ ] **Macro tools / Tool chains** - Define reusable tool sequences:
|
||||
```yaml
|
||||
research_topic:
|
||||
description: "Deep research on a topic"
|
||||
steps:
|
||||
- web_search: {query: "$topic"}
|
||||
- web_extract: {urls: "$search_results.urls[:3]"}
|
||||
- summarize: {content: "$extracted"}
|
||||
```
|
||||
- Could be defined in skills or a new `macros/` directory
|
||||
- Agent can invoke macro as single tool call
|
||||
|
||||
- [ ] **Tool failure patterns** - Learn from failures:
|
||||
- Track: tool, input pattern, error type, what worked instead
|
||||
- Before calling a tool, check: "Has this pattern failed before?"
|
||||
- Persistent across sessions (stored in skills or separate DB)
|
||||
|
||||
- [ ] **Parallel tool execution** - When tools are independent, run concurrently:
|
||||
- Detect independence (no data dependencies between calls)
|
||||
- Use `asyncio.gather()` for parallel execution
|
||||
- Already have async support in some tools, just need orchestration
|
||||
|
||||
**Files to modify:** `model_tools.py`, `toolsets.py`, new `tool_macros.py`
|
||||
|
||||
---
|
||||
|
||||
## 5. Dynamic Skills Expansion 📚
|
||||
## 3. Dynamic Skills Expansion 📚
|
||||
|
||||
**Problem:** Skills system is elegant but static. Skills must be manually created and added.
|
||||
|
||||
@@ -269,21 +152,7 @@ These items need to be addressed ASAP:
|
||||
|
||||
---
|
||||
|
||||
## 6. Task Continuation Hints 🎯
|
||||
|
||||
**Problem:** Could be more helpful by suggesting logical next steps.
|
||||
|
||||
**Ideas:**
|
||||
- [ ] **Suggest next steps** - At end of a task, suggest logical continuations:
|
||||
- "Code is written. Want me to also write tests / docs / deploy?"
|
||||
- Based on common workflows for task type
|
||||
- Non-intrusive, just offer options
|
||||
|
||||
**Files to modify:** `run_agent.py`, response generation logic
|
||||
|
||||
---
|
||||
|
||||
## 7. Interactive Clarifying Questions Tool ❓
|
||||
## 4. Interactive Clarifying Questions Tool ❓
|
||||
|
||||
**Problem:** Agent sometimes makes assumptions or guesses when it should ask the user. Currently can only ask via text, which gets lost in long outputs.
|
||||
|
||||
@@ -319,25 +188,7 @@ These items need to be addressed ASAP:
|
||||
|
||||
---
|
||||
|
||||
## 8. Resource Awareness & Efficiency 💰
|
||||
|
||||
**Problem:** No awareness of costs, time, or resource usage. Could be smarter about efficiency.
|
||||
|
||||
**Ideas:**
|
||||
- [ ] **Tool result caching** - Don't repeat identical operations:
|
||||
- Cache web searches, extractions within a session
|
||||
- Invalidation based on time-sensitivity of query
|
||||
- Hash-based lookup: same input → cached output
|
||||
|
||||
- [ ] **Lazy evaluation** - Don't fetch everything upfront:
|
||||
- Get summaries first, full content only if needed
|
||||
- "I found 5 relevant pages. Want me to deep-dive on any?"
|
||||
|
||||
**Files to modify:** `model_tools.py`, new `resource_tracker.py`
|
||||
|
||||
---
|
||||
|
||||
## 9. Collaborative Problem Solving 🤝
|
||||
## 5. Collaborative Problem Solving 🤝
|
||||
|
||||
**Problem:** Interaction is command/response. Complex problems benefit from dialogue.
|
||||
|
||||
@@ -356,7 +207,7 @@ These items need to be addressed ASAP:
|
||||
|
||||
---
|
||||
|
||||
## 10. Project-Local Context 💾
|
||||
## 6. Project-Local Context 💾
|
||||
|
||||
**Problem:** Valuable context lost between sessions.
|
||||
|
||||
@@ -374,30 +225,7 @@ These items need to be addressed ASAP:
|
||||
|
||||
**Files to modify:** New `project_context.py`, auto-load in `run_agent.py`
|
||||
|
||||
---
|
||||
|
||||
## 11. Graceful Degradation & Robustness 🛡️
|
||||
|
||||
**Problem:** When things go wrong, recovery is limited. Should fail gracefully.
|
||||
|
||||
**Ideas:**
|
||||
- [ ] **Fallback chains** - When primary approach fails, have backups:
|
||||
- `web_extract` fails → try `browser_navigate` → try `web_search` for cached version
|
||||
- Define fallback order per tool type
|
||||
|
||||
- [ ] **Partial progress preservation** - Don't lose work on failure:
|
||||
- Long task fails midway → save what we've got
|
||||
- "I completed 3/5 steps before the error. Here's what I have..."
|
||||
|
||||
- [ ] **Self-healing** - Detect and recover from bad states:
|
||||
- Browser stuck → close and retry
|
||||
- Terminal hung → timeout and reset
|
||||
|
||||
**Files to modify:** `model_tools.py`, tool implementations, new `fallback_manager.py`
|
||||
|
||||
---
|
||||
|
||||
## 12. Tools & Skills Wishlist 🧰
|
||||
## 6. Tools & Skills Wishlist 🧰
|
||||
|
||||
*Things that would need new tool implementations (can't do well with current tools):*
|
||||
|
||||
@@ -464,7 +292,7 @@ These items need to be addressed ASAP:
|
||||
|
||||
---
|
||||
|
||||
## 13. Messaging Platform Integrations 💬
|
||||
## 7. Messaging Platform Integrations 💬 ✅ COMPLETE
|
||||
|
||||
**Problem:** Agent currently only works via `cli.py` which requires direct terminal access. Users may want to interact via messaging apps from their phone or other devices.
|
||||
|
||||
@@ -485,75 +313,41 @@ These items need to be addressed ASAP:
|
||||
```
|
||||
|
||||
**Platform support (each user sets up their own credentials):**
|
||||
- [ ] **Telegram** - via `python-telegram-bot` or `grammy` equivalent
|
||||
- [x] **Telegram** - via `python-telegram-bot`
|
||||
- Bot token from @BotFather
|
||||
- Easiest to set up, good for personal use
|
||||
- [ ] **Discord** - via `discord.py`
|
||||
- [x] **Discord** - via `discord.py`
|
||||
- Bot token from Discord Developer Portal
|
||||
- Can work in servers (group sessions) or DMs
|
||||
- [ ] **WhatsApp** - via `baileys` (WhatsApp Web protocol)
|
||||
- QR code scan to authenticate
|
||||
- [x] **WhatsApp** - via Node.js bridge (whatsapp-web.js/baileys)
|
||||
- Requires Node.js bridge setup
|
||||
- More complex, but reaches most people
|
||||
|
||||
**Session management:**
|
||||
- [ ] **Session store** - JSONL persistence per session key
|
||||
- `~/.hermes/sessions/{session_key}.jsonl`
|
||||
- Session keys: `telegram:dm:{user_id}`, `discord:channel:{id}`, etc.
|
||||
- [ ] **Session expiry** - Configurable reset policies
|
||||
- Daily reset (default 4am) OR idle timeout (e.g., 2 hours)
|
||||
- [x] **Session store** - JSONL persistence per session key
|
||||
- `~/.hermes/sessions/{session_id}.jsonl`
|
||||
- Session keys: `agent:main:telegram:dm`, `agent:main:discord:group:123`, etc.
|
||||
- [x] **Session expiry** - Configurable reset policies
|
||||
- Daily reset (default 4am) OR idle timeout (default 2 hours)
|
||||
- Manual reset via `/reset` or `/new` command in chat
|
||||
- [ ] **Session continuity** - Conversations persist across messages until reset
|
||||
- Per-platform and per-type overrides
|
||||
- [x] **Session continuity** - Conversations persist across messages until reset
|
||||
|
||||
**Files to create:** `monitors/telegram_monitor.py`, `monitors/discord_monitor.py`, `monitors/session_store.py`
|
||||
**Files created:** `gateway/`, `gateway/platforms/`, `gateway/config.py`, `gateway/session.py`, `gateway/delivery.py`, `gateway/run.py`
|
||||
|
||||
**Configuration:**
|
||||
- Environment variables: `TELEGRAM_BOT_TOKEN`, `DISCORD_BOT_TOKEN`, etc.
|
||||
- Config file: `~/.hermes/gateway.json`
|
||||
- CLI commands: `/platforms` to check status, `--gateway` to start
|
||||
|
||||
**Dynamic context injection:**
|
||||
- Agent knows its source platform and chat
|
||||
- Agent knows connected platforms and home channels
|
||||
- Agent can deliver cron outputs to specific platforms
|
||||
|
||||
---
|
||||
|
||||
## 14. Scheduled Tasks / Cron Jobs ⏰
|
||||
|
||||
**Problem:** Agent only runs on-demand. Some tasks benefit from scheduled execution (daily summaries, monitoring, reminders).
|
||||
|
||||
**Ideas:**
|
||||
- [ ] **Cron-style scheduler** - Run agent turns on a schedule
|
||||
- Store jobs in `~/.hermes/cron/jobs.json`
|
||||
- Each job: `{ id, schedule, prompt, session_mode, delivery }`
|
||||
- Uses APScheduler or similar Python library
|
||||
|
||||
- [ ] **Session modes:**
|
||||
- `isolated` - Fresh session each run (no history, clean context)
|
||||
- `main` - Append to main session (agent remembers previous scheduled runs)
|
||||
|
||||
- [ ] **Delivery options:**
|
||||
- Write output to file (`~/.hermes/cron/output/{job_id}/{timestamp}.md`)
|
||||
- Send to messaging channel (if integrations enabled)
|
||||
- Both
|
||||
|
||||
- [ ] **CLI interface:**
|
||||
```bash
|
||||
# List scheduled jobs
|
||||
python cli.py --cron list
|
||||
|
||||
# Add a job (runs daily at 9am)
|
||||
python cli.py --cron add "Summarize my email inbox" --schedule "0 9 * * *"
|
||||
|
||||
# Quick syntax for simple intervals
|
||||
python cli.py --cron add "Check server status" --every 30m
|
||||
|
||||
# Remove a job
|
||||
python cli.py --cron remove <job_id>
|
||||
```
|
||||
|
||||
- [ ] **Agent self-scheduling** - Let the agent create its own cron jobs
|
||||
- New tool: `schedule_task(prompt, schedule, session_mode)`
|
||||
- "Remind me to check the deployment tomorrow at 9am"
|
||||
- Agent can set follow-up tasks for itself
|
||||
|
||||
- [ ] **In-chat command:** `/cronjob {prompt} {frequency}` when using messaging integrations
|
||||
|
||||
**Files to create:** `cron/scheduler.py`, `cron/jobs.py`, `tools/schedule_tool.py`
|
||||
|
||||
---
|
||||
|
||||
## 15. Text-to-Speech (TTS) 🔊
|
||||
## 8. Text-to-Speech (TTS) 🔊
|
||||
|
||||
**Problem:** Agent can only respond with text. Some users prefer audio responses (accessibility, hands-free use, podcasts).
|
||||
|
||||
@@ -584,7 +378,7 @@ These items need to be addressed ASAP:
|
||||
|
||||
---
|
||||
|
||||
## 16. Speech-to-Text / Audio Transcription 🎤
|
||||
## 13. Speech-to-Text / Audio Transcription 🎤
|
||||
|
||||
**Problem:** Users may want to send voice memos instead of typing. Agent is blind to audio content.
|
||||
|
||||
@@ -613,103 +407,6 @@ These items need to be addressed ASAP:
|
||||
|
||||
**Files to create:** `tools/transcribe_tool.py`, integrate with messaging monitors
|
||||
|
||||
---
|
||||
|
||||
## Priority Order (Suggested)
|
||||
|
||||
1. **🎯 Subagent Architecture** - Critical for context management, enables everything else
|
||||
2. **Memory & Context Management** - Complements subagents for remaining context
|
||||
3. **Self-Reflection** - Improves reliability and reduces wasted tool calls
|
||||
4. **Project-Local Context** - Practical win, keeps useful info across sessions
|
||||
5. **Messaging Integrations** - Unlocks mobile access, new interaction patterns
|
||||
6. **Scheduled Tasks / Cron Jobs** - Enables automation, reminders, monitoring
|
||||
7. **Tool Composition** - Quality of life, builds on other improvements
|
||||
8. **Dynamic Skills** - Force multiplier for repeated tasks
|
||||
9. **Interactive Clarifying Questions** - Better UX for ambiguous tasks
|
||||
10. **TTS / Audio Transcription** - Accessibility, hands-free use
|
||||
|
||||
---
|
||||
|
||||
## Removed Items (Unrealistic)
|
||||
|
||||
The following were removed because they're architecturally impossible:
|
||||
|
||||
- ~~Proactive suggestions / Prefetching~~ - Agent only runs on user request, can't interject
|
||||
- ~~Clipboard integration~~ - No access to user's local system clipboard
|
||||
|
||||
The following **moved to active TODO** (now possible with new architecture):
|
||||
|
||||
- ~~Session save/restore~~ → See **Messaging Integrations** (session persistence)
|
||||
- ~~Voice/TTS playback~~ → See **TTS** (can generate audio files, send via messaging)
|
||||
- ~~Set reminders~~ → See **Scheduled Tasks / Cron Jobs**
|
||||
|
||||
The following were removed because they're **already possible**:
|
||||
|
||||
- ~~HTTP/API Client~~ → Use `curl` or Python `requests` in terminal
|
||||
- ~~Structured Data Manipulation~~ → Use `pandas` in terminal
|
||||
- ~~Git-Native Operations~~ → Use `git` CLI in terminal
|
||||
- ~~Symbolic Math~~ → Use `SymPy` in terminal
|
||||
- ~~Code Quality Tools~~ → Run linters (`eslint`, `black`, `mypy`) in terminal
|
||||
- ~~Testing Framework~~ → Run `pytest`, `jest`, etc. in terminal
|
||||
- ~~Translation~~ → LLM handles this fine, or use translation APIs
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Brainstorm Ideas (Not Yet Fleshed Out)
|
||||
|
||||
*These are early-stage ideas that need more thinking before implementation. Captured here so they don't get lost.*
|
||||
|
||||
### Remote/Distributed Execution 🌐
|
||||
|
||||
**Concept:** Run agent on a powerful remote server while interacting from a thin client.
|
||||
|
||||
**Why interesting:**
|
||||
- Run on beefy GPU server for local LLM inference
|
||||
- Agent has access to remote machine's resources (files, tools, internet)
|
||||
- User interacts via lightweight client (phone, low-power laptop)
|
||||
|
||||
**Open questions:**
|
||||
- How does this differ from just SSH + running cli.py on remote?
|
||||
- Would need secure communication channel (WebSocket? gRPC?)
|
||||
- How to handle tool outputs that reference remote paths?
|
||||
- Credential management for remote execution
|
||||
- Latency considerations for interactive use
|
||||
|
||||
**Possible architecture:**
|
||||
```
|
||||
┌─────────────┐ ┌─────────────────────────┐
|
||||
│ Thin Client │ ◄─────► │ Remote Hermes Server │
|
||||
│ (phone/web) │ WS/API │ - Full agent + tools │
|
||||
└─────────────┘ │ - GPU for local LLM │
|
||||
│ - Access to server files│
|
||||
└─────────────────────────┘
|
||||
```
|
||||
|
||||
**Related to:** Messaging integrations (could be the "server" that monitors receive from)
|
||||
|
||||
---
|
||||
|
||||
### Multi-Agent Parallel Execution 🤖🤖
|
||||
|
||||
**Concept:** Extension of Subagent Architecture (Section 1) - run multiple subagents in parallel.
|
||||
|
||||
**Why interesting:**
|
||||
- Independent subtasks don't need to wait for each other
|
||||
- "Research X while setting up Y" - both run simultaneously
|
||||
- Faster completion for complex multi-part tasks
|
||||
|
||||
**Open questions:**
|
||||
- How to detect which tasks are truly independent?
|
||||
- Resource management (API rate limits, concurrent connections)
|
||||
- How to merge results when parallel tasks have conflicts?
|
||||
- Cost implications of multiple parallel LLM calls
|
||||
|
||||
*Note: Basic subagent delegation (Section 1) should be implemented first, parallel execution is an optimization on top.*
|
||||
|
||||
---
|
||||
|
||||
### Plugin/Extension System 🔌
|
||||
|
||||
**Concept:** Allow users to add custom tools/skills without modifying core code.
|
||||
@@ -726,4 +423,167 @@ The following were removed because they're **already possible**:
|
||||
|
||||
---
|
||||
|
||||
## Recently Completed ✅
|
||||
|
||||
### Dangerous Command Approval System
|
||||
**Implemented:** Dangerous command detection and approval for terminal tool.
|
||||
|
||||
**Features:**
|
||||
- Pattern-based detection of dangerous commands (rm -rf, DROP TABLE, chmod 777, etc.)
|
||||
- CLI prompt with options: `[o]nce | [s]ession | [a]lways | [d]eny`
|
||||
- Session caching (approved patterns don't re-prompt)
|
||||
- Permanent allowlist in `~/.hermes/config.yaml`
|
||||
- Force flag for agent to bypass after user confirmation
|
||||
- Skip check for isolated backends (Docker, Singularity, Modal)
|
||||
- Helpful sudo failure messages for messaging platforms
|
||||
|
||||
**Files:** `tools/terminal_tool.py`, `model_tools.py`, `hermes_cli/config.py`
|
||||
|
||||
---
|
||||
|
||||
## 14. Learning Machine / Dynamic Memory System 🧠
|
||||
|
||||
*Inspired by [Dash](~/agent-codebases/dash) - a self-learning data agent.*
|
||||
|
||||
**Problem:** Agent starts fresh every session. Valuable learnings from debugging, error patterns, successful approaches, and user preferences are lost.
|
||||
|
||||
**Dash's Key Insight:** Separate **Knowledge** (static, curated) from **Learnings** (dynamic, discovered):
|
||||
|
||||
| System | What It Stores | How It Evolves |
|
||||
|--------|---------------|----------------|
|
||||
| **Knowledge** (Skills) | Validated approaches, templates, best practices | Curated by user |
|
||||
| **Learnings** | Error patterns, gotchas, discovered fixes | Managed automatically |
|
||||
|
||||
**Tools to implement:**
|
||||
- [ ] `save_learning(topic, learning, context?)` - Record a discovered pattern
|
||||
```python
|
||||
save_learning(
|
||||
topic="python-ssl",
|
||||
learning="On Ubuntu 22.04, SSL certificate errors often fixed by: apt install ca-certificates",
|
||||
context="Debugging requests SSL failure"
|
||||
)
|
||||
```
|
||||
- [ ] `search_learnings(query)` - Find relevant past learnings
|
||||
```python
|
||||
search_learnings("SSL certificate error Python")
|
||||
# Returns: "On Ubuntu 22.04, SSL certificate errors often fixed by..."
|
||||
```
|
||||
|
||||
**User Profile & Memory:**
|
||||
- [ ] `user_profile` - Structured facts about user preferences
|
||||
```yaml
|
||||
# ~/.hermes/user_profile.yaml
|
||||
coding_style:
|
||||
python_formatter: black
|
||||
type_hints: always
|
||||
test_framework: pytest
|
||||
preferences:
|
||||
verbosity: detailed
|
||||
confirm_destructive: true
|
||||
environment:
|
||||
os: linux
|
||||
shell: bash
|
||||
default_python: 3.11
|
||||
```
|
||||
- [ ] `user_memory` - Unstructured observations the agent learns
|
||||
```yaml
|
||||
# ~/.hermes/user_memory.yaml
|
||||
- "User prefers tabs over spaces despite black's defaults"
|
||||
- "User's main project is ~/work/myapp - a Django app"
|
||||
- "User often works late - don't ask about timezone"
|
||||
```
|
||||
|
||||
**When to learn:**
|
||||
- After fixing an error that took multiple attempts
|
||||
- When user corrects the agent's approach
|
||||
- When a workaround is discovered for a tool limitation
|
||||
- When user expresses a preference
|
||||
|
||||
**Storage:** Vector database (ChromaDB) or simple YAML with embedding search.
|
||||
|
||||
**Files to create:** `tools/learning_tools.py`, `learning/store.py`, `~/.hermes/learnings/`
|
||||
|
||||
---
|
||||
|
||||
## 15. Layered Context Architecture 📊
|
||||
|
||||
*Inspired by Dash's "Six Layers of Context" - grounding responses in multiple sources.*
|
||||
|
||||
**Problem:** Context sources are ad-hoc. No clear hierarchy or strategy for what context to include when.
|
||||
|
||||
**Proposed Layers for Hermes:**
|
||||
|
||||
| Layer | Source | When Loaded | Example |
|
||||
|-------|--------|-------------|---------|
|
||||
| 1. **Project Context** | `.hermes/context.md` | Auto on cwd | "This is a FastAPI project using PostgreSQL" |
|
||||
| 2. **Skills** | `skills/*.md` | On request | "How to set up React project" |
|
||||
| 3. **User Profile** | `~/.hermes/user_profile.yaml` | Always | "User prefers pytest, uses black" |
|
||||
| 4. **Learnings** | `~/.hermes/learnings/` | Semantic search | "SSL fix for Ubuntu" |
|
||||
| 5. **External Knowledge** | Web search, docs | On demand | Current API docs, Stack Overflow |
|
||||
| 6. **Runtime Introspection** | Tool calls | Real-time | File contents, terminal output |
|
||||
|
||||
**Benefits:**
|
||||
- Clear mental model for what context is available
|
||||
- Prioritization: local > learned > external
|
||||
- Debugging: "Why did agent do X?" → check which layers contributed
|
||||
|
||||
**Files to modify:** `run_agent.py` (context loading), new `context/layers.py`
|
||||
|
||||
---
|
||||
|
||||
## 16. Evaluation System with LLM Grading 📏
|
||||
|
||||
*Inspired by Dash's evaluation framework.*
|
||||
|
||||
**Problem:** `batch_runner.py` runs test cases but lacks quality assessment.
|
||||
|
||||
**Dash's Approach:**
|
||||
- **String matching** (default) - Check if expected strings appear
|
||||
- **LLM grader** (-g flag) - GPT evaluates response quality
|
||||
- **Result comparison** (-r flag) - Compare against golden output
|
||||
|
||||
**Implementation for Hermes:**
|
||||
|
||||
- [ ] **Test case format:**
|
||||
```python
|
||||
TestCase(
|
||||
name="create_python_project",
|
||||
prompt="Create a new Python project with FastAPI and tests",
|
||||
expected_strings=["requirements.txt", "main.py", "test_"], # Basic check
|
||||
golden_actions=["write:main.py", "write:requirements.txt", "terminal:pip install"],
|
||||
grader_criteria="Should create complete project structure with working code"
|
||||
)
|
||||
```
|
||||
|
||||
- [ ] **LLM grader mode:**
|
||||
```python
|
||||
def grade_response(response: str, criteria: str) -> Grade:
|
||||
"""Use GPT to evaluate response quality."""
|
||||
prompt = f"""
|
||||
Evaluate this agent response against the criteria.
|
||||
Criteria: {criteria}
|
||||
Response: {response}
|
||||
|
||||
Score (1-5) and explain why.
|
||||
"""
|
||||
# Returns: Grade(score=4, explanation="Created all files but tests are minimal")
|
||||
```
|
||||
|
||||
- [ ] **Action comparison mode:**
|
||||
- Record tool calls made during test
|
||||
- Compare against expected actions
|
||||
- "Expected terminal call to pip install, got npm install"
|
||||
|
||||
- [ ] **CLI flags:**
|
||||
```bash
|
||||
python batch_runner.py eval test_cases.yaml # String matching
|
||||
python batch_runner.py eval test_cases.yaml -g # + LLM grading
|
||||
python batch_runner.py eval test_cases.yaml -r # + Result comparison
|
||||
python batch_runner.py eval test_cases.yaml -v # Verbose (show responses)
|
||||
```
|
||||
|
||||
**Files to modify:** `batch_runner.py`, new `evals/test_cases.py`, new `evals/grader.py`
|
||||
|
||||
---
|
||||
|
||||
*Last updated: $(date +%Y-%m-%d)* 🤖
|
||||
|
||||
@@ -1,41 +0,0 @@
|
||||
# Dockerfile for atropos-agent sandbox server
|
||||
# Runs inside Nomad containers to handle tool execution
|
||||
# Includes bubblewrap for namespace-based slot isolation
|
||||
|
||||
FROM python:3.11-slim
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
# Bubblewrap for namespace isolation
|
||||
bubblewrap \
|
||||
# `script` for PTY allocation (used for stable tmux+asciinema startup)
|
||||
util-linux \
|
||||
# Git for SWE-style tasks (cloning repos)
|
||||
git \
|
||||
# tmux for stateful terminal sessions (Phase 4.7+)
|
||||
tmux \
|
||||
# Common tools agents might need
|
||||
curl \
|
||||
wget \
|
||||
jq \
|
||||
# Cleanup
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install Python dependencies (sandbox server + optional terminal recording)
|
||||
RUN pip install --no-cache-dir aiohttp asciinema
|
||||
|
||||
# Copy the sandbox server
|
||||
COPY sandbox_server.py /app/sandbox_server.py
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Create data directory for slot workspaces
|
||||
RUN mkdir -p /data
|
||||
|
||||
# Verify bubblewrap is installed and working
|
||||
RUN bwrap --version
|
||||
|
||||
EXPOSE 8080
|
||||
|
||||
# Default command - can be overridden by Nomad job spec
|
||||
CMD ["python", "sandbox_server.py", "--port", "8080", "--slots", "10", "--data-dir", "/data"]
|
||||
@@ -1,46 +0,0 @@
|
||||
"""
|
||||
Atropos integration for Hermes-Agent.
|
||||
|
||||
This package is intentionally optional: Hermes-Agent should work without Atropos.
|
||||
If you import anything from `atropos.*` without having `atroposlib` installed,
|
||||
we raise a clear error with install instructions.
|
||||
|
||||
Install (recommended, from repo checkout):
|
||||
uv sync --extra atropos
|
||||
|
||||
Or (pip / editable):
|
||||
pip install -e '.[atropos]'
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
def _require_atroposlib() -> None:
|
||||
try:
|
||||
import atroposlib # noqa: F401
|
||||
except ModuleNotFoundError as exc: # pragma: no cover
|
||||
raise ModuleNotFoundError(
|
||||
"Hermes-Agent Atropos integration requires `atroposlib`, but it is not installed.\n"
|
||||
"Install it with:\n"
|
||||
" uv sync --extra atropos\n"
|
||||
"or:\n"
|
||||
" pip install -e '.[atropos]'\n"
|
||||
) from exc
|
||||
|
||||
|
||||
_require_atroposlib()
|
||||
|
||||
# Re-export the most commonly used pieces for convenience.
|
||||
from .agent import AgentConfig, AgentResult, AgentStep, AtroposAgent, SequenceData # noqa: E402
|
||||
from .envs import AgentEnv, AgentEnvConfig # noqa: E402
|
||||
|
||||
__all__ = [
|
||||
"AtroposAgent",
|
||||
"AgentConfig",
|
||||
"AgentResult",
|
||||
"AgentStep",
|
||||
"SequenceData",
|
||||
"AgentEnv",
|
||||
"AgentEnvConfig",
|
||||
]
|
||||
|
||||
@@ -1,15 +0,0 @@
|
||||
"""
|
||||
Agent abstractions for atropos-agent.
|
||||
|
||||
Provides the core AtroposAgent class for running ReACT-style agent loops.
|
||||
"""
|
||||
|
||||
from .atropos_agent import AgentConfig, AgentResult, AgentStep, AtroposAgent, SequenceData
|
||||
|
||||
__all__ = [
|
||||
"AtroposAgent",
|
||||
"AgentConfig",
|
||||
"AgentResult",
|
||||
"AgentStep",
|
||||
"SequenceData",
|
||||
]
|
||||
@@ -1,850 +0,0 @@
|
||||
"""
|
||||
ReACT-style agent implementation for atropos-agent.
|
||||
|
||||
This module provides the core AtroposAgent class that implements a basic
|
||||
Reason-Act-Observe loop with tool calling capabilities.
|
||||
|
||||
Uses ManagedServer from atroposlib for automatic token/logprob tracking,
|
||||
making trajectories ready for RL training.
|
||||
|
||||
The agent uses Hermes-style XML tags for tool calls:
|
||||
- <think>...</think> for reasoning
|
||||
- <tool_call>{"name": "...", "arguments": {...}}</tool_call> for actions
|
||||
- <tool_response>...</tool_response> for observations
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import json
|
||||
import time
|
||||
from contextlib import asynccontextmanager
|
||||
from dataclasses import dataclass, field
|
||||
from uuid import uuid4
|
||||
from typing import Any, AsyncGenerator, Awaitable, Callable, Dict, List, Optional, Union
|
||||
|
||||
from dotenv import load_dotenv
|
||||
import httpx
|
||||
|
||||
from ..tools import ToolCall, ToolRegistry, ToolResult
|
||||
from atroposlib.envs.server_handling.managed_server import ManagedServer
|
||||
|
||||
load_dotenv()
|
||||
|
||||
|
||||
# Default system prompt with tool calling instructions.
|
||||
AGENT_SYSTEM_PROMPT = """You are a deep thinking AI. You MUST enclose your internal reasoning inside <think>...</think> tags.
|
||||
|
||||
You are a function calling AI model.
|
||||
|
||||
You are provided with function signatures within <tools></tools> XML tags.
|
||||
You must call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.
|
||||
You can ONLY respond without a tool call if you are totally certain you have the final answer to the user's question or task
|
||||
After calling & executing a function, you will be provided with function results within <tool_response></tool_response> XML tags.
|
||||
|
||||
Here are the available tools:
|
||||
<tools>
|
||||
{tools_json}
|
||||
</tools>
|
||||
|
||||
Use the following JSON schema for each tool call you will make:
|
||||
{"title": "FunctionCall", "type": "object", "properties": {"name": {"title": "Name", "type": "string"}, "arguments": {"title": "Arguments", "type": "object"}}, "required": ["name", "arguments"]}
|
||||
|
||||
## REQUIRED TOOL FORMAT
|
||||
|
||||
When you decide to call a tool, your assistant message MUST be:
|
||||
1) exactly one <think>...</think> block, followed by
|
||||
2) one or more <tool_call>...</tool_call> blocks,
|
||||
and NOTHING else in that message.
|
||||
|
||||
If you need to explain anything, put it inside <think>. Do NOT write natural language outside <think> or <tool_call>.
|
||||
|
||||
For each function call return a JSON object with function name and arguments within <tool_call></tool_call> XML tags as follows:
|
||||
<tool_call>
|
||||
{"name": "<function-name>", "arguments": {"arg1": "value1"}}
|
||||
</tool_call>
|
||||
|
||||
Each <tool_call> must be on its own and contain ONLY the JSON object (no extra text).
|
||||
The JSON inside <tool_call> MUST be valid JSON with double quotes.
|
||||
|
||||
Do NOT output <tool_response> in an assistant message.
|
||||
|
||||
After you receive tool results, you may either call more tools (same required format) or provide the final answer.
|
||||
When providing the final answer, do NOT include any <tool_call> blocks.
|
||||
|
||||
## TERMINAL TOOL NOTES
|
||||
|
||||
- Commands execute under POSIX `/bin/sh` (not bash).
|
||||
- Each tool call runs in a fresh shell: environment changes (like `cd` or venv activation) do not persist across tool calls.
|
||||
- Avoid bash-only features like `source`, `[[ ... ]]`, or process substitution.
|
||||
- Prefer explicit venv usage:
|
||||
- `python -m venv .venv && . .venv/bin/activate && python -m pip install -e .` (POSIX `.` activation), or
|
||||
- `.venv/bin/python -m pip install -e .` (no activation required).
|
||||
|
||||
## ICL (examples)
|
||||
|
||||
User: Show the current directory.
|
||||
Assistant:
|
||||
<think>I should run pwd.</think>
|
||||
<tool_call>
|
||||
{"name": "terminal", "arguments": {"command": "pwd"}}
|
||||
</tool_call>
|
||||
User: <tool_response>{"success": true, "output": "/tmp\\n"}</tool_response>
|
||||
Assistant: /tmp
|
||||
|
||||
User: List files, then count them.
|
||||
Assistant:
|
||||
<think>I should count files.</think>
|
||||
<tool_call>
|
||||
{"name": "terminal", "arguments": {"command": "ls -1 | wc -l"}}
|
||||
</tool_call>
|
||||
User: <tool_response>{"success": true, "output": "3\\n"}</tool_response>
|
||||
Assistant: 3
|
||||
|
||||
User: Run pwd, then print ok (two tool calls).
|
||||
Assistant:
|
||||
<think>I should run two commands.</think>
|
||||
<tool_call>
|
||||
{"name": "terminal", "arguments": {"command": "pwd"}}
|
||||
</tool_call>
|
||||
<tool_call>
|
||||
{"name": "terminal", "arguments": {"command": "echo ok"}}
|
||||
</tool_call>
|
||||
User: <tool_response>{"success": true, "output": "/tmp\\n"}</tool_response>
|
||||
User: <tool_response>{"success": true, "output": "ok\\n"}</tool_response>
|
||||
Assistant: ok
|
||||
"""
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentConfig:
|
||||
"""Configuration for the AtroposAgent."""
|
||||
|
||||
# Generation parameters
|
||||
temperature: Optional[float] = 0.7
|
||||
# Default to "let the backend decide" (important for tool-tag completions that may be longer).
|
||||
max_tokens: Optional[int] = None
|
||||
|
||||
# Agent behavior
|
||||
max_steps: int = 50
|
||||
system_prompt: Optional[str] = None
|
||||
tool_delay_s: float = 0.0
|
||||
|
||||
# Working directory for tools
|
||||
working_dir: Optional[str] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class SequenceData:
|
||||
"""Token/logprob data from a single completion."""
|
||||
|
||||
full_text: str
|
||||
tokens: List[int]
|
||||
masked_tokens: List[int] # -100 for prompt, actual IDs for completion
|
||||
logprobs: List[float] # 1.0 for prompt, actual values for completion
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
|
||||
@classmethod
|
||||
def from_sequence_node(cls, node) -> "SequenceData":
|
||||
"""Create from a ManagedServer SequenceNode."""
|
||||
return cls(
|
||||
full_text=node.full_text,
|
||||
tokens=node.tokens,
|
||||
masked_tokens=node.masked_tokens,
|
||||
logprobs=node.logprobs,
|
||||
metadata=getattr(node, "metadata", None),
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentStep:
|
||||
"""A single step in the agent's trajectory."""
|
||||
|
||||
step_number: int
|
||||
assistant_message: str
|
||||
tool_calls: List[ToolCall] = field(default_factory=list)
|
||||
tool_results: List[ToolResult] = field(default_factory=list)
|
||||
sequence_data: Optional[SequenceData] = None # Token data from this step
|
||||
|
||||
@property
|
||||
def has_tool_calls(self) -> bool:
|
||||
return len(self.tool_calls) > 0
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentResult:
|
||||
"""Result of running an agent trajectory."""
|
||||
|
||||
success: bool
|
||||
final_response: str
|
||||
steps: List[AgentStep] = field(default_factory=list)
|
||||
total_tokens: int = 0
|
||||
error: Optional[str] = None
|
||||
metadata: Dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
# Full trajectory token data for RL training
|
||||
trajectory_data: Optional[SequenceData] = None
|
||||
|
||||
@property
|
||||
def num_steps(self) -> int:
|
||||
return len(self.steps)
|
||||
|
||||
@property
|
||||
def total_tool_calls(self) -> int:
|
||||
return sum(len(step.tool_calls) for step in self.steps)
|
||||
|
||||
def to_messages(self) -> List[Dict[str, str]]:
|
||||
"""Convert trajectory to messages format for logging."""
|
||||
messages = []
|
||||
for step in self.steps:
|
||||
messages.append({"role": "assistant", "content": step.assistant_message})
|
||||
if step.tool_results:
|
||||
# Combine all tool responses
|
||||
responses = "\n".join(r.to_xml() for r in step.tool_results)
|
||||
messages.append({"role": "user", "content": responses})
|
||||
return messages
|
||||
|
||||
def to_scored_data(self, score: float) -> Optional[Dict[str, Any]]:
|
||||
"""
|
||||
Convert to format suitable for ScoredDataGroup.
|
||||
|
||||
Args:
|
||||
score: The score for this trajectory
|
||||
|
||||
Returns:
|
||||
Dict with tokens, masks, scores suitable for training, or None if no data
|
||||
"""
|
||||
if self.trajectory_data is None:
|
||||
return None
|
||||
|
||||
return {
|
||||
"tokens": self.trajectory_data.tokens,
|
||||
"masks": self.trajectory_data.masked_tokens,
|
||||
"scores": score,
|
||||
"logprobs": self.trajectory_data.logprobs,
|
||||
}
|
||||
|
||||
|
||||
class AtroposAgent:
|
||||
"""
|
||||
A ReACT-style agent that uses LLMs with tool calling.
|
||||
|
||||
This implementation wraps ManagedServer for automatic token/logprob tracking,
|
||||
making trajectories ready for RL training.
|
||||
|
||||
Example:
|
||||
# `server` may be an Atropos `ServerManager` (recommended) or a single `APIServer`.
|
||||
# In practice, environments usually construct this via `BaseEnv`.
|
||||
server = ...
|
||||
tools = ToolRegistry()
|
||||
tools.register(BashTool())
|
||||
|
||||
agent = AtroposAgent(server=server, tools=tools)
|
||||
result = await agent.run("List the files in the current directory")
|
||||
|
||||
# Access token data for training
|
||||
if result.trajectory_data:
|
||||
print(f"Tokens: {result.trajectory_data.tokens}")
|
||||
print(f"Masked: {result.trajectory_data.masked_tokens}")
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
server, # ServerManager or APIServer
|
||||
tools: Optional[ToolRegistry] = None,
|
||||
config: Optional[AgentConfig] = None,
|
||||
tokenizer: Optional[Any] = None,
|
||||
execute_tool: Optional[Callable[[ToolCall], Awaitable[ToolResult]]] = None,
|
||||
):
|
||||
self.server = server
|
||||
self.tools = tools or ToolRegistry()
|
||||
self.config = config or AgentConfig()
|
||||
self.tokenizer = tokenizer or getattr(server, "tokenizer", None)
|
||||
self.execute_tool = execute_tool or self.tools.execute
|
||||
|
||||
@asynccontextmanager
|
||||
async def _managed(self) -> AsyncGenerator[Any, None]:
|
||||
"""
|
||||
Yield a ManagedServer-like object.
|
||||
|
||||
- If `self.server` is a ServerManager, use its `managed_server()` context manager.
|
||||
- If `self.server` is a single APIServer, wrap it in `ManagedServer` directly.
|
||||
"""
|
||||
if os.getenv("ATROPOS_BYPASS_MANAGED_SERVER") == "1":
|
||||
yield _DirectChatCompletionClient(server=self.server)
|
||||
return
|
||||
if hasattr(self.server, "managed_server"):
|
||||
async with self.server.managed_server(tokenizer=self.tokenizer) as managed:
|
||||
yield managed
|
||||
else:
|
||||
managed = ManagedServer(server=self.server, tokenizer=self.tokenizer)
|
||||
try:
|
||||
yield managed
|
||||
finally:
|
||||
managed.reset()
|
||||
|
||||
def _build_system_prompt(self) -> str:
|
||||
"""Build the system prompt with tool descriptions."""
|
||||
if self.config.system_prompt:
|
||||
return self.config.system_prompt
|
||||
|
||||
tools_json = self.tools.get_prompt_tool_definitions_json()
|
||||
# Avoid `str.format()` here because the prompt contains many literal `{}` braces
|
||||
# in JSON examples; we only want to substitute the single `{tools_json}` token.
|
||||
return AGENT_SYSTEM_PROMPT.replace("{tools_json}", tools_json)
|
||||
|
||||
def _infer_server_model_for_debug(self) -> Optional[str]:
|
||||
"""
|
||||
Best-effort inference of the configured model name for debug payload saving.
|
||||
|
||||
ManagedServer/server_manager typically injects `model` internally, so `chat_kwargs`
|
||||
may not contain it. For replaying saved payloads via curl, it's useful to persist it.
|
||||
"""
|
||||
servers = getattr(self.server, "servers", None)
|
||||
if isinstance(servers, list) and servers:
|
||||
s0 = servers[0]
|
||||
cfg = getattr(s0, "config", None)
|
||||
model = getattr(cfg, "model_name", None) or getattr(s0, "model_name", None)
|
||||
if isinstance(model, str) and model:
|
||||
return model
|
||||
model = getattr(self.server, "model_name", None) or getattr(self.server, "model", None)
|
||||
if isinstance(model, str) and model:
|
||||
return model
|
||||
return None
|
||||
|
||||
def _infer_server_base_url_for_debug(self) -> Optional[str]:
|
||||
"""
|
||||
Best-effort inference of the configured base_url for debug logging.
|
||||
|
||||
This is helpful when diagnosing hangs / retries at the transport layer.
|
||||
"""
|
||||
servers = getattr(self.server, "servers", None)
|
||||
if isinstance(servers, list) and servers:
|
||||
s0 = servers[0]
|
||||
cfg = getattr(s0, "config", None)
|
||||
base_url = getattr(cfg, "base_url", None) or getattr(s0, "base_url", None)
|
||||
if isinstance(base_url, str) and base_url:
|
||||
return base_url
|
||||
base_url = getattr(self.server, "base_url", None)
|
||||
if isinstance(base_url, str) and base_url:
|
||||
return base_url
|
||||
return None
|
||||
|
||||
def _extract_response_metadata(self, response: Any) -> Dict[str, Any]:
|
||||
"""
|
||||
Extract lightweight, JSON-serializable metadata from an OpenAI-style response.
|
||||
|
||||
This is useful for debugging training runs, especially when ManagedServer state
|
||||
tracking is unavailable (e.g. OpenAI-compatible chat endpoints).
|
||||
"""
|
||||
meta: Dict[str, Any] = {}
|
||||
try:
|
||||
rid = getattr(response, "id", None)
|
||||
if isinstance(rid, str) and rid:
|
||||
meta["id"] = rid
|
||||
model = getattr(response, "model", None)
|
||||
if isinstance(model, str) and model:
|
||||
meta["model"] = model
|
||||
created = getattr(response, "created", None)
|
||||
if isinstance(created, int):
|
||||
meta["created"] = created
|
||||
system_fingerprint = getattr(response, "system_fingerprint", None)
|
||||
if isinstance(system_fingerprint, str) and system_fingerprint:
|
||||
meta["system_fingerprint"] = system_fingerprint
|
||||
|
||||
choices = getattr(response, "choices", None)
|
||||
if isinstance(choices, list) and choices:
|
||||
fr = getattr(choices[0], "finish_reason", None)
|
||||
if isinstance(fr, str) and fr:
|
||||
meta["finish_reason"] = fr
|
||||
|
||||
usage = getattr(response, "usage", None)
|
||||
if usage is not None:
|
||||
if hasattr(usage, "model_dump"):
|
||||
meta["usage"] = usage.model_dump()
|
||||
elif isinstance(usage, dict):
|
||||
meta["usage"] = usage
|
||||
except Exception:
|
||||
pass
|
||||
return meta
|
||||
|
||||
def _debug_dump_request(self, *, step_num: int, chat_kwargs: Dict[str, Any]) -> None:
|
||||
if os.getenv("ATROPOS_DEBUG_AGENT_REQUEST") != "1":
|
||||
return
|
||||
try:
|
||||
# Avoid dumping megabytes by default; messages can be huge.
|
||||
meta = {
|
||||
"step": step_num,
|
||||
"base_url": self._infer_server_base_url_for_debug(),
|
||||
"model": chat_kwargs.get("model") or self._infer_server_model_for_debug(),
|
||||
"chat_kwargs_keys": sorted(list(chat_kwargs.keys())),
|
||||
"n": chat_kwargs.get("n"),
|
||||
"max_tokens": chat_kwargs.get("max_tokens"),
|
||||
"temperature": chat_kwargs.get("temperature"),
|
||||
"num_messages": len(chat_kwargs.get("messages") or []),
|
||||
}
|
||||
print("\n=== ATROPOS_DEBUG_AGENT_REQUEST ===", flush=True)
|
||||
print(meta, flush=True)
|
||||
|
||||
if os.getenv("ATROPOS_DEBUG_AGENT_REQUEST_FULL") == "1":
|
||||
payload = dict(chat_kwargs)
|
||||
# Make the payload more legible and less huge.
|
||||
try:
|
||||
dumped = json.dumps(payload, ensure_ascii=False, indent=2)
|
||||
except Exception:
|
||||
dumped = repr(payload)
|
||||
print("\n=== ATROPOS_DEBUG_AGENT_REQUEST_FULL ===", flush=True)
|
||||
print(dumped[:200_000], flush=True)
|
||||
|
||||
# Optional: save the FULL request payload to disk (no truncation).
|
||||
save_dir = os.getenv("ATROPOS_DEBUG_AGENT_REQUEST_SAVE_DIR")
|
||||
if save_dir:
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
payload: Dict[str, Any] = dict(chat_kwargs)
|
||||
if "model" not in payload:
|
||||
model = self._infer_server_model_for_debug()
|
||||
if model:
|
||||
payload["model"] = model
|
||||
# Use a unique filename so parallel trajectories don't clobber each other.
|
||||
fname = os.path.join(
|
||||
save_dir,
|
||||
f"atropos_agent_request_step{step_num}_{int(time.time()*1000)}_{os.getpid()}_{uuid4().hex}.json",
|
||||
)
|
||||
with open(fname, "w", encoding="utf-8") as f:
|
||||
json.dump(payload, f, ensure_ascii=False, indent=2)
|
||||
print(f"[AtroposAgent] saved request payload: {fname}", flush=True)
|
||||
except Exception:
|
||||
return
|
||||
|
||||
def _debug_dump_response(self, *, step_num: int, response: Any) -> None:
|
||||
if os.getenv("ATROPOS_DEBUG_AGENT_RESPONSE") != "1":
|
||||
return
|
||||
print("\n=== ATROPOS_DEBUG_AGENT_RESPONSE ===", flush=True)
|
||||
print({"step": step_num, "type": type(response).__name__}, flush=True)
|
||||
try:
|
||||
dumped = response.model_dump() # openai pydantic model
|
||||
except Exception:
|
||||
dumped = getattr(response, "__dict__", {"repr": repr(response)})
|
||||
# Keep the dump bounded; we only need enough to see the assistant message content.
|
||||
text = str(dumped)
|
||||
print(text[:200_000], flush=True)
|
||||
|
||||
async def _chat_completion_with_debug(
|
||||
self, *, managed: Any, step_num: int, chat_kwargs: Dict[str, Any]
|
||||
) -> Any:
|
||||
"""
|
||||
Call `managed.chat_completion()` with optional timeout + richer failure logging.
|
||||
|
||||
Debug env vars:
|
||||
- `ATROPOS_AGENT_CHAT_TIMEOUT_S`: if set, wraps the await in `asyncio.wait_for`.
|
||||
- `ATROPOS_DEBUG_AGENT_WAIT_EVERY_S`: if set, prints a heartbeat while waiting.
|
||||
"""
|
||||
# Hard guardrail: never allow a single chat completion to block for too long.
|
||||
# This is essential for RL data-gen stability; long hangs should be treated as failures (score=0).
|
||||
timeout_s_raw = os.getenv("ATROPOS_AGENT_CHAT_TIMEOUT_S")
|
||||
timeout_s_default = 240.0
|
||||
timeout_s = float(timeout_s_raw) if timeout_s_raw else timeout_s_default
|
||||
timeout_s = min(timeout_s, 240.0)
|
||||
|
||||
wait_every_raw = os.getenv("ATROPOS_DEBUG_AGENT_WAIT_EVERY_S")
|
||||
wait_every_s = float(wait_every_raw) if wait_every_raw else None
|
||||
|
||||
async def _await_call() -> Any:
|
||||
if not wait_every_s or wait_every_s <= 0:
|
||||
return await managed.chat_completion(**chat_kwargs)
|
||||
|
||||
# Heartbeat mode: wait in chunks without cancelling the underlying request.
|
||||
# NOTE: do NOT use `asyncio.wait_for(task, timeout=...)` here, because a timeout
|
||||
# will cancel the task and surface as `CancelledError` on the next loop.
|
||||
task = asyncio.create_task(managed.chat_completion(**chat_kwargs))
|
||||
t0 = time.perf_counter()
|
||||
try:
|
||||
while True:
|
||||
done, _pending = await asyncio.wait({task}, timeout=wait_every_s)
|
||||
if task in done:
|
||||
return task.result()
|
||||
|
||||
waited = time.perf_counter() - t0
|
||||
print(
|
||||
f"[AtroposAgent] step={step_num} still waiting for chat_completion... ({waited:.1f}s)",
|
||||
flush=True,
|
||||
)
|
||||
except asyncio.CancelledError:
|
||||
task.cancel()
|
||||
raise
|
||||
|
||||
try:
|
||||
return await asyncio.wait_for(_await_call(), timeout=timeout_s)
|
||||
except asyncio.TimeoutError as e:
|
||||
print("\n=== ATROPOS_DEBUG_AGENT_CHAT_TIMEOUT ===", flush=True)
|
||||
print({"step": step_num, "timeout_s": timeout_s}, flush=True)
|
||||
raise RuntimeError(f"chat_completion timed out after {timeout_s:.1f}s") from e
|
||||
except asyncio.CancelledError:
|
||||
# Treat cancellation as a hard failure rather than crashing the whole env run.
|
||||
# (Atropos/BaseEnv may cancel tasks during shutdown or retries.)
|
||||
raise RuntimeError("chat_completion cancelled") from None
|
||||
except Exception as e:
|
||||
detail: Dict[str, Any] = {
|
||||
"step": step_num,
|
||||
"exc_type": type(e).__name__,
|
||||
"exc_str": str(e),
|
||||
}
|
||||
if isinstance(e, httpx.HTTPStatusError):
|
||||
try:
|
||||
detail["status_code"] = e.response.status_code
|
||||
detail["response_text"] = e.response.text[:20_000]
|
||||
except Exception:
|
||||
pass
|
||||
elif isinstance(e, httpx.RequestError):
|
||||
detail["request"] = repr(getattr(e, "request", None))
|
||||
|
||||
print("\n=== ATROPOS_DEBUG_AGENT_CHAT_FAILURE ===", flush=True)
|
||||
print(detail, flush=True)
|
||||
raise
|
||||
|
||||
async def run(
|
||||
self,
|
||||
task: str,
|
||||
initial_messages: Optional[List[Dict[str, str]]] = None,
|
||||
) -> AgentResult:
|
||||
"""
|
||||
Run the agent on a task using ManagedServer for token tracking.
|
||||
|
||||
Args:
|
||||
task: The task/prompt for the agent
|
||||
initial_messages: Optional additional context messages
|
||||
|
||||
Returns:
|
||||
AgentResult with the trajectory, final response, and token data
|
||||
"""
|
||||
messages = [
|
||||
{"role": "system", "content": self._build_system_prompt()},
|
||||
]
|
||||
|
||||
if initial_messages:
|
||||
messages.extend(initial_messages)
|
||||
|
||||
messages.append({"role": "user", "content": task})
|
||||
|
||||
steps = []
|
||||
final_response = ""
|
||||
final_node = None
|
||||
final_prompt_messages: Optional[List[Dict[str, str]]] = None
|
||||
last_node = None
|
||||
last_prompt_messages: Optional[List[Dict[str, str]]] = None
|
||||
last_response_text: str = ""
|
||||
|
||||
# Use ManagedServer for automatic token tracking
|
||||
async with self._managed() as managed:
|
||||
for step_num in range(self.config.max_steps):
|
||||
# ReACT loop iteration here, just call -> tools -> observe until done (no tools called)
|
||||
try:
|
||||
# Keep a copy of the prompt messages used for this completion.
|
||||
# Useful for reconstructing tokens/masks when state tracking is unavailable.
|
||||
prompt_messages = list(messages)
|
||||
chat_kwargs: Dict[str, Any] = {"messages": messages, "n": 1}
|
||||
if self.config.max_tokens is not None:
|
||||
chat_kwargs["max_tokens"] = self.config.max_tokens
|
||||
if self.config.temperature is not None:
|
||||
chat_kwargs["temperature"] = self.config.temperature
|
||||
|
||||
t_req = time.perf_counter()
|
||||
print(
|
||||
f"[AtroposAgent] step={step_num+1} chat_completion start "
|
||||
f"(messages={len(messages)}, max_tokens={self.config.max_tokens}, temp={self.config.temperature})",
|
||||
flush=True,
|
||||
)
|
||||
self._debug_dump_request(step_num=step_num + 1, chat_kwargs=chat_kwargs)
|
||||
response = await self._chat_completion_with_debug(
|
||||
managed=managed, step_num=step_num + 1, chat_kwargs=chat_kwargs
|
||||
)
|
||||
self._debug_dump_response(step_num=step_num + 1, response=response)
|
||||
response_meta = self._extract_response_metadata(response)
|
||||
print(
|
||||
f"[AtroposAgent] step={step_num+1} chat_completion done in {time.perf_counter() - t_req:.2f}s",
|
||||
flush=True,
|
||||
)
|
||||
|
||||
current_node = None
|
||||
if hasattr(managed, "get_state"):
|
||||
state = managed.get_state()
|
||||
nodes = state.get("nodes", [])
|
||||
current_node = nodes[-1] if nodes else None
|
||||
|
||||
except Exception as e:
|
||||
return AgentResult(
|
||||
success=False,
|
||||
final_response="",
|
||||
steps=steps,
|
||||
error=f"Generation error: {str(e)}",
|
||||
)
|
||||
|
||||
msg = response.choices[0].message
|
||||
# Some OpenAI-compatible servers populate `message.reasoning` and leave `content=""`.
|
||||
response_text = (msg.content or "") or (getattr(msg, "reasoning", None) or "")
|
||||
tool_calls = ToolCall.parse_from_text(response_text)
|
||||
last_node = current_node
|
||||
last_prompt_messages = prompt_messages
|
||||
last_response_text = response_text
|
||||
|
||||
step_sequence_data = SequenceData.from_sequence_node(current_node) if current_node else None
|
||||
if step_sequence_data is None:
|
||||
if response_meta:
|
||||
# We still want metadata for debugging even if token/logprob state tracking is unavailable.
|
||||
step_sequence_data = SequenceData(
|
||||
full_text=response_text,
|
||||
tokens=[],
|
||||
masked_tokens=[],
|
||||
logprobs=[],
|
||||
metadata=response_meta,
|
||||
)
|
||||
else:
|
||||
merged = dict(response_meta)
|
||||
node_meta = step_sequence_data.metadata
|
||||
if isinstance(node_meta, dict):
|
||||
merged.update(node_meta)
|
||||
step_sequence_data.metadata = merged or step_sequence_data.metadata
|
||||
|
||||
step = AgentStep(
|
||||
step_number=step_num + 1,
|
||||
assistant_message=response_text,
|
||||
tool_calls=tool_calls,
|
||||
sequence_data=step_sequence_data,
|
||||
)
|
||||
|
||||
if not tool_calls:
|
||||
steps.append(step)
|
||||
final_response = response_text
|
||||
final_node = current_node
|
||||
final_prompt_messages = prompt_messages
|
||||
break
|
||||
|
||||
messages.append({"role": "assistant", "content": response_text})
|
||||
|
||||
tool_responses = []
|
||||
for call in tool_calls:
|
||||
result = await self.execute_tool(call)
|
||||
step.tool_results.append(result)
|
||||
tool_responses.append(result.to_xml())
|
||||
if self.config.tool_delay_s > 0:
|
||||
await asyncio.sleep(self.config.tool_delay_s)
|
||||
|
||||
steps.append(step)
|
||||
|
||||
responses_text = "\n".join(tool_responses)
|
||||
# Tool observations are represented as user content with Hermes-style tags.
|
||||
# This is compatible with most OpenAI-compatible chat APIs and ensures
|
||||
# tokenizers/chat templates include tool outputs during training.
|
||||
messages.append({"role": "user", "content": responses_text})
|
||||
|
||||
else:
|
||||
# Reached max steps without completing
|
||||
# Return a failure result but include the last observed completion so callers can
|
||||
# record the trajectory (score=0) without triggering retries.
|
||||
final_response = last_response_text or final_response
|
||||
final_node = last_node
|
||||
final_prompt_messages = last_prompt_messages
|
||||
trajectory_data = None
|
||||
if final_node:
|
||||
trajectory_data = SequenceData.from_sequence_node(final_node)
|
||||
elif final_prompt_messages is not None and self.tokenizer is not None:
|
||||
if hasattr(self.tokenizer, "apply_chat_template"):
|
||||
prompt_text = self.tokenizer.apply_chat_template(
|
||||
final_prompt_messages, tokenize=False, add_generation_prompt=True
|
||||
)
|
||||
prompt_tokens = self.tokenizer.encode(prompt_text, add_special_tokens=False)
|
||||
else:
|
||||
prompt_text = "\n".join([f"{m['role']}: {m['content']}" for m in final_prompt_messages])
|
||||
prompt_tokens = self.tokenizer.encode(prompt_text, add_special_tokens=True)
|
||||
output_tokens = self.tokenizer.encode(final_response, add_special_tokens=False)
|
||||
tokens = prompt_tokens + output_tokens
|
||||
masked_tokens = ([-100] * len(prompt_tokens)) + output_tokens
|
||||
logprobs = ([1.0] * len(prompt_tokens)) + ([0.0] * len(output_tokens))
|
||||
trajectory_data = SequenceData(
|
||||
full_text=f"{prompt_text}{final_response}",
|
||||
tokens=tokens,
|
||||
masked_tokens=masked_tokens,
|
||||
logprobs=logprobs,
|
||||
)
|
||||
# Preserve response metadata (if any) even on failure trajectories.
|
||||
try:
|
||||
if trajectory_data is not None and steps:
|
||||
last_step = steps[-1]
|
||||
if last_step.sequence_data and isinstance(last_step.sequence_data.metadata, dict):
|
||||
trajectory_data.metadata = dict(last_step.sequence_data.metadata)
|
||||
except Exception:
|
||||
pass
|
||||
return AgentResult(
|
||||
success=False,
|
||||
final_response=final_response,
|
||||
steps=steps,
|
||||
error=f"Reached maximum steps ({self.config.max_steps})",
|
||||
trajectory_data=trajectory_data,
|
||||
)
|
||||
|
||||
# Build result with trajectory data
|
||||
trajectory_data = None
|
||||
if final_node:
|
||||
trajectory_data = SequenceData.from_sequence_node(final_node)
|
||||
elif final_prompt_messages is not None and self.tokenizer is not None:
|
||||
if hasattr(self.tokenizer, "apply_chat_template"):
|
||||
prompt_text = self.tokenizer.apply_chat_template(
|
||||
final_prompt_messages, tokenize=False, add_generation_prompt=True
|
||||
)
|
||||
prompt_tokens = self.tokenizer.encode(prompt_text, add_special_tokens=False)
|
||||
else:
|
||||
prompt_text = "\n".join([f"{m['role']}: {m['content']}" for m in final_prompt_messages])
|
||||
prompt_tokens = self.tokenizer.encode(prompt_text, add_special_tokens=True)
|
||||
output_tokens = self.tokenizer.encode(final_response, add_special_tokens=False)
|
||||
tokens = prompt_tokens + output_tokens
|
||||
masked_tokens = ([-100] * len(prompt_tokens)) + output_tokens
|
||||
logprobs = ([1.0] * len(prompt_tokens)) + ([0.0] * len(output_tokens))
|
||||
trajectory_data = SequenceData(
|
||||
full_text=f"{prompt_text}{final_response}",
|
||||
tokens=tokens,
|
||||
masked_tokens=masked_tokens,
|
||||
logprobs=logprobs,
|
||||
)
|
||||
|
||||
# Ensure trajectory_data carries the most recent metadata we observed (if any).
|
||||
try:
|
||||
if trajectory_data is not None and steps:
|
||||
last_step = steps[-1]
|
||||
if last_step.sequence_data and isinstance(last_step.sequence_data.metadata, dict):
|
||||
trajectory_data.metadata = dict(last_step.sequence_data.metadata)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return AgentResult(
|
||||
success=True,
|
||||
final_response=final_response,
|
||||
steps=steps,
|
||||
trajectory_data=trajectory_data,
|
||||
)
|
||||
|
||||
async def run_single_turn(
|
||||
self,
|
||||
messages: List[Dict[str, str]],
|
||||
execute_tools: bool = True,
|
||||
) -> tuple[str, List[ToolResult], Optional[SequenceData]]:
|
||||
"""
|
||||
Run a single turn of the agent (one LLM call + tool execution).
|
||||
|
||||
This is useful for integration with BaseEnv where you want more
|
||||
control over the loop.
|
||||
|
||||
Args:
|
||||
messages: The conversation history
|
||||
execute_tools: Whether to execute parsed tool calls
|
||||
|
||||
Returns:
|
||||
Tuple of (response_text, tool_results, sequence_data)
|
||||
"""
|
||||
async with self._managed() as managed:
|
||||
chat_kwargs: Dict[str, Any] = {"messages": messages, "n": 1}
|
||||
if self.config.max_tokens is not None:
|
||||
chat_kwargs["max_tokens"] = self.config.max_tokens
|
||||
if self.config.temperature is not None:
|
||||
chat_kwargs["temperature"] = self.config.temperature
|
||||
|
||||
self._debug_dump_request(step_num=1, chat_kwargs=chat_kwargs)
|
||||
response = await self._chat_completion_with_debug(managed=managed, step_num=1, chat_kwargs=chat_kwargs)
|
||||
self._debug_dump_response(step_num=1, response=response)
|
||||
|
||||
current_node = None
|
||||
if hasattr(managed, "get_state"):
|
||||
state = managed.get_state()
|
||||
nodes = state.get("nodes", [])
|
||||
current_node = nodes[-1] if nodes else None
|
||||
|
||||
msg = response.choices[0].message
|
||||
response_text = (msg.content or "") or (getattr(msg, "reasoning", None) or "")
|
||||
tool_results = []
|
||||
|
||||
if execute_tools:
|
||||
tool_calls = ToolCall.parse_from_text(response_text)
|
||||
for call in tool_calls:
|
||||
result = await self.execute_tool(call)
|
||||
tool_results.append(result)
|
||||
|
||||
sequence_data = SequenceData.from_sequence_node(current_node) if current_node else None
|
||||
|
||||
return response_text, tool_results, sequence_data
|
||||
|
||||
|
||||
class _DirectChatCompletionClient:
|
||||
"""
|
||||
Minimal stand-in for ManagedServer that calls the OpenAI-compatible endpoint directly.
|
||||
|
||||
This is for isolating issues where `ManagedServer.chat_completion()` hangs or misbehaves.
|
||||
It intentionally does NOT do token/logprob tracking.
|
||||
"""
|
||||
|
||||
def __init__(self, server: Any):
|
||||
self._server = server
|
||||
|
||||
def _server_config(self) -> tuple[str, str, str]:
|
||||
# ServerManager case: first configured server.
|
||||
servers = getattr(self._server, "servers", None)
|
||||
if isinstance(servers, list) and servers:
|
||||
s0 = servers[0]
|
||||
cfg = getattr(s0, "config", None)
|
||||
base_url = getattr(cfg, "base_url", None) or getattr(s0, "base_url", None)
|
||||
api_key = getattr(cfg, "api_key", None) or getattr(s0, "api_key", None)
|
||||
model = getattr(cfg, "model_name", None) or getattr(s0, "model_name", None)
|
||||
if isinstance(base_url, str) and isinstance(api_key, str) and isinstance(model, str):
|
||||
return base_url.rstrip("/"), api_key, model
|
||||
|
||||
# APIServer-like fallback.
|
||||
base_url = getattr(self._server, "base_url", None)
|
||||
api_key = getattr(self._server, "api_key", None)
|
||||
model = getattr(self._server, "model_name", None) or getattr(self._server, "model", None)
|
||||
if isinstance(base_url, str) and isinstance(api_key, str) and isinstance(model, str):
|
||||
return base_url.rstrip("/"), api_key, model
|
||||
|
||||
raise RuntimeError("Unable to resolve server base_url/api_key/model for direct chat completion")
|
||||
|
||||
async def chat_completion(self, *, messages: List[Dict[str, str]], n: int = 1, **kwargs: Any) -> Any:
|
||||
base_url, api_key, model = self._server_config()
|
||||
url = f"{base_url}/chat/completions"
|
||||
|
||||
payload: Dict[str, Any] = {
|
||||
"model": model,
|
||||
"messages": messages,
|
||||
"n": n,
|
||||
}
|
||||
# Pass through common generation kwargs.
|
||||
for k in ("max_tokens", "temperature", "top_p", "presence_penalty", "frequency_penalty", "stop"):
|
||||
if k in kwargs and kwargs[k] is not None:
|
||||
payload[k] = kwargs[k]
|
||||
|
||||
timeout_s = float(os.getenv("ATROPOS_DIRECT_REQUEST_TIMEOUT_S") or "120")
|
||||
print(f"[AtroposAgent] DIRECT chat_completion POST {url} (timeout={timeout_s}s)", flush=True)
|
||||
async with httpx.AsyncClient(timeout=timeout_s) as client:
|
||||
resp = await client.post(
|
||||
url,
|
||||
headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},
|
||||
json=payload,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
|
||||
# Return a very small object compatible with the code paths that read
|
||||
# `response.choices[0].message.content`.
|
||||
class _Msg:
|
||||
def __init__(self, d: Dict[str, Any]):
|
||||
self.content = d.get("content")
|
||||
self.reasoning = d.get("reasoning")
|
||||
|
||||
class _Choice:
|
||||
def __init__(self, d: Dict[str, Any]):
|
||||
self.message = _Msg(d.get("message") or {})
|
||||
|
||||
class _Resp:
|
||||
def __init__(self, d: Dict[str, Any]):
|
||||
self._d = d
|
||||
self.choices = [_Choice(c) for c in (d.get("choices") or [])]
|
||||
|
||||
def model_dump(self) -> Dict[str, Any]:
|
||||
return self._d
|
||||
|
||||
return _Resp(data)
|
||||
@@ -1,6 +0,0 @@
|
||||
"""
|
||||
FastAPI services for atropos-agent.
|
||||
|
||||
- tool_executor_server: queued/batched sandbox tool execution (Phase 4)
|
||||
"""
|
||||
|
||||
@@ -1,254 +0,0 @@
|
||||
"""
|
||||
Tool Executor API (Phase 4)
|
||||
|
||||
This service provides a queued, batched execution layer on top of a ToolBackend.
|
||||
It mirrors the stateful FastAPI + app.state pattern used in:
|
||||
atropos/atroposlib/api/server.py
|
||||
|
||||
Run (dev):
|
||||
uv run uvicorn atropos_agent.api.tool_executor_server:app --host 0.0.0.0 --port 9001
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from typing import Any, Dict, Optional
|
||||
from pathlib import Path
|
||||
|
||||
from fastapi import FastAPI, Header, HTTPException, status
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from ..backends.nomad_backend import NomadBackendConfig, NomadToolBackend
|
||||
from ..tools import ToolRegistry, build_tool_registry
|
||||
from ..tools.base import (
|
||||
ArtifactArchiveRequestPayload,
|
||||
ArtifactArchiveResponsePayload,
|
||||
ArtifactListRequestPayload,
|
||||
ArtifactListResponsePayload,
|
||||
ArtifactReadRequestPayload,
|
||||
ArtifactReadResponsePayload,
|
||||
ToolExecutorExecuteRequest,
|
||||
ToolExecutorReleaseRequest,
|
||||
ToolResultPayload,
|
||||
)
|
||||
from ..tools.tool_executor import ToolExecutor, ToolExecutorConfig
|
||||
|
||||
|
||||
class ToolExecutorServerConfig(BaseModel):
|
||||
nomad_address: str = Field(default="http://localhost:4646")
|
||||
job_id: str = Field(default="atropos-sandbox-tool-executor")
|
||||
image: str = Field(default="atropos-sandbox:local")
|
||||
slots_per_container: int = Field(default=10)
|
||||
min_containers: int = Field(default=1)
|
||||
max_containers: int = Field(default=10)
|
||||
privileged: bool = Field(default=False)
|
||||
acquire_timeout_s: float = Field(default=30.0)
|
||||
|
||||
batch_window_ms: int = Field(default=20)
|
||||
max_batch_size: int = Field(default=200)
|
||||
allow_network: bool = Field(default=True)
|
||||
|
||||
tool_server_url: Optional[str] = Field(default=None)
|
||||
tool_server_token: Optional[str] = Field(default=None)
|
||||
|
||||
token: Optional[str] = Field(default=None, description="Bearer token required for requests (optional in dev).")
|
||||
|
||||
purge_job_on_shutdown: bool = Field(default=True)
|
||||
|
||||
@classmethod
|
||||
def from_env(cls) -> "ToolExecutorServerConfig":
|
||||
# In dev, prefer loading secrets/config from the repo-local `.env` (not committed).
|
||||
try:
|
||||
from dotenv import load_dotenv # type: ignore
|
||||
except Exception: # pragma: no cover
|
||||
load_dotenv = None # type: ignore[assignment]
|
||||
if load_dotenv is not None:
|
||||
env_path = Path(__file__).resolve().parents[2] / ".env"
|
||||
if env_path.exists():
|
||||
load_dotenv(dotenv_path=env_path)
|
||||
|
||||
def _get_bool(name: str, default: bool) -> bool:
|
||||
raw = os.getenv(name)
|
||||
if raw is None:
|
||||
return default
|
||||
return raw.strip().lower() in {"1", "true", "yes", "y", "on"}
|
||||
|
||||
return cls(
|
||||
nomad_address=os.getenv("TOOL_EXECUTOR_NOMAD_ADDRESS", "http://localhost:4646"),
|
||||
job_id=os.getenv("TOOL_EXECUTOR_JOB_ID", "atropos-sandbox-tool-executor"),
|
||||
image=os.getenv("TOOL_EXECUTOR_IMAGE", "atropos-sandbox:local"),
|
||||
slots_per_container=int(os.getenv("TOOL_EXECUTOR_SLOTS", "10")),
|
||||
min_containers=int(os.getenv("TOOL_EXECUTOR_MIN_CONTAINERS", "1")),
|
||||
max_containers=int(os.getenv("TOOL_EXECUTOR_MAX_CONTAINERS", "10")),
|
||||
privileged=_get_bool("TOOL_EXECUTOR_PRIVILEGED", False),
|
||||
acquire_timeout_s=float(os.getenv("TOOL_EXECUTOR_ACQUIRE_TIMEOUT_S", "30.0")),
|
||||
batch_window_ms=int(os.getenv("TOOL_EXECUTOR_BATCH_WINDOW_MS", "20")),
|
||||
max_batch_size=int(os.getenv("TOOL_EXECUTOR_MAX_BATCH_SIZE", "200")),
|
||||
allow_network=_get_bool("TOOL_EXECUTOR_ALLOW_NETWORK", True),
|
||||
tool_server_url=os.getenv("TOOL_EXECUTOR_TOOL_SERVER_URL") or None,
|
||||
tool_server_token=os.getenv("TOOL_EXECUTOR_TOOL_SERVER_TOKEN") or None,
|
||||
token=os.getenv("TOOL_EXECUTOR_TOKEN") or None,
|
||||
purge_job_on_shutdown=_get_bool("TOOL_EXECUTOR_PURGE_JOB_ON_SHUTDOWN", True),
|
||||
)
|
||||
|
||||
|
||||
app = FastAPI(title="Atropos-Agent Tool Executor")
|
||||
|
||||
|
||||
@app.get("/")
|
||||
async def root() -> Dict[str, str]:
|
||||
return {"message": "Atropos-Agent Tool Executor"}
|
||||
|
||||
|
||||
def _check_auth(cfg: ToolExecutorServerConfig, authorization: Optional[str]) -> None:
|
||||
if not cfg.token:
|
||||
return
|
||||
if not authorization:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing Authorization header")
|
||||
if not authorization.lower().startswith("bearer "):
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid Authorization header")
|
||||
token = authorization.split(" ", 1)[1].strip()
|
||||
if token != cfg.token:
|
||||
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Invalid token")
|
||||
|
||||
|
||||
@app.on_event("startup")
|
||||
async def _startup() -> None:
|
||||
cfg = ToolExecutorServerConfig.from_env()
|
||||
|
||||
# Default to Atropos "full" tool surface: sandbox + external (if tool_server_url provided).
|
||||
tools: ToolRegistry = build_tool_registry(
|
||||
enabled_toolsets=["full"],
|
||||
disabled_toolsets=None,
|
||||
tool_server_url=cfg.tool_server_url,
|
||||
)
|
||||
|
||||
backend = NomadToolBackend(
|
||||
NomadBackendConfig(
|
||||
nomad_address=cfg.nomad_address,
|
||||
sandbox_job_id=cfg.job_id,
|
||||
sandbox_image=cfg.image,
|
||||
slots_per_container=cfg.slots_per_container,
|
||||
min_containers=cfg.min_containers,
|
||||
max_containers=cfg.max_containers,
|
||||
privileged=cfg.privileged,
|
||||
acquire_timeout_s=cfg.acquire_timeout_s,
|
||||
purge_job_on_start=False,
|
||||
)
|
||||
)
|
||||
await backend.start()
|
||||
|
||||
executor = ToolExecutor(
|
||||
backend=backend,
|
||||
tools=tools,
|
||||
config=ToolExecutorConfig(
|
||||
batch_window_ms=cfg.batch_window_ms,
|
||||
max_batch_size=cfg.max_batch_size,
|
||||
allow_network=cfg.allow_network,
|
||||
tool_server_url=cfg.tool_server_url,
|
||||
tool_server_token=cfg.tool_server_token,
|
||||
),
|
||||
)
|
||||
await executor.start()
|
||||
|
||||
app.state.cfg = cfg
|
||||
app.state.backend = backend
|
||||
app.state.executor = executor
|
||||
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def _shutdown() -> None:
|
||||
executor: Optional[ToolExecutor] = getattr(app.state, "executor", None)
|
||||
backend: Optional[NomadToolBackend] = getattr(app.state, "backend", None)
|
||||
cfg: Optional[ToolExecutorServerConfig] = getattr(app.state, "cfg", None)
|
||||
|
||||
if executor is not None:
|
||||
await executor.close()
|
||||
|
||||
if backend is not None:
|
||||
await backend.stop(purge=bool(cfg.purge_job_on_shutdown) if cfg else False)
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health() -> Dict[str, Any]:
|
||||
return {"status": "ok"}
|
||||
|
||||
|
||||
@app.get("/status")
|
||||
async def status_endpoint() -> Dict[str, Any]:
|
||||
executor: ToolExecutor = app.state.executor
|
||||
backend: NomadToolBackend = app.state.backend
|
||||
|
||||
return {
|
||||
"queue_size": executor.queue_size(),
|
||||
"total_requests": executor.total_requests,
|
||||
"total_errors": executor.total_errors,
|
||||
"pool": backend.get_stats(),
|
||||
}
|
||||
|
||||
|
||||
@app.post("/execute", response_model=ToolResultPayload)
|
||||
async def execute_tool(
|
||||
req: ToolExecutorExecuteRequest,
|
||||
authorization: Optional[str] = Header(default=None),
|
||||
status_code: int = status.HTTP_200_OK, # noqa: B008
|
||||
) -> ToolResultPayload:
|
||||
cfg: ToolExecutorServerConfig = app.state.cfg
|
||||
_check_auth(cfg, authorization)
|
||||
|
||||
executor: ToolExecutor = app.state.executor
|
||||
result = await executor.execute(
|
||||
trajectory_id=req.trajectory_id,
|
||||
call=req.tool.to_tool_call(),
|
||||
timeout_s=req.timeout_s,
|
||||
)
|
||||
return ToolResultPayload.from_tool_result(result)
|
||||
|
||||
|
||||
@app.post("/release")
|
||||
async def release_trajectory(
|
||||
req: ToolExecutorReleaseRequest,
|
||||
authorization: Optional[str] = Header(default=None),
|
||||
) -> Dict[str, Any]:
|
||||
cfg: ToolExecutorServerConfig = app.state.cfg
|
||||
_check_auth(cfg, authorization)
|
||||
|
||||
executor: ToolExecutor = app.state.executor
|
||||
await executor.release_trajectory(req.trajectory_id, reset_workspace=req.reset_workspace)
|
||||
return {"status": "ok"}
|
||||
|
||||
|
||||
@app.post("/artifacts/read", response_model=ArtifactReadResponsePayload)
|
||||
async def artifacts_read(
|
||||
req: ArtifactReadRequestPayload,
|
||||
authorization: Optional[str] = Header(default=None),
|
||||
) -> ArtifactReadResponsePayload:
|
||||
cfg: ToolExecutorServerConfig = app.state.cfg
|
||||
_check_auth(cfg, authorization)
|
||||
|
||||
executor: ToolExecutor = app.state.executor
|
||||
return await executor.read_artifact(req)
|
||||
|
||||
|
||||
@app.post("/artifacts/list", response_model=ArtifactListResponsePayload)
|
||||
async def artifacts_list(
|
||||
req: ArtifactListRequestPayload,
|
||||
authorization: Optional[str] = Header(default=None),
|
||||
) -> ArtifactListResponsePayload:
|
||||
cfg: ToolExecutorServerConfig = app.state.cfg
|
||||
_check_auth(cfg, authorization)
|
||||
|
||||
executor: ToolExecutor = app.state.executor
|
||||
return await executor.list_artifacts(req)
|
||||
|
||||
|
||||
@app.post("/artifacts/archive", response_model=ArtifactArchiveResponsePayload)
|
||||
async def artifacts_archive(
|
||||
req: ArtifactArchiveRequestPayload,
|
||||
authorization: Optional[str] = Header(default=None),
|
||||
) -> ArtifactArchiveResponsePayload:
|
||||
cfg: ToolExecutorServerConfig = app.state.cfg
|
||||
_check_auth(cfg, authorization)
|
||||
|
||||
executor: ToolExecutor = app.state.executor
|
||||
return await executor.archive_artifacts(req)
|
||||
@@ -1,140 +0,0 @@
|
||||
"""
|
||||
External ToolServer (Phase 4.5+).
|
||||
|
||||
This server executes tools that must NOT run inside the sandbox, typically
|
||||
because they require credentials or access to external services.
|
||||
|
||||
Run (dev):
|
||||
uv run uvicorn atropos_agent.api.tool_server:app --host 0.0.0.0 --port 9002
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import inspect
|
||||
from typing import Any, Dict, List, Optional
|
||||
from pathlib import Path
|
||||
|
||||
from fastapi import FastAPI, Header, HTTPException, status
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from ..tools import ToolRegistry, build_tool_registry
|
||||
from ..tools.base import ToolResultPayload, ToolServerExecuteRequest
|
||||
|
||||
|
||||
class ToolServerConfig(BaseModel):
|
||||
token: Optional[str] = Field(
|
||||
default=None,
|
||||
description="Bearer token required for requests (optional in dev).",
|
||||
)
|
||||
max_concurrency: int = Field(default=16, ge=1, description="Max concurrent tool executions.")
|
||||
|
||||
@classmethod
|
||||
def from_env(cls) -> "ToolServerConfig":
|
||||
# In dev, prefer loading secrets from the repo-local `.env` (not committed).
|
||||
try:
|
||||
from dotenv import load_dotenv # type: ignore
|
||||
except Exception: # pragma: no cover
|
||||
load_dotenv = None # type: ignore[assignment]
|
||||
if load_dotenv is not None:
|
||||
env_path = Path(__file__).resolve().parents[2] / ".env"
|
||||
if env_path.exists():
|
||||
load_dotenv(dotenv_path=env_path)
|
||||
|
||||
token = os.getenv("TOOL_SERVER_TOKEN") or None
|
||||
max_concurrency = int(os.getenv("TOOL_SERVER_MAX_CONCURRENCY", "16"))
|
||||
return cls(token=token, max_concurrency=max_concurrency)
|
||||
|
||||
|
||||
app = FastAPI(title="Atropos-Agent Tool Server")
|
||||
|
||||
|
||||
@app.get("/")
|
||||
async def root() -> Dict[str, str]:
|
||||
return {"message": "Atropos-Agent Tool Server"}
|
||||
|
||||
|
||||
@app.on_event("startup")
|
||||
async def _startup() -> None:
|
||||
cfg = ToolServerConfig.from_env()
|
||||
|
||||
# External-only registry. It will only include tools that are enabled by toolsets and
|
||||
# whose Hermes requirements/keys are satisfied in this process.
|
||||
tools: ToolRegistry = build_tool_registry(
|
||||
enabled_toolsets=["all"],
|
||||
disabled_toolsets=["terminal", "sandbox", "filesystem", "terminal_stateful", "default"],
|
||||
tool_server_url="enabled",
|
||||
)
|
||||
|
||||
app.state.cfg = cfg
|
||||
app.state.tools = tools
|
||||
app.state.semaphore = asyncio.Semaphore(cfg.max_concurrency)
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health() -> Dict[str, Any]:
|
||||
return {"status": "ok"}
|
||||
|
||||
|
||||
@app.get("/tools")
|
||||
async def list_tools() -> Dict[str, Any]:
|
||||
tools: ToolRegistry = app.state.tools
|
||||
return {"tools": [s.to_dict() for s in tools.get_schemas()]}
|
||||
|
||||
|
||||
def _check_auth(cfg: ToolServerConfig, authorization: Optional[str]) -> None:
|
||||
if not cfg.token:
|
||||
return
|
||||
if not authorization:
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing Authorization header")
|
||||
if not authorization.lower().startswith("bearer "):
|
||||
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid Authorization header")
|
||||
token = authorization.split(" ", 1)[1].strip()
|
||||
if token != cfg.token:
|
||||
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Invalid token")
|
||||
|
||||
|
||||
@app.post("/execute", response_model=ToolResultPayload)
|
||||
async def execute_tool(
|
||||
req: ToolServerExecuteRequest,
|
||||
authorization: Optional[str] = Header(default=None),
|
||||
) -> ToolResultPayload:
|
||||
cfg: ToolServerConfig = app.state.cfg
|
||||
_check_auth(cfg, authorization)
|
||||
|
||||
tools: ToolRegistry = app.state.tools
|
||||
sem: asyncio.Semaphore = app.state.semaphore
|
||||
|
||||
tool = tools.get(req.tool.name)
|
||||
if tool is None:
|
||||
return ToolResultPayload(
|
||||
success=False,
|
||||
error=f"Unknown tool: {req.tool.name}",
|
||||
uniq_id=req.tool.uniq_id,
|
||||
)
|
||||
|
||||
async with sem:
|
||||
try:
|
||||
kwargs = dict(req.tool.arguments)
|
||||
sig = inspect.signature(tool.execute).parameters
|
||||
# Some tools can benefit from extra context.
|
||||
if req.trajectory_id and "trajectory_id" in sig:
|
||||
kwargs["trajectory_id"] = req.trajectory_id
|
||||
if req.slot_id and "slot_id" in sig:
|
||||
kwargs["slot_id"] = req.slot_id
|
||||
if req.container_addr and "container_addr" in sig:
|
||||
kwargs["container_addr"] = req.container_addr
|
||||
if "task_id" in sig:
|
||||
kwargs["task_id"] = req.trajectory_id
|
||||
result = await tool.execute(**kwargs)
|
||||
except Exception as e:
|
||||
return ToolResultPayload(
|
||||
success=False,
|
||||
error=f"Tool execution error: {e}",
|
||||
uniq_id=req.tool.uniq_id,
|
||||
)
|
||||
|
||||
if result.uniq_id is None:
|
||||
result.uniq_id = req.tool.uniq_id
|
||||
return ToolResultPayload.from_tool_result(result)
|
||||
@@ -1,27 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any
|
||||
|
||||
from .base import ToolBackend
|
||||
from .modal_backend import ModalSandboxConfig, ModalToolBackend
|
||||
from .nomad_backend import NomadBackendConfig, NomadToolBackend
|
||||
|
||||
|
||||
def create_tool_backend(cfg: Any) -> ToolBackend:
|
||||
mode = str(getattr(cfg, "tool_pool_mode", "nomad")).strip().lower()
|
||||
if mode == "nomad":
|
||||
return NomadToolBackend(NomadBackendConfig.from_agent_env_config(cfg))
|
||||
if mode == "modal":
|
||||
return ModalToolBackend(ModalSandboxConfig.from_agent_env_config(cfg))
|
||||
raise ValueError(f"Unknown tool_pool_mode: {mode}")
|
||||
|
||||
|
||||
__all__ = [
|
||||
"ToolBackend",
|
||||
"create_tool_backend",
|
||||
"NomadBackendConfig",
|
||||
"NomadToolBackend",
|
||||
"ModalSandboxConfig",
|
||||
"ModalToolBackend",
|
||||
]
|
||||
|
||||
@@ -1,89 +0,0 @@
|
||||
"""
|
||||
Backend interfaces for AgentEnv tool execution.
|
||||
|
||||
The goal of this module is to decouple ToolExecutor / AgentEnv from any single
|
||||
execution backend (Nomad/Docker today; Modal later).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, List, Optional, Protocol, Tuple
|
||||
|
||||
from ..slots.executor import ExecutionResult
|
||||
from ..slots.slot import Slot
|
||||
|
||||
|
||||
class ToolBackend(Protocol):
|
||||
"""
|
||||
Minimal interface required by ToolExecutor.
|
||||
|
||||
Backends provide:
|
||||
- lifecycle (start/stop)
|
||||
- slot acquisition/release (workspace affinity)
|
||||
- batched tool execution across slots
|
||||
- optional artifact helpers (for env verification / demos)
|
||||
"""
|
||||
|
||||
@property
|
||||
def default_timeout_s(self) -> Optional[float]:
|
||||
"""Default sandbox execution timeout in seconds (if any)."""
|
||||
|
||||
async def start(self) -> None:
|
||||
"""Start the backend (provision workers/containers, health checks, etc)."""
|
||||
|
||||
async def stop(self, *, purge: bool = False) -> None:
|
||||
"""Stop the backend and optionally purge remote resources."""
|
||||
|
||||
async def acquire(self, trajectory_id: Optional[str] = None) -> Slot:
|
||||
"""Acquire a slot for a trajectory (workspace affinity)."""
|
||||
|
||||
async def release(self, slot: Slot, *, reset_workspace: bool = False) -> None:
|
||||
"""Release a slot back to the pool."""
|
||||
|
||||
async def execute_batch(
|
||||
self,
|
||||
requests: List[Tuple[Slot, str, Dict[str, Any]]],
|
||||
*,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> List[ExecutionResult]:
|
||||
"""Execute a batch of sandbox tool calls and return results in order."""
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Optional artifact helpers (supported by the Nomad sandbox-server today)
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
async def read_artifact(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str,
|
||||
*,
|
||||
encoding: str = "text",
|
||||
max_bytes: Optional[int] = None,
|
||||
include_sha256: bool = False,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
raise NotImplementedError
|
||||
|
||||
async def list_artifacts(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str = ".",
|
||||
*,
|
||||
recursive: bool = False,
|
||||
max_entries: Optional[int] = None,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
raise NotImplementedError
|
||||
|
||||
async def archive_artifacts(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str = ".",
|
||||
*,
|
||||
archive_format: str = "tar.gz",
|
||||
max_bytes: Optional[int] = None,
|
||||
max_entries: Optional[int] = None,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
raise NotImplementedError
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,156 +0,0 @@
|
||||
"""
|
||||
Nomad/Docker tool backend.
|
||||
|
||||
This backend is the current default for AgentEnv: it provisions a Nomad job
|
||||
running `sandbox_server.py` and multiplexes stateless slots inside each container.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
from ..slots import Slot, SlotPool, SlotPoolConfig
|
||||
from ..slots.executor import ExecutionResult
|
||||
from .base import ToolBackend
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class NomadBackendConfig:
|
||||
nomad_address: str
|
||||
sandbox_job_id: str
|
||||
sandbox_image: str
|
||||
slots_per_container: int
|
||||
min_containers: int
|
||||
max_containers: int
|
||||
privileged: bool
|
||||
acquire_timeout_s: float
|
||||
purge_job_on_start: bool
|
||||
# Driver selection: "docker" or "singularity"
|
||||
driver: str = "docker"
|
||||
# Path to .sif file for singularity driver (required if driver="singularity")
|
||||
singularity_image: Optional[str] = None
|
||||
|
||||
@classmethod
|
||||
def from_agent_env_config(cls, cfg: Any) -> "NomadBackendConfig":
|
||||
return cls(
|
||||
nomad_address=str(getattr(cfg, "nomad_address")),
|
||||
sandbox_job_id=str(getattr(cfg, "sandbox_job_id")),
|
||||
sandbox_image=str(getattr(cfg, "sandbox_image")),
|
||||
slots_per_container=int(getattr(cfg, "slots_per_container")),
|
||||
min_containers=int(getattr(cfg, "min_containers")),
|
||||
max_containers=int(getattr(cfg, "max_containers")),
|
||||
privileged=bool(getattr(cfg, "privileged")),
|
||||
acquire_timeout_s=float(getattr(cfg, "acquire_timeout_s")),
|
||||
purge_job_on_start=bool(getattr(cfg, "purge_job_on_start", False)),
|
||||
driver=str(getattr(cfg, "driver", "docker")),
|
||||
singularity_image=getattr(cfg, "singularity_image", None),
|
||||
)
|
||||
|
||||
|
||||
class NomadToolBackend(ToolBackend):
|
||||
def __init__(self, config: NomadBackendConfig):
|
||||
self.config = config
|
||||
self.pool = SlotPool(
|
||||
SlotPoolConfig(
|
||||
nomad_address=config.nomad_address,
|
||||
job_id=config.sandbox_job_id,
|
||||
image=config.sandbox_image,
|
||||
slots_per_container=config.slots_per_container,
|
||||
min_containers=config.min_containers,
|
||||
max_containers=config.max_containers,
|
||||
privileged=config.privileged,
|
||||
acquire_timeout=config.acquire_timeout_s,
|
||||
purge_job_on_start=bool(config.purge_job_on_start),
|
||||
driver=config.driver,
|
||||
singularity_image=config.singularity_image,
|
||||
)
|
||||
)
|
||||
|
||||
@property
|
||||
def default_timeout_s(self) -> Optional[float]:
|
||||
t = getattr(self.pool.executor, "timeout", None)
|
||||
total = getattr(t, "total", None)
|
||||
try:
|
||||
return float(total) if total is not None else None
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
async def start(self) -> None:
|
||||
await self.pool.start()
|
||||
|
||||
async def stop(self, *, purge: bool = False) -> None:
|
||||
await self.pool.stop(purge_job=purge)
|
||||
|
||||
async def acquire(self, trajectory_id: Optional[str] = None) -> Slot:
|
||||
return await self.pool.acquire(trajectory_id)
|
||||
|
||||
async def release(self, slot: Slot, *, reset_workspace: bool = False) -> None:
|
||||
await self.pool.release(slot, reset_workspace=reset_workspace)
|
||||
|
||||
async def execute_batch(
|
||||
self,
|
||||
requests: List[Tuple[Slot, str, Dict[str, Any]]],
|
||||
*,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> List[ExecutionResult]:
|
||||
return await self.pool.execute_batch(requests, timeout=timeout_s)
|
||||
|
||||
async def read_artifact(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str,
|
||||
*,
|
||||
encoding: str = "text",
|
||||
max_bytes: Optional[int] = None,
|
||||
include_sha256: bool = False,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
return await self.pool.executor.read_artifact(
|
||||
slot,
|
||||
path,
|
||||
encoding=encoding,
|
||||
max_bytes=max_bytes,
|
||||
include_sha256=include_sha256,
|
||||
timeout=timeout_s,
|
||||
)
|
||||
|
||||
async def list_artifacts(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str = ".",
|
||||
*,
|
||||
recursive: bool = False,
|
||||
max_entries: Optional[int] = None,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
return await self.pool.executor.list_artifacts(
|
||||
slot,
|
||||
path,
|
||||
recursive=recursive,
|
||||
max_entries=max_entries,
|
||||
timeout=timeout_s,
|
||||
)
|
||||
|
||||
async def archive_artifacts(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str = ".",
|
||||
*,
|
||||
archive_format: str = "tar.gz",
|
||||
max_bytes: Optional[int] = None,
|
||||
max_entries: Optional[int] = None,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
return await self.pool.executor.archive_artifacts(
|
||||
slot,
|
||||
path,
|
||||
archive_format=archive_format,
|
||||
max_bytes=max_bytes,
|
||||
max_entries=max_entries,
|
||||
timeout=timeout_s,
|
||||
)
|
||||
|
||||
def get_stats(self) -> Dict[str, Any]:
|
||||
return self.pool.get_stats()
|
||||
|
||||
@@ -1,10 +0,0 @@
|
||||
"""
|
||||
Environment implementations for atropos-agent.
|
||||
"""
|
||||
|
||||
from .agent_env import AgentEnv, AgentEnvConfig
|
||||
|
||||
# NOTE: Additional example envs exist as modules (e.g. `test_env`, `swe_smith_oracle_env`),
|
||||
# but are intentionally not imported here to avoid pulling heavy optional deps at import time.
|
||||
|
||||
__all__ = ["AgentEnv", "AgentEnvConfig"]
|
||||
@@ -1,526 +0,0 @@
|
||||
"""
|
||||
AgentEnv - Atropos BaseEnv extension for agent/tool-call workloads.
|
||||
|
||||
AgentEnv is responsible for starting the sandbox tool execution backend and
|
||||
providing helpers for running agent trajectories with queued/batched tool calls.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
import os
|
||||
import asyncio
|
||||
import time
|
||||
import uuid
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Any, Awaitable, Callable, Dict, Generic, List, Optional, Tuple, TypeVar
|
||||
|
||||
from pydantic import Field
|
||||
|
||||
from atroposlib.envs.base import APIServerConfig, BaseEnv, BaseEnvConfig, Item, ScoredDataGroup, ScoredDataItem
|
||||
from atroposlib.envs.server_handling.server_baseline import AsyncSemWithAdaptiveWeight
|
||||
|
||||
from ..agent import AgentConfig, AgentResult, AtroposAgent
|
||||
from ..backends import ToolBackend, create_tool_backend
|
||||
from ..tools import ToolRegistry, build_tool_registry
|
||||
from ..tools.tool_executor import ToolExecutor, ToolExecutorConfig
|
||||
|
||||
# Main BaseEnv child classes. Child class THESE to get agent+tooling functionality easily.
|
||||
|
||||
class AgentEnvConfig(BaseEnvConfig):
|
||||
tool_pool_mode: str = Field(default="nomad", description="Tool execution backend ('nomad' or 'modal')")
|
||||
|
||||
allow_network: bool = Field(
|
||||
default=True,
|
||||
description="Whether sandbox bash commands may access the network (env policy).",
|
||||
)
|
||||
require_sandbox: bool = Field(
|
||||
default=False,
|
||||
description="Fail closed if bubblewrap sandboxing is unavailable/unusable for stateless sandbox tools.",
|
||||
)
|
||||
require_stateful_sandbox: bool = Field(
|
||||
default=False,
|
||||
description="Fail closed if bubblewrap/PID isolation is unavailable for stateful terminal tools (tmux).",
|
||||
)
|
||||
tool_batch_window_ms: int = Field(default=20, description="ToolExecutor batching window (ms)")
|
||||
tool_max_batch_size: int = Field(default=200, description="ToolExecutor maximum batch size")
|
||||
|
||||
# nomad mode settings. TODO: Add Modal support, split this into own config
|
||||
nomad_address: str = Field(default="http://localhost:4646", description="Nomad API address")
|
||||
sandbox_job_id: str = Field(default="atropos-sandbox-agent-env", description="Nomad job id for sandbox containers")
|
||||
sandbox_image: str = Field(default="atropos-sandbox:local", description="Docker image for sandbox containers")
|
||||
slots_per_container: int = Field(default=10, description="Nomad mode: slots per container")
|
||||
min_containers: int = Field(default=1, description="Nomad mode: minimum containers")
|
||||
max_containers: int = Field(default=10, description="Nomad mode: maximum containers")
|
||||
privileged: bool = Field(default=False, description="Nomad mode: run container privileged")
|
||||
acquire_timeout_s: float = Field(default=30.0, description="Slot acquisition timeout (seconds)")
|
||||
purge_job_on_start: bool = Field(
|
||||
default=False,
|
||||
description=(
|
||||
"Nomad mode: stop/purge the sandbox job on startup. This is helpful in local dev and training runs "
|
||||
"to recover from previous crashes that leave the job in a restart backoff state."
|
||||
),
|
||||
)
|
||||
purge_job_on_shutdown: bool = Field(default=True, description="Nomad mode: stop/purge job on shutdown")
|
||||
|
||||
# Nomad driver selection (docker or singularity)
|
||||
driver: str = Field(
|
||||
default="docker",
|
||||
description="Nomad task driver: 'docker' (default) or 'singularity' (for HPC without sudo Docker)",
|
||||
)
|
||||
singularity_image: Optional[str] = Field(
|
||||
default=None,
|
||||
description="Path to .sif file for Singularity driver (required if driver='singularity')",
|
||||
)
|
||||
|
||||
# modal mode settings (stub; implementation pending)
|
||||
modal_app_name: str = Field(default="atropos-sandbox", description="Modal app name (stub)")
|
||||
modal_function_name: str = Field(default="sandbox_server", description="Modal function/actor name (stub)")
|
||||
modal_volume_name: Optional[str] = Field(default=None, description="Modal Volume name for persistent storage (stub)")
|
||||
modal_volume_mount_path: str = Field(default="/data", description="Modal Volume mount path (stub)")
|
||||
|
||||
# basic agent defaults
|
||||
agent_max_steps: int = Field(default=50, description="Max ReACT steps per trajectory")
|
||||
agent_temperature: float = Field(default=0.7, description="Sampling temperature")
|
||||
agent_max_tokens: Optional[int] = Field(
|
||||
default=None,
|
||||
description="Max tokens per model response (default: let backend decide)",
|
||||
)
|
||||
agent_tool_delay_s: float = Field(default=0.0, description="Delay between tool calls (seconds)")
|
||||
|
||||
# tool selection
|
||||
enabled_toolsets: List[str] = Field(
|
||||
default_factory=lambda: ["default"],
|
||||
description="Toolsets to enable (Hermes-style grouping).",
|
||||
)
|
||||
disabled_toolsets: List[str] = Field(
|
||||
default_factory=list,
|
||||
description="Toolsets to disable (applied after enabled_toolsets).",
|
||||
)
|
||||
|
||||
# external ToolServer routing (Phase 4.5+)
|
||||
tool_server_url: Optional[str] = Field(
|
||||
default=None,
|
||||
description="Base URL for external ToolServer (enables external tools).",
|
||||
)
|
||||
tool_server_token: Optional[str] = Field(
|
||||
default=None,
|
||||
description="Bearer token for ToolServer auth (optional in dev).",
|
||||
)
|
||||
|
||||
AgentEnvConfigT = TypeVar("AgentEnvConfigT", bound="AgentEnvConfig")
|
||||
|
||||
|
||||
class AgentEnv(BaseEnv, ABC, Generic[AgentEnvConfigT]):
|
||||
env_config_cls = AgentEnvConfig
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: AgentEnvConfigT,
|
||||
server_configs: List[APIServerConfig],
|
||||
slurm: bool = False,
|
||||
testing: bool = False,
|
||||
):
|
||||
super().__init__(config, server_configs, slurm, testing)
|
||||
self.config: AgentEnvConfigT = config
|
||||
|
||||
self.tools: ToolRegistry = self.build_tools()
|
||||
|
||||
self._backend: Optional[ToolBackend] = None
|
||||
self._tool_executor: Optional[ToolExecutor] = None
|
||||
self._tool_server_inprocess: bool = False
|
||||
self._trajectory_workspace_meta: Dict[str, Dict[str, Any]] = {}
|
||||
|
||||
def build_tools(self) -> ToolRegistry:
|
||||
"""Wraps original Hermes-Agent ToolRegistry for atropos AgentEnv use.
|
||||
See Hermes-Agent docs for toolsets and available tools etc.
|
||||
"""
|
||||
return build_tool_registry(
|
||||
enabled_toolsets=self.config.enabled_toolsets or ["default"],
|
||||
disabled_toolsets=self.config.disabled_toolsets or None,
|
||||
tool_server_url=self.config.tool_server_url,
|
||||
)
|
||||
|
||||
@abstractmethod
|
||||
def build_task(self, item: Item) -> str:
|
||||
"""Return the user-facing task string for the agent."""
|
||||
|
||||
@abstractmethod
|
||||
async def score_trajectory(self, item: Item, final_response: str) -> float:
|
||||
"""Return a scalar score for this trajectory."""
|
||||
|
||||
async def setup_trajectory_workspace(
|
||||
self,
|
||||
item: Item,
|
||||
*,
|
||||
trajectory_id: str,
|
||||
exec_tool: Callable[["ToolCall"], Awaitable["ToolResult"]],
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Optional hook: prepare the sandbox workspace before the agent starts.
|
||||
|
||||
Examples:
|
||||
- clone a repo and checkout a commit
|
||||
- write fixture files (e.g. images) for external-tool demos
|
||||
- pre-install dependencies
|
||||
|
||||
Default: no-op.
|
||||
"""
|
||||
_ = (item, trajectory_id, exec_tool)
|
||||
return {}
|
||||
|
||||
async def verify_and_score_trajectory(
|
||||
self,
|
||||
item: Item,
|
||||
final_response: str,
|
||||
*,
|
||||
trajectory_id: str,
|
||||
exec_tool: Callable[["ToolCall"], Awaitable["ToolResult"]],
|
||||
agent_result: Optional[AgentResult] = None,
|
||||
workspace_meta: Optional[Dict[str, Any]] = None,
|
||||
) -> tuple[float, Dict[str, Any]]:
|
||||
"""
|
||||
Optional hook: run in-sandbox verification before scoring.
|
||||
|
||||
Many agent envs need to execute verification inside the same trajectory
|
||||
workspace (e.g. pytest) before releasing/resetting the slot.
|
||||
|
||||
Default: calls `score_trajectory()` and returns empty metadata.
|
||||
"""
|
||||
_ = (trajectory_id, exec_tool, agent_result, workspace_meta) # default ignores in-workspace verification
|
||||
score = await self.score_trajectory(item, final_response)
|
||||
return score, {}
|
||||
|
||||
def build_agent_config(self, item: Item) -> AgentConfig: # noqa: ARG002
|
||||
return AgentConfig(
|
||||
max_steps=self.config.agent_max_steps,
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.agent_max_tokens,
|
||||
tool_delay_s=self.config.agent_tool_delay_s,
|
||||
)
|
||||
|
||||
async def setup(self) -> None:
|
||||
print(f"[AgentEnv] setup(): starting tool backend ({self.config.tool_pool_mode})", flush=True)
|
||||
await self._start_tool_backend()
|
||||
print("[AgentEnv] setup(): configuring server concurrency", flush=True)
|
||||
self._configure_server_concurrency()
|
||||
print("[AgentEnv] setup(): running env-specific setup_agent_env()", flush=True)
|
||||
await self.setup_agent_env()
|
||||
print("[AgentEnv] setup(): done", flush=True)
|
||||
|
||||
def _configure_server_concurrency(self) -> None:
|
||||
"""
|
||||
Ensure the LLM server concurrency isn't accidentally capped below `group_size`.
|
||||
|
||||
In `BaseEnv process` mode, groups are collected concurrently and if the underlying
|
||||
ServerManager/OpenAIServer semaphore is left at 1, we serialize inference even
|
||||
when `--env.group_size` is > 1.
|
||||
"""
|
||||
desired = int(getattr(self.config, "group_size", 1) or 1)
|
||||
if desired <= 1:
|
||||
return
|
||||
|
||||
servers = getattr(self.server, "servers", None)
|
||||
if not isinstance(servers, list) or not servers:
|
||||
return
|
||||
|
||||
for s in servers:
|
||||
sem = getattr(s, "sem", None)
|
||||
eval_sem = getattr(s, "eval_sem", None)
|
||||
# Only increase; never shrink.
|
||||
if sem is not None and getattr(sem, "max_val", 0) < desired:
|
||||
s.sem = AsyncSemWithAdaptiveWeight(desired)
|
||||
if hasattr(s, "config") and hasattr(s.config, "num_max_requests_at_once"):
|
||||
s.config.num_max_requests_at_once = desired
|
||||
if eval_sem is not None and getattr(eval_sem, "max_val", 0) < desired:
|
||||
s.eval_sem = AsyncSemWithAdaptiveWeight(desired)
|
||||
if hasattr(s, "config") and hasattr(s.config, "num_requests_for_eval"):
|
||||
s.config.num_requests_for_eval = desired
|
||||
|
||||
@abstractmethod
|
||||
async def setup_agent_env(self) -> None:
|
||||
"""Subclass hook for env-specific setup."""
|
||||
|
||||
async def evaluate(self, *args, **kwargs): # noqa: ARG002
|
||||
"""
|
||||
Default eval hook (no-op).
|
||||
|
||||
Atropos BaseEnv requires an `evaluate()` implementation. Many agent envs
|
||||
won't have a meaningful evaluation path during early PoC work; they can
|
||||
override this when needed.
|
||||
"""
|
||||
return {}
|
||||
|
||||
async def env_manager(self):
|
||||
try:
|
||||
return await super().env_manager()
|
||||
finally:
|
||||
await self.shutdown_tool_backend()
|
||||
|
||||
async def process_manager(self):
|
||||
try:
|
||||
return await super().process_manager()
|
||||
finally:
|
||||
await self.shutdown_tool_backend()
|
||||
|
||||
async def _start_tool_backend(self) -> None:
|
||||
if self._tool_executor is not None:
|
||||
return
|
||||
|
||||
tool_server_url = self.config.tool_server_url
|
||||
tool_server_client = None
|
||||
if tool_server_url == "inprocess":
|
||||
import httpx
|
||||
from ..api.tool_server import app as tool_server_app
|
||||
|
||||
await tool_server_app.router.startup()
|
||||
tool_server_client = httpx.AsyncClient(
|
||||
transport=httpx.ASGITransport(app=tool_server_app),
|
||||
base_url="http://toolserver",
|
||||
)
|
||||
tool_server_url = "http://toolserver"
|
||||
self._tool_server_inprocess = True
|
||||
|
||||
backend = create_tool_backend(self.config)
|
||||
await backend.start()
|
||||
|
||||
executor = ToolExecutor(
|
||||
backend=backend,
|
||||
tools=self.tools,
|
||||
config=ToolExecutorConfig(
|
||||
batch_window_ms=self.config.tool_batch_window_ms,
|
||||
max_batch_size=self.config.tool_max_batch_size,
|
||||
allow_network=self.config.allow_network,
|
||||
require_sandbox=self.config.require_sandbox,
|
||||
require_stateful_sandbox=self.config.require_stateful_sandbox,
|
||||
tool_server_url=tool_server_url,
|
||||
tool_server_token=self.config.tool_server_token,
|
||||
),
|
||||
)
|
||||
await executor.start()
|
||||
if tool_server_client is not None:
|
||||
executor._tool_server_client = tool_server_client # type: ignore[attr-defined]
|
||||
|
||||
self._backend = backend
|
||||
self._tool_executor = executor
|
||||
|
||||
async def shutdown_tool_backend(self) -> None:
|
||||
executor = self._tool_executor
|
||||
backend = self._backend
|
||||
inprocess_tool_server = self._tool_server_inprocess
|
||||
self._tool_executor = None
|
||||
self._backend = None
|
||||
self._tool_server_inprocess = False
|
||||
|
||||
if executor is not None:
|
||||
await executor.close()
|
||||
if backend is not None:
|
||||
await backend.stop(purge=bool(self.config.purge_job_on_shutdown))
|
||||
if inprocess_tool_server:
|
||||
from ..api.tool_server import app as tool_server_app
|
||||
|
||||
await tool_server_app.router.shutdown()
|
||||
|
||||
async def collect_trajectory(
|
||||
self, item: Item
|
||||
) -> Tuple[Optional[ScoredDataItem], List[Item]]:
|
||||
if self._tool_executor is None:
|
||||
raise RuntimeError("Tool backend not started")
|
||||
|
||||
trajectory_id = str(uuid.uuid4())
|
||||
t0 = time.perf_counter()
|
||||
print(f"[AgentEnv] collect_trajectory(): tid={trajectory_id} start", flush=True)
|
||||
task = self.build_task(item)
|
||||
agent_config = self.build_agent_config(item)
|
||||
if os.getenv("ATROPOS_DEBUG_PRINT_TASK") == "1":
|
||||
print(f"Starting trajectory {trajectory_id} with task: {task}", flush=True)
|
||||
else:
|
||||
# Avoid printing the full task prompt by default (can be huge/noisy).
|
||||
one_line = " ".join(str(task).splitlines()).strip()
|
||||
preview = one_line[:240] + ("…" if len(one_line) > 240 else "")
|
||||
print(f"Starting trajectory {trajectory_id} (task preview): {preview}", flush=True)
|
||||
|
||||
async def _exec(call):
|
||||
return await self._tool_executor.execute(trajectory_id, call)
|
||||
|
||||
agent = AtroposAgent(
|
||||
server=self.server,
|
||||
tokenizer=self.tokenizer,
|
||||
tools=self.tools,
|
||||
config=agent_config,
|
||||
execute_tool=_exec,
|
||||
)
|
||||
|
||||
try:
|
||||
print(f"[AgentEnv] tid={trajectory_id} setup_trajectory_workspace() start", flush=True)
|
||||
workspace_meta = await self.setup_trajectory_workspace(item, trajectory_id=trajectory_id, exec_tool=_exec)
|
||||
if not isinstance(workspace_meta, dict):
|
||||
workspace_meta = {}
|
||||
self._trajectory_workspace_meta[trajectory_id] = workspace_meta
|
||||
print(
|
||||
f"[AgentEnv] tid={trajectory_id} setup_trajectory_workspace() done in {time.perf_counter() - t0:.2f}s",
|
||||
flush=True,
|
||||
)
|
||||
|
||||
print(f"[AgentEnv] tid={trajectory_id} agent.run() start", flush=True)
|
||||
result = await agent.run(task)
|
||||
print(
|
||||
f"[AgentEnv] tid={trajectory_id} agent.run() done in {time.perf_counter() - t0:.2f}s "
|
||||
f"success={result.success} tool_calls={result.total_tool_calls}",
|
||||
flush=True,
|
||||
)
|
||||
if not result.success or result.trajectory_data is None:
|
||||
# Do not trigger BaseEnv retries for agent failures.
|
||||
# Record the trajectory with score 0.0 so training/eval can see the failure mode.
|
||||
messages = [{"role": "system", "content": agent._build_system_prompt()}] # noqa: SLF001
|
||||
messages.append({"role": "user", "content": task})
|
||||
for step in result.steps:
|
||||
messages.append({"role": "assistant", "content": step.assistant_message})
|
||||
if step.tool_results:
|
||||
tool_text = "\n".join(r.to_xml() for r in step.tool_results)
|
||||
messages.append({"role": "user", "content": tool_text})
|
||||
|
||||
scored: ScoredDataItem = {
|
||||
"tokens": (result.trajectory_data.tokens if result.trajectory_data else []),
|
||||
"masks": (result.trajectory_data.masked_tokens if result.trajectory_data else []),
|
||||
"scores": 0.0,
|
||||
}
|
||||
if result.trajectory_data is not None:
|
||||
scored["inference_logprobs"] = result.trajectory_data.logprobs # type: ignore[typeddict-unknown-key]
|
||||
if getattr(result.trajectory_data, "metadata", None):
|
||||
scored["overrides"] = {"managed_metadata": result.trajectory_data.metadata}
|
||||
if self.config.include_messages:
|
||||
# Record a final failure marker as a user-side tool_response-like block so it survives templates.
|
||||
import json
|
||||
|
||||
err = result.error or "agent_failed"
|
||||
messages.append(
|
||||
{
|
||||
"role": "user",
|
||||
"content": f"<tool_response>{json.dumps({'success': False, 'error': err})}</tool_response>",
|
||||
}
|
||||
)
|
||||
scored["messages"] = messages
|
||||
return scored, []
|
||||
|
||||
print(f"[AgentEnv] tid={trajectory_id} verify_and_score_trajectory() start", flush=True)
|
||||
score, score_metadata = await self.verify_and_score_trajectory(
|
||||
item,
|
||||
result.final_response,
|
||||
trajectory_id=trajectory_id,
|
||||
exec_tool=_exec,
|
||||
agent_result=result,
|
||||
workspace_meta=workspace_meta,
|
||||
)
|
||||
print(
|
||||
f"[AgentEnv] tid={trajectory_id} verify_and_score_trajectory() done in {time.perf_counter() - t0:.2f}s "
|
||||
f"score={score}",
|
||||
flush=True,
|
||||
)
|
||||
|
||||
messages = [{"role": "system", "content": agent._build_system_prompt()}] # noqa: SLF001
|
||||
messages.append({"role": "user", "content": task})
|
||||
for step in result.steps:
|
||||
messages.append({"role": "assistant", "content": step.assistant_message})
|
||||
if step.tool_results:
|
||||
tool_text = "\n".join(r.to_xml() for r in step.tool_results)
|
||||
messages.append({"role": "user", "content": tool_text})
|
||||
|
||||
# Optional: allow env verification to attach additional messages (e.g. install logs).
|
||||
if self.config.include_messages and isinstance(score_metadata, dict):
|
||||
extra = score_metadata.get("verification_messages")
|
||||
if isinstance(extra, list):
|
||||
for m in extra:
|
||||
if isinstance(m, dict) and isinstance(m.get("role"), str) and isinstance(m.get("content"), str):
|
||||
messages.append({"role": m["role"], "content": m["content"]})
|
||||
|
||||
scored: ScoredDataItem = {
|
||||
"tokens": result.trajectory_data.tokens,
|
||||
"masks": result.trajectory_data.masked_tokens,
|
||||
"scores": score,
|
||||
}
|
||||
# Atroposlib expects policy logprobs at the *group* level under `inference_logprobs`.
|
||||
# We stash per-item values here and lift them into the group in `collect_trajectories()`.
|
||||
scored["inference_logprobs"] = result.trajectory_data.logprobs # type: ignore[typeddict-unknown-key]
|
||||
if getattr(result.trajectory_data, "metadata", None):
|
||||
scored["overrides"] = {"managed_metadata": result.trajectory_data.metadata}
|
||||
if self.config.include_messages:
|
||||
scored["messages"] = messages
|
||||
|
||||
return scored, []
|
||||
finally:
|
||||
self._trajectory_workspace_meta.pop(trajectory_id, None)
|
||||
print(f"[AgentEnv] tid={trajectory_id} release_trajectory(reset_workspace=True)", flush=True)
|
||||
await self._tool_executor.release_trajectory(trajectory_id, reset_workspace=True)
|
||||
print(f"[AgentEnv] collect_trajectory(): tid={trajectory_id} done in {time.perf_counter() - t0:.2f}s", flush=True)
|
||||
|
||||
async def collect_trajectories(
|
||||
self, item: Item
|
||||
) -> Tuple[Optional[ScoredDataGroup], List[Item]]:
|
||||
tasks = [self.collect_trajectory(item) for _ in range(self.config.group_size)]
|
||||
results = await asyncio.gather(*tasks)
|
||||
|
||||
backlog: List[Item] = []
|
||||
items: List[ScoredDataItem] = []
|
||||
for scored, b in results:
|
||||
backlog.extend(b)
|
||||
if scored is not None:
|
||||
items.append(scored)
|
||||
|
||||
if len(items) != self.config.group_size:
|
||||
return None, backlog
|
||||
|
||||
group: ScoredDataGroup = ScoredDataGroup(
|
||||
tokens=[],
|
||||
masks=[],
|
||||
scores=[],
|
||||
advantages=[],
|
||||
ref_logprobs=[],
|
||||
messages=[] if self.config.include_messages else None,
|
||||
inference_logprobs=[],
|
||||
group_overrides={},
|
||||
overrides=[],
|
||||
images=[],
|
||||
generation_params=None,
|
||||
)
|
||||
|
||||
for it in items:
|
||||
group["tokens"].append(it["tokens"])
|
||||
group["masks"].append(it["masks"])
|
||||
group["scores"].append(it["scores"])
|
||||
# policy logprobs (for PPO/GRPO training) if present
|
||||
lp = it.get("inference_logprobs") # type: ignore[typeddict-item]
|
||||
if lp is not None:
|
||||
group["inference_logprobs"].append(lp)
|
||||
group["overrides"].append(it.get("overrides") or {}) # type: ignore[typeddict-item]
|
||||
if group.get("messages") is not None and it.get("messages") is not None:
|
||||
group["messages"].append(it["messages"])
|
||||
|
||||
return group, backlog
|
||||
|
||||
async def run_agent(self, task: str, *, trajectory_id: Optional[str] = None) -> Tuple[str, Dict[str, Any]]:
|
||||
"""
|
||||
Run the AtroposAgent on a single task and return (final_response, debug).
|
||||
|
||||
This is a helper intended for simple environments and tests.
|
||||
"""
|
||||
if self._tool_executor is None:
|
||||
raise RuntimeError("Tool backend not started")
|
||||
|
||||
tid = trajectory_id or str(uuid.uuid4())
|
||||
|
||||
async def _exec(call):
|
||||
return await self._tool_executor.execute(tid, call)
|
||||
|
||||
agent = AtroposAgent(
|
||||
server=self.server,
|
||||
tokenizer=self.tokenizer,
|
||||
tools=self.tools,
|
||||
config=AgentConfig(
|
||||
max_steps=self.config.agent_max_steps,
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.agent_max_tokens,
|
||||
),
|
||||
execute_tool=_exec,
|
||||
)
|
||||
result = await agent.run(task)
|
||||
await self._tool_executor.release_trajectory(tid, reset_workspace=True)
|
||||
return result.final_response, {"success": result.success, "error": result.error, "tool_calls": result.total_tool_calls}
|
||||
@@ -1,171 +0,0 @@
|
||||
"""
|
||||
Hermes-Agent + Atropos (Nomad sandbox) compatibility smoke environment.
|
||||
|
||||
This environment is intended to validate, end-to-end:
|
||||
BaseEnv.process -> AgentEnv -> ToolExecutor (batched) -> Nomad SlotPool -> sandbox_server
|
||||
|
||||
It forces the model to use a sandbox tool by asking it to run a command that
|
||||
generates a high-entropy token inside the sandbox, then repeat it exactly.
|
||||
|
||||
Run (process mode):
|
||||
uv run python -m atropos.envs.hermes_compat_test_env process --env.use_wandb false --env.total_steps 2 --env.group_size 1
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from typing import Any, Dict, List, Tuple
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from pydantic import Field
|
||||
|
||||
from atroposlib.envs.base import APIServerConfig, Item
|
||||
|
||||
from ..agent import AgentConfig, AgentResult
|
||||
from ..tools import ToolCall
|
||||
from .agent_env import AgentEnv, AgentEnvConfig
|
||||
|
||||
load_dotenv()
|
||||
|
||||
|
||||
def _forced_tool_item() -> Item:
|
||||
# Use double quotes in the shell command and show JSON escaping explicitly.
|
||||
# This avoids invalid JSON escapes like `\\'` (not valid JSON) that some models produce.
|
||||
cmd = 'python -c "import secrets; print(secrets.token_hex(16))"'
|
||||
return {
|
||||
"command": cmd,
|
||||
"prompt": (
|
||||
"You are acting as an agent inside a sandboxed environment.\n"
|
||||
"You MUST use the terminal tool to execute commands.\n"
|
||||
"Run this exact command:\n"
|
||||
f"{cmd}\n"
|
||||
"When you call the tool, use valid JSON inside <tool_call>. Example:\n"
|
||||
'<tool_call>{"name": "terminal", "arguments": {"command": '
|
||||
'"python -c \\\\"import secrets; print(secrets.token_hex(16))\\\\""}}'
|
||||
"</tool_call>\n"
|
||||
"Then respond with EXACTLY what it printed (the hex token) and nothing else.\n"
|
||||
"Do not guess. Do not explain."
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
class HermesCompatTestEnvConfig(AgentEnvConfig):
|
||||
server_base_url: str = Field(
|
||||
default="http://127.0.0.1:8080",
|
||||
description="Base URL for an OpenAI-compatible chat server (without /v1).",
|
||||
)
|
||||
server_model: str = Field(default="hermes-4-36b", description="Model name")
|
||||
tokenizer_name: str = Field(default="NousResearch/Hermes-4.3-36B", description="Tokenizer name for RL tokenization")
|
||||
|
||||
|
||||
class HermesCompatTestEnv(AgentEnv[HermesCompatTestEnvConfig]):
|
||||
name = "hermes_compat_test_env"
|
||||
env_config_cls = HermesCompatTestEnvConfig
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: HermesCompatTestEnvConfig,
|
||||
server_configs: List[APIServerConfig],
|
||||
slurm: bool = False,
|
||||
testing: bool = False,
|
||||
):
|
||||
super().__init__(config, server_configs, slurm, testing)
|
||||
self._iter = 0
|
||||
|
||||
@classmethod
|
||||
def config_init(cls) -> Tuple[HermesCompatTestEnvConfig, List[APIServerConfig]]:
|
||||
base_url = (
|
||||
os.getenv("ATROPOS_SERVER_BASE_URL")
|
||||
or os.getenv("OPENAI_BASE_URL")
|
||||
or os.getenv("LLM_BASE_URL")
|
||||
or "http://127.0.0.1:8080"
|
||||
)
|
||||
model = os.getenv("ATROPOS_SERVER_MODEL") or os.getenv("LLM_MODEL") or "hermes-4-36b"
|
||||
api_key = os.getenv("ATROPOS_SERVER_API_KEY") or os.getenv("NOUS_API_KEY") or os.getenv("OPENAI_API_KEY") or "local"
|
||||
|
||||
env_config = HermesCompatTestEnvConfig(
|
||||
tokenizer_name=os.getenv("ATROPOS_TOKENIZER_NAME") or "NousResearch/Hermes-4.3-36B",
|
||||
group_size=1,
|
||||
use_wandb=False,
|
||||
include_messages=True,
|
||||
ensure_scores_are_not_same=False,
|
||||
total_steps=2,
|
||||
batch_size=1,
|
||||
server_base_url=base_url,
|
||||
server_model=model,
|
||||
# Tooling: sandbox-only terminal.
|
||||
enabled_toolsets=["terminal"],
|
||||
disabled_toolsets=[],
|
||||
# Default to Nomad sandboxing; users can override via --env.* args.
|
||||
sandbox_image=os.getenv("ATROPOS_SANDBOX_IMAGE") or "atropos-sandbox:local",
|
||||
# In local dev it's common for a previous crash to leave the job in backoff.
|
||||
purge_job_on_start=True,
|
||||
purge_job_on_shutdown=True,
|
||||
)
|
||||
|
||||
server_configs = [
|
||||
APIServerConfig(
|
||||
model_name=model,
|
||||
base_url=f"{base_url.rstrip('/')}/v1",
|
||||
api_key=api_key,
|
||||
num_max_requests_at_once=1,
|
||||
num_requests_for_eval=1,
|
||||
timeout=120,
|
||||
)
|
||||
]
|
||||
return env_config, server_configs
|
||||
|
||||
async def setup_agent_env(self) -> None:
|
||||
return None
|
||||
|
||||
async def get_next_item(self) -> Item:
|
||||
self._iter += 1
|
||||
return _forced_tool_item()
|
||||
|
||||
def build_task(self, item: Item) -> str:
|
||||
return str(item.get("prompt") or "")
|
||||
|
||||
def build_agent_config(self, item: Item) -> AgentConfig: # noqa: ARG002
|
||||
# Avoid imposing max_tokens by default; tool-tag responses can be long for some models.
|
||||
return AgentConfig(
|
||||
max_steps=min(8, int(self.config.agent_max_steps)),
|
||||
temperature=0.2,
|
||||
max_tokens=None,
|
||||
)
|
||||
|
||||
async def score_trajectory(self, item: Item, final_response: str) -> float:
|
||||
# Scoring happens in verify_and_score_trajectory so we can inspect tool results.
|
||||
_ = (item, final_response)
|
||||
return 0.0
|
||||
|
||||
async def verify_and_score_trajectory(
|
||||
self,
|
||||
item: Item,
|
||||
final_response: str,
|
||||
*,
|
||||
trajectory_id: str, # noqa: ARG002
|
||||
exec_tool, # noqa: ARG002
|
||||
agent_result: AgentResult | None = None,
|
||||
workspace_meta: Dict[str, Any] | None = None, # noqa: ARG002
|
||||
) -> tuple[float, Dict[str, Any]]:
|
||||
if agent_result is None:
|
||||
return 0.0, {"error": "Missing agent_result"}
|
||||
|
||||
observed: str = ""
|
||||
tool_ok = False
|
||||
for step in agent_result.steps:
|
||||
for res in step.tool_results:
|
||||
if not res.success:
|
||||
return 0.0, {"error": res.error, "output": res.output}
|
||||
out = (res.output or "").strip()
|
||||
if out:
|
||||
observed = out.splitlines()[-1].strip()
|
||||
tool_ok = True
|
||||
|
||||
final = (final_response or "").strip()
|
||||
score = 1.0 if tool_ok and agent_result.total_tool_calls > 0 and observed and final == observed else 0.0
|
||||
return score, {"observed": observed, "tool_calls": agent_result.total_tool_calls, "command": item.get("command")}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
HermesCompatTestEnv.cli()
|
||||
@@ -1,172 +0,0 @@
|
||||
"""
|
||||
Nomad sandbox terminal smoke environment (training-oriented).
|
||||
|
||||
Validates, end-to-end:
|
||||
BaseEnv.process -> AgentEnv -> ToolExecutor (batched) -> Nomad SlotPool -> sandbox_server
|
||||
|
||||
It forces the model to use a sandbox tool by asking it to run a command that
|
||||
generates a high-entropy token inside the sandbox, then repeat it exactly.
|
||||
|
||||
Run (process mode):
|
||||
uv run python -m atropos.envs.sandbox_terminal_smoke_env process --env.use_wandb false --env.total_steps 2 --env.group_size 1
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from typing import Any, Dict, List, Tuple
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from pydantic import Field
|
||||
|
||||
from atroposlib.envs.base import APIServerConfig, Item
|
||||
|
||||
from ..agent import AgentConfig, AgentResult
|
||||
from ..tools import ToolCall
|
||||
from .agent_env import AgentEnv, AgentEnvConfig
|
||||
|
||||
load_dotenv()
|
||||
|
||||
STRICT_TOOLCALL_SYSTEM_PROMPT = None
|
||||
|
||||
|
||||
def _forced_tool_item() -> Item:
|
||||
# Use double quotes in the shell command and show JSON escaping explicitly.
|
||||
# This avoids invalid JSON escapes like `\\'` (not valid JSON) that some models produce.
|
||||
cmd = 'python -c "import secrets; print(secrets.token_hex(16))"'
|
||||
return {
|
||||
"command": cmd,
|
||||
"prompt": (
|
||||
"You MUST use the terminal tool.\n"
|
||||
"Run this exact command:\n"
|
||||
f"{cmd}\n"
|
||||
"When you call the tool, use valid JSON inside <tool_call>. Example:\n"
|
||||
'<tool_call>{"name": "terminal", "arguments": {"command": '
|
||||
'"python -c \\\\"import secrets; print(secrets.token_hex(16))\\\\""}}'
|
||||
"</tool_call>\n"
|
||||
"Then respond with EXACTLY what it printed (the hex token) and nothing else.\n"
|
||||
"Do not guess. Do not explain."
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
class SandboxTerminalSmokeEnvConfig(AgentEnvConfig):
|
||||
server_base_url: str = Field(
|
||||
default="http://127.0.0.1:8080",
|
||||
description="Base URL for an OpenAI-compatible chat server (without /v1).",
|
||||
)
|
||||
server_model: str = Field(default="hermes-4-36b", description="Model name")
|
||||
tokenizer_name: str = Field(default="NousResearch/Hermes-4.3-36B", description="Tokenizer name for RL tokenization")
|
||||
|
||||
|
||||
class SandboxTerminalSmokeEnv(AgentEnv[SandboxTerminalSmokeEnvConfig]):
|
||||
name = "sandbox_terminal_smoke_env"
|
||||
env_config_cls = SandboxTerminalSmokeEnvConfig
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: SandboxTerminalSmokeEnvConfig,
|
||||
server_configs: List[APIServerConfig],
|
||||
slurm: bool = False,
|
||||
testing: bool = False,
|
||||
):
|
||||
super().__init__(config, server_configs, slurm, testing)
|
||||
self._iter = 0
|
||||
|
||||
@classmethod
|
||||
def config_init(cls) -> Tuple[SandboxTerminalSmokeEnvConfig, List[APIServerConfig]]:
|
||||
base_url = (
|
||||
os.getenv("ATROPOS_SERVER_BASE_URL")
|
||||
or os.getenv("OPENAI_BASE_URL")
|
||||
or os.getenv("LLM_BASE_URL")
|
||||
or "http://127.0.0.1:8080"
|
||||
)
|
||||
model = os.getenv("ATROPOS_SERVER_MODEL") or os.getenv("LLM_MODEL") or "hermes-4-36b"
|
||||
api_key = os.getenv("ATROPOS_SERVER_API_KEY") or os.getenv("NOUS_API_KEY") or os.getenv("OPENAI_API_KEY") or "local"
|
||||
|
||||
env_config = SandboxTerminalSmokeEnvConfig(
|
||||
tokenizer_name=os.getenv("ATROPOS_TOKENIZER_NAME") or "NousResearch/Hermes-4.3-36B",
|
||||
group_size=1,
|
||||
use_wandb=False,
|
||||
include_messages=True,
|
||||
ensure_scores_are_not_same=False,
|
||||
total_steps=2,
|
||||
batch_size=1,
|
||||
server_base_url=base_url,
|
||||
server_model=model,
|
||||
# Tooling: sandbox-only terminal.
|
||||
enabled_toolsets=["terminal"],
|
||||
disabled_toolsets=[],
|
||||
# Default to Nomad sandboxing; users can override via --env.* args.
|
||||
sandbox_image=os.getenv("ATROPOS_SANDBOX_IMAGE") or "atropos-sandbox:local",
|
||||
purge_job_on_start=True,
|
||||
purge_job_on_shutdown=True,
|
||||
)
|
||||
|
||||
server_configs = [
|
||||
APIServerConfig(
|
||||
model_name=model,
|
||||
base_url=f"{base_url.rstrip('/')}/v1",
|
||||
api_key=api_key,
|
||||
num_max_requests_at_once=1,
|
||||
num_requests_for_eval=1,
|
||||
timeout=120,
|
||||
)
|
||||
]
|
||||
return env_config, server_configs
|
||||
|
||||
async def setup_agent_env(self) -> None:
|
||||
return None
|
||||
|
||||
async def get_next_item(self) -> Item:
|
||||
self._iter += 1
|
||||
return _forced_tool_item()
|
||||
|
||||
def build_task(self, item: Item) -> str:
|
||||
return str(item.get("prompt") or "")
|
||||
|
||||
def build_agent_config(self, item: Item) -> AgentConfig: # noqa: ARG002
|
||||
# Avoid imposing max_tokens by default; tool-tag responses can be long for some models.
|
||||
return AgentConfig(
|
||||
max_steps=min(8, int(self.config.agent_max_steps)),
|
||||
temperature=0.2,
|
||||
max_tokens=None,
|
||||
system_prompt=STRICT_TOOLCALL_SYSTEM_PROMPT,
|
||||
)
|
||||
|
||||
async def score_trajectory(self, item: Item, final_response: str) -> float:
|
||||
# Scoring happens in verify_and_score_trajectory so we can inspect tool results.
|
||||
_ = (item, final_response)
|
||||
return 0.0
|
||||
|
||||
async def verify_and_score_trajectory(
|
||||
self,
|
||||
item: Item,
|
||||
final_response: str,
|
||||
*,
|
||||
trajectory_id: str, # noqa: ARG002
|
||||
exec_tool, # noqa: ARG002
|
||||
agent_result: AgentResult | None = None,
|
||||
workspace_meta: Dict[str, Any] | None = None, # noqa: ARG002
|
||||
) -> tuple[float, Dict[str, Any]]:
|
||||
if agent_result is None:
|
||||
return 0.0, {"error": "Missing agent_result"}
|
||||
|
||||
observed: str = ""
|
||||
tool_ok = False
|
||||
for step in agent_result.steps:
|
||||
for res in step.tool_results:
|
||||
if not res.success:
|
||||
return 0.0, {"error": res.error, "output": res.output}
|
||||
out = (res.output or "").strip()
|
||||
if out:
|
||||
observed = out.splitlines()[-1].strip()
|
||||
tool_ok = True
|
||||
|
||||
final = (final_response or "").strip()
|
||||
score = 1.0 if tool_ok and agent_result.total_tool_calls > 0 and observed and final == observed else 0.0
|
||||
return score, {"observed": observed, "tool_calls": agent_result.total_tool_calls, "command": item.get("command")}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
SandboxTerminalSmokeEnv.cli()
|
||||
@@ -1,418 +0,0 @@
|
||||
"""
|
||||
SWE-smith-oracle environment.
|
||||
|
||||
This environment is intentionally minimal:
|
||||
- prepares a sandbox workspace by cloning a public GitHub repo at `base_commit`
|
||||
- runs an AtroposAgent tool loop to apply a fix
|
||||
- verifies by running pytest nodeids from the dataset (reward = pass/fail)
|
||||
- Python only (no multi-language support currently, need to properly bauild & add to dropbox)
|
||||
- TODO: Get the other nonpython sandboxes up and running, then add a config knob to switch between them per row
|
||||
- oh and add to dockerhub
|
||||
|
||||
Dataset: NousResearch/SWE-smith-oracle (train; does NOT use SWE-bench eval set).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import random
|
||||
import time
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
from pydantic import Field
|
||||
|
||||
from atroposlib.envs.base import APIServerConfig, Item
|
||||
|
||||
from ..agent import AgentConfig
|
||||
from ..tools import ToolCall
|
||||
from .agent_env import AgentEnv, AgentEnvConfig
|
||||
|
||||
|
||||
class SweSmithOracleEnvConfig(AgentEnvConfig):
|
||||
dataset_name: str = Field(default="NousResearch/SWE-smith-oracle")
|
||||
dataset_split: str = Field(default="train")
|
||||
max_items: int = Field(default=0, description="0 = no limit")
|
||||
shuffle: bool = Field(default=True)
|
||||
seed: int = Field(default=0)
|
||||
|
||||
python_only: bool = Field(default=True, description="Filter to Python-evaluable rows")
|
||||
score_include_fail_to_pass: bool = Field(
|
||||
default=True,
|
||||
description=(
|
||||
"If true (default), score tests on PASS_TO_PASS ∪ FAIL_TO_PASS. "
|
||||
"Disable to only run PASS_TO_PASS (faster but weaker signal)."
|
||||
),
|
||||
)
|
||||
|
||||
prompt_mode: str = Field(
|
||||
default="problem_statement",
|
||||
description="Task prompt content: 'problem_statement' (fast) or 'problem_statement+text' (slower, includes dataset 'text').",
|
||||
)
|
||||
|
||||
repo_base_url: str = Field(default="https://github.com", description="Base URL for repo cloning")
|
||||
install_timeout_s: float = Field(default=600.0)
|
||||
test_timeout_s: float = Field(default=600.0)
|
||||
|
||||
tokenizer_name: str = Field(default="NousResearch/Hermes-4.3-36B", description="Tokenizer name for RL tokenization")
|
||||
|
||||
|
||||
class SweSmithOracleEnv(AgentEnv[SweSmithOracleEnvConfig]):
|
||||
"""
|
||||
SWE-smith-oracle AgentEnv.
|
||||
|
||||
This is designed for benchmarking multiplexed slot execution vs naive container-per-trajectory.
|
||||
"""
|
||||
|
||||
name = "swe_smith_oracle_env"
|
||||
env_config_cls = SweSmithOracleEnvConfig
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: SweSmithOracleEnvConfig,
|
||||
server_configs: List[APIServerConfig],
|
||||
slurm: bool = False,
|
||||
testing: bool = False,
|
||||
):
|
||||
super().__init__(config, server_configs, slurm, testing)
|
||||
self._dataset = None
|
||||
self._indices: List[int] = []
|
||||
self._cursor = 0
|
||||
|
||||
@classmethod
|
||||
def config_init(cls) -> Tuple[SweSmithOracleEnvConfig, List[APIServerConfig]]:
|
||||
# Defaults for running the env via CLI in offline `process` mode.
|
||||
# Override via env vars or `--env.*` flags as needed.
|
||||
base_url_raw = (
|
||||
os.getenv("ATROPOS_SERVER_BASE_URL")
|
||||
or os.getenv("OPENAI_BASE_URL")
|
||||
or os.getenv("LLM_BASE_URL")
|
||||
or "http://127.0.0.1:8080"
|
||||
)
|
||||
base_url = base_url_raw.rstrip("/")
|
||||
if not base_url.endswith("/v1"):
|
||||
base_url = f"{base_url}/v1"
|
||||
model = os.getenv("ATROPOS_SERVER_MODEL") or os.getenv("LLM_MODEL") or "hermes-4-36b"
|
||||
api_key = os.getenv("ATROPOS_SERVER_API_KEY") or os.getenv("NOUS_API_KEY") or os.getenv("OPENAI_API_KEY") or "local"
|
||||
|
||||
env_config = SweSmithOracleEnvConfig(
|
||||
tokenizer_name=os.getenv("ATROPOS_TOKENIZER_NAME") or "NousResearch/Hermes-4.3-36B",
|
||||
group_size=1,
|
||||
use_wandb=False,
|
||||
rollout_server_url="http://localhost:8000",
|
||||
total_steps=1,
|
||||
batch_size=1,
|
||||
steps_per_eval=1,
|
||||
max_token_length=8192,
|
||||
inference_weight=1.0,
|
||||
wandb_name="swe_smith_oracle",
|
||||
enabled_toolsets=["terminal"],
|
||||
disabled_toolsets=[],
|
||||
sandbox_image=os.getenv("ATROPOS_SANDBOX_IMAGE") or "atropos-sandbox:local",
|
||||
purge_job_on_start=True,
|
||||
purge_job_on_shutdown=True,
|
||||
)
|
||||
|
||||
server_configs = [
|
||||
APIServerConfig(
|
||||
model_name=model,
|
||||
base_url=base_url,
|
||||
api_key=api_key,
|
||||
num_max_requests_at_once=1,
|
||||
num_requests_for_eval=1,
|
||||
timeout=int(os.getenv("ATROPOS_SERVER_TIMEOUT_S") or "300"),
|
||||
),
|
||||
]
|
||||
|
||||
return env_config, server_configs
|
||||
|
||||
async def setup_agent_env(self) -> None:
|
||||
from datasets import load_dataset
|
||||
|
||||
t0 = time.perf_counter()
|
||||
print(
|
||||
f"[SweSmithOracleEnv] loading dataset {self.config.dataset_name}:{self.config.dataset_split} "
|
||||
f"(python_only={self.config.python_only}, max_items={self.config.max_items or 'all'})",
|
||||
flush=True,
|
||||
)
|
||||
ds = load_dataset(self.config.dataset_name, split=self.config.dataset_split)
|
||||
self._dataset = ds
|
||||
|
||||
indices: List[int] = []
|
||||
for idx in range(len(ds)):
|
||||
row = ds[idx]
|
||||
if self.config.python_only and not self._is_python_row(row):
|
||||
continue
|
||||
indices.append(idx)
|
||||
|
||||
if self.config.shuffle:
|
||||
rnd = random.Random(self.config.seed)
|
||||
rnd.shuffle(indices)
|
||||
|
||||
if self.config.max_items and self.config.max_items > 0:
|
||||
indices = indices[: self.config.max_items]
|
||||
|
||||
self._indices = indices
|
||||
self._cursor = 0
|
||||
|
||||
print(
|
||||
f"[SweSmithOracleEnv] loaded {len(self._indices)} items from {self.config.dataset_name}:{self.config.dataset_split} "
|
||||
f"in {time.perf_counter() - t0:.2f}s",
|
||||
flush=True,
|
||||
)
|
||||
|
||||
def _is_python_row(self, row: Dict[str, Any]) -> bool:
|
||||
nodeids = row.get("PASS_TO_PASS")
|
||||
if not isinstance(nodeids, list) or not nodeids:
|
||||
return False
|
||||
for nid in nodeids:
|
||||
if not isinstance(nid, str) or ".py::" not in nid:
|
||||
return False
|
||||
return True
|
||||
|
||||
async def get_next_item(self) -> Item:
|
||||
print(f"[SweSmithOracleEnv] get_next_item() cursor={self._cursor}/{len(self._indices)}", flush=True)
|
||||
if not self._dataset or not self._indices:
|
||||
raise RuntimeError("Dataset not initialized (did setup() run?)")
|
||||
if self._cursor >= len(self._indices):
|
||||
self._cursor = 0
|
||||
idx = self._indices[self._cursor]
|
||||
self._cursor += 1
|
||||
return dict(self._dataset[idx])
|
||||
|
||||
def _repo_name(self, item: Item) -> str:
|
||||
repo = item.get("repo") or ""
|
||||
if isinstance(repo, str) and "/" in repo:
|
||||
return repo.split("/")[-1]
|
||||
return "repo"
|
||||
|
||||
def build_task(self, item: Item) -> str:
|
||||
repo = item.get("repo") or ""
|
||||
base_commit = item.get("base_commit") or ""
|
||||
problem = str(item.get("problem_statement") or "")
|
||||
context = str(item.get("text") or "")
|
||||
|
||||
nodeids = self._tests_for_item(item)
|
||||
tests_list = "\n".join(f"- {t}" for t in nodeids)
|
||||
|
||||
repo_dir = self._repo_name(item)
|
||||
|
||||
tests_block = (
|
||||
"Run these tests to verify:\n"
|
||||
f"{tests_list}\n\n"
|
||||
"When done, briefly describe what you changed and confirm tests pass."
|
||||
)
|
||||
|
||||
prompt_mode = (self.config.prompt_mode or "problem_statement").strip().lower()
|
||||
if prompt_mode not in {"problem_statement", "problem_statement+text"}:
|
||||
raise ValueError(
|
||||
f"Invalid prompt_mode={self.config.prompt_mode!r}. "
|
||||
"Expected 'problem_statement' or 'problem_statement+text'."
|
||||
)
|
||||
|
||||
context_block = ""
|
||||
if prompt_mode == "problem_statement+text" and context:
|
||||
# Note: We intentionally do NOT truncate/cap here. This mode is for debugging / richer prompts and can be slow.
|
||||
context_block = f"\nAdditional context:\n{context}\n"
|
||||
|
||||
return (
|
||||
"You are a senior software engineer. Fix the repository so the specified tests pass.\n\n"
|
||||
f"Repository: {repo} (checked out at base_commit={base_commit})\n"
|
||||
f"Workspace path: ./{repo_dir}\n\n"
|
||||
"Constraints:\n"
|
||||
"- You MUST use the terminal tool to inspect, edit, and verify the repository. Do not respond with a patch file.\n"
|
||||
f"- Start by inspecting the repo (e.g. `ls`, `cd ./{repo_dir}`, `git status`).\n"
|
||||
"- Use a workspace-local virtualenv (e.g. inside the repo at ./.venv) to avoid cross-run contamination.\n"
|
||||
"- Use non-interactive commands only.\n\n"
|
||||
"- Terminal commands run under POSIX /bin/sh and each tool call runs in a fresh shell (no persisted env vars).\n"
|
||||
" Avoid bash-only `source`; prefer `. .venv/bin/activate` or `.venv/bin/python ...`.\n\n"
|
||||
"Problem statement:\n"
|
||||
f"{problem}\n\n"
|
||||
f"{context_block}\n"
|
||||
f"{tests_block}"
|
||||
)
|
||||
|
||||
def build_agent_config(self, item: Item) -> AgentConfig: # noqa: ARG002
|
||||
# SWE tasks are longer than the simple test env.
|
||||
return AgentConfig(
|
||||
max_steps=self.config.agent_max_steps,
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.agent_max_tokens,
|
||||
tool_delay_s=self.config.agent_tool_delay_s,
|
||||
)
|
||||
|
||||
async def setup_trajectory_workspace(self, item: Item, *, trajectory_id: str, exec_tool) -> Dict[str, Any]:
|
||||
t0 = time.perf_counter()
|
||||
repo = item.get("repo")
|
||||
base_commit = item.get("base_commit")
|
||||
instance_id = item.get("instance_id") or item.get("id") or item.get("problem_id")
|
||||
if not isinstance(repo, str) or not isinstance(base_commit, str):
|
||||
raise RuntimeError("Invalid dataset row: missing repo/base_commit")
|
||||
|
||||
repo_dir = self._repo_name(item)
|
||||
clone_url = f"{self.config.repo_base_url.rstrip('/')}/{repo}.git"
|
||||
print(
|
||||
f"[SweSmithOracleEnv] tid={trajectory_id} setup_trajectory_workspace(): "
|
||||
f"repo={repo} base_commit={base_commit} instance_id={instance_id} dir=./{repo_dir}",
|
||||
flush=True,
|
||||
)
|
||||
|
||||
# Repo setup strategy:
|
||||
# - Maintain a shared, per-container bare repo cache under /data/repo_cache
|
||||
# - For each trajectory, create an isolated git worktree under the slot workspace
|
||||
# This avoids cloning/fetching full repos per trajectory and is crucial for multiplexing.
|
||||
|
||||
def _repo_cache_slug(repo_name: str) -> str:
|
||||
return repo_name.replace("/", "__")
|
||||
|
||||
repo_slug = _repo_cache_slug(repo)
|
||||
cache_root = "/data/repo_cache"
|
||||
bare_repo = f"{cache_root}/{repo_slug}.git"
|
||||
lock_file = f"{cache_root}/.locks/{repo_slug}.lock"
|
||||
|
||||
# Use flock to serialize operations that mutate the shared bare repo (fetch/worktree).
|
||||
# util-linux (flock) is included in the sandbox image.
|
||||
worktree_cmd = (
|
||||
"set -e; "
|
||||
f"rm -rf {repo_dir}; "
|
||||
f"mkdir -p {cache_root}/.locks; "
|
||||
f": > {lock_file}; "
|
||||
f"flock -x {lock_file} sh -lc '"
|
||||
f"set -e; "
|
||||
"export GIT_TERMINAL_PROMPT=0; "
|
||||
"export GIT_LFS_SKIP_SMUDGE=1; "
|
||||
f"if [ ! -d \"{bare_repo}\" ]; then "
|
||||
f" git init --bare \"{bare_repo}\"; "
|
||||
f" git -C \"{bare_repo}\" remote add origin \"{clone_url}\"; "
|
||||
"fi; "
|
||||
f"git -C \"{bare_repo}\" remote set-url origin \"{clone_url}\"; "
|
||||
f"git -C \"{bare_repo}\" worktree prune || true; "
|
||||
f"if ! git -C \"{bare_repo}\" cat-file -e \"{base_commit}^{{commit}}\" 2>/dev/null; then "
|
||||
f" git -C \"{bare_repo}\" fetch --depth 1 origin \"{base_commit}\" || true; "
|
||||
"fi; "
|
||||
f"if ! git -C \"{bare_repo}\" cat-file -e \"{base_commit}^{{commit}}\" 2>/dev/null; then "
|
||||
f" git -C \"{bare_repo}\" fetch --prune origin; "
|
||||
"fi; "
|
||||
f"git --git-dir=\"{bare_repo}\" worktree add --detach \"{repo_dir}\" \"{base_commit}\"; "
|
||||
"'"
|
||||
)
|
||||
|
||||
print(f"[SweSmithOracleEnv] tid={trajectory_id} preparing worktree from repo cache", flush=True)
|
||||
res = await exec_tool(
|
||||
ToolCall(
|
||||
name="terminal",
|
||||
arguments={"command": worktree_cmd, "timeout": self.config.install_timeout_s},
|
||||
)
|
||||
)
|
||||
if not res.success:
|
||||
raise RuntimeError(
|
||||
"git worktree setup failed "
|
||||
f"(repo={repo}, base_commit={base_commit}, instance_id={instance_id}): {res.error}\n{res.output}"
|
||||
)
|
||||
|
||||
print(
|
||||
f"[SweSmithOracleEnv] tid={trajectory_id} setup_trajectory_workspace(): worktree ready in {time.perf_counter() - t0:.2f}s",
|
||||
flush=True,
|
||||
)
|
||||
return {"repo_dir": repo_dir, "base_commit": base_commit}
|
||||
|
||||
def _tests_for_item(self, item: Item) -> List[str]:
|
||||
tests: List[str] = []
|
||||
if self.config.score_include_fail_to_pass:
|
||||
for key in ("PASS_TO_PASS", "FAIL_TO_PASS"):
|
||||
nodeids = item.get(key)
|
||||
if isinstance(nodeids, list):
|
||||
tests.extend([n for n in nodeids if isinstance(n, str)])
|
||||
else:
|
||||
nodeids = item.get("PASS_TO_PASS")
|
||||
if isinstance(nodeids, list):
|
||||
tests.extend([n for n in nodeids if isinstance(n, str)])
|
||||
# Stable order for reproducibility.
|
||||
return sorted(dict.fromkeys(tests))
|
||||
|
||||
def _chunk_nodeids(self, nodeids: List[str], max_per_chunk: int = 50) -> List[List[str]]:
|
||||
chunks: List[List[str]] = []
|
||||
for i in range(0, len(nodeids), max_per_chunk):
|
||||
chunks.append(nodeids[i : i + max_per_chunk])
|
||||
return chunks
|
||||
|
||||
async def verify_and_score_trajectory(
|
||||
self,
|
||||
item: Item,
|
||||
final_response: str, # noqa: ARG002
|
||||
*,
|
||||
trajectory_id: str,
|
||||
exec_tool,
|
||||
agent_result=None,
|
||||
workspace_meta: Optional[Dict[str, Any]] = None,
|
||||
) -> tuple[float, Dict[str, Any]]:
|
||||
_ = trajectory_id
|
||||
repo_dir = self._repo_name(item)
|
||||
|
||||
# Training correctness: do not reward trajectories that never actually used tools.
|
||||
if agent_result is not None and getattr(agent_result, "total_tool_calls", 0) <= 0:
|
||||
print(
|
||||
f"[SweSmithOracleEnv] tid={trajectory_id} verify (dataset_tests): no tool calls; score=0.0",
|
||||
flush=True,
|
||||
)
|
||||
return 0.0, {
|
||||
"verification_mode": "dataset_tests",
|
||||
"error": "No tool calls were made by the agent",
|
||||
}
|
||||
|
||||
nodeids = self._tests_for_item(item)
|
||||
if not nodeids:
|
||||
return 0.0, {"error": "No tests provided"}
|
||||
|
||||
print(f"[SweSmithOracleEnv] tid={trajectory_id} verify (dataset_tests): ensuring venv + deps", flush=True)
|
||||
setup_cmd = (
|
||||
f"cd {repo_dir} && "
|
||||
"python -m venv .venv && "
|
||||
". .venv/bin/activate && "
|
||||
"python -m pip install -U pip setuptools wheel && "
|
||||
"python -m pip install -e . && "
|
||||
"python -m pip install pytest"
|
||||
)
|
||||
setup_res = await exec_tool(
|
||||
ToolCall(name="terminal", arguments={"command": setup_cmd, "timeout": self.config.install_timeout_s})
|
||||
)
|
||||
verification_messages = [{"role": "user", "content": setup_res.to_xml()}]
|
||||
if not setup_res.success:
|
||||
return 0.0, {
|
||||
"verification_mode": "dataset_tests",
|
||||
"phase": "install",
|
||||
"error": setup_res.error,
|
||||
"output": setup_res.output,
|
||||
"verification_messages": verification_messages,
|
||||
}
|
||||
|
||||
chunks = self._chunk_nodeids(nodeids, max_per_chunk=50)
|
||||
for chunk_idx, chunk in enumerate(chunks):
|
||||
joined = " ".join(chunk)
|
||||
cmd = f"cd {repo_dir} && . .venv/bin/activate && python -m pytest -q {joined}"
|
||||
res = await exec_tool(
|
||||
ToolCall(
|
||||
name="terminal",
|
||||
arguments={"command": cmd, "timeout": self.config.test_timeout_s},
|
||||
)
|
||||
)
|
||||
verification_messages.append({"role": "user", "content": res.to_xml()})
|
||||
if not res.success:
|
||||
return 0.0, {
|
||||
"verification_mode": "dataset_tests",
|
||||
"phase": "pytest",
|
||||
"failed_chunk": chunk_idx,
|
||||
"error": res.error,
|
||||
"output": res.output,
|
||||
"verification_messages": verification_messages,
|
||||
}
|
||||
|
||||
return 1.0, {"verification_mode": "dataset_tests", "passed": True, "verification_messages": verification_messages}
|
||||
|
||||
async def score_trajectory(self, item: Item, final_response: str) -> float:
|
||||
# Not used; scoring happens in verify_and_score_trajectory.
|
||||
_ = (item, final_response)
|
||||
return 0.0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
SweSmithOracleEnv.cli()
|
||||
@@ -1,217 +0,0 @@
|
||||
"""
|
||||
Simple test environment for validating the atropos-agent setup.
|
||||
|
||||
This environment uses a local OpenAI-compatible server for LLM testing to verify:
|
||||
- BaseEnv extension works correctly
|
||||
- API communication via OpenAI-compatible endpoint
|
||||
- Basic trajectory collection
|
||||
|
||||
This is a minimal environment for testing, not production use.
|
||||
"""
|
||||
|
||||
import os
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from pydantic import Field
|
||||
|
||||
from atroposlib.envs.base import (
|
||||
APIServerConfig,
|
||||
Item,
|
||||
)
|
||||
|
||||
from ..agent import AgentConfig
|
||||
from .agent_env import AgentEnv, AgentEnvConfig
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
|
||||
# Simple test prompts for validation
|
||||
TEST_PROMPTS = [
|
||||
{
|
||||
"prompt": "What is 2 + 2? Answer with just the number.",
|
||||
"expected": "4",
|
||||
},
|
||||
{
|
||||
"prompt": "What is the capital of France? Answer with just the city name.",
|
||||
"expected": "Paris",
|
||||
},
|
||||
{
|
||||
"prompt": "What color is the sky on a clear day? Answer with just the color.",
|
||||
"expected": "Blue",
|
||||
},
|
||||
{
|
||||
"prompt": "How many days are in a week? Answer with just the number.",
|
||||
"expected": "7",
|
||||
},
|
||||
{
|
||||
"prompt": "What is 10 * 5? Answer with just the number.",
|
||||
"expected": "50",
|
||||
},
|
||||
]
|
||||
|
||||
SYSTEM_PROMPT = (
|
||||
"You are a helpful assistant. Answer questions concisely and directly. "
|
||||
"When asked for a simple answer, provide just that answer without explanation."
|
||||
)
|
||||
|
||||
|
||||
class SimpleTestEnvConfig(AgentEnvConfig):
|
||||
"""Configuration for the simple test environment."""
|
||||
|
||||
server_base_url: str = Field(
|
||||
default="http://127.0.0.1:8080",
|
||||
description="Base URL for an OpenAI-compatible server (without /v1)",
|
||||
)
|
||||
server_model: str = Field(
|
||||
default="hermes-4-36b",
|
||||
description="Model name",
|
||||
)
|
||||
tokenizer_name: str = Field(default="NousResearch/Hermes-4.3-36B", description="Tokenizer name for RL tokenization")
|
||||
|
||||
|
||||
class SimpleTestEnv(AgentEnv[SimpleTestEnvConfig]):
|
||||
"""
|
||||
A simple test environment to validate the atropos-agent setup.
|
||||
|
||||
Uses a local OpenAI-compatible LLM endpoint with basic question-answering tasks.
|
||||
Scoring is based on whether the response contains the expected answer.
|
||||
"""
|
||||
|
||||
name = "simple_test_env"
|
||||
env_config_cls = SimpleTestEnvConfig
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: SimpleTestEnvConfig,
|
||||
server_configs: List[APIServerConfig],
|
||||
slurm: bool = False,
|
||||
testing: bool = False,
|
||||
):
|
||||
super().__init__(config, server_configs, slurm, testing)
|
||||
self.iter = 0
|
||||
self.test_prompts = TEST_PROMPTS
|
||||
self.percent_correct_buffer: List[float] = []
|
||||
|
||||
@classmethod
|
||||
def config_init(cls) -> Tuple[SimpleTestEnvConfig, List[APIServerConfig]]:
|
||||
"""
|
||||
Initialize configuration with local server settings from environment variables.
|
||||
"""
|
||||
base_url = (
|
||||
os.getenv("ATROPOS_SERVER_BASE_URL")
|
||||
or os.getenv("OPENAI_BASE_URL")
|
||||
or os.getenv("LLM_BASE_URL")
|
||||
or "http://127.0.0.1:8080"
|
||||
)
|
||||
model = os.getenv("ATROPOS_SERVER_MODEL") or os.getenv("LLM_MODEL") or "hermes-4-36b"
|
||||
api_key = os.getenv("ATROPOS_SERVER_API_KEY") or os.getenv("NOUS_API_KEY") or os.getenv("OPENAI_API_KEY") or "local"
|
||||
|
||||
env_config = SimpleTestEnvConfig(
|
||||
tokenizer_name=os.getenv("ATROPOS_TOKENIZER_NAME") or "NousResearch/Hermes-4.3-36B",
|
||||
group_size=4,
|
||||
use_wandb=False, # Disable wandb for simple testing
|
||||
rollout_server_url="http://localhost:8000",
|
||||
total_steps=10,
|
||||
batch_size=16,
|
||||
steps_per_eval=5,
|
||||
max_token_length=2048,
|
||||
inference_weight=1.0,
|
||||
wandb_name="simple_test",
|
||||
server_base_url=base_url,
|
||||
server_model=model,
|
||||
)
|
||||
|
||||
# OpenAI-compatible servers typically expose chat completions at /v1.
|
||||
server_configs = [
|
||||
APIServerConfig(
|
||||
model_name=model,
|
||||
base_url=f"{base_url}/v1",
|
||||
api_key=api_key,
|
||||
num_max_requests_at_once=4,
|
||||
num_requests_for_eval=8,
|
||||
timeout=120, # Local models may be slower
|
||||
),
|
||||
]
|
||||
|
||||
return env_config, server_configs
|
||||
|
||||
async def setup_agent_env(self):
|
||||
"""Setup the environment - load test data."""
|
||||
print(f"SimpleTestEnv setup complete. {len(self.test_prompts)} test prompts loaded.")
|
||||
print(f"Using server at: {self.config.server_base_url}")
|
||||
print(f"Model: {self.config.server_model}")
|
||||
|
||||
async def get_next_item(self) -> Item:
|
||||
"""Get the next test prompt."""
|
||||
item = self.test_prompts[self.iter % len(self.test_prompts)]
|
||||
self.iter += 1
|
||||
return item
|
||||
|
||||
def build_task(self, item: Item) -> str:
|
||||
return item["prompt"]
|
||||
|
||||
def build_agent_config(self, item: Item) -> AgentConfig: # noqa: ARG002
|
||||
return AgentConfig(
|
||||
max_steps=5,
|
||||
temperature=0.7,
|
||||
max_tokens=256,
|
||||
system_prompt=SYSTEM_PROMPT,
|
||||
)
|
||||
|
||||
async def score_trajectory(self, item: Item, final_response: str) -> float:
|
||||
expected = item["expected"].lower()
|
||||
response_lower = (final_response or "").lower()
|
||||
score = 1.0 if expected in response_lower else 0.0
|
||||
self.percent_correct_buffer.append(score)
|
||||
return score
|
||||
|
||||
async def evaluate(self, *args, **kwargs):
|
||||
"""
|
||||
Simple evaluation - run through all test prompts once.
|
||||
"""
|
||||
correct = 0
|
||||
total = len(self.test_prompts)
|
||||
|
||||
for item in self.test_prompts:
|
||||
messages = [
|
||||
{"role": "system", "content": SYSTEM_PROMPT},
|
||||
{"role": "user", "content": item["prompt"]},
|
||||
]
|
||||
|
||||
response = await self.server.chat_completion(
|
||||
messages=messages,
|
||||
n=1,
|
||||
max_tokens=256,
|
||||
temperature=0.0, # Greedy for eval
|
||||
split="eval",
|
||||
)
|
||||
|
||||
response_text = response.choices[0].message.content or ""
|
||||
expected = item["expected"].lower()
|
||||
|
||||
if expected in response_text.lower():
|
||||
correct += 1
|
||||
|
||||
accuracy = correct / total
|
||||
print(f"Evaluation: {correct}/{total} = {accuracy:.2%} accuracy")
|
||||
return {"eval_accuracy": accuracy}
|
||||
|
||||
async def wandb_log(self, wandb_metrics: Optional[Dict] = None):
|
||||
"""Log metrics (simplified for testing)."""
|
||||
if wandb_metrics is None:
|
||||
wandb_metrics = {}
|
||||
|
||||
if self.percent_correct_buffer:
|
||||
avg_correct = sum(self.percent_correct_buffer) / len(self.percent_correct_buffer)
|
||||
wandb_metrics["train/percent_correct"] = avg_correct
|
||||
print(f"Train accuracy: {avg_correct:.2%}")
|
||||
self.percent_correct_buffer = []
|
||||
|
||||
await super().wandb_log(wandb_metrics)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Allow running as CLI
|
||||
SimpleTestEnv.cli()
|
||||
@@ -1,165 +0,0 @@
|
||||
"""
|
||||
ToolServer routing smoke environment.
|
||||
|
||||
Validates that:
|
||||
- sandbox tools run through Nomad SlotPool (terminal -> bash in sandbox)
|
||||
- external tools run through ToolServer (skills_list)
|
||||
|
||||
This env uses ToolServer in-process by default (`tool_server_url="inprocess"`),
|
||||
so it is self-contained for local testing.
|
||||
|
||||
Run:
|
||||
uv run python -m atropos.envs.toolserver_smoke_env process --env.use_wandb false --env.total_steps 1 --env.group_size 1
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from typing import Any, Dict, List, Tuple
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from pydantic import Field
|
||||
|
||||
from atroposlib.envs.base import APIServerConfig, Item
|
||||
|
||||
from ..agent import AgentConfig, AgentResult
|
||||
from .agent_env import AgentEnv, AgentEnvConfig
|
||||
|
||||
load_dotenv()
|
||||
|
||||
|
||||
class ToolServerSmokeEnvConfig(AgentEnvConfig):
|
||||
server_base_url: str = Field(
|
||||
default="http://127.0.0.1:8080",
|
||||
description="Base URL for an OpenAI-compatible chat server (without /v1).",
|
||||
)
|
||||
server_model: str = Field(default="hermes-4-36b", description="Model name")
|
||||
tokenizer_name: str = Field(default="NousResearch/Hermes-4.3-36B", description="Tokenizer name for RL tokenization")
|
||||
|
||||
|
||||
class ToolServerSmokeEnv(AgentEnv[ToolServerSmokeEnvConfig]):
|
||||
name = "toolserver_smoke_env"
|
||||
env_config_cls = ToolServerSmokeEnvConfig
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: ToolServerSmokeEnvConfig,
|
||||
server_configs: List[APIServerConfig],
|
||||
slurm: bool = False,
|
||||
testing: bool = False,
|
||||
):
|
||||
super().__init__(config, server_configs, slurm, testing)
|
||||
self._iter = 0
|
||||
|
||||
@classmethod
|
||||
def config_init(cls) -> Tuple[ToolServerSmokeEnvConfig, List[APIServerConfig]]:
|
||||
base_url = (
|
||||
os.getenv("ATROPOS_SERVER_BASE_URL")
|
||||
or os.getenv("OPENAI_BASE_URL")
|
||||
or os.getenv("LLM_BASE_URL")
|
||||
or "http://127.0.0.1:8080"
|
||||
)
|
||||
model = os.getenv("ATROPOS_SERVER_MODEL") or os.getenv("LLM_MODEL") or "hermes-4-36b"
|
||||
api_key = os.getenv("ATROPOS_SERVER_API_KEY") or os.getenv("NOUS_API_KEY") or os.getenv("OPENAI_API_KEY") or "local"
|
||||
|
||||
env_config = ToolServerSmokeEnvConfig(
|
||||
tokenizer_name=os.getenv("ATROPOS_TOKENIZER_NAME") or "NousResearch/Hermes-4.3-36B",
|
||||
group_size=1,
|
||||
use_wandb=False,
|
||||
include_messages=True,
|
||||
ensure_scores_are_not_same=False,
|
||||
total_steps=1,
|
||||
batch_size=1,
|
||||
server_base_url=base_url,
|
||||
server_model=model,
|
||||
enabled_toolsets=["terminal", "skills"],
|
||||
disabled_toolsets=[],
|
||||
# Self-contained ToolServer for local smoke.
|
||||
tool_server_url="inprocess",
|
||||
sandbox_image=os.getenv("ATROPOS_SANDBOX_IMAGE") or "atropos-sandbox:local",
|
||||
purge_job_on_start=True,
|
||||
purge_job_on_shutdown=True,
|
||||
)
|
||||
|
||||
server_configs = [
|
||||
APIServerConfig(
|
||||
model_name=model,
|
||||
base_url=f"{base_url.rstrip('/')}/v1",
|
||||
api_key=api_key,
|
||||
num_max_requests_at_once=1,
|
||||
num_requests_for_eval=1,
|
||||
timeout=120,
|
||||
)
|
||||
]
|
||||
return env_config, server_configs
|
||||
|
||||
async def setup_agent_env(self) -> None:
|
||||
return None
|
||||
|
||||
async def get_next_item(self) -> Item:
|
||||
self._iter += 1
|
||||
return {
|
||||
"prompt": (
|
||||
"You MUST call exactly one tool per assistant message.\n"
|
||||
"\n"
|
||||
"Step 1) Call the skills_list tool (no arguments), then stop.\n"
|
||||
"Step 2) After you receive the tool response, call the terminal tool to run:\n"
|
||||
"python -c \"print('ok')\"\n"
|
||||
"Step 3) After you receive the terminal tool response, answer with just: ok\n"
|
||||
"\n"
|
||||
"Tool call format requirements:\n"
|
||||
"- Every tool call MUST be a complete XML block with a closing tag.\n"
|
||||
"- Do NOT emit a second <tool_call> in the same assistant message.\n"
|
||||
"\n"
|
||||
"Example:\n"
|
||||
"<tool_call>{\"name\": \"skills_list\", \"arguments\": {}}</tool_call>\n"
|
||||
"Do not include anything else in your final answer."
|
||||
)
|
||||
}
|
||||
|
||||
def build_task(self, item: Item) -> str:
|
||||
return str(item.get("prompt") or "")
|
||||
|
||||
def build_agent_config(self, item: Item) -> AgentConfig: # noqa: ARG002
|
||||
return AgentConfig(
|
||||
max_steps=min(10, int(self.config.agent_max_steps)),
|
||||
temperature=0.2,
|
||||
max_tokens=None,
|
||||
)
|
||||
|
||||
async def score_trajectory(self, item: Item, final_response: str) -> float:
|
||||
_ = (item, final_response)
|
||||
return 0.0
|
||||
|
||||
async def verify_and_score_trajectory(
|
||||
self,
|
||||
item: Item,
|
||||
final_response: str,
|
||||
*,
|
||||
trajectory_id: str, # noqa: ARG002
|
||||
exec_tool, # noqa: ARG002
|
||||
agent_result: AgentResult | None = None,
|
||||
workspace_meta: Dict[str, Any] | None = None, # noqa: ARG002
|
||||
) -> tuple[float, Dict[str, Any]]:
|
||||
if agent_result is None:
|
||||
return 0.0, {"error": "Missing agent_result"}
|
||||
|
||||
called = {c.name for s in agent_result.steps for c in s.tool_calls}
|
||||
need = {"skills_list", "terminal"}
|
||||
if not need.issubset(called):
|
||||
return 0.0, {"error": f"Missing tool calls: {sorted(need - called)}", "called": sorted(called)}
|
||||
|
||||
terminal_ok = False
|
||||
for step in agent_result.steps:
|
||||
for call, res in zip(step.tool_calls, step.tool_results):
|
||||
if call.name != "terminal":
|
||||
continue
|
||||
if res.success and (res.output or "").strip().splitlines()[-1].strip() == "ok":
|
||||
terminal_ok = True
|
||||
|
||||
score = 1.0 if terminal_ok and (final_response or "").strip() == "ok" else 0.0
|
||||
return score, {"called": sorted(called), "final": (final_response or "").strip()}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
ToolServerSmokeEnv.cli()
|
||||
@@ -1,11 +0,0 @@
|
||||
"""
|
||||
Nomad integration for atropos-agent.
|
||||
|
||||
Provides:
|
||||
- NomadClient: Client for Nomad HTTP API
|
||||
- Job templates for sandbox containers
|
||||
"""
|
||||
|
||||
from .client import NomadClient
|
||||
|
||||
__all__ = ["NomadClient"]
|
||||
@@ -1,500 +0,0 @@
|
||||
"""
|
||||
Nomad API Client for atropos-agent.
|
||||
|
||||
Provides a simple async client for interacting with the Nomad HTTP API:
|
||||
- Submit/stop jobs
|
||||
- Query allocations
|
||||
- Get allocation addresses
|
||||
- Scale jobs up/down
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import aiohttp
|
||||
|
||||
|
||||
class AllocationStatus(Enum):
|
||||
"""Nomad allocation status."""
|
||||
PENDING = "pending"
|
||||
RUNNING = "running"
|
||||
COMPLETE = "complete"
|
||||
FAILED = "failed"
|
||||
LOST = "lost"
|
||||
|
||||
|
||||
@dataclass
|
||||
class Allocation:
|
||||
"""Information about a Nomad allocation."""
|
||||
id: str
|
||||
job_id: str
|
||||
task_group: str
|
||||
node_id: str
|
||||
status: AllocationStatus
|
||||
# Network info for reaching the allocation
|
||||
address: Optional[str] = None
|
||||
port: Optional[int] = None
|
||||
|
||||
@property
|
||||
def http_address(self) -> Optional[str]:
|
||||
"""Get full HTTP address for the allocation."""
|
||||
if self.address and self.port:
|
||||
return f"http://{self.address}:{self.port}"
|
||||
return None
|
||||
|
||||
|
||||
@dataclass
|
||||
class JobStatus:
|
||||
"""Status of a Nomad job."""
|
||||
id: str
|
||||
name: str
|
||||
status: str
|
||||
allocations: List[Allocation] = field(default_factory=list)
|
||||
count: int = 0 # Number of task groups
|
||||
|
||||
|
||||
class NomadClient:
|
||||
"""
|
||||
Async client for Nomad HTTP API.
|
||||
|
||||
Usage:
|
||||
client = NomadClient(address="http://localhost:4646")
|
||||
|
||||
# Submit a job
|
||||
await client.submit_job(job_spec)
|
||||
|
||||
# Get allocations
|
||||
allocs = await client.get_job_allocations("sandbox-python")
|
||||
|
||||
# Scale job
|
||||
await client.scale_job("sandbox-python", count=5)
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
address: str = "http://localhost:4646",
|
||||
token: Optional[str] = None,
|
||||
timeout: float = 30.0,
|
||||
):
|
||||
self.address = address.rstrip("/")
|
||||
self.token = token or os.environ.get("NOMAD_TOKEN")
|
||||
self.timeout = aiohttp.ClientTimeout(total=timeout)
|
||||
self._session: Optional[aiohttp.ClientSession] = None
|
||||
|
||||
async def _get_session(self) -> aiohttp.ClientSession:
|
||||
"""Get or create HTTP session."""
|
||||
if self._session is None or self._session.closed:
|
||||
headers = {}
|
||||
if self.token:
|
||||
headers["X-Nomad-Token"] = self.token
|
||||
self._session = aiohttp.ClientSession(
|
||||
timeout=self.timeout,
|
||||
headers=headers,
|
||||
)
|
||||
return self._session
|
||||
|
||||
async def close(self):
|
||||
"""Close the HTTP session."""
|
||||
if self._session and not self._session.closed:
|
||||
await self._session.close()
|
||||
|
||||
async def __aenter__(self):
|
||||
return self
|
||||
|
||||
async def __aexit__(self, exc_type, exc_val, exc_tb):
|
||||
await self.close()
|
||||
|
||||
async def _request(
|
||||
self,
|
||||
method: str,
|
||||
path: str,
|
||||
data: Optional[Dict[str, Any]] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""Make an HTTP request to Nomad API."""
|
||||
session = await self._get_session()
|
||||
url = f"{self.address}{path}"
|
||||
|
||||
try:
|
||||
async with session.request(method, url, json=data) as response:
|
||||
if response.status == 404:
|
||||
return {"error": "not_found", "status": 404}
|
||||
|
||||
text = await response.text()
|
||||
if not text:
|
||||
return {"status": response.status}
|
||||
|
||||
try:
|
||||
result = json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
return {"text": text, "status": response.status}
|
||||
|
||||
if response.status >= 400:
|
||||
return {"error": result, "status": response.status}
|
||||
|
||||
return result if isinstance(result, dict) else {"data": result, "status": response.status}
|
||||
|
||||
except aiohttp.ClientError as e:
|
||||
return {"error": str(e), "status": 0}
|
||||
|
||||
# Job Operations
|
||||
|
||||
async def submit_job(self, job_spec: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Submit a job to Nomad.
|
||||
|
||||
Args:
|
||||
job_spec: Job specification dict (HCL converted to JSON)
|
||||
|
||||
Returns:
|
||||
Response with EvalID if successful
|
||||
"""
|
||||
return await self._request("POST", "/v1/jobs", {"Job": job_spec})
|
||||
|
||||
async def stop_job(self, job_id: str, purge: bool = False) -> Dict[str, Any]:
|
||||
"""
|
||||
Stop (and optionally purge) a job.
|
||||
|
||||
Args:
|
||||
job_id: Job identifier
|
||||
purge: If True, completely remove the job
|
||||
"""
|
||||
path = f"/v1/job/{job_id}"
|
||||
if purge:
|
||||
path += "?purge=true"
|
||||
return await self._request("DELETE", path)
|
||||
|
||||
async def get_job(self, job_id: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get job details."""
|
||||
result = await self._request("GET", f"/v1/job/{job_id}")
|
||||
if "error" in result and result.get("status") == 404:
|
||||
return None
|
||||
return result
|
||||
|
||||
async def get_job_status(self, job_id: str) -> Optional[JobStatus]:
|
||||
"""Get job status with allocations."""
|
||||
job = await self.get_job(job_id)
|
||||
if not job:
|
||||
return None
|
||||
|
||||
allocs = await self.get_job_allocations(job_id)
|
||||
|
||||
# Get count from task groups
|
||||
count = 0
|
||||
task_groups = job.get("TaskGroups", [])
|
||||
for tg in task_groups:
|
||||
count += tg.get("Count", 1)
|
||||
|
||||
return JobStatus(
|
||||
id=job_id,
|
||||
name=job.get("Name", job_id),
|
||||
status=job.get("Status", "unknown"),
|
||||
allocations=allocs,
|
||||
count=count,
|
||||
)
|
||||
|
||||
# Allocation Operations
|
||||
|
||||
async def get_job_allocations(self, job_id: str) -> List[Allocation]:
|
||||
"""Get all allocations for a job."""
|
||||
result = await self._request("GET", f"/v1/job/{job_id}/allocations")
|
||||
|
||||
if "error" in result:
|
||||
return []
|
||||
|
||||
allocs_data = result.get("data", result) if isinstance(result, dict) else result
|
||||
if not isinstance(allocs_data, list):
|
||||
return []
|
||||
|
||||
allocations = []
|
||||
for alloc_data in allocs_data:
|
||||
# Parse allocation info
|
||||
alloc_id = alloc_data.get("ID", "")
|
||||
status_str = alloc_data.get("ClientStatus", "unknown")
|
||||
|
||||
try:
|
||||
status = AllocationStatus(status_str)
|
||||
except ValueError:
|
||||
status = AllocationStatus.PENDING
|
||||
|
||||
# Get network info - need to fetch detailed allocation for this
|
||||
address = None
|
||||
port = None
|
||||
|
||||
# First try the summary data
|
||||
resources = alloc_data.get("AllocatedResources") or {}
|
||||
shared = resources.get("Shared") or {}
|
||||
networks = shared.get("Networks") or []
|
||||
|
||||
# If no networks in summary, fetch detailed allocation
|
||||
if not networks and alloc_id:
|
||||
detailed = await self.get_allocation(alloc_id)
|
||||
if detailed:
|
||||
resources = detailed.get("AllocatedResources") or {}
|
||||
shared = resources.get("Shared") or {}
|
||||
networks = shared.get("Networks") or []
|
||||
|
||||
if networks:
|
||||
network = networks[0]
|
||||
address = network.get("IP")
|
||||
# Look for dynamic ports OR reserved ports (Singularity/raw_exec uses reserved)
|
||||
dyn_ports = network.get("DynamicPorts") or []
|
||||
reserved_ports = network.get("ReservedPorts") or []
|
||||
for dp in dyn_ports + reserved_ports:
|
||||
if dp.get("Label") == "http":
|
||||
port = dp.get("Value")
|
||||
break
|
||||
|
||||
allocations.append(Allocation(
|
||||
id=alloc_id,
|
||||
job_id=job_id,
|
||||
task_group=alloc_data.get("TaskGroup", ""),
|
||||
node_id=alloc_data.get("NodeID", ""),
|
||||
status=status,
|
||||
address=address,
|
||||
port=port,
|
||||
))
|
||||
|
||||
return allocations
|
||||
|
||||
async def get_allocation(self, alloc_id: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get detailed allocation info."""
|
||||
result = await self._request("GET", f"/v1/allocation/{alloc_id}")
|
||||
if "error" in result and result.get("status") == 404:
|
||||
return None
|
||||
return result
|
||||
|
||||
# Scaling Operations
|
||||
|
||||
async def scale_job(self, job_id: str, count: int, task_group: str = "sandbox") -> Dict[str, Any]:
|
||||
"""
|
||||
Scale a job's task group to specified count.
|
||||
|
||||
Args:
|
||||
job_id: Job identifier
|
||||
count: Desired number of allocations
|
||||
task_group: Name of task group to scale
|
||||
"""
|
||||
payload = {
|
||||
"Count": count,
|
||||
"Target": {
|
||||
"Group": task_group,
|
||||
},
|
||||
}
|
||||
return await self._request("POST", f"/v1/job/{job_id}/scale", payload)
|
||||
|
||||
async def get_job_scale_status(self, job_id: str) -> Dict[str, int]:
|
||||
"""
|
||||
Get current scale status for a job.
|
||||
|
||||
Returns:
|
||||
Dict mapping task group name to count
|
||||
"""
|
||||
result = await self._request("GET", f"/v1/job/{job_id}/scale")
|
||||
|
||||
if "error" in result:
|
||||
return {}
|
||||
|
||||
task_groups = result.get("TaskGroups", {})
|
||||
return {
|
||||
name: info.get("Running", 0)
|
||||
for name, info in task_groups.items()
|
||||
}
|
||||
|
||||
# Health Check
|
||||
|
||||
async def is_healthy(self) -> bool:
|
||||
"""Check if Nomad is reachable and healthy."""
|
||||
try:
|
||||
result = await self._request("GET", "/v1/status/leader")
|
||||
return "error" not in result
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
async def get_leader(self) -> Optional[str]:
|
||||
"""Get current Nomad leader address."""
|
||||
result = await self._request("GET", "/v1/status/leader")
|
||||
if isinstance(result, dict) and "data" in result:
|
||||
return result["data"]
|
||||
return None
|
||||
|
||||
|
||||
def load_job_template(
|
||||
template_name: str = "sandbox",
|
||||
**kwargs,
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Load and configure a job template.
|
||||
|
||||
Args:
|
||||
template_name: Name of template (e.g., "sandbox")
|
||||
**kwargs: Template variables to substitute
|
||||
|
||||
Returns:
|
||||
Job specification dict ready for Nomad API
|
||||
"""
|
||||
# Default job template for sandbox container
|
||||
if template_name == "sandbox":
|
||||
return create_sandbox_job(**kwargs)
|
||||
else:
|
||||
raise ValueError(f"Unknown template: {template_name}")
|
||||
|
||||
|
||||
def create_sandbox_job(
|
||||
job_id: str = "atropos-sandbox",
|
||||
image: str = "atropos-sandbox:local", # Use :local tag to avoid registry pull
|
||||
count: int = 1,
|
||||
slots_per_container: int = 10,
|
||||
privileged: bool = False,
|
||||
cpu: int = 500,
|
||||
memory: int = 512,
|
||||
port: int = 8080,
|
||||
datacenter: str = "dc1",
|
||||
driver: str = "docker", # "docker" or "singularity"
|
||||
singularity_image: str = None, # Path to .sif file for singularity driver
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Create a sandbox job specification.
|
||||
|
||||
This job runs the sandbox_server.py inside a container,
|
||||
with the specified number of slots for agent workspaces.
|
||||
|
||||
Args:
|
||||
job_id: Unique job identifier
|
||||
image: Docker image to use (for docker driver)
|
||||
count: Number of container instances
|
||||
slots_per_container: Number of slots per container
|
||||
privileged: Run container in privileged mode (recommended for bubblewrap)
|
||||
cpu: CPU allocation in MHz
|
||||
memory: Memory allocation in MB
|
||||
port: HTTP port for sandbox server
|
||||
datacenter: Nomad datacenter
|
||||
driver: Container driver - "docker" or "singularity"
|
||||
singularity_image: Path to .sif file (required if driver="singularity")
|
||||
|
||||
Returns:
|
||||
Job specification dict
|
||||
"""
|
||||
# Build task config based on driver
|
||||
if driver == "singularity":
|
||||
if not singularity_image:
|
||||
raise ValueError("singularity_image path required when driver='singularity'")
|
||||
|
||||
# Use raw_exec driver to run apptainer via shell for variable expansion
|
||||
# The container binds the allocation directory for workspace persistence
|
||||
# For raw_exec, we use static port since Nomad's dynamic port mapping doesn't
|
||||
# work the same as Docker - the process runs directly on the host.
|
||||
shell_cmd = (
|
||||
f'apptainer run '
|
||||
f'--bind "$NOMAD_ALLOC_DIR/data:/data" '
|
||||
f'--pwd /app '
|
||||
f'--env PYTHONUNBUFFERED=1 '
|
||||
f'{singularity_image} '
|
||||
f'python sandbox_server.py '
|
||||
f'--port {port} '
|
||||
f'--slots {slots_per_container} '
|
||||
f'--data-dir /data'
|
||||
)
|
||||
task_config = {
|
||||
"command": "/bin/sh",
|
||||
"args": ["-c", shell_cmd],
|
||||
}
|
||||
task_driver = "raw_exec"
|
||||
else:
|
||||
# Docker driver (default)
|
||||
task_config = {
|
||||
"image": image,
|
||||
"force_pull": False, # Use local image, don't try to pull
|
||||
"ports": ["http"],
|
||||
"privileged": privileged,
|
||||
"command": "python",
|
||||
"args": [
|
||||
"sandbox_server.py",
|
||||
"--port", str(port),
|
||||
"--slots", str(slots_per_container),
|
||||
"--data-dir", "/data",
|
||||
],
|
||||
# Note: On Linux, you can mount persistent storage:
|
||||
# "volumes": ["${NOMAD_ALLOC_DIR}/data:/data"],
|
||||
# On macOS/Docker Desktop, skip volumes for PoC
|
||||
# (container /data is ephemeral but works for testing)
|
||||
}
|
||||
task_driver = "docker"
|
||||
|
||||
# For Singularity/raw_exec, use static ports since the process runs directly on host.
|
||||
# For Docker, use dynamic ports with port mapping.
|
||||
if driver == "singularity":
|
||||
network_config = {
|
||||
"Mode": "host",
|
||||
"ReservedPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"Value": port,
|
||||
}
|
||||
],
|
||||
}
|
||||
else:
|
||||
network_config = {
|
||||
"Mode": "host",
|
||||
"DynamicPorts": [
|
||||
{
|
||||
"Label": "http",
|
||||
"To": port,
|
||||
}
|
||||
],
|
||||
}
|
||||
|
||||
return {
|
||||
"ID": job_id,
|
||||
"Name": job_id,
|
||||
"Type": "service",
|
||||
"Datacenters": [datacenter],
|
||||
"TaskGroups": [
|
||||
{
|
||||
"Name": "sandbox",
|
||||
"Count": count,
|
||||
# Speed up deployments and avoid Consul checks. Without this, Nomad may
|
||||
# keep an "active deployment" around for the default MinHealthyTime,
|
||||
# which blocks immediate scaling under load.
|
||||
"Update": {
|
||||
"HealthCheck": "task_states",
|
||||
"MinHealthyTime": 0,
|
||||
},
|
||||
"Networks": [network_config],
|
||||
"Tasks": [
|
||||
{
|
||||
"Name": "sandbox-server",
|
||||
"Driver": task_driver,
|
||||
"Config": task_config,
|
||||
"Env": {
|
||||
"PYTHONUNBUFFERED": "1",
|
||||
"NOMAD_ALLOC_DIR": "${NOMAD_ALLOC_DIR}",
|
||||
},
|
||||
"Resources": {
|
||||
"CPU": cpu,
|
||||
"MemoryMB": memory,
|
||||
},
|
||||
# Note: Services with Checks require Consul, which we skip for the PoC
|
||||
}
|
||||
],
|
||||
"RestartPolicy": {
|
||||
"Attempts": 3,
|
||||
"Interval": 300_000_000_000, # 5 minutes
|
||||
"Delay": 10_000_000_000, # 10 seconds
|
||||
"Mode": "delay",
|
||||
},
|
||||
"ReschedulePolicy": {
|
||||
"Attempts": 5,
|
||||
"Interval": 3600_000_000_000, # 1 hour
|
||||
"Delay": 30_000_000_000, # 30 seconds
|
||||
"DelayFunction": "exponential",
|
||||
"MaxDelay": 300_000_000_000, # 5 minutes
|
||||
"Unlimited": False,
|
||||
},
|
||||
}
|
||||
],
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,20 +0,0 @@
|
||||
"""
|
||||
Slot-based multiplexing for atropos-agent.
|
||||
|
||||
Provides:
|
||||
- Slot: Isolated workspace for a single trajectory
|
||||
- SlotPool: Manages slots across Nomad allocations
|
||||
- SandboxExecutor: Executes tools in sandbox containers
|
||||
"""
|
||||
|
||||
from .executor import SandboxExecutor
|
||||
from .pool import SlotPool, SlotPoolConfig
|
||||
from .slot import Slot, SlotState
|
||||
|
||||
__all__ = [
|
||||
"Slot",
|
||||
"SlotState",
|
||||
"SlotPool",
|
||||
"SlotPoolConfig",
|
||||
"SandboxExecutor",
|
||||
]
|
||||
@@ -1,457 +0,0 @@
|
||||
"""
|
||||
SandboxExecutor - HTTP client for sandbox container communication.
|
||||
|
||||
Sends tool execution requests to sandbox_server.py running inside Nomad containers.
|
||||
Supports single and batch execution for efficiency.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import uuid
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
import aiohttp
|
||||
|
||||
from .slot import Slot, SlotState
|
||||
from ..tools.base import ToolCall, ToolResult
|
||||
|
||||
|
||||
@dataclass
|
||||
class ExecutionRequest:
|
||||
"""Request to execute a tool in a slot."""
|
||||
slot: Slot
|
||||
tool_name: str
|
||||
args: Dict[str, Any]
|
||||
execution_id: str = field(default_factory=lambda: str(uuid.uuid4()))
|
||||
timeout: float = 30.0
|
||||
|
||||
|
||||
@dataclass
|
||||
class ExecutionResult:
|
||||
"""Result from sandbox execution."""
|
||||
success: bool
|
||||
output: str = ""
|
||||
error: str = ""
|
||||
execution_id: str = ""
|
||||
slot_id: str = ""
|
||||
metadata: Dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def to_tool_result(self) -> ToolResult:
|
||||
"""Convert to ToolResult for agent consumption."""
|
||||
return ToolResult(
|
||||
success=self.success,
|
||||
output=self.output,
|
||||
error=self.error,
|
||||
metadata=self.metadata,
|
||||
uniq_id=self.execution_id,
|
||||
)
|
||||
|
||||
|
||||
class SandboxExecutor:
|
||||
"""
|
||||
HTTP client for executing tools in sandbox containers.
|
||||
|
||||
Communicates with sandbox_server.py running inside Nomad allocations.
|
||||
Supports both single execution and batched parallel execution.
|
||||
|
||||
Usage:
|
||||
executor = SandboxExecutor()
|
||||
|
||||
# Single execution
|
||||
result = await executor.execute(slot, "bash", {"command": "ls"})
|
||||
|
||||
# Batch execution
|
||||
results = await executor.execute_batch([
|
||||
(slot1, "bash", {"command": "ls"}),
|
||||
(slot2, "write_file", {"path": "test.txt", "content": "hello"}),
|
||||
])
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
timeout: float = 30.0,
|
||||
max_retries: int = 3,
|
||||
retry_delay: float = 1.0,
|
||||
):
|
||||
self.timeout = aiohttp.ClientTimeout(total=timeout)
|
||||
self.max_retries = max_retries
|
||||
self.retry_delay = retry_delay
|
||||
self._session: Optional[aiohttp.ClientSession] = None
|
||||
|
||||
async def _get_session(self) -> aiohttp.ClientSession:
|
||||
"""Get or create HTTP session."""
|
||||
if self._session is None or self._session.closed:
|
||||
self._session = aiohttp.ClientSession(timeout=self.timeout)
|
||||
return self._session
|
||||
|
||||
async def close(self):
|
||||
"""Close HTTP session."""
|
||||
if self._session and not self._session.closed:
|
||||
await self._session.close()
|
||||
|
||||
async def __aenter__(self):
|
||||
return self
|
||||
|
||||
async def __aexit__(self, exc_type, exc_val, exc_tb):
|
||||
await self.close()
|
||||
|
||||
async def execute(
|
||||
self,
|
||||
slot: Slot,
|
||||
tool_name: str,
|
||||
args: Dict[str, Any],
|
||||
timeout: Optional[float] = None,
|
||||
) -> ExecutionResult:
|
||||
"""
|
||||
Execute a tool in a slot's workspace.
|
||||
|
||||
Args:
|
||||
slot: Slot to execute in
|
||||
tool_name: Name of tool (bash, read_file, write_file)
|
||||
args: Tool arguments
|
||||
timeout: Optional timeout override
|
||||
|
||||
Returns:
|
||||
ExecutionResult with output or error
|
||||
"""
|
||||
execution_id = str(uuid.uuid4())
|
||||
exec_timeout = timeout or self.timeout.total or 30.0
|
||||
|
||||
# Mark slot as executing
|
||||
original_state = slot.state
|
||||
try:
|
||||
if slot.state == SlotState.ACQUIRED:
|
||||
slot.start_execution(execution_id)
|
||||
|
||||
result = await self._send_execute_request(
|
||||
container_addr=slot.container_addr,
|
||||
slot_id=slot.slot_id,
|
||||
tool_name=tool_name,
|
||||
args=args,
|
||||
execution_id=execution_id,
|
||||
timeout=exec_timeout,
|
||||
)
|
||||
result.slot_id = slot.slot_id
|
||||
return result
|
||||
|
||||
finally:
|
||||
# Restore slot state
|
||||
if slot.state == SlotState.EXECUTING:
|
||||
slot.end_execution()
|
||||
|
||||
async def _send_execute_request(
|
||||
self,
|
||||
container_addr: str,
|
||||
slot_id: str,
|
||||
tool_name: str,
|
||||
args: Dict[str, Any],
|
||||
execution_id: str,
|
||||
timeout: float,
|
||||
) -> ExecutionResult:
|
||||
"""Send execution request to sandbox server with retry logic."""
|
||||
session = await self._get_session()
|
||||
url = f"{container_addr}/execute"
|
||||
|
||||
payload = {
|
||||
"slot_id": slot_id,
|
||||
"tool": tool_name,
|
||||
"args": args,
|
||||
"execution_id": execution_id,
|
||||
"timeout": timeout,
|
||||
}
|
||||
|
||||
last_error = None
|
||||
for attempt in range(self.max_retries):
|
||||
try:
|
||||
async with session.post(url, json=payload) as response:
|
||||
data = await response.json()
|
||||
|
||||
return ExecutionResult(
|
||||
success=data.get("success", False),
|
||||
output=data.get("output", ""),
|
||||
error=data.get("error", ""),
|
||||
execution_id=data.get("execution_id", execution_id),
|
||||
metadata=data.get("metadata", {}),
|
||||
)
|
||||
|
||||
except aiohttp.ClientError as e:
|
||||
last_error = str(e)
|
||||
if attempt < self.max_retries - 1:
|
||||
await asyncio.sleep(self.retry_delay * (attempt + 1))
|
||||
continue
|
||||
except asyncio.TimeoutError:
|
||||
last_error = f"Request timed out after {timeout}s"
|
||||
break
|
||||
except Exception as e:
|
||||
last_error = str(e)
|
||||
break
|
||||
|
||||
return ExecutionResult(
|
||||
success=False,
|
||||
error=f"Failed after {self.max_retries} attempts: {last_error}",
|
||||
execution_id=execution_id,
|
||||
)
|
||||
|
||||
async def execute_batch(
|
||||
self,
|
||||
requests: List[Tuple[Slot, str, Dict[str, Any]]],
|
||||
timeout: Optional[float] = None,
|
||||
) -> List[ExecutionResult]:
|
||||
"""
|
||||
Execute multiple tools in parallel across slots.
|
||||
|
||||
This is the key optimization - we batch tool calls to maximize
|
||||
container utilization while agents are waiting for LLM responses.
|
||||
|
||||
Args:
|
||||
requests: List of (slot, tool_name, args) tuples
|
||||
timeout: Optional timeout override
|
||||
|
||||
Returns:
|
||||
List of ExecutionResults in same order as requests
|
||||
"""
|
||||
if not requests:
|
||||
return []
|
||||
|
||||
# Group requests by container address for batch API
|
||||
by_container: Dict[str, List[Tuple[int, Slot, str, Dict[str, Any], str]]] = {}
|
||||
|
||||
for idx, (slot, tool_name, args) in enumerate(requests):
|
||||
execution_id = str(uuid.uuid4())
|
||||
container = slot.container_addr
|
||||
|
||||
if container not in by_container:
|
||||
by_container[container] = []
|
||||
by_container[container].append((idx, slot, tool_name, args, execution_id))
|
||||
|
||||
# Mark slots as executing
|
||||
if slot.state == SlotState.ACQUIRED:
|
||||
slot.start_execution(execution_id)
|
||||
|
||||
# Execute batches in parallel
|
||||
exec_timeout = timeout or self.timeout.total or 30.0
|
||||
batch_tasks = []
|
||||
|
||||
for container_addr, batch_requests in by_container.items():
|
||||
task = self._send_batch_request(
|
||||
container_addr=container_addr,
|
||||
batch_requests=batch_requests,
|
||||
timeout=exec_timeout,
|
||||
)
|
||||
batch_tasks.append(task)
|
||||
|
||||
# Gather all batch results
|
||||
batch_results = await asyncio.gather(*batch_tasks, return_exceptions=True)
|
||||
|
||||
# Collect results in original order
|
||||
results: List[Optional[ExecutionResult]] = [None] * len(requests)
|
||||
|
||||
for batch_result in batch_results:
|
||||
if isinstance(batch_result, Exception):
|
||||
# Mark all in this batch as failed
|
||||
continue
|
||||
|
||||
for idx, result in batch_result:
|
||||
results[idx] = result
|
||||
|
||||
# Fill in any missing results
|
||||
for idx, result in enumerate(results):
|
||||
if result is None:
|
||||
slot, tool_name, args = requests[idx]
|
||||
results[idx] = ExecutionResult(
|
||||
success=False,
|
||||
error="Batch execution failed",
|
||||
slot_id=slot.slot_id,
|
||||
)
|
||||
|
||||
# End execution on all slots
|
||||
for slot, _, _ in requests:
|
||||
if slot.state == SlotState.EXECUTING:
|
||||
slot.end_execution()
|
||||
|
||||
return results # type: ignore
|
||||
|
||||
async def _send_batch_request(
|
||||
self,
|
||||
container_addr: str,
|
||||
batch_requests: List[Tuple[int, Slot, str, Dict[str, Any], str]],
|
||||
timeout: float,
|
||||
) -> List[Tuple[int, ExecutionResult]]:
|
||||
"""Send batch execution request to a single container."""
|
||||
session = await self._get_session()
|
||||
url = f"{container_addr}/batch"
|
||||
|
||||
# Build batch payload
|
||||
payload = [
|
||||
{
|
||||
"slot_id": slot.slot_id,
|
||||
"tool": tool_name,
|
||||
"args": args,
|
||||
"execution_id": execution_id,
|
||||
"timeout": timeout,
|
||||
}
|
||||
for _, slot, tool_name, args, execution_id in batch_requests
|
||||
]
|
||||
|
||||
try:
|
||||
async with session.post(url, json=payload) as response:
|
||||
data = await response.json()
|
||||
|
||||
if not isinstance(data, list):
|
||||
raise ValueError(f"Expected list response, got {type(data)}")
|
||||
|
||||
results = []
|
||||
for i, (idx, slot, _, _, execution_id) in enumerate(batch_requests):
|
||||
if i < len(data):
|
||||
item = data[i]
|
||||
result = ExecutionResult(
|
||||
success=item.get("success", False),
|
||||
output=item.get("output", ""),
|
||||
error=item.get("error", ""),
|
||||
execution_id=item.get("execution_id", execution_id),
|
||||
slot_id=slot.slot_id,
|
||||
metadata=item.get("metadata", {}),
|
||||
)
|
||||
else:
|
||||
result = ExecutionResult(
|
||||
success=False,
|
||||
error="Missing result in batch response",
|
||||
execution_id=execution_id,
|
||||
slot_id=slot.slot_id,
|
||||
)
|
||||
results.append((idx, result))
|
||||
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
# Return error for all requests in batch
|
||||
return [
|
||||
(idx, ExecutionResult(
|
||||
success=False,
|
||||
error=str(e),
|
||||
execution_id=execution_id,
|
||||
slot_id=slot.slot_id,
|
||||
))
|
||||
for idx, slot, _, _, execution_id in batch_requests
|
||||
]
|
||||
|
||||
async def reset_slot(self, slot: Slot) -> ExecutionResult:
|
||||
"""
|
||||
Reset a slot's workspace (delete all files).
|
||||
|
||||
Useful when reusing a slot for a new trajectory.
|
||||
"""
|
||||
session = await self._get_session()
|
||||
url = f"{slot.container_addr}/reset"
|
||||
|
||||
try:
|
||||
async with session.post(url, json={"slot_id": slot.slot_id}) as response:
|
||||
data = await response.json()
|
||||
return ExecutionResult(
|
||||
success=data.get("success", False),
|
||||
output=data.get("output", ""),
|
||||
error=data.get("error", ""),
|
||||
slot_id=slot.slot_id,
|
||||
)
|
||||
except Exception as e:
|
||||
return ExecutionResult(
|
||||
success=False,
|
||||
error=str(e),
|
||||
slot_id=slot.slot_id,
|
||||
)
|
||||
|
||||
async def health_check(self, container_addr: str) -> bool:
|
||||
"""Check if a sandbox container is healthy."""
|
||||
session = await self._get_session()
|
||||
url = f"{container_addr}/health"
|
||||
|
||||
try:
|
||||
async with session.get(url) as response:
|
||||
data = await response.json()
|
||||
return data.get("status") == "ok"
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
async def get_container_status(
|
||||
self,
|
||||
container_addr: str
|
||||
) -> Optional[Dict[str, Any]]:
|
||||
"""Get status info from a sandbox container."""
|
||||
session = await self._get_session()
|
||||
url = f"{container_addr}/health"
|
||||
|
||||
try:
|
||||
async with session.get(url) as response:
|
||||
return await response.json()
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
# -------------------------------------------------------------------------
|
||||
# Artifact helpers (optional)
|
||||
# -------------------------------------------------------------------------
|
||||
|
||||
async def _post_json(
|
||||
self,
|
||||
url: str,
|
||||
payload: Dict[str, Any],
|
||||
timeout: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
session = await self._get_session()
|
||||
try:
|
||||
async with session.post(url, json=payload, timeout=timeout) as response:
|
||||
data = await response.json()
|
||||
if isinstance(data, dict):
|
||||
data.setdefault("http_status", response.status)
|
||||
return data
|
||||
return {"success": False, "error": f"Unexpected response type: {type(data)}", "http_status": response.status}
|
||||
except Exception as e:
|
||||
return {"success": False, "error": str(e)}
|
||||
|
||||
async def read_artifact(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str,
|
||||
*,
|
||||
encoding: str = "text",
|
||||
max_bytes: Optional[int] = None,
|
||||
include_sha256: bool = False,
|
||||
timeout: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
url = f"{slot.container_addr}/artifacts/read"
|
||||
payload: Dict[str, Any] = {"slot_id": slot.slot_id, "path": path, "encoding": encoding, "include_sha256": include_sha256}
|
||||
if max_bytes is not None:
|
||||
payload["max_bytes"] = max_bytes
|
||||
return await self._post_json(url, payload, timeout=timeout)
|
||||
|
||||
async def list_artifacts(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str = ".",
|
||||
*,
|
||||
recursive: bool = False,
|
||||
max_entries: Optional[int] = None,
|
||||
timeout: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
url = f"{slot.container_addr}/artifacts/list"
|
||||
payload: Dict[str, Any] = {"slot_id": slot.slot_id, "path": path, "recursive": recursive}
|
||||
if max_entries is not None:
|
||||
payload["max_entries"] = max_entries
|
||||
return await self._post_json(url, payload, timeout=timeout)
|
||||
|
||||
async def archive_artifacts(
|
||||
self,
|
||||
slot: Slot,
|
||||
path: str = ".",
|
||||
*,
|
||||
archive_format: str = "tar.gz",
|
||||
max_bytes: Optional[int] = None,
|
||||
max_entries: Optional[int] = None,
|
||||
timeout: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
url = f"{slot.container_addr}/artifacts/archive"
|
||||
payload: Dict[str, Any] = {"slot_id": slot.slot_id, "path": path, "format": archive_format}
|
||||
if max_bytes is not None:
|
||||
payload["max_bytes"] = max_bytes
|
||||
if max_entries is not None:
|
||||
payload["max_entries"] = max_entries
|
||||
return await self._post_json(url, payload, timeout=timeout)
|
||||
@@ -1,659 +0,0 @@
|
||||
"""
|
||||
SlotPool - Manages slots across Nomad allocations.
|
||||
|
||||
The SlotPool is the core abstraction for slot-based multiplexing:
|
||||
- Tracks available/acquired slots across containers
|
||||
- Handles slot acquisition and release
|
||||
- Auto-scales Nomad job count based on demand
|
||||
- Provides batched tool execution
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import subprocess
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
from ..nomad.client import (
|
||||
Allocation,
|
||||
AllocationStatus,
|
||||
NomadClient,
|
||||
create_sandbox_job,
|
||||
)
|
||||
from .executor import ExecutionResult, SandboxExecutor
|
||||
from .slot import Slot, SlotState, create_slots_for_allocation
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class SlotPoolConfig:
|
||||
"""Configuration for SlotPool."""
|
||||
|
||||
# Nomad settings
|
||||
nomad_address: str = "http://localhost:4646"
|
||||
job_id: str = "atropos-sandbox"
|
||||
datacenter: str = "dc1"
|
||||
|
||||
# Container settings
|
||||
image: str = "atropos-sandbox:local" # Use :local tag to avoid registry pull
|
||||
slots_per_container: int = 10
|
||||
privileged: bool = False
|
||||
cpu: int = 500 # MHz
|
||||
memory: int = 512 # MB
|
||||
|
||||
# Driver selection: "docker" or "singularity"
|
||||
driver: str = "docker"
|
||||
# Path to .sif file for singularity driver (required if driver="singularity")
|
||||
singularity_image: Optional[str] = None
|
||||
|
||||
# Scaling settings
|
||||
min_containers: int = 1
|
||||
max_containers: int = 10
|
||||
|
||||
# Timeouts
|
||||
acquire_timeout: float = 30.0 # Seconds between acquire polls (also triggers scale-up attempts)
|
||||
health_check_interval: float = 30.0 # Seconds between health checks
|
||||
scale_cooldown: float = 60.0 # Seconds between scale operations
|
||||
|
||||
# Job lifecycle
|
||||
purge_job_on_start: bool = False # Purge any pre-existing job before starting (local dev/training friendly)
|
||||
|
||||
# Local Docker image convenience (macOS/Nomad dev mode)
|
||||
auto_build_local_image: bool = True # If image endswith :local and is missing, build it from the bundled Dockerfile.
|
||||
dockerfile_path: Optional[str] = None # Override Dockerfile path (default: Hermes-Agent/atropos/Dockerfile).
|
||||
docker_build_context: Optional[str] = None # Override build context (default: Hermes-Agent/atropos).
|
||||
|
||||
|
||||
class SlotPool:
|
||||
"""
|
||||
Manages a pool of slots across Nomad allocations.
|
||||
|
||||
The SlotPool:
|
||||
- Deploys sandbox containers to Nomad
|
||||
- Tracks slots across all running containers
|
||||
- Handles slot acquisition/release
|
||||
- Auto-scales based on demand
|
||||
- Provides batched execution via SandboxExecutor
|
||||
|
||||
Usage:
|
||||
config = SlotPoolConfig(
|
||||
nomad_address="http://localhost:4646",
|
||||
job_id="my-sandbox",
|
||||
slots_per_container=10,
|
||||
)
|
||||
|
||||
pool = SlotPool(config)
|
||||
await pool.start()
|
||||
|
||||
# Acquire a slot
|
||||
slot = await pool.acquire()
|
||||
|
||||
# Execute tool
|
||||
result = await pool.execute(slot, "bash", {"command": "ls"})
|
||||
|
||||
# Release slot
|
||||
await pool.release(slot)
|
||||
|
||||
# Shutdown
|
||||
await pool.stop()
|
||||
"""
|
||||
|
||||
def __init__(self, config: Optional[SlotPoolConfig] = None):
|
||||
self.config = config or SlotPoolConfig()
|
||||
|
||||
# Nomad client
|
||||
self.nomad = NomadClient(address=self.config.nomad_address)
|
||||
|
||||
# Sandbox executor for tool execution
|
||||
self.executor = SandboxExecutor()
|
||||
|
||||
# Slot tracking
|
||||
self._slots: Dict[str, Slot] = {} # slot_key -> Slot
|
||||
self._available_queue: asyncio.Queue[str] = asyncio.Queue()
|
||||
self._lock = asyncio.Lock()
|
||||
self._scale_lock = asyncio.Lock()
|
||||
|
||||
# State
|
||||
self._started = False
|
||||
self._health_task: Optional[asyncio.Task] = None
|
||||
self._scale_task: Optional[asyncio.Task] = None
|
||||
self._last_scale_time = 0.0
|
||||
|
||||
def _default_dockerfile_path(self) -> Path:
|
||||
# Hermes-Agent/atropos/Dockerfile lives next to this module in source checkouts.
|
||||
return Path(__file__).resolve().parents[1] / "Dockerfile"
|
||||
|
||||
def _default_build_context(self) -> Path:
|
||||
return Path(__file__).resolve().parents[1]
|
||||
|
||||
def _docker_image_exists(self, image: str) -> bool:
|
||||
try:
|
||||
proc = subprocess.run(
|
||||
["docker", "image", "inspect", image],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
check=False,
|
||||
env={**os.environ, "DOCKER_CLI_HINTS": "false"},
|
||||
)
|
||||
return proc.returncode == 0
|
||||
except FileNotFoundError:
|
||||
return False
|
||||
|
||||
def _try_build_local_image(self, image: str) -> None:
|
||||
dockerfile = Path(self.config.dockerfile_path) if self.config.dockerfile_path else self._default_dockerfile_path()
|
||||
context = Path(self.config.docker_build_context) if self.config.docker_build_context else self._default_build_context()
|
||||
|
||||
if not dockerfile.exists():
|
||||
raise RuntimeError(
|
||||
f"Sandbox Dockerfile not found at {dockerfile}. "
|
||||
"Build the sandbox image manually or set --env.purge_job_on_start false and provide a non-local image."
|
||||
)
|
||||
if not context.exists():
|
||||
raise RuntimeError(f"Docker build context not found at {context}")
|
||||
|
||||
# Prefer buildx+--load to ensure the image ends up in the local daemon (required by Nomad's docker driver).
|
||||
buildx_cmd = [
|
||||
"docker",
|
||||
"buildx",
|
||||
"build",
|
||||
"--load",
|
||||
"-t",
|
||||
image,
|
||||
"-f",
|
||||
str(dockerfile),
|
||||
str(context),
|
||||
]
|
||||
proc = subprocess.run(buildx_cmd, check=False, env={**os.environ, "DOCKER_CLI_HINTS": "false"})
|
||||
if proc.returncode == 0:
|
||||
return
|
||||
|
||||
# Fallback to classic docker build if buildx isn't available.
|
||||
build_cmd = ["docker", "build", "-t", image, "-f", str(dockerfile), str(context)]
|
||||
proc2 = subprocess.run(build_cmd, check=False, env={**os.environ, "DOCKER_CLI_HINTS": "false"})
|
||||
if proc2.returncode != 0:
|
||||
raise RuntimeError(
|
||||
f"Failed to build local sandbox image {image}. "
|
||||
f"Tried: {' '.join(buildx_cmd)} and {' '.join(build_cmd)}"
|
||||
)
|
||||
|
||||
def _ensure_local_image(self) -> None:
|
||||
image = (self.config.image or "").strip()
|
||||
if not image.endswith(":local"):
|
||||
return
|
||||
if not self.config.auto_build_local_image:
|
||||
return
|
||||
|
||||
if self._docker_image_exists(image):
|
||||
return
|
||||
|
||||
logger.info(f"Local sandbox image {image} not found; building it now...")
|
||||
self._try_build_local_image(image)
|
||||
|
||||
def _slot_key(self, alloc_id: str, slot_id: str) -> str:
|
||||
"""Generate unique key for a slot."""
|
||||
return f"{alloc_id}:{slot_id}"
|
||||
|
||||
@property
|
||||
def total_slots(self) -> int:
|
||||
"""Total number of slots in pool."""
|
||||
return len(self._slots)
|
||||
|
||||
@property
|
||||
def available_slots(self) -> int:
|
||||
"""Number of available slots."""
|
||||
return sum(1 for s in self._slots.values() if s.is_available)
|
||||
|
||||
@property
|
||||
def acquired_slots(self) -> int:
|
||||
"""Number of acquired slots."""
|
||||
return sum(1 for s in self._slots.values() if s.is_acquired)
|
||||
|
||||
async def start(self) -> None:
|
||||
"""
|
||||
Start the slot pool.
|
||||
|
||||
- Checks if Nomad is healthy
|
||||
- Deploys sandbox job if not running
|
||||
- Discovers existing allocations
|
||||
- Starts health check background task
|
||||
"""
|
||||
if self._started:
|
||||
return
|
||||
|
||||
logger.info(f"Starting SlotPool (job_id={self.config.job_id})")
|
||||
|
||||
try:
|
||||
# Make sure local sandbox images exist before Nomad tries to pull them.
|
||||
# This is a common footgun in macOS dev mode with :local tags.
|
||||
self._ensure_local_image()
|
||||
|
||||
# Check Nomad health
|
||||
if not await self.nomad.is_healthy():
|
||||
raise RuntimeError(f"Nomad is not reachable at {self.config.nomad_address}")
|
||||
|
||||
if self.config.purge_job_on_start:
|
||||
logger.info(f"Purging any existing Nomad job: {self.config.job_id}")
|
||||
await self.nomad.stop_job(self.config.job_id, purge=True)
|
||||
|
||||
# Check if job exists (after optional purge)
|
||||
job = await self.nomad.get_job(self.config.job_id)
|
||||
|
||||
if job is None:
|
||||
# Deploy new job
|
||||
logger.info(f"Deploying sandbox job: {self.config.job_id} (driver={self.config.driver})")
|
||||
job_spec = create_sandbox_job(
|
||||
job_id=self.config.job_id,
|
||||
image=self.config.image,
|
||||
count=self.config.min_containers,
|
||||
slots_per_container=self.config.slots_per_container,
|
||||
privileged=self.config.privileged,
|
||||
cpu=self.config.cpu,
|
||||
memory=self.config.memory,
|
||||
datacenter=self.config.datacenter,
|
||||
driver=self.config.driver,
|
||||
singularity_image=self.config.singularity_image,
|
||||
)
|
||||
result = await self.nomad.submit_job(job_spec)
|
||||
if "error" in result:
|
||||
raise RuntimeError(f"Failed to submit job: {result}")
|
||||
|
||||
# Wait for allocations to be running (even if the job already existed).
|
||||
await self._wait_for_healthy_allocations(self.config.min_containers)
|
||||
|
||||
# Discover existing allocations and slots
|
||||
await self._refresh_slots()
|
||||
|
||||
# Start health check task
|
||||
self._health_task = asyncio.create_task(self._health_check_loop())
|
||||
|
||||
self._started = True
|
||||
logger.info(f"SlotPool started: {self.total_slots} slots available")
|
||||
except Exception:
|
||||
# Ensure aiohttp sessions are not leaked if we fail to start.
|
||||
await self.stop(purge_job=False)
|
||||
raise
|
||||
|
||||
async def stop(self, purge_job: bool = False) -> None:
|
||||
"""
|
||||
Stop the slot pool.
|
||||
|
||||
Args:
|
||||
purge_job: If True, also stop the Nomad job
|
||||
"""
|
||||
logger.info("Stopping SlotPool")
|
||||
|
||||
# Cancel health check task
|
||||
if self._health_task:
|
||||
self._health_task.cancel()
|
||||
try:
|
||||
await self._health_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
finally:
|
||||
self._health_task = None
|
||||
|
||||
if self._scale_task:
|
||||
self._scale_task.cancel()
|
||||
try:
|
||||
await self._scale_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
finally:
|
||||
self._scale_task = None
|
||||
|
||||
# Optionally stop the job (do this even if start() never completed).
|
||||
if purge_job:
|
||||
logger.info(f"Stopping Nomad job: {self.config.job_id}")
|
||||
await self.nomad.stop_job(self.config.job_id, purge=True)
|
||||
|
||||
# Close connections
|
||||
await self.executor.close()
|
||||
await self.nomad.close()
|
||||
|
||||
self._started = False
|
||||
self._slots.clear()
|
||||
|
||||
# Clear the queue
|
||||
while not self._available_queue.empty():
|
||||
try:
|
||||
self._available_queue.get_nowait()
|
||||
except asyncio.QueueEmpty:
|
||||
break
|
||||
|
||||
async def acquire(self, trajectory_id: Optional[str] = None) -> Slot:
|
||||
"""
|
||||
Acquire an available slot.
|
||||
|
||||
If no slots are available, waits up to acquire_timeout seconds.
|
||||
If still no slots, attempts to scale up.
|
||||
|
||||
Args:
|
||||
trajectory_id: Optional ID of trajectory acquiring the slot
|
||||
|
||||
Returns:
|
||||
Acquired Slot
|
||||
|
||||
Raises:
|
||||
asyncio.TimeoutError: If no slot becomes available
|
||||
"""
|
||||
if not self._started:
|
||||
raise RuntimeError("SlotPool not started")
|
||||
|
||||
while True:
|
||||
try:
|
||||
# Try to get an available slot
|
||||
slot_key = await asyncio.wait_for(
|
||||
self._available_queue.get(),
|
||||
timeout=self.config.acquire_timeout,
|
||||
)
|
||||
except asyncio.TimeoutError:
|
||||
# Try to scale up, but keep waiting even if scaling isn't possible.
|
||||
# In practice, slots may become available shortly (e.g. contention),
|
||||
# and scaling may be temporarily blocked by Nomad deployments.
|
||||
await self._try_scale_up()
|
||||
continue
|
||||
|
||||
slot = self._slots.get(slot_key)
|
||||
if slot is None:
|
||||
# Slot was removed; discard stale queue entry and retry.
|
||||
continue
|
||||
|
||||
try:
|
||||
slot.acquire(trajectory_id)
|
||||
except RuntimeError:
|
||||
# Slot isn't actually available (e.g. duplicate queue entry); retry.
|
||||
continue
|
||||
|
||||
logger.debug(f"Acquired slot {slot.slot_id} (alloc={slot.alloc_id[:8]})")
|
||||
return slot
|
||||
|
||||
async def release(self, slot: Slot, reset_workspace: bool = False) -> None:
|
||||
"""
|
||||
Release a slot back to the pool.
|
||||
|
||||
Args:
|
||||
slot: Slot to release
|
||||
reset_workspace: If True, clear the workspace files
|
||||
"""
|
||||
slot_key = self._slot_key(slot.alloc_id, slot.slot_id)
|
||||
|
||||
if slot_key not in self._slots:
|
||||
logger.warning(f"Releasing unknown slot: {slot_key}")
|
||||
return
|
||||
|
||||
# Optionally reset workspace
|
||||
if reset_workspace:
|
||||
await self.executor.reset_slot(slot)
|
||||
|
||||
slot.release()
|
||||
await self._available_queue.put(slot_key)
|
||||
|
||||
logger.debug(f"Released slot {slot.slot_id}")
|
||||
|
||||
async def execute(
|
||||
self,
|
||||
slot: Slot,
|
||||
tool_name: str,
|
||||
args: Dict[str, Any],
|
||||
timeout: Optional[float] = None,
|
||||
) -> ExecutionResult:
|
||||
"""
|
||||
Execute a tool in a slot's workspace.
|
||||
|
||||
Args:
|
||||
slot: Slot to execute in
|
||||
tool_name: Name of tool (bash, read_file, write_file)
|
||||
args: Tool arguments
|
||||
timeout: Optional timeout override
|
||||
|
||||
Returns:
|
||||
ExecutionResult
|
||||
"""
|
||||
return await self.executor.execute(slot, tool_name, args, timeout)
|
||||
|
||||
async def execute_batch(
|
||||
self,
|
||||
requests: List[Tuple[Slot, str, Dict[str, Any]]],
|
||||
timeout: Optional[float] = None,
|
||||
) -> List[ExecutionResult]:
|
||||
"""
|
||||
Execute multiple tools in parallel.
|
||||
|
||||
This is the key optimization - batch execution across multiple slots
|
||||
maximizes container utilization.
|
||||
|
||||
Args:
|
||||
requests: List of (slot, tool_name, args) tuples
|
||||
timeout: Optional timeout override
|
||||
|
||||
Returns:
|
||||
List of ExecutionResults in same order
|
||||
"""
|
||||
return await self.executor.execute_batch(requests, timeout)
|
||||
|
||||
async def _refresh_slots(self) -> None:
|
||||
"""Refresh slot inventory from Nomad allocations."""
|
||||
async with self._lock:
|
||||
allocs = await self.nomad.get_job_allocations(self.config.job_id)
|
||||
|
||||
# Track which slots we've seen
|
||||
seen_keys = set()
|
||||
|
||||
for alloc in allocs:
|
||||
if alloc.status != AllocationStatus.RUNNING:
|
||||
continue
|
||||
|
||||
if not alloc.http_address:
|
||||
continue
|
||||
|
||||
# Check container health
|
||||
healthy = await self.executor.health_check(alloc.http_address)
|
||||
if not healthy:
|
||||
continue
|
||||
|
||||
# Create slots for this allocation
|
||||
for i in range(self.config.slots_per_container):
|
||||
slot_id = f"slot_{i}"
|
||||
slot_key = self._slot_key(alloc.id, slot_id)
|
||||
seen_keys.add(slot_key)
|
||||
|
||||
if slot_key not in self._slots:
|
||||
# New slot
|
||||
slot = Slot(
|
||||
slot_id=slot_id,
|
||||
alloc_id=alloc.id,
|
||||
container_addr=alloc.http_address,
|
||||
)
|
||||
self._slots[slot_key] = slot
|
||||
await self._available_queue.put(slot_key)
|
||||
logger.debug(f"Added slot: {slot_key}")
|
||||
|
||||
# Remove slots from dead allocations
|
||||
for slot_key in list(self._slots.keys()):
|
||||
if slot_key not in seen_keys:
|
||||
slot = self._slots.pop(slot_key)
|
||||
logger.debug(f"Removed slot: {slot_key}")
|
||||
|
||||
async def _wait_for_healthy_allocations(
|
||||
self,
|
||||
min_count: int,
|
||||
timeout: float = 120.0
|
||||
) -> None:
|
||||
"""Wait for allocations to become healthy."""
|
||||
import time
|
||||
start = time.time()
|
||||
|
||||
def _summarize_alloc_detail(detail: Dict[str, Any]) -> str:
|
||||
task_states = detail.get("TaskStates") or {}
|
||||
parts: List[str] = []
|
||||
if isinstance(task_states, dict):
|
||||
for task_name, st in task_states.items():
|
||||
events = (st or {}).get("Events") or []
|
||||
if isinstance(events, list) and events:
|
||||
# Include a few recent events; the latest can be a generic restart message
|
||||
# while the true root cause is slightly earlier (e.g. image pull failure).
|
||||
recent = events[-3:]
|
||||
msgs: List[str] = []
|
||||
for ev in recent:
|
||||
desc = ev.get("DisplayMessage") or ev.get("Message") or ev.get("Type") or ""
|
||||
if desc:
|
||||
msgs.append(desc)
|
||||
if msgs:
|
||||
parts.append(f"{task_name}: " + " | ".join(msgs))
|
||||
return "; ".join(parts)
|
||||
|
||||
def _alloc_events_lower(detail: Dict[str, Any]) -> str:
|
||||
task_states = detail.get("TaskStates") or {}
|
||||
texts: List[str] = []
|
||||
if isinstance(task_states, dict):
|
||||
for _task_name, st in task_states.items():
|
||||
events = (st or {}).get("Events") or []
|
||||
if isinstance(events, list):
|
||||
for ev in events[-10:]:
|
||||
desc = ev.get("DisplayMessage") or ev.get("Message") or ev.get("Type") or ""
|
||||
if desc:
|
||||
texts.append(desc)
|
||||
return " ".join(texts).lower()
|
||||
|
||||
while time.time() - start < timeout:
|
||||
allocs = await self.nomad.get_job_allocations(self.config.job_id)
|
||||
|
||||
healthy_count = 0
|
||||
for alloc in allocs:
|
||||
if alloc.status == AllocationStatus.RUNNING and alloc.http_address:
|
||||
if await self.executor.health_check(alloc.http_address):
|
||||
healthy_count += 1
|
||||
|
||||
# Fast-fail on obvious driver/image errors to avoid waiting out the full timeout.
|
||||
if alloc.id:
|
||||
detail = await self.nomad.get_allocation(alloc.id)
|
||||
if isinstance(detail, dict):
|
||||
summary = _summarize_alloc_detail(detail)
|
||||
lowered = _alloc_events_lower(detail) or summary.lower()
|
||||
if "failed to pull" in lowered or "pull access denied" in lowered:
|
||||
raise RuntimeError(
|
||||
"Nomad allocation failed to start due to a Docker image pull error. "
|
||||
f"Allocation {alloc.id[:8]}: {summary}\n"
|
||||
"If you're using a local image tag (e.g. `atropos-sandbox:local`) on macOS, "
|
||||
"make sure the image is loaded into Docker, e.g.:\n"
|
||||
" docker buildx build --load -t atropos-sandbox:local -f Hermes-Agent/atropos/Dockerfile Hermes-Agent/atropos"
|
||||
)
|
||||
if "exceeded allowed attempts" in lowered:
|
||||
raise RuntimeError(
|
||||
"Nomad allocation is crash-looping and has entered restart backoff. "
|
||||
f"Allocation {alloc.id[:8]}: {summary}\n"
|
||||
"Inspect logs with:\n"
|
||||
f" nomad alloc logs -stderr -task sandbox-server {alloc.id}\n"
|
||||
"Common causes include: missing local Docker image tag, container entrypoint error, "
|
||||
"or sandbox-server startup failure."
|
||||
)
|
||||
|
||||
if healthy_count >= min_count:
|
||||
return
|
||||
|
||||
await asyncio.sleep(2.0)
|
||||
|
||||
# Timed out: include allocation status detail to help debugging.
|
||||
allocs = await self.nomad.get_job_allocations(self.config.job_id)
|
||||
alloc_lines: List[str] = []
|
||||
for alloc in allocs[:10]:
|
||||
addr = alloc.http_address or "-"
|
||||
line = f"{alloc.id[:8]} status={alloc.status.value} http={addr}"
|
||||
detail = await self.nomad.get_allocation(alloc.id)
|
||||
if isinstance(detail, dict):
|
||||
summary = _summarize_alloc_detail(detail)
|
||||
if summary:
|
||||
line += f" detail={summary}"
|
||||
alloc_lines.append(line)
|
||||
|
||||
hint = (
|
||||
"Timed out waiting for healthy sandbox allocations.\n"
|
||||
f"Job: {self.config.job_id}, desired_healthy: {min_count}\n"
|
||||
"Allocations:\n - " + "\n - ".join(alloc_lines)
|
||||
)
|
||||
raise RuntimeError(hint)
|
||||
|
||||
async def _try_scale_up(self) -> bool:
|
||||
"""Attempt to scale up the job."""
|
||||
import time
|
||||
|
||||
async with self._scale_lock:
|
||||
# Check cooldown
|
||||
if time.time() - self._last_scale_time < self.config.scale_cooldown:
|
||||
return False
|
||||
|
||||
# Check max containers
|
||||
status = await self.nomad.get_job_status(self.config.job_id)
|
||||
if status is None:
|
||||
return False
|
||||
|
||||
current_count = status.count
|
||||
if current_count >= self.config.max_containers:
|
||||
logger.warning(f"Cannot scale up: already at max ({self.config.max_containers})")
|
||||
return False
|
||||
|
||||
# Scale up
|
||||
new_count = min(current_count + 1, self.config.max_containers)
|
||||
logger.info(f"Scaling up from {current_count} to {new_count} containers")
|
||||
|
||||
scale_resp = await self.nomad.scale_job(
|
||||
self.config.job_id,
|
||||
count=new_count,
|
||||
task_group="sandbox",
|
||||
)
|
||||
|
||||
# Nomad may return non-JSON errors (e.g. plain text) with a status field.
|
||||
if isinstance(scale_resp, dict) and scale_resp.get("status", 200) >= 400:
|
||||
logger.warning(f"Scale request rejected: {scale_resp}")
|
||||
self._last_scale_time = time.time()
|
||||
return False
|
||||
|
||||
self._last_scale_time = time.time()
|
||||
|
||||
# Wait for new allocation in the background so contended acquires can still
|
||||
# make progress (e.g. by grabbing slots released by other trajectories).
|
||||
if self._scale_task is None or self._scale_task.done():
|
||||
self._scale_task = asyncio.create_task(self._wait_for_scale(new_count))
|
||||
|
||||
return True
|
||||
|
||||
async def _wait_for_scale(self, desired_count: int) -> None:
|
||||
try:
|
||||
await self._wait_for_healthy_allocations(desired_count, timeout=60.0)
|
||||
await self._refresh_slots()
|
||||
except asyncio.CancelledError:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to scale up: {e}")
|
||||
|
||||
async def _health_check_loop(self) -> None:
|
||||
"""Background task to monitor container health."""
|
||||
while True:
|
||||
try:
|
||||
await asyncio.sleep(self.config.health_check_interval)
|
||||
await self._refresh_slots()
|
||||
except asyncio.CancelledError:
|
||||
break
|
||||
except Exception as e:
|
||||
logger.error(f"Health check error: {e}")
|
||||
|
||||
def get_stats(self) -> Dict[str, Any]:
|
||||
"""Get pool statistics."""
|
||||
slots_by_state = {}
|
||||
for slot in self._slots.values():
|
||||
state = slot.state.value
|
||||
slots_by_state[state] = slots_by_state.get(state, 0) + 1
|
||||
|
||||
container_count = len({s.alloc_id for s in self._slots.values()}) if self._slots else 0
|
||||
|
||||
return {
|
||||
"total_slots": self.total_slots,
|
||||
"available_slots": self.available_slots,
|
||||
"acquired_slots": self.acquired_slots,
|
||||
"containers": container_count,
|
||||
"slots_by_state": slots_by_state,
|
||||
"started": self._started,
|
||||
}
|
||||
@@ -1,159 +0,0 @@
|
||||
"""
|
||||
Slot abstraction for atropos-agent.
|
||||
|
||||
A Slot represents an isolated workspace for a single agent trajectory.
|
||||
Slots are hosted on Nomad allocations and provide workspace isolation
|
||||
via filesystem directories.
|
||||
"""
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import Any, Dict, Optional
|
||||
import uuid
|
||||
|
||||
|
||||
class SlotState(Enum):
|
||||
"""State of a slot in the pool."""
|
||||
AVAILABLE = "available" # Ready to be acquired
|
||||
ACQUIRED = "acquired" # Assigned to a trajectory
|
||||
EXECUTING = "executing" # Currently executing a tool
|
||||
RELEASING = "releasing" # Being released back to pool
|
||||
ERROR = "error" # In error state
|
||||
|
||||
|
||||
@dataclass
|
||||
class Slot:
|
||||
"""
|
||||
An isolated workspace for a single agent trajectory.
|
||||
|
||||
Slots are the unit of scheduling - each trajectory runs in its own slot,
|
||||
with an isolated workspace directory. Multiple slots share a container.
|
||||
|
||||
Attributes:
|
||||
slot_id: Unique identifier for this slot (e.g., "slot_0")
|
||||
alloc_id: Nomad allocation ID hosting this slot
|
||||
container_addr: HTTP address of the sandbox server (e.g., "http://10.0.0.1:8080")
|
||||
workspace_dir: Path to workspace in container (e.g., "/data/slot_0")
|
||||
state: Current state of the slot
|
||||
trajectory_id: ID of trajectory currently using this slot (if acquired)
|
||||
metadata: Additional metadata
|
||||
"""
|
||||
slot_id: str
|
||||
alloc_id: str
|
||||
container_addr: str
|
||||
workspace_dir: str = ""
|
||||
state: SlotState = SlotState.AVAILABLE
|
||||
trajectory_id: Optional[str] = None
|
||||
metadata: Dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def __post_init__(self):
|
||||
"""Set default workspace_dir if not provided."""
|
||||
if not self.workspace_dir:
|
||||
self.workspace_dir = f"/data/{self.slot_id}"
|
||||
|
||||
@property
|
||||
def is_available(self) -> bool:
|
||||
"""Check if slot is available for acquisition."""
|
||||
return self.state == SlotState.AVAILABLE
|
||||
|
||||
@property
|
||||
def is_acquired(self) -> bool:
|
||||
"""Check if slot is currently acquired."""
|
||||
return self.state in (SlotState.ACQUIRED, SlotState.EXECUTING)
|
||||
|
||||
def acquire(self, trajectory_id: Optional[str] = None) -> None:
|
||||
"""
|
||||
Mark slot as acquired by a trajectory.
|
||||
|
||||
Args:
|
||||
trajectory_id: Optional ID of acquiring trajectory
|
||||
"""
|
||||
if not self.is_available:
|
||||
raise RuntimeError(f"Cannot acquire slot {self.slot_id}: state is {self.state}")
|
||||
|
||||
self.state = SlotState.ACQUIRED
|
||||
self.trajectory_id = trajectory_id or str(uuid.uuid4())
|
||||
|
||||
def start_execution(self, execution_id: Optional[str] = None) -> None:
|
||||
"""Mark slot as executing."""
|
||||
if self.state != SlotState.ACQUIRED:
|
||||
raise RuntimeError(f"Cannot start execution on slot {self.slot_id}: state is {self.state}")
|
||||
|
||||
self.state = SlotState.EXECUTING
|
||||
if execution_id:
|
||||
self.metadata["current_execution_id"] = execution_id
|
||||
|
||||
def end_execution(self) -> None:
|
||||
"""Mark execution as complete, return to acquired state."""
|
||||
if self.state != SlotState.EXECUTING:
|
||||
raise RuntimeError(f"Cannot end execution on slot {self.slot_id}: state is {self.state}")
|
||||
|
||||
self.state = SlotState.ACQUIRED
|
||||
self.metadata.pop("current_execution_id", None)
|
||||
|
||||
def release(self) -> None:
|
||||
"""Release slot back to available state."""
|
||||
self.state = SlotState.AVAILABLE
|
||||
self.trajectory_id = None
|
||||
self.metadata.pop("current_execution_id", None)
|
||||
|
||||
def mark_error(self, error: str) -> None:
|
||||
"""Mark slot as in error state."""
|
||||
self.state = SlotState.ERROR
|
||||
self.metadata["error"] = error
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to dictionary for serialization."""
|
||||
return {
|
||||
"slot_id": self.slot_id,
|
||||
"alloc_id": self.alloc_id,
|
||||
"container_addr": self.container_addr,
|
||||
"workspace_dir": self.workspace_dir,
|
||||
"state": self.state.value,
|
||||
"trajectory_id": self.trajectory_id,
|
||||
"metadata": self.metadata,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: Dict[str, Any]) -> "Slot":
|
||||
"""Create from dictionary."""
|
||||
return cls(
|
||||
slot_id=data["slot_id"],
|
||||
alloc_id=data["alloc_id"],
|
||||
container_addr=data["container_addr"],
|
||||
workspace_dir=data.get("workspace_dir", ""),
|
||||
state=SlotState(data.get("state", "available")),
|
||||
trajectory_id=data.get("trajectory_id"),
|
||||
metadata=data.get("metadata", {}),
|
||||
)
|
||||
|
||||
def __repr__(self) -> str:
|
||||
return f"Slot({self.slot_id}, state={self.state.value}, alloc={self.alloc_id[:8]}...)"
|
||||
|
||||
|
||||
def create_slots_for_allocation(
|
||||
alloc_id: str,
|
||||
container_addr: str,
|
||||
num_slots: int = 10,
|
||||
) -> list["Slot"]:
|
||||
"""
|
||||
Create slots for a Nomad allocation.
|
||||
|
||||
Args:
|
||||
alloc_id: Nomad allocation ID
|
||||
container_addr: HTTP address of sandbox server
|
||||
num_slots: Number of slots to create
|
||||
|
||||
Returns:
|
||||
List of Slot objects
|
||||
"""
|
||||
slots = []
|
||||
for i in range(num_slots):
|
||||
slot_id = f"slot_{i}"
|
||||
slots.append(Slot(
|
||||
slot_id=slot_id,
|
||||
alloc_id=alloc_id,
|
||||
container_addr=container_addr,
|
||||
workspace_dir=f"/data/{slot_id}",
|
||||
))
|
||||
return slots
|
||||
@@ -1,2 +0,0 @@
|
||||
"""Terminal helpers for stateful sandbox interactions."""
|
||||
|
||||
@@ -1,115 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
import pyte
|
||||
|
||||
|
||||
class AsciinemaStreamDecoder:
|
||||
def __init__(self, *, default_width: int = 80, default_height: int = 24) -> None:
|
||||
self._default_width = max(1, int(default_width))
|
||||
self._default_height = max(1, int(default_height))
|
||||
self._buffer = ""
|
||||
self._has_header = False
|
||||
self.width = self._default_width
|
||||
self.height = self._default_height
|
||||
self._screen = pyte.Screen(self.width, self.height)
|
||||
self._stream = pyte.Stream(self._screen)
|
||||
|
||||
def reset(self) -> None:
|
||||
self._buffer = ""
|
||||
self._has_header = False
|
||||
self.width = self._default_width
|
||||
self.height = self._default_height
|
||||
self._screen = pyte.Screen(self.width, self.height)
|
||||
self._stream = pyte.Stream(self._screen)
|
||||
|
||||
def feed(self, chunk: str | bytes) -> None:
|
||||
if not chunk:
|
||||
return
|
||||
if isinstance(chunk, bytes):
|
||||
chunk = chunk.decode("utf-8", errors="replace")
|
||||
self._buffer += chunk
|
||||
while True:
|
||||
line, sep, rest = self._buffer.partition("\n")
|
||||
if not sep:
|
||||
break
|
||||
self._buffer = rest
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
parsed = self._parse_json_line(line)
|
||||
if parsed is None:
|
||||
continue
|
||||
if not self._has_header:
|
||||
if isinstance(parsed, dict):
|
||||
self._init_from_header(parsed)
|
||||
continue
|
||||
if isinstance(parsed, list):
|
||||
self._has_header = True
|
||||
self._apply_event(parsed)
|
||||
continue
|
||||
continue
|
||||
if isinstance(parsed, list):
|
||||
self._apply_event(parsed)
|
||||
|
||||
def render(self) -> str:
|
||||
return "\n".join(self._screen.display)
|
||||
|
||||
def _parse_json_line(self, line: str) -> Any | None:
|
||||
try:
|
||||
return json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
return None
|
||||
|
||||
def _init_from_header(self, header: dict[str, Any]) -> None:
|
||||
width = _coerce_int(
|
||||
header.get("width") or header.get("columns") or header.get("cols"),
|
||||
self._default_width,
|
||||
)
|
||||
height = _coerce_int(
|
||||
header.get("height") or header.get("rows") or header.get("lines"),
|
||||
self._default_height,
|
||||
)
|
||||
self.width = max(1, width)
|
||||
self.height = max(1, height)
|
||||
self._screen = pyte.Screen(self.width, self.height)
|
||||
self._stream = pyte.Stream(self._screen)
|
||||
self._has_header = True
|
||||
|
||||
def _apply_event(self, event: list[Any]) -> None:
|
||||
if len(event) < 2:
|
||||
return
|
||||
event_type = event[1]
|
||||
payload = event[2] if len(event) > 2 else ""
|
||||
if event_type == "o":
|
||||
if isinstance(payload, str):
|
||||
self._stream.feed(payload)
|
||||
elif event_type == "r":
|
||||
width, height = _parse_resize(payload)
|
||||
if width and height:
|
||||
self.width = width
|
||||
self.height = height
|
||||
self._screen.resize(width, height)
|
||||
|
||||
|
||||
def _coerce_int(value: Any, default: int) -> int:
|
||||
try:
|
||||
return int(value)
|
||||
except (TypeError, ValueError):
|
||||
return int(default)
|
||||
|
||||
|
||||
def _parse_resize(payload: Any) -> tuple[int, int]:
|
||||
if isinstance(payload, str) and "x" in payload:
|
||||
left, right = payload.lower().split("x", 1)
|
||||
return _coerce_int(left, 0), _coerce_int(right, 0)
|
||||
if isinstance(payload, dict):
|
||||
width = _coerce_int(payload.get("width") or payload.get("columns") or payload.get("cols"), 0)
|
||||
height = _coerce_int(payload.get("height") or payload.get("rows") or payload.get("lines"), 0)
|
||||
return width, height
|
||||
if isinstance(payload, list) and len(payload) >= 2:
|
||||
return _coerce_int(payload[0], 0), _coerce_int(payload[1], 0)
|
||||
return 0, 0
|
||||
|
||||
@@ -1,26 +0,0 @@
|
||||
"""
|
||||
Tool abstractions for atropos-agent.
|
||||
|
||||
Provides base Tool class and common tool implementations.
|
||||
"""
|
||||
|
||||
from .base import Tool, ToolCall, ToolRegistry, ToolResult, ToolSchema
|
||||
from .build_registry import build_tool_registry
|
||||
from .sandbox_stubs import BashTool, ReadFileTool, TerminalTool, WriteFileTool
|
||||
from .terminal_stateful_tool import TerminalStatefulTool
|
||||
from .tmux_tool import TmuxTool
|
||||
|
||||
__all__ = [
|
||||
"Tool",
|
||||
"ToolCall",
|
||||
"ToolRegistry",
|
||||
"ToolResult",
|
||||
"ToolSchema",
|
||||
"BashTool",
|
||||
"ReadFileTool",
|
||||
"WriteFileTool",
|
||||
"TerminalTool",
|
||||
"TerminalStatefulTool",
|
||||
"TmuxTool",
|
||||
"build_tool_registry",
|
||||
]
|
||||
@@ -1,423 +0,0 @@
|
||||
"""
|
||||
Base Tool abstraction for atropos-agent.
|
||||
|
||||
Tools follow a simple pattern:
|
||||
1. Define schema (name, description, parameters)
|
||||
2. Implement execute() method
|
||||
3. Return ToolResult with output/error
|
||||
|
||||
Tool calls use Hermes-style XML tags:
|
||||
<tool_call>{"name": "bash", "arguments": {"command": "ls"}}</tool_call>
|
||||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
import uuid
|
||||
from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Dict, List, Literal, Optional
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
|
||||
@dataclass
|
||||
class ToolSchema:
|
||||
"""JSON Schema for a tool's parameters."""
|
||||
|
||||
name: str
|
||||
description: str
|
||||
parameters: Dict[str, Any] = field(default_factory=dict)
|
||||
required: List[str] = field(default_factory=list)
|
||||
external: bool = False # Whether the tool must be executed via an external ToolServer (secret proxy) and not inside the sandbox.
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to OpenAI-compatible function schema."""
|
||||
return {
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": self.name,
|
||||
"description": self.description,
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": self.parameters,
|
||||
"required": self.required,
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
def to_prompt_description(self) -> str:
|
||||
"""Convert to human-readable description for system prompt."""
|
||||
params_desc = []
|
||||
for name, spec in self.parameters.items():
|
||||
req = "(required)" if name in self.required else "(optional)"
|
||||
desc = spec.get("description", "")
|
||||
param_type = spec.get("type", "string")
|
||||
params_desc.append(f" - {name} ({param_type}) {req}: {desc}")
|
||||
|
||||
params_str = "\n".join(params_desc) if params_desc else " (no parameters)"
|
||||
return f"**{self.name}**: {self.description}\nParameters:\n{params_str}"
|
||||
|
||||
|
||||
@dataclass
|
||||
class ToolCall:
|
||||
"""A parsed tool call from model output."""
|
||||
|
||||
name: str
|
||||
arguments: Dict[str, Any]
|
||||
raw_text: str = "" # Original XML/JSON text
|
||||
uniq_id: str = field(default_factory=lambda: str(uuid.uuid4())) # Unique tool-call id for traceability/reconstruction.
|
||||
|
||||
@classmethod
|
||||
def parse_from_text(cls, text: str) -> List["ToolCall"]:
|
||||
"""
|
||||
Extract tool calls from text using Hermes-style XML tags.
|
||||
|
||||
Supported formats (STRICT: requires well-formed closing tags):
|
||||
- Hermes JSON wrapper:
|
||||
<tool_call>{"name": "...", "arguments": {...}}</tool_call>
|
||||
- GLM/llama.cpp style:
|
||||
<tool_call>terminal{"command":"ls -la"}</tool_call>
|
||||
"""
|
||||
calls: List["ToolCall"] = []
|
||||
|
||||
if not text:
|
||||
return calls
|
||||
|
||||
def _append_from_payload(*, name: str, arguments: Dict[str, Any], raw: str, uniq_id: Optional[str] = None) -> None:
|
||||
if not isinstance(name, str) or not name:
|
||||
return
|
||||
if not isinstance(arguments, dict):
|
||||
return
|
||||
calls.append(
|
||||
cls(
|
||||
name=name,
|
||||
arguments=arguments,
|
||||
raw_text=raw,
|
||||
uniq_id=uniq_id or str(uuid.uuid4()),
|
||||
)
|
||||
)
|
||||
|
||||
# STRICT parsing: only accept well-formed <tool_call>...</tool_call> blocks.
|
||||
pattern = r"<tool_call>\s*(.*?)\s*</tool_call>"
|
||||
for inner in re.findall(pattern, text, re.DOTALL):
|
||||
cleaned = (inner or "").strip()
|
||||
if not cleaned:
|
||||
continue
|
||||
|
||||
# Hermes JSON wrapper.
|
||||
if cleaned.startswith("{"):
|
||||
try:
|
||||
data = json.loads(cleaned)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
uniq_id = data.get("uniq_id") or data.get("id") or None
|
||||
_append_from_payload(
|
||||
name=data.get("name", ""),
|
||||
arguments=data.get("arguments", {}),
|
||||
raw=inner,
|
||||
uniq_id=uniq_id,
|
||||
)
|
||||
continue
|
||||
|
||||
# GLM/llama.cpp style: terminal{...}
|
||||
m = re.match(r"^\s*([A-Za-z0-9_.:\\-]+)\s*(\{.*\})\s*$", cleaned, re.DOTALL)
|
||||
if not m:
|
||||
continue
|
||||
name = m.group(1)
|
||||
args_text = m.group(2)
|
||||
try:
|
||||
args = json.loads(args_text)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
_append_from_payload(name=name, arguments=args, raw=inner)
|
||||
|
||||
return calls
|
||||
|
||||
@classmethod
|
||||
def has_tool_call(cls, text: str) -> bool:
|
||||
"""Check if text contains any tool calls."""
|
||||
return bool(re.search(r"<tool_call>", text))
|
||||
|
||||
|
||||
@dataclass
|
||||
class ToolResult:
|
||||
"""Result from executing a tool."""
|
||||
|
||||
success: bool
|
||||
output: str = ""
|
||||
error: str = ""
|
||||
metadata: Dict[str, Any] = field(default_factory=dict)
|
||||
uniq_id: Optional[str] = None # Should match ToolCall.uniq_id for async execution tracking.
|
||||
|
||||
def to_xml(self) -> str:
|
||||
"""Format as XML for including in conversation."""
|
||||
data = {
|
||||
"success": self.success,
|
||||
"output": self.output,
|
||||
}
|
||||
if self.uniq_id:
|
||||
data["uniq_id"] = self.uniq_id
|
||||
if self.error:
|
||||
data["error"] = self.error
|
||||
if self.metadata:
|
||||
data["metadata"] = self.metadata
|
||||
return f"<tool_response>{json.dumps(data)}</tool_response>"
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""Convert to dictionary."""
|
||||
return {
|
||||
"success": self.success,
|
||||
"output": self.output,
|
||||
"error": self.error,
|
||||
"metadata": self.metadata,
|
||||
"uniq_id": self.uniq_id,
|
||||
}
|
||||
|
||||
|
||||
class Tool(ABC):
|
||||
"""
|
||||
Abstract base class for tools.
|
||||
|
||||
Subclasses must implement:
|
||||
- schema: ToolSchema describing the tool
|
||||
- execute(): async method that performs the tool action
|
||||
"""
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def schema(self) -> ToolSchema:
|
||||
"""Return the tool's schema."""
|
||||
pass
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
"""Tool name (from schema)."""
|
||||
return self.schema.name
|
||||
|
||||
@abstractmethod
|
||||
async def execute(self, **kwargs) -> ToolResult:
|
||||
"""
|
||||
Execute the tool with given arguments.
|
||||
|
||||
Args:
|
||||
**kwargs: Tool-specific arguments
|
||||
|
||||
Returns:
|
||||
ToolResult with success/failure and output
|
||||
"""
|
||||
pass
|
||||
|
||||
def is_available(self) -> tuple[bool, str | None]:
|
||||
"""
|
||||
Return whether this tool should be exposed/executable in the current process.
|
||||
|
||||
Tools that depend on optional binaries/services/env vars can override this
|
||||
to avoid advertising a tool that will fail at runtime.
|
||||
"""
|
||||
return True, None
|
||||
|
||||
async def __call__(self, **kwargs) -> ToolResult:
|
||||
"""Allow calling tool instance directly."""
|
||||
return await self.execute(**kwargs)
|
||||
|
||||
# Note: This is only wrapping declarations for the external ToolServer (for execution on external process tools), and tools preinstalled in envs
|
||||
class ToolRegistry:
|
||||
"""Registry of available tools."""
|
||||
|
||||
def __init__(self):
|
||||
self._tools: Dict[str, Tool] = {}
|
||||
|
||||
def register(self, tool: Tool) -> None:
|
||||
"""Register a tool."""
|
||||
self._tools[tool.name] = tool
|
||||
|
||||
def get(self, name: str) -> Optional[Tool]:
|
||||
"""Get a tool by name."""
|
||||
return self._tools.get(name)
|
||||
|
||||
def list_tools(self) -> List[Tool]:
|
||||
"""List all registered tools."""
|
||||
return list(self._tools.values())
|
||||
|
||||
def get_schemas(self) -> List[ToolSchema]:
|
||||
"""Get schemas for all registered tools."""
|
||||
return [tool.schema for tool in self._tools.values()]
|
||||
|
||||
def get_prompt_description(self) -> str:
|
||||
"""Generate tool descriptions for system prompt."""
|
||||
descriptions = [tool.schema.to_prompt_description() for tool in self._tools.values()]
|
||||
return "\n\n".join(descriptions)
|
||||
|
||||
def get_prompt_tool_definitions_json(self) -> str:
|
||||
"""
|
||||
Return a Hermes-style JSON list of tool definitions for use inside a `<tools>...</tools>` block.
|
||||
|
||||
Hermes trajectories historically use a simplified schema list:
|
||||
[{"name": ..., "description": ..., "parameters": {...}, "required": null}, ...]
|
||||
"""
|
||||
formatted: List[Dict[str, Any]] = []
|
||||
for tool in self._tools.values():
|
||||
fn = tool.schema.to_dict().get("function", {})
|
||||
formatted.append(
|
||||
{
|
||||
"name": fn.get("name", tool.name),
|
||||
"description": fn.get("description", ""),
|
||||
"parameters": fn.get("parameters", {}),
|
||||
# Keep parity with Hermes saved trajectories (required is typically null there).
|
||||
"required": None,
|
||||
}
|
||||
)
|
||||
return json.dumps(formatted, ensure_ascii=False)
|
||||
|
||||
async def execute(self, call: ToolCall) -> ToolResult:
|
||||
"""Execute a tool call."""
|
||||
tool = self.get(call.name)
|
||||
if tool is None:
|
||||
return ToolResult(
|
||||
success=False,
|
||||
error=f"Unknown tool: {call.name}",
|
||||
uniq_id=call.uniq_id,
|
||||
)
|
||||
|
||||
try:
|
||||
result = await tool.execute(**call.arguments)
|
||||
if result.uniq_id is None:
|
||||
result.uniq_id = call.uniq_id
|
||||
return result
|
||||
except Exception as e:
|
||||
return ToolResult(
|
||||
success=False,
|
||||
error=f"Tool execution error: {str(e)}",
|
||||
uniq_id=call.uniq_id,
|
||||
)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# FastAPI / transport models
|
||||
# =============================================================================
|
||||
|
||||
|
||||
class ToolCallPayload(BaseModel):
|
||||
name: str
|
||||
arguments: Dict[str, Any] = Field(default_factory=dict)
|
||||
uniq_id: str
|
||||
|
||||
@classmethod
|
||||
def from_tool_call(cls, call: ToolCall) -> "ToolCallPayload":
|
||||
return cls(name=call.name, arguments=call.arguments, uniq_id=call.uniq_id)
|
||||
|
||||
def to_tool_call(self) -> ToolCall:
|
||||
return ToolCall(name=self.name, arguments=self.arguments, uniq_id=self.uniq_id)
|
||||
|
||||
|
||||
class ToolResultPayload(BaseModel):
|
||||
success: bool
|
||||
output: str = ""
|
||||
error: str = ""
|
||||
metadata: Dict[str, Any] = Field(default_factory=dict)
|
||||
uniq_id: Optional[str] = None
|
||||
|
||||
@classmethod
|
||||
def from_tool_result(cls, result: ToolResult) -> "ToolResultPayload":
|
||||
return cls(
|
||||
success=result.success,
|
||||
output=result.output,
|
||||
error=result.error,
|
||||
metadata=result.metadata,
|
||||
uniq_id=result.uniq_id,
|
||||
)
|
||||
|
||||
def to_tool_result(self) -> ToolResult:
|
||||
return ToolResult(
|
||||
success=self.success,
|
||||
output=self.output,
|
||||
error=self.error,
|
||||
metadata=self.metadata,
|
||||
uniq_id=self.uniq_id,
|
||||
)
|
||||
|
||||
|
||||
class ToolExecutorExecuteRequest(BaseModel):
|
||||
trajectory_id: str
|
||||
tool: ToolCallPayload
|
||||
timeout_s: Optional[float] = None
|
||||
|
||||
|
||||
class ToolExecutorReleaseRequest(BaseModel):
|
||||
trajectory_id: str
|
||||
reset_workspace: bool = False
|
||||
|
||||
|
||||
class ToolServerExecuteRequest(BaseModel):
|
||||
trajectory_id: Optional[str] = None
|
||||
tool: ToolCallPayload
|
||||
timeout_s: Optional[float] = None
|
||||
# Optional sandbox context for tools that need workspace artifacts.
|
||||
# This is set by ToolExecutor and is NOT model-controlled.
|
||||
slot_id: Optional[str] = None
|
||||
container_addr: Optional[str] = None
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Artifact transport models
|
||||
# =============================================================================
|
||||
|
||||
|
||||
class ArtifactReadRequestPayload(BaseModel):
|
||||
trajectory_id: str
|
||||
path: str
|
||||
encoding: Literal["text", "base64"] = "text"
|
||||
max_bytes: Optional[int] = None
|
||||
include_sha256: bool = False
|
||||
|
||||
|
||||
class ArtifactReadResponsePayload(BaseModel):
|
||||
success: bool
|
||||
content: str = ""
|
||||
error: str = ""
|
||||
encoding: str = "text"
|
||||
truncated: bool = False
|
||||
bytes: int = 0
|
||||
file_size: Optional[int] = None
|
||||
path: str = ""
|
||||
mime: Optional[str] = None
|
||||
sha256: Optional[str] = None
|
||||
|
||||
|
||||
class ArtifactListRequestPayload(BaseModel):
|
||||
trajectory_id: str
|
||||
path: str = "."
|
||||
recursive: bool = False
|
||||
max_entries: Optional[int] = None
|
||||
|
||||
|
||||
class ArtifactListEntryPayload(BaseModel):
|
||||
path: str
|
||||
is_dir: bool
|
||||
size: int
|
||||
mtime: float
|
||||
|
||||
|
||||
class ArtifactListResponsePayload(BaseModel):
|
||||
success: bool
|
||||
entries: List[ArtifactListEntryPayload] = Field(default_factory=list)
|
||||
truncated: bool = False
|
||||
error: str = ""
|
||||
|
||||
|
||||
class ArtifactArchiveRequestPayload(BaseModel):
|
||||
trajectory_id: str
|
||||
path: str = "."
|
||||
format: Literal["tar.gz", "tgz"] = "tar.gz"
|
||||
max_bytes: Optional[int] = None
|
||||
max_entries: Optional[int] = None
|
||||
|
||||
|
||||
class ArtifactArchiveResponsePayload(BaseModel):
|
||||
success: bool
|
||||
content: str = ""
|
||||
error: str = ""
|
||||
encoding: str = "base64"
|
||||
format: str = "tar.gz"
|
||||
bytes: int = 0
|
||||
entry_count: int = 0
|
||||
@@ -1,64 +0,0 @@
|
||||
"""
|
||||
Unified tool registry builder for Hermes-Agent Atropos integration.
|
||||
|
||||
This composes:
|
||||
- sandbox tool stubs (terminal/bash/read_file/write_file + stateful terminal/tmux)
|
||||
- Hermes external tools (web/vision/image/moa/skills/browser), executed via ToolServer
|
||||
|
||||
ToolExecutor only needs the schema + `external` routing bit; ToolServer executes
|
||||
the external tools via Hermes' existing implementations.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import List, Optional
|
||||
|
||||
from .base import ToolRegistry
|
||||
from .hermes_external_tools import build_external_tools
|
||||
from .sandbox_stubs import BashTool, ReadFileTool, TerminalTool, WriteFileTool
|
||||
from .terminal_stateful_tool import TerminalStatefulTool
|
||||
from .tmux_tool import TmuxTool
|
||||
from .toolset_resolver import resolve_multiple_toolsets
|
||||
|
||||
|
||||
def build_tool_registry(
|
||||
*,
|
||||
enabled_toolsets: Optional[List[str]] = None,
|
||||
disabled_toolsets: Optional[List[str]] = None,
|
||||
tool_server_url: Optional[str] = None,
|
||||
) -> ToolRegistry:
|
||||
"""
|
||||
Build a ToolRegistry for AgentEnv / ToolExecutor / ToolServer.
|
||||
|
||||
If `tool_server_url` is not provided, external tools will be omitted so we do
|
||||
not advertise tools that cannot execute.
|
||||
"""
|
||||
enabled_toolsets = enabled_toolsets or ["default"]
|
||||
|
||||
# Resolve tool names using Hermes toolsets plus Atropos additions.
|
||||
selected = set(resolve_multiple_toolsets(enabled_toolsets))
|
||||
if disabled_toolsets:
|
||||
selected -= set(resolve_multiple_toolsets(disabled_toolsets))
|
||||
|
||||
reg = ToolRegistry()
|
||||
|
||||
# Always register sandbox tools if selected.
|
||||
sandbox_by_name = {
|
||||
"terminal": TerminalTool(),
|
||||
"bash": BashTool(),
|
||||
"read_file": ReadFileTool(),
|
||||
"write_file": WriteFileTool(),
|
||||
"terminal_stateful": TerminalStatefulTool(),
|
||||
"tmux": TmuxTool(),
|
||||
}
|
||||
for name, tool in sandbox_by_name.items():
|
||||
if name in selected:
|
||||
reg.register(tool)
|
||||
|
||||
# External tools: only include when ToolServer is configured.
|
||||
if tool_server_url:
|
||||
for tool in build_external_tools(selected_tool_names=selected):
|
||||
if tool.name in selected:
|
||||
reg.register(tool)
|
||||
|
||||
return reg
|
||||
@@ -1,90 +0,0 @@
|
||||
"""
|
||||
Hermes external tool adapter for Atropos ToolServer.
|
||||
|
||||
These tools reuse Hermes-Agent's existing tool runner (`model_tools.handle_function_call`)
|
||||
so we don't duplicate external tool implementations.
|
||||
|
||||
Important:
|
||||
- These are marked `external=True` and should be executed ONLY by ToolServer.
|
||||
- We run `handle_function_call` in a worker thread because the Hermes implementation
|
||||
uses `asyncio.run()` internally for some async tools (web_extract, vision, MoA, etc).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import model_tools
|
||||
|
||||
from .base import Tool, ToolResult, ToolSchema
|
||||
|
||||
|
||||
def _schema_from_openai_tool_dict(tool: Dict[str, Any], *, external: bool) -> ToolSchema:
|
||||
fn = tool.get("function") or {}
|
||||
name = str(fn.get("name") or "")
|
||||
description = str(fn.get("description") or "")
|
||||
params = fn.get("parameters") or {}
|
||||
properties = params.get("properties") or {}
|
||||
required = params.get("required") or []
|
||||
if not isinstance(required, list):
|
||||
required = []
|
||||
return ToolSchema(
|
||||
name=name,
|
||||
description=description,
|
||||
parameters=dict(properties),
|
||||
required=[str(x) for x in required if isinstance(x, (str, int))],
|
||||
external=external,
|
||||
)
|
||||
|
||||
|
||||
class HermesExternalTool(Tool):
|
||||
def __init__(self, schema: ToolSchema):
|
||||
self._schema = schema
|
||||
|
||||
@property
|
||||
def schema(self) -> ToolSchema:
|
||||
return self._schema
|
||||
|
||||
async def execute(self, task_id: Optional[str] = None, **kwargs: Any) -> ToolResult:
|
||||
# `model_tools.handle_function_call` returns a JSON string (success or error).
|
||||
# Run in a thread because some Hermes tool handlers call `asyncio.run()`.
|
||||
raw = await asyncio.to_thread(model_tools.handle_function_call, self.name, kwargs, task_id)
|
||||
|
||||
try:
|
||||
parsed = json.loads(raw)
|
||||
except Exception:
|
||||
# Keep as plain string.
|
||||
return ToolResult(success=True, output=str(raw))
|
||||
|
||||
if isinstance(parsed, dict) and parsed.get("error"):
|
||||
return ToolResult(success=False, error=str(parsed.get("error")), output="")
|
||||
|
||||
return ToolResult(success=True, output=json.dumps(parsed, ensure_ascii=False))
|
||||
|
||||
|
||||
def build_external_tools(
|
||||
*,
|
||||
selected_tool_names: Optional[set[str]] = None,
|
||||
) -> List[HermesExternalTool]:
|
||||
"""
|
||||
Build external tool wrappers from Hermes tool declarations.
|
||||
|
||||
Filters out sandbox-oriented tools (e.g. `terminal`) since those should run
|
||||
inside the sandbox via ToolExecutor.
|
||||
"""
|
||||
# IMPORTANT: Hermes' `model_tools.get_tool_definitions()` only understands Hermes toolsets.
|
||||
# Atropos envs add extra toolsets (filesystem/sandbox/stateful). To avoid noisy "Unknown toolset"
|
||||
# prints and accidental filtering, we fetch ALL Hermes tool definitions here and filter by name.
|
||||
tools = model_tools.get_tool_definitions(enabled_toolsets=None, disabled_toolsets=None, quiet_mode=True)
|
||||
|
||||
wrappers: List[HermesExternalTool] = []
|
||||
for t in tools:
|
||||
schema = _schema_from_openai_tool_dict(t, external=True)
|
||||
if schema.name in {"terminal"}:
|
||||
continue
|
||||
if selected_tool_names is not None and schema.name not in selected_tool_names:
|
||||
continue
|
||||
wrappers.append(HermesExternalTool(schema))
|
||||
return wrappers
|
||||
@@ -1,99 +0,0 @@
|
||||
"""
|
||||
Sandbox tool stubs for Atropos ToolExecutor.
|
||||
|
||||
These tools are executed inside the sandbox containers via:
|
||||
ToolExecutor -> SlotPool -> sandbox_server.py
|
||||
|
||||
They intentionally do NOT execute anything on the host process. If they are
|
||||
called directly (outside ToolExecutor), they return a clear error.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from .base import Tool, ToolResult, ToolSchema
|
||||
|
||||
|
||||
class TerminalTool(Tool):
|
||||
@property
|
||||
def schema(self) -> ToolSchema:
|
||||
return ToolSchema(
|
||||
name="terminal",
|
||||
description=(
|
||||
"Execute a command inside the sandbox slot workspace and return stdout/stderr. "
|
||||
"Filesystem persists within a trajectory slot. Background processes are not supported "
|
||||
"in stateless mode. Commands run under POSIX /bin/sh and each tool call runs in a fresh "
|
||||
"shell (no persisted env vars). Avoid bash-only syntax like `source`; prefer `. .venv/bin/activate` "
|
||||
"or invoke `.venv/bin/python ...` directly."
|
||||
),
|
||||
parameters={
|
||||
"command": {"type": "string", "description": "The command to execute"},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"description": "Command timeout in seconds (optional).",
|
||||
"minimum": 1,
|
||||
},
|
||||
"background": {
|
||||
"type": "boolean",
|
||||
"description": "Not supported in sandbox terminal (always false).",
|
||||
"default": False,
|
||||
},
|
||||
},
|
||||
required=["command"],
|
||||
external=False,
|
||||
)
|
||||
|
||||
async def execute(self, **_kwargs) -> ToolResult:
|
||||
return ToolResult(
|
||||
success=False,
|
||||
error="terminal must be executed via ToolExecutor inside the sandbox",
|
||||
)
|
||||
|
||||
|
||||
class BashTool(Tool):
|
||||
@property
|
||||
def schema(self) -> ToolSchema:
|
||||
return ToolSchema(
|
||||
name="bash",
|
||||
description="Execute a bash command inside the sandbox slot workspace.",
|
||||
parameters={"command": {"type": "string", "description": "The bash command to execute"}},
|
||||
required=["command"],
|
||||
external=False,
|
||||
)
|
||||
|
||||
async def execute(self, **_kwargs) -> ToolResult:
|
||||
return ToolResult(success=False, error="bash must be executed via ToolExecutor inside the sandbox")
|
||||
|
||||
|
||||
class ReadFileTool(Tool):
|
||||
@property
|
||||
def schema(self) -> ToolSchema:
|
||||
return ToolSchema(
|
||||
name="read_file",
|
||||
description="Read a file from the sandbox slot workspace.",
|
||||
parameters={"path": {"type": "string", "description": "Path to the file"}},
|
||||
required=["path"],
|
||||
external=False,
|
||||
)
|
||||
|
||||
async def execute(self, **_kwargs) -> ToolResult:
|
||||
return ToolResult(success=False, error="read_file must be executed via ToolExecutor inside the sandbox")
|
||||
|
||||
|
||||
class WriteFileTool(Tool):
|
||||
@property
|
||||
def schema(self) -> ToolSchema:
|
||||
return ToolSchema(
|
||||
name="write_file",
|
||||
description="Write a file into the sandbox slot workspace.",
|
||||
parameters={
|
||||
"path": {"type": "string", "description": "Path to the file"},
|
||||
"content": {"type": "string", "description": "File content"},
|
||||
},
|
||||
required=["path", "content"],
|
||||
external=False,
|
||||
)
|
||||
|
||||
async def execute(self, **_kwargs) -> ToolResult:
|
||||
return ToolResult(success=False, error="write_file must be executed via ToolExecutor inside the sandbox")
|
||||
@@ -1,45 +0,0 @@
|
||||
"""
|
||||
Stateful terminal tool schema.
|
||||
|
||||
This is a sandbox tool that routes to the sandbox server as `bash_stateful`
|
||||
via ToolExecutor mapping. It exists to expose an explicit, opt-in terminal
|
||||
primitive suitable for stateful workflows (e.g. tmux sessions / TUIs).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from .base import Tool, ToolResult, ToolSchema
|
||||
|
||||
|
||||
class TerminalStatefulTool(Tool):
|
||||
@property
|
||||
def schema(self) -> ToolSchema:
|
||||
return ToolSchema(
|
||||
name="terminal_stateful",
|
||||
description=(
|
||||
"Execute a command in the sandbox, allowing stateful/background processes to persist "
|
||||
"across tool calls within the same trajectory slot (e.g. tmux sessions). "
|
||||
"Use sparingly; output is still non-interactive."
|
||||
),
|
||||
parameters={
|
||||
"command": {"type": "string", "description": "The command to execute"},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"description": "Command timeout in seconds (optional).",
|
||||
"minimum": 1,
|
||||
},
|
||||
},
|
||||
required=["command"],
|
||||
)
|
||||
|
||||
def is_available(self) -> tuple[bool, str | None]:
|
||||
return True, None
|
||||
|
||||
async def execute(self, command: str, timeout: Optional[int] = None) -> ToolResult:
|
||||
_ = (command, timeout)
|
||||
return ToolResult(
|
||||
success=False,
|
||||
error="terminal_stateful must be executed via ToolExecutor inside the sandbox",
|
||||
)
|
||||
@@ -1,89 +0,0 @@
|
||||
"""
|
||||
tmux tool schema (sandbox).
|
||||
|
||||
This is a sandbox tool that provides basic tmux session control suitable for
|
||||
TUI-style terminal interactions:
|
||||
- send keys (arrow keys, enter, etc.)
|
||||
- capture the current screen buffer
|
||||
|
||||
Execution is routed by ToolExecutor to the sandbox server's `tmux` backend.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, Optional
|
||||
|
||||
from .base import Tool, ToolResult, ToolSchema
|
||||
|
||||
|
||||
class TmuxTool(Tool):
|
||||
@property
|
||||
def schema(self) -> ToolSchema:
|
||||
return ToolSchema(
|
||||
name="tmux",
|
||||
description=(
|
||||
"Control a per-trajectory tmux session inside the sandbox (stateful terminal). "
|
||||
"Use this for TUI-style interactions: send keys and capture the current screen."
|
||||
),
|
||||
parameters={
|
||||
"action": {
|
||||
"type": "string",
|
||||
"description": "Action to perform: start | send_keys | stream | stop.",
|
||||
"enum": ["start", "send_keys", "stream", "stop", "capture"],
|
||||
},
|
||||
"keys": {
|
||||
"description": "Keys to send (string or list of strings) when action=send_keys.",
|
||||
},
|
||||
"block": {
|
||||
"type": "boolean",
|
||||
"description": "If true, wait for shell command completion (only valid at a shell prompt).",
|
||||
"default": False,
|
||||
},
|
||||
"min_wait_s": {
|
||||
"type": "number",
|
||||
"description": "For non-blocking send_keys, sleep this long after sending keys (seconds).",
|
||||
"default": 0.0,
|
||||
},
|
||||
"max_wait_s": {
|
||||
"type": "number",
|
||||
"description": "For blocking send_keys, max time to wait for completion (seconds).",
|
||||
},
|
||||
"capture_entire": {
|
||||
"type": "boolean",
|
||||
"description": "Deprecated. Streaming is preferred.",
|
||||
"default": False,
|
||||
},
|
||||
"max_bytes": {
|
||||
"type": "integer",
|
||||
"description": "Max bytes to return per stream call.",
|
||||
},
|
||||
"reset": {
|
||||
"type": "boolean",
|
||||
"description": "If true, reset stream offset to the beginning of the asciinema recording.",
|
||||
"default": False,
|
||||
},
|
||||
"pane_width": {
|
||||
"type": "integer",
|
||||
"description": "Pane width for action=start (columns).",
|
||||
"minimum": 20,
|
||||
},
|
||||
"pane_height": {
|
||||
"type": "integer",
|
||||
"description": "Pane height for action=start (rows).",
|
||||
"minimum": 10,
|
||||
},
|
||||
},
|
||||
required=["action"],
|
||||
)
|
||||
|
||||
def is_available(self) -> tuple[bool, str | None]:
|
||||
return True, None
|
||||
|
||||
async def execute(self, **kwargs: Dict[str, Any]) -> ToolResult:
|
||||
# This tool is intended to be executed via ToolExecutor -> sandbox server.
|
||||
# We keep a safe fallback for non-sandbox contexts.
|
||||
action = str(kwargs.get("action") or "").strip()
|
||||
return ToolResult(
|
||||
success=False,
|
||||
error=f"tmux tool must be executed in the sandbox (got action={action!r})",
|
||||
)
|
||||
@@ -1,500 +0,0 @@
|
||||
"""
|
||||
ToolExecutor - queued, batched tool dispatch for multiplexed agent trajectories.
|
||||
|
||||
This component is responsible for:
|
||||
- Maintaining trajectory -> Slot affinity (workspace continuity)
|
||||
- Batching sandbox tool calls across trajectories to maximize container utilization
|
||||
- Routing external tools (ToolSchema.external=True) to a ToolServer (Phase 4.5)
|
||||
|
||||
For now, only sandbox tools are executed:
|
||||
- bash
|
||||
- read_file
|
||||
- write_file
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from dataclasses import dataclass
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import httpx
|
||||
|
||||
from .base import (
|
||||
ArtifactArchiveRequestPayload,
|
||||
ArtifactArchiveResponsePayload,
|
||||
ArtifactListRequestPayload,
|
||||
ArtifactListResponsePayload,
|
||||
ArtifactReadRequestPayload,
|
||||
ArtifactReadResponsePayload,
|
||||
ToolCall,
|
||||
ToolCallPayload,
|
||||
ToolRegistry,
|
||||
ToolResult,
|
||||
ToolResultPayload,
|
||||
ToolServerExecuteRequest,
|
||||
)
|
||||
from ..backends.base import ToolBackend
|
||||
from ..slots import Slot
|
||||
|
||||
|
||||
@dataclass
|
||||
class ToolExecutorConfig:
|
||||
batch_window_ms: int = 20
|
||||
max_batch_size: int = 200
|
||||
allow_network: bool = True
|
||||
require_sandbox: bool = False
|
||||
require_stateful_sandbox: bool = False
|
||||
tool_server_url: Optional[str] = None
|
||||
tool_server_token: Optional[str] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class _QueuedToolRequest:
|
||||
trajectory_id: str
|
||||
call: ToolCall
|
||||
timeout_s: Optional[float]
|
||||
future: asyncio.Future
|
||||
|
||||
|
||||
class ToolExecutor:
|
||||
def __init__(
|
||||
self,
|
||||
backend: ToolBackend,
|
||||
tools: ToolRegistry,
|
||||
config: Optional[ToolExecutorConfig] = None,
|
||||
) -> None:
|
||||
self.backend = backend
|
||||
self.tools = tools
|
||||
self.config = config or ToolExecutorConfig()
|
||||
|
||||
self._queue: asyncio.Queue[Optional[_QueuedToolRequest]] = asyncio.Queue()
|
||||
self._task: Optional[asyncio.Task] = None
|
||||
self._stopping = asyncio.Event()
|
||||
|
||||
self._slots_lock = asyncio.Lock()
|
||||
self._slot_by_trajectory: Dict[str, Slot] = {}
|
||||
|
||||
self._tool_server_client: Optional[httpx.AsyncClient] = None
|
||||
self._tool_server_lock = asyncio.Lock()
|
||||
|
||||
# lightweight stats for status endpoints
|
||||
self.total_requests: int = 0
|
||||
self.total_errors: int = 0
|
||||
self.latencies_s: List[float] = []
|
||||
|
||||
async def start(self) -> None:
|
||||
if self._task is None:
|
||||
self._task = asyncio.create_task(self._run_loop())
|
||||
|
||||
def queue_size(self) -> int:
|
||||
return self._queue.qsize()
|
||||
|
||||
async def close(self) -> None:
|
||||
self._stopping.set()
|
||||
await self._queue.put(None)
|
||||
if self._task:
|
||||
await self._task
|
||||
self._task = None
|
||||
|
||||
client = self._tool_server_client
|
||||
self._tool_server_client = None
|
||||
if client is not None:
|
||||
await client.aclose()
|
||||
|
||||
# Best-effort release any remaining slots.
|
||||
async with self._slots_lock:
|
||||
slots = list(self._slot_by_trajectory.items())
|
||||
self._slot_by_trajectory.clear()
|
||||
|
||||
for _, slot in slots:
|
||||
try:
|
||||
await self.backend.release(slot, reset_workspace=False)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
async def execute(
|
||||
self,
|
||||
trajectory_id: str,
|
||||
call: ToolCall,
|
||||
timeout_s: Optional[float] = None,
|
||||
) -> ToolResult:
|
||||
if self._task is None:
|
||||
raise RuntimeError("ToolExecutor not started (call start() first)")
|
||||
|
||||
# Allow tool args to suggest a timeout (Hermes-compatible terminal tool),
|
||||
# but never let the model choose "infinite" timeouts.
|
||||
if timeout_s is None:
|
||||
raw_timeout = call.arguments.get("timeout")
|
||||
if isinstance(raw_timeout, (int, float)):
|
||||
timeout_s = float(raw_timeout)
|
||||
if timeout_s is not None:
|
||||
timeout_s = max(1.0, min(float(timeout_s), 600.0))
|
||||
|
||||
loop = asyncio.get_running_loop()
|
||||
fut: asyncio.Future = loop.create_future()
|
||||
started = time.perf_counter()
|
||||
await self._queue.put(_QueuedToolRequest(trajectory_id=trajectory_id, call=call, timeout_s=timeout_s, future=fut))
|
||||
try:
|
||||
result: ToolResult = await fut
|
||||
return result
|
||||
finally:
|
||||
self.latencies_s.append(time.perf_counter() - started)
|
||||
|
||||
async def release_trajectory(self, trajectory_id: str, reset_workspace: bool = False) -> None:
|
||||
async with self._slots_lock:
|
||||
slot = self._slot_by_trajectory.pop(trajectory_id, None)
|
||||
|
||||
if slot is not None:
|
||||
await self.backend.release(slot, reset_workspace=reset_workspace)
|
||||
|
||||
async def _get_slot_if_present(self, trajectory_id: str) -> Optional[Slot]:
|
||||
async with self._slots_lock:
|
||||
return self._slot_by_trajectory.get(trajectory_id)
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Artifact helpers (optional)
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
async def read_artifact(self, req: ArtifactReadRequestPayload) -> ArtifactReadResponsePayload:
|
||||
slot = await self._get_slot_if_present(req.trajectory_id)
|
||||
if slot is None:
|
||||
return ArtifactReadResponsePayload(success=False, error="No active slot for trajectory (run a sandbox tool first)")
|
||||
data = await self.backend.read_artifact(
|
||||
slot,
|
||||
req.path,
|
||||
encoding=req.encoding,
|
||||
max_bytes=req.max_bytes,
|
||||
include_sha256=req.include_sha256,
|
||||
)
|
||||
if isinstance(data, dict):
|
||||
data = dict(data)
|
||||
data.pop("http_status", None)
|
||||
try:
|
||||
return ArtifactReadResponsePayload(**(data or {}))
|
||||
except Exception as e:
|
||||
return ArtifactReadResponsePayload(success=False, error=f"Invalid artifact read response: {e}")
|
||||
|
||||
async def list_artifacts(self, req: ArtifactListRequestPayload) -> ArtifactListResponsePayload:
|
||||
slot = await self._get_slot_if_present(req.trajectory_id)
|
||||
if slot is None:
|
||||
return ArtifactListResponsePayload(success=False, error="No active slot for trajectory (run a sandbox tool first)")
|
||||
data = await self.backend.list_artifacts(
|
||||
slot,
|
||||
req.path,
|
||||
recursive=req.recursive,
|
||||
max_entries=req.max_entries,
|
||||
)
|
||||
if isinstance(data, dict):
|
||||
data = dict(data)
|
||||
data.pop("http_status", None)
|
||||
try:
|
||||
return ArtifactListResponsePayload(**(data or {}))
|
||||
except Exception as e:
|
||||
return ArtifactListResponsePayload(success=False, error=f"Invalid artifact list response: {e}")
|
||||
|
||||
async def archive_artifacts(self, req: ArtifactArchiveRequestPayload) -> ArtifactArchiveResponsePayload:
|
||||
slot = await self._get_slot_if_present(req.trajectory_id)
|
||||
if slot is None:
|
||||
return ArtifactArchiveResponsePayload(success=False, error="No active slot for trajectory (run a sandbox tool first)")
|
||||
data = await self.backend.archive_artifacts(
|
||||
slot,
|
||||
req.path,
|
||||
archive_format=req.format,
|
||||
max_bytes=req.max_bytes,
|
||||
max_entries=req.max_entries,
|
||||
)
|
||||
if isinstance(data, dict):
|
||||
data = dict(data)
|
||||
data.pop("http_status", None)
|
||||
try:
|
||||
return ArtifactArchiveResponsePayload(**(data or {}))
|
||||
except Exception as e:
|
||||
return ArtifactArchiveResponsePayload(success=False, error=f"Invalid artifact archive response: {e}")
|
||||
|
||||
async def _get_or_acquire_slot(self, trajectory_id: str) -> Slot:
|
||||
async with self._slots_lock:
|
||||
existing = self._slot_by_trajectory.get(trajectory_id)
|
||||
if existing is not None:
|
||||
return existing
|
||||
|
||||
slot = await self.backend.acquire(trajectory_id)
|
||||
|
||||
async with self._slots_lock:
|
||||
existing = self._slot_by_trajectory.get(trajectory_id)
|
||||
if existing is not None:
|
||||
# Another coroutine won the race; return its slot.
|
||||
await self.backend.release(slot, reset_workspace=False)
|
||||
return existing
|
||||
self._slot_by_trajectory[trajectory_id] = slot
|
||||
return slot
|
||||
|
||||
async def _run_loop(self) -> None:
|
||||
pending: List[_QueuedToolRequest] = []
|
||||
deadline: Optional[float] = None
|
||||
|
||||
batch_window_s = max(0.0, self.config.batch_window_ms / 1000.0)
|
||||
max_batch = max(1, self.config.max_batch_size)
|
||||
|
||||
while True:
|
||||
if self._stopping.is_set() and self._queue.empty() and not pending:
|
||||
break
|
||||
|
||||
timeout = None
|
||||
if pending and deadline is not None:
|
||||
timeout = max(0.0, deadline - time.perf_counter())
|
||||
|
||||
try:
|
||||
item = await asyncio.wait_for(self._queue.get(), timeout=timeout)
|
||||
if item is None:
|
||||
continue
|
||||
pending.append(item)
|
||||
if len(pending) == 1:
|
||||
deadline = time.perf_counter() + batch_window_s
|
||||
if len(pending) < max_batch:
|
||||
continue
|
||||
except asyncio.TimeoutError:
|
||||
# batch window elapsed
|
||||
pass
|
||||
|
||||
if not pending:
|
||||
deadline = None
|
||||
continue
|
||||
|
||||
batch = pending
|
||||
pending = []
|
||||
deadline = None
|
||||
|
||||
await self._execute_batch(batch)
|
||||
|
||||
async def _get_tool_server_client(self) -> httpx.AsyncClient:
|
||||
url = self.config.tool_server_url
|
||||
if not url:
|
||||
raise RuntimeError("ToolServer not configured")
|
||||
|
||||
if self._tool_server_client is not None:
|
||||
return self._tool_server_client
|
||||
|
||||
async with self._tool_server_lock:
|
||||
if self._tool_server_client is None:
|
||||
self._tool_server_client = httpx.AsyncClient(base_url=url.rstrip("/"))
|
||||
return self._tool_server_client
|
||||
|
||||
def _tool_server_headers(self) -> Dict[str, str]:
|
||||
token = self.config.tool_server_token
|
||||
if not token:
|
||||
return {}
|
||||
return {"Authorization": f"Bearer {token}"}
|
||||
|
||||
async def _execute_external(self, req: _QueuedToolRequest) -> ToolResult:
|
||||
client = await self._get_tool_server_client()
|
||||
slot_id: Optional[str] = None
|
||||
container_addr: Optional[str] = None
|
||||
slot = await self._get_slot_if_present(req.trajectory_id)
|
||||
if slot is not None:
|
||||
slot_id = slot.slot_id
|
||||
container_addr = slot.container_addr
|
||||
|
||||
payload = ToolServerExecuteRequest(
|
||||
trajectory_id=req.trajectory_id,
|
||||
tool=ToolCallPayload.from_tool_call(req.call),
|
||||
timeout_s=req.timeout_s,
|
||||
slot_id=slot_id,
|
||||
container_addr=container_addr,
|
||||
)
|
||||
|
||||
try:
|
||||
resp = await client.post(
|
||||
"/execute",
|
||||
json=payload.model_dump(),
|
||||
headers=self._tool_server_headers(),
|
||||
timeout=req.timeout_s,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
parsed = ToolResultPayload(**data)
|
||||
result = parsed.to_tool_result()
|
||||
if result.uniq_id is None:
|
||||
result.uniq_id = req.call.uniq_id
|
||||
return result
|
||||
except Exception as e:
|
||||
return ToolResult(
|
||||
success=False,
|
||||
error=f"External tool failed: {e}",
|
||||
uniq_id=req.call.uniq_id,
|
||||
)
|
||||
|
||||
async def _execute_batch(self, batch: List[_QueuedToolRequest]) -> None:
|
||||
# Resolve tool schemas once per request and separate sandbox/external/unknown.
|
||||
sandbox_items: List[_QueuedToolRequest] = []
|
||||
external_items: List[_QueuedToolRequest] = []
|
||||
unknown_items: List[_QueuedToolRequest] = []
|
||||
|
||||
for it in batch:
|
||||
tool = self.tools.get(it.call.name)
|
||||
if tool is None:
|
||||
unknown_items.append(it)
|
||||
continue
|
||||
|
||||
schema = tool.schema
|
||||
if not schema.external:
|
||||
sandbox_items.append(it)
|
||||
else:
|
||||
external_items.append(it)
|
||||
|
||||
for it in unknown_items:
|
||||
self.total_requests += 1
|
||||
self.total_errors += 1
|
||||
if not it.future.done():
|
||||
it.future.set_result(
|
||||
ToolResult(
|
||||
success=False,
|
||||
error=f"Unknown tool: {it.call.name}",
|
||||
uniq_id=it.call.uniq_id,
|
||||
)
|
||||
)
|
||||
|
||||
if external_items:
|
||||
if not self.config.tool_server_url:
|
||||
for it in external_items:
|
||||
self.total_requests += 1
|
||||
self.total_errors += 1
|
||||
if not it.future.done():
|
||||
it.future.set_result(
|
||||
ToolResult(
|
||||
success=False,
|
||||
error=f"External tool not available (ToolServer not configured): {it.call.name}",
|
||||
uniq_id=it.call.uniq_id,
|
||||
)
|
||||
)
|
||||
else:
|
||||
results = await asyncio.gather(*[self._execute_external(it) for it in external_items])
|
||||
for it, res in zip(external_items, results):
|
||||
self.total_requests += 1
|
||||
if not getattr(res, "success", False):
|
||||
self.total_errors += 1
|
||||
if not it.future.done():
|
||||
it.future.set_result(res)
|
||||
|
||||
if not sandbox_items:
|
||||
return
|
||||
|
||||
# Acquire slots for the distinct trajectories in this batch.
|
||||
try:
|
||||
traj_ids = list({it.trajectory_id for it in sandbox_items})
|
||||
slots = await asyncio.gather(*[self._get_or_acquire_slot(tid) for tid in traj_ids])
|
||||
slot_by_traj = dict(zip(traj_ids, slots))
|
||||
except Exception as e:
|
||||
for it in sandbox_items:
|
||||
self.total_requests += 1
|
||||
self.total_errors += 1
|
||||
if not it.future.done():
|
||||
it.future.set_result(
|
||||
ToolResult(
|
||||
success=False,
|
||||
error=f"Failed to acquire slot: {e}",
|
||||
uniq_id=it.call.uniq_id,
|
||||
)
|
||||
)
|
||||
return
|
||||
|
||||
# Group by timeout so we don't accidentally make short timeouts wait on long ones.
|
||||
by_timeout: Dict[float, List[_QueuedToolRequest]] = {}
|
||||
default_timeout = self.backend.default_timeout_s
|
||||
|
||||
for it in sandbox_items:
|
||||
t = it.timeout_s
|
||||
if t is None:
|
||||
t = default_timeout
|
||||
if t is None:
|
||||
t = 30.0
|
||||
by_timeout.setdefault(float(t), []).append(it)
|
||||
|
||||
for timeout_s, items in by_timeout.items():
|
||||
requests = []
|
||||
dispatched: List[_QueuedToolRequest] = []
|
||||
for it in items:
|
||||
slot = slot_by_traj[it.trajectory_id]
|
||||
tool_name = it.call.name
|
||||
args = dict(it.call.arguments)
|
||||
|
||||
# Hermes compatibility: treat `terminal` as an alias of sandbox `bash`.
|
||||
if tool_name == "terminal":
|
||||
if args.get("background"):
|
||||
self.total_requests += 1
|
||||
self.total_errors += 1
|
||||
if not it.future.done():
|
||||
it.future.set_result(
|
||||
ToolResult(
|
||||
success=False,
|
||||
error="terminal background execution is not supported in sandbox",
|
||||
uniq_id=it.call.uniq_id,
|
||||
)
|
||||
)
|
||||
continue
|
||||
tool_name = "bash"
|
||||
# `timeout` is handled at the ToolExecutor level, not passed to the sandbox tool args.
|
||||
args.pop("timeout", None)
|
||||
elif tool_name == "terminal_stateful":
|
||||
tool_name = "bash_stateful"
|
||||
args.pop("timeout", None)
|
||||
elif tool_name == "tmux":
|
||||
# `tmux` is a sandbox tool backed by the stateful session manager.
|
||||
# Network policy is env-controlled.
|
||||
args.pop("allow_network", None)
|
||||
|
||||
if tool_name == "bash":
|
||||
# Network policy is set by the environment/executor, not by the model.
|
||||
args.pop("allow_network", None)
|
||||
args.pop("require_sandbox", None)
|
||||
args["allow_network"] = bool(self.config.allow_network)
|
||||
args["require_sandbox"] = bool(self.config.require_sandbox)
|
||||
# `timeout` is handled at the ToolExecutor level, not passed to the sandbox tool args.
|
||||
args.pop("timeout", None)
|
||||
elif tool_name == "bash_stateful":
|
||||
# Network policy is set by the environment/executor, not by the model.
|
||||
args.pop("allow_network", None)
|
||||
args.pop("require_sandbox", None)
|
||||
args.pop("require_stateful_sandbox", None)
|
||||
args["allow_network"] = bool(self.config.allow_network)
|
||||
args["require_stateful_sandbox"] = bool(self.config.require_stateful_sandbox)
|
||||
args.pop("timeout", None)
|
||||
elif tool_name == "tmux":
|
||||
# Network policy applies to the underlying stateful session.
|
||||
args.pop("allow_network", None)
|
||||
args.pop("require_sandbox", None)
|
||||
args.pop("require_stateful_sandbox", None)
|
||||
args["allow_network"] = bool(self.config.allow_network)
|
||||
args["require_stateful_sandbox"] = bool(self.config.require_stateful_sandbox)
|
||||
|
||||
requests.append((slot, tool_name, args))
|
||||
dispatched.append(it)
|
||||
|
||||
results = None
|
||||
try:
|
||||
if not dispatched:
|
||||
continue
|
||||
results = await self.backend.execute_batch(requests, timeout_s=timeout_s)
|
||||
except Exception as e:
|
||||
for it in items:
|
||||
self.total_requests += 1
|
||||
self.total_errors += 1
|
||||
if not it.future.done():
|
||||
it.future.set_result(
|
||||
ToolResult(
|
||||
success=False,
|
||||
error=f"Batch execution failed: {e}",
|
||||
uniq_id=it.call.uniq_id,
|
||||
)
|
||||
)
|
||||
continue
|
||||
|
||||
for it, res in zip(dispatched, results):
|
||||
self.total_requests += 1
|
||||
if not getattr(res, "success", False):
|
||||
self.total_errors += 1
|
||||
tool_result = res.to_tool_result()
|
||||
tool_result.uniq_id = it.call.uniq_id
|
||||
if not it.future.done():
|
||||
it.future.set_result(tool_result)
|
||||
@@ -1,88 +0,0 @@
|
||||
"""
|
||||
Toolset resolution for Hermes-Agent Atropos integration.
|
||||
|
||||
We primarily reuse Hermes-Agent toolsets (`toolsets.py`), but Atropos training/envs
|
||||
need a few extra sandbox-oriented toolsets that Hermes doesn't expose by default
|
||||
(e.g. filesystem + stateful terminal).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, List, Optional, Set
|
||||
|
||||
import toolsets as hermes_toolsets
|
||||
|
||||
|
||||
ATROPOS_TOOLSETS: Dict[str, Dict[str, Any]] = {
|
||||
"filesystem": {
|
||||
"description": "Read/write files in the sandbox workspace.",
|
||||
"tools": ["read_file", "write_file"],
|
||||
"includes": [],
|
||||
},
|
||||
"terminal_stateful": {
|
||||
"description": "Stateful terminal execution (tmux/TUI support) inside the sandbox.",
|
||||
"tools": ["terminal_stateful", "tmux"],
|
||||
"includes": [],
|
||||
},
|
||||
"sandbox": {
|
||||
"description": "Sandbox tools (terminal + filesystem).",
|
||||
"tools": [],
|
||||
"includes": ["terminal", "filesystem"],
|
||||
},
|
||||
"default": {
|
||||
"description": "Default toolset for Atropos AgentEnv tasks.",
|
||||
"tools": [],
|
||||
"includes": ["sandbox"],
|
||||
},
|
||||
"full": {
|
||||
"description": "All Hermes tools plus Atropos sandbox additions.",
|
||||
"tools": [],
|
||||
"includes": ["all", "filesystem", "sandbox", "terminal_stateful"],
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def validate_toolset(name: str) -> bool:
|
||||
if name in {"all", "*"}:
|
||||
return True
|
||||
return hermes_toolsets.validate_toolset(name) or name in ATROPOS_TOOLSETS
|
||||
|
||||
|
||||
def resolve_toolset(name: str, visited: Optional[Set[str]] = None) -> List[str]:
|
||||
if visited is None:
|
||||
visited = set()
|
||||
|
||||
if name in {"all", "*"}:
|
||||
# Union Hermes + Atropos toolsets.
|
||||
all_tools: Set[str] = set()
|
||||
for tname in hermes_toolsets.get_toolset_names():
|
||||
all_tools.update(resolve_toolset(tname, visited=set()))
|
||||
for tname, spec in ATROPOS_TOOLSETS.items():
|
||||
# Avoid recursion: some Atropos toolsets (e.g. "full") include "all".
|
||||
if tname == "full" or "all" in (spec.get("includes") or []):
|
||||
continue
|
||||
all_tools.update(resolve_toolset(tname, visited=set()))
|
||||
return sorted(all_tools)
|
||||
|
||||
if name in ATROPOS_TOOLSETS:
|
||||
if name in visited:
|
||||
return []
|
||||
visited.add(name)
|
||||
spec = ATROPOS_TOOLSETS[name]
|
||||
tools: Set[str] = set(spec.get("tools", []))
|
||||
for inc in spec.get("includes", []):
|
||||
tools.update(resolve_toolset(inc, visited=set(visited)))
|
||||
return sorted(tools)
|
||||
|
||||
# Fall back to Hermes toolsets.
|
||||
# IMPORTANT: do not pre-add `name` to `visited` here; Hermes' resolver uses
|
||||
# `visited` for its own cycle detection and will treat the presence of `name`
|
||||
# as a circular dependency.
|
||||
return sorted(hermes_toolsets.resolve_toolset(name, visited=set(visited)))
|
||||
|
||||
|
||||
def resolve_multiple_toolsets(names: List[str]) -> List[str]:
|
||||
tools: Set[str] = set()
|
||||
for name in names:
|
||||
tools.update(resolve_toolset(name, visited=set()))
|
||||
return sorted(tools)
|
||||
@@ -1,415 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Atropos-compatible Hermes agent runner.
|
||||
|
||||
This is a minimal subclass of Hermes-Agent's `AIAgent` that swaps the OpenAI
|
||||
function-calling backend for Atroposlib's `ManagedServer`/`ServerManager` backend
|
||||
and uses Hermes-style XML tool tags:
|
||||
|
||||
- <tool_call>{"name": "...", "arguments": {...}}</tool_call>
|
||||
- <tool_response>{...}</tool_response>
|
||||
|
||||
Tool observations are appended as `role="user"` messages containing one or more
|
||||
`<tool_response>` blocks so they survive common chat templates during tokenization.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import re
|
||||
import time
|
||||
import warnings
|
||||
import os
|
||||
from contextlib import asynccontextmanager
|
||||
from typing import Any, AsyncGenerator, Dict, List, Optional, Tuple
|
||||
|
||||
from model_tools import cleanup_vm, handle_function_call
|
||||
from run_agent import AIAgent
|
||||
|
||||
_TOOL_CALL_RE = re.compile(r"<tool_call>\\s*(.*?)\\s*</tool_call>", re.DOTALL)
|
||||
|
||||
|
||||
ATROPOS_TOOL_SYSTEM_PROMPT = """You are a helpful AI assistant with access to tools.
|
||||
|
||||
## Available Tools
|
||||
<tools>
|
||||
{tool_descriptions}
|
||||
</tools>
|
||||
|
||||
## How to Use Tools
|
||||
To call a tool, output:
|
||||
<tool_call>{{"name": "tool_name", "arguments": {{"arg1": "value1"}}}}</tool_call>
|
||||
|
||||
You may include optional reasoning in <think>...</think> before tool calls.
|
||||
|
||||
After each tool call, you will receive tool results as:
|
||||
<tool_response>{{...}}</tool_response>
|
||||
|
||||
Continue until finished, then provide a final response with no <tool_call> blocks.
|
||||
"""
|
||||
|
||||
|
||||
class AtroposAIAgent(AIAgent):
|
||||
"""
|
||||
Hermes `AIAgent` variant that uses Atroposlib ServerManager/ManagedServer.
|
||||
|
||||
Notes:
|
||||
- The default Hermes `AIAgent` remains unchanged; this class is opt-in.
|
||||
- The underlying server must expose `managed_server(tokenizer=...)` OR be a single
|
||||
APIServer-compatible object usable by Atroposlib's `ManagedServer`.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
server: Any,
|
||||
tokenizer: Any = None,
|
||||
model: str = "local",
|
||||
max_iterations: int = 10,
|
||||
tool_delay: float = 0.0,
|
||||
enabled_toolsets: Optional[List[str]] = None,
|
||||
disabled_toolsets: Optional[List[str]] = None,
|
||||
save_trajectories: bool = False,
|
||||
verbose_logging: bool = False,
|
||||
quiet_mode: bool = False,
|
||||
ephemeral_system_prompt: Optional[str] = None,
|
||||
log_prefix_chars: int = 100,
|
||||
log_prefix: str = "",
|
||||
session_id: Optional[str] = None,
|
||||
temperature: Optional[float] = None,
|
||||
max_tokens: Optional[int] = None,
|
||||
):
|
||||
# Call parent init mainly to reuse tool selection + trajectory saving utilities.
|
||||
super().__init__(
|
||||
base_url="http://unused",
|
||||
api_key="dummy-key",
|
||||
model=model,
|
||||
max_iterations=max_iterations,
|
||||
tool_delay=tool_delay,
|
||||
enabled_toolsets=enabled_toolsets,
|
||||
disabled_toolsets=disabled_toolsets,
|
||||
save_trajectories=save_trajectories,
|
||||
verbose_logging=verbose_logging,
|
||||
quiet_mode=quiet_mode,
|
||||
ephemeral_system_prompt=ephemeral_system_prompt,
|
||||
log_prefix_chars=log_prefix_chars,
|
||||
log_prefix=log_prefix,
|
||||
session_id=session_id,
|
||||
)
|
||||
|
||||
self.server = server
|
||||
self.tokenizer = tokenizer
|
||||
self.temperature = temperature
|
||||
self.max_tokens = max_tokens
|
||||
|
||||
@asynccontextmanager
|
||||
async def _managed(self) -> AsyncGenerator[Any, None]:
|
||||
if hasattr(self.server, "managed_server"):
|
||||
with warnings.catch_warnings():
|
||||
warnings.filterwarnings(
|
||||
"ignore",
|
||||
message=r"Using OpenAIServer with managed_server does not allow for state tracking",
|
||||
category=UserWarning,
|
||||
)
|
||||
async with self.server.managed_server(tokenizer=self.tokenizer) as managed:
|
||||
yield managed
|
||||
return
|
||||
|
||||
# Fall back to directly wrapping a single server object.
|
||||
from atroposlib.envs.server_handling.managed_server import ManagedServer
|
||||
|
||||
managed = ManagedServer(server=self.server, tokenizer=self.tokenizer)
|
||||
try:
|
||||
yield managed
|
||||
finally:
|
||||
managed.reset()
|
||||
|
||||
def _tool_descriptions_text(self) -> str:
|
||||
if not self.tools:
|
||||
return "(no tools available)"
|
||||
|
||||
parts: List[str] = []
|
||||
for tool in self.tools:
|
||||
fn = (tool or {}).get("function", {})
|
||||
name = fn.get("name", "")
|
||||
desc = (fn.get("description") or "").strip()
|
||||
if not name:
|
||||
continue
|
||||
if desc:
|
||||
parts.append(f"- {name}: {desc}")
|
||||
else:
|
||||
parts.append(f"- {name}")
|
||||
return "\n".join(parts) if parts else "(no tools available)"
|
||||
|
||||
def _build_system_prompt(self, system_message: Optional[str]) -> Optional[str]:
|
||||
tool_prompt = ATROPOS_TOOL_SYSTEM_PROMPT.format(
|
||||
tool_descriptions=self._tool_descriptions_text()
|
||||
)
|
||||
|
||||
parts: List[str] = []
|
||||
if system_message:
|
||||
parts.append(system_message)
|
||||
if self.ephemeral_system_prompt:
|
||||
parts.append(self.ephemeral_system_prompt)
|
||||
parts.append(tool_prompt)
|
||||
|
||||
return "\n\n".join(parts)
|
||||
|
||||
def _parse_tool_calls(self, content: str) -> Tuple[List[Tuple[str, Dict[str, Any]]], List[str]]:
|
||||
"""
|
||||
Returns:
|
||||
(calls, errors)
|
||||
"""
|
||||
calls: List[Tuple[str, Dict[str, Any]]] = []
|
||||
errors: List[str] = []
|
||||
|
||||
for raw in _TOOL_CALL_RE.findall(content or ""):
|
||||
try:
|
||||
payload = json.loads(raw)
|
||||
except json.JSONDecodeError as exc:
|
||||
errors.append(f"Invalid JSON inside <tool_call>: {exc}")
|
||||
continue
|
||||
|
||||
name = payload.get("name")
|
||||
args = payload.get("arguments", {})
|
||||
if not isinstance(name, str) or not name:
|
||||
errors.append("Tool call missing 'name' string")
|
||||
continue
|
||||
if not isinstance(args, dict):
|
||||
errors.append("Tool call 'arguments' must be an object")
|
||||
continue
|
||||
|
||||
calls.append((name, args))
|
||||
|
||||
return calls, errors
|
||||
|
||||
async def run_conversation_async(
|
||||
self,
|
||||
user_message: str,
|
||||
system_message: Optional[str] = None,
|
||||
conversation_history: Optional[List[Dict[str, Any]]] = None,
|
||||
task_id: Optional[str] = None,
|
||||
) -> Dict[str, Any]:
|
||||
import uuid
|
||||
|
||||
effective_task_id = task_id or str(uuid.uuid4())
|
||||
|
||||
messages: List[Dict[str, Any]] = conversation_history.copy() if conversation_history else []
|
||||
messages.append({"role": "user", "content": user_message})
|
||||
|
||||
active_system_prompt = self._build_system_prompt(system_message)
|
||||
|
||||
api_call_count = 0
|
||||
final_response: Optional[str] = None
|
||||
managed_state: Optional[Dict[str, Any]] = None
|
||||
completed = False
|
||||
|
||||
try:
|
||||
async with self._managed() as managed:
|
||||
while api_call_count < self.max_iterations:
|
||||
api_call_count += 1
|
||||
|
||||
api_messages = messages.copy()
|
||||
if active_system_prompt:
|
||||
api_messages = [{"role": "system", "content": active_system_prompt}] + api_messages
|
||||
|
||||
chat_kwargs: Dict[str, Any] = {"messages": api_messages, "n": 1}
|
||||
if self.max_tokens is not None:
|
||||
chat_kwargs["max_tokens"] = self.max_tokens
|
||||
if self.temperature is not None:
|
||||
chat_kwargs["temperature"] = self.temperature
|
||||
|
||||
# Prefer OpenAI tool calling when supported by the backend:
|
||||
# - Many providers normalize Hermes-style <tool_call> tags into tool_calls when `tools` is provided.
|
||||
# - ManagedServer (atroposlib) does prompt->completion conversion and does not support `tools`.
|
||||
# Only pass `tools` when we're calling an OpenAI-compatible chat endpoint directly.
|
||||
tool_schemas = self.tools if self.tools else None
|
||||
managed_cls = type(managed).__name__
|
||||
if tool_schemas and managed_cls != "ManagedServer":
|
||||
chat_kwargs["tools"] = tool_schemas
|
||||
|
||||
if os.getenv("HERMES_DEBUG_ATROPOS_REQUEST") == "1":
|
||||
meta = {
|
||||
"managed_type": managed_cls,
|
||||
"model": getattr(getattr(managed, "config", None), "model_name", self.model),
|
||||
"base_url": getattr(getattr(managed, "config", None), "base_url", None),
|
||||
"kwargs": chat_kwargs,
|
||||
}
|
||||
# Avoid dumping megabytes of data accidentally.
|
||||
# (Messages can be large; this is still "full" but bounded.)
|
||||
print("\n=== HERMES_DEBUG_ATROPOS_REQUEST ===", flush=True)
|
||||
print(json.dumps(meta, ensure_ascii=False, indent=2)[:200_000], flush=True)
|
||||
|
||||
response = await managed.chat_completion(**chat_kwargs)
|
||||
|
||||
if os.getenv("HERMES_DEBUG_ATROPOS_RESPONSE") == "1":
|
||||
try:
|
||||
dumped = response.model_dump() # openai pydantic model
|
||||
except Exception:
|
||||
dumped = getattr(response, "__dict__", {"repr": repr(response)})
|
||||
print("\n=== HERMES_DEBUG_ATROPOS_RESPONSE: ChatCompletion (raw) ===", flush=True)
|
||||
print(json.dumps(dumped, ensure_ascii=False, indent=2), flush=True)
|
||||
|
||||
if hasattr(managed, "get_state"):
|
||||
managed_state = managed.get_state()
|
||||
|
||||
msg = response.choices[0].message
|
||||
assistant_content = (msg.content or "")
|
||||
msg_reasoning = getattr(msg, "reasoning", None)
|
||||
|
||||
# Use tool_calls if the backend provides them (preferred).
|
||||
structured_tool_calls = getattr(msg, "tool_calls", None)
|
||||
|
||||
# If the backend emits content="" but includes useful text in reasoning,
|
||||
# use it for parsing *only if needed* (e.g. tool tags).
|
||||
if assistant_content == "" and isinstance(msg_reasoning, str) and msg_reasoning:
|
||||
if os.getenv("HERMES_DEBUG_ATROPOS_RESPONSE") == "1":
|
||||
print("\n=== HERMES_DEBUG_ATROPOS_RESPONSE: message.reasoning present (content empty) ===", flush=True)
|
||||
print(msg_reasoning, flush=True)
|
||||
|
||||
assistant_msg: Dict[str, Any] = {"role": "assistant", "content": assistant_content}
|
||||
if structured_tool_calls:
|
||||
# Preserve tool_calls so the next request is consistent with OpenAI protocol.
|
||||
try:
|
||||
assistant_msg["tool_calls"] = [
|
||||
{
|
||||
"id": tc.id,
|
||||
"type": tc.type,
|
||||
"function": {"name": tc.function.name, "arguments": tc.function.arguments},
|
||||
}
|
||||
for tc in structured_tool_calls
|
||||
]
|
||||
except Exception:
|
||||
# Best-effort; keep conversation moving.
|
||||
pass
|
||||
messages.append(assistant_msg)
|
||||
|
||||
# Mode A: OpenAI tool calling (preferred when supported)
|
||||
if structured_tool_calls:
|
||||
for tc in structured_tool_calls:
|
||||
tool_start = time.time()
|
||||
try:
|
||||
tool_args = json.loads(tc.function.arguments or "{}")
|
||||
except Exception:
|
||||
tool_args = {}
|
||||
tool_result = handle_function_call(tc.function.name, tool_args, effective_task_id)
|
||||
tool_duration = time.time() - tool_start
|
||||
|
||||
# Keep the raw tool result as tool content (OpenAI protocol expects role=tool).
|
||||
messages.append(
|
||||
{
|
||||
"role": "tool",
|
||||
"tool_call_id": tc.id,
|
||||
"content": tool_result,
|
||||
}
|
||||
)
|
||||
|
||||
if self.tool_delay and self.tool_delay > 0:
|
||||
await asyncio.sleep(self.tool_delay)
|
||||
|
||||
# Continue loop after tool execution.
|
||||
continue
|
||||
|
||||
# Mode B: Hermes XML tool tags in assistant text (fallback).
|
||||
parse_source = assistant_content or (msg_reasoning or "")
|
||||
tool_calls, parse_errors = self._parse_tool_calls(parse_source)
|
||||
|
||||
if parse_errors and not tool_calls:
|
||||
# Ask the model to retry with valid tool JSON.
|
||||
err_text = "; ".join(parse_errors[:3])
|
||||
messages.append(
|
||||
{
|
||||
"role": "user",
|
||||
"content": (
|
||||
f"<tool_response>{json.dumps({'error': err_text}, ensure_ascii=False)}</tool_response>\n"
|
||||
"The previous <tool_call> blocks were invalid. Please output valid JSON inside <tool_call>."
|
||||
),
|
||||
}
|
||||
)
|
||||
continue
|
||||
|
||||
if not tool_calls:
|
||||
# No tool calls: treat as final answer.
|
||||
final_response = (assistant_content or "").strip()
|
||||
completed = True
|
||||
break
|
||||
|
||||
tool_responses: List[str] = []
|
||||
for tool_name, tool_args in tool_calls:
|
||||
tool_start = time.time()
|
||||
tool_result = handle_function_call(tool_name, tool_args, effective_task_id)
|
||||
tool_duration = time.time() - tool_start
|
||||
|
||||
try:
|
||||
parsed = json.loads(tool_result)
|
||||
payload: Any = parsed
|
||||
except Exception:
|
||||
payload = tool_result
|
||||
|
||||
tool_payload = {
|
||||
"name": tool_name,
|
||||
"duration_s": round(tool_duration, 3),
|
||||
"result": payload,
|
||||
}
|
||||
tool_responses.append(
|
||||
f"<tool_response>{json.dumps(tool_payload, ensure_ascii=False)}</tool_response>"
|
||||
)
|
||||
|
||||
if self.tool_delay and self.tool_delay > 0:
|
||||
await asyncio.sleep(self.tool_delay)
|
||||
|
||||
messages.append({"role": "user", "content": "\n".join(tool_responses)})
|
||||
|
||||
if final_response is None:
|
||||
final_response = "I've reached the maximum number of iterations."
|
||||
|
||||
finally:
|
||||
try:
|
||||
cleanup_vm(effective_task_id)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Save trajectory using Hermes formatting (optional).
|
||||
self._save_trajectory(messages, user_message, completed=completed)
|
||||
|
||||
return {
|
||||
"final_response": final_response,
|
||||
"messages": messages,
|
||||
"api_calls": api_call_count,
|
||||
"completed": completed,
|
||||
"managed_state": managed_state,
|
||||
"system_prompt": active_system_prompt,
|
||||
"task_id": effective_task_id,
|
||||
}
|
||||
|
||||
def run_conversation(self, *args: Any, **kwargs: Any) -> Dict[str, Any]:
|
||||
"""
|
||||
Sync wrapper for convenience.
|
||||
|
||||
If called from within a running event loop (e.g. prompt_toolkit), this
|
||||
runs the async conversation in a dedicated thread to avoid nested loops.
|
||||
"""
|
||||
try:
|
||||
asyncio.get_running_loop()
|
||||
except RuntimeError:
|
||||
return asyncio.run(self.run_conversation_async(*args, **kwargs))
|
||||
|
||||
import queue
|
||||
import threading
|
||||
|
||||
out: "queue.Queue[object]" = queue.Queue(maxsize=1)
|
||||
|
||||
def runner() -> None:
|
||||
try:
|
||||
out.put(asyncio.run(self.run_conversation_async(*args, **kwargs)))
|
||||
except BaseException as exc: # noqa: BLE001
|
||||
out.put(exc)
|
||||
|
||||
thread = threading.Thread(target=runner, daemon=True)
|
||||
thread.start()
|
||||
|
||||
result = out.get()
|
||||
if isinstance(result, BaseException):
|
||||
raise result
|
||||
return result # type: ignore[return-value]
|
||||
@@ -23,9 +23,12 @@ model:
|
||||
# OPTION 1: Local execution (default)
|
||||
# Commands run directly on your machine in the current directory
|
||||
# -----------------------------------------------------------------------------
|
||||
# Working directory behavior:
|
||||
# - CLI (`hermes` command): Uses "." (current directory where you run hermes)
|
||||
# - Messaging (Telegram/Discord): Uses MESSAGING_CWD from .env (default: home)
|
||||
terminal:
|
||||
env_type: "local"
|
||||
cwd: "." # Use "." for current directory, or specify absolute path
|
||||
cwd: "." # CLI working directory - "." means current directory
|
||||
timeout: 180
|
||||
lifetime_seconds: 300
|
||||
# sudo_password: "" # Enable sudo commands (pipes via sudo -S) - SECURITY WARNING: plaintext!
|
||||
@@ -55,7 +58,7 @@ terminal:
|
||||
# cwd: "/workspace"
|
||||
# timeout: 180
|
||||
# lifetime_seconds: 300
|
||||
# docker_image: "python:3.11"
|
||||
# docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# OPTION 4: Singularity/Apptainer container
|
||||
@@ -67,7 +70,7 @@ terminal:
|
||||
# cwd: "/workspace"
|
||||
# timeout: 180
|
||||
# lifetime_seconds: 300
|
||||
# singularity_image: "docker://python:3.11"
|
||||
# singularity_image: "docker://nikolaik/python-nodejs:python3.11-nodejs20"
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# OPTION 5: Modal cloud execution
|
||||
@@ -79,7 +82,7 @@ terminal:
|
||||
# cwd: "/workspace"
|
||||
# timeout: 180
|
||||
# lifetime_seconds: 300
|
||||
# modal_image: "python:3.11"
|
||||
# modal_image: "nikolaik/python-nodejs:python3.11-nodejs20"
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# SUDO SUPPORT (works with ALL backends above)
|
||||
@@ -112,12 +115,41 @@ browser:
|
||||
# after this period of no activity between agent loops (default: 120 = 2 minutes)
|
||||
inactivity_timeout: 120
|
||||
|
||||
# =============================================================================
|
||||
# Context Compression (Auto-shrinks long conversations)
|
||||
# =============================================================================
|
||||
# When conversation approaches model's context limit, middle turns are
|
||||
# automatically summarized to free up space while preserving important context.
|
||||
#
|
||||
# HOW IT WORKS:
|
||||
# 1. Tracks actual token usage from API responses (not estimates)
|
||||
# 2. When prompt_tokens >= threshold% of model's context_length, triggers compression
|
||||
# 3. Protects first 3 turns (system prompt, initial request, first response)
|
||||
# 4. Protects last 4 turns (recent context is most relevant)
|
||||
# 5. Summarizes middle turns using a fast/cheap model
|
||||
# 6. Inserts summary as a user message, continues conversation seamlessly
|
||||
#
|
||||
compression:
|
||||
# Enable automatic context compression (default: true)
|
||||
# Set to false if you prefer to manage context manually or want errors on overflow
|
||||
enabled: true
|
||||
|
||||
# Trigger compression at this % of model's context limit (default: 0.85 = 85%)
|
||||
# Lower values = more aggressive compression, higher values = compress later
|
||||
threshold: 0.85
|
||||
|
||||
# Model to use for generating summaries (fast/cheap recommended)
|
||||
# This model compresses the middle turns into a concise summary
|
||||
summary_model: "google/gemini-2.0-flash-001"
|
||||
|
||||
# =============================================================================
|
||||
# Agent Behavior
|
||||
# =============================================================================
|
||||
agent:
|
||||
# Maximum conversation turns before stopping
|
||||
max_turns: 20
|
||||
# Maximum tool-calling iterations per conversation
|
||||
# Higher = more room for complex tasks, but costs more tokens
|
||||
# Recommended: 20-30 for focused tasks, 50-100 for open exploration
|
||||
max_turns: 60
|
||||
|
||||
# Enable verbose logging
|
||||
verbose: false
|
||||
|
||||
36
cron/__init__.py
Normal file
36
cron/__init__.py
Normal file
@@ -0,0 +1,36 @@
|
||||
"""
|
||||
Cron job scheduling system for Hermes Agent.
|
||||
|
||||
This module provides scheduled task execution, allowing the agent to:
|
||||
- Run automated tasks on schedules (cron expressions, intervals, one-shot)
|
||||
- Self-schedule reminders and follow-up tasks
|
||||
- Execute tasks in isolated sessions (no prior context)
|
||||
|
||||
Usage:
|
||||
# Run due jobs (for system cron integration)
|
||||
python -c "from cron import tick; tick()"
|
||||
|
||||
# Or via CLI
|
||||
python cli.py --cron-daemon
|
||||
"""
|
||||
|
||||
from cron.jobs import (
|
||||
create_job,
|
||||
get_job,
|
||||
list_jobs,
|
||||
remove_job,
|
||||
update_job,
|
||||
JOBS_FILE,
|
||||
)
|
||||
from cron.scheduler import tick, run_daemon
|
||||
|
||||
__all__ = [
|
||||
"create_job",
|
||||
"get_job",
|
||||
"list_jobs",
|
||||
"remove_job",
|
||||
"update_job",
|
||||
"tick",
|
||||
"run_daemon",
|
||||
"JOBS_FILE",
|
||||
]
|
||||
383
cron/jobs.py
Normal file
383
cron/jobs.py
Normal file
@@ -0,0 +1,383 @@
|
||||
"""
|
||||
Cron job storage and management.
|
||||
|
||||
Jobs are stored in ~/.hermes/cron/jobs.json
|
||||
Output is saved to ~/.hermes/cron/output/{job_id}/{timestamp}.md
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import uuid
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, List, Any
|
||||
|
||||
try:
|
||||
from croniter import croniter
|
||||
HAS_CRONITER = True
|
||||
except ImportError:
|
||||
HAS_CRONITER = False
|
||||
|
||||
# =============================================================================
|
||||
# Configuration
|
||||
# =============================================================================
|
||||
|
||||
HERMES_DIR = Path.home() / ".hermes"
|
||||
CRON_DIR = HERMES_DIR / "cron"
|
||||
JOBS_FILE = CRON_DIR / "jobs.json"
|
||||
OUTPUT_DIR = CRON_DIR / "output"
|
||||
|
||||
|
||||
def ensure_dirs():
|
||||
"""Ensure cron directories exist."""
|
||||
CRON_DIR.mkdir(parents=True, exist_ok=True)
|
||||
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Schedule Parsing
|
||||
# =============================================================================
|
||||
|
||||
def parse_duration(s: str) -> int:
|
||||
"""
|
||||
Parse duration string into minutes.
|
||||
|
||||
Examples:
|
||||
"30m" → 30
|
||||
"2h" → 120
|
||||
"1d" → 1440
|
||||
"""
|
||||
s = s.strip().lower()
|
||||
match = re.match(r'^(\d+)\s*(m|min|mins|minute|minutes|h|hr|hrs|hour|hours|d|day|days)$', s)
|
||||
if not match:
|
||||
raise ValueError(f"Invalid duration: '{s}'. Use format like '30m', '2h', or '1d'")
|
||||
|
||||
value = int(match.group(1))
|
||||
unit = match.group(2)[0] # First char: m, h, or d
|
||||
|
||||
multipliers = {'m': 1, 'h': 60, 'd': 1440}
|
||||
return value * multipliers[unit]
|
||||
|
||||
|
||||
def parse_schedule(schedule: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Parse schedule string into structured format.
|
||||
|
||||
Returns dict with:
|
||||
- kind: "once" | "interval" | "cron"
|
||||
- For "once": "run_at" (ISO timestamp)
|
||||
- For "interval": "minutes" (int)
|
||||
- For "cron": "expr" (cron expression)
|
||||
|
||||
Examples:
|
||||
"30m" → once in 30 minutes
|
||||
"2h" → once in 2 hours
|
||||
"every 30m" → recurring every 30 minutes
|
||||
"every 2h" → recurring every 2 hours
|
||||
"0 9 * * *" → cron expression
|
||||
"2026-02-03T14:00" → once at timestamp
|
||||
"""
|
||||
schedule = schedule.strip()
|
||||
original = schedule
|
||||
schedule_lower = schedule.lower()
|
||||
|
||||
# "every X" pattern → recurring interval
|
||||
if schedule_lower.startswith("every "):
|
||||
duration_str = schedule[6:].strip()
|
||||
minutes = parse_duration(duration_str)
|
||||
return {
|
||||
"kind": "interval",
|
||||
"minutes": minutes,
|
||||
"display": f"every {minutes}m"
|
||||
}
|
||||
|
||||
# Check for cron expression (5 or 6 space-separated fields)
|
||||
# Cron fields: minute hour day month weekday [year]
|
||||
parts = schedule.split()
|
||||
if len(parts) >= 5 and all(
|
||||
re.match(r'^[\d\*\-,/]+$', p) for p in parts[:5]
|
||||
):
|
||||
if not HAS_CRONITER:
|
||||
raise ValueError("Cron expressions require 'croniter' package. Install with: pip install croniter")
|
||||
# Validate cron expression
|
||||
try:
|
||||
croniter(schedule)
|
||||
except Exception as e:
|
||||
raise ValueError(f"Invalid cron expression '{schedule}': {e}")
|
||||
return {
|
||||
"kind": "cron",
|
||||
"expr": schedule,
|
||||
"display": schedule
|
||||
}
|
||||
|
||||
# ISO timestamp (contains T or looks like date)
|
||||
if 'T' in schedule or re.match(r'^\d{4}-\d{2}-\d{2}', schedule):
|
||||
try:
|
||||
# Parse and validate
|
||||
dt = datetime.fromisoformat(schedule.replace('Z', '+00:00'))
|
||||
return {
|
||||
"kind": "once",
|
||||
"run_at": dt.isoformat(),
|
||||
"display": f"once at {dt.strftime('%Y-%m-%d %H:%M')}"
|
||||
}
|
||||
except ValueError as e:
|
||||
raise ValueError(f"Invalid timestamp '{schedule}': {e}")
|
||||
|
||||
# Duration like "30m", "2h", "1d" → one-shot from now
|
||||
try:
|
||||
minutes = parse_duration(schedule)
|
||||
run_at = datetime.now() + timedelta(minutes=minutes)
|
||||
return {
|
||||
"kind": "once",
|
||||
"run_at": run_at.isoformat(),
|
||||
"display": f"once in {original}"
|
||||
}
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
raise ValueError(
|
||||
f"Invalid schedule '{original}'. Use:\n"
|
||||
f" - Duration: '30m', '2h', '1d' (one-shot)\n"
|
||||
f" - Interval: 'every 30m', 'every 2h' (recurring)\n"
|
||||
f" - Cron: '0 9 * * *' (cron expression)\n"
|
||||
f" - Timestamp: '2026-02-03T14:00:00' (one-shot at time)"
|
||||
)
|
||||
|
||||
|
||||
def compute_next_run(schedule: Dict[str, Any], last_run_at: Optional[str] = None) -> Optional[str]:
|
||||
"""
|
||||
Compute the next run time for a schedule.
|
||||
|
||||
Returns ISO timestamp string, or None if no more runs.
|
||||
"""
|
||||
now = datetime.now()
|
||||
|
||||
if schedule["kind"] == "once":
|
||||
run_at = datetime.fromisoformat(schedule["run_at"])
|
||||
# If in the future, return it; if in the past, no more runs
|
||||
return schedule["run_at"] if run_at > now else None
|
||||
|
||||
elif schedule["kind"] == "interval":
|
||||
minutes = schedule["minutes"]
|
||||
if last_run_at:
|
||||
# Next run is last_run + interval
|
||||
last = datetime.fromisoformat(last_run_at)
|
||||
next_run = last + timedelta(minutes=minutes)
|
||||
else:
|
||||
# First run is now + interval
|
||||
next_run = now + timedelta(minutes=minutes)
|
||||
return next_run.isoformat()
|
||||
|
||||
elif schedule["kind"] == "cron":
|
||||
if not HAS_CRONITER:
|
||||
return None
|
||||
cron = croniter(schedule["expr"], now)
|
||||
next_run = cron.get_next(datetime)
|
||||
return next_run.isoformat()
|
||||
|
||||
return None
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Job CRUD Operations
|
||||
# =============================================================================
|
||||
|
||||
def load_jobs() -> List[Dict[str, Any]]:
|
||||
"""Load all jobs from storage."""
|
||||
ensure_dirs()
|
||||
if not JOBS_FILE.exists():
|
||||
return []
|
||||
|
||||
try:
|
||||
with open(JOBS_FILE, 'r', encoding='utf-8') as f:
|
||||
data = json.load(f)
|
||||
return data.get("jobs", [])
|
||||
except (json.JSONDecodeError, IOError):
|
||||
return []
|
||||
|
||||
|
||||
def save_jobs(jobs: List[Dict[str, Any]]):
|
||||
"""Save all jobs to storage."""
|
||||
ensure_dirs()
|
||||
with open(JOBS_FILE, 'w', encoding='utf-8') as f:
|
||||
json.dump({"jobs": jobs, "updated_at": datetime.now().isoformat()}, f, indent=2)
|
||||
|
||||
|
||||
def create_job(
|
||||
prompt: str,
|
||||
schedule: str,
|
||||
name: Optional[str] = None,
|
||||
repeat: Optional[int] = None,
|
||||
deliver: Optional[str] = None,
|
||||
origin: Optional[Dict[str, Any]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Create a new cron job.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to run (must be self-contained)
|
||||
schedule: Schedule string (see parse_schedule)
|
||||
name: Optional friendly name
|
||||
repeat: How many times to run (None = forever, 1 = once)
|
||||
deliver: Where to deliver output ("origin", "local", "telegram", etc.)
|
||||
origin: Source info where job was created (for "origin" delivery)
|
||||
|
||||
Returns:
|
||||
The created job dict
|
||||
"""
|
||||
parsed_schedule = parse_schedule(schedule)
|
||||
|
||||
# Auto-set repeat=1 for one-shot schedules if not specified
|
||||
if parsed_schedule["kind"] == "once" and repeat is None:
|
||||
repeat = 1
|
||||
|
||||
# Default delivery to origin if available, otherwise local
|
||||
if deliver is None:
|
||||
deliver = "origin" if origin else "local"
|
||||
|
||||
job_id = uuid.uuid4().hex[:12]
|
||||
now = datetime.now().isoformat()
|
||||
|
||||
job = {
|
||||
"id": job_id,
|
||||
"name": name or prompt[:50].strip(),
|
||||
"prompt": prompt,
|
||||
"schedule": parsed_schedule,
|
||||
"schedule_display": parsed_schedule.get("display", schedule),
|
||||
"repeat": {
|
||||
"times": repeat, # None = forever
|
||||
"completed": 0
|
||||
},
|
||||
"enabled": True,
|
||||
"created_at": now,
|
||||
"next_run_at": compute_next_run(parsed_schedule),
|
||||
"last_run_at": None,
|
||||
"last_status": None,
|
||||
"last_error": None,
|
||||
# Delivery configuration
|
||||
"deliver": deliver,
|
||||
"origin": origin, # Tracks where job was created for "origin" delivery
|
||||
}
|
||||
|
||||
jobs = load_jobs()
|
||||
jobs.append(job)
|
||||
save_jobs(jobs)
|
||||
|
||||
return job
|
||||
|
||||
|
||||
def get_job(job_id: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get a job by ID."""
|
||||
jobs = load_jobs()
|
||||
for job in jobs:
|
||||
if job["id"] == job_id:
|
||||
return job
|
||||
return None
|
||||
|
||||
|
||||
def list_jobs(include_disabled: bool = False) -> List[Dict[str, Any]]:
|
||||
"""List all jobs, optionally including disabled ones."""
|
||||
jobs = load_jobs()
|
||||
if not include_disabled:
|
||||
jobs = [j for j in jobs if j.get("enabled", True)]
|
||||
return jobs
|
||||
|
||||
|
||||
def update_job(job_id: str, updates: Dict[str, Any]) -> Optional[Dict[str, Any]]:
|
||||
"""Update a job by ID."""
|
||||
jobs = load_jobs()
|
||||
for i, job in enumerate(jobs):
|
||||
if job["id"] == job_id:
|
||||
jobs[i] = {**job, **updates}
|
||||
save_jobs(jobs)
|
||||
return jobs[i]
|
||||
return None
|
||||
|
||||
|
||||
def remove_job(job_id: str) -> bool:
|
||||
"""Remove a job by ID."""
|
||||
jobs = load_jobs()
|
||||
original_len = len(jobs)
|
||||
jobs = [j for j in jobs if j["id"] != job_id]
|
||||
if len(jobs) < original_len:
|
||||
save_jobs(jobs)
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
|
||||
"""
|
||||
Mark a job as having been run.
|
||||
|
||||
Updates last_run_at, last_status, increments completed count,
|
||||
computes next_run_at, and auto-deletes if repeat limit reached.
|
||||
"""
|
||||
jobs = load_jobs()
|
||||
for i, job in enumerate(jobs):
|
||||
if job["id"] == job_id:
|
||||
now = datetime.now().isoformat()
|
||||
job["last_run_at"] = now
|
||||
job["last_status"] = "ok" if success else "error"
|
||||
job["last_error"] = error if not success else None
|
||||
|
||||
# Increment completed count
|
||||
if job.get("repeat"):
|
||||
job["repeat"]["completed"] = job["repeat"].get("completed", 0) + 1
|
||||
|
||||
# Check if we've hit the repeat limit
|
||||
times = job["repeat"].get("times")
|
||||
completed = job["repeat"]["completed"]
|
||||
if times is not None and completed >= times:
|
||||
# Remove the job (limit reached)
|
||||
jobs.pop(i)
|
||||
save_jobs(jobs)
|
||||
return
|
||||
|
||||
# Compute next run
|
||||
job["next_run_at"] = compute_next_run(job["schedule"], now)
|
||||
|
||||
# If no next run (one-shot completed), disable
|
||||
if job["next_run_at"] is None:
|
||||
job["enabled"] = False
|
||||
|
||||
save_jobs(jobs)
|
||||
return
|
||||
|
||||
save_jobs(jobs)
|
||||
|
||||
|
||||
def get_due_jobs() -> List[Dict[str, Any]]:
|
||||
"""Get all jobs that are due to run now."""
|
||||
now = datetime.now()
|
||||
jobs = load_jobs()
|
||||
due = []
|
||||
|
||||
for job in jobs:
|
||||
if not job.get("enabled", True):
|
||||
continue
|
||||
|
||||
next_run = job.get("next_run_at")
|
||||
if not next_run:
|
||||
continue
|
||||
|
||||
next_run_dt = datetime.fromisoformat(next_run)
|
||||
if next_run_dt <= now:
|
||||
due.append(job)
|
||||
|
||||
return due
|
||||
|
||||
|
||||
def save_job_output(job_id: str, output: str):
|
||||
"""Save job output to file."""
|
||||
ensure_dirs()
|
||||
job_output_dir = OUTPUT_DIR / job_id
|
||||
job_output_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
|
||||
output_file = job_output_dir / f"{timestamp}.md"
|
||||
|
||||
with open(output_file, 'w', encoding='utf-8') as f:
|
||||
f.write(output)
|
||||
|
||||
return output_file
|
||||
188
cron/scheduler.py
Normal file
188
cron/scheduler.py
Normal file
@@ -0,0 +1,188 @@
|
||||
"""
|
||||
Cron job scheduler - executes due jobs.
|
||||
|
||||
This module provides:
|
||||
- tick(): Run all due jobs once (for system cron integration)
|
||||
- run_daemon(): Run continuously, checking every 60 seconds
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import traceback
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
# Add parent directory to path for imports
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||
|
||||
from cron.jobs import get_due_jobs, mark_job_run, save_job_output
|
||||
|
||||
|
||||
def run_job(job: dict) -> tuple[bool, str, Optional[str]]:
|
||||
"""
|
||||
Execute a single cron job.
|
||||
|
||||
Returns:
|
||||
Tuple of (success, output, error_message)
|
||||
"""
|
||||
from run_agent import AIAgent
|
||||
|
||||
job_id = job["id"]
|
||||
job_name = job["name"]
|
||||
prompt = job["prompt"]
|
||||
|
||||
print(f"[cron] Running job '{job_name}' (ID: {job_id})")
|
||||
print(f"[cron] Prompt: {prompt[:100]}{'...' if len(prompt) > 100 else ''}")
|
||||
|
||||
try:
|
||||
# Create agent with default settings
|
||||
# Jobs run in isolated sessions (no prior context)
|
||||
agent = AIAgent(
|
||||
model=os.getenv("HERMES_MODEL", "anthropic/claude-sonnet-4"),
|
||||
quiet_mode=True,
|
||||
session_id=f"cron_{job_id}_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
|
||||
)
|
||||
|
||||
# Run the conversation
|
||||
result = agent.run_conversation(prompt)
|
||||
|
||||
# Extract final response
|
||||
final_response = result.get("final_response", "")
|
||||
if not final_response:
|
||||
final_response = "(No response generated)"
|
||||
|
||||
# Build output document
|
||||
output = f"""# Cron Job: {job_name}
|
||||
|
||||
**Job ID:** {job_id}
|
||||
**Run Time:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
|
||||
**Schedule:** {job.get('schedule_display', 'N/A')}
|
||||
|
||||
## Prompt
|
||||
|
||||
{prompt}
|
||||
|
||||
## Response
|
||||
|
||||
{final_response}
|
||||
"""
|
||||
|
||||
print(f"[cron] Job '{job_name}' completed successfully")
|
||||
return True, output, None
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"{type(e).__name__}: {str(e)}"
|
||||
print(f"[cron] Job '{job_name}' failed: {error_msg}")
|
||||
|
||||
# Build error output
|
||||
output = f"""# Cron Job: {job_name} (FAILED)
|
||||
|
||||
**Job ID:** {job_id}
|
||||
**Run Time:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
|
||||
**Schedule:** {job.get('schedule_display', 'N/A')}
|
||||
|
||||
## Prompt
|
||||
|
||||
{prompt}
|
||||
|
||||
## Error
|
||||
|
||||
```
|
||||
{error_msg}
|
||||
|
||||
{traceback.format_exc()}
|
||||
```
|
||||
"""
|
||||
return False, output, error_msg
|
||||
|
||||
|
||||
def tick(verbose: bool = True) -> int:
|
||||
"""
|
||||
Check and run all due jobs.
|
||||
|
||||
This is designed to be called by system cron every minute:
|
||||
*/1 * * * * cd ~/hermes-agent && python -c "from cron import tick; tick()"
|
||||
|
||||
Args:
|
||||
verbose: Whether to print status messages
|
||||
|
||||
Returns:
|
||||
Number of jobs executed
|
||||
"""
|
||||
due_jobs = get_due_jobs()
|
||||
|
||||
if verbose and not due_jobs:
|
||||
print(f"[cron] {datetime.now().strftime('%H:%M:%S')} - No jobs due")
|
||||
return 0
|
||||
|
||||
if verbose:
|
||||
print(f"[cron] {datetime.now().strftime('%H:%M:%S')} - {len(due_jobs)} job(s) due")
|
||||
|
||||
executed = 0
|
||||
for job in due_jobs:
|
||||
try:
|
||||
success, output, error = run_job(job)
|
||||
|
||||
# Save output to file
|
||||
output_file = save_job_output(job["id"], output)
|
||||
if verbose:
|
||||
print(f"[cron] Output saved to: {output_file}")
|
||||
|
||||
# Mark job as run (handles repeat counting, next_run computation)
|
||||
mark_job_run(job["id"], success, error)
|
||||
executed += 1
|
||||
|
||||
except Exception as e:
|
||||
print(f"[cron] Error processing job {job['id']}: {e}")
|
||||
mark_job_run(job["id"], False, str(e))
|
||||
|
||||
return executed
|
||||
|
||||
|
||||
def run_daemon(check_interval: int = 60, verbose: bool = True):
|
||||
"""
|
||||
Run the cron daemon continuously.
|
||||
|
||||
Checks for due jobs every `check_interval` seconds.
|
||||
|
||||
Args:
|
||||
check_interval: Seconds between checks (default: 60)
|
||||
verbose: Whether to print status messages
|
||||
"""
|
||||
print(f"[cron] Starting daemon (checking every {check_interval}s)")
|
||||
print(f"[cron] Press Ctrl+C to stop")
|
||||
print()
|
||||
|
||||
try:
|
||||
while True:
|
||||
try:
|
||||
tick(verbose=verbose)
|
||||
except Exception as e:
|
||||
print(f"[cron] Tick error: {e}")
|
||||
|
||||
time.sleep(check_interval)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print("\n[cron] Daemon stopped")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Allow running directly: python cron/scheduler.py [daemon|tick]
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Hermes Cron Scheduler")
|
||||
parser.add_argument("mode", choices=["daemon", "tick"], default="tick", nargs="?",
|
||||
help="Mode: 'tick' to run once, 'daemon' to run continuously")
|
||||
parser.add_argument("--interval", type=int, default=60,
|
||||
help="Check interval in seconds for daemon mode")
|
||||
parser.add_argument("--quiet", "-q", action="store_true",
|
||||
help="Suppress status messages")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.mode == "daemon":
|
||||
run_daemon(check_interval=args.interval, verbose=not args.quiet)
|
||||
else:
|
||||
tick(verbose=not args.quiet)
|
||||
@@ -1,224 +0,0 @@
|
||||
# Modal Backend
|
||||
|
||||
Hermes Agent uses [Modal](https://modal.com) for scalable, isolated cloud execution environments. There are two Modal integrations:
|
||||
|
||||
1. **Terminal Tool** (`tools/terminal_tool.py`) - For CLI/agent command execution
|
||||
2. **Atropos Backend** (`atropos/backends/modal_backend.py`) - For batch RL training workloads
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Terminal Tool (CLI/Agent)
|
||||
|
||||
The terminal tool provides a simple interface for executing commands in Modal sandboxes.
|
||||
|
||||
### Configuration
|
||||
|
||||
Set environment variables:
|
||||
|
||||
```bash
|
||||
export TERMINAL_ENV=modal
|
||||
export TERMINAL_MODAL_IMAGE=python:3.11
|
||||
export TERMINAL_MODAL_APP_NAME=hermes-sandbox
|
||||
```
|
||||
|
||||
Or use a YAML config file (`modal_profiles.yaml`):
|
||||
|
||||
```yaml
|
||||
profiles:
|
||||
default:
|
||||
image: python:3.11
|
||||
cpu: 1.0
|
||||
memory: 2048
|
||||
min_pool: 1
|
||||
max_pool: 5
|
||||
idle_timeout: 120
|
||||
|
||||
gpu:
|
||||
image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
|
||||
gpu: T4
|
||||
memory: 16384
|
||||
min_pool: 0
|
||||
max_pool: 2
|
||||
```
|
||||
|
||||
### Features
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| **Sandbox Pool** | Pre-warmed sandboxes for low latency |
|
||||
| **Auto-scaling** | Grows/shrinks pool based on demand |
|
||||
| **Idle Timeout** | Sandboxes auto-terminate when unused |
|
||||
| **Profile Selection** | Different configs for different workloads |
|
||||
| **Credential Injection** | `modal.Secret` integration |
|
||||
|
||||
### Usage
|
||||
|
||||
```python
|
||||
from tools.terminal_tool import terminal_tool
|
||||
|
||||
# Simple command
|
||||
output = terminal_tool("echo hello", task_id="my-task")
|
||||
|
||||
# With profile selection
|
||||
output = terminal_tool("python train.py", task_id="training", profile="gpu")
|
||||
|
||||
# Cleanup when done
|
||||
from tools.terminal_tool import cleanup_vm
|
||||
cleanup_vm("my-task")
|
||||
```
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
_ModalPoolManager (singleton)
|
||||
├── "default" pool → [sandbox-0, sandbox-1, ...]
|
||||
└── "gpu" pool → [sandbox-0, ...]
|
||||
|
||||
Each pool:
|
||||
- Maintains min_pool warm sandboxes
|
||||
- Scales up to max_pool on demand
|
||||
- Background thread scales down idle sandboxes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Atropos Backend (RL Training)
|
||||
|
||||
The Atropos backend is designed for high-throughput batch execution during reinforcement learning training.
|
||||
|
||||
### Key Concept: Slot-based Multiplexing
|
||||
|
||||
Instead of one sandbox per trajectory, multiple trajectories share sandboxes via **slots**:
|
||||
|
||||
```
|
||||
Sandbox (1 container)
|
||||
├── Slot 0 → Trajectory A (workspace: /data/slot_0)
|
||||
├── Slot 1 → Trajectory B (workspace: /data/slot_1)
|
||||
└── Slot 2 → Trajectory C (workspace: /data/slot_2)
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- Fewer containers = lower cost
|
||||
- Shared warm-up time
|
||||
- Better GPU utilization
|
||||
|
||||
### Configuration
|
||||
|
||||
```python
|
||||
from atropos.backends.modal_backend import ModalSandboxConfig, ModalToolBackend
|
||||
|
||||
config = ModalSandboxConfig(
|
||||
name="default",
|
||||
image="python:3.11",
|
||||
cpu=1.0,
|
||||
memory=2048,
|
||||
slots_per_sandbox=10, # 10 trajectories per container
|
||||
min_sandboxes=1,
|
||||
max_sandboxes=5,
|
||||
)
|
||||
|
||||
backend = ModalToolBackend(config.with_app_name("my-training"))
|
||||
```
|
||||
|
||||
### Multi-Profile Support
|
||||
|
||||
Different trajectory types can request different resources:
|
||||
|
||||
```python
|
||||
backend = ModalToolBackend.with_profiles(
|
||||
app_name="rl-training",
|
||||
profiles={
|
||||
"default": ModalSandboxConfig(
|
||||
name="default",
|
||||
cpu=1.0,
|
||||
memory=2048,
|
||||
),
|
||||
"pytorch-gpu": ModalSandboxConfig(
|
||||
name="pytorch-gpu",
|
||||
image="pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime",
|
||||
gpu="T4",
|
||||
memory=16384,
|
||||
),
|
||||
}
|
||||
)
|
||||
|
||||
# CPU task
|
||||
slot1 = await backend.acquire("traj-1", profile="default")
|
||||
|
||||
# GPU task
|
||||
slot2 = await backend.acquire("traj-2", profile="pytorch-gpu")
|
||||
```
|
||||
|
||||
### Batched Execution
|
||||
|
||||
The key optimization - execute many commands in parallel:
|
||||
|
||||
```python
|
||||
# Acquire slots for multiple trajectories
|
||||
slots = [await backend.acquire(f"traj-{i}") for i in range(50)]
|
||||
|
||||
# Execute batch across all slots in parallel
|
||||
results = await backend.execute_batch([
|
||||
(slot, "bash", {"command": "python step.py"})
|
||||
for slot in slots
|
||||
])
|
||||
|
||||
# Release slots
|
||||
for slot in slots:
|
||||
await backend.release(slot)
|
||||
```
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
ModalToolBackend
|
||||
└── _ModalMultiProfileManager
|
||||
├── "default" → _ModalSandboxPool
|
||||
│ ├── Sandbox 0 (slots 0-9)
|
||||
│ └── Sandbox 1 (slots 0-9)
|
||||
│
|
||||
└── "pytorch-gpu" → _ModalSandboxPool
|
||||
└── Sandbox 0 (slots 0-9)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Credentials
|
||||
|
||||
Inject secrets securely using Modal's secret management:
|
||||
|
||||
```bash
|
||||
# Create secret in Modal dashboard or CLI
|
||||
modal secret create my-api-key API_KEY=sk-xxx
|
||||
```
|
||||
|
||||
```python
|
||||
# Reference in config
|
||||
config = ModalSandboxConfig(
|
||||
secrets=["my-api-key"], # Modal secret names
|
||||
env_vars={"DEBUG": "1"}, # Additional env vars
|
||||
)
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Modal package not installed"
|
||||
```bash
|
||||
pip install modal
|
||||
modal token new # Authenticate
|
||||
```
|
||||
|
||||
### "Sandbox creation failed"
|
||||
- Check Modal dashboard for quota limits
|
||||
- Verify image exists and is accessible
|
||||
- Check secret names are correct
|
||||
|
||||
### Shutdown errors
|
||||
These are harmless warnings during Python interpreter shutdown:
|
||||
```
|
||||
[Modal] Error terminating ...: cannot schedule new futures after interpreter shutdown
|
||||
```
|
||||
|
||||
The sandboxes will auto-terminate via Modal's idle_timeout anyway.
|
||||
32
docs/cli.md
32
docs/cli.md
@@ -250,6 +250,38 @@ This is useful for:
|
||||
- Replaying conversations
|
||||
- Training data inspection
|
||||
|
||||
### Context Compression
|
||||
|
||||
Long conversations can exceed model context limits. The CLI automatically compresses context when approaching the limit:
|
||||
|
||||
```yaml
|
||||
# In cli-config.yaml
|
||||
compression:
|
||||
enabled: true # Enable auto-compression
|
||||
threshold: 0.85 # Compress at 85% of context limit
|
||||
summary_model: "google/gemini-2.0-flash-001"
|
||||
```
|
||||
|
||||
**How it works:**
|
||||
1. Tracks actual token usage from each API response
|
||||
2. When tokens reach threshold, middle turns are summarized
|
||||
3. First 3 and last 4 turns are always protected
|
||||
4. Conversation continues seamlessly after compression
|
||||
|
||||
**When compression triggers:**
|
||||
```
|
||||
📦 Context compression triggered (170,000 tokens ≥ 170,000 threshold)
|
||||
📊 Model context limit: 200,000 tokens (85% = 170,000)
|
||||
🗜️ Summarizing turns 4-15 (12 turns)
|
||||
✅ Compressed: 20 → 9 messages (~45,000 tokens saved)
|
||||
```
|
||||
|
||||
To disable compression:
|
||||
```yaml
|
||||
compression:
|
||||
enabled: false
|
||||
```
|
||||
|
||||
## Quiet Mode
|
||||
|
||||
The CLI runs in "quiet mode" (`HERMES_QUIET=1`), which:
|
||||
|
||||
515
docs/messaging.md
Normal file
515
docs/messaging.md
Normal file
@@ -0,0 +1,515 @@
|
||||
# Messaging Platform Integrations (Gateway)
|
||||
|
||||
Hermes Agent can connect to messaging platforms like Telegram, Discord, and WhatsApp to serve as a conversational AI assistant.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Set your bot token(s) in .env file
|
||||
echo 'TELEGRAM_BOT_TOKEN="your_telegram_bot_token"' >> .env
|
||||
echo 'DISCORD_BOT_TOKEN="your_discord_bot_token"' >> .env
|
||||
|
||||
# 2. Test the gateway (foreground)
|
||||
./scripts/hermes-gateway run
|
||||
|
||||
# 3. Install as a system service (runs in background)
|
||||
./scripts/hermes-gateway install
|
||||
|
||||
# 4. Manage the service
|
||||
./scripts/hermes-gateway start
|
||||
./scripts/hermes-gateway stop
|
||||
./scripts/hermes-gateway restart
|
||||
./scripts/hermes-gateway status
|
||||
```
|
||||
|
||||
**Quick test (without service install):**
|
||||
```bash
|
||||
python cli.py --gateway # Runs in foreground, useful for debugging
|
||||
```
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Hermes Gateway │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Telegram │ │ Discord │ │ WhatsApp │ │
|
||||
│ │ Adapter │ │ Adapter │ │ Adapter │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │ │
|
||||
│ └─────────────────┼─────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────▼────────┐ │
|
||||
│ │ Session Store │ │
|
||||
│ │ (per-chat) │ │
|
||||
│ └────────┬────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────▼────────┐ │
|
||||
│ │ AIAgent │ │
|
||||
│ │ (run_agent) │ │
|
||||
│ └─────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Session Management
|
||||
|
||||
### Session Persistence
|
||||
|
||||
Sessions persist across messages until they reset. The agent remembers your conversation context.
|
||||
|
||||
### Reset Policies
|
||||
|
||||
Sessions reset based on configurable policies:
|
||||
|
||||
| Policy | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| Daily | 4:00 AM | Reset at a specific hour each day |
|
||||
| Idle | 120 min | Reset after N minutes of inactivity |
|
||||
| Both | (combined) | Whichever triggers first |
|
||||
|
||||
### Manual Reset
|
||||
|
||||
Send `/new` or `/reset` as a message to start fresh.
|
||||
|
||||
### Per-Platform Overrides
|
||||
|
||||
Configure different reset policies per platform:
|
||||
|
||||
```json
|
||||
{
|
||||
"reset_by_platform": {
|
||||
"telegram": { "mode": "idle", "idle_minutes": 240 },
|
||||
"discord": { "mode": "idle", "idle_minutes": 60 }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Platform Setup
|
||||
|
||||
### Telegram
|
||||
|
||||
1. **Create a bot** via [@BotFather](https://t.me/BotFather)
|
||||
2. **Get your token** (looks like `123456789:ABCdefGHIjklMNOpqrsTUVwxyz`)
|
||||
3. **Set environment variable:**
|
||||
```bash
|
||||
export TELEGRAM_BOT_TOKEN="your_token_here"
|
||||
```
|
||||
4. **Optional: Set home channel** for cron job delivery:
|
||||
```bash
|
||||
export TELEGRAM_HOME_CHANNEL="-1001234567890"
|
||||
export TELEGRAM_HOME_CHANNEL_NAME="My Notes"
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
```bash
|
||||
pip install python-telegram-bot>=20.0
|
||||
```
|
||||
|
||||
### Discord
|
||||
|
||||
1. **Create an application** at [Discord Developer Portal](https://discord.com/developers/applications)
|
||||
2. **Create a bot** under your application
|
||||
3. **Get the bot token**
|
||||
4. **Enable required intents:**
|
||||
- Message Content Intent
|
||||
- Server Members Intent (optional)
|
||||
5. **Invite to your server** using OAuth2 URL generator (scopes: `bot`, `applications.commands`)
|
||||
6. **Set environment variable:**
|
||||
```bash
|
||||
export DISCORD_BOT_TOKEN="your_token_here"
|
||||
```
|
||||
7. **Optional: Set home channel:**
|
||||
```bash
|
||||
export DISCORD_HOME_CHANNEL="123456789012345678"
|
||||
export DISCORD_HOME_CHANNEL_NAME="#bot-updates"
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
```bash
|
||||
pip install discord.py>=2.0
|
||||
```
|
||||
|
||||
### WhatsApp
|
||||
|
||||
WhatsApp integration is more complex due to the lack of a simple bot API.
|
||||
|
||||
**Options:**
|
||||
1. **WhatsApp Business API** (requires Meta verification)
|
||||
2. **whatsapp-web.js** via Node.js bridge (for personal accounts)
|
||||
|
||||
**Bridge Setup:**
|
||||
1. Install Node.js
|
||||
2. Set up the bridge script (see `scripts/whatsapp-bridge/` for reference)
|
||||
3. Configure in gateway:
|
||||
```json
|
||||
{
|
||||
"platforms": {
|
||||
"whatsapp": {
|
||||
"enabled": true,
|
||||
"extra": {
|
||||
"bridge_script": "/path/to/bridge.js",
|
||||
"bridge_port": 3000
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
There are **three ways** to configure the gateway (in order of precedence):
|
||||
|
||||
### 1. Environment Variables (`.env` file) - Recommended for Quick Setup
|
||||
|
||||
Add to your `~/.hermes/.env` file:
|
||||
|
||||
```bash
|
||||
# =============================================================================
|
||||
# MESSAGING PLATFORM TOKENS
|
||||
# =============================================================================
|
||||
|
||||
# Telegram - get from @BotFather on Telegram
|
||||
TELEGRAM_BOT_TOKEN=your_telegram_bot_token
|
||||
TELEGRAM_ALLOWED_USERS=123456789,987654321 # Security: restrict to these user IDs
|
||||
|
||||
# Optional: Default channel for cron job delivery
|
||||
TELEGRAM_HOME_CHANNEL=-1001234567890
|
||||
TELEGRAM_HOME_CHANNEL_NAME="My Notes"
|
||||
|
||||
# Discord - get from Discord Developer Portal
|
||||
DISCORD_BOT_TOKEN=your_discord_bot_token
|
||||
DISCORD_ALLOWED_USERS=123456789012345678 # Security: restrict to these user IDs
|
||||
|
||||
# Optional: Default channel for cron job delivery
|
||||
DISCORD_HOME_CHANNEL=123456789012345678
|
||||
DISCORD_HOME_CHANNEL_NAME="#bot-updates"
|
||||
|
||||
# WhatsApp - requires Node.js bridge setup
|
||||
WHATSAPP_ENABLED=true
|
||||
|
||||
# =============================================================================
|
||||
# AGENT SETTINGS
|
||||
# =============================================================================
|
||||
|
||||
# Max tool-calling iterations per conversation (default: 60)
|
||||
HERMES_MAX_ITERATIONS=60
|
||||
|
||||
# Working directory for terminal commands (default: home ~)
|
||||
MESSAGING_CWD=/home/myuser
|
||||
|
||||
# =============================================================================
|
||||
# TOOL PROGRESS NOTIFICATIONS
|
||||
# =============================================================================
|
||||
|
||||
# Show progress messages as agent uses tools
|
||||
HERMES_TOOL_PROGRESS=true
|
||||
|
||||
# Mode: "new" (only when tool changes) or "all" (every tool call)
|
||||
HERMES_TOOL_PROGRESS_MODE=new
|
||||
|
||||
# =============================================================================
|
||||
# SESSION SETTINGS
|
||||
# =============================================================================
|
||||
|
||||
# Reset sessions after N minutes of inactivity (default: 120)
|
||||
SESSION_IDLE_MINUTES=120
|
||||
|
||||
# Daily reset hour in 24h format (default: 4 = 4am)
|
||||
SESSION_RESET_HOUR=4
|
||||
```
|
||||
|
||||
### 2. Gateway Config File (`~/.hermes/gateway.json`) - Full Control
|
||||
|
||||
For advanced configuration, create `~/.hermes/gateway.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"platforms": {
|
||||
"telegram": {
|
||||
"enabled": true,
|
||||
"token": "your_telegram_token",
|
||||
"home_channel": {
|
||||
"platform": "telegram",
|
||||
"chat_id": "-1001234567890",
|
||||
"name": "My Notes"
|
||||
}
|
||||
},
|
||||
"discord": {
|
||||
"enabled": true,
|
||||
"token": "your_discord_token",
|
||||
"home_channel": {
|
||||
"platform": "discord",
|
||||
"chat_id": "123456789012345678",
|
||||
"name": "#bot-updates"
|
||||
}
|
||||
}
|
||||
},
|
||||
"default_reset_policy": {
|
||||
"mode": "both",
|
||||
"at_hour": 4,
|
||||
"idle_minutes": 120
|
||||
},
|
||||
"reset_by_platform": {
|
||||
"discord": {
|
||||
"mode": "idle",
|
||||
"idle_minutes": 60
|
||||
}
|
||||
},
|
||||
"always_log_local": true
|
||||
}
|
||||
```
|
||||
|
||||
## Platform-Specific Toolsets
|
||||
|
||||
Each platform has its own toolset for security:
|
||||
|
||||
| Platform | Toolset | Capabilities |
|
||||
|----------|---------|--------------|
|
||||
| CLI | `hermes-cli` | Full access (terminal, browser, etc.) |
|
||||
| Telegram | `hermes-telegram` | Full tools including terminal |
|
||||
| Discord | `hermes-discord` | Full tools including terminal |
|
||||
| WhatsApp | `hermes-whatsapp` | Full tools including terminal |
|
||||
|
||||
## User Experience Features
|
||||
|
||||
### Typing Indicator
|
||||
|
||||
The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
|
||||
|
||||
### Tool Progress Notifications
|
||||
|
||||
When `HERMES_TOOL_PROGRESS=true`, the bot sends status messages as it works:
|
||||
|
||||
```
|
||||
💻 `ls -la`...
|
||||
🔍 web_search...
|
||||
📄 web_extract...
|
||||
🎨 image_generate...
|
||||
```
|
||||
|
||||
Terminal commands show the actual command (truncated to 50 chars). Other tools just show the tool name.
|
||||
|
||||
**Modes:**
|
||||
- `new`: Only sends message when switching to a different tool (less spam)
|
||||
- `all`: Sends message for every single tool call
|
||||
|
||||
### Working Directory
|
||||
|
||||
- **CLI (`hermes` command)**: Uses current directory where you run the command
|
||||
- **Messaging**: Uses `MESSAGING_CWD` (default: home directory `~`)
|
||||
|
||||
This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
|
||||
|
||||
### Max Iterations
|
||||
|
||||
If the agent hits the max iteration limit while working, instead of a generic error, it asks the model to summarize what it found so far. This gives you a useful response even when the task couldn't be fully completed.
|
||||
|
||||
## Cron Job Delivery
|
||||
|
||||
When scheduling cron jobs, you can specify where the output should be delivered:
|
||||
|
||||
```
|
||||
User: "Remind me to check the server in 30 minutes"
|
||||
|
||||
Agent uses: schedule_cronjob(
|
||||
prompt="Check server status...",
|
||||
schedule="30m",
|
||||
deliver="origin" # Back to this chat
|
||||
)
|
||||
```
|
||||
|
||||
### Delivery Options
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `"origin"` | Back to where the job was created |
|
||||
| `"local"` | Save to local files only |
|
||||
| `"telegram"` | Telegram home channel |
|
||||
| `"discord"` | Discord home channel |
|
||||
| `"telegram:123456"` | Specific Telegram chat |
|
||||
|
||||
## Dynamic Context Injection
|
||||
|
||||
The agent knows where it is via injected context:
|
||||
|
||||
```
|
||||
## Current Session Context
|
||||
|
||||
**Source:** Telegram (group: Dev Team, ID: -1001234567890)
|
||||
**Connected Platforms:** local, telegram, discord
|
||||
|
||||
**Home Channels:**
|
||||
- telegram: My Notes (ID: -1001234567890)
|
||||
- discord: #bot-updates (ID: 123456789012345678)
|
||||
|
||||
**Delivery options for scheduled tasks:**
|
||||
- "origin" → Back to this chat (Dev Team)
|
||||
- "local" → Save to local files only
|
||||
- "telegram" → Home channel (My Notes)
|
||||
- "discord" → Home channel (#bot-updates)
|
||||
```
|
||||
|
||||
## CLI Commands
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/platforms` | Show gateway configuration and status |
|
||||
| `--gateway` | Start the gateway (CLI flag) |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "python-telegram-bot not installed"
|
||||
|
||||
```bash
|
||||
pip install python-telegram-bot>=20.0
|
||||
```
|
||||
|
||||
### "discord.py not installed"
|
||||
|
||||
```bash
|
||||
pip install discord.py>=2.0
|
||||
```
|
||||
|
||||
### "No platforms connected"
|
||||
|
||||
1. Check your environment variables are set
|
||||
2. Check your tokens are valid
|
||||
3. Try `/platforms` to see configuration status
|
||||
|
||||
### Session not persisting
|
||||
|
||||
1. Check `~/.hermes/sessions/` exists
|
||||
2. Check session policies aren't too aggressive
|
||||
3. Verify no errors in gateway logs
|
||||
|
||||
## Adding a New Platform
|
||||
|
||||
To add a new messaging platform:
|
||||
|
||||
### 1. Create the adapter
|
||||
|
||||
Create `gateway/platforms/your_platform.py`:
|
||||
|
||||
```python
|
||||
from gateway.platforms.base import BasePlatformAdapter, MessageEvent, SendResult
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
|
||||
class YourPlatformAdapter(BasePlatformAdapter):
|
||||
def __init__(self, config: PlatformConfig):
|
||||
super().__init__(config, Platform.YOUR_PLATFORM)
|
||||
|
||||
async def connect(self) -> bool:
|
||||
# Connect to the platform
|
||||
...
|
||||
|
||||
async def disconnect(self) -> None:
|
||||
# Disconnect
|
||||
...
|
||||
|
||||
async def send(self, chat_id: str, content: str, ...) -> SendResult:
|
||||
# Send a message
|
||||
...
|
||||
|
||||
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
|
||||
# Get chat information
|
||||
...
|
||||
```
|
||||
|
||||
### 2. Register the platform
|
||||
|
||||
Add to `gateway/config.py`:
|
||||
|
||||
```python
|
||||
class Platform(Enum):
|
||||
# ... existing ...
|
||||
YOUR_PLATFORM = "your_platform"
|
||||
```
|
||||
|
||||
### 3. Add to gateway runner
|
||||
|
||||
Update `gateway/run.py` `_create_adapter()`:
|
||||
|
||||
```python
|
||||
elif platform == Platform.YOUR_PLATFORM:
|
||||
from gateway.platforms.your_platform import YourPlatformAdapter
|
||||
return YourPlatformAdapter(config)
|
||||
```
|
||||
|
||||
### 4. Create a toolset (optional)
|
||||
|
||||
Add to `toolsets.py`:
|
||||
|
||||
```python
|
||||
"hermes-your-platform": {
|
||||
"description": "Your platform toolset",
|
||||
"tools": [...],
|
||||
"includes": []
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Configure
|
||||
|
||||
Add environment variables to `.env`:
|
||||
|
||||
```bash
|
||||
YOUR_PLATFORM_TOKEN=...
|
||||
YOUR_PLATFORM_HOME_CHANNEL=...
|
||||
```
|
||||
|
||||
## Service Management
|
||||
|
||||
### Linux (systemd)
|
||||
|
||||
```bash
|
||||
# Install as user service
|
||||
./scripts/hermes-gateway install
|
||||
|
||||
# Manage
|
||||
systemctl --user start hermes-gateway
|
||||
systemctl --user stop hermes-gateway
|
||||
systemctl --user restart hermes-gateway
|
||||
systemctl --user status hermes-gateway
|
||||
|
||||
# View logs
|
||||
journalctl --user -u hermes-gateway -f
|
||||
|
||||
# Enable lingering (keeps running after logout)
|
||||
sudo loginctl enable-linger $USER
|
||||
```
|
||||
|
||||
### macOS (launchd)
|
||||
|
||||
```bash
|
||||
# Install
|
||||
./scripts/hermes-gateway install
|
||||
|
||||
# Manage
|
||||
launchctl start ai.hermes.gateway
|
||||
launchctl stop ai.hermes.gateway
|
||||
|
||||
# View logs
|
||||
tail -f ~/.hermes/logs/gateway.log
|
||||
```
|
||||
|
||||
### Manual (any platform)
|
||||
|
||||
```bash
|
||||
# Run in foreground (for testing/debugging)
|
||||
./scripts/hermes-gateway run
|
||||
|
||||
# Or via CLI (also foreground)
|
||||
python cli.py --gateway
|
||||
```
|
||||
|
||||
## Storage Locations
|
||||
|
||||
| Path | Purpose |
|
||||
|------|---------|
|
||||
| `~/.hermes/gateway.json` | Gateway configuration |
|
||||
| `~/.hermes/sessions/sessions.json` | Session index |
|
||||
| `~/.hermes/sessions/{id}.jsonl` | Conversation transcripts |
|
||||
| `~/.hermes/cron/output/` | Cron job outputs |
|
||||
| `~/.hermes/logs/gateway.log` | Gateway logs (macOS launchd) |
|
||||
35
gateway/__init__.py
Normal file
35
gateway/__init__.py
Normal file
@@ -0,0 +1,35 @@
|
||||
"""
|
||||
Hermes Gateway - Multi-platform messaging integration.
|
||||
|
||||
This module provides a unified gateway for connecting the Hermes agent
|
||||
to various messaging platforms (Telegram, Discord, WhatsApp) with:
|
||||
- Session management (persistent conversations with reset policies)
|
||||
- Dynamic context injection (agent knows where messages come from)
|
||||
- Delivery routing (cron job outputs to appropriate channels)
|
||||
- Platform-specific toolsets (different capabilities per platform)
|
||||
"""
|
||||
|
||||
from .config import GatewayConfig, PlatformConfig, HomeChannel, load_gateway_config
|
||||
from .session import (
|
||||
SessionContext,
|
||||
SessionStore,
|
||||
SessionResetPolicy,
|
||||
build_session_context_prompt,
|
||||
)
|
||||
from .delivery import DeliveryRouter, DeliveryTarget
|
||||
|
||||
__all__ = [
|
||||
# Config
|
||||
"GatewayConfig",
|
||||
"PlatformConfig",
|
||||
"HomeChannel",
|
||||
"load_gateway_config",
|
||||
# Session
|
||||
"SessionContext",
|
||||
"SessionStore",
|
||||
"SessionResetPolicy",
|
||||
"build_session_context_prompt",
|
||||
# Delivery
|
||||
"DeliveryRouter",
|
||||
"DeliveryTarget",
|
||||
]
|
||||
333
gateway/config.py
Normal file
333
gateway/config.py
Normal file
@@ -0,0 +1,333 @@
|
||||
"""
|
||||
Gateway configuration management.
|
||||
|
||||
Handles loading and validating configuration for:
|
||||
- Connected platforms (Telegram, Discord, WhatsApp)
|
||||
- Home channels for each platform
|
||||
- Session reset policies
|
||||
- Delivery preferences
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
from pathlib import Path
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Dict, List, Optional, Any
|
||||
from enum import Enum
|
||||
|
||||
|
||||
class Platform(Enum):
|
||||
"""Supported messaging platforms."""
|
||||
LOCAL = "local"
|
||||
TELEGRAM = "telegram"
|
||||
DISCORD = "discord"
|
||||
WHATSAPP = "whatsapp"
|
||||
|
||||
|
||||
@dataclass
|
||||
class HomeChannel:
|
||||
"""
|
||||
Default destination for a platform.
|
||||
|
||||
When a cron job specifies deliver="telegram" without a specific chat ID,
|
||||
messages are sent to this home channel.
|
||||
"""
|
||||
platform: Platform
|
||||
chat_id: str
|
||||
name: str # Human-readable name for display
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"platform": self.platform.value,
|
||||
"chat_id": self.chat_id,
|
||||
"name": self.name,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
|
||||
return cls(
|
||||
platform=Platform(data["platform"]),
|
||||
chat_id=str(data["chat_id"]),
|
||||
name=data.get("name", "Home"),
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class SessionResetPolicy:
|
||||
"""
|
||||
Controls when sessions reset (lose context).
|
||||
|
||||
Modes:
|
||||
- "daily": Reset at a specific hour each day
|
||||
- "idle": Reset after N minutes of inactivity
|
||||
- "both": Whichever triggers first (daily boundary OR idle timeout)
|
||||
"""
|
||||
mode: str = "both" # "daily", "idle", or "both"
|
||||
at_hour: int = 4 # Hour for daily reset (0-23, local time)
|
||||
idle_minutes: int = 120 # Minutes of inactivity before reset
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"mode": self.mode,
|
||||
"at_hour": self.at_hour,
|
||||
"idle_minutes": self.idle_minutes,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: Dict[str, Any]) -> "SessionResetPolicy":
|
||||
return cls(
|
||||
mode=data.get("mode", "both"),
|
||||
at_hour=data.get("at_hour", 4),
|
||||
idle_minutes=data.get("idle_minutes", 120),
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class PlatformConfig:
|
||||
"""Configuration for a single messaging platform."""
|
||||
enabled: bool = False
|
||||
token: Optional[str] = None # Bot token (Telegram, Discord)
|
||||
api_key: Optional[str] = None # API key if different from token
|
||||
home_channel: Optional[HomeChannel] = None
|
||||
|
||||
# Platform-specific settings
|
||||
extra: Dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
result = {
|
||||
"enabled": self.enabled,
|
||||
"extra": self.extra,
|
||||
}
|
||||
if self.token:
|
||||
result["token"] = self.token
|
||||
if self.api_key:
|
||||
result["api_key"] = self.api_key
|
||||
if self.home_channel:
|
||||
result["home_channel"] = self.home_channel.to_dict()
|
||||
return result
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: Dict[str, Any]) -> "PlatformConfig":
|
||||
home_channel = None
|
||||
if "home_channel" in data:
|
||||
home_channel = HomeChannel.from_dict(data["home_channel"])
|
||||
|
||||
return cls(
|
||||
enabled=data.get("enabled", False),
|
||||
token=data.get("token"),
|
||||
api_key=data.get("api_key"),
|
||||
home_channel=home_channel,
|
||||
extra=data.get("extra", {}),
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class GatewayConfig:
|
||||
"""
|
||||
Main gateway configuration.
|
||||
|
||||
Manages all platform connections, session policies, and delivery settings.
|
||||
"""
|
||||
# Platform configurations
|
||||
platforms: Dict[Platform, PlatformConfig] = field(default_factory=dict)
|
||||
|
||||
# Session reset policies by type
|
||||
default_reset_policy: SessionResetPolicy = field(default_factory=SessionResetPolicy)
|
||||
reset_by_type: Dict[str, SessionResetPolicy] = field(default_factory=dict)
|
||||
reset_by_platform: Dict[Platform, SessionResetPolicy] = field(default_factory=dict)
|
||||
|
||||
# Reset trigger commands
|
||||
reset_triggers: List[str] = field(default_factory=lambda: ["/new", "/reset"])
|
||||
|
||||
# Storage paths
|
||||
sessions_dir: Path = field(default_factory=lambda: Path.home() / ".hermes" / "sessions")
|
||||
|
||||
# Delivery settings
|
||||
always_log_local: bool = True # Always save cron outputs to local files
|
||||
|
||||
def get_connected_platforms(self) -> List[Platform]:
|
||||
"""Return list of platforms that are enabled and configured."""
|
||||
connected = []
|
||||
for platform, config in self.platforms.items():
|
||||
if config.enabled and (config.token or config.api_key):
|
||||
connected.append(platform)
|
||||
return connected
|
||||
|
||||
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
|
||||
"""Get the home channel for a platform."""
|
||||
config = self.platforms.get(platform)
|
||||
if config:
|
||||
return config.home_channel
|
||||
return None
|
||||
|
||||
def get_reset_policy(
|
||||
self,
|
||||
platform: Optional[Platform] = None,
|
||||
session_type: Optional[str] = None
|
||||
) -> SessionResetPolicy:
|
||||
"""
|
||||
Get the appropriate reset policy for a session.
|
||||
|
||||
Priority: platform override > type override > default
|
||||
"""
|
||||
# Platform-specific override takes precedence
|
||||
if platform and platform in self.reset_by_platform:
|
||||
return self.reset_by_platform[platform]
|
||||
|
||||
# Type-specific override (dm, group, thread)
|
||||
if session_type and session_type in self.reset_by_type:
|
||||
return self.reset_by_type[session_type]
|
||||
|
||||
return self.default_reset_policy
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"platforms": {
|
||||
p.value: c.to_dict() for p, c in self.platforms.items()
|
||||
},
|
||||
"default_reset_policy": self.default_reset_policy.to_dict(),
|
||||
"reset_by_type": {
|
||||
k: v.to_dict() for k, v in self.reset_by_type.items()
|
||||
},
|
||||
"reset_by_platform": {
|
||||
p.value: v.to_dict() for p, v in self.reset_by_platform.items()
|
||||
},
|
||||
"reset_triggers": self.reset_triggers,
|
||||
"sessions_dir": str(self.sessions_dir),
|
||||
"always_log_local": self.always_log_local,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: Dict[str, Any]) -> "GatewayConfig":
|
||||
platforms = {}
|
||||
for platform_name, platform_data in data.get("platforms", {}).items():
|
||||
try:
|
||||
platform = Platform(platform_name)
|
||||
platforms[platform] = PlatformConfig.from_dict(platform_data)
|
||||
except ValueError:
|
||||
pass # Skip unknown platforms
|
||||
|
||||
reset_by_type = {}
|
||||
for type_name, policy_data in data.get("reset_by_type", {}).items():
|
||||
reset_by_type[type_name] = SessionResetPolicy.from_dict(policy_data)
|
||||
|
||||
reset_by_platform = {}
|
||||
for platform_name, policy_data in data.get("reset_by_platform", {}).items():
|
||||
try:
|
||||
platform = Platform(platform_name)
|
||||
reset_by_platform[platform] = SessionResetPolicy.from_dict(policy_data)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
default_policy = SessionResetPolicy()
|
||||
if "default_reset_policy" in data:
|
||||
default_policy = SessionResetPolicy.from_dict(data["default_reset_policy"])
|
||||
|
||||
sessions_dir = Path.home() / ".hermes" / "sessions"
|
||||
if "sessions_dir" in data:
|
||||
sessions_dir = Path(data["sessions_dir"])
|
||||
|
||||
return cls(
|
||||
platforms=platforms,
|
||||
default_reset_policy=default_policy,
|
||||
reset_by_type=reset_by_type,
|
||||
reset_by_platform=reset_by_platform,
|
||||
reset_triggers=data.get("reset_triggers", ["/new", "/reset"]),
|
||||
sessions_dir=sessions_dir,
|
||||
always_log_local=data.get("always_log_local", True),
|
||||
)
|
||||
|
||||
|
||||
def load_gateway_config() -> GatewayConfig:
|
||||
"""
|
||||
Load gateway configuration from multiple sources.
|
||||
|
||||
Priority (highest to lowest):
|
||||
1. Environment variables
|
||||
2. ~/.hermes/gateway.json
|
||||
3. cli-config.yaml gateway section
|
||||
4. Defaults
|
||||
"""
|
||||
config = GatewayConfig()
|
||||
|
||||
# Try loading from ~/.hermes/gateway.json
|
||||
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
|
||||
if gateway_config_path.exists():
|
||||
try:
|
||||
with open(gateway_config_path, "r") as f:
|
||||
data = json.load(f)
|
||||
config = GatewayConfig.from_dict(data)
|
||||
except Exception as e:
|
||||
print(f"[gateway] Warning: Failed to load {gateway_config_path}: {e}")
|
||||
|
||||
# Override with environment variables
|
||||
_apply_env_overrides(config)
|
||||
|
||||
return config
|
||||
|
||||
|
||||
def _apply_env_overrides(config: GatewayConfig) -> None:
|
||||
"""Apply environment variable overrides to config."""
|
||||
|
||||
# Telegram
|
||||
telegram_token = os.getenv("TELEGRAM_BOT_TOKEN")
|
||||
if telegram_token:
|
||||
if Platform.TELEGRAM not in config.platforms:
|
||||
config.platforms[Platform.TELEGRAM] = PlatformConfig()
|
||||
config.platforms[Platform.TELEGRAM].enabled = True
|
||||
config.platforms[Platform.TELEGRAM].token = telegram_token
|
||||
|
||||
telegram_home = os.getenv("TELEGRAM_HOME_CHANNEL")
|
||||
if telegram_home and Platform.TELEGRAM in config.platforms:
|
||||
config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
|
||||
platform=Platform.TELEGRAM,
|
||||
chat_id=telegram_home,
|
||||
name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
|
||||
)
|
||||
|
||||
# Discord
|
||||
discord_token = os.getenv("DISCORD_BOT_TOKEN")
|
||||
if discord_token:
|
||||
if Platform.DISCORD not in config.platforms:
|
||||
config.platforms[Platform.DISCORD] = PlatformConfig()
|
||||
config.platforms[Platform.DISCORD].enabled = True
|
||||
config.platforms[Platform.DISCORD].token = discord_token
|
||||
|
||||
discord_home = os.getenv("DISCORD_HOME_CHANNEL")
|
||||
if discord_home and Platform.DISCORD in config.platforms:
|
||||
config.platforms[Platform.DISCORD].home_channel = HomeChannel(
|
||||
platform=Platform.DISCORD,
|
||||
chat_id=discord_home,
|
||||
name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
|
||||
)
|
||||
|
||||
# WhatsApp (typically uses different auth mechanism)
|
||||
whatsapp_enabled = os.getenv("WHATSAPP_ENABLED", "").lower() in ("true", "1", "yes")
|
||||
if whatsapp_enabled:
|
||||
if Platform.WHATSAPP not in config.platforms:
|
||||
config.platforms[Platform.WHATSAPP] = PlatformConfig()
|
||||
config.platforms[Platform.WHATSAPP].enabled = True
|
||||
|
||||
# Session settings
|
||||
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
|
||||
if idle_minutes:
|
||||
try:
|
||||
config.default_reset_policy.idle_minutes = int(idle_minutes)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
reset_hour = os.getenv("SESSION_RESET_HOUR")
|
||||
if reset_hour:
|
||||
try:
|
||||
config.default_reset_policy.at_hour = int(reset_hour)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
|
||||
def save_gateway_config(config: GatewayConfig) -> None:
|
||||
"""Save gateway configuration to ~/.hermes/gateway.json."""
|
||||
gateway_config_path = Path.home() / ".hermes" / "gateway.json"
|
||||
gateway_config_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
with open(gateway_config_path, "w") as f:
|
||||
json.dump(config.to_dict(), f, indent=2)
|
||||
318
gateway/delivery.py
Normal file
318
gateway/delivery.py
Normal file
@@ -0,0 +1,318 @@
|
||||
"""
|
||||
Delivery routing for cron job outputs and agent responses.
|
||||
|
||||
Routes messages to the appropriate destination based on:
|
||||
- Explicit targets (e.g., "telegram:123456789")
|
||||
- Platform home channels (e.g., "telegram" → home channel)
|
||||
- Origin (back to where the job was created)
|
||||
- Local (always saved to files)
|
||||
"""
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
from dataclasses import dataclass
|
||||
from typing import Dict, List, Optional, Any, Union
|
||||
from enum import Enum
|
||||
|
||||
from .config import Platform, GatewayConfig, HomeChannel
|
||||
from .session import SessionSource
|
||||
|
||||
|
||||
@dataclass
|
||||
class DeliveryTarget:
|
||||
"""
|
||||
A single delivery target.
|
||||
|
||||
Represents where a message should be sent:
|
||||
- "origin" → back to source
|
||||
- "local" → save to local files
|
||||
- "telegram" → Telegram home channel
|
||||
- "telegram:123456" → specific Telegram chat
|
||||
"""
|
||||
platform: Platform
|
||||
chat_id: Optional[str] = None # None means use home channel
|
||||
is_origin: bool = False
|
||||
is_explicit: bool = False # True if chat_id was explicitly specified
|
||||
|
||||
@classmethod
|
||||
def parse(cls, target: str, origin: Optional[SessionSource] = None) -> "DeliveryTarget":
|
||||
"""
|
||||
Parse a delivery target string.
|
||||
|
||||
Formats:
|
||||
- "origin" → back to source
|
||||
- "local" → local files only
|
||||
- "telegram" → Telegram home channel
|
||||
- "telegram:123456" → specific Telegram chat
|
||||
"""
|
||||
target = target.strip().lower()
|
||||
|
||||
if target == "origin":
|
||||
if origin:
|
||||
return cls(
|
||||
platform=origin.platform,
|
||||
chat_id=origin.chat_id,
|
||||
is_origin=True,
|
||||
)
|
||||
else:
|
||||
# Fallback to local if no origin
|
||||
return cls(platform=Platform.LOCAL, is_origin=True)
|
||||
|
||||
if target == "local":
|
||||
return cls(platform=Platform.LOCAL)
|
||||
|
||||
# Check for platform:chat_id format
|
||||
if ":" in target:
|
||||
platform_str, chat_id = target.split(":", 1)
|
||||
try:
|
||||
platform = Platform(platform_str)
|
||||
return cls(platform=platform, chat_id=chat_id, is_explicit=True)
|
||||
except ValueError:
|
||||
# Unknown platform, treat as local
|
||||
return cls(platform=Platform.LOCAL)
|
||||
|
||||
# Just a platform name (use home channel)
|
||||
try:
|
||||
platform = Platform(target)
|
||||
return cls(platform=platform)
|
||||
except ValueError:
|
||||
# Unknown platform, treat as local
|
||||
return cls(platform=Platform.LOCAL)
|
||||
|
||||
def to_string(self) -> str:
|
||||
"""Convert back to string format."""
|
||||
if self.is_origin:
|
||||
return "origin"
|
||||
if self.platform == Platform.LOCAL:
|
||||
return "local"
|
||||
if self.chat_id:
|
||||
return f"{self.platform.value}:{self.chat_id}"
|
||||
return self.platform.value
|
||||
|
||||
|
||||
class DeliveryRouter:
|
||||
"""
|
||||
Routes messages to appropriate destinations.
|
||||
|
||||
Handles the logic of resolving delivery targets and dispatching
|
||||
messages to the right platform adapters.
|
||||
"""
|
||||
|
||||
def __init__(self, config: GatewayConfig, adapters: Dict[Platform, Any] = None):
|
||||
"""
|
||||
Initialize the delivery router.
|
||||
|
||||
Args:
|
||||
config: Gateway configuration
|
||||
adapters: Dict mapping platforms to their adapter instances
|
||||
"""
|
||||
self.config = config
|
||||
self.adapters = adapters or {}
|
||||
self.output_dir = Path.home() / ".hermes" / "cron" / "output"
|
||||
|
||||
def resolve_targets(
|
||||
self,
|
||||
deliver: Union[str, List[str]],
|
||||
origin: Optional[SessionSource] = None
|
||||
) -> List[DeliveryTarget]:
|
||||
"""
|
||||
Resolve delivery specification to concrete targets.
|
||||
|
||||
Args:
|
||||
deliver: Delivery spec - "origin", "telegram", ["local", "discord"], etc.
|
||||
origin: The source where the request originated (for "origin" target)
|
||||
|
||||
Returns:
|
||||
List of resolved delivery targets
|
||||
"""
|
||||
if isinstance(deliver, str):
|
||||
deliver = [deliver]
|
||||
|
||||
targets = []
|
||||
seen_platforms = set()
|
||||
|
||||
for target_str in deliver:
|
||||
target = DeliveryTarget.parse(target_str, origin)
|
||||
|
||||
# Resolve home channel if needed
|
||||
if target.chat_id is None and target.platform != Platform.LOCAL:
|
||||
home = self.config.get_home_channel(target.platform)
|
||||
if home:
|
||||
target.chat_id = home.chat_id
|
||||
else:
|
||||
# No home channel configured, skip this platform
|
||||
continue
|
||||
|
||||
# Deduplicate
|
||||
key = (target.platform, target.chat_id)
|
||||
if key not in seen_platforms:
|
||||
seen_platforms.add(key)
|
||||
targets.append(target)
|
||||
|
||||
# Always include local if configured
|
||||
if self.config.always_log_local:
|
||||
local_key = (Platform.LOCAL, None)
|
||||
if local_key not in seen_platforms:
|
||||
targets.append(DeliveryTarget(platform=Platform.LOCAL))
|
||||
|
||||
return targets
|
||||
|
||||
async def deliver(
|
||||
self,
|
||||
content: str,
|
||||
targets: List[DeliveryTarget],
|
||||
job_id: Optional[str] = None,
|
||||
job_name: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Deliver content to all specified targets.
|
||||
|
||||
Args:
|
||||
content: The message/output to deliver
|
||||
targets: List of delivery targets
|
||||
job_id: Optional job ID (for cron jobs)
|
||||
job_name: Optional job name
|
||||
metadata: Additional metadata to include
|
||||
|
||||
Returns:
|
||||
Dict with delivery results per target
|
||||
"""
|
||||
results = {}
|
||||
|
||||
for target in targets:
|
||||
try:
|
||||
if target.platform == Platform.LOCAL:
|
||||
result = self._deliver_local(content, job_id, job_name, metadata)
|
||||
else:
|
||||
result = await self._deliver_to_platform(target, content, metadata)
|
||||
|
||||
results[target.to_string()] = {
|
||||
"success": True,
|
||||
"result": result
|
||||
}
|
||||
except Exception as e:
|
||||
results[target.to_string()] = {
|
||||
"success": False,
|
||||
"error": str(e)
|
||||
}
|
||||
|
||||
return results
|
||||
|
||||
def _deliver_local(
|
||||
self,
|
||||
content: str,
|
||||
job_id: Optional[str],
|
||||
job_name: Optional[str],
|
||||
metadata: Optional[Dict[str, Any]]
|
||||
) -> Dict[str, Any]:
|
||||
"""Save content to local files."""
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
|
||||
if job_id:
|
||||
output_path = self.output_dir / job_id / f"{timestamp}.md"
|
||||
else:
|
||||
output_path = self.output_dir / "misc" / f"{timestamp}.md"
|
||||
|
||||
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Build the output document
|
||||
lines = []
|
||||
if job_name:
|
||||
lines.append(f"# {job_name}")
|
||||
else:
|
||||
lines.append("# Delivery Output")
|
||||
|
||||
lines.append("")
|
||||
lines.append(f"**Timestamp:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
|
||||
if job_id:
|
||||
lines.append(f"**Job ID:** {job_id}")
|
||||
|
||||
if metadata:
|
||||
for key, value in metadata.items():
|
||||
lines.append(f"**{key}:** {value}")
|
||||
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
lines.append(content)
|
||||
|
||||
output_path.write_text("\n".join(lines))
|
||||
|
||||
return {
|
||||
"path": str(output_path),
|
||||
"timestamp": timestamp
|
||||
}
|
||||
|
||||
async def _deliver_to_platform(
|
||||
self,
|
||||
target: DeliveryTarget,
|
||||
content: str,
|
||||
metadata: Optional[Dict[str, Any]]
|
||||
) -> Dict[str, Any]:
|
||||
"""Deliver content to a messaging platform."""
|
||||
adapter = self.adapters.get(target.platform)
|
||||
|
||||
if not adapter:
|
||||
raise ValueError(f"No adapter configured for {target.platform.value}")
|
||||
|
||||
if not target.chat_id:
|
||||
raise ValueError(f"No chat ID for {target.platform.value} delivery")
|
||||
|
||||
# Call the adapter's send method
|
||||
# Adapters should implement: async def send(chat_id: str, content: str) -> Dict
|
||||
return await adapter.send(target.chat_id, content, metadata=metadata)
|
||||
|
||||
|
||||
def parse_deliver_spec(
|
||||
deliver: Optional[Union[str, List[str]]],
|
||||
origin: Optional[SessionSource] = None,
|
||||
default: str = "origin"
|
||||
) -> Union[str, List[str]]:
|
||||
"""
|
||||
Normalize a delivery specification.
|
||||
|
||||
If None or empty, returns the default.
|
||||
"""
|
||||
if not deliver:
|
||||
return default
|
||||
return deliver
|
||||
|
||||
|
||||
def build_delivery_context_for_tool(
|
||||
config: GatewayConfig,
|
||||
origin: Optional[SessionSource] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Build context for the schedule_cronjob tool to understand delivery options.
|
||||
|
||||
This is passed to the tool so it can validate and explain delivery targets.
|
||||
"""
|
||||
connected = config.get_connected_platforms()
|
||||
|
||||
options = {
|
||||
"origin": {
|
||||
"description": "Back to where this job was created",
|
||||
"available": origin is not None,
|
||||
},
|
||||
"local": {
|
||||
"description": "Save to local files only",
|
||||
"available": True,
|
||||
}
|
||||
}
|
||||
|
||||
for platform in connected:
|
||||
home = config.get_home_channel(platform)
|
||||
options[platform.value] = {
|
||||
"description": f"{platform.value.title()} home channel",
|
||||
"available": True,
|
||||
"home_channel": home.to_dict() if home else None,
|
||||
}
|
||||
|
||||
return {
|
||||
"origin": origin.to_dict() if origin else None,
|
||||
"options": options,
|
||||
"always_log_local": config.always_log_local,
|
||||
}
|
||||
17
gateway/platforms/__init__.py
Normal file
17
gateway/platforms/__init__.py
Normal file
@@ -0,0 +1,17 @@
|
||||
"""
|
||||
Platform adapters for messaging integrations.
|
||||
|
||||
Each adapter handles:
|
||||
- Receiving messages from a platform
|
||||
- Sending messages/responses back
|
||||
- Platform-specific authentication
|
||||
- Message formatting and media handling
|
||||
"""
|
||||
|
||||
from .base import BasePlatformAdapter, MessageEvent, SendResult
|
||||
|
||||
__all__ = [
|
||||
"BasePlatformAdapter",
|
||||
"MessageEvent",
|
||||
"SendResult",
|
||||
]
|
||||
365
gateway/platforms/base.py
Normal file
365
gateway/platforms/base.py
Normal file
@@ -0,0 +1,365 @@
|
||||
"""
|
||||
Base platform adapter interface.
|
||||
|
||||
All platform adapters (Telegram, Discord, WhatsApp) inherit from this
|
||||
and implement the required methods.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Optional, Any, Callable, Awaitable
|
||||
from enum import Enum
|
||||
|
||||
import sys
|
||||
sys.path.insert(0, str(__file__).rsplit("/", 3)[0])
|
||||
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
from gateway.session import SessionSource
|
||||
|
||||
|
||||
class MessageType(Enum):
|
||||
"""Types of incoming messages."""
|
||||
TEXT = "text"
|
||||
PHOTO = "photo"
|
||||
VIDEO = "video"
|
||||
AUDIO = "audio"
|
||||
VOICE = "voice"
|
||||
DOCUMENT = "document"
|
||||
STICKER = "sticker"
|
||||
COMMAND = "command" # /command style
|
||||
|
||||
|
||||
@dataclass
|
||||
class MessageEvent:
|
||||
"""
|
||||
Incoming message from a platform.
|
||||
|
||||
Normalized representation that all adapters produce.
|
||||
"""
|
||||
# Message content
|
||||
text: str
|
||||
message_type: MessageType = MessageType.TEXT
|
||||
|
||||
# Source information
|
||||
source: SessionSource = None
|
||||
|
||||
# Original platform data
|
||||
raw_message: Any = None
|
||||
message_id: Optional[str] = None
|
||||
|
||||
# Media attachments
|
||||
media_urls: List[str] = field(default_factory=list)
|
||||
media_types: List[str] = field(default_factory=list)
|
||||
|
||||
# Reply context
|
||||
reply_to_message_id: Optional[str] = None
|
||||
|
||||
# Timestamps
|
||||
timestamp: datetime = field(default_factory=datetime.now)
|
||||
|
||||
def is_command(self) -> bool:
|
||||
"""Check if this is a command message (e.g., /new, /reset)."""
|
||||
return self.text.startswith("/")
|
||||
|
||||
def get_command(self) -> Optional[str]:
|
||||
"""Extract command name if this is a command message."""
|
||||
if not self.is_command():
|
||||
return None
|
||||
# Split on space and get first word, strip the /
|
||||
parts = self.text.split(maxsplit=1)
|
||||
return parts[0][1:].lower() if parts else None
|
||||
|
||||
def get_command_args(self) -> str:
|
||||
"""Get the arguments after a command."""
|
||||
if not self.is_command():
|
||||
return self.text
|
||||
parts = self.text.split(maxsplit=1)
|
||||
return parts[1] if len(parts) > 1 else ""
|
||||
|
||||
|
||||
@dataclass
|
||||
class SendResult:
|
||||
"""Result of sending a message."""
|
||||
success: bool
|
||||
message_id: Optional[str] = None
|
||||
error: Optional[str] = None
|
||||
raw_response: Any = None
|
||||
|
||||
|
||||
# Type for message handlers
|
||||
MessageHandler = Callable[[MessageEvent], Awaitable[Optional[str]]]
|
||||
|
||||
|
||||
class BasePlatformAdapter(ABC):
|
||||
"""
|
||||
Base class for platform adapters.
|
||||
|
||||
Subclasses implement platform-specific logic for:
|
||||
- Connecting and authenticating
|
||||
- Receiving messages
|
||||
- Sending messages/responses
|
||||
- Handling media
|
||||
"""
|
||||
|
||||
def __init__(self, config: PlatformConfig, platform: Platform):
|
||||
self.config = config
|
||||
self.platform = platform
|
||||
self._message_handler: Optional[MessageHandler] = None
|
||||
self._running = False
|
||||
|
||||
# Track active message handlers per session for interrupt support
|
||||
# Key: session_key (e.g., chat_id), Value: (event, asyncio.Event for interrupt)
|
||||
self._active_sessions: Dict[str, asyncio.Event] = {}
|
||||
self._pending_messages: Dict[str, MessageEvent] = {}
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
"""Human-readable name for this adapter."""
|
||||
return self.platform.value.title()
|
||||
|
||||
@property
|
||||
def is_connected(self) -> bool:
|
||||
"""Check if adapter is currently connected."""
|
||||
return self._running
|
||||
|
||||
def set_message_handler(self, handler: MessageHandler) -> None:
|
||||
"""
|
||||
Set the handler for incoming messages.
|
||||
|
||||
The handler receives a MessageEvent and should return
|
||||
an optional response string.
|
||||
"""
|
||||
self._message_handler = handler
|
||||
|
||||
@abstractmethod
|
||||
async def connect(self) -> bool:
|
||||
"""
|
||||
Connect to the platform and start receiving messages.
|
||||
|
||||
Returns True if connection was successful.
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def disconnect(self) -> None:
|
||||
"""Disconnect from the platform."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def send(
|
||||
self,
|
||||
chat_id: str,
|
||||
content: str,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> SendResult:
|
||||
"""
|
||||
Send a message to a chat.
|
||||
|
||||
Args:
|
||||
chat_id: The chat/channel ID to send to
|
||||
content: Message content (may be markdown)
|
||||
reply_to: Optional message ID to reply to
|
||||
metadata: Additional platform-specific options
|
||||
|
||||
Returns:
|
||||
SendResult with success status and message ID
|
||||
"""
|
||||
pass
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
"""
|
||||
Send a typing indicator.
|
||||
|
||||
Override in subclasses if the platform supports it.
|
||||
"""
|
||||
pass
|
||||
|
||||
async def _keep_typing(self, chat_id: str, interval: float = 2.0) -> None:
|
||||
"""
|
||||
Continuously send typing indicator until cancelled.
|
||||
|
||||
Telegram/Discord typing status expires after ~5 seconds, so we refresh every 2
|
||||
to recover quickly after progress messages interrupt it.
|
||||
"""
|
||||
try:
|
||||
while True:
|
||||
await self.send_typing(chat_id)
|
||||
await asyncio.sleep(interval)
|
||||
except asyncio.CancelledError:
|
||||
pass # Normal cancellation when handler completes
|
||||
|
||||
async def handle_message(self, event: MessageEvent) -> None:
|
||||
"""
|
||||
Process an incoming message.
|
||||
|
||||
This method returns quickly by spawning background tasks.
|
||||
This allows new messages to be processed even while an agent is running,
|
||||
enabling interruption support.
|
||||
"""
|
||||
if not self._message_handler:
|
||||
return
|
||||
|
||||
session_key = event.source.chat_id
|
||||
|
||||
# Check if there's already an active handler for this session
|
||||
if session_key in self._active_sessions:
|
||||
# Store this as a pending message - it will interrupt the running agent
|
||||
print(f"[{self.name}] ⚡ New message while session {session_key} is active - triggering interrupt")
|
||||
self._pending_messages[session_key] = event
|
||||
# Signal the interrupt (the processing task checks this)
|
||||
self._active_sessions[session_key].set()
|
||||
return # Don't process now - will be handled after current task finishes
|
||||
|
||||
# Spawn background task to process this message
|
||||
asyncio.create_task(self._process_message_background(event, session_key))
|
||||
|
||||
async def _process_message_background(self, event: MessageEvent, session_key: str) -> None:
|
||||
"""Background task that actually processes the message."""
|
||||
# Create interrupt event for this session
|
||||
interrupt_event = asyncio.Event()
|
||||
self._active_sessions[session_key] = interrupt_event
|
||||
|
||||
# Start continuous typing indicator (refreshes every 2 seconds)
|
||||
typing_task = asyncio.create_task(self._keep_typing(event.source.chat_id))
|
||||
|
||||
try:
|
||||
# Call the handler (this can take a while with tool calls)
|
||||
response = await self._message_handler(event)
|
||||
|
||||
# Send response if any
|
||||
if response:
|
||||
result = await self.send(
|
||||
chat_id=event.source.chat_id,
|
||||
content=response,
|
||||
reply_to=event.message_id
|
||||
)
|
||||
|
||||
# Log send failures (don't raise - user already saw tool progress)
|
||||
if not result.success:
|
||||
print(f"[{self.name}] Failed to send response: {result.error}")
|
||||
# Try sending without markdown as fallback
|
||||
fallback_result = await self.send(
|
||||
chat_id=event.source.chat_id,
|
||||
content=f"(Response formatting failed, plain text:)\n\n{response[:3500]}",
|
||||
reply_to=event.message_id
|
||||
)
|
||||
if not fallback_result.success:
|
||||
print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
|
||||
|
||||
# Check if there's a pending message that was queued during our processing
|
||||
if session_key in self._pending_messages:
|
||||
pending_event = self._pending_messages.pop(session_key)
|
||||
print(f"[{self.name}] 📨 Processing queued message from interrupt")
|
||||
# Clean up current session before processing pending
|
||||
if session_key in self._active_sessions:
|
||||
del self._active_sessions[session_key]
|
||||
typing_task.cancel()
|
||||
try:
|
||||
await typing_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
# Process pending message in new background task
|
||||
await self._process_message_background(pending_event, session_key)
|
||||
return # Already cleaned up
|
||||
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Error handling message: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
finally:
|
||||
# Stop typing indicator
|
||||
typing_task.cancel()
|
||||
try:
|
||||
await typing_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
# Clean up session tracking
|
||||
if session_key in self._active_sessions:
|
||||
del self._active_sessions[session_key]
|
||||
|
||||
def has_pending_interrupt(self, session_key: str) -> bool:
|
||||
"""Check if there's a pending interrupt for a session."""
|
||||
return session_key in self._active_sessions and self._active_sessions[session_key].is_set()
|
||||
|
||||
def get_pending_message(self, session_key: str) -> Optional[MessageEvent]:
|
||||
"""Get and clear any pending message for a session."""
|
||||
return self._pending_messages.get(session_key)
|
||||
|
||||
def build_source(
|
||||
self,
|
||||
chat_id: str,
|
||||
chat_name: Optional[str] = None,
|
||||
chat_type: str = "dm",
|
||||
user_id: Optional[str] = None,
|
||||
user_name: Optional[str] = None,
|
||||
thread_id: Optional[str] = None
|
||||
) -> SessionSource:
|
||||
"""Helper to build a SessionSource for this platform."""
|
||||
return SessionSource(
|
||||
platform=self.platform,
|
||||
chat_id=str(chat_id),
|
||||
chat_name=chat_name,
|
||||
chat_type=chat_type,
|
||||
user_id=str(user_id) if user_id else None,
|
||||
user_name=user_name,
|
||||
thread_id=str(thread_id) if thread_id else None,
|
||||
)
|
||||
|
||||
@abstractmethod
|
||||
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get information about a chat/channel.
|
||||
|
||||
Returns dict with at least:
|
||||
- name: Chat name
|
||||
- type: "dm", "group", "channel"
|
||||
"""
|
||||
pass
|
||||
|
||||
def format_message(self, content: str) -> str:
|
||||
"""
|
||||
Format a message for this platform.
|
||||
|
||||
Override in subclasses to handle platform-specific formatting
|
||||
(e.g., Telegram MarkdownV2, Discord markdown).
|
||||
|
||||
Default implementation returns content as-is.
|
||||
"""
|
||||
return content
|
||||
|
||||
def truncate_message(self, content: str, max_length: int = 4096) -> List[str]:
|
||||
"""
|
||||
Split a long message into chunks.
|
||||
|
||||
Args:
|
||||
content: The full message content
|
||||
max_length: Maximum length per chunk (platform-specific)
|
||||
|
||||
Returns:
|
||||
List of message chunks
|
||||
"""
|
||||
if len(content) <= max_length:
|
||||
return [content]
|
||||
|
||||
chunks = []
|
||||
while content:
|
||||
if len(content) <= max_length:
|
||||
chunks.append(content)
|
||||
break
|
||||
|
||||
# Try to split at a newline
|
||||
split_idx = content.rfind("\n", 0, max_length)
|
||||
if split_idx == -1:
|
||||
# No newline, split at space
|
||||
split_idx = content.rfind(" ", 0, max_length)
|
||||
if split_idx == -1:
|
||||
# No space either, hard split
|
||||
split_idx = max_length
|
||||
|
||||
chunks.append(content[:split_idx])
|
||||
content = content[split_idx:].lstrip()
|
||||
|
||||
return chunks
|
||||
297
gateway/platforms/discord.py
Normal file
297
gateway/platforms/discord.py
Normal file
@@ -0,0 +1,297 @@
|
||||
"""
|
||||
Discord platform adapter.
|
||||
|
||||
Uses discord.py library for:
|
||||
- Receiving messages from servers and DMs
|
||||
- Sending responses back
|
||||
- Handling threads and channels
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from typing import Dict, List, Optional, Any
|
||||
|
||||
try:
|
||||
import discord
|
||||
from discord import Message as DiscordMessage, Intents
|
||||
from discord.ext import commands
|
||||
DISCORD_AVAILABLE = True
|
||||
except ImportError:
|
||||
DISCORD_AVAILABLE = False
|
||||
discord = None
|
||||
DiscordMessage = Any
|
||||
Intents = Any
|
||||
commands = None
|
||||
|
||||
import sys
|
||||
sys.path.insert(0, str(__file__).rsplit("/", 3)[0])
|
||||
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
from gateway.platforms.base import (
|
||||
BasePlatformAdapter,
|
||||
MessageEvent,
|
||||
MessageType,
|
||||
SendResult,
|
||||
)
|
||||
|
||||
|
||||
def check_discord_requirements() -> bool:
|
||||
"""Check if Discord dependencies are available."""
|
||||
return DISCORD_AVAILABLE
|
||||
|
||||
|
||||
class DiscordAdapter(BasePlatformAdapter):
|
||||
"""
|
||||
Discord bot adapter.
|
||||
|
||||
Handles:
|
||||
- Receiving messages from servers and DMs
|
||||
- Sending responses with Discord markdown
|
||||
- Thread support
|
||||
- Slash commands (future)
|
||||
"""
|
||||
|
||||
# Discord message limits
|
||||
MAX_MESSAGE_LENGTH = 2000
|
||||
|
||||
def __init__(self, config: PlatformConfig):
|
||||
super().__init__(config, Platform.DISCORD)
|
||||
self._client: Optional[commands.Bot] = None
|
||||
self._ready_event = asyncio.Event()
|
||||
|
||||
async def connect(self) -> bool:
|
||||
"""Connect to Discord and start receiving events."""
|
||||
if not DISCORD_AVAILABLE:
|
||||
print(f"[{self.name}] discord.py not installed. Run: pip install discord.py")
|
||||
return False
|
||||
|
||||
if not self.config.token:
|
||||
print(f"[{self.name}] No bot token configured")
|
||||
return False
|
||||
|
||||
try:
|
||||
# Set up intents
|
||||
intents = Intents.default()
|
||||
intents.message_content = True
|
||||
intents.dm_messages = True
|
||||
intents.guild_messages = True
|
||||
|
||||
# Create bot
|
||||
self._client = commands.Bot(
|
||||
command_prefix="!", # Not really used, we handle raw messages
|
||||
intents=intents,
|
||||
)
|
||||
|
||||
# Register event handlers
|
||||
@self._client.event
|
||||
async def on_ready():
|
||||
print(f"[{self.name}] Connected as {self._client.user}")
|
||||
self._ready_event.set()
|
||||
|
||||
@self._client.event
|
||||
async def on_message(message: DiscordMessage):
|
||||
# Ignore bot's own messages
|
||||
if message.author == self._client.user:
|
||||
return
|
||||
await self._handle_message(message)
|
||||
|
||||
# Start the bot in background
|
||||
asyncio.create_task(self._client.start(self.config.token))
|
||||
|
||||
# Wait for ready
|
||||
await asyncio.wait_for(self._ready_event.wait(), timeout=30)
|
||||
|
||||
self._running = True
|
||||
return True
|
||||
|
||||
except asyncio.TimeoutError:
|
||||
print(f"[{self.name}] Timeout waiting for connection")
|
||||
return False
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Failed to connect: {e}")
|
||||
return False
|
||||
|
||||
async def disconnect(self) -> None:
|
||||
"""Disconnect from Discord."""
|
||||
if self._client:
|
||||
try:
|
||||
await self._client.close()
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Error during disconnect: {e}")
|
||||
|
||||
self._running = False
|
||||
self._client = None
|
||||
self._ready_event.clear()
|
||||
print(f"[{self.name}] Disconnected")
|
||||
|
||||
async def send(
|
||||
self,
|
||||
chat_id: str,
|
||||
content: str,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> SendResult:
|
||||
"""Send a message to a Discord channel."""
|
||||
if not self._client:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
try:
|
||||
# Get the channel
|
||||
channel = self._client.get_channel(int(chat_id))
|
||||
if not channel:
|
||||
channel = await self._client.fetch_channel(int(chat_id))
|
||||
|
||||
if not channel:
|
||||
return SendResult(success=False, error=f"Channel {chat_id} not found")
|
||||
|
||||
# Format and split message if needed
|
||||
formatted = self.format_message(content)
|
||||
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
|
||||
|
||||
message_ids = []
|
||||
reference = None
|
||||
|
||||
if reply_to:
|
||||
try:
|
||||
ref_msg = await channel.fetch_message(int(reply_to))
|
||||
reference = ref_msg
|
||||
except Exception:
|
||||
pass # Ignore if we can't find the referenced message
|
||||
|
||||
for i, chunk in enumerate(chunks):
|
||||
msg = await channel.send(
|
||||
content=chunk,
|
||||
reference=reference if i == 0 else None,
|
||||
)
|
||||
message_ids.append(str(msg.id))
|
||||
|
||||
return SendResult(
|
||||
success=True,
|
||||
message_id=message_ids[0] if message_ids else None,
|
||||
raw_response={"message_ids": message_ids}
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
return SendResult(success=False, error=str(e))
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
"""Send typing indicator."""
|
||||
if self._client:
|
||||
try:
|
||||
channel = self._client.get_channel(int(chat_id))
|
||||
if channel:
|
||||
await channel.typing()
|
||||
except Exception:
|
||||
pass # Ignore typing indicator failures
|
||||
|
||||
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
|
||||
"""Get information about a Discord channel."""
|
||||
if not self._client:
|
||||
return {"name": "Unknown", "type": "dm"}
|
||||
|
||||
try:
|
||||
channel = self._client.get_channel(int(chat_id))
|
||||
if not channel:
|
||||
channel = await self._client.fetch_channel(int(chat_id))
|
||||
|
||||
if not channel:
|
||||
return {"name": str(chat_id), "type": "dm"}
|
||||
|
||||
# Determine channel type
|
||||
if isinstance(channel, discord.DMChannel):
|
||||
chat_type = "dm"
|
||||
name = channel.recipient.name if channel.recipient else str(chat_id)
|
||||
elif isinstance(channel, discord.Thread):
|
||||
chat_type = "thread"
|
||||
name = channel.name
|
||||
elif isinstance(channel, discord.TextChannel):
|
||||
chat_type = "channel"
|
||||
name = f"#{channel.name}"
|
||||
if channel.guild:
|
||||
name = f"{channel.guild.name} / {name}"
|
||||
else:
|
||||
chat_type = "channel"
|
||||
name = getattr(channel, "name", str(chat_id))
|
||||
|
||||
return {
|
||||
"name": name,
|
||||
"type": chat_type,
|
||||
"guild_id": str(channel.guild.id) if hasattr(channel, "guild") and channel.guild else None,
|
||||
"guild_name": channel.guild.name if hasattr(channel, "guild") and channel.guild else None,
|
||||
}
|
||||
except Exception as e:
|
||||
return {"name": str(chat_id), "type": "dm", "error": str(e)}
|
||||
|
||||
def format_message(self, content: str) -> str:
|
||||
"""
|
||||
Format message for Discord.
|
||||
|
||||
Discord uses its own markdown variant.
|
||||
"""
|
||||
# Discord markdown is fairly standard, no special escaping needed
|
||||
return content
|
||||
|
||||
async def _handle_message(self, message: DiscordMessage) -> None:
|
||||
"""Handle incoming Discord messages."""
|
||||
# Determine message type
|
||||
msg_type = MessageType.TEXT
|
||||
if message.content.startswith("/"):
|
||||
msg_type = MessageType.COMMAND
|
||||
elif message.attachments:
|
||||
# Check attachment types
|
||||
for att in message.attachments:
|
||||
if att.content_type:
|
||||
if att.content_type.startswith("image/"):
|
||||
msg_type = MessageType.PHOTO
|
||||
elif att.content_type.startswith("video/"):
|
||||
msg_type = MessageType.VIDEO
|
||||
elif att.content_type.startswith("audio/"):
|
||||
msg_type = MessageType.AUDIO
|
||||
else:
|
||||
msg_type = MessageType.DOCUMENT
|
||||
break
|
||||
|
||||
# Determine chat type
|
||||
if isinstance(message.channel, discord.DMChannel):
|
||||
chat_type = "dm"
|
||||
chat_name = message.author.name
|
||||
elif isinstance(message.channel, discord.Thread):
|
||||
chat_type = "thread"
|
||||
chat_name = message.channel.name
|
||||
else:
|
||||
chat_type = "group" # Treat server channels as groups
|
||||
chat_name = getattr(message.channel, "name", str(message.channel.id))
|
||||
if hasattr(message.channel, "guild") and message.channel.guild:
|
||||
chat_name = f"{message.channel.guild.name} / #{chat_name}"
|
||||
|
||||
# Get thread ID if in a thread
|
||||
thread_id = None
|
||||
if isinstance(message.channel, discord.Thread):
|
||||
thread_id = str(message.channel.id)
|
||||
|
||||
# Build source
|
||||
source = self.build_source(
|
||||
chat_id=str(message.channel.id),
|
||||
chat_name=chat_name,
|
||||
chat_type=chat_type,
|
||||
user_id=str(message.author.id),
|
||||
user_name=message.author.display_name,
|
||||
thread_id=thread_id,
|
||||
)
|
||||
|
||||
# Build media URLs
|
||||
media_urls = [att.url for att in message.attachments]
|
||||
media_types = [att.content_type or "unknown" for att in message.attachments]
|
||||
|
||||
event = MessageEvent(
|
||||
text=message.content,
|
||||
message_type=msg_type,
|
||||
source=source,
|
||||
raw_message=message,
|
||||
message_id=str(message.id),
|
||||
media_urls=media_urls,
|
||||
media_types=media_types,
|
||||
reply_to_message_id=str(message.reference.message_id) if message.reference else None,
|
||||
timestamp=message.created_at,
|
||||
)
|
||||
|
||||
await self.handle_message(event)
|
||||
298
gateway/platforms/telegram.py
Normal file
298
gateway/platforms/telegram.py
Normal file
@@ -0,0 +1,298 @@
|
||||
"""
|
||||
Telegram platform adapter.
|
||||
|
||||
Uses python-telegram-bot library for:
|
||||
- Receiving messages from users/groups
|
||||
- Sending responses back
|
||||
- Handling media and commands
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from typing import Dict, List, Optional, Any
|
||||
|
||||
try:
|
||||
from telegram import Update, Bot, Message
|
||||
from telegram.ext import (
|
||||
Application,
|
||||
CommandHandler,
|
||||
MessageHandler as TelegramMessageHandler,
|
||||
ContextTypes,
|
||||
filters,
|
||||
)
|
||||
from telegram.constants import ParseMode, ChatType
|
||||
TELEGRAM_AVAILABLE = True
|
||||
except ImportError:
|
||||
TELEGRAM_AVAILABLE = False
|
||||
Update = Any
|
||||
Bot = Any
|
||||
Message = Any
|
||||
Application = Any
|
||||
ContextTypes = Any
|
||||
|
||||
import sys
|
||||
sys.path.insert(0, str(__file__).rsplit("/", 3)[0])
|
||||
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
from gateway.platforms.base import (
|
||||
BasePlatformAdapter,
|
||||
MessageEvent,
|
||||
MessageType,
|
||||
SendResult,
|
||||
)
|
||||
|
||||
|
||||
def check_telegram_requirements() -> bool:
|
||||
"""Check if Telegram dependencies are available."""
|
||||
return TELEGRAM_AVAILABLE
|
||||
|
||||
|
||||
class TelegramAdapter(BasePlatformAdapter):
|
||||
"""
|
||||
Telegram bot adapter.
|
||||
|
||||
Handles:
|
||||
- Receiving messages from users and groups
|
||||
- Sending responses with Telegram markdown
|
||||
- Forum topics (thread_id support)
|
||||
- Media messages
|
||||
"""
|
||||
|
||||
# Telegram message limits
|
||||
MAX_MESSAGE_LENGTH = 4096
|
||||
|
||||
def __init__(self, config: PlatformConfig):
|
||||
super().__init__(config, Platform.TELEGRAM)
|
||||
self._app: Optional[Application] = None
|
||||
self._bot: Optional[Bot] = None
|
||||
|
||||
async def connect(self) -> bool:
|
||||
"""Connect to Telegram and start polling for updates."""
|
||||
if not TELEGRAM_AVAILABLE:
|
||||
print(f"[{self.name}] python-telegram-bot not installed. Run: pip install python-telegram-bot")
|
||||
return False
|
||||
|
||||
if not self.config.token:
|
||||
print(f"[{self.name}] No bot token configured")
|
||||
return False
|
||||
|
||||
try:
|
||||
# Build the application
|
||||
self._app = Application.builder().token(self.config.token).build()
|
||||
self._bot = self._app.bot
|
||||
|
||||
# Register handlers
|
||||
self._app.add_handler(TelegramMessageHandler(
|
||||
filters.TEXT & ~filters.COMMAND,
|
||||
self._handle_text_message
|
||||
))
|
||||
self._app.add_handler(TelegramMessageHandler(
|
||||
filters.COMMAND,
|
||||
self._handle_command
|
||||
))
|
||||
self._app.add_handler(TelegramMessageHandler(
|
||||
filters.PHOTO | filters.VIDEO | filters.AUDIO | filters.VOICE | filters.Document.ALL,
|
||||
self._handle_media_message
|
||||
))
|
||||
|
||||
# Start polling in background
|
||||
await self._app.initialize()
|
||||
await self._app.start()
|
||||
await self._app.updater.start_polling(allowed_updates=Update.ALL_TYPES)
|
||||
|
||||
self._running = True
|
||||
print(f"[{self.name}] Connected and polling for updates")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Failed to connect: {e}")
|
||||
return False
|
||||
|
||||
async def disconnect(self) -> None:
|
||||
"""Stop polling and disconnect."""
|
||||
if self._app:
|
||||
try:
|
||||
await self._app.updater.stop()
|
||||
await self._app.stop()
|
||||
await self._app.shutdown()
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Error during disconnect: {e}")
|
||||
|
||||
self._running = False
|
||||
self._app = None
|
||||
self._bot = None
|
||||
print(f"[{self.name}] Disconnected")
|
||||
|
||||
async def send(
|
||||
self,
|
||||
chat_id: str,
|
||||
content: str,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> SendResult:
|
||||
"""Send a message to a Telegram chat."""
|
||||
if not self._bot:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
try:
|
||||
# Format and split message if needed
|
||||
formatted = self.format_message(content)
|
||||
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
|
||||
|
||||
message_ids = []
|
||||
thread_id = metadata.get("thread_id") if metadata else None
|
||||
|
||||
for i, chunk in enumerate(chunks):
|
||||
# Try Markdown first, fall back to plain text if it fails
|
||||
try:
|
||||
msg = await self._bot.send_message(
|
||||
chat_id=int(chat_id),
|
||||
text=chunk,
|
||||
parse_mode=ParseMode.MARKDOWN,
|
||||
reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
|
||||
message_thread_id=int(thread_id) if thread_id else None,
|
||||
)
|
||||
except Exception as md_error:
|
||||
# Markdown parsing failed, try plain text
|
||||
if "parse" in str(md_error).lower() or "markdown" in str(md_error).lower():
|
||||
msg = await self._bot.send_message(
|
||||
chat_id=int(chat_id),
|
||||
text=chunk,
|
||||
parse_mode=None, # Plain text
|
||||
reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
|
||||
message_thread_id=int(thread_id) if thread_id else None,
|
||||
)
|
||||
else:
|
||||
raise # Re-raise if not a parse error
|
||||
message_ids.append(str(msg.message_id))
|
||||
|
||||
return SendResult(
|
||||
success=True,
|
||||
message_id=message_ids[0] if message_ids else None,
|
||||
raw_response={"message_ids": message_ids}
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
return SendResult(success=False, error=str(e))
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
"""Send typing indicator."""
|
||||
if self._bot:
|
||||
try:
|
||||
await self._bot.send_chat_action(
|
||||
chat_id=int(chat_id),
|
||||
action="typing"
|
||||
)
|
||||
except Exception:
|
||||
pass # Ignore typing indicator failures
|
||||
|
||||
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
|
||||
"""Get information about a Telegram chat."""
|
||||
if not self._bot:
|
||||
return {"name": "Unknown", "type": "dm"}
|
||||
|
||||
try:
|
||||
chat = await self._bot.get_chat(int(chat_id))
|
||||
|
||||
chat_type = "dm"
|
||||
if chat.type == ChatType.GROUP:
|
||||
chat_type = "group"
|
||||
elif chat.type == ChatType.SUPERGROUP:
|
||||
chat_type = "group"
|
||||
if chat.is_forum:
|
||||
chat_type = "forum"
|
||||
elif chat.type == ChatType.CHANNEL:
|
||||
chat_type = "channel"
|
||||
|
||||
return {
|
||||
"name": chat.title or chat.full_name or str(chat_id),
|
||||
"type": chat_type,
|
||||
"username": chat.username,
|
||||
"is_forum": getattr(chat, "is_forum", False),
|
||||
}
|
||||
except Exception as e:
|
||||
return {"name": str(chat_id), "type": "dm", "error": str(e)}
|
||||
|
||||
def format_message(self, content: str) -> str:
|
||||
"""
|
||||
Format message for Telegram.
|
||||
|
||||
Telegram uses a subset of markdown. We'll use the simpler
|
||||
Markdown mode (not MarkdownV2) for compatibility.
|
||||
"""
|
||||
# Basic escaping for Telegram Markdown
|
||||
# In Markdown mode (not V2), only certain characters need escaping
|
||||
return content
|
||||
|
||||
async def _handle_text_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
|
||||
"""Handle incoming text messages."""
|
||||
if not update.message or not update.message.text:
|
||||
return
|
||||
|
||||
event = self._build_message_event(update.message, MessageType.TEXT)
|
||||
await self.handle_message(event)
|
||||
|
||||
async def _handle_command(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
|
||||
"""Handle incoming command messages."""
|
||||
if not update.message or not update.message.text:
|
||||
return
|
||||
|
||||
event = self._build_message_event(update.message, MessageType.COMMAND)
|
||||
await self.handle_message(event)
|
||||
|
||||
async def _handle_media_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
|
||||
"""Handle incoming media messages."""
|
||||
if not update.message:
|
||||
return
|
||||
|
||||
msg = update.message
|
||||
|
||||
# Determine media type
|
||||
if msg.photo:
|
||||
msg_type = MessageType.PHOTO
|
||||
elif msg.video:
|
||||
msg_type = MessageType.VIDEO
|
||||
elif msg.audio:
|
||||
msg_type = MessageType.AUDIO
|
||||
elif msg.voice:
|
||||
msg_type = MessageType.VOICE
|
||||
else:
|
||||
msg_type = MessageType.DOCUMENT
|
||||
|
||||
event = self._build_message_event(msg, msg_type)
|
||||
|
||||
# Add caption as text
|
||||
if msg.caption:
|
||||
event.text = msg.caption
|
||||
|
||||
await self.handle_message(event)
|
||||
|
||||
def _build_message_event(self, message: Message, msg_type: MessageType) -> MessageEvent:
|
||||
"""Build a MessageEvent from a Telegram message."""
|
||||
chat = message.chat
|
||||
user = message.from_user
|
||||
|
||||
# Determine chat type
|
||||
chat_type = "dm"
|
||||
if chat.type in (ChatType.GROUP, ChatType.SUPERGROUP):
|
||||
chat_type = "group"
|
||||
elif chat.type == ChatType.CHANNEL:
|
||||
chat_type = "channel"
|
||||
|
||||
# Build source
|
||||
source = self.build_source(
|
||||
chat_id=str(chat.id),
|
||||
chat_name=chat.title or (chat.full_name if hasattr(chat, "full_name") else None),
|
||||
chat_type=chat_type,
|
||||
user_id=str(user.id) if user else None,
|
||||
user_name=user.full_name if user else None,
|
||||
thread_id=str(message.message_thread_id) if message.message_thread_id else None,
|
||||
)
|
||||
|
||||
return MessageEvent(
|
||||
text=message.text or "",
|
||||
message_type=msg_type,
|
||||
source=source,
|
||||
raw_message=message,
|
||||
message_id=str(message.message_id),
|
||||
timestamp=message.date,
|
||||
)
|
||||
327
gateway/platforms/whatsapp.py
Normal file
327
gateway/platforms/whatsapp.py
Normal file
@@ -0,0 +1,327 @@
|
||||
"""
|
||||
WhatsApp platform adapter.
|
||||
|
||||
WhatsApp integration is more complex than Telegram/Discord because:
|
||||
- No official bot API for personal accounts
|
||||
- Business API requires Meta Business verification
|
||||
- Most solutions use web-based automation
|
||||
|
||||
This adapter supports multiple backends:
|
||||
1. WhatsApp Business API (requires Meta verification)
|
||||
2. whatsapp-web.js (via Node.js subprocess) - for personal accounts
|
||||
3. Baileys (via Node.js subprocess) - alternative for personal accounts
|
||||
|
||||
For simplicity, we'll implement a generic interface that can work
|
||||
with different backends via a bridge pattern.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Any
|
||||
|
||||
import sys
|
||||
sys.path.insert(0, str(__file__).rsplit("/", 3)[0])
|
||||
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
from gateway.platforms.base import (
|
||||
BasePlatformAdapter,
|
||||
MessageEvent,
|
||||
MessageType,
|
||||
SendResult,
|
||||
)
|
||||
|
||||
|
||||
def check_whatsapp_requirements() -> bool:
|
||||
"""
|
||||
Check if WhatsApp dependencies are available.
|
||||
|
||||
WhatsApp requires a Node.js bridge for most implementations.
|
||||
"""
|
||||
# Check for Node.js
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["node", "--version"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=5
|
||||
)
|
||||
return result.returncode == 0
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
class WhatsAppAdapter(BasePlatformAdapter):
|
||||
"""
|
||||
WhatsApp adapter.
|
||||
|
||||
This implementation uses a simple HTTP bridge pattern where:
|
||||
1. A Node.js process runs the WhatsApp Web client
|
||||
2. Messages are forwarded via HTTP/IPC to this Python adapter
|
||||
3. Responses are sent back through the bridge
|
||||
|
||||
The actual Node.js bridge implementation can vary:
|
||||
- whatsapp-web.js based
|
||||
- Baileys based
|
||||
- Business API based
|
||||
|
||||
Configuration:
|
||||
- bridge_script: Path to the Node.js bridge script
|
||||
- bridge_port: Port for HTTP communication (default: 3000)
|
||||
- session_path: Path to store WhatsApp session data
|
||||
"""
|
||||
|
||||
# WhatsApp message limits
|
||||
MAX_MESSAGE_LENGTH = 65536 # WhatsApp allows longer messages
|
||||
|
||||
def __init__(self, config: PlatformConfig):
|
||||
super().__init__(config, Platform.WHATSAPP)
|
||||
self._bridge_process: Optional[subprocess.Popen] = None
|
||||
self._bridge_port: int = config.extra.get("bridge_port", 3000)
|
||||
self._bridge_script: Optional[str] = config.extra.get("bridge_script")
|
||||
self._session_path: Path = Path(config.extra.get(
|
||||
"session_path",
|
||||
Path.home() / ".hermes" / "whatsapp" / "session"
|
||||
))
|
||||
self._message_queue: asyncio.Queue = asyncio.Queue()
|
||||
|
||||
async def connect(self) -> bool:
|
||||
"""
|
||||
Start the WhatsApp bridge.
|
||||
|
||||
This launches the Node.js bridge process and waits for it to be ready.
|
||||
"""
|
||||
if not check_whatsapp_requirements():
|
||||
print(f"[{self.name}] Node.js not found. WhatsApp requires Node.js.")
|
||||
return False
|
||||
|
||||
if not self._bridge_script:
|
||||
print(f"[{self.name}] No bridge script configured.")
|
||||
print(f"[{self.name}] Set 'bridge_script' in whatsapp.extra config.")
|
||||
print(f"[{self.name}] See docs/messaging.md for WhatsApp setup instructions.")
|
||||
return False
|
||||
|
||||
bridge_path = Path(self._bridge_script)
|
||||
if not bridge_path.exists():
|
||||
print(f"[{self.name}] Bridge script not found: {bridge_path}")
|
||||
return False
|
||||
|
||||
try:
|
||||
# Ensure session directory exists
|
||||
self._session_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Start the bridge process
|
||||
self._bridge_process = subprocess.Popen(
|
||||
[
|
||||
"node",
|
||||
str(bridge_path),
|
||||
"--port", str(self._bridge_port),
|
||||
"--session", str(self._session_path),
|
||||
],
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE,
|
||||
text=True,
|
||||
)
|
||||
|
||||
# Wait for bridge to be ready (look for ready signal)
|
||||
# This is a simplified version - real implementation would
|
||||
# wait for an HTTP health check or specific stdout message
|
||||
await asyncio.sleep(5)
|
||||
|
||||
if self._bridge_process.poll() is not None:
|
||||
stderr = self._bridge_process.stderr.read() if self._bridge_process.stderr else ""
|
||||
print(f"[{self.name}] Bridge process died: {stderr}")
|
||||
return False
|
||||
|
||||
# Start message polling task
|
||||
asyncio.create_task(self._poll_messages())
|
||||
|
||||
self._running = True
|
||||
print(f"[{self.name}] Bridge started on port {self._bridge_port}")
|
||||
print(f"[{self.name}] Scan QR code if prompted (check bridge output)")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Failed to start bridge: {e}")
|
||||
return False
|
||||
|
||||
async def disconnect(self) -> None:
|
||||
"""Stop the WhatsApp bridge."""
|
||||
if self._bridge_process:
|
||||
try:
|
||||
self._bridge_process.terminate()
|
||||
await asyncio.sleep(1)
|
||||
if self._bridge_process.poll() is None:
|
||||
self._bridge_process.kill()
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Error stopping bridge: {e}")
|
||||
|
||||
self._running = False
|
||||
self._bridge_process = None
|
||||
print(f"[{self.name}] Disconnected")
|
||||
|
||||
async def send(
|
||||
self,
|
||||
chat_id: str,
|
||||
content: str,
|
||||
reply_to: Optional[str] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None
|
||||
) -> SendResult:
|
||||
"""Send a message via the WhatsApp bridge."""
|
||||
if not self._running:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
try:
|
||||
import aiohttp
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
payload = {
|
||||
"chatId": chat_id,
|
||||
"message": content,
|
||||
}
|
||||
if reply_to:
|
||||
payload["replyTo"] = reply_to
|
||||
|
||||
async with session.post(
|
||||
f"http://localhost:{self._bridge_port}/send",
|
||||
json=payload,
|
||||
timeout=aiohttp.ClientTimeout(total=30)
|
||||
) as resp:
|
||||
if resp.status == 200:
|
||||
data = await resp.json()
|
||||
return SendResult(
|
||||
success=True,
|
||||
message_id=data.get("messageId"),
|
||||
raw_response=data
|
||||
)
|
||||
else:
|
||||
error = await resp.text()
|
||||
return SendResult(success=False, error=error)
|
||||
|
||||
except ImportError:
|
||||
return SendResult(
|
||||
success=False,
|
||||
error="aiohttp not installed. Run: pip install aiohttp"
|
||||
)
|
||||
except Exception as e:
|
||||
return SendResult(success=False, error=str(e))
|
||||
|
||||
async def send_typing(self, chat_id: str) -> None:
|
||||
"""Send typing indicator via bridge."""
|
||||
if not self._running:
|
||||
return
|
||||
|
||||
try:
|
||||
import aiohttp
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
await session.post(
|
||||
f"http://localhost:{self._bridge_port}/typing",
|
||||
json={"chatId": chat_id},
|
||||
timeout=aiohttp.ClientTimeout(total=5)
|
||||
)
|
||||
except Exception:
|
||||
pass # Ignore typing indicator failures
|
||||
|
||||
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
|
||||
"""Get information about a WhatsApp chat."""
|
||||
if not self._running:
|
||||
return {"name": "Unknown", "type": "dm"}
|
||||
|
||||
try:
|
||||
import aiohttp
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(
|
||||
f"http://localhost:{self._bridge_port}/chat/{chat_id}",
|
||||
timeout=aiohttp.ClientTimeout(total=10)
|
||||
) as resp:
|
||||
if resp.status == 200:
|
||||
data = await resp.json()
|
||||
return {
|
||||
"name": data.get("name", chat_id),
|
||||
"type": "group" if data.get("isGroup") else "dm",
|
||||
"participants": data.get("participants", []),
|
||||
}
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return {"name": chat_id, "type": "dm"}
|
||||
|
||||
async def _poll_messages(self) -> None:
|
||||
"""Poll the bridge for incoming messages."""
|
||||
try:
|
||||
import aiohttp
|
||||
except ImportError:
|
||||
print(f"[{self.name}] aiohttp not installed, message polling disabled")
|
||||
return
|
||||
|
||||
while self._running:
|
||||
try:
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(
|
||||
f"http://localhost:{self._bridge_port}/messages",
|
||||
timeout=aiohttp.ClientTimeout(total=30)
|
||||
) as resp:
|
||||
if resp.status == 200:
|
||||
messages = await resp.json()
|
||||
for msg_data in messages:
|
||||
event = self._build_message_event(msg_data)
|
||||
if event:
|
||||
await self.handle_message(event)
|
||||
except asyncio.CancelledError:
|
||||
break
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Poll error: {e}")
|
||||
await asyncio.sleep(5)
|
||||
|
||||
await asyncio.sleep(1) # Poll interval
|
||||
|
||||
def _build_message_event(self, data: Dict[str, Any]) -> Optional[MessageEvent]:
|
||||
"""Build a MessageEvent from bridge message data."""
|
||||
try:
|
||||
# Determine message type
|
||||
msg_type = MessageType.TEXT
|
||||
if data.get("hasMedia"):
|
||||
media_type = data.get("mediaType", "")
|
||||
if "image" in media_type:
|
||||
msg_type = MessageType.PHOTO
|
||||
elif "video" in media_type:
|
||||
msg_type = MessageType.VIDEO
|
||||
elif "audio" in media_type or "ptt" in media_type: # ptt = voice note
|
||||
msg_type = MessageType.VOICE
|
||||
else:
|
||||
msg_type = MessageType.DOCUMENT
|
||||
|
||||
# Determine chat type
|
||||
is_group = data.get("isGroup", False)
|
||||
chat_type = "group" if is_group else "dm"
|
||||
|
||||
# Build source
|
||||
source = self.build_source(
|
||||
chat_id=data.get("chatId", ""),
|
||||
chat_name=data.get("chatName"),
|
||||
chat_type=chat_type,
|
||||
user_id=data.get("senderId"),
|
||||
user_name=data.get("senderName"),
|
||||
)
|
||||
|
||||
return MessageEvent(
|
||||
text=data.get("body", ""),
|
||||
message_type=msg_type,
|
||||
source=source,
|
||||
raw_message=data,
|
||||
message_id=data.get("messageId"),
|
||||
media_urls=data.get("mediaUrls", []),
|
||||
)
|
||||
except Exception as e:
|
||||
print(f"[{self.name}] Error building event: {e}")
|
||||
return None
|
||||
|
||||
|
||||
# Note: A reference Node.js bridge script would be provided in scripts/whatsapp-bridge/
|
||||
# It would use whatsapp-web.js or Baileys to:
|
||||
# 1. Handle WhatsApp Web authentication (QR code)
|
||||
# 2. Listen for incoming messages
|
||||
# 3. Expose HTTP endpoints for send/receive/status
|
||||
666
gateway/run.py
Normal file
666
gateway/run.py
Normal file
@@ -0,0 +1,666 @@
|
||||
"""
|
||||
Gateway runner - entry point for messaging platform integrations.
|
||||
|
||||
This module provides:
|
||||
- start_gateway(): Start all configured platform adapters
|
||||
- GatewayRunner: Main class managing the gateway lifecycle
|
||||
|
||||
Usage:
|
||||
# Start the gateway
|
||||
python -m gateway.run
|
||||
|
||||
# Or from CLI
|
||||
python cli.py --gateway
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
import signal
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
from typing import Dict, Optional, Any, List
|
||||
|
||||
# Add parent directory to path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||
|
||||
# Load environment variables from ~/.hermes/.env first
|
||||
from dotenv import load_dotenv
|
||||
_env_path = Path.home() / '.hermes' / '.env'
|
||||
if _env_path.exists():
|
||||
load_dotenv(_env_path)
|
||||
# Also try project .env as fallback
|
||||
load_dotenv()
|
||||
|
||||
# Gateway runs in quiet mode - suppress debug output and use cwd directly (no temp dirs)
|
||||
os.environ["HERMES_QUIET"] = "1"
|
||||
|
||||
# Set terminal working directory for messaging platforms
|
||||
# Uses MESSAGING_CWD if set, otherwise defaults to home directory
|
||||
# This is separate from CLI which uses the directory where `hermes` is run
|
||||
messaging_cwd = os.getenv("MESSAGING_CWD") or str(Path.home())
|
||||
os.environ["TERMINAL_CWD"] = messaging_cwd
|
||||
|
||||
from gateway.config import (
|
||||
Platform,
|
||||
GatewayConfig,
|
||||
load_gateway_config,
|
||||
)
|
||||
from gateway.session import (
|
||||
SessionStore,
|
||||
SessionSource,
|
||||
SessionContext,
|
||||
build_session_context,
|
||||
build_session_context_prompt,
|
||||
)
|
||||
from gateway.delivery import DeliveryRouter, DeliveryTarget
|
||||
from gateway.platforms.base import BasePlatformAdapter, MessageEvent
|
||||
|
||||
|
||||
class GatewayRunner:
|
||||
"""
|
||||
Main gateway controller.
|
||||
|
||||
Manages the lifecycle of all platform adapters and routes
|
||||
messages to/from the agent.
|
||||
"""
|
||||
|
||||
def __init__(self, config: Optional[GatewayConfig] = None):
|
||||
self.config = config or load_gateway_config()
|
||||
self.adapters: Dict[Platform, BasePlatformAdapter] = {}
|
||||
self.session_store = SessionStore(self.config.sessions_dir, self.config)
|
||||
self.delivery_router = DeliveryRouter(self.config)
|
||||
self._running = False
|
||||
self._shutdown_event = asyncio.Event()
|
||||
|
||||
# Track running agents per session for interrupt support
|
||||
# Key: session_key, Value: AIAgent instance
|
||||
self._running_agents: Dict[str, Any] = {}
|
||||
self._pending_messages: Dict[str, str] = {} # Queued messages during interrupt
|
||||
|
||||
async def start(self) -> bool:
|
||||
"""
|
||||
Start the gateway and all configured platform adapters.
|
||||
|
||||
Returns True if at least one adapter connected successfully.
|
||||
"""
|
||||
print("[gateway] Starting Hermes Gateway...")
|
||||
print(f"[gateway] Session storage: {self.config.sessions_dir}")
|
||||
|
||||
connected_count = 0
|
||||
|
||||
# Initialize and connect each configured platform
|
||||
for platform, platform_config in self.config.platforms.items():
|
||||
if not platform_config.enabled:
|
||||
continue
|
||||
|
||||
adapter = self._create_adapter(platform, platform_config)
|
||||
if not adapter:
|
||||
print(f"[gateway] No adapter available for {platform.value}")
|
||||
continue
|
||||
|
||||
# Set up message handler
|
||||
adapter.set_message_handler(self._handle_message)
|
||||
|
||||
# Try to connect
|
||||
print(f"[gateway] Connecting to {platform.value}...")
|
||||
try:
|
||||
success = await adapter.connect()
|
||||
if success:
|
||||
self.adapters[platform] = adapter
|
||||
connected_count += 1
|
||||
print(f"[gateway] ✓ {platform.value} connected")
|
||||
else:
|
||||
print(f"[gateway] ✗ {platform.value} failed to connect")
|
||||
except Exception as e:
|
||||
print(f"[gateway] ✗ {platform.value} error: {e}")
|
||||
|
||||
if connected_count == 0:
|
||||
print("[gateway] No platforms connected. Check your configuration.")
|
||||
return False
|
||||
|
||||
# Update delivery router with adapters
|
||||
self.delivery_router.adapters = self.adapters
|
||||
|
||||
self._running = True
|
||||
print(f"[gateway] Gateway running with {connected_count} platform(s)")
|
||||
print("[gateway] Press Ctrl+C to stop")
|
||||
|
||||
return True
|
||||
|
||||
async def stop(self) -> None:
|
||||
"""Stop the gateway and disconnect all adapters."""
|
||||
print("[gateway] Stopping gateway...")
|
||||
self._running = False
|
||||
|
||||
for platform, adapter in self.adapters.items():
|
||||
try:
|
||||
await adapter.disconnect()
|
||||
print(f"[gateway] ✓ {platform.value} disconnected")
|
||||
except Exception as e:
|
||||
print(f"[gateway] ✗ {platform.value} disconnect error: {e}")
|
||||
|
||||
self.adapters.clear()
|
||||
self._shutdown_event.set()
|
||||
print("[gateway] Gateway stopped")
|
||||
|
||||
async def wait_for_shutdown(self) -> None:
|
||||
"""Wait for shutdown signal."""
|
||||
await self._shutdown_event.wait()
|
||||
|
||||
def _create_adapter(
|
||||
self,
|
||||
platform: Platform,
|
||||
config: Any
|
||||
) -> Optional[BasePlatformAdapter]:
|
||||
"""Create the appropriate adapter for a platform."""
|
||||
if platform == Platform.TELEGRAM:
|
||||
from gateway.platforms.telegram import TelegramAdapter, check_telegram_requirements
|
||||
if not check_telegram_requirements():
|
||||
print(f"[gateway] Telegram: python-telegram-bot not installed")
|
||||
return None
|
||||
return TelegramAdapter(config)
|
||||
|
||||
elif platform == Platform.DISCORD:
|
||||
from gateway.platforms.discord import DiscordAdapter, check_discord_requirements
|
||||
if not check_discord_requirements():
|
||||
print(f"[gateway] Discord: discord.py not installed")
|
||||
return None
|
||||
return DiscordAdapter(config)
|
||||
|
||||
elif platform == Platform.WHATSAPP:
|
||||
from gateway.platforms.whatsapp import WhatsAppAdapter, check_whatsapp_requirements
|
||||
if not check_whatsapp_requirements():
|
||||
print(f"[gateway] WhatsApp: Node.js not installed or bridge not configured")
|
||||
return None
|
||||
return WhatsAppAdapter(config)
|
||||
|
||||
return None
|
||||
|
||||
def _is_user_authorized(self, source: SessionSource) -> bool:
|
||||
"""
|
||||
Check if a user is authorized to use the bot.
|
||||
|
||||
Authorization is checked via environment variables:
|
||||
- GATEWAY_ALLOWED_USERS: Comma-separated list of user IDs (all platforms)
|
||||
- TELEGRAM_ALLOWED_USERS: Telegram-specific user IDs
|
||||
- DISCORD_ALLOWED_USERS: Discord-specific user IDs
|
||||
|
||||
If no allowlist is configured, all users are allowed (open access).
|
||||
"""
|
||||
user_id = source.user_id
|
||||
if not user_id:
|
||||
return False # Can't verify unknown users
|
||||
|
||||
# Check platform-specific allowlist first
|
||||
platform_env_map = {
|
||||
Platform.TELEGRAM: "TELEGRAM_ALLOWED_USERS",
|
||||
Platform.DISCORD: "DISCORD_ALLOWED_USERS",
|
||||
Platform.WHATSAPP: "WHATSAPP_ALLOWED_USERS",
|
||||
}
|
||||
|
||||
platform_allowlist = os.getenv(platform_env_map.get(source.platform, ""))
|
||||
global_allowlist = os.getenv("GATEWAY_ALLOWED_USERS", "")
|
||||
|
||||
# If no allowlists configured, allow all (backward compatible)
|
||||
if not platform_allowlist and not global_allowlist:
|
||||
return True
|
||||
|
||||
# Check if user is in any allowlist
|
||||
allowed_ids = set()
|
||||
if platform_allowlist:
|
||||
allowed_ids.update(uid.strip() for uid in platform_allowlist.split(","))
|
||||
if global_allowlist:
|
||||
allowed_ids.update(uid.strip() for uid in global_allowlist.split(","))
|
||||
|
||||
return user_id in allowed_ids
|
||||
|
||||
async def _handle_message(self, event: MessageEvent) -> Optional[str]:
|
||||
"""
|
||||
Handle an incoming message from any platform.
|
||||
|
||||
This is the core message processing pipeline:
|
||||
1. Check user authorization
|
||||
2. Check for commands (/new, /reset, etc.)
|
||||
3. Check for running agent and interrupt if needed
|
||||
4. Get or create session
|
||||
5. Build context for agent
|
||||
6. Run agent conversation
|
||||
7. Return response
|
||||
"""
|
||||
source = event.source
|
||||
|
||||
# Check if user is authorized
|
||||
if not self._is_user_authorized(source):
|
||||
print(f"[gateway] Unauthorized user: {source.user_id} ({source.user_name}) on {source.platform.value}")
|
||||
return None # Silently ignore unauthorized users
|
||||
|
||||
# Check for commands
|
||||
command = event.get_command()
|
||||
if command in ["new", "reset"]:
|
||||
return await self._handle_reset_command(event)
|
||||
|
||||
if command == "status":
|
||||
return await self._handle_status_command(event)
|
||||
|
||||
if command == "stop":
|
||||
return await self._handle_stop_command(event)
|
||||
|
||||
# Get or create session
|
||||
session_entry = self.session_store.get_or_create_session(source)
|
||||
session_key = session_entry.session_key
|
||||
|
||||
# Check if there's already a running agent for this session
|
||||
if session_key in self._running_agents:
|
||||
running_agent = self._running_agents[session_key]
|
||||
print(f"[gateway] ⚡ Interrupting running agent for session {session_key[:20]}...")
|
||||
running_agent.interrupt(event.text)
|
||||
# Store the new message to be processed after current agent finishes
|
||||
self._pending_messages[session_key] = event.text
|
||||
return None # Don't respond yet - let the interrupt handle it
|
||||
|
||||
# Build session context
|
||||
context = build_session_context(source, self.config, session_entry)
|
||||
|
||||
# Set environment variables for tools
|
||||
self._set_session_env(context)
|
||||
|
||||
# Build the context prompt to inject
|
||||
context_prompt = build_session_context_prompt(context)
|
||||
|
||||
# Load conversation history from transcript
|
||||
history = self.session_store.load_transcript(session_entry.session_id)
|
||||
|
||||
try:
|
||||
# Run the agent
|
||||
response = await self._run_agent(
|
||||
message=event.text,
|
||||
context_prompt=context_prompt,
|
||||
history=history,
|
||||
source=source,
|
||||
session_id=session_entry.session_id,
|
||||
session_key=session_key
|
||||
)
|
||||
|
||||
# Append to transcript
|
||||
self.session_store.append_to_transcript(
|
||||
session_entry.session_id,
|
||||
{"role": "user", "content": event.text, "timestamp": datetime.now().isoformat()}
|
||||
)
|
||||
self.session_store.append_to_transcript(
|
||||
session_entry.session_id,
|
||||
{"role": "assistant", "content": response, "timestamp": datetime.now().isoformat()}
|
||||
)
|
||||
|
||||
# Update session
|
||||
self.session_store.update_session(session_entry.session_key)
|
||||
|
||||
return response
|
||||
|
||||
except Exception as e:
|
||||
print(f"[gateway] Agent error: {e}")
|
||||
return f"Sorry, I encountered an error: {str(e)}"
|
||||
finally:
|
||||
# Clear session env
|
||||
self._clear_session_env()
|
||||
|
||||
async def _handle_reset_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /new or /reset command."""
|
||||
source = event.source
|
||||
|
||||
# Get existing session key
|
||||
session_key = f"agent:main:{source.platform.value}:" + \
|
||||
(f"dm" if source.chat_type == "dm" else f"{source.chat_type}:{source.chat_id}")
|
||||
|
||||
# Reset the session
|
||||
new_entry = self.session_store.reset_session(session_key)
|
||||
|
||||
if new_entry:
|
||||
return "✨ Session reset! I've started fresh with no memory of our previous conversation."
|
||||
else:
|
||||
# No existing session, just create one
|
||||
self.session_store.get_or_create_session(source, force_new=True)
|
||||
return "✨ New session started!"
|
||||
|
||||
async def _handle_status_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /status command."""
|
||||
source = event.source
|
||||
session_entry = self.session_store.get_or_create_session(source)
|
||||
|
||||
connected_platforms = [p.value for p in self.adapters.keys()]
|
||||
|
||||
# Check if there's an active agent
|
||||
session_key = session_entry.session_key
|
||||
is_running = session_key in self._running_agents
|
||||
|
||||
lines = [
|
||||
"📊 **Hermes Gateway Status**",
|
||||
"",
|
||||
f"**Session ID:** `{session_entry.session_id[:12]}...`",
|
||||
f"**Created:** {session_entry.created_at.strftime('%Y-%m-%d %H:%M')}",
|
||||
f"**Last Activity:** {session_entry.updated_at.strftime('%Y-%m-%d %H:%M')}",
|
||||
f"**Tokens:** {session_entry.total_tokens:,}",
|
||||
f"**Agent Running:** {'Yes ⚡' if is_running else 'No'}",
|
||||
"",
|
||||
f"**Connected Platforms:** {', '.join(connected_platforms)}",
|
||||
]
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
async def _handle_stop_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /stop command - interrupt a running agent."""
|
||||
source = event.source
|
||||
session_entry = self.session_store.get_or_create_session(source)
|
||||
session_key = session_entry.session_key
|
||||
|
||||
if session_key in self._running_agents:
|
||||
agent = self._running_agents[session_key]
|
||||
agent.interrupt()
|
||||
return "⚡ Stopping the current task... The agent will finish its current step and respond."
|
||||
else:
|
||||
return "No active task to stop."
|
||||
|
||||
def _set_session_env(self, context: SessionContext) -> None:
|
||||
"""Set environment variables for the current session."""
|
||||
os.environ["HERMES_SESSION_PLATFORM"] = context.source.platform.value
|
||||
os.environ["HERMES_SESSION_CHAT_ID"] = context.source.chat_id
|
||||
if context.source.chat_name:
|
||||
os.environ["HERMES_SESSION_CHAT_NAME"] = context.source.chat_name
|
||||
|
||||
def _clear_session_env(self) -> None:
|
||||
"""Clear session environment variables."""
|
||||
for var in ["HERMES_SESSION_PLATFORM", "HERMES_SESSION_CHAT_ID", "HERMES_SESSION_CHAT_NAME"]:
|
||||
if var in os.environ:
|
||||
del os.environ[var]
|
||||
|
||||
async def _run_agent(
|
||||
self,
|
||||
message: str,
|
||||
context_prompt: str,
|
||||
history: List[Dict[str, Any]],
|
||||
source: SessionSource,
|
||||
session_id: str,
|
||||
session_key: str = None
|
||||
) -> str:
|
||||
"""
|
||||
Run the agent with the given message and context.
|
||||
|
||||
This is run in a thread pool to not block the event loop.
|
||||
Supports interruption via new messages.
|
||||
"""
|
||||
from run_agent import AIAgent
|
||||
import queue
|
||||
|
||||
# Determine toolset based on platform
|
||||
toolset_map = {
|
||||
Platform.LOCAL: "hermes-cli",
|
||||
Platform.TELEGRAM: "hermes-telegram",
|
||||
Platform.DISCORD: "hermes-discord",
|
||||
Platform.WHATSAPP: "hermes-whatsapp",
|
||||
}
|
||||
toolset = toolset_map.get(source.platform, "hermes-telegram")
|
||||
|
||||
# Check if tool progress notifications are enabled
|
||||
tool_progress_enabled = os.getenv("HERMES_TOOL_PROGRESS", "").lower() in ("1", "true", "yes")
|
||||
progress_mode = os.getenv("HERMES_TOOL_PROGRESS_MODE", "new") # "all" or "new" (only new tools)
|
||||
|
||||
# Queue for progress messages (thread-safe)
|
||||
progress_queue = queue.Queue() if tool_progress_enabled else None
|
||||
last_tool = [None] # Mutable container for tracking in closure
|
||||
|
||||
def progress_callback(tool_name: str, preview: str = None):
|
||||
"""Callback invoked by agent when a tool is called."""
|
||||
if not progress_queue:
|
||||
return
|
||||
|
||||
# "new" mode: only report when tool changes
|
||||
if progress_mode == "new" and tool_name == last_tool[0]:
|
||||
return
|
||||
last_tool[0] = tool_name
|
||||
|
||||
# Build progress message
|
||||
tool_emojis = {
|
||||
"terminal": "💻",
|
||||
"web_search": "🔍",
|
||||
"web_extract": "📄",
|
||||
"read_file": "📖",
|
||||
"write_file": "✍️",
|
||||
"list_directory": "📂",
|
||||
"image_generate": "🎨",
|
||||
"browser_navigate": "🌐",
|
||||
"browser_click": "👆",
|
||||
"moa_query": "🧠",
|
||||
}
|
||||
emoji = tool_emojis.get(tool_name, "⚙️")
|
||||
|
||||
if tool_name == "terminal" and preview:
|
||||
msg = f"{emoji} `{preview}`..."
|
||||
else:
|
||||
msg = f"{emoji} {tool_name}..."
|
||||
|
||||
progress_queue.put(msg)
|
||||
|
||||
# Background task to send progress messages
|
||||
async def send_progress_messages():
|
||||
if not progress_queue:
|
||||
return
|
||||
|
||||
adapter = self.adapters.get(source.platform)
|
||||
if not adapter:
|
||||
return
|
||||
|
||||
while True:
|
||||
try:
|
||||
# Non-blocking check with small timeout
|
||||
msg = progress_queue.get_nowait()
|
||||
await adapter.send(chat_id=source.chat_id, content=msg)
|
||||
# Restore typing indicator after sending progress message
|
||||
await asyncio.sleep(0.3)
|
||||
await adapter.send_typing(source.chat_id)
|
||||
except queue.Empty:
|
||||
await asyncio.sleep(0.3) # Check again soon
|
||||
except asyncio.CancelledError:
|
||||
# Drain remaining messages
|
||||
while not progress_queue.empty():
|
||||
try:
|
||||
msg = progress_queue.get_nowait()
|
||||
await adapter.send(chat_id=source.chat_id, content=msg)
|
||||
except:
|
||||
break
|
||||
return
|
||||
except Exception as e:
|
||||
print(f"[Gateway] Progress message error: {e}")
|
||||
await asyncio.sleep(1)
|
||||
|
||||
# We need to share the agent instance for interrupt support
|
||||
agent_holder = [None] # Mutable container for the agent instance
|
||||
result_holder = [None] # Mutable container for the result
|
||||
|
||||
def run_sync():
|
||||
# Read from env var or use default (same as CLI)
|
||||
max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "60"))
|
||||
|
||||
agent = AIAgent(
|
||||
model=os.getenv("HERMES_MODEL", "anthropic/claude-sonnet-4"),
|
||||
max_iterations=max_iterations,
|
||||
quiet_mode=True,
|
||||
enabled_toolsets=[toolset],
|
||||
ephemeral_system_prompt=context_prompt,
|
||||
session_id=session_id,
|
||||
tool_progress_callback=progress_callback if tool_progress_enabled else None,
|
||||
)
|
||||
|
||||
# Store agent reference for interrupt support
|
||||
agent_holder[0] = agent
|
||||
|
||||
# Convert transcript history to agent format
|
||||
# Transcript has timestamps; agent expects {"role": ..., "content": ...}
|
||||
agent_history = []
|
||||
for msg in history:
|
||||
role = msg.get("role")
|
||||
content = msg.get("content")
|
||||
if role and content:
|
||||
agent_history.append({"role": role, "content": content})
|
||||
|
||||
result = agent.run_conversation(message, conversation_history=agent_history)
|
||||
result_holder[0] = result
|
||||
|
||||
# Return final response, or a message if something went wrong
|
||||
final_response = result.get("final_response")
|
||||
if final_response:
|
||||
return final_response
|
||||
elif result.get("error"):
|
||||
# Agent couldn't recover - show the error
|
||||
return f"⚠️ {result['error']}"
|
||||
else:
|
||||
return "(No response generated)"
|
||||
|
||||
# Start progress message sender if enabled
|
||||
progress_task = None
|
||||
if tool_progress_enabled:
|
||||
progress_task = asyncio.create_task(send_progress_messages())
|
||||
|
||||
# Track this agent as running for this session (for interrupt support)
|
||||
# We do this in a callback after the agent is created
|
||||
async def track_agent():
|
||||
# Wait for agent to be created
|
||||
while agent_holder[0] is None:
|
||||
await asyncio.sleep(0.05)
|
||||
if session_key:
|
||||
self._running_agents[session_key] = agent_holder[0]
|
||||
|
||||
tracking_task = asyncio.create_task(track_agent())
|
||||
|
||||
# Monitor for interrupts from the adapter (new messages arriving)
|
||||
async def monitor_for_interrupt():
|
||||
adapter = self.adapters.get(source.platform)
|
||||
if not adapter:
|
||||
return
|
||||
|
||||
chat_id = source.chat_id
|
||||
while True:
|
||||
await asyncio.sleep(0.2) # Check every 200ms
|
||||
# Check if adapter has a pending interrupt for this session
|
||||
if hasattr(adapter, 'has_pending_interrupt') and adapter.has_pending_interrupt(chat_id):
|
||||
agent = agent_holder[0]
|
||||
if agent:
|
||||
pending_event = adapter.get_pending_message(chat_id)
|
||||
pending_text = pending_event.text if pending_event else None
|
||||
print(f"[gateway] ⚡ Interrupt detected from adapter, signaling agent...")
|
||||
agent.interrupt(pending_text)
|
||||
break
|
||||
|
||||
interrupt_monitor = asyncio.create_task(monitor_for_interrupt())
|
||||
|
||||
try:
|
||||
# Run in thread pool to not block
|
||||
loop = asyncio.get_event_loop()
|
||||
response = await loop.run_in_executor(None, run_sync)
|
||||
|
||||
# Check if we were interrupted and have a pending message
|
||||
result = result_holder[0]
|
||||
adapter = self.adapters.get(source.platform)
|
||||
|
||||
# Get pending message from adapter if interrupted
|
||||
pending = None
|
||||
if result and result.get("interrupted") and adapter:
|
||||
pending_event = adapter.get_pending_message(source.chat_id)
|
||||
if pending_event:
|
||||
pending = pending_event.text
|
||||
elif result.get("interrupt_message"):
|
||||
pending = result.get("interrupt_message")
|
||||
|
||||
if pending:
|
||||
print(f"[gateway] 📨 Processing interrupted message: '{pending[:40]}...'")
|
||||
# Add an indicator to the response
|
||||
if response:
|
||||
response = response + "\n\n---\n_[Interrupted - processing your new message]_"
|
||||
|
||||
# Send the interrupted response first
|
||||
if adapter and response:
|
||||
await adapter.send(chat_id=source.chat_id, content=response)
|
||||
|
||||
# Now process the pending message with updated history
|
||||
updated_history = result.get("messages", history)
|
||||
return await self._run_agent(
|
||||
message=pending,
|
||||
context_prompt=context_prompt,
|
||||
history=updated_history,
|
||||
source=source,
|
||||
session_id=session_id,
|
||||
session_key=session_key
|
||||
)
|
||||
finally:
|
||||
# Stop progress sender and interrupt monitor
|
||||
if progress_task:
|
||||
progress_task.cancel()
|
||||
interrupt_monitor.cancel()
|
||||
|
||||
# Clean up tracking
|
||||
tracking_task.cancel()
|
||||
if session_key and session_key in self._running_agents:
|
||||
del self._running_agents[session_key]
|
||||
|
||||
# Wait for cancelled tasks
|
||||
for task in [progress_task, interrupt_monitor, tracking_task]:
|
||||
if task:
|
||||
try:
|
||||
await task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
return response
|
||||
|
||||
|
||||
async def start_gateway(config: Optional[GatewayConfig] = None) -> None:
|
||||
"""
|
||||
Start the gateway and run until interrupted.
|
||||
|
||||
This is the main entry point for running the gateway.
|
||||
"""
|
||||
runner = GatewayRunner(config)
|
||||
|
||||
# Set up signal handlers
|
||||
def signal_handler():
|
||||
asyncio.create_task(runner.stop())
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
for sig in (signal.SIGINT, signal.SIGTERM):
|
||||
try:
|
||||
loop.add_signal_handler(sig, signal_handler)
|
||||
except NotImplementedError:
|
||||
# Windows doesn't support add_signal_handler
|
||||
pass
|
||||
|
||||
# Start the gateway
|
||||
success = await runner.start()
|
||||
if not success:
|
||||
return
|
||||
|
||||
# Wait for shutdown
|
||||
await runner.wait_for_shutdown()
|
||||
|
||||
|
||||
def main():
|
||||
"""CLI entry point for the gateway."""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Hermes Gateway - Multi-platform messaging")
|
||||
parser.add_argument("--config", "-c", help="Path to gateway config file")
|
||||
parser.add_argument("--verbose", "-v", action="store_true", help="Verbose output")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
config = None
|
||||
if args.config:
|
||||
import json
|
||||
with open(args.config) as f:
|
||||
data = json.load(f)
|
||||
config = GatewayConfig.from_dict(data)
|
||||
|
||||
# Run the gateway
|
||||
asyncio.run(start_gateway(config))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
522
gateway/session.py
Normal file
522
gateway/session.py
Normal file
@@ -0,0 +1,522 @@
|
||||
"""
|
||||
Session management for the gateway.
|
||||
|
||||
Handles:
|
||||
- Session context tracking (where messages come from)
|
||||
- Session storage (conversations persisted to disk)
|
||||
- Reset policy evaluation (when to start fresh)
|
||||
- Dynamic system prompt injection (agent knows its context)
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
from datetime import datetime, timedelta
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Dict, List, Optional, Any
|
||||
|
||||
from .config import (
|
||||
Platform,
|
||||
GatewayConfig,
|
||||
SessionResetPolicy,
|
||||
HomeChannel,
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class SessionSource:
|
||||
"""
|
||||
Describes where a message originated from.
|
||||
|
||||
This information is used to:
|
||||
1. Route responses back to the right place
|
||||
2. Inject context into the system prompt
|
||||
3. Track origin for cron job delivery
|
||||
"""
|
||||
platform: Platform
|
||||
chat_id: str
|
||||
chat_name: Optional[str] = None
|
||||
chat_type: str = "dm" # "dm", "group", "channel", "thread"
|
||||
user_id: Optional[str] = None
|
||||
user_name: Optional[str] = None
|
||||
thread_id: Optional[str] = None # For forum topics, Discord threads, etc.
|
||||
|
||||
@property
|
||||
def description(self) -> str:
|
||||
"""Human-readable description of the source."""
|
||||
if self.platform == Platform.LOCAL:
|
||||
return "CLI terminal"
|
||||
|
||||
parts = []
|
||||
if self.chat_type == "dm":
|
||||
parts.append(f"DM with {self.user_name or self.user_id or 'user'}")
|
||||
elif self.chat_type == "group":
|
||||
parts.append(f"group: {self.chat_name or self.chat_id}")
|
||||
elif self.chat_type == "channel":
|
||||
parts.append(f"channel: {self.chat_name or self.chat_id}")
|
||||
else:
|
||||
parts.append(self.chat_name or self.chat_id)
|
||||
|
||||
if self.thread_id:
|
||||
parts.append(f"thread: {self.thread_id}")
|
||||
|
||||
return ", ".join(parts)
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"platform": self.platform.value,
|
||||
"chat_id": self.chat_id,
|
||||
"chat_name": self.chat_name,
|
||||
"chat_type": self.chat_type,
|
||||
"user_id": self.user_id,
|
||||
"user_name": self.user_name,
|
||||
"thread_id": self.thread_id,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: Dict[str, Any]) -> "SessionSource":
|
||||
return cls(
|
||||
platform=Platform(data["platform"]),
|
||||
chat_id=str(data["chat_id"]),
|
||||
chat_name=data.get("chat_name"),
|
||||
chat_type=data.get("chat_type", "dm"),
|
||||
user_id=data.get("user_id"),
|
||||
user_name=data.get("user_name"),
|
||||
thread_id=data.get("thread_id"),
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def local_cli(cls) -> "SessionSource":
|
||||
"""Create a source representing the local CLI."""
|
||||
return cls(
|
||||
platform=Platform.LOCAL,
|
||||
chat_id="cli",
|
||||
chat_name="CLI terminal",
|
||||
chat_type="dm",
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class SessionContext:
|
||||
"""
|
||||
Full context for a session, used for dynamic system prompt injection.
|
||||
|
||||
The agent receives this information to understand:
|
||||
- Where messages are coming from
|
||||
- What platforms are available
|
||||
- Where it can deliver scheduled task outputs
|
||||
"""
|
||||
source: SessionSource
|
||||
connected_platforms: List[Platform]
|
||||
home_channels: Dict[Platform, HomeChannel]
|
||||
|
||||
# Session metadata
|
||||
session_key: str = ""
|
||||
session_id: str = ""
|
||||
created_at: Optional[datetime] = None
|
||||
updated_at: Optional[datetime] = None
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"source": self.source.to_dict(),
|
||||
"connected_platforms": [p.value for p in self.connected_platforms],
|
||||
"home_channels": {
|
||||
p.value: hc.to_dict() for p, hc in self.home_channels.items()
|
||||
},
|
||||
"session_key": self.session_key,
|
||||
"session_id": self.session_id,
|
||||
"created_at": self.created_at.isoformat() if self.created_at else None,
|
||||
"updated_at": self.updated_at.isoformat() if self.updated_at else None,
|
||||
}
|
||||
|
||||
|
||||
def build_session_context_prompt(context: SessionContext) -> str:
|
||||
"""
|
||||
Build the dynamic system prompt section that tells the agent about its context.
|
||||
|
||||
This is injected into the system prompt so the agent knows:
|
||||
- Where messages are coming from
|
||||
- What platforms are connected
|
||||
- Where it can deliver scheduled task outputs
|
||||
"""
|
||||
lines = [
|
||||
"## Current Session Context",
|
||||
"",
|
||||
]
|
||||
|
||||
# Source info
|
||||
platform_name = context.source.platform.value.title()
|
||||
if context.source.platform == Platform.LOCAL:
|
||||
lines.append(f"**Source:** {platform_name} (the machine running this agent)")
|
||||
else:
|
||||
lines.append(f"**Source:** {platform_name} ({context.source.description})")
|
||||
|
||||
# Connected platforms
|
||||
platforms_list = ["local (files on this machine)"]
|
||||
for p in context.connected_platforms:
|
||||
if p != Platform.LOCAL:
|
||||
platforms_list.append(f"{p.value}: Connected ✓")
|
||||
|
||||
lines.append(f"**Connected Platforms:** {', '.join(platforms_list)}")
|
||||
|
||||
# Home channels
|
||||
if context.home_channels:
|
||||
lines.append("")
|
||||
lines.append("**Home Channels (default destinations):**")
|
||||
for platform, home in context.home_channels.items():
|
||||
lines.append(f" - {platform.value}: {home.name} (ID: {home.chat_id})")
|
||||
|
||||
# Delivery options for scheduled tasks
|
||||
lines.append("")
|
||||
lines.append("**Delivery options for scheduled tasks:**")
|
||||
|
||||
# Origin delivery
|
||||
if context.source.platform == Platform.LOCAL:
|
||||
lines.append("- `\"origin\"` → Local output (saved to files)")
|
||||
else:
|
||||
lines.append(f"- `\"origin\"` → Back to this chat ({context.source.chat_name or context.source.chat_id})")
|
||||
|
||||
# Local always available
|
||||
lines.append("- `\"local\"` → Save to local files only (~/.hermes/cron/output/)")
|
||||
|
||||
# Platform home channels
|
||||
for platform, home in context.home_channels.items():
|
||||
lines.append(f"- `\"{platform.value}\"` → Home channel ({home.name})")
|
||||
|
||||
# Note about explicit targeting
|
||||
lines.append("")
|
||||
lines.append("*For explicit targeting, use `\"platform:chat_id\"` format if the user provides a specific chat ID.*")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
@dataclass
|
||||
class SessionEntry:
|
||||
"""
|
||||
Entry in the session store.
|
||||
|
||||
Maps a session key to its current session ID and metadata.
|
||||
"""
|
||||
session_key: str
|
||||
session_id: str
|
||||
created_at: datetime
|
||||
updated_at: datetime
|
||||
|
||||
# Origin metadata for delivery routing
|
||||
origin: Optional[SessionSource] = None
|
||||
|
||||
# Display metadata
|
||||
display_name: Optional[str] = None
|
||||
platform: Optional[Platform] = None
|
||||
chat_type: str = "dm"
|
||||
|
||||
# Token tracking
|
||||
input_tokens: int = 0
|
||||
output_tokens: int = 0
|
||||
total_tokens: int = 0
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
result = {
|
||||
"session_key": self.session_key,
|
||||
"session_id": self.session_id,
|
||||
"created_at": self.created_at.isoformat(),
|
||||
"updated_at": self.updated_at.isoformat(),
|
||||
"display_name": self.display_name,
|
||||
"platform": self.platform.value if self.platform else None,
|
||||
"chat_type": self.chat_type,
|
||||
"input_tokens": self.input_tokens,
|
||||
"output_tokens": self.output_tokens,
|
||||
"total_tokens": self.total_tokens,
|
||||
}
|
||||
if self.origin:
|
||||
result["origin"] = self.origin.to_dict()
|
||||
return result
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: Dict[str, Any]) -> "SessionEntry":
|
||||
origin = None
|
||||
if "origin" in data and data["origin"]:
|
||||
origin = SessionSource.from_dict(data["origin"])
|
||||
|
||||
platform = None
|
||||
if data.get("platform"):
|
||||
try:
|
||||
platform = Platform(data["platform"])
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
return cls(
|
||||
session_key=data["session_key"],
|
||||
session_id=data["session_id"],
|
||||
created_at=datetime.fromisoformat(data["created_at"]),
|
||||
updated_at=datetime.fromisoformat(data["updated_at"]),
|
||||
origin=origin,
|
||||
display_name=data.get("display_name"),
|
||||
platform=platform,
|
||||
chat_type=data.get("chat_type", "dm"),
|
||||
input_tokens=data.get("input_tokens", 0),
|
||||
output_tokens=data.get("output_tokens", 0),
|
||||
total_tokens=data.get("total_tokens", 0),
|
||||
)
|
||||
|
||||
|
||||
class SessionStore:
|
||||
"""
|
||||
Manages session storage and retrieval.
|
||||
|
||||
Sessions are stored in:
|
||||
- sessions.json: Index mapping session keys to session IDs
|
||||
- {session_id}.jsonl: Conversation transcripts
|
||||
"""
|
||||
|
||||
def __init__(self, sessions_dir: Path, config: GatewayConfig):
|
||||
self.sessions_dir = sessions_dir
|
||||
self.config = config
|
||||
self._entries: Dict[str, SessionEntry] = {}
|
||||
self._loaded = False
|
||||
|
||||
def _ensure_loaded(self) -> None:
|
||||
"""Load sessions from disk if not already loaded."""
|
||||
if self._loaded:
|
||||
return
|
||||
|
||||
self.sessions_dir.mkdir(parents=True, exist_ok=True)
|
||||
sessions_file = self.sessions_dir / "sessions.json"
|
||||
|
||||
if sessions_file.exists():
|
||||
try:
|
||||
with open(sessions_file, "r") as f:
|
||||
data = json.load(f)
|
||||
for key, entry_data in data.items():
|
||||
self._entries[key] = SessionEntry.from_dict(entry_data)
|
||||
except Exception as e:
|
||||
print(f"[gateway] Warning: Failed to load sessions: {e}")
|
||||
|
||||
self._loaded = True
|
||||
|
||||
def _save(self) -> None:
|
||||
"""Save sessions index to disk."""
|
||||
self.sessions_dir.mkdir(parents=True, exist_ok=True)
|
||||
sessions_file = self.sessions_dir / "sessions.json"
|
||||
|
||||
data = {key: entry.to_dict() for key, entry in self._entries.items()}
|
||||
with open(sessions_file, "w") as f:
|
||||
json.dump(data, f, indent=2)
|
||||
|
||||
def _generate_session_key(self, source: SessionSource) -> str:
|
||||
"""Generate a session key from a source."""
|
||||
platform = source.platform.value
|
||||
|
||||
if source.chat_type == "dm":
|
||||
# DMs share the main session per platform
|
||||
return f"agent:main:{platform}:dm"
|
||||
else:
|
||||
# Groups/channels get their own keys
|
||||
return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}"
|
||||
|
||||
def _should_reset(self, entry: SessionEntry, source: SessionSource) -> bool:
|
||||
"""
|
||||
Check if a session should be reset based on policy.
|
||||
|
||||
Returns True if the session is stale and should start fresh.
|
||||
"""
|
||||
policy = self.config.get_reset_policy(
|
||||
platform=source.platform,
|
||||
session_type=source.chat_type
|
||||
)
|
||||
|
||||
now = datetime.now()
|
||||
|
||||
# Check idle timeout
|
||||
if policy.mode in ("idle", "both"):
|
||||
idle_deadline = entry.updated_at + timedelta(minutes=policy.idle_minutes)
|
||||
if now > idle_deadline:
|
||||
return True
|
||||
|
||||
# Check daily reset
|
||||
if policy.mode in ("daily", "both"):
|
||||
# Find the most recent reset boundary
|
||||
today_reset = now.replace(
|
||||
hour=policy.at_hour,
|
||||
minute=0,
|
||||
second=0,
|
||||
microsecond=0
|
||||
)
|
||||
if now.hour < policy.at_hour:
|
||||
# Reset boundary was yesterday
|
||||
today_reset -= timedelta(days=1)
|
||||
|
||||
if entry.updated_at < today_reset:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def get_or_create_session(
|
||||
self,
|
||||
source: SessionSource,
|
||||
force_new: bool = False
|
||||
) -> SessionEntry:
|
||||
"""
|
||||
Get an existing session or create a new one.
|
||||
|
||||
Evaluates reset policy to determine if the existing session is stale.
|
||||
"""
|
||||
self._ensure_loaded()
|
||||
|
||||
session_key = self._generate_session_key(source)
|
||||
now = datetime.now()
|
||||
|
||||
# Check for existing session
|
||||
if session_key in self._entries and not force_new:
|
||||
entry = self._entries[session_key]
|
||||
|
||||
# Check if session should be reset
|
||||
if not self._should_reset(entry, source):
|
||||
# Update timestamp and return existing
|
||||
entry.updated_at = now
|
||||
self._save()
|
||||
return entry
|
||||
|
||||
# Create new session
|
||||
session_id = f"{now.strftime('%Y%m%d_%H%M%S')}_{uuid.uuid4().hex[:8]}"
|
||||
|
||||
entry = SessionEntry(
|
||||
session_key=session_key,
|
||||
session_id=session_id,
|
||||
created_at=now,
|
||||
updated_at=now,
|
||||
origin=source,
|
||||
display_name=source.chat_name,
|
||||
platform=source.platform,
|
||||
chat_type=source.chat_type,
|
||||
)
|
||||
|
||||
self._entries[session_key] = entry
|
||||
self._save()
|
||||
|
||||
return entry
|
||||
|
||||
def update_session(
|
||||
self,
|
||||
session_key: str,
|
||||
input_tokens: int = 0,
|
||||
output_tokens: int = 0
|
||||
) -> None:
|
||||
"""Update a session's metadata after an interaction."""
|
||||
self._ensure_loaded()
|
||||
|
||||
if session_key in self._entries:
|
||||
entry = self._entries[session_key]
|
||||
entry.updated_at = datetime.now()
|
||||
entry.input_tokens += input_tokens
|
||||
entry.output_tokens += output_tokens
|
||||
entry.total_tokens = entry.input_tokens + entry.output_tokens
|
||||
self._save()
|
||||
|
||||
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
|
||||
"""Force reset a session, creating a new session ID."""
|
||||
self._ensure_loaded()
|
||||
|
||||
if session_key not in self._entries:
|
||||
return None
|
||||
|
||||
old_entry = self._entries[session_key]
|
||||
now = datetime.now()
|
||||
session_id = f"{now.strftime('%Y%m%d_%H%M%S')}_{uuid.uuid4().hex[:8]}"
|
||||
|
||||
new_entry = SessionEntry(
|
||||
session_key=session_key,
|
||||
session_id=session_id,
|
||||
created_at=now,
|
||||
updated_at=now,
|
||||
origin=old_entry.origin,
|
||||
display_name=old_entry.display_name,
|
||||
platform=old_entry.platform,
|
||||
chat_type=old_entry.chat_type,
|
||||
)
|
||||
|
||||
self._entries[session_key] = new_entry
|
||||
self._save()
|
||||
|
||||
return new_entry
|
||||
|
||||
def list_sessions(self, active_minutes: Optional[int] = None) -> List[SessionEntry]:
|
||||
"""
|
||||
List all sessions, optionally filtered by activity.
|
||||
|
||||
Args:
|
||||
active_minutes: If provided, only return sessions updated within this many minutes
|
||||
"""
|
||||
self._ensure_loaded()
|
||||
|
||||
entries = list(self._entries.values())
|
||||
|
||||
if active_minutes is not None:
|
||||
cutoff = datetime.now() - timedelta(minutes=active_minutes)
|
||||
entries = [e for e in entries if e.updated_at >= cutoff]
|
||||
|
||||
# Sort by most recently updated
|
||||
entries.sort(key=lambda e: e.updated_at, reverse=True)
|
||||
|
||||
return entries
|
||||
|
||||
def get_transcript_path(self, session_id: str) -> Path:
|
||||
"""Get the path to a session's transcript file."""
|
||||
return self.sessions_dir / f"{session_id}.jsonl"
|
||||
|
||||
def append_to_transcript(self, session_id: str, message: Dict[str, Any]) -> None:
|
||||
"""Append a message to a session's transcript."""
|
||||
transcript_path = self.get_transcript_path(session_id)
|
||||
|
||||
with open(transcript_path, "a") as f:
|
||||
f.write(json.dumps(message, ensure_ascii=False) + "\n")
|
||||
|
||||
def load_transcript(self, session_id: str) -> List[Dict[str, Any]]:
|
||||
"""Load all messages from a session's transcript."""
|
||||
transcript_path = self.get_transcript_path(session_id)
|
||||
|
||||
if not transcript_path.exists():
|
||||
return []
|
||||
|
||||
messages = []
|
||||
with open(transcript_path, "r") as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line:
|
||||
messages.append(json.loads(line))
|
||||
|
||||
return messages
|
||||
|
||||
|
||||
def build_session_context(
|
||||
source: SessionSource,
|
||||
config: GatewayConfig,
|
||||
session_entry: Optional[SessionEntry] = None
|
||||
) -> SessionContext:
|
||||
"""
|
||||
Build a full session context from a source and config.
|
||||
|
||||
This is used to inject context into the agent's system prompt.
|
||||
"""
|
||||
connected = config.get_connected_platforms()
|
||||
|
||||
home_channels = {}
|
||||
for platform in connected:
|
||||
home = config.get_home_channel(platform)
|
||||
if home:
|
||||
home_channels[platform] = home
|
||||
|
||||
context = SessionContext(
|
||||
source=source,
|
||||
connected_platforms=connected,
|
||||
home_channels=home_channels,
|
||||
)
|
||||
|
||||
if session_entry:
|
||||
context.session_key = session_entry.session_key
|
||||
context.session_id = session_entry.session_id
|
||||
context.created_at = session_entry.created_at
|
||||
context.updated_at = session_entry.updated_at
|
||||
|
||||
return context
|
||||
34
hermes
34
hermes
@@ -7,40 +7,6 @@ Usage: ./hermes [options]
|
||||
"""
|
||||
|
||||
if __name__ == "__main__":
|
||||
"""
|
||||
Fire (google/python-fire) does not support POSIX-style short flags like `-p`.
|
||||
We translate the most common shorthands to their long equivalents so wrapper
|
||||
scripts can reliably use:
|
||||
- `-p "..."` -> `--prompt "..."` (no TUI/banner; print result and exit)
|
||||
- `-q "..."` -> `--query "..."` (single-shot with banner UX)
|
||||
"""
|
||||
|
||||
import sys
|
||||
|
||||
def _rewrite_short_flags(argv: list[str]) -> list[str]:
|
||||
rewritten: list[str] = []
|
||||
i = 0
|
||||
while i < len(argv):
|
||||
arg = argv[i]
|
||||
if arg == "-p":
|
||||
rewritten.append("--prompt")
|
||||
if i + 1 < len(argv):
|
||||
rewritten.append(argv[i + 1])
|
||||
i += 2
|
||||
continue
|
||||
if arg == "-q":
|
||||
rewritten.append("--query")
|
||||
if i + 1 < len(argv):
|
||||
rewritten.append(argv[i + 1])
|
||||
i += 2
|
||||
continue
|
||||
rewritten.append(arg)
|
||||
i += 1
|
||||
return rewritten
|
||||
|
||||
sys.argv = [sys.argv[0]] + _rewrite_short_flags(sys.argv[1:])
|
||||
|
||||
from cli import main
|
||||
import fire
|
||||
|
||||
fire.Fire(main)
|
||||
|
||||
@@ -13,7 +13,6 @@ Requires-Dist: httpx
|
||||
Requires-Dist: rich
|
||||
Requires-Dist: tenacity
|
||||
Requires-Dist: pyyaml
|
||||
Requires-Dist: prompt_toolkit
|
||||
Requires-Dist: requests
|
||||
Requires-Dist: jinja2
|
||||
Requires-Dist: pydantic>=2.0
|
||||
@@ -28,12 +27,15 @@ Requires-Dist: boto3; extra == "modal"
|
||||
Provides-Extra: dev
|
||||
Requires-Dist: pytest; extra == "dev"
|
||||
Requires-Dist: pytest-asyncio; extra == "dev"
|
||||
Provides-Extra: atropos
|
||||
Requires-Dist: atroposlib @ git+https://github.com/NousResearch/atropos.git ; extra == "atropos"
|
||||
Requires-Dist: aiohttp; extra == "atropos"
|
||||
Requires-Dist: fastapi; extra == "atropos"
|
||||
Requires-Dist: uvicorn; extra == "atropos"
|
||||
Requires-Dist: pyte; extra == "atropos"
|
||||
Provides-Extra: messaging
|
||||
Requires-Dist: python-telegram-bot>=20.0; extra == "messaging"
|
||||
Requires-Dist: discord.py>=2.0; extra == "messaging"
|
||||
Provides-Extra: cron
|
||||
Requires-Dist: croniter; extra == "cron"
|
||||
Provides-Extra: all
|
||||
Requires-Dist: croniter; extra == "all"
|
||||
Requires-Dist: python-telegram-bot>=20.0; extra == "all"
|
||||
Requires-Dist: discord.py>=2.0; extra == "all"
|
||||
|
||||
# Hermes Agent
|
||||
|
||||
@@ -42,6 +44,7 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse
|
||||
## Features
|
||||
|
||||
- **Interactive CLI**: Beautiful terminal interface with animated feedback, personalities, and session management
|
||||
- **Messaging Gateway**: Connect to Telegram, Discord, and WhatsApp for conversational AI anywhere
|
||||
- **Web Tools**: Search, extract content, and crawl websites
|
||||
- **Terminal Tools**: Execute commands via local, Docker, Singularity, Modal, or SSH backends
|
||||
- **Browser Tools**: Automate web browsers to navigate, click, type, and extract content
|
||||
@@ -50,13 +53,85 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse
|
||||
- **Creative Tools**: Generate images from text prompts
|
||||
- **Skills Tools**: On-demand knowledge documents with progressive disclosure
|
||||
- **Toolsets System**: Organize tools into logical groups for different scenarios
|
||||
- **Scheduled Tasks**: Cron jobs for automated agent tasks with delivery to platforms
|
||||
- **Context Compression**: Automatic summarization when approaching context limits
|
||||
- **Batch Processing**: Process datasets in parallel with checkpointing and statistics tracking
|
||||
- **Ephemeral System Prompts**: Guide model behavior without polluting training datasets
|
||||
|
||||
## Quick Start (CLI)
|
||||
## Installation
|
||||
|
||||
### Quick Install (Recommended)
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1 | iex
|
||||
```
|
||||
|
||||
This installer will:
|
||||
- Clone the repository to `~/.hermes-agent`
|
||||
- Create a virtual environment and install dependencies
|
||||
- Set up the `hermes` command in your PATH
|
||||
- Run an interactive setup wizard to configure API keys
|
||||
|
||||
### Manual Installation
|
||||
|
||||
If you prefer to install manually:
|
||||
|
||||
```bash
|
||||
# After setup (see below), just run:
|
||||
# Clone with submodules
|
||||
git clone --recurse-submodules https://github.com/NousResearch/Hermes-Agent.git
|
||||
cd Hermes-Agent
|
||||
|
||||
# Run the setup script
|
||||
./setup-hermes.sh
|
||||
```
|
||||
|
||||
Or step-by-step:
|
||||
|
||||
```bash
|
||||
# Create and activate virtual environment
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate # Windows: venv\Scripts\activate
|
||||
|
||||
# Install in editable mode with all extras
|
||||
pip install -e ".[all]"
|
||||
|
||||
# Or install dependencies manually
|
||||
pip install -r requirements.txt
|
||||
pip install -e ./mini-swe-agent
|
||||
|
||||
# Copy and configure environment
|
||||
cp .env.example .env
|
||||
# Edit .env with your API keys
|
||||
|
||||
# Run the setup wizard
|
||||
hermes setup
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
Once installed, the `hermes` command is your main entry point:
|
||||
|
||||
```bash
|
||||
hermes # Interactive chat (default)
|
||||
hermes chat # Same as above
|
||||
hermes chat -q "Hello" # Single query, then exit
|
||||
hermes setup # Configure API keys and settings
|
||||
hermes status # Show configuration status
|
||||
hermes doctor # Diagnose issues
|
||||
hermes gateway # Start messaging gateway (Telegram/Discord/WhatsApp)
|
||||
hermes cron daemon # Run cron job scheduler
|
||||
hermes version # Show version info
|
||||
```
|
||||
|
||||
**Legacy `./hermes` script:**
|
||||
```bash
|
||||
# The old CLI script still works:
|
||||
./hermes
|
||||
|
||||
# Or with options:
|
||||
@@ -70,35 +145,9 @@ The CLI provides:
|
||||
- Customizable personalities (`/personality kawaii`, `/personality pirate`, etc.)
|
||||
- Persistent configuration via `cli-config.yaml`
|
||||
|
||||
## Setup
|
||||
## Configuration
|
||||
|
||||
### 1. Clone the Repository
|
||||
```bash
|
||||
# Clone with submodules (recommended)
|
||||
git clone --recurse-submodules https://github.com/NousResearch/Hermes-Agent.git
|
||||
cd Hermes-Agent
|
||||
|
||||
# Or if already cloned without submodules:
|
||||
git submodule update --init --recursive
|
||||
```
|
||||
|
||||
### 2. Install Dependencies
|
||||
```bash
|
||||
# Create and activate virtual environment (recommended)
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||
|
||||
# Install Python packages
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Install mini-swe-agent for terminal tools
|
||||
pip install -e ./mini-swe-agent
|
||||
|
||||
# Install Node.js dependencies for browser tools (requires Node.js)
|
||||
npm install
|
||||
```
|
||||
|
||||
### 3. Configure Environment Variables
|
||||
### Environment Variables
|
||||
```bash
|
||||
# Copy the example environment file
|
||||
cp .env.example .env
|
||||
@@ -327,6 +376,169 @@ logs/
|
||||
- **Trajectory Format**: Uses the same format as batch processing for consistency
|
||||
- **Git Ignored**: `logs/` is in `.gitignore` so logs aren't committed
|
||||
|
||||
## Context Compression
|
||||
|
||||
Long conversations can exceed the model's context limit. Hermes Agent automatically compresses context when approaching the limit:
|
||||
|
||||
**How it works:**
|
||||
1. Tracks actual token usage from API responses (`usage.prompt_tokens`)
|
||||
2. When tokens reach 85% of model's context limit, triggers compression
|
||||
3. Protects first 3 turns (system prompt, initial request, first response)
|
||||
4. Protects last 4 turns (recent context is most relevant)
|
||||
5. Summarizes middle turns using a fast/cheap model (Gemini Flash)
|
||||
6. Inserts summary as a user message, conversation continues seamlessly
|
||||
|
||||
**Configuration (`cli-config.yaml`):**
|
||||
```yaml
|
||||
compression:
|
||||
enabled: true # Enable auto-compression (default)
|
||||
threshold: 0.85 # Compress at 85% of context limit
|
||||
summary_model: "google/gemini-2.0-flash-001"
|
||||
```
|
||||
|
||||
**Or via environment variables:**
|
||||
```bash
|
||||
CONTEXT_COMPRESSION_ENABLED=true
|
||||
CONTEXT_COMPRESSION_THRESHOLD=0.85
|
||||
CONTEXT_COMPRESSION_MODEL=google/gemini-2.0-flash-001
|
||||
```
|
||||
|
||||
**When compression triggers, you'll see:**
|
||||
```
|
||||
📦 Context compression triggered (170,000 tokens ≥ 170,000 threshold)
|
||||
📊 Model context limit: 200,000 tokens (85% = 170,000)
|
||||
🗜️ Summarizing turns 4-15 (12 turns)
|
||||
✅ Compressed: 20 → 9 messages (~45,000 tokens saved)
|
||||
```
|
||||
|
||||
## Scheduled Tasks (Cron Jobs)
|
||||
|
||||
Hermes Agent can schedule automated tasks to run in the future - either one-time reminders or recurring jobs.
|
||||
|
||||
### CLI Commands
|
||||
|
||||
```bash
|
||||
# List scheduled jobs
|
||||
/cron
|
||||
|
||||
# Add a one-shot reminder (runs once in 30 minutes)
|
||||
/cron add 30m Remind me to check the build status
|
||||
|
||||
# Add a recurring job (every 2 hours)
|
||||
/cron add "every 2h" Check server status at 192.168.1.100 and report any issues
|
||||
|
||||
# Add a cron expression (daily at 9am)
|
||||
/cron add "0 9 * * *" Generate a morning briefing summarizing GitHub notifications
|
||||
|
||||
# Remove a job
|
||||
/cron remove abc123def456
|
||||
```
|
||||
|
||||
### Agent Self-Scheduling
|
||||
|
||||
The agent can also schedule its own follow-up tasks using tools:
|
||||
|
||||
```python
|
||||
# Available when using hermes-cli toolset (default for CLI)
|
||||
schedule_cronjob(prompt="...", schedule="30m", repeat=1) # One-shot
|
||||
schedule_cronjob(prompt="...", schedule="every 2h") # Recurring
|
||||
list_cronjobs() # View all jobs
|
||||
remove_cronjob(job_id="...") # Cancel a job
|
||||
```
|
||||
|
||||
**⚠️ Important:** Cronjobs run in **isolated sessions with NO prior context**. The prompt must be completely self-contained with all necessary information (file paths, URLs, server addresses, etc.). The future agent will not remember anything from the current conversation.
|
||||
|
||||
### Schedule Formats
|
||||
|
||||
| Format | Example | Description |
|
||||
|--------|---------|-------------|
|
||||
| Duration | `30m`, `2h`, `1d` | One-shot delay from now |
|
||||
| Interval | `every 30m`, `every 2h` | Recurring at fixed intervals |
|
||||
| Cron | `0 9 * * *` | Cron expression (requires `croniter`) |
|
||||
| Timestamp | `2026-02-03T14:00` | One-shot at specific time |
|
||||
|
||||
### Repeat Options
|
||||
|
||||
| repeat | Behavior |
|
||||
|--------|----------|
|
||||
| (omitted) | One-shot schedules run once; intervals/cron run forever |
|
||||
| `1` | Run once then auto-delete |
|
||||
| `N` | Run N times then auto-delete |
|
||||
|
||||
### Running the Cron Daemon
|
||||
|
||||
Jobs are stored in `~/.hermes/cron/jobs.json` and executed by a scheduler:
|
||||
|
||||
```bash
|
||||
# Option 1: Built-in daemon (checks every 60 seconds)
|
||||
python cli.py --cron-daemon
|
||||
|
||||
# Option 2: System cron integration (run once per minute)
|
||||
# Add to crontab: crontab -e
|
||||
*/1 * * * * cd ~/hermes-agent && python cli.py --cron-tick-once >> ~/.hermes/cron/cron.log 2>&1
|
||||
```
|
||||
|
||||
### Job Output
|
||||
|
||||
Job outputs are saved to `~/.hermes/cron/output/{job_id}/{timestamp}.md` for review.
|
||||
|
||||
## Messaging Gateway (Telegram, Discord, WhatsApp)
|
||||
|
||||
Connect Hermes Agent to messaging platforms so you can chat from anywhere.
|
||||
|
||||
### Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Add your bot token to .env
|
||||
echo 'TELEGRAM_BOT_TOKEN="your_token"' >> .env
|
||||
|
||||
# 2. Test the gateway (foreground)
|
||||
./scripts/hermes-gateway run
|
||||
|
||||
# 3. Install as a background service
|
||||
./scripts/hermes-gateway install
|
||||
|
||||
# 4. Manage the service
|
||||
./scripts/hermes-gateway start # Start
|
||||
./scripts/hermes-gateway stop # Stop
|
||||
./scripts/hermes-gateway status # Check status
|
||||
```
|
||||
|
||||
### Supported Platforms
|
||||
|
||||
| Platform | Setup | Toolset |
|
||||
|----------|-------|---------|
|
||||
| Telegram | Bot via @BotFather | `hermes-telegram` |
|
||||
| Discord | Bot via Developer Portal | `hermes-discord` |
|
||||
| WhatsApp | Node.js bridge | `hermes-whatsapp` |
|
||||
|
||||
### Session Management
|
||||
|
||||
- Sessions persist across messages (agent remembers context)
|
||||
- Reset policies: daily (4am), idle (2 hours), or both
|
||||
- Manual reset: send `/new` or `/reset`
|
||||
|
||||
### Cron Job Delivery
|
||||
|
||||
Schedule tasks that deliver to specific platforms:
|
||||
|
||||
```python
|
||||
schedule_cronjob(
|
||||
prompt="Check server status...",
|
||||
schedule="every 1h",
|
||||
deliver="telegram" # or "origin", "discord", etc.
|
||||
)
|
||||
```
|
||||
|
||||
### CLI Commands
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/platforms` | Show gateway configuration status |
|
||||
| `--gateway` | Start the gateway (CLI flag) |
|
||||
|
||||
See [docs/messaging.md](docs/messaging.md) for full setup instructions.
|
||||
|
||||
## Interactive CLI
|
||||
|
||||
The CLI provides a rich interactive experience for working with the agent.
|
||||
@@ -359,6 +571,8 @@ The CLI provides a rich interactive experience for working with the agent.
|
||||
| `/history` | Show conversation history |
|
||||
| `/save` | Save current conversation to file |
|
||||
| `/config` | Show current configuration |
|
||||
| `/cron` | Manage scheduled tasks (list, add, remove) |
|
||||
| `/platforms` | Show gateway/messaging platform status |
|
||||
| `/quit` | Exit the CLI |
|
||||
|
||||
### Configuration
|
||||
@@ -616,6 +830,11 @@ All environment variables can be configured in the `.env` file (copy from `.env.
|
||||
- `TERMINAL_SSH_PORT`: SSH port (default: `22`)
|
||||
- `TERMINAL_SSH_KEY`: Path to SSH private key (optional, uses ssh-agent if not set)
|
||||
|
||||
**Context Compression (auto-shrinks long conversations):**
|
||||
- `CONTEXT_COMPRESSION_ENABLED`: Enable auto-compression (default: `true`)
|
||||
- `CONTEXT_COMPRESSION_THRESHOLD`: Compress at this % of context limit (default: `0.85`)
|
||||
- `CONTEXT_COMPRESSION_MODEL`: Model for generating summaries (default: `google/gemini-2.0-flash-001`)
|
||||
|
||||
**Browser Tool Configuration (agent-browser + Browserbase):**
|
||||
- `BROWSERBASE_API_KEY`: Browserbase API key for cloud browser execution
|
||||
- `BROWSERBASE_PROJECT_ID`: Browserbase project ID
|
||||
@@ -647,13 +866,3 @@ All environment variables can be configured in the `.env` file (copy from `.env.
|
||||
| `skills/` | On-demand knowledge documents |
|
||||
| `docs/` | Documentation |
|
||||
| `configs/` | Example batch run scripts |
|
||||
|
||||
# Atropos Integrations & RL Training
|
||||
|
||||
## Nomad Setup
|
||||
Follow this: https://developer.hashicorp.com/nomad/docs/deploy
|
||||
|
||||
## Atropos dependencies
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -e '.[atropos]'
|
||||
|
||||
@@ -1,66 +1,43 @@
|
||||
README.md
|
||||
atropos_compatible_agent.py
|
||||
batch_runner.py
|
||||
local_server.py
|
||||
cli.py
|
||||
model_tools.py
|
||||
pyproject.toml
|
||||
run_agent.py
|
||||
toolset_distributions.py
|
||||
toolsets.py
|
||||
trajectory_compressor.py
|
||||
atropos/__init__.py
|
||||
atropos/sandbox_server.py
|
||||
atropos/agent/__init__.py
|
||||
atropos/agent/atropos_agent.py
|
||||
atropos/api/__init__.py
|
||||
atropos/api/tool_executor_server.py
|
||||
atropos/api/tool_server.py
|
||||
atropos/backends/__init__.py
|
||||
atropos/backends/base.py
|
||||
atropos/backends/modal_backend.py
|
||||
atropos/backends/nomad_backend.py
|
||||
atropos/envs/__init__.py
|
||||
atropos/envs/agent_env.py
|
||||
atropos/envs/hermes_compat_test_env.py
|
||||
atropos/envs/sandbox_terminal_smoke_env.py
|
||||
atropos/envs/swe_smith_oracle_env.py
|
||||
atropos/envs/test_env.py
|
||||
atropos/envs/toolserver_smoke_env.py
|
||||
atropos/nomad/__init__.py
|
||||
atropos/nomad/client.py
|
||||
atropos/slots/__init__.py
|
||||
atropos/slots/executor.py
|
||||
atropos/slots/pool.py
|
||||
atropos/slots/slot.py
|
||||
atropos/terminal/__init__.py
|
||||
atropos/terminal/asciinema_stream.py
|
||||
atropos/tools/__init__.py
|
||||
atropos/tools/base.py
|
||||
atropos/tools/build_registry.py
|
||||
atropos/tools/hermes_external_tools.py
|
||||
atropos/tools/sandbox_stubs.py
|
||||
atropos/tools/terminal_stateful_tool.py
|
||||
atropos/tools/tmux_tool.py
|
||||
atropos/tools/tool_executor.py
|
||||
atropos/tools/toolset_resolver.py
|
||||
cron/__init__.py
|
||||
cron/jobs.py
|
||||
cron/scheduler.py
|
||||
gateway/__init__.py
|
||||
gateway/config.py
|
||||
gateway/delivery.py
|
||||
gateway/run.py
|
||||
gateway/session.py
|
||||
hermes_agent.egg-info/PKG-INFO
|
||||
hermes_agent.egg-info/SOURCES.txt
|
||||
hermes_agent.egg-info/dependency_links.txt
|
||||
hermes_agent.egg-info/entry_points.txt
|
||||
hermes_agent.egg-info/requires.txt
|
||||
hermes_agent.egg-info/top_level.txt
|
||||
hermes_cli/__init__.py
|
||||
hermes_cli/cron.py
|
||||
hermes_cli/doctor.py
|
||||
hermes_cli/gateway.py
|
||||
hermes_cli/main.py
|
||||
hermes_cli/setup.py
|
||||
hermes_cli/status.py
|
||||
tests/test_batch_runner.py
|
||||
tests/test_checkpoint_resumption.py
|
||||
tests/test_modal_integration.py
|
||||
tests/test_modal_stress.py
|
||||
tests/test_modal_terminal.py
|
||||
tests/test_nous_api_limits.py
|
||||
tests/test_nous_api_pattern.py
|
||||
tests/test_temperature_fix.py
|
||||
tests/test_tool_call_parsing.py
|
||||
tests/test_web_tools.py
|
||||
tools/__init__.py
|
||||
tools/browser_tool.py
|
||||
tools/cronjob_tools.py
|
||||
tools/image_generation_tool.py
|
||||
tools/mixture_of_agents_tool.py
|
||||
tools/skills_tool.py
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
[console_scripts]
|
||||
hermes = hermes_cli.main:main
|
||||
hermes-agent = run_agent:main
|
||||
hermes-atropos-sandbox-smoke = atropos.envs.sandbox_terminal_smoke_env:SandboxTerminalSmokeEnv.cli
|
||||
hermes-atropos-toolserver-smoke = atropos.envs.toolserver_smoke_env:ToolServerSmokeEnv.cli
|
||||
|
||||
@@ -5,7 +5,6 @@ httpx
|
||||
rich
|
||||
tenacity
|
||||
pyyaml
|
||||
prompt_toolkit
|
||||
requests
|
||||
jinja2
|
||||
pydantic>=2.0
|
||||
@@ -15,17 +14,22 @@ litellm>=1.75.5
|
||||
typer
|
||||
platformdirs
|
||||
|
||||
[atropos]
|
||||
atroposlib @ git+https://github.com/NousResearch/atropos.git
|
||||
aiohttp
|
||||
fastapi
|
||||
uvicorn
|
||||
pyte
|
||||
[all]
|
||||
croniter
|
||||
python-telegram-bot>=20.0
|
||||
discord.py>=2.0
|
||||
|
||||
[cron]
|
||||
croniter
|
||||
|
||||
[dev]
|
||||
pytest
|
||||
pytest-asyncio
|
||||
|
||||
[messaging]
|
||||
python-telegram-bot>=20.0
|
||||
discord.py>=2.0
|
||||
|
||||
[modal]
|
||||
modal
|
||||
boto3
|
||||
|
||||
@@ -1,7 +1,8 @@
|
||||
atropos
|
||||
atropos_compatible_agent
|
||||
batch_runner
|
||||
local_server
|
||||
cli
|
||||
cron
|
||||
gateway
|
||||
hermes_cli
|
||||
model_tools
|
||||
run_agent
|
||||
tools
|
||||
|
||||
14
hermes_cli/__init__.py
Normal file
14
hermes_cli/__init__.py
Normal file
@@ -0,0 +1,14 @@
|
||||
"""
|
||||
Hermes CLI - Unified command-line interface for Hermes Agent.
|
||||
|
||||
Provides subcommands for:
|
||||
- hermes chat - Interactive chat (same as ./hermes)
|
||||
- hermes gateway - Run gateway in foreground
|
||||
- hermes gateway start - Start gateway service
|
||||
- hermes gateway stop - Stop gateway service
|
||||
- hermes setup - Interactive setup wizard
|
||||
- hermes status - Show status of all components
|
||||
- hermes cron - Manage cron jobs
|
||||
"""
|
||||
|
||||
__version__ = "0.1.0"
|
||||
785
hermes_cli/config.py
Normal file
785
hermes_cli/config.py
Normal file
@@ -0,0 +1,785 @@
|
||||
"""
|
||||
Configuration management for Hermes Agent.
|
||||
|
||||
Config files are stored in ~/.hermes/ for easy access:
|
||||
- ~/.hermes/config.yaml - All settings (model, toolsets, terminal, etc.)
|
||||
- ~/.hermes/.env - API keys and secrets
|
||||
|
||||
This module provides:
|
||||
- hermes config - Show current configuration
|
||||
- hermes config edit - Open config in editor
|
||||
- hermes config set - Set a specific value
|
||||
- hermes config wizard - Re-run setup wizard
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional, List, Tuple
|
||||
|
||||
import yaml
|
||||
|
||||
# ANSI colors
|
||||
class Colors:
|
||||
RESET = "\033[0m"
|
||||
BOLD = "\033[1m"
|
||||
DIM = "\033[2m"
|
||||
RED = "\033[31m"
|
||||
GREEN = "\033[32m"
|
||||
YELLOW = "\033[33m"
|
||||
BLUE = "\033[34m"
|
||||
MAGENTA = "\033[35m"
|
||||
CYAN = "\033[36m"
|
||||
|
||||
def color(text: str, *codes) -> str:
|
||||
if not sys.stdout.isatty():
|
||||
return text
|
||||
return "".join(codes) + text + Colors.RESET
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Config paths
|
||||
# =============================================================================
|
||||
|
||||
def get_hermes_home() -> Path:
|
||||
"""Get the Hermes home directory (~/.hermes)."""
|
||||
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
|
||||
|
||||
def get_config_path() -> Path:
|
||||
"""Get the main config file path."""
|
||||
return get_hermes_home() / "config.yaml"
|
||||
|
||||
def get_env_path() -> Path:
|
||||
"""Get the .env file path (for API keys)."""
|
||||
return get_hermes_home() / ".env"
|
||||
|
||||
def get_project_root() -> Path:
|
||||
"""Get the project installation directory."""
|
||||
return Path(__file__).parent.parent.resolve()
|
||||
|
||||
def ensure_hermes_home():
|
||||
"""Ensure ~/.hermes directory structure exists."""
|
||||
home = get_hermes_home()
|
||||
(home / "cron").mkdir(parents=True, exist_ok=True)
|
||||
(home / "sessions").mkdir(parents=True, exist_ok=True)
|
||||
(home / "logs").mkdir(parents=True, exist_ok=True)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Config loading/saving
|
||||
# =============================================================================
|
||||
|
||||
DEFAULT_CONFIG = {
|
||||
"model": "anthropic/claude-sonnet-4.5",
|
||||
"toolsets": ["hermes-cli"],
|
||||
"max_turns": 100,
|
||||
|
||||
"terminal": {
|
||||
"backend": "local",
|
||||
"cwd": ".", # Use current directory
|
||||
"timeout": 180,
|
||||
"docker_image": "nikolaik/python-nodejs:python3.11-nodejs20",
|
||||
"singularity_image": "docker://nikolaik/python-nodejs:python3.11-nodejs20",
|
||||
"modal_image": "nikolaik/python-nodejs:python3.11-nodejs20",
|
||||
},
|
||||
|
||||
"browser": {
|
||||
"inactivity_timeout": 120,
|
||||
},
|
||||
|
||||
"compression": {
|
||||
"enabled": True,
|
||||
"threshold": 0.85,
|
||||
"summary_model": "google/gemini-2.0-flash-001",
|
||||
},
|
||||
|
||||
"display": {
|
||||
"compact": False,
|
||||
"personality": "kawaii",
|
||||
},
|
||||
|
||||
# Permanently allowed dangerous command patterns (added via "always" approval)
|
||||
"command_allowlist": [],
|
||||
|
||||
# Config schema version - bump this when adding new required fields
|
||||
"_config_version": 1,
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# Config Migration System
|
||||
# =============================================================================
|
||||
|
||||
# Required environment variables with metadata for migration prompts
|
||||
REQUIRED_ENV_VARS = {
|
||||
"OPENROUTER_API_KEY": {
|
||||
"description": "OpenRouter API key (required for vision, web scraping, and tools)",
|
||||
"prompt": "OpenRouter API key",
|
||||
"url": "https://openrouter.ai/keys",
|
||||
"required": True,
|
||||
"password": True,
|
||||
},
|
||||
}
|
||||
|
||||
# Optional environment variables that enhance functionality
|
||||
OPTIONAL_ENV_VARS = {
|
||||
"FIRECRAWL_API_KEY": {
|
||||
"description": "Firecrawl API key for web search and scraping",
|
||||
"prompt": "Firecrawl API key",
|
||||
"url": "https://firecrawl.dev/",
|
||||
"tools": ["web_search", "web_extract"],
|
||||
"password": True,
|
||||
},
|
||||
"BROWSERBASE_API_KEY": {
|
||||
"description": "Browserbase API key for browser automation",
|
||||
"prompt": "Browserbase API key",
|
||||
"url": "https://browserbase.com/",
|
||||
"tools": ["browser_navigate", "browser_click", "etc."],
|
||||
"password": True,
|
||||
},
|
||||
"BROWSERBASE_PROJECT_ID": {
|
||||
"description": "Browserbase project ID",
|
||||
"prompt": "Browserbase project ID",
|
||||
"url": "https://browserbase.com/",
|
||||
"tools": ["browser_navigate", "browser_click", "etc."],
|
||||
"password": False,
|
||||
},
|
||||
"FAL_KEY": {
|
||||
"description": "FAL API key for image generation",
|
||||
"prompt": "FAL API key",
|
||||
"url": "https://fal.ai/",
|
||||
"tools": ["image_generate"],
|
||||
"password": True,
|
||||
},
|
||||
"TINKER_API_KEY": {
|
||||
"description": "Tinker API key for RL training",
|
||||
"prompt": "Tinker API key",
|
||||
"url": "https://tinker-console.thinkingmachines.ai/keys",
|
||||
"tools": ["rl_start_training", "rl_check_status", "rl_stop_training"],
|
||||
"password": True,
|
||||
},
|
||||
"WANDB_API_KEY": {
|
||||
"description": "Weights & Biases API key for experiment tracking",
|
||||
"prompt": "WandB API key",
|
||||
"url": "https://wandb.ai/authorize",
|
||||
"tools": ["rl_get_results", "rl_check_status"],
|
||||
"password": True,
|
||||
},
|
||||
"OPENAI_BASE_URL": {
|
||||
"description": "Custom OpenAI-compatible API endpoint URL",
|
||||
"prompt": "API base URL (e.g., https://api.example.com/v1)",
|
||||
"url": None,
|
||||
"password": False,
|
||||
},
|
||||
"OPENAI_API_KEY": {
|
||||
"description": "API key for custom OpenAI-compatible endpoint",
|
||||
"prompt": "API key for custom endpoint",
|
||||
"url": None,
|
||||
"password": True,
|
||||
},
|
||||
# Messaging platform tokens
|
||||
"TELEGRAM_BOT_TOKEN": {
|
||||
"description": "Telegram bot token from @BotFather",
|
||||
"prompt": "Telegram bot token",
|
||||
"url": "https://t.me/BotFather",
|
||||
"password": True,
|
||||
},
|
||||
"TELEGRAM_ALLOWED_USERS": {
|
||||
"description": "Comma-separated Telegram user IDs allowed to use the bot (get ID from @userinfobot)",
|
||||
"prompt": "Allowed Telegram user IDs (comma-separated)",
|
||||
"url": "https://t.me/userinfobot",
|
||||
"password": False,
|
||||
},
|
||||
"DISCORD_BOT_TOKEN": {
|
||||
"description": "Discord bot token from Developer Portal",
|
||||
"prompt": "Discord bot token",
|
||||
"url": "https://discord.com/developers/applications",
|
||||
"password": True,
|
||||
},
|
||||
"DISCORD_ALLOWED_USERS": {
|
||||
"description": "Comma-separated Discord user IDs allowed to use the bot",
|
||||
"prompt": "Allowed Discord user IDs (comma-separated)",
|
||||
"url": None,
|
||||
"password": False,
|
||||
},
|
||||
# Terminal configuration
|
||||
"MESSAGING_CWD": {
|
||||
"description": "Working directory for terminal commands via messaging (Telegram/Discord/etc). CLI always uses current directory.",
|
||||
"prompt": "Messaging working directory (default: home)",
|
||||
"url": None,
|
||||
"password": False,
|
||||
},
|
||||
"SUDO_PASSWORD": {
|
||||
"description": "Sudo password for terminal commands requiring root access",
|
||||
"prompt": "Sudo password",
|
||||
"url": None,
|
||||
"password": True,
|
||||
},
|
||||
# Agent configuration
|
||||
"HERMES_MAX_ITERATIONS": {
|
||||
"description": "Maximum tool-calling iterations per conversation (default: 60)",
|
||||
"prompt": "Max iterations",
|
||||
"url": None,
|
||||
"password": False,
|
||||
},
|
||||
"HERMES_TOOL_PROGRESS": {
|
||||
"description": "Send tool progress messages in messaging channels (true/false)",
|
||||
"prompt": "Enable tool progress messages",
|
||||
"url": None,
|
||||
"password": False,
|
||||
},
|
||||
"HERMES_TOOL_PROGRESS_MODE": {
|
||||
"description": "Progress mode: 'all' (every tool) or 'new' (only when tool changes)",
|
||||
"prompt": "Progress mode (all/new)",
|
||||
"url": None,
|
||||
"password": False,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def get_missing_env_vars(required_only: bool = False) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Check which environment variables are missing.
|
||||
|
||||
Returns list of dicts with var info for missing variables.
|
||||
"""
|
||||
missing = []
|
||||
|
||||
# Check required vars
|
||||
for var_name, info in REQUIRED_ENV_VARS.items():
|
||||
if not get_env_value(var_name):
|
||||
missing.append({"name": var_name, **info, "is_required": True})
|
||||
|
||||
# Check optional vars (if not required_only)
|
||||
if not required_only:
|
||||
for var_name, info in OPTIONAL_ENV_VARS.items():
|
||||
if not get_env_value(var_name):
|
||||
missing.append({"name": var_name, **info, "is_required": False})
|
||||
|
||||
return missing
|
||||
|
||||
|
||||
def get_missing_config_fields() -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Check which config fields are missing or outdated.
|
||||
|
||||
Returns list of missing/outdated fields.
|
||||
"""
|
||||
config = load_config()
|
||||
missing = []
|
||||
|
||||
# Check for new top-level keys in DEFAULT_CONFIG
|
||||
for key, default_value in DEFAULT_CONFIG.items():
|
||||
if key.startswith('_'):
|
||||
continue # Skip internal keys
|
||||
if key not in config:
|
||||
missing.append({
|
||||
"key": key,
|
||||
"default": default_value,
|
||||
"description": f"New config section: {key}",
|
||||
})
|
||||
elif isinstance(default_value, dict):
|
||||
# Check nested keys
|
||||
for subkey, subvalue in default_value.items():
|
||||
if subkey not in config.get(key, {}):
|
||||
missing.append({
|
||||
"key": f"{key}.{subkey}",
|
||||
"default": subvalue,
|
||||
"description": f"New config option: {key}.{subkey}",
|
||||
})
|
||||
|
||||
return missing
|
||||
|
||||
|
||||
def check_config_version() -> Tuple[int, int]:
|
||||
"""
|
||||
Check config version.
|
||||
|
||||
Returns (current_version, latest_version).
|
||||
"""
|
||||
config = load_config()
|
||||
current = config.get("_config_version", 0)
|
||||
latest = DEFAULT_CONFIG.get("_config_version", 1)
|
||||
return current, latest
|
||||
|
||||
|
||||
def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, Any]:
|
||||
"""
|
||||
Migrate config to latest version, prompting for new required fields.
|
||||
|
||||
Args:
|
||||
interactive: If True, prompt user for missing values
|
||||
quiet: If True, suppress output
|
||||
|
||||
Returns:
|
||||
Dict with migration results: {"env_added": [...], "config_added": [...], "warnings": [...]}
|
||||
"""
|
||||
results = {"env_added": [], "config_added": [], "warnings": []}
|
||||
|
||||
# Check config version
|
||||
current_ver, latest_ver = check_config_version()
|
||||
|
||||
if current_ver < latest_ver and not quiet:
|
||||
print(f"Config version: {current_ver} → {latest_ver}")
|
||||
|
||||
# Check for missing required env vars
|
||||
missing_env = get_missing_env_vars(required_only=True)
|
||||
|
||||
if missing_env and not quiet:
|
||||
print("\n⚠️ Missing required environment variables:")
|
||||
for var in missing_env:
|
||||
print(f" • {var['name']}: {var['description']}")
|
||||
|
||||
if interactive and missing_env:
|
||||
print("\nLet's configure them now:\n")
|
||||
for var in missing_env:
|
||||
if var.get("url"):
|
||||
print(f" Get your key at: {var['url']}")
|
||||
|
||||
if var.get("password"):
|
||||
import getpass
|
||||
value = getpass.getpass(f" {var['prompt']}: ")
|
||||
else:
|
||||
value = input(f" {var['prompt']}: ").strip()
|
||||
|
||||
if value:
|
||||
save_env_value(var["name"], value)
|
||||
results["env_added"].append(var["name"])
|
||||
print(f" ✓ Saved {var['name']}")
|
||||
else:
|
||||
results["warnings"].append(f"Skipped {var['name']} - some features may not work")
|
||||
print()
|
||||
|
||||
# Check for missing config fields
|
||||
missing_config = get_missing_config_fields()
|
||||
|
||||
if missing_config:
|
||||
config = load_config()
|
||||
|
||||
for field in missing_config:
|
||||
key = field["key"]
|
||||
default = field["default"]
|
||||
|
||||
# Add with default value
|
||||
if "." in key:
|
||||
# Nested key
|
||||
parent, child = key.split(".", 1)
|
||||
if parent not in config:
|
||||
config[parent] = {}
|
||||
config[parent][child] = default
|
||||
else:
|
||||
config[key] = default
|
||||
|
||||
results["config_added"].append(key)
|
||||
if not quiet:
|
||||
print(f" ✓ Added {key} = {default}")
|
||||
|
||||
# Update version and save
|
||||
config["_config_version"] = latest_ver
|
||||
save_config(config)
|
||||
elif current_ver < latest_ver:
|
||||
# Just update version
|
||||
config = load_config()
|
||||
config["_config_version"] = latest_ver
|
||||
save_config(config)
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def load_config() -> Dict[str, Any]:
|
||||
"""Load configuration from ~/.hermes/config.yaml."""
|
||||
config_path = get_config_path()
|
||||
|
||||
config = DEFAULT_CONFIG.copy()
|
||||
|
||||
if config_path.exists():
|
||||
try:
|
||||
with open(config_path) as f:
|
||||
user_config = yaml.safe_load(f) or {}
|
||||
|
||||
# Deep merge
|
||||
for key, value in user_config.items():
|
||||
if isinstance(value, dict) and key in config and isinstance(config[key], dict):
|
||||
config[key].update(value)
|
||||
else:
|
||||
config[key] = value
|
||||
except Exception as e:
|
||||
print(f"Warning: Failed to load config: {e}")
|
||||
|
||||
return config
|
||||
|
||||
|
||||
def save_config(config: Dict[str, Any]):
|
||||
"""Save configuration to ~/.hermes/config.yaml."""
|
||||
ensure_hermes_home()
|
||||
config_path = get_config_path()
|
||||
|
||||
with open(config_path, 'w') as f:
|
||||
yaml.dump(config, f, default_flow_style=False, sort_keys=False)
|
||||
|
||||
|
||||
def load_env() -> Dict[str, str]:
|
||||
"""Load environment variables from ~/.hermes/.env."""
|
||||
env_path = get_env_path()
|
||||
env_vars = {}
|
||||
|
||||
if env_path.exists():
|
||||
with open(env_path) as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line and not line.startswith('#') and '=' in line:
|
||||
key, _, value = line.partition('=')
|
||||
env_vars[key.strip()] = value.strip().strip('"\'')
|
||||
|
||||
return env_vars
|
||||
|
||||
|
||||
def save_env_value(key: str, value: str):
|
||||
"""Save or update a value in ~/.hermes/.env."""
|
||||
ensure_hermes_home()
|
||||
env_path = get_env_path()
|
||||
|
||||
# Load existing
|
||||
lines = []
|
||||
if env_path.exists():
|
||||
with open(env_path) as f:
|
||||
lines = f.readlines()
|
||||
|
||||
# Find and update or append
|
||||
found = False
|
||||
for i, line in enumerate(lines):
|
||||
if line.strip().startswith(f"{key}="):
|
||||
lines[i] = f"{key}={value}\n"
|
||||
found = True
|
||||
break
|
||||
|
||||
if not found:
|
||||
lines.append(f"{key}={value}\n")
|
||||
|
||||
with open(env_path, 'w') as f:
|
||||
f.writelines(lines)
|
||||
|
||||
|
||||
def get_env_value(key: str) -> Optional[str]:
|
||||
"""Get a value from ~/.hermes/.env or environment."""
|
||||
# Check environment first
|
||||
if key in os.environ:
|
||||
return os.environ[key]
|
||||
|
||||
# Then check .env file
|
||||
env_vars = load_env()
|
||||
return env_vars.get(key)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Config display
|
||||
# =============================================================================
|
||||
|
||||
def redact_key(key: str) -> str:
|
||||
"""Redact an API key for display."""
|
||||
if not key:
|
||||
return color("(not set)", Colors.DIM)
|
||||
if len(key) < 12:
|
||||
return "***"
|
||||
return key[:4] + "..." + key[-4:]
|
||||
|
||||
|
||||
def show_config():
|
||||
"""Display current configuration."""
|
||||
config = load_config()
|
||||
env_vars = load_env()
|
||||
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
|
||||
print(color("│ 🦋 Hermes Configuration │", Colors.CYAN))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
|
||||
|
||||
# Paths
|
||||
print()
|
||||
print(color("◆ Paths", Colors.CYAN, Colors.BOLD))
|
||||
print(f" Config: {get_config_path()}")
|
||||
print(f" Secrets: {get_env_path()}")
|
||||
print(f" Install: {get_project_root()}")
|
||||
|
||||
# API Keys
|
||||
print()
|
||||
print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
keys = [
|
||||
("OPENROUTER_API_KEY", "OpenRouter"),
|
||||
("ANTHROPIC_API_KEY", "Anthropic"),
|
||||
("OPENAI_API_KEY", "OpenAI"),
|
||||
("FIRECRAWL_API_KEY", "Firecrawl"),
|
||||
("BROWSERBASE_API_KEY", "Browserbase"),
|
||||
("FAL_KEY", "FAL"),
|
||||
]
|
||||
|
||||
for env_key, name in keys:
|
||||
value = get_env_value(env_key)
|
||||
print(f" {name:<14} {redact_key(value)}")
|
||||
|
||||
# Model settings
|
||||
print()
|
||||
print(color("◆ Model", Colors.CYAN, Colors.BOLD))
|
||||
print(f" Model: {config.get('model', 'not set')}")
|
||||
print(f" Max turns: {config.get('max_turns', 100)}")
|
||||
print(f" Toolsets: {', '.join(config.get('toolsets', ['all']))}")
|
||||
|
||||
# Terminal
|
||||
print()
|
||||
print(color("◆ Terminal", Colors.CYAN, Colors.BOLD))
|
||||
terminal = config.get('terminal', {})
|
||||
print(f" Backend: {terminal.get('backend', 'local')}")
|
||||
print(f" Working dir: {terminal.get('cwd', '.')}")
|
||||
print(f" Timeout: {terminal.get('timeout', 60)}s")
|
||||
|
||||
if terminal.get('backend') == 'docker':
|
||||
print(f" Docker image: {terminal.get('docker_image', 'python:3.11-slim')}")
|
||||
elif terminal.get('backend') == 'singularity':
|
||||
print(f" Image: {terminal.get('singularity_image', 'docker://python:3.11')}")
|
||||
elif terminal.get('backend') == 'modal':
|
||||
print(f" Modal image: {terminal.get('modal_image', 'python:3.11')}")
|
||||
modal_token = get_env_value('MODAL_TOKEN_ID')
|
||||
print(f" Modal token: {'configured' if modal_token else '(not set)'}")
|
||||
elif terminal.get('backend') == 'ssh':
|
||||
ssh_host = get_env_value('TERMINAL_SSH_HOST')
|
||||
ssh_user = get_env_value('TERMINAL_SSH_USER')
|
||||
print(f" SSH host: {ssh_host or '(not set)'}")
|
||||
print(f" SSH user: {ssh_user or '(not set)'}")
|
||||
|
||||
# Compression
|
||||
print()
|
||||
print(color("◆ Context Compression", Colors.CYAN, Colors.BOLD))
|
||||
compression = config.get('compression', {})
|
||||
enabled = compression.get('enabled', True)
|
||||
print(f" Enabled: {'yes' if enabled else 'no'}")
|
||||
if enabled:
|
||||
print(f" Threshold: {compression.get('threshold', 0.85) * 100:.0f}%")
|
||||
print(f" Model: {compression.get('summary_model', 'google/gemini-2.0-flash-001')}")
|
||||
|
||||
# Messaging
|
||||
print()
|
||||
print(color("◆ Messaging Platforms", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
telegram_token = get_env_value('TELEGRAM_BOT_TOKEN')
|
||||
discord_token = get_env_value('DISCORD_BOT_TOKEN')
|
||||
|
||||
print(f" Telegram: {'configured' if telegram_token else color('not configured', Colors.DIM)}")
|
||||
print(f" Discord: {'configured' if discord_token else color('not configured', Colors.DIM)}")
|
||||
|
||||
print()
|
||||
print(color("─" * 60, Colors.DIM))
|
||||
print(color(" hermes config edit # Edit config file", Colors.DIM))
|
||||
print(color(" hermes config set KEY VALUE", Colors.DIM))
|
||||
print(color(" hermes setup # Run setup wizard", Colors.DIM))
|
||||
print()
|
||||
|
||||
|
||||
def edit_config():
|
||||
"""Open config file in user's editor."""
|
||||
config_path = get_config_path()
|
||||
|
||||
# Ensure config exists
|
||||
if not config_path.exists():
|
||||
save_config(DEFAULT_CONFIG)
|
||||
print(f"Created {config_path}")
|
||||
|
||||
# Find editor
|
||||
editor = os.getenv('EDITOR') or os.getenv('VISUAL')
|
||||
|
||||
if not editor:
|
||||
# Try common editors
|
||||
for cmd in ['nano', 'vim', 'vi', 'code', 'notepad']:
|
||||
import shutil
|
||||
if shutil.which(cmd):
|
||||
editor = cmd
|
||||
break
|
||||
|
||||
if not editor:
|
||||
print(f"No editor found. Config file is at:")
|
||||
print(f" {config_path}")
|
||||
return
|
||||
|
||||
print(f"Opening {config_path} in {editor}...")
|
||||
subprocess.run([editor, str(config_path)])
|
||||
|
||||
|
||||
def set_config_value(key: str, value: str):
|
||||
"""Set a configuration value."""
|
||||
# Check if it's an API key (goes to .env)
|
||||
api_keys = [
|
||||
'OPENROUTER_API_KEY', 'ANTHROPIC_API_KEY', 'OPENAI_API_KEY',
|
||||
'FIRECRAWL_API_KEY', 'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID',
|
||||
'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
|
||||
'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
|
||||
'SUDO_PASSWORD'
|
||||
]
|
||||
|
||||
if key.upper() in api_keys or key.upper().startswith('TERMINAL_SSH'):
|
||||
save_env_value(key.upper(), value)
|
||||
print(f"✓ Set {key} in {get_env_path()}")
|
||||
return
|
||||
|
||||
# Otherwise it goes to config.yaml
|
||||
config = load_config()
|
||||
|
||||
# Handle nested keys (e.g., "terminal.backend")
|
||||
parts = key.split('.')
|
||||
current = config
|
||||
|
||||
for part in parts[:-1]:
|
||||
if part not in current:
|
||||
current[part] = {}
|
||||
current = current[part]
|
||||
|
||||
# Convert value to appropriate type
|
||||
if value.lower() in ('true', 'yes', 'on'):
|
||||
value = True
|
||||
elif value.lower() in ('false', 'no', 'off'):
|
||||
value = False
|
||||
elif value.isdigit():
|
||||
value = int(value)
|
||||
elif value.replace('.', '', 1).isdigit():
|
||||
value = float(value)
|
||||
|
||||
current[parts[-1]] = value
|
||||
save_config(config)
|
||||
print(f"✓ Set {key} = {value} in {get_config_path()}")
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Command handler
|
||||
# =============================================================================
|
||||
|
||||
def config_command(args):
|
||||
"""Handle config subcommands."""
|
||||
subcmd = getattr(args, 'config_command', None)
|
||||
|
||||
if subcmd is None or subcmd == "show":
|
||||
show_config()
|
||||
|
||||
elif subcmd == "edit":
|
||||
edit_config()
|
||||
|
||||
elif subcmd == "set":
|
||||
key = getattr(args, 'key', None)
|
||||
value = getattr(args, 'value', None)
|
||||
if not key or not value:
|
||||
print("Usage: hermes config set KEY VALUE")
|
||||
print()
|
||||
print("Examples:")
|
||||
print(" hermes config set model anthropic/claude-sonnet-4")
|
||||
print(" hermes config set terminal.backend docker")
|
||||
print(" hermes config set OPENROUTER_API_KEY sk-or-...")
|
||||
sys.exit(1)
|
||||
set_config_value(key, value)
|
||||
|
||||
elif subcmd == "path":
|
||||
print(get_config_path())
|
||||
|
||||
elif subcmd == "env-path":
|
||||
print(get_env_path())
|
||||
|
||||
elif subcmd == "migrate":
|
||||
print()
|
||||
print(color("🔄 Checking configuration for updates...", Colors.CYAN, Colors.BOLD))
|
||||
print()
|
||||
|
||||
# Check what's missing
|
||||
missing_env = get_missing_env_vars(required_only=False)
|
||||
missing_config = get_missing_config_fields()
|
||||
current_ver, latest_ver = check_config_version()
|
||||
|
||||
if not missing_env and not missing_config and current_ver >= latest_ver:
|
||||
print(color("✓ Configuration is up to date!", Colors.GREEN))
|
||||
print()
|
||||
return
|
||||
|
||||
# Show what needs to be updated
|
||||
if current_ver < latest_ver:
|
||||
print(f" Config version: {current_ver} → {latest_ver}")
|
||||
|
||||
if missing_config:
|
||||
print(f"\n {len(missing_config)} new config option(s) will be added with defaults")
|
||||
|
||||
required_missing = [v for v in missing_env if v.get("is_required")]
|
||||
optional_missing = [v for v in missing_env if not v.get("is_required")]
|
||||
|
||||
if required_missing:
|
||||
print(f"\n ⚠️ {len(required_missing)} required API key(s) missing:")
|
||||
for var in required_missing:
|
||||
print(f" • {var['name']}")
|
||||
|
||||
if optional_missing:
|
||||
print(f"\n ℹ️ {len(optional_missing)} optional API key(s) not configured:")
|
||||
for var in optional_missing:
|
||||
tools = var.get("tools", [])
|
||||
tools_str = f" (enables: {', '.join(tools[:2])})" if tools else ""
|
||||
print(f" • {var['name']}{tools_str}")
|
||||
|
||||
print()
|
||||
|
||||
# Run migration
|
||||
results = migrate_config(interactive=True, quiet=False)
|
||||
|
||||
print()
|
||||
if results["env_added"] or results["config_added"]:
|
||||
print(color("✓ Configuration updated!", Colors.GREEN))
|
||||
|
||||
if results["warnings"]:
|
||||
print()
|
||||
for warning in results["warnings"]:
|
||||
print(color(f" ⚠️ {warning}", Colors.YELLOW))
|
||||
|
||||
print()
|
||||
|
||||
elif subcmd == "check":
|
||||
# Non-interactive check for what's missing
|
||||
print()
|
||||
print(color("📋 Configuration Status", Colors.CYAN, Colors.BOLD))
|
||||
print()
|
||||
|
||||
current_ver, latest_ver = check_config_version()
|
||||
if current_ver >= latest_ver:
|
||||
print(f" Config version: {current_ver} ✓")
|
||||
else:
|
||||
print(color(f" Config version: {current_ver} → {latest_ver} (update available)", Colors.YELLOW))
|
||||
|
||||
print()
|
||||
print(color(" Required:", Colors.BOLD))
|
||||
for var_name in REQUIRED_ENV_VARS:
|
||||
if get_env_value(var_name):
|
||||
print(f" ✓ {var_name}")
|
||||
else:
|
||||
print(color(f" ✗ {var_name} (missing)", Colors.RED))
|
||||
|
||||
print()
|
||||
print(color(" Optional:", Colors.BOLD))
|
||||
for var_name, info in OPTIONAL_ENV_VARS.items():
|
||||
if get_env_value(var_name):
|
||||
print(f" ✓ {var_name}")
|
||||
else:
|
||||
tools = info.get("tools", [])
|
||||
tools_str = f" → {', '.join(tools[:2])}" if tools else ""
|
||||
print(color(f" ○ {var_name}{tools_str}", Colors.DIM))
|
||||
|
||||
missing_config = get_missing_config_fields()
|
||||
if missing_config:
|
||||
print()
|
||||
print(color(f" {len(missing_config)} new config option(s) available", Colors.YELLOW))
|
||||
print(f" Run 'hermes config migrate' to add them")
|
||||
|
||||
print()
|
||||
|
||||
else:
|
||||
print(f"Unknown config command: {subcmd}")
|
||||
print()
|
||||
print("Available commands:")
|
||||
print(" hermes config Show current configuration")
|
||||
print(" hermes config edit Open config in editor")
|
||||
print(" hermes config set K V Set a config value")
|
||||
print(" hermes config check Check for missing/outdated config")
|
||||
print(" hermes config migrate Update config with new options")
|
||||
print(" hermes config path Show config file path")
|
||||
print(" hermes config env-path Show .env file path")
|
||||
sys.exit(1)
|
||||
131
hermes_cli/cron.py
Normal file
131
hermes_cli/cron.py
Normal file
@@ -0,0 +1,131 @@
|
||||
"""
|
||||
Cron subcommand for hermes CLI.
|
||||
|
||||
Handles: hermes cron [list|daemon|tick]
|
||||
"""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
|
||||
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
|
||||
sys.path.insert(0, str(PROJECT_ROOT))
|
||||
|
||||
# ANSI colors
|
||||
class Colors:
|
||||
RESET = "\033[0m"
|
||||
BOLD = "\033[1m"
|
||||
DIM = "\033[2m"
|
||||
RED = "\033[31m"
|
||||
GREEN = "\033[32m"
|
||||
YELLOW = "\033[33m"
|
||||
CYAN = "\033[36m"
|
||||
|
||||
def color(text: str, *codes) -> str:
|
||||
if not sys.stdout.isatty():
|
||||
return text
|
||||
return "".join(codes) + text + Colors.RESET
|
||||
|
||||
|
||||
def cron_list(show_all: bool = False):
|
||||
"""List all scheduled jobs."""
|
||||
from cron.jobs import list_jobs
|
||||
|
||||
jobs = list_jobs(include_disabled=show_all)
|
||||
|
||||
if not jobs:
|
||||
print(color("No scheduled jobs.", Colors.DIM))
|
||||
print(color("Create one with: hermes cron add <schedule> <prompt>", Colors.DIM))
|
||||
return
|
||||
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────────────────────┐", Colors.CYAN))
|
||||
print(color("│ Scheduled Jobs │", Colors.CYAN))
|
||||
print(color("└─────────────────────────────────────────────────────────────────────────┘", Colors.CYAN))
|
||||
print()
|
||||
|
||||
for job in jobs:
|
||||
job_id = job.get("id", "?")[:8]
|
||||
name = job.get("name", "(unnamed)")
|
||||
schedule = job.get("schedule_display", job.get("schedule", {}).get("value", "?"))
|
||||
enabled = job.get("enabled", True)
|
||||
next_run = job.get("next_run_at", "?")
|
||||
|
||||
# Repeat info
|
||||
repeat_info = job.get("repeat", {})
|
||||
repeat_times = repeat_info.get("times")
|
||||
repeat_completed = repeat_info.get("completed", 0)
|
||||
|
||||
if repeat_times:
|
||||
repeat_str = f"{repeat_completed}/{repeat_times}"
|
||||
else:
|
||||
repeat_str = "∞"
|
||||
|
||||
# Delivery targets
|
||||
deliver = job.get("deliver", ["local"])
|
||||
if isinstance(deliver, str):
|
||||
deliver = [deliver]
|
||||
deliver_str = ", ".join(deliver)
|
||||
|
||||
# Status indicator
|
||||
if not enabled:
|
||||
status = color("[disabled]", Colors.RED)
|
||||
else:
|
||||
status = color("[active]", Colors.GREEN)
|
||||
|
||||
print(f" {color(job_id, Colors.YELLOW)} {status}")
|
||||
print(f" Name: {name}")
|
||||
print(f" Schedule: {schedule}")
|
||||
print(f" Repeat: {repeat_str}")
|
||||
print(f" Next run: {next_run}")
|
||||
print(f" Deliver: {deliver_str}")
|
||||
print()
|
||||
|
||||
|
||||
def cron_daemon(interval: int = 60):
|
||||
"""Run the cron daemon."""
|
||||
from cron.scheduler import start_daemon
|
||||
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
|
||||
print(color("│ 🦋 Hermes Cron Daemon │", Colors.CYAN))
|
||||
print(color("├─────────────────────────────────────────────────────────┤", Colors.CYAN))
|
||||
print(color("│ Press Ctrl+C to stop │", Colors.CYAN))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
|
||||
print()
|
||||
|
||||
try:
|
||||
start_daemon(interval=interval)
|
||||
except KeyboardInterrupt:
|
||||
print()
|
||||
print(color("Cron daemon stopped.", Colors.YELLOW))
|
||||
|
||||
|
||||
def cron_tick():
|
||||
"""Run due jobs once (for system cron integration)."""
|
||||
from cron.scheduler import tick
|
||||
|
||||
print(f"[{datetime.now().isoformat()}] Running cron tick...")
|
||||
tick()
|
||||
|
||||
|
||||
def cron_command(args):
|
||||
"""Handle cron subcommands."""
|
||||
subcmd = getattr(args, 'cron_command', None)
|
||||
|
||||
if subcmd is None or subcmd == "list":
|
||||
show_all = getattr(args, 'all', False)
|
||||
cron_list(show_all)
|
||||
|
||||
elif subcmd == "daemon":
|
||||
interval = getattr(args, 'interval', 60)
|
||||
cron_daemon(interval)
|
||||
|
||||
elif subcmd == "tick":
|
||||
cron_tick()
|
||||
|
||||
else:
|
||||
print(f"Unknown cron command: {subcmd}")
|
||||
print("Usage: hermes cron [list|daemon|tick]")
|
||||
sys.exit(1)
|
||||
316
hermes_cli/doctor.py
Normal file
316
hermes_cli/doctor.py
Normal file
@@ -0,0 +1,316 @@
|
||||
"""
|
||||
Doctor command for hermes CLI.
|
||||
|
||||
Diagnoses issues with Hermes Agent setup.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
|
||||
|
||||
# ANSI colors
|
||||
class Colors:
|
||||
RESET = "\033[0m"
|
||||
BOLD = "\033[1m"
|
||||
DIM = "\033[2m"
|
||||
RED = "\033[31m"
|
||||
GREEN = "\033[32m"
|
||||
YELLOW = "\033[33m"
|
||||
CYAN = "\033[36m"
|
||||
|
||||
def color(text: str, *codes) -> str:
|
||||
if not sys.stdout.isatty():
|
||||
return text
|
||||
return "".join(codes) + text + Colors.RESET
|
||||
|
||||
def check_ok(text: str, detail: str = ""):
|
||||
print(f" {color('✓', Colors.GREEN)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
|
||||
|
||||
def check_warn(text: str, detail: str = ""):
|
||||
print(f" {color('⚠', Colors.YELLOW)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
|
||||
|
||||
def check_fail(text: str, detail: str = ""):
|
||||
print(f" {color('✗', Colors.RED)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
|
||||
|
||||
def check_info(text: str):
|
||||
print(f" {color('→', Colors.CYAN)} {text}")
|
||||
|
||||
|
||||
def run_doctor(args):
|
||||
"""Run diagnostic checks."""
|
||||
should_fix = getattr(args, 'fix', False)
|
||||
|
||||
issues = []
|
||||
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
|
||||
print(color("│ 🩺 Hermes Doctor │", Colors.CYAN))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
|
||||
|
||||
# =========================================================================
|
||||
# Check: Python version
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Python Environment", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
py_version = sys.version_info
|
||||
if py_version >= (3, 10):
|
||||
check_ok(f"Python {py_version.major}.{py_version.minor}.{py_version.micro}")
|
||||
elif py_version >= (3, 8):
|
||||
check_warn(f"Python {py_version.major}.{py_version.minor}.{py_version.micro}", "(3.10+ recommended)")
|
||||
else:
|
||||
check_fail(f"Python {py_version.major}.{py_version.minor}.{py_version.micro}", "(3.10+ required)")
|
||||
issues.append("Upgrade Python to 3.10+")
|
||||
|
||||
# Check if in virtual environment
|
||||
in_venv = sys.prefix != sys.base_prefix
|
||||
if in_venv:
|
||||
check_ok("Virtual environment active")
|
||||
else:
|
||||
check_warn("Not in virtual environment", "(recommended)")
|
||||
|
||||
# =========================================================================
|
||||
# Check: Required packages
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Required Packages", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
required_packages = [
|
||||
("openai", "OpenAI SDK"),
|
||||
("rich", "Rich (terminal UI)"),
|
||||
("dotenv", "python-dotenv"),
|
||||
("yaml", "PyYAML"),
|
||||
("httpx", "HTTPX"),
|
||||
]
|
||||
|
||||
optional_packages = [
|
||||
("croniter", "Croniter (cron expressions)"),
|
||||
("browserbase", "Browserbase SDK"),
|
||||
("telegram", "python-telegram-bot"),
|
||||
("discord", "discord.py"),
|
||||
]
|
||||
|
||||
for module, name in required_packages:
|
||||
try:
|
||||
__import__(module)
|
||||
check_ok(name)
|
||||
except ImportError:
|
||||
check_fail(name, "(missing)")
|
||||
issues.append(f"Install {name}: pip install {module}")
|
||||
|
||||
for module, name in optional_packages:
|
||||
try:
|
||||
__import__(module)
|
||||
check_ok(name, "(optional)")
|
||||
except ImportError:
|
||||
check_warn(name, "(optional, not installed)")
|
||||
|
||||
# =========================================================================
|
||||
# Check: Configuration files
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Configuration Files", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
env_path = PROJECT_ROOT / '.env'
|
||||
if env_path.exists():
|
||||
check_ok(".env file exists")
|
||||
|
||||
# Check for common issues
|
||||
content = env_path.read_text()
|
||||
if "OPENROUTER_API_KEY" in content or "ANTHROPIC_API_KEY" in content:
|
||||
check_ok("API key configured")
|
||||
else:
|
||||
check_warn("No API key found in .env")
|
||||
issues.append("Run 'hermes setup' to configure API keys")
|
||||
else:
|
||||
check_fail(".env file missing")
|
||||
check_info("Run 'hermes setup' to create one")
|
||||
issues.append("Run 'hermes setup' to create .env")
|
||||
|
||||
config_path = PROJECT_ROOT / 'cli-config.yaml'
|
||||
if config_path.exists():
|
||||
check_ok("cli-config.yaml exists")
|
||||
else:
|
||||
check_warn("cli-config.yaml not found", "(using defaults)")
|
||||
|
||||
# =========================================================================
|
||||
# Check: Directory structure
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Directory Structure", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
hermes_home = Path.home() / ".hermes"
|
||||
if hermes_home.exists():
|
||||
check_ok("~/.hermes directory exists")
|
||||
else:
|
||||
check_warn("~/.hermes not found", "(will be created on first use)")
|
||||
|
||||
logs_dir = PROJECT_ROOT / "logs"
|
||||
if logs_dir.exists():
|
||||
check_ok("logs/ directory exists")
|
||||
else:
|
||||
check_warn("logs/ not found", "(will be created on first use)")
|
||||
|
||||
# =========================================================================
|
||||
# Check: External tools
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ External Tools", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
# Git
|
||||
if shutil.which("git"):
|
||||
check_ok("git")
|
||||
else:
|
||||
check_warn("git not found", "(optional)")
|
||||
|
||||
# ripgrep (optional, for faster file search)
|
||||
if shutil.which("rg"):
|
||||
check_ok("ripgrep (rg)", "(faster file search)")
|
||||
else:
|
||||
check_warn("ripgrep (rg) not found", "(file search uses grep fallback)")
|
||||
check_info("Install for faster search: sudo apt install ripgrep")
|
||||
|
||||
# Docker (optional)
|
||||
terminal_env = os.getenv("TERMINAL_ENV", "local")
|
||||
if terminal_env == "docker":
|
||||
if shutil.which("docker"):
|
||||
# Check if docker daemon is running
|
||||
result = subprocess.run(["docker", "info"], capture_output=True)
|
||||
if result.returncode == 0:
|
||||
check_ok("docker", "(daemon running)")
|
||||
else:
|
||||
check_fail("docker daemon not running")
|
||||
issues.append("Start Docker daemon")
|
||||
else:
|
||||
check_fail("docker not found", "(required for TERMINAL_ENV=docker)")
|
||||
issues.append("Install Docker or change TERMINAL_ENV")
|
||||
else:
|
||||
if shutil.which("docker"):
|
||||
check_ok("docker", "(optional)")
|
||||
else:
|
||||
check_warn("docker not found", "(optional)")
|
||||
|
||||
# SSH (if using ssh backend)
|
||||
if terminal_env == "ssh":
|
||||
ssh_host = os.getenv("TERMINAL_SSH_HOST")
|
||||
if ssh_host:
|
||||
# Try to connect
|
||||
result = subprocess.run(
|
||||
["ssh", "-o", "ConnectTimeout=5", "-o", "BatchMode=yes", ssh_host, "echo ok"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
if result.returncode == 0:
|
||||
check_ok(f"SSH connection to {ssh_host}")
|
||||
else:
|
||||
check_fail(f"SSH connection to {ssh_host}")
|
||||
issues.append(f"Check SSH configuration for {ssh_host}")
|
||||
else:
|
||||
check_fail("TERMINAL_SSH_HOST not set", "(required for TERMINAL_ENV=ssh)")
|
||||
issues.append("Set TERMINAL_SSH_HOST in .env")
|
||||
|
||||
# =========================================================================
|
||||
# Check: API connectivity
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ API Connectivity", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
openrouter_key = os.getenv("OPENROUTER_API_KEY")
|
||||
if openrouter_key:
|
||||
try:
|
||||
import httpx
|
||||
response = httpx.get(
|
||||
"https://openrouter.ai/api/v1/models",
|
||||
headers={"Authorization": f"Bearer {openrouter_key}"},
|
||||
timeout=10
|
||||
)
|
||||
if response.status_code == 200:
|
||||
check_ok("OpenRouter API")
|
||||
elif response.status_code == 401:
|
||||
check_fail("OpenRouter API", "(invalid API key)")
|
||||
issues.append("Check OPENROUTER_API_KEY in .env")
|
||||
else:
|
||||
check_fail("OpenRouter API", f"(HTTP {response.status_code})")
|
||||
except Exception as e:
|
||||
check_fail("OpenRouter API", f"({e})")
|
||||
issues.append("Check network connectivity")
|
||||
else:
|
||||
check_warn("OpenRouter API", "(not configured)")
|
||||
|
||||
anthropic_key = os.getenv("ANTHROPIC_API_KEY")
|
||||
if anthropic_key:
|
||||
try:
|
||||
import httpx
|
||||
response = httpx.get(
|
||||
"https://api.anthropic.com/v1/models",
|
||||
headers={
|
||||
"x-api-key": anthropic_key,
|
||||
"anthropic-version": "2023-06-01"
|
||||
},
|
||||
timeout=10
|
||||
)
|
||||
if response.status_code == 200:
|
||||
check_ok("Anthropic API")
|
||||
elif response.status_code == 401:
|
||||
check_fail("Anthropic API", "(invalid API key)")
|
||||
else:
|
||||
# Note: Anthropic may not have /models endpoint
|
||||
check_warn("Anthropic API", "(couldn't verify)")
|
||||
except Exception as e:
|
||||
check_warn("Anthropic API", f"({e})")
|
||||
|
||||
# =========================================================================
|
||||
# Check: Tool Availability
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Tool Availability", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
try:
|
||||
# Add project root to path for imports
|
||||
sys.path.insert(0, str(PROJECT_ROOT))
|
||||
from model_tools import check_tool_availability, TOOLSET_REQUIREMENTS
|
||||
|
||||
available, unavailable = check_tool_availability()
|
||||
|
||||
for tid in available:
|
||||
info = TOOLSET_REQUIREMENTS.get(tid, {})
|
||||
check_ok(info.get("name", tid))
|
||||
|
||||
for item in unavailable:
|
||||
if item["missing_vars"]:
|
||||
vars_str = ", ".join(item["missing_vars"])
|
||||
check_warn(item["name"], f"(missing {vars_str})")
|
||||
else:
|
||||
check_warn(item["name"], "(system dependency not met)")
|
||||
|
||||
# Count disabled tools with API key requirements
|
||||
api_disabled = [u for u in unavailable if u["missing_vars"]]
|
||||
if api_disabled:
|
||||
issues.append("Run 'hermes setup' to configure missing API keys for full tool access")
|
||||
except Exception as e:
|
||||
check_warn("Could not check tool availability", f"({e})")
|
||||
|
||||
# =========================================================================
|
||||
# Summary
|
||||
# =========================================================================
|
||||
print()
|
||||
if issues:
|
||||
print(color("─" * 60, Colors.YELLOW))
|
||||
print(color(f" Found {len(issues)} issue(s) to address:", Colors.YELLOW, Colors.BOLD))
|
||||
print()
|
||||
for i, issue in enumerate(issues, 1):
|
||||
print(f" {i}. {issue}")
|
||||
print()
|
||||
|
||||
if should_fix:
|
||||
print(color(" Attempting auto-fix is not yet implemented.", Colors.DIM))
|
||||
print(color(" Please resolve issues manually.", Colors.DIM))
|
||||
else:
|
||||
print(color("─" * 60, Colors.GREEN))
|
||||
print(color(" All checks passed! 🎉", Colors.GREEN, Colors.BOLD))
|
||||
|
||||
print()
|
||||
487
hermes_cli/gateway.py
Normal file
487
hermes_cli/gateway.py
Normal file
@@ -0,0 +1,487 @@
|
||||
"""
|
||||
Gateway subcommand for hermes CLI.
|
||||
|
||||
Handles: hermes gateway [run|start|stop|restart|status|install|uninstall]
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import signal
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Process Management (for manual gateway runs)
|
||||
# =============================================================================
|
||||
|
||||
def find_gateway_pids() -> list:
|
||||
"""Find PIDs of running gateway processes."""
|
||||
pids = []
|
||||
try:
|
||||
# Look for gateway processes with multiple patterns
|
||||
patterns = [
|
||||
"hermes_cli.main gateway",
|
||||
"hermes gateway",
|
||||
"gateway/run.py",
|
||||
]
|
||||
|
||||
result = subprocess.run(
|
||||
["ps", "aux"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
for line in result.stdout.split('\n'):
|
||||
# Skip grep and current process
|
||||
if 'grep' in line or str(os.getpid()) in line:
|
||||
continue
|
||||
|
||||
for pattern in patterns:
|
||||
if pattern in line:
|
||||
parts = line.split()
|
||||
if len(parts) > 1:
|
||||
try:
|
||||
pid = int(parts[1])
|
||||
if pid not in pids:
|
||||
pids.append(pid)
|
||||
except ValueError:
|
||||
continue
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return pids
|
||||
|
||||
|
||||
def kill_gateway_processes(force: bool = False) -> int:
|
||||
"""Kill any running gateway processes. Returns count killed."""
|
||||
pids = find_gateway_pids()
|
||||
killed = 0
|
||||
|
||||
for pid in pids:
|
||||
try:
|
||||
if force:
|
||||
os.kill(pid, signal.SIGKILL)
|
||||
else:
|
||||
os.kill(pid, signal.SIGTERM)
|
||||
killed += 1
|
||||
except ProcessLookupError:
|
||||
# Process already gone
|
||||
pass
|
||||
except PermissionError:
|
||||
print(f"⚠ Permission denied to kill PID {pid}")
|
||||
|
||||
return killed
|
||||
|
||||
|
||||
def is_linux() -> bool:
|
||||
return sys.platform.startswith('linux')
|
||||
|
||||
def is_macos() -> bool:
|
||||
return sys.platform == 'darwin'
|
||||
|
||||
def is_windows() -> bool:
|
||||
return sys.platform == 'win32'
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Service Configuration
|
||||
# =============================================================================
|
||||
|
||||
SERVICE_NAME = "hermes-gateway"
|
||||
SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"
|
||||
|
||||
def get_systemd_unit_path() -> Path:
|
||||
return Path.home() / ".config" / "systemd" / "user" / f"{SERVICE_NAME}.service"
|
||||
|
||||
def get_launchd_plist_path() -> Path:
|
||||
return Path.home() / "Library" / "LaunchAgents" / "ai.hermes.gateway.plist"
|
||||
|
||||
def get_python_path() -> str:
|
||||
venv_python = PROJECT_ROOT / "venv" / "bin" / "python"
|
||||
if venv_python.exists():
|
||||
return str(venv_python)
|
||||
return sys.executable
|
||||
|
||||
def get_hermes_cli_path() -> str:
|
||||
"""Get the path to the hermes CLI."""
|
||||
# Check if installed via pip
|
||||
import shutil
|
||||
hermes_bin = shutil.which("hermes")
|
||||
if hermes_bin:
|
||||
return hermes_bin
|
||||
|
||||
# Fallback to direct module execution
|
||||
return f"{get_python_path()} -m hermes_cli.main"
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Systemd (Linux)
|
||||
# =============================================================================
|
||||
|
||||
def generate_systemd_unit() -> str:
|
||||
python_path = get_python_path()
|
||||
working_dir = str(PROJECT_ROOT)
|
||||
|
||||
return f"""[Unit]
|
||||
Description={SERVICE_DESCRIPTION}
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart={python_path} -m hermes_cli.main gateway run
|
||||
WorkingDirectory={working_dir}
|
||||
Restart=on-failure
|
||||
RestartSec=10
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
"""
|
||||
|
||||
def systemd_install(force: bool = False):
|
||||
unit_path = get_systemd_unit_path()
|
||||
|
||||
if unit_path.exists() and not force:
|
||||
print(f"Service already installed at: {unit_path}")
|
||||
print("Use --force to reinstall")
|
||||
return
|
||||
|
||||
unit_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
print(f"Installing systemd service to: {unit_path}")
|
||||
unit_path.write_text(generate_systemd_unit())
|
||||
|
||||
subprocess.run(["systemctl", "--user", "daemon-reload"], check=True)
|
||||
subprocess.run(["systemctl", "--user", "enable", SERVICE_NAME], check=True)
|
||||
|
||||
print()
|
||||
print("✓ Service installed and enabled!")
|
||||
print()
|
||||
print("Next steps:")
|
||||
print(f" hermes gateway start # Start the service")
|
||||
print(f" hermes gateway status # Check status")
|
||||
print(f" journalctl --user -u {SERVICE_NAME} -f # View logs")
|
||||
print()
|
||||
print("To enable lingering (keeps running after logout):")
|
||||
print(" sudo loginctl enable-linger $USER")
|
||||
|
||||
def systemd_uninstall():
|
||||
subprocess.run(["systemctl", "--user", "stop", SERVICE_NAME], check=False)
|
||||
subprocess.run(["systemctl", "--user", "disable", SERVICE_NAME], check=False)
|
||||
|
||||
unit_path = get_systemd_unit_path()
|
||||
if unit_path.exists():
|
||||
unit_path.unlink()
|
||||
print(f"✓ Removed {unit_path}")
|
||||
|
||||
subprocess.run(["systemctl", "--user", "daemon-reload"], check=True)
|
||||
print("✓ Service uninstalled")
|
||||
|
||||
def systemd_start():
|
||||
subprocess.run(["systemctl", "--user", "start", SERVICE_NAME], check=True)
|
||||
print("✓ Service started")
|
||||
|
||||
def systemd_stop():
|
||||
subprocess.run(["systemctl", "--user", "stop", SERVICE_NAME], check=True)
|
||||
print("✓ Service stopped")
|
||||
|
||||
def systemd_restart():
|
||||
subprocess.run(["systemctl", "--user", "restart", SERVICE_NAME], check=True)
|
||||
print("✓ Service restarted")
|
||||
|
||||
def systemd_status(deep: bool = False):
|
||||
# Check if service unit file exists
|
||||
unit_path = get_systemd_unit_path()
|
||||
if not unit_path.exists():
|
||||
print("✗ Gateway service is not installed")
|
||||
print(" Run: hermes gateway install")
|
||||
return
|
||||
|
||||
# Show detailed status first
|
||||
subprocess.run(
|
||||
["systemctl", "--user", "status", SERVICE_NAME, "--no-pager"],
|
||||
capture_output=False
|
||||
)
|
||||
|
||||
# Check if service is active
|
||||
result = subprocess.run(
|
||||
["systemctl", "--user", "is-active", SERVICE_NAME],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
status = result.stdout.strip()
|
||||
|
||||
if status == "active":
|
||||
print("✓ Gateway service is running")
|
||||
else:
|
||||
print("✗ Gateway service is stopped")
|
||||
print(" Run: hermes gateway start")
|
||||
|
||||
if deep:
|
||||
print()
|
||||
print("Recent logs:")
|
||||
subprocess.run([
|
||||
"journalctl", "--user", "-u", SERVICE_NAME,
|
||||
"-n", "20", "--no-pager"
|
||||
])
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Launchd (macOS)
|
||||
# =============================================================================
|
||||
|
||||
def generate_launchd_plist() -> str:
|
||||
python_path = get_python_path()
|
||||
working_dir = str(PROJECT_ROOT)
|
||||
log_dir = Path.home() / ".hermes" / "logs"
|
||||
log_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
return f"""<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>Label</key>
|
||||
<string>ai.hermes.gateway</string>
|
||||
|
||||
<key>ProgramArguments</key>
|
||||
<array>
|
||||
<string>{python_path}</string>
|
||||
<string>-m</string>
|
||||
<string>hermes_cli.main</string>
|
||||
<string>gateway</string>
|
||||
<string>run</string>
|
||||
</array>
|
||||
|
||||
<key>WorkingDirectory</key>
|
||||
<string>{working_dir}</string>
|
||||
|
||||
<key>RunAtLoad</key>
|
||||
<true/>
|
||||
|
||||
<key>KeepAlive</key>
|
||||
<dict>
|
||||
<key>SuccessfulExit</key>
|
||||
<false/>
|
||||
</dict>
|
||||
|
||||
<key>StandardOutPath</key>
|
||||
<string>{log_dir}/gateway.log</string>
|
||||
|
||||
<key>StandardErrorPath</key>
|
||||
<string>{log_dir}/gateway.error.log</string>
|
||||
</dict>
|
||||
</plist>
|
||||
"""
|
||||
|
||||
def launchd_install(force: bool = False):
|
||||
plist_path = get_launchd_plist_path()
|
||||
|
||||
if plist_path.exists() and not force:
|
||||
print(f"Service already installed at: {plist_path}")
|
||||
print("Use --force to reinstall")
|
||||
return
|
||||
|
||||
plist_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
print(f"Installing launchd service to: {plist_path}")
|
||||
plist_path.write_text(generate_launchd_plist())
|
||||
|
||||
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
|
||||
|
||||
print()
|
||||
print("✓ Service installed and loaded!")
|
||||
print()
|
||||
print("Next steps:")
|
||||
print(" hermes gateway status # Check status")
|
||||
print(" tail -f ~/.hermes/logs/gateway.log # View logs")
|
||||
|
||||
def launchd_uninstall():
|
||||
plist_path = get_launchd_plist_path()
|
||||
subprocess.run(["launchctl", "unload", str(plist_path)], check=False)
|
||||
|
||||
if plist_path.exists():
|
||||
plist_path.unlink()
|
||||
print(f"✓ Removed {plist_path}")
|
||||
|
||||
print("✓ Service uninstalled")
|
||||
|
||||
def launchd_start():
|
||||
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
|
||||
print("✓ Service started")
|
||||
|
||||
def launchd_stop():
|
||||
subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
|
||||
print("✓ Service stopped")
|
||||
|
||||
def launchd_restart():
|
||||
launchd_stop()
|
||||
launchd_start()
|
||||
|
||||
def launchd_status(deep: bool = False):
|
||||
result = subprocess.run(
|
||||
["launchctl", "list", "ai.hermes.gateway"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
if result.returncode == 0:
|
||||
print("✓ Gateway service is loaded")
|
||||
print(result.stdout)
|
||||
else:
|
||||
print("✗ Gateway service is not loaded")
|
||||
|
||||
if deep:
|
||||
log_file = Path.home() / ".hermes" / "logs" / "gateway.log"
|
||||
if log_file.exists():
|
||||
print()
|
||||
print("Recent logs:")
|
||||
subprocess.run(["tail", "-20", str(log_file)])
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Gateway Runner
|
||||
# =============================================================================
|
||||
|
||||
def run_gateway(verbose: bool = False):
|
||||
"""Run the gateway in foreground."""
|
||||
sys.path.insert(0, str(PROJECT_ROOT))
|
||||
|
||||
from gateway.run import start_gateway
|
||||
|
||||
print("┌─────────────────────────────────────────────────────────┐")
|
||||
print("│ 🦋 Hermes Gateway Starting... │")
|
||||
print("├─────────────────────────────────────────────────────────┤")
|
||||
print("│ Press Ctrl+C to stop │")
|
||||
print("└─────────────────────────────────────────────────────────┘")
|
||||
print()
|
||||
|
||||
asyncio.run(start_gateway())
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Main Command Handler
|
||||
# =============================================================================
|
||||
|
||||
def gateway_command(args):
|
||||
"""Handle gateway subcommands."""
|
||||
subcmd = getattr(args, 'gateway_command', None)
|
||||
|
||||
# Default to run if no subcommand
|
||||
if subcmd is None or subcmd == "run":
|
||||
verbose = getattr(args, 'verbose', False)
|
||||
run_gateway(verbose)
|
||||
return
|
||||
|
||||
# Service management commands
|
||||
if subcmd == "install":
|
||||
force = getattr(args, 'force', False)
|
||||
if is_linux():
|
||||
systemd_install(force)
|
||||
elif is_macos():
|
||||
launchd_install(force)
|
||||
else:
|
||||
print("Service installation not supported on this platform.")
|
||||
print("Run manually: hermes gateway run")
|
||||
sys.exit(1)
|
||||
|
||||
elif subcmd == "uninstall":
|
||||
if is_linux():
|
||||
systemd_uninstall()
|
||||
elif is_macos():
|
||||
launchd_uninstall()
|
||||
else:
|
||||
print("Not supported on this platform.")
|
||||
sys.exit(1)
|
||||
|
||||
elif subcmd == "start":
|
||||
if is_linux():
|
||||
systemd_start()
|
||||
elif is_macos():
|
||||
launchd_start()
|
||||
else:
|
||||
print("Not supported on this platform.")
|
||||
sys.exit(1)
|
||||
|
||||
elif subcmd == "stop":
|
||||
# Try service first, fall back to killing processes directly
|
||||
service_available = False
|
||||
|
||||
if is_linux() and get_systemd_unit_path().exists():
|
||||
try:
|
||||
systemd_stop()
|
||||
service_available = True
|
||||
except subprocess.CalledProcessError:
|
||||
pass # Fall through to process kill
|
||||
elif is_macos() and get_launchd_plist_path().exists():
|
||||
try:
|
||||
launchd_stop()
|
||||
service_available = True
|
||||
except subprocess.CalledProcessError:
|
||||
pass
|
||||
|
||||
if not service_available:
|
||||
# Kill gateway processes directly
|
||||
killed = kill_gateway_processes()
|
||||
if killed:
|
||||
print(f"✓ Stopped {killed} gateway process(es)")
|
||||
else:
|
||||
print("✗ No gateway processes found")
|
||||
|
||||
elif subcmd == "restart":
|
||||
# Try service first, fall back to killing and restarting
|
||||
service_available = False
|
||||
|
||||
if is_linux() and get_systemd_unit_path().exists():
|
||||
try:
|
||||
systemd_restart()
|
||||
service_available = True
|
||||
except subprocess.CalledProcessError:
|
||||
pass
|
||||
elif is_macos() and get_launchd_plist_path().exists():
|
||||
try:
|
||||
launchd_restart()
|
||||
service_available = True
|
||||
except subprocess.CalledProcessError:
|
||||
pass
|
||||
|
||||
if not service_available:
|
||||
# Manual restart: kill existing processes
|
||||
killed = kill_gateway_processes()
|
||||
if killed:
|
||||
print(f"✓ Stopped {killed} gateway process(es)")
|
||||
|
||||
import time
|
||||
time.sleep(2)
|
||||
|
||||
# Start fresh
|
||||
print("Starting gateway...")
|
||||
run_gateway(verbose=False)
|
||||
|
||||
elif subcmd == "status":
|
||||
deep = getattr(args, 'deep', False)
|
||||
|
||||
# Check for service first
|
||||
if is_linux() and get_systemd_unit_path().exists():
|
||||
systemd_status(deep)
|
||||
elif is_macos() and get_launchd_plist_path().exists():
|
||||
launchd_status(deep)
|
||||
else:
|
||||
# Check for manually running processes
|
||||
pids = find_gateway_pids()
|
||||
if pids:
|
||||
print(f"✓ Gateway is running (PID: {', '.join(map(str, pids))})")
|
||||
print(" (Running manually, not as a system service)")
|
||||
print()
|
||||
print("To install as a service:")
|
||||
print(" hermes gateway install")
|
||||
else:
|
||||
print("✗ Gateway is not running")
|
||||
print()
|
||||
print("To start:")
|
||||
print(" hermes gateway # Run in foreground")
|
||||
print(" hermes gateway install # Install as service")
|
||||
507
hermes_cli/main.py
Normal file
507
hermes_cli/main.py
Normal file
@@ -0,0 +1,507 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Hermes CLI - Main entry point.
|
||||
|
||||
Usage:
|
||||
hermes # Interactive chat (default)
|
||||
hermes chat # Interactive chat
|
||||
hermes gateway # Run gateway in foreground
|
||||
hermes gateway start # Start gateway as service
|
||||
hermes gateway stop # Stop gateway service
|
||||
hermes gateway status # Show gateway status
|
||||
hermes gateway install # Install gateway service
|
||||
hermes gateway uninstall # Uninstall gateway service
|
||||
hermes setup # Interactive setup wizard
|
||||
hermes status # Show status of all components
|
||||
hermes cron # Manage cron jobs
|
||||
hermes cron list # List cron jobs
|
||||
hermes cron daemon # Run cron daemon
|
||||
hermes doctor # Check configuration and dependencies
|
||||
hermes version # Show version
|
||||
hermes update # Update to latest version
|
||||
hermes uninstall # Uninstall Hermes Agent
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add project root to path
|
||||
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
|
||||
sys.path.insert(0, str(PROJECT_ROOT))
|
||||
|
||||
# Load .env file
|
||||
from dotenv import load_dotenv
|
||||
env_path = PROJECT_ROOT / '.env'
|
||||
if env_path.exists():
|
||||
load_dotenv(dotenv_path=env_path)
|
||||
|
||||
from hermes_cli import __version__
|
||||
|
||||
|
||||
def cmd_chat(args):
|
||||
"""Run interactive chat CLI."""
|
||||
# Import and run the CLI
|
||||
from cli import main as cli_main
|
||||
|
||||
# Build kwargs from args
|
||||
kwargs = {
|
||||
"model": args.model,
|
||||
"toolsets": args.toolsets,
|
||||
"verbose": args.verbose,
|
||||
"query": args.query,
|
||||
}
|
||||
# Filter out None values
|
||||
kwargs = {k: v for k, v in kwargs.items() if v is not None}
|
||||
|
||||
cli_main(**kwargs)
|
||||
|
||||
|
||||
def cmd_gateway(args):
|
||||
"""Gateway management commands."""
|
||||
from hermes_cli.gateway import gateway_command
|
||||
gateway_command(args)
|
||||
|
||||
|
||||
def cmd_setup(args):
|
||||
"""Interactive setup wizard."""
|
||||
from hermes_cli.setup import run_setup_wizard
|
||||
run_setup_wizard(args)
|
||||
|
||||
|
||||
def cmd_status(args):
|
||||
"""Show status of all components."""
|
||||
from hermes_cli.status import show_status
|
||||
show_status(args)
|
||||
|
||||
|
||||
def cmd_cron(args):
|
||||
"""Cron job management."""
|
||||
from hermes_cli.cron import cron_command
|
||||
cron_command(args)
|
||||
|
||||
|
||||
def cmd_doctor(args):
|
||||
"""Check configuration and dependencies."""
|
||||
from hermes_cli.doctor import run_doctor
|
||||
run_doctor(args)
|
||||
|
||||
|
||||
def cmd_config(args):
|
||||
"""Configuration management."""
|
||||
from hermes_cli.config import config_command
|
||||
config_command(args)
|
||||
|
||||
|
||||
def cmd_version(args):
|
||||
"""Show version."""
|
||||
print(f"Hermes Agent v{__version__}")
|
||||
print(f"Project: {PROJECT_ROOT}")
|
||||
|
||||
# Show Python version
|
||||
print(f"Python: {sys.version.split()[0]}")
|
||||
|
||||
# Check for key dependencies
|
||||
try:
|
||||
import openai
|
||||
print(f"OpenAI SDK: {openai.__version__}")
|
||||
except ImportError:
|
||||
print("OpenAI SDK: Not installed")
|
||||
|
||||
|
||||
def cmd_uninstall(args):
|
||||
"""Uninstall Hermes Agent."""
|
||||
from hermes_cli.uninstall import run_uninstall
|
||||
run_uninstall(args)
|
||||
|
||||
|
||||
def cmd_update(args):
|
||||
"""Update Hermes Agent to the latest version."""
|
||||
import subprocess
|
||||
|
||||
print("🦋 Updating Hermes Agent...")
|
||||
print()
|
||||
|
||||
# Check if we're in a git repo
|
||||
git_dir = PROJECT_ROOT / '.git'
|
||||
if not git_dir.exists():
|
||||
print("✗ Not a git repository. Please reinstall:")
|
||||
print(" curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash")
|
||||
sys.exit(1)
|
||||
|
||||
# Fetch and pull
|
||||
try:
|
||||
print("→ Fetching updates...")
|
||||
subprocess.run(["git", "fetch", "origin"], cwd=PROJECT_ROOT, check=True)
|
||||
|
||||
# Get current branch
|
||||
result = subprocess.run(
|
||||
["git", "rev-parse", "--abbrev-ref", "HEAD"],
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True
|
||||
)
|
||||
branch = result.stdout.strip()
|
||||
|
||||
# Check if there are updates
|
||||
result = subprocess.run(
|
||||
["git", "rev-list", f"HEAD..origin/{branch}", "--count"],
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True
|
||||
)
|
||||
commit_count = int(result.stdout.strip())
|
||||
|
||||
if commit_count == 0:
|
||||
print("✓ Already up to date!")
|
||||
return
|
||||
|
||||
print(f"→ Found {commit_count} new commit(s)")
|
||||
print("→ Pulling updates...")
|
||||
subprocess.run(["git", "pull", "origin", branch], cwd=PROJECT_ROOT, check=True)
|
||||
|
||||
# Reinstall Python dependencies
|
||||
print("→ Updating Python dependencies...")
|
||||
venv_pip = PROJECT_ROOT / "venv" / "bin" / "pip"
|
||||
if venv_pip.exists():
|
||||
subprocess.run([str(venv_pip), "install", "-e", ".", "--quiet"], cwd=PROJECT_ROOT, check=True)
|
||||
else:
|
||||
subprocess.run(["pip", "install", "-e", ".", "--quiet"], cwd=PROJECT_ROOT, check=True)
|
||||
|
||||
# Check for Node.js deps
|
||||
if (PROJECT_ROOT / "package.json").exists():
|
||||
import shutil
|
||||
if shutil.which("npm"):
|
||||
print("→ Updating Node.js dependencies...")
|
||||
subprocess.run(["npm", "install", "--silent"], cwd=PROJECT_ROOT, check=False)
|
||||
|
||||
print()
|
||||
print("✓ Code updated!")
|
||||
|
||||
# Check for config migrations
|
||||
print()
|
||||
print("→ Checking configuration for new options...")
|
||||
|
||||
from hermes_cli.config import (
|
||||
get_missing_env_vars, get_missing_config_fields,
|
||||
check_config_version, migrate_config
|
||||
)
|
||||
|
||||
missing_env = get_missing_env_vars(required_only=True)
|
||||
missing_config = get_missing_config_fields()
|
||||
current_ver, latest_ver = check_config_version()
|
||||
|
||||
needs_migration = missing_env or missing_config or current_ver < latest_ver
|
||||
|
||||
if needs_migration:
|
||||
print()
|
||||
if missing_env:
|
||||
print(f" ⚠️ {len(missing_env)} new required setting(s) need configuration")
|
||||
if missing_config:
|
||||
print(f" ℹ️ {len(missing_config)} new config option(s) available")
|
||||
|
||||
print()
|
||||
response = input("Would you like to configure them now? [Y/n]: ").strip().lower()
|
||||
|
||||
if response in ('', 'y', 'yes'):
|
||||
print()
|
||||
results = migrate_config(interactive=True, quiet=False)
|
||||
|
||||
if results["env_added"] or results["config_added"]:
|
||||
print()
|
||||
print("✓ Configuration updated!")
|
||||
else:
|
||||
print()
|
||||
print("Skipped. Run 'hermes config migrate' later to configure.")
|
||||
else:
|
||||
print(" ✓ Configuration is up to date")
|
||||
|
||||
print()
|
||||
print("✓ Update complete!")
|
||||
print()
|
||||
print("Note: If you have the gateway service running, restart it:")
|
||||
print(" hermes gateway restart")
|
||||
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f"✗ Update failed: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point for hermes CLI."""
|
||||
parser = argparse.ArgumentParser(
|
||||
prog="hermes",
|
||||
description="Hermes Agent - AI assistant with tool-calling capabilities",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
hermes Start interactive chat
|
||||
hermes chat -q "Hello" Single query mode
|
||||
hermes setup Run setup wizard
|
||||
hermes config View configuration
|
||||
hermes config edit Edit config in $EDITOR
|
||||
hermes config set model gpt-4 Set a config value
|
||||
hermes gateway Run messaging gateway
|
||||
hermes gateway install Install as system service
|
||||
hermes update Update to latest version
|
||||
|
||||
For more help on a command:
|
||||
hermes <command> --help
|
||||
"""
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--version", "-V",
|
||||
action="store_true",
|
||||
help="Show version and exit"
|
||||
)
|
||||
|
||||
subparsers = parser.add_subparsers(dest="command", help="Command to run")
|
||||
|
||||
# =========================================================================
|
||||
# chat command
|
||||
# =========================================================================
|
||||
chat_parser = subparsers.add_parser(
|
||||
"chat",
|
||||
help="Interactive chat with the agent",
|
||||
description="Start an interactive chat session with Hermes Agent"
|
||||
)
|
||||
chat_parser.add_argument(
|
||||
"-q", "--query",
|
||||
help="Single query (non-interactive mode)"
|
||||
)
|
||||
chat_parser.add_argument(
|
||||
"-m", "--model",
|
||||
help="Model to use (e.g., anthropic/claude-sonnet-4)"
|
||||
)
|
||||
chat_parser.add_argument(
|
||||
"-t", "--toolsets",
|
||||
help="Comma-separated toolsets to enable"
|
||||
)
|
||||
chat_parser.add_argument(
|
||||
"-v", "--verbose",
|
||||
action="store_true",
|
||||
help="Verbose output"
|
||||
)
|
||||
chat_parser.set_defaults(func=cmd_chat)
|
||||
|
||||
# =========================================================================
|
||||
# gateway command
|
||||
# =========================================================================
|
||||
gateway_parser = subparsers.add_parser(
|
||||
"gateway",
|
||||
help="Messaging gateway management",
|
||||
description="Manage the messaging gateway (Telegram, Discord, WhatsApp)"
|
||||
)
|
||||
gateway_subparsers = gateway_parser.add_subparsers(dest="gateway_command")
|
||||
|
||||
# gateway run (default)
|
||||
gateway_run = gateway_subparsers.add_parser("run", help="Run gateway in foreground")
|
||||
gateway_run.add_argument("-v", "--verbose", action="store_true")
|
||||
|
||||
# gateway start
|
||||
gateway_start = gateway_subparsers.add_parser("start", help="Start gateway service")
|
||||
|
||||
# gateway stop
|
||||
gateway_stop = gateway_subparsers.add_parser("stop", help="Stop gateway service")
|
||||
|
||||
# gateway restart
|
||||
gateway_restart = gateway_subparsers.add_parser("restart", help="Restart gateway service")
|
||||
|
||||
# gateway status
|
||||
gateway_status = gateway_subparsers.add_parser("status", help="Show gateway status")
|
||||
gateway_status.add_argument("--deep", action="store_true", help="Deep status check")
|
||||
|
||||
# gateway install
|
||||
gateway_install = gateway_subparsers.add_parser("install", help="Install gateway as service")
|
||||
gateway_install.add_argument("--force", action="store_true", help="Force reinstall")
|
||||
|
||||
# gateway uninstall
|
||||
gateway_uninstall = gateway_subparsers.add_parser("uninstall", help="Uninstall gateway service")
|
||||
|
||||
gateway_parser.set_defaults(func=cmd_gateway)
|
||||
|
||||
# =========================================================================
|
||||
# setup command
|
||||
# =========================================================================
|
||||
setup_parser = subparsers.add_parser(
|
||||
"setup",
|
||||
help="Interactive setup wizard",
|
||||
description="Configure Hermes Agent with an interactive wizard"
|
||||
)
|
||||
setup_parser.add_argument(
|
||||
"--non-interactive",
|
||||
action="store_true",
|
||||
help="Non-interactive mode (use defaults/env vars)"
|
||||
)
|
||||
setup_parser.add_argument(
|
||||
"--reset",
|
||||
action="store_true",
|
||||
help="Reset configuration to defaults"
|
||||
)
|
||||
setup_parser.set_defaults(func=cmd_setup)
|
||||
|
||||
# =========================================================================
|
||||
# status command
|
||||
# =========================================================================
|
||||
status_parser = subparsers.add_parser(
|
||||
"status",
|
||||
help="Show status of all components",
|
||||
description="Display status of Hermes Agent components"
|
||||
)
|
||||
status_parser.add_argument(
|
||||
"--all",
|
||||
action="store_true",
|
||||
help="Show all details (redacted for sharing)"
|
||||
)
|
||||
status_parser.add_argument(
|
||||
"--deep",
|
||||
action="store_true",
|
||||
help="Run deep checks (may take longer)"
|
||||
)
|
||||
status_parser.set_defaults(func=cmd_status)
|
||||
|
||||
# =========================================================================
|
||||
# cron command
|
||||
# =========================================================================
|
||||
cron_parser = subparsers.add_parser(
|
||||
"cron",
|
||||
help="Cron job management",
|
||||
description="Manage scheduled tasks"
|
||||
)
|
||||
cron_subparsers = cron_parser.add_subparsers(dest="cron_command")
|
||||
|
||||
# cron list
|
||||
cron_list = cron_subparsers.add_parser("list", help="List scheduled jobs")
|
||||
cron_list.add_argument("--all", action="store_true", help="Include disabled jobs")
|
||||
|
||||
# cron daemon
|
||||
cron_daemon = cron_subparsers.add_parser("daemon", help="Run cron daemon")
|
||||
cron_daemon.add_argument("--interval", type=int, default=60, help="Check interval in seconds")
|
||||
|
||||
# cron tick
|
||||
cron_tick = cron_subparsers.add_parser("tick", help="Run due jobs once (for system cron)")
|
||||
|
||||
cron_parser.set_defaults(func=cmd_cron)
|
||||
|
||||
# =========================================================================
|
||||
# doctor command
|
||||
# =========================================================================
|
||||
doctor_parser = subparsers.add_parser(
|
||||
"doctor",
|
||||
help="Check configuration and dependencies",
|
||||
description="Diagnose issues with Hermes Agent setup"
|
||||
)
|
||||
doctor_parser.add_argument(
|
||||
"--fix",
|
||||
action="store_true",
|
||||
help="Attempt to fix issues automatically"
|
||||
)
|
||||
doctor_parser.set_defaults(func=cmd_doctor)
|
||||
|
||||
# =========================================================================
|
||||
# config command
|
||||
# =========================================================================
|
||||
config_parser = subparsers.add_parser(
|
||||
"config",
|
||||
help="View and edit configuration",
|
||||
description="Manage Hermes Agent configuration"
|
||||
)
|
||||
config_subparsers = config_parser.add_subparsers(dest="config_command")
|
||||
|
||||
# config show (default)
|
||||
config_show = config_subparsers.add_parser("show", help="Show current configuration")
|
||||
|
||||
# config edit
|
||||
config_edit = config_subparsers.add_parser("edit", help="Open config file in editor")
|
||||
|
||||
# config set
|
||||
config_set = config_subparsers.add_parser("set", help="Set a configuration value")
|
||||
config_set.add_argument("key", nargs="?", help="Configuration key (e.g., model, terminal.backend)")
|
||||
config_set.add_argument("value", nargs="?", help="Value to set")
|
||||
|
||||
# config path
|
||||
config_path = config_subparsers.add_parser("path", help="Print config file path")
|
||||
|
||||
# config env-path
|
||||
config_env = config_subparsers.add_parser("env-path", help="Print .env file path")
|
||||
|
||||
# config check
|
||||
config_check = config_subparsers.add_parser("check", help="Check for missing/outdated config")
|
||||
|
||||
# config migrate
|
||||
config_migrate = config_subparsers.add_parser("migrate", help="Update config with new options")
|
||||
|
||||
config_parser.set_defaults(func=cmd_config)
|
||||
|
||||
# =========================================================================
|
||||
# version command
|
||||
# =========================================================================
|
||||
version_parser = subparsers.add_parser(
|
||||
"version",
|
||||
help="Show version information"
|
||||
)
|
||||
version_parser.set_defaults(func=cmd_version)
|
||||
|
||||
# =========================================================================
|
||||
# update command
|
||||
# =========================================================================
|
||||
update_parser = subparsers.add_parser(
|
||||
"update",
|
||||
help="Update Hermes Agent to the latest version",
|
||||
description="Pull the latest changes from git and reinstall dependencies"
|
||||
)
|
||||
update_parser.set_defaults(func=cmd_update)
|
||||
|
||||
# =========================================================================
|
||||
# uninstall command
|
||||
# =========================================================================
|
||||
uninstall_parser = subparsers.add_parser(
|
||||
"uninstall",
|
||||
help="Uninstall Hermes Agent",
|
||||
description="Remove Hermes Agent from your system. Can keep configs/data for reinstall."
|
||||
)
|
||||
uninstall_parser.add_argument(
|
||||
"--full",
|
||||
action="store_true",
|
||||
help="Full uninstall - remove everything including configs and data"
|
||||
)
|
||||
uninstall_parser.add_argument(
|
||||
"--yes", "-y",
|
||||
action="store_true",
|
||||
help="Skip confirmation prompts"
|
||||
)
|
||||
uninstall_parser.set_defaults(func=cmd_uninstall)
|
||||
|
||||
# =========================================================================
|
||||
# Parse and execute
|
||||
# =========================================================================
|
||||
args = parser.parse_args()
|
||||
|
||||
# Handle --version flag
|
||||
if args.version:
|
||||
cmd_version(args)
|
||||
return
|
||||
|
||||
# Default to chat if no command specified
|
||||
if args.command is None:
|
||||
# No command = run chat
|
||||
args.query = None
|
||||
args.model = None
|
||||
args.toolsets = None
|
||||
args.verbose = False
|
||||
cmd_chat(args)
|
||||
return
|
||||
|
||||
# Execute the command
|
||||
if hasattr(args, 'func'):
|
||||
args.func(args)
|
||||
else:
|
||||
parser.print_help()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
989
hermes_cli/setup.py
Normal file
989
hermes_cli/setup.py
Normal file
@@ -0,0 +1,989 @@
|
||||
"""
|
||||
Interactive setup wizard for Hermes Agent.
|
||||
|
||||
Guides users through:
|
||||
1. Installation directory confirmation
|
||||
2. API key configuration
|
||||
3. Model selection
|
||||
4. Terminal backend selection
|
||||
5. Messaging platform setup
|
||||
6. Optional features
|
||||
|
||||
Config files are stored in ~/.hermes/ for easy access.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, Any
|
||||
|
||||
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
|
||||
|
||||
# Import config helpers
|
||||
from hermes_cli.config import (
|
||||
get_hermes_home, get_config_path, get_env_path,
|
||||
load_config, save_config, save_env_value, get_env_value,
|
||||
ensure_hermes_home, DEFAULT_CONFIG
|
||||
)
|
||||
|
||||
# ANSI colors
|
||||
class Colors:
|
||||
RESET = "\033[0m"
|
||||
BOLD = "\033[1m"
|
||||
DIM = "\033[2m"
|
||||
RED = "\033[31m"
|
||||
GREEN = "\033[32m"
|
||||
YELLOW = "\033[33m"
|
||||
BLUE = "\033[34m"
|
||||
MAGENTA = "\033[35m"
|
||||
CYAN = "\033[36m"
|
||||
|
||||
def color(text: str, *codes) -> str:
|
||||
"""Apply color codes to text."""
|
||||
if not sys.stdout.isatty():
|
||||
return text
|
||||
return "".join(codes) + text + Colors.RESET
|
||||
|
||||
def print_header(title: str):
|
||||
"""Print a section header."""
|
||||
print()
|
||||
print(color(f"◆ {title}", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
def print_info(text: str):
|
||||
"""Print info text."""
|
||||
print(color(f" {text}", Colors.DIM))
|
||||
|
||||
def print_success(text: str):
|
||||
"""Print success message."""
|
||||
print(color(f"✓ {text}", Colors.GREEN))
|
||||
|
||||
def print_warning(text: str):
|
||||
"""Print warning message."""
|
||||
print(color(f"⚠ {text}", Colors.YELLOW))
|
||||
|
||||
def print_error(text: str):
|
||||
"""Print error message."""
|
||||
print(color(f"✗ {text}", Colors.RED))
|
||||
|
||||
def prompt(question: str, default: str = None, password: bool = False) -> str:
|
||||
"""Prompt for input with optional default."""
|
||||
if default:
|
||||
display = f"{question} [{default}]: "
|
||||
else:
|
||||
display = f"{question}: "
|
||||
|
||||
try:
|
||||
if password:
|
||||
import getpass
|
||||
value = getpass.getpass(color(display, Colors.YELLOW))
|
||||
else:
|
||||
value = input(color(display, Colors.YELLOW))
|
||||
|
||||
return value.strip() or default or ""
|
||||
except (KeyboardInterrupt, EOFError):
|
||||
print()
|
||||
sys.exit(1)
|
||||
|
||||
def prompt_choice(question: str, choices: list, default: int = 0) -> int:
|
||||
"""Prompt for a choice from a list with arrow key navigation."""
|
||||
print(color(question, Colors.YELLOW))
|
||||
|
||||
# Try to use interactive menu if available
|
||||
try:
|
||||
from simple_term_menu import TerminalMenu
|
||||
|
||||
# Add visual indicators
|
||||
menu_choices = [f" {choice}" for choice in choices]
|
||||
|
||||
terminal_menu = TerminalMenu(
|
||||
menu_choices,
|
||||
cursor_index=default,
|
||||
menu_cursor="→ ",
|
||||
menu_cursor_style=("fg_green", "bold"),
|
||||
menu_highlight_style=("fg_green",),
|
||||
cycle_cursor=True,
|
||||
clear_screen=False,
|
||||
)
|
||||
|
||||
idx = terminal_menu.show()
|
||||
if idx is None: # User pressed Escape or Ctrl+C
|
||||
print()
|
||||
sys.exit(1)
|
||||
print() # Add newline after selection
|
||||
return idx
|
||||
|
||||
except ImportError:
|
||||
# Fallback to number-based selection
|
||||
for i, choice in enumerate(choices):
|
||||
marker = "●" if i == default else "○"
|
||||
if i == default:
|
||||
print(color(f" {marker} {choice}", Colors.GREEN))
|
||||
else:
|
||||
print(f" {marker} {choice}")
|
||||
|
||||
while True:
|
||||
try:
|
||||
value = input(color(f" Select [1-{len(choices)}] ({default + 1}): ", Colors.DIM))
|
||||
if not value:
|
||||
return default
|
||||
idx = int(value) - 1
|
||||
if 0 <= idx < len(choices):
|
||||
return idx
|
||||
print_error(f"Please enter a number between 1 and {len(choices)}")
|
||||
except ValueError:
|
||||
print_error("Please enter a number")
|
||||
except (KeyboardInterrupt, EOFError):
|
||||
print()
|
||||
sys.exit(1)
|
||||
|
||||
def prompt_yes_no(question: str, default: bool = True) -> bool:
|
||||
"""Prompt for yes/no."""
|
||||
default_str = "Y/n" if default else "y/N"
|
||||
|
||||
while True:
|
||||
value = input(color(f"{question} [{default_str}]: ", Colors.YELLOW)).strip().lower()
|
||||
|
||||
if not value:
|
||||
return default
|
||||
if value in ('y', 'yes'):
|
||||
return True
|
||||
if value in ('n', 'no'):
|
||||
return False
|
||||
print_error("Please enter 'y' or 'n'")
|
||||
|
||||
|
||||
def _print_setup_summary(config: dict, hermes_home):
|
||||
"""Print the setup completion summary."""
|
||||
# Tool availability summary
|
||||
print()
|
||||
print_header("Tool Availability Summary")
|
||||
|
||||
tool_status = []
|
||||
|
||||
# OpenRouter (required for vision, moa)
|
||||
if get_env_value('OPENROUTER_API_KEY'):
|
||||
tool_status.append(("Vision (image analysis)", True, None))
|
||||
tool_status.append(("Mixture of Agents", True, None))
|
||||
else:
|
||||
tool_status.append(("Vision (image analysis)", False, "OPENROUTER_API_KEY"))
|
||||
tool_status.append(("Mixture of Agents", False, "OPENROUTER_API_KEY"))
|
||||
|
||||
# Firecrawl (web tools)
|
||||
if get_env_value('FIRECRAWL_API_KEY'):
|
||||
tool_status.append(("Web Search & Extract", True, None))
|
||||
else:
|
||||
tool_status.append(("Web Search & Extract", False, "FIRECRAWL_API_KEY"))
|
||||
|
||||
# Browserbase (browser tools)
|
||||
if get_env_value('BROWSERBASE_API_KEY'):
|
||||
tool_status.append(("Browser Automation", True, None))
|
||||
else:
|
||||
tool_status.append(("Browser Automation", False, "BROWSERBASE_API_KEY"))
|
||||
|
||||
# FAL (image generation)
|
||||
if get_env_value('FAL_KEY'):
|
||||
tool_status.append(("Image Generation", True, None))
|
||||
else:
|
||||
tool_status.append(("Image Generation", False, "FAL_KEY"))
|
||||
|
||||
# Tinker + WandB (RL training)
|
||||
if get_env_value('TINKER_API_KEY') and get_env_value('WANDB_API_KEY'):
|
||||
tool_status.append(("RL Training (Tinker)", True, None))
|
||||
elif get_env_value('TINKER_API_KEY'):
|
||||
tool_status.append(("RL Training (Tinker)", False, "WANDB_API_KEY"))
|
||||
else:
|
||||
tool_status.append(("RL Training (Tinker)", False, "TINKER_API_KEY"))
|
||||
|
||||
# Terminal (always available if system deps met)
|
||||
tool_status.append(("Terminal/Commands", True, None))
|
||||
|
||||
# Skills (always available if skills dir exists)
|
||||
tool_status.append(("Skills Knowledge Base", True, None))
|
||||
|
||||
# Print status
|
||||
available_count = sum(1 for _, avail, _ in tool_status if avail)
|
||||
total_count = len(tool_status)
|
||||
|
||||
print_info(f"{available_count}/{total_count} tool categories available:")
|
||||
print()
|
||||
|
||||
for name, available, missing_var in tool_status:
|
||||
if available:
|
||||
print(f" {color('✓', Colors.GREEN)} {name}")
|
||||
else:
|
||||
print(f" {color('✗', Colors.RED)} {name} {color(f'(missing {missing_var})', Colors.DIM)}")
|
||||
|
||||
print()
|
||||
|
||||
disabled_tools = [(name, var) for name, avail, var in tool_status if not avail]
|
||||
if disabled_tools:
|
||||
print_warning("Some tools are disabled. Run 'hermes setup' again to configure them,")
|
||||
print_warning("or edit ~/.hermes/.env directly to add the missing API keys.")
|
||||
print()
|
||||
|
||||
# Done banner
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.GREEN))
|
||||
print(color("│ ✓ Setup Complete! │", Colors.GREEN))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.GREEN))
|
||||
print()
|
||||
|
||||
# Show file locations prominently
|
||||
print(color("📁 All your files are in ~/.hermes/:", Colors.CYAN, Colors.BOLD))
|
||||
print()
|
||||
print(f" {color('Settings:', Colors.YELLOW)} {get_config_path()}")
|
||||
print(f" {color('API Keys:', Colors.YELLOW)} {get_env_path()}")
|
||||
print(f" {color('Data:', Colors.YELLOW)} {hermes_home}/cron/, sessions/, logs/")
|
||||
print()
|
||||
|
||||
print(color("─" * 60, Colors.DIM))
|
||||
print()
|
||||
print(color("📝 To edit your configuration:", Colors.CYAN, Colors.BOLD))
|
||||
print()
|
||||
print(f" {color('hermes config', Colors.GREEN)} View current settings")
|
||||
print(f" {color('hermes config edit', Colors.GREEN)} Open config in your editor")
|
||||
print(f" {color('hermes config set KEY VALUE', Colors.GREEN)}")
|
||||
print(f" Set a specific value")
|
||||
print()
|
||||
print(f" Or edit the files directly:")
|
||||
print(f" {color(f'nano {get_config_path()}', Colors.DIM)}")
|
||||
print(f" {color(f'nano {get_env_path()}', Colors.DIM)}")
|
||||
print()
|
||||
|
||||
print(color("─" * 60, Colors.DIM))
|
||||
print()
|
||||
print(color("🚀 Ready to go!", Colors.CYAN, Colors.BOLD))
|
||||
print()
|
||||
print(f" {color('hermes', Colors.GREEN)} Start chatting")
|
||||
print(f" {color('hermes gateway', Colors.GREEN)} Start messaging gateway")
|
||||
print(f" {color('hermes doctor', Colors.GREEN)} Check for issues")
|
||||
print()
|
||||
|
||||
|
||||
def run_setup_wizard(args):
|
||||
"""Run the interactive setup wizard."""
|
||||
ensure_hermes_home()
|
||||
|
||||
config = load_config()
|
||||
hermes_home = get_hermes_home()
|
||||
|
||||
# Check if this is an existing installation with config
|
||||
is_existing = get_env_value("OPENROUTER_API_KEY") is not None or get_config_path().exists()
|
||||
|
||||
# Import migration helpers
|
||||
from hermes_cli.config import (
|
||||
get_missing_env_vars, get_missing_config_fields,
|
||||
check_config_version, migrate_config,
|
||||
REQUIRED_ENV_VARS, OPTIONAL_ENV_VARS
|
||||
)
|
||||
|
||||
# Check what's missing
|
||||
missing_required = [v for v in get_missing_env_vars(required_only=False) if v.get("is_required")]
|
||||
missing_optional = [v for v in get_missing_env_vars(required_only=False) if not v.get("is_required")]
|
||||
missing_config = get_missing_config_fields()
|
||||
current_ver, latest_ver = check_config_version()
|
||||
|
||||
has_missing = missing_required or missing_optional or missing_config or current_ver < latest_ver
|
||||
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.MAGENTA))
|
||||
print(color("│ 🦋 Hermes Agent Setup Wizard │", Colors.MAGENTA))
|
||||
print(color("├─────────────────────────────────────────────────────────┤", Colors.MAGENTA))
|
||||
print(color("│ Let's configure your Hermes Agent installation. │", Colors.MAGENTA))
|
||||
print(color("│ Press Ctrl+C at any time to exit. │", Colors.MAGENTA))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.MAGENTA))
|
||||
|
||||
# If existing installation, show what's missing and offer quick mode
|
||||
quick_mode = False
|
||||
if is_existing and has_missing:
|
||||
print()
|
||||
print_header("Existing Installation Detected")
|
||||
print_success("You already have Hermes configured!")
|
||||
print()
|
||||
|
||||
if missing_required:
|
||||
print_warning(f" {len(missing_required)} required setting(s) missing:")
|
||||
for var in missing_required:
|
||||
print(f" • {var['name']}")
|
||||
|
||||
if missing_optional:
|
||||
print_info(f" {len(missing_optional)} optional tool(s) not configured:")
|
||||
for var in missing_optional[:3]: # Show first 3
|
||||
tools = var.get("tools", [])
|
||||
tools_str = f" → {', '.join(tools[:2])}" if tools else ""
|
||||
print(f" • {var['name']}{tools_str}")
|
||||
if len(missing_optional) > 3:
|
||||
print(f" • ...and {len(missing_optional) - 3} more")
|
||||
|
||||
if missing_config:
|
||||
print_info(f" {len(missing_config)} new config option(s) available")
|
||||
|
||||
print()
|
||||
|
||||
setup_choices = [
|
||||
"Quick setup - just configure missing items",
|
||||
"Full setup - reconfigure everything",
|
||||
"Skip - exit setup"
|
||||
]
|
||||
|
||||
choice = prompt_choice("What would you like to do?", setup_choices, 0)
|
||||
|
||||
if choice == 0:
|
||||
quick_mode = True
|
||||
elif choice == 2:
|
||||
print()
|
||||
print_info("Exiting. Run 'hermes setup' again when ready.")
|
||||
return
|
||||
# choice == 1 continues with full setup
|
||||
|
||||
elif is_existing and not has_missing:
|
||||
print()
|
||||
print_header("Configuration Status")
|
||||
print_success("Your configuration is complete!")
|
||||
print()
|
||||
|
||||
if not prompt_yes_no("Would you like to reconfigure anyway?", False):
|
||||
print()
|
||||
print_info("Exiting. Your configuration is already set up.")
|
||||
print_info(f"Config: {get_config_path()}")
|
||||
print_info(f"Secrets: {get_env_path()}")
|
||||
return
|
||||
|
||||
# Quick mode: only configure missing items
|
||||
if quick_mode:
|
||||
print()
|
||||
print_header("Quick Setup - Missing Items Only")
|
||||
|
||||
# Handle missing required env vars
|
||||
if missing_required:
|
||||
for var in missing_required:
|
||||
print()
|
||||
print(color(f" {var['name']}", Colors.CYAN))
|
||||
print_info(f" {var.get('description', '')}")
|
||||
if var.get("url"):
|
||||
print_info(f" Get key at: {var['url']}")
|
||||
|
||||
if var.get("password"):
|
||||
value = prompt(f" {var.get('prompt', var['name'])}", password=True)
|
||||
else:
|
||||
value = prompt(f" {var.get('prompt', var['name'])}")
|
||||
|
||||
if value:
|
||||
save_env_value(var["name"], value)
|
||||
print_success(f" Saved {var['name']}")
|
||||
else:
|
||||
print_warning(f" Skipped {var['name']}")
|
||||
|
||||
# Handle missing optional env vars
|
||||
if missing_optional:
|
||||
print()
|
||||
print_header("Optional Tools (Quick Setup)")
|
||||
|
||||
for var in missing_optional:
|
||||
tools = var.get("tools", [])
|
||||
tools_str = f" (enables: {', '.join(tools[:2])})" if tools else ""
|
||||
|
||||
if prompt_yes_no(f"Configure {var['name']}{tools_str}?", False):
|
||||
if var.get("url"):
|
||||
print_info(f" Get key at: {var['url']}")
|
||||
|
||||
if var.get("password"):
|
||||
value = prompt(f" {var.get('prompt', var['name'])}", password=True)
|
||||
else:
|
||||
value = prompt(f" {var.get('prompt', var['name'])}")
|
||||
|
||||
if value:
|
||||
save_env_value(var["name"], value)
|
||||
print_success(f" Saved")
|
||||
|
||||
# Handle missing config fields
|
||||
if missing_config:
|
||||
print()
|
||||
print_info(f"Adding {len(missing_config)} new config option(s) with defaults...")
|
||||
for field in missing_config:
|
||||
print_success(f" Added {field['key']} = {field['default']}")
|
||||
|
||||
# Update config version
|
||||
config["_config_version"] = latest_ver
|
||||
save_config(config)
|
||||
|
||||
# Jump to summary
|
||||
_print_setup_summary(config, hermes_home)
|
||||
return
|
||||
|
||||
# =========================================================================
|
||||
# Step 0: Show paths (full setup)
|
||||
# =========================================================================
|
||||
print_header("Configuration Location")
|
||||
print_info(f"Config file: {get_config_path()}")
|
||||
print_info(f"Secrets file: {get_env_path()}")
|
||||
print_info(f"Data folder: {hermes_home}")
|
||||
print_info(f"Install dir: {PROJECT_ROOT}")
|
||||
print()
|
||||
print_info("You can edit these files directly or use 'hermes config edit'")
|
||||
|
||||
# =========================================================================
|
||||
# Step 1: OpenRouter API Key (Required for tools)
|
||||
# =========================================================================
|
||||
print_header("OpenRouter API Key (Required)")
|
||||
print_info("OpenRouter is used for vision, web scraping, and tool operations")
|
||||
print_info("even if you use a custom endpoint for your main agent.")
|
||||
print_info("Get your API key at: https://openrouter.ai/keys")
|
||||
|
||||
existing_or = get_env_value("OPENROUTER_API_KEY")
|
||||
if existing_or:
|
||||
print_info(f"Current: {existing_or[:8]}... (configured)")
|
||||
if prompt_yes_no("Update OpenRouter API key?", False):
|
||||
api_key = prompt(" OpenRouter API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("OPENROUTER_API_KEY", api_key)
|
||||
print_success("OpenRouter API key updated")
|
||||
else:
|
||||
api_key = prompt(" OpenRouter API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("OPENROUTER_API_KEY", api_key)
|
||||
print_success("OpenRouter API key saved")
|
||||
else:
|
||||
print_warning("Skipped - some tools (vision, web scraping) won't work without this")
|
||||
|
||||
# =========================================================================
|
||||
# Step 2: Main Agent Provider
|
||||
# =========================================================================
|
||||
print_header("Main Agent Provider")
|
||||
print_info("Choose how to connect to your main chat model.")
|
||||
|
||||
existing_custom = get_env_value("OPENAI_BASE_URL")
|
||||
|
||||
provider_choices = [
|
||||
"OpenRouter (use same key for agent - recommended)",
|
||||
"Custom OpenAI-compatible endpoint (separate from OpenRouter)",
|
||||
f"Keep current" + (f" ({existing_custom})" if existing_custom else " (OpenRouter)")
|
||||
]
|
||||
|
||||
provider_idx = prompt_choice("Select your main agent provider:", provider_choices, 2)
|
||||
|
||||
if provider_idx == 0: # OpenRouter for agent too
|
||||
# Clear any custom endpoint - will use OpenRouter
|
||||
if existing_custom:
|
||||
save_env_value("OPENAI_BASE_URL", "")
|
||||
save_env_value("OPENAI_API_KEY", "")
|
||||
print_success("Agent will use OpenRouter")
|
||||
|
||||
elif provider_idx == 1: # Custom endpoint
|
||||
print_info("Custom OpenAI-Compatible Endpoint Configuration:")
|
||||
print_info("Works with any API that follows OpenAI's chat completions spec")
|
||||
|
||||
# Show current values if set
|
||||
current_url = get_env_value("OPENAI_BASE_URL") or ""
|
||||
current_key = get_env_value("OPENAI_API_KEY")
|
||||
current_model = config.get('model', '')
|
||||
|
||||
if current_url:
|
||||
print_info(f" Current URL: {current_url}")
|
||||
if current_key:
|
||||
print_info(f" Current key: {current_key[:8]}... (configured)")
|
||||
|
||||
base_url = prompt(" API base URL (e.g., https://api.example.com/v1)", current_url)
|
||||
api_key = prompt(" API key", password=True)
|
||||
model_name = prompt(" Model name (e.g., gpt-4, claude-3-opus)", current_model)
|
||||
|
||||
if base_url:
|
||||
save_env_value("OPENAI_BASE_URL", base_url)
|
||||
if api_key:
|
||||
save_env_value("OPENAI_API_KEY", api_key)
|
||||
if model_name:
|
||||
config['model'] = model_name
|
||||
print_success("Custom endpoint configured")
|
||||
# else: Keep current (provider_idx == 2)
|
||||
|
||||
# =========================================================================
|
||||
# Step 3: Model Selection
|
||||
# =========================================================================
|
||||
print_header("Default Model")
|
||||
|
||||
current_model = config.get('model', 'anthropic/claude-sonnet-4')
|
||||
print_info(f"Current: {current_model}")
|
||||
|
||||
model_choices = [
|
||||
"anthropic/claude-sonnet-4.5 (recommended)",
|
||||
"anthropic/claude-opus-4.5",
|
||||
"openai/gpt-5.2",
|
||||
"openai/gpt-5.2-codex",
|
||||
"google/gemini-3-pro-preview",
|
||||
"google/gemini-3-flash-preview",
|
||||
"z-ai/glm-4.7",
|
||||
"moonshotai/kimi-k2.5",
|
||||
"minimax/minimax-m2.1",
|
||||
"Custom model",
|
||||
f"Keep current ({current_model})"
|
||||
]
|
||||
|
||||
model_idx = prompt_choice("Select default model:", model_choices, 10) # Default: keep current
|
||||
|
||||
model_map = {
|
||||
0: "anthropic/claude-sonnet-4.5",
|
||||
1: "anthropic/claude-opus-4.5",
|
||||
2: "openai/gpt-5.2",
|
||||
3: "openai/gpt-5.2-codex",
|
||||
4: "google/gemini-3-pro-preview",
|
||||
5: "google/gemini-3-flash-preview",
|
||||
6: "z-ai/glm-4.7",
|
||||
7: "moonshotai/kimi-k2.5",
|
||||
8: "minimax/minimax-m2.1",
|
||||
}
|
||||
|
||||
if model_idx in model_map:
|
||||
config['model'] = model_map[model_idx]
|
||||
elif model_idx == 9: # Custom
|
||||
custom = prompt("Enter model name (e.g., anthropic/claude-sonnet-4.5)")
|
||||
if custom:
|
||||
config['model'] = custom
|
||||
# else: Keep current (model_idx == 10)
|
||||
|
||||
# =========================================================================
|
||||
# Step 4: Terminal Backend
|
||||
# =========================================================================
|
||||
print_header("Terminal Backend")
|
||||
print_info("The terminal tool allows the agent to run commands.")
|
||||
|
||||
current_backend = config.get('terminal', {}).get('backend', 'local')
|
||||
print_info(f"Current: {current_backend}")
|
||||
|
||||
# Detect platform for backend availability
|
||||
import platform
|
||||
is_linux = platform.system() == "Linux"
|
||||
is_macos = platform.system() == "Darwin"
|
||||
is_windows = platform.system() == "Windows"
|
||||
|
||||
# Build choices based on platform
|
||||
terminal_choices = [
|
||||
"Local (run commands on this machine - no isolation)",
|
||||
"Docker (isolated containers - recommended for security)",
|
||||
]
|
||||
|
||||
# Singularity/Apptainer is Linux-only (HPC)
|
||||
if is_linux:
|
||||
terminal_choices.append("Singularity/Apptainer (HPC clusters, shared compute)")
|
||||
|
||||
terminal_choices.extend([
|
||||
"Modal (cloud execution, GPU access, serverless)",
|
||||
"SSH (run commands on a remote server)",
|
||||
f"Keep current ({current_backend})"
|
||||
])
|
||||
|
||||
# Build index map based on available choices
|
||||
if is_linux:
|
||||
backend_to_idx = {'local': 0, 'docker': 1, 'singularity': 2, 'modal': 3, 'ssh': 4}
|
||||
idx_to_backend = {0: 'local', 1: 'docker', 2: 'singularity', 3: 'modal', 4: 'ssh'}
|
||||
keep_current_idx = 5
|
||||
else:
|
||||
backend_to_idx = {'local': 0, 'docker': 1, 'modal': 2, 'ssh': 3}
|
||||
idx_to_backend = {0: 'local', 1: 'docker', 2: 'modal', 3: 'ssh'}
|
||||
keep_current_idx = 4
|
||||
if current_backend == 'singularity':
|
||||
print_warning("Singularity is only available on Linux - please select a different backend")
|
||||
|
||||
# Default based on current
|
||||
default_terminal = backend_to_idx.get(current_backend, 0)
|
||||
|
||||
terminal_idx = prompt_choice("Select terminal backend:", terminal_choices, keep_current_idx)
|
||||
|
||||
# Map index to backend name (handles platform differences)
|
||||
selected_backend = idx_to_backend.get(terminal_idx)
|
||||
|
||||
if selected_backend == 'local':
|
||||
config.setdefault('terminal', {})['backend'] = 'local'
|
||||
print_info("Local Execution Configuration:")
|
||||
print_info("Commands run directly on this machine (no isolation)")
|
||||
|
||||
if is_windows:
|
||||
print_info("Note: On Windows, commands run via cmd.exe or PowerShell")
|
||||
|
||||
# Messaging working directory configuration
|
||||
print_info("")
|
||||
print_info("Working Directory for Messaging (Telegram/Discord/etc):")
|
||||
print_info(" The CLI always uses the directory you run 'hermes' from")
|
||||
print_info(" But messaging bots need a static starting directory")
|
||||
|
||||
current_cwd = get_env_value('MESSAGING_CWD') or str(Path.home())
|
||||
print_info(f" Current: {current_cwd}")
|
||||
|
||||
cwd_input = prompt(" Messaging working directory", current_cwd)
|
||||
# Expand ~ to full path
|
||||
if cwd_input.startswith('~'):
|
||||
cwd_expanded = str(Path.home()) + cwd_input[1:]
|
||||
else:
|
||||
cwd_expanded = cwd_input
|
||||
save_env_value("MESSAGING_CWD", cwd_expanded)
|
||||
|
||||
if prompt_yes_no(" Enable sudo support? (allows agent to run sudo commands)", False):
|
||||
print_warning(" SECURITY WARNING: Sudo password will be stored in plaintext")
|
||||
sudo_pass = prompt(" Sudo password (leave empty to skip)", password=True)
|
||||
if sudo_pass:
|
||||
save_env_value("SUDO_PASSWORD", sudo_pass)
|
||||
print_success(" Sudo password saved")
|
||||
|
||||
print_success("Terminal set to local")
|
||||
|
||||
elif selected_backend == 'docker':
|
||||
config.setdefault('terminal', {})['backend'] = 'docker'
|
||||
default_docker = config.get('terminal', {}).get('docker_image', 'nikolaik/python-nodejs:python3.11-nodejs20')
|
||||
print_info("Docker Configuration:")
|
||||
if is_macos:
|
||||
print_info("Requires Docker Desktop for Mac")
|
||||
elif is_windows:
|
||||
print_info("Requires Docker Desktop for Windows")
|
||||
docker_image = prompt(" Docker image", default_docker)
|
||||
config['terminal']['docker_image'] = docker_image
|
||||
print_success("Terminal set to Docker")
|
||||
|
||||
elif selected_backend == 'singularity':
|
||||
config.setdefault('terminal', {})['backend'] = 'singularity'
|
||||
default_singularity = config.get('terminal', {}).get('singularity_image', 'docker://nikolaik/python-nodejs:python3.11-nodejs20')
|
||||
print_info("Singularity/Apptainer Configuration:")
|
||||
print_info("Requires apptainer or singularity to be installed")
|
||||
singularity_image = prompt(" Image (docker:// prefix for Docker Hub)", default_singularity)
|
||||
config['terminal']['singularity_image'] = singularity_image
|
||||
print_success("Terminal set to Singularity/Apptainer")
|
||||
|
||||
elif selected_backend == 'modal':
|
||||
config.setdefault('terminal', {})['backend'] = 'modal'
|
||||
default_modal = config.get('terminal', {}).get('modal_image', 'nikolaik/python-nodejs:python3.11-nodejs20')
|
||||
print_info("Modal Cloud Configuration:")
|
||||
print_info("Get credentials at: https://modal.com/settings")
|
||||
|
||||
# Always show current status and allow reconfiguration
|
||||
current_token = get_env_value('MODAL_TOKEN_ID')
|
||||
if current_token:
|
||||
print_info(f" Token ID: {current_token[:8]}... (configured)")
|
||||
|
||||
modal_image = prompt(" Container image", default_modal)
|
||||
config['terminal']['modal_image'] = modal_image
|
||||
|
||||
token_id = prompt(" Modal token ID", current_token or "")
|
||||
token_secret = prompt(" Modal token secret", password=True)
|
||||
|
||||
if token_id:
|
||||
save_env_value("MODAL_TOKEN_ID", token_id)
|
||||
if token_secret:
|
||||
save_env_value("MODAL_TOKEN_SECRET", token_secret)
|
||||
|
||||
print_success("Terminal set to Modal")
|
||||
|
||||
elif selected_backend == 'ssh':
|
||||
config.setdefault('terminal', {})['backend'] = 'ssh'
|
||||
print_info("SSH Remote Execution Configuration:")
|
||||
print_info("Commands will run on a remote server over SSH")
|
||||
|
||||
current_host = get_env_value('TERMINAL_SSH_HOST') or ''
|
||||
current_user = get_env_value('TERMINAL_SSH_USER') or os.getenv("USER", "")
|
||||
current_port = get_env_value('TERMINAL_SSH_PORT') or '22'
|
||||
current_key = get_env_value('TERMINAL_SSH_KEY') or '~/.ssh/id_rsa'
|
||||
|
||||
if current_host:
|
||||
print_info(f" Current host: {current_user}@{current_host}:{current_port}")
|
||||
|
||||
ssh_host = prompt(" SSH host", current_host)
|
||||
ssh_user = prompt(" SSH user", current_user)
|
||||
ssh_port = prompt(" SSH port", current_port)
|
||||
ssh_key = prompt(" SSH key path (or leave empty for ssh-agent)", current_key)
|
||||
|
||||
if ssh_host:
|
||||
save_env_value("TERMINAL_SSH_HOST", ssh_host)
|
||||
if ssh_user:
|
||||
save_env_value("TERMINAL_SSH_USER", ssh_user)
|
||||
if ssh_port and ssh_port != '22':
|
||||
save_env_value("TERMINAL_SSH_PORT", ssh_port)
|
||||
if ssh_key:
|
||||
save_env_value("TERMINAL_SSH_KEY", ssh_key)
|
||||
|
||||
print_success("Terminal set to SSH")
|
||||
# else: Keep current (selected_backend is None)
|
||||
|
||||
# =========================================================================
|
||||
# Step 5: Agent Settings
|
||||
# =========================================================================
|
||||
print_header("Agent Settings")
|
||||
|
||||
# Max iterations
|
||||
current_max = get_env_value('HERMES_MAX_ITERATIONS') or '60'
|
||||
print_info("Maximum tool-calling iterations per conversation.")
|
||||
print_info("Higher = more complex tasks, but costs more tokens.")
|
||||
print_info("Recommended: 30-60 for most tasks, 100+ for open exploration.")
|
||||
|
||||
max_iter_str = prompt("Max iterations", current_max)
|
||||
try:
|
||||
max_iter = int(max_iter_str)
|
||||
if max_iter > 0:
|
||||
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
|
||||
config['max_turns'] = max_iter
|
||||
print_success(f"Max iterations set to {max_iter}")
|
||||
except ValueError:
|
||||
print_warning("Invalid number, keeping current value")
|
||||
|
||||
# Tool progress notifications (for messaging)
|
||||
print_info("")
|
||||
print_info("Tool Progress Notifications (Messaging only)")
|
||||
print_info("Send status messages when the agent uses tools.")
|
||||
print_info("Example: '💻 ls -la...' or '🔍 web_search...'")
|
||||
|
||||
current_progress = get_env_value('HERMES_TOOL_PROGRESS') or 'false'
|
||||
if prompt_yes_no("Enable tool progress messages?", current_progress.lower() in ('1', 'true', 'yes')):
|
||||
save_env_value("HERMES_TOOL_PROGRESS", "true")
|
||||
|
||||
# Progress mode
|
||||
current_mode = get_env_value('HERMES_TOOL_PROGRESS_MODE') or 'new'
|
||||
print_info(" Mode options:")
|
||||
print_info(" 'new' - Only when switching tools (less spam)")
|
||||
print_info(" 'all' - Every tool call")
|
||||
mode = prompt(" Progress mode", current_mode)
|
||||
if mode.lower() in ('all', 'new'):
|
||||
save_env_value("HERMES_TOOL_PROGRESS_MODE", mode.lower())
|
||||
print_success("Tool progress enabled")
|
||||
else:
|
||||
save_env_value("HERMES_TOOL_PROGRESS", "false")
|
||||
|
||||
# =========================================================================
|
||||
# Step 6: Context Compression
|
||||
# =========================================================================
|
||||
print_header("Context Compression")
|
||||
print_info("Automatically summarize old messages when context gets too long.")
|
||||
|
||||
compression = config.get('compression', {})
|
||||
current_enabled = compression.get('enabled', True)
|
||||
|
||||
if prompt_yes_no(f"Enable context compression?", current_enabled):
|
||||
config.setdefault('compression', {})['enabled'] = True
|
||||
|
||||
current_threshold = compression.get('threshold', 0.85)
|
||||
threshold_str = prompt(f"Compression threshold (0.5-0.95)", str(current_threshold))
|
||||
try:
|
||||
threshold = float(threshold_str)
|
||||
if 0.5 <= threshold <= 0.95:
|
||||
config['compression']['threshold'] = threshold
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
print_success("Context compression enabled")
|
||||
else:
|
||||
config.setdefault('compression', {})['enabled'] = False
|
||||
|
||||
# =========================================================================
|
||||
# Step 7: Messaging Platforms (Optional)
|
||||
# =========================================================================
|
||||
print_header("Messaging Platforms (Optional)")
|
||||
print_info("Connect to messaging platforms to chat with Hermes from anywhere.")
|
||||
|
||||
# Telegram
|
||||
existing_telegram = get_env_value('TELEGRAM_BOT_TOKEN')
|
||||
if existing_telegram:
|
||||
print_info("Telegram: already configured")
|
||||
if prompt_yes_no("Reconfigure Telegram?", False):
|
||||
existing_telegram = None
|
||||
|
||||
if not existing_telegram and prompt_yes_no("Set up Telegram bot?", False):
|
||||
print_info("Create a bot via @BotFather on Telegram")
|
||||
token = prompt("Telegram bot token", password=True)
|
||||
if token:
|
||||
save_env_value("TELEGRAM_BOT_TOKEN", token)
|
||||
print_success("Telegram token saved")
|
||||
|
||||
# Allowed users (security)
|
||||
print()
|
||||
print_info("🔒 Security: Restrict who can use your bot")
|
||||
print_info(" To find your Telegram user ID:")
|
||||
print_info(" 1. Message @userinfobot on Telegram")
|
||||
print_info(" 2. It will reply with your numeric ID (e.g., 123456789)")
|
||||
print()
|
||||
allowed_users = prompt("Allowed user IDs (comma-separated, leave empty for open access)")
|
||||
if allowed_users:
|
||||
save_env_value("TELEGRAM_ALLOWED_USERS", allowed_users.replace(" ", ""))
|
||||
print_success("Telegram allowlist configured - only listed users can use the bot")
|
||||
else:
|
||||
print_info("⚠️ No allowlist set - anyone who finds your bot can use it!")
|
||||
|
||||
home_channel = prompt("Home channel ID (optional, for cron delivery)")
|
||||
if home_channel:
|
||||
save_env_value("TELEGRAM_HOME_CHANNEL", home_channel)
|
||||
|
||||
# Check/update existing Telegram allowlist
|
||||
elif existing_telegram:
|
||||
existing_allowlist = get_env_value('TELEGRAM_ALLOWED_USERS')
|
||||
if not existing_allowlist:
|
||||
print_info("⚠️ Telegram has no user allowlist - anyone can use your bot!")
|
||||
if prompt_yes_no("Add allowed users now?", True):
|
||||
print_info(" To find your Telegram user ID: message @userinfobot")
|
||||
allowed_users = prompt("Allowed user IDs (comma-separated)")
|
||||
if allowed_users:
|
||||
save_env_value("TELEGRAM_ALLOWED_USERS", allowed_users.replace(" ", ""))
|
||||
print_success("Telegram allowlist configured")
|
||||
|
||||
# Discord
|
||||
existing_discord = get_env_value('DISCORD_BOT_TOKEN')
|
||||
if existing_discord:
|
||||
print_info("Discord: already configured")
|
||||
if prompt_yes_no("Reconfigure Discord?", False):
|
||||
existing_discord = None
|
||||
|
||||
if not existing_discord and prompt_yes_no("Set up Discord bot?", False):
|
||||
print_info("Create a bot at https://discord.com/developers/applications")
|
||||
token = prompt("Discord bot token", password=True)
|
||||
if token:
|
||||
save_env_value("DISCORD_BOT_TOKEN", token)
|
||||
print_success("Discord token saved")
|
||||
|
||||
# Allowed users (security)
|
||||
print()
|
||||
print_info("🔒 Security: Restrict who can use your bot")
|
||||
print_info(" To find your Discord user ID:")
|
||||
print_info(" 1. Enable Developer Mode in Discord settings")
|
||||
print_info(" 2. Right-click your name → Copy ID")
|
||||
print()
|
||||
allowed_users = prompt("Allowed user IDs (comma-separated, leave empty for open access)")
|
||||
if allowed_users:
|
||||
save_env_value("DISCORD_ALLOWED_USERS", allowed_users.replace(" ", ""))
|
||||
print_success("Discord allowlist configured")
|
||||
else:
|
||||
print_info("⚠️ No allowlist set - anyone in servers with your bot can use it!")
|
||||
|
||||
home_channel = prompt("Home channel ID (optional, for cron delivery)")
|
||||
if home_channel:
|
||||
save_env_value("DISCORD_HOME_CHANNEL", home_channel)
|
||||
|
||||
# Check/update existing Discord allowlist
|
||||
elif existing_discord:
|
||||
existing_allowlist = get_env_value('DISCORD_ALLOWED_USERS')
|
||||
if not existing_allowlist:
|
||||
print_info("⚠️ Discord has no user allowlist - anyone can use your bot!")
|
||||
if prompt_yes_no("Add allowed users now?", True):
|
||||
print_info(" To find Discord ID: Enable Developer Mode, right-click name → Copy ID")
|
||||
allowed_users = prompt("Allowed user IDs (comma-separated)")
|
||||
if allowed_users:
|
||||
save_env_value("DISCORD_ALLOWED_USERS", allowed_users.replace(" ", ""))
|
||||
print_success("Discord allowlist configured")
|
||||
|
||||
# =========================================================================
|
||||
# Step 8: Additional Tools (Optional)
|
||||
# =========================================================================
|
||||
print_header("Additional Tools (Optional)")
|
||||
print_info("These tools extend the agent's capabilities.")
|
||||
print_info("Without their API keys, the corresponding features will be disabled.")
|
||||
print()
|
||||
|
||||
# Firecrawl - Web scraping
|
||||
print_info("─" * 50)
|
||||
print(color(" Web Search & Scraping (Firecrawl)", Colors.CYAN))
|
||||
print_info(" Enables: web_search, web_extract tools")
|
||||
print_info(" Use case: Search the web, read webpage content")
|
||||
if get_env_value('FIRECRAWL_API_KEY'):
|
||||
print_success(" Status: Configured ✓")
|
||||
if prompt_yes_no(" Update Firecrawl API key?", False):
|
||||
api_key = prompt(" API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("FIRECRAWL_API_KEY", api_key)
|
||||
print_success(" Updated")
|
||||
else:
|
||||
print_warning(" Status: Not configured (tools will be disabled)")
|
||||
if prompt_yes_no(" Set up Firecrawl?", False):
|
||||
print_info(" Get your API key at: https://firecrawl.dev/")
|
||||
api_key = prompt(" API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("FIRECRAWL_API_KEY", api_key)
|
||||
print_success(" Configured ✓")
|
||||
print()
|
||||
|
||||
# Browserbase - Browser automation
|
||||
print_info("─" * 50)
|
||||
print(color(" Browser Automation (Browserbase)", Colors.CYAN))
|
||||
print_info(" Enables: browser_navigate, browser_click, etc.")
|
||||
print_info(" Use case: Interact with web pages, fill forms, screenshots")
|
||||
if get_env_value('BROWSERBASE_API_KEY'):
|
||||
print_success(" Status: Configured ✓")
|
||||
if prompt_yes_no(" Update Browserbase credentials?", False):
|
||||
api_key = prompt(" API key", password=True)
|
||||
project_id = prompt(" Project ID")
|
||||
if api_key:
|
||||
save_env_value("BROWSERBASE_API_KEY", api_key)
|
||||
if project_id:
|
||||
save_env_value("BROWSERBASE_PROJECT_ID", project_id)
|
||||
print_success(" Updated")
|
||||
else:
|
||||
print_warning(" Status: Not configured (tools will be disabled)")
|
||||
if prompt_yes_no(" Set up Browserbase?", False):
|
||||
print_info(" Get credentials at: https://browserbase.com/")
|
||||
api_key = prompt(" API key", password=True)
|
||||
project_id = prompt(" Project ID")
|
||||
if api_key:
|
||||
save_env_value("BROWSERBASE_API_KEY", api_key)
|
||||
if project_id:
|
||||
save_env_value("BROWSERBASE_PROJECT_ID", project_id)
|
||||
print_success(" Configured ✓")
|
||||
print()
|
||||
|
||||
# FAL - Image generation
|
||||
print_info("─" * 50)
|
||||
print(color(" Image Generation (FAL)", Colors.CYAN))
|
||||
print_info(" Enables: image_generate tool")
|
||||
print_info(" Use case: Generate images from text prompts (FLUX)")
|
||||
if get_env_value('FAL_KEY'):
|
||||
print_success(" Status: Configured ✓")
|
||||
if prompt_yes_no(" Update FAL API key?", False):
|
||||
api_key = prompt(" API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("FAL_KEY", api_key)
|
||||
print_success(" Updated")
|
||||
else:
|
||||
print_warning(" Status: Not configured (tool will be disabled)")
|
||||
if prompt_yes_no(" Set up FAL?", False):
|
||||
print_info(" Get your API key at: https://fal.ai/")
|
||||
api_key = prompt(" API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("FAL_KEY", api_key)
|
||||
print_success(" Configured ✓")
|
||||
print()
|
||||
|
||||
# Tinker + WandB - RL Training
|
||||
print_info("─" * 50)
|
||||
print(color(" RL Training (Tinker + WandB)", Colors.CYAN))
|
||||
print_info(" Enables: rl_start_training, rl_check_status, rl_get_results tools")
|
||||
print_info(" Use case: Run reinforcement learning training via Tinker API")
|
||||
tinker_configured = get_env_value('TINKER_API_KEY')
|
||||
wandb_configured = get_env_value('WANDB_API_KEY')
|
||||
|
||||
if tinker_configured and wandb_configured:
|
||||
print_success(" Status: Configured ✓")
|
||||
if prompt_yes_no(" Update RL training credentials?", False):
|
||||
api_key = prompt(" Tinker API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("TINKER_API_KEY", api_key)
|
||||
wandb_key = prompt(" WandB API key", password=True)
|
||||
if wandb_key:
|
||||
save_env_value("WANDB_API_KEY", wandb_key)
|
||||
print_success(" Updated")
|
||||
else:
|
||||
if tinker_configured:
|
||||
print_warning(" Status: Tinker configured, WandB missing")
|
||||
elif wandb_configured:
|
||||
print_warning(" Status: WandB configured, Tinker missing")
|
||||
else:
|
||||
print_warning(" Status: Not configured (tools will be disabled)")
|
||||
|
||||
if prompt_yes_no(" Set up RL Training?", False):
|
||||
print_info(" Get Tinker key at: https://tinker-console.thinkingmachines.ai/keys")
|
||||
print_info(" Get WandB key at: https://wandb.ai/authorize")
|
||||
api_key = prompt(" Tinker API key", password=True)
|
||||
if api_key:
|
||||
save_env_value("TINKER_API_KEY", api_key)
|
||||
wandb_key = prompt(" WandB API key", password=True)
|
||||
if wandb_key:
|
||||
save_env_value("WANDB_API_KEY", wandb_key)
|
||||
if api_key and wandb_key:
|
||||
print_success(" Configured ✓")
|
||||
else:
|
||||
print_warning(" Partially configured (both keys required)")
|
||||
|
||||
# =========================================================================
|
||||
# Save config and show summary
|
||||
# =========================================================================
|
||||
save_config(config)
|
||||
_print_setup_summary(config, hermes_home)
|
||||
241
hermes_cli/status.py
Normal file
241
hermes_cli/status.py
Normal file
@@ -0,0 +1,241 @@
|
||||
"""
|
||||
Status command for hermes CLI.
|
||||
|
||||
Shows the status of all Hermes Agent components.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
|
||||
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
|
||||
|
||||
# ANSI colors
|
||||
class Colors:
|
||||
RESET = "\033[0m"
|
||||
BOLD = "\033[1m"
|
||||
DIM = "\033[2m"
|
||||
RED = "\033[31m"
|
||||
GREEN = "\033[32m"
|
||||
YELLOW = "\033[33m"
|
||||
CYAN = "\033[36m"
|
||||
|
||||
def color(text: str, *codes) -> str:
|
||||
if not sys.stdout.isatty():
|
||||
return text
|
||||
return "".join(codes) + text + Colors.RESET
|
||||
|
||||
def check_mark(ok: bool) -> str:
|
||||
if ok:
|
||||
return color("✓", Colors.GREEN)
|
||||
return color("✗", Colors.RED)
|
||||
|
||||
def redact_key(key: str) -> str:
|
||||
"""Redact an API key for display."""
|
||||
if not key:
|
||||
return "(not set)"
|
||||
if len(key) < 12:
|
||||
return "***"
|
||||
return key[:4] + "..." + key[-4:]
|
||||
|
||||
|
||||
def show_status(args):
|
||||
"""Show status of all Hermes Agent components."""
|
||||
show_all = getattr(args, 'all', False)
|
||||
deep = getattr(args, 'deep', False)
|
||||
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
|
||||
print(color("│ 🦋 Hermes Agent Status │", Colors.CYAN))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
|
||||
|
||||
# =========================================================================
|
||||
# Environment
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Environment", Colors.CYAN, Colors.BOLD))
|
||||
print(f" Project: {PROJECT_ROOT}")
|
||||
print(f" Python: {sys.version.split()[0]}")
|
||||
|
||||
env_path = PROJECT_ROOT / '.env'
|
||||
print(f" .env file: {check_mark(env_path.exists())} {'exists' if env_path.exists() else 'not found'}")
|
||||
|
||||
# =========================================================================
|
||||
# API Keys
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
keys = {
|
||||
"OpenRouter": "OPENROUTER_API_KEY",
|
||||
"Anthropic": "ANTHROPIC_API_KEY",
|
||||
"OpenAI": "OPENAI_API_KEY",
|
||||
"Firecrawl": "FIRECRAWL_API_KEY",
|
||||
"Browserbase": "BROWSERBASE_API_KEY",
|
||||
"FAL": "FAL_KEY",
|
||||
"Tinker": "TINKER_API_KEY",
|
||||
"WandB": "WANDB_API_KEY",
|
||||
}
|
||||
|
||||
for name, env_var in keys.items():
|
||||
value = os.getenv(env_var, "")
|
||||
has_key = bool(value)
|
||||
display = redact_key(value) if not show_all else value
|
||||
print(f" {name:<12} {check_mark(has_key)} {display}")
|
||||
|
||||
# =========================================================================
|
||||
# Terminal Configuration
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Terminal Backend", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
terminal_env = os.getenv("TERMINAL_ENV", "local")
|
||||
print(f" Backend: {terminal_env}")
|
||||
|
||||
if terminal_env == "ssh":
|
||||
ssh_host = os.getenv("TERMINAL_SSH_HOST", "")
|
||||
ssh_user = os.getenv("TERMINAL_SSH_USER", "")
|
||||
print(f" SSH Host: {ssh_host or '(not set)'}")
|
||||
print(f" SSH User: {ssh_user or '(not set)'}")
|
||||
elif terminal_env == "docker":
|
||||
docker_image = os.getenv("TERMINAL_DOCKER_IMAGE", "python:3.11-slim")
|
||||
print(f" Docker Image: {docker_image}")
|
||||
|
||||
sudo_password = os.getenv("SUDO_PASSWORD", "")
|
||||
print(f" Sudo: {check_mark(bool(sudo_password))} {'enabled' if sudo_password else 'disabled'}")
|
||||
|
||||
# =========================================================================
|
||||
# Messaging Platforms
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Messaging Platforms", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
platforms = {
|
||||
"Telegram": ("TELEGRAM_BOT_TOKEN", "TELEGRAM_HOME_CHANNEL"),
|
||||
"Discord": ("DISCORD_BOT_TOKEN", "DISCORD_HOME_CHANNEL"),
|
||||
"WhatsApp": ("WHATSAPP_ENABLED", None),
|
||||
}
|
||||
|
||||
for name, (token_var, home_var) in platforms.items():
|
||||
token = os.getenv(token_var, "")
|
||||
has_token = bool(token)
|
||||
|
||||
home_channel = ""
|
||||
if home_var:
|
||||
home_channel = os.getenv(home_var, "")
|
||||
|
||||
status = "configured" if has_token else "not configured"
|
||||
if home_channel:
|
||||
status += f" (home: {home_channel})"
|
||||
|
||||
print(f" {name:<12} {check_mark(has_token)} {status}")
|
||||
|
||||
# =========================================================================
|
||||
# Gateway Status
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
if sys.platform.startswith('linux'):
|
||||
result = subprocess.run(
|
||||
["systemctl", "--user", "is-active", "hermes-gateway"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
is_active = result.stdout.strip() == "active"
|
||||
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
|
||||
print(f" Manager: systemd (user)")
|
||||
|
||||
elif sys.platform == 'darwin':
|
||||
result = subprocess.run(
|
||||
["launchctl", "list", "ai.hermes.gateway"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
is_loaded = result.returncode == 0
|
||||
print(f" Status: {check_mark(is_loaded)} {'loaded' if is_loaded else 'not loaded'}")
|
||||
print(f" Manager: launchd")
|
||||
else:
|
||||
print(f" Status: {color('N/A', Colors.DIM)}")
|
||||
print(f" Manager: (not supported on this platform)")
|
||||
|
||||
# =========================================================================
|
||||
# Cron Jobs
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Scheduled Jobs", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
jobs_file = Path.home() / ".hermes" / "cron" / "jobs.json"
|
||||
if jobs_file.exists():
|
||||
import json
|
||||
try:
|
||||
with open(jobs_file) as f:
|
||||
data = json.load(f)
|
||||
jobs = data.get("jobs", [])
|
||||
enabled_jobs = [j for j in jobs if j.get("enabled", True)]
|
||||
print(f" Jobs: {len(enabled_jobs)} active, {len(jobs)} total")
|
||||
except:
|
||||
print(f" Jobs: (error reading jobs file)")
|
||||
else:
|
||||
print(f" Jobs: 0")
|
||||
|
||||
# =========================================================================
|
||||
# Sessions
|
||||
# =========================================================================
|
||||
print()
|
||||
print(color("◆ Sessions", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
sessions_file = Path.home() / ".hermes" / "sessions" / "sessions.json"
|
||||
if sessions_file.exists():
|
||||
import json
|
||||
try:
|
||||
with open(sessions_file) as f:
|
||||
data = json.load(f)
|
||||
print(f" Active: {len(data)} session(s)")
|
||||
except:
|
||||
print(f" Active: (error reading sessions file)")
|
||||
else:
|
||||
print(f" Active: 0")
|
||||
|
||||
# =========================================================================
|
||||
# Deep checks
|
||||
# =========================================================================
|
||||
if deep:
|
||||
print()
|
||||
print(color("◆ Deep Checks", Colors.CYAN, Colors.BOLD))
|
||||
|
||||
# Check OpenRouter connectivity
|
||||
openrouter_key = os.getenv("OPENROUTER_API_KEY", "")
|
||||
if openrouter_key:
|
||||
try:
|
||||
import httpx
|
||||
response = httpx.get(
|
||||
"https://openrouter.ai/api/v1/models",
|
||||
headers={"Authorization": f"Bearer {openrouter_key}"},
|
||||
timeout=10
|
||||
)
|
||||
ok = response.status_code == 200
|
||||
print(f" OpenRouter: {check_mark(ok)} {'reachable' if ok else f'error ({response.status_code})'}")
|
||||
except Exception as e:
|
||||
print(f" OpenRouter: {check_mark(False)} error: {e}")
|
||||
|
||||
# Check gateway port
|
||||
try:
|
||||
import socket
|
||||
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
sock.settimeout(1)
|
||||
result = sock.connect_ex(('127.0.0.1', 18789))
|
||||
sock.close()
|
||||
# Port in use = gateway likely running
|
||||
port_in_use = result == 0
|
||||
# This is informational, not necessarily bad
|
||||
print(f" Port 18789: {'in use' if port_in_use else 'available'}")
|
||||
except:
|
||||
pass
|
||||
|
||||
print()
|
||||
print(color("─" * 60, Colors.DIM))
|
||||
print(color(" Run 'hermes doctor' for detailed diagnostics", Colors.DIM))
|
||||
print(color(" Run 'hermes setup' to configure", Colors.DIM))
|
||||
print()
|
||||
341
hermes_cli/uninstall.py
Normal file
341
hermes_cli/uninstall.py
Normal file
@@ -0,0 +1,341 @@
|
||||
"""
|
||||
Hermes Agent Uninstaller.
|
||||
|
||||
Provides options for:
|
||||
- Full uninstall: Remove everything including configs and data
|
||||
- Keep data: Remove code but keep ~/.hermes/ (configs, sessions, logs)
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import shutil
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
# ANSI colors
|
||||
class Colors:
|
||||
RESET = "\033[0m"
|
||||
BOLD = "\033[1m"
|
||||
DIM = "\033[2m"
|
||||
RED = "\033[31m"
|
||||
GREEN = "\033[32m"
|
||||
YELLOW = "\033[33m"
|
||||
BLUE = "\033[34m"
|
||||
MAGENTA = "\033[35m"
|
||||
CYAN = "\033[36m"
|
||||
|
||||
def color(text: str, *codes) -> str:
|
||||
"""Apply color codes to text (only in TTY)."""
|
||||
if not sys.stdout.isatty():
|
||||
return text
|
||||
return "".join(codes) + text + Colors.RESET
|
||||
|
||||
def log_info(msg: str):
|
||||
print(f"{color('→', Colors.CYAN)} {msg}")
|
||||
|
||||
def log_success(msg: str):
|
||||
print(f"{color('✓', Colors.GREEN)} {msg}")
|
||||
|
||||
def log_warn(msg: str):
|
||||
print(f"{color('⚠', Colors.YELLOW)} {msg}")
|
||||
|
||||
def log_error(msg: str):
|
||||
print(f"{color('✗', Colors.RED)} {msg}")
|
||||
|
||||
|
||||
def get_project_root() -> Path:
|
||||
"""Get the project installation directory."""
|
||||
return Path(__file__).parent.parent.resolve()
|
||||
|
||||
|
||||
def get_hermes_home() -> Path:
|
||||
"""Get the Hermes home directory (~/.hermes)."""
|
||||
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
|
||||
|
||||
|
||||
def find_shell_configs() -> list:
|
||||
"""Find shell configuration files that might have PATH entries."""
|
||||
home = Path.home()
|
||||
configs = []
|
||||
|
||||
candidates = [
|
||||
home / ".bashrc",
|
||||
home / ".bash_profile",
|
||||
home / ".profile",
|
||||
home / ".zshrc",
|
||||
home / ".zprofile",
|
||||
]
|
||||
|
||||
for config in candidates:
|
||||
if config.exists():
|
||||
configs.append(config)
|
||||
|
||||
return configs
|
||||
|
||||
|
||||
def remove_path_from_shell_configs():
|
||||
"""Remove Hermes PATH entries from shell configuration files."""
|
||||
configs = find_shell_configs()
|
||||
removed_from = []
|
||||
|
||||
for config_path in configs:
|
||||
try:
|
||||
content = config_path.read_text()
|
||||
original_content = content
|
||||
|
||||
# Remove lines containing hermes-agent or hermes PATH entries
|
||||
new_lines = []
|
||||
skip_next = False
|
||||
|
||||
for line in content.split('\n'):
|
||||
# Skip the "# Hermes Agent" comment and following line
|
||||
if '# Hermes Agent' in line or '# hermes-agent' in line:
|
||||
skip_next = True
|
||||
continue
|
||||
if skip_next and ('hermes' in line.lower() and 'PATH' in line):
|
||||
skip_next = False
|
||||
continue
|
||||
skip_next = False
|
||||
|
||||
# Remove any PATH line containing hermes
|
||||
if 'hermes' in line.lower() and ('PATH=' in line or 'path=' in line.lower()):
|
||||
continue
|
||||
|
||||
new_lines.append(line)
|
||||
|
||||
new_content = '\n'.join(new_lines)
|
||||
|
||||
# Clean up multiple blank lines
|
||||
while '\n\n\n' in new_content:
|
||||
new_content = new_content.replace('\n\n\n', '\n\n')
|
||||
|
||||
if new_content != original_content:
|
||||
config_path.write_text(new_content)
|
||||
removed_from.append(config_path)
|
||||
|
||||
except Exception as e:
|
||||
log_warn(f"Could not update {config_path}: {e}")
|
||||
|
||||
return removed_from
|
||||
|
||||
|
||||
def remove_wrapper_script():
|
||||
"""Remove the hermes wrapper script if it exists."""
|
||||
wrapper_paths = [
|
||||
Path.home() / ".local" / "bin" / "hermes",
|
||||
Path("/usr/local/bin/hermes"),
|
||||
]
|
||||
|
||||
removed = []
|
||||
for wrapper in wrapper_paths:
|
||||
if wrapper.exists():
|
||||
try:
|
||||
# Check if it's our wrapper (contains hermes_cli reference)
|
||||
content = wrapper.read_text()
|
||||
if 'hermes_cli' in content or 'hermes-agent' in content:
|
||||
wrapper.unlink()
|
||||
removed.append(wrapper)
|
||||
except Exception as e:
|
||||
log_warn(f"Could not remove {wrapper}: {e}")
|
||||
|
||||
return removed
|
||||
|
||||
|
||||
def uninstall_gateway_service():
|
||||
"""Stop and uninstall the gateway service if running."""
|
||||
import platform
|
||||
|
||||
if platform.system() != "Linux":
|
||||
return False
|
||||
|
||||
service_file = Path.home() / ".config" / "systemd" / "user" / "hermes-gateway.service"
|
||||
|
||||
if not service_file.exists():
|
||||
return False
|
||||
|
||||
try:
|
||||
# Stop the service
|
||||
subprocess.run(
|
||||
["systemctl", "--user", "stop", "hermes-gateway"],
|
||||
capture_output=True,
|
||||
check=False
|
||||
)
|
||||
|
||||
# Disable the service
|
||||
subprocess.run(
|
||||
["systemctl", "--user", "disable", "hermes-gateway"],
|
||||
capture_output=True,
|
||||
check=False
|
||||
)
|
||||
|
||||
# Remove service file
|
||||
service_file.unlink()
|
||||
|
||||
# Reload systemd
|
||||
subprocess.run(
|
||||
["systemctl", "--user", "daemon-reload"],
|
||||
capture_output=True,
|
||||
check=False
|
||||
)
|
||||
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
log_warn(f"Could not fully remove gateway service: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def run_uninstall(args):
|
||||
"""
|
||||
Run the uninstall process.
|
||||
|
||||
Options:
|
||||
- Full uninstall: removes code + ~/.hermes/ (configs, data, logs)
|
||||
- Keep data: removes code but keeps ~/.hermes/ for future reinstall
|
||||
"""
|
||||
project_root = get_project_root()
|
||||
hermes_home = get_hermes_home()
|
||||
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.MAGENTA, Colors.BOLD))
|
||||
print(color("│ 🦋 Hermes Agent Uninstaller │", Colors.MAGENTA, Colors.BOLD))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.MAGENTA, Colors.BOLD))
|
||||
print()
|
||||
|
||||
# Show what will be affected
|
||||
print(color("Current Installation:", Colors.CYAN, Colors.BOLD))
|
||||
print(f" Code: {project_root}")
|
||||
print(f" Config: {hermes_home / 'config.yaml'}")
|
||||
print(f" Secrets: {hermes_home / '.env'}")
|
||||
print(f" Data: {hermes_home / 'cron/'}, {hermes_home / 'sessions/'}, {hermes_home / 'logs/'}")
|
||||
print()
|
||||
|
||||
# Ask for confirmation
|
||||
print(color("Uninstall Options:", Colors.YELLOW, Colors.BOLD))
|
||||
print()
|
||||
print(" 1) " + color("Keep data", Colors.GREEN) + " - Remove code only, keep configs/sessions/logs")
|
||||
print(" (Recommended - you can reinstall later with your settings intact)")
|
||||
print()
|
||||
print(" 2) " + color("Full uninstall", Colors.RED) + " - Remove everything including all data")
|
||||
print(" (Warning: This deletes all configs, sessions, and logs permanently)")
|
||||
print()
|
||||
print(" 3) " + color("Cancel", Colors.CYAN) + " - Don't uninstall")
|
||||
print()
|
||||
|
||||
try:
|
||||
choice = input(color("Select option [1/2/3]: ", Colors.BOLD)).strip()
|
||||
except (KeyboardInterrupt, EOFError):
|
||||
print()
|
||||
print("Cancelled.")
|
||||
return
|
||||
|
||||
if choice == "3" or choice.lower() in ("c", "cancel", "q", "quit", "n", "no"):
|
||||
print()
|
||||
print("Uninstall cancelled.")
|
||||
return
|
||||
|
||||
full_uninstall = (choice == "2")
|
||||
|
||||
# Final confirmation
|
||||
print()
|
||||
if full_uninstall:
|
||||
print(color("⚠️ WARNING: This will permanently delete ALL Hermes data!", Colors.RED, Colors.BOLD))
|
||||
print(color(" Including: configs, API keys, sessions, scheduled jobs, logs", Colors.RED))
|
||||
else:
|
||||
print("This will remove the Hermes code but keep your configuration and data.")
|
||||
|
||||
print()
|
||||
try:
|
||||
confirm = input(f"Type '{color('yes', Colors.YELLOW)}' to confirm: ").strip().lower()
|
||||
except (KeyboardInterrupt, EOFError):
|
||||
print()
|
||||
print("Cancelled.")
|
||||
return
|
||||
|
||||
if confirm != "yes":
|
||||
print()
|
||||
print("Uninstall cancelled.")
|
||||
return
|
||||
|
||||
print()
|
||||
print(color("Uninstalling...", Colors.CYAN, Colors.BOLD))
|
||||
print()
|
||||
|
||||
# 1. Stop and uninstall gateway service
|
||||
log_info("Checking for gateway service...")
|
||||
if uninstall_gateway_service():
|
||||
log_success("Gateway service stopped and removed")
|
||||
else:
|
||||
log_info("No gateway service found")
|
||||
|
||||
# 2. Remove PATH entries from shell configs
|
||||
log_info("Removing PATH entries from shell configs...")
|
||||
removed_configs = remove_path_from_shell_configs()
|
||||
if removed_configs:
|
||||
for config in removed_configs:
|
||||
log_success(f"Updated {config}")
|
||||
else:
|
||||
log_info("No PATH entries found to remove")
|
||||
|
||||
# 3. Remove wrapper script
|
||||
log_info("Removing hermes command...")
|
||||
removed_wrappers = remove_wrapper_script()
|
||||
if removed_wrappers:
|
||||
for wrapper in removed_wrappers:
|
||||
log_success(f"Removed {wrapper}")
|
||||
else:
|
||||
log_info("No wrapper script found")
|
||||
|
||||
# 4. Remove installation directory (code)
|
||||
log_info(f"Removing installation directory...")
|
||||
|
||||
# Check if we're running from within the install dir
|
||||
# We need to be careful here
|
||||
try:
|
||||
if project_root.exists():
|
||||
# If the install is inside ~/.hermes/, just remove the hermes-agent subdir
|
||||
if hermes_home in project_root.parents or project_root.parent == hermes_home:
|
||||
shutil.rmtree(project_root)
|
||||
log_success(f"Removed {project_root}")
|
||||
else:
|
||||
# Installation is somewhere else entirely
|
||||
shutil.rmtree(project_root)
|
||||
log_success(f"Removed {project_root}")
|
||||
except Exception as e:
|
||||
log_warn(f"Could not fully remove {project_root}: {e}")
|
||||
log_info("You may need to manually remove it")
|
||||
|
||||
# 5. Optionally remove ~/.hermes/ data directory
|
||||
if full_uninstall:
|
||||
log_info("Removing configuration and data...")
|
||||
try:
|
||||
if hermes_home.exists():
|
||||
shutil.rmtree(hermes_home)
|
||||
log_success(f"Removed {hermes_home}")
|
||||
except Exception as e:
|
||||
log_warn(f"Could not fully remove {hermes_home}: {e}")
|
||||
log_info("You may need to manually remove it")
|
||||
else:
|
||||
log_info(f"Keeping configuration and data in {hermes_home}")
|
||||
|
||||
# Done
|
||||
print()
|
||||
print(color("┌─────────────────────────────────────────────────────────┐", Colors.GREEN, Colors.BOLD))
|
||||
print(color("│ ✓ Uninstall Complete! │", Colors.GREEN, Colors.BOLD))
|
||||
print(color("└─────────────────────────────────────────────────────────┘", Colors.GREEN, Colors.BOLD))
|
||||
print()
|
||||
|
||||
if not full_uninstall:
|
||||
print(color("Your configuration and data have been preserved:", Colors.CYAN))
|
||||
print(f" {hermes_home}/")
|
||||
print()
|
||||
print("To reinstall later with your existing settings:")
|
||||
print(color(" curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash", Colors.DIM))
|
||||
print()
|
||||
|
||||
print(color("Reload your shell to complete the process:", Colors.YELLOW))
|
||||
print(" source ~/.bashrc # or ~/.zshrc")
|
||||
print()
|
||||
print("Thank you for using Hermes Agent! 🦋")
|
||||
print()
|
||||
353
local_server.py
353
local_server.py
@@ -1,353 +0,0 @@
|
||||
"""
|
||||
Local OpenAI-compatible server implementation for Hermes-Agent (Atropos integration).
|
||||
|
||||
Extends the Atropos APIServer to work with local OpenAI-compatible APIs (e.g. vLLM, SGLang),
|
||||
providing tokens_and_logprobs_completion support via client-side tokenization.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import warnings
|
||||
from typing import Any, List, Optional
|
||||
|
||||
import openai
|
||||
from openai.types.chat.chat_completion import ChatCompletion
|
||||
from openai.types.completion import Completion
|
||||
|
||||
from atroposlib.envs.server_handling.server_baseline import (
|
||||
APIServer,
|
||||
APIServerConfig,
|
||||
ReasoningConfig,
|
||||
)
|
||||
|
||||
|
||||
class LocalServer(APIServer):
|
||||
"""
|
||||
OpenAI-compatible local server with tokens_and_logprobs support.
|
||||
|
||||
Uses an OpenAI-compatible API (typically at a /v1 endpoint) and handles
|
||||
token extraction via client-side tokenization.
|
||||
|
||||
Note: Many local servers don't return per-token logprobs in the standard API,
|
||||
so this implementation uses placeholder logprobs (0.0) for PoC purposes.
|
||||
For production training, use vLLM/SGLang servers that return real logprobs.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: APIServerConfig,
|
||||
tokenizer: Optional[Any] = None,
|
||||
tokenizer_name: str = "gpt2",
|
||||
reasoning_config: Optional[ReasoningConfig] = None,
|
||||
):
|
||||
"""
|
||||
Initialize the local server.
|
||||
|
||||
Args:
|
||||
config: Server configuration
|
||||
tokenizer: Pre-initialized tokenizer (optional)
|
||||
tokenizer_name: Name of tokenizer to load if tokenizer not provided
|
||||
reasoning_config: Optional reasoning configuration
|
||||
"""
|
||||
# Build the OpenAI client pointing to the server's /v1 endpoint
|
||||
base_url = config.base_url
|
||||
if base_url and not base_url.endswith("/v1"):
|
||||
base_url = f"{base_url.rstrip('/')}/v1"
|
||||
|
||||
self.openai = openai.AsyncClient(
|
||||
api_key=config.api_key or "local", # Local servers often ignore auth
|
||||
base_url=base_url,
|
||||
timeout=config.timeout,
|
||||
)
|
||||
|
||||
# Initialize tokenizer
|
||||
if tokenizer is not None:
|
||||
self.tokenizer = tokenizer
|
||||
else:
|
||||
try:
|
||||
from transformers import AutoTokenizer # type: ignore
|
||||
except ModuleNotFoundError as exc:
|
||||
raise ModuleNotFoundError(
|
||||
"Missing optional dependency 'transformers'. Pass a tokenizer instance to LocalServer, "
|
||||
"or install transformers to enable `tokenizer_name` auto-loading."
|
||||
) from exc
|
||||
self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
|
||||
|
||||
# Add a simple chat template if the tokenizer doesn't have one
|
||||
# This is needed for ManagedServer's chat_completion to work
|
||||
if not hasattr(self.tokenizer, 'chat_template') or self.tokenizer.chat_template is None:
|
||||
# Simple ChatML-style template
|
||||
self.tokenizer.chat_template = (
|
||||
"{% for message in messages %}"
|
||||
"{% if message['role'] == 'system' %}<|im_start|>system\n{{ message['content'] }}<|im_end|>\n"
|
||||
"{% elif message['role'] == 'user' %}<|im_start|>user\n{{ message['content'] }}<|im_end|>\n"
|
||||
"{% elif message['role'] == 'assistant' %}<|im_start|>assistant\n{{ message['content'] }}<|im_end|>\n"
|
||||
"{% endif %}"
|
||||
"{% endfor %}"
|
||||
"{% if add_generation_prompt %}<|im_start|>assistant\n{% endif %}"
|
||||
)
|
||||
|
||||
super().__init__(config, reasoning_config=reasoning_config)
|
||||
# Local servers are treated as always-healthy unless a status task is enabled.
|
||||
self.server_healthy = True
|
||||
|
||||
@classmethod
|
||||
def from_env(
|
||||
cls,
|
||||
base_url: Optional[str] = None,
|
||||
model: Optional[str] = None,
|
||||
api_key: Optional[str] = None,
|
||||
tokenizer_name: str = "gpt2",
|
||||
**kwargs,
|
||||
) -> "LocalServer":
|
||||
"""
|
||||
Create a LocalServer from environment variables (or explicit overrides).
|
||||
|
||||
Env vars (checked in order):
|
||||
- base URL: ATROPOS_SERVER_BASE_URL, OPENAI_BASE_URL, LOCAL_LLM_BASE_URL, LLM_BASE_URL
|
||||
- model: ATROPOS_SERVER_MODEL, LLM_MODEL, LOCAL_LLM_MODEL
|
||||
- api key: ATROPOS_SERVER_API_KEY, OPENAI_API_KEY, LOCAL_LLM_API_KEY, LLM_API_KEY
|
||||
"""
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
|
||||
base_url = (
|
||||
base_url
|
||||
or os.getenv("ATROPOS_SERVER_BASE_URL")
|
||||
or os.getenv("OPENAI_BASE_URL")
|
||||
or os.getenv("LOCAL_LLM_BASE_URL")
|
||||
or os.getenv("LLM_BASE_URL")
|
||||
or "http://localhost:11434"
|
||||
)
|
||||
model = (
|
||||
model
|
||||
or os.getenv("ATROPOS_SERVER_MODEL")
|
||||
or os.getenv("LLM_MODEL")
|
||||
or os.getenv("LOCAL_LLM_MODEL")
|
||||
or "hermes3:8b"
|
||||
)
|
||||
api_key = (
|
||||
api_key
|
||||
or os.getenv("ATROPOS_SERVER_API_KEY")
|
||||
or os.getenv("OPENAI_API_KEY")
|
||||
or os.getenv("LOCAL_LLM_API_KEY")
|
||||
or os.getenv("LLM_API_KEY")
|
||||
)
|
||||
|
||||
config = APIServerConfig(
|
||||
model_name=model,
|
||||
base_url=base_url,
|
||||
api_key=api_key or "local",
|
||||
timeout=kwargs.get("timeout", 120),
|
||||
num_max_requests_at_once=kwargs.get("num_max_requests_at_once", 4),
|
||||
num_requests_for_eval=kwargs.get("num_requests_for_eval", 4),
|
||||
health_check=False, # Local dev servers often lack /health
|
||||
)
|
||||
|
||||
return cls(config, tokenizer_name=tokenizer_name)
|
||||
|
||||
async def check_server_status_task(self, chat_completion: bool = True):
|
||||
"""
|
||||
Check if the server is healthy.
|
||||
|
||||
For local development, we generally assume the server is healthy.
|
||||
"""
|
||||
while True:
|
||||
try:
|
||||
# Simple health check via a minimal completion
|
||||
if chat_completion:
|
||||
await self.openai.chat.completions.create(
|
||||
model=self.config.model_name,
|
||||
messages=[{"role": "user", "content": "hi"}],
|
||||
max_tokens=1,
|
||||
)
|
||||
else:
|
||||
await self.openai.completions.create(
|
||||
model=self.config.model_name,
|
||||
prompt="hi",
|
||||
max_tokens=1,
|
||||
)
|
||||
self.server_healthy = True
|
||||
except Exception:
|
||||
self.server_healthy = False
|
||||
await asyncio.sleep(5)
|
||||
|
||||
async def _chat_completion_wrapper(self, **kwargs) -> ChatCompletion:
|
||||
"""
|
||||
Wrapper for chat completion using an OpenAI-compatible API.
|
||||
"""
|
||||
assert kwargs.get("model") is not None, "Model is required!"
|
||||
assert kwargs.get("messages") is not None, "Messages are required!"
|
||||
|
||||
n = kwargs.get("n", 1)
|
||||
|
||||
# Some OpenAI-compatible servers don't support n > 1, so we make multiple requests.
|
||||
if n > 1:
|
||||
completion_list = await asyncio.gather(
|
||||
*[self.openai.chat.completions.create(**{**kwargs, "n": 1}) for _ in range(n)]
|
||||
)
|
||||
# Merge completions
|
||||
completions = completion_list[0]
|
||||
for c in completion_list[1:]:
|
||||
for choice in c.choices:
|
||||
choice.index = len(completions.choices)
|
||||
completions.choices.append(choice)
|
||||
return completions
|
||||
else:
|
||||
return await self.openai.chat.completions.create(**kwargs)
|
||||
|
||||
async def _completion_wrapper(self, **kwargs) -> Completion:
|
||||
"""
|
||||
Wrapper for completion using an OpenAI-compatible API.
|
||||
"""
|
||||
assert kwargs.get("model") is not None, "Model is required!"
|
||||
assert kwargs.get("prompt") is not None, "Prompt is required!"
|
||||
|
||||
n = kwargs.get("n", 1)
|
||||
|
||||
# Some OpenAI-compatible servers don't support n > 1.
|
||||
if n > 1:
|
||||
completion_list = await asyncio.gather(
|
||||
*[self.openai.completions.create(**{**kwargs, "n": 1}) for _ in range(n)]
|
||||
)
|
||||
completions = completion_list[0]
|
||||
for c in completion_list[1:]:
|
||||
for choice in c.choices:
|
||||
choice.index = len(completions.choices)
|
||||
completions.choices.append(choice)
|
||||
return completions
|
||||
else:
|
||||
return await self.openai.completions.create(**kwargs)
|
||||
|
||||
async def _tokens_and_logprobs_completion_wrapper(
|
||||
self, **kwargs
|
||||
) -> tuple[List[int], List[List[int]], List[List[float]], List[str]]:
|
||||
"""
|
||||
Wrapper for tokens and logprobs completion.
|
||||
|
||||
Returns:
|
||||
Tuple of (prompt_tokens, output_tokens_list, output_logprobs_list, finish_reasons)
|
||||
|
||||
Note: Many OpenAI-compatible local servers don't return per-token logprobs,
|
||||
so we use placeholder logprobs (0.0). For real training, use vLLM/SGLang.
|
||||
"""
|
||||
model = kwargs.get("model")
|
||||
assert model is not None, "Model is required!"
|
||||
|
||||
# Handle input_ids (from ManagedServer) or prompt
|
||||
if "input_ids" in kwargs:
|
||||
prompt_tokens = kwargs.pop("input_ids")
|
||||
prompt = self.tokenizer.decode(prompt_tokens)
|
||||
kwargs.pop("prompt", None)
|
||||
else:
|
||||
prompt = kwargs.pop("prompt", "")
|
||||
prompt_tokens = self.tokenizer.encode(prompt, add_special_tokens=True)
|
||||
|
||||
n = kwargs.pop("n", 1)
|
||||
max_tokens = kwargs.pop("max_tokens", 256)
|
||||
temperature = kwargs.pop("temperature", 0.7)
|
||||
stop = kwargs.pop("stop", None)
|
||||
|
||||
# Make completion requests
|
||||
completions = []
|
||||
for _ in range(n):
|
||||
try:
|
||||
response = await self.openai.completions.create(
|
||||
model=model,
|
||||
prompt=prompt,
|
||||
max_tokens=max_tokens,
|
||||
temperature=temperature,
|
||||
stop=stop,
|
||||
)
|
||||
completions.append(response)
|
||||
except Exception as e:
|
||||
# Fallback to chat completion if completion endpoint not supported
|
||||
warnings.warn(f"Completion API failed, trying chat: {e}")
|
||||
response = await self.openai.chat.completions.create(
|
||||
model=model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
max_tokens=max_tokens,
|
||||
temperature=temperature,
|
||||
stop=stop,
|
||||
)
|
||||
# Convert to completion-like response
|
||||
completions.append(response)
|
||||
|
||||
output_tokens_list = []
|
||||
output_logprobs_list = []
|
||||
finish_reasons = []
|
||||
|
||||
for completion in completions:
|
||||
# Extract text from response
|
||||
if hasattr(completion.choices[0], "text"):
|
||||
# Completion API response
|
||||
text = completion.choices[0].text
|
||||
finish_reason = completion.choices[0].finish_reason or "stop"
|
||||
else:
|
||||
# Chat completion API response
|
||||
text = completion.choices[0].message.content or ""
|
||||
finish_reason = completion.choices[0].finish_reason or "stop"
|
||||
|
||||
# Tokenize output
|
||||
output_tokens = self.tokenizer.encode(text, add_special_tokens=False)
|
||||
|
||||
# Placeholder logprobs (many local servers don't provide per-token logprobs).
|
||||
# In production, use vLLM/SGLang which return real logprobs
|
||||
output_logprobs = [0.0] * len(output_tokens)
|
||||
|
||||
output_tokens_list.append(output_tokens)
|
||||
output_logprobs_list.append(output_logprobs)
|
||||
finish_reasons.append(finish_reason)
|
||||
|
||||
return prompt_tokens, output_tokens_list, output_logprobs_list, finish_reasons
|
||||
|
||||
def managed_server(self, tokenizer=None, track_tree: bool = False):
|
||||
"""
|
||||
Create a ManagedServer context manager for this server.
|
||||
|
||||
Args:
|
||||
tokenizer: Optional tokenizer override
|
||||
track_tree: Whether to maintain tree structure for multi-turn
|
||||
|
||||
Returns:
|
||||
ManagedServer context manager
|
||||
"""
|
||||
from atroposlib.envs.server_handling.managed_server import ManagedServer
|
||||
|
||||
return ManagedServerContext(
|
||||
self,
|
||||
tokenizer=tokenizer or self.tokenizer,
|
||||
track_tree=track_tree,
|
||||
)
|
||||
|
||||
|
||||
class ManagedServerContext:
|
||||
"""
|
||||
Context manager wrapper for ManagedServer.
|
||||
|
||||
Usage:
|
||||
async with server.managed_server(tokenizer=tokenizer) as managed:
|
||||
response = await managed.chat_completion(...)
|
||||
state = managed.get_state()
|
||||
"""
|
||||
|
||||
def __init__(self, server: LocalServer, tokenizer, track_tree: bool = False):
|
||||
self.server = server
|
||||
self.tokenizer = tokenizer
|
||||
self.track_tree = track_tree
|
||||
self.managed = None
|
||||
|
||||
async def __aenter__(self):
|
||||
from atroposlib.envs.server_handling.managed_server import ManagedServer
|
||||
|
||||
self.managed = ManagedServer(
|
||||
self.server,
|
||||
tokenizer=self.tokenizer,
|
||||
track_tree=self.track_tree,
|
||||
)
|
||||
return self.managed
|
||||
|
||||
async def __aexit__(self, exc_type, exc_val, exc_tb):
|
||||
if self.managed:
|
||||
self.managed.reset()
|
||||
return False
|
||||
@@ -1,62 +0,0 @@
|
||||
# Active Context
|
||||
|
||||
## Current Focus
|
||||
Singularity/Apptainer integration for HPC environments has been **COMPLETED AND TESTED**.
|
||||
|
||||
## Recently Completed (Feb 6, 2026)
|
||||
|
||||
### Singularity/Apptainer Sandbox Integration - FULLY WORKING
|
||||
Successfully adapted the Atropos implementation from Docker to Singularity/Apptainer for HPC clusters where Docker cannot run without sudo permissions.
|
||||
|
||||
**Files Modified:**
|
||||
1. `atropos/nomad/client.py` - Added `driver` and `singularity_image` parameters to `create_sandbox_job()`; Fixed port detection to check both `DynamicPorts` and `ReservedPorts` in `get_job_allocations()`
|
||||
2. `atropos/slots/pool.py` - Added `driver` and `singularity_image` to `SlotPoolConfig`
|
||||
3. `atropos/backends/nomad_backend.py` - Added driver options to `NomadBackendConfig`
|
||||
4. `atropos/envs/agent_env.py` - Added CLI arguments `--env.driver` and `--env.singularity_image` to `AgentEnvConfig`
|
||||
|
||||
**Files Created:**
|
||||
1. `nomad-singularity.hcl` - Nomad config with raw_exec driver enabled
|
||||
2. `atropos/atropos-sandbox.sif` - Singularity image (80MB) built from Docker image
|
||||
3. `test_singularity_job.py` - Test script for Singularity integration
|
||||
|
||||
**Key Implementation Details:**
|
||||
- Uses Nomad's `raw_exec` driver to run `apptainer` commands
|
||||
- Shell wrapper (`/bin/sh -c`) ensures Nomad environment variables expand correctly
|
||||
- Binds Nomad allocation directory to `/data` for workspace persistence
|
||||
- Uses **static ports** (`ReservedPorts`) instead of dynamic ports since raw_exec runs directly on host
|
||||
- `get_job_allocations()` now checks both `DynamicPorts` (Docker) and `ReservedPorts` (Singularity)
|
||||
|
||||
**Test Results (All Passing):**
|
||||
- Health check: ✅ Server responding with 5 slots
|
||||
- Bash execution: ✅ Commands execute inside Singularity container
|
||||
- Write file: ✅ File written to slot workspace
|
||||
- Read file: ✅ File read back successfully
|
||||
|
||||
## Usage
|
||||
|
||||
### For Docker (default):
|
||||
```python
|
||||
config = SlotPoolConfig(
|
||||
driver="docker",
|
||||
image="atropos-sandbox:local",
|
||||
)
|
||||
```
|
||||
|
||||
### For Singularity/Apptainer:
|
||||
```python
|
||||
config = SlotPoolConfig(
|
||||
driver="singularity",
|
||||
singularity_image="/path/to/atropos-sandbox.sif",
|
||||
)
|
||||
```
|
||||
|
||||
### Nomad Configuration:
|
||||
```bash
|
||||
# Start Nomad with Singularity support
|
||||
nomad agent -dev -config=nomad-singularity.hcl
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
- Deploy to HPC cluster for production testing
|
||||
- Consider adding bubblewrap (bwrap) support inside Singularity for additional sandboxing
|
||||
- Document HPC-specific deployment procedures in skills/mlops/
|
||||
@@ -1,55 +0,0 @@
|
||||
# Product Context: Hermes-Agent
|
||||
|
||||
## Why This Project Exists
|
||||
|
||||
Hermes-Agent addresses several key challenges in the AI agent space:
|
||||
|
||||
1. **Unified Tool Interface** - Provides a clean, consistent interface for LLMs to use various tools (web, terminal, browser, vision, etc.) without requiring custom integration for each model provider.
|
||||
|
||||
2. **Training Data Generation** - Enables efficient generation of high-quality tool-calling trajectories for fine-tuning LLMs, with features like batch processing, checkpointing, and trajectory compression.
|
||||
|
||||
3. **Flexible Deployment** - Supports multiple execution environments (local, Docker, Singularity, Modal, SSH) to accommodate different security and isolation requirements.
|
||||
|
||||
4. **Developer Experience** - Offers a beautiful, interactive CLI with kawaii-style feedback that makes working with AI agents enjoyable.
|
||||
|
||||
## Problems It Solves
|
||||
|
||||
### For AI Researchers
|
||||
- **Data Generation at Scale**: Parallel batch processing with content-based checkpointing for fault tolerance
|
||||
- **Clean Trajectories**: Trajectory compression to fit token budgets while preserving important information
|
||||
- **Toolset Distributions**: Probability-based tool selection for varied training data
|
||||
|
||||
### For Developers
|
||||
- **Tool Orchestration**: Logical grouping of tools into toolsets (research, development, debugging, etc.)
|
||||
- **Session Persistence**: Conversation history and session logging for debugging
|
||||
- **Multi-Model Support**: Works with any OpenAI-compatible API (OpenRouter, local models, etc.)
|
||||
|
||||
### For MLOps
|
||||
- **Skills System**: On-demand knowledge documents for specific tools/frameworks (Axolotl, vLLM, TRL, etc.)
|
||||
- **Sandboxed Execution**: Terminal commands can run in isolated environments (Docker, Singularity, Modal)
|
||||
- **Configurable Backends**: Easy switching between local and cloud execution
|
||||
|
||||
## How It Should Work
|
||||
|
||||
### User Flow (CLI)
|
||||
1. User launches `./hermes`
|
||||
2. Beautiful welcome banner displays with caduceus logo, model info, and available tools
|
||||
3. User types a natural language request
|
||||
4. Agent processes request, potentially calling tools with animated feedback
|
||||
5. Agent responds with results, conversation continues
|
||||
6. Session is automatically logged for debugging
|
||||
|
||||
### User Flow (Batch Processing)
|
||||
1. User prepares JSONL file with prompts
|
||||
2. Runs `batch_runner.py` with distribution and worker count
|
||||
3. System processes prompts in parallel, saves checkpoints
|
||||
4. Completed trajectories saved to `data/<run_name>/trajectories.jsonl`
|
||||
5. Optional: compress trajectories with `trajectory_compressor.py`
|
||||
|
||||
## User Experience Goals
|
||||
|
||||
- **Delightful Interaction**: Kawaii ASCII faces, animated spinners, cute messages
|
||||
- **Informative Feedback**: Clear progress indication during tool execution
|
||||
- **Configurable Personalities**: From "helpful" to "pirate" to "Shakespeare"
|
||||
- **Easy Configuration**: YAML config file + environment variables + CLI flags
|
||||
- **Graceful Degradation**: Missing tools/APIs don't break the system, just disable features
|
||||
@@ -1,67 +0,0 @@
|
||||
# Progress
|
||||
|
||||
## Completed Features
|
||||
|
||||
### ✅ Singularity/Apptainer Sandbox Integration (Feb 6, 2026 - FULLY TESTED)
|
||||
Adapted the Atropos sandbox environment from Docker to Singularity/Apptainer for HPC clusters.
|
||||
|
||||
**What Works:**
|
||||
- `create_sandbox_job()` supports both `driver="docker"` and `driver="singularity"`
|
||||
- SlotPoolConfig and NomadBackendConfig propagate driver settings
|
||||
- Singularity container runs sandbox_server.py via Nomad's raw_exec driver
|
||||
- All sandbox operations work: bash execution, file read/write
|
||||
- Nomad environment variables properly expanded via shell wrapper
|
||||
- **CLI arguments** `--env.driver` and `--env.singularity_image` for AgentEnvConfig
|
||||
- **Static port binding** for Singularity (ReservedPorts vs DynamicPorts)
|
||||
- **Port detection** works for both Docker and Singularity allocations
|
||||
|
||||
**CLI Usage:**
|
||||
```bash
|
||||
python -m atropos.envs.swe_smith_oracle_env process \
|
||||
--env.driver singularity \
|
||||
--env.singularity_image /path/to/atropos-sandbox.sif
|
||||
```
|
||||
|
||||
**Created Files:**
|
||||
- `nomad-singularity.hcl` - Nomad config with raw_exec enabled
|
||||
- `atropos/atropos-sandbox.sif` - 80MB Singularity image
|
||||
- `test_singularity_job.py` - Integration test script
|
||||
|
||||
**Modified Files:**
|
||||
- `atropos/nomad/client.py` - driver support + ReservedPorts detection
|
||||
- `atropos/slots/pool.py` - driver config fields
|
||||
- `atropos/backends/nomad_backend.py` - driver config fields
|
||||
- `atropos/envs/agent_env.py` - CLI arguments for driver selection
|
||||
|
||||
### ✅ Memory Bank Initialized (Feb 5, 2026)
|
||||
Set up project documentation structure for context persistence.
|
||||
|
||||
## In Progress
|
||||
None currently.
|
||||
|
||||
## Known Issues
|
||||
- `bwrap_available: false` in Singularity containers - bubblewrap sandboxing not available inside the container (kernel namespaces already in use)
|
||||
- Health check timing - may need longer wait for container startup on slower systems
|
||||
|
||||
## What's Left to Build
|
||||
|
||||
### HPC Deployment
|
||||
- [ ] Test on actual HPC cluster with Slurm/PBS integration
|
||||
- [ ] Document cluster-specific deployment procedures
|
||||
- [ ] Add support for shared filesystem workspace binding
|
||||
|
||||
### Enhanced Sandboxing
|
||||
- [ ] Investigate alternative sandboxing inside Singularity (seccomp, etc.)
|
||||
- [ ] Add network isolation options for Singularity
|
||||
|
||||
### Documentation
|
||||
- [ ] Add Singularity deployment to README
|
||||
- [ ] Create HPC deployment skill in skills/mlops/
|
||||
|
||||
## Evolution of Decisions
|
||||
|
||||
### Container Runtime Selection
|
||||
- **Initial**: Docker-only via Nomad docker driver
|
||||
- **Problem**: HPC clusters don't allow Docker without sudo
|
||||
- **Solution**: Added Singularity/Apptainer support via raw_exec driver
|
||||
- **Result**: Both runtimes now supported with same API
|
||||
@@ -1,44 +0,0 @@
|
||||
# Project Brief: Hermes-Agent
|
||||
|
||||
## Overview
|
||||
Hermes-Agent is an AI agent harness for LLMs with advanced tool-calling capabilities, featuring a flexible toolsets system for organizing and managing tools. Named after Hermes, the Greek messenger god, it serves as a bridge between human intent and AI-powered task execution.
|
||||
|
||||
## Core Requirements
|
||||
|
||||
### Primary Goals
|
||||
1. **Interactive CLI Experience** - Beautiful terminal interface with animated feedback, personalities, and session management
|
||||
2. **Flexible Tool System** - Modular tools organized into logical toolsets for different use cases
|
||||
3. **Batch Processing** - Process multiple prompts in parallel with checkpointing and statistics
|
||||
4. **Multi-Backend Support** - Support for local, Docker, Singularity, Modal, and SSH terminal backends
|
||||
5. **Training Data Generation** - Save conversation trajectories in formats suitable for LLM fine-tuning
|
||||
|
||||
### Target Users
|
||||
- AI researchers generating training data
|
||||
- Developers needing an AI assistant with tool access
|
||||
- MLOps practitioners automating workflows
|
||||
- Anyone needing a powerful CLI-based AI agent
|
||||
|
||||
## Scope
|
||||
|
||||
### In Scope
|
||||
- Interactive CLI with rich formatting and kawaii-style feedback
|
||||
- Web tools (search, extract, crawl via Firecrawl)
|
||||
- Terminal tools (command execution across multiple backends)
|
||||
- Browser automation (via agent-browser + Browserbase)
|
||||
- Vision tools (image analysis)
|
||||
- Image generation (FLUX via FAL.ai)
|
||||
- Mixture-of-Agents reasoning
|
||||
- Skills system for on-demand knowledge
|
||||
- Batch processing with parallel workers
|
||||
- Trajectory compression for training
|
||||
|
||||
### Out of Scope (Current)
|
||||
- Proactive suggestions (agent only runs on request)
|
||||
- Clipboard integration (no local system access)
|
||||
- Real-time streaming of thinking/reasoning (deferred)
|
||||
|
||||
## Success Metrics
|
||||
- Clean, maintainable tool architecture
|
||||
- Reliable tool execution with proper error handling
|
||||
- Efficient context management for long conversations
|
||||
- High-quality trajectory data for training
|
||||
@@ -1,149 +0,0 @@
|
||||
# System Patterns: Hermes-Agent
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ CLI (cli.py) │
|
||||
│ - Rich welcome banner with caduceus │
|
||||
│ - prompt_toolkit for input with history │
|
||||
│ - Kawaii-style feedback and personalities │
|
||||
└────────────────────────────┬────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ AIAgent (run_agent.py) │
|
||||
│ - Conversation loop with tool calling │
|
||||
│ - KawaiiSpinner for animated feedback │
|
||||
│ - Retry logic with exponential backoff │
|
||||
│ - Session logging to logs/ directory │
|
||||
└────────────────────────────┬────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Tool Routing (model_tools.py) │
|
||||
│ - get_tool_definitions() - returns tools for API calls │
|
||||
│ - handle_function_call() - dispatches to tool handlers │
|
||||
│ - Toolset filtering (enabled/disabled) │
|
||||
└────────────────────────────┬────────────────────────────────────┘
|
||||
│
|
||||
┌─────────────────┼─────────────────┐
|
||||
▼ ▼ ▼
|
||||
┌───────────┐ ┌───────────┐ ┌───────────┐
|
||||
│ Web Tools │ │ Terminal │ │ Browser │
|
||||
│ (Firecrawl)│ │ (mini-swe)│ │(agent-brw)│
|
||||
└───────────┘ └───────────┘ └───────────┘
|
||||
│ │ │
|
||||
└─────────────────┼─────────────────┘
|
||||
▼
|
||||
┌───────────────┐
|
||||
│ Toolsets │
|
||||
│ (toolsets.py)│
|
||||
│ Composition │
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
## Key Design Patterns
|
||||
|
||||
### 1. Toolset Composition Pattern
|
||||
Toolsets can include other toolsets, allowing flexible composition:
|
||||
|
||||
```python
|
||||
TOOLSETS = {
|
||||
"web": {"tools": ["web_search", "web_extract"], "includes": []},
|
||||
"debugging": {"tools": ["terminal"], "includes": ["web"]},
|
||||
"full_stack": {"tools": [], "includes": ["web", "terminal", "vision", "browser"]}
|
||||
}
|
||||
```
|
||||
|
||||
Resolution is recursive with cycle detection.
|
||||
|
||||
### 2. Graceful Degradation Pattern
|
||||
Each tool module has a `check_*_requirements()` function:
|
||||
- Tools are only loaded if requirements are met
|
||||
- Missing API keys disable tools, not crash the system
|
||||
- Import errors are caught and tools marked unavailable
|
||||
|
||||
```python
|
||||
try:
|
||||
from tools.web_tools import web_search_tool, check_firecrawl_api_key
|
||||
except ModuleNotFoundError:
|
||||
web_search_tool = None
|
||||
def check_firecrawl_api_key(): return False
|
||||
```
|
||||
|
||||
### 3. Session Isolation Pattern (task_id)
|
||||
Stateful tools (terminal, browser) use `task_id` to isolate concurrent sessions:
|
||||
- Each batch worker gets unique task_id
|
||||
- VMs and browser sessions are tracked per task_id
|
||||
- Cleanup functions release resources: `cleanup_vm(task_id)`, `cleanup_browser(task_id)`
|
||||
|
||||
### 4. Trajectory Format Pattern
|
||||
Conversations are saved in ShareGPT format for training:
|
||||
|
||||
```json
|
||||
{"from": "system", "value": "System prompt with <tools>...</tools>"}
|
||||
{"from": "human", "value": "User message"}
|
||||
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
|
||||
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
|
||||
{"from": "gpt", "value": "Final response"}
|
||||
```
|
||||
|
||||
### 5. Ephemeral System Prompt Pattern
|
||||
Guide model behavior during data collection without saving to trajectories:
|
||||
- `ephemeral_system_prompt` influences execution
|
||||
- Only standard tool-calling system prompt saved to trajectories
|
||||
- Keeps training data clean
|
||||
|
||||
### 6. Retry with Validation Pattern
|
||||
The agent validates responses before accepting:
|
||||
- Check tool names against `valid_tool_names` set
|
||||
- Validate JSON arguments can be parsed
|
||||
- Check for content after `<think>` blocks
|
||||
- Roll back to last valid state on persistent failures
|
||||
|
||||
## Component Relationships
|
||||
|
||||
### AIAgent Class
|
||||
- Central orchestrator for conversations
|
||||
- Manages conversation history
|
||||
- Calls OpenAI-compatible API
|
||||
- Routes tool calls to handlers
|
||||
- Provides animated feedback (KawaiiSpinner)
|
||||
|
||||
### Tool Modules (tools/*.py)
|
||||
- Self-contained tool implementations
|
||||
- Export: handler function + check function + schema
|
||||
- Return JSON strings (never raw dicts)
|
||||
- Accept optional `task_id` for stateful tools
|
||||
|
||||
### Toolsets System (toolsets.py)
|
||||
- Defines logical groupings of tools
|
||||
- Supports composition via `includes`
|
||||
- `resolve_toolset()` recursively resolves all tools
|
||||
- `validate_toolset()` checks if name is valid
|
||||
|
||||
### Model Tools (model_tools.py)
|
||||
- Aggregates all tool definitions
|
||||
- Routes function calls to correct handlers
|
||||
- Filters tools based on enabled/disabled toolsets
|
||||
- Bridge between agent and tool implementations
|
||||
|
||||
## Critical Implementation Paths
|
||||
|
||||
### Tool Execution Flow
|
||||
1. AIAgent receives tool_calls from API response
|
||||
2. Validates tool names against `valid_tool_names`
|
||||
3. Validates JSON arguments can be parsed
|
||||
4. Calls `handle_function_call()` with tool name, args, task_id
|
||||
5. `handle_function_call()` routes to appropriate handler
|
||||
6. Tool executes, returns JSON string
|
||||
7. Result added to conversation as tool message
|
||||
8. Loop continues until natural language response
|
||||
|
||||
### Configuration Loading Flow
|
||||
1. `cli.py` calls `load_cli_config()`
|
||||
2. Loads `cli-config.yaml`, merges with defaults
|
||||
3. Sets environment variables for terminal config
|
||||
4. `AIAgent` reads env vars when initializing terminal tool
|
||||
5. Terminal tool creates appropriate backend based on `TERMINAL_ENV`
|
||||
@@ -1,113 +0,0 @@
|
||||
# Technical Context: Hermes-Agent
|
||||
|
||||
## Technologies Used
|
||||
|
||||
### Core Stack
|
||||
- **Python 3.11+** - Primary language
|
||||
- **OpenAI SDK** - For LLM API interactions (OpenAI-compatible)
|
||||
- **OpenRouter** - Default LLM provider (supports multiple models)
|
||||
- **Rich** - Terminal formatting and panels
|
||||
- **prompt_toolkit** - Interactive input with history
|
||||
- **Fire** - CLI argument parsing
|
||||
- **PyYAML** - Configuration files
|
||||
- **python-dotenv** - Environment variable management
|
||||
|
||||
### Tool Dependencies
|
||||
- **Firecrawl** - Web search and extraction (`FIRECRAWL_API_KEY`)
|
||||
- **mini-swe-agent** - Terminal tool backend (local/docker/singularity/modal/ssh)
|
||||
- **agent-browser** - Browser automation (npm package)
|
||||
- **Browserbase** - Cloud browser execution (`BROWSERBASE_API_KEY`)
|
||||
- **FAL.ai** - Image generation with FLUX (`FAL_KEY`)
|
||||
- **Nous API** - Vision and MoA tools (`NOUS_API_KEY`)
|
||||
|
||||
### Optional Dependencies
|
||||
- **Modal** - Cloud compute for sandboxed environments
|
||||
- **Singularity/Apptainer** - Rootless containers (HPC environments)
|
||||
- **Docker** - Container isolation
|
||||
|
||||
## Development Setup
|
||||
|
||||
### Quick Start
|
||||
```bash
|
||||
# Clone with submodules
|
||||
git clone --recurse-submodules https://github.com/NousResearch/Hermes-Agent.git
|
||||
cd Hermes-Agent
|
||||
|
||||
# Create virtual environment
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
pip install -e ./mini-swe-agent
|
||||
|
||||
# Install browser tools (optional)
|
||||
npm install
|
||||
|
||||
# Configure environment
|
||||
cp .env.example .env
|
||||
# Edit .env with your API keys
|
||||
```
|
||||
|
||||
### Key Configuration Files
|
||||
- `.env` - API keys and secrets
|
||||
- `cli-config.yaml` - CLI configuration (model, terminal, toolsets, personalities)
|
||||
- `configs/` - Batch run scripts and configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
**Required for Full Functionality:**
|
||||
- `OPENROUTER_API_KEY` - Primary LLM access
|
||||
- `FIRECRAWL_API_KEY` - Web tools
|
||||
- `NOUS_API_KEY` - Vision and reasoning tools
|
||||
- `FAL_KEY` - Image generation
|
||||
|
||||
**Terminal Backend:**
|
||||
- `TERMINAL_ENV` - Backend type: `local`, `docker`, `singularity`, `modal`, `ssh`
|
||||
- `TERMINAL_CWD` - Working directory
|
||||
- `TERMINAL_DOCKER_IMAGE` / `TERMINAL_SINGULARITY_IMAGE` - Container images
|
||||
- `TERMINAL_SSH_HOST/USER/KEY` - SSH backend config
|
||||
- `SUDO_PASSWORD` - Optional sudo support
|
||||
|
||||
**Browser:**
|
||||
- `BROWSERBASE_API_KEY` - Browser automation
|
||||
- `BROWSERBASE_PROJECT_ID` - Browserbase project
|
||||
|
||||
## Technical Constraints
|
||||
|
||||
1. **Context Window Limits** - Long tool outputs can exhaust context; trajectory compression helps
|
||||
2. **API Rate Limits** - OpenRouter and tool APIs have rate limits; exponential backoff implemented
|
||||
3. **Tool Availability** - Tools gracefully degrade if dependencies/keys missing
|
||||
4. **Async Compatibility** - Some tools are async, handled via `asyncio.run()` in sync context
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
```
|
||||
tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py
|
||||
↑
|
||||
run_agent.py ──────────────────────────┘
|
||||
cli.py → run_agent.py (uses AIAgent with quiet_mode=True)
|
||||
batch_runner.py → run_agent.py + toolset_distributions.py
|
||||
```
|
||||
|
||||
## Tool Usage Patterns
|
||||
|
||||
### Adding a New Tool
|
||||
1. Create `tools/your_tool.py` with handler + requirements check
|
||||
2. Export in `tools/__init__.py`
|
||||
3. Register in `model_tools.py` (definitions + handler routing)
|
||||
4. Add to toolset in `toolsets.py`
|
||||
5. Optionally add to `toolset_distributions.py` for batch processing
|
||||
|
||||
### Tool Handler Pattern
|
||||
```python
|
||||
def your_tool(param: str, task_id: str = None) -> str:
|
||||
"""Execute tool and return JSON string result."""
|
||||
try:
|
||||
result = {"success": True, "data": "..."}
|
||||
return json.dumps(result, ensure_ascii=False)
|
||||
except Exception as e:
|
||||
return json.dumps({"error": str(e)}, ensure_ascii=False)
|
||||
```
|
||||
|
||||
All tool handlers MUST return a JSON string, never raw dicts.
|
||||
@@ -1,134 +0,0 @@
|
||||
# Modal Sandbox Profiles Configuration
|
||||
# =====================================
|
||||
# This file defines different sandbox profiles for heterogeneous workloads.
|
||||
# Copy to modal_profiles.yaml and customize as needed.
|
||||
#
|
||||
# Usage:
|
||||
# terminal_tool("python train.py", profile="pytorch-gpu")
|
||||
# terminal_tool("npm test", profile="node")
|
||||
#
|
||||
# Each profile can specify:
|
||||
# - image: Docker image to use
|
||||
# - gpu: GPU type (null, "T4", "A10G", "A100", "H100")
|
||||
# - cpu: CPU cores (float)
|
||||
# - memory: Memory in MB
|
||||
# - min_pool: Minimum warm sandboxes (cost vs latency tradeoff)
|
||||
# - max_pool: Maximum sandboxes (hard cost cap)
|
||||
# - idle_timeout: Server-side auto-cleanup in seconds
|
||||
# - max_lifetime: Maximum sandbox lifetime in seconds
|
||||
# - scale_down_idle: Client-side scale-down threshold in seconds
|
||||
# - workdir: Working directory inside container
|
||||
# - secrets: List of Modal Secret names to inject (created via dashboard/CLI)
|
||||
# - env_vars: Dict of environment variables to pass directly
|
||||
# - use_dotenv: If true, loads local .env file into sandbox
|
||||
#
|
||||
# SECRETS SETUP:
|
||||
# Create secrets via Modal dashboard or CLI:
|
||||
# modal secret create huggingface-token HF_TOKEN=hf_xxx
|
||||
# modal secret create openai-key OPENAI_API_KEY=sk-xxx
|
||||
# Then reference by name in profile's secrets list.
|
||||
|
||||
# Default profile used when no profile specified
|
||||
default_profile: default
|
||||
|
||||
profiles:
|
||||
# Default Python environment - good for most tasks
|
||||
default:
|
||||
image: python:3.11
|
||||
gpu: null
|
||||
cpu: 1.0
|
||||
memory: 2048
|
||||
min_pool: 1 # Keep 1 warm for fast response
|
||||
max_pool: 5
|
||||
idle_timeout: 120 # Modal terminates if idle 2 min
|
||||
max_lifetime: 3600 # Max 1 hour
|
||||
scale_down_idle: 180
|
||||
workdir: /workspace
|
||||
secrets: [] # Add secret names here: ["my-api-keys"]
|
||||
env_vars: {} # Add env vars here: {DEBUG: "1"}
|
||||
use_dotenv: false # Set to true to load local .env
|
||||
|
||||
# PyTorch with GPU for ML training/inference
|
||||
pytorch-gpu:
|
||||
image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
|
||||
gpu: T4 # Options: T4, A10G, A100, H100
|
||||
cpu: 4.0
|
||||
memory: 16384 # 16GB
|
||||
min_pool: 0 # Don't keep GPU sandboxes warm (expensive!)
|
||||
max_pool: 2
|
||||
idle_timeout: 60 # Shorter idle timeout for GPU (cost)
|
||||
max_lifetime: 1800 # 30 min max for GPU tasks
|
||||
scale_down_idle: 60
|
||||
workdir: /workspace
|
||||
# ML-specific secrets
|
||||
secrets:
|
||||
- huggingface-token # HF_TOKEN env var
|
||||
- wandb-key # WANDB_API_KEY env var
|
||||
env_vars:
|
||||
CUDA_VISIBLE_DEVICES: "0"
|
||||
PYTORCH_CUDA_ALLOC_CONF: "expandable_segments:True"
|
||||
|
||||
# High-end GPU for large models
|
||||
pytorch-a100:
|
||||
image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
|
||||
gpu: A100
|
||||
cpu: 8.0
|
||||
memory: 65536 # 64GB
|
||||
min_pool: 0
|
||||
max_pool: 1 # Only 1 at a time (very expensive)
|
||||
idle_timeout: 30
|
||||
max_lifetime: 3600
|
||||
scale_down_idle: 30
|
||||
workdir: /workspace
|
||||
|
||||
# Node.js for JavaScript/TypeScript tasks
|
||||
node:
|
||||
image: node:18
|
||||
gpu: null
|
||||
cpu: 1.0
|
||||
memory: 2048
|
||||
min_pool: 0 # Create on-demand
|
||||
max_pool: 3
|
||||
idle_timeout: 120
|
||||
max_lifetime: 3600
|
||||
scale_down_idle: 180
|
||||
workdir: /workspace
|
||||
|
||||
# High memory for data processing
|
||||
high-memory:
|
||||
image: python:3.11
|
||||
gpu: null
|
||||
cpu: 4.0
|
||||
memory: 32768 # 32GB
|
||||
min_pool: 0
|
||||
max_pool: 2
|
||||
idle_timeout: 120
|
||||
max_lifetime: 3600
|
||||
scale_down_idle: 180
|
||||
workdir: /workspace
|
||||
|
||||
# Rust development environment
|
||||
rust:
|
||||
image: rust:1.75
|
||||
gpu: null
|
||||
cpu: 2.0
|
||||
memory: 4096
|
||||
min_pool: 0
|
||||
max_pool: 2
|
||||
idle_timeout: 120
|
||||
max_lifetime: 3600
|
||||
scale_down_idle: 180
|
||||
workdir: /workspace
|
||||
|
||||
# Go development environment
|
||||
golang:
|
||||
image: golang:1.21
|
||||
gpu: null
|
||||
cpu: 2.0
|
||||
memory: 4096
|
||||
min_pool: 0
|
||||
max_pool: 2
|
||||
idle_timeout: 120
|
||||
max_lifetime: 3600
|
||||
scale_down_idle: 180
|
||||
workdir: /workspace
|
||||
1039
model_tools.py
1039
model_tools.py
File diff suppressed because it is too large
Load Diff
@@ -1,37 +0,0 @@
|
||||
# Nomad Development Configuration (Hermes-Agent)
|
||||
# Run with: nomad agent -dev -config=nomad-dev.hcl
|
||||
#
|
||||
# This is intended for local development only.
|
||||
|
||||
client {
|
||||
enabled = true
|
||||
|
||||
options {
|
||||
# Enable Docker volume mounts for persistent slot workspaces
|
||||
"docker.volumes.enabled" = "true"
|
||||
}
|
||||
}
|
||||
|
||||
# Docker driver plugin configuration
|
||||
plugin "docker" {
|
||||
config {
|
||||
# CRITICAL: Enable volume mounts
|
||||
volumes {
|
||||
enabled = true
|
||||
}
|
||||
|
||||
# Allow privileged containers if needed
|
||||
allow_privileged = false
|
||||
|
||||
# Garbage collection settings
|
||||
gc {
|
||||
image = true
|
||||
# NOTE: For local dev we often rely on locally built images like `atropos-sandbox:local`.
|
||||
# A short image GC delay can delete these between runs, causing confusing "Failed to pull"
|
||||
# crash loops. Keep this comfortably long; tighten it for CI/production if needed.
|
||||
image_delay = "24h"
|
||||
container = true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,31 +0,0 @@
|
||||
# Nomad Configuration for Singularity/Apptainer Sandbox
|
||||
# Run with: nomad agent -dev -config=nomad-singularity.hcl
|
||||
#
|
||||
# This uses the raw_exec driver to run Apptainer containers.
|
||||
# Suitable for HPC environments where Docker cannot run without sudo.
|
||||
|
||||
client {
|
||||
enabled = true
|
||||
|
||||
options {
|
||||
# Enable raw_exec driver for Singularity/Apptainer
|
||||
"driver.raw_exec.enable" = "1"
|
||||
}
|
||||
}
|
||||
|
||||
# raw_exec driver plugin configuration
|
||||
plugin "raw_exec" {
|
||||
config {
|
||||
enabled = true
|
||||
}
|
||||
}
|
||||
|
||||
# Optional: If you have the nomad-driver-singularity plugin installed,
|
||||
# uncomment the following instead of using raw_exec:
|
||||
# plugin "singularity" {
|
||||
# config {
|
||||
# enabled = true
|
||||
# # Allow bind mounts
|
||||
# bind_paths = ["/tmp", "/var/tmp"]
|
||||
# }
|
||||
# }
|
||||
@@ -19,7 +19,6 @@ dependencies = [
|
||||
"rich",
|
||||
"tenacity",
|
||||
"pyyaml",
|
||||
"prompt_toolkit",
|
||||
"requests",
|
||||
"jinja2",
|
||||
"pydantic>=2.0",
|
||||
@@ -35,32 +34,17 @@ dependencies = [
|
||||
[project.optional-dependencies]
|
||||
modal = ["modal", "boto3"]
|
||||
dev = ["pytest", "pytest-asyncio"]
|
||||
# Install Atropos from source (PyPI is often stale for this internal dependency).
|
||||
atropos = [
|
||||
"atroposlib @ git+https://github.com/NousResearch/atropos.git",
|
||||
# Atropos integration runtime deps (kept optional for Hermes-only users)
|
||||
"aiohttp",
|
||||
"fastapi",
|
||||
"uvicorn",
|
||||
"pyte",
|
||||
]
|
||||
messaging = ["python-telegram-bot>=20.0", "discord.py>=2.0"]
|
||||
cron = ["croniter"]
|
||||
cli = ["simple-term-menu"]
|
||||
all = ["croniter", "python-telegram-bot>=20.0", "discord.py>=2.0", "simple-term-menu"]
|
||||
|
||||
[project.scripts]
|
||||
hermes = "hermes_cli.main:main"
|
||||
hermes-agent = "run_agent:main"
|
||||
hermes-atropos-sandbox-smoke = "atropos.envs.sandbox_terminal_smoke_env:SandboxTerminalSmokeEnv.cli"
|
||||
hermes-atropos-toolserver-smoke = "atropos.envs.toolserver_smoke_env:ToolServerSmokeEnv.cli"
|
||||
|
||||
[tool.setuptools]
|
||||
py-modules = [
|
||||
"run_agent",
|
||||
"model_tools",
|
||||
"toolsets",
|
||||
"batch_runner",
|
||||
"trajectory_compressor",
|
||||
"toolset_distributions",
|
||||
"atropos_compatible_agent",
|
||||
"local_server",
|
||||
]
|
||||
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli"]
|
||||
|
||||
[tool.setuptools.packages.find]
|
||||
include = ["tools", "atropos", "atropos.*"]
|
||||
include = ["tools", "hermes_cli", "gateway", "cron"]
|
||||
|
||||
@@ -30,5 +30,15 @@ platformdirs
|
||||
# modal
|
||||
# boto3
|
||||
|
||||
# Optional: Legacy Hecate terminal backend
|
||||
# git+ssh://git@github.com/NousResearch/hecate.git
|
||||
# Optional: For cron expression parsing (cronjob scheduling)
|
||||
croniter
|
||||
|
||||
# Optional: For messaging platform integrations (gateway)
|
||||
# Telegram: pip install python-telegram-bot
|
||||
python-telegram-bot>=20.0
|
||||
|
||||
# Discord: pip install discord.py
|
||||
discord.py>=2.0
|
||||
|
||||
# WhatsApp: Requires Node.js bridge (see docs/messaging.md)
|
||||
# aiohttp # For WhatsApp bridge communication
|
||||
448
rl_cli.py
Normal file
448
rl_cli.py
Normal file
@@ -0,0 +1,448 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
RL Training CLI Runner
|
||||
|
||||
Dedicated CLI runner for RL training workflows with:
|
||||
- Extended timeouts for long-running training
|
||||
- RL-focused system prompts
|
||||
- Full toolset including RL training tools
|
||||
- Special handling for 30-minute check intervals
|
||||
|
||||
Usage:
|
||||
python rl_cli.py "Train a model on GSM8k for math reasoning"
|
||||
python rl_cli.py --interactive
|
||||
python rl_cli.py --list-environments
|
||||
|
||||
Environment Variables:
|
||||
TINKER_API_KEY: API key for Tinker service (required)
|
||||
WANDB_API_KEY: API key for WandB metrics (required)
|
||||
OPENROUTER_API_KEY: API key for OpenRouter (required for agent)
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import fire
|
||||
import yaml
|
||||
|
||||
# Load environment variables from .env file
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load from ~/.hermes/.env first, then local .env
|
||||
hermes_env_path = Path.home() / '.hermes' / '.env'
|
||||
local_env_path = Path(__file__).parent / '.env'
|
||||
|
||||
if hermes_env_path.exists():
|
||||
load_dotenv(dotenv_path=hermes_env_path)
|
||||
print(f"✅ Loaded environment variables from {hermes_env_path}")
|
||||
elif local_env_path.exists():
|
||||
load_dotenv(dotenv_path=local_env_path)
|
||||
print(f"✅ Loaded environment variables from {local_env_path}")
|
||||
|
||||
# Set terminal working directory to tinker-atropos submodule
|
||||
# This ensures terminal commands run in the right context for RL work
|
||||
tinker_atropos_dir = Path(__file__).parent / 'tinker-atropos'
|
||||
if tinker_atropos_dir.exists():
|
||||
os.environ['TERMINAL_CWD'] = str(tinker_atropos_dir)
|
||||
os.environ['HERMES_QUIET'] = '1' # Disable temp subdirectory creation
|
||||
print(f"📂 Terminal working directory: {tinker_atropos_dir}")
|
||||
else:
|
||||
# Fall back to hermes-agent directory if submodule not found
|
||||
os.environ['TERMINAL_CWD'] = str(Path(__file__).parent)
|
||||
os.environ['HERMES_QUIET'] = '1'
|
||||
print(f"⚠️ tinker-atropos submodule not found, using: {Path(__file__).parent}")
|
||||
|
||||
# Import agent and tools
|
||||
from run_agent import AIAgent
|
||||
from model_tools import get_tool_definitions, check_toolset_requirements
|
||||
from tools.rl_training_tool import check_rl_api_keys, get_missing_keys
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Config Loading
|
||||
# ============================================================================
|
||||
|
||||
DEFAULT_MODEL = "anthropic/claude-opus-4.5"
|
||||
DEFAULT_BASE_URL = "https://openrouter.ai/api/v1"
|
||||
|
||||
|
||||
def load_hermes_config() -> dict:
|
||||
"""
|
||||
Load configuration from ~/.hermes/config.yaml.
|
||||
|
||||
Returns:
|
||||
dict: Configuration with model, base_url, etc.
|
||||
"""
|
||||
config_path = Path.home() / '.hermes' / 'config.yaml'
|
||||
|
||||
config = {
|
||||
"model": DEFAULT_MODEL,
|
||||
"base_url": DEFAULT_BASE_URL,
|
||||
}
|
||||
|
||||
if config_path.exists():
|
||||
try:
|
||||
with open(config_path, "r") as f:
|
||||
file_config = yaml.safe_load(f) or {}
|
||||
|
||||
# Get model from config
|
||||
if "model" in file_config:
|
||||
if isinstance(file_config["model"], str):
|
||||
config["model"] = file_config["model"]
|
||||
elif isinstance(file_config["model"], dict):
|
||||
config["model"] = file_config["model"].get("default", DEFAULT_MODEL)
|
||||
|
||||
# Get base_url if specified
|
||||
if "base_url" in file_config:
|
||||
config["base_url"] = file_config["base_url"]
|
||||
|
||||
except Exception as e:
|
||||
print(f"⚠️ Warning: Failed to load config.yaml: {e}")
|
||||
|
||||
return config
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# RL-Specific Configuration
|
||||
# ============================================================================
|
||||
|
||||
# Extended timeouts for long-running RL operations
|
||||
RL_MAX_ITERATIONS = 200 # Allow many more iterations for long workflows
|
||||
|
||||
# RL-focused system prompt
|
||||
RL_SYSTEM_PROMPT = """You are an automated post-training engineer specializing in reinforcement learning for language models.
|
||||
|
||||
## Your Capabilities
|
||||
|
||||
You have access to RL training tools for running reinforcement learning on models through Tinker-Atropos:
|
||||
|
||||
1. **DISCOVER**: Use `rl_list_environments` to see available RL environments
|
||||
2. **INSPECT**: Read environment files to understand how they work (verifiers, data loading, rewards)
|
||||
3. **INSPECT DATA**: Use terminal to explore HuggingFace datasets and understand their format
|
||||
4. **CREATE**: Copy existing environments as templates, modify for your needs
|
||||
5. **CONFIGURE**: Use `rl_select_environment` and `rl_edit_config` to set up training
|
||||
6. **TEST**: Always use `rl_test_inference` before full training to validate your setup
|
||||
7. **TRAIN**: Use `rl_start_training` to begin, `rl_check_status` to monitor
|
||||
8. **EVALUATE**: Use `rl_get_results` and analyze WandB metrics to assess performance
|
||||
|
||||
## Environment Files
|
||||
|
||||
Environment files are located in: `tinker-atropos/tinker_atropos/environments/`
|
||||
|
||||
Study existing environments to learn patterns. Look for:
|
||||
- `load_dataset()` calls - how data is loaded
|
||||
- `score_answer()` / `score()` - verification logic
|
||||
- `get_next_item()` - prompt formatting
|
||||
- `system_prompt` - instruction format
|
||||
- `config_init()` - default configuration
|
||||
|
||||
## Creating New Environments
|
||||
|
||||
To create a new environment:
|
||||
1. Read an existing environment file (e.g., gsm8k_tinker.py)
|
||||
2. Use terminal to explore the target dataset format
|
||||
3. Copy the environment file as a template
|
||||
4. Modify the dataset loading, prompt formatting, and verifier logic
|
||||
5. Test with `rl_test_inference` before training
|
||||
|
||||
## Important Guidelines
|
||||
|
||||
- **Always test before training**: Training runs take hours - verify everything works first
|
||||
- **Monitor metrics**: Check WandB for reward/mean and percent_correct
|
||||
- **Status check intervals**: Wait at least 30 minutes between status checks
|
||||
- **Early stopping**: Stop training early if metrics look bad or stagnant
|
||||
- **Iterate quickly**: Start with small total_steps to validate, then scale up
|
||||
|
||||
## Available Toolsets
|
||||
|
||||
You have access to:
|
||||
- **RL tools**: Environment discovery, config management, training, testing
|
||||
- **Terminal**: Run commands, inspect files, explore datasets
|
||||
- **Web**: Search for information, documentation, papers
|
||||
- **File tools**: Read and modify code files
|
||||
|
||||
When asked to train a model, follow this workflow:
|
||||
1. List available environments
|
||||
2. Select and configure the appropriate environment
|
||||
3. Test with sample prompts
|
||||
4. Start training with conservative settings
|
||||
5. Monitor progress and adjust as needed
|
||||
"""
|
||||
|
||||
# Toolsets to enable for RL workflows
|
||||
RL_TOOLSETS = ["terminal", "web", "rl"]
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Helper Functions
|
||||
# ============================================================================
|
||||
|
||||
def check_requirements():
|
||||
"""Check that all required environment variables and services are available."""
|
||||
errors = []
|
||||
|
||||
# Check API keys
|
||||
if not os.getenv("OPENROUTER_API_KEY"):
|
||||
errors.append("OPENROUTER_API_KEY not set - required for agent")
|
||||
|
||||
missing_rl_keys = get_missing_keys()
|
||||
if missing_rl_keys:
|
||||
errors.append(f"Missing RL API keys: {', '.join(missing_rl_keys)}")
|
||||
|
||||
if errors:
|
||||
print("❌ Missing requirements:")
|
||||
for error in errors:
|
||||
print(f" - {error}")
|
||||
print("\nPlease set these environment variables in your .env file or shell.")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def check_tinker_atropos():
|
||||
"""Check if tinker-atropos submodule is properly set up."""
|
||||
tinker_path = Path(__file__).parent / "tinker-atropos"
|
||||
|
||||
if not tinker_path.exists():
|
||||
return False, "tinker-atropos submodule not found. Run: git submodule update --init"
|
||||
|
||||
envs_path = tinker_path / "tinker_atropos" / "environments"
|
||||
if not envs_path.exists():
|
||||
return False, f"environments directory not found at {envs_path}"
|
||||
|
||||
env_files = list(envs_path.glob("*.py"))
|
||||
env_files = [f for f in env_files if not f.name.startswith("_")]
|
||||
|
||||
return True, {"path": str(tinker_path), "environments_count": len(env_files)}
|
||||
|
||||
|
||||
def list_environments_sync():
|
||||
"""List available environments (synchronous wrapper)."""
|
||||
from tools.rl_training_tool import rl_list_environments
|
||||
import json
|
||||
|
||||
async def _list():
|
||||
result = await rl_list_environments()
|
||||
return json.loads(result)
|
||||
|
||||
return asyncio.run(_list())
|
||||
|
||||
|
||||
# ============================================================================
|
||||
# Main CLI
|
||||
# ============================================================================
|
||||
|
||||
def main(
|
||||
task: str = None,
|
||||
model: str = None,
|
||||
api_key: str = None,
|
||||
base_url: str = None,
|
||||
max_iterations: int = RL_MAX_ITERATIONS,
|
||||
interactive: bool = False,
|
||||
list_environments: bool = False,
|
||||
check_server: bool = False,
|
||||
verbose: bool = False,
|
||||
save_trajectories: bool = True,
|
||||
):
|
||||
"""
|
||||
RL Training CLI - Dedicated runner for RL training workflows.
|
||||
|
||||
Args:
|
||||
task: The training task/goal (e.g., "Train a model on GSM8k for math")
|
||||
model: Model to use for the agent (reads from ~/.hermes/config.yaml if not provided)
|
||||
api_key: OpenRouter API key (uses OPENROUTER_API_KEY env var if not provided)
|
||||
base_url: API base URL (reads from config or defaults to OpenRouter)
|
||||
max_iterations: Maximum agent iterations (default: 200 for long workflows)
|
||||
interactive: Run in interactive mode (multiple conversations)
|
||||
list_environments: Just list available RL environments and exit
|
||||
check_server: Check if RL API server is running and exit
|
||||
verbose: Enable verbose logging
|
||||
save_trajectories: Save conversation trajectories (default: True for RL)
|
||||
|
||||
Examples:
|
||||
# Train on a specific environment
|
||||
python rl_cli.py "Train a model on GSM8k math problems"
|
||||
|
||||
# Interactive mode
|
||||
python rl_cli.py --interactive
|
||||
|
||||
# List available environments
|
||||
python rl_cli.py --list-environments
|
||||
|
||||
# Check server status
|
||||
python rl_cli.py --check-server
|
||||
"""
|
||||
# Load config from ~/.hermes/config.yaml
|
||||
config = load_hermes_config()
|
||||
|
||||
# Use config values if not explicitly provided
|
||||
if model is None:
|
||||
model = config["model"]
|
||||
if base_url is None:
|
||||
base_url = config["base_url"]
|
||||
|
||||
print("🎯 RL Training Agent")
|
||||
print("=" * 60)
|
||||
|
||||
# Handle setup check
|
||||
if check_server:
|
||||
print("\n🔍 Checking tinker-atropos setup...")
|
||||
ok, result = check_tinker_atropos()
|
||||
if ok:
|
||||
print("✅ tinker-atropos submodule found")
|
||||
print(f" Path: {result.get('path')}")
|
||||
print(f" Environments found: {result.get('environments_count', 0)}")
|
||||
|
||||
# Also check API keys
|
||||
missing = get_missing_keys()
|
||||
if missing:
|
||||
print(f"\n⚠️ Missing API keys: {', '.join(missing)}")
|
||||
print(" Add them to ~/.hermes/.env")
|
||||
else:
|
||||
print("✅ API keys configured")
|
||||
else:
|
||||
print(f"❌ tinker-atropos not set up: {result}")
|
||||
print("\nTo set up:")
|
||||
print(" git submodule update --init")
|
||||
print(" pip install -e ./tinker-atropos")
|
||||
return
|
||||
|
||||
# Handle environment listing
|
||||
if list_environments:
|
||||
print("\n📋 Available RL Environments:")
|
||||
print("-" * 40)
|
||||
try:
|
||||
data = list_environments_sync()
|
||||
if "error" in data:
|
||||
print(f"❌ Error: {data['error']}")
|
||||
return
|
||||
|
||||
envs = data.get("environments", [])
|
||||
if not envs:
|
||||
print("No environments found.")
|
||||
print("\nMake sure tinker-atropos is set up:")
|
||||
print(" git submodule update --init")
|
||||
return
|
||||
|
||||
for env in envs:
|
||||
print(f"\n 📦 {env['name']}")
|
||||
print(f" Class: {env['class_name']}")
|
||||
print(f" Path: {env['file_path']}")
|
||||
if env.get('description'):
|
||||
desc = env['description'][:100] + "..." if len(env.get('description', '')) > 100 else env.get('description', '')
|
||||
print(f" Description: {desc}")
|
||||
|
||||
print(f"\n📊 Total: {len(envs)} environments")
|
||||
print("\nUse `rl_select_environment(name)` to select an environment for training.")
|
||||
except Exception as e:
|
||||
print(f"❌ Error listing environments: {e}")
|
||||
print("\nMake sure tinker-atropos is set up:")
|
||||
print(" git submodule update --init")
|
||||
print(" pip install -e ./tinker-atropos")
|
||||
return
|
||||
|
||||
# Check requirements
|
||||
if not check_requirements():
|
||||
sys.exit(1)
|
||||
|
||||
# Set default task if none provided
|
||||
if not task and not interactive:
|
||||
print("\n⚠️ No task provided. Use --interactive for interactive mode or provide a task.")
|
||||
print("\nExamples:")
|
||||
print(' python rl_cli.py "Train a model on GSM8k math problems"')
|
||||
print(' python rl_cli.py "Create an RL environment for code generation"')
|
||||
print(' python rl_cli.py --interactive')
|
||||
return
|
||||
|
||||
# Get API key
|
||||
api_key = api_key or os.getenv("OPENROUTER_API_KEY")
|
||||
if not api_key:
|
||||
print("❌ No API key provided. Set OPENROUTER_API_KEY or pass --api-key")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"\n🤖 Model: {model}")
|
||||
print(f"🔧 Max iterations: {max_iterations}")
|
||||
print(f"📁 Toolsets: {', '.join(RL_TOOLSETS)}")
|
||||
print("=" * 60)
|
||||
|
||||
# Create agent with RL configuration
|
||||
agent = AIAgent(
|
||||
base_url=base_url,
|
||||
api_key=api_key,
|
||||
model=model,
|
||||
max_iterations=max_iterations,
|
||||
enabled_toolsets=RL_TOOLSETS,
|
||||
save_trajectories=save_trajectories,
|
||||
verbose_logging=verbose,
|
||||
quiet_mode=False,
|
||||
ephemeral_system_prompt=RL_SYSTEM_PROMPT,
|
||||
)
|
||||
|
||||
if interactive:
|
||||
# Interactive mode - multiple conversations
|
||||
print("\n🔄 Interactive RL Training Mode")
|
||||
print("Type 'quit' or 'exit' to end the session.")
|
||||
print("Type 'status' to check active training runs.")
|
||||
print("-" * 40)
|
||||
|
||||
while True:
|
||||
try:
|
||||
user_input = input("\n🎯 RL Task> ").strip()
|
||||
|
||||
if not user_input:
|
||||
continue
|
||||
|
||||
if user_input.lower() in ('quit', 'exit', 'q'):
|
||||
print("\n👋 Goodbye!")
|
||||
break
|
||||
|
||||
if user_input.lower() == 'status':
|
||||
# Quick status check
|
||||
from tools.rl_training_tool import rl_list_runs
|
||||
import json
|
||||
result = asyncio.run(rl_list_runs())
|
||||
runs = json.loads(result)
|
||||
if isinstance(runs, list) and runs:
|
||||
print("\n📊 Active Runs:")
|
||||
for run in runs:
|
||||
print(f" - {run['run_id']}: {run['environment']} ({run['status']})")
|
||||
else:
|
||||
print("\nNo active runs.")
|
||||
continue
|
||||
|
||||
# Run the agent
|
||||
print("\n" + "=" * 60)
|
||||
response = agent.run_conversation(user_input)
|
||||
print("\n" + "=" * 60)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print("\n\n👋 Interrupted. Goodbye!")
|
||||
break
|
||||
except Exception as e:
|
||||
print(f"\n❌ Error: {e}")
|
||||
if verbose:
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
else:
|
||||
# Single task mode
|
||||
print(f"\n📝 Task: {task}")
|
||||
print("-" * 40)
|
||||
|
||||
try:
|
||||
response = agent.run_conversation(task)
|
||||
print("\n" + "=" * 60)
|
||||
print("✅ Task completed")
|
||||
except KeyboardInterrupt:
|
||||
print("\n\n⚠️ Interrupted by user")
|
||||
except Exception as e:
|
||||
print(f"\n❌ Error: {e}")
|
||||
if verbose:
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
fire.Fire(main)
|
||||
808
run_agent.py
808
run_agent.py
@@ -30,6 +30,7 @@ import threading
|
||||
import uuid
|
||||
from typing import List, Dict, Any, Optional
|
||||
from openai import OpenAI
|
||||
import fire
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
@@ -50,6 +51,410 @@ from model_tools import get_tool_definitions, handle_function_call, check_toolse
|
||||
from tools.terminal_tool import cleanup_vm
|
||||
from tools.browser_tool import cleanup_browser
|
||||
|
||||
import requests
|
||||
|
||||
# =============================================================================
|
||||
# Model Context Management
|
||||
# =============================================================================
|
||||
|
||||
# Cache for model metadata from OpenRouter
|
||||
_model_metadata_cache: Dict[str, Dict[str, Any]] = {}
|
||||
_model_metadata_cache_time: float = 0
|
||||
_MODEL_CACHE_TTL = 3600 # 1 hour cache TTL
|
||||
|
||||
# Default context lengths for common models (fallback if API fails)
|
||||
DEFAULT_CONTEXT_LENGTHS = {
|
||||
"anthropic/claude-opus-4": 200000,
|
||||
"anthropic/claude-opus-4.5": 200000,
|
||||
"anthropic/claude-sonnet-4": 200000,
|
||||
"anthropic/claude-sonnet-4-20250514": 200000,
|
||||
"anthropic/claude-haiku-4.5": 200000,
|
||||
"openai/gpt-4o": 128000,
|
||||
"openai/gpt-4-turbo": 128000,
|
||||
"openai/gpt-4o-mini": 128000,
|
||||
"google/gemini-2.0-flash": 1048576,
|
||||
"google/gemini-2.5-pro": 1048576,
|
||||
"meta-llama/llama-3.3-70b-instruct": 131072,
|
||||
"deepseek/deepseek-chat-v3": 65536,
|
||||
"qwen/qwen-2.5-72b-instruct": 32768,
|
||||
}
|
||||
|
||||
|
||||
def fetch_model_metadata(force_refresh: bool = False) -> Dict[str, Dict[str, Any]]:
|
||||
"""
|
||||
Fetch model metadata from OpenRouter's /api/v1/models endpoint.
|
||||
Results are cached for 1 hour to minimize API calls.
|
||||
|
||||
Returns:
|
||||
Dict mapping model_id to metadata (context_length, max_completion_tokens, etc.)
|
||||
"""
|
||||
global _model_metadata_cache, _model_metadata_cache_time
|
||||
|
||||
# Return cached data if fresh
|
||||
if not force_refresh and _model_metadata_cache and (time.time() - _model_metadata_cache_time) < _MODEL_CACHE_TTL:
|
||||
return _model_metadata_cache
|
||||
|
||||
try:
|
||||
response = requests.get(
|
||||
"https://openrouter.ai/api/v1/models",
|
||||
timeout=10
|
||||
)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
|
||||
# Build cache mapping model_id to relevant metadata
|
||||
cache = {}
|
||||
for model in data.get("data", []):
|
||||
model_id = model.get("id", "")
|
||||
cache[model_id] = {
|
||||
"context_length": model.get("context_length", 128000),
|
||||
"max_completion_tokens": model.get("top_provider", {}).get("max_completion_tokens", 4096),
|
||||
"name": model.get("name", model_id),
|
||||
"pricing": model.get("pricing", {}),
|
||||
}
|
||||
# Also cache by canonical slug if different
|
||||
canonical = model.get("canonical_slug", "")
|
||||
if canonical and canonical != model_id:
|
||||
cache[canonical] = cache[model_id]
|
||||
|
||||
_model_metadata_cache = cache
|
||||
_model_metadata_cache_time = time.time()
|
||||
|
||||
if not os.getenv("HERMES_QUIET"):
|
||||
logging.debug(f"Fetched metadata for {len(cache)} models from OpenRouter")
|
||||
|
||||
return cache
|
||||
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to fetch model metadata from OpenRouter: {e}")
|
||||
# Return cached data even if stale, or empty dict
|
||||
return _model_metadata_cache or {}
|
||||
|
||||
|
||||
def get_model_context_length(model: str) -> int:
|
||||
"""
|
||||
Get the context length for a specific model.
|
||||
|
||||
Args:
|
||||
model: Model identifier (e.g., "anthropic/claude-sonnet-4")
|
||||
|
||||
Returns:
|
||||
Context length in tokens (defaults to 128000 if unknown)
|
||||
"""
|
||||
# Try to get from OpenRouter API
|
||||
metadata = fetch_model_metadata()
|
||||
if model in metadata:
|
||||
return metadata[model].get("context_length", 128000)
|
||||
|
||||
# Check default fallbacks (handles partial matches)
|
||||
for default_model, length in DEFAULT_CONTEXT_LENGTHS.items():
|
||||
if default_model in model or model in default_model:
|
||||
return length
|
||||
|
||||
# Conservative default
|
||||
return 128000
|
||||
|
||||
|
||||
def estimate_tokens_rough(text: str) -> int:
|
||||
"""
|
||||
Rough token estimate for pre-flight checks (before API call).
|
||||
Uses ~4 chars per token heuristic.
|
||||
|
||||
For accurate counts, use the `usage.prompt_tokens` from API responses.
|
||||
|
||||
Args:
|
||||
text: Text to estimate tokens for
|
||||
|
||||
Returns:
|
||||
Rough estimated token count
|
||||
"""
|
||||
if not text:
|
||||
return 0
|
||||
return len(text) // 4
|
||||
|
||||
|
||||
def estimate_messages_tokens_rough(messages: List[Dict[str, Any]]) -> int:
|
||||
"""
|
||||
Rough token estimate for messages (pre-flight check only).
|
||||
|
||||
For accurate counts, use the `usage.prompt_tokens` from API responses.
|
||||
|
||||
Args:
|
||||
messages: List of message dicts
|
||||
|
||||
Returns:
|
||||
Rough estimated token count
|
||||
"""
|
||||
total_chars = sum(len(str(msg)) for msg in messages)
|
||||
return total_chars // 4
|
||||
|
||||
|
||||
class ContextCompressor:
|
||||
"""
|
||||
Compresses conversation context when approaching model's context limit.
|
||||
|
||||
Uses similar logic to trajectory_compressor but operates in real-time:
|
||||
1. Protects first few turns (system, initial user, first assistant response)
|
||||
2. Protects last N turns (recent context is most relevant)
|
||||
3. Summarizes middle turns when threshold is reached
|
||||
|
||||
Token tracking uses actual counts from API responses (usage.prompt_tokens)
|
||||
rather than estimates for accuracy.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model: str,
|
||||
threshold_percent: float = 0.85,
|
||||
summary_model: str = "google/gemini-2.0-flash-001",
|
||||
protect_first_n: int = 3,
|
||||
protect_last_n: int = 4,
|
||||
summary_target_tokens: int = 500,
|
||||
quiet_mode: bool = False,
|
||||
):
|
||||
"""
|
||||
Initialize the context compressor.
|
||||
|
||||
Args:
|
||||
model: The main model being used (to determine context limit)
|
||||
threshold_percent: Trigger compression at this % of context (default 85%)
|
||||
summary_model: Model to use for generating summaries (cheap/fast)
|
||||
protect_first_n: Number of initial turns to always keep
|
||||
protect_last_n: Number of recent turns to always keep
|
||||
summary_target_tokens: Target token count for summaries
|
||||
quiet_mode: Suppress compression notifications
|
||||
"""
|
||||
self.model = model
|
||||
self.threshold_percent = threshold_percent
|
||||
self.summary_model = summary_model
|
||||
self.protect_first_n = protect_first_n
|
||||
self.protect_last_n = protect_last_n
|
||||
self.summary_target_tokens = summary_target_tokens
|
||||
self.quiet_mode = quiet_mode
|
||||
|
||||
self.context_length = get_model_context_length(model)
|
||||
self.threshold_tokens = int(self.context_length * threshold_percent)
|
||||
self.compression_count = 0
|
||||
|
||||
# Track actual token usage from API responses
|
||||
self.last_prompt_tokens = 0
|
||||
self.last_completion_tokens = 0
|
||||
self.last_total_tokens = 0
|
||||
|
||||
# Initialize OpenRouter client for summarization
|
||||
api_key = os.getenv("OPENROUTER_API_KEY", "")
|
||||
self.client = OpenAI(
|
||||
api_key=api_key,
|
||||
base_url="https://openrouter.ai/api/v1"
|
||||
) if api_key else None
|
||||
|
||||
def update_from_response(self, usage: Dict[str, Any]):
|
||||
"""
|
||||
Update tracked token usage from API response.
|
||||
|
||||
Args:
|
||||
usage: The usage dict from response (contains prompt_tokens, completion_tokens, total_tokens)
|
||||
"""
|
||||
self.last_prompt_tokens = usage.get("prompt_tokens", 0)
|
||||
self.last_completion_tokens = usage.get("completion_tokens", 0)
|
||||
self.last_total_tokens = usage.get("total_tokens", 0)
|
||||
|
||||
def should_compress(self, prompt_tokens: int = None) -> bool:
|
||||
"""
|
||||
Check if context exceeds the compression threshold.
|
||||
|
||||
Uses actual token count from API response for accuracy.
|
||||
|
||||
Args:
|
||||
prompt_tokens: Actual prompt tokens from last API response.
|
||||
If None, uses last tracked value.
|
||||
|
||||
Returns:
|
||||
True if compression should be triggered
|
||||
"""
|
||||
tokens = prompt_tokens if prompt_tokens is not None else self.last_prompt_tokens
|
||||
return tokens >= self.threshold_tokens
|
||||
|
||||
def should_compress_preflight(self, messages: List[Dict[str, Any]]) -> bool:
|
||||
"""
|
||||
Quick pre-flight check using rough estimate (before API call).
|
||||
|
||||
Use this to avoid making an API call that would fail due to context overflow.
|
||||
For post-response compression decisions, use should_compress() with actual tokens.
|
||||
|
||||
Args:
|
||||
messages: Current conversation messages
|
||||
|
||||
Returns:
|
||||
True if compression is likely needed
|
||||
"""
|
||||
rough_estimate = estimate_messages_tokens_rough(messages)
|
||||
return rough_estimate >= self.threshold_tokens
|
||||
|
||||
def get_status(self) -> Dict[str, Any]:
|
||||
"""
|
||||
Get current compression status for display/logging.
|
||||
|
||||
Returns:
|
||||
Dict with token usage and threshold info
|
||||
"""
|
||||
return {
|
||||
"last_prompt_tokens": self.last_prompt_tokens,
|
||||
"threshold_tokens": self.threshold_tokens,
|
||||
"context_length": self.context_length,
|
||||
"usage_percent": (self.last_prompt_tokens / self.context_length * 100) if self.context_length else 0,
|
||||
"compression_count": self.compression_count,
|
||||
}
|
||||
|
||||
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]]) -> str:
|
||||
"""
|
||||
Generate a concise summary of conversation turns using a fast model.
|
||||
|
||||
Args:
|
||||
turns_to_summarize: List of message dicts to summarize
|
||||
|
||||
Returns:
|
||||
Summary string
|
||||
"""
|
||||
if not self.client:
|
||||
# Fallback if no API key
|
||||
return "[CONTEXT SUMMARY]: Previous conversation turns have been compressed to save space. The assistant performed various actions and received responses."
|
||||
|
||||
# Format turns for summarization
|
||||
parts = []
|
||||
for i, msg in enumerate(turns_to_summarize):
|
||||
role = msg.get("role", "unknown")
|
||||
content = msg.get("content", "")
|
||||
|
||||
# Truncate very long content
|
||||
if len(content) > 2000:
|
||||
content = content[:1000] + "\n...[truncated]...\n" + content[-500:]
|
||||
|
||||
# Include tool call info if present
|
||||
tool_calls = msg.get("tool_calls", [])
|
||||
if tool_calls:
|
||||
tool_names = [tc.get("function", {}).get("name", "?") for tc in tool_calls if isinstance(tc, dict)]
|
||||
content += f"\n[Tool calls: {', '.join(tool_names)}]"
|
||||
|
||||
parts.append(f"[{role.upper()}]: {content}")
|
||||
|
||||
content_to_summarize = "\n\n".join(parts)
|
||||
|
||||
prompt = f"""Summarize these conversation turns concisely. This summary will replace these turns in the conversation history.
|
||||
|
||||
Write from a neutral perspective describing:
|
||||
1. What actions were taken (tool calls, searches, file operations)
|
||||
2. Key information or results obtained
|
||||
3. Important decisions or findings
|
||||
4. Relevant data, file names, or outputs
|
||||
|
||||
Keep factual and informative. Target ~{self.summary_target_tokens} tokens.
|
||||
|
||||
---
|
||||
TURNS TO SUMMARIZE:
|
||||
{content_to_summarize}
|
||||
---
|
||||
|
||||
Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
|
||||
|
||||
try:
|
||||
response = self.client.chat.completions.create(
|
||||
model=self.summary_model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=0.3,
|
||||
max_tokens=self.summary_target_tokens * 2,
|
||||
timeout=30.0,
|
||||
)
|
||||
|
||||
summary = response.choices[0].message.content.strip()
|
||||
if not summary.startswith("[CONTEXT SUMMARY]:"):
|
||||
summary = "[CONTEXT SUMMARY]: " + summary
|
||||
|
||||
return summary
|
||||
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to generate context summary: {e}")
|
||||
return "[CONTEXT SUMMARY]: Previous conversation turns have been compressed. The assistant performed tool calls and received responses."
|
||||
|
||||
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Compress conversation messages by summarizing middle turns.
|
||||
|
||||
Algorithm:
|
||||
1. Keep first N turns (system prompt, initial context)
|
||||
2. Keep last N turns (recent/relevant context)
|
||||
3. Summarize everything in between
|
||||
4. Insert summary as a user message
|
||||
|
||||
Args:
|
||||
messages: Current conversation messages
|
||||
current_tokens: Actual token count from API (for logging). If None, uses estimate.
|
||||
|
||||
Returns:
|
||||
Compressed message list
|
||||
"""
|
||||
n_messages = len(messages)
|
||||
|
||||
# Not enough messages to compress
|
||||
if n_messages <= self.protect_first_n + self.protect_last_n + 1:
|
||||
if not self.quiet_mode:
|
||||
print(f"⚠️ Cannot compress: only {n_messages} messages (need > {self.protect_first_n + self.protect_last_n + 1})")
|
||||
return messages
|
||||
|
||||
# Determine compression boundaries
|
||||
compress_start = self.protect_first_n
|
||||
compress_end = n_messages - self.protect_last_n
|
||||
|
||||
# Nothing to compress
|
||||
if compress_start >= compress_end:
|
||||
return messages
|
||||
|
||||
# Extract turns to summarize
|
||||
turns_to_summarize = messages[compress_start:compress_end]
|
||||
|
||||
# Use actual token count if provided, otherwise estimate
|
||||
display_tokens = current_tokens if current_tokens else self.last_prompt_tokens or estimate_messages_tokens_rough(messages)
|
||||
|
||||
if not self.quiet_mode:
|
||||
print(f"\n📦 Context compression triggered ({display_tokens:,} tokens ≥ {self.threshold_tokens:,} threshold)")
|
||||
print(f" 📊 Model context limit: {self.context_length:,} tokens ({self.threshold_percent*100:.0f}% = {self.threshold_tokens:,})")
|
||||
print(f" 🗜️ Summarizing turns {compress_start+1}-{compress_end} ({len(turns_to_summarize)} turns)")
|
||||
|
||||
# Generate summary
|
||||
summary = self._generate_summary(turns_to_summarize)
|
||||
|
||||
# Build compressed messages
|
||||
compressed = []
|
||||
|
||||
# Keep protected head turns
|
||||
for i in range(compress_start):
|
||||
msg = messages[i].copy()
|
||||
# Add notice to system message on first compression
|
||||
if i == 0 and msg.get("role") == "system" and self.compression_count == 0:
|
||||
msg["content"] = msg.get("content", "") + "\n\n[Note: Some earlier conversation turns may be summarized to preserve context space.]"
|
||||
compressed.append(msg)
|
||||
|
||||
# Add summary as user message
|
||||
compressed.append({
|
||||
"role": "user",
|
||||
"content": summary
|
||||
})
|
||||
|
||||
# Keep protected tail turns
|
||||
for i in range(compress_end, n_messages):
|
||||
compressed.append(messages[i].copy())
|
||||
|
||||
self.compression_count += 1
|
||||
|
||||
if not self.quiet_mode:
|
||||
# Estimate new size (actual will be known after next API call)
|
||||
new_estimate = estimate_messages_tokens_rough(compressed)
|
||||
saved_estimate = display_tokens - new_estimate
|
||||
print(f" ✅ Compressed: {n_messages} → {len(compressed)} messages (~{saved_estimate:,} tokens saved)")
|
||||
print(f" 💡 Compression #{self.compression_count} complete")
|
||||
|
||||
return compressed
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Default System Prompt Components
|
||||
@@ -180,7 +585,7 @@ class AIAgent:
|
||||
base_url: str = None,
|
||||
api_key: str = None,
|
||||
model: str = "anthropic/claude-sonnet-4-20250514", # OpenRouter format
|
||||
max_iterations: int = 10,
|
||||
max_iterations: int = 60, # Default tool-calling iterations
|
||||
tool_delay: float = 1.0,
|
||||
enabled_toolsets: List[str] = None,
|
||||
disabled_toolsets: List[str] = None,
|
||||
@@ -195,6 +600,7 @@ class AIAgent:
|
||||
providers_order: List[str] = None,
|
||||
provider_sort: str = None,
|
||||
session_id: str = None,
|
||||
tool_progress_callback: callable = None,
|
||||
):
|
||||
"""
|
||||
Initialize the AI Agent.
|
||||
@@ -218,6 +624,7 @@ class AIAgent:
|
||||
providers_order (List[str]): OpenRouter providers to try in order (optional)
|
||||
provider_sort (str): Sort providers by price/throughput/latency (optional)
|
||||
session_id (str): Pre-generated session ID for logging (optional, auto-generated if not provided)
|
||||
tool_progress_callback (callable): Callback function(tool_name, args_preview) for progress notifications
|
||||
"""
|
||||
self.model = model
|
||||
self.max_iterations = max_iterations
|
||||
@@ -229,6 +636,12 @@ class AIAgent:
|
||||
self.log_prefix_chars = log_prefix_chars
|
||||
self.log_prefix = f"{log_prefix} " if log_prefix else ""
|
||||
self.base_url = base_url or "" # Store for OpenRouter detection
|
||||
self.tool_progress_callback = tool_progress_callback
|
||||
self._last_reported_tool = None # Track for "new tool" mode
|
||||
|
||||
# Interrupt mechanism for breaking out of tool loops
|
||||
self._interrupt_requested = False
|
||||
self._interrupt_message = None # Optional message that triggered interrupt
|
||||
|
||||
# Store OpenRouter provider preferences
|
||||
self.providers_allowed = providers_allowed
|
||||
@@ -363,6 +776,30 @@ class AIAgent:
|
||||
|
||||
# Track conversation messages for session logging
|
||||
self._session_messages: List[Dict[str, Any]] = []
|
||||
|
||||
# Initialize context compressor for automatic context management
|
||||
# Compresses conversation when approaching model's context limit
|
||||
# Configuration via environment variables (can be set in .env or cli-config.yaml)
|
||||
compression_threshold = float(os.getenv("CONTEXT_COMPRESSION_THRESHOLD", "0.85"))
|
||||
compression_model = os.getenv("CONTEXT_COMPRESSION_MODEL", "google/gemini-2.0-flash-001")
|
||||
compression_enabled = os.getenv("CONTEXT_COMPRESSION_ENABLED", "true").lower() in ("true", "1", "yes")
|
||||
|
||||
self.context_compressor = ContextCompressor(
|
||||
model=self.model,
|
||||
threshold_percent=compression_threshold,
|
||||
summary_model=compression_model,
|
||||
protect_first_n=3, # Keep system, first user, first assistant
|
||||
protect_last_n=4, # Keep recent context
|
||||
summary_target_tokens=500,
|
||||
quiet_mode=self.quiet_mode,
|
||||
)
|
||||
self.compression_enabled = compression_enabled
|
||||
|
||||
if not self.quiet_mode:
|
||||
if compression_enabled:
|
||||
print(f"📊 Context limit: {self.context_compressor.context_length:,} tokens (compress at {int(compression_threshold*100)}% = {self.context_compressor.threshold_tokens:,})")
|
||||
else:
|
||||
print(f"📊 Context limit: {self.context_compressor.context_length:,} tokens (auto-compression disabled)")
|
||||
|
||||
# Pools of kawaii faces for random selection
|
||||
KAWAII_SEARCH = [
|
||||
@@ -551,6 +988,49 @@ class AIAgent:
|
||||
# Check if there's any non-whitespace content remaining
|
||||
return bool(cleaned.strip())
|
||||
|
||||
def _extract_reasoning(self, assistant_message) -> Optional[str]:
|
||||
"""
|
||||
Extract reasoning/thinking content from an assistant message.
|
||||
|
||||
OpenRouter and various providers can return reasoning in multiple formats:
|
||||
1. message.reasoning - Direct reasoning field (DeepSeek, Qwen, etc.)
|
||||
2. message.reasoning_content - Alternative field (Moonshot AI, Novita, etc.)
|
||||
3. message.reasoning_details - Array of {type, summary, ...} objects (OpenRouter unified)
|
||||
|
||||
Args:
|
||||
assistant_message: The assistant message object from the API response
|
||||
|
||||
Returns:
|
||||
Combined reasoning text, or None if no reasoning found
|
||||
"""
|
||||
reasoning_parts = []
|
||||
|
||||
# Check direct reasoning field
|
||||
if hasattr(assistant_message, 'reasoning') and assistant_message.reasoning:
|
||||
reasoning_parts.append(assistant_message.reasoning)
|
||||
|
||||
# Check reasoning_content field (alternative name used by some providers)
|
||||
if hasattr(assistant_message, 'reasoning_content') and assistant_message.reasoning_content:
|
||||
# Don't duplicate if same as reasoning
|
||||
if assistant_message.reasoning_content not in reasoning_parts:
|
||||
reasoning_parts.append(assistant_message.reasoning_content)
|
||||
|
||||
# Check reasoning_details array (OpenRouter unified format)
|
||||
# Format: [{"type": "reasoning.summary", "summary": "...", ...}, ...]
|
||||
if hasattr(assistant_message, 'reasoning_details') and assistant_message.reasoning_details:
|
||||
for detail in assistant_message.reasoning_details:
|
||||
if isinstance(detail, dict):
|
||||
# Extract summary from reasoning detail object
|
||||
summary = detail.get('summary') or detail.get('content') or detail.get('text')
|
||||
if summary and summary not in reasoning_parts:
|
||||
reasoning_parts.append(summary)
|
||||
|
||||
# Combine all reasoning parts
|
||||
if reasoning_parts:
|
||||
return "\n\n".join(reasoning_parts)
|
||||
|
||||
return None
|
||||
|
||||
def _get_messages_up_to_last_assistant(self, messages: List[Dict]) -> List[Dict]:
|
||||
"""
|
||||
Get messages up to (but not including) the last assistant turn.
|
||||
@@ -796,9 +1276,16 @@ class AIAgent:
|
||||
return
|
||||
|
||||
try:
|
||||
# Convert to trajectory format (reuse existing method)
|
||||
# Use empty string as user_query since it's embedded in messages
|
||||
trajectory = self._convert_to_trajectory_format(messages, "", True)
|
||||
# Extract the first user message for the trajectory format
|
||||
# The first message should be the user's initial query
|
||||
first_user_query = ""
|
||||
for msg in messages:
|
||||
if msg.get("role") == "user":
|
||||
first_user_query = msg.get("content", "")
|
||||
break
|
||||
|
||||
# Convert to trajectory format
|
||||
trajectory = self._convert_to_trajectory_format(messages, first_user_query, True)
|
||||
|
||||
# Build the session log entry
|
||||
entry = {
|
||||
@@ -819,6 +1306,42 @@ class AIAgent:
|
||||
if self.verbose_logging:
|
||||
logging.warning(f"Failed to save session log: {e}")
|
||||
|
||||
def interrupt(self, message: str = None) -> None:
|
||||
"""
|
||||
Request the agent to interrupt its current tool-calling loop.
|
||||
|
||||
Call this from another thread (e.g., input handler, message receiver)
|
||||
to gracefully stop the agent and process a new message.
|
||||
|
||||
Args:
|
||||
message: Optional new message that triggered the interrupt.
|
||||
If provided, the agent will include this in its response context.
|
||||
|
||||
Example (CLI):
|
||||
# In a separate input thread:
|
||||
if user_typed_something:
|
||||
agent.interrupt(user_input)
|
||||
|
||||
Example (Messaging):
|
||||
# When new message arrives for active session:
|
||||
if session_has_running_agent:
|
||||
running_agent.interrupt(new_message.text)
|
||||
"""
|
||||
self._interrupt_requested = True
|
||||
self._interrupt_message = message
|
||||
if not self.quiet_mode:
|
||||
print(f"\n⚡ Interrupt requested" + (f": '{message[:40]}...'" if message and len(message) > 40 else f": '{message}'" if message else ""))
|
||||
|
||||
def clear_interrupt(self) -> None:
|
||||
"""Clear any pending interrupt request."""
|
||||
self._interrupt_requested = False
|
||||
self._interrupt_message = None
|
||||
|
||||
@property
|
||||
def is_interrupted(self) -> bool:
|
||||
"""Check if an interrupt has been requested."""
|
||||
return self._interrupt_requested
|
||||
|
||||
def run_conversation(
|
||||
self,
|
||||
user_message: str,
|
||||
@@ -876,8 +1399,19 @@ class AIAgent:
|
||||
# Main conversation loop
|
||||
api_call_count = 0
|
||||
final_response = None
|
||||
interrupted = False
|
||||
|
||||
# Clear any stale interrupt state at start
|
||||
self.clear_interrupt()
|
||||
|
||||
while api_call_count < self.max_iterations:
|
||||
# Check for interrupt request (e.g., user sent new message)
|
||||
if self._interrupt_requested:
|
||||
interrupted = True
|
||||
if not self.quiet_mode:
|
||||
print(f"\n⚡ Breaking out of tool loop due to interrupt...")
|
||||
break
|
||||
|
||||
api_call_count += 1
|
||||
|
||||
# Prepare messages for API call
|
||||
@@ -889,37 +1423,25 @@ class AIAgent:
|
||||
for msg in messages:
|
||||
api_msg = msg.copy()
|
||||
|
||||
# For assistant messages with tool_calls, providers require 'reasoning_content' field
|
||||
# Extract reasoning from our stored 'reasoning' field and add it as 'reasoning_content'
|
||||
if msg.get("role") == "assistant" and msg.get("tool_calls"):
|
||||
# For ALL assistant messages, pass reasoning back to the API
|
||||
# This ensures multi-turn reasoning context is preserved
|
||||
if msg.get("role") == "assistant":
|
||||
reasoning_text = msg.get("reasoning")
|
||||
if reasoning_text:
|
||||
# Add reasoning_content for API compatibility (Moonshot AI, Novita, etc.)
|
||||
# Add reasoning_content for API compatibility (Moonshot AI, Novita, OpenRouter)
|
||||
api_msg["reasoning_content"] = reasoning_text
|
||||
|
||||
# Remove 'reasoning' field - it's for trajectory storage only
|
||||
# The reasoning is already in the content via <think> tags AND
|
||||
# we've added reasoning_content for API compatibility above
|
||||
# We've copied it to 'reasoning_content' for the API above
|
||||
if "reasoning" in api_msg:
|
||||
api_msg.pop("reasoning")
|
||||
# Remove 'reasoning_details' if present - we use reasoning_content instead
|
||||
if "reasoning_details" in api_msg:
|
||||
api_msg.pop("reasoning_details")
|
||||
# Keep 'reasoning_details' - OpenRouter uses this for multi-turn reasoning context
|
||||
# The signature field helps maintain reasoning continuity
|
||||
api_messages.append(api_msg)
|
||||
|
||||
if active_system_prompt:
|
||||
# Insert system message at the beginning
|
||||
api_messages = [{"role": "system", "content": active_system_prompt}] + api_messages
|
||||
|
||||
if os.getenv("HERMES_DEBUG_OPENAI_REQUEST") == "1":
|
||||
meta = {
|
||||
"model": self.model,
|
||||
"base_url": self.base_url,
|
||||
"messages": api_messages,
|
||||
"tools": self.tools if self.tools else None,
|
||||
}
|
||||
print("\n=== HERMES_DEBUG_OPENAI_REQUEST ===", flush=True)
|
||||
print(json.dumps(meta, ensure_ascii=False, indent=2)[:200_000], flush=True)
|
||||
|
||||
# Calculate approximate request size for logging
|
||||
total_chars = sum(len(str(msg)) for msg in api_messages)
|
||||
@@ -933,13 +1455,12 @@ class AIAgent:
|
||||
print(f"{self.log_prefix} 📊 Request size: {len(api_messages)} messages, ~{approx_tokens:,} tokens (~{total_chars:,} chars)")
|
||||
print(f"{self.log_prefix} 🔧 Available tools: {len(self.tools) if self.tools else 0}")
|
||||
else:
|
||||
# Animated thinking spinner in quiet mode (disable for wrappers/non-TTY usage)
|
||||
if os.getenv("HERMES_DISABLE_SPINNER") != "1":
|
||||
face = random.choice(KawaiiSpinner.KAWAII_THINKING)
|
||||
verb = random.choice(KawaiiSpinner.THINKING_VERBS)
|
||||
spinner_type = random.choice(['brain', 'sparkle', 'pulse', 'moon', 'star'])
|
||||
thinking_spinner = KawaiiSpinner(f"{face} {verb}...", spinner_type=spinner_type)
|
||||
thinking_spinner.start()
|
||||
# Animated thinking spinner in quiet mode
|
||||
face = random.choice(KawaiiSpinner.KAWAII_THINKING)
|
||||
verb = random.choice(KawaiiSpinner.THINKING_VERBS)
|
||||
spinner_type = random.choice(['brain', 'sparkle', 'pulse', 'moon', 'star'])
|
||||
thinking_spinner = KawaiiSpinner(f"{face} {verb}...", spinner_type=spinner_type)
|
||||
thinking_spinner.start()
|
||||
|
||||
# Log request details if verbose
|
||||
if self.verbose_logging:
|
||||
@@ -990,14 +1511,6 @@ class AIAgent:
|
||||
api_kwargs["extra_body"] = extra_body
|
||||
|
||||
response = self.client.chat.completions.create(**api_kwargs)
|
||||
|
||||
if os.getenv("HERMES_DEBUG_OPENAI_RESPONSE") == "1":
|
||||
try:
|
||||
dumped = response.model_dump()
|
||||
except Exception:
|
||||
dumped = getattr(response, "__dict__", {"repr": repr(response)})
|
||||
print("\n=== HERMES_DEBUG_OPENAI_RESPONSE: ChatCompletion (raw) ===", flush=True)
|
||||
print(json.dumps(dumped, ensure_ascii=False, indent=2), flush=True)
|
||||
|
||||
api_duration = time.time() - api_start_time
|
||||
|
||||
@@ -1123,6 +1636,18 @@ class AIAgent:
|
||||
"error": "First response truncated due to output length limit"
|
||||
}
|
||||
|
||||
# Track actual token usage from response for context management
|
||||
if hasattr(response, 'usage') and response.usage:
|
||||
usage_dict = {
|
||||
"prompt_tokens": getattr(response.usage, 'prompt_tokens', 0),
|
||||
"completion_tokens": getattr(response.usage, 'completion_tokens', 0),
|
||||
"total_tokens": getattr(response.usage, 'total_tokens', 0),
|
||||
}
|
||||
self.context_compressor.update_from_response(usage_dict)
|
||||
|
||||
if self.verbose_logging:
|
||||
logging.debug(f"Token usage: prompt={usage_dict['prompt_tokens']:,}, completion={usage_dict['completion_tokens']:,}, total={usage_dict['total_tokens']:,}")
|
||||
|
||||
break # Success, exit retry loop
|
||||
|
||||
except Exception as api_error:
|
||||
@@ -1150,17 +1675,28 @@ class AIAgent:
|
||||
])
|
||||
|
||||
if is_context_length_error:
|
||||
print(f"{self.log_prefix}❌ Context length exceeded - this error cannot be resolved by retrying.")
|
||||
print(f"{self.log_prefix} 💡 The conversation has accumulated too much content from tool responses.")
|
||||
logging.error(f"{self.log_prefix}Context length exceeded: {approx_tokens:,} tokens. Cannot continue.")
|
||||
# Return a partial result instead of crashing
|
||||
return {
|
||||
"messages": messages,
|
||||
"completed": False,
|
||||
"api_calls": api_call_count,
|
||||
"error": f"Context length exceeded ({approx_tokens:,} tokens). Conversation terminated early.",
|
||||
"partial": True
|
||||
}
|
||||
print(f"{self.log_prefix}⚠️ Context length exceeded - attempting compression...")
|
||||
|
||||
# Try to compress and retry
|
||||
original_len = len(messages)
|
||||
messages = self.context_compressor.compress(messages, current_tokens=approx_tokens)
|
||||
|
||||
if len(messages) < original_len:
|
||||
# Compression was possible, retry
|
||||
print(f"{self.log_prefix} 🗜️ Compressed {original_len} → {len(messages)} messages, retrying...")
|
||||
continue # Retry with compressed messages
|
||||
else:
|
||||
# Can't compress further
|
||||
print(f"{self.log_prefix}❌ Context length exceeded and cannot compress further.")
|
||||
print(f"{self.log_prefix} 💡 The conversation has accumulated too much content.")
|
||||
logging.error(f"{self.log_prefix}Context length exceeded: {approx_tokens:,} tokens. Cannot compress further.")
|
||||
return {
|
||||
"messages": messages,
|
||||
"completed": False,
|
||||
"api_calls": api_call_count,
|
||||
"error": f"Context length exceeded ({approx_tokens:,} tokens). Cannot compress further.",
|
||||
"partial": True
|
||||
}
|
||||
|
||||
if retry_count > max_retries:
|
||||
print(f"{self.log_prefix}❌ Max retries ({max_retries}) exceeded. Giving up.")
|
||||
@@ -1228,10 +1764,16 @@ class AIAgent:
|
||||
self._invalid_tool_retries = 0
|
||||
|
||||
# Validate tool call arguments are valid JSON
|
||||
# Handle empty strings as empty objects (common model quirk)
|
||||
invalid_json_args = []
|
||||
for tc in assistant_message.tool_calls:
|
||||
args = tc.function.arguments
|
||||
# Treat empty/whitespace strings as empty object
|
||||
if not args or not args.strip():
|
||||
tc.function.arguments = "{}"
|
||||
continue
|
||||
try:
|
||||
json.loads(tc.function.arguments)
|
||||
json.loads(args)
|
||||
except json.JSONDecodeError as e:
|
||||
invalid_json_args.append((tc.function.name, str(e)))
|
||||
|
||||
@@ -1247,28 +1789,34 @@ class AIAgent:
|
||||
# Don't add anything to messages, just retry the API call
|
||||
continue
|
||||
else:
|
||||
print(f"{self.log_prefix}❌ Max retries (3) for invalid JSON arguments exceeded. Stopping as partial.")
|
||||
self._invalid_json_retries = 0 # Reset for next conversation
|
||||
return {
|
||||
"final_response": None,
|
||||
"messages": messages, # Messages up to last valid point
|
||||
"api_calls": api_call_count,
|
||||
"completed": False,
|
||||
"partial": True,
|
||||
"error": f"Model generated invalid JSON arguments for tool '{tool_name}': {error_msg}"
|
||||
}
|
||||
# Instead of returning partial, inject a helpful message and let model recover
|
||||
print(f"{self.log_prefix}⚠️ Injecting recovery message for invalid JSON...")
|
||||
self._invalid_json_retries = 0 # Reset for next attempt
|
||||
|
||||
# Add a user message explaining the issue
|
||||
recovery_msg = (
|
||||
f"Your tool call to '{tool_name}' had invalid JSON arguments. "
|
||||
f"Error: {error_msg}. "
|
||||
f"For tools with no required parameters, use an empty object: {{}}. "
|
||||
f"Please either retry the tool call with valid JSON, or respond without using that tool."
|
||||
)
|
||||
messages.append({"role": "user", "content": recovery_msg})
|
||||
# Continue the loop - model will see this message and can recover
|
||||
continue
|
||||
|
||||
# Reset retry counter on successful JSON validation
|
||||
self._invalid_json_retries = 0
|
||||
|
||||
# Extract reasoning from response if available (for reasoning models like minimax, kimi, etc.)
|
||||
# Extract reasoning from response for storage
|
||||
# The reasoning_content field will be added when preparing API messages
|
||||
reasoning_text = None
|
||||
if hasattr(assistant_message, 'reasoning') and assistant_message.reasoning:
|
||||
reasoning_text = assistant_message.reasoning
|
||||
elif hasattr(assistant_message, 'reasoning_content') and assistant_message.reasoning_content:
|
||||
reasoning_text = assistant_message.reasoning_content
|
||||
# Extract reasoning from response if available
|
||||
# OpenRouter can return reasoning in multiple formats:
|
||||
# 1. message.reasoning - direct reasoning field
|
||||
# 2. message.reasoning_content - alternative field (some providers)
|
||||
# 3. message.reasoning_details - array with {summary: "..."} objects
|
||||
reasoning_text = self._extract_reasoning(assistant_message)
|
||||
|
||||
if reasoning_text and self.verbose_logging:
|
||||
preview = reasoning_text[:100] + "..." if len(reasoning_text) > 100 else reasoning_text
|
||||
logging.debug(f"Captured reasoning ({len(reasoning_text)} chars): {preview}")
|
||||
|
||||
# Build assistant message with tool calls
|
||||
# Content stays as-is; reasoning is stored separately and will be passed
|
||||
@@ -1290,6 +1838,14 @@ class AIAgent:
|
||||
]
|
||||
}
|
||||
|
||||
# Store reasoning_details for multi-turn reasoning context (OpenRouter)
|
||||
if hasattr(assistant_message, 'reasoning_details') and assistant_message.reasoning_details:
|
||||
assistant_msg["reasoning_details"] = [
|
||||
{"type": d.get("type"), "text": d.get("text"), "signature": d.get("signature")}
|
||||
for d in assistant_message.reasoning_details
|
||||
if isinstance(d, dict)
|
||||
]
|
||||
|
||||
messages.append(assistant_msg)
|
||||
|
||||
# Execute each tool call
|
||||
@@ -1309,11 +1865,24 @@ class AIAgent:
|
||||
args_str = json.dumps(function_args, ensure_ascii=False)
|
||||
args_preview = args_str[:self.log_prefix_chars] + "..." if len(args_str) > self.log_prefix_chars else args_str
|
||||
print(f" 📞 Tool {i}: {function_name}({list(function_args.keys())}) - {args_preview}")
|
||||
|
||||
# Fire progress callback if registered (for messaging platforms)
|
||||
if self.tool_progress_callback:
|
||||
try:
|
||||
# Build preview for terminal commands
|
||||
if function_name == "terminal":
|
||||
cmd = function_args.get("command", "")
|
||||
preview = cmd[:50] + "..." if len(cmd) > 50 else cmd
|
||||
else:
|
||||
preview = None
|
||||
self.tool_progress_callback(function_name, preview)
|
||||
except Exception as cb_err:
|
||||
logging.debug(f"Tool progress callback error: {cb_err}")
|
||||
|
||||
tool_start_time = time.time()
|
||||
|
||||
# Execute the tool - with animated spinner in quiet mode
|
||||
if self.quiet_mode and os.getenv("HERMES_DISABLE_SPINNER") != "1":
|
||||
if self.quiet_mode:
|
||||
# Tool-specific spinner animations
|
||||
tool_spinners = {
|
||||
'web_search': ('arrows', ['🔍', '🌐', '📡', '🔎']),
|
||||
@@ -1343,9 +1912,6 @@ class AIAgent:
|
||||
tool_duration = time.time() - tool_start_time
|
||||
cute_msg = self._get_cute_tool_message(function_name, function_args, tool_duration)
|
||||
spinner.stop(cute_msg)
|
||||
elif self.quiet_mode:
|
||||
function_result = handle_function_call(function_name, function_args, effective_task_id)
|
||||
tool_duration = time.time() - tool_start_time
|
||||
else:
|
||||
function_result = handle_function_call(function_name, function_args, effective_task_id)
|
||||
tool_duration = time.time() - tool_start_time
|
||||
@@ -1372,6 +1938,18 @@ class AIAgent:
|
||||
if self.tool_delay > 0 and i < len(assistant_message.tool_calls):
|
||||
time.sleep(self.tool_delay)
|
||||
|
||||
# Check if context compression is needed before next API call
|
||||
# Uses actual token count from last API response
|
||||
if self.compression_enabled and self.context_compressor.should_compress():
|
||||
messages = self.context_compressor.compress(
|
||||
messages,
|
||||
current_tokens=self.context_compressor.last_prompt_tokens
|
||||
)
|
||||
|
||||
# Save session log incrementally (so progress is visible even if interrupted)
|
||||
self._session_messages = messages
|
||||
self._save_session_log(messages)
|
||||
|
||||
# Continue loop for next response
|
||||
continue
|
||||
|
||||
@@ -1427,11 +2005,11 @@ class AIAgent:
|
||||
self._empty_content_retries = 0
|
||||
|
||||
# Extract reasoning from response if available
|
||||
reasoning_text = None
|
||||
if hasattr(assistant_message, 'reasoning') and assistant_message.reasoning:
|
||||
reasoning_text = assistant_message.reasoning
|
||||
elif hasattr(assistant_message, 'reasoning_content') and assistant_message.reasoning_content:
|
||||
reasoning_text = assistant_message.reasoning_content
|
||||
reasoning_text = self._extract_reasoning(assistant_message)
|
||||
|
||||
if reasoning_text and self.verbose_logging:
|
||||
preview = reasoning_text[:100] + "..." if len(reasoning_text) > 100 else reasoning_text
|
||||
logging.debug(f"Captured final reasoning ({len(reasoning_text)} chars): {preview}")
|
||||
|
||||
# Build final assistant message
|
||||
# Content stays as-is; reasoning stored separately for trajectory extraction
|
||||
@@ -1441,6 +2019,14 @@ class AIAgent:
|
||||
"reasoning": reasoning_text # Stored for trajectory extraction
|
||||
}
|
||||
|
||||
# Store reasoning_details for multi-turn reasoning context (OpenRouter)
|
||||
if hasattr(assistant_message, 'reasoning_details') and assistant_message.reasoning_details:
|
||||
final_msg["reasoning_details"] = [
|
||||
{"type": d.get("type"), "text": d.get("text"), "signature": d.get("signature")}
|
||||
for d in assistant_message.reasoning_details
|
||||
if isinstance(d, dict)
|
||||
]
|
||||
|
||||
messages.append(final_msg)
|
||||
|
||||
if not self.quiet_mode:
|
||||
@@ -1465,11 +2051,47 @@ class AIAgent:
|
||||
final_response = f"I apologize, but I encountered repeated errors: {error_msg}"
|
||||
break
|
||||
|
||||
# Handle max iterations reached
|
||||
if api_call_count >= self.max_iterations:
|
||||
print(f"⚠️ Reached maximum iterations ({self.max_iterations}). Stopping to prevent infinite loop.")
|
||||
if final_response is None:
|
||||
final_response = "I've reached the maximum number of iterations. Here's what I found so far."
|
||||
# Handle max iterations reached - ask model to summarize what it found
|
||||
if api_call_count >= self.max_iterations and final_response is None:
|
||||
print(f"⚠️ Reached maximum iterations ({self.max_iterations}). Requesting summary...")
|
||||
|
||||
# Inject a user message asking for a summary
|
||||
summary_request = (
|
||||
"You've reached the maximum number of tool-calling iterations allowed. "
|
||||
"Please provide a final response summarizing what you've found and accomplished so far, "
|
||||
"without calling any more tools."
|
||||
)
|
||||
messages.append({"role": "user", "content": summary_request})
|
||||
|
||||
# Make one final API call WITHOUT tools to force a text response
|
||||
try:
|
||||
api_messages = messages.copy()
|
||||
if self.ephemeral_system_prompt:
|
||||
api_messages = [{"role": "system", "content": self.ephemeral_system_prompt}] + api_messages
|
||||
|
||||
summary_response = self.client.chat.completions.create(
|
||||
model=self.model,
|
||||
messages=api_messages,
|
||||
# No tools parameter - forces text response
|
||||
extra_headers=self.extra_headers,
|
||||
extra_body=self.extra_body,
|
||||
)
|
||||
|
||||
if summary_response.choices and summary_response.choices[0].message.content:
|
||||
final_response = summary_response.choices[0].message.content
|
||||
# Strip think blocks from final response
|
||||
if "<think>" in final_response:
|
||||
import re
|
||||
final_response = re.sub(r'<think>.*?</think>\s*', '', final_response, flags=re.DOTALL).strip()
|
||||
|
||||
# Add to messages for session continuity
|
||||
messages.append({"role": "assistant", "content": final_response})
|
||||
else:
|
||||
final_response = "I reached the iteration limit and couldn't generate a summary."
|
||||
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to get summary response: {e}")
|
||||
final_response = f"I reached the maximum iterations ({self.max_iterations}) but couldn't summarize. Error: {str(e)}"
|
||||
|
||||
# Determine if conversation completed successfully
|
||||
completed = final_response is not None and api_call_count < self.max_iterations
|
||||
@@ -1494,13 +2116,24 @@ class AIAgent:
|
||||
self._session_messages = messages
|
||||
self._save_session_log(messages)
|
||||
|
||||
return {
|
||||
# Build result with interrupt info if applicable
|
||||
result = {
|
||||
"final_response": final_response,
|
||||
"messages": messages,
|
||||
"api_calls": api_call_count,
|
||||
"completed": completed,
|
||||
"partial": False # True only when stopped due to invalid tool calls
|
||||
"partial": False, # True only when stopped due to invalid tool calls
|
||||
"interrupted": interrupted,
|
||||
}
|
||||
|
||||
# Include interrupt message if one triggered the interrupt
|
||||
if interrupted and self._interrupt_message:
|
||||
result["interrupt_message"] = self._interrupt_message
|
||||
|
||||
# Clear interrupt state after handling
|
||||
self.clear_interrupt()
|
||||
|
||||
return result
|
||||
|
||||
def chat(self, message: str) -> str:
|
||||
"""
|
||||
@@ -1732,11 +2365,4 @@ def main(
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
try:
|
||||
import fire # type: ignore
|
||||
except ModuleNotFoundError as exc:
|
||||
raise SystemExit(
|
||||
"Missing optional dependency 'fire'. Install hermes-agent with its CLI extras or add `fire` "
|
||||
f"to your environment. Original error: {exc}"
|
||||
) from exc
|
||||
fire.Fire(main)
|
||||
|
||||
414
scripts/hermes-gateway
Executable file
414
scripts/hermes-gateway
Executable file
@@ -0,0 +1,414 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Hermes Gateway - Standalone messaging platform integration.
|
||||
|
||||
This is the proper entry point for running the gateway as a service.
|
||||
NOT tied to the CLI - runs independently.
|
||||
|
||||
Usage:
|
||||
# Run in foreground (for testing)
|
||||
./scripts/hermes-gateway
|
||||
|
||||
# Install as systemd service
|
||||
./scripts/hermes-gateway install
|
||||
|
||||
# Manage the service
|
||||
./scripts/hermes-gateway start
|
||||
./scripts/hermes-gateway stop
|
||||
./scripts/hermes-gateway restart
|
||||
./scripts/hermes-gateway status
|
||||
|
||||
# Uninstall
|
||||
./scripts/hermes-gateway uninstall
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add parent directory to path
|
||||
SCRIPT_DIR = Path(__file__).parent.resolve()
|
||||
PROJECT_DIR = SCRIPT_DIR.parent
|
||||
sys.path.insert(0, str(PROJECT_DIR))
|
||||
|
||||
# Load .env file
|
||||
from dotenv import load_dotenv
|
||||
env_path = PROJECT_DIR / '.env'
|
||||
if env_path.exists():
|
||||
load_dotenv(dotenv_path=env_path)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Service Configuration
|
||||
# =============================================================================
|
||||
|
||||
SERVICE_NAME = "hermes-gateway"
|
||||
SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"
|
||||
|
||||
def get_systemd_unit_path() -> Path:
|
||||
"""Get the path for the systemd user service file."""
|
||||
return Path.home() / ".config" / "systemd" / "user" / f"{SERVICE_NAME}.service"
|
||||
|
||||
def get_launchd_plist_path() -> Path:
|
||||
"""Get the path for the launchd plist file (macOS)."""
|
||||
return Path.home() / "Library" / "LaunchAgents" / f"ai.hermes.gateway.plist"
|
||||
|
||||
def get_python_path() -> str:
|
||||
"""Get the path to the Python interpreter."""
|
||||
# Prefer the venv if it exists
|
||||
venv_python = PROJECT_DIR / "venv" / "bin" / "python"
|
||||
if venv_python.exists():
|
||||
return str(venv_python)
|
||||
return sys.executable
|
||||
|
||||
def get_gateway_script_path() -> str:
|
||||
"""Get the path to this script."""
|
||||
return str(Path(__file__).resolve())
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Systemd Service (Linux)
|
||||
# =============================================================================
|
||||
|
||||
def generate_systemd_unit() -> str:
|
||||
"""Generate the systemd unit file content."""
|
||||
python_path = get_python_path()
|
||||
script_path = get_gateway_script_path()
|
||||
working_dir = str(PROJECT_DIR)
|
||||
|
||||
return f"""[Unit]
|
||||
Description={SERVICE_DESCRIPTION}
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart={python_path} {script_path} run
|
||||
WorkingDirectory={working_dir}
|
||||
Restart=on-failure
|
||||
RestartSec=10
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
|
||||
# Environment (optional - can also use .env file)
|
||||
# Environment="TELEGRAM_BOT_TOKEN=your_token"
|
||||
# Environment="DISCORD_BOT_TOKEN=your_token"
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
"""
|
||||
|
||||
def install_systemd():
|
||||
"""Install the systemd user service."""
|
||||
unit_path = get_systemd_unit_path()
|
||||
unit_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
print(f"Installing systemd service to: {unit_path}")
|
||||
unit_path.write_text(generate_systemd_unit())
|
||||
|
||||
# Reload systemd
|
||||
subprocess.run(["systemctl", "--user", "daemon-reload"], check=True)
|
||||
|
||||
# Enable the service (start on boot)
|
||||
subprocess.run(["systemctl", "--user", "enable", SERVICE_NAME], check=True)
|
||||
|
||||
print(f"✓ Service installed and enabled")
|
||||
print(f"")
|
||||
print(f"To start the service:")
|
||||
print(f" systemctl --user start {SERVICE_NAME}")
|
||||
print(f"")
|
||||
print(f"To view logs:")
|
||||
print(f" journalctl --user -u {SERVICE_NAME} -f")
|
||||
print(f"")
|
||||
print(f"To enable lingering (keeps service running after logout):")
|
||||
print(f" sudo loginctl enable-linger $USER")
|
||||
|
||||
def uninstall_systemd():
|
||||
"""Uninstall the systemd user service."""
|
||||
unit_path = get_systemd_unit_path()
|
||||
|
||||
# Stop and disable first
|
||||
subprocess.run(["systemctl", "--user", "stop", SERVICE_NAME], check=False)
|
||||
subprocess.run(["systemctl", "--user", "disable", SERVICE_NAME], check=False)
|
||||
|
||||
# Remove the unit file
|
||||
if unit_path.exists():
|
||||
unit_path.unlink()
|
||||
print(f"✓ Removed {unit_path}")
|
||||
|
||||
# Reload systemd
|
||||
subprocess.run(["systemctl", "--user", "daemon-reload"], check=True)
|
||||
print(f"✓ Service uninstalled")
|
||||
|
||||
def systemd_status():
|
||||
"""Show systemd service status."""
|
||||
subprocess.run(["systemctl", "--user", "status", SERVICE_NAME])
|
||||
|
||||
def systemd_start():
|
||||
"""Start the systemd service."""
|
||||
subprocess.run(["systemctl", "--user", "start", SERVICE_NAME], check=True)
|
||||
print(f"✓ Service started")
|
||||
|
||||
def systemd_stop():
|
||||
"""Stop the systemd service."""
|
||||
subprocess.run(["systemctl", "--user", "stop", SERVICE_NAME], check=True)
|
||||
print(f"✓ Service stopped")
|
||||
|
||||
def systemd_restart():
|
||||
"""Restart the systemd service."""
|
||||
subprocess.run(["systemctl", "--user", "restart", SERVICE_NAME], check=True)
|
||||
print(f"✓ Service restarted")
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Launchd Service (macOS)
|
||||
# =============================================================================
|
||||
|
||||
def generate_launchd_plist() -> str:
|
||||
"""Generate the launchd plist file content."""
|
||||
python_path = get_python_path()
|
||||
script_path = get_gateway_script_path()
|
||||
working_dir = str(PROJECT_DIR)
|
||||
log_dir = Path.home() / ".hermes" / "logs"
|
||||
|
||||
return f"""<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>Label</key>
|
||||
<string>ai.hermes.gateway</string>
|
||||
|
||||
<key>ProgramArguments</key>
|
||||
<array>
|
||||
<string>{python_path}</string>
|
||||
<string>{script_path}</string>
|
||||
<string>run</string>
|
||||
</array>
|
||||
|
||||
<key>WorkingDirectory</key>
|
||||
<string>{working_dir}</string>
|
||||
|
||||
<key>RunAtLoad</key>
|
||||
<true/>
|
||||
|
||||
<key>KeepAlive</key>
|
||||
<dict>
|
||||
<key>SuccessfulExit</key>
|
||||
<false/>
|
||||
</dict>
|
||||
|
||||
<key>StandardOutPath</key>
|
||||
<string>{log_dir}/gateway.log</string>
|
||||
|
||||
<key>StandardErrorPath</key>
|
||||
<string>{log_dir}/gateway.error.log</string>
|
||||
|
||||
<key>EnvironmentVariables</key>
|
||||
<dict>
|
||||
<key>PATH</key>
|
||||
<string>/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
|
||||
</dict>
|
||||
</dict>
|
||||
</plist>
|
||||
"""
|
||||
|
||||
def install_launchd():
|
||||
"""Install the launchd service (macOS)."""
|
||||
plist_path = get_launchd_plist_path()
|
||||
plist_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Ensure log directory exists
|
||||
log_dir = Path.home() / ".hermes" / "logs"
|
||||
log_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
print(f"Installing launchd service to: {plist_path}")
|
||||
plist_path.write_text(generate_launchd_plist())
|
||||
|
||||
# Load the service
|
||||
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
|
||||
|
||||
print(f"✓ Service installed and loaded")
|
||||
print(f"")
|
||||
print(f"To view logs:")
|
||||
print(f" tail -f ~/.hermes/logs/gateway.log")
|
||||
print(f"")
|
||||
print(f"To manage the service:")
|
||||
print(f" launchctl start ai.hermes.gateway")
|
||||
print(f" launchctl stop ai.hermes.gateway")
|
||||
|
||||
def uninstall_launchd():
|
||||
"""Uninstall the launchd service (macOS)."""
|
||||
plist_path = get_launchd_plist_path()
|
||||
|
||||
# Unload first
|
||||
subprocess.run(["launchctl", "unload", str(plist_path)], check=False)
|
||||
|
||||
# Remove the plist file
|
||||
if plist_path.exists():
|
||||
plist_path.unlink()
|
||||
print(f"✓ Removed {plist_path}")
|
||||
|
||||
print(f"✓ Service uninstalled")
|
||||
|
||||
def launchd_status():
|
||||
"""Show launchd service status."""
|
||||
subprocess.run(["launchctl", "list", "ai.hermes.gateway"])
|
||||
|
||||
def launchd_start():
|
||||
"""Start the launchd service."""
|
||||
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
|
||||
print(f"✓ Service started")
|
||||
|
||||
def launchd_stop():
|
||||
"""Stop the launchd service."""
|
||||
subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
|
||||
print(f"✓ Service stopped")
|
||||
|
||||
def launchd_restart():
|
||||
"""Restart the launchd service."""
|
||||
launchd_stop()
|
||||
launchd_start()
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Platform Detection
|
||||
# =============================================================================
|
||||
|
||||
def is_linux() -> bool:
|
||||
return sys.platform.startswith('linux')
|
||||
|
||||
def is_macos() -> bool:
|
||||
return sys.platform == 'darwin'
|
||||
|
||||
def is_windows() -> bool:
|
||||
return sys.platform == 'win32'
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Gateway Runner
|
||||
# =============================================================================
|
||||
|
||||
def run_gateway():
|
||||
"""Run the gateway in foreground."""
|
||||
from gateway.run import start_gateway
|
||||
print("Starting Hermes Gateway...")
|
||||
print("Press Ctrl+C to stop.")
|
||||
print()
|
||||
asyncio.run(start_gateway())
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Main CLI
|
||||
# =============================================================================
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Hermes Gateway - Messaging Platform Integration",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
# Run in foreground (for testing)
|
||||
./scripts/hermes-gateway run
|
||||
|
||||
# Install as system service
|
||||
./scripts/hermes-gateway install
|
||||
|
||||
# Manage the service
|
||||
./scripts/hermes-gateway start
|
||||
./scripts/hermes-gateway stop
|
||||
./scripts/hermes-gateway restart
|
||||
./scripts/hermes-gateway status
|
||||
|
||||
# Uninstall
|
||||
./scripts/hermes-gateway uninstall
|
||||
|
||||
Configuration:
|
||||
Set environment variables in .env file or system environment:
|
||||
- TELEGRAM_BOT_TOKEN
|
||||
- DISCORD_BOT_TOKEN
|
||||
- WHATSAPP_ENABLED
|
||||
|
||||
Or create ~/.hermes/gateway.json for advanced configuration.
|
||||
"""
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"command",
|
||||
choices=["run", "install", "uninstall", "start", "stop", "restart", "status"],
|
||||
nargs="?",
|
||||
default="run",
|
||||
help="Command to execute (default: run)"
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--verbose", "-v",
|
||||
action="store_true",
|
||||
help="Verbose output"
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Detect platform and dispatch command
|
||||
if args.command == "run":
|
||||
run_gateway()
|
||||
|
||||
elif args.command == "install":
|
||||
if is_linux():
|
||||
install_systemd()
|
||||
elif is_macos():
|
||||
install_launchd()
|
||||
else:
|
||||
print("Service installation not supported on this platform.")
|
||||
print("Please run manually: ./scripts/hermes-gateway run")
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "uninstall":
|
||||
if is_linux():
|
||||
uninstall_systemd()
|
||||
elif is_macos():
|
||||
uninstall_launchd()
|
||||
else:
|
||||
print("Service uninstallation not supported on this platform.")
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "start":
|
||||
if is_linux():
|
||||
systemd_start()
|
||||
elif is_macos():
|
||||
launchd_start()
|
||||
else:
|
||||
print("Not supported on this platform.")
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "stop":
|
||||
if is_linux():
|
||||
systemd_stop()
|
||||
elif is_macos():
|
||||
launchd_stop()
|
||||
else:
|
||||
print("Not supported on this platform.")
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "restart":
|
||||
if is_linux():
|
||||
systemd_restart()
|
||||
elif is_macos():
|
||||
launchd_restart()
|
||||
else:
|
||||
print("Not supported on this platform.")
|
||||
sys.exit(1)
|
||||
|
||||
elif args.command == "status":
|
||||
if is_linux():
|
||||
systemd_status()
|
||||
elif is_macos():
|
||||
launchd_status()
|
||||
else:
|
||||
print("Not supported on this platform.")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
519
scripts/install.ps1
Normal file
519
scripts/install.ps1
Normal file
@@ -0,0 +1,519 @@
|
||||
# ============================================================================
|
||||
# Hermes Agent Installer for Windows
|
||||
# ============================================================================
|
||||
# Installation script for Windows (PowerShell).
|
||||
#
|
||||
# Usage:
|
||||
# irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1 | iex
|
||||
#
|
||||
# Or download and run with options:
|
||||
# .\install.ps1 -NoVenv -SkipSetup
|
||||
#
|
||||
# ============================================================================
|
||||
|
||||
param(
|
||||
[switch]$NoVenv,
|
||||
[switch]$SkipSetup,
|
||||
[string]$Branch = "main",
|
||||
[string]$HermesHome = "$env:USERPROFILE\.hermes",
|
||||
[string]$InstallDir = "$env:USERPROFILE\.hermes\hermes-agent"
|
||||
)
|
||||
|
||||
$ErrorActionPreference = "Stop"
|
||||
|
||||
# ============================================================================
|
||||
# Configuration
|
||||
# ============================================================================
|
||||
|
||||
$RepoUrlSsh = "git@github.com:NousResearch/hermes-agent.git"
|
||||
$RepoUrlHttps = "https://github.com/NousResearch/hermes-agent.git"
|
||||
|
||||
# ============================================================================
|
||||
# Helper functions
|
||||
# ============================================================================
|
||||
|
||||
function Write-Banner {
|
||||
Write-Host ""
|
||||
Write-Host "┌─────────────────────────────────────────────────────────┐" -ForegroundColor Magenta
|
||||
Write-Host "│ 🦋 Hermes Agent Installer │" -ForegroundColor Magenta
|
||||
Write-Host "├─────────────────────────────────────────────────────────┤" -ForegroundColor Magenta
|
||||
Write-Host "│ I'm just a butterfly with a lot of tools. │" -ForegroundColor Magenta
|
||||
Write-Host "└─────────────────────────────────────────────────────────┘" -ForegroundColor Magenta
|
||||
Write-Host ""
|
||||
}
|
||||
|
||||
function Write-Info {
|
||||
param([string]$Message)
|
||||
Write-Host "→ $Message" -ForegroundColor Cyan
|
||||
}
|
||||
|
||||
function Write-Success {
|
||||
param([string]$Message)
|
||||
Write-Host "✓ $Message" -ForegroundColor Green
|
||||
}
|
||||
|
||||
function Write-Warning {
|
||||
param([string]$Message)
|
||||
Write-Host "⚠ $Message" -ForegroundColor Yellow
|
||||
}
|
||||
|
||||
function Write-Error {
|
||||
param([string]$Message)
|
||||
Write-Host "✗ $Message" -ForegroundColor Red
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# Dependency checks
|
||||
# ============================================================================
|
||||
|
||||
function Test-Python {
|
||||
Write-Info "Checking Python..."
|
||||
|
||||
# Try different python commands
|
||||
$pythonCmds = @("python3", "python", "py -3")
|
||||
|
||||
foreach ($cmd in $pythonCmds) {
|
||||
try {
|
||||
$version = & $cmd.Split()[0] $cmd.Split()[1..99] -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')" 2>$null
|
||||
if ($version) {
|
||||
$major, $minor = $version.Split('.')
|
||||
if ([int]$major -ge 3 -and [int]$minor -ge 10) {
|
||||
$script:PythonCmd = $cmd
|
||||
Write-Success "Python $version found"
|
||||
return $true
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
# Try next command
|
||||
}
|
||||
}
|
||||
|
||||
Write-Error "Python 3.10+ not found"
|
||||
Write-Info "Please install Python 3.10 or newer from:"
|
||||
Write-Info " https://www.python.org/downloads/"
|
||||
Write-Info ""
|
||||
Write-Info "Make sure to check 'Add Python to PATH' during installation"
|
||||
return $false
|
||||
}
|
||||
|
||||
function Test-Git {
|
||||
Write-Info "Checking Git..."
|
||||
|
||||
if (Get-Command git -ErrorAction SilentlyContinue) {
|
||||
$version = git --version
|
||||
Write-Success "Git found ($version)"
|
||||
return $true
|
||||
}
|
||||
|
||||
Write-Error "Git not found"
|
||||
Write-Info "Please install Git from:"
|
||||
Write-Info " https://git-scm.com/download/win"
|
||||
return $false
|
||||
}
|
||||
|
||||
function Test-Node {
|
||||
Write-Info "Checking Node.js (optional, for browser tools)..."
|
||||
|
||||
if (Get-Command node -ErrorAction SilentlyContinue) {
|
||||
$version = node --version
|
||||
Write-Success "Node.js $version found"
|
||||
$script:HasNode = $true
|
||||
return $true
|
||||
}
|
||||
|
||||
Write-Warning "Node.js not found (browser tools will be limited)"
|
||||
Write-Info "To install Node.js (optional):"
|
||||
Write-Info " https://nodejs.org/en/download/"
|
||||
$script:HasNode = $false
|
||||
return $true # Don't fail - Node is optional
|
||||
}
|
||||
|
||||
function Test-Ripgrep {
|
||||
Write-Info "Checking ripgrep (optional, for faster file search)..."
|
||||
|
||||
if (Get-Command rg -ErrorAction SilentlyContinue) {
|
||||
$version = rg --version | Select-Object -First 1
|
||||
Write-Success "$version found"
|
||||
$script:HasRipgrep = $true
|
||||
return $true
|
||||
}
|
||||
|
||||
Write-Warning "ripgrep not found (file search will use findstr fallback)"
|
||||
|
||||
# Check what package managers are available
|
||||
$hasWinget = Get-Command winget -ErrorAction SilentlyContinue
|
||||
$hasChoco = Get-Command choco -ErrorAction SilentlyContinue
|
||||
$hasScoop = Get-Command scoop -ErrorAction SilentlyContinue
|
||||
|
||||
# Offer to install
|
||||
Write-Host ""
|
||||
$response = Read-Host "Would you like to install ripgrep? (faster search, recommended) [Y/n]"
|
||||
|
||||
if ($response -eq "" -or $response -match "^[Yy]") {
|
||||
Write-Info "Installing ripgrep..."
|
||||
|
||||
if ($hasWinget) {
|
||||
try {
|
||||
winget install BurntSushi.ripgrep.MSVC --silent 2>&1 | Out-Null
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Success "ripgrep installed via winget"
|
||||
$script:HasRipgrep = $true
|
||||
return $true
|
||||
}
|
||||
} catch { }
|
||||
}
|
||||
|
||||
if ($hasChoco) {
|
||||
try {
|
||||
choco install ripgrep -y 2>&1 | Out-Null
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Success "ripgrep installed via chocolatey"
|
||||
$script:HasRipgrep = $true
|
||||
return $true
|
||||
}
|
||||
} catch { }
|
||||
}
|
||||
|
||||
if ($hasScoop) {
|
||||
try {
|
||||
scoop install ripgrep 2>&1 | Out-Null
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Success "ripgrep installed via scoop"
|
||||
$script:HasRipgrep = $true
|
||||
return $true
|
||||
}
|
||||
} catch { }
|
||||
}
|
||||
|
||||
Write-Warning "Auto-install failed. You can install manually:"
|
||||
} else {
|
||||
Write-Info "Skipping ripgrep installation. To install manually:"
|
||||
}
|
||||
|
||||
# Show manual install instructions
|
||||
Write-Info " winget install BurntSushi.ripgrep.MSVC"
|
||||
Write-Info " Or: choco install ripgrep"
|
||||
Write-Info " Or: scoop install ripgrep"
|
||||
Write-Info " Or download from: https://github.com/BurntSushi/ripgrep/releases"
|
||||
|
||||
$script:HasRipgrep = $false
|
||||
return $true # Don't fail - ripgrep is optional
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# Installation
|
||||
# ============================================================================
|
||||
|
||||
function Install-Repository {
|
||||
Write-Info "Installing to $InstallDir..."
|
||||
|
||||
if (Test-Path $InstallDir) {
|
||||
if (Test-Path "$InstallDir\.git") {
|
||||
Write-Info "Existing installation found, updating..."
|
||||
Push-Location $InstallDir
|
||||
git fetch origin
|
||||
git checkout $Branch
|
||||
git pull origin $Branch
|
||||
Pop-Location
|
||||
} else {
|
||||
Write-Error "Directory exists but is not a git repository: $InstallDir"
|
||||
Write-Info "Remove it or choose a different directory with -InstallDir"
|
||||
exit 1
|
||||
}
|
||||
} else {
|
||||
# Try SSH first (for private repo access), fall back to HTTPS
|
||||
# Use --recurse-submodules to also clone mini-swe-agent and tinker-atropos
|
||||
Write-Info "Trying SSH clone..."
|
||||
$sshResult = git clone --branch $Branch --recurse-submodules $RepoUrlSsh $InstallDir 2>&1
|
||||
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Success "Cloned via SSH"
|
||||
} else {
|
||||
Write-Info "SSH failed, trying HTTPS..."
|
||||
$httpsResult = git clone --branch $Branch --recurse-submodules $RepoUrlHttps $InstallDir 2>&1
|
||||
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Success "Cloned via HTTPS"
|
||||
} else {
|
||||
Write-Error "Failed to clone repository"
|
||||
Write-Info "For private repo access, ensure your SSH key is added to GitHub:"
|
||||
Write-Info " ssh-add ~/.ssh/id_rsa"
|
||||
Write-Info " ssh -T git@github.com # Test connection"
|
||||
exit 1
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Ensure submodules are initialized and updated (for existing installs or if --recurse failed)
|
||||
Write-Info "Initializing submodules (mini-swe-agent, tinker-atropos)..."
|
||||
Push-Location $InstallDir
|
||||
git submodule update --init --recursive
|
||||
Pop-Location
|
||||
Write-Success "Submodules ready"
|
||||
|
||||
Write-Success "Repository ready"
|
||||
}
|
||||
|
||||
function Install-Venv {
|
||||
if ($NoVenv) {
|
||||
Write-Info "Skipping virtual environment (-NoVenv)"
|
||||
return
|
||||
}
|
||||
|
||||
Write-Info "Creating virtual environment..."
|
||||
|
||||
Push-Location $InstallDir
|
||||
|
||||
if (-not (Test-Path "venv")) {
|
||||
& $PythonCmd -m venv venv
|
||||
}
|
||||
|
||||
# Activate
|
||||
& .\venv\Scripts\Activate.ps1
|
||||
|
||||
# Upgrade pip
|
||||
pip install --upgrade pip wheel setuptools | Out-Null
|
||||
|
||||
Pop-Location
|
||||
|
||||
Write-Success "Virtual environment ready"
|
||||
}
|
||||
|
||||
function Install-Dependencies {
|
||||
Write-Info "Installing dependencies..."
|
||||
|
||||
Push-Location $InstallDir
|
||||
|
||||
if (-not $NoVenv) {
|
||||
& .\venv\Scripts\Activate.ps1
|
||||
}
|
||||
|
||||
# Install main package
|
||||
try {
|
||||
pip install -e ".[all]" 2>&1 | Out-Null
|
||||
} catch {
|
||||
pip install -e "." | Out-Null
|
||||
}
|
||||
|
||||
Write-Success "Main package installed"
|
||||
|
||||
# Install submodules
|
||||
Write-Info "Installing mini-swe-agent (terminal tool backend)..."
|
||||
if (Test-Path "mini-swe-agent\pyproject.toml") {
|
||||
try {
|
||||
pip install -e ".\mini-swe-agent" 2>&1 | Out-Null
|
||||
Write-Success "mini-swe-agent installed"
|
||||
} catch {
|
||||
Write-Warning "mini-swe-agent install failed (terminal tools may not work)"
|
||||
}
|
||||
} else {
|
||||
Write-Warning "mini-swe-agent not found (run: git submodule update --init)"
|
||||
}
|
||||
|
||||
Write-Info "Installing tinker-atropos (RL training backend)..."
|
||||
if (Test-Path "tinker-atropos\pyproject.toml") {
|
||||
try {
|
||||
pip install -e ".\tinker-atropos" 2>&1 | Out-Null
|
||||
Write-Success "tinker-atropos installed"
|
||||
} catch {
|
||||
Write-Warning "tinker-atropos install failed (RL tools may not work)"
|
||||
}
|
||||
} else {
|
||||
Write-Warning "tinker-atropos not found (run: git submodule update --init)"
|
||||
}
|
||||
|
||||
Pop-Location
|
||||
|
||||
Write-Success "All dependencies installed"
|
||||
}
|
||||
|
||||
function Set-PathVariable {
|
||||
Write-Info "Setting up PATH..."
|
||||
|
||||
if ($NoVenv) {
|
||||
$binDir = "$InstallDir"
|
||||
} else {
|
||||
$binDir = "$InstallDir\venv\Scripts"
|
||||
}
|
||||
|
||||
# Add to user PATH
|
||||
$currentPath = [Environment]::GetEnvironmentVariable("Path", "User")
|
||||
|
||||
if ($currentPath -notlike "*$binDir*") {
|
||||
[Environment]::SetEnvironmentVariable(
|
||||
"Path",
|
||||
"$binDir;$currentPath",
|
||||
"User"
|
||||
)
|
||||
Write-Success "Added to user PATH"
|
||||
} else {
|
||||
Write-Info "PATH already configured"
|
||||
}
|
||||
|
||||
# Update current session
|
||||
$env:Path = "$binDir;$env:Path"
|
||||
}
|
||||
|
||||
function Copy-ConfigTemplates {
|
||||
Write-Info "Setting up configuration files..."
|
||||
|
||||
# Create ~/.hermes directory structure (config at top level, code in subdir)
|
||||
New-Item -ItemType Directory -Force -Path "$HermesHome\cron" | Out-Null
|
||||
New-Item -ItemType Directory -Force -Path "$HermesHome\sessions" | Out-Null
|
||||
New-Item -ItemType Directory -Force -Path "$HermesHome\logs" | Out-Null
|
||||
|
||||
# Create .env at ~/.hermes/.env (top level, easy to find)
|
||||
$envPath = "$HermesHome\.env"
|
||||
if (-not (Test-Path $envPath)) {
|
||||
$examplePath = "$InstallDir\.env.example"
|
||||
if (Test-Path $examplePath) {
|
||||
Copy-Item $examplePath $envPath
|
||||
Write-Success "Created ~/.hermes/.env from template"
|
||||
} else {
|
||||
# Create empty .env if no example exists
|
||||
New-Item -ItemType File -Force -Path $envPath | Out-Null
|
||||
Write-Success "Created ~/.hermes/.env"
|
||||
}
|
||||
} else {
|
||||
Write-Info "~/.hermes/.env already exists, keeping it"
|
||||
}
|
||||
|
||||
# Create config.yaml at ~/.hermes/config.yaml (top level, easy to find)
|
||||
$configPath = "$HermesHome\config.yaml"
|
||||
if (-not (Test-Path $configPath)) {
|
||||
$examplePath = "$InstallDir\cli-config.yaml.example"
|
||||
if (Test-Path $examplePath) {
|
||||
Copy-Item $examplePath $configPath
|
||||
Write-Success "Created ~/.hermes/config.yaml from template"
|
||||
}
|
||||
} else {
|
||||
Write-Info "~/.hermes/config.yaml already exists, keeping it"
|
||||
}
|
||||
|
||||
Write-Success "Configuration directory ready: ~/.hermes/"
|
||||
}
|
||||
|
||||
function Install-NodeDeps {
|
||||
if (-not $HasNode) {
|
||||
Write-Info "Skipping Node.js dependencies (Node not installed)"
|
||||
return
|
||||
}
|
||||
|
||||
Push-Location $InstallDir
|
||||
|
||||
if (Test-Path "package.json") {
|
||||
Write-Info "Installing Node.js dependencies..."
|
||||
try {
|
||||
npm install --silent 2>&1 | Out-Null
|
||||
Write-Success "Node.js dependencies installed"
|
||||
} catch {
|
||||
Write-Warning "npm install failed (browser tools may not work)"
|
||||
}
|
||||
}
|
||||
|
||||
Pop-Location
|
||||
}
|
||||
|
||||
function Invoke-SetupWizard {
|
||||
if ($SkipSetup) {
|
||||
Write-Info "Skipping setup wizard (-SkipSetup)"
|
||||
return
|
||||
}
|
||||
|
||||
Write-Host ""
|
||||
Write-Info "Starting setup wizard..."
|
||||
Write-Host ""
|
||||
|
||||
Push-Location $InstallDir
|
||||
|
||||
if (-not $NoVenv) {
|
||||
& .\venv\Scripts\Activate.ps1
|
||||
}
|
||||
|
||||
python -m hermes_cli.main setup
|
||||
|
||||
Pop-Location
|
||||
}
|
||||
|
||||
function Write-Completion {
|
||||
Write-Host ""
|
||||
Write-Host "┌─────────────────────────────────────────────────────────┐" -ForegroundColor Green
|
||||
Write-Host "│ ✓ Installation Complete! │" -ForegroundColor Green
|
||||
Write-Host "└─────────────────────────────────────────────────────────┘" -ForegroundColor Green
|
||||
Write-Host ""
|
||||
|
||||
# Show file locations
|
||||
Write-Host "📁 Your files (all in ~/.hermes/):" -ForegroundColor Cyan
|
||||
Write-Host ""
|
||||
Write-Host " Config: " -NoNewline -ForegroundColor Yellow
|
||||
Write-Host "$HermesHome\config.yaml"
|
||||
Write-Host " API Keys: " -NoNewline -ForegroundColor Yellow
|
||||
Write-Host "$HermesHome\.env"
|
||||
Write-Host " Data: " -NoNewline -ForegroundColor Yellow
|
||||
Write-Host "$HermesHome\cron\, sessions\, logs\"
|
||||
Write-Host " Code: " -NoNewline -ForegroundColor Yellow
|
||||
Write-Host "$HermesHome\hermes-agent\"
|
||||
Write-Host ""
|
||||
|
||||
Write-Host "─────────────────────────────────────────────────────────" -ForegroundColor Cyan
|
||||
Write-Host ""
|
||||
Write-Host "🚀 Commands:" -ForegroundColor Cyan
|
||||
Write-Host ""
|
||||
Write-Host " hermes " -NoNewline -ForegroundColor Green
|
||||
Write-Host "Start chatting"
|
||||
Write-Host " hermes setup " -NoNewline -ForegroundColor Green
|
||||
Write-Host "Configure API keys & settings"
|
||||
Write-Host " hermes config " -NoNewline -ForegroundColor Green
|
||||
Write-Host "View/edit configuration"
|
||||
Write-Host " hermes config edit " -NoNewline -ForegroundColor Green
|
||||
Write-Host "Open config in editor"
|
||||
Write-Host " hermes gateway " -NoNewline -ForegroundColor Green
|
||||
Write-Host "Run messaging gateway"
|
||||
Write-Host " hermes update " -NoNewline -ForegroundColor Green
|
||||
Write-Host "Update to latest version"
|
||||
Write-Host ""
|
||||
|
||||
Write-Host "─────────────────────────────────────────────────────────" -ForegroundColor Cyan
|
||||
Write-Host ""
|
||||
Write-Host "⚡ Restart your terminal for PATH changes to take effect" -ForegroundColor Yellow
|
||||
Write-Host ""
|
||||
|
||||
# Show notes about optional tools
|
||||
if (-not $HasNode) {
|
||||
Write-Host "Note: Node.js was not found. Browser automation tools" -ForegroundColor Yellow
|
||||
Write-Host "will have limited functionality." -ForegroundColor Yellow
|
||||
Write-Host ""
|
||||
}
|
||||
|
||||
if (-not $HasRipgrep) {
|
||||
Write-Host "Note: ripgrep (rg) was not found. File search will use" -ForegroundColor Yellow
|
||||
Write-Host "findstr as a fallback. For faster search:" -ForegroundColor Yellow
|
||||
Write-Host " winget install BurntSushi.ripgrep.MSVC" -ForegroundColor Yellow
|
||||
Write-Host ""
|
||||
}
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# Main
|
||||
# ============================================================================
|
||||
|
||||
function Main {
|
||||
Write-Banner
|
||||
|
||||
if (-not (Test-Python)) { exit 1 }
|
||||
if (-not (Test-Git)) { exit 1 }
|
||||
Test-Node # Optional, doesn't fail
|
||||
Test-Ripgrep # Optional, doesn't fail
|
||||
|
||||
Install-Repository
|
||||
Install-Venv
|
||||
Install-Dependencies
|
||||
Install-NodeDeps
|
||||
Set-PathVariable
|
||||
Copy-ConfigTemplates
|
||||
Invoke-SetupWizard
|
||||
|
||||
Write-Completion
|
||||
}
|
||||
|
||||
Main
|
||||
692
scripts/install.sh
Executable file
692
scripts/install.sh
Executable file
@@ -0,0 +1,692 @@
|
||||
#!/bin/bash
|
||||
# ============================================================================
|
||||
# Hermes Agent Installer
|
||||
# ============================================================================
|
||||
# Installation script for Linux and macOS.
|
||||
#
|
||||
# Usage:
|
||||
# curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
|
||||
#
|
||||
# Or with options:
|
||||
# curl -fsSL ... | bash -s -- --no-venv --skip-setup
|
||||
#
|
||||
# ============================================================================
|
||||
|
||||
set -e
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[0;33m'
|
||||
BLUE='\033[0;34m'
|
||||
MAGENTA='\033[0;35m'
|
||||
CYAN='\033[0;36m'
|
||||
NC='\033[0m' # No Color
|
||||
BOLD='\033[1m'
|
||||
|
||||
# Configuration
|
||||
REPO_URL_SSH="git@github.com:NousResearch/hermes-agent.git"
|
||||
REPO_URL_HTTPS="https://github.com/NousResearch/hermes-agent.git"
|
||||
HERMES_HOME="$HOME/.hermes"
|
||||
INSTALL_DIR="${HERMES_INSTALL_DIR:-$HERMES_HOME/hermes-agent}"
|
||||
PYTHON_MIN_VERSION="3.10"
|
||||
|
||||
# Options
|
||||
USE_VENV=true
|
||||
RUN_SETUP=true
|
||||
BRANCH="main"
|
||||
|
||||
# Parse arguments
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--no-venv)
|
||||
USE_VENV=false
|
||||
shift
|
||||
;;
|
||||
--skip-setup)
|
||||
RUN_SETUP=false
|
||||
shift
|
||||
;;
|
||||
--branch)
|
||||
BRANCH="$2"
|
||||
shift 2
|
||||
;;
|
||||
--dir)
|
||||
INSTALL_DIR="$2"
|
||||
shift 2
|
||||
;;
|
||||
-h|--help)
|
||||
echo "Hermes Agent Installer"
|
||||
echo ""
|
||||
echo "Usage: install.sh [OPTIONS]"
|
||||
echo ""
|
||||
echo "Options:"
|
||||
echo " --no-venv Don't create virtual environment"
|
||||
echo " --skip-setup Skip interactive setup wizard"
|
||||
echo " --branch NAME Git branch to install (default: main)"
|
||||
echo " --dir PATH Installation directory (default: ~/.hermes-agent)"
|
||||
echo " -h, --help Show this help"
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo "Unknown option: $1"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# ============================================================================
|
||||
# Helper functions
|
||||
# ============================================================================
|
||||
|
||||
print_banner() {
|
||||
echo ""
|
||||
echo -e "${MAGENTA}${BOLD}"
|
||||
echo "┌─────────────────────────────────────────────────────────┐"
|
||||
echo "│ 🦋 Hermes Agent Installer │"
|
||||
echo "├─────────────────────────────────────────────────────────┤"
|
||||
echo "│ I'm just a butterfly with a lot of tools. │"
|
||||
echo "└─────────────────────────────────────────────────────────┘"
|
||||
echo -e "${NC}"
|
||||
}
|
||||
|
||||
log_info() {
|
||||
echo -e "${CYAN}→${NC} $1"
|
||||
}
|
||||
|
||||
log_success() {
|
||||
echo -e "${GREEN}✓${NC} $1"
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}⚠${NC} $1"
|
||||
}
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}✗${NC} $1"
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# System detection
|
||||
# ============================================================================
|
||||
|
||||
detect_os() {
|
||||
case "$(uname -s)" in
|
||||
Linux*)
|
||||
OS="linux"
|
||||
if [ -f /etc/os-release ]; then
|
||||
. /etc/os-release
|
||||
DISTRO="$ID"
|
||||
else
|
||||
DISTRO="unknown"
|
||||
fi
|
||||
;;
|
||||
Darwin*)
|
||||
OS="macos"
|
||||
DISTRO="macos"
|
||||
;;
|
||||
CYGWIN*|MINGW*|MSYS*)
|
||||
OS="windows"
|
||||
DISTRO="windows"
|
||||
log_error "Windows detected. Please use the PowerShell installer:"
|
||||
log_info " irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1 | iex"
|
||||
exit 1
|
||||
;;
|
||||
*)
|
||||
OS="unknown"
|
||||
DISTRO="unknown"
|
||||
log_warn "Unknown operating system"
|
||||
;;
|
||||
esac
|
||||
|
||||
log_success "Detected: $OS ($DISTRO)"
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# Dependency checks
|
||||
# ============================================================================
|
||||
|
||||
check_python() {
|
||||
log_info "Checking Python..."
|
||||
|
||||
# Try different python commands
|
||||
for cmd in python3.12 python3.11 python3.10 python3 python; do
|
||||
if command -v $cmd &> /dev/null; then
|
||||
PYTHON_CMD=$cmd
|
||||
PYTHON_VERSION=$($cmd -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')
|
||||
|
||||
# Check version
|
||||
if python3 -c "import sys; exit(0 if sys.version_info >= (3, 10) else 1)" 2>/dev/null; then
|
||||
log_success "Python $PYTHON_VERSION found"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
log_error "Python 3.10+ not found"
|
||||
log_info "Please install Python 3.10 or newer:"
|
||||
|
||||
case "$OS" in
|
||||
linux)
|
||||
case "$DISTRO" in
|
||||
ubuntu|debian)
|
||||
log_info " sudo apt update && sudo apt install python3.11 python3.11-venv"
|
||||
;;
|
||||
fedora)
|
||||
log_info " sudo dnf install python3.11"
|
||||
;;
|
||||
arch)
|
||||
log_info " sudo pacman -S python"
|
||||
;;
|
||||
*)
|
||||
log_info " Use your package manager to install Python 3.10+"
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
macos)
|
||||
log_info " brew install python@3.11"
|
||||
log_info " Or download from https://www.python.org/downloads/"
|
||||
;;
|
||||
esac
|
||||
|
||||
exit 1
|
||||
}
|
||||
|
||||
check_git() {
|
||||
log_info "Checking Git..."
|
||||
|
||||
if command -v git &> /dev/null; then
|
||||
GIT_VERSION=$(git --version | awk '{print $3}')
|
||||
log_success "Git $GIT_VERSION found"
|
||||
return 0
|
||||
fi
|
||||
|
||||
log_error "Git not found"
|
||||
log_info "Please install Git:"
|
||||
|
||||
case "$OS" in
|
||||
linux)
|
||||
case "$DISTRO" in
|
||||
ubuntu|debian)
|
||||
log_info " sudo apt update && sudo apt install git"
|
||||
;;
|
||||
fedora)
|
||||
log_info " sudo dnf install git"
|
||||
;;
|
||||
arch)
|
||||
log_info " sudo pacman -S git"
|
||||
;;
|
||||
*)
|
||||
log_info " Use your package manager to install git"
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
macos)
|
||||
log_info " xcode-select --install"
|
||||
log_info " Or: brew install git"
|
||||
;;
|
||||
esac
|
||||
|
||||
exit 1
|
||||
}
|
||||
|
||||
check_node() {
|
||||
log_info "Checking Node.js (optional, for browser tools)..."
|
||||
|
||||
if command -v node &> /dev/null; then
|
||||
NODE_VERSION=$(node --version)
|
||||
log_success "Node.js $NODE_VERSION found"
|
||||
HAS_NODE=true
|
||||
return 0
|
||||
fi
|
||||
|
||||
log_warn "Node.js not found (browser tools will be limited)"
|
||||
log_info "To install Node.js (optional):"
|
||||
|
||||
case "$OS" in
|
||||
linux)
|
||||
case "$DISTRO" in
|
||||
ubuntu|debian)
|
||||
log_info " curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -"
|
||||
log_info " sudo apt install -y nodejs"
|
||||
;;
|
||||
fedora)
|
||||
log_info " sudo dnf install nodejs"
|
||||
;;
|
||||
arch)
|
||||
log_info " sudo pacman -S nodejs npm"
|
||||
;;
|
||||
*)
|
||||
log_info " https://nodejs.org/en/download/"
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
macos)
|
||||
log_info " brew install node"
|
||||
log_info " Or: https://nodejs.org/en/download/"
|
||||
;;
|
||||
esac
|
||||
|
||||
HAS_NODE=false
|
||||
# Don't exit - Node is optional
|
||||
}
|
||||
|
||||
check_ripgrep() {
|
||||
log_info "Checking ripgrep (optional, for faster file search)..."
|
||||
|
||||
if command -v rg &> /dev/null; then
|
||||
RG_VERSION=$(rg --version | head -1)
|
||||
log_success "$RG_VERSION found"
|
||||
HAS_RIPGREP=true
|
||||
return 0
|
||||
fi
|
||||
|
||||
log_warn "ripgrep not found (file search will use grep fallback)"
|
||||
|
||||
# Offer to install
|
||||
echo ""
|
||||
read -p "Would you like to install ripgrep? (faster search, recommended) [Y/n] " -n 1 -r
|
||||
echo
|
||||
|
||||
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
|
||||
log_info "Installing ripgrep..."
|
||||
|
||||
# Check if we can use sudo
|
||||
CAN_SUDO=false
|
||||
if command -v sudo &> /dev/null; then
|
||||
# Check if user has sudo access (without actually running sudo)
|
||||
if sudo -n true 2>/dev/null || sudo -v 2>/dev/null; then
|
||||
CAN_SUDO=true
|
||||
fi
|
||||
fi
|
||||
|
||||
case "$OS" in
|
||||
linux)
|
||||
if [ "$CAN_SUDO" = true ]; then
|
||||
case "$DISTRO" in
|
||||
ubuntu|debian)
|
||||
if sudo apt install -y ripgrep 2>/dev/null; then
|
||||
log_success "ripgrep installed"
|
||||
HAS_RIPGREP=true
|
||||
return 0
|
||||
fi
|
||||
;;
|
||||
fedora)
|
||||
if sudo dnf install -y ripgrep 2>/dev/null; then
|
||||
log_success "ripgrep installed"
|
||||
HAS_RIPGREP=true
|
||||
return 0
|
||||
fi
|
||||
;;
|
||||
arch)
|
||||
if sudo pacman -S --noconfirm ripgrep 2>/dev/null; then
|
||||
log_success "ripgrep installed"
|
||||
HAS_RIPGREP=true
|
||||
return 0
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
else
|
||||
log_warn "sudo not available - cannot auto-install system packages"
|
||||
# Try cargo as fallback if available
|
||||
if command -v cargo &> /dev/null; then
|
||||
log_info "Trying cargo install (no sudo required)..."
|
||||
if cargo install ripgrep 2>/dev/null; then
|
||||
log_success "ripgrep installed via cargo"
|
||||
HAS_RIPGREP=true
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
;;
|
||||
macos)
|
||||
if command -v brew &> /dev/null; then
|
||||
if brew install ripgrep 2>/dev/null; then
|
||||
log_success "ripgrep installed"
|
||||
HAS_RIPGREP=true
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
log_warn "Auto-install failed. You can install manually later:"
|
||||
else
|
||||
log_info "Skipping ripgrep installation. To install manually:"
|
||||
fi
|
||||
|
||||
# Show manual install instructions
|
||||
case "$OS" in
|
||||
linux)
|
||||
case "$DISTRO" in
|
||||
ubuntu|debian)
|
||||
log_info " sudo apt install ripgrep"
|
||||
;;
|
||||
fedora)
|
||||
log_info " sudo dnf install ripgrep"
|
||||
;;
|
||||
arch)
|
||||
log_info " sudo pacman -S ripgrep"
|
||||
;;
|
||||
*)
|
||||
log_info " https://github.com/BurntSushi/ripgrep#installation"
|
||||
;;
|
||||
esac
|
||||
# Show cargo alternative for users without sudo
|
||||
if command -v cargo &> /dev/null; then
|
||||
log_info " Or without sudo: cargo install ripgrep"
|
||||
fi
|
||||
;;
|
||||
macos)
|
||||
log_info " brew install ripgrep"
|
||||
;;
|
||||
esac
|
||||
|
||||
HAS_RIPGREP=false
|
||||
# Don't exit - ripgrep is optional (grep fallback exists)
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# Installation
|
||||
# ============================================================================
|
||||
|
||||
clone_repo() {
|
||||
log_info "Installing to $INSTALL_DIR..."
|
||||
|
||||
if [ -d "$INSTALL_DIR" ]; then
|
||||
if [ -d "$INSTALL_DIR/.git" ]; then
|
||||
log_info "Existing installation found, updating..."
|
||||
cd "$INSTALL_DIR"
|
||||
git fetch origin
|
||||
git checkout "$BRANCH"
|
||||
git pull origin "$BRANCH"
|
||||
else
|
||||
log_error "Directory exists but is not a git repository: $INSTALL_DIR"
|
||||
log_info "Remove it or choose a different directory with --dir"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
# Try SSH first (for private repo access), fall back to HTTPS
|
||||
# Use --recurse-submodules to also clone mini-swe-agent and tinker-atropos
|
||||
log_info "Trying SSH clone..."
|
||||
if git clone --branch "$BRANCH" --recurse-submodules "$REPO_URL_SSH" "$INSTALL_DIR" 2>/dev/null; then
|
||||
log_success "Cloned via SSH"
|
||||
else
|
||||
log_info "SSH failed, trying HTTPS..."
|
||||
if git clone --branch "$BRANCH" --recurse-submodules "$REPO_URL_HTTPS" "$INSTALL_DIR"; then
|
||||
log_success "Cloned via HTTPS"
|
||||
else
|
||||
log_error "Failed to clone repository"
|
||||
log_info "For private repo access, ensure your SSH key is added to GitHub:"
|
||||
log_info " ssh-add ~/.ssh/id_rsa"
|
||||
log_info " ssh -T git@github.com # Test connection"
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
cd "$INSTALL_DIR"
|
||||
|
||||
# Ensure submodules are initialized and updated (for existing installs or if --recurse failed)
|
||||
log_info "Initializing submodules (mini-swe-agent, tinker-atropos)..."
|
||||
git submodule update --init --recursive
|
||||
log_success "Submodules ready"
|
||||
|
||||
log_success "Repository ready"
|
||||
}
|
||||
|
||||
setup_venv() {
|
||||
if [ "$USE_VENV" = false ]; then
|
||||
log_info "Skipping virtual environment (--no-venv)"
|
||||
return 0
|
||||
fi
|
||||
|
||||
log_info "Creating virtual environment..."
|
||||
|
||||
if [ -d "venv" ]; then
|
||||
log_info "Virtual environment already exists"
|
||||
else
|
||||
$PYTHON_CMD -m venv venv
|
||||
fi
|
||||
|
||||
# Activate
|
||||
source venv/bin/activate
|
||||
|
||||
# Upgrade pip
|
||||
pip install --upgrade pip wheel setuptools > /dev/null
|
||||
|
||||
log_success "Virtual environment ready"
|
||||
}
|
||||
|
||||
install_deps() {
|
||||
log_info "Installing dependencies..."
|
||||
|
||||
if [ "$USE_VENV" = true ]; then
|
||||
source venv/bin/activate
|
||||
fi
|
||||
|
||||
# Install the main package in editable mode with all extras
|
||||
pip install -e ".[all]" > /dev/null 2>&1 || pip install -e "." > /dev/null
|
||||
|
||||
log_success "Main package installed"
|
||||
|
||||
# Install submodules
|
||||
log_info "Installing mini-swe-agent (terminal tool backend)..."
|
||||
if [ -d "mini-swe-agent" ] && [ -f "mini-swe-agent/pyproject.toml" ]; then
|
||||
pip install -e "./mini-swe-agent" > /dev/null 2>&1 || log_warn "mini-swe-agent install failed (terminal tools may not work)"
|
||||
log_success "mini-swe-agent installed"
|
||||
else
|
||||
log_warn "mini-swe-agent not found (run: git submodule update --init)"
|
||||
fi
|
||||
|
||||
log_info "Installing tinker-atropos (RL training backend)..."
|
||||
if [ -d "tinker-atropos" ] && [ -f "tinker-atropos/pyproject.toml" ]; then
|
||||
pip install -e "./tinker-atropos" > /dev/null 2>&1 || log_warn "tinker-atropos install failed (RL tools may not work)"
|
||||
log_success "tinker-atropos installed"
|
||||
else
|
||||
log_warn "tinker-atropos not found (run: git submodule update --init)"
|
||||
fi
|
||||
|
||||
log_success "All dependencies installed"
|
||||
}
|
||||
|
||||
setup_path() {
|
||||
log_info "Setting up PATH..."
|
||||
|
||||
# Determine the bin directory
|
||||
if [ "$USE_VENV" = true ]; then
|
||||
BIN_DIR="$INSTALL_DIR/venv/bin"
|
||||
else
|
||||
BIN_DIR="$HOME/.local/bin"
|
||||
mkdir -p "$BIN_DIR"
|
||||
|
||||
# Create a wrapper script
|
||||
cat > "$BIN_DIR/hermes" << EOF
|
||||
#!/bin/bash
|
||||
cd "$INSTALL_DIR"
|
||||
exec python -m hermes_cli.main "\$@"
|
||||
EOF
|
||||
chmod +x "$BIN_DIR/hermes"
|
||||
fi
|
||||
|
||||
# Add to PATH in shell config
|
||||
SHELL_CONFIG=""
|
||||
if [ -n "$BASH_VERSION" ]; then
|
||||
if [ -f "$HOME/.bashrc" ]; then
|
||||
SHELL_CONFIG="$HOME/.bashrc"
|
||||
elif [ -f "$HOME/.bash_profile" ]; then
|
||||
SHELL_CONFIG="$HOME/.bash_profile"
|
||||
fi
|
||||
elif [ -n "$ZSH_VERSION" ] || [ -f "$HOME/.zshrc" ]; then
|
||||
SHELL_CONFIG="$HOME/.zshrc"
|
||||
fi
|
||||
|
||||
PATH_LINE="export PATH=\"$BIN_DIR:\$PATH\""
|
||||
|
||||
if [ -n "$SHELL_CONFIG" ]; then
|
||||
if ! grep -q "hermes-agent" "$SHELL_CONFIG" 2>/dev/null; then
|
||||
echo "" >> "$SHELL_CONFIG"
|
||||
echo "# Hermes Agent" >> "$SHELL_CONFIG"
|
||||
echo "$PATH_LINE" >> "$SHELL_CONFIG"
|
||||
log_success "Added to $SHELL_CONFIG"
|
||||
else
|
||||
log_info "PATH already configured in $SHELL_CONFIG"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Also export for current session
|
||||
export PATH="$BIN_DIR:$PATH"
|
||||
|
||||
log_success "PATH configured"
|
||||
}
|
||||
|
||||
copy_config_templates() {
|
||||
log_info "Setting up configuration files..."
|
||||
|
||||
# Create ~/.hermes directory structure (config at top level, code in subdir)
|
||||
mkdir -p "$HERMES_HOME/cron"
|
||||
mkdir -p "$HERMES_HOME/sessions"
|
||||
mkdir -p "$HERMES_HOME/logs"
|
||||
|
||||
# Create .env at ~/.hermes/.env (top level, easy to find)
|
||||
if [ ! -f "$HERMES_HOME/.env" ]; then
|
||||
if [ -f "$INSTALL_DIR/.env.example" ]; then
|
||||
cp "$INSTALL_DIR/.env.example" "$HERMES_HOME/.env"
|
||||
log_success "Created ~/.hermes/.env from template"
|
||||
else
|
||||
# Create empty .env if no example exists
|
||||
touch "$HERMES_HOME/.env"
|
||||
log_success "Created ~/.hermes/.env"
|
||||
fi
|
||||
else
|
||||
log_info "~/.hermes/.env already exists, keeping it"
|
||||
fi
|
||||
|
||||
# Create config.yaml at ~/.hermes/config.yaml (top level, easy to find)
|
||||
if [ ! -f "$HERMES_HOME/config.yaml" ]; then
|
||||
if [ -f "$INSTALL_DIR/cli-config.yaml.example" ]; then
|
||||
cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
|
||||
log_success "Created ~/.hermes/config.yaml from template"
|
||||
fi
|
||||
else
|
||||
log_info "~/.hermes/config.yaml already exists, keeping it"
|
||||
fi
|
||||
|
||||
log_success "Configuration directory ready: ~/.hermes/"
|
||||
}
|
||||
|
||||
install_node_deps() {
|
||||
if [ "$HAS_NODE" = false ]; then
|
||||
log_info "Skipping Node.js dependencies (Node not installed)"
|
||||
return 0
|
||||
fi
|
||||
|
||||
if [ -f "$INSTALL_DIR/package.json" ]; then
|
||||
log_info "Installing Node.js dependencies..."
|
||||
cd "$INSTALL_DIR"
|
||||
npm install --silent 2>/dev/null || {
|
||||
log_warn "npm install failed (browser tools may not work)"
|
||||
return 0
|
||||
}
|
||||
log_success "Node.js dependencies installed"
|
||||
fi
|
||||
}
|
||||
|
||||
run_setup_wizard() {
|
||||
if [ "$RUN_SETUP" = false ]; then
|
||||
log_info "Skipping setup wizard (--skip-setup)"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo ""
|
||||
log_info "Starting setup wizard..."
|
||||
echo ""
|
||||
|
||||
if [ "$USE_VENV" = true ]; then
|
||||
source "$INSTALL_DIR/venv/bin/activate"
|
||||
fi
|
||||
|
||||
cd "$INSTALL_DIR"
|
||||
python -m hermes_cli.main setup
|
||||
}
|
||||
|
||||
print_success() {
|
||||
echo ""
|
||||
echo -e "${GREEN}${BOLD}"
|
||||
echo "┌─────────────────────────────────────────────────────────┐"
|
||||
echo "│ ✓ Installation Complete! │"
|
||||
echo "└─────────────────────────────────────────────────────────┘"
|
||||
echo -e "${NC}"
|
||||
echo ""
|
||||
|
||||
# Show file locations
|
||||
echo -e "${CYAN}${BOLD}📁 Your files (all in ~/.hermes/):${NC}"
|
||||
echo ""
|
||||
echo -e " ${YELLOW}Config:${NC} ~/.hermes/config.yaml"
|
||||
echo -e " ${YELLOW}API Keys:${NC} ~/.hermes/.env"
|
||||
echo -e " ${YELLOW}Data:${NC} ~/.hermes/cron/, sessions/, logs/"
|
||||
echo -e " ${YELLOW}Code:${NC} ~/.hermes/hermes-agent/"
|
||||
echo ""
|
||||
|
||||
echo -e "${CYAN}─────────────────────────────────────────────────────────${NC}"
|
||||
echo ""
|
||||
echo -e "${CYAN}${BOLD}🚀 Commands:${NC}"
|
||||
echo ""
|
||||
echo -e " ${GREEN}hermes${NC} Start chatting"
|
||||
echo -e " ${GREEN}hermes setup${NC} Configure API keys & settings"
|
||||
echo -e " ${GREEN}hermes config${NC} View/edit configuration"
|
||||
echo -e " ${GREEN}hermes config edit${NC} Open config in editor"
|
||||
echo -e " ${GREEN}hermes gateway${NC} Run messaging gateway"
|
||||
echo -e " ${GREEN}hermes update${NC} Update to latest version"
|
||||
echo ""
|
||||
|
||||
echo -e "${CYAN}─────────────────────────────────────────────────────────${NC}"
|
||||
echo ""
|
||||
echo -e "${YELLOW}⚡ Reload your shell to use 'hermes' command:${NC}"
|
||||
echo ""
|
||||
echo " source ~/.bashrc # or ~/.zshrc"
|
||||
echo ""
|
||||
|
||||
# Show Node.js warning if not installed
|
||||
if [ "$HAS_NODE" = false ]; then
|
||||
echo -e "${YELLOW}"
|
||||
echo "Note: Node.js was not found. Browser automation tools"
|
||||
echo "will have limited functionality. Install Node.js later"
|
||||
echo "if you need full browser support."
|
||||
echo -e "${NC}"
|
||||
fi
|
||||
|
||||
# Show ripgrep note if not installed
|
||||
if [ "$HAS_RIPGREP" = false ]; then
|
||||
echo -e "${YELLOW}"
|
||||
echo "Note: ripgrep (rg) was not found. File search will use"
|
||||
echo "grep as a fallback. For faster search in large codebases,"
|
||||
echo "install ripgrep: sudo apt install ripgrep (or brew install ripgrep)"
|
||||
echo -e "${NC}"
|
||||
fi
|
||||
}
|
||||
|
||||
# ============================================================================
|
||||
# Main
|
||||
# ============================================================================
|
||||
|
||||
main() {
|
||||
print_banner
|
||||
|
||||
detect_os
|
||||
check_python
|
||||
check_git
|
||||
check_node
|
||||
check_ripgrep
|
||||
|
||||
clone_repo
|
||||
setup_venv
|
||||
install_deps
|
||||
install_node_deps
|
||||
setup_path
|
||||
copy_config_templates
|
||||
run_setup_wizard
|
||||
|
||||
print_success
|
||||
}
|
||||
|
||||
main
|
||||
@@ -1,62 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# Launch a local llama.cpp OpenAI-compatible server running GLM-4.7-Flash (GGUF).
|
||||
#
|
||||
# Requires:
|
||||
# - `llama-server` installed (e.g. `brew install llama.cpp`)
|
||||
#
|
||||
# Default settings are chosen to avoid clashing with Atropos sandbox_server
|
||||
# (which commonly uses port 8080 in local dev).
|
||||
#
|
||||
# Usage:
|
||||
# Hermes-Agent/scripts/launch_llama_cpp_glm47_flash.sh
|
||||
#
|
||||
# Override defaults:
|
||||
# LLAMA_CPP_HOST=127.0.0.1 LLAMA_CPP_PORT=8082 \
|
||||
# LLAMA_CPP_HF_REPO=ggml-org/GLM-4.7-Flash-GGUF \
|
||||
# LLAMA_CPP_HF_FILE=GLM-4.7-Flash-Q4_K.gguf \
|
||||
# Hermes-Agent/scripts/launch_llama_cpp_glm47_flash.sh
|
||||
|
||||
HOST="${LLAMA_CPP_HOST:-127.0.0.1}"
|
||||
PORT="${LLAMA_CPP_PORT:-8080}"
|
||||
HF_REPO="${LLAMA_CPP_HF_REPO:-ggml-org/GLM-4.7-Flash-GGUF}"
|
||||
HF_FILE="${LLAMA_CPP_HF_FILE:-GLM-4.7-Flash-Q4_K.gguf}"
|
||||
ALIAS="${LLAMA_CPP_ALIAS:-glm-4.7-flash}"
|
||||
|
||||
if ! command -v llama-server >/dev/null 2>&1; then
|
||||
echo "Error: llama-server not found in PATH."
|
||||
echo "Install via Homebrew: brew install llama.cpp"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Launching llama.cpp server..."
|
||||
echo " host: $HOST"
|
||||
echo " port: $PORT"
|
||||
echo " repo: $HF_REPO"
|
||||
echo " file: $HF_FILE"
|
||||
echo " alias: $ALIAS"
|
||||
echo
|
||||
echo "Suggested env vars for Hermes/Atropos integration:"
|
||||
echo " export ATROPOS_SERVER_BASE_URL=http://${HOST}:${PORT}"
|
||||
echo " export ATROPOS_SERVER_MODEL=${ALIAS}"
|
||||
echo " export ATROPOS_SERVER_API_KEY=local"
|
||||
echo
|
||||
|
||||
if command -v lsof >/dev/null 2>&1; then
|
||||
if lsof -nP -iTCP:"$PORT" -sTCP:LISTEN >/dev/null 2>&1; then
|
||||
echo "Error: port $PORT is already in use."
|
||||
echo "Pick a different port, e.g.:"
|
||||
echo " LLAMA_CPP_PORT=8082 Hermes-Agent/scripts/launch_llama_cpp_glm47_flash.sh"
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
exec llama-server \
|
||||
--host "$HOST" \
|
||||
--port "$PORT" \
|
||||
--hf-repo "$HF_REPO" \
|
||||
--hf-file "$HF_FILE" \
|
||||
--alias "$ALIAS" \
|
||||
-c 32768 \
|
||||
-n -1
|
||||
@@ -1,70 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# Launch a local llama.cpp OpenAI-compatible server running Hermes 4.3 36B (GGUF).
|
||||
#
|
||||
# Requires:
|
||||
# - `llama-server` installed (e.g. `brew install llama.cpp`)
|
||||
#
|
||||
# Note: Port choice can conflict with other local dev servers. If 8080 is already
|
||||
# in use, override via `LLAMA_CPP_PORT=...`.
|
||||
#
|
||||
# Usage:
|
||||
# Hermes-Agent/scripts/launch_llama_cpp_hermes_4_36b.sh
|
||||
#
|
||||
# Override defaults:
|
||||
# LLAMA_CPP_HOST=127.0.0.1 LLAMA_CPP_PORT=8082 \
|
||||
# LLAMA_CPP_HF_REPO=NousResearch/Hermes-4.3-36B-GGUF \
|
||||
# LLAMA_CPP_HF_FILE=hermes-4_3_36b-Q4_K_M.gguf \
|
||||
# LLAMA_CPP_ALIAS=hermes-4-36b \
|
||||
# LLAMA_CPP_PARALLEL=4 LLAMA_CPP_THREADS_HTTP=4 \
|
||||
# Hermes-Agent/scripts/launch_llama_cpp_hermes_4_36b.sh
|
||||
|
||||
HOST="${LLAMA_CPP_HOST:-127.0.0.1}"
|
||||
PORT="${LLAMA_CPP_PORT:-8080}"
|
||||
HF_REPO="${LLAMA_CPP_HF_REPO:-NousResearch/Hermes-4.3-36B-GGUF}"
|
||||
HF_FILE="${LLAMA_CPP_HF_FILE:-hermes-4_3_36b-Q4_K_M.gguf}"
|
||||
ALIAS="${LLAMA_CPP_ALIAS:-hermes-4-36b}"
|
||||
PARALLEL="${LLAMA_CPP_PARALLEL:-4}"
|
||||
THREADS_HTTP="${LLAMA_CPP_THREADS_HTTP:-4}"
|
||||
|
||||
if ! command -v llama-server >/dev/null 2>&1; then
|
||||
echo "Error: llama-server not found in PATH."
|
||||
echo "Install via Homebrew: brew install llama.cpp"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Launching llama.cpp server..."
|
||||
echo " host: $HOST"
|
||||
echo " port: $PORT"
|
||||
echo " repo: $HF_REPO"
|
||||
echo " file: $HF_FILE"
|
||||
echo " alias: $ALIAS"
|
||||
echo " slots: $PARALLEL"
|
||||
echo
|
||||
echo "Suggested env vars for Hermes/Atropos integration:"
|
||||
echo " export ATROPOS_SERVER_BASE_URL=http://${HOST}:${PORT}"
|
||||
echo " export ATROPOS_SERVER_MODEL=${ALIAS}"
|
||||
echo " export ATROPOS_TOKENIZER_NAME=NousResearch/Hermes-4.3-36B"
|
||||
echo " export ATROPOS_SERVER_API_KEY=local"
|
||||
echo
|
||||
|
||||
if command -v lsof >/dev/null 2>&1; then
|
||||
if lsof -nP -iTCP:"$PORT" -sTCP:LISTEN >/dev/null 2>&1; then
|
||||
echo "Error: port $PORT is already in use."
|
||||
echo "Pick a different port, e.g.:"
|
||||
echo " LLAMA_CPP_PORT=8082 Hermes-Agent/scripts/launch_llama_cpp_hermes_4_36b.sh"
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
exec llama-server \
|
||||
--host "$HOST" \
|
||||
--port "$PORT" \
|
||||
--hf-repo "$HF_REPO" \
|
||||
--hf-file "$HF_FILE" \
|
||||
--alias "$ALIAS" \
|
||||
--parallel "$PARALLEL" \
|
||||
--threads-http "$THREADS_HTTP" \
|
||||
-c 32768 \
|
||||
-n -1
|
||||
306
setup-hermes.sh
306
setup-hermes.sh
@@ -1,149 +1,203 @@
|
||||
#!/bin/bash
|
||||
|
||||
# ============================================================================
|
||||
# Hermes Agent Setup Script
|
||||
# Automated setup for all dependencies and configuration
|
||||
# ============================================================================
|
||||
# Quick setup for developers who cloned the repo manually.
|
||||
#
|
||||
# Usage:
|
||||
# ./setup-hermes.sh
|
||||
#
|
||||
# This script:
|
||||
# 1. Creates a virtual environment (if not exists)
|
||||
# 2. Installs dependencies
|
||||
# 3. Creates .env from template (if not exists)
|
||||
# 4. Installs the 'hermes' CLI command
|
||||
# 5. Runs the setup wizard (optional)
|
||||
# ============================================================================
|
||||
|
||||
set -e
|
||||
|
||||
echo "========================================="
|
||||
echo "Hermes Agent Setup"
|
||||
echo "========================================="
|
||||
# Colors
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[0;33m'
|
||||
CYAN='\033[0;36m'
|
||||
NC='\033[0m'
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
cd "$SCRIPT_DIR"
|
||||
|
||||
echo ""
|
||||
echo -e "${CYAN}🦋 Hermes Agent Setup${NC}"
|
||||
echo ""
|
||||
|
||||
# Change to hermes-agent directory
|
||||
cd /home/teknium/hermes-agent
|
||||
# ============================================================================
|
||||
# Python check
|
||||
# ============================================================================
|
||||
|
||||
# Check Python version
|
||||
echo "[1/10] Checking Python version..."
|
||||
python_version=$(python3 --version | cut -d' ' -f2 | cut -d'.' -f1,2)
|
||||
echo "✓ Python $python_version detected"
|
||||
echo ""
|
||||
echo -e "${CYAN}→${NC} Checking Python..."
|
||||
|
||||
# Install uv
|
||||
echo "[2/10] Installing uv (fast Python package installer)..."
|
||||
if ! command -v uv &> /dev/null; then
|
||||
echo "Installing uv..."
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
export PATH="$HOME/.cargo/bin:$PATH"
|
||||
echo "✓ uv installed"
|
||||
else
|
||||
echo "✓ uv already installed: $(uv --version)"
|
||||
PYTHON_CMD=""
|
||||
for cmd in python3.12 python3.11 python3.10 python3 python; do
|
||||
if command -v $cmd &> /dev/null; then
|
||||
if $cmd -c "import sys; exit(0 if sys.version_info >= (3, 10) else 1)" 2>/dev/null; then
|
||||
PYTHON_CMD=$cmd
|
||||
break
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [ -z "$PYTHON_CMD" ]; then
|
||||
echo -e "${YELLOW}✗${NC} Python 3.10+ required"
|
||||
exit 1
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Install Node.js 20 using NodeSource
|
||||
echo "[3/10] Installing Node.js 20..."
|
||||
if ! command -v node &> /dev/null || [[ $(node --version | cut -d'v' -f2 | cut -d'.' -f1) -lt 20 ]]; then
|
||||
echo "Installing Node.js 20 LTS..."
|
||||
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
|
||||
sudo apt-get install -y nodejs
|
||||
echo "✓ Node.js installed"
|
||||
PYTHON_VERSION=$($PYTHON_CMD -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')
|
||||
echo -e "${GREEN}✓${NC} Python $PYTHON_VERSION found"
|
||||
|
||||
# ============================================================================
|
||||
# Virtual environment
|
||||
# ============================================================================
|
||||
|
||||
echo -e "${CYAN}→${NC} Setting up virtual environment..."
|
||||
|
||||
if [ ! -d "venv" ]; then
|
||||
$PYTHON_CMD -m venv venv
|
||||
echo -e "${GREEN}✓${NC} Created venv"
|
||||
else
|
||||
echo "✓ Node.js 20+ already installed: $(node --version)"
|
||||
echo -e "${GREEN}✓${NC} venv exists"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Initialize git submodules
|
||||
echo "[4/10] Initializing git submodules..."
|
||||
git submodule update --init --recursive
|
||||
echo "✓ Submodules initialized"
|
||||
echo ""
|
||||
|
||||
# Create Python virtual environment with uv
|
||||
echo "[5/10] Creating Python virtual environment with uv..."
|
||||
if [ -d "venv" ]; then
|
||||
echo "Virtual environment already exists, skipping..."
|
||||
else
|
||||
uv venv venv
|
||||
echo "✓ Virtual environment created with uv"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Activate virtual environment and install Python packages with uv
|
||||
echo "[6/10] Installing Python dependencies with uv..."
|
||||
source venv/bin/activate
|
||||
uv pip install -r requirements.txt
|
||||
echo "✓ Python packages installed"
|
||||
echo ""
|
||||
pip install --upgrade pip wheel setuptools > /dev/null
|
||||
|
||||
# Install mini-swe-agent with uv
|
||||
echo "[7/10] Installing mini-swe-agent..."
|
||||
uv pip install -e ./mini-swe-agent
|
||||
echo "✓ mini-swe-agent installed"
|
||||
echo ""
|
||||
# ============================================================================
|
||||
# Dependencies
|
||||
# ============================================================================
|
||||
|
||||
# Install Node.js dependencies
|
||||
echo "[8/10] Installing Node.js dependencies..."
|
||||
npm install
|
||||
echo "✓ Node.js packages installed"
|
||||
echo ""
|
||||
echo -e "${CYAN}→${NC} Installing dependencies..."
|
||||
|
||||
# Set up environment file
|
||||
echo "[9/10] Setting up environment configuration..."
|
||||
if [ -f ".env" ]; then
|
||||
echo ".env file already exists, creating backup..."
|
||||
cp .env .env.backup.$(date +%Y%m%d_%H%M%S)
|
||||
fi
|
||||
cp .env.example .env
|
||||
echo "✓ .env file created from .env.example"
|
||||
echo ""
|
||||
pip install -e ".[all]" > /dev/null 2>&1 || pip install -e "." > /dev/null
|
||||
|
||||
# Set up CLI config
|
||||
echo "[10/10] Setting up CLI configuration..."
|
||||
if [ ! -f "cli-config.yaml" ]; then
|
||||
cp cli-config.yaml.example cli-config.yaml
|
||||
echo "✓ cli-config.yaml created from example"
|
||||
echo -e "${GREEN}✓${NC} Dependencies installed"
|
||||
|
||||
# ============================================================================
|
||||
# Optional: ripgrep (for faster file search)
|
||||
# ============================================================================
|
||||
|
||||
echo -e "${CYAN}→${NC} Checking ripgrep (optional, for faster search)..."
|
||||
|
||||
if command -v rg &> /dev/null; then
|
||||
echo -e "${GREEN}✓${NC} ripgrep found"
|
||||
else
|
||||
echo "cli-config.yaml already exists, skipping..."
|
||||
echo -e "${YELLOW}⚠${NC} ripgrep not found (file search will use grep fallback)"
|
||||
read -p "Install ripgrep for faster search? [Y/n] " -n 1 -r
|
||||
echo
|
||||
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
|
||||
INSTALLED=false
|
||||
|
||||
# Check if sudo is available
|
||||
if command -v sudo &> /dev/null && sudo -n true 2>/dev/null; then
|
||||
if command -v apt &> /dev/null; then
|
||||
sudo apt install -y ripgrep && INSTALLED=true
|
||||
elif command -v dnf &> /dev/null; then
|
||||
sudo dnf install -y ripgrep && INSTALLED=true
|
||||
fi
|
||||
fi
|
||||
|
||||
# Try brew (no sudo needed)
|
||||
if [ "$INSTALLED" = false ] && command -v brew &> /dev/null; then
|
||||
brew install ripgrep && INSTALLED=true
|
||||
fi
|
||||
|
||||
# Try cargo (no sudo needed)
|
||||
if [ "$INSTALLED" = false ] && command -v cargo &> /dev/null; then
|
||||
echo -e "${CYAN}→${NC} Trying cargo install (no sudo required)..."
|
||||
cargo install ripgrep && INSTALLED=true
|
||||
fi
|
||||
|
||||
if [ "$INSTALLED" = true ]; then
|
||||
echo -e "${GREEN}✓${NC} ripgrep installed"
|
||||
else
|
||||
echo -e "${YELLOW}⚠${NC} Auto-install failed. Install options:"
|
||||
echo " sudo apt install ripgrep # Debian/Ubuntu"
|
||||
echo " brew install ripgrep # macOS"
|
||||
echo " cargo install ripgrep # With Rust (no sudo)"
|
||||
echo " https://github.com/BurntSushi/ripgrep#installation"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# ============================================================================
|
||||
# Environment file
|
||||
# ============================================================================
|
||||
|
||||
if [ ! -f ".env" ]; then
|
||||
if [ -f ".env.example" ]; then
|
||||
cp .env.example .env
|
||||
echo -e "${GREEN}✓${NC} Created .env from template"
|
||||
fi
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} .env exists"
|
||||
fi
|
||||
|
||||
# ============================================================================
|
||||
# PATH setup
|
||||
# ============================================================================
|
||||
|
||||
echo -e "${CYAN}→${NC} Setting up hermes command..."
|
||||
|
||||
BIN_DIR="$SCRIPT_DIR/venv/bin"
|
||||
|
||||
# Add to shell config if not already there
|
||||
SHELL_CONFIG=""
|
||||
if [ -f "$HOME/.zshrc" ]; then
|
||||
SHELL_CONFIG="$HOME/.zshrc"
|
||||
elif [ -f "$HOME/.bashrc" ]; then
|
||||
SHELL_CONFIG="$HOME/.bashrc"
|
||||
elif [ -f "$HOME/.bash_profile" ]; then
|
||||
SHELL_CONFIG="$HOME/.bash_profile"
|
||||
fi
|
||||
|
||||
if [ -n "$SHELL_CONFIG" ]; then
|
||||
if ! grep -q "hermes-agent" "$SHELL_CONFIG" 2>/dev/null; then
|
||||
echo "" >> "$SHELL_CONFIG"
|
||||
echo "# Hermes Agent" >> "$SHELL_CONFIG"
|
||||
echo "export PATH=\"$BIN_DIR:\$PATH\"" >> "$SHELL_CONFIG"
|
||||
echo -e "${GREEN}✓${NC} Added to $SHELL_CONFIG"
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} PATH already in $SHELL_CONFIG"
|
||||
fi
|
||||
fi
|
||||
|
||||
# ============================================================================
|
||||
# Done
|
||||
# ============================================================================
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}✓ Setup complete!${NC}"
|
||||
echo ""
|
||||
echo "Next steps:"
|
||||
echo ""
|
||||
echo " 1. Reload your shell:"
|
||||
echo " source $SHELL_CONFIG"
|
||||
echo ""
|
||||
echo " 2. Run the setup wizard to configure API keys:"
|
||||
echo " hermes setup"
|
||||
echo ""
|
||||
echo " 3. Start chatting:"
|
||||
echo " hermes"
|
||||
echo ""
|
||||
echo "Other commands:"
|
||||
echo " hermes status # Check configuration"
|
||||
echo " hermes gateway # Start messaging gateway"
|
||||
echo " hermes cron daemon # Run cron daemon"
|
||||
echo " hermes doctor # Diagnose issues"
|
||||
echo ""
|
||||
|
||||
# Show Node.js and Python versions
|
||||
echo "========================================="
|
||||
echo "Setup Complete!"
|
||||
echo "========================================="
|
||||
echo ""
|
||||
echo "Installed versions:"
|
||||
echo " Node.js: $(node --version)"
|
||||
echo " npm: $(npm --version)"
|
||||
echo " Python: $(python3 --version)"
|
||||
echo " uv: $(uv --version)"
|
||||
echo ""
|
||||
|
||||
echo "========================================="
|
||||
echo "Next Steps:"
|
||||
echo "========================================="
|
||||
echo ""
|
||||
echo "1. Configure API Keys in .env file:"
|
||||
echo " nano .env"
|
||||
echo ""
|
||||
echo " Required API keys:"
|
||||
echo " - OPENROUTER_API_KEY (https://openrouter.ai/keys)"
|
||||
echo " - FIRECRAWL_API_KEY (https://firecrawl.dev/)"
|
||||
echo " - NOUS_API_KEY (https://inference-api.nousresearch.com/)"
|
||||
echo " - FAL_KEY (https://fal.ai/)"
|
||||
echo ""
|
||||
echo " Optional API keys:"
|
||||
echo " - BROWSERBASE_API_KEY (https://browserbase.com/)"
|
||||
echo " - BROWSERBASE_PROJECT_ID"
|
||||
echo ""
|
||||
echo "2. Activate the virtual environment:"
|
||||
echo " source venv/bin/activate"
|
||||
echo ""
|
||||
echo "3. Run the CLI:"
|
||||
echo " ./hermes"
|
||||
echo ""
|
||||
echo "4. Or run a single query:"
|
||||
echo " python run_agent.py --query \"your question here\""
|
||||
echo ""
|
||||
echo "5. List available tools:"
|
||||
echo " python run_agent.py --list_tools"
|
||||
echo ""
|
||||
echo "========================================="
|
||||
echo "Configuration Files:"
|
||||
echo "========================================="
|
||||
echo " .env - API keys and environment variables"
|
||||
echo " cli-config.yaml - CLI settings and preferences"
|
||||
echo ""
|
||||
echo "For more information, see README.md"
|
||||
echo ""
|
||||
# Ask if they want to run setup wizard now
|
||||
read -p "Would you like to run the setup wizard now? [Y/n] " -n 1 -r
|
||||
echo
|
||||
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
|
||||
echo ""
|
||||
python -m hermes_cli.main setup
|
||||
fi
|
||||
|
||||
@@ -1,15 +0,0 @@
|
||||
{"prompt": "Test prompt 0: What is 2+2? Just answer briefly.", "test_id": 0}
|
||||
{"prompt": "Test prompt 1: What is 2+2? Just answer briefly.", "test_id": 1}
|
||||
{"prompt": "Test prompt 2: What is 2+2? Just answer briefly.", "test_id": 2}
|
||||
{"prompt": "Test prompt 3: What is 2+2? Just answer briefly.", "test_id": 3}
|
||||
{"prompt": "Test prompt 4: What is 2+2? Just answer briefly.", "test_id": 4}
|
||||
{"prompt": "Test prompt 5: What is 2+2? Just answer briefly.", "test_id": 5}
|
||||
{"prompt": "Test prompt 6: What is 2+2? Just answer briefly.", "test_id": 6}
|
||||
{"prompt": "Test prompt 7: What is 2+2? Just answer briefly.", "test_id": 7}
|
||||
{"prompt": "Test prompt 8: What is 2+2? Just answer briefly.", "test_id": 8}
|
||||
{"prompt": "Test prompt 9: What is 2+2? Just answer briefly.", "test_id": 9}
|
||||
{"prompt": "Test prompt 10: What is 2+2? Just answer briefly.", "test_id": 10}
|
||||
{"prompt": "Test prompt 11: What is 2+2? Just answer briefly.", "test_id": 11}
|
||||
{"prompt": "Test prompt 12: What is 2+2? Just answer briefly.", "test_id": 12}
|
||||
{"prompt": "Test prompt 13: What is 2+2? Just answer briefly.", "test_id": 13}
|
||||
{"prompt": "Test prompt 14: What is 2+2? Just answer briefly.", "test_id": 14}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user