Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 1e373745a9 |
+49
-52
@@ -1,55 +1,52 @@
|
||||
/venv/
|
||||
/_pycache/
|
||||
*.pyc*
|
||||
__pycache__/
|
||||
.venv/
|
||||
.vscode/
|
||||
.env
|
||||
.env.local
|
||||
.env.development.local
|
||||
.env.test.local
|
||||
.env.production.local
|
||||
.env.development
|
||||
.env.test
|
||||
export*
|
||||
__pycache__/model_tools.cpython-310.pyc
|
||||
__pycache__/web_tools.cpython-310.pyc
|
||||
logs/
|
||||
data/
|
||||
.pytest_cache/
|
||||
tmp/
|
||||
temp_vision_images/
|
||||
hermes-*/*
|
||||
examples/
|
||||
tests/quick_test_dataset.jsonl
|
||||
tests/sample_dataset.jsonl
|
||||
run_datagen_kimik2-thinking.sh
|
||||
run_datagen_megascience_glm4-6.sh
|
||||
run_datagen_sonnet.sh
|
||||
source-data/*
|
||||
run_datagen_megascience_glm4-6.sh
|
||||
data/*
|
||||
node_modules/
|
||||
browser-use/
|
||||
agent-browser/
|
||||
# Private keys
|
||||
*.ppk
|
||||
*.pem
|
||||
privvy*
|
||||
images/
|
||||
__pycache__/
|
||||
hermes_agent.egg-info/
|
||||
wandb/
|
||||
testlogs
|
||||
|
||||
# CLI config (may contain sensitive SSH paths)
|
||||
cli-config.yaml
|
||||
|
||||
# Skills Hub state (lives in ~/.hermes/skills/.hub/ at runtime, but just in case)
|
||||
skills/.hub/
|
||||
/venv/
|
||||
/_pycache/
|
||||
*.pyc*
|
||||
__pycache__/
|
||||
.venv/
|
||||
.vscode/
|
||||
.env
|
||||
.env.local
|
||||
.env.development.local
|
||||
.env.test.local
|
||||
.env.production.local
|
||||
.env.development
|
||||
.env.test
|
||||
export*
|
||||
__pycache__/model_tools.cpython-310.pyc
|
||||
__pycache__/web_tools.cpython-310.pyc
|
||||
logs/
|
||||
data/
|
||||
.pytest_cache/
|
||||
tmp/
|
||||
temp_vision_images/
|
||||
hermes-*/*
|
||||
examples/
|
||||
tests/quick_test_dataset.jsonl
|
||||
tests/sample_dataset.jsonl
|
||||
run_datagen_kimik2-thinking.sh
|
||||
run_datagen_megascience_glm4-6.sh
|
||||
run_datagen_sonnet.sh
|
||||
source-data/*
|
||||
run_datagen_megascience_glm4-6.sh
|
||||
data/*
|
||||
node_modules/
|
||||
browser-use/
|
||||
agent-browser/
|
||||
# Private keys
|
||||
*.ppk
|
||||
*.pem
|
||||
privvy*
|
||||
images/
|
||||
__pycache__/
|
||||
hermes_agent.egg-info/
|
||||
wandb/
|
||||
testlogs
|
||||
|
||||
# CLI config (may contain sensitive SSH paths)
|
||||
cli-config.yaml
|
||||
|
||||
# Skills Hub state (lives in ~/.hermes/skills/.hub/ at runtime, but just in case)
|
||||
skills/.hub/
|
||||
ignored/
|
||||
.worktrees/
|
||||
environments/benchmarks/evals/
|
||||
|
||||
# Release script temp files
|
||||
.release_notes.md
|
||||
|
||||
@@ -333,8 +333,6 @@ metadata:
|
||||
hermes:
|
||||
tags: [Category, Subcategory, Keywords]
|
||||
related_skills: [other-skill-name]
|
||||
fallback_for_toolsets: [web] # Optional — show only when toolset is unavailable
|
||||
requires_toolsets: [terminal] # Optional — show only when toolset is available
|
||||
---
|
||||
|
||||
# Skill Title
|
||||
@@ -369,48 +367,6 @@ platforms: [windows] # Windows only
|
||||
|
||||
If the field is omitted or empty, the skill loads on all platforms (backward compatible). See `skills/apple/` for examples of macOS-only skills.
|
||||
|
||||
### Conditional skill activation
|
||||
|
||||
Skills can declare conditions that control when they appear in the system prompt, based on which tools and toolsets are available in the current session. This is primarily used for **fallback skills** — alternatives that should only be shown when a primary tool is unavailable.
|
||||
|
||||
Four fields are supported under `metadata.hermes`:
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
hermes:
|
||||
fallback_for_toolsets: [web] # Show ONLY when these toolsets are unavailable
|
||||
requires_toolsets: [terminal] # Show ONLY when these toolsets are available
|
||||
fallback_for_tools: [web_search] # Show ONLY when these specific tools are unavailable
|
||||
requires_tools: [terminal] # Show ONLY when these specific tools are available
|
||||
```
|
||||
|
||||
**Semantics:**
|
||||
- `fallback_for_*`: The skill is a backup. It is **hidden** when the listed tools/toolsets are available, and **shown** when they are unavailable. Use this for free alternatives to premium tools.
|
||||
- `requires_*`: The skill needs certain tools to function. It is **hidden** when the listed tools/toolsets are unavailable. Use this for skills that depend on specific capabilities (e.g., a skill that only makes sense with terminal access).
|
||||
- If both are specified, both conditions must be satisfied for the skill to appear.
|
||||
- If neither is specified, the skill is always shown (backward compatible).
|
||||
|
||||
**Examples:**
|
||||
|
||||
```yaml
|
||||
# DuckDuckGo search — shown when Firecrawl (web toolset) is unavailable
|
||||
metadata:
|
||||
hermes:
|
||||
fallback_for_toolsets: [web]
|
||||
|
||||
# Smart home skill — only useful when terminal is available
|
||||
metadata:
|
||||
hermes:
|
||||
requires_toolsets: [terminal]
|
||||
|
||||
# Local browser fallback — shown when Browserbase is unavailable
|
||||
metadata:
|
||||
hermes:
|
||||
fallback_for_toolsets: [browser]
|
||||
```
|
||||
|
||||
The filtering happens at prompt build time in `agent/prompt_builder.py`. The `build_skills_system_prompt()` function receives the set of available tools and toolsets from the agent and uses `_skill_should_show()` to evaluate each skill's conditions.
|
||||
|
||||
### Skill guidelines
|
||||
|
||||
- **No external dependencies unless absolutely necessary.** Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`).
|
||||
|
||||
@@ -1,383 +0,0 @@
|
||||
# Hermes Agent v0.2.0 (v2026.3.12)
|
||||
|
||||
**Release Date:** March 12, 2026
|
||||
|
||||
> First tagged release since v0.1.0 (the initial pre-public foundation). In just over two weeks, Hermes Agent went from a small internal project to a full-featured AI agent platform — thanks to an explosion of community contributions. This release covers **216 merged pull requests** from **63 contributors**, resolving **119 issues**.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
- **Multi-Platform Messaging Gateway** — Telegram, Discord, Slack, WhatsApp, Signal, Email (IMAP/SMTP), and Home Assistant platforms with unified session management, media attachments, and per-platform tool configuration.
|
||||
|
||||
- **MCP (Model Context Protocol) Client** — Native MCP support with stdio and HTTP transports, reconnection, resource/prompt discovery, and sampling (server-initiated LLM requests). ([#291](https://github.com/NousResearch/hermes-agent/pull/291) — @0xbyt4, [#301](https://github.com/NousResearch/hermes-agent/pull/301), [#753](https://github.com/NousResearch/hermes-agent/pull/753))
|
||||
|
||||
- **Skills Ecosystem** — 70+ bundled and optional skills across 15+ categories with a Skills Hub for community discovery, per-platform enable/disable, conditional activation based on tool availability, and prerequisite validation. ([#743](https://github.com/NousResearch/hermes-agent/pull/743) — @teyrebaz33, [#785](https://github.com/NousResearch/hermes-agent/pull/785) — @teyrebaz33)
|
||||
|
||||
- **Centralized Provider Router** — Unified `call_llm()`/`async_call_llm()` API replaces scattered provider logic across vision, summarization, compression, and trajectory saving. All auxiliary consumers route through a single code path with automatic credential resolution. ([#1003](https://github.com/NousResearch/hermes-agent/pull/1003))
|
||||
|
||||
- **ACP Server** — VS Code, Zed, and JetBrains editor integration via the Agent Communication Protocol standard. ([#949](https://github.com/NousResearch/hermes-agent/pull/949))
|
||||
|
||||
- **CLI Skin/Theme Engine** — Data-driven visual customization: banners, spinners, colors, branding. 7 built-in skins + custom YAML skins.
|
||||
|
||||
- **Git Worktree Isolation** — `hermes -w` launches isolated agent sessions in git worktrees for safe parallel work on the same repo. ([#654](https://github.com/NousResearch/hermes-agent/pull/654))
|
||||
|
||||
- **Filesystem Checkpoints & Rollback** — Automatic snapshots before destructive operations with `/rollback` to restore. ([#824](https://github.com/NousResearch/hermes-agent/pull/824))
|
||||
|
||||
- **3,289 Tests** — From near-zero test coverage to a comprehensive test suite covering agent, gateway, tools, cron, and CLI.
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Core Agent & Architecture
|
||||
|
||||
### Provider & Model Support
|
||||
- Centralized provider router with `resolve_provider_client()` + `call_llm()` API ([#1003](https://github.com/NousResearch/hermes-agent/pull/1003))
|
||||
- Nous Portal as first-class provider in setup ([#644](https://github.com/NousResearch/hermes-agent/issues/644))
|
||||
- OpenAI Codex (Responses API) with ChatGPT subscription support ([#43](https://github.com/NousResearch/hermes-agent/pull/43)) — @grp06
|
||||
- Codex OAuth vision support + multimodal content adapter
|
||||
- Validate `/model` against live API instead of hardcoded lists
|
||||
- Self-hosted Firecrawl support ([#460](https://github.com/NousResearch/hermes-agent/pull/460)) — @caentzminger
|
||||
- Kimi Code API support ([#635](https://github.com/NousResearch/hermes-agent/pull/635)) — @christomitov
|
||||
- MiniMax model ID update ([#473](https://github.com/NousResearch/hermes-agent/pull/473)) — @tars90percent
|
||||
- OpenRouter provider routing configuration (provider_preferences)
|
||||
- Nous credential refresh on 401 errors ([#571](https://github.com/NousResearch/hermes-agent/pull/571), [#269](https://github.com/NousResearch/hermes-agent/pull/269)) — @rewbs
|
||||
- z.ai/GLM, Kimi/Moonshot, MiniMax, Azure OpenAI as first-class providers
|
||||
- Unified `/model` and `/provider` into single view
|
||||
|
||||
### Agent Loop & Conversation
|
||||
- Simple fallback model for provider resilience ([#740](https://github.com/NousResearch/hermes-agent/pull/740))
|
||||
- Shared iteration budget across parent + subagent delegation
|
||||
- Iteration budget pressure via tool result injection
|
||||
- Configurable subagent provider/model with full credential resolution
|
||||
- Handle 413 payload-too-large via compression instead of aborting ([#153](https://github.com/NousResearch/hermes-agent/pull/153)) — @tekelala
|
||||
- Retry with rebuilt payload after compression ([#616](https://github.com/NousResearch/hermes-agent/pull/616)) — @tripledoublev
|
||||
- Auto-compress pathologically large gateway sessions ([#628](https://github.com/NousResearch/hermes-agent/issues/628))
|
||||
- Tool call repair middleware — auto-lowercase and invalid tool handler
|
||||
- Reasoning effort configuration and `/reasoning` command ([#921](https://github.com/NousResearch/hermes-agent/pull/921))
|
||||
- Detect and block file re-read/search loops after context compression ([#705](https://github.com/NousResearch/hermes-agent/pull/705)) — @0xbyt4
|
||||
|
||||
### Session & Memory
|
||||
- Session naming with unique titles, auto-lineage, rich listing, and resume by name ([#720](https://github.com/NousResearch/hermes-agent/pull/720))
|
||||
- Interactive session browser with search filtering ([#733](https://github.com/NousResearch/hermes-agent/pull/733))
|
||||
- Display previous messages when resuming a session ([#734](https://github.com/NousResearch/hermes-agent/pull/734))
|
||||
- Honcho AI-native cross-session user modeling ([#38](https://github.com/NousResearch/hermes-agent/pull/38)) — @erosika
|
||||
- Proactive async memory flush on session expiry
|
||||
- Smart context length probing with persistent caching + banner display
|
||||
- `/resume` command for switching to named sessions in gateway
|
||||
- Session reset policy for messaging platforms
|
||||
|
||||
---
|
||||
|
||||
## 📱 Messaging Platforms (Gateway)
|
||||
|
||||
### Telegram
|
||||
- Native file attachments: send_document + send_video
|
||||
- Document file processing for PDF, text, and Office files — @tekelala
|
||||
- Forum topic session isolation ([#766](https://github.com/NousResearch/hermes-agent/pull/766)) — @spanishflu-est1918
|
||||
- Browser screenshot sharing via MEDIA: protocol ([#657](https://github.com/NousResearch/hermes-agent/pull/657))
|
||||
- Location support for find-nearby skill
|
||||
- TTS voice message accumulation fix ([#176](https://github.com/NousResearch/hermes-agent/pull/176)) — @Bartok9
|
||||
- Improved error handling and logging ([#763](https://github.com/NousResearch/hermes-agent/pull/763)) — @aydnOktay
|
||||
- Italic regex newline fix + 43 format tests ([#204](https://github.com/NousResearch/hermes-agent/pull/204)) — @0xbyt4
|
||||
|
||||
### Discord
|
||||
- Channel topic included in session context ([#248](https://github.com/NousResearch/hermes-agent/pull/248)) — @Bartok9
|
||||
- DISCORD_ALLOW_BOTS config for bot message filtering ([#758](https://github.com/NousResearch/hermes-agent/pull/758))
|
||||
- Document and video support ([#784](https://github.com/NousResearch/hermes-agent/pull/784))
|
||||
- Improved error handling and logging ([#761](https://github.com/NousResearch/hermes-agent/pull/761)) — @aydnOktay
|
||||
|
||||
### Slack
|
||||
- App_mention 404 fix + document/video support ([#784](https://github.com/NousResearch/hermes-agent/pull/784))
|
||||
- Structured logging replacing print statements — @aydnOktay
|
||||
|
||||
### WhatsApp
|
||||
- Native media sending — images, videos, documents ([#292](https://github.com/NousResearch/hermes-agent/pull/292)) — @satelerd
|
||||
- Multi-user session isolation ([#75](https://github.com/NousResearch/hermes-agent/pull/75)) — @satelerd
|
||||
- Cross-platform port cleanup replacing Linux-only fuser ([#433](https://github.com/NousResearch/hermes-agent/pull/433)) — @Farukest
|
||||
- DM interrupt key mismatch fix ([#350](https://github.com/NousResearch/hermes-agent/pull/350)) — @Farukest
|
||||
|
||||
### Signal
|
||||
- Full Signal messenger gateway via signal-cli-rest-api ([#405](https://github.com/NousResearch/hermes-agent/issues/405))
|
||||
- Media URL support in message events ([#871](https://github.com/NousResearch/hermes-agent/pull/871))
|
||||
|
||||
### Email (IMAP/SMTP)
|
||||
- New email gateway platform — @0xbyt4
|
||||
|
||||
### Home Assistant
|
||||
- REST tools + WebSocket gateway integration ([#184](https://github.com/NousResearch/hermes-agent/pull/184)) — @0xbyt4
|
||||
- Service discovery and enhanced setup
|
||||
- Toolset mapping fix ([#538](https://github.com/NousResearch/hermes-agent/pull/538)) — @Himess
|
||||
|
||||
### Gateway Core
|
||||
- Expose subagent tool calls and thinking to users ([#186](https://github.com/NousResearch/hermes-agent/pull/186)) — @cutepawss
|
||||
- Configurable background process watcher notifications ([#840](https://github.com/NousResearch/hermes-agent/pull/840))
|
||||
- `edit_message()` for Telegram/Discord/Slack with fallback
|
||||
- `/compress`, `/usage`, `/update` slash commands
|
||||
- Eliminated 3x SQLite message duplication in gateway sessions ([#873](https://github.com/NousResearch/hermes-agent/pull/873))
|
||||
- Stabilize system prompt across gateway turns for cache hits ([#754](https://github.com/NousResearch/hermes-agent/pull/754))
|
||||
- MCP server shutdown on gateway exit ([#796](https://github.com/NousResearch/hermes-agent/pull/796)) — @0xbyt4
|
||||
- Pass session_db to AIAgent, fixing session_search error ([#108](https://github.com/NousResearch/hermes-agent/pull/108)) — @Bartok9
|
||||
- Persist transcript changes in /retry, /undo; fix /reset attribute ([#217](https://github.com/NousResearch/hermes-agent/pull/217)) — @Farukest
|
||||
- UTF-8 encoding fix preventing Windows crashes ([#369](https://github.com/NousResearch/hermes-agent/pull/369)) — @ch3ronsa
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ CLI & User Experience
|
||||
|
||||
### Interactive CLI
|
||||
- Data-driven skin/theme engine — 7 built-in skins (default, ares, mono, slate, poseidon, sisyphus, charizard) + custom YAML skins
|
||||
- `/personality` command with custom personality + disable support ([#773](https://github.com/NousResearch/hermes-agent/pull/773)) — @teyrebaz33
|
||||
- User-defined quick commands that bypass the agent loop ([#746](https://github.com/NousResearch/hermes-agent/pull/746)) — @teyrebaz33
|
||||
- `/reasoning` command for effort level and display toggle ([#921](https://github.com/NousResearch/hermes-agent/pull/921))
|
||||
- `/verbose` slash command to toggle debug at runtime ([#94](https://github.com/NousResearch/hermes-agent/pull/94)) — @cesareth
|
||||
- `/insights` command — usage analytics, cost estimation & activity patterns ([#552](https://github.com/NousResearch/hermes-agent/pull/552))
|
||||
- `/background` command for managing background processes
|
||||
- `/help` formatting with command categories
|
||||
- Bell-on-complete — terminal bell when agent finishes ([#738](https://github.com/NousResearch/hermes-agent/pull/738))
|
||||
- Up/down arrow history navigation
|
||||
- Clipboard image paste (Alt+V / Ctrl+V)
|
||||
- Loading indicators for slow slash commands ([#882](https://github.com/NousResearch/hermes-agent/pull/882))
|
||||
- Spinner flickering fix under patch_stdout ([#91](https://github.com/NousResearch/hermes-agent/pull/91)) — @0xbyt4
|
||||
- `--quiet/-Q` flag for programmatic single-query mode
|
||||
- `--fuck-it-ship-it` flag to bypass all approval prompts ([#724](https://github.com/NousResearch/hermes-agent/pull/724)) — @dmahan93
|
||||
- Tools summary flag ([#767](https://github.com/NousResearch/hermes-agent/pull/767)) — @luisv-1
|
||||
- Terminal blinking fix on SSH ([#284](https://github.com/NousResearch/hermes-agent/pull/284)) — @ygd58
|
||||
- Multi-line paste detection fix ([#84](https://github.com/NousResearch/hermes-agent/pull/84)) — @0xbyt4
|
||||
|
||||
### Setup & Configuration
|
||||
- Modular setup wizard with section subcommands and tool-first UX
|
||||
- Container resource configuration prompts
|
||||
- Backend validation for required binaries
|
||||
- Config migration system (currently v7)
|
||||
- API keys properly routed to .env instead of config.yaml ([#469](https://github.com/NousResearch/hermes-agent/pull/469)) — @ygd58
|
||||
- Atomic write for .env to prevent API key loss on crash ([#954](https://github.com/NousResearch/hermes-agent/pull/954))
|
||||
- `hermes tools` — per-platform tool enable/disable with curses UI
|
||||
- `hermes doctor` for health checks across all configured providers
|
||||
- `hermes update` with auto-restart for gateway service
|
||||
- Show update-available notice in CLI banner
|
||||
- Multiple named custom providers
|
||||
- Shell config detection improvement for PATH setup ([#317](https://github.com/NousResearch/hermes-agent/pull/317)) — @mehmetkr-31
|
||||
- Consistent HERMES_HOME and .env path resolution ([#51](https://github.com/NousResearch/hermes-agent/pull/51), [#48](https://github.com/NousResearch/hermes-agent/pull/48)) — @deankerr
|
||||
- Docker backend fix on macOS + subagent auth for Nous Portal ([#46](https://github.com/NousResearch/hermes-agent/pull/46)) — @rsavitt
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Tool System
|
||||
|
||||
### MCP (Model Context Protocol)
|
||||
- Native MCP client with stdio + HTTP transports ([#291](https://github.com/NousResearch/hermes-agent/pull/291) — @0xbyt4, [#301](https://github.com/NousResearch/hermes-agent/pull/301))
|
||||
- Sampling support — server-initiated LLM requests ([#753](https://github.com/NousResearch/hermes-agent/pull/753))
|
||||
- Resource and prompt discovery
|
||||
- Automatic reconnection and security hardening
|
||||
- Banner integration, `/reload-mcp` command
|
||||
- `hermes tools` UI integration
|
||||
|
||||
### Browser
|
||||
- Local browser backend — zero-cost headless Chromium (no Browserbase needed)
|
||||
- Console/errors tool, annotated screenshots, auto-recording, dogfood QA skill ([#745](https://github.com/NousResearch/hermes-agent/pull/745))
|
||||
- Screenshot sharing via MEDIA: on all messaging platforms ([#657](https://github.com/NousResearch/hermes-agent/pull/657))
|
||||
|
||||
### Terminal & Execution
|
||||
- `execute_code` sandbox with json_parse, shell_quote, retry helpers
|
||||
- Docker: custom volume mounts ([#158](https://github.com/NousResearch/hermes-agent/pull/158)) — @Indelwin
|
||||
- Daytona cloud sandbox backend ([#451](https://github.com/NousResearch/hermes-agent/pull/451)) — @rovle
|
||||
- SSH backend fix ([#59](https://github.com/NousResearch/hermes-agent/pull/59)) — @deankerr
|
||||
- Shell noise filtering and login shell execution for environment consistency
|
||||
- Head+tail truncation for execute_code stdout overflow
|
||||
- Configurable background process notification modes
|
||||
|
||||
### File Operations
|
||||
- Filesystem checkpoints and `/rollback` command ([#824](https://github.com/NousResearch/hermes-agent/pull/824))
|
||||
- Structured tool result hints (next-action guidance) for patch and search_files ([#722](https://github.com/NousResearch/hermes-agent/issues/722))
|
||||
- Docker volumes passed to sandbox container config ([#687](https://github.com/NousResearch/hermes-agent/pull/687)) — @manuelschipper
|
||||
|
||||
---
|
||||
|
||||
## 🧩 Skills Ecosystem
|
||||
|
||||
### Skills System
|
||||
- Per-platform skill enable/disable ([#743](https://github.com/NousResearch/hermes-agent/pull/743)) — @teyrebaz33
|
||||
- Conditional skill activation based on tool availability ([#785](https://github.com/NousResearch/hermes-agent/pull/785)) — @teyrebaz33
|
||||
- Skill prerequisites — hide skills with unmet dependencies ([#659](https://github.com/NousResearch/hermes-agent/pull/659)) — @kshitijk4poor
|
||||
- Optional skills — shipped but not activated by default
|
||||
- `hermes skills browse` — paginated hub browsing
|
||||
- Skills sub-category organization
|
||||
- Platform-conditional skill loading
|
||||
- Atomic skill file writes ([#551](https://github.com/NousResearch/hermes-agent/pull/551)) — @aydnOktay
|
||||
- Skills sync data loss prevention ([#563](https://github.com/NousResearch/hermes-agent/pull/563)) — @0xbyt4
|
||||
- Dynamic skill slash commands for CLI and gateway
|
||||
|
||||
### New Skills (selected)
|
||||
- **ASCII Art** — pyfiglet (571 fonts), cowsay, image-to-ascii ([#209](https://github.com/NousResearch/hermes-agent/pull/209)) — @0xbyt4
|
||||
- **ASCII Video** — Full production pipeline ([#854](https://github.com/NousResearch/hermes-agent/pull/854)) — @SHL0MS
|
||||
- **DuckDuckGo Search** — Firecrawl fallback ([#267](https://github.com/NousResearch/hermes-agent/pull/267)) — @gamedevCloudy; DDGS API expansion ([#598](https://github.com/NousResearch/hermes-agent/pull/598)) — @areu01or00
|
||||
- **Solana Blockchain** — Wallet balances, USD pricing, token names ([#212](https://github.com/NousResearch/hermes-agent/pull/212)) — @gizdusum
|
||||
- **AgentMail** — Agent-owned email inboxes ([#330](https://github.com/NousResearch/hermes-agent/pull/330)) — @teyrebaz33
|
||||
- **Polymarket** — Prediction market data (read-only) ([#629](https://github.com/NousResearch/hermes-agent/pull/629))
|
||||
- **OpenClaw Migration** — Official migration tool ([#570](https://github.com/NousResearch/hermes-agent/pull/570)) — @unmodeled-tyler
|
||||
- **Domain Intelligence** — Passive recon: subdomains, SSL, WHOIS, DNS ([#136](https://github.com/NousResearch/hermes-agent/pull/136)) — @FurkanL0
|
||||
- **Superpowers** — Software development skills ([#137](https://github.com/NousResearch/hermes-agent/pull/137)) — @kaos35
|
||||
- **Hermes-Atropos** — RL environment development skill ([#815](https://github.com/NousResearch/hermes-agent/pull/815))
|
||||
- Plus: arXiv search, OCR/documents, Excalidraw diagrams, YouTube transcripts, GIF search, Pokémon player, Minecraft modpack server, OpenHue (Philips Hue), Google Workspace, Notion, PowerPoint, Obsidian, find-nearby, and 40+ MLOps skills
|
||||
|
||||
---
|
||||
|
||||
## 🔒 Security & Reliability
|
||||
|
||||
### Security Hardening
|
||||
- Path traversal fix in skill_view — prevented reading arbitrary files ([#220](https://github.com/NousResearch/hermes-agent/issues/220)) — @Farukest
|
||||
- Shell injection prevention in sudo password piping ([#65](https://github.com/NousResearch/hermes-agent/pull/65)) — @leonsgithub
|
||||
- Dangerous command detection: multiline bypass fix ([#233](https://github.com/NousResearch/hermes-agent/pull/233)) — @Farukest; tee/process substitution patterns ([#280](https://github.com/NousResearch/hermes-agent/pull/280)) — @dogiladeveloper
|
||||
- Symlink boundary check fix in skills_guard ([#386](https://github.com/NousResearch/hermes-agent/pull/386)) — @Farukest
|
||||
- Symlink bypass fix in write deny list on macOS ([#61](https://github.com/NousResearch/hermes-agent/pull/61)) — @0xbyt4
|
||||
- Multi-word prompt injection bypass prevention ([#192](https://github.com/NousResearch/hermes-agent/pull/192)) — @0xbyt4
|
||||
- Cron prompt injection scanner bypass fix ([#63](https://github.com/NousResearch/hermes-agent/pull/63)) — @0xbyt4
|
||||
- Enforce 0600/0700 file permissions on sensitive files ([#757](https://github.com/NousResearch/hermes-agent/pull/757))
|
||||
- .env file permissions restricted to owner-only ([#529](https://github.com/NousResearch/hermes-agent/pull/529)) — @Himess
|
||||
- `--force` flag properly blocked from overriding dangerous verdicts ([#388](https://github.com/NousResearch/hermes-agent/pull/388)) — @Farukest
|
||||
- FTS5 query sanitization + DB connection leak fix ([#565](https://github.com/NousResearch/hermes-agent/pull/565)) — @0xbyt4
|
||||
- Expand secret redaction patterns + config toggle to disable
|
||||
- In-memory permanent allowlist to prevent data leak ([#600](https://github.com/NousResearch/hermes-agent/pull/600)) — @alireza78a
|
||||
|
||||
### Atomic Writes (data loss prevention)
|
||||
- sessions.json ([#611](https://github.com/NousResearch/hermes-agent/pull/611)) — @alireza78a
|
||||
- Cron jobs ([#146](https://github.com/NousResearch/hermes-agent/pull/146)) — @alireza78a
|
||||
- .env config ([#954](https://github.com/NousResearch/hermes-agent/pull/954))
|
||||
- Process checkpoints ([#298](https://github.com/NousResearch/hermes-agent/pull/298)) — @aydnOktay
|
||||
- Batch runner ([#297](https://github.com/NousResearch/hermes-agent/pull/297)) — @aydnOktay
|
||||
- Skill files ([#551](https://github.com/NousResearch/hermes-agent/pull/551)) — @aydnOktay
|
||||
|
||||
### Reliability
|
||||
- Guard all print() against OSError for systemd/headless environments ([#963](https://github.com/NousResearch/hermes-agent/pull/963))
|
||||
- Reset all retry counters at start of run_conversation ([#607](https://github.com/NousResearch/hermes-agent/pull/607)) — @0xbyt4
|
||||
- Return deny on approval callback timeout instead of None ([#603](https://github.com/NousResearch/hermes-agent/pull/603)) — @0xbyt4
|
||||
- Fix None message content crashes across codebase ([#277](https://github.com/NousResearch/hermes-agent/pull/277))
|
||||
- Fix context overrun crash with local LLM backends ([#403](https://github.com/NousResearch/hermes-agent/pull/403)) — @ch3ronsa
|
||||
- Prevent `_flush_sentinel` from leaking to external APIs ([#227](https://github.com/NousResearch/hermes-agent/pull/227)) — @Farukest
|
||||
- Prevent conversation_history mutation in callers ([#229](https://github.com/NousResearch/hermes-agent/pull/229)) — @Farukest
|
||||
- Fix systemd restart loop ([#614](https://github.com/NousResearch/hermes-agent/pull/614)) — @voidborne-d
|
||||
- Close file handles and sockets to prevent fd leaks ([#568](https://github.com/NousResearch/hermes-agent/pull/568) — @alireza78a, [#296](https://github.com/NousResearch/hermes-agent/pull/296) — @alireza78a, [#709](https://github.com/NousResearch/hermes-agent/pull/709) — @memosr)
|
||||
- Prevent data loss in clipboard PNG conversion ([#602](https://github.com/NousResearch/hermes-agent/pull/602)) — @0xbyt4
|
||||
- Eliminate shell noise from terminal output ([#293](https://github.com/NousResearch/hermes-agent/pull/293)) — @0xbyt4
|
||||
- Timezone-aware now() for prompt, cron, and execute_code ([#309](https://github.com/NousResearch/hermes-agent/pull/309)) — @areu01or00
|
||||
|
||||
### Windows Compatibility
|
||||
- Guard POSIX-only process functions ([#219](https://github.com/NousResearch/hermes-agent/pull/219)) — @Farukest
|
||||
- Windows native support via Git Bash + ZIP-based update fallback
|
||||
- pywinpty for PTY support ([#457](https://github.com/NousResearch/hermes-agent/pull/457)) — @shitcoinsherpa
|
||||
- Explicit UTF-8 encoding on all config/data file I/O ([#458](https://github.com/NousResearch/hermes-agent/pull/458)) — @shitcoinsherpa
|
||||
- Windows-compatible path handling ([#354](https://github.com/NousResearch/hermes-agent/pull/354), [#390](https://github.com/NousResearch/hermes-agent/pull/390)) — @Farukest
|
||||
- Regex-based search output parsing for drive-letter paths ([#533](https://github.com/NousResearch/hermes-agent/pull/533)) — @Himess
|
||||
- Auth store file lock for Windows ([#455](https://github.com/NousResearch/hermes-agent/pull/455)) — @shitcoinsherpa
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Notable Bug Fixes
|
||||
|
||||
- Fix DeepSeek V3 tool call parser silently dropping multi-line JSON arguments ([#444](https://github.com/NousResearch/hermes-agent/pull/444)) — @PercyDikec
|
||||
- Fix gateway transcript losing 1 message per turn due to offset mismatch ([#395](https://github.com/NousResearch/hermes-agent/pull/395)) — @PercyDikec
|
||||
- Fix /retry command silently discarding the agent's final response ([#441](https://github.com/NousResearch/hermes-agent/pull/441)) — @PercyDikec
|
||||
- Fix max-iterations retry returning empty string after think-block stripping ([#438](https://github.com/NousResearch/hermes-agent/pull/438)) — @PercyDikec
|
||||
- Fix max-iterations retry using hardcoded max_tokens ([#436](https://github.com/NousResearch/hermes-agent/pull/436)) — @Farukest
|
||||
- Fix Codex status dict key mismatch ([#448](https://github.com/NousResearch/hermes-agent/pull/448)) and visibility filter ([#446](https://github.com/NousResearch/hermes-agent/pull/446)) — @PercyDikec
|
||||
- Strip \<think\> blocks from final user-facing responses ([#174](https://github.com/NousResearch/hermes-agent/pull/174)) — @Bartok9
|
||||
- Fix \<think\> block regex stripping visible content when model discusses tags literally ([#786](https://github.com/NousResearch/hermes-agent/issues/786))
|
||||
- Fix Mistral 422 errors from leftover finish_reason in assistant messages ([#253](https://github.com/NousResearch/hermes-agent/pull/253)) — @Sertug17
|
||||
- Fix OPENROUTER_API_KEY resolution order across all code paths ([#295](https://github.com/NousResearch/hermes-agent/pull/295)) — @0xbyt4
|
||||
- Fix OPENAI_BASE_URL API key priority ([#420](https://github.com/NousResearch/hermes-agent/pull/420)) — @manuelschipper
|
||||
- Fix Anthropic "prompt is too long" 400 error not detected as context length error ([#813](https://github.com/NousResearch/hermes-agent/issues/813))
|
||||
- Fix SQLite session transcript accumulating duplicate messages — 3-4x token inflation ([#860](https://github.com/NousResearch/hermes-agent/issues/860))
|
||||
- Fix setup wizard skipping API key prompts on first install ([#748](https://github.com/NousResearch/hermes-agent/pull/748))
|
||||
- Fix setup wizard showing OpenRouter model list for Nous Portal ([#575](https://github.com/NousResearch/hermes-agent/pull/575)) — @PercyDikec
|
||||
- Fix provider selection not persisting when switching via hermes model ([#881](https://github.com/NousResearch/hermes-agent/pull/881))
|
||||
- Fix Docker backend failing when docker not in PATH on macOS ([#889](https://github.com/NousResearch/hermes-agent/pull/889))
|
||||
- Fix ClawHub Skills Hub adapter for API endpoint changes ([#286](https://github.com/NousResearch/hermes-agent/pull/286)) — @BP602
|
||||
- Fix Honcho auto-enable when API key is present ([#243](https://github.com/NousResearch/hermes-agent/pull/243)) — @Bartok9
|
||||
- Fix duplicate 'skills' subparser crash on Python 3.11+ ([#898](https://github.com/NousResearch/hermes-agent/issues/898))
|
||||
- Fix memory tool entry parsing when content contains section sign ([#162](https://github.com/NousResearch/hermes-agent/pull/162)) — @aydnOktay
|
||||
- Fix piped install silently aborting when interactive prompts fail ([#72](https://github.com/NousResearch/hermes-agent/pull/72)) — @cutepawss
|
||||
- Fix false positives in recursive delete detection ([#68](https://github.com/NousResearch/hermes-agent/pull/68)) — @cutepawss
|
||||
- Fix Ruff lint warnings across codebase ([#608](https://github.com/NousResearch/hermes-agent/pull/608)) — @JackTheGit
|
||||
- Fix Anthropic native base URL fail-fast ([#173](https://github.com/NousResearch/hermes-agent/pull/173)) — @adavyas
|
||||
- Fix install.sh creating ~/.hermes before moving Node.js directory ([#53](https://github.com/NousResearch/hermes-agent/pull/53)) — @JoshuaMart
|
||||
- Fix SystemExit traceback during atexit cleanup on Ctrl+C ([#55](https://github.com/NousResearch/hermes-agent/pull/55)) — @bierlingm
|
||||
- Restore missing MIT license file ([#620](https://github.com/NousResearch/hermes-agent/pull/620)) — @stablegenius49
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Testing
|
||||
|
||||
- **3,289 tests** across agent, gateway, tools, cron, and CLI
|
||||
- Parallelized test suite with pytest-xdist ([#802](https://github.com/NousResearch/hermes-agent/pull/802)) — @OutThisLife
|
||||
- Unit tests batch 1: 8 core modules ([#60](https://github.com/NousResearch/hermes-agent/pull/60)) — @0xbyt4
|
||||
- Unit tests batch 2: 8 more modules ([#62](https://github.com/NousResearch/hermes-agent/pull/62)) — @0xbyt4
|
||||
- Unit tests batch 3: 8 untested modules ([#191](https://github.com/NousResearch/hermes-agent/pull/191)) — @0xbyt4
|
||||
- Unit tests batch 4: 5 security/logic-critical modules ([#193](https://github.com/NousResearch/hermes-agent/pull/193)) — @0xbyt4
|
||||
- AIAgent (run_agent.py) unit tests ([#67](https://github.com/NousResearch/hermes-agent/pull/67)) — @0xbyt4
|
||||
- Trajectory compressor tests ([#203](https://github.com/NousResearch/hermes-agent/pull/203)) — @0xbyt4
|
||||
- Clarify tool tests ([#121](https://github.com/NousResearch/hermes-agent/pull/121)) — @Bartok9
|
||||
- Telegram format tests — 43 tests for italic/bold/code rendering ([#204](https://github.com/NousResearch/hermes-agent/pull/204)) — @0xbyt4
|
||||
- Vision tools type hints + 42 tests ([#792](https://github.com/NousResearch/hermes-agent/pull/792))
|
||||
- Compressor tool-call boundary regression tests ([#648](https://github.com/NousResearch/hermes-agent/pull/648)) — @intertwine
|
||||
- Test structure reorganization ([#34](https://github.com/NousResearch/hermes-agent/pull/34)) — @0xbyt4
|
||||
- Shell noise elimination + fix 36 test failures ([#293](https://github.com/NousResearch/hermes-agent/pull/293)) — @0xbyt4
|
||||
|
||||
---
|
||||
|
||||
## 🔬 RL & Evaluation Environments
|
||||
|
||||
- WebResearchEnv — Multi-step web research RL environment ([#434](https://github.com/NousResearch/hermes-agent/pull/434)) — @jackx707
|
||||
- Modal sandbox concurrency limits to avoid deadlocks ([#621](https://github.com/NousResearch/hermes-agent/pull/621)) — @voteblake
|
||||
- Hermes-atropos-environments bundled skill ([#815](https://github.com/NousResearch/hermes-agent/pull/815))
|
||||
- Local vLLM instance support for evaluation — @dmahan93
|
||||
- YC-Bench long-horizon agent benchmark environment
|
||||
- OpenThoughts-TBLite evaluation environment and scripts
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- Full documentation website (Docusaurus) with 37+ pages
|
||||
- Comprehensive platform setup guides for Telegram, Discord, Slack, WhatsApp, Signal, Email
|
||||
- AGENTS.md — development guide for AI coding assistants
|
||||
- CONTRIBUTING.md ([#117](https://github.com/NousResearch/hermes-agent/pull/117)) — @Bartok9
|
||||
- Slash commands reference ([#142](https://github.com/NousResearch/hermes-agent/pull/142)) — @Bartok9
|
||||
- Comprehensive AGENTS.md accuracy audit ([#732](https://github.com/NousResearch/hermes-agent/pull/732))
|
||||
- Skin/theme system documentation
|
||||
- MCP documentation and examples
|
||||
- Docs accuracy audit — 35+ corrections
|
||||
- Documentation typo fixes ([#825](https://github.com/NousResearch/hermes-agent/pull/825), [#439](https://github.com/NousResearch/hermes-agent/pull/439)) — @JackTheGit
|
||||
- CLI config precedence and terminology standardization ([#166](https://github.com/NousResearch/hermes-agent/pull/166), [#167](https://github.com/NousResearch/hermes-agent/pull/167), [#168](https://github.com/NousResearch/hermes-agent/pull/168)) — @Jr-kenny
|
||||
- Telegram token regex documentation ([#713](https://github.com/NousResearch/hermes-agent/pull/713)) — @VolodymyrBg
|
||||
|
||||
---
|
||||
|
||||
## 👥 Contributors
|
||||
|
||||
Thank you to the 63 contributors who made this release possible! In just over two weeks, the Hermes Agent community came together to ship an extraordinary amount of work.
|
||||
|
||||
### Core
|
||||
- **@teknium1** — 43 PRs: Project lead, core architecture, provider router, sessions, skills, CLI, documentation
|
||||
|
||||
### Top Community Contributors
|
||||
- **@0xbyt4** — 40 PRs: MCP client, Home Assistant, security fixes (symlink, prompt injection, cron), extensive test coverage (6 batches), ascii-art skill, shell noise elimination, skills sync, Telegram formatting, and dozens more
|
||||
- **@Farukest** — 16 PRs: Security hardening (path traversal, dangerous command detection, symlink boundary), Windows compatibility (POSIX guards, path handling), WhatsApp fixes, max-iterations retry, gateway fixes
|
||||
- **@aydnOktay** — 11 PRs: Atomic writes (process checkpoints, batch runner, skill files), error handling improvements across Telegram, Discord, code execution, transcription, TTS, and skills
|
||||
- **@Bartok9** — 9 PRs: CONTRIBUTING.md, slash commands reference, Discord channel topics, think-block stripping, TTS fix, Honcho fix, session count fix, clarify tests
|
||||
- **@PercyDikec** — 7 PRs: DeepSeek V3 parser fix, /retry response discard, gateway transcript offset, Codex status/visibility, max-iterations retry, setup wizard fix
|
||||
- **@teyrebaz33** — 5 PRs: Skills enable/disable system, quick commands, personality customization, conditional skill activation
|
||||
- **@alireza78a** — 5 PRs: Atomic writes (cron, sessions), fd leak prevention, security allowlist, code execution socket cleanup
|
||||
- **@shitcoinsherpa** — 3 PRs: Windows support (pywinpty, UTF-8 encoding, auth store lock)
|
||||
- **@Himess** — 3 PRs: Cron/HomeAssistant/Daytona fix, Windows drive-letter parsing, .env permissions
|
||||
- **@satelerd** — 2 PRs: WhatsApp native media, multi-user session isolation
|
||||
- **@rovle** — 1 PR: Daytona cloud sandbox backend (4 commits)
|
||||
- **@erosika** — 1 PR: Honcho AI-native memory integration
|
||||
- **@dmahan93** — 1 PR: --fuck-it-ship-it flag + RL environment work
|
||||
- **@SHL0MS** — 1 PR: ASCII video skill
|
||||
|
||||
### All Contributors
|
||||
@0xbyt4, @BP602, @Bartok9, @Farukest, @FurkanL0, @Himess, @Indelwin, @JackTheGit, @JoshuaMart, @Jr-kenny, @OutThisLife, @PercyDikec, @SHL0MS, @Sertug17, @VencentSoliman, @VolodymyrBg, @adavyas, @alireza78a, @areu01or00, @aydnOktay, @batuhankocyigit, @bierlingm, @caentzminger, @cesareth, @ch3ronsa, @christomitov, @cutepawss, @deankerr, @dmahan93, @dogiladeveloper, @dragonkhoi, @erosika, @gamedevCloudy, @gizdusum, @grp06, @intertwine, @jackx707, @jdblackstar, @johnh4098, @kaos35, @kshitijk4poor, @leonsgithub, @luisv-1, @manuelschipper, @mehmetkr-31, @memosr, @PeterFile, @rewbs, @rovle, @rsavitt, @satelerd, @spanishflu-est1918, @stablegenius49, @tars90percent, @tekelala, @teknium1, @teyrebaz33, @tripledoublev, @unmodeled-tyler, @voidborne-d, @voteblake, @ygd58
|
||||
|
||||
---
|
||||
|
||||
**Full Changelog**: [v0.1.0...v2026.3.12](https://github.com/NousResearch/hermes-agent/compare/v0.1.0...v2026.3.12)
|
||||
+23
-476
@@ -17,10 +17,7 @@ Resolution order for text tasks (auto mode):
|
||||
Resolution order for vision/multimodal tasks (auto mode):
|
||||
1. OpenRouter
|
||||
2. Nous Portal
|
||||
3. Codex OAuth (gpt-5.3-codex supports vision via Responses API)
|
||||
4. Custom endpoint (for local vision models: Qwen-VL, LLaVA, Pixtral, etc.)
|
||||
5. None (API-key providers like z.ai/Kimi/MiniMax are skipped —
|
||||
they may not support multimodal)
|
||||
3. None (steps 3-5 are skipped — they may not support multimodal)
|
||||
|
||||
Per-task provider overrides (e.g. AUXILIARY_VISION_PROVIDER,
|
||||
CONTEXT_COMPRESSION_PROVIDER) can force a specific provider for each task:
|
||||
@@ -443,7 +440,7 @@ def _try_custom_endpoint() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
custom_key = os.getenv("OPENAI_API_KEY")
|
||||
if not custom_base or not custom_key:
|
||||
return None, None
|
||||
model = os.getenv("OPENAI_MODEL") or "gpt-4o-mini"
|
||||
model = os.getenv("OPENAI_MODEL") or os.getenv("LLM_MODEL") or "gpt-4o-mini"
|
||||
logger.debug("Auxiliary client: custom endpoint (%s)", model)
|
||||
return OpenAI(api_key=custom_key, base_url=custom_base), model
|
||||
|
||||
@@ -502,205 +499,6 @@ def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
return None, None
|
||||
|
||||
|
||||
# ── Centralized Provider Router ─────────────────────────────────────────────
|
||||
#
|
||||
# resolve_provider_client() is the single entry point for creating a properly
|
||||
# configured client given a (provider, model) pair. It handles auth lookup,
|
||||
# base URL resolution, provider-specific headers, and API format differences
|
||||
# (Chat Completions vs Responses API for Codex).
|
||||
#
|
||||
# All auxiliary consumer code should go through this or the public helpers
|
||||
# below — never look up auth env vars ad-hoc.
|
||||
|
||||
|
||||
def _to_async_client(sync_client, model: str):
|
||||
"""Convert a sync client to its async counterpart, preserving Codex routing."""
|
||||
from openai import AsyncOpenAI
|
||||
|
||||
if isinstance(sync_client, CodexAuxiliaryClient):
|
||||
return AsyncCodexAuxiliaryClient(sync_client), model
|
||||
|
||||
async_kwargs = {
|
||||
"api_key": sync_client.api_key,
|
||||
"base_url": str(sync_client.base_url),
|
||||
}
|
||||
base_lower = str(sync_client.base_url).lower()
|
||||
if "openrouter" in base_lower:
|
||||
async_kwargs["default_headers"] = dict(_OR_HEADERS)
|
||||
elif "api.kimi.com" in base_lower:
|
||||
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
|
||||
return AsyncOpenAI(**async_kwargs), model
|
||||
|
||||
|
||||
def resolve_provider_client(
|
||||
provider: str,
|
||||
model: str = None,
|
||||
async_mode: bool = False,
|
||||
raw_codex: bool = False,
|
||||
) -> Tuple[Optional[Any], Optional[str]]:
|
||||
"""Central router: given a provider name and optional model, return a
|
||||
configured client with the correct auth, base URL, and API format.
|
||||
|
||||
The returned client always exposes ``.chat.completions.create()`` — for
|
||||
Codex/Responses API providers, an adapter handles the translation
|
||||
transparently.
|
||||
|
||||
Args:
|
||||
provider: Provider identifier. One of:
|
||||
"openrouter", "nous", "openai-codex" (or "codex"),
|
||||
"zai", "kimi-coding", "minimax", "minimax-cn",
|
||||
"custom" (OPENAI_BASE_URL + OPENAI_API_KEY),
|
||||
"auto" (full auto-detection chain).
|
||||
model: Model slug override. If None, uses the provider's default
|
||||
auxiliary model.
|
||||
async_mode: If True, return an async-compatible client.
|
||||
raw_codex: If True, return a raw OpenAI client for Codex providers
|
||||
instead of wrapping in CodexAuxiliaryClient. Use this when
|
||||
the caller needs direct access to responses.stream() (e.g.,
|
||||
the main agent loop).
|
||||
|
||||
Returns:
|
||||
(client, resolved_model) or (None, None) if auth is unavailable.
|
||||
"""
|
||||
# Normalise aliases
|
||||
provider = (provider or "auto").strip().lower()
|
||||
if provider == "codex":
|
||||
provider = "openai-codex"
|
||||
if provider == "main":
|
||||
provider = "custom"
|
||||
|
||||
# ── Auto: try all providers in priority order ────────────────────
|
||||
if provider == "auto":
|
||||
client, resolved = _resolve_auto()
|
||||
if client is None:
|
||||
return None, None
|
||||
final_model = model or resolved
|
||||
return (_to_async_client(client, final_model) if async_mode
|
||||
else (client, final_model))
|
||||
|
||||
# ── OpenRouter ───────────────────────────────────────────────────
|
||||
if provider == "openrouter":
|
||||
client, default = _try_openrouter()
|
||||
if client is None:
|
||||
logger.warning("resolve_provider_client: openrouter requested "
|
||||
"but OPENROUTER_API_KEY not set")
|
||||
return None, None
|
||||
final_model = model or default
|
||||
return (_to_async_client(client, final_model) if async_mode
|
||||
else (client, final_model))
|
||||
|
||||
# ── Nous Portal (OAuth) ──────────────────────────────────────────
|
||||
if provider == "nous":
|
||||
client, default = _try_nous()
|
||||
if client is None:
|
||||
logger.warning("resolve_provider_client: nous requested "
|
||||
"but Nous Portal not configured (run: hermes login)")
|
||||
return None, None
|
||||
final_model = model or default
|
||||
return (_to_async_client(client, final_model) if async_mode
|
||||
else (client, final_model))
|
||||
|
||||
# ── OpenAI Codex (OAuth → Responses API) ─────────────────────────
|
||||
if provider == "openai-codex":
|
||||
if raw_codex:
|
||||
# Return the raw OpenAI client for callers that need direct
|
||||
# access to responses.stream() (e.g., the main agent loop).
|
||||
codex_token = _read_codex_access_token()
|
||||
if not codex_token:
|
||||
logger.warning("resolve_provider_client: openai-codex requested "
|
||||
"but no Codex OAuth token found (run: hermes model)")
|
||||
return None, None
|
||||
final_model = model or _CODEX_AUX_MODEL
|
||||
raw_client = OpenAI(api_key=codex_token, base_url=_CODEX_AUX_BASE_URL)
|
||||
return (raw_client, final_model)
|
||||
# Standard path: wrap in CodexAuxiliaryClient adapter
|
||||
client, default = _try_codex()
|
||||
if client is None:
|
||||
logger.warning("resolve_provider_client: openai-codex requested "
|
||||
"but no Codex OAuth token found (run: hermes model)")
|
||||
return None, None
|
||||
final_model = model or default
|
||||
return (_to_async_client(client, final_model) if async_mode
|
||||
else (client, final_model))
|
||||
|
||||
# ── Custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY) ───────────
|
||||
if provider == "custom":
|
||||
# Try custom first, then codex, then API-key providers
|
||||
for try_fn in (_try_custom_endpoint, _try_codex,
|
||||
_resolve_api_key_provider):
|
||||
client, default = try_fn()
|
||||
if client is not None:
|
||||
final_model = model or default
|
||||
return (_to_async_client(client, final_model) if async_mode
|
||||
else (client, final_model))
|
||||
logger.warning("resolve_provider_client: custom/main requested "
|
||||
"but no endpoint credentials found")
|
||||
return None, None
|
||||
|
||||
# ── API-key providers from PROVIDER_REGISTRY ─────────────────────
|
||||
try:
|
||||
from hermes_cli.auth import PROVIDER_REGISTRY, _resolve_kimi_base_url
|
||||
except ImportError:
|
||||
logger.debug("hermes_cli.auth not available for provider %s", provider)
|
||||
return None, None
|
||||
|
||||
pconfig = PROVIDER_REGISTRY.get(provider)
|
||||
if pconfig is None:
|
||||
logger.warning("resolve_provider_client: unknown provider %r", provider)
|
||||
return None, None
|
||||
|
||||
if pconfig.auth_type == "api_key":
|
||||
# Find the first configured API key
|
||||
api_key = ""
|
||||
for env_var in pconfig.api_key_env_vars:
|
||||
api_key = os.getenv(env_var, "").strip()
|
||||
if api_key:
|
||||
break
|
||||
if not api_key:
|
||||
logger.warning("resolve_provider_client: provider %s has no API "
|
||||
"key configured (tried: %s)",
|
||||
provider, ", ".join(pconfig.api_key_env_vars))
|
||||
return None, None
|
||||
|
||||
# Resolve base URL (env override → provider-specific logic → default)
|
||||
base_url_override = os.getenv(pconfig.base_url_env_var, "").strip() if pconfig.base_url_env_var else ""
|
||||
if provider == "kimi-coding":
|
||||
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, base_url_override)
|
||||
elif base_url_override:
|
||||
base_url = base_url_override
|
||||
else:
|
||||
base_url = pconfig.inference_base_url
|
||||
|
||||
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
|
||||
final_model = model or default_model
|
||||
|
||||
# Provider-specific headers
|
||||
headers = {}
|
||||
if "api.kimi.com" in base_url.lower():
|
||||
headers["User-Agent"] = "KimiCLI/1.0"
|
||||
|
||||
client = OpenAI(api_key=api_key, base_url=base_url,
|
||||
**({"default_headers": headers} if headers else {}))
|
||||
logger.debug("resolve_provider_client: %s (%s)", provider, final_model)
|
||||
return (_to_async_client(client, final_model) if async_mode
|
||||
else (client, final_model))
|
||||
|
||||
elif pconfig.auth_type in ("oauth_device_code", "oauth_external"):
|
||||
# OAuth providers — route through their specific try functions
|
||||
if provider == "nous":
|
||||
return resolve_provider_client("nous", model, async_mode)
|
||||
if provider == "openai-codex":
|
||||
return resolve_provider_client("openai-codex", model, async_mode)
|
||||
# Other OAuth providers not directly supported
|
||||
logger.warning("resolve_provider_client: OAuth provider %s not "
|
||||
"directly supported, try 'auto'", provider)
|
||||
return None, None
|
||||
|
||||
logger.warning("resolve_provider_client: unhandled auth_type %s for %s",
|
||||
pconfig.auth_type, provider)
|
||||
return None, None
|
||||
|
||||
|
||||
# ── Public API ──────────────────────────────────────────────────────────────
|
||||
|
||||
def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
@@ -715,8 +513,8 @@ def get_text_auxiliary_client(task: str = "") -> Tuple[Optional[OpenAI], Optiona
|
||||
"""
|
||||
forced = _get_auxiliary_provider(task)
|
||||
if forced != "auto":
|
||||
return resolve_provider_client(forced)
|
||||
return resolve_provider_client("auto")
|
||||
return _resolve_forced_provider(forced)
|
||||
return _resolve_auto()
|
||||
|
||||
|
||||
def get_async_text_auxiliary_client(task: str = ""):
|
||||
@@ -726,10 +524,24 @@ def get_async_text_auxiliary_client(task: str = ""):
|
||||
(AsyncCodexAuxiliaryClient, model) which wraps the Responses API.
|
||||
Returns (None, None) when no provider is available.
|
||||
"""
|
||||
forced = _get_auxiliary_provider(task)
|
||||
if forced != "auto":
|
||||
return resolve_provider_client(forced, async_mode=True)
|
||||
return resolve_provider_client("auto", async_mode=True)
|
||||
from openai import AsyncOpenAI
|
||||
|
||||
sync_client, model = get_text_auxiliary_client(task)
|
||||
if sync_client is None:
|
||||
return None, None
|
||||
|
||||
if isinstance(sync_client, CodexAuxiliaryClient):
|
||||
return AsyncCodexAuxiliaryClient(sync_client), model
|
||||
|
||||
async_kwargs = {
|
||||
"api_key": sync_client.api_key,
|
||||
"base_url": str(sync_client.base_url),
|
||||
}
|
||||
if "openrouter" in str(sync_client.base_url).lower():
|
||||
async_kwargs["default_headers"] = dict(_OR_HEADERS)
|
||||
elif "api.kimi.com" in str(sync_client.base_url).lower():
|
||||
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
|
||||
return AsyncOpenAI(**async_kwargs), model
|
||||
|
||||
|
||||
def get_vision_auxiliary_client() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
@@ -747,7 +559,7 @@ def get_vision_auxiliary_client() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
"""
|
||||
forced = _get_auxiliary_provider("vision")
|
||||
if forced != "auto":
|
||||
return resolve_provider_client(forced)
|
||||
return _resolve_forced_provider(forced)
|
||||
# Auto: try providers known to support multimodal first, then fall
|
||||
# back to the user's custom endpoint. Many local models (Qwen-VL,
|
||||
# LLaVA, Pixtral, etc.) support vision — skipping them entirely
|
||||
@@ -761,21 +573,6 @@ def get_vision_auxiliary_client() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
return None, None
|
||||
|
||||
|
||||
def get_async_vision_auxiliary_client():
|
||||
"""Return (async_client, model_slug) for async vision consumers.
|
||||
|
||||
Properly handles Codex routing — unlike manually constructing
|
||||
AsyncOpenAI from a sync client, this preserves the Responses API
|
||||
adapter for Codex providers.
|
||||
|
||||
Returns (None, None) when no provider is available.
|
||||
"""
|
||||
sync_client, model = get_vision_auxiliary_client()
|
||||
if sync_client is None:
|
||||
return None, None
|
||||
return _to_async_client(sync_client, model)
|
||||
|
||||
|
||||
def get_auxiliary_extra_body() -> dict:
|
||||
"""Return extra_body kwargs for auxiliary API calls.
|
||||
|
||||
@@ -801,253 +598,3 @@ def auxiliary_max_tokens_param(value: int) -> dict:
|
||||
and "api.openai.com" in custom_base.lower()):
|
||||
return {"max_completion_tokens": value}
|
||||
return {"max_tokens": value}
|
||||
|
||||
|
||||
# ── Centralized LLM Call API ────────────────────────────────────────────────
|
||||
#
|
||||
# call_llm() and async_call_llm() own the full request lifecycle:
|
||||
# 1. Resolve provider + model from task config (or explicit args)
|
||||
# 2. Get or create a cached client for that provider
|
||||
# 3. Format request args for the provider + model (max_tokens handling, etc.)
|
||||
# 4. Make the API call
|
||||
# 5. Return the response
|
||||
#
|
||||
# Every auxiliary LLM consumer should use these instead of manually
|
||||
# constructing clients and calling .chat.completions.create().
|
||||
|
||||
# Client cache: (provider, async_mode) -> (client, default_model)
|
||||
_client_cache: Dict[tuple, tuple] = {}
|
||||
|
||||
|
||||
def _get_cached_client(
|
||||
provider: str, model: str = None, async_mode: bool = False,
|
||||
) -> Tuple[Optional[Any], Optional[str]]:
|
||||
"""Get or create a cached client for the given provider."""
|
||||
cache_key = (provider, async_mode)
|
||||
if cache_key in _client_cache:
|
||||
cached_client, cached_default = _client_cache[cache_key]
|
||||
return cached_client, model or cached_default
|
||||
client, default_model = resolve_provider_client(provider, model, async_mode)
|
||||
if client is not None:
|
||||
_client_cache[cache_key] = (client, default_model)
|
||||
return client, model or default_model
|
||||
|
||||
|
||||
def _resolve_task_provider_model(
|
||||
task: str = None,
|
||||
provider: str = None,
|
||||
model: str = None,
|
||||
) -> Tuple[str, Optional[str]]:
|
||||
"""Determine provider + model for a call.
|
||||
|
||||
Priority:
|
||||
1. Explicit provider/model args (always win)
|
||||
2. Env var overrides (AUXILIARY_{TASK}_PROVIDER, etc.)
|
||||
3. Config file (auxiliary.{task}.provider/model or compression.*)
|
||||
4. "auto" (full auto-detection chain)
|
||||
|
||||
Returns (provider, model) where model may be None (use provider default).
|
||||
"""
|
||||
if provider:
|
||||
return provider, model
|
||||
|
||||
if task:
|
||||
# Check env var overrides first
|
||||
env_provider = _get_auxiliary_provider(task)
|
||||
if env_provider != "auto":
|
||||
# Check for env var model override too
|
||||
env_model = None
|
||||
for prefix in ("AUXILIARY_", "CONTEXT_"):
|
||||
val = os.getenv(f"{prefix}{task.upper()}_MODEL", "").strip()
|
||||
if val:
|
||||
env_model = val
|
||||
break
|
||||
return env_provider, model or env_model
|
||||
|
||||
# Read from config file
|
||||
try:
|
||||
from hermes_cli.config import load_config
|
||||
config = load_config()
|
||||
except ImportError:
|
||||
return "auto", model
|
||||
|
||||
# Check auxiliary.{task} section
|
||||
aux = config.get("auxiliary", {})
|
||||
task_config = aux.get(task, {})
|
||||
cfg_provider = task_config.get("provider", "").strip() or None
|
||||
cfg_model = task_config.get("model", "").strip() or None
|
||||
|
||||
# Backwards compat: compression section has its own keys
|
||||
if task == "compression" and not cfg_provider:
|
||||
comp = config.get("compression", {})
|
||||
cfg_provider = comp.get("summary_provider", "").strip() or None
|
||||
cfg_model = cfg_model or comp.get("summary_model", "").strip() or None
|
||||
|
||||
if cfg_provider and cfg_provider != "auto":
|
||||
return cfg_provider, model or cfg_model
|
||||
return "auto", model or cfg_model
|
||||
|
||||
return "auto", model
|
||||
|
||||
|
||||
def _build_call_kwargs(
|
||||
provider: str,
|
||||
model: str,
|
||||
messages: list,
|
||||
temperature: Optional[float] = None,
|
||||
max_tokens: Optional[int] = None,
|
||||
tools: Optional[list] = None,
|
||||
timeout: float = 30.0,
|
||||
extra_body: Optional[dict] = None,
|
||||
) -> dict:
|
||||
"""Build kwargs for .chat.completions.create() with model/provider adjustments."""
|
||||
kwargs: Dict[str, Any] = {
|
||||
"model": model,
|
||||
"messages": messages,
|
||||
"timeout": timeout,
|
||||
}
|
||||
|
||||
if temperature is not None:
|
||||
kwargs["temperature"] = temperature
|
||||
|
||||
if max_tokens is not None:
|
||||
# Codex adapter handles max_tokens internally; OpenRouter/Nous use max_tokens.
|
||||
# Direct OpenAI api.openai.com with newer models needs max_completion_tokens.
|
||||
if provider == "custom":
|
||||
custom_base = os.getenv("OPENAI_BASE_URL", "")
|
||||
if "api.openai.com" in custom_base.lower():
|
||||
kwargs["max_completion_tokens"] = max_tokens
|
||||
else:
|
||||
kwargs["max_tokens"] = max_tokens
|
||||
else:
|
||||
kwargs["max_tokens"] = max_tokens
|
||||
|
||||
if tools:
|
||||
kwargs["tools"] = tools
|
||||
|
||||
# Provider-specific extra_body
|
||||
merged_extra = dict(extra_body or {})
|
||||
if provider == "nous" or auxiliary_is_nous:
|
||||
merged_extra.setdefault("tags", []).extend(["product=hermes-agent"])
|
||||
if merged_extra:
|
||||
kwargs["extra_body"] = merged_extra
|
||||
|
||||
return kwargs
|
||||
|
||||
|
||||
def call_llm(
|
||||
task: str = None,
|
||||
*,
|
||||
provider: str = None,
|
||||
model: str = None,
|
||||
messages: list,
|
||||
temperature: float = None,
|
||||
max_tokens: int = None,
|
||||
tools: list = None,
|
||||
timeout: float = 30.0,
|
||||
extra_body: dict = None,
|
||||
) -> Any:
|
||||
"""Centralized synchronous LLM call.
|
||||
|
||||
Resolves provider + model (from task config, explicit args, or auto-detect),
|
||||
handles auth, request formatting, and model-specific arg adjustments.
|
||||
|
||||
Args:
|
||||
task: Auxiliary task name ("compression", "vision", "web_extract",
|
||||
"session_search", "skills_hub", "mcp", "flush_memories").
|
||||
Reads provider:model from config/env. Ignored if provider is set.
|
||||
provider: Explicit provider override.
|
||||
model: Explicit model override.
|
||||
messages: Chat messages list.
|
||||
temperature: Sampling temperature (None = provider default).
|
||||
max_tokens: Max output tokens (handles max_tokens vs max_completion_tokens).
|
||||
tools: Tool definitions (for function calling).
|
||||
timeout: Request timeout in seconds.
|
||||
extra_body: Additional request body fields.
|
||||
|
||||
Returns:
|
||||
Response object with .choices[0].message.content
|
||||
|
||||
Raises:
|
||||
RuntimeError: If no provider is configured.
|
||||
"""
|
||||
resolved_provider, resolved_model = _resolve_task_provider_model(
|
||||
task, provider, model)
|
||||
|
||||
client, final_model = _get_cached_client(resolved_provider, resolved_model)
|
||||
if client is None:
|
||||
# Fallback: try openrouter
|
||||
if resolved_provider != "openrouter":
|
||||
logger.warning("Provider %s unavailable, falling back to openrouter",
|
||||
resolved_provider)
|
||||
client, final_model = _get_cached_client(
|
||||
"openrouter", resolved_model or _OPENROUTER_MODEL)
|
||||
if client is None:
|
||||
raise RuntimeError(
|
||||
f"No LLM provider configured for task={task} provider={resolved_provider}. "
|
||||
f"Run: hermes setup")
|
||||
|
||||
kwargs = _build_call_kwargs(
|
||||
resolved_provider, final_model, messages,
|
||||
temperature=temperature, max_tokens=max_tokens,
|
||||
tools=tools, timeout=timeout, extra_body=extra_body)
|
||||
|
||||
# Handle max_tokens vs max_completion_tokens retry
|
||||
try:
|
||||
return client.chat.completions.create(**kwargs)
|
||||
except Exception as first_err:
|
||||
err_str = str(first_err)
|
||||
if "max_tokens" in err_str or "unsupported_parameter" in err_str:
|
||||
kwargs.pop("max_tokens", None)
|
||||
kwargs["max_completion_tokens"] = max_tokens
|
||||
return client.chat.completions.create(**kwargs)
|
||||
raise
|
||||
|
||||
|
||||
async def async_call_llm(
|
||||
task: str = None,
|
||||
*,
|
||||
provider: str = None,
|
||||
model: str = None,
|
||||
messages: list,
|
||||
temperature: float = None,
|
||||
max_tokens: int = None,
|
||||
tools: list = None,
|
||||
timeout: float = 30.0,
|
||||
extra_body: dict = None,
|
||||
) -> Any:
|
||||
"""Centralized asynchronous LLM call.
|
||||
|
||||
Same as call_llm() but async. See call_llm() for full documentation.
|
||||
"""
|
||||
resolved_provider, resolved_model = _resolve_task_provider_model(
|
||||
task, provider, model)
|
||||
|
||||
client, final_model = _get_cached_client(
|
||||
resolved_provider, resolved_model, async_mode=True)
|
||||
if client is None:
|
||||
if resolved_provider != "openrouter":
|
||||
logger.warning("Provider %s unavailable, falling back to openrouter",
|
||||
resolved_provider)
|
||||
client, final_model = _get_cached_client(
|
||||
"openrouter", resolved_model or _OPENROUTER_MODEL,
|
||||
async_mode=True)
|
||||
if client is None:
|
||||
raise RuntimeError(
|
||||
f"No LLM provider configured for task={task} provider={resolved_provider}. "
|
||||
f"Run: hermes setup")
|
||||
|
||||
kwargs = _build_call_kwargs(
|
||||
resolved_provider, final_model, messages,
|
||||
temperature=temperature, max_tokens=max_tokens,
|
||||
tools=tools, timeout=timeout, extra_body=extra_body)
|
||||
|
||||
try:
|
||||
return await client.chat.completions.create(**kwargs)
|
||||
except Exception as first_err:
|
||||
err_str = str(first_err)
|
||||
if "max_tokens" in err_str or "unsupported_parameter" in err_str:
|
||||
kwargs.pop("max_tokens", None)
|
||||
kwargs["max_completion_tokens"] = max_tokens
|
||||
return await client.chat.completions.create(**kwargs)
|
||||
raise
|
||||
|
||||
+80
-25
@@ -9,7 +9,7 @@ import logging
|
||||
import os
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
from agent.auxiliary_client import call_llm
|
||||
from agent.auxiliary_client import get_text_auxiliary_client
|
||||
from agent.model_metadata import (
|
||||
get_model_context_length,
|
||||
estimate_messages_tokens_rough,
|
||||
@@ -53,7 +53,8 @@ class ContextCompressor:
|
||||
self.last_completion_tokens = 0
|
||||
self.last_total_tokens = 0
|
||||
|
||||
self.summary_model = summary_model_override or ""
|
||||
self.client, default_model = get_text_auxiliary_client("compression")
|
||||
self.summary_model = summary_model_override or default_model
|
||||
|
||||
def update_from_response(self, usage: Dict[str, Any]):
|
||||
"""Update tracked token usage from API response."""
|
||||
@@ -119,30 +120,84 @@ TURNS TO SUMMARIZE:
|
||||
|
||||
Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
|
||||
|
||||
# Use the centralized LLM router — handles provider resolution,
|
||||
# auth, and fallback internally.
|
||||
# 1. Try the auxiliary model (cheap/fast)
|
||||
if self.client:
|
||||
try:
|
||||
return self._call_summary_model(self.client, self.summary_model, prompt)
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to generate context summary with auxiliary model: {e}")
|
||||
|
||||
# 2. Fallback: try the user's main model endpoint
|
||||
fallback_client, fallback_model = self._get_fallback_client()
|
||||
if fallback_client is not None:
|
||||
try:
|
||||
logger.info("Retrying context summary with main model (%s)", fallback_model)
|
||||
summary = self._call_summary_model(fallback_client, fallback_model, prompt)
|
||||
self.client = fallback_client
|
||||
self.summary_model = fallback_model
|
||||
return summary
|
||||
except Exception as fallback_err:
|
||||
logging.warning(f"Main model summary also failed: {fallback_err}")
|
||||
|
||||
# 3. All models failed — return None so the caller drops turns without a summary
|
||||
logging.warning("Context compression: no model available for summary. Middle turns will be dropped without summary.")
|
||||
return None
|
||||
|
||||
def _call_summary_model(self, client, model: str, prompt: str) -> str:
|
||||
"""Make the actual LLM call to generate a summary. Raises on failure."""
|
||||
kwargs = {
|
||||
"model": model,
|
||||
"messages": [{"role": "user", "content": prompt}],
|
||||
"temperature": 0.3,
|
||||
"timeout": 30.0,
|
||||
}
|
||||
# Most providers (OpenRouter, local models) use max_tokens.
|
||||
# Direct OpenAI with newer models (gpt-4o, o-series, gpt-5+)
|
||||
# requires max_completion_tokens instead.
|
||||
try:
|
||||
call_kwargs = {
|
||||
"task": "compression",
|
||||
"messages": [{"role": "user", "content": prompt}],
|
||||
"temperature": 0.3,
|
||||
"max_tokens": self.summary_target_tokens * 2,
|
||||
"timeout": 30.0,
|
||||
}
|
||||
if self.summary_model:
|
||||
call_kwargs["model"] = self.summary_model
|
||||
response = call_llm(**call_kwargs)
|
||||
summary = response.choices[0].message.content.strip()
|
||||
if not summary.startswith("[CONTEXT SUMMARY]:"):
|
||||
summary = "[CONTEXT SUMMARY]: " + summary
|
||||
return summary
|
||||
except RuntimeError:
|
||||
logging.warning("Context compression: no provider available for "
|
||||
"summary. Middle turns will be dropped without summary.")
|
||||
return None
|
||||
except Exception as e:
|
||||
logging.warning("Failed to generate context summary: %s", e)
|
||||
return None
|
||||
kwargs["max_tokens"] = self.summary_target_tokens * 2
|
||||
response = client.chat.completions.create(**kwargs)
|
||||
except Exception as first_err:
|
||||
if "max_tokens" in str(first_err) or "unsupported_parameter" in str(first_err):
|
||||
kwargs.pop("max_tokens", None)
|
||||
kwargs["max_completion_tokens"] = self.summary_target_tokens * 2
|
||||
response = client.chat.completions.create(**kwargs)
|
||||
else:
|
||||
raise
|
||||
|
||||
summary = response.choices[0].message.content.strip()
|
||||
if not summary.startswith("[CONTEXT SUMMARY]:"):
|
||||
summary = "[CONTEXT SUMMARY]: " + summary
|
||||
return summary
|
||||
|
||||
def _get_fallback_client(self):
|
||||
"""Try to build a fallback client from the main model's endpoint config.
|
||||
|
||||
When the primary auxiliary client fails (e.g. stale OpenRouter key), this
|
||||
creates a client using the user's active custom endpoint (OPENAI_BASE_URL)
|
||||
so compression can still produce a real summary instead of a static string.
|
||||
|
||||
Returns (client, model) or (None, None).
|
||||
"""
|
||||
custom_base = os.getenv("OPENAI_BASE_URL")
|
||||
custom_key = os.getenv("OPENAI_API_KEY")
|
||||
if not custom_base or not custom_key:
|
||||
return None, None
|
||||
|
||||
# Don't fallback to the same provider that just failed
|
||||
from hermes_constants import OPENROUTER_BASE_URL
|
||||
if custom_base.rstrip("/") == OPENROUTER_BASE_URL.rstrip("/"):
|
||||
return None, None
|
||||
|
||||
model = os.getenv("LLM_MODEL") or os.getenv("OPENAI_MODEL") or self.model
|
||||
try:
|
||||
from openai import OpenAI as _OpenAI
|
||||
client = _OpenAI(api_key=custom_key, base_url=custom_base)
|
||||
logger.debug("Built fallback auxiliary client: %s via %s", model, custom_base)
|
||||
return client, model
|
||||
except Exception as exc:
|
||||
logger.debug("Could not build fallback auxiliary client: %s", exc)
|
||||
return None, None
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Tool-call / tool-result pair integrity helpers
|
||||
|
||||
+1
-56
@@ -187,58 +187,7 @@ def _skill_is_platform_compatible(skill_file: Path) -> bool:
|
||||
return True # Err on the side of showing the skill
|
||||
|
||||
|
||||
def _read_skill_conditions(skill_file: Path) -> dict:
|
||||
"""Extract conditional activation fields from SKILL.md frontmatter."""
|
||||
try:
|
||||
from tools.skills_tool import _parse_frontmatter
|
||||
raw = skill_file.read_text(encoding="utf-8")[:2000]
|
||||
frontmatter, _ = _parse_frontmatter(raw)
|
||||
hermes = frontmatter.get("metadata", {}).get("hermes", {})
|
||||
return {
|
||||
"fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
|
||||
"requires_toolsets": hermes.get("requires_toolsets", []),
|
||||
"fallback_for_tools": hermes.get("fallback_for_tools", []),
|
||||
"requires_tools": hermes.get("requires_tools", []),
|
||||
}
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
|
||||
def _skill_should_show(
|
||||
conditions: dict,
|
||||
available_tools: "set[str] | None",
|
||||
available_toolsets: "set[str] | None",
|
||||
) -> bool:
|
||||
"""Return False if the skill's conditional activation rules exclude it."""
|
||||
if available_tools is None and available_toolsets is None:
|
||||
return True # No filtering info — show everything (backward compat)
|
||||
|
||||
at = available_tools or set()
|
||||
ats = available_toolsets or set()
|
||||
|
||||
# fallback_for: hide when the primary tool/toolset IS available
|
||||
for ts in conditions.get("fallback_for_toolsets", []):
|
||||
if ts in ats:
|
||||
return False
|
||||
for t in conditions.get("fallback_for_tools", []):
|
||||
if t in at:
|
||||
return False
|
||||
|
||||
# requires: hide when a required tool/toolset is NOT available
|
||||
for ts in conditions.get("requires_toolsets", []):
|
||||
if ts not in ats:
|
||||
return False
|
||||
for t in conditions.get("requires_tools", []):
|
||||
if t not in at:
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def build_skills_system_prompt(
|
||||
available_tools: "set[str] | None" = None,
|
||||
available_toolsets: "set[str] | None" = None,
|
||||
) -> str:
|
||||
def build_skills_system_prompt() -> str:
|
||||
"""Build a compact skill index for the system prompt.
|
||||
|
||||
Scans ~/.hermes/skills/ for SKILL.md files grouped by category.
|
||||
@@ -261,10 +210,6 @@ def build_skills_system_prompt(
|
||||
# Skip skills incompatible with the current OS platform
|
||||
if not _skill_is_platform_compatible(skill_file):
|
||||
continue
|
||||
# Skip skills whose conditional activation rules exclude them
|
||||
conditions = _read_skill_conditions(skill_file)
|
||||
if not _skill_should_show(conditions, available_tools, available_toolsets):
|
||||
continue
|
||||
rel_path = skill_file.relative_to(skills_dir)
|
||||
parts = rel_path.parts
|
||||
if len(parts) >= 2:
|
||||
|
||||
@@ -416,7 +416,7 @@ from model_tools import get_tool_definitions, get_toolset_for_tool
|
||||
# Extracted CLI modules (Phase 3)
|
||||
from hermes_cli.banner import (
|
||||
cprint as _cprint, _GOLD, _BOLD, _DIM, _RST,
|
||||
VERSION, RELEASE_DATE, HERMES_AGENT_LOGO, HERMES_CADUCEUS, COMPACT_BANNER,
|
||||
VERSION, HERMES_AGENT_LOGO, HERMES_CADUCEUS, COMPACT_BANNER,
|
||||
get_available_skills as _get_available_skills,
|
||||
build_welcome_banner,
|
||||
)
|
||||
@@ -993,7 +993,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
|
||||
# Wrap in a panel with the title
|
||||
outer_panel = Panel(
|
||||
layout_table,
|
||||
title=f"[bold {_title_c}]{_agent_name} v{VERSION} ({RELEASE_DATE})[/]",
|
||||
title=f"[bold {_title_c}]{_agent_name} {VERSION}[/]",
|
||||
border_style=_border_c,
|
||||
padding=(0, 2),
|
||||
)
|
||||
@@ -1129,17 +1129,12 @@ class HermesCLI:
|
||||
self.verbose = verbose if verbose is not None else (self.tool_progress_mode == "verbose")
|
||||
|
||||
# Configuration - priority: CLI args > env vars > config file
|
||||
# Model comes from: CLI arg or config.yaml (single source of truth).
|
||||
# LLM_MODEL/OPENAI_MODEL env vars are NOT checked — config.yaml is
|
||||
# authoritative. This avoids conflicts in multi-agent setups where
|
||||
# env vars would stomp each other.
|
||||
_model_config = CLI_CONFIG.get("model", {})
|
||||
_config_model = _model_config.get("default", "") if isinstance(_model_config, dict) else (_model_config or "")
|
||||
self.model = model or _config_model or "anthropic/claude-opus-4.6"
|
||||
# Model can come from: CLI arg, LLM_MODEL env, OPENAI_MODEL env (custom endpoint), or config
|
||||
self.model = model or os.getenv("LLM_MODEL") or os.getenv("OPENAI_MODEL") or CLI_CONFIG["model"]["default"]
|
||||
# Track whether model was explicitly chosen by the user or fell back
|
||||
# to the global default. Provider-specific normalisation may override
|
||||
# the default silently but should warn when overriding an explicit choice.
|
||||
self._model_is_default = not model
|
||||
self._model_is_default = not (model or os.getenv("LLM_MODEL") or os.getenv("OPENAI_MODEL"))
|
||||
|
||||
self._explicit_api_key = api_key
|
||||
self._explicit_base_url = base_url
|
||||
@@ -2265,72 +2260,6 @@ class HermesCLI:
|
||||
remaining = len(self.conversation_history)
|
||||
print(f" {remaining} message(s) remaining in history.")
|
||||
|
||||
def _show_model_and_providers(self):
|
||||
"""Unified /model and /provider display.
|
||||
|
||||
Shows current model + provider, then lists all authenticated
|
||||
providers with their available models so users can switch easily.
|
||||
"""
|
||||
from hermes_cli.models import (
|
||||
curated_models_for_provider, list_available_providers,
|
||||
normalize_provider, _PROVIDER_LABELS,
|
||||
)
|
||||
from hermes_cli.auth import resolve_provider as _resolve_provider
|
||||
|
||||
# Resolve current provider
|
||||
raw_provider = normalize_provider(self.provider)
|
||||
if raw_provider == "auto":
|
||||
try:
|
||||
current = _resolve_provider(
|
||||
self.requested_provider,
|
||||
explicit_api_key=self._explicit_api_key,
|
||||
explicit_base_url=self._explicit_base_url,
|
||||
)
|
||||
except Exception:
|
||||
current = "openrouter"
|
||||
else:
|
||||
current = raw_provider
|
||||
current_label = _PROVIDER_LABELS.get(current, current)
|
||||
|
||||
print(f"\n Current: {self.model} via {current_label}")
|
||||
print()
|
||||
|
||||
# Show all authenticated providers with their models
|
||||
providers = list_available_providers()
|
||||
authed = [p for p in providers if p["authenticated"]]
|
||||
unauthed = [p for p in providers if not p["authenticated"]]
|
||||
|
||||
if authed:
|
||||
print(" Authenticated providers & models:")
|
||||
for p in authed:
|
||||
is_active = p["id"] == current
|
||||
marker = " ← active" if is_active else ""
|
||||
print(f" [{p['id']}]{marker}")
|
||||
curated = curated_models_for_provider(p["id"])
|
||||
if curated:
|
||||
for mid, desc in curated:
|
||||
current_marker = " ← current" if (is_active and mid == self.model) else ""
|
||||
print(f" {mid}{current_marker}")
|
||||
else:
|
||||
print(f" (use /model {p['id']}:<model-name>)")
|
||||
print()
|
||||
|
||||
if unauthed:
|
||||
names = ", ".join(p["label"] for p in unauthed)
|
||||
print(f" Not configured: {names}")
|
||||
print(f" Run: hermes setup")
|
||||
print()
|
||||
|
||||
print(" Switch model: /model <model-name>")
|
||||
print(" Switch provider: /model <provider>:<model-name>")
|
||||
if authed and len(authed) > 1:
|
||||
# Show a concrete example with a non-active provider
|
||||
other = next((p for p in authed if p["id"] != current), authed[0])
|
||||
other_models = curated_models_for_provider(other["id"])
|
||||
if other_models:
|
||||
example_model = other_models[0][0]
|
||||
print(f" Example: /model {other['id']}:{example_model}")
|
||||
|
||||
def _handle_prompt_command(self, cmd: str):
|
||||
"""Handle the /prompt command to view or set system prompt."""
|
||||
parts = cmd.split(maxsplit=1)
|
||||
@@ -2795,11 +2724,7 @@ class HermesCLI:
|
||||
base_url_for_probe = runtime.get("base_url", "")
|
||||
except Exception as e:
|
||||
provider_label = _PROVIDER_LABELS.get(target_provider, target_provider)
|
||||
if target_provider == "custom":
|
||||
print(f"(>_<) Custom endpoint not configured. Set OPENAI_BASE_URL and OPENAI_API_KEY,")
|
||||
print(f" or run: hermes setup → Custom OpenAI-compatible endpoint")
|
||||
else:
|
||||
print(f"(>_<) Could not resolve credentials for provider '{provider_label}': {e}")
|
||||
print(f"(>_<) Could not resolve credentials for provider '{provider_label}': {e}")
|
||||
print(f"(^_^) Current model unchanged: {self.model}")
|
||||
return True
|
||||
|
||||
@@ -2846,9 +2771,65 @@ class HermesCLI:
|
||||
print(f" Reason: {message}")
|
||||
print(" Note: Model will revert on restart. Use a verified model to save to config.")
|
||||
else:
|
||||
self._show_model_and_providers()
|
||||
from hermes_cli.models import curated_models_for_provider, normalize_provider, _PROVIDER_LABELS
|
||||
from hermes_cli.auth import resolve_provider as _resolve_provider
|
||||
# Resolve "auto" to the actual provider using credential detection
|
||||
raw_provider = normalize_provider(self.provider)
|
||||
if raw_provider == "auto":
|
||||
try:
|
||||
display_provider = _resolve_provider(
|
||||
self.requested_provider,
|
||||
explicit_api_key=self._explicit_api_key,
|
||||
explicit_base_url=self._explicit_base_url,
|
||||
)
|
||||
except Exception:
|
||||
display_provider = "openrouter"
|
||||
else:
|
||||
display_provider = raw_provider
|
||||
provider_label = _PROVIDER_LABELS.get(display_provider, display_provider)
|
||||
print(f"\n Current model: {self.model}")
|
||||
print(f" Current provider: {provider_label}")
|
||||
print()
|
||||
curated = curated_models_for_provider(display_provider)
|
||||
if curated:
|
||||
print(f" Available models ({provider_label}):")
|
||||
for mid, desc in curated:
|
||||
marker = " ←" if mid == self.model else ""
|
||||
label = f" {desc}" if desc else ""
|
||||
print(f" {mid}{label}{marker}")
|
||||
print()
|
||||
print(" Usage: /model <model-name>")
|
||||
print(" /model provider:model-name (to switch provider)")
|
||||
print(" Example: /model openrouter:anthropic/claude-sonnet-4.5")
|
||||
print(" See /provider for available providers")
|
||||
elif cmd_lower == "/provider":
|
||||
self._show_model_and_providers()
|
||||
from hermes_cli.models import list_available_providers, normalize_provider, _PROVIDER_LABELS
|
||||
from hermes_cli.auth import resolve_provider as _resolve_provider
|
||||
# Resolve current provider
|
||||
raw_provider = normalize_provider(self.provider)
|
||||
if raw_provider == "auto":
|
||||
try:
|
||||
current = _resolve_provider(
|
||||
self.requested_provider,
|
||||
explicit_api_key=self._explicit_api_key,
|
||||
explicit_base_url=self._explicit_base_url,
|
||||
)
|
||||
except Exception:
|
||||
current = "openrouter"
|
||||
else:
|
||||
current = raw_provider
|
||||
current_label = _PROVIDER_LABELS.get(current, current)
|
||||
print(f"\n Current provider: {current_label} ({current})\n")
|
||||
providers = list_available_providers()
|
||||
print(" Available providers:")
|
||||
for p in providers:
|
||||
marker = " ← active" if p["id"] == current else ""
|
||||
auth = "✓" if p["authenticated"] else "✗"
|
||||
aliases = f" (also: {', '.join(p['aliases'])})" if p["aliases"] else ""
|
||||
print(f" [{auth}] {p['id']:<14} {p['label']}{aliases}{marker}")
|
||||
print()
|
||||
print(" Switch: /model provider:model-name")
|
||||
print(" Setup: hermes setup")
|
||||
elif cmd_lower.startswith("/prompt"):
|
||||
# Use original case so prompt text isn't lowercased
|
||||
self._handle_prompt_command(cmd_original)
|
||||
|
||||
+7
-13
@@ -168,22 +168,16 @@ def parse_schedule(schedule: str) -> Dict[str, Any]:
|
||||
|
||||
|
||||
def _ensure_aware(dt: datetime) -> datetime:
|
||||
"""Return a timezone-aware datetime in Hermes configured timezone.
|
||||
"""Make a naive datetime tz-aware using the configured timezone.
|
||||
|
||||
Backward compatibility:
|
||||
- Older stored timestamps may be naive.
|
||||
- Naive values are interpreted as *system-local wall time* (the timezone
|
||||
`datetime.now()` used when they were created), then converted to the
|
||||
configured Hermes timezone.
|
||||
|
||||
This preserves relative ordering for legacy naive timestamps across
|
||||
timezone changes and avoids false not-due results.
|
||||
Handles backward compatibility: timestamps stored before timezone support
|
||||
are naive (server-local). We assume they were in the same timezone as
|
||||
the current configuration so comparisons work without crashing.
|
||||
"""
|
||||
target_tz = _hermes_now().tzinfo
|
||||
if dt.tzinfo is None:
|
||||
local_tz = datetime.now().astimezone().tzinfo
|
||||
return dt.replace(tzinfo=local_tz).astimezone(target_tz)
|
||||
return dt.astimezone(target_tz)
|
||||
tz = _hermes_now().tzinfo
|
||||
return dt.replace(tzinfo=tz)
|
||||
return dt
|
||||
|
||||
|
||||
def compute_next_run(schedule: Dict[str, Any], last_run_at: Optional[str] = None) -> Optional[str]:
|
||||
|
||||
+1
-1
@@ -180,7 +180,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
|
||||
except UnicodeDecodeError:
|
||||
load_dotenv(str(_hermes_home / ".env"), override=True, encoding="latin-1")
|
||||
|
||||
model = os.getenv("HERMES_MODEL") or "anthropic/claude-opus-4.6"
|
||||
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
|
||||
|
||||
# Load config.yaml for model, reasoning, prefill, toolsets, provider routing
|
||||
_cfg = {}
|
||||
|
||||
@@ -292,18 +292,6 @@ def load_gateway_config() -> GatewayConfig:
|
||||
sr = yaml_cfg.get("session_reset")
|
||||
if sr and isinstance(sr, dict):
|
||||
config.default_reset_policy = SessionResetPolicy.from_dict(sr)
|
||||
|
||||
# Bridge discord settings from config.yaml to env vars
|
||||
# (env vars take precedence — only set if not already defined)
|
||||
discord_cfg = yaml_cfg.get("discord", {})
|
||||
if isinstance(discord_cfg, dict):
|
||||
if "require_mention" in discord_cfg and not os.getenv("DISCORD_REQUIRE_MENTION"):
|
||||
os.environ["DISCORD_REQUIRE_MENTION"] = str(discord_cfg["require_mention"]).lower()
|
||||
frc = discord_cfg.get("free_response_channels")
|
||||
if frc is not None and not os.getenv("DISCORD_FREE_RESPONSE_CHANNELS"):
|
||||
if isinstance(frc, list):
|
||||
frc = ",".join(str(v) for v in frc)
|
||||
os.environ["DISCORD_FREE_RESPONSE_CHANNELS"] = str(frc)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
@@ -775,46 +775,6 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
except Exception as e:
|
||||
return SendResult(success=False, error=str(e))
|
||||
|
||||
def _get_parent_channel_id(self, channel: Any) -> Optional[str]:
|
||||
"""Return the parent channel ID for a Discord thread-like channel, if present."""
|
||||
parent = getattr(channel, "parent", None)
|
||||
if parent is not None and getattr(parent, "id", None) is not None:
|
||||
return str(parent.id)
|
||||
parent_id = getattr(channel, "parent_id", None)
|
||||
if parent_id is not None:
|
||||
return str(parent_id)
|
||||
return None
|
||||
|
||||
def _is_forum_parent(self, channel: Any) -> bool:
|
||||
"""Best-effort check for whether a Discord channel is a forum channel."""
|
||||
if channel is None:
|
||||
return False
|
||||
forum_cls = getattr(discord, "ForumChannel", None)
|
||||
if forum_cls and isinstance(channel, forum_cls):
|
||||
return True
|
||||
channel_type = getattr(channel, "type", None)
|
||||
if channel_type is not None:
|
||||
type_value = getattr(channel_type, "value", channel_type)
|
||||
if type_value == 15:
|
||||
return True
|
||||
return False
|
||||
|
||||
def _format_thread_chat_name(self, thread: Any) -> str:
|
||||
"""Build a readable chat name for thread-like Discord channels, including forum context when available."""
|
||||
thread_name = getattr(thread, "name", None) or str(getattr(thread, "id", "thread"))
|
||||
parent = getattr(thread, "parent", None)
|
||||
guild = getattr(thread, "guild", None) or getattr(parent, "guild", None)
|
||||
guild_name = getattr(guild, "name", None)
|
||||
parent_name = getattr(parent, "name", None)
|
||||
|
||||
if self._is_forum_parent(parent) and guild_name and parent_name:
|
||||
return f"{guild_name} / {parent_name} / {thread_name}"
|
||||
if parent_name and guild_name:
|
||||
return f"{guild_name} / #{parent_name} / {thread_name}"
|
||||
if parent_name:
|
||||
return f"{parent_name} / {thread_name}"
|
||||
return thread_name
|
||||
|
||||
async def _handle_message(self, message: DiscordMessage) -> None:
|
||||
"""Handle incoming Discord messages."""
|
||||
# In server channels (not DMs), require the bot to be @mentioned
|
||||
@@ -825,33 +785,28 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
# bot responds to every message without needing a mention.
|
||||
# DISCORD_REQUIRE_MENTION: Set to "false" to disable mention requirement
|
||||
# globally (all channels become free-response). Default: "true".
|
||||
# Can also be set via discord.require_mention in config.yaml.
|
||||
|
||||
thread_id = None
|
||||
parent_channel_id = None
|
||||
is_thread = isinstance(message.channel, discord.Thread)
|
||||
if is_thread:
|
||||
thread_id = str(message.channel.id)
|
||||
parent_channel_id = self._get_parent_channel_id(message.channel)
|
||||
|
||||
|
||||
if not isinstance(message.channel, discord.DMChannel):
|
||||
# Check if this channel is in the free-response list
|
||||
free_channels_raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
|
||||
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
|
||||
channel_ids = {str(message.channel.id)}
|
||||
if parent_channel_id:
|
||||
channel_ids.add(parent_channel_id)
|
||||
|
||||
channel_id = str(message.channel.id)
|
||||
|
||||
# Global override: if DISCORD_REQUIRE_MENTION=false, all channels are free
|
||||
require_mention = os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
|
||||
is_free_channel = bool(channel_ids & free_channels)
|
||||
|
||||
|
||||
is_free_channel = channel_id in free_channels
|
||||
|
||||
if require_mention and not is_free_channel:
|
||||
# Must be @mentioned to respond
|
||||
if self._client.user not in message.mentions:
|
||||
return
|
||||
|
||||
return # Silently ignore messages that don't mention the bot
|
||||
|
||||
# Strip the bot mention from the message text so the agent sees clean input
|
||||
if self._client.user and self._client.user in message.mentions:
|
||||
message.content = message.content.replace(f"<@{self._client.user.id}>", "").strip()
|
||||
message.content = message.content.replace(f"<@!{self._client.user.id}>", "").strip()
|
||||
|
||||
|
||||
# Determine message type
|
||||
msg_type = MessageType.TEXT
|
||||
if message.content.startswith("/"):
|
||||
@@ -874,15 +829,20 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
if isinstance(message.channel, discord.DMChannel):
|
||||
chat_type = "dm"
|
||||
chat_name = message.author.name
|
||||
elif is_thread:
|
||||
elif isinstance(message.channel, discord.Thread):
|
||||
chat_type = "thread"
|
||||
chat_name = self._format_thread_chat_name(message.channel)
|
||||
chat_name = message.channel.name
|
||||
else:
|
||||
chat_type = "group"
|
||||
chat_type = "group" # Treat server channels as groups
|
||||
chat_name = getattr(message.channel, "name", str(message.channel.id))
|
||||
if hasattr(message.channel, "guild") and message.channel.guild:
|
||||
chat_name = f"{message.channel.guild.name} / #{chat_name}"
|
||||
|
||||
|
||||
# Get thread ID if in a thread
|
||||
thread_id = None
|
||||
if isinstance(message.channel, discord.Thread):
|
||||
thread_id = str(message.channel.id)
|
||||
|
||||
# Get channel topic (if available - TextChannels have topics, DMs/threads don't)
|
||||
chat_topic = getattr(message.channel, "topic", None)
|
||||
|
||||
|
||||
+31
-39
@@ -187,30 +187,6 @@ def _resolve_runtime_agent_kwargs() -> dict:
|
||||
}
|
||||
|
||||
|
||||
def _resolve_gateway_model() -> str:
|
||||
"""Read model from env/config — mirrors the resolution in _run_agent_sync.
|
||||
|
||||
Without this, temporary AIAgent instances (memory flush, /compress) fall
|
||||
back to the hardcoded default ("anthropic/claude-opus-4.6") which fails
|
||||
when the active provider is openai-codex.
|
||||
"""
|
||||
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
|
||||
try:
|
||||
import yaml as _y
|
||||
_cfg_path = _hermes_home / "config.yaml"
|
||||
if _cfg_path.exists():
|
||||
with open(_cfg_path, encoding="utf-8") as _f:
|
||||
_cfg = _y.safe_load(_f) or {}
|
||||
_model_cfg = _cfg.get("model", {})
|
||||
if isinstance(_model_cfg, str):
|
||||
model = _model_cfg
|
||||
elif isinstance(_model_cfg, dict):
|
||||
model = _model_cfg.get("default", model)
|
||||
except Exception:
|
||||
pass
|
||||
return model
|
||||
|
||||
|
||||
class GatewayRunner:
|
||||
"""
|
||||
Main gateway controller.
|
||||
@@ -282,14 +258,8 @@ class GatewayRunner:
|
||||
if not runtime_kwargs.get("api_key"):
|
||||
return
|
||||
|
||||
# Resolve model from config — AIAgent's default is OpenRouter-
|
||||
# formatted ("anthropic/claude-opus-4.6") which fails when the
|
||||
# active provider is openai-codex.
|
||||
model = _resolve_gateway_model()
|
||||
|
||||
tmp_agent = AIAgent(
|
||||
**runtime_kwargs,
|
||||
model=model,
|
||||
max_iterations=8,
|
||||
quiet_mode=True,
|
||||
enabled_toolsets=["memory", "skills"],
|
||||
@@ -1136,7 +1106,6 @@ class GatewayRunner:
|
||||
if len(_hyg_msgs) >= 4:
|
||||
_hyg_agent = AIAgent(
|
||||
**_hyg_runtime,
|
||||
model=_hyg_model,
|
||||
max_iterations=4,
|
||||
quiet_mode=True,
|
||||
enabled_toolsets=["memory"],
|
||||
@@ -1575,7 +1544,7 @@ class GatewayRunner:
|
||||
config_path = _hermes_home / 'config.yaml'
|
||||
|
||||
# Resolve current model and provider from config
|
||||
current = os.getenv("HERMES_MODEL") or "anthropic/claude-opus-4.6"
|
||||
current = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
|
||||
current_provider = "openrouter"
|
||||
try:
|
||||
if config_path.exists():
|
||||
@@ -2029,8 +1998,21 @@ class GatewayRunner:
|
||||
)
|
||||
return
|
||||
|
||||
# Read model from config via shared helper
|
||||
model = _resolve_gateway_model()
|
||||
# Read model from config (same as _run_agent)
|
||||
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
|
||||
try:
|
||||
import yaml as _y
|
||||
_cfg_path = _hermes_home / "config.yaml"
|
||||
if _cfg_path.exists():
|
||||
with open(_cfg_path, encoding="utf-8") as _f:
|
||||
_cfg = _y.safe_load(_f) or {}
|
||||
_model_cfg = _cfg.get("model", {})
|
||||
if isinstance(_model_cfg, str):
|
||||
model = _model_cfg
|
||||
elif isinstance(_model_cfg, dict):
|
||||
model = _model_cfg.get("default", model)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Determine toolset (same logic as _run_agent)
|
||||
default_toolset_map = {
|
||||
@@ -2187,9 +2169,6 @@ class GatewayRunner:
|
||||
if not runtime_kwargs.get("api_key"):
|
||||
return "No provider configured -- cannot compress."
|
||||
|
||||
# Resolve model from config (same reason as memory flush above).
|
||||
model = _resolve_gateway_model()
|
||||
|
||||
msgs = [
|
||||
{"role": m.get("role"), "content": m.get("content")}
|
||||
for m in history
|
||||
@@ -2200,7 +2179,6 @@ class GatewayRunner:
|
||||
|
||||
tmp_agent = AIAgent(
|
||||
**runtime_kwargs,
|
||||
model=model,
|
||||
max_iterations=4,
|
||||
quiet_mode=True,
|
||||
enabled_toolsets=["memory"],
|
||||
@@ -3115,7 +3093,21 @@ class GatewayRunner:
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
model = _resolve_gateway_model()
|
||||
model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
|
||||
|
||||
try:
|
||||
import yaml as _y
|
||||
_cfg_path = _hermes_home / "config.yaml"
|
||||
if _cfg_path.exists():
|
||||
with open(_cfg_path, encoding="utf-8") as _f:
|
||||
_cfg = _y.safe_load(_f) or {}
|
||||
_model_cfg = _cfg.get("model", {})
|
||||
if isinstance(_model_cfg, str):
|
||||
model = _model_cfg
|
||||
elif isinstance(_model_cfg, dict):
|
||||
model = _model_cfg.get("default", model)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
try:
|
||||
runtime_kwargs = _resolve_runtime_agent_kwargs()
|
||||
|
||||
@@ -11,5 +11,4 @@ Provides subcommands for:
|
||||
- hermes cron - Manage cron jobs
|
||||
"""
|
||||
|
||||
__version__ = "0.2.0"
|
||||
__release_date__ = "2026.3.12"
|
||||
__version__ = "v1.0.0"
|
||||
|
||||
+12
-6
@@ -108,6 +108,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
|
||||
auth_type="oauth_external",
|
||||
inference_base_url=DEFAULT_CODEX_BASE_URL,
|
||||
),
|
||||
"nous-api": ProviderConfig(
|
||||
id="nous-api",
|
||||
name="Nous Portal (API Key)",
|
||||
auth_type="api_key",
|
||||
inference_base_url="https://inference-api.nousresearch.com/v1",
|
||||
api_key_env_vars=("NOUS_API_KEY",),
|
||||
base_url_env_var="NOUS_BASE_URL",
|
||||
),
|
||||
"zai": ProviderConfig(
|
||||
id="zai",
|
||||
name="Z.AI / GLM",
|
||||
@@ -513,6 +521,7 @@ def resolve_provider(
|
||||
|
||||
# Normalize provider aliases
|
||||
_PROVIDER_ALIASES = {
|
||||
"nous_api": "nous-api", "nousapi": "nous-api", "nous-portal-api": "nous-api",
|
||||
"glm": "zai", "z-ai": "zai", "z.ai": "zai", "zhipu": "zai",
|
||||
"kimi": "kimi-coding", "moonshot": "kimi-coding",
|
||||
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
|
||||
@@ -1671,12 +1680,8 @@ def _prompt_model_selection(model_ids: List[str], current_model: str = "") -> Op
|
||||
|
||||
|
||||
def _save_model_choice(model_id: str) -> None:
|
||||
"""Save the selected model to config.yaml (single source of truth).
|
||||
|
||||
The model is stored in config.yaml only — NOT in .env. This avoids
|
||||
conflicts in multi-agent setups where env vars would stomp each other.
|
||||
"""
|
||||
from hermes_cli.config import save_config, load_config
|
||||
"""Save the selected model to config.yaml and .env."""
|
||||
from hermes_cli.config import save_config, load_config, save_env_value
|
||||
|
||||
config = load_config()
|
||||
# Always use dict format so provider/base_url can be stored alongside
|
||||
@@ -1685,6 +1690,7 @@ def _save_model_choice(model_id: str) -> None:
|
||||
else:
|
||||
config["model"] = {"default": model_id}
|
||||
save_config(config)
|
||||
save_env_value("LLM_MODEL", model_id)
|
||||
|
||||
|
||||
def login_command(args) -> None:
|
||||
|
||||
@@ -62,7 +62,7 @@ def _skin_branding(key: str, fallback: str) -> str:
|
||||
# ASCII Art & Branding
|
||||
# =========================================================================
|
||||
|
||||
from hermes_cli import __version__ as VERSION, __release_date__ as RELEASE_DATE
|
||||
from hermes_cli import __version__ as VERSION
|
||||
|
||||
HERMES_AGENT_LOGO = """[bold #FFD700]██╗ ██╗███████╗██████╗ ███╗ ███╗███████╗███████╗ █████╗ ██████╗ ███████╗███╗ ██╗████████╗[/]
|
||||
[bold #FFD700]██║ ██║██╔════╝██╔══██╗████╗ ████║██╔════╝██╔════╝ ██╔══██╗██╔════╝ ██╔════╝████╗ ██║╚══██╔══╝[/]
|
||||
@@ -380,7 +380,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
border_color = _skin_color("banner_border", "#CD7F32")
|
||||
outer_panel = Panel(
|
||||
layout_table,
|
||||
title=f"[bold {title_color}]{agent_name} v{VERSION} ({RELEASE_DATE})[/]",
|
||||
title=f"[bold {title_color}]{agent_name} {VERSION}[/]",
|
||||
border_style=border_color,
|
||||
padding=(0, 2),
|
||||
)
|
||||
|
||||
@@ -1,135 +0,0 @@
|
||||
"""Shared curses-based multi-select checklist for Hermes CLI.
|
||||
|
||||
Used by both ``hermes tools`` and ``hermes skills`` to present a
|
||||
toggleable list of items. Falls back to a numbered text UI when
|
||||
curses is unavailable (Windows without curses, piped stdin, etc.).
|
||||
"""
|
||||
|
||||
from typing import List, Set
|
||||
|
||||
from hermes_cli.colors import Colors, color
|
||||
|
||||
|
||||
def curses_checklist(
|
||||
title: str,
|
||||
items: List[str],
|
||||
pre_selected: Set[int],
|
||||
) -> Set[int]:
|
||||
"""Multi-select checklist. Returns set of **selected** indices.
|
||||
|
||||
Args:
|
||||
title: Header text shown at the top of the checklist.
|
||||
items: Display labels for each row.
|
||||
pre_selected: Indices that start checked.
|
||||
|
||||
Returns:
|
||||
The indices the user confirmed as checked. On cancel (ESC/q),
|
||||
returns ``pre_selected`` unchanged.
|
||||
"""
|
||||
try:
|
||||
import curses
|
||||
selected = set(pre_selected)
|
||||
result = [None]
|
||||
|
||||
def _ui(stdscr):
|
||||
curses.curs_set(0)
|
||||
if curses.has_colors():
|
||||
curses.start_color()
|
||||
curses.use_default_colors()
|
||||
curses.init_pair(1, curses.COLOR_GREEN, -1)
|
||||
curses.init_pair(2, curses.COLOR_YELLOW, -1)
|
||||
curses.init_pair(3, 8, -1) # dim gray
|
||||
cursor = 0
|
||||
scroll_offset = 0
|
||||
|
||||
while True:
|
||||
stdscr.clear()
|
||||
max_y, max_x = stdscr.getmaxyx()
|
||||
|
||||
# Header
|
||||
try:
|
||||
hattr = curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0)
|
||||
stdscr.addnstr(0, 0, title, max_x - 1, hattr)
|
||||
stdscr.addnstr(
|
||||
1, 0,
|
||||
" ↑↓ navigate SPACE toggle ENTER confirm ESC cancel",
|
||||
max_x - 1, curses.A_DIM,
|
||||
)
|
||||
except curses.error:
|
||||
pass
|
||||
|
||||
# Scrollable item list
|
||||
visible_rows = max_y - 3
|
||||
if cursor < scroll_offset:
|
||||
scroll_offset = cursor
|
||||
elif cursor >= scroll_offset + visible_rows:
|
||||
scroll_offset = cursor - visible_rows + 1
|
||||
|
||||
for draw_i, i in enumerate(
|
||||
range(scroll_offset, min(len(items), scroll_offset + visible_rows))
|
||||
):
|
||||
y = draw_i + 3
|
||||
if y >= max_y - 1:
|
||||
break
|
||||
check = "✓" if i in selected else " "
|
||||
arrow = "→" if i == cursor else " "
|
||||
line = f" {arrow} [{check}] {items[i]}"
|
||||
|
||||
attr = curses.A_NORMAL
|
||||
if i == cursor:
|
||||
attr = curses.A_BOLD
|
||||
if curses.has_colors():
|
||||
attr |= curses.color_pair(1)
|
||||
try:
|
||||
stdscr.addnstr(y, 0, line, max_x - 1, attr)
|
||||
except curses.error:
|
||||
pass
|
||||
|
||||
stdscr.refresh()
|
||||
key = stdscr.getch()
|
||||
|
||||
if key in (curses.KEY_UP, ord("k")):
|
||||
cursor = (cursor - 1) % len(items)
|
||||
elif key in (curses.KEY_DOWN, ord("j")):
|
||||
cursor = (cursor + 1) % len(items)
|
||||
elif key == ord(" "):
|
||||
selected.symmetric_difference_update({cursor})
|
||||
elif key in (curses.KEY_ENTER, 10, 13):
|
||||
result[0] = set(selected)
|
||||
return
|
||||
elif key in (27, ord("q")):
|
||||
result[0] = set(pre_selected)
|
||||
return
|
||||
|
||||
curses.wrapper(_ui)
|
||||
return result[0] if result[0] is not None else set(pre_selected)
|
||||
|
||||
except Exception:
|
||||
pass # fall through to numbered fallback
|
||||
|
||||
# ── Numbered text fallback ────────────────────────────────────────────
|
||||
selected = set(pre_selected)
|
||||
print(color(f"\n {title}", Colors.YELLOW))
|
||||
print(color(" Toggle by number, Enter to confirm.\n", Colors.DIM))
|
||||
|
||||
while True:
|
||||
for i, label in enumerate(items):
|
||||
check = "✓" if i in selected else " "
|
||||
print(f" {i + 1:3}. [{check}] {label}")
|
||||
print()
|
||||
|
||||
try:
|
||||
raw = input(color(" Number to toggle, 's' to save, 'q' to cancel: ", Colors.DIM)).strip()
|
||||
except (KeyboardInterrupt, EOFError):
|
||||
return set(pre_selected)
|
||||
|
||||
if raw.lower() == "s" or raw == "":
|
||||
return selected
|
||||
if raw.lower() == "q":
|
||||
return set(pre_selected)
|
||||
try:
|
||||
idx = int(raw) - 1
|
||||
if 0 <= idx < len(items):
|
||||
selected.symmetric_difference_update({idx})
|
||||
except ValueError:
|
||||
print(color(" Invalid input", Colors.DIM))
|
||||
+14
-48
@@ -17,7 +17,6 @@ import platform
|
||||
import stat
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, Optional, List, Tuple
|
||||
|
||||
@@ -126,41 +125,17 @@ DEFAULT_CONFIG = {
|
||||
"summary_provider": "auto",
|
||||
},
|
||||
|
||||
# Auxiliary model config — provider:model for each side task.
|
||||
# Format: provider is the provider name, model is the model slug.
|
||||
# "auto" for provider = auto-detect best available provider.
|
||||
# Empty model = use provider's default auxiliary model.
|
||||
# All tasks fall back to openrouter:google/gemini-3-flash-preview if
|
||||
# the configured provider is unavailable.
|
||||
# Auxiliary model overrides (advanced). By default Hermes auto-selects
|
||||
# the provider and model for each side task. Set these to override.
|
||||
"auxiliary": {
|
||||
"vision": {
|
||||
"provider": "auto", # auto | openrouter | nous | codex | custom
|
||||
"provider": "auto", # auto | openrouter | nous | main
|
||||
"model": "", # e.g. "google/gemini-2.5-flash", "gpt-4o"
|
||||
},
|
||||
"web_extract": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
},
|
||||
"compression": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
},
|
||||
"session_search": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
},
|
||||
"skills_hub": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
},
|
||||
"mcp": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
},
|
||||
"flush_memories": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
},
|
||||
},
|
||||
|
||||
"display": {
|
||||
@@ -232,12 +207,6 @@ DEFAULT_CONFIG = {
|
||||
# Empty string means use server-local time.
|
||||
"timezone": "",
|
||||
|
||||
# Discord platform settings (gateway mode)
|
||||
"discord": {
|
||||
"require_mention": True, # Require @mention to respond in server channels
|
||||
"free_response_channels": "", # Comma-separated channel IDs where bot responds without mention
|
||||
},
|
||||
|
||||
# Permanently allowed dangerous command patterns (added via "always" approval)
|
||||
"command_allowlist": [],
|
||||
# User-defined quick commands that bypass the agent loop (type: exec only)
|
||||
@@ -248,7 +217,7 @@ DEFAULT_CONFIG = {
|
||||
"personalities": {},
|
||||
|
||||
# Config schema version - bump this when adding new required fields
|
||||
"_config_version": 7,
|
||||
"_config_version": 6,
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
@@ -273,6 +242,14 @@ REQUIRED_ENV_VARS = {}
|
||||
# Optional environment variables that enhance functionality
|
||||
OPTIONAL_ENV_VARS = {
|
||||
# ── Provider (handled in provider selection, not shown in checklists) ──
|
||||
"NOUS_API_KEY": {
|
||||
"description": "Nous Portal API key (direct API key access to Nous inference)",
|
||||
"prompt": "Nous Portal API key",
|
||||
"url": "https://portal.nousresearch.com",
|
||||
"password": True,
|
||||
"category": "provider",
|
||||
"advanced": True,
|
||||
},
|
||||
"NOUS_BASE_URL": {
|
||||
"description": "Nous Portal base URL override",
|
||||
"prompt": "Nous Portal base URL (leave empty for default)",
|
||||
@@ -981,19 +958,8 @@ def save_env_value(key: str, value: str):
|
||||
lines[-1] += "\n"
|
||||
lines.append(f"{key}={value}\n")
|
||||
|
||||
fd, tmp_path = tempfile.mkstemp(dir=str(env_path.parent), suffix='.tmp', prefix='.env_')
|
||||
try:
|
||||
with os.fdopen(fd, 'w', **write_kw) as f:
|
||||
f.writelines(lines)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
os.replace(tmp_path, env_path)
|
||||
except BaseException:
|
||||
try:
|
||||
os.unlink(tmp_path)
|
||||
except OSError:
|
||||
pass
|
||||
raise
|
||||
with open(env_path, 'w', **write_kw) as f:
|
||||
f.writelines(lines)
|
||||
_secure_file(env_path)
|
||||
|
||||
# Restrict .env permissions to owner-only (contains API keys)
|
||||
|
||||
+5
-12
@@ -490,16 +490,13 @@ def run_doctor(args):
|
||||
print(f"\r {color('⚠', Colors.YELLOW)} Anthropic API {color(f'({e})', Colors.DIM)} ")
|
||||
|
||||
# -- API-key providers (Z.AI/GLM, Kimi, MiniMax, MiniMax-CN) --
|
||||
# Tuple: (name, env_vars, default_url, base_env, supports_models_endpoint)
|
||||
# If supports_models_endpoint is False, we skip the health check and just show "configured"
|
||||
_apikey_providers = [
|
||||
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
|
||||
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
|
||||
# MiniMax APIs don't support /models endpoint — https://github.com/NousResearch/hermes-agent/issues/811
|
||||
("MiniMax", ("MINIMAX_API_KEY",), None, "MINIMAX_BASE_URL", False),
|
||||
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), None, "MINIMAX_CN_BASE_URL", False),
|
||||
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL"),
|
||||
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL"),
|
||||
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL"),
|
||||
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL"),
|
||||
]
|
||||
for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
|
||||
for _pname, _env_vars, _default_url, _base_env in _apikey_providers:
|
||||
_key = ""
|
||||
for _ev in _env_vars:
|
||||
_key = os.getenv(_ev, "")
|
||||
@@ -507,10 +504,6 @@ def run_doctor(args):
|
||||
break
|
||||
if _key:
|
||||
_label = _pname.ljust(20)
|
||||
# Some providers (like MiniMax) don't support /models endpoint
|
||||
if not _supports_health_check:
|
||||
print(f" {color('✓', Colors.GREEN)} {_label} {color('(key configured)', Colors.DIM)}")
|
||||
continue
|
||||
print(f" Checking {_pname} API...", end="", flush=True)
|
||||
try:
|
||||
import httpx
|
||||
|
||||
+2
-2
@@ -51,7 +51,7 @@ os.environ.setdefault("MSWEA_SILENT_STARTUP", "1")
|
||||
|
||||
import logging
|
||||
|
||||
from hermes_cli import __version__, __release_date__
|
||||
from hermes_cli import __version__
|
||||
from hermes_constants import OPENROUTER_BASE_URL
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -1484,7 +1484,7 @@ def cmd_config(args):
|
||||
|
||||
def cmd_version(args):
|
||||
"""Show version."""
|
||||
print(f"Hermes Agent v{__version__} ({__release_date__})")
|
||||
print(f"Hermes Agent v{__version__}")
|
||||
print(f"Project: {PROJECT_ROOT}")
|
||||
|
||||
# Show Python version
|
||||
|
||||
+2
-51
@@ -31,19 +31,6 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
|
||||
]
|
||||
|
||||
_PROVIDER_MODELS: dict[str, list[str]] = {
|
||||
"nous": [
|
||||
"claude-opus-4-6",
|
||||
"claude-sonnet-4-6",
|
||||
"gpt-5.4",
|
||||
"gemini-3-flash",
|
||||
"gemini-3.0-pro-preview",
|
||||
"deepseek-v3.2",
|
||||
],
|
||||
"openai-codex": [
|
||||
"gpt-5.2-codex",
|
||||
"gpt-5.1-codex-mini",
|
||||
"gpt-5.1-codex-max",
|
||||
],
|
||||
"zai": [
|
||||
"glm-5",
|
||||
"glm-4.7",
|
||||
@@ -177,22 +164,10 @@ def parse_model_input(raw: str, current_provider: str) -> tuple[str, str]:
|
||||
|
||||
|
||||
def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]]:
|
||||
"""Return ``(model_id, description)`` tuples for a provider's model list.
|
||||
|
||||
Tries to fetch the live model list from the provider's API first,
|
||||
falling back to the static ``_PROVIDER_MODELS`` catalog if the API
|
||||
is unreachable.
|
||||
"""
|
||||
"""Return ``(model_id, description)`` tuples for a provider's curated list."""
|
||||
normalized = normalize_provider(provider)
|
||||
if normalized == "openrouter":
|
||||
return list(OPENROUTER_MODELS)
|
||||
|
||||
# Try live API first (Codex, Nous, etc. all support /models)
|
||||
live = provider_model_ids(normalized)
|
||||
if live:
|
||||
return [(m, "") for m in live]
|
||||
|
||||
# Fallback to static catalog
|
||||
models = _PROVIDER_MODELS.get(normalized, [])
|
||||
return [(m, "") for m in models]
|
||||
|
||||
@@ -209,11 +184,7 @@ def normalize_provider(provider: Optional[str]) -> str:
|
||||
|
||||
|
||||
def provider_model_ids(provider: Optional[str]) -> list[str]:
|
||||
"""Return the best known model catalog for a provider.
|
||||
|
||||
Tries live API endpoints for providers that support them (Codex, Nous),
|
||||
falling back to static lists.
|
||||
"""
|
||||
"""Return the best known model catalog for a provider."""
|
||||
normalized = normalize_provider(provider)
|
||||
if normalized == "openrouter":
|
||||
return model_ids()
|
||||
@@ -221,17 +192,6 @@ def provider_model_ids(provider: Optional[str]) -> list[str]:
|
||||
from hermes_cli.codex_models import get_codex_model_ids
|
||||
|
||||
return get_codex_model_ids()
|
||||
if normalized == "nous":
|
||||
# Try live Nous Portal /models endpoint
|
||||
try:
|
||||
from hermes_cli.auth import fetch_nous_models, resolve_nous_runtime_credentials
|
||||
creds = resolve_nous_runtime_credentials()
|
||||
if creds:
|
||||
live = fetch_nous_models(creds.get("api_key", ""), creds.get("base_url", ""))
|
||||
if live:
|
||||
return live
|
||||
except Exception:
|
||||
pass
|
||||
return list(_PROVIDER_MODELS.get(normalized, []))
|
||||
|
||||
|
||||
@@ -303,15 +263,6 @@ def validate_requested_model(
|
||||
"message": "Model names cannot contain spaces.",
|
||||
}
|
||||
|
||||
# Custom endpoints can serve any model — skip validation
|
||||
if normalized == "custom":
|
||||
return {
|
||||
"accepted": True,
|
||||
"persist": True,
|
||||
"recognized": False,
|
||||
"message": None,
|
||||
}
|
||||
|
||||
# Probe the live API to check if the model actually exists
|
||||
api_models = fetch_api_models(api_key, base_url)
|
||||
|
||||
|
||||
+469
-665
File diff suppressed because it is too large
Load Diff
+22
-23
@@ -189,30 +189,29 @@ class MiniSWERunner:
|
||||
)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
# Initialize LLM client via centralized provider router.
|
||||
# If explicit api_key/base_url are provided (e.g. from CLI args),
|
||||
# construct directly. Otherwise use the router for OpenRouter.
|
||||
if api_key or base_url:
|
||||
from openai import OpenAI
|
||||
client_kwargs = {
|
||||
"base_url": base_url or "https://openrouter.ai/api/v1",
|
||||
"api_key": api_key or os.getenv(
|
||||
"OPENROUTER_API_KEY",
|
||||
os.getenv("ANTHROPIC_API_KEY",
|
||||
os.getenv("OPENAI_API_KEY", ""))),
|
||||
}
|
||||
self.client = OpenAI(**client_kwargs)
|
||||
# Initialize OpenAI client - defaults to OpenRouter
|
||||
from openai import OpenAI
|
||||
|
||||
client_kwargs = {}
|
||||
|
||||
# Default to OpenRouter if no base_url provided
|
||||
if base_url:
|
||||
client_kwargs["base_url"] = base_url
|
||||
else:
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
self.client, _ = resolve_provider_client("openrouter", model=model)
|
||||
if self.client is None:
|
||||
# Fallback: try auto-detection
|
||||
self.client, _ = resolve_provider_client("auto", model=model)
|
||||
if self.client is None:
|
||||
from openai import OpenAI
|
||||
self.client = OpenAI(
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
api_key=os.getenv("OPENROUTER_API_KEY", ""))
|
||||
client_kwargs["base_url"] = "https://openrouter.ai/api/v1"
|
||||
|
||||
|
||||
|
||||
# Handle API key - OpenRouter is the primary provider
|
||||
if api_key:
|
||||
client_kwargs["api_key"] = api_key
|
||||
else:
|
||||
client_kwargs["api_key"] = os.getenv(
|
||||
"OPENROUTER_API_KEY",
|
||||
os.getenv("ANTHROPIC_API_KEY", os.getenv("OPENAI_API_KEY", ""))
|
||||
)
|
||||
|
||||
self.client = OpenAI(**client_kwargs)
|
||||
|
||||
# Environment will be created per-task
|
||||
self.env = None
|
||||
|
||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "hermes-agent"
|
||||
version = "0.2.0"
|
||||
version = "0.1.0"
|
||||
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
|
||||
+135
-175
@@ -99,51 +99,6 @@ from agent.trajectory import (
|
||||
)
|
||||
|
||||
|
||||
class _SafeWriter:
|
||||
"""Transparent stdout wrapper that catches OSError from broken pipes.
|
||||
|
||||
When hermes-agent runs as a systemd service, Docker container, or headless
|
||||
daemon, the stdout pipe can become unavailable (idle timeout, buffer
|
||||
exhaustion, socket reset). Any print() call then raises
|
||||
``OSError: [Errno 5] Input/output error``, which can crash
|
||||
run_conversation() — especially via double-fault when the except handler
|
||||
also tries to print.
|
||||
|
||||
This wrapper delegates all writes to the underlying stream and silently
|
||||
catches OSError. It is installed once at the start of run_conversation()
|
||||
and is transparent when stdout is healthy (zero overhead on the happy path).
|
||||
"""
|
||||
|
||||
__slots__ = ("_inner",)
|
||||
|
||||
def __init__(self, inner):
|
||||
object.__setattr__(self, "_inner", inner)
|
||||
|
||||
def write(self, data):
|
||||
try:
|
||||
return self._inner.write(data)
|
||||
except OSError:
|
||||
return len(data) if isinstance(data, str) else 0
|
||||
|
||||
def flush(self):
|
||||
try:
|
||||
self._inner.flush()
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
def fileno(self):
|
||||
return self._inner.fileno()
|
||||
|
||||
def isatty(self):
|
||||
try:
|
||||
return self._inner.isatty()
|
||||
except OSError:
|
||||
return False
|
||||
|
||||
def __getattr__(self, name):
|
||||
return getattr(self._inner, name)
|
||||
|
||||
|
||||
class IterationBudget:
|
||||
"""Thread-safe shared iteration counter for parent and child agents.
|
||||
|
||||
@@ -418,50 +373,36 @@ class AIAgent:
|
||||
]:
|
||||
logging.getLogger(quiet_logger).setLevel(logging.ERROR)
|
||||
|
||||
# Initialize OpenAI client via centralized provider router.
|
||||
# The router handles auth resolution, base URL, headers, and
|
||||
# Codex wrapping for all known providers.
|
||||
# raw_codex=True because the main agent needs direct responses.stream()
|
||||
# access for Codex Responses API streaming.
|
||||
if api_key and base_url:
|
||||
# Explicit credentials from CLI/gateway — construct directly.
|
||||
# The runtime provider resolver already handled auth for us.
|
||||
client_kwargs = {"api_key": api_key, "base_url": base_url}
|
||||
effective_base = base_url
|
||||
if "openrouter" in effective_base.lower():
|
||||
client_kwargs["default_headers"] = {
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
}
|
||||
elif "api.kimi.com" in effective_base.lower():
|
||||
client_kwargs["default_headers"] = {
|
||||
"User-Agent": "KimiCLI/1.0",
|
||||
}
|
||||
# Initialize OpenAI client - defaults to OpenRouter
|
||||
client_kwargs = {}
|
||||
|
||||
# Default to OpenRouter if no base_url provided
|
||||
if base_url:
|
||||
client_kwargs["base_url"] = base_url
|
||||
else:
|
||||
# No explicit creds — use the centralized provider router
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
_routed_client, _ = resolve_provider_client(
|
||||
self.provider or "auto", model=self.model, raw_codex=True)
|
||||
if _routed_client is not None:
|
||||
client_kwargs = {
|
||||
"api_key": _routed_client.api_key,
|
||||
"base_url": str(_routed_client.base_url),
|
||||
}
|
||||
# Preserve any default_headers the router set
|
||||
if hasattr(_routed_client, '_default_headers') and _routed_client._default_headers:
|
||||
client_kwargs["default_headers"] = dict(_routed_client._default_headers)
|
||||
else:
|
||||
# Final fallback: try raw OpenRouter key
|
||||
client_kwargs = {
|
||||
"api_key": os.getenv("OPENROUTER_API_KEY", ""),
|
||||
"base_url": OPENROUTER_BASE_URL,
|
||||
"default_headers": {
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
},
|
||||
}
|
||||
client_kwargs["base_url"] = OPENROUTER_BASE_URL
|
||||
|
||||
# Handle API key - OpenRouter is the primary provider
|
||||
if api_key:
|
||||
client_kwargs["api_key"] = api_key
|
||||
else:
|
||||
# Primary: OPENROUTER_API_KEY, fallback to direct provider keys
|
||||
client_kwargs["api_key"] = os.getenv("OPENROUTER_API_KEY", "")
|
||||
|
||||
# OpenRouter app attribution — shows hermes-agent in rankings/analytics
|
||||
effective_base = client_kwargs.get("base_url", "")
|
||||
if "openrouter" in effective_base.lower():
|
||||
client_kwargs["default_headers"] = {
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
}
|
||||
elif "api.kimi.com" in effective_base.lower():
|
||||
# Kimi Code API requires a recognized coding-agent User-Agent
|
||||
# (see https://github.com/MoonshotAI/kimi-cli)
|
||||
client_kwargs["default_headers"] = {
|
||||
"User-Agent": "KimiCLI/1.0",
|
||||
}
|
||||
|
||||
self._client_kwargs = client_kwargs # stored for rebuilding after interrupt
|
||||
try:
|
||||
@@ -1465,14 +1406,7 @@ class AIAgent:
|
||||
prompt_parts.append(user_block)
|
||||
|
||||
has_skills_tools = any(name in self.valid_tool_names for name in ['skills_list', 'skill_view', 'skill_manage'])
|
||||
if has_skills_tools:
|
||||
avail_toolsets = {ts for ts, avail in check_toolset_requirements().items() if avail}
|
||||
skills_prompt = build_skills_system_prompt(
|
||||
available_tools=self.valid_tool_names,
|
||||
available_toolsets=avail_toolsets,
|
||||
)
|
||||
else:
|
||||
skills_prompt = ""
|
||||
skills_prompt = build_skills_system_prompt() if has_skills_tools else ""
|
||||
if skills_prompt:
|
||||
prompt_parts.append(skills_prompt)
|
||||
|
||||
@@ -2257,6 +2191,75 @@ class AIAgent:
|
||||
|
||||
# ── Provider fallback ──────────────────────────────────────────────────
|
||||
|
||||
# API-key providers: provider → (base_url, [env_var_names])
|
||||
_FALLBACK_API_KEY_PROVIDERS = {
|
||||
"openrouter": (OPENROUTER_BASE_URL, ["OPENROUTER_API_KEY"]),
|
||||
"zai": ("https://api.z.ai/api/paas/v4", ["ZAI_API_KEY", "Z_AI_API_KEY"]),
|
||||
"kimi-coding": ("https://api.moonshot.ai/v1", ["KIMI_API_KEY"]),
|
||||
"minimax": ("https://api.minimax.io/v1", ["MINIMAX_API_KEY"]),
|
||||
"minimax-cn": ("https://api.minimaxi.com/v1", ["MINIMAX_CN_API_KEY"]),
|
||||
}
|
||||
|
||||
# OAuth providers: provider → (resolver_import_path, api_mode)
|
||||
# Each resolver returns {"api_key": ..., "base_url": ...}.
|
||||
_FALLBACK_OAUTH_PROVIDERS = {
|
||||
"openai-codex": ("resolve_codex_runtime_credentials", "codex_responses"),
|
||||
"nous": ("resolve_nous_runtime_credentials", "chat_completions"),
|
||||
}
|
||||
|
||||
def _resolve_fallback_credentials(
|
||||
self, fb_provider: str, fb_config: dict
|
||||
) -> Optional[tuple]:
|
||||
"""Resolve credentials for a fallback provider.
|
||||
|
||||
Returns (api_key, base_url, api_mode) on success, or None on failure.
|
||||
Handles three cases:
|
||||
1. OAuth providers (openai-codex, nous) — call credential resolver
|
||||
2. API-key providers (openrouter, zai, etc.) — read env var
|
||||
3. Custom endpoints — use base_url + api_key_env from config
|
||||
"""
|
||||
# ── 1. OAuth providers ────────────────────────────────────────
|
||||
if fb_provider in self._FALLBACK_OAUTH_PROVIDERS:
|
||||
resolver_name, api_mode = self._FALLBACK_OAUTH_PROVIDERS[fb_provider]
|
||||
try:
|
||||
import hermes_cli.auth as _auth
|
||||
resolver = getattr(_auth, resolver_name)
|
||||
creds = resolver()
|
||||
return creds["api_key"], creds["base_url"], api_mode
|
||||
except Exception as e:
|
||||
logging.warning(
|
||||
"Fallback to %s failed (credential resolution): %s",
|
||||
fb_provider, e,
|
||||
)
|
||||
return None
|
||||
|
||||
# ── 2. API-key providers ──────────────────────────────────────
|
||||
fb_key = (fb_config.get("api_key") or "").strip()
|
||||
if not fb_key:
|
||||
key_env = (fb_config.get("api_key_env") or "").strip()
|
||||
if key_env:
|
||||
fb_key = os.getenv(key_env, "")
|
||||
elif fb_provider in self._FALLBACK_API_KEY_PROVIDERS:
|
||||
for env_var in self._FALLBACK_API_KEY_PROVIDERS[fb_provider][1]:
|
||||
fb_key = os.getenv(env_var, "")
|
||||
if fb_key:
|
||||
break
|
||||
if not fb_key:
|
||||
logging.warning(
|
||||
"Fallback model configured but no API key found for provider '%s'",
|
||||
fb_provider,
|
||||
)
|
||||
return None
|
||||
|
||||
# ── 3. Resolve base URL ───────────────────────────────────────
|
||||
fb_base_url = (fb_config.get("base_url") or "").strip()
|
||||
if not fb_base_url and fb_provider in self._FALLBACK_API_KEY_PROVIDERS:
|
||||
fb_base_url = self._FALLBACK_API_KEY_PROVIDERS[fb_provider][0]
|
||||
if not fb_base_url:
|
||||
fb_base_url = OPENROUTER_BASE_URL
|
||||
|
||||
return fb_key, fb_base_url, "chat_completions"
|
||||
|
||||
def _try_activate_fallback(self) -> bool:
|
||||
"""Switch to the configured fallback model/provider.
|
||||
|
||||
@@ -2264,10 +2267,6 @@ class AIAgent:
|
||||
OpenAI client, model slug, and provider in-place so the retry loop
|
||||
can continue with the new backend. One-shot: returns False if
|
||||
already activated or not configured.
|
||||
|
||||
Uses the centralized provider router (resolve_provider_client) for
|
||||
auth resolution and client construction — no duplicated provider→key
|
||||
mappings.
|
||||
"""
|
||||
if self._fallback_activated or not self._fallback_model:
|
||||
return False
|
||||
@@ -2278,31 +2277,25 @@ class AIAgent:
|
||||
if not fb_provider or not fb_model:
|
||||
return False
|
||||
|
||||
# Use centralized router for client construction.
|
||||
# raw_codex=True because the main agent needs direct responses.stream()
|
||||
# access for Codex providers.
|
||||
resolved = self._resolve_fallback_credentials(fb_provider, fb)
|
||||
if resolved is None:
|
||||
return False
|
||||
fb_key, fb_base_url, fb_api_mode = resolved
|
||||
|
||||
# Build new client
|
||||
try:
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
fb_client, _ = resolve_provider_client(
|
||||
fb_provider, model=fb_model, raw_codex=True)
|
||||
if fb_client is None:
|
||||
logging.warning(
|
||||
"Fallback to %s failed: provider not configured",
|
||||
fb_provider)
|
||||
return False
|
||||
client_kwargs = {"api_key": fb_key, "base_url": fb_base_url}
|
||||
if "openrouter" in fb_base_url.lower():
|
||||
client_kwargs["default_headers"] = {
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
}
|
||||
elif "api.kimi.com" in fb_base_url.lower():
|
||||
client_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
|
||||
|
||||
# Determine api_mode from provider
|
||||
fb_api_mode = "chat_completions"
|
||||
if fb_provider == "openai-codex":
|
||||
fb_api_mode = "codex_responses"
|
||||
fb_base_url = str(fb_client.base_url)
|
||||
|
||||
# Swap client and config in-place
|
||||
self.client = fb_client
|
||||
self._client_kwargs = {
|
||||
"api_key": fb_client.api_key,
|
||||
"base_url": fb_base_url,
|
||||
}
|
||||
self.client = OpenAI(**client_kwargs)
|
||||
self._client_kwargs = client_kwargs
|
||||
old_model = self.model
|
||||
self.model = fb_model
|
||||
self.provider = fb_provider
|
||||
@@ -2399,26 +2392,16 @@ class AIAgent:
|
||||
|
||||
extra_body = {}
|
||||
|
||||
_is_openrouter = "openrouter" in self.base_url.lower()
|
||||
|
||||
# Provider preferences (only, ignore, order, sort) are OpenRouter-
|
||||
# specific. Only send to OpenRouter-compatible endpoints.
|
||||
# TODO: Nous Portal will add transparent proxy support — re-enable
|
||||
# for _is_nous when their backend is updated.
|
||||
if provider_preferences and _is_openrouter:
|
||||
if provider_preferences:
|
||||
extra_body["provider"] = provider_preferences
|
||||
|
||||
_is_openrouter = "openrouter" in self.base_url.lower()
|
||||
_is_nous = "nousresearch" in self.base_url.lower()
|
||||
|
||||
_is_mistral = "api.mistral.ai" in self.base_url.lower()
|
||||
if (_is_openrouter or _is_nous) and not _is_mistral:
|
||||
if self.reasoning_config is not None:
|
||||
rc = dict(self.reasoning_config)
|
||||
# Nous Portal requires reasoning enabled — don't send
|
||||
# enabled=false to it (would cause 400).
|
||||
if _is_nous and rc.get("enabled") is False:
|
||||
pass # omit reasoning entirely for Nous when disabled
|
||||
else:
|
||||
extra_body["reasoning"] = rc
|
||||
extra_body["reasoning"] = self.reasoning_config
|
||||
else:
|
||||
extra_body["reasoning"] = {
|
||||
"enabled": True,
|
||||
@@ -2595,22 +2578,19 @@ class AIAgent:
|
||||
|
||||
# Use auxiliary client for the flush call when available --
|
||||
# it's cheaper and avoids Codex Responses API incompatibility.
|
||||
from agent.auxiliary_client import call_llm as _call_llm
|
||||
_aux_available = True
|
||||
try:
|
||||
response = _call_llm(
|
||||
task="flush_memories",
|
||||
messages=api_messages,
|
||||
tools=[memory_tool_def],
|
||||
temperature=0.3,
|
||||
max_tokens=5120,
|
||||
timeout=30.0,
|
||||
)
|
||||
except RuntimeError:
|
||||
_aux_available = False
|
||||
response = None
|
||||
from agent.auxiliary_client import get_text_auxiliary_client
|
||||
aux_client, aux_model = get_text_auxiliary_client()
|
||||
|
||||
if not _aux_available and self.api_mode == "codex_responses":
|
||||
if aux_client:
|
||||
api_kwargs = {
|
||||
"model": aux_model,
|
||||
"messages": api_messages,
|
||||
"tools": [memory_tool_def],
|
||||
"temperature": 0.3,
|
||||
"max_tokens": 5120,
|
||||
}
|
||||
response = aux_client.chat.completions.create(**api_kwargs, timeout=30.0)
|
||||
elif self.api_mode == "codex_responses":
|
||||
# No auxiliary client -- use the Codex Responses path directly
|
||||
codex_kwargs = self._build_api_kwargs(api_messages)
|
||||
codex_kwargs["tools"] = self._responses_tools([memory_tool_def])
|
||||
@@ -2618,7 +2598,7 @@ class AIAgent:
|
||||
if "max_output_tokens" in codex_kwargs:
|
||||
codex_kwargs["max_output_tokens"] = 5120
|
||||
response = self._run_codex_stream(codex_kwargs)
|
||||
elif not _aux_available:
|
||||
else:
|
||||
api_kwargs = {
|
||||
"model": self.model,
|
||||
"messages": api_messages,
|
||||
@@ -2630,7 +2610,7 @@ class AIAgent:
|
||||
|
||||
# Extract tool calls from the response, handling both API formats
|
||||
tool_calls = []
|
||||
if self.api_mode == "codex_responses" and not _aux_available:
|
||||
if self.api_mode == "codex_responses" and not aux_client:
|
||||
assistant_msg, _ = self._normalize_codex_response(response)
|
||||
if assistant_msg and assistant_msg.tool_calls:
|
||||
tool_calls = assistant_msg.tool_calls
|
||||
@@ -3177,11 +3157,6 @@ class AIAgent:
|
||||
Returns:
|
||||
Dict: Complete conversation result with final response and message history
|
||||
"""
|
||||
# Guard stdout against OSError from broken pipes (systemd/headless/daemon).
|
||||
# Installed once, transparent when stdout is healthy, prevents crash on write.
|
||||
if not isinstance(sys.stdout, _SafeWriter):
|
||||
sys.stdout = _SafeWriter(sys.stdout)
|
||||
|
||||
# Generate unique task_id if not provided to isolate VMs between concurrent tasks
|
||||
effective_task_id = task_id or str(uuid.uuid4())
|
||||
|
||||
@@ -3897,7 +3872,6 @@ class AIAgent:
|
||||
'token limit', 'too many tokens', 'reduce the length',
|
||||
'exceeds the limit', 'context window',
|
||||
'request entity too large', # OpenRouter/Nous 413 safety net
|
||||
'prompt is too long', # Anthropic: "prompt is too long: N tokens > M maximum"
|
||||
])
|
||||
|
||||
if is_context_length_error:
|
||||
@@ -4282,7 +4256,6 @@ class AIAgent:
|
||||
|
||||
messages.append(assistant_msg)
|
||||
|
||||
_msg_count_before_tools = len(messages)
|
||||
self._execute_tool_calls(assistant_message, messages, effective_task_id, api_call_count)
|
||||
|
||||
# Refund the iteration if the ONLY tool(s) called were
|
||||
@@ -4292,20 +4265,7 @@ class AIAgent:
|
||||
if _tc_names == {"execute_code"}:
|
||||
self.iteration_budget.refund()
|
||||
|
||||
# Estimate next prompt size using real token counts from the
|
||||
# last API response + rough estimate of newly appended tool
|
||||
# results. This catches cases where tool results push the
|
||||
# context past the limit that last_prompt_tokens alone misses
|
||||
# (e.g. large file reads, web extractions).
|
||||
_compressor = self.context_compressor
|
||||
_new_tool_msgs = messages[_msg_count_before_tools:]
|
||||
_new_chars = sum(len(str(m.get("content", "") or "")) for m in _new_tool_msgs)
|
||||
_estimated_next_prompt = (
|
||||
_compressor.last_prompt_tokens
|
||||
+ _compressor.last_completion_tokens
|
||||
+ _new_chars // 3 # conservative: JSON-heavy tool results ≈ 3 chars/token
|
||||
)
|
||||
if self.compression_enabled and _compressor.should_compress(_estimated_next_prompt):
|
||||
if self.compression_enabled and self.context_compressor.should_compress():
|
||||
messages, active_system_prompt = self._compress_context(
|
||||
messages, system_message,
|
||||
approx_tokens=self.context_compressor.last_prompt_tokens,
|
||||
|
||||
@@ -1,540 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Hermes Agent Release Script
|
||||
|
||||
Generates changelogs and creates GitHub releases with CalVer tags.
|
||||
|
||||
Usage:
|
||||
# Preview changelog (dry run)
|
||||
python scripts/release.py
|
||||
|
||||
# Preview with semver bump
|
||||
python scripts/release.py --bump minor
|
||||
|
||||
# Create the release
|
||||
python scripts/release.py --bump minor --publish
|
||||
|
||||
# First release (no previous tag)
|
||||
python scripts/release.py --bump minor --publish --first-release
|
||||
|
||||
# Override CalVer date (e.g. for a belated release)
|
||||
python scripts/release.py --bump minor --publish --date 2026.3.15
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
VERSION_FILE = REPO_ROOT / "hermes_cli" / "__init__.py"
|
||||
PYPROJECT_FILE = REPO_ROOT / "pyproject.toml"
|
||||
|
||||
# ──────────────────────────────────────────────────────────────────────
|
||||
# Git email → GitHub username mapping
|
||||
# ──────────────────────────────────────────────────────────────────────
|
||||
|
||||
# Auto-extracted from noreply emails + manual overrides
|
||||
AUTHOR_MAP = {
|
||||
# teknium (multiple emails)
|
||||
"teknium1@gmail.com": "teknium1",
|
||||
"teknium@nousresearch.com": "teknium1",
|
||||
"127238744+teknium1@users.noreply.github.com": "teknium1",
|
||||
# contributors (from noreply pattern)
|
||||
"35742124+0xbyt4@users.noreply.github.com": "0xbyt4",
|
||||
"82637225+kshitijk4poor@users.noreply.github.com": "kshitijk4poor",
|
||||
"16443023+stablegenius49@users.noreply.github.com": "stablegenius49",
|
||||
"185121704+stablegenius49@users.noreply.github.com": "stablegenius49",
|
||||
"101283333+batuhankocyigit@users.noreply.github.com": "batuhankocyigit",
|
||||
"126368201+vilkasdev@users.noreply.github.com": "vilkasdev",
|
||||
"137614867+cutepawss@users.noreply.github.com": "cutepawss",
|
||||
"96793918+memosr@users.noreply.github.com": "memosr",
|
||||
"131039422+SHL0MS@users.noreply.github.com": "SHL0MS",
|
||||
"77628552+raulvidis@users.noreply.github.com": "raulvidis",
|
||||
"145567217+Aum08Desai@users.noreply.github.com": "Aum08Desai",
|
||||
"256820943+kshitij-eliza@users.noreply.github.com": "kshitij-eliza",
|
||||
"44278268+shitcoinsherpa@users.noreply.github.com": "shitcoinsherpa",
|
||||
"104278804+Sertug17@users.noreply.github.com": "Sertug17",
|
||||
"112503481+caentzminger@users.noreply.github.com": "caentzminger",
|
||||
"258577966+voidborne-d@users.noreply.github.com": "voidborne-d",
|
||||
"70424851+insecurejezza@users.noreply.github.com": "insecurejezza",
|
||||
"259807879+Bartok9@users.noreply.github.com": "Bartok9",
|
||||
# contributors (manual mapping from git names)
|
||||
"dmayhem93@gmail.com": "dmahan93",
|
||||
"samherring99@gmail.com": "samherring99",
|
||||
"desaiaum08@gmail.com": "Aum08Desai",
|
||||
"shannon.sands.1979@gmail.com": "shannonsands",
|
||||
"shannon@nousresearch.com": "shannonsands",
|
||||
"eri@plasticlabs.ai": "Erosika",
|
||||
"hjcpuro@gmail.com": "hjc-puro",
|
||||
"xaydinoktay@gmail.com": "aydnOktay",
|
||||
"abdullahfarukozden@gmail.com": "Farukest",
|
||||
"lovre.pesut@gmail.com": "rovle",
|
||||
"hakanerten02@hotmail.com": "teyrebaz33",
|
||||
"alireza78.crypto@gmail.com": "alireza78a",
|
||||
"brooklyn.bb.nicholson@gmail.com": "brooklynnicholson",
|
||||
"gpickett00@gmail.com": "gpickett00",
|
||||
"mcosma@gmail.com": "wakamex",
|
||||
"clawdia.nash@proton.me": "clawdia-nash",
|
||||
"pickett.austin@gmail.com": "austinpickett",
|
||||
"jaisehgal11299@gmail.com": "jaisup",
|
||||
"percydikec@gmail.com": "PercyDikec",
|
||||
"dean.kerr@gmail.com": "deankerr",
|
||||
"socrates1024@gmail.com": "socrates1024",
|
||||
"satelerd@gmail.com": "satelerd",
|
||||
"numman.ali@gmail.com": "nummanali",
|
||||
"0xNyk@users.noreply.github.com": "0xNyk",
|
||||
"0xnykcd@googlemail.com": "0xNyk",
|
||||
"buraysandro9@gmail.com": "buray",
|
||||
"contact@jomar.fr": "joshmartinelle",
|
||||
"camilo@tekelala.com": "tekelala",
|
||||
"vincentcharlebois@gmail.com": "vincentcharlebois",
|
||||
"aryan@synvoid.com": "aryansingh",
|
||||
"johnsonblake1@gmail.com": "blakejohnson",
|
||||
"bryan@intertwinesys.com": "bryanyoung",
|
||||
"christo.mitov@gmail.com": "christomitov",
|
||||
"hermes@nousresearch.com": "NousResearch",
|
||||
"openclaw@sparklab.ai": "openclaw",
|
||||
"semihcvlk53@gmail.com": "Himess",
|
||||
"erenkar950@gmail.com": "erenkarakus",
|
||||
"adavyasharma@gmail.com": "adavyas",
|
||||
"acaayush1111@gmail.com": "aayushchaudhary",
|
||||
"jason@outland.art": "jasonoutland",
|
||||
"mrflu1918@proton.me": "SPANISHFLU",
|
||||
"morganemoss@gmai.com": "mormio",
|
||||
"kopjop926@gmail.com": "cesareth",
|
||||
"fuleinist@gmail.com": "fuleinist",
|
||||
"jack.47@gmail.com": "JackTheGit",
|
||||
"dalvidjr2022@gmail.com": "Jr-kenny",
|
||||
"m@statecraft.systems": "mbierling",
|
||||
"balyan.sid@gmail.com": "balyansid",
|
||||
}
|
||||
|
||||
|
||||
def git(*args, cwd=None):
|
||||
"""Run a git command and return stdout."""
|
||||
result = subprocess.run(
|
||||
["git"] + list(args),
|
||||
capture_output=True, text=True,
|
||||
cwd=cwd or str(REPO_ROOT),
|
||||
)
|
||||
if result.returncode != 0:
|
||||
print(f"git {' '.join(args)} failed: {result.stderr}", file=sys.stderr)
|
||||
return ""
|
||||
return result.stdout.strip()
|
||||
|
||||
|
||||
def get_last_tag():
|
||||
"""Get the most recent CalVer tag."""
|
||||
tags = git("tag", "--list", "v20*", "--sort=-v:refname")
|
||||
if tags:
|
||||
return tags.split("\n")[0]
|
||||
return None
|
||||
|
||||
|
||||
def get_current_version():
|
||||
"""Read current semver from __init__.py."""
|
||||
content = VERSION_FILE.read_text()
|
||||
match = re.search(r'__version__\s*=\s*"([^"]+)"', content)
|
||||
return match.group(1) if match else "0.0.0"
|
||||
|
||||
|
||||
def bump_version(current: str, part: str) -> str:
|
||||
"""Bump a semver version string."""
|
||||
parts = current.split(".")
|
||||
if len(parts) != 3:
|
||||
parts = ["0", "0", "0"]
|
||||
major, minor, patch = int(parts[0]), int(parts[1]), int(parts[2])
|
||||
|
||||
if part == "major":
|
||||
major += 1
|
||||
minor = 0
|
||||
patch = 0
|
||||
elif part == "minor":
|
||||
minor += 1
|
||||
patch = 0
|
||||
elif part == "patch":
|
||||
patch += 1
|
||||
else:
|
||||
raise ValueError(f"Unknown bump part: {part}")
|
||||
|
||||
return f"{major}.{minor}.{patch}"
|
||||
|
||||
|
||||
def update_version_files(semver: str, calver_date: str):
|
||||
"""Update version strings in source files."""
|
||||
# Update __init__.py
|
||||
content = VERSION_FILE.read_text()
|
||||
content = re.sub(
|
||||
r'__version__\s*=\s*"[^"]+"',
|
||||
f'__version__ = "{semver}"',
|
||||
content,
|
||||
)
|
||||
content = re.sub(
|
||||
r'__release_date__\s*=\s*"[^"]+"',
|
||||
f'__release_date__ = "{calver_date}"',
|
||||
content,
|
||||
)
|
||||
VERSION_FILE.write_text(content)
|
||||
|
||||
# Update pyproject.toml
|
||||
pyproject = PYPROJECT_FILE.read_text()
|
||||
pyproject = re.sub(
|
||||
r'^version\s*=\s*"[^"]+"',
|
||||
f'version = "{semver}"',
|
||||
pyproject,
|
||||
flags=re.MULTILINE,
|
||||
)
|
||||
PYPROJECT_FILE.write_text(pyproject)
|
||||
|
||||
|
||||
def resolve_author(name: str, email: str) -> str:
|
||||
"""Resolve a git author to a GitHub @mention."""
|
||||
# Try email lookup first
|
||||
gh_user = AUTHOR_MAP.get(email)
|
||||
if gh_user:
|
||||
return f"@{gh_user}"
|
||||
|
||||
# Try noreply pattern
|
||||
noreply_match = re.match(r"(\d+)\+(.+)@users\.noreply\.github\.com", email)
|
||||
if noreply_match:
|
||||
return f"@{noreply_match.group(2)}"
|
||||
|
||||
# Try username@users.noreply.github.com
|
||||
noreply_match2 = re.match(r"(.+)@users\.noreply\.github\.com", email)
|
||||
if noreply_match2:
|
||||
return f"@{noreply_match2.group(1)}"
|
||||
|
||||
# Fallback to git name
|
||||
return name
|
||||
|
||||
|
||||
def categorize_commit(subject: str) -> str:
|
||||
"""Categorize a commit by its conventional commit prefix."""
|
||||
subject_lower = subject.lower()
|
||||
|
||||
# Match conventional commit patterns
|
||||
patterns = {
|
||||
"breaking": [r"^breaking[\s:(]", r"^!:", r"BREAKING CHANGE"],
|
||||
"features": [r"^feat[\s:(]", r"^feature[\s:(]", r"^add[\s:(]"],
|
||||
"fixes": [r"^fix[\s:(]", r"^bugfix[\s:(]", r"^bug[\s:(]", r"^hotfix[\s:(]"],
|
||||
"improvements": [r"^improve[\s:(]", r"^perf[\s:(]", r"^enhance[\s:(]",
|
||||
r"^refactor[\s:(]", r"^cleanup[\s:(]", r"^clean[\s:(]",
|
||||
r"^update[\s:(]", r"^optimize[\s:(]"],
|
||||
"docs": [r"^doc[\s:(]", r"^docs[\s:(]"],
|
||||
"tests": [r"^test[\s:(]", r"^tests[\s:(]"],
|
||||
"chore": [r"^chore[\s:(]", r"^ci[\s:(]", r"^build[\s:(]",
|
||||
r"^deps[\s:(]", r"^bump[\s:(]"],
|
||||
}
|
||||
|
||||
for category, regexes in patterns.items():
|
||||
for regex in regexes:
|
||||
if re.match(regex, subject_lower):
|
||||
return category
|
||||
|
||||
# Heuristic fallbacks
|
||||
if any(w in subject_lower for w in ["add ", "new ", "implement", "support "]):
|
||||
return "features"
|
||||
if any(w in subject_lower for w in ["fix ", "fixed ", "resolve", "patch "]):
|
||||
return "fixes"
|
||||
if any(w in subject_lower for w in ["refactor", "cleanup", "improve", "update "]):
|
||||
return "improvements"
|
||||
|
||||
return "other"
|
||||
|
||||
|
||||
def clean_subject(subject: str) -> str:
|
||||
"""Clean up a commit subject for display."""
|
||||
# Remove conventional commit prefix
|
||||
cleaned = re.sub(r"^(feat|fix|docs|chore|refactor|test|perf|ci|build|improve|add|update|cleanup|hotfix|breaking|enhance|optimize|bugfix|bug|feature|tests|deps|bump)[\s:(!]+\s*", "", subject, flags=re.IGNORECASE)
|
||||
# Remove trailing issue refs that are redundant with PR links
|
||||
cleaned = cleaned.strip()
|
||||
# Capitalize first letter
|
||||
if cleaned:
|
||||
cleaned = cleaned[0].upper() + cleaned[1:]
|
||||
return cleaned
|
||||
|
||||
|
||||
def get_commits(since_tag=None):
|
||||
"""Get commits since a tag (or all commits if None)."""
|
||||
if since_tag:
|
||||
range_spec = f"{since_tag}..HEAD"
|
||||
else:
|
||||
range_spec = "HEAD"
|
||||
|
||||
# Format: hash|author_name|author_email|subject
|
||||
log = git(
|
||||
"log", range_spec,
|
||||
"--format=%H|%an|%ae|%s",
|
||||
"--no-merges",
|
||||
)
|
||||
|
||||
if not log:
|
||||
return []
|
||||
|
||||
commits = []
|
||||
for line in log.split("\n"):
|
||||
if not line.strip():
|
||||
continue
|
||||
parts = line.split("|", 3)
|
||||
if len(parts) != 4:
|
||||
continue
|
||||
sha, name, email, subject = parts
|
||||
commits.append({
|
||||
"sha": sha,
|
||||
"short_sha": sha[:8],
|
||||
"author_name": name,
|
||||
"author_email": email,
|
||||
"subject": subject,
|
||||
"category": categorize_commit(subject),
|
||||
"github_author": resolve_author(name, email),
|
||||
})
|
||||
|
||||
return commits
|
||||
|
||||
|
||||
def get_pr_number(subject: str) -> str:
|
||||
"""Extract PR number from commit subject if present."""
|
||||
match = re.search(r"#(\d+)", subject)
|
||||
if match:
|
||||
return match.group(1)
|
||||
return None
|
||||
|
||||
|
||||
def generate_changelog(commits, tag_name, semver, repo_url="https://github.com/NousResearch/hermes-agent",
|
||||
prev_tag=None, first_release=False):
|
||||
"""Generate markdown changelog from categorized commits."""
|
||||
lines = []
|
||||
|
||||
# Header
|
||||
now = datetime.now()
|
||||
date_str = now.strftime("%B %d, %Y")
|
||||
lines.append(f"# Hermes Agent v{semver} ({tag_name})")
|
||||
lines.append("")
|
||||
lines.append(f"**Release Date:** {date_str}")
|
||||
lines.append("")
|
||||
|
||||
if first_release:
|
||||
lines.append("> 🎉 **First official release!** This marks the beginning of regular weekly releases")
|
||||
lines.append("> for Hermes Agent. See below for everything included in this initial release.")
|
||||
lines.append("")
|
||||
|
||||
# Group commits by category
|
||||
categories = defaultdict(list)
|
||||
all_authors = set()
|
||||
teknium_aliases = {"@teknium1"}
|
||||
|
||||
for commit in commits:
|
||||
categories[commit["category"]].append(commit)
|
||||
author = commit["github_author"]
|
||||
if author not in teknium_aliases:
|
||||
all_authors.add(author)
|
||||
|
||||
# Category display order and emoji
|
||||
category_order = [
|
||||
("breaking", "⚠️ Breaking Changes"),
|
||||
("features", "✨ Features"),
|
||||
("improvements", "🔧 Improvements"),
|
||||
("fixes", "🐛 Bug Fixes"),
|
||||
("docs", "📚 Documentation"),
|
||||
("tests", "🧪 Tests"),
|
||||
("chore", "🏗️ Infrastructure"),
|
||||
("other", "📦 Other Changes"),
|
||||
]
|
||||
|
||||
for cat_key, cat_title in category_order:
|
||||
cat_commits = categories.get(cat_key, [])
|
||||
if not cat_commits:
|
||||
continue
|
||||
|
||||
lines.append(f"## {cat_title}")
|
||||
lines.append("")
|
||||
|
||||
for commit in cat_commits:
|
||||
subject = clean_subject(commit["subject"])
|
||||
pr_num = get_pr_number(commit["subject"])
|
||||
author = commit["github_author"]
|
||||
|
||||
# Build the line
|
||||
parts = [f"- {subject}"]
|
||||
if pr_num:
|
||||
parts.append(f"([#{pr_num}]({repo_url}/pull/{pr_num}))")
|
||||
else:
|
||||
parts.append(f"([`{commit['short_sha']}`]({repo_url}/commit/{commit['sha']}))")
|
||||
|
||||
if author not in teknium_aliases:
|
||||
parts.append(f"— {author}")
|
||||
|
||||
lines.append(" ".join(parts))
|
||||
|
||||
lines.append("")
|
||||
|
||||
# Contributors section
|
||||
if all_authors:
|
||||
# Sort contributors by commit count
|
||||
author_counts = defaultdict(int)
|
||||
for commit in commits:
|
||||
author = commit["github_author"]
|
||||
if author not in teknium_aliases:
|
||||
author_counts[author] += 1
|
||||
|
||||
sorted_authors = sorted(author_counts.items(), key=lambda x: -x[1])
|
||||
|
||||
lines.append("## 👥 Contributors")
|
||||
lines.append("")
|
||||
lines.append("Thank you to everyone who contributed to this release!")
|
||||
lines.append("")
|
||||
for author, count in sorted_authors:
|
||||
commit_word = "commit" if count == 1 else "commits"
|
||||
lines.append(f"- {author} ({count} {commit_word})")
|
||||
lines.append("")
|
||||
|
||||
# Full changelog link
|
||||
if prev_tag:
|
||||
lines.append(f"**Full Changelog**: [{prev_tag}...{tag_name}]({repo_url}/compare/{prev_tag}...{tag_name})")
|
||||
else:
|
||||
lines.append(f"**Full Changelog**: [{tag_name}]({repo_url}/commits/{tag_name})")
|
||||
lines.append("")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Hermes Agent Release Tool")
|
||||
parser.add_argument("--bump", choices=["major", "minor", "patch"],
|
||||
help="Which semver component to bump")
|
||||
parser.add_argument("--publish", action="store_true",
|
||||
help="Actually create the tag and GitHub release (otherwise dry run)")
|
||||
parser.add_argument("--date", type=str,
|
||||
help="Override CalVer date (format: YYYY.M.D)")
|
||||
parser.add_argument("--first-release", action="store_true",
|
||||
help="Mark as first release (no previous tag expected)")
|
||||
parser.add_argument("--output", type=str,
|
||||
help="Write changelog to file instead of stdout")
|
||||
args = parser.parse_args()
|
||||
|
||||
# Determine CalVer date
|
||||
if args.date:
|
||||
calver_date = args.date
|
||||
else:
|
||||
now = datetime.now()
|
||||
calver_date = f"{now.year}.{now.month}.{now.day}"
|
||||
|
||||
tag_name = f"v{calver_date}"
|
||||
|
||||
# Check for existing tag with same date
|
||||
existing = git("tag", "--list", tag_name)
|
||||
if existing and not args.publish:
|
||||
# Append a suffix for same-day releases
|
||||
suffix = 2
|
||||
while git("tag", "--list", f"{tag_name}.{suffix}"):
|
||||
suffix += 1
|
||||
tag_name = f"{tag_name}.{suffix}"
|
||||
calver_date = f"{calver_date}.{suffix}"
|
||||
print(f"Note: Tag {tag_name[:-2]} already exists, using {tag_name}")
|
||||
|
||||
# Determine semver
|
||||
current_version = get_current_version()
|
||||
if args.bump:
|
||||
new_version = bump_version(current_version, args.bump)
|
||||
else:
|
||||
new_version = current_version
|
||||
|
||||
# Get previous tag
|
||||
prev_tag = get_last_tag()
|
||||
if not prev_tag and not args.first_release:
|
||||
print("No previous tags found. Use --first-release for the initial release.")
|
||||
print(f"Would create tag: {tag_name}")
|
||||
print(f"Would set version: {new_version}")
|
||||
|
||||
# Get commits
|
||||
commits = get_commits(since_tag=prev_tag)
|
||||
if not commits:
|
||||
print("No new commits since last tag.")
|
||||
if not args.first_release:
|
||||
return
|
||||
|
||||
print(f"{'='*60}")
|
||||
print(f" Hermes Agent Release Preview")
|
||||
print(f"{'='*60}")
|
||||
print(f" CalVer tag: {tag_name}")
|
||||
print(f" SemVer: v{current_version} → v{new_version}")
|
||||
print(f" Previous tag: {prev_tag or '(none — first release)'}")
|
||||
print(f" Commits: {len(commits)}")
|
||||
print(f" Unique authors: {len(set(c['github_author'] for c in commits))}")
|
||||
print(f" Mode: {'PUBLISH' if args.publish else 'DRY RUN'}")
|
||||
print(f"{'='*60}")
|
||||
print()
|
||||
|
||||
# Generate changelog
|
||||
changelog = generate_changelog(
|
||||
commits, tag_name, new_version,
|
||||
prev_tag=prev_tag,
|
||||
first_release=args.first_release,
|
||||
)
|
||||
|
||||
if args.output:
|
||||
Path(args.output).write_text(changelog)
|
||||
print(f"Changelog written to {args.output}")
|
||||
else:
|
||||
print(changelog)
|
||||
|
||||
if args.publish:
|
||||
print(f"\n{'='*60}")
|
||||
print(" Publishing release...")
|
||||
print(f"{'='*60}")
|
||||
|
||||
# Update version files
|
||||
if args.bump:
|
||||
update_version_files(new_version, calver_date)
|
||||
print(f" ✓ Updated version files to v{new_version} ({calver_date})")
|
||||
|
||||
# Commit version bump
|
||||
git("add", str(VERSION_FILE), str(PYPROJECT_FILE))
|
||||
git("commit", "-m", f"chore: bump version to v{new_version} ({calver_date})")
|
||||
print(f" ✓ Committed version bump")
|
||||
|
||||
# Create annotated tag
|
||||
git("tag", "-a", tag_name, "-m",
|
||||
f"Hermes Agent v{new_version} ({calver_date})\n\nWeekly release")
|
||||
print(f" ✓ Created tag {tag_name}")
|
||||
|
||||
# Push
|
||||
push_result = git("push", "origin", "HEAD", "--tags")
|
||||
print(f" ✓ Pushed to origin")
|
||||
|
||||
# Create GitHub release
|
||||
changelog_file = REPO_ROOT / ".release_notes.md"
|
||||
changelog_file.write_text(changelog)
|
||||
|
||||
result = subprocess.run(
|
||||
["gh", "release", "create", tag_name,
|
||||
"--title", f"Hermes Agent v{new_version} ({calver_date})",
|
||||
"--notes-file", str(changelog_file)],
|
||||
capture_output=True, text=True,
|
||||
cwd=str(REPO_ROOT),
|
||||
)
|
||||
|
||||
changelog_file.unlink(missing_ok=True)
|
||||
|
||||
if result.returncode == 0:
|
||||
print(f" ✓ GitHub release created: {result.stdout.strip()}")
|
||||
else:
|
||||
print(f" ✗ GitHub release failed: {result.stderr}")
|
||||
print(f" Tag was created. Create the release manually:")
|
||||
print(f" gh release create {tag_name} --title 'Hermes Agent v{new_version} ({calver_date})'")
|
||||
|
||||
print(f"\n 🎉 Release v{new_version} ({tag_name}) published!")
|
||||
else:
|
||||
print(f"\n{'='*60}")
|
||||
print(f" Dry run complete. To publish, add --publish")
|
||||
print(f" Example: python scripts/release.py --bump minor --publish")
|
||||
print(f"{'='*60}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -8,7 +8,6 @@ metadata:
|
||||
hermes:
|
||||
tags: [search, duckduckgo, web-search, free, fallback]
|
||||
related_skills: [arxiv]
|
||||
fallback_for_toolsets: [web]
|
||||
---
|
||||
|
||||
# DuckDuckGo Search
|
||||
|
||||
@@ -9,7 +9,8 @@ from agent.context_compressor import ContextCompressor
|
||||
@pytest.fixture()
|
||||
def compressor():
|
||||
"""Create a ContextCompressor with mocked dependencies."""
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(None, None)):
|
||||
c = ContextCompressor(
|
||||
model="test/model",
|
||||
threshold_percent=0.85,
|
||||
@@ -118,11 +119,14 @@ class TestGenerateSummaryNoneContent:
|
||||
"""Regression: content=None (from tool-call-only assistant messages) must not crash."""
|
||||
|
||||
def test_none_content_does_not_crash(self):
|
||||
mock_client = MagicMock()
|
||||
mock_response = MagicMock()
|
||||
mock_response.choices = [MagicMock()]
|
||||
mock_response.choices[0].message.content = "[CONTEXT SUMMARY]: tool calls happened"
|
||||
mock_client.chat.completions.create.return_value = mock_response
|
||||
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(mock_client, "test-model")):
|
||||
c = ContextCompressor(model="test", quiet_mode=True)
|
||||
|
||||
messages = [
|
||||
@@ -135,14 +139,14 @@ class TestGenerateSummaryNoneContent:
|
||||
{"role": "user", "content": "thanks"},
|
||||
]
|
||||
|
||||
with patch("agent.context_compressor.call_llm", return_value=mock_response):
|
||||
summary = c._generate_summary(messages)
|
||||
summary = c._generate_summary(messages)
|
||||
assert isinstance(summary, str)
|
||||
assert "CONTEXT SUMMARY" in summary
|
||||
|
||||
def test_none_content_in_system_message_compress(self):
|
||||
"""System message with content=None should not crash during compress."""
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(None, None)):
|
||||
c = ContextCompressor(model="test", quiet_mode=True, protect_first_n=2, protect_last_n=2)
|
||||
|
||||
msgs = [{"role": "system", "content": None}] + [
|
||||
@@ -161,12 +165,12 @@ class TestCompressWithClient:
|
||||
mock_response.choices[0].message.content = "[CONTEXT SUMMARY]: stuff happened"
|
||||
mock_client.chat.completions.create.return_value = mock_response
|
||||
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(mock_client, "test-model")):
|
||||
c = ContextCompressor(model="test", quiet_mode=True)
|
||||
|
||||
msgs = [{"role": "user" if i % 2 == 0 else "assistant", "content": f"msg {i}"} for i in range(10)]
|
||||
with patch("agent.context_compressor.call_llm", return_value=mock_response):
|
||||
result = c.compress(msgs)
|
||||
result = c.compress(msgs)
|
||||
|
||||
# Should have summary message in the middle
|
||||
contents = [m.get("content", "") for m in result]
|
||||
@@ -180,7 +184,8 @@ class TestCompressWithClient:
|
||||
mock_response.choices[0].message.content = "[CONTEXT SUMMARY]: compressed middle"
|
||||
mock_client.chat.completions.create.return_value = mock_response
|
||||
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(mock_client, "test-model")):
|
||||
c = ContextCompressor(
|
||||
model="test",
|
||||
quiet_mode=True,
|
||||
@@ -207,8 +212,7 @@ class TestCompressWithClient:
|
||||
{"role": "user", "content": "later 4"},
|
||||
]
|
||||
|
||||
with patch("agent.context_compressor.call_llm", return_value=mock_response):
|
||||
result = c.compress(msgs)
|
||||
result = c.compress(msgs)
|
||||
|
||||
answered_ids = {
|
||||
msg.get("tool_call_id")
|
||||
@@ -228,7 +232,8 @@ class TestCompressWithClient:
|
||||
mock_response.choices[0].message.content = "[CONTEXT SUMMARY]: stuff happened"
|
||||
mock_client.chat.completions.create.return_value = mock_response
|
||||
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(mock_client, "test-model")):
|
||||
c = ContextCompressor(model="test", quiet_mode=True, protect_first_n=2, protect_last_n=2)
|
||||
|
||||
# Last head message (index 1) is "assistant" → summary should be "user"
|
||||
@@ -240,8 +245,7 @@ class TestCompressWithClient:
|
||||
{"role": "user", "content": "msg 4"},
|
||||
{"role": "assistant", "content": "msg 5"},
|
||||
]
|
||||
with patch("agent.context_compressor.call_llm", return_value=mock_response):
|
||||
result = c.compress(msgs)
|
||||
result = c.compress(msgs)
|
||||
summary_msg = [m for m in result if "CONTEXT SUMMARY" in (m.get("content") or "")]
|
||||
assert len(summary_msg) == 1
|
||||
assert summary_msg[0]["role"] == "user"
|
||||
@@ -254,7 +258,8 @@ class TestCompressWithClient:
|
||||
mock_response.choices[0].message.content = "[CONTEXT SUMMARY]: stuff happened"
|
||||
mock_client.chat.completions.create.return_value = mock_response
|
||||
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(mock_client, "test-model")):
|
||||
c = ContextCompressor(model="test", quiet_mode=True, protect_first_n=3, protect_last_n=2)
|
||||
|
||||
# Last head message (index 2) is "user" → summary should be "assistant"
|
||||
@@ -268,18 +273,20 @@ class TestCompressWithClient:
|
||||
{"role": "user", "content": "msg 6"},
|
||||
{"role": "assistant", "content": "msg 7"},
|
||||
]
|
||||
with patch("agent.context_compressor.call_llm", return_value=mock_response):
|
||||
result = c.compress(msgs)
|
||||
result = c.compress(msgs)
|
||||
summary_msg = [m for m in result if "CONTEXT SUMMARY" in (m.get("content") or "")]
|
||||
assert len(summary_msg) == 1
|
||||
assert summary_msg[0]["role"] == "assistant"
|
||||
|
||||
def test_summarization_does_not_start_tail_with_tool_outputs(self):
|
||||
mock_client = MagicMock()
|
||||
mock_response = MagicMock()
|
||||
mock_response.choices = [MagicMock()]
|
||||
mock_response.choices[0].message.content = "[CONTEXT SUMMARY]: compressed middle"
|
||||
mock_client.chat.completions.create.return_value = mock_response
|
||||
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
|
||||
with patch("agent.context_compressor.get_model_context_length", return_value=100000), \
|
||||
patch("agent.context_compressor.get_text_auxiliary_client", return_value=(mock_client, "test-model")):
|
||||
c = ContextCompressor(
|
||||
model="test",
|
||||
quiet_mode=True,
|
||||
@@ -302,8 +309,7 @@ class TestCompressWithClient:
|
||||
{"role": "user", "content": "latest user"},
|
||||
]
|
||||
|
||||
with patch("agent.context_compressor.call_llm", return_value=mock_response):
|
||||
result = c.compress(msgs)
|
||||
result = c.compress(msgs)
|
||||
|
||||
called_ids = {
|
||||
tc["id"]
|
||||
|
||||
@@ -8,8 +8,6 @@ from agent.prompt_builder import (
|
||||
_scan_context_content,
|
||||
_truncate_content,
|
||||
_read_skill_description,
|
||||
_read_skill_conditions,
|
||||
_skill_should_show,
|
||||
build_skills_system_prompt,
|
||||
build_context_files_prompt,
|
||||
CONTEXT_FILE_MAX_CHARS,
|
||||
@@ -279,177 +277,3 @@ class TestPromptBuilderConstants:
|
||||
assert "telegram" in PLATFORM_HINTS
|
||||
assert "discord" in PLATFORM_HINTS
|
||||
assert "cli" in PLATFORM_HINTS
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Conditional skill activation
|
||||
# =========================================================================
|
||||
|
||||
class TestReadSkillConditions:
|
||||
def test_no_conditions_returns_empty_lists(self, tmp_path):
|
||||
skill_file = tmp_path / "SKILL.md"
|
||||
skill_file.write_text("---\nname: test\ndescription: A skill\n---\n")
|
||||
conditions = _read_skill_conditions(skill_file)
|
||||
assert conditions["fallback_for_toolsets"] == []
|
||||
assert conditions["requires_toolsets"] == []
|
||||
assert conditions["fallback_for_tools"] == []
|
||||
assert conditions["requires_tools"] == []
|
||||
|
||||
def test_reads_fallback_for_toolsets(self, tmp_path):
|
||||
skill_file = tmp_path / "SKILL.md"
|
||||
skill_file.write_text(
|
||||
"---\nname: ddg\ndescription: DuckDuckGo\nmetadata:\n hermes:\n fallback_for_toolsets: [web]\n---\n"
|
||||
)
|
||||
conditions = _read_skill_conditions(skill_file)
|
||||
assert conditions["fallback_for_toolsets"] == ["web"]
|
||||
|
||||
def test_reads_requires_toolsets(self, tmp_path):
|
||||
skill_file = tmp_path / "SKILL.md"
|
||||
skill_file.write_text(
|
||||
"---\nname: openhue\ndescription: Hue lights\nmetadata:\n hermes:\n requires_toolsets: [terminal]\n---\n"
|
||||
)
|
||||
conditions = _read_skill_conditions(skill_file)
|
||||
assert conditions["requires_toolsets"] == ["terminal"]
|
||||
|
||||
def test_reads_multiple_conditions(self, tmp_path):
|
||||
skill_file = tmp_path / "SKILL.md"
|
||||
skill_file.write_text(
|
||||
"---\nname: test\ndescription: Test\nmetadata:\n hermes:\n fallback_for_toolsets: [browser]\n requires_tools: [terminal]\n---\n"
|
||||
)
|
||||
conditions = _read_skill_conditions(skill_file)
|
||||
assert conditions["fallback_for_toolsets"] == ["browser"]
|
||||
assert conditions["requires_tools"] == ["terminal"]
|
||||
|
||||
def test_missing_file_returns_empty(self, tmp_path):
|
||||
conditions = _read_skill_conditions(tmp_path / "missing.md")
|
||||
assert conditions == {}
|
||||
|
||||
|
||||
class TestSkillShouldShow:
|
||||
def test_no_filter_info_always_shows(self):
|
||||
assert _skill_should_show({}, None, None) is True
|
||||
|
||||
def test_empty_conditions_always_shows(self):
|
||||
assert _skill_should_show(
|
||||
{"fallback_for_toolsets": [], "requires_toolsets": [],
|
||||
"fallback_for_tools": [], "requires_tools": []},
|
||||
{"web_search"}, {"web"}
|
||||
) is True
|
||||
|
||||
def test_fallback_hidden_when_toolset_available(self):
|
||||
conditions = {"fallback_for_toolsets": ["web"], "requires_toolsets": [],
|
||||
"fallback_for_tools": [], "requires_tools": []}
|
||||
assert _skill_should_show(conditions, set(), {"web"}) is False
|
||||
|
||||
def test_fallback_shown_when_toolset_unavailable(self):
|
||||
conditions = {"fallback_for_toolsets": ["web"], "requires_toolsets": [],
|
||||
"fallback_for_tools": [], "requires_tools": []}
|
||||
assert _skill_should_show(conditions, set(), set()) is True
|
||||
|
||||
def test_requires_shown_when_toolset_available(self):
|
||||
conditions = {"fallback_for_toolsets": [], "requires_toolsets": ["terminal"],
|
||||
"fallback_for_tools": [], "requires_tools": []}
|
||||
assert _skill_should_show(conditions, set(), {"terminal"}) is True
|
||||
|
||||
def test_requires_hidden_when_toolset_missing(self):
|
||||
conditions = {"fallback_for_toolsets": [], "requires_toolsets": ["terminal"],
|
||||
"fallback_for_tools": [], "requires_tools": []}
|
||||
assert _skill_should_show(conditions, set(), set()) is False
|
||||
|
||||
def test_fallback_for_tools_hidden_when_tool_available(self):
|
||||
conditions = {"fallback_for_toolsets": [], "requires_toolsets": [],
|
||||
"fallback_for_tools": ["web_search"], "requires_tools": []}
|
||||
assert _skill_should_show(conditions, {"web_search"}, set()) is False
|
||||
|
||||
def test_fallback_for_tools_shown_when_tool_missing(self):
|
||||
conditions = {"fallback_for_toolsets": [], "requires_toolsets": [],
|
||||
"fallback_for_tools": ["web_search"], "requires_tools": []}
|
||||
assert _skill_should_show(conditions, set(), set()) is True
|
||||
|
||||
def test_requires_tools_hidden_when_tool_missing(self):
|
||||
conditions = {"fallback_for_toolsets": [], "requires_toolsets": [],
|
||||
"fallback_for_tools": [], "requires_tools": ["terminal"]}
|
||||
assert _skill_should_show(conditions, set(), set()) is False
|
||||
|
||||
def test_requires_tools_shown_when_tool_available(self):
|
||||
conditions = {"fallback_for_toolsets": [], "requires_toolsets": [],
|
||||
"fallback_for_tools": [], "requires_tools": ["terminal"]}
|
||||
assert _skill_should_show(conditions, {"terminal"}, set()) is True
|
||||
|
||||
|
||||
class TestBuildSkillsSystemPromptConditional:
|
||||
def test_fallback_skill_hidden_when_primary_available(self, monkeypatch, tmp_path):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
skill_dir = tmp_path / "skills" / "search" / "duckduckgo"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text(
|
||||
"---\nname: duckduckgo\ndescription: Free web search\nmetadata:\n hermes:\n fallback_for_toolsets: [web]\n---\n"
|
||||
)
|
||||
result = build_skills_system_prompt(
|
||||
available_tools=set(),
|
||||
available_toolsets={"web"},
|
||||
)
|
||||
assert "duckduckgo" not in result
|
||||
|
||||
def test_fallback_skill_shown_when_primary_unavailable(self, monkeypatch, tmp_path):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
skill_dir = tmp_path / "skills" / "search" / "duckduckgo"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text(
|
||||
"---\nname: duckduckgo\ndescription: Free web search\nmetadata:\n hermes:\n fallback_for_toolsets: [web]\n---\n"
|
||||
)
|
||||
result = build_skills_system_prompt(
|
||||
available_tools=set(),
|
||||
available_toolsets=set(),
|
||||
)
|
||||
assert "duckduckgo" in result
|
||||
|
||||
def test_requires_skill_hidden_when_toolset_missing(self, monkeypatch, tmp_path):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
skill_dir = tmp_path / "skills" / "iot" / "openhue"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text(
|
||||
"---\nname: openhue\ndescription: Hue lights\nmetadata:\n hermes:\n requires_toolsets: [terminal]\n---\n"
|
||||
)
|
||||
result = build_skills_system_prompt(
|
||||
available_tools=set(),
|
||||
available_toolsets=set(),
|
||||
)
|
||||
assert "openhue" not in result
|
||||
|
||||
def test_requires_skill_shown_when_toolset_available(self, monkeypatch, tmp_path):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
skill_dir = tmp_path / "skills" / "iot" / "openhue"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text(
|
||||
"---\nname: openhue\ndescription: Hue lights\nmetadata:\n hermes:\n requires_toolsets: [terminal]\n---\n"
|
||||
)
|
||||
result = build_skills_system_prompt(
|
||||
available_tools=set(),
|
||||
available_toolsets={"terminal"},
|
||||
)
|
||||
assert "openhue" in result
|
||||
|
||||
def test_unconditional_skill_always_shown(self, monkeypatch, tmp_path):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
skill_dir = tmp_path / "skills" / "general" / "notes"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text(
|
||||
"---\nname: notes\ndescription: Take notes\n---\n"
|
||||
)
|
||||
result = build_skills_system_prompt(
|
||||
available_tools=set(),
|
||||
available_toolsets=set(),
|
||||
)
|
||||
assert "notes" in result
|
||||
|
||||
def test_no_args_shows_all_skills(self, monkeypatch, tmp_path):
|
||||
"""Backward compat: calling with no args shows everything."""
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
skill_dir = tmp_path / "skills" / "search" / "duckduckgo"
|
||||
skill_dir.mkdir(parents=True)
|
||||
(skill_dir / "SKILL.md").write_text(
|
||||
"---\nname: duckduckgo\ndescription: Free web search\nmetadata:\n hermes:\n fallback_for_toolsets: [web]\n---\n"
|
||||
)
|
||||
result = build_skills_system_prompt()
|
||||
assert "duckduckgo" in result
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
"""Shared fixtures for the hermes-agent test suite."""
|
||||
|
||||
import os
|
||||
import signal
|
||||
import sys
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
@@ -49,21 +48,3 @@ def mock_config():
|
||||
"memory": {"memory_enabled": False, "user_profile_enabled": False},
|
||||
"command_allowlist": [],
|
||||
}
|
||||
|
||||
|
||||
# ── Global test timeout ─────────────────────────────────────────────────────
|
||||
# Kill any individual test that takes longer than 30 seconds.
|
||||
# Prevents hanging tests (subprocess spawns, blocking I/O) from stalling the
|
||||
# entire test suite.
|
||||
|
||||
def _timeout_handler(signum, frame):
|
||||
raise TimeoutError("Test exceeded 30 second timeout")
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _enforce_test_timeout():
|
||||
"""Kill any individual test that takes longer than 30 seconds."""
|
||||
old = signal.signal(signal.SIGALRM, _timeout_handler)
|
||||
signal.alarm(30)
|
||||
yield
|
||||
signal.alarm(0)
|
||||
signal.signal(signal.SIGALRM, old)
|
||||
|
||||
@@ -1,249 +0,0 @@
|
||||
"""Tests for Discord free-response defaults and mention gating."""
|
||||
|
||||
from datetime import datetime, timezone
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import AsyncMock, MagicMock
|
||||
import sys
|
||||
|
||||
import pytest
|
||||
|
||||
from gateway.config import PlatformConfig
|
||||
|
||||
|
||||
def _ensure_discord_mock():
|
||||
"""Install a mock discord module when discord.py isn't available."""
|
||||
if "discord" in sys.modules and hasattr(sys.modules["discord"], "__file__"):
|
||||
return
|
||||
|
||||
discord_mod = MagicMock()
|
||||
discord_mod.Intents.default.return_value = MagicMock()
|
||||
discord_mod.Client = MagicMock
|
||||
discord_mod.File = MagicMock
|
||||
discord_mod.DMChannel = type("DMChannel", (), {})
|
||||
discord_mod.Thread = type("Thread", (), {})
|
||||
discord_mod.ForumChannel = type("ForumChannel", (), {})
|
||||
discord_mod.ui = SimpleNamespace(View=object, button=lambda *a, **k: (lambda fn: fn), Button=object)
|
||||
discord_mod.ButtonStyle = SimpleNamespace(success=1, primary=2, danger=3, green=1, blurple=2, red=3)
|
||||
discord_mod.Color = SimpleNamespace(orange=lambda: 1, green=lambda: 2, blue=lambda: 3, red=lambda: 4)
|
||||
discord_mod.Interaction = object
|
||||
discord_mod.Embed = MagicMock
|
||||
|
||||
ext_mod = MagicMock()
|
||||
commands_mod = MagicMock()
|
||||
commands_mod.Bot = MagicMock
|
||||
ext_mod.commands = commands_mod
|
||||
|
||||
sys.modules.setdefault("discord", discord_mod)
|
||||
sys.modules.setdefault("discord.ext", ext_mod)
|
||||
sys.modules.setdefault("discord.ext.commands", commands_mod)
|
||||
|
||||
|
||||
_ensure_discord_mock()
|
||||
|
||||
import gateway.platforms.discord as discord_platform # noqa: E402
|
||||
from gateway.platforms.discord import DiscordAdapter # noqa: E402
|
||||
|
||||
|
||||
class FakeDMChannel:
|
||||
def __init__(self, channel_id: int = 1, name: str = "dm"):
|
||||
self.id = channel_id
|
||||
self.name = name
|
||||
|
||||
|
||||
class FakeTextChannel:
|
||||
def __init__(self, channel_id: int = 1, name: str = "general", guild_name: str = "Hermes Server"):
|
||||
self.id = channel_id
|
||||
self.name = name
|
||||
self.guild = SimpleNamespace(name=guild_name)
|
||||
self.topic = None
|
||||
|
||||
|
||||
class FakeForumChannel:
|
||||
def __init__(self, channel_id: int = 1, name: str = "support-forum", guild_name: str = "Hermes Server"):
|
||||
self.id = channel_id
|
||||
self.name = name
|
||||
self.guild = SimpleNamespace(name=guild_name)
|
||||
self.type = 15
|
||||
self.topic = None
|
||||
|
||||
|
||||
class FakeThread:
|
||||
def __init__(self, channel_id: int = 1, name: str = "thread", parent=None, guild_name: str = "Hermes Server"):
|
||||
self.id = channel_id
|
||||
self.name = name
|
||||
self.parent = parent
|
||||
self.parent_id = getattr(parent, "id", None)
|
||||
self.guild = getattr(parent, "guild", None) or SimpleNamespace(name=guild_name)
|
||||
self.topic = None
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def adapter(monkeypatch):
|
||||
monkeypatch.setattr(discord_platform.discord, "DMChannel", FakeDMChannel, raising=False)
|
||||
monkeypatch.setattr(discord_platform.discord, "Thread", FakeThread, raising=False)
|
||||
monkeypatch.setattr(discord_platform.discord, "ForumChannel", FakeForumChannel, raising=False)
|
||||
|
||||
config = PlatformConfig(enabled=True, token="fake-token")
|
||||
adapter = DiscordAdapter(config)
|
||||
adapter._client = SimpleNamespace(user=SimpleNamespace(id=999))
|
||||
adapter.handle_message = AsyncMock()
|
||||
return adapter
|
||||
|
||||
|
||||
def make_message(*, channel, content: str, mentions=None):
|
||||
author = SimpleNamespace(id=42, display_name="Jezza", name="Jezza")
|
||||
return SimpleNamespace(
|
||||
id=123,
|
||||
content=content,
|
||||
mentions=list(mentions or []),
|
||||
attachments=[],
|
||||
reference=None,
|
||||
created_at=datetime.now(timezone.utc),
|
||||
channel=channel,
|
||||
author=author,
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_defaults_to_require_mention(adapter, monkeypatch):
|
||||
"""Default behavior: require @mention in server channels."""
|
||||
monkeypatch.delenv("DISCORD_REQUIRE_MENTION", raising=False)
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=123), content="hello from channel")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
# Should be ignored — no mention, require_mention defaults to true
|
||||
adapter.handle_message.assert_not_awaited()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_free_response_in_server_channels(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=123), content="hello from channel")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.text == "hello from channel"
|
||||
assert event.source.chat_id == "123"
|
||||
assert event.source.chat_type == "group"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_free_response_in_threads(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
thread = FakeThread(channel_id=456, name="Ghost reader skill")
|
||||
message = make_message(channel=thread, content="hello from thread")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.text == "hello from thread"
|
||||
assert event.source.chat_id == "456"
|
||||
assert event.source.thread_id == "456"
|
||||
assert event.source.chat_type == "thread"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_forum_threads_are_handled_as_threads(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
forum = FakeForumChannel(channel_id=222, name="support-forum")
|
||||
thread = FakeThread(channel_id=456, name="Can Hermes reply here?", parent=forum)
|
||||
message = make_message(channel=thread, content="hello from forum post")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.text == "hello from forum post"
|
||||
assert event.source.chat_id == "456"
|
||||
assert event.source.thread_id == "456"
|
||||
assert event.source.chat_type == "thread"
|
||||
assert event.source.chat_name == "Hermes Server / support-forum / Can Hermes reply here?"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_can_still_require_mentions_when_enabled(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=789), content="ignored without mention")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_not_awaited()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_free_response_channel_overrides_mention_requirement(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
|
||||
monkeypatch.setenv("DISCORD_FREE_RESPONSE_CHANNELS", "789,999")
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=789), content="allowed without mention")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.text == "allowed without mention"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_forum_parent_in_free_response_list_allows_forum_thread(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
|
||||
monkeypatch.setenv("DISCORD_FREE_RESPONSE_CHANNELS", "222")
|
||||
|
||||
forum = FakeForumChannel(channel_id=222, name="support-forum")
|
||||
thread = FakeThread(channel_id=333, name="Forum topic", parent=forum)
|
||||
message = make_message(channel=thread, content="allowed from forum thread")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.text == "allowed from forum thread"
|
||||
assert event.source.chat_id == "333"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_accepts_and_strips_bot_mentions_when_required(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
bot_user = adapter._client.user
|
||||
message = make_message(
|
||||
channel=FakeTextChannel(channel_id=321),
|
||||
content=f"<@{bot_user.id}> hello with mention",
|
||||
mentions=[bot_user],
|
||||
)
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.text == "hello with mention"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_discord_dms_ignore_mention_requirement(adapter, monkeypatch):
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeDMChannel(channel_id=654), content="dm without mention")
|
||||
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.text == "dm without mention"
|
||||
assert event.source.chat_type == "dm"
|
||||
@@ -1,97 +0,0 @@
|
||||
import json
|
||||
|
||||
from hermes_cli.auth import _update_config_for_provider, get_active_provider
|
||||
from hermes_cli.config import load_config, save_config
|
||||
from hermes_cli.setup import setup_model_provider
|
||||
|
||||
|
||||
def _clear_provider_env(monkeypatch):
|
||||
for key in (
|
||||
"NOUS_API_KEY",
|
||||
"OPENROUTER_API_KEY",
|
||||
"OPENAI_BASE_URL",
|
||||
"OPENAI_API_KEY",
|
||||
"LLM_MODEL",
|
||||
):
|
||||
monkeypatch.delenv(key, raising=False)
|
||||
|
||||
|
||||
|
||||
def test_nous_oauth_setup_keeps_current_model_when_syncing_disk_provider(
|
||||
tmp_path, monkeypatch
|
||||
):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
_clear_provider_env(monkeypatch)
|
||||
|
||||
config = load_config()
|
||||
|
||||
prompt_choices = iter([0, 2])
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.setup.prompt_choice",
|
||||
lambda *args, **kwargs: next(prompt_choices),
|
||||
)
|
||||
monkeypatch.setattr("hermes_cli.setup.prompt", lambda *args, **kwargs: "")
|
||||
|
||||
def _fake_login_nous(*args, **kwargs):
|
||||
auth_path = tmp_path / "auth.json"
|
||||
auth_path.write_text(json.dumps({"active_provider": "nous", "providers": {}}))
|
||||
_update_config_for_provider("nous", "https://inference.example.com/v1")
|
||||
|
||||
monkeypatch.setattr("hermes_cli.auth._login_nous", _fake_login_nous)
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.auth.resolve_nous_runtime_credentials",
|
||||
lambda *args, **kwargs: {
|
||||
"base_url": "https://inference.example.com/v1",
|
||||
"api_key": "nous-key",
|
||||
},
|
||||
)
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.auth.fetch_nous_models",
|
||||
lambda *args, **kwargs: ["gemini-3-flash"],
|
||||
)
|
||||
|
||||
setup_model_provider(config)
|
||||
save_config(config)
|
||||
|
||||
reloaded = load_config()
|
||||
|
||||
assert isinstance(reloaded["model"], dict)
|
||||
assert reloaded["model"]["provider"] == "nous"
|
||||
assert reloaded["model"]["base_url"] == "https://inference.example.com/v1"
|
||||
assert reloaded["model"]["default"] == "anthropic/claude-opus-4.6"
|
||||
|
||||
|
||||
def test_custom_setup_clears_active_oauth_provider(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
_clear_provider_env(monkeypatch)
|
||||
|
||||
auth_path = tmp_path / "auth.json"
|
||||
auth_path.write_text(json.dumps({"active_provider": "nous", "providers": {}}))
|
||||
|
||||
config = load_config()
|
||||
|
||||
monkeypatch.setattr("hermes_cli.setup.prompt_choice", lambda *args, **kwargs: 3)
|
||||
|
||||
prompt_values = iter(
|
||||
[
|
||||
"https://custom.example/v1",
|
||||
"custom-api-key",
|
||||
"custom/model",
|
||||
"",
|
||||
]
|
||||
)
|
||||
monkeypatch.setattr(
|
||||
"hermes_cli.setup.prompt",
|
||||
lambda *args, **kwargs: next(prompt_values),
|
||||
)
|
||||
|
||||
setup_model_provider(config)
|
||||
save_config(config)
|
||||
|
||||
reloaded = load_config()
|
||||
|
||||
assert get_active_provider() is None
|
||||
assert isinstance(reloaded["model"], dict)
|
||||
assert reloaded["model"]["provider"] == "custom"
|
||||
assert reloaded["model"]["base_url"] == "https://custom.example/v1"
|
||||
assert reloaded["model"]["default"] == "custom/model"
|
||||
@@ -579,7 +579,7 @@ class WebToolsTester:
|
||||
"results": self.test_results,
|
||||
"environment": {
|
||||
"firecrawl_api_key": check_firecrawl_api_key(),
|
||||
"auxiliary_model": check_auxiliary_model(),
|
||||
"nous_api_key": check_auxiliary_model(),
|
||||
"debug_mode": get_debug_session_info()["enabled"]
|
||||
}
|
||||
}
|
||||
|
||||
@@ -6,11 +6,6 @@ Verifies that:
|
||||
- Preflight compression proactively compresses oversized sessions before API calls
|
||||
"""
|
||||
|
||||
import pytest
|
||||
pytestmark = pytest.mark.skip(reason="Hangs in non-interactive environments")
|
||||
|
||||
|
||||
|
||||
import uuid
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import MagicMock, patch
|
||||
@@ -401,73 +396,3 @@ class TestPreflightCompression:
|
||||
result = agent.run_conversation("hello", conversation_history=big_history)
|
||||
|
||||
mock_compress.assert_not_called()
|
||||
|
||||
|
||||
class TestToolResultPreflightCompression:
|
||||
"""Compression should trigger when tool results push context past the threshold."""
|
||||
|
||||
def test_large_tool_results_trigger_compression(self, agent):
|
||||
"""When tool results push estimated tokens past threshold, compress before next call."""
|
||||
agent.compression_enabled = True
|
||||
agent.context_compressor.context_length = 200_000
|
||||
agent.context_compressor.threshold_tokens = 140_000
|
||||
agent.context_compressor.last_prompt_tokens = 130_000
|
||||
agent.context_compressor.last_completion_tokens = 5_000
|
||||
|
||||
tc = SimpleNamespace(
|
||||
id="tc1", type="function",
|
||||
function=SimpleNamespace(name="web_search", arguments='{"query":"test"}'),
|
||||
)
|
||||
tool_resp = _mock_response(
|
||||
content=None, finish_reason="stop", tool_calls=[tc],
|
||||
usage={"prompt_tokens": 130_000, "completion_tokens": 5_000, "total_tokens": 135_000},
|
||||
)
|
||||
ok_resp = _mock_response(
|
||||
content="Done after compression", finish_reason="stop",
|
||||
usage={"prompt_tokens": 50_000, "completion_tokens": 100, "total_tokens": 50_100},
|
||||
)
|
||||
agent.client.chat.completions.create.side_effect = [tool_resp, ok_resp]
|
||||
large_result = "x" * 100_000
|
||||
|
||||
with (
|
||||
patch("run_agent.handle_function_call", return_value=large_result),
|
||||
patch.object(agent, "_compress_context") as mock_compress,
|
||||
patch.object(agent, "_persist_session"),
|
||||
patch.object(agent, "_save_trajectory"),
|
||||
patch.object(agent, "_cleanup_task_resources"),
|
||||
):
|
||||
mock_compress.return_value = (
|
||||
[{"role": "user", "content": "hello"}], "compressed prompt",
|
||||
)
|
||||
result = agent.run_conversation("hello")
|
||||
|
||||
mock_compress.assert_called_once()
|
||||
assert result["completed"] is True
|
||||
|
||||
def test_anthropic_prompt_too_long_safety_net(self, agent):
|
||||
"""Anthropic 'prompt is too long' error triggers compression as safety net."""
|
||||
err_400 = Exception(
|
||||
"Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', "
|
||||
"'message': 'prompt is too long: 233153 tokens > 200000 maximum'}}"
|
||||
)
|
||||
err_400.status_code = 400
|
||||
ok_resp = _mock_response(content="Recovered", finish_reason="stop")
|
||||
agent.client.chat.completions.create.side_effect = [err_400, ok_resp]
|
||||
prefill = [
|
||||
{"role": "user", "content": "previous"},
|
||||
{"role": "assistant", "content": "answer"},
|
||||
]
|
||||
|
||||
with (
|
||||
patch.object(agent, "_compress_context") as mock_compress,
|
||||
patch.object(agent, "_persist_session"),
|
||||
patch.object(agent, "_save_trajectory"),
|
||||
patch.object(agent, "_cleanup_task_resources"),
|
||||
):
|
||||
mock_compress.return_value = (
|
||||
[{"role": "user", "content": "hello"}], "compressed",
|
||||
)
|
||||
result = agent.run_conversation("hello", conversation_history=prefill)
|
||||
|
||||
mock_compress.assert_called_once()
|
||||
assert result["completed"] is True
|
||||
|
||||
@@ -28,8 +28,6 @@ from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
|
||||
pytestmark = pytest.mark.skip(reason="Live API integration test — hangs in batch runs")
|
||||
|
||||
# Ensure repo root is importable
|
||||
_repo_root = Path(__file__).resolve().parent.parent
|
||||
if str(_repo_root) not in sys.path:
|
||||
|
||||
@@ -229,14 +229,13 @@ class TestVisionModelOverride:
|
||||
|
||||
def test_default_model_when_no_override(self, monkeypatch):
|
||||
monkeypatch.delenv("AUXILIARY_VISION_MODEL", raising=False)
|
||||
from tools.vision_tools import _handle_vision_analyze
|
||||
from tools.vision_tools import _handle_vision_analyze, DEFAULT_VISION_MODEL
|
||||
with patch("tools.vision_tools.vision_analyze_tool", new_callable=MagicMock) as mock_tool:
|
||||
mock_tool.return_value = '{"success": true}'
|
||||
_handle_vision_analyze({"image_url": "http://test.jpg", "question": "test"})
|
||||
call_args = mock_tool.call_args
|
||||
# With no AUXILIARY_VISION_MODEL env var, model should be None
|
||||
# (the centralized call_llm router picks the provider default)
|
||||
assert call_args[0][2] is None
|
||||
expected = DEFAULT_VISION_MODEL or "google/gemini-3-flash-preview"
|
||||
assert call_args[0][2] == expected
|
||||
|
||||
|
||||
# ── DEFAULT_CONFIG shape tests ───────────────────────────────────────────────
|
||||
|
||||
@@ -93,8 +93,8 @@ class TestModelCommand:
|
||||
output = capsys.readouterr().out
|
||||
assert "anthropic/claude-opus-4.6" in output
|
||||
assert "OpenRouter" in output
|
||||
assert "Authenticated providers" in output or "Switch model" in output
|
||||
assert "provider" in output and "model" in output
|
||||
assert "Available models" in output
|
||||
assert "provider:model-name" in output
|
||||
|
||||
# -- provider switching tests -------------------------------------------
|
||||
|
||||
|
||||
@@ -197,28 +197,21 @@ def test_codex_provider_replaces_incompatible_default_model(monkeypatch):
|
||||
assert shell.model == "gpt-5.2-codex"
|
||||
|
||||
|
||||
def test_codex_provider_uses_config_model(monkeypatch):
|
||||
"""Model comes from config.yaml, not LLM_MODEL env var.
|
||||
Config.yaml is the single source of truth to avoid multi-agent conflicts."""
|
||||
def test_codex_provider_trusts_explicit_envvar_model(monkeypatch):
|
||||
"""When the user explicitly sets LLM_MODEL, we trust their choice and
|
||||
let the API be the judge — even if it's a non-OpenAI model. Only
|
||||
provider prefixes are stripped; the bare model passes through."""
|
||||
cli = _import_cli()
|
||||
|
||||
# LLM_MODEL env var should be IGNORED (even if set)
|
||||
monkeypatch.setenv("LLM_MODEL", "should-be-ignored")
|
||||
monkeypatch.setenv("LLM_MODEL", "claude-opus-4-6")
|
||||
monkeypatch.delenv("OPENAI_MODEL", raising=False)
|
||||
|
||||
# Set model via config
|
||||
monkeypatch.setitem(cli.CLI_CONFIG, "model", {
|
||||
"default": "gpt-5.2-codex",
|
||||
"provider": "openai-codex",
|
||||
"base_url": "https://chatgpt.com/backend-api/codex",
|
||||
})
|
||||
|
||||
def _runtime_resolve(**kwargs):
|
||||
return {
|
||||
"provider": "openai-codex",
|
||||
"api_mode": "codex_responses",
|
||||
"base_url": "https://chatgpt.com/backend-api/codex",
|
||||
"api_key": "fake-codex-token",
|
||||
"api_key": "test-key",
|
||||
"source": "env/config",
|
||||
}
|
||||
|
||||
@@ -227,12 +220,11 @@ def test_codex_provider_uses_config_model(monkeypatch):
|
||||
|
||||
shell = cli.HermesCLI(compact=True, max_turns=1)
|
||||
|
||||
assert shell._model_is_default is False
|
||||
assert shell._ensure_runtime_credentials() is True
|
||||
assert shell.provider == "openai-codex"
|
||||
# Model from config (may be normalized by codex provider logic)
|
||||
assert "codex" in shell.model.lower()
|
||||
# LLM_MODEL env var is NOT used
|
||||
assert shell.model != "should-be-ignored"
|
||||
# User explicitly chose this model — it passes through untouched
|
||||
assert shell.model == "claude-opus-4-6"
|
||||
|
||||
|
||||
def test_codex_provider_preserves_explicit_codex_model(monkeypatch):
|
||||
|
||||
+81
-119
@@ -35,7 +35,7 @@ def _make_agent(fallback_model=None):
|
||||
patch("run_agent.OpenAI"),
|
||||
):
|
||||
agent = AIAgent(
|
||||
api_key="test-key",
|
||||
api_key="test-key-primary",
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
@@ -45,14 +45,6 @@ def _make_agent(fallback_model=None):
|
||||
return agent
|
||||
|
||||
|
||||
def _mock_resolve(base_url="https://openrouter.ai/api/v1", api_key="test-key"):
|
||||
"""Helper to create a mock client for resolve_provider_client."""
|
||||
mock_client = MagicMock()
|
||||
mock_client.api_key = api_key
|
||||
mock_client.base_url = base_url
|
||||
return mock_client
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# _try_activate_fallback()
|
||||
# =============================================================================
|
||||
@@ -79,13 +71,9 @@ class TestTryActivateFallback:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "openrouter", "model": "anthropic/claude-sonnet-4"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-or-fallback-key",
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "anthropic/claude-sonnet-4"),
|
||||
with (
|
||||
patch.dict("os.environ", {"OPENROUTER_API_KEY": "sk-or-fallback-key"}),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
result = agent._try_activate_fallback()
|
||||
assert result is True
|
||||
@@ -93,37 +81,36 @@ class TestTryActivateFallback:
|
||||
assert agent.model == "anthropic/claude-sonnet-4"
|
||||
assert agent.provider == "openrouter"
|
||||
assert agent.api_mode == "chat_completions"
|
||||
assert agent.client is mock_client
|
||||
mock_openai.assert_called_once()
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert call_kwargs["api_key"] == "sk-or-fallback-key"
|
||||
assert "openrouter" in call_kwargs["base_url"].lower()
|
||||
# OpenRouter should get attribution headers
|
||||
assert "default_headers" in call_kwargs
|
||||
|
||||
def test_activates_zai_fallback(self):
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "zai", "model": "glm-5"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-zai-key",
|
||||
base_url="https://open.z.ai/api/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "glm-5"),
|
||||
with (
|
||||
patch.dict("os.environ", {"ZAI_API_KEY": "sk-zai-key"}),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
result = agent._try_activate_fallback()
|
||||
assert result is True
|
||||
assert agent.model == "glm-5"
|
||||
assert agent.provider == "zai"
|
||||
assert agent.client is mock_client
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert call_kwargs["api_key"] == "sk-zai-key"
|
||||
assert "z.ai" in call_kwargs["base_url"].lower()
|
||||
|
||||
def test_activates_kimi_fallback(self):
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "kimi-coding", "model": "kimi-k2.5"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-kimi-key",
|
||||
base_url="https://api.moonshot.ai/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "kimi-k2.5"),
|
||||
with (
|
||||
patch.dict("os.environ", {"KIMI_API_KEY": "sk-kimi-key"}),
|
||||
patch("run_agent.OpenAI"),
|
||||
):
|
||||
assert agent._try_activate_fallback() is True
|
||||
assert agent.model == "kimi-k2.5"
|
||||
@@ -133,30 +120,23 @@ class TestTryActivateFallback:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "minimax", "model": "MiniMax-M2.5"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-mm-key",
|
||||
base_url="https://api.minimax.io/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "MiniMax-M2.5"),
|
||||
with (
|
||||
patch.dict("os.environ", {"MINIMAX_API_KEY": "sk-mm-key"}),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
assert agent._try_activate_fallback() is True
|
||||
assert agent.model == "MiniMax-M2.5"
|
||||
assert agent.provider == "minimax"
|
||||
assert agent.client is mock_client
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert "minimax.io" in call_kwargs["base_url"]
|
||||
|
||||
def test_only_fires_once(self):
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "openrouter", "model": "anthropic/claude-sonnet-4"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-or-key",
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "anthropic/claude-sonnet-4"),
|
||||
with (
|
||||
patch.dict("os.environ", {"OPENROUTER_API_KEY": "sk-or-key"}),
|
||||
patch("run_agent.OpenAI"),
|
||||
):
|
||||
assert agent._try_activate_fallback() is True
|
||||
# Second attempt should return False
|
||||
@@ -167,10 +147,9 @@ class TestTryActivateFallback:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "minimax", "model": "MiniMax-M2.5"},
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(None, None),
|
||||
):
|
||||
# Ensure MINIMAX_API_KEY is not in the environment
|
||||
env = {k: v for k, v in os.environ.items() if k != "MINIMAX_API_KEY"}
|
||||
with patch.dict("os.environ", env, clear=True):
|
||||
assert agent._try_activate_fallback() is False
|
||||
assert agent._fallback_activated is False
|
||||
|
||||
@@ -184,29 +163,22 @@ class TestTryActivateFallback:
|
||||
"api_key_env": "MY_CUSTOM_KEY",
|
||||
},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="custom-secret",
|
||||
base_url="http://localhost:8080/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "my-model"),
|
||||
with (
|
||||
patch.dict("os.environ", {"MY_CUSTOM_KEY": "custom-secret"}),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
assert agent._try_activate_fallback() is True
|
||||
assert agent.client is mock_client
|
||||
assert agent.model == "my-model"
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert call_kwargs["base_url"] == "http://localhost:8080/v1"
|
||||
assert call_kwargs["api_key"] == "custom-secret"
|
||||
|
||||
def test_prompt_caching_enabled_for_claude_on_openrouter(self):
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "openrouter", "model": "anthropic/claude-sonnet-4"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-or-key",
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "anthropic/claude-sonnet-4"),
|
||||
with (
|
||||
patch.dict("os.environ", {"OPENROUTER_API_KEY": "sk-or-key"}),
|
||||
patch("run_agent.OpenAI"),
|
||||
):
|
||||
agent._try_activate_fallback()
|
||||
assert agent._use_prompt_caching is True
|
||||
@@ -215,13 +187,9 @@ class TestTryActivateFallback:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "openrouter", "model": "google/gemini-2.5-flash"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-or-key",
|
||||
base_url="https://openrouter.ai/api/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "google/gemini-2.5-flash"),
|
||||
with (
|
||||
patch.dict("os.environ", {"OPENROUTER_API_KEY": "sk-or-key"}),
|
||||
patch("run_agent.OpenAI"),
|
||||
):
|
||||
agent._try_activate_fallback()
|
||||
assert agent._use_prompt_caching is False
|
||||
@@ -230,13 +198,9 @@ class TestTryActivateFallback:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "zai", "model": "glm-5"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-zai-key",
|
||||
base_url="https://open.z.ai/api/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "glm-5"),
|
||||
with (
|
||||
patch.dict("os.environ", {"ZAI_API_KEY": "sk-zai-key"}),
|
||||
patch("run_agent.OpenAI"),
|
||||
):
|
||||
agent._try_activate_fallback()
|
||||
assert agent._use_prompt_caching is False
|
||||
@@ -246,36 +210,35 @@ class TestTryActivateFallback:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "zai", "model": "glm-5"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="sk-alt-key",
|
||||
base_url="https://open.z.ai/api/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "glm-5"),
|
||||
with (
|
||||
patch.dict("os.environ", {"Z_AI_API_KEY": "sk-alt-key"}),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
assert agent._try_activate_fallback() is True
|
||||
assert agent.client is mock_client
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert call_kwargs["api_key"] == "sk-alt-key"
|
||||
|
||||
def test_activates_codex_fallback(self):
|
||||
"""OpenAI Codex fallback should use OAuth credentials and codex_responses mode."""
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "openai-codex", "model": "gpt-5.3-codex"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="codex-oauth-token",
|
||||
base_url="https://chatgpt.com/backend-api/codex",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "gpt-5.3-codex"),
|
||||
mock_creds = {
|
||||
"api_key": "codex-oauth-token",
|
||||
"base_url": "https://chatgpt.com/backend-api/codex",
|
||||
}
|
||||
with (
|
||||
patch("hermes_cli.auth.resolve_codex_runtime_credentials", return_value=mock_creds),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
result = agent._try_activate_fallback()
|
||||
assert result is True
|
||||
assert agent.model == "gpt-5.3-codex"
|
||||
assert agent.provider == "openai-codex"
|
||||
assert agent.api_mode == "codex_responses"
|
||||
assert agent.client is mock_client
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert call_kwargs["api_key"] == "codex-oauth-token"
|
||||
assert "chatgpt.com" in call_kwargs["base_url"]
|
||||
|
||||
def test_codex_fallback_fails_gracefully_without_credentials(self):
|
||||
"""Codex fallback should return False if no OAuth credentials available."""
|
||||
@@ -283,8 +246,8 @@ class TestTryActivateFallback:
|
||||
fallback_model={"provider": "openai-codex", "model": "gpt-5.3-codex"},
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(None, None),
|
||||
"hermes_cli.auth.resolve_codex_runtime_credentials",
|
||||
side_effect=Exception("No Codex credentials"),
|
||||
):
|
||||
assert agent._try_activate_fallback() is False
|
||||
assert agent._fallback_activated is False
|
||||
@@ -294,20 +257,22 @@ class TestTryActivateFallback:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": "nous", "model": "nous-hermes-3"},
|
||||
)
|
||||
mock_client = _mock_resolve(
|
||||
api_key="nous-agent-key-abc",
|
||||
base_url="https://inference-api.nousresearch.com/v1",
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "nous-hermes-3"),
|
||||
mock_creds = {
|
||||
"api_key": "nous-agent-key-abc",
|
||||
"base_url": "https://inference-api.nousresearch.com/v1",
|
||||
}
|
||||
with (
|
||||
patch("hermes_cli.auth.resolve_nous_runtime_credentials", return_value=mock_creds),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
result = agent._try_activate_fallback()
|
||||
assert result is True
|
||||
assert agent.model == "nous-hermes-3"
|
||||
assert agent.provider == "nous"
|
||||
assert agent.api_mode == "chat_completions"
|
||||
assert agent.client is mock_client
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert call_kwargs["api_key"] == "nous-agent-key-abc"
|
||||
assert "nousresearch.com" in call_kwargs["base_url"]
|
||||
|
||||
def test_nous_fallback_fails_gracefully_without_login(self):
|
||||
"""Nous fallback should return False if not logged in."""
|
||||
@@ -315,8 +280,8 @@ class TestTryActivateFallback:
|
||||
fallback_model={"provider": "nous", "model": "nous-hermes-3"},
|
||||
)
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(None, None),
|
||||
"hermes_cli.auth.resolve_nous_runtime_credentials",
|
||||
side_effect=Exception("Not logged in to Nous Portal"),
|
||||
):
|
||||
assert agent._try_activate_fallback() is False
|
||||
assert agent._fallback_activated is False
|
||||
@@ -350,7 +315,7 @@ class TestFallbackInit:
|
||||
# =============================================================================
|
||||
|
||||
class TestProviderCredentials:
|
||||
"""Verify that each supported provider resolves via the centralized router."""
|
||||
"""Verify that each supported provider resolves its API key correctly."""
|
||||
|
||||
@pytest.mark.parametrize("provider,env_var,base_url_fragment", [
|
||||
("openrouter", "OPENROUTER_API_KEY", "openrouter"),
|
||||
@@ -363,15 +328,12 @@ class TestProviderCredentials:
|
||||
agent = _make_agent(
|
||||
fallback_model={"provider": provider, "model": "test-model"},
|
||||
)
|
||||
mock_client = MagicMock()
|
||||
mock_client.api_key = "test-api-key"
|
||||
mock_client.base_url = f"https://{base_url_fragment}/v1"
|
||||
with patch(
|
||||
"agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(mock_client, "test-model"),
|
||||
with (
|
||||
patch.dict("os.environ", {env_var: "test-key-123"}),
|
||||
patch("run_agent.OpenAI") as mock_openai,
|
||||
):
|
||||
result = agent._try_activate_fallback()
|
||||
assert result is True, f"Failed to activate fallback for {provider}"
|
||||
assert agent.client is mock_client
|
||||
assert agent.model == "test-model"
|
||||
assert agent.provider == provider
|
||||
call_kwargs = mock_openai.call_args[1]
|
||||
assert call_kwargs["api_key"] == "test-key-123"
|
||||
assert base_url_fragment in call_kwargs["base_url"].lower()
|
||||
|
||||
@@ -98,9 +98,10 @@ class TestFlushMemoriesUsesAuxiliaryClient:
|
||||
def test_flush_uses_auxiliary_when_available(self, monkeypatch):
|
||||
agent = _make_agent(monkeypatch, api_mode="codex_responses", provider="openai-codex")
|
||||
|
||||
mock_response = _chat_response_with_memory_call()
|
||||
mock_aux_client = MagicMock()
|
||||
mock_aux_client.chat.completions.create.return_value = _chat_response_with_memory_call()
|
||||
|
||||
with patch("agent.auxiliary_client.call_llm", return_value=mock_response) as mock_call:
|
||||
with patch("agent.auxiliary_client.get_text_auxiliary_client", return_value=(mock_aux_client, "gpt-4o-mini")):
|
||||
messages = [
|
||||
{"role": "user", "content": "Hello"},
|
||||
{"role": "assistant", "content": "Hi there"},
|
||||
@@ -109,9 +110,9 @@ class TestFlushMemoriesUsesAuxiliaryClient:
|
||||
with patch("tools.memory_tool.memory_tool", return_value="Saved.") as mock_memory:
|
||||
agent.flush_memories(messages)
|
||||
|
||||
mock_call.assert_called_once()
|
||||
call_kwargs = mock_call.call_args
|
||||
assert call_kwargs.kwargs.get("task") == "flush_memories"
|
||||
mock_aux_client.chat.completions.create.assert_called_once()
|
||||
call_kwargs = mock_aux_client.chat.completions.create.call_args
|
||||
assert call_kwargs.kwargs.get("model") == "gpt-4o-mini" or call_kwargs[1].get("model") == "gpt-4o-mini"
|
||||
|
||||
def test_flush_uses_main_client_when_no_auxiliary(self, monkeypatch):
|
||||
"""Non-Codex mode with no auxiliary falls back to self.client."""
|
||||
@@ -119,7 +120,7 @@ class TestFlushMemoriesUsesAuxiliaryClient:
|
||||
agent.client = MagicMock()
|
||||
agent.client.chat.completions.create.return_value = _chat_response_with_memory_call()
|
||||
|
||||
with patch("agent.auxiliary_client.call_llm", side_effect=RuntimeError("no provider")):
|
||||
with patch("agent.auxiliary_client.get_text_auxiliary_client", return_value=(None, None)):
|
||||
messages = [
|
||||
{"role": "user", "content": "Hello"},
|
||||
{"role": "assistant", "content": "Hi there"},
|
||||
@@ -134,9 +135,10 @@ class TestFlushMemoriesUsesAuxiliaryClient:
|
||||
"""Verify that memory tool calls from the flush response actually get executed."""
|
||||
agent = _make_agent(monkeypatch, api_mode="chat_completions", provider="openrouter")
|
||||
|
||||
mock_response = _chat_response_with_memory_call()
|
||||
mock_aux_client = MagicMock()
|
||||
mock_aux_client.chat.completions.create.return_value = _chat_response_with_memory_call()
|
||||
|
||||
with patch("agent.auxiliary_client.call_llm", return_value=mock_response):
|
||||
with patch("agent.auxiliary_client.get_text_auxiliary_client", return_value=(mock_aux_client, "gpt-4o-mini")):
|
||||
messages = [
|
||||
{"role": "user", "content": "Hello"},
|
||||
{"role": "assistant", "content": "Hi"},
|
||||
@@ -155,9 +157,10 @@ class TestFlushMemoriesUsesAuxiliaryClient:
|
||||
"""After flush, the flush prompt and any response should be removed from messages."""
|
||||
agent = _make_agent(monkeypatch, api_mode="chat_completions", provider="openrouter")
|
||||
|
||||
mock_response = _chat_response_with_memory_call()
|
||||
mock_aux_client = MagicMock()
|
||||
mock_aux_client.chat.completions.create.return_value = _chat_response_with_memory_call()
|
||||
|
||||
with patch("agent.auxiliary_client.call_llm", return_value=mock_response):
|
||||
with patch("agent.auxiliary_client.get_text_auxiliary_client", return_value=(mock_aux_client, "gpt-4o-mini")):
|
||||
messages = [
|
||||
{"role": "user", "content": "Hello"},
|
||||
{"role": "assistant", "content": "Hi"},
|
||||
@@ -199,7 +202,7 @@ class TestFlushMemoriesCodexFallback:
|
||||
model="gpt-5-codex",
|
||||
)
|
||||
|
||||
with patch("agent.auxiliary_client.call_llm", side_effect=RuntimeError("no provider")), \
|
||||
with patch("agent.auxiliary_client.get_text_auxiliary_client", return_value=(None, None)), \
|
||||
patch.object(agent, "_run_codex_stream", return_value=codex_response) as mock_stream, \
|
||||
patch.object(agent, "_build_api_kwargs") as mock_build, \
|
||||
patch("tools.memory_tool.memory_tool", return_value="Saved.") as mock_memory:
|
||||
|
||||
+1
-81
@@ -959,7 +959,7 @@ class TestFlushSentinelNotLeaked:
|
||||
agent.client.chat.completions.create.return_value = mock_response
|
||||
|
||||
# Bypass auxiliary client so flush uses agent.client directly
|
||||
with patch("agent.auxiliary_client.call_llm", side_effect=RuntimeError("no provider")):
|
||||
with patch("agent.auxiliary_client.get_text_auxiliary_client", return_value=(None, None)):
|
||||
agent.flush_memories(messages, min_turns=0)
|
||||
|
||||
# Check what was actually sent to the API
|
||||
@@ -1283,83 +1283,3 @@ class TestBudgetPressure:
|
||||
messages[-1]["content"] = last_content + f"\n\n{warning}"
|
||||
assert "plain text result" in messages[-1]["content"]
|
||||
assert "BUDGET WARNING" in messages[-1]["content"]
|
||||
|
||||
|
||||
class TestSafeWriter:
|
||||
"""Verify _SafeWriter guards stdout against OSError (broken pipes)."""
|
||||
|
||||
def test_write_delegates_normally(self):
|
||||
"""When stdout is healthy, _SafeWriter is transparent."""
|
||||
from run_agent import _SafeWriter
|
||||
from io import StringIO
|
||||
inner = StringIO()
|
||||
writer = _SafeWriter(inner)
|
||||
writer.write("hello")
|
||||
assert inner.getvalue() == "hello"
|
||||
|
||||
def test_write_catches_oserror(self):
|
||||
"""OSError on write is silently caught, returns len(data)."""
|
||||
from run_agent import _SafeWriter
|
||||
from unittest.mock import MagicMock
|
||||
inner = MagicMock()
|
||||
inner.write.side_effect = OSError(5, "Input/output error")
|
||||
writer = _SafeWriter(inner)
|
||||
result = writer.write("hello")
|
||||
assert result == 5 # len("hello")
|
||||
|
||||
def test_flush_catches_oserror(self):
|
||||
"""OSError on flush is silently caught."""
|
||||
from run_agent import _SafeWriter
|
||||
from unittest.mock import MagicMock
|
||||
inner = MagicMock()
|
||||
inner.flush.side_effect = OSError(5, "Input/output error")
|
||||
writer = _SafeWriter(inner)
|
||||
writer.flush() # should not raise
|
||||
|
||||
def test_print_survives_broken_stdout(self, monkeypatch):
|
||||
"""print() through _SafeWriter doesn't crash on broken pipe."""
|
||||
import sys
|
||||
from run_agent import _SafeWriter
|
||||
from unittest.mock import MagicMock
|
||||
broken = MagicMock()
|
||||
broken.write.side_effect = OSError(5, "Input/output error")
|
||||
original = sys.stdout
|
||||
sys.stdout = _SafeWriter(broken)
|
||||
try:
|
||||
print("this should not crash") # would raise without _SafeWriter
|
||||
finally:
|
||||
sys.stdout = original
|
||||
|
||||
def test_installed_in_run_conversation(self, agent):
|
||||
"""run_conversation installs _SafeWriter on sys.stdout."""
|
||||
import sys
|
||||
from run_agent import _SafeWriter
|
||||
resp = _mock_response(content="Done", finish_reason="stop")
|
||||
agent.client.chat.completions.create.return_value = resp
|
||||
original = sys.stdout
|
||||
try:
|
||||
with (
|
||||
patch.object(agent, "_persist_session"),
|
||||
patch.object(agent, "_save_trajectory"),
|
||||
patch.object(agent, "_cleanup_task_resources"),
|
||||
):
|
||||
agent.run_conversation("test")
|
||||
assert isinstance(sys.stdout, _SafeWriter)
|
||||
finally:
|
||||
sys.stdout = original
|
||||
|
||||
def test_double_wrap_prevented(self):
|
||||
"""Wrapping an already-wrapped stream doesn't add layers."""
|
||||
import sys
|
||||
from run_agent import _SafeWriter
|
||||
from io import StringIO
|
||||
inner = StringIO()
|
||||
wrapped = _SafeWriter(inner)
|
||||
# isinstance check should prevent double-wrapping
|
||||
assert isinstance(wrapped, _SafeWriter)
|
||||
# The guard in run_conversation checks isinstance before wrapping
|
||||
if not isinstance(wrapped, _SafeWriter):
|
||||
wrapped = _SafeWriter(wrapped)
|
||||
# Still just one layer
|
||||
wrapped.write("test")
|
||||
assert inner.getvalue() == "test"
|
||||
|
||||
@@ -158,6 +158,29 @@ def test_custom_endpoint_auto_provider_prefers_openai_key(monkeypatch):
|
||||
assert resolved["api_key"] == "sk-vllm-key"
|
||||
|
||||
|
||||
def test_resolve_runtime_provider_nous_api(monkeypatch):
|
||||
"""Nous Portal API key provider resolves via the api_key path."""
|
||||
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "nous-api")
|
||||
monkeypatch.setattr(
|
||||
rp,
|
||||
"resolve_api_key_provider_credentials",
|
||||
lambda pid: {
|
||||
"provider": "nous-api",
|
||||
"api_key": "nous-test-key",
|
||||
"base_url": "https://inference-api.nousresearch.com/v1",
|
||||
"source": "NOUS_API_KEY",
|
||||
},
|
||||
)
|
||||
|
||||
resolved = rp.resolve_runtime_provider(requested="nous-api")
|
||||
|
||||
assert resolved["provider"] == "nous-api"
|
||||
assert resolved["api_mode"] == "chat_completions"
|
||||
assert resolved["base_url"] == "https://inference-api.nousresearch.com/v1"
|
||||
assert resolved["api_key"] == "nous-test-key"
|
||||
assert resolved["requested_provider"] == "nous-api"
|
||||
|
||||
|
||||
def test_explicit_openrouter_skips_openai_base_url(monkeypatch):
|
||||
"""When the user explicitly requests openrouter, OPENAI_BASE_URL
|
||||
(which may point to a custom endpoint) must not override the
|
||||
|
||||
@@ -249,85 +249,6 @@ class TestCronTimezone:
|
||||
due = get_due_jobs()
|
||||
assert len(due) == 1
|
||||
|
||||
def test_ensure_aware_naive_preserves_absolute_time(self):
|
||||
"""_ensure_aware must preserve the absolute instant for naive datetimes.
|
||||
|
||||
Regression: the old code used replace(tzinfo=hermes_tz) which shifted
|
||||
absolute time when system-local tz != Hermes tz. The fix interprets
|
||||
naive values as system-local wall time, then converts.
|
||||
"""
|
||||
from cron.jobs import _ensure_aware
|
||||
|
||||
os.environ["HERMES_TIMEZONE"] = "Asia/Kolkata"
|
||||
hermes_time.reset_cache()
|
||||
|
||||
# Create a naive datetime — will be interpreted as system-local time
|
||||
naive_dt = datetime(2026, 3, 11, 12, 0, 0)
|
||||
|
||||
result = _ensure_aware(naive_dt)
|
||||
|
||||
# The result should be in Kolkata tz
|
||||
assert result.tzinfo is not None
|
||||
|
||||
# The UTC equivalent must match what we'd get by correctly interpreting
|
||||
# the naive dt as system-local time first, then converting
|
||||
system_tz = datetime.now().astimezone().tzinfo
|
||||
expected_utc = naive_dt.replace(tzinfo=system_tz).astimezone(timezone.utc)
|
||||
actual_utc = result.astimezone(timezone.utc)
|
||||
assert actual_utc == expected_utc, (
|
||||
f"Absolute time shifted: expected {expected_utc}, got {actual_utc}"
|
||||
)
|
||||
|
||||
def test_ensure_aware_normalizes_aware_to_hermes_tz(self):
|
||||
"""Already-aware datetimes should be normalized to Hermes tz."""
|
||||
from cron.jobs import _ensure_aware
|
||||
|
||||
os.environ["HERMES_TIMEZONE"] = "Asia/Kolkata"
|
||||
hermes_time.reset_cache()
|
||||
|
||||
# Create an aware datetime in UTC
|
||||
utc_dt = datetime(2026, 3, 11, 15, 0, 0, tzinfo=timezone.utc)
|
||||
result = _ensure_aware(utc_dt)
|
||||
|
||||
# Must be in Hermes tz (Kolkata) but same absolute instant
|
||||
kolkata = ZoneInfo("Asia/Kolkata")
|
||||
assert result.utctimetuple()[:5] == (2026, 3, 11, 15, 0)
|
||||
expected_local = utc_dt.astimezone(kolkata)
|
||||
assert result == expected_local
|
||||
|
||||
def test_ensure_aware_due_job_not_skipped_when_system_ahead(self, tmp_path, monkeypatch):
|
||||
"""Reproduce the actual bug: system tz ahead of Hermes tz caused
|
||||
overdue jobs to appear as not-yet-due.
|
||||
|
||||
Scenario: system is Asia/Kolkata (UTC+5:30), Hermes is UTC.
|
||||
A naive timestamp from 5 minutes ago (local time) should still
|
||||
be recognized as due after conversion.
|
||||
"""
|
||||
import cron.jobs as jobs_module
|
||||
monkeypatch.setattr(jobs_module, "CRON_DIR", tmp_path / "cron")
|
||||
monkeypatch.setattr(jobs_module, "JOBS_FILE", tmp_path / "cron" / "jobs.json")
|
||||
monkeypatch.setattr(jobs_module, "OUTPUT_DIR", tmp_path / "cron" / "output")
|
||||
|
||||
os.environ["HERMES_TIMEZONE"] = "UTC"
|
||||
hermes_time.reset_cache()
|
||||
|
||||
from cron.jobs import create_job, load_jobs, save_jobs, get_due_jobs
|
||||
|
||||
job = create_job(prompt="Bug repro", schedule="every 1h")
|
||||
jobs = load_jobs()
|
||||
|
||||
# Simulate a naive timestamp that was written by datetime.now() on a
|
||||
# system running in UTC+5:30 — 5 minutes in the past (local time)
|
||||
naive_past = (datetime.now() - timedelta(minutes=5)).isoformat()
|
||||
jobs[0]["next_run_at"] = naive_past
|
||||
save_jobs(jobs)
|
||||
|
||||
# Must be recognized as due regardless of tz mismatch
|
||||
due = get_due_jobs()
|
||||
assert len(due) == 1, (
|
||||
"Overdue job was skipped — _ensure_aware likely shifted absolute time"
|
||||
)
|
||||
|
||||
def test_create_job_stores_tz_aware_timestamps(self, tmp_path, monkeypatch):
|
||||
"""New jobs store timezone-aware created_at and next_run_at."""
|
||||
import cron.jobs as jobs_module
|
||||
|
||||
@@ -137,7 +137,8 @@ class TestBrowserVisionAnnotate:
|
||||
|
||||
with (
|
||||
patch("tools.browser_tool._run_browser_command") as mock_cmd,
|
||||
patch("tools.browser_tool.call_llm") as mock_call_llm,
|
||||
patch("tools.browser_tool._aux_vision_client") as mock_client,
|
||||
patch("tools.browser_tool._DEFAULT_VISION_MODEL", "test-model"),
|
||||
patch("tools.browser_tool._get_vision_model", return_value="test-model"),
|
||||
):
|
||||
mock_cmd.return_value = {"success": True, "data": {}}
|
||||
@@ -158,7 +159,8 @@ class TestBrowserVisionAnnotate:
|
||||
|
||||
with (
|
||||
patch("tools.browser_tool._run_browser_command") as mock_cmd,
|
||||
patch("tools.browser_tool.call_llm") as mock_call_llm,
|
||||
patch("tools.browser_tool._aux_vision_client") as mock_client,
|
||||
patch("tools.browser_tool._DEFAULT_VISION_MODEL", "test-model"),
|
||||
patch("tools.browser_tool._get_vision_model", return_value="test-model"),
|
||||
):
|
||||
mock_cmd.return_value = {"success": True, "data": {}}
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
|
||||
Tests for the code execution sandbox (programmatic tool calling).
|
||||
|
||||
These tests monkeypatch handle_function_call so they don't require API keys
|
||||
@@ -12,10 +11,6 @@ Run with: python -m pytest tests/test_code_execution.py -v
|
||||
or: python tests/test_code_execution.py
|
||||
"""
|
||||
|
||||
import pytest
|
||||
pytestmark = pytest.mark.skip(reason="Hangs in non-interactive environments")
|
||||
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
@@ -8,11 +8,6 @@ Every test with output validates against a known-good value AND
|
||||
asserts zero contamination from shell noise via _assert_clean().
|
||||
"""
|
||||
|
||||
import pytest
|
||||
pytestmark = pytest.mark.skip(reason="Hangs in non-interactive environments")
|
||||
|
||||
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
+56
-177
@@ -1828,8 +1828,8 @@ class TestSamplingCallbackText:
|
||||
)
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
params = _make_sampling_params()
|
||||
result = asyncio.run(self.handler(None, params))
|
||||
@@ -1847,13 +1847,13 @@ class TestSamplingCallbackText:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
) as mock_call:
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
params = _make_sampling_params(system_prompt="Be helpful")
|
||||
asyncio.run(self.handler(None, params))
|
||||
|
||||
call_args = mock_call.call_args
|
||||
call_args = fake_client.chat.completions.create.call_args
|
||||
messages = call_args.kwargs["messages"]
|
||||
assert messages[0] == {"role": "system", "content": "Be helpful"}
|
||||
|
||||
@@ -1865,8 +1865,8 @@ class TestSamplingCallbackText:
|
||||
)
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
params = _make_sampling_params()
|
||||
result = asyncio.run(self.handler(None, params))
|
||||
@@ -1889,8 +1889,8 @@ class TestSamplingCallbackToolUse:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_tool_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
params = _make_sampling_params()
|
||||
result = asyncio.run(self.handler(None, params))
|
||||
@@ -1916,8 +1916,8 @@ class TestSamplingCallbackToolUse:
|
||||
)
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(self.handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -1939,8 +1939,8 @@ class TestToolLoopGovernance:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_tool_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
params = _make_sampling_params()
|
||||
# Round 1, 2: allowed
|
||||
@@ -1956,26 +1956,24 @@ class TestToolLoopGovernance:
|
||||
def test_text_response_resets_counter(self):
|
||||
"""A text response resets the tool loop counter."""
|
||||
handler = SamplingHandler("tl2", {"max_tool_rounds": 1})
|
||||
|
||||
# Use a list to hold the current response, so the side_effect can
|
||||
# pick up changes between calls.
|
||||
responses = [_make_llm_tool_response()]
|
||||
fake_client = MagicMock()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
side_effect=lambda **kw: responses[0],
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
# Tool response (round 1 of 1 allowed)
|
||||
fake_client.chat.completions.create.return_value = _make_llm_tool_response()
|
||||
r1 = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(r1, CreateMessageResultWithTools)
|
||||
|
||||
# Text response resets counter
|
||||
responses[0] = _make_llm_response()
|
||||
fake_client.chat.completions.create.return_value = _make_llm_response()
|
||||
r2 = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(r2, CreateMessageResult)
|
||||
|
||||
# Tool response again (should succeed since counter was reset)
|
||||
responses[0] = _make_llm_tool_response()
|
||||
fake_client.chat.completions.create.return_value = _make_llm_tool_response()
|
||||
r3 = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(r3, CreateMessageResultWithTools)
|
||||
|
||||
@@ -1986,8 +1984,8 @@ class TestToolLoopGovernance:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_tool_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(result, ErrorData)
|
||||
@@ -2005,8 +2003,8 @@ class TestSamplingErrors:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
# First call succeeds
|
||||
r1 = asyncio.run(handler(None, _make_sampling_params()))
|
||||
@@ -2019,16 +2017,20 @@ class TestSamplingErrors:
|
||||
|
||||
def test_timeout_error(self):
|
||||
handler = SamplingHandler("to", {"timeout": 0.05})
|
||||
fake_client = MagicMock()
|
||||
|
||||
def slow_call(**kwargs):
|
||||
import threading
|
||||
# Use an event to ensure the thread truly blocks long enough
|
||||
evt = threading.Event()
|
||||
evt.wait(5) # blocks for up to 5 seconds (cancelled by timeout)
|
||||
return _make_llm_response()
|
||||
|
||||
fake_client.chat.completions.create.side_effect = slow_call
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
side_effect=slow_call,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(result, ErrorData)
|
||||
@@ -2039,11 +2041,12 @@ class TestSamplingErrors:
|
||||
handler = SamplingHandler("np", {})
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
side_effect=RuntimeError("No LLM provider configured"),
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(None, None),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(result, ErrorData)
|
||||
assert "No LLM provider" in result.message
|
||||
assert handler.metrics["errors"] == 1
|
||||
|
||||
def test_empty_choices_returns_error(self):
|
||||
@@ -2057,8 +2060,8 @@ class TestSamplingErrors:
|
||||
)
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2077,8 +2080,8 @@ class TestSamplingErrors:
|
||||
)
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2096,8 +2099,8 @@ class TestSamplingErrors:
|
||||
)
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2117,19 +2120,19 @@ class TestModelWhitelist:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "test-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(result, CreateMessageResult)
|
||||
|
||||
def test_disallowed_model_rejected(self):
|
||||
handler = SamplingHandler("wl2", {"allowed_models": ["gpt-4o"], "model": "test-model"})
|
||||
handler = SamplingHandler("wl2", {"allowed_models": ["gpt-4o"]})
|
||||
fake_client = MagicMock()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "gpt-3.5-turbo"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(result, ErrorData)
|
||||
@@ -2142,8 +2145,8 @@ class TestModelWhitelist:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "any-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
assert isinstance(result, CreateMessageResult)
|
||||
@@ -2163,8 +2166,8 @@ class TestMalformedToolCallArgs:
|
||||
)
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2191,8 +2194,8 @@ class TestMalformedToolCallArgs:
|
||||
fake_client.chat.completions.create.return_value = response
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
result = asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2211,8 +2214,8 @@ class TestMetricsTracking:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2226,8 +2229,8 @@ class TestMetricsTracking:
|
||||
fake_client.chat.completions.create.return_value = _make_llm_tool_response()
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
return_value=fake_client.chat.completions.create.return_value,
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(fake_client, "default-model"),
|
||||
):
|
||||
asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2238,8 +2241,8 @@ class TestMetricsTracking:
|
||||
handler = SamplingHandler("met3", {})
|
||||
|
||||
with patch(
|
||||
"agent.auxiliary_client.call_llm",
|
||||
side_effect=RuntimeError("No LLM provider configured"),
|
||||
"agent.auxiliary_client.get_text_auxiliary_client",
|
||||
return_value=(None, None),
|
||||
):
|
||||
asyncio.run(handler(None, _make_sampling_params()))
|
||||
|
||||
@@ -2323,127 +2326,3 @@ class TestMCPServerTaskSamplingIntegration:
|
||||
kwargs = server._sampling.session_kwargs()
|
||||
assert "sampling_callback" in kwargs
|
||||
assert "sampling_capabilities" in kwargs
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Discovery failed_count tracking
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestDiscoveryFailedCount:
|
||||
"""Verify discover_mcp_tools() correctly tracks failed server connections."""
|
||||
|
||||
def test_failed_server_increments_failed_count(self):
|
||||
"""When _discover_and_register_server raises, failed_count increments."""
|
||||
from tools.mcp_tool import discover_mcp_tools, _servers, _ensure_mcp_loop
|
||||
|
||||
fake_config = {
|
||||
"good_server": {"command": "npx", "args": ["good"]},
|
||||
"bad_server": {"command": "npx", "args": ["bad"]},
|
||||
}
|
||||
|
||||
async def fake_register(name, cfg):
|
||||
if name == "bad_server":
|
||||
raise ConnectionError("Connection refused")
|
||||
# Simulate successful registration
|
||||
from tools.mcp_tool import MCPServerTask
|
||||
server = MCPServerTask(name)
|
||||
server.session = MagicMock()
|
||||
server._tools = [_make_mcp_tool("tool_a")]
|
||||
_servers[name] = server
|
||||
return [f"mcp_{name}_tool_a"]
|
||||
|
||||
with patch("tools.mcp_tool._load_mcp_config", return_value=fake_config), \
|
||||
patch("tools.mcp_tool._discover_and_register_server", side_effect=fake_register), \
|
||||
patch("tools.mcp_tool._MCP_AVAILABLE", True), \
|
||||
patch("tools.mcp_tool._existing_tool_names", return_value=["mcp_good_server_tool_a"]):
|
||||
_ensure_mcp_loop()
|
||||
|
||||
# Capture the logger to verify failed_count in summary
|
||||
with patch("tools.mcp_tool.logger") as mock_logger:
|
||||
discover_mcp_tools()
|
||||
|
||||
# Find the summary info call
|
||||
info_calls = [
|
||||
str(call)
|
||||
for call in mock_logger.info.call_args_list
|
||||
if "failed" in str(call).lower() or "MCP:" in str(call)
|
||||
]
|
||||
# The summary should mention the failure
|
||||
assert any("1 failed" in str(c) for c in info_calls), (
|
||||
f"Summary should report 1 failed server, got: {info_calls}"
|
||||
)
|
||||
|
||||
_servers.pop("good_server", None)
|
||||
_servers.pop("bad_server", None)
|
||||
|
||||
def test_all_servers_fail_still_prints_summary(self):
|
||||
"""When all servers fail, a summary with failure count is still printed."""
|
||||
from tools.mcp_tool import discover_mcp_tools, _servers, _ensure_mcp_loop
|
||||
|
||||
fake_config = {
|
||||
"srv1": {"command": "npx", "args": ["a"]},
|
||||
"srv2": {"command": "npx", "args": ["b"]},
|
||||
}
|
||||
|
||||
async def always_fail(name, cfg):
|
||||
raise ConnectionError(f"Server {name} refused")
|
||||
|
||||
with patch("tools.mcp_tool._load_mcp_config", return_value=fake_config), \
|
||||
patch("tools.mcp_tool._discover_and_register_server", side_effect=always_fail), \
|
||||
patch("tools.mcp_tool._MCP_AVAILABLE", True), \
|
||||
patch("tools.mcp_tool._existing_tool_names", return_value=[]):
|
||||
_ensure_mcp_loop()
|
||||
|
||||
with patch("tools.mcp_tool.logger") as mock_logger:
|
||||
discover_mcp_tools()
|
||||
|
||||
# Summary must be printed even when all servers fail
|
||||
info_calls = [str(call) for call in mock_logger.info.call_args_list]
|
||||
assert any("2 failed" in str(c) for c in info_calls), (
|
||||
f"Summary should report 2 failed servers, got: {info_calls}"
|
||||
)
|
||||
|
||||
_servers.pop("srv1", None)
|
||||
_servers.pop("srv2", None)
|
||||
|
||||
def test_ok_servers_excludes_failures(self):
|
||||
"""ok_servers count correctly excludes failed servers."""
|
||||
from tools.mcp_tool import discover_mcp_tools, _servers, _ensure_mcp_loop
|
||||
|
||||
fake_config = {
|
||||
"ok1": {"command": "npx", "args": ["ok1"]},
|
||||
"ok2": {"command": "npx", "args": ["ok2"]},
|
||||
"fail1": {"command": "npx", "args": ["fail"]},
|
||||
}
|
||||
|
||||
async def selective_register(name, cfg):
|
||||
if name == "fail1":
|
||||
raise ConnectionError("Refused")
|
||||
from tools.mcp_tool import MCPServerTask
|
||||
server = MCPServerTask(name)
|
||||
server.session = MagicMock()
|
||||
server._tools = [_make_mcp_tool("t")]
|
||||
_servers[name] = server
|
||||
return [f"mcp_{name}_t"]
|
||||
|
||||
with patch("tools.mcp_tool._load_mcp_config", return_value=fake_config), \
|
||||
patch("tools.mcp_tool._discover_and_register_server", side_effect=selective_register), \
|
||||
patch("tools.mcp_tool._MCP_AVAILABLE", True), \
|
||||
patch("tools.mcp_tool._existing_tool_names", return_value=["mcp_ok1_t", "mcp_ok2_t"]):
|
||||
_ensure_mcp_loop()
|
||||
|
||||
with patch("tools.mcp_tool.logger") as mock_logger:
|
||||
discover_mcp_tools()
|
||||
|
||||
info_calls = [str(call) for call in mock_logger.info.call_args_list]
|
||||
# Should say "2 server(s)" not "3 server(s)"
|
||||
assert any("2 server" in str(c) for c in info_calls), (
|
||||
f"Summary should report 2 ok servers, got: {info_calls}"
|
||||
)
|
||||
assert any("1 failed" in str(c) for c in info_calls), (
|
||||
f"Summary should report 1 failed, got: {info_calls}"
|
||||
)
|
||||
|
||||
_servers.pop("ok1", None)
|
||||
_servers.pop("ok2", None)
|
||||
_servers.pop("fail1", None)
|
||||
|
||||
@@ -189,14 +189,16 @@ class TestSessionSearch:
|
||||
{"role": "assistant", "content": "hi there"},
|
||||
]
|
||||
|
||||
# Mock async_call_llm to raise RuntimeError → summarizer returns None
|
||||
from unittest.mock import AsyncMock, patch as _patch
|
||||
with _patch("tools.session_search_tool.async_call_llm",
|
||||
new_callable=AsyncMock,
|
||||
side_effect=RuntimeError("no provider")):
|
||||
result = json.loads(session_search(
|
||||
query="test", db=mock_db, current_session_id=current_sid,
|
||||
))
|
||||
# Mock the summarizer to return a simple summary
|
||||
import tools.session_search_tool as sst
|
||||
original_client = sst._async_aux_client
|
||||
sst._async_aux_client = None # Disable summarizer → returns None
|
||||
|
||||
result = json.loads(session_search(
|
||||
query="test", db=mock_db, current_session_id=current_sid,
|
||||
))
|
||||
|
||||
sst._async_aux_client = original_client
|
||||
|
||||
assert result["success"] is True
|
||||
# Current session should be skipped, only other_sid should appear
|
||||
|
||||
@@ -202,7 +202,7 @@ class TestHandleVisionAnalyze:
|
||||
assert model == "custom/model-v1"
|
||||
|
||||
def test_falls_back_to_default_model(self):
|
||||
"""Without AUXILIARY_VISION_MODEL, model should be None (let call_llm resolve default)."""
|
||||
"""Without AUXILIARY_VISION_MODEL, should use DEFAULT_VISION_MODEL or fallback."""
|
||||
with (
|
||||
patch(
|
||||
"tools.vision_tools.vision_analyze_tool", new_callable=AsyncMock
|
||||
@@ -218,9 +218,9 @@ class TestHandleVisionAnalyze:
|
||||
coro.close()
|
||||
call_args = mock_tool.call_args
|
||||
model = call_args[0][2]
|
||||
# With no AUXILIARY_VISION_MODEL set, model should be None
|
||||
# (the centralized call_llm router picks the default)
|
||||
assert model is None
|
||||
# Should be DEFAULT_VISION_MODEL or the hardcoded fallback
|
||||
assert model is not None
|
||||
assert len(model) > 0
|
||||
|
||||
def test_empty_args_graceful(self):
|
||||
"""Missing keys should default to empty strings, not raise."""
|
||||
@@ -277,6 +277,8 @@ class TestErrorLoggingExcInfo:
|
||||
new_callable=AsyncMock,
|
||||
side_effect=Exception("download boom"),
|
||||
),
|
||||
patch("tools.vision_tools._aux_async_client", MagicMock()),
|
||||
patch("tools.vision_tools.DEFAULT_VISION_MODEL", "test/model"),
|
||||
caplog.at_level(logging.ERROR, logger="tools.vision_tools"),
|
||||
):
|
||||
result = await vision_analyze_tool(
|
||||
@@ -309,16 +311,25 @@ class TestErrorLoggingExcInfo:
|
||||
"tools.vision_tools._image_to_base64_data_url",
|
||||
return_value="data:image/jpeg;base64,abc",
|
||||
),
|
||||
patch("agent.auxiliary_client.get_auxiliary_extra_body", return_value=None),
|
||||
patch(
|
||||
"agent.auxiliary_client.auxiliary_max_tokens_param",
|
||||
return_value={"max_tokens": 2000},
|
||||
),
|
||||
caplog.at_level(logging.WARNING, logger="tools.vision_tools"),
|
||||
):
|
||||
# Mock the async_call_llm function to return a mock response
|
||||
# Mock the vision client
|
||||
mock_client = AsyncMock()
|
||||
mock_response = MagicMock()
|
||||
mock_choice = MagicMock()
|
||||
mock_choice.message.content = "A test image description"
|
||||
mock_response.choices = [mock_choice]
|
||||
mock_client.chat.completions.create = AsyncMock(return_value=mock_response)
|
||||
|
||||
# Patch module-level _aux_async_client so the tool doesn't bail early
|
||||
with (
|
||||
patch("tools.vision_tools.async_call_llm", new_callable=AsyncMock, return_value=mock_response),
|
||||
patch("tools.vision_tools._aux_async_client", mock_client),
|
||||
patch("tools.vision_tools.DEFAULT_VISION_MODEL", "test/model"),
|
||||
):
|
||||
# Make unlink fail to trigger cleanup warning
|
||||
original_unlink = Path.unlink
|
||||
|
||||
+56
-27
@@ -63,7 +63,7 @@ import time
|
||||
import requests
|
||||
from typing import Dict, Any, Optional, List
|
||||
from pathlib import Path
|
||||
from agent.auxiliary_client import call_llm
|
||||
from agent.auxiliary_client import get_vision_auxiliary_client, get_text_auxiliary_client
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -80,15 +80,38 @@ DEFAULT_SESSION_TIMEOUT = 300
|
||||
# Max tokens for snapshot content before summarization
|
||||
SNAPSHOT_SUMMARIZE_THRESHOLD = 8000
|
||||
|
||||
# Vision client — for browser_vision (screenshot analysis)
|
||||
# Wrapped in try/except so a broken auxiliary config doesn't prevent the entire
|
||||
# browser_tool module from importing (which would disable all 10 browser tools).
|
||||
try:
|
||||
_aux_vision_client, _DEFAULT_VISION_MODEL = get_vision_auxiliary_client()
|
||||
except Exception as _init_err:
|
||||
logger.debug("Could not initialise vision auxiliary client: %s", _init_err)
|
||||
_aux_vision_client, _DEFAULT_VISION_MODEL = None, None
|
||||
|
||||
def _get_vision_model() -> Optional[str]:
|
||||
# Text client — for page snapshot summarization (same config as web_extract)
|
||||
try:
|
||||
_aux_text_client, _DEFAULT_TEXT_MODEL = get_text_auxiliary_client("web_extract")
|
||||
except Exception as _init_err:
|
||||
logger.debug("Could not initialise text auxiliary client: %s", _init_err)
|
||||
_aux_text_client, _DEFAULT_TEXT_MODEL = None, None
|
||||
|
||||
# Module-level alias for availability checks
|
||||
EXTRACTION_MODEL = _DEFAULT_TEXT_MODEL or _DEFAULT_VISION_MODEL
|
||||
|
||||
|
||||
def _get_vision_model() -> str:
|
||||
"""Model for browser_vision (screenshot analysis — multimodal)."""
|
||||
return os.getenv("AUXILIARY_VISION_MODEL", "").strip() or None
|
||||
return (os.getenv("AUXILIARY_VISION_MODEL", "").strip()
|
||||
or _DEFAULT_VISION_MODEL
|
||||
or "google/gemini-3-flash-preview")
|
||||
|
||||
|
||||
def _get_extraction_model() -> Optional[str]:
|
||||
def _get_extraction_model() -> str:
|
||||
"""Model for page snapshot text summarization — same as web_extract."""
|
||||
return os.getenv("AUXILIARY_WEB_EXTRACT_MODEL", "").strip() or None
|
||||
return (os.getenv("AUXILIARY_WEB_EXTRACT_MODEL", "").strip()
|
||||
or _DEFAULT_TEXT_MODEL
|
||||
or "google/gemini-3-flash-preview")
|
||||
|
||||
|
||||
def _is_local_mode() -> bool:
|
||||
@@ -918,6 +941,9 @@ def _extract_relevant_content(
|
||||
|
||||
Falls back to simple truncation when no auxiliary text model is configured.
|
||||
"""
|
||||
if _aux_text_client is None:
|
||||
return _truncate_snapshot(snapshot_text)
|
||||
|
||||
if user_task:
|
||||
extraction_prompt = (
|
||||
f"You are a content extractor for a browser automation agent.\n\n"
|
||||
@@ -942,16 +968,13 @@ def _extract_relevant_content(
|
||||
)
|
||||
|
||||
try:
|
||||
call_kwargs = {
|
||||
"task": "web_extract",
|
||||
"messages": [{"role": "user", "content": extraction_prompt}],
|
||||
"max_tokens": 4000,
|
||||
"temperature": 0.1,
|
||||
}
|
||||
model = _get_extraction_model()
|
||||
if model:
|
||||
call_kwargs["model"] = model
|
||||
response = call_llm(**call_kwargs)
|
||||
from agent.auxiliary_client import auxiliary_max_tokens_param
|
||||
response = _aux_text_client.chat.completions.create(
|
||||
model=_get_extraction_model(),
|
||||
messages=[{"role": "user", "content": extraction_prompt}],
|
||||
**auxiliary_max_tokens_param(4000),
|
||||
temperature=0.1,
|
||||
)
|
||||
return response.choices[0].message.content
|
||||
except Exception:
|
||||
return _truncate_snapshot(snapshot_text)
|
||||
@@ -1474,6 +1497,14 @@ def browser_vision(question: str, annotate: bool = False, task_id: Optional[str]
|
||||
|
||||
effective_task_id = task_id or "default"
|
||||
|
||||
# Check auxiliary vision client
|
||||
if _aux_vision_client is None or _DEFAULT_VISION_MODEL is None:
|
||||
return json.dumps({
|
||||
"success": False,
|
||||
"error": "Browser vision unavailable: no auxiliary vision model configured. "
|
||||
"Set OPENROUTER_API_KEY or configure Nous Portal to enable browser vision."
|
||||
}, ensure_ascii=False)
|
||||
|
||||
# Save screenshot to persistent location so it can be shared with users
|
||||
hermes_home = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
|
||||
screenshots_dir = hermes_home / "browser_screenshots"
|
||||
@@ -1531,13 +1562,14 @@ def browser_vision(question: str, annotate: bool = False, task_id: Optional[str]
|
||||
f"Focus on answering the user's specific question."
|
||||
)
|
||||
|
||||
# Use the centralized LLM router
|
||||
# Use the sync auxiliary vision client directly
|
||||
from agent.auxiliary_client import auxiliary_max_tokens_param
|
||||
vision_model = _get_vision_model()
|
||||
logger.debug("browser_vision: analysing screenshot (%d bytes)",
|
||||
len(image_data))
|
||||
call_kwargs = {
|
||||
"task": "vision",
|
||||
"messages": [
|
||||
logger.debug("browser_vision: analysing screenshot (%d bytes) with model=%s",
|
||||
len(image_data), vision_model)
|
||||
response = _aux_vision_client.chat.completions.create(
|
||||
model=vision_model,
|
||||
messages=[
|
||||
{
|
||||
"role": "user",
|
||||
"content": [
|
||||
@@ -1546,12 +1578,9 @@ def browser_vision(question: str, annotate: bool = False, task_id: Optional[str]
|
||||
],
|
||||
}
|
||||
],
|
||||
"max_tokens": 2000,
|
||||
"temperature": 0.1,
|
||||
}
|
||||
if vision_model:
|
||||
call_kwargs["model"] = vision_model
|
||||
response = call_llm(**call_kwargs)
|
||||
**auxiliary_max_tokens_param(2000),
|
||||
temperature=0.1,
|
||||
)
|
||||
|
||||
analysis = response.choices[0].message.content
|
||||
response_data = {
|
||||
|
||||
@@ -95,34 +95,21 @@ def _run_git(
|
||||
) -> tuple:
|
||||
"""Run a git command against the shadow repo. Returns (ok, stdout, stderr)."""
|
||||
env = _git_env(shadow_repo, working_dir)
|
||||
cmd = ["git"] + list(args)
|
||||
try:
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
["git"] + args,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=timeout,
|
||||
env=env,
|
||||
cwd=str(Path(working_dir).resolve()),
|
||||
)
|
||||
ok = result.returncode == 0
|
||||
stdout = result.stdout.strip()
|
||||
stderr = result.stderr.strip()
|
||||
if not ok:
|
||||
logger.error(
|
||||
"Git command failed: %s (rc=%d) stderr=%s",
|
||||
" ".join(cmd), result.returncode, stderr,
|
||||
)
|
||||
return ok, stdout, stderr
|
||||
return result.returncode == 0, result.stdout.strip(), result.stderr.strip()
|
||||
except subprocess.TimeoutExpired:
|
||||
msg = f"git timed out after {timeout}s: {' '.join(cmd)}"
|
||||
logger.error(msg, exc_info=True)
|
||||
return False, "", msg
|
||||
return False, "", f"git timed out after {timeout}s: git {' '.join(args)}"
|
||||
except FileNotFoundError:
|
||||
logger.error("Git executable not found: %s", " ".join(cmd), exc_info=True)
|
||||
return False, "", "git not found"
|
||||
except Exception as exc:
|
||||
logger.error("Unexpected git error running %s: %s", " ".join(cmd), exc, exc_info=True)
|
||||
return False, "", str(exc)
|
||||
|
||||
|
||||
@@ -300,7 +287,7 @@ class CheckpointManager:
|
||||
["cat-file", "-t", commit_hash], shadow, abs_dir,
|
||||
)
|
||||
if not ok:
|
||||
return {"success": False, "error": f"Checkpoint '{commit_hash}' not found", "debug": err or None}
|
||||
return {"success": False, "error": f"Checkpoint '{commit_hash}' not found"}
|
||||
|
||||
# Take a checkpoint of current state before restoring (so you can undo the undo)
|
||||
self._take(abs_dir, f"pre-rollback snapshot (restoring to {commit_hash[:8]})")
|
||||
@@ -312,7 +299,7 @@ class CheckpointManager:
|
||||
)
|
||||
|
||||
if not ok:
|
||||
return {"success": False, "error": "Restore failed", "debug": err or None}
|
||||
return {"success": False, "error": f"Restore failed: {err}"}
|
||||
|
||||
# Get info about what was restored
|
||||
ok2, reason_out, _ = _run_git(
|
||||
|
||||
@@ -209,7 +209,7 @@ def _upscale_image(image_url: str, original_prompt: str) -> Dict[str, Any]:
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
logger.error("Error upscaling image: %s", e, exc_info=True)
|
||||
logger.error("Error upscaling image: %s", e)
|
||||
return None
|
||||
|
||||
|
||||
@@ -377,7 +377,7 @@ def image_generate_tool(
|
||||
except Exception as e:
|
||||
generation_time = (datetime.datetime.now() - start_time).total_seconds()
|
||||
error_msg = f"Error generating image: {str(e)}"
|
||||
logger.error("%s", error_msg, exc_info=True)
|
||||
logger.error("%s", error_msg)
|
||||
|
||||
# Prepare error response - minimal format
|
||||
response_data = {
|
||||
|
||||
+35
-25
@@ -456,13 +456,17 @@ class SamplingHandler:
|
||||
# Resolve model
|
||||
model = self._resolve_model(getattr(params, "modelPreferences", None))
|
||||
|
||||
# Get auxiliary LLM client via centralized router
|
||||
from agent.auxiliary_client import call_llm
|
||||
# Get auxiliary LLM client
|
||||
from agent.auxiliary_client import get_text_auxiliary_client
|
||||
client, default_model = get_text_auxiliary_client()
|
||||
if client is None:
|
||||
self.metrics["errors"] += 1
|
||||
return self._error("No LLM provider available for sampling")
|
||||
|
||||
# Model whitelist check (we need to resolve model before calling)
|
||||
resolved_model = model or self.model_override or ""
|
||||
resolved_model = model or default_model
|
||||
|
||||
if self.allowed_models and resolved_model and resolved_model not in self.allowed_models:
|
||||
# Model whitelist check
|
||||
if self.allowed_models and resolved_model not in self.allowed_models:
|
||||
logger.warning(
|
||||
"MCP server '%s' requested model '%s' not in allowed_models",
|
||||
self.server_name, resolved_model,
|
||||
@@ -480,15 +484,20 @@ class SamplingHandler:
|
||||
|
||||
# Build LLM call kwargs
|
||||
max_tokens = min(params.maxTokens, self.max_tokens_cap)
|
||||
call_temperature = None
|
||||
call_kwargs: dict = {
|
||||
"model": resolved_model,
|
||||
"messages": messages,
|
||||
"max_tokens": max_tokens,
|
||||
}
|
||||
if hasattr(params, "temperature") and params.temperature is not None:
|
||||
call_temperature = params.temperature
|
||||
call_kwargs["temperature"] = params.temperature
|
||||
if stop := getattr(params, "stopSequences", None):
|
||||
call_kwargs["stop"] = stop
|
||||
|
||||
# Forward server-provided tools
|
||||
call_tools = None
|
||||
server_tools = getattr(params, "tools", None)
|
||||
if server_tools:
|
||||
call_tools = [
|
||||
call_kwargs["tools"] = [
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
@@ -499,6 +508,9 @@ class SamplingHandler:
|
||||
}
|
||||
for t in server_tools
|
||||
]
|
||||
if tool_choice := getattr(params, "toolChoice", None):
|
||||
mode = getattr(tool_choice, "mode", "auto")
|
||||
call_kwargs["tool_choice"] = {"auto": "auto", "required": "required", "none": "none"}.get(mode, "auto")
|
||||
|
||||
logger.log(
|
||||
self.audit_level,
|
||||
@@ -508,15 +520,7 @@ class SamplingHandler:
|
||||
|
||||
# Offload sync LLM call to thread (non-blocking)
|
||||
def _sync_call():
|
||||
return call_llm(
|
||||
task="mcp",
|
||||
model=resolved_model or None,
|
||||
messages=messages,
|
||||
temperature=call_temperature,
|
||||
max_tokens=max_tokens,
|
||||
tools=call_tools,
|
||||
timeout=self.timeout,
|
||||
)
|
||||
return client.chat.completions.create(**call_kwargs)
|
||||
|
||||
try:
|
||||
response = await asyncio.wait_for(
|
||||
@@ -1327,23 +1331,29 @@ def discover_mcp_tools() -> List[str]:
|
||||
|
||||
async def _discover_one(name: str, cfg: dict) -> List[str]:
|
||||
"""Connect to a single server and return its registered tool names."""
|
||||
return await _discover_and_register_server(name, cfg)
|
||||
transport_desc = cfg.get("url", f'{cfg.get("command", "?")} {" ".join(cfg.get("args", [])[:2])}')
|
||||
try:
|
||||
registered = await _discover_and_register_server(name, cfg)
|
||||
transport_type = "HTTP" if "url" in cfg else "stdio"
|
||||
return registered
|
||||
except Exception as exc:
|
||||
logger.warning(
|
||||
"Failed to connect to MCP server '%s': %s",
|
||||
name, exc,
|
||||
)
|
||||
return []
|
||||
|
||||
async def _discover_all():
|
||||
nonlocal failed_count
|
||||
server_names = list(new_servers.keys())
|
||||
# Connect to all servers in PARALLEL
|
||||
results = await asyncio.gather(
|
||||
*(_discover_one(name, cfg) for name, cfg in new_servers.items()),
|
||||
return_exceptions=True,
|
||||
)
|
||||
for name, result in zip(server_names, results):
|
||||
for result in results:
|
||||
if isinstance(result, Exception):
|
||||
failed_count += 1
|
||||
logger.warning(
|
||||
"Failed to connect to MCP server '%s': %s",
|
||||
name, result,
|
||||
)
|
||||
logger.warning("MCP discovery error: %s", result)
|
||||
elif isinstance(result, list):
|
||||
all_tools.extend(result)
|
||||
else:
|
||||
|
||||
+20
-11
@@ -1,30 +1,39 @@
|
||||
"""Shared OpenRouter API client for Hermes tools.
|
||||
|
||||
Provides a single lazy-initialized AsyncOpenAI client that all tool modules
|
||||
can share. Routes through the centralized provider router in
|
||||
agent/auxiliary_client.py so auth, headers, and API format are handled
|
||||
consistently.
|
||||
can share, eliminating the duplicated _get_openrouter_client() /
|
||||
_get_summarizer_client() pattern previously copy-pasted across web_tools,
|
||||
vision_tools, mixture_of_agents_tool, and session_search_tool.
|
||||
"""
|
||||
|
||||
import os
|
||||
|
||||
_client = None
|
||||
from openai import AsyncOpenAI
|
||||
from hermes_constants import OPENROUTER_BASE_URL
|
||||
|
||||
_client: AsyncOpenAI | None = None
|
||||
|
||||
|
||||
def get_async_client():
|
||||
"""Return a shared async OpenAI-compatible client for OpenRouter.
|
||||
def get_async_client() -> AsyncOpenAI:
|
||||
"""Return a shared AsyncOpenAI client pointed at OpenRouter.
|
||||
|
||||
The client is created lazily on first call and reused thereafter.
|
||||
Uses the centralized provider router for auth and client construction.
|
||||
Raises ValueError if OPENROUTER_API_KEY is not set.
|
||||
"""
|
||||
global _client
|
||||
if _client is None:
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, _model = resolve_provider_client("openrouter", async_mode=True)
|
||||
if client is None:
|
||||
api_key = os.getenv("OPENROUTER_API_KEY")
|
||||
if not api_key:
|
||||
raise ValueError("OPENROUTER_API_KEY environment variable not set")
|
||||
_client = client
|
||||
_client = AsyncOpenAI(
|
||||
api_key=api_key,
|
||||
base_url=OPENROUTER_BASE_URL,
|
||||
default_headers={
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
},
|
||||
)
|
||||
return _client
|
||||
|
||||
|
||||
|
||||
@@ -22,7 +22,13 @@ import os
|
||||
import logging
|
||||
from typing import Dict, Any, List, Optional, Union
|
||||
|
||||
from agent.auxiliary_client import async_call_llm
|
||||
from openai import AsyncOpenAI, OpenAI
|
||||
|
||||
from agent.auxiliary_client import get_async_text_auxiliary_client
|
||||
|
||||
# Resolve the async auxiliary client at import time so we have the model slug.
|
||||
# Handles Codex Responses API adapter transparently.
|
||||
_async_aux_client, _SUMMARIZER_MODEL = get_async_text_auxiliary_client()
|
||||
MAX_SESSION_CHARS = 100_000
|
||||
MAX_SUMMARY_TOKENS = 10000
|
||||
|
||||
@@ -150,22 +156,26 @@ async def _summarize_session(
|
||||
f"Summarize this conversation with focus on: {query}"
|
||||
)
|
||||
|
||||
if _async_aux_client is None or _SUMMARIZER_MODEL is None:
|
||||
logging.warning("No auxiliary model available for session summarization")
|
||||
return None
|
||||
|
||||
max_retries = 3
|
||||
for attempt in range(max_retries):
|
||||
try:
|
||||
response = await async_call_llm(
|
||||
task="session_search",
|
||||
from agent.auxiliary_client import get_auxiliary_extra_body, auxiliary_max_tokens_param
|
||||
_extra = get_auxiliary_extra_body()
|
||||
response = await _async_aux_client.chat.completions.create(
|
||||
model=_SUMMARIZER_MODEL,
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt},
|
||||
],
|
||||
**({} if not _extra else {"extra_body": _extra}),
|
||||
temperature=0.1,
|
||||
max_tokens=MAX_SUMMARY_TOKENS,
|
||||
**auxiliary_max_tokens_param(MAX_SUMMARY_TOKENS),
|
||||
)
|
||||
return response.choices[0].message.content.strip()
|
||||
except RuntimeError:
|
||||
logging.warning("No auxiliary model available for session summarization")
|
||||
return None
|
||||
except Exception as e:
|
||||
if attempt < max_retries - 1:
|
||||
await asyncio.sleep(1 * (attempt + 1))
|
||||
@@ -323,6 +333,8 @@ def session_search(
|
||||
|
||||
def check_session_search_requirements() -> bool:
|
||||
"""Requires SQLite state database and an auxiliary text model."""
|
||||
if _async_aux_client is None:
|
||||
return False
|
||||
try:
|
||||
from hermes_state import DEFAULT_DB_PATH
|
||||
return DEFAULT_DB_PATH.parent.exists()
|
||||
|
||||
+18
-5
@@ -29,7 +29,7 @@ from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import List, Tuple
|
||||
|
||||
|
||||
from hermes_constants import OPENROUTER_BASE_URL
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -934,12 +934,25 @@ def llm_audit_skill(skill_path: Path, static_result: ScanResult,
|
||||
if not model:
|
||||
return static_result
|
||||
|
||||
# Call the LLM via the centralized provider router
|
||||
# Call the LLM via the OpenAI SDK (same pattern as run_agent.py)
|
||||
try:
|
||||
from agent.auxiliary_client import call_llm
|
||||
from openai import OpenAI
|
||||
import os
|
||||
|
||||
response = call_llm(
|
||||
provider="openrouter",
|
||||
api_key = os.getenv("OPENROUTER_API_KEY", "")
|
||||
if not api_key:
|
||||
return static_result
|
||||
|
||||
client = OpenAI(
|
||||
base_url=OPENROUTER_BASE_URL,
|
||||
api_key=api_key,
|
||||
default_headers={
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
},
|
||||
)
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=[{
|
||||
"role": "user",
|
||||
|
||||
+45
-46
@@ -37,11 +37,28 @@ from pathlib import Path
|
||||
from typing import Any, Awaitable, Dict, Optional
|
||||
from urllib.parse import urlparse
|
||||
import httpx
|
||||
from agent.auxiliary_client import async_call_llm
|
||||
from openai import AsyncOpenAI
|
||||
from agent.auxiliary_client import get_vision_auxiliary_client
|
||||
from tools.debug_helpers import DebugSession
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Resolve vision auxiliary client at module level; build an async wrapper.
|
||||
_aux_sync_client, DEFAULT_VISION_MODEL = get_vision_auxiliary_client()
|
||||
_aux_async_client: AsyncOpenAI | None = None
|
||||
if _aux_sync_client is not None:
|
||||
_async_kwargs = {
|
||||
"api_key": _aux_sync_client.api_key,
|
||||
"base_url": str(_aux_sync_client.base_url),
|
||||
}
|
||||
if "openrouter" in str(_aux_sync_client.base_url).lower():
|
||||
_async_kwargs["default_headers"] = {
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
}
|
||||
_aux_async_client = AsyncOpenAI(**_async_kwargs)
|
||||
|
||||
_debug = DebugSession("vision_tools", env_var="VISION_TOOLS_DEBUG")
|
||||
|
||||
|
||||
@@ -180,7 +197,7 @@ def _image_to_base64_data_url(image_path: Path, mime_type: Optional[str] = None)
|
||||
async def vision_analyze_tool(
|
||||
image_url: str,
|
||||
user_prompt: str,
|
||||
model: str = None,
|
||||
model: str = DEFAULT_VISION_MODEL,
|
||||
) -> str:
|
||||
"""
|
||||
Analyze an image from a URL or local file path using vision AI.
|
||||
@@ -240,6 +257,14 @@ async def vision_analyze_tool(
|
||||
logger.info("Analyzing image: %s", image_url[:60])
|
||||
logger.info("User prompt: %s", user_prompt[:100])
|
||||
|
||||
# Check auxiliary vision client availability
|
||||
if _aux_async_client is None or DEFAULT_VISION_MODEL is None:
|
||||
return json.dumps({
|
||||
"success": False,
|
||||
"analysis": "Vision analysis unavailable: no auxiliary vision model configured. "
|
||||
"Set OPENROUTER_API_KEY or configure Nous Portal to enable vision tools."
|
||||
}, indent=2, ensure_ascii=False)
|
||||
|
||||
# Determine if this is a local file path or a remote URL
|
||||
local_path = Path(image_url)
|
||||
if local_path.is_file():
|
||||
@@ -295,18 +320,18 @@ async def vision_analyze_tool(
|
||||
}
|
||||
]
|
||||
|
||||
logger.info("Processing image with vision model...")
|
||||
logger.info("Processing image with %s...", model)
|
||||
|
||||
# Call the vision API via centralized router
|
||||
call_kwargs = {
|
||||
"task": "vision",
|
||||
"messages": messages,
|
||||
"temperature": 0.1,
|
||||
"max_tokens": 2000,
|
||||
}
|
||||
if model:
|
||||
call_kwargs["model"] = model
|
||||
response = await async_call_llm(**call_kwargs)
|
||||
# Call the vision API
|
||||
from agent.auxiliary_client import get_auxiliary_extra_body, auxiliary_max_tokens_param
|
||||
_extra = get_auxiliary_extra_body()
|
||||
response = await _aux_async_client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
temperature=0.1,
|
||||
**auxiliary_max_tokens_param(2000),
|
||||
**({} if not _extra else {"extra_body": _extra}),
|
||||
)
|
||||
|
||||
# Extract the analysis
|
||||
analysis = response.choices[0].message.content.strip()
|
||||
@@ -333,28 +358,10 @@ async def vision_analyze_tool(
|
||||
error_msg = f"Error analyzing image: {str(e)}"
|
||||
logger.error("%s", error_msg, exc_info=True)
|
||||
|
||||
# Detect vision capability errors — give the model a clear message
|
||||
# so it can inform the user instead of a cryptic API error.
|
||||
err_str = str(e).lower()
|
||||
if any(hint in err_str for hint in (
|
||||
"does not support", "not support image", "invalid_request",
|
||||
"content_policy", "image_url", "multimodal",
|
||||
"unrecognized request argument", "image input",
|
||||
)):
|
||||
analysis = (
|
||||
f"{model} does not support vision or our request was not "
|
||||
f"accepted by the server. Error: {e}"
|
||||
)
|
||||
else:
|
||||
analysis = (
|
||||
"There was a problem with the request and the image could not "
|
||||
f"be analyzed. Error: {e}"
|
||||
)
|
||||
|
||||
# Prepare error response
|
||||
result = {
|
||||
"success": False,
|
||||
"analysis": analysis,
|
||||
"analysis": "There was a problem with the request and the image could not be analyzed."
|
||||
}
|
||||
|
||||
debug_call_data["error"] = error_msg
|
||||
@@ -377,18 +384,7 @@ async def vision_analyze_tool(
|
||||
|
||||
def check_vision_requirements() -> bool:
|
||||
"""Check if an auxiliary vision model is available."""
|
||||
try:
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, _ = resolve_provider_client("openrouter")
|
||||
if client is not None:
|
||||
return True
|
||||
client, _ = resolve_provider_client("nous")
|
||||
if client is not None:
|
||||
return True
|
||||
client, _ = resolve_provider_client("custom")
|
||||
return client is not None
|
||||
except Exception:
|
||||
return False
|
||||
return _aux_async_client is not None
|
||||
|
||||
|
||||
def get_debug_session_info() -> Dict[str, Any]:
|
||||
@@ -416,9 +412,10 @@ if __name__ == "__main__":
|
||||
print("Set OPENROUTER_API_KEY or configure Nous Portal to enable vision tools.")
|
||||
exit(1)
|
||||
else:
|
||||
print("✅ Vision model available")
|
||||
print(f"✅ Vision model available: {DEFAULT_VISION_MODEL}")
|
||||
|
||||
print("🛠️ Vision tools ready for use!")
|
||||
print(f"🧠 Using model: {DEFAULT_VISION_MODEL}")
|
||||
|
||||
# Show debug mode status
|
||||
if _debug.active:
|
||||
@@ -485,7 +482,9 @@ def _handle_vision_analyze(args: Dict[str, Any], **kw: Any) -> Awaitable[str]:
|
||||
"Fully describe and explain everything about this image, then answer the "
|
||||
f"following question:\n\n{question}"
|
||||
)
|
||||
model = os.getenv("AUXILIARY_VISION_MODEL", "").strip() or None
|
||||
model = (os.getenv("AUXILIARY_VISION_MODEL", "").strip()
|
||||
or DEFAULT_VISION_MODEL
|
||||
or "google/gemini-3-flash-preview")
|
||||
return vision_analyze_tool(image_url, full_prompt, model)
|
||||
|
||||
|
||||
|
||||
+52
-37
@@ -47,7 +47,8 @@ import re
|
||||
import asyncio
|
||||
from typing import List, Dict, Any, Optional
|
||||
from firecrawl import Firecrawl
|
||||
from agent.auxiliary_client import async_call_llm
|
||||
from openai import AsyncOpenAI
|
||||
from agent.auxiliary_client import get_async_text_auxiliary_client
|
||||
from tools.debug_helpers import DebugSession
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -82,8 +83,15 @@ def _get_firecrawl_client():
|
||||
|
||||
DEFAULT_MIN_LENGTH_FOR_SUMMARIZATION = 5000
|
||||
|
||||
# Allow per-task override via env var
|
||||
DEFAULT_SUMMARIZER_MODEL = os.getenv("AUXILIARY_WEB_EXTRACT_MODEL", "").strip() or None
|
||||
# Resolve async auxiliary client at module level.
|
||||
# Handles Codex Responses API adapter transparently.
|
||||
_aux_async_client, _DEFAULT_SUMMARIZER_MODEL = get_async_text_auxiliary_client("web_extract")
|
||||
|
||||
# Allow per-task override via config.yaml auxiliary.web_extract_model
|
||||
DEFAULT_SUMMARIZER_MODEL = (
|
||||
os.getenv("AUXILIARY_WEB_EXTRACT_MODEL", "").strip()
|
||||
or _DEFAULT_SUMMARIZER_MODEL
|
||||
)
|
||||
|
||||
_debug = DebugSession("web_tools", env_var="WEB_TOOLS_DEBUG")
|
||||
|
||||
@@ -241,22 +249,22 @@ Create a markdown summary that captures all key information in a well-organized,
|
||||
|
||||
for attempt in range(max_retries):
|
||||
try:
|
||||
call_kwargs = {
|
||||
"task": "web_extract",
|
||||
"messages": [
|
||||
if _aux_async_client is None:
|
||||
logger.warning("No auxiliary model available for web content processing")
|
||||
return None
|
||||
from agent.auxiliary_client import get_auxiliary_extra_body, auxiliary_max_tokens_param
|
||||
_extra = get_auxiliary_extra_body()
|
||||
response = await _aux_async_client.chat.completions.create(
|
||||
model=model,
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt}
|
||||
],
|
||||
"temperature": 0.1,
|
||||
"max_tokens": max_tokens,
|
||||
}
|
||||
if model:
|
||||
call_kwargs["model"] = model
|
||||
response = await async_call_llm(**call_kwargs)
|
||||
temperature=0.1,
|
||||
**auxiliary_max_tokens_param(max_tokens),
|
||||
**({} if not _extra else {"extra_body": _extra}),
|
||||
)
|
||||
return response.choices[0].message.content.strip()
|
||||
except RuntimeError:
|
||||
logger.warning("No auxiliary model available for web content processing")
|
||||
return None
|
||||
except Exception as api_error:
|
||||
last_error = api_error
|
||||
if attempt < max_retries - 1:
|
||||
@@ -360,18 +368,25 @@ Synthesize these into ONE cohesive, comprehensive summary that:
|
||||
Create a single, unified markdown summary."""
|
||||
|
||||
try:
|
||||
call_kwargs = {
|
||||
"task": "web_extract",
|
||||
"messages": [
|
||||
if _aux_async_client is None:
|
||||
logger.warning("No auxiliary model for synthesis, concatenating summaries")
|
||||
fallback = "\n\n".join(summaries)
|
||||
if len(fallback) > max_output_size:
|
||||
fallback = fallback[:max_output_size] + "\n\n[... truncated ...]"
|
||||
return fallback
|
||||
|
||||
from agent.auxiliary_client import get_auxiliary_extra_body, auxiliary_max_tokens_param
|
||||
_extra = get_auxiliary_extra_body()
|
||||
response = await _aux_async_client.chat.completions.create(
|
||||
model=model,
|
||||
messages=[
|
||||
{"role": "system", "content": "You synthesize multiple summaries into one cohesive, comprehensive summary. Be thorough but concise."},
|
||||
{"role": "user", "content": synthesis_prompt}
|
||||
],
|
||||
"temperature": 0.1,
|
||||
"max_tokens": 20000,
|
||||
}
|
||||
if model:
|
||||
call_kwargs["model"] = model
|
||||
response = await async_call_llm(**call_kwargs)
|
||||
temperature=0.1,
|
||||
**auxiliary_max_tokens_param(20000),
|
||||
**({} if not _extra else {"extra_body": _extra}),
|
||||
)
|
||||
final_summary = response.choices[0].message.content.strip()
|
||||
|
||||
# Enforce hard cap
|
||||
@@ -698,8 +713,8 @@ async def web_extract_tool(
|
||||
debug_call_data["pages_extracted"] = pages_extracted
|
||||
debug_call_data["original_response_size"] = len(json.dumps(response))
|
||||
|
||||
# Process each result with LLM if enabled
|
||||
if use_llm_processing:
|
||||
# Process each result with LLM if enabled and auxiliary client is available
|
||||
if use_llm_processing and _aux_async_client is not None:
|
||||
logger.info("Processing extracted content with LLM (parallel)...")
|
||||
debug_call_data["processing_applied"].append("llm_processing")
|
||||
|
||||
@@ -765,6 +780,10 @@ async def web_extract_tool(
|
||||
else:
|
||||
logger.warning("%s (no content to process)", url)
|
||||
else:
|
||||
if use_llm_processing and _aux_async_client is None:
|
||||
logger.warning("LLM processing requested but no auxiliary model available, returning raw content")
|
||||
debug_call_data["processing_applied"].append("llm_processing_unavailable")
|
||||
|
||||
# Print summary of extracted pages for debugging (original behavior)
|
||||
for result in response.get('results', []):
|
||||
url = result.get('url', 'Unknown URL')
|
||||
@@ -994,8 +1013,8 @@ async def web_crawl_tool(
|
||||
debug_call_data["pages_crawled"] = pages_crawled
|
||||
debug_call_data["original_response_size"] = len(json.dumps(response))
|
||||
|
||||
# Process each result with LLM if enabled
|
||||
if use_llm_processing:
|
||||
# Process each result with LLM if enabled and auxiliary client is available
|
||||
if use_llm_processing and _aux_async_client is not None:
|
||||
logger.info("Processing crawled content with LLM (parallel)...")
|
||||
debug_call_data["processing_applied"].append("llm_processing")
|
||||
|
||||
@@ -1061,6 +1080,10 @@ async def web_crawl_tool(
|
||||
else:
|
||||
logger.warning("%s (no content to process)", page_url)
|
||||
else:
|
||||
if use_llm_processing and _aux_async_client is None:
|
||||
logger.warning("LLM processing requested but no auxiliary model available, returning raw content")
|
||||
debug_call_data["processing_applied"].append("llm_processing_unavailable")
|
||||
|
||||
# Print summary of crawled pages for debugging (original behavior)
|
||||
for result in response.get('results', []):
|
||||
page_url = result.get('url', 'Unknown URL')
|
||||
@@ -1115,15 +1138,7 @@ def check_firecrawl_api_key() -> bool:
|
||||
|
||||
def check_auxiliary_model() -> bool:
|
||||
"""Check if an auxiliary text model is available for LLM content processing."""
|
||||
try:
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
for p in ("openrouter", "nous", "custom", "codex"):
|
||||
client, _ = resolve_provider_client(p)
|
||||
if client is not None:
|
||||
return True
|
||||
return False
|
||||
except Exception:
|
||||
return False
|
||||
return _aux_async_client is not None
|
||||
|
||||
|
||||
def get_debug_session_info() -> Dict[str, Any]:
|
||||
|
||||
+43
-90
@@ -344,65 +344,38 @@ class TrajectoryCompressor:
|
||||
raise RuntimeError(f"Failed to load tokenizer '{self.config.tokenizer_name}': {e}")
|
||||
|
||||
def _init_summarizer(self):
|
||||
"""Initialize LLM routing for summarization (sync and async).
|
||||
|
||||
Uses call_llm/async_call_llm from the centralized provider router
|
||||
which handles auth, headers, and provider detection internally.
|
||||
For custom endpoints, falls back to raw client construction.
|
||||
"""
|
||||
from agent.auxiliary_client import call_llm, async_call_llm
|
||||
|
||||
provider = self._detect_provider()
|
||||
if provider:
|
||||
# Store provider for use in _generate_summary calls
|
||||
self._llm_provider = provider
|
||||
self._use_call_llm = True
|
||||
# Verify the provider is available
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, _ = resolve_provider_client(
|
||||
provider, model=self.config.summarization_model)
|
||||
if client is None:
|
||||
raise RuntimeError(
|
||||
f"Provider '{provider}' is not configured. "
|
||||
f"Check your API key or run: hermes setup")
|
||||
self.client = None # Not used directly
|
||||
self.async_client = None # Not used directly
|
||||
else:
|
||||
# Custom endpoint — use config's raw base_url + api_key_env
|
||||
self._use_call_llm = False
|
||||
api_key = os.getenv(self.config.api_key_env)
|
||||
if not api_key:
|
||||
raise RuntimeError(
|
||||
f"Missing API key. Set {self.config.api_key_env} "
|
||||
f"environment variable.")
|
||||
from openai import OpenAI, AsyncOpenAI
|
||||
self.client = OpenAI(
|
||||
api_key=api_key, base_url=self.config.base_url)
|
||||
self.async_client = AsyncOpenAI(
|
||||
api_key=api_key, base_url=self.config.base_url)
|
||||
|
||||
print(f"✅ Initialized summarizer client: {self.config.summarization_model}")
|
||||
"""Initialize OpenRouter client for summarization (sync and async)."""
|
||||
api_key = os.getenv(self.config.api_key_env)
|
||||
if not api_key:
|
||||
raise RuntimeError(f"Missing API key. Set {self.config.api_key_env} environment variable.")
|
||||
|
||||
from openai import OpenAI, AsyncOpenAI
|
||||
|
||||
# OpenRouter app attribution headers (only for OpenRouter endpoints)
|
||||
extra = {}
|
||||
if "openrouter" in self.config.base_url.lower():
|
||||
extra["default_headers"] = {
|
||||
"HTTP-Referer": "https://github.com/NousResearch/hermes-agent",
|
||||
"X-OpenRouter-Title": "Hermes Agent",
|
||||
"X-OpenRouter-Categories": "productivity,cli-agent",
|
||||
}
|
||||
|
||||
# Sync client (for backwards compatibility)
|
||||
self.client = OpenAI(
|
||||
api_key=api_key,
|
||||
base_url=self.config.base_url,
|
||||
**extra,
|
||||
)
|
||||
|
||||
# Async client for parallel processing
|
||||
self.async_client = AsyncOpenAI(
|
||||
api_key=api_key,
|
||||
base_url=self.config.base_url,
|
||||
**extra,
|
||||
)
|
||||
|
||||
print(f"✅ Initialized OpenRouter client: {self.config.summarization_model}")
|
||||
print(f" Max concurrent requests: {self.config.max_concurrent_requests}")
|
||||
|
||||
def _detect_provider(self) -> str:
|
||||
"""Detect the provider name from the configured base_url."""
|
||||
url = self.config.base_url.lower()
|
||||
if "openrouter" in url:
|
||||
return "openrouter"
|
||||
if "nousresearch.com" in url:
|
||||
return "nous"
|
||||
if "chatgpt.com/backend-api/codex" in url:
|
||||
return "codex"
|
||||
if "api.z.ai" in url:
|
||||
return "zai"
|
||||
if "moonshot.ai" in url or "api.kimi.com" in url:
|
||||
return "kimi-coding"
|
||||
if "minimaxi.com" in url:
|
||||
return "minimax-cn"
|
||||
if "minimax.io" in url:
|
||||
return "minimax"
|
||||
# Unknown base_url — not a known provider
|
||||
return ""
|
||||
|
||||
def count_tokens(self, text: str) -> int:
|
||||
"""Count tokens in text using the configured tokenizer."""
|
||||
@@ -528,22 +501,12 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
|
||||
try:
|
||||
metrics.summarization_api_calls += 1
|
||||
|
||||
if getattr(self, '_use_call_llm', False):
|
||||
from agent.auxiliary_client import call_llm
|
||||
response = call_llm(
|
||||
provider=self._llm_provider,
|
||||
model=self.config.summarization_model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=self.config.temperature,
|
||||
max_tokens=self.config.summary_target_tokens * 2,
|
||||
)
|
||||
else:
|
||||
response = self.client.chat.completions.create(
|
||||
model=self.config.summarization_model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=self.config.temperature,
|
||||
max_tokens=self.config.summary_target_tokens * 2,
|
||||
)
|
||||
response = self.client.chat.completions.create(
|
||||
model=self.config.summarization_model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=self.config.temperature,
|
||||
max_tokens=self.config.summary_target_tokens * 2,
|
||||
)
|
||||
|
||||
summary = response.choices[0].message.content.strip()
|
||||
|
||||
@@ -595,22 +558,12 @@ Write only the summary, starting with "[CONTEXT SUMMARY]:" prefix."""
|
||||
try:
|
||||
metrics.summarization_api_calls += 1
|
||||
|
||||
if getattr(self, '_use_call_llm', False):
|
||||
from agent.auxiliary_client import async_call_llm
|
||||
response = await async_call_llm(
|
||||
provider=self._llm_provider,
|
||||
model=self.config.summarization_model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=self.config.temperature,
|
||||
max_tokens=self.config.summary_target_tokens * 2,
|
||||
)
|
||||
else:
|
||||
response = await self.async_client.chat.completions.create(
|
||||
model=self.config.summarization_model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=self.config.temperature,
|
||||
max_tokens=self.config.summary_target_tokens * 2,
|
||||
)
|
||||
response = await self.async_client.chat.completions.create(
|
||||
model=self.config.summarization_model,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=self.config.temperature,
|
||||
max_tokens=self.config.summary_target_tokens * 2,
|
||||
)
|
||||
|
||||
summary = response.choices[0].message.content.strip()
|
||||
|
||||
|
||||
@@ -55,8 +55,6 @@ metadata:
|
||||
hermes:
|
||||
tags: [python, automation]
|
||||
category: devops
|
||||
fallback_for_toolsets: [web] # Optional — conditional activation (see below)
|
||||
requires_toolsets: [terminal] # Optional — conditional activation (see below)
|
||||
---
|
||||
|
||||
# Skill Title
|
||||
@@ -92,30 +90,6 @@ platforms: [macos, linux] # macOS and Linux
|
||||
|
||||
When set, the skill is automatically hidden from the system prompt, `skills_list()`, and slash commands on incompatible platforms. If omitted, the skill loads on all platforms.
|
||||
|
||||
### Conditional Activation (Fallback Skills)
|
||||
|
||||
Skills can automatically show or hide themselves based on which tools are available in the current session. This is most useful for **fallback skills** — free or local alternatives that should only appear when a premium tool is unavailable.
|
||||
|
||||
```yaml
|
||||
metadata:
|
||||
hermes:
|
||||
fallback_for_toolsets: [web] # Show ONLY when these toolsets are unavailable
|
||||
requires_toolsets: [terminal] # Show ONLY when these toolsets are available
|
||||
fallback_for_tools: [web_search] # Show ONLY when these specific tools are unavailable
|
||||
requires_tools: [terminal] # Show ONLY when these specific tools are available
|
||||
```
|
||||
|
||||
| Field | Behavior |
|
||||
|-------|----------|
|
||||
| `fallback_for_toolsets` | Skill is **hidden** when the listed toolsets are available. Shown when they're missing. |
|
||||
| `fallback_for_tools` | Same, but checks individual tools instead of toolsets. |
|
||||
| `requires_toolsets` | Skill is **hidden** when the listed toolsets are unavailable. Shown when they're present. |
|
||||
| `requires_tools` | Same, but checks individual tools. |
|
||||
|
||||
**Example:** The built-in `duckduckgo-search` skill uses `fallback_for_toolsets: [web]`. When you have `FIRECRAWL_API_KEY` set, the web toolset is available and the agent uses `web_search` — the DuckDuckGo skill stays hidden. If the API key is missing, the web toolset is unavailable and the DuckDuckGo skill automatically appears as a fallback.
|
||||
|
||||
Skills without any conditional fields behave exactly as before — they're always shown.
|
||||
|
||||
## Skill Directory Structure
|
||||
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user