Compare commits
54 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 5a8265cb78 | |||
| ff6a86cb52 | |||
| 86960cdbb0 | |||
| 8b0afa0e57 | |||
| ab21fbfd89 | |||
| bdc72ec355 | |||
| c8a5e36be8 | |||
| 1368caf66f | |||
| 30ea423ce8 | |||
| 19b0ddce40 | |||
| 383db35925 | |||
| 55ac056920 | |||
| 085c1c6875 | |||
| a18e5b95ad | |||
| 3696c74bfb | |||
| bbcff8dcd0 | |||
| 77c5bc9da9 | |||
| 65e24c942e | |||
| 22d1bda185 | |||
| ab271ebe10 | |||
| e1befe5077 | |||
| fff237e111 | |||
| 598c25d43e | |||
| 5c03f2e7cc | |||
| 8d7a98d2ff | |||
| 7fe6782a25 | |||
| b9a5e6e247 | |||
| 363c5bc3c3 | |||
| 2ad7694874 | |||
| cbf1f15cfe | |||
| 9692b3c28a | |||
| f3c59321af | |||
| 6e02fa73c2 | |||
| 25080986a0 | |||
| c3158d38b2 | |||
| 50d1518df6 | |||
| 1d5a69a445 | |||
| 786038443e | |||
| 7ec838507a | |||
| efbe8d674a | |||
| a6547f399f | |||
| 52b3a3ca3a | |||
| 74b0072f8f | |||
| f6d4b6a319 | |||
| 37bf19a29d | |||
| 469cd16fe0 | |||
| b1a66d55b4 | |||
| 0d41fb0827 | |||
| 4aef055805 | |||
| f3006ebef9 | |||
| 99ff375f7a | |||
| 125e5ef089 | |||
| 4a630c2071 | |||
| 7b18eeee9b |
@@ -19,6 +19,9 @@ jobs:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Install system dependencies
|
||||
run: sudo apt-get update && sudo apt-get install -y ripgrep
|
||||
|
||||
- name: Install uv
|
||||
uses: astral-sh/setup-uv@v5
|
||||
|
||||
|
||||
@@ -0,0 +1,346 @@
|
||||
# Hermes Agent v0.8.0 (v2026.4.8)
|
||||
|
||||
**Release Date:** April 8, 2026
|
||||
|
||||
> The intelligence release — background task auto-notifications, free MiMo v2 Pro on Nous Portal, live model switching across all platforms, self-optimized GPT/Codex guidance, native Google AI Studio, smart inactivity timeouts, approval buttons, MCP OAuth 2.1, and 209 merged PRs with 82 resolved issues.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
- **Background Process Auto-Notifications (`notify_on_complete`)** — Background tasks can now automatically notify the agent when they finish. Start a long-running process (AI model training, test suites, deployments, builds) and the agent gets notified on completion — no polling needed. The agent can keep working on other things and pick up results when they land. ([#5779](https://github.com/NousResearch/hermes-agent/pull/5779))
|
||||
|
||||
- **Free Xiaomi MiMo v2 Pro on Nous Portal** — Nous Portal now supports the free-tier Xiaomi MiMo v2 Pro model for auxiliary tasks (compression, vision, summarization), with free-tier model gating and pricing display in model selection. ([#6018](https://github.com/NousResearch/hermes-agent/pull/6018), [#5880](https://github.com/NousResearch/hermes-agent/pull/5880))
|
||||
|
||||
- **Live Model Switching (`/model` Command)** — Switch models and providers mid-session from CLI, Telegram, Discord, Slack, or any gateway platform. Aggregator-aware resolution keeps you on OpenRouter/Nous when possible, with automatic cross-provider fallback when needed. Interactive model pickers on Telegram and Discord with inline buttons. ([#5181](https://github.com/NousResearch/hermes-agent/pull/5181), [#5742](https://github.com/NousResearch/hermes-agent/pull/5742))
|
||||
|
||||
- **Self-Optimized GPT/Codex Tool-Use Guidance** — The agent diagnosed and patched 5 failure modes in GPT and Codex tool calling through automated behavioral benchmarking, dramatically improving reliability on OpenAI models. Includes execution discipline guidance and thinking-only prefill continuation for structured reasoning. ([#6120](https://github.com/NousResearch/hermes-agent/pull/6120), [#5414](https://github.com/NousResearch/hermes-agent/pull/5414), [#5931](https://github.com/NousResearch/hermes-agent/pull/5931))
|
||||
|
||||
- **Google AI Studio (Gemini) Native Provider** — Direct access to Gemini models through Google's AI Studio API. Includes automatic models.dev registry integration for real-time context length detection across any provider. ([#5577](https://github.com/NousResearch/hermes-agent/pull/5577))
|
||||
|
||||
- **Inactivity-Based Agent Timeouts** — Gateway and cron timeouts now track actual tool activity instead of wall-clock time. Long-running tasks that are actively working will never be killed — only truly idle agents time out. ([#5389](https://github.com/NousResearch/hermes-agent/pull/5389), [#5440](https://github.com/NousResearch/hermes-agent/pull/5440))
|
||||
|
||||
- **Approval Buttons on Slack & Telegram** — Dangerous command approval via native platform buttons instead of typing `/approve`. Slack gets thread context preservation; Telegram gets emoji reactions for approval status. ([#5890](https://github.com/NousResearch/hermes-agent/pull/5890), [#5975](https://github.com/NousResearch/hermes-agent/pull/5975))
|
||||
|
||||
- **MCP OAuth 2.1 PKCE + OSV Malware Scanning** — Full standards-compliant OAuth for MCP server authentication, plus automatic malware scanning of MCP extension packages via the OSV vulnerability database. ([#5420](https://github.com/NousResearch/hermes-agent/pull/5420), [#5305](https://github.com/NousResearch/hermes-agent/pull/5305))
|
||||
|
||||
- **Centralized Logging & Config Validation** — Structured logging to `~/.hermes/logs/` (agent.log + errors.log) with the `hermes logs` command for tailing and filtering. Config structure validation catches malformed YAML at startup before it causes cryptic failures. ([#5430](https://github.com/NousResearch/hermes-agent/pull/5430), [#5426](https://github.com/NousResearch/hermes-agent/pull/5426))
|
||||
|
||||
- **Plugin System Expansion** — Plugins can now register CLI subcommands, receive request-scoped API hooks with correlation IDs, prompt for required env vars during install, and hook into session lifecycle events (finalize/reset). ([#5295](https://github.com/NousResearch/hermes-agent/pull/5295), [#5427](https://github.com/NousResearch/hermes-agent/pull/5427), [#5470](https://github.com/NousResearch/hermes-agent/pull/5470), [#6129](https://github.com/NousResearch/hermes-agent/pull/6129))
|
||||
|
||||
- **Matrix Tier 1 & Platform Hardening** — Matrix gets reactions, read receipts, rich formatting, and room management. Discord adds channel controls and ignored channels. Signal gets full MEDIA: tag delivery. Mattermost gets file attachments. Comprehensive reliability fixes across all platforms. ([#5275](https://github.com/NousResearch/hermes-agent/pull/5275), [#5975](https://github.com/NousResearch/hermes-agent/pull/5975), [#5602](https://github.com/NousResearch/hermes-agent/pull/5602))
|
||||
|
||||
- **Security Hardening Pass** — Consolidated SSRF protections, timing attack mitigations, tar traversal prevention, credential leakage guards, cron path traversal hardening, and cross-session isolation. Terminal workdir sanitization across all backends. ([#5944](https://github.com/NousResearch/hermes-agent/pull/5944), [#5613](https://github.com/NousResearch/hermes-agent/pull/5613), [#5629](https://github.com/NousResearch/hermes-agent/pull/5629))
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Core Agent & Architecture
|
||||
|
||||
### Provider & Model Support
|
||||
- **Native Google AI Studio (Gemini) provider** with models.dev integration for automatic context length detection ([#5577](https://github.com/NousResearch/hermes-agent/pull/5577))
|
||||
- **`/model` command — full provider+model system overhaul** — live switching across CLI and all gateway platforms with aggregator-aware resolution ([#5181](https://github.com/NousResearch/hermes-agent/pull/5181))
|
||||
- **Interactive model picker for Telegram and Discord** — inline button-based model selection ([#5742](https://github.com/NousResearch/hermes-agent/pull/5742))
|
||||
- **Nous Portal free-tier model gating** with pricing display in model selection ([#5880](https://github.com/NousResearch/hermes-agent/pull/5880))
|
||||
- **Model pricing display** for OpenRouter and Nous Portal providers ([#5416](https://github.com/NousResearch/hermes-agent/pull/5416))
|
||||
- **xAI (Grok) prompt caching** via `x-grok-conv-id` header ([#5604](https://github.com/NousResearch/hermes-agent/pull/5604))
|
||||
- **Grok added to tool-use enforcement models** for direct xAI usage ([#5595](https://github.com/NousResearch/hermes-agent/pull/5595))
|
||||
- **MiniMax TTS provider** (speech-2.8) ([#4963](https://github.com/NousResearch/hermes-agent/pull/4963))
|
||||
- **Non-agentic model warning** — warns users when loading Hermes LLM models not designed for tool use ([#5378](https://github.com/NousResearch/hermes-agent/pull/5378))
|
||||
- **Ollama Cloud auth, /model switch persistence**, and alias tab completion ([#5269](https://github.com/NousResearch/hermes-agent/pull/5269))
|
||||
- **Preserve dots in OpenCode Go model names** (minimax-m2.7, glm-4.5, kimi-k2.5) ([#5597](https://github.com/NousResearch/hermes-agent/pull/5597))
|
||||
- **MiniMax models 404 fix** — strip /v1 from Anthropic base URL for OpenCode Go ([#4918](https://github.com/NousResearch/hermes-agent/pull/4918))
|
||||
- **Provider credential reset windows** honored in pooled failover ([#5188](https://github.com/NousResearch/hermes-agent/pull/5188))
|
||||
- **OAuth token sync** between credential pool and credentials file ([#4981](https://github.com/NousResearch/hermes-agent/pull/4981))
|
||||
- **Stale OAuth credentials** no longer block OpenRouter users on auto-detect ([#5746](https://github.com/NousResearch/hermes-agent/pull/5746))
|
||||
- **Codex OAuth credential pool disconnect** + expired token import fix ([#5681](https://github.com/NousResearch/hermes-agent/pull/5681))
|
||||
- **Codex pool entry sync** from `~/.codex/auth.json` on exhaustion — @GratefulDave ([#5610](https://github.com/NousResearch/hermes-agent/pull/5610))
|
||||
- **Auxiliary client payment fallback** — retry with next provider on 402 ([#5599](https://github.com/NousResearch/hermes-agent/pull/5599))
|
||||
- **Auxiliary client resolves named custom providers** and 'main' alias ([#5978](https://github.com/NousResearch/hermes-agent/pull/5978))
|
||||
- **Use mimo-v2-pro** for non-vision auxiliary tasks on Nous free tier ([#6018](https://github.com/NousResearch/hermes-agent/pull/6018))
|
||||
- **Vision auto-detection** tries main provider first ([#6041](https://github.com/NousResearch/hermes-agent/pull/6041))
|
||||
- **Provider re-ordering and Quick Install** — @austinpickett ([#4664](https://github.com/NousResearch/hermes-agent/pull/4664))
|
||||
- **Nous OAuth access_token** no longer used as inference API key — @SHL0MS ([#5564](https://github.com/NousResearch/hermes-agent/pull/5564))
|
||||
- **HERMES_PORTAL_BASE_URL env var** respected during Nous login — @benbarclay ([#5745](https://github.com/NousResearch/hermes-agent/pull/5745))
|
||||
- **Env var overrides** for Nous portal/inference URLs ([#5419](https://github.com/NousResearch/hermes-agent/pull/5419))
|
||||
- **Z.AI endpoint auto-detect** via probe and cache ([#5763](https://github.com/NousResearch/hermes-agent/pull/5763))
|
||||
- **MiniMax context lengths, model catalog, thinking guard, aux model, and config base_url** corrections ([#6082](https://github.com/NousResearch/hermes-agent/pull/6082))
|
||||
- **Community provider/model resolution fixes** — salvaged 4 community PRs + MiniMax aux URL ([#5983](https://github.com/NousResearch/hermes-agent/pull/5983))
|
||||
|
||||
### Agent Loop & Conversation
|
||||
- **Self-optimized GPT/Codex tool-use guidance** via automated behavioral benchmarking — agent self-diagnosed and patched 5 failure modes ([#6120](https://github.com/NousResearch/hermes-agent/pull/6120))
|
||||
- **GPT/Codex execution discipline guidance** in system prompts ([#5414](https://github.com/NousResearch/hermes-agent/pull/5414))
|
||||
- **Thinking-only prefill continuation** for structured reasoning responses ([#5931](https://github.com/NousResearch/hermes-agent/pull/5931))
|
||||
- **Accept reasoning-only responses** without retries — set content to "(empty)" instead of infinite retry ([#5278](https://github.com/NousResearch/hermes-agent/pull/5278))
|
||||
- **Jittered retry backoff** — exponential backoff with jitter for API retries ([#6048](https://github.com/NousResearch/hermes-agent/pull/6048))
|
||||
- **Smart thinking block signature management** — preserve and manage Anthropic thinking signatures across turns ([#6112](https://github.com/NousResearch/hermes-agent/pull/6112))
|
||||
- **Coerce tool call arguments** to match JSON Schema types — fixes models that send strings instead of numbers/booleans ([#5265](https://github.com/NousResearch/hermes-agent/pull/5265))
|
||||
- **Save oversized tool results to file** instead of destructive truncation ([#5210](https://github.com/NousResearch/hermes-agent/pull/5210))
|
||||
- **Sandbox-aware tool result persistence** ([#6085](https://github.com/NousResearch/hermes-agent/pull/6085))
|
||||
- **Streaming fallback** improved after edit failures ([#6110](https://github.com/NousResearch/hermes-agent/pull/6110))
|
||||
- **Codex empty-output gaps** covered in fallback + normalizer + auxiliary client ([#5724](https://github.com/NousResearch/hermes-agent/pull/5724), [#5730](https://github.com/NousResearch/hermes-agent/pull/5730), [#5734](https://github.com/NousResearch/hermes-agent/pull/5734))
|
||||
- **Codex stream output backfill** from output_item.done events ([#5689](https://github.com/NousResearch/hermes-agent/pull/5689))
|
||||
- **Stream consumer creates new message** after tool boundaries ([#5739](https://github.com/NousResearch/hermes-agent/pull/5739))
|
||||
- **Codex validation aligned** with normalization for empty stream output ([#5940](https://github.com/NousResearch/hermes-agent/pull/5940))
|
||||
- **Bridge tool-calls** in copilot-acp adapter ([#5460](https://github.com/NousResearch/hermes-agent/pull/5460))
|
||||
- **Filter transcript-only roles** from chat-completions payload ([#4880](https://github.com/NousResearch/hermes-agent/pull/4880))
|
||||
- **Context compaction failures fixed** on temperature-restricted models — @MadKangYu ([#5608](https://github.com/NousResearch/hermes-agent/pull/5608))
|
||||
- **Sanitize tool_calls for all strict APIs** (Fireworks, Mistral, etc.) — @lumethegreat ([#5183](https://github.com/NousResearch/hermes-agent/pull/5183))
|
||||
|
||||
### Memory & Sessions
|
||||
- **Supermemory memory provider** — new memory plugin with multi-container, search_mode, identity template, and env var override ([#5737](https://github.com/NousResearch/hermes-agent/pull/5737), [#5933](https://github.com/NousResearch/hermes-agent/pull/5933))
|
||||
- **Shared thread sessions** by default — multi-user thread support across gateway platforms ([#5391](https://github.com/NousResearch/hermes-agent/pull/5391))
|
||||
- **Subagent sessions linked to parent** and hidden from session list ([#5309](https://github.com/NousResearch/hermes-agent/pull/5309))
|
||||
- **Profile-scoped memory isolation** and clone support ([#4845](https://github.com/NousResearch/hermes-agent/pull/4845))
|
||||
- **Thread gateway user_id to memory plugins** for per-user scoping ([#5895](https://github.com/NousResearch/hermes-agent/pull/5895))
|
||||
- **Honcho plugin drift overhaul** + plugin CLI registration system ([#5295](https://github.com/NousResearch/hermes-agent/pull/5295))
|
||||
- **Honcho holographic prompt and trust score** rendering preserved ([#4872](https://github.com/NousResearch/hermes-agent/pull/4872))
|
||||
- **Honcho doctor fix** — use recall_mode instead of memory_mode — @techguysimon ([#5645](https://github.com/NousResearch/hermes-agent/pull/5645))
|
||||
- **RetainDB** — API routes, write queue, dialectic, agent model, file tools fixes ([#5461](https://github.com/NousResearch/hermes-agent/pull/5461))
|
||||
- **Hindsight memory plugin overhaul** + memory setup wizard fixes ([#5094](https://github.com/NousResearch/hermes-agent/pull/5094))
|
||||
- **mem0 API v2 compat**, prefetch context fencing, secret redaction ([#5423](https://github.com/NousResearch/hermes-agent/pull/5423))
|
||||
- **mem0 env vars merged** with mem0.json instead of either/or ([#4939](https://github.com/NousResearch/hermes-agent/pull/4939))
|
||||
- **Clean user message** used for all memory provider operations ([#4940](https://github.com/NousResearch/hermes-agent/pull/4940))
|
||||
- **Silent memory flush failure** on /new and /resume fixed — @ryanautomated ([#5640](https://github.com/NousResearch/hermes-agent/pull/5640))
|
||||
- **OpenViking atexit safety net** for session commit ([#5664](https://github.com/NousResearch/hermes-agent/pull/5664))
|
||||
- **OpenViking tenant-scoping headers** for multi-tenant servers ([#4936](https://github.com/NousResearch/hermes-agent/pull/4936))
|
||||
- **ByteRover brv query** runs synchronously before LLM call ([#4831](https://github.com/NousResearch/hermes-agent/pull/4831))
|
||||
|
||||
---
|
||||
|
||||
## 📱 Messaging Platforms (Gateway)
|
||||
|
||||
### Gateway Core
|
||||
- **Inactivity-based agent timeout** — replaces wall-clock timeout with smart activity tracking; long-running active tasks never killed ([#5389](https://github.com/NousResearch/hermes-agent/pull/5389))
|
||||
- **Approval buttons for Slack & Telegram** + Slack thread context preservation ([#5890](https://github.com/NousResearch/hermes-agent/pull/5890))
|
||||
- **Live-stream /update output** + forward interactive prompts to user ([#5180](https://github.com/NousResearch/hermes-agent/pull/5180))
|
||||
- **Infinite timeout support** + periodic notifications + actionable error messages ([#4959](https://github.com/NousResearch/hermes-agent/pull/4959))
|
||||
- **Duplicate message prevention** — gateway dedup + partial stream guard ([#4878](https://github.com/NousResearch/hermes-agent/pull/4878))
|
||||
- **Webhook delivery_info persistence** + full session id in /status ([#5942](https://github.com/NousResearch/hermes-agent/pull/5942))
|
||||
- **Tool preview truncation** respects tool_preview_length in all/new progress modes ([#5937](https://github.com/NousResearch/hermes-agent/pull/5937))
|
||||
- **Short preview truncation** restored for all/new tool progress modes ([#4935](https://github.com/NousResearch/hermes-agent/pull/4935))
|
||||
- **Update-pending state** written atomically to prevent corruption ([#4923](https://github.com/NousResearch/hermes-agent/pull/4923))
|
||||
- **Approval session key isolated** per turn ([#4884](https://github.com/NousResearch/hermes-agent/pull/4884))
|
||||
- **Active-session guard bypass** for /approve, /deny, /stop, /new ([#4926](https://github.com/NousResearch/hermes-agent/pull/4926), [#5765](https://github.com/NousResearch/hermes-agent/pull/5765))
|
||||
- **Typing indicator paused** during approval waits ([#5893](https://github.com/NousResearch/hermes-agent/pull/5893))
|
||||
- **Caption check** uses exact line-by-line match instead of substring (all platforms) ([#5939](https://github.com/NousResearch/hermes-agent/pull/5939))
|
||||
- **MEDIA: tags stripped** from streamed gateway messages ([#5152](https://github.com/NousResearch/hermes-agent/pull/5152))
|
||||
- **MEDIA: tags extracted** from cron delivery before sending ([#5598](https://github.com/NousResearch/hermes-agent/pull/5598))
|
||||
- **Profile-aware service units** + voice transcription cleanup ([#5972](https://github.com/NousResearch/hermes-agent/pull/5972))
|
||||
- **Thread-safe PairingStore** with atomic writes — @CharlieKerfoot ([#5656](https://github.com/NousResearch/hermes-agent/pull/5656))
|
||||
- **Sanitize media URLs** in base platform logs — @WAXLYY ([#5631](https://github.com/NousResearch/hermes-agent/pull/5631))
|
||||
- **Reduce Telegram fallback IP activation log noise** — @MadKangYu ([#5615](https://github.com/NousResearch/hermes-agent/pull/5615))
|
||||
- **Cron static method wrappers** to prevent self-binding ([#5299](https://github.com/NousResearch/hermes-agent/pull/5299))
|
||||
- **Stale 'hermes login' replaced** with 'hermes auth' + credential removal re-seeding fix ([#5670](https://github.com/NousResearch/hermes-agent/pull/5670))
|
||||
|
||||
### Telegram
|
||||
- **Group topics skill binding** for supergroup forum topics ([#4886](https://github.com/NousResearch/hermes-agent/pull/4886))
|
||||
- **Emoji reactions** for approval status and notifications ([#5975](https://github.com/NousResearch/hermes-agent/pull/5975))
|
||||
- **Duplicate message delivery prevented** on send timeout ([#5153](https://github.com/NousResearch/hermes-agent/pull/5153))
|
||||
- **Command names sanitized** to strip invalid characters ([#5596](https://github.com/NousResearch/hermes-agent/pull/5596))
|
||||
- **Per-platform disabled skills** respected in Telegram menu and gateway dispatch ([#4799](https://github.com/NousResearch/hermes-agent/pull/4799))
|
||||
- **/approve and /deny** routed through running-agent guard ([#4798](https://github.com/NousResearch/hermes-agent/pull/4798))
|
||||
|
||||
### Discord
|
||||
- **Channel controls** — ignored_channels and no_thread_channels config options ([#5975](https://github.com/NousResearch/hermes-agent/pull/5975))
|
||||
- **Skills registered as native slash commands** via shared gateway logic ([#5603](https://github.com/NousResearch/hermes-agent/pull/5603))
|
||||
- **/approve, /deny, /queue, /background, /btw** registered as native slash commands ([#4800](https://github.com/NousResearch/hermes-agent/pull/4800), [#5477](https://github.com/NousResearch/hermes-agent/pull/5477))
|
||||
- **Unnecessary members intent** removed on startup + token lock leak fix ([#5302](https://github.com/NousResearch/hermes-agent/pull/5302))
|
||||
|
||||
### Slack
|
||||
- **Thread engagement** — auto-respond in bot-started and mentioned threads ([#5897](https://github.com/NousResearch/hermes-agent/pull/5897))
|
||||
- **mrkdwn in edit_message** + thread replies without @mentions ([#5733](https://github.com/NousResearch/hermes-agent/pull/5733))
|
||||
|
||||
### Matrix
|
||||
- **Tier 1 feature parity** — reactions, read receipts, rich formatting, room management ([#5275](https://github.com/NousResearch/hermes-agent/pull/5275))
|
||||
- **MATRIX_REQUIRE_MENTION and MATRIX_AUTO_THREAD** support ([#5106](https://github.com/NousResearch/hermes-agent/pull/5106))
|
||||
- **Comprehensive reliability** — encrypted media, auth recovery, cron E2EE, Synapse compat ([#5271](https://github.com/NousResearch/hermes-agent/pull/5271))
|
||||
- **CJK input, E2EE, and reconnect** fixes ([#5665](https://github.com/NousResearch/hermes-agent/pull/5665))
|
||||
|
||||
### Signal
|
||||
- **Full MEDIA: tag delivery** — send_image_file, send_voice, and send_video implemented ([#5602](https://github.com/NousResearch/hermes-agent/pull/5602))
|
||||
|
||||
### Mattermost
|
||||
- **File attachments** — set message type to DOCUMENT when post has file attachments — @nericervin ([#5609](https://github.com/NousResearch/hermes-agent/pull/5609))
|
||||
|
||||
### Feishu
|
||||
- **Interactive card approval buttons** ([#6043](https://github.com/NousResearch/hermes-agent/pull/6043))
|
||||
- **Reconnect and ACL** fixes ([#5665](https://github.com/NousResearch/hermes-agent/pull/5665))
|
||||
|
||||
### Webhooks
|
||||
- **`{__raw__}` template token** and thread_id passthrough for forum topics ([#5662](https://github.com/NousResearch/hermes-agent/pull/5662))
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ CLI & User Experience
|
||||
|
||||
### Interactive CLI
|
||||
- **Defer response content** until reasoning block completes ([#5773](https://github.com/NousResearch/hermes-agent/pull/5773))
|
||||
- **Ghost status-bar lines cleared** on terminal resize ([#4960](https://github.com/NousResearch/hermes-agent/pull/4960))
|
||||
- **Normalise \r\n and \r line endings** in pasted text ([#4849](https://github.com/NousResearch/hermes-agent/pull/4849))
|
||||
- **ChatConsole errors, curses scroll, skin-aware banner, git state** banner fixes ([#5974](https://github.com/NousResearch/hermes-agent/pull/5974))
|
||||
- **Native Windows image paste** support ([#5917](https://github.com/NousResearch/hermes-agent/pull/5917))
|
||||
- **--yolo and other flags** no longer silently dropped when placed before 'chat' subcommand ([#5145](https://github.com/NousResearch/hermes-agent/pull/5145))
|
||||
|
||||
### Setup & Configuration
|
||||
- **Config structure validation** — detect malformed YAML at startup with actionable error messages ([#5426](https://github.com/NousResearch/hermes-agent/pull/5426))
|
||||
- **Centralized logging** to `~/.hermes/logs/` — agent.log (INFO+), errors.log (WARNING+) with `hermes logs` command ([#5430](https://github.com/NousResearch/hermes-agent/pull/5430))
|
||||
- **Docs links added** to setup wizard sections ([#5283](https://github.com/NousResearch/hermes-agent/pull/5283))
|
||||
- **Doctor diagnostics** — sync provider checks, config migration, WAL and mem0 diagnostics ([#5077](https://github.com/NousResearch/hermes-agent/pull/5077))
|
||||
- **Timeout debug logging** and user-facing diagnostics improved ([#5370](https://github.com/NousResearch/hermes-agent/pull/5370))
|
||||
- **Reasoning effort unified** to config.yaml only ([#6118](https://github.com/NousResearch/hermes-agent/pull/6118))
|
||||
- **Permanent command allowlist** loaded on startup ([#5076](https://github.com/NousResearch/hermes-agent/pull/5076))
|
||||
- **`hermes auth remove`** now clears env-seeded credentials permanently ([#5285](https://github.com/NousResearch/hermes-agent/pull/5285))
|
||||
- **Bundled skills synced to all profiles** during update ([#5795](https://github.com/NousResearch/hermes-agent/pull/5795))
|
||||
- **`hermes update` no longer kills** freshly-restarted gateway service ([#5448](https://github.com/NousResearch/hermes-agent/pull/5448))
|
||||
- **Subprocess.run() timeouts** added to all gateway CLI commands ([#5424](https://github.com/NousResearch/hermes-agent/pull/5424))
|
||||
- **Actionable error message** when Codex refresh token is reused — @tymrtn ([#5612](https://github.com/NousResearch/hermes-agent/pull/5612))
|
||||
- **Google-workspace skill scripts** can now run directly — @xinbenlv ([#5624](https://github.com/NousResearch/hermes-agent/pull/5624))
|
||||
|
||||
### Cron System
|
||||
- **Inactivity-based cron timeout** — replaces wall-clock; active tasks run indefinitely ([#5440](https://github.com/NousResearch/hermes-agent/pull/5440))
|
||||
- **Pre-run script injection** for data collection and change detection ([#5082](https://github.com/NousResearch/hermes-agent/pull/5082))
|
||||
- **Delivery failure tracking** in job status ([#6042](https://github.com/NousResearch/hermes-agent/pull/6042))
|
||||
- **Delivery guidance** in cron prompts — stops send_message thrashing ([#5444](https://github.com/NousResearch/hermes-agent/pull/5444))
|
||||
- **MEDIA files delivered** as native platform attachments ([#5921](https://github.com/NousResearch/hermes-agent/pull/5921))
|
||||
- **[SILENT] suppression** works anywhere in response — @auspic7 ([#5654](https://github.com/NousResearch/hermes-agent/pull/5654))
|
||||
- **Cron path traversal** hardening ([#5147](https://github.com/NousResearch/hermes-agent/pull/5147))
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Tool System
|
||||
|
||||
### Terminal & Execution
|
||||
- **Execute_code on remote backends** — code execution now works on Docker, SSH, Modal, and other remote terminal backends ([#5088](https://github.com/NousResearch/hermes-agent/pull/5088))
|
||||
- **Exit code context** for common CLI tools in terminal results — helps agent understand what went wrong ([#5144](https://github.com/NousResearch/hermes-agent/pull/5144))
|
||||
- **Progressive subdirectory hint discovery** — agent learns project structure as it navigates ([#5291](https://github.com/NousResearch/hermes-agent/pull/5291))
|
||||
- **notify_on_complete for background processes** — get notified when long-running tasks finish ([#5779](https://github.com/NousResearch/hermes-agent/pull/5779))
|
||||
- **Docker env config** — explicit container environment variables via docker_env config ([#4738](https://github.com/NousResearch/hermes-agent/pull/4738))
|
||||
- **Approval metadata included** in terminal tool results ([#5141](https://github.com/NousResearch/hermes-agent/pull/5141))
|
||||
- **Workdir parameter sanitized** in terminal tool across all backends ([#5629](https://github.com/NousResearch/hermes-agent/pull/5629))
|
||||
- **Detached process crash recovery** state corrected ([#6101](https://github.com/NousResearch/hermes-agent/pull/6101))
|
||||
- **Agent-browser paths with spaces** preserved — @Vasanthdev2004 ([#6077](https://github.com/NousResearch/hermes-agent/pull/6077))
|
||||
- **Portable base64 encoding** for image reading on macOS — @CharlieKerfoot ([#5657](https://github.com/NousResearch/hermes-agent/pull/5657))
|
||||
|
||||
### Browser
|
||||
- **Switch managed browser provider** from Browserbase to Browser Use — @benbarclay ([#5750](https://github.com/NousResearch/hermes-agent/pull/5750))
|
||||
- **Firecrawl cloud browser** provider — @alt-glitch ([#5628](https://github.com/NousResearch/hermes-agent/pull/5628))
|
||||
- **JS evaluation** via browser_console expression parameter ([#5303](https://github.com/NousResearch/hermes-agent/pull/5303))
|
||||
- **Windows browser** fixes ([#5665](https://github.com/NousResearch/hermes-agent/pull/5665))
|
||||
|
||||
### MCP
|
||||
- **MCP OAuth 2.1 PKCE** — full standards-compliant OAuth client support ([#5420](https://github.com/NousResearch/hermes-agent/pull/5420))
|
||||
- **OSV malware check** for MCP extension packages ([#5305](https://github.com/NousResearch/hermes-agent/pull/5305))
|
||||
- **Prefer structuredContent over text** + no_mcp sentinel ([#5979](https://github.com/NousResearch/hermes-agent/pull/5979))
|
||||
- **Unknown toolsets warning suppressed** for MCP server names ([#5279](https://github.com/NousResearch/hermes-agent/pull/5279))
|
||||
|
||||
### Web & Files
|
||||
- **.zip document support** + auto-mount cache dirs into remote backends ([#4846](https://github.com/NousResearch/hermes-agent/pull/4846))
|
||||
- **Redact query secrets** in send_message errors — @WAXLYY ([#5650](https://github.com/NousResearch/hermes-agent/pull/5650))
|
||||
|
||||
### Delegation
|
||||
- **Credential pool sharing** + workspace path hints for subagents ([#5748](https://github.com/NousResearch/hermes-agent/pull/5748))
|
||||
|
||||
### ACP (VS Code / Zed / JetBrains)
|
||||
- **Aggregate ACP improvements** — auth compat, protocol fixes, command ads, delegation, SSE events ([#5292](https://github.com/NousResearch/hermes-agent/pull/5292))
|
||||
|
||||
---
|
||||
|
||||
## 🧩 Skills Ecosystem
|
||||
|
||||
### Skills System
|
||||
- **Skill config interface** — skills can declare required config.yaml settings, prompted during setup, injected at load time ([#5635](https://github.com/NousResearch/hermes-agent/pull/5635))
|
||||
- **Plugin CLI registration system** — plugins register their own CLI subcommands without touching main.py ([#5295](https://github.com/NousResearch/hermes-agent/pull/5295))
|
||||
- **Request-scoped API hooks** with tool call correlation IDs for plugins ([#5427](https://github.com/NousResearch/hermes-agent/pull/5427))
|
||||
- **Session lifecycle hooks** — on_session_finalize and on_session_reset for CLI + gateway ([#6129](https://github.com/NousResearch/hermes-agent/pull/6129))
|
||||
- **Prompt for required env vars** during plugin install — @kshitijk4poor ([#5470](https://github.com/NousResearch/hermes-agent/pull/5470))
|
||||
- **Plugin name validation** — reject names that resolve to plugins root ([#5368](https://github.com/NousResearch/hermes-agent/pull/5368))
|
||||
- **pre_llm_call plugin context** moved to user message to preserve prompt cache ([#5146](https://github.com/NousResearch/hermes-agent/pull/5146))
|
||||
|
||||
### New & Updated Skills
|
||||
- **popular-web-designs** — 54 production website design systems ([#5194](https://github.com/NousResearch/hermes-agent/pull/5194))
|
||||
- **p5js creative coding** — @SHL0MS ([#5600](https://github.com/NousResearch/hermes-agent/pull/5600))
|
||||
- **manim-video** — mathematical and technical animations — @SHL0MS ([#4930](https://github.com/NousResearch/hermes-agent/pull/4930))
|
||||
- **llm-wiki** — Karpathy's LLM Wiki skill ([#5635](https://github.com/NousResearch/hermes-agent/pull/5635))
|
||||
- **gitnexus-explorer** — codebase indexing and knowledge serving ([#5208](https://github.com/NousResearch/hermes-agent/pull/5208))
|
||||
- **research-paper-writing** — AI-Scientist & GPT-Researcher patterns — @SHL0MS ([#5421](https://github.com/NousResearch/hermes-agent/pull/5421))
|
||||
- **blogwatcher** updated to JulienTant's fork ([#5759](https://github.com/NousResearch/hermes-agent/pull/5759))
|
||||
- **claude-code skill** comprehensive rewrite v2.0 + v2.2 ([#5155](https://github.com/NousResearch/hermes-agent/pull/5155), [#5158](https://github.com/NousResearch/hermes-agent/pull/5158))
|
||||
- **Code verification skills** consolidated into one ([#4854](https://github.com/NousResearch/hermes-agent/pull/4854))
|
||||
- **Manim CE reference docs** expanded — geometry, animations, LaTeX — @leotrs ([#5791](https://github.com/NousResearch/hermes-agent/pull/5791))
|
||||
- **Manim-video references** — design thinking, updaters, paper explainer, decorations, production quality — @SHL0MS ([#5588](https://github.com/NousResearch/hermes-agent/pull/5588), [#5408](https://github.com/NousResearch/hermes-agent/pull/5408))
|
||||
|
||||
---
|
||||
|
||||
## 🔒 Security & Reliability
|
||||
|
||||
### Security Hardening
|
||||
- **Consolidated security** — SSRF protections, timing attack mitigations, tar traversal prevention, credential leakage guards ([#5944](https://github.com/NousResearch/hermes-agent/pull/5944))
|
||||
- **Cross-session isolation** + cron path traversal hardening ([#5613](https://github.com/NousResearch/hermes-agent/pull/5613))
|
||||
- **Workdir parameter sanitized** in terminal tool across all backends ([#5629](https://github.com/NousResearch/hermes-agent/pull/5629))
|
||||
- **Approval 'once' session escalation** prevented + cron delivery platform validation ([#5280](https://github.com/NousResearch/hermes-agent/pull/5280))
|
||||
- **Profile-scoped Google Workspace OAuth tokens** protected ([#4910](https://github.com/NousResearch/hermes-agent/pull/4910))
|
||||
|
||||
### Reliability
|
||||
- **Aggressive worktree and branch cleanup** to prevent accumulation ([#6134](https://github.com/NousResearch/hermes-agent/pull/6134))
|
||||
- **O(n²) catastrophic backtracking** in redact regex fixed — 100x improvement on large outputs ([#4962](https://github.com/NousResearch/hermes-agent/pull/4962))
|
||||
- **Runtime stability fixes** across core, web, delegate, and browser tools ([#4843](https://github.com/NousResearch/hermes-agent/pull/4843))
|
||||
- **API server streaming fix** + conversation history support ([#5977](https://github.com/NousResearch/hermes-agent/pull/5977))
|
||||
- **OpenViking API endpoint paths** and response parsing corrected ([#5078](https://github.com/NousResearch/hermes-agent/pull/5078))
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Notable Bug Fixes
|
||||
|
||||
- **9 community bugfixes salvaged** — gateway, cron, deps, macOS launchd in one batch ([#5288](https://github.com/NousResearch/hermes-agent/pull/5288))
|
||||
- **Batch core bug fixes** — model config, session reset, alias fallback, launchctl, delegation, atomic writes ([#5630](https://github.com/NousResearch/hermes-agent/pull/5630))
|
||||
- **Batch gateway/platform fixes** — matrix E2EE, CJK input, Windows browser, Feishu reconnect + ACL ([#5665](https://github.com/NousResearch/hermes-agent/pull/5665))
|
||||
- **Stale test skips removed**, regex backtracking, file search bug, and test flakiness ([#4969](https://github.com/NousResearch/hermes-agent/pull/4969))
|
||||
- **Nix flake** — read version, regen uv.lock, add hermes_logging — @alt-glitch ([#5651](https://github.com/NousResearch/hermes-agent/pull/5651))
|
||||
- **Lowercase variable redaction** regression tests ([#5185](https://github.com/NousResearch/hermes-agent/pull/5185))
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Testing
|
||||
|
||||
- **57 failing CI tests repaired** across 14 files ([#5823](https://github.com/NousResearch/hermes-agent/pull/5823))
|
||||
- **Test suite re-architecture** + CI failure fixes — @alt-glitch ([#5946](https://github.com/NousResearch/hermes-agent/pull/5946))
|
||||
- **Codebase-wide lint cleanup** — unused imports, dead code, and inefficient patterns ([#5821](https://github.com/NousResearch/hermes-agent/pull/5821))
|
||||
- **browser_close tool removed** — auto-cleanup handles it ([#5792](https://github.com/NousResearch/hermes-agent/pull/5792))
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- **Comprehensive documentation audit** — fix stale info, expand thin pages, add depth ([#5393](https://github.com/NousResearch/hermes-agent/pull/5393))
|
||||
- **40+ discrepancies fixed** between documentation and codebase ([#5818](https://github.com/NousResearch/hermes-agent/pull/5818))
|
||||
- **13 features documented** from last week's PRs ([#5815](https://github.com/NousResearch/hermes-agent/pull/5815))
|
||||
- **Guides section overhaul** — fix existing + add 3 new tutorials ([#5735](https://github.com/NousResearch/hermes-agent/pull/5735))
|
||||
- **Salvaged 4 docs PRs** — docker setup, post-update validation, local LLM guide, signal-cli install ([#5727](https://github.com/NousResearch/hermes-agent/pull/5727))
|
||||
- **Discord configuration reference** ([#5386](https://github.com/NousResearch/hermes-agent/pull/5386))
|
||||
- **Community FAQ entries** for common workflows and troubleshooting ([#4797](https://github.com/NousResearch/hermes-agent/pull/4797))
|
||||
- **WSL2 networking guide** for local model servers ([#5616](https://github.com/NousResearch/hermes-agent/pull/5616))
|
||||
- **Honcho CLI reference** + plugin CLI registration docs ([#5308](https://github.com/NousResearch/hermes-agent/pull/5308))
|
||||
- **Obsidian Headless setup** for servers in llm-wiki ([#5660](https://github.com/NousResearch/hermes-agent/pull/5660))
|
||||
- **Hermes Mod visual skin editor** added to skins page ([#6095](https://github.com/NousResearch/hermes-agent/pull/6095))
|
||||
|
||||
---
|
||||
|
||||
## 👥 Contributors
|
||||
|
||||
### Core
|
||||
- **@teknium1** — 179 PRs
|
||||
|
||||
### Top Community Contributors
|
||||
- **@SHL0MS** (7 PRs) — p5js creative coding skill, manim-video skill + 5 reference expansions, research-paper-writing, Nous OAuth fix, manim font fix
|
||||
- **@alt-glitch** (3 PRs) — Firecrawl cloud browser provider, test re-architecture + CI fixes, Nix flake fixes
|
||||
- **@benbarclay** (2 PRs) — Browser Use managed provider switch, Nous portal base URL fix
|
||||
- **@CharlieKerfoot** (2 PRs) — macOS portable base64 encoding, thread-safe PairingStore
|
||||
- **@WAXLYY** (2 PRs) — send_message secret redaction, gateway media URL sanitization
|
||||
- **@MadKangYu** (2 PRs) — Telegram log noise reduction, context compaction fix for temperature-restricted models
|
||||
|
||||
### All Contributors
|
||||
@alt-glitch, @austinpickett, @auspic7, @benbarclay, @CharlieKerfoot, @GratefulDave, @kshitijk4poor, @leotrs, @lumethegreat, @MadKangYu, @nericervin, @ryanautomated, @SHL0MS, @techguysimon, @tymrtn, @Vasanthdev2004, @WAXLYY, @xinbenlv
|
||||
|
||||
---
|
||||
|
||||
**Full Changelog**: [v2026.4.3...v2026.4.8](https://github.com/NousResearch/hermes-agent/compare/v2026.4.3...v2026.4.8)
|
||||
@@ -1102,7 +1102,15 @@ def convert_messages_to_anthropic(
|
||||
curr_content = [{"type": "text", "text": curr_content}]
|
||||
fixed[-1]["content"] = prev_content + curr_content
|
||||
else:
|
||||
# Consecutive assistant messages — merge text content
|
||||
# Consecutive assistant messages — merge text content.
|
||||
# Drop thinking blocks from the *second* message: their
|
||||
# signature was computed against a different turn boundary
|
||||
# and becomes invalid once merged.
|
||||
if isinstance(m["content"], list):
|
||||
m["content"] = [
|
||||
b for b in m["content"]
|
||||
if not (isinstance(b, dict) and b.get("type") in ("thinking", "redacted_thinking"))
|
||||
]
|
||||
prev_blocks = fixed[-1]["content"]
|
||||
curr_blocks = m["content"]
|
||||
if isinstance(prev_blocks, list) and isinstance(curr_blocks, list):
|
||||
@@ -1120,6 +1128,68 @@ def convert_messages_to_anthropic(
|
||||
fixed.append(m)
|
||||
result = fixed
|
||||
|
||||
# ── Thinking block signature management ──────────────────────────
|
||||
# Anthropic signs thinking blocks against the full turn content.
|
||||
# Any upstream mutation (context compression, session truncation,
|
||||
# orphan stripping, message merging) invalidates the signature,
|
||||
# causing HTTP 400 "Invalid signature in thinking block".
|
||||
#
|
||||
# Strategy (following clawdbot/OpenClaw pattern):
|
||||
# 1. Strip thinking/redacted_thinking from all assistant messages
|
||||
# EXCEPT the last one — preserves reasoning continuity on the
|
||||
# current tool-use chain while avoiding stale signature errors.
|
||||
# 2. Downgrade unsigned thinking blocks (no signature) to text —
|
||||
# Anthropic can't validate them and will reject them.
|
||||
# 3. Strip cache_control from thinking/redacted_thinking blocks —
|
||||
# cache markers can interfere with signature validation.
|
||||
_THINKING_TYPES = frozenset(("thinking", "redacted_thinking"))
|
||||
|
||||
last_assistant_idx = None
|
||||
for i in range(len(result) - 1, -1, -1):
|
||||
if result[i].get("role") == "assistant":
|
||||
last_assistant_idx = i
|
||||
break
|
||||
|
||||
for idx, m in enumerate(result):
|
||||
if m.get("role") != "assistant" or not isinstance(m.get("content"), list):
|
||||
continue
|
||||
|
||||
if idx != last_assistant_idx:
|
||||
# Strip ALL thinking blocks from non-latest assistant messages
|
||||
stripped = [
|
||||
b for b in m["content"]
|
||||
if not (isinstance(b, dict) and b.get("type") in _THINKING_TYPES)
|
||||
]
|
||||
m["content"] = stripped or [{"type": "text", "text": "(thinking elided)"}]
|
||||
else:
|
||||
# Latest assistant: keep signed thinking blocks for reasoning
|
||||
# continuity; downgrade unsigned ones to plain text.
|
||||
new_content = []
|
||||
for b in m["content"]:
|
||||
if not isinstance(b, dict) or b.get("type") not in _THINKING_TYPES:
|
||||
new_content.append(b)
|
||||
continue
|
||||
if b.get("type") == "redacted_thinking":
|
||||
# Redacted blocks use 'data' for the signature payload
|
||||
if b.get("data"):
|
||||
new_content.append(b)
|
||||
# else: drop — no data means it can't be validated
|
||||
elif b.get("signature"):
|
||||
# Signed thinking block — keep it
|
||||
new_content.append(b)
|
||||
else:
|
||||
# Unsigned thinking — downgrade to text so it's not lost
|
||||
thinking_text = b.get("thinking", "")
|
||||
if thinking_text:
|
||||
new_content.append({"type": "text", "text": thinking_text})
|
||||
m["content"] = new_content or [{"type": "text", "text": "(empty)"}]
|
||||
|
||||
# Strip cache_control from any remaining thinking/redacted_thinking
|
||||
# blocks — cache markers interfere with signature validation.
|
||||
for b in m["content"]:
|
||||
if isinstance(b, dict) and b.get("type") in _THINKING_TYPES:
|
||||
b.pop("cache_control", None)
|
||||
|
||||
return system, result
|
||||
|
||||
|
||||
@@ -1224,9 +1294,9 @@ def build_anthropic_kwargs(
|
||||
# Map reasoning_config to Anthropic's thinking parameter.
|
||||
# Claude 4.6 models use adaptive thinking + output_config.effort.
|
||||
# Older models use manual thinking with budget_tokens.
|
||||
# Haiku models do NOT support extended thinking at all — skip entirely.
|
||||
# Haiku and MiniMax models do NOT support extended thinking — skip entirely.
|
||||
if reasoning_config and isinstance(reasoning_config, dict):
|
||||
if reasoning_config.get("enabled") is not False and "haiku" not in model.lower():
|
||||
if reasoning_config.get("enabled") is not False and "haiku" not in model.lower() and "minimax" not in model.lower():
|
||||
effort = str(reasoning_config.get("effort", "medium")).lower()
|
||||
budget = THINKING_BUDGET.get(effort, 8000)
|
||||
if _supports_adaptive_thinking(model):
|
||||
|
||||
+129
-42
@@ -59,13 +59,48 @@ from hermes_constants import OPENROUTER_BASE_URL
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_PROVIDER_ALIASES = {
|
||||
"google": "gemini",
|
||||
"google-gemini": "gemini",
|
||||
"google-ai-studio": "gemini",
|
||||
"glm": "zai",
|
||||
"z-ai": "zai",
|
||||
"z.ai": "zai",
|
||||
"zhipu": "zai",
|
||||
"kimi": "kimi-coding",
|
||||
"moonshot": "kimi-coding",
|
||||
"minimax-china": "minimax-cn",
|
||||
"minimax_cn": "minimax-cn",
|
||||
"claude": "anthropic",
|
||||
"claude-code": "anthropic",
|
||||
}
|
||||
|
||||
|
||||
def _normalize_aux_provider(provider: Optional[str], *, for_vision: bool = False) -> str:
|
||||
normalized = (provider or "auto").strip().lower()
|
||||
if normalized.startswith("custom:"):
|
||||
suffix = normalized.split(":", 1)[1].strip()
|
||||
if not suffix:
|
||||
return "custom"
|
||||
normalized = suffix if not for_vision else "custom"
|
||||
if normalized == "codex":
|
||||
return "openai-codex"
|
||||
if normalized == "main":
|
||||
# Resolve to the user's actual main provider so named custom providers
|
||||
# and non-aggregator providers (DeepSeek, Alibaba, etc.) work correctly.
|
||||
main_prov = _read_main_provider()
|
||||
if main_prov and main_prov not in ("auto", "main", ""):
|
||||
return main_prov
|
||||
return "custom"
|
||||
return _PROVIDER_ALIASES.get(normalized, normalized)
|
||||
|
||||
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
|
||||
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
|
||||
"gemini": "gemini-3-flash-preview",
|
||||
"zai": "glm-4.5-flash",
|
||||
"kimi-coding": "kimi-k2-turbo-preview",
|
||||
"minimax": "MiniMax-M2.7-highspeed",
|
||||
"minimax-cn": "MiniMax-M2.7-highspeed",
|
||||
"minimax": "MiniMax-M2.7",
|
||||
"minimax-cn": "MiniMax-M2.7",
|
||||
"anthropic": "claude-haiku-4-5-20251001",
|
||||
"ai-gateway": "google/gemini-3-flash",
|
||||
"opencode-zen": "gemini-3-flash",
|
||||
@@ -92,6 +127,7 @@ auxiliary_is_nous: bool = False
|
||||
_OPENROUTER_MODEL = "google/gemini-3-flash-preview"
|
||||
_NOUS_MODEL = "google/gemini-3-flash-preview"
|
||||
_NOUS_FREE_TIER_VISION_MODEL = "xiaomi/mimo-v2-omni"
|
||||
_NOUS_FREE_TIER_AUX_MODEL = "xiaomi/mimo-v2-pro"
|
||||
_NOUS_DEFAULT_BASE_URL = "https://inference-api.nousresearch.com/v1"
|
||||
_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
|
||||
_AUTH_JSON_PATH = get_hermes_home() / "auth.json"
|
||||
@@ -105,6 +141,23 @@ _CODEX_AUX_MODEL = "gpt-5.2-codex"
|
||||
_CODEX_AUX_BASE_URL = "https://chatgpt.com/backend-api/codex"
|
||||
|
||||
|
||||
def _to_openai_base_url(base_url: str) -> str:
|
||||
"""Normalize an Anthropic-style base URL to OpenAI-compatible format.
|
||||
|
||||
Some providers (MiniMax, MiniMax-CN) expose an ``/anthropic`` endpoint for
|
||||
the Anthropic Messages API and a separate ``/v1`` endpoint for OpenAI chat
|
||||
completions. The auxiliary client uses the OpenAI SDK, so it must hit the
|
||||
``/v1`` surface. Passing the raw ``inference_base_url`` causes requests to
|
||||
land on ``/anthropic/chat/completions`` — a 404.
|
||||
"""
|
||||
url = str(base_url or "").strip().rstrip("/")
|
||||
if url.endswith("/anthropic"):
|
||||
rewritten = url[: -len("/anthropic")] + "/v1"
|
||||
logger.debug("Auxiliary client: rewrote base URL %s → %s", url, rewritten)
|
||||
return rewritten
|
||||
return url
|
||||
|
||||
|
||||
def _select_pool_entry(provider: str) -> Tuple[bool, Optional[Any]]:
|
||||
"""Return (pool_exists_for_provider, selected_entry)."""
|
||||
try:
|
||||
@@ -634,7 +687,9 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
if not api_key:
|
||||
continue
|
||||
|
||||
base_url = _pool_runtime_base_url(entry, pconfig.inference_base_url) or pconfig.inference_base_url
|
||||
base_url = _to_openai_base_url(
|
||||
_pool_runtime_base_url(entry, pconfig.inference_base_url) or pconfig.inference_base_url
|
||||
)
|
||||
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id, "default")
|
||||
logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
|
||||
extra = {}
|
||||
@@ -651,7 +706,9 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
if not api_key:
|
||||
continue
|
||||
|
||||
base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
|
||||
base_url = _to_openai_base_url(
|
||||
str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
|
||||
)
|
||||
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id, "default")
|
||||
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
|
||||
extra = {}
|
||||
@@ -713,7 +770,7 @@ def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
|
||||
|
||||
|
||||
def _try_nous() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
nous = _read_nous_auth()
|
||||
if not nous:
|
||||
return None, None
|
||||
@@ -725,12 +782,13 @@ def _try_nous() -> Tuple[Optional[OpenAI], Optional[str]]:
|
||||
else:
|
||||
model = _NOUS_MODEL
|
||||
# Free-tier users can't use paid auxiliary models — use the free
|
||||
# multimodal model instead so vision/browser-vision still works.
|
||||
# models instead: mimo-v2-omni for vision, mimo-v2-pro for text tasks.
|
||||
try:
|
||||
from hermes_cli.models import check_nous_free_tier
|
||||
if check_nous_free_tier():
|
||||
model = _NOUS_FREE_TIER_VISION_MODEL
|
||||
logger.debug("Free-tier Nous account — using %s for auxiliary/vision", model)
|
||||
model = _NOUS_FREE_TIER_VISION_MODEL if vision else _NOUS_FREE_TIER_AUX_MODEL
|
||||
logger.debug("Free-tier Nous account — using %s for auxiliary/%s",
|
||||
model, "vision" if vision else "text")
|
||||
except Exception:
|
||||
pass
|
||||
return (
|
||||
@@ -776,7 +834,7 @@ def _read_main_provider() -> str:
|
||||
if isinstance(model_cfg, dict):
|
||||
provider = model_cfg.get("provider", "")
|
||||
if isinstance(provider, str) and provider.strip():
|
||||
return provider.strip().lower()
|
||||
return _normalize_aux_provider(provider)
|
||||
except Exception:
|
||||
pass
|
||||
return ""
|
||||
@@ -1138,11 +1196,7 @@ def resolve_provider_client(
|
||||
(client, resolved_model) or (None, None) if auth is unavailable.
|
||||
"""
|
||||
# Normalise aliases
|
||||
provider = (provider or "auto").strip().lower()
|
||||
if provider == "codex":
|
||||
provider = "openai-codex"
|
||||
if provider == "main":
|
||||
provider = "custom"
|
||||
provider = _normalize_aux_provider(provider)
|
||||
|
||||
# ── Auto: try all providers in priority order ────────────────────
|
||||
if provider == "auto":
|
||||
@@ -1238,6 +1292,28 @@ def resolve_provider_client(
|
||||
"but no endpoint credentials found")
|
||||
return None, None
|
||||
|
||||
# ── Named custom providers (config.yaml custom_providers list) ───
|
||||
try:
|
||||
from hermes_cli.runtime_provider import _get_named_custom_provider
|
||||
custom_entry = _get_named_custom_provider(provider)
|
||||
if custom_entry:
|
||||
custom_base = custom_entry.get("base_url", "").strip()
|
||||
custom_key = custom_entry.get("api_key", "").strip() or "no-key-required"
|
||||
if custom_base:
|
||||
final_model = model or _read_main_model() or "gpt-4o-mini"
|
||||
client = OpenAI(api_key=custom_key, base_url=custom_base)
|
||||
logger.debug(
|
||||
"resolve_provider_client: named custom provider %r (%s)",
|
||||
provider, final_model)
|
||||
return (_to_async_client(client, final_model) if async_mode
|
||||
else (client, final_model))
|
||||
logger.warning(
|
||||
"resolve_provider_client: named custom provider %r has no base_url",
|
||||
provider)
|
||||
return None, None
|
||||
except ImportError:
|
||||
pass
|
||||
|
||||
# ── API-key providers from PROVIDER_REGISTRY ─────────────────────
|
||||
try:
|
||||
from hermes_cli.auth import PROVIDER_REGISTRY, resolve_api_key_provider_credentials
|
||||
@@ -1270,7 +1346,9 @@ def resolve_provider_client(
|
||||
provider, ", ".join(tried_sources))
|
||||
return None, None
|
||||
|
||||
base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
|
||||
base_url = _to_openai_base_url(
|
||||
str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
|
||||
)
|
||||
|
||||
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
|
||||
final_model = model or default_model
|
||||
@@ -1347,19 +1425,11 @@ def get_async_text_auxiliary_client(task: str = ""):
|
||||
_VISION_AUTO_PROVIDER_ORDER = (
|
||||
"openrouter",
|
||||
"nous",
|
||||
"openai-codex",
|
||||
"anthropic",
|
||||
"custom",
|
||||
)
|
||||
|
||||
|
||||
def _normalize_vision_provider(provider: Optional[str]) -> str:
|
||||
provider = (provider or "auto").strip().lower()
|
||||
if provider == "codex":
|
||||
return "openai-codex"
|
||||
if provider == "main":
|
||||
return "custom"
|
||||
return provider
|
||||
return _normalize_aux_provider(provider, for_vision=True)
|
||||
|
||||
|
||||
def _resolve_strict_vision_backend(provider: str) -> Tuple[Optional[Any], Optional[str]]:
|
||||
@@ -1367,7 +1437,7 @@ def _resolve_strict_vision_backend(provider: str) -> Tuple[Optional[Any], Option
|
||||
if provider == "openrouter":
|
||||
return _try_openrouter()
|
||||
if provider == "nous":
|
||||
return _try_nous()
|
||||
return _try_nous(vision=True)
|
||||
if provider == "openai-codex":
|
||||
return _try_codex()
|
||||
if provider == "anthropic":
|
||||
@@ -1400,17 +1470,20 @@ def _preferred_main_vision_provider() -> Optional[str]:
|
||||
def get_available_vision_backends() -> List[str]:
|
||||
"""Return the currently available vision backends in auto-selection order.
|
||||
|
||||
This is the single source of truth for setup, tool gating, and runtime
|
||||
auto-routing of vision tasks. The selected main provider is preferred when
|
||||
it is also a known-good vision backend; otherwise Hermes falls back through
|
||||
the standard conservative order.
|
||||
Order: OpenRouter → Nous → active provider. This is the single source
|
||||
of truth for setup, tool gating, and runtime auto-routing of vision tasks.
|
||||
"""
|
||||
ordered = list(_VISION_AUTO_PROVIDER_ORDER)
|
||||
preferred = _preferred_main_vision_provider()
|
||||
if preferred in ordered:
|
||||
ordered.remove(preferred)
|
||||
ordered.insert(0, preferred)
|
||||
return [provider for provider in ordered if _strict_vision_backend_available(provider)]
|
||||
available = [p for p in _VISION_AUTO_PROVIDER_ORDER
|
||||
if _strict_vision_backend_available(p)]
|
||||
# Also check the user's active provider (may be DeepSeek, Alibaba, named
|
||||
# custom, etc.) — resolve_provider_client handles all provider types.
|
||||
main_provider = _read_main_provider()
|
||||
if (main_provider and main_provider not in ("auto", "")
|
||||
and main_provider not in available):
|
||||
client, _ = resolve_provider_client(main_provider, _read_main_model())
|
||||
if client is not None:
|
||||
available.append(main_provider)
|
||||
return available
|
||||
|
||||
|
||||
def resolve_vision_provider_client(
|
||||
@@ -1455,16 +1528,30 @@ def resolve_vision_provider_client(
|
||||
return "custom", client, final_model
|
||||
|
||||
if requested == "auto":
|
||||
ordered = list(_VISION_AUTO_PROVIDER_ORDER)
|
||||
preferred = _preferred_main_vision_provider()
|
||||
if preferred in ordered:
|
||||
ordered.remove(preferred)
|
||||
ordered.insert(0, preferred)
|
||||
|
||||
for candidate in ordered:
|
||||
# Vision auto-detection order:
|
||||
# 1. OpenRouter (known vision-capable default model)
|
||||
# 2. Nous Portal (known vision-capable default model)
|
||||
# 3. Active provider + model (user's main chat config)
|
||||
# 4. Stop
|
||||
for candidate in _VISION_AUTO_PROVIDER_ORDER:
|
||||
sync_client, default_model = _resolve_strict_vision_backend(candidate)
|
||||
if sync_client is not None:
|
||||
return _finalize(candidate, sync_client, default_model)
|
||||
|
||||
# Fall back to the user's active provider + model.
|
||||
main_provider = _read_main_provider()
|
||||
main_model = _read_main_model()
|
||||
if main_provider and main_provider not in ("auto", ""):
|
||||
sync_client, resolved_model = resolve_provider_client(
|
||||
main_provider, main_model)
|
||||
if sync_client is not None:
|
||||
logger.info(
|
||||
"Vision auto-detect: using active provider %s (%s)",
|
||||
main_provider, resolved_model or main_model,
|
||||
)
|
||||
return _finalize(
|
||||
main_provider, sync_client, resolved_model or main_model)
|
||||
|
||||
logger.debug("Auxiliary vision client: none available")
|
||||
return None, None, None
|
||||
|
||||
|
||||
+63
-3
@@ -113,8 +113,15 @@ DEFAULT_CONTEXT_LENGTHS = {
|
||||
"llama": 131072,
|
||||
# Qwen
|
||||
"qwen": 131072,
|
||||
# MiniMax
|
||||
"minimax": 204800,
|
||||
# MiniMax (lowercase — lookup lowercases model names at line 973)
|
||||
"minimax-m1-256k": 1000000,
|
||||
"minimax-m1-128k": 1000000,
|
||||
"minimax-m1-80k": 1000000,
|
||||
"minimax-m1-40k": 1000000,
|
||||
"minimax-m1": 1000000,
|
||||
"minimax-m2.5": 1048576,
|
||||
"minimax-m2.7": 1048576,
|
||||
"minimax": 1048576,
|
||||
# GLM
|
||||
"glm": 202752,
|
||||
# Kimi
|
||||
@@ -127,7 +134,7 @@ DEFAULT_CONTEXT_LENGTHS = {
|
||||
"deepseek-ai/DeepSeek-V3.2": 65536,
|
||||
"moonshotai/Kimi-K2.5": 262144,
|
||||
"moonshotai/Kimi-K2-Thinking": 262144,
|
||||
"MiniMaxAI/MiniMax-M2.5": 204800,
|
||||
"minimaxai/minimax-m2.5": 1048576,
|
||||
"XiaomiMiMo/MiMo-V2-Flash": 32768,
|
||||
"mimo-v2-pro": 1048576,
|
||||
"mimo-v2-omni": 1048576,
|
||||
@@ -611,6 +618,59 @@ def _model_id_matches(candidate_id: str, lookup_model: str) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
def query_ollama_num_ctx(model: str, base_url: str) -> Optional[int]:
|
||||
"""Query an Ollama server for the model's context length.
|
||||
|
||||
Returns the model's maximum context from GGUF metadata via ``/api/show``,
|
||||
or the explicit ``num_ctx`` from the Modelfile if set. Returns None if
|
||||
the server is unreachable or not Ollama.
|
||||
|
||||
This is the value that should be passed as ``num_ctx`` in Ollama chat
|
||||
requests to override the default 2048.
|
||||
"""
|
||||
import httpx
|
||||
|
||||
bare_model = _strip_provider_prefix(model)
|
||||
server_url = base_url.rstrip("/")
|
||||
if server_url.endswith("/v1"):
|
||||
server_url = server_url[:-3]
|
||||
|
||||
try:
|
||||
server_type = detect_local_server_type(base_url)
|
||||
except Exception:
|
||||
return None
|
||||
if server_type != "ollama":
|
||||
return None
|
||||
|
||||
try:
|
||||
with httpx.Client(timeout=3.0) as client:
|
||||
resp = client.post(f"{server_url}/api/show", json={"name": bare_model})
|
||||
if resp.status_code != 200:
|
||||
return None
|
||||
data = resp.json()
|
||||
|
||||
# Prefer explicit num_ctx from Modelfile parameters (user override)
|
||||
params = data.get("parameters", "")
|
||||
if "num_ctx" in params:
|
||||
for line in params.split("\n"):
|
||||
if "num_ctx" in line:
|
||||
parts = line.strip().split()
|
||||
if len(parts) >= 2:
|
||||
try:
|
||||
return int(parts[-1])
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
# Fall back to GGUF model_info context_length (training max)
|
||||
model_info = data.get("model_info", {})
|
||||
for key, value in model_info.items():
|
||||
if "context_length" in key and isinstance(value, (int, float)):
|
||||
return int(value)
|
||||
except Exception:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
def _query_local_context_length(model: str, base_url: str) -> Optional[int]:
|
||||
"""Query a local server for the model's context length."""
|
||||
import httpx
|
||||
|
||||
@@ -204,6 +204,30 @@ OPENAI_MODEL_EXECUTION_GUIDANCE = (
|
||||
"the result.\n"
|
||||
"</tool_persistence>\n"
|
||||
"\n"
|
||||
"<mandatory_tool_use>\n"
|
||||
"NEVER answer these from memory or mental computation — ALWAYS use a tool:\n"
|
||||
"- Arithmetic, math, calculations → use terminal or execute_code\n"
|
||||
"- Hashes, encodings, checksums → use terminal (e.g. sha256sum, base64)\n"
|
||||
"- Current time, date, timezone → use terminal (e.g. date)\n"
|
||||
"- System state: OS, CPU, memory, disk, ports, processes → use terminal\n"
|
||||
"- File contents, sizes, line counts → use read_file, search_files, or terminal\n"
|
||||
"- Git history, branches, diffs → use terminal\n"
|
||||
"- Current facts (weather, news, versions) → use web_search\n"
|
||||
"Your memory and user profile describe the USER, not the system you are "
|
||||
"running on. The execution environment may differ from what the user profile "
|
||||
"says about their personal setup.\n"
|
||||
"</mandatory_tool_use>\n"
|
||||
"\n"
|
||||
"<act_dont_ask>\n"
|
||||
"When a question has an obvious default interpretation, act on it immediately "
|
||||
"instead of asking for clarification. Examples:\n"
|
||||
"- 'Is port 443 open?' → check THIS machine (don't ask 'open where?')\n"
|
||||
"- 'What OS am I running?' → check the live system (don't use user profile)\n"
|
||||
"- 'What time is it?' → run `date` (don't guess)\n"
|
||||
"Only ask for clarification when the ambiguity genuinely changes what tool "
|
||||
"you would call.\n"
|
||||
"</act_dont_ask>\n"
|
||||
"\n"
|
||||
"<prerequisite_checks>\n"
|
||||
"- Before taking an action, check whether prerequisite discovery, lookup, or "
|
||||
"context-gathering steps are needed.\n"
|
||||
|
||||
@@ -0,0 +1,57 @@
|
||||
"""Retry utilities — jittered backoff for decorrelated retries.
|
||||
|
||||
Replaces fixed exponential backoff with jittered delays to prevent
|
||||
thundering-herd retry spikes when multiple sessions hit the same
|
||||
rate-limited provider concurrently.
|
||||
"""
|
||||
|
||||
import random
|
||||
import threading
|
||||
import time
|
||||
|
||||
# Monotonic counter for jitter seed uniqueness within the same process.
|
||||
# Protected by a lock to avoid race conditions in concurrent retry paths
|
||||
# (e.g. multiple gateway sessions retrying simultaneously).
|
||||
_jitter_counter = 0
|
||||
_jitter_lock = threading.Lock()
|
||||
|
||||
|
||||
def jittered_backoff(
|
||||
attempt: int,
|
||||
*,
|
||||
base_delay: float = 5.0,
|
||||
max_delay: float = 120.0,
|
||||
jitter_ratio: float = 0.5,
|
||||
) -> float:
|
||||
"""Compute a jittered exponential backoff delay.
|
||||
|
||||
Args:
|
||||
attempt: 1-based retry attempt number.
|
||||
base_delay: Base delay in seconds for attempt 1.
|
||||
max_delay: Maximum delay cap in seconds.
|
||||
jitter_ratio: Fraction of computed delay to use as random jitter
|
||||
range. 0.5 means jitter is uniform in [0, 0.5 * delay].
|
||||
|
||||
Returns:
|
||||
Delay in seconds: min(base * 2^(attempt-1), max_delay) + jitter.
|
||||
|
||||
The jitter decorrelates concurrent retries so multiple sessions
|
||||
hitting the same provider don't all retry at the same instant.
|
||||
"""
|
||||
global _jitter_counter
|
||||
with _jitter_lock:
|
||||
_jitter_counter += 1
|
||||
tick = _jitter_counter
|
||||
|
||||
exponent = max(0, attempt - 1)
|
||||
if exponent >= 63 or base_delay <= 0:
|
||||
delay = max_delay
|
||||
else:
|
||||
delay = min(base_delay * (2 ** exponent), max_delay)
|
||||
|
||||
# Seed from time + counter for decorrelation even with coarse clocks.
|
||||
seed = (time.time_ns() ^ (tick * 0x9E3779B9)) & 0xFFFFFFFF
|
||||
rng = random.Random(seed)
|
||||
jitter = rng.uniform(0, jitter_ratio * delay)
|
||||
|
||||
return delay + jitter
|
||||
@@ -63,7 +63,7 @@ from agent.usage_pricing import (
|
||||
format_duration_compact,
|
||||
format_token_count_compact,
|
||||
)
|
||||
from hermes_cli.banner import _format_context_length
|
||||
from hermes_cli.banner import _format_context_length, format_banner_version_label
|
||||
|
||||
_COMMAND_SPINNER_FRAMES = ("⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏")
|
||||
|
||||
@@ -612,6 +612,11 @@ def _run_cleanup():
|
||||
pass
|
||||
# Shut down memory provider (on_session_end + shutdown_all) at actual
|
||||
# session boundary — NOT per-turn inside run_conversation().
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_invoke_hook("on_session_finalize", session_id=_active_agent_ref.session_id if _active_agent_ref else None, platform="cli")
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
if _active_agent_ref and hasattr(_active_agent_ref, 'shutdown_memory_provider'):
|
||||
_active_agent_ref.shutdown_memory_provider(
|
||||
@@ -755,7 +760,10 @@ def _setup_worktree(repo_root: str = None) -> Optional[Dict[str, str]]:
|
||||
def _cleanup_worktree(info: Dict[str, str] = None) -> None:
|
||||
"""Remove a worktree and its branch on exit.
|
||||
|
||||
If the worktree has uncommitted changes, warn and keep it.
|
||||
Preserves the worktree only if it has unpushed commits (real work
|
||||
that hasn't been pushed to any remote). Uncommitted changes alone
|
||||
(untracked files, test artifacts) are not enough to keep it — agent
|
||||
work lives in commits/PRs, not the working tree.
|
||||
"""
|
||||
global _active_worktree
|
||||
info = info or _active_worktree
|
||||
@@ -771,23 +779,27 @@ def _cleanup_worktree(info: Dict[str, str] = None) -> None:
|
||||
if not Path(wt_path).exists():
|
||||
return
|
||||
|
||||
# Check for uncommitted changes
|
||||
# Check for unpushed commits — commits reachable from HEAD but not
|
||||
# from any remote branch. These represent real work the agent did
|
||||
# but didn't push.
|
||||
has_unpushed = False
|
||||
try:
|
||||
status = subprocess.run(
|
||||
["git", "status", "--porcelain"],
|
||||
result = subprocess.run(
|
||||
["git", "log", "--oneline", "HEAD", "--not", "--remotes"],
|
||||
capture_output=True, text=True, timeout=10, cwd=wt_path,
|
||||
)
|
||||
has_changes = bool(status.stdout.strip())
|
||||
has_unpushed = bool(result.stdout.strip())
|
||||
except Exception:
|
||||
has_changes = True # Assume dirty on error — don't delete
|
||||
has_unpushed = True # Assume unpushed on error — don't delete
|
||||
|
||||
if has_changes:
|
||||
print(f"\n\033[33m⚠ Worktree has uncommitted changes, keeping: {wt_path}\033[0m")
|
||||
print(f" To clean up manually: git worktree remove {wt_path}")
|
||||
if has_unpushed:
|
||||
print(f"\n\033[33m⚠ Worktree has unpushed commits, keeping: {wt_path}\033[0m")
|
||||
print(f" To clean up manually: git worktree remove --force {wt_path}")
|
||||
_active_worktree = None
|
||||
return
|
||||
|
||||
# Remove worktree
|
||||
# Remove worktree (even if working tree is dirty — uncommitted
|
||||
# changes without unpushed commits are just artifacts)
|
||||
try:
|
||||
subprocess.run(
|
||||
["git", "worktree", "remove", wt_path, "--force"],
|
||||
@@ -796,7 +808,7 @@ def _cleanup_worktree(info: Dict[str, str] = None) -> None:
|
||||
except Exception as e:
|
||||
logger.debug("Failed to remove worktree: %s", e)
|
||||
|
||||
# Delete the branch (only if it was never pushed / has no upstream)
|
||||
# Delete the branch
|
||||
try:
|
||||
subprocess.run(
|
||||
["git", "branch", "-D", branch],
|
||||
@@ -810,19 +822,27 @@ def _cleanup_worktree(info: Dict[str, str] = None) -> None:
|
||||
|
||||
|
||||
def _prune_stale_worktrees(repo_root: str, max_age_hours: int = 24) -> None:
|
||||
"""Remove worktrees older than max_age_hours that have no uncommitted changes.
|
||||
"""Remove stale worktrees and orphaned branches on startup.
|
||||
|
||||
Runs silently on startup to clean up after crashed/killed sessions.
|
||||
Age-based tiers:
|
||||
- Under max_age_hours (24h): skip — session may still be active.
|
||||
- 24h–72h: remove if no unpushed commits.
|
||||
- Over 72h: force remove regardless (nothing should sit this long).
|
||||
|
||||
Also prunes orphaned ``hermes/*`` and ``pr-*`` local branches that
|
||||
have no corresponding worktree.
|
||||
"""
|
||||
import subprocess
|
||||
import time
|
||||
|
||||
worktrees_dir = Path(repo_root) / ".worktrees"
|
||||
if not worktrees_dir.exists():
|
||||
_prune_orphaned_branches(repo_root)
|
||||
return
|
||||
|
||||
now = time.time()
|
||||
cutoff = now - (max_age_hours * 3600)
|
||||
soft_cutoff = now - (max_age_hours * 3600) # 24h default
|
||||
hard_cutoff = now - (max_age_hours * 3 * 3600) # 72h default
|
||||
|
||||
for entry in worktrees_dir.iterdir():
|
||||
if not entry.is_dir() or not entry.name.startswith("hermes-"):
|
||||
@@ -831,21 +851,24 @@ def _prune_stale_worktrees(repo_root: str, max_age_hours: int = 24) -> None:
|
||||
# Check age
|
||||
try:
|
||||
mtime = entry.stat().st_mtime
|
||||
if mtime > cutoff:
|
||||
if mtime > soft_cutoff:
|
||||
continue # Too recent — skip
|
||||
except Exception:
|
||||
continue
|
||||
|
||||
# Check for uncommitted changes
|
||||
try:
|
||||
status = subprocess.run(
|
||||
["git", "status", "--porcelain"],
|
||||
capture_output=True, text=True, timeout=5, cwd=str(entry),
|
||||
)
|
||||
if status.stdout.strip():
|
||||
continue # Has changes — skip
|
||||
except Exception:
|
||||
continue # Can't check — skip
|
||||
force = mtime <= hard_cutoff # Over 72h — force remove
|
||||
|
||||
if not force:
|
||||
# 24h–72h tier: only remove if no unpushed commits
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "log", "--oneline", "HEAD", "--not", "--remotes"],
|
||||
capture_output=True, text=True, timeout=5, cwd=str(entry),
|
||||
)
|
||||
if result.stdout.strip():
|
||||
continue # Has unpushed commits — skip
|
||||
except Exception:
|
||||
continue # Can't check — skip
|
||||
|
||||
# Safe to remove
|
||||
try:
|
||||
@@ -864,10 +887,81 @@ def _prune_stale_worktrees(repo_root: str, max_age_hours: int = 24) -> None:
|
||||
["git", "branch", "-D", branch],
|
||||
capture_output=True, text=True, timeout=10, cwd=repo_root,
|
||||
)
|
||||
logger.debug("Pruned stale worktree: %s", entry.name)
|
||||
logger.debug("Pruned stale worktree: %s (force=%s)", entry.name, force)
|
||||
except Exception as e:
|
||||
logger.debug("Failed to prune worktree %s: %s", entry.name, e)
|
||||
|
||||
_prune_orphaned_branches(repo_root)
|
||||
|
||||
|
||||
def _prune_orphaned_branches(repo_root: str) -> None:
|
||||
"""Delete local ``hermes/hermes-*`` and ``pr-*`` branches with no worktree.
|
||||
|
||||
These are auto-generated by ``hermes -w`` sessions and PR review
|
||||
workflows respectively. Once their worktree is gone they serve no
|
||||
purpose and just accumulate.
|
||||
"""
|
||||
import subprocess
|
||||
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "branch", "--format=%(refname:short)"],
|
||||
capture_output=True, text=True, timeout=10, cwd=repo_root,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
return
|
||||
all_branches = [b.strip() for b in result.stdout.strip().split("\n") if b.strip()]
|
||||
except Exception:
|
||||
return
|
||||
|
||||
# Collect branches that are actively checked out in a worktree
|
||||
active_branches: set = set()
|
||||
try:
|
||||
wt_result = subprocess.run(
|
||||
["git", "worktree", "list", "--porcelain"],
|
||||
capture_output=True, text=True, timeout=10, cwd=repo_root,
|
||||
)
|
||||
for line in wt_result.stdout.split("\n"):
|
||||
if line.startswith("branch refs/heads/"):
|
||||
active_branches.add(line.split("branch refs/heads/", 1)[-1].strip())
|
||||
except Exception:
|
||||
return # Can't determine active branches — bail
|
||||
|
||||
# Also protect the currently checked-out branch and main
|
||||
try:
|
||||
head_result = subprocess.run(
|
||||
["git", "branch", "--show-current"],
|
||||
capture_output=True, text=True, timeout=5, cwd=repo_root,
|
||||
)
|
||||
current = head_result.stdout.strip()
|
||||
if current:
|
||||
active_branches.add(current)
|
||||
except Exception:
|
||||
pass
|
||||
active_branches.add("main")
|
||||
|
||||
orphaned = [
|
||||
b for b in all_branches
|
||||
if b not in active_branches
|
||||
and (b.startswith("hermes/hermes-") or b.startswith("pr-"))
|
||||
]
|
||||
|
||||
if not orphaned:
|
||||
return
|
||||
|
||||
# Delete in batches
|
||||
for i in range(0, len(orphaned), 50):
|
||||
batch = orphaned[i:i + 50]
|
||||
try:
|
||||
subprocess.run(
|
||||
["git", "branch", "-D"] + batch,
|
||||
capture_output=True, text=True, timeout=30, cwd=repo_root,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.debug("Failed to prune orphaned branches: %s", e)
|
||||
|
||||
logger.debug("Pruned %d orphaned branches", len(orphaned))
|
||||
|
||||
# ============================================================================
|
||||
# ASCII Art & Branding
|
||||
# ============================================================================
|
||||
@@ -1036,21 +1130,44 @@ COMPACT_BANNER = """
|
||||
|
||||
def _build_compact_banner() -> str:
|
||||
"""Build a compact banner that fits the current terminal width."""
|
||||
w = min(shutil.get_terminal_size().columns - 2, 64)
|
||||
try:
|
||||
from hermes_cli.skin_engine import get_active_skin
|
||||
_skin = get_active_skin()
|
||||
except Exception:
|
||||
_skin = None
|
||||
|
||||
skin_name = getattr(_skin, "name", "default") if _skin else "default"
|
||||
border_color = _skin.get_color("banner_border", "#FFD700") if _skin else "#FFD700"
|
||||
title_color = _skin.get_color("banner_title", "#FFBF00") if _skin else "#FFBF00"
|
||||
dim_color = _skin.get_color("banner_dim", "#B8860B") if _skin else "#B8860B"
|
||||
|
||||
if skin_name == "default":
|
||||
line1 = "⚕ NOUS HERMES - AI Agent Framework"
|
||||
tiny_line = "⚕ NOUS HERMES"
|
||||
else:
|
||||
agent_name = _skin.get_branding("agent_name", "Hermes Agent") if _skin else "Hermes Agent"
|
||||
line1 = f"{agent_name} - AI Agent Framework"
|
||||
tiny_line = agent_name
|
||||
|
||||
version_line = format_banner_version_label()
|
||||
|
||||
w = min(shutil.get_terminal_size().columns - 2, 88)
|
||||
if w < 30:
|
||||
return "\n[#FFBF00]⚕ NOUS HERMES[/] [dim #B8860B]- Nous Research[/]\n"
|
||||
return f"\n[{title_color}]{tiny_line}[/] [dim {dim_color}]- Nous Research[/]\n"
|
||||
|
||||
inner = w - 2 # inside the box border
|
||||
bar = "═" * w
|
||||
line1 = "⚕ NOUS HERMES - AI Agent Framework"
|
||||
line2 = "Messenger of the Digital Gods · Nous Research"
|
||||
content_width = inner - 2
|
||||
|
||||
# Truncate and pad to fit
|
||||
line1 = line1[:inner - 2].ljust(inner - 2)
|
||||
line2 = line2[:inner - 2].ljust(inner - 2)
|
||||
line1 = line1[:content_width].ljust(content_width)
|
||||
line2 = version_line[:content_width].ljust(content_width)
|
||||
|
||||
return (
|
||||
f"\n[bold #FFD700]╔{bar}╗[/]\n"
|
||||
f"[bold #FFD700]║[/] [#FFBF00]{line1}[/] [bold #FFD700]║[/]\n"
|
||||
f"[bold #FFD700]║[/] [dim #B8860B]{line2}[/] [bold #FFD700]║[/]\n"
|
||||
f"[bold #FFD700]╚{bar}╝[/]\n"
|
||||
f"\n[bold {border_color}]╔{bar}╗[/]\n"
|
||||
f"[bold {border_color}]║[/] [{title_color}]{line1}[/] [bold {border_color}]║[/]\n"
|
||||
f"[bold {border_color}]║[/] [dim {dim_color}]{line2}[/] [bold {border_color}]║[/]\n"
|
||||
f"[bold {border_color}]╚{bar}╝[/]\n"
|
||||
)
|
||||
|
||||
|
||||
@@ -2163,7 +2280,7 @@ class HermesCLI:
|
||||
)
|
||||
except Exception as exc:
|
||||
message = format_runtime_provider_error(exc)
|
||||
self.console.print(f"[bold red]{message}[/]")
|
||||
ChatConsole().print(f"[bold red]{message}[/]")
|
||||
return False
|
||||
|
||||
api_key = runtime.get("api_key")
|
||||
@@ -2378,7 +2495,7 @@ class HermesCLI:
|
||||
self._pending_title = None
|
||||
return True
|
||||
except Exception as e:
|
||||
self.console.print(f"[bold red]Failed to initialize agent: {e}[/]")
|
||||
ChatConsole().print(f"[bold red]Failed to initialize agent: {e}[/]")
|
||||
return False
|
||||
|
||||
def show_banner(self):
|
||||
@@ -3291,6 +3408,22 @@ class HermesCLI:
|
||||
flush_tool_summary()
|
||||
print()
|
||||
|
||||
def _notify_session_boundary(self, event_type: str) -> None:
|
||||
"""Fire a session-boundary plugin hook (on_session_finalize or on_session_reset).
|
||||
|
||||
Non-blocking — errors are caught and logged. Safe to call from any
|
||||
lifecycle point (shutdown, /new, /reset).
|
||||
"""
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_invoke_hook(
|
||||
event_type,
|
||||
session_id=self.agent.session_id if self.agent else None,
|
||||
platform=getattr(self, "platform", None) or "cli",
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def new_session(self, silent=False):
|
||||
"""Start a fresh session with a new session ID and cleared agent state."""
|
||||
if self.agent and self.conversation_history:
|
||||
@@ -3298,6 +3431,10 @@ class HermesCLI:
|
||||
self.agent.flush_memories(self.conversation_history)
|
||||
except (Exception, KeyboardInterrupt):
|
||||
pass
|
||||
self._notify_session_boundary("on_session_finalize")
|
||||
elif self.agent:
|
||||
# First session or empty history — still finalize the old session
|
||||
self._notify_session_boundary("on_session_finalize")
|
||||
|
||||
old_session_id = self.session_id
|
||||
if self._session_db and old_session_id:
|
||||
@@ -3342,6 +3479,7 @@ class HermesCLI:
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
self._notify_session_boundary("on_session_reset")
|
||||
|
||||
if not silent:
|
||||
print("(^_^)v New session started!")
|
||||
@@ -4530,13 +4668,13 @@ class HermesCLI:
|
||||
if output:
|
||||
self.console.print(_rich_text_from_ansi(output))
|
||||
else:
|
||||
self.console.print("[dim]Command returned no output[/]")
|
||||
ChatConsole().print("[dim]Command returned no output[/]")
|
||||
except subprocess.TimeoutExpired:
|
||||
self.console.print("[bold red]Quick command timed out (30s)[/]")
|
||||
ChatConsole().print("[bold red]Quick command timed out (30s)[/]")
|
||||
except Exception as e:
|
||||
self.console.print(f"[bold red]Quick command error: {e}[/]")
|
||||
ChatConsole().print(f"[bold red]Quick command error: {e}[/]")
|
||||
else:
|
||||
self.console.print(f"[bold red]Quick command '{base_cmd}' has no command defined[/]")
|
||||
ChatConsole().print(f"[bold red]Quick command '{base_cmd}' has no command defined[/]")
|
||||
elif qcmd.get("type") == "alias":
|
||||
target = qcmd.get("target", "").strip()
|
||||
if target:
|
||||
@@ -4545,9 +4683,9 @@ class HermesCLI:
|
||||
aliased_command = f"{target} {user_args}".strip()
|
||||
return self.process_command(aliased_command)
|
||||
else:
|
||||
self.console.print(f"[bold red]Quick command '{base_cmd}' has no target defined[/]")
|
||||
ChatConsole().print(f"[bold red]Quick command '{base_cmd}' has no target defined[/]")
|
||||
else:
|
||||
self.console.print(f"[bold red]Quick command '{base_cmd}' has unsupported type (supported: 'exec', 'alias')[/]")
|
||||
ChatConsole().print(f"[bold red]Quick command '{base_cmd}' has unsupported type (supported: 'exec', 'alias')[/]")
|
||||
# Check for plugin-registered slash commands
|
||||
elif base_cmd.lstrip("/") in _get_plugin_cmd_handler_names():
|
||||
from hermes_cli.plugins import get_plugin_command_handler
|
||||
@@ -4572,7 +4710,7 @@ class HermesCLI:
|
||||
if hasattr(self, '_pending_input'):
|
||||
self._pending_input.put(msg)
|
||||
else:
|
||||
self.console.print(f"[bold red]Failed to load skill for {base_cmd}[/]")
|
||||
ChatConsole().print(f"[bold red]Failed to load skill for {base_cmd}[/]")
|
||||
else:
|
||||
# Prefix matching: if input uniquely identifies one command, execute it.
|
||||
# Matches against both built-in COMMANDS and installed skill commands so
|
||||
@@ -4633,14 +4771,14 @@ class HermesCLI:
|
||||
)
|
||||
|
||||
if not msg:
|
||||
self.console.print("[bold red]Failed to load the bundled /plan skill[/]")
|
||||
ChatConsole().print("[bold red]Failed to load the bundled /plan skill[/]")
|
||||
return
|
||||
|
||||
_cprint(f" 📝 Plan mode queued via skill. Markdown plan target: {plan_path}")
|
||||
if hasattr(self, '_pending_input'):
|
||||
self._pending_input.put(msg)
|
||||
else:
|
||||
self.console.print("[bold red]Plan mode unavailable: input queue not initialized[/]")
|
||||
ChatConsole().print("[bold red]Plan mode unavailable: input queue not initialized[/]")
|
||||
|
||||
def _handle_background_command(self, cmd: str):
|
||||
"""Handle /background <prompt> — run a prompt in a separate background session.
|
||||
|
||||
+7
-1
@@ -574,12 +574,16 @@ def remove_job(job_id: str) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
|
||||
def mark_job_run(job_id: str, success: bool, error: Optional[str] = None,
|
||||
delivery_error: Optional[str] = None):
|
||||
"""
|
||||
Mark a job as having been run.
|
||||
|
||||
Updates last_run_at, last_status, increments completed count,
|
||||
computes next_run_at, and auto-deletes if repeat limit reached.
|
||||
|
||||
``delivery_error`` is tracked separately from the agent error — a job
|
||||
can succeed (agent produced output) but fail delivery (platform down).
|
||||
"""
|
||||
jobs = load_jobs()
|
||||
for i, job in enumerate(jobs):
|
||||
@@ -588,6 +592,8 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
|
||||
job["last_run_at"] = now
|
||||
job["last_status"] = "ok" if success else "error"
|
||||
job["last_error"] = error if not success else None
|
||||
# Track delivery failures separately — cleared on successful delivery
|
||||
job["last_delivery_error"] = delivery_error
|
||||
|
||||
# Increment completed count
|
||||
if job.get("repeat"):
|
||||
|
||||
+32
-25
@@ -196,7 +196,7 @@ def _send_media_via_adapter(adapter, chat_id: str, media_files: list, metadata:
|
||||
logger.warning("Job '%s': failed to send media %s: %s", job.get("id", "?"), media_path, e)
|
||||
|
||||
|
||||
def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> None:
|
||||
def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Optional[str]:
|
||||
"""
|
||||
Deliver job output to the configured target (origin chat, specific platform, etc.).
|
||||
|
||||
@@ -204,16 +204,16 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> None:
|
||||
use the live adapter first — this supports E2EE rooms (e.g. Matrix) where
|
||||
the standalone HTTP path cannot encrypt. Falls back to standalone send if
|
||||
the adapter path fails or is unavailable.
|
||||
|
||||
Returns None on success, or an error string on failure.
|
||||
"""
|
||||
target = _resolve_delivery_target(job)
|
||||
if not target:
|
||||
if job.get("deliver", "local") != "local":
|
||||
logger.warning(
|
||||
"Job '%s' deliver=%s but no concrete delivery target could be resolved",
|
||||
job["id"],
|
||||
job.get("deliver", "local"),
|
||||
)
|
||||
return
|
||||
msg = f"no delivery target resolved for deliver={job.get('deliver', 'local')}"
|
||||
logger.warning("Job '%s': %s", job["id"], msg)
|
||||
return msg
|
||||
return None # local-only jobs don't deliver — not a failure
|
||||
|
||||
platform_name = target["platform"]
|
||||
chat_id = target["chat_id"]
|
||||
@@ -239,19 +239,22 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> None:
|
||||
}
|
||||
platform = platform_map.get(platform_name.lower())
|
||||
if not platform:
|
||||
logger.warning("Job '%s': unknown platform '%s' for delivery", job["id"], platform_name)
|
||||
return
|
||||
msg = f"unknown platform '{platform_name}'"
|
||||
logger.warning("Job '%s': %s", job["id"], msg)
|
||||
return msg
|
||||
|
||||
try:
|
||||
config = load_gateway_config()
|
||||
except Exception as e:
|
||||
logger.error("Job '%s': failed to load gateway config for delivery: %s", job["id"], e)
|
||||
return
|
||||
msg = f"failed to load gateway config: {e}"
|
||||
logger.error("Job '%s': %s", job["id"], msg)
|
||||
return msg
|
||||
|
||||
pconfig = config.platforms.get(platform)
|
||||
if not pconfig or not pconfig.enabled:
|
||||
logger.warning("Job '%s': platform '%s' not configured/enabled", job["id"], platform_name)
|
||||
return
|
||||
msg = f"platform '{platform_name}' not configured/enabled"
|
||||
logger.warning("Job '%s': %s", job["id"], msg)
|
||||
return msg
|
||||
|
||||
# Optionally wrap the content with a header/footer so the user knows this
|
||||
# is a cron delivery. Wrapping is on by default; set cron.wrap_response: false
|
||||
@@ -307,7 +310,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> None:
|
||||
|
||||
if adapter_ok:
|
||||
logger.info("Job '%s': delivered to %s:%s via live adapter", job["id"], platform_name, chat_id)
|
||||
return
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"Job '%s': live adapter delivery to %s:%s failed (%s), falling back to standalone",
|
||||
@@ -329,13 +332,17 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> None:
|
||||
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files))
|
||||
result = future.result(timeout=30)
|
||||
except Exception as e:
|
||||
logger.error("Job '%s': delivery to %s:%s failed: %s", job["id"], platform_name, chat_id, e)
|
||||
return
|
||||
msg = f"delivery to {platform_name}:{chat_id} failed: {e}"
|
||||
logger.error("Job '%s': %s", job["id"], msg)
|
||||
return msg
|
||||
|
||||
if result and result.get("error"):
|
||||
logger.error("Job '%s': delivery error: %s", job["id"], result["error"])
|
||||
else:
|
||||
logger.info("Job '%s': delivered to %s:%s", job["id"], platform_name, chat_id)
|
||||
msg = f"delivery error: {result['error']}"
|
||||
logger.error("Job '%s': %s", job["id"], msg)
|
||||
return msg
|
||||
|
||||
logger.info("Job '%s': delivered to %s:%s", job["id"], platform_name, chat_id)
|
||||
return None
|
||||
|
||||
|
||||
_SCRIPT_TIMEOUT = 120 # seconds
|
||||
@@ -578,11 +585,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
|
||||
except Exception as e:
|
||||
logger.warning("Job '%s': failed to load config.yaml, using defaults: %s", job_id, e)
|
||||
|
||||
# Reasoning config from env or config.yaml
|
||||
# Reasoning config from config.yaml
|
||||
from hermes_constants import parse_reasoning_effort
|
||||
effort = os.getenv("HERMES_REASONING_EFFORT", "")
|
||||
if not effort:
|
||||
effort = str(_cfg.get("agent", {}).get("reasoning_effort", "")).strip()
|
||||
effort = str(_cfg.get("agent", {}).get("reasoning_effort", "")).strip()
|
||||
reasoning_config = parse_reasoning_effort(effort)
|
||||
|
||||
# Prefill messages from env or config.yaml
|
||||
@@ -868,13 +873,15 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
|
||||
logger.info("Job '%s': agent returned %s — skipping delivery", job["id"], SILENT_MARKER)
|
||||
should_deliver = False
|
||||
|
||||
delivery_error = None
|
||||
if should_deliver:
|
||||
try:
|
||||
_deliver_result(job, deliver_content, adapters=adapters, loop=loop)
|
||||
delivery_error = _deliver_result(job, deliver_content, adapters=adapters, loop=loop)
|
||||
except Exception as de:
|
||||
delivery_error = str(de)
|
||||
logger.error("Delivery failed for job %s: %s", job["id"], de)
|
||||
|
||||
mark_job_run(job["id"], success, error)
|
||||
mark_job_run(job["id"], success, error, delivery_error=delivery_error)
|
||||
executed += 1
|
||||
|
||||
except Exception as e:
|
||||
|
||||
@@ -21,6 +21,8 @@ from dataclasses import dataclass, field
|
||||
from typing import Any, Dict, List, Optional, Set
|
||||
|
||||
from model_tools import handle_function_call
|
||||
from tools.terminal_tool import get_active_env
|
||||
from tools.tool_result_storage import maybe_persist_tool_result, enforce_turn_budget
|
||||
|
||||
# Thread pool for running sync tool calls that internally use asyncio.run()
|
||||
# (e.g., the Modal/Docker/Daytona terminal backends). Running them in a separate
|
||||
@@ -138,6 +140,7 @@ class HermesAgentLoop:
|
||||
temperature: float = 1.0,
|
||||
max_tokens: Optional[int] = None,
|
||||
extra_body: Optional[Dict[str, Any]] = None,
|
||||
budget_config: Optional["BudgetConfig"] = None,
|
||||
):
|
||||
"""
|
||||
Initialize the agent loop.
|
||||
@@ -154,7 +157,11 @@ class HermesAgentLoop:
|
||||
extra_body: Extra parameters passed to the OpenAI client's create() call.
|
||||
Used for OpenRouter provider preferences, transforms, etc.
|
||||
e.g. {"provider": {"ignore": ["DeepInfra"]}}
|
||||
budget_config: Tool result persistence budget. Controls per-tool
|
||||
thresholds, per-turn aggregate budget, and preview size.
|
||||
If None, uses DEFAULT_BUDGET (current hardcoded values).
|
||||
"""
|
||||
from tools.budget_config import DEFAULT_BUDGET
|
||||
self.server = server
|
||||
self.tool_schemas = tool_schemas
|
||||
self.valid_tool_names = valid_tool_names
|
||||
@@ -163,6 +170,7 @@ class HermesAgentLoop:
|
||||
self.temperature = temperature
|
||||
self.max_tokens = max_tokens
|
||||
self.extra_body = extra_body
|
||||
self.budget_config = budget_config or DEFAULT_BUDGET
|
||||
|
||||
async def run(self, messages: List[Dict[str, Any]]) -> AgentResult:
|
||||
"""
|
||||
@@ -446,8 +454,15 @@ class HermesAgentLoop:
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
pass
|
||||
|
||||
# Add tool response to conversation
|
||||
tc_id = tc.get("id", "") if isinstance(tc, dict) else tc.id
|
||||
tool_result = maybe_persist_tool_result(
|
||||
content=tool_result,
|
||||
tool_name=tool_name,
|
||||
tool_use_id=tc_id,
|
||||
env=get_active_env(self.task_id),
|
||||
config=self.budget_config,
|
||||
)
|
||||
|
||||
messages.append(
|
||||
{
|
||||
"role": "tool",
|
||||
@@ -456,6 +471,14 @@ class HermesAgentLoop:
|
||||
}
|
||||
)
|
||||
|
||||
num_tcs = len(assistant_msg.tool_calls)
|
||||
if num_tcs > 0:
|
||||
enforce_turn_budget(
|
||||
messages[-num_tcs:],
|
||||
env=get_active_env(self.task_id),
|
||||
config=self.budget_config,
|
||||
)
|
||||
|
||||
turn_elapsed = _time.monotonic() - turn_start
|
||||
logger.info(
|
||||
"[%s] turn %d: api=%.1fs, %d tools, turn_total=%.1fs",
|
||||
|
||||
@@ -1048,6 +1048,7 @@ class AgenticOPDEnv(HermesAgentBaseEnv):
|
||||
temperature=0.0,
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
|
||||
|
||||
@@ -44,7 +44,7 @@ import tempfile
|
||||
import time
|
||||
import uuid
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
from pathlib import Path, PurePosixPath, PureWindowsPath
|
||||
from typing import Any, Dict, List, Optional, Tuple, Union
|
||||
|
||||
# Ensure repo root is on sys.path for imports
|
||||
@@ -148,6 +148,62 @@ MODAL_INCOMPATIBLE_TASKS = {
|
||||
# Tar extraction helper
|
||||
# =============================================================================
|
||||
|
||||
def _normalize_tar_member_parts(member_name: str) -> list:
|
||||
"""Return safe path components for a tar member or raise ValueError."""
|
||||
normalized_name = member_name.replace("\\", "/")
|
||||
posix_path = PurePosixPath(normalized_name)
|
||||
windows_path = PureWindowsPath(member_name)
|
||||
|
||||
if (
|
||||
not normalized_name
|
||||
or posix_path.is_absolute()
|
||||
or windows_path.is_absolute()
|
||||
or windows_path.drive
|
||||
):
|
||||
raise ValueError(f"Unsafe archive member path: {member_name}")
|
||||
|
||||
parts = [part for part in posix_path.parts if part not in ("", ".")]
|
||||
if not parts or any(part == ".." for part in parts):
|
||||
raise ValueError(f"Unsafe archive member path: {member_name}")
|
||||
return parts
|
||||
|
||||
|
||||
def _safe_extract_tar(tar: tarfile.TarFile, target_dir: Path) -> None:
|
||||
"""Extract a tar archive without allowing traversal or link entries."""
|
||||
target_dir.mkdir(parents=True, exist_ok=True)
|
||||
target_root = target_dir.resolve()
|
||||
|
||||
for member in tar.getmembers():
|
||||
parts = _normalize_tar_member_parts(member.name)
|
||||
target = target_dir.joinpath(*parts)
|
||||
target_real = target.resolve(strict=False)
|
||||
|
||||
try:
|
||||
target_real.relative_to(target_root)
|
||||
except ValueError as exc:
|
||||
raise ValueError(f"Unsafe archive member path: {member.name}") from exc
|
||||
|
||||
if member.isdir():
|
||||
target_real.mkdir(parents=True, exist_ok=True)
|
||||
continue
|
||||
|
||||
if not member.isfile():
|
||||
raise ValueError(f"Unsupported archive member type: {member.name}")
|
||||
|
||||
target_real.parent.mkdir(parents=True, exist_ok=True)
|
||||
extracted = tar.extractfile(member)
|
||||
if extracted is None:
|
||||
raise ValueError(f"Cannot read archive member: {member.name}")
|
||||
|
||||
with extracted, open(target_real, "wb") as dst:
|
||||
shutil.copyfileobj(extracted, dst)
|
||||
|
||||
try:
|
||||
os.chmod(target_real, member.mode & 0o777)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def _extract_base64_tar(b64_data: str, target_dir: Path):
|
||||
"""Extract a base64-encoded tar.gz archive into target_dir."""
|
||||
if not b64_data:
|
||||
@@ -155,7 +211,7 @@ def _extract_base64_tar(b64_data: str, target_dir: Path):
|
||||
raw = base64.b64decode(b64_data)
|
||||
buf = io.BytesIO(raw)
|
||||
with tarfile.open(fileobj=buf, mode="r:gz") as tar:
|
||||
tar.extractall(path=str(target_dir))
|
||||
_safe_extract_tar(tar, target_dir)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
@@ -485,6 +541,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
else:
|
||||
@@ -497,6 +554,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
|
||||
|
||||
@@ -549,6 +549,7 @@ class YCBenchEvalEnv(HermesAgentBaseEnv):
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
|
||||
|
||||
@@ -62,6 +62,11 @@ from atroposlib.type_definitions import Item
|
||||
|
||||
from environments.agent_loop import AgentResult, HermesAgentLoop
|
||||
from environments.tool_context import ToolContext
|
||||
from tools.budget_config import (
|
||||
DEFAULT_RESULT_SIZE_CHARS,
|
||||
DEFAULT_TURN_BUDGET_CHARS,
|
||||
DEFAULT_PREVIEW_SIZE_CHARS,
|
||||
)
|
||||
|
||||
# Import hermes-agent toolset infrastructure
|
||||
from model_tools import get_tool_definitions
|
||||
@@ -160,6 +165,32 @@ class HermesAgentEnvConfig(BaseEnvConfig):
|
||||
"Options: hermes, mistral, llama3_json, qwen, deepseek_v3, etc.",
|
||||
)
|
||||
|
||||
# --- Tool result budget ---
|
||||
# Defaults imported from tools.budget_config (single source of truth).
|
||||
default_result_size_chars: int = Field(
|
||||
default=DEFAULT_RESULT_SIZE_CHARS,
|
||||
description="Default per-tool threshold (chars) for persisting large results "
|
||||
"to sandbox. Results exceeding this are written to /tmp/hermes-results/ "
|
||||
"and replaced with a preview. Per-tool registry values take precedence "
|
||||
"unless overridden via tool_result_overrides.",
|
||||
)
|
||||
turn_budget_chars: int = Field(
|
||||
default=DEFAULT_TURN_BUDGET_CHARS,
|
||||
description="Aggregate char budget per assistant turn. If all tool results "
|
||||
"in a single turn exceed this, the largest are persisted to disk first.",
|
||||
)
|
||||
preview_size_chars: int = Field(
|
||||
default=DEFAULT_PREVIEW_SIZE_CHARS,
|
||||
description="Size of the inline preview shown after a tool result is persisted.",
|
||||
)
|
||||
tool_result_overrides: Optional[Dict[str, int]] = Field(
|
||||
default=None,
|
||||
description="Per-tool threshold overrides (chars). Keys are tool names, "
|
||||
"values are char thresholds. Overrides both the default and registry "
|
||||
"per-tool values. Example: {'terminal': 10000, 'search_files': 5000}. "
|
||||
"Note: read_file is pinned to infinity and cannot be overridden.",
|
||||
)
|
||||
|
||||
# --- Provider-specific parameters ---
|
||||
# Passed as extra_body to the OpenAI client's chat.completions.create() call.
|
||||
# Useful for OpenRouter provider preferences, transforms, route settings, etc.
|
||||
@@ -176,6 +207,16 @@ class HermesAgentEnvConfig(BaseEnvConfig):
|
||||
"transforms, and other provider-specific settings.",
|
||||
)
|
||||
|
||||
def build_budget_config(self):
|
||||
"""Build a BudgetConfig from env config fields."""
|
||||
from tools.budget_config import BudgetConfig
|
||||
return BudgetConfig(
|
||||
default_result_size=self.default_result_size_chars,
|
||||
turn_budget=self.turn_budget_chars,
|
||||
preview_size=self.preview_size_chars,
|
||||
tool_overrides=dict(self.tool_result_overrides) if self.tool_result_overrides else {},
|
||||
)
|
||||
|
||||
|
||||
class HermesAgentBaseEnv(BaseEnv):
|
||||
"""
|
||||
@@ -490,6 +531,7 @@ class HermesAgentBaseEnv(BaseEnv):
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
except NotImplementedError:
|
||||
@@ -507,6 +549,7 @@ class HermesAgentBaseEnv(BaseEnv):
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
else:
|
||||
@@ -520,6 +563,7 @@ class HermesAgentBaseEnv(BaseEnv):
|
||||
temperature=self.config.agent_temperature,
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
|
||||
|
||||
@@ -472,6 +472,7 @@ class WebResearchEnv(HermesAgentBaseEnv):
|
||||
temperature=0.0, # Deterministic for eval
|
||||
max_tokens=self.config.max_token_length,
|
||||
extra_body=self.config.extra_body,
|
||||
budget_config=self.config.build_budget_config(),
|
||||
)
|
||||
result = await agent.run(messages)
|
||||
|
||||
|
||||
@@ -556,6 +556,18 @@ def load_gateway_config() -> GatewayConfig:
|
||||
os.environ["DISCORD_AUTO_THREAD"] = str(discord_cfg["auto_thread"]).lower()
|
||||
if "reactions" in discord_cfg and not os.getenv("DISCORD_REACTIONS"):
|
||||
os.environ["DISCORD_REACTIONS"] = str(discord_cfg["reactions"]).lower()
|
||||
# ignored_channels: channels where bot never responds (even when mentioned)
|
||||
ic = discord_cfg.get("ignored_channels")
|
||||
if ic is not None and not os.getenv("DISCORD_IGNORED_CHANNELS"):
|
||||
if isinstance(ic, list):
|
||||
ic = ",".join(str(v) for v in ic)
|
||||
os.environ["DISCORD_IGNORED_CHANNELS"] = str(ic)
|
||||
# no_thread_channels: channels where bot responds directly without creating thread
|
||||
ntc = discord_cfg.get("no_thread_channels")
|
||||
if ntc is not None and not os.getenv("DISCORD_NO_THREAD_CHANNELS"):
|
||||
if isinstance(ntc, list):
|
||||
ntc = ",".join(str(v) for v in ntc)
|
||||
os.environ["DISCORD_NO_THREAD_CHANNELS"] = str(ntc)
|
||||
|
||||
# Telegram settings → env vars (env vars take precedence)
|
||||
telegram_cfg = yaml_cfg.get("telegram", {})
|
||||
@@ -570,6 +582,8 @@ def load_gateway_config() -> GatewayConfig:
|
||||
if isinstance(frc, list):
|
||||
frc = ",".join(str(v) for v in frc)
|
||||
os.environ["TELEGRAM_FREE_RESPONSE_CHATS"] = str(frc)
|
||||
if "reactions" in telegram_cfg and not os.getenv("TELEGRAM_REACTIONS"):
|
||||
os.environ["TELEGRAM_REACTIONS"] = str(telegram_cfg["reactions"]).lower()
|
||||
|
||||
whatsapp_cfg = yaml_cfg.get("whatsapp", {})
|
||||
if isinstance(whatsapp_cfg, dict):
|
||||
|
||||
@@ -20,6 +20,7 @@ Requires:
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import hmac
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
@@ -370,7 +371,7 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
auth_header = request.headers.get("Authorization", "")
|
||||
if auth_header.startswith("Bearer "):
|
||||
token = auth_header[7:].strip()
|
||||
if token == self._api_key:
|
||||
if hmac.compare_digest(token, self._api_key):
|
||||
return None # Auth OK
|
||||
|
||||
return web.json_response(
|
||||
@@ -563,8 +564,10 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
if delta is not None:
|
||||
_stream_q.put(delta)
|
||||
|
||||
def _on_tool_progress(name, preview, args):
|
||||
def _on_tool_progress(event_type, name, preview, args, **kwargs):
|
||||
"""Inject tool progress into the SSE stream for Open WebUI."""
|
||||
if event_type != "tool.started":
|
||||
return # Only show tool start events in chat stream
|
||||
if name.startswith("_"):
|
||||
return # Skip internal events (_thinking)
|
||||
from agent.display import get_tool_emoji
|
||||
@@ -815,9 +818,29 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
else:
|
||||
return web.json_response(_openai_error("'input' must be a string or array"), status=400)
|
||||
|
||||
# Reconstruct conversation history from previous_response_id
|
||||
# Accept explicit conversation_history from the request body.
|
||||
# This lets stateless clients supply their own history instead of
|
||||
# relying on server-side response chaining via previous_response_id.
|
||||
# Precedence: explicit conversation_history > previous_response_id.
|
||||
conversation_history: List[Dict[str, str]] = []
|
||||
if previous_response_id:
|
||||
raw_history = body.get("conversation_history")
|
||||
if raw_history:
|
||||
if not isinstance(raw_history, list):
|
||||
return web.json_response(
|
||||
_openai_error("'conversation_history' must be an array of message objects"),
|
||||
status=400,
|
||||
)
|
||||
for i, entry in enumerate(raw_history):
|
||||
if not isinstance(entry, dict) or "role" not in entry or "content" not in entry:
|
||||
return web.json_response(
|
||||
_openai_error(f"conversation_history[{i}] must have 'role' and 'content' fields"),
|
||||
status=400,
|
||||
)
|
||||
conversation_history.append({"role": str(entry["role"]), "content": str(entry["content"])})
|
||||
if previous_response_id:
|
||||
logger.debug("Both conversation_history and previous_response_id provided; using conversation_history")
|
||||
|
||||
if not conversation_history and previous_response_id:
|
||||
stored = self._response_store.get(previous_response_id)
|
||||
if stored is None:
|
||||
return web.json_response(_openai_error(f"Previous response not found: {previous_response_id}"), status=404)
|
||||
@@ -1403,14 +1426,49 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
|
||||
instructions = body.get("instructions")
|
||||
previous_response_id = body.get("previous_response_id")
|
||||
|
||||
# Accept explicit conversation_history from the request body.
|
||||
# Precedence: explicit conversation_history > previous_response_id.
|
||||
conversation_history: List[Dict[str, str]] = []
|
||||
if previous_response_id:
|
||||
raw_history = body.get("conversation_history")
|
||||
if raw_history:
|
||||
if not isinstance(raw_history, list):
|
||||
return web.json_response(
|
||||
_openai_error("'conversation_history' must be an array of message objects"),
|
||||
status=400,
|
||||
)
|
||||
for i, entry in enumerate(raw_history):
|
||||
if not isinstance(entry, dict) or "role" not in entry or "content" not in entry:
|
||||
return web.json_response(
|
||||
_openai_error(f"conversation_history[{i}] must have 'role' and 'content' fields"),
|
||||
status=400,
|
||||
)
|
||||
conversation_history.append({"role": str(entry["role"]), "content": str(entry["content"])})
|
||||
if previous_response_id:
|
||||
logger.debug("Both conversation_history and previous_response_id provided; using conversation_history")
|
||||
|
||||
if not conversation_history and previous_response_id:
|
||||
stored = self._response_store.get(previous_response_id)
|
||||
if stored:
|
||||
conversation_history = list(stored.get("conversation_history", []))
|
||||
if instructions is None:
|
||||
instructions = stored.get("instructions")
|
||||
|
||||
# When input is a multi-message array, extract all but the last
|
||||
# message as conversation history (the last becomes user_message).
|
||||
# Only fires when no explicit history was provided.
|
||||
if not conversation_history and isinstance(raw_input, list) and len(raw_input) > 1:
|
||||
for msg in raw_input[:-1]:
|
||||
if isinstance(msg, dict) and msg.get("role") and msg.get("content"):
|
||||
content = msg["content"]
|
||||
if isinstance(content, list):
|
||||
# Flatten multi-part content blocks to text
|
||||
content = " ".join(
|
||||
part.get("text", "") for part in content
|
||||
if isinstance(part, dict) and part.get("type") == "text"
|
||||
)
|
||||
conversation_history.append({"role": msg["role"], "content": str(content)})
|
||||
|
||||
session_id = body.get("session_id") or run_id
|
||||
ephemeral_system_prompt = instructions
|
||||
|
||||
|
||||
@@ -124,7 +124,14 @@ async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) ->
|
||||
|
||||
Returns:
|
||||
Absolute path to the cached image file as a string.
|
||||
|
||||
Raises:
|
||||
ValueError: If the URL targets a private/internal network (SSRF protection).
|
||||
"""
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(url):
|
||||
raise ValueError(f"Blocked unsafe URL (SSRF protection): {_safe_url_for_log(url)}")
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
import logging as _logging
|
||||
@@ -232,7 +239,14 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) ->
|
||||
|
||||
Returns:
|
||||
Absolute path to the cached audio file as a string.
|
||||
|
||||
Raises:
|
||||
ValueError: If the URL targets a private/internal network (SSRF protection).
|
||||
"""
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(url):
|
||||
raise ValueError(f"Blocked unsafe URL (SSRF protection): {_safe_url_for_log(url)}")
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
import logging as _logging
|
||||
@@ -1105,6 +1119,22 @@ class BasePlatformAdapter(ABC):
|
||||
logger.error("[%s] Fallback send also failed: %s", self.name, fallback_result.error)
|
||||
return fallback_result
|
||||
|
||||
@staticmethod
|
||||
def _merge_caption(existing_text: Optional[str], new_text: str) -> str:
|
||||
"""Merge a new caption into existing text, avoiding duplicates.
|
||||
|
||||
Uses line-by-line exact match (not substring) to prevent false positives
|
||||
where a shorter caption is silently dropped because it appears as a
|
||||
substring of a longer one (e.g. "Meeting" inside "Meeting agenda").
|
||||
Whitespace is normalised for comparison.
|
||||
"""
|
||||
if not existing_text:
|
||||
return new_text
|
||||
existing_captions = [c.strip() for c in existing_text.split("\n\n")]
|
||||
if new_text.strip() not in existing_captions:
|
||||
return f"{existing_text}\n\n{new_text}".strip()
|
||||
return existing_text
|
||||
|
||||
async def handle_message(self, event: MessageEvent) -> None:
|
||||
"""
|
||||
Process an incoming message.
|
||||
@@ -1164,10 +1194,7 @@ class BasePlatformAdapter(ABC):
|
||||
existing.media_urls.extend(event.media_urls)
|
||||
existing.media_types.extend(event.media_types)
|
||||
if event.text:
|
||||
if not existing.text:
|
||||
existing.text = event.text
|
||||
elif event.text not in existing.text:
|
||||
existing.text = f"{existing.text}\n\n{event.text}".strip()
|
||||
existing.text = self._merge_caption(existing.text, event.text)
|
||||
else:
|
||||
self._pending_messages[session_key] = event
|
||||
return # Don't interrupt now - will run after current task completes
|
||||
|
||||
@@ -55,6 +55,7 @@ from gateway.platforms.base import (
|
||||
cache_document_from_bytes,
|
||||
SUPPORTED_DOCUMENT_TYPES,
|
||||
)
|
||||
from tools.url_safety import is_safe_url
|
||||
|
||||
|
||||
def _clean_discord_id(entry: str) -> str:
|
||||
@@ -1285,6 +1286,10 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
if not self._client:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
if not is_safe_url(image_url):
|
||||
logger.warning("[%s] Blocked unsafe image URL during Discord send_image", self.name)
|
||||
return await super().send_image(chat_id, image_url, caption, reply_to, metadata=metadata)
|
||||
|
||||
try:
|
||||
import aiohttp
|
||||
|
||||
@@ -2188,9 +2193,11 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
# UNLESS the channel is in the free-response list or the message is
|
||||
# in a thread where the bot has already participated.
|
||||
#
|
||||
# Config (all settable via discord.* in config.yaml):
|
||||
# Config (all settable via discord.* in config.yaml or DISCORD_* env vars):
|
||||
# discord.require_mention: Require @mention in server channels (default: true)
|
||||
# discord.free_response_channels: Channel IDs where bot responds without mention
|
||||
# discord.ignored_channels: Channel IDs where bot NEVER responds (even when mentioned)
|
||||
# discord.no_thread_channels: Channel IDs where bot responds directly without creating thread
|
||||
# discord.auto_thread: Auto-create thread on @mention in channels (default: true)
|
||||
|
||||
thread_id = None
|
||||
@@ -2201,9 +2208,18 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
parent_channel_id = self._get_parent_channel_id(message.channel)
|
||||
|
||||
if not isinstance(message.channel, discord.DMChannel):
|
||||
# Check ignored channels first - never respond even when mentioned
|
||||
ignored_channels_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
|
||||
ignored_channels = {ch.strip() for ch in ignored_channels_raw.split(",") if ch.strip()}
|
||||
channel_ids = {str(message.channel.id)}
|
||||
if parent_channel_id:
|
||||
channel_ids.add(parent_channel_id)
|
||||
if channel_ids & ignored_channels:
|
||||
logger.debug("[%s] Ignoring message in ignored channel: %s", self.name, channel_ids)
|
||||
return
|
||||
|
||||
free_channels_raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
|
||||
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
|
||||
channel_ids = {str(message.channel.id)}
|
||||
if parent_channel_id:
|
||||
channel_ids.add(parent_channel_id)
|
||||
|
||||
@@ -2225,10 +2241,14 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
# Auto-thread: when enabled, automatically create a thread for every
|
||||
# @mention in a text channel so each conversation is isolated (like Slack).
|
||||
# Messages already inside threads or DMs are unaffected.
|
||||
# no_thread_channels: channels where bot responds directly without thread.
|
||||
auto_threaded_channel = None
|
||||
if not is_thread and not isinstance(message.channel, discord.DMChannel):
|
||||
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
|
||||
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
|
||||
skip_thread = bool(channel_ids & no_thread_channels)
|
||||
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
|
||||
if auto_thread:
|
||||
if auto_thread and not skip_thread:
|
||||
thread = await self._auto_create_thread(message)
|
||||
if thread:
|
||||
is_thread = True
|
||||
|
||||
+153
-4
@@ -20,6 +20,7 @@ from __future__ import annotations
|
||||
import asyncio
|
||||
import hashlib
|
||||
import hmac
|
||||
import itertools
|
||||
import json
|
||||
import logging
|
||||
import mimetypes
|
||||
@@ -1052,6 +1053,9 @@ class FeishuAdapter(BasePlatformAdapter):
|
||||
self._media_batch_state = FeishuBatchState()
|
||||
self._pending_media_batches = self._media_batch_state.events
|
||||
self._pending_media_batch_tasks = self._media_batch_state.tasks
|
||||
# Exec approval button state (approval_id → {session_key, message_id, chat_id})
|
||||
self._approval_state: Dict[int, Dict[str, str]] = {}
|
||||
self._approval_counter = itertools.count(1)
|
||||
self._load_seen_message_ids()
|
||||
|
||||
@staticmethod
|
||||
@@ -1394,6 +1398,104 @@ class FeishuAdapter(BasePlatformAdapter):
|
||||
logger.error("[Feishu] Failed to edit message %s: %s", message_id, exc, exc_info=True)
|
||||
return SendResult(success=False, error=str(exc))
|
||||
|
||||
async def send_exec_approval(
|
||||
self, chat_id: str, command: str, session_key: str,
|
||||
description: str = "dangerous command",
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
) -> SendResult:
|
||||
"""Send an interactive card with approval buttons.
|
||||
|
||||
The buttons carry ``hermes_action`` in their value dict so that
|
||||
``_handle_card_action_event`` can intercept them and call
|
||||
``resolve_gateway_approval()`` to unblock the waiting agent thread.
|
||||
"""
|
||||
if not self._client:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
try:
|
||||
approval_id = next(self._approval_counter)
|
||||
cmd_preview = command[:3000] + "..." if len(command) > 3000 else command
|
||||
|
||||
def _btn(label: str, action_name: str, btn_type: str = "default") -> dict:
|
||||
return {
|
||||
"tag": "button",
|
||||
"text": {"tag": "plain_text", "content": label},
|
||||
"type": btn_type,
|
||||
"value": {"hermes_action": action_name, "approval_id": approval_id},
|
||||
}
|
||||
|
||||
card = {
|
||||
"config": {"wide_screen_mode": True},
|
||||
"header": {
|
||||
"title": {"content": "⚠️ Command Approval Required", "tag": "plain_text"},
|
||||
"template": "orange",
|
||||
},
|
||||
"elements": [
|
||||
{
|
||||
"tag": "markdown",
|
||||
"content": f"```\n{cmd_preview}\n```\n**Reason:** {description}",
|
||||
},
|
||||
{
|
||||
"tag": "action",
|
||||
"actions": [
|
||||
_btn("✅ Allow Once", "approve_once", "primary"),
|
||||
_btn("✅ Session", "approve_session"),
|
||||
_btn("✅ Always", "approve_always"),
|
||||
_btn("❌ Deny", "deny", "danger"),
|
||||
],
|
||||
},
|
||||
],
|
||||
}
|
||||
|
||||
payload = json.dumps(card, ensure_ascii=False)
|
||||
response = await self._feishu_send_with_retry(
|
||||
chat_id=chat_id,
|
||||
msg_type="interactive",
|
||||
payload=payload,
|
||||
reply_to=None,
|
||||
metadata=metadata,
|
||||
)
|
||||
|
||||
result = self._finalize_send_result(response, "send_exec_approval failed")
|
||||
if result.success:
|
||||
self._approval_state[approval_id] = {
|
||||
"session_key": session_key,
|
||||
"message_id": result.message_id or "",
|
||||
"chat_id": chat_id,
|
||||
}
|
||||
return result
|
||||
except Exception as exc:
|
||||
logger.warning("[Feishu] send_exec_approval failed: %s", exc)
|
||||
return SendResult(success=False, error=str(exc))
|
||||
|
||||
async def _update_approval_card(
|
||||
self, message_id: str, label: str, user_name: str, choice: str,
|
||||
) -> None:
|
||||
"""Replace the approval card with a resolved status card."""
|
||||
if not self._client or not message_id:
|
||||
return
|
||||
icon = "❌" if choice == "deny" else "✅"
|
||||
card = {
|
||||
"config": {"wide_screen_mode": True},
|
||||
"header": {
|
||||
"title": {"content": f"{icon} {label}", "tag": "plain_text"},
|
||||
"template": "red" if choice == "deny" else "green",
|
||||
},
|
||||
"elements": [
|
||||
{
|
||||
"tag": "markdown",
|
||||
"content": f"{icon} **{label}** by {user_name}",
|
||||
},
|
||||
],
|
||||
}
|
||||
try:
|
||||
payload = json.dumps(card, ensure_ascii=False)
|
||||
body = self._build_update_message_body(msg_type="interactive", content=payload)
|
||||
request = self._build_update_message_request(message_id=message_id, request_body=body)
|
||||
await asyncio.to_thread(self._client.im.v1.message.update, request)
|
||||
except Exception as exc:
|
||||
logger.warning("[Feishu] Failed to update approval card %s: %s", message_id, exc)
|
||||
|
||||
async def send_voice(
|
||||
self,
|
||||
chat_id: str,
|
||||
@@ -1820,6 +1922,52 @@ class FeishuAdapter(BasePlatformAdapter):
|
||||
action = getattr(event, "action", None)
|
||||
action_tag = str(getattr(action, "tag", "") or "button")
|
||||
action_value = getattr(action, "value", {}) or {}
|
||||
|
||||
# --- Exec approval button intercept ---
|
||||
hermes_action = action_value.get("hermes_action") if isinstance(action_value, dict) else None
|
||||
if hermes_action:
|
||||
approval_id = action_value.get("approval_id")
|
||||
state = self._approval_state.pop(approval_id, None)
|
||||
if not state:
|
||||
logger.debug("[Feishu] Approval %s already resolved or unknown", approval_id)
|
||||
return
|
||||
|
||||
choice_map = {
|
||||
"approve_once": "once",
|
||||
"approve_session": "session",
|
||||
"approve_always": "always",
|
||||
"deny": "deny",
|
||||
}
|
||||
choice = choice_map.get(hermes_action, "deny")
|
||||
|
||||
label_map = {
|
||||
"once": "Approved once",
|
||||
"session": "Approved for session",
|
||||
"always": "Approved permanently",
|
||||
"deny": "Denied",
|
||||
}
|
||||
label = label_map.get(choice, "Resolved")
|
||||
|
||||
# Resolve sender name for the status card
|
||||
sender_id = SimpleNamespace(open_id=open_id, user_id=None, union_id=None)
|
||||
sender_profile = await self._resolve_sender_profile(sender_id)
|
||||
user_name = sender_profile.get("user_name") or open_id
|
||||
|
||||
# Resolve the approval — unblocks the agent thread
|
||||
try:
|
||||
from tools.approval import resolve_gateway_approval
|
||||
count = resolve_gateway_approval(state["session_key"], choice)
|
||||
logger.info(
|
||||
"Feishu button resolved %d approval(s) for session %s (choice=%s, user=%s)",
|
||||
count, state["session_key"], choice, user_name,
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.error("Failed to resolve gateway approval from Feishu button: %s", exc)
|
||||
|
||||
# Update the card to show the decision
|
||||
await self._update_approval_card(state.get("message_id", ""), label, user_name, choice)
|
||||
return
|
||||
|
||||
synthetic_text = f"/card {action_tag}"
|
||||
if action_value:
|
||||
try:
|
||||
@@ -2065,10 +2213,7 @@ class FeishuAdapter(BasePlatformAdapter):
|
||||
existing.media_urls.extend(event.media_urls)
|
||||
existing.media_types.extend(event.media_types)
|
||||
if event.text:
|
||||
if not existing.text:
|
||||
existing.text = event.text
|
||||
elif event.text not in existing.text.split("\n\n"):
|
||||
existing.text = f"{existing.text}\n\n{event.text}"
|
||||
existing.text = self._merge_caption(existing.text, event.text)
|
||||
existing.timestamp = event.timestamp
|
||||
if event.message_id:
|
||||
existing.message_id = event.message_id
|
||||
@@ -2112,6 +2257,10 @@ class FeishuAdapter(BasePlatformAdapter):
|
||||
default_ext: str,
|
||||
preferred_name: str,
|
||||
) -> tuple[str, str]:
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(file_url):
|
||||
raise ValueError(f"Blocked unsafe URL (SSRF protection): {file_url[:80]}")
|
||||
|
||||
import httpx
|
||||
|
||||
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
|
||||
|
||||
@@ -586,6 +586,11 @@ class MatrixAdapter(BasePlatformAdapter):
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
) -> SendResult:
|
||||
"""Download an image URL and upload it to Matrix."""
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(image_url):
|
||||
logger.warning("Matrix: blocked unsafe image URL (SSRF protection)")
|
||||
return await super().send_image(chat_id, image_url, caption, reply_to, metadata=metadata)
|
||||
|
||||
try:
|
||||
# Try aiohttp first (always available), fall back to httpx
|
||||
try:
|
||||
|
||||
@@ -407,6 +407,11 @@ class MattermostAdapter(BasePlatformAdapter):
|
||||
kind: str = "file",
|
||||
) -> SendResult:
|
||||
"""Download a URL and upload it as a file attachment."""
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(url):
|
||||
logger.warning("Mattermost: blocked unsafe URL (SSRF protection)")
|
||||
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
|
||||
|
||||
import asyncio
|
||||
import aiohttp
|
||||
|
||||
|
||||
@@ -595,6 +595,11 @@ class SlackAdapter(BasePlatformAdapter):
|
||||
if not self._app:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(image_url):
|
||||
logger.warning("[Slack] Blocked unsafe image URL (SSRF protection)")
|
||||
return await super().send_image(chat_id, image_url, caption, reply_to, metadata=metadata)
|
||||
|
||||
try:
|
||||
import httpx
|
||||
|
||||
|
||||
@@ -1632,7 +1632,12 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
"""
|
||||
if not self._bot:
|
||||
return SendResult(success=False, error="Not connected")
|
||||
|
||||
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(image_url):
|
||||
logger.warning("[%s] Blocked unsafe image URL (SSRF protection)", self.name)
|
||||
return await super().send_image(chat_id, image_url, caption, reply_to, metadata=metadata)
|
||||
|
||||
try:
|
||||
# Telegram can send photos directly from URLs (up to ~5MB)
|
||||
_photo_thread = metadata.get("thread_id") if metadata else None
|
||||
@@ -2222,10 +2227,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
existing.media_urls.extend(event.media_urls)
|
||||
existing.media_types.extend(event.media_types)
|
||||
if event.text:
|
||||
if not existing.text:
|
||||
existing.text = event.text
|
||||
elif event.text not in existing.text:
|
||||
existing.text = f"{existing.text}\n\n{event.text}".strip()
|
||||
existing.text = self._merge_caption(existing.text, event.text)
|
||||
|
||||
prior_task = self._pending_photo_batch_tasks.get(batch_key)
|
||||
if prior_task and not prior_task.done():
|
||||
@@ -2415,11 +2417,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
existing.media_urls.extend(event.media_urls)
|
||||
existing.media_types.extend(event.media_types)
|
||||
if event.text:
|
||||
if existing.text:
|
||||
if event.text not in existing.text.split("\n\n"):
|
||||
existing.text = f"{existing.text}\n\n{event.text}"
|
||||
else:
|
||||
existing.text = event.text
|
||||
existing.text = self._merge_caption(existing.text, event.text)
|
||||
|
||||
prior_task = self._media_group_tasks.get(media_group_id)
|
||||
if prior_task:
|
||||
@@ -2675,3 +2673,46 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
auto_skill=topic_skill,
|
||||
timestamp=message.date,
|
||||
)
|
||||
|
||||
# ── Message reactions (processing lifecycle) ──────────────────────────
|
||||
|
||||
def _reactions_enabled(self) -> bool:
|
||||
"""Check if message reactions are enabled via config/env."""
|
||||
return os.getenv("TELEGRAM_REACTIONS", "false").lower() not in ("false", "0", "no")
|
||||
|
||||
async def _set_reaction(self, chat_id: str, message_id: str, emoji: str) -> bool:
|
||||
"""Set a single emoji reaction on a Telegram message."""
|
||||
if not self._bot:
|
||||
return False
|
||||
try:
|
||||
await self._bot.set_message_reaction(
|
||||
chat_id=int(chat_id),
|
||||
message_id=int(message_id),
|
||||
reaction=emoji,
|
||||
)
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.debug("[%s] set_message_reaction failed (%s): %s", self.name, emoji, e)
|
||||
return False
|
||||
|
||||
async def on_processing_start(self, event: MessageEvent) -> None:
|
||||
"""Add an in-progress reaction when message processing begins."""
|
||||
if not self._reactions_enabled():
|
||||
return
|
||||
chat_id = getattr(event.source, "chat_id", None)
|
||||
message_id = getattr(event, "message_id", None)
|
||||
if chat_id and message_id:
|
||||
await self._set_reaction(chat_id, message_id, "\U0001f440")
|
||||
|
||||
async def on_processing_complete(self, event: MessageEvent, success: bool) -> None:
|
||||
"""Swap the in-progress reaction for a final success/failure reaction.
|
||||
|
||||
Unlike Discord (additive reactions), Telegram's set_message_reaction
|
||||
replaces all existing reactions in one call — no remove step needed.
|
||||
"""
|
||||
if not self._reactions_enabled():
|
||||
return
|
||||
chat_id = getattr(event.source, "chat_id", None)
|
||||
message_id = getattr(event, "message_id", None)
|
||||
if chat_id and message_id:
|
||||
await self._set_reaction(chat_id, message_id, "\u2705" if success else "\u274c")
|
||||
|
||||
@@ -76,8 +76,17 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
self._routes: Dict[str, dict] = dict(self._static_routes)
|
||||
self._runner = None
|
||||
|
||||
# Delivery info keyed by session chat_id — consumed by send()
|
||||
# Delivery info keyed by session chat_id.
|
||||
#
|
||||
# Read by every send() invocation for the chat_id (status messages
|
||||
# AND the final response). Cleaned up via TTL on each POST so the
|
||||
# dict stays bounded — see _prune_delivery_info(). Do NOT pop on
|
||||
# send(), or interim status messages (e.g. fallback notifications,
|
||||
# context-pressure warnings) will consume the entry before the
|
||||
# final response arrives, causing the response to silently fall
|
||||
# back to the "log" deliver type.
|
||||
self._delivery_info: Dict[str, dict] = {}
|
||||
self._delivery_info_created: Dict[str, float] = {}
|
||||
|
||||
# Reference to gateway runner for cross-platform delivery (set externally)
|
||||
self.gateway_runner = None
|
||||
@@ -160,10 +169,14 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
) -> SendResult:
|
||||
"""Deliver the agent's response to the configured destination.
|
||||
|
||||
chat_id is ``webhook:{route}:{delivery_id}`` — we pop the delivery
|
||||
info stored during webhook receipt so it doesn't leak memory.
|
||||
chat_id is ``webhook:{route}:{delivery_id}``. The delivery info
|
||||
stored during webhook receipt is read with ``.get()`` (not popped)
|
||||
so that interim status messages emitted before the final response
|
||||
— fallback-model notifications, context-pressure warnings, etc. —
|
||||
do not consume the entry and silently downgrade the final response
|
||||
to the ``log`` deliver type. TTL cleanup happens on POST.
|
||||
"""
|
||||
delivery = self._delivery_info.pop(chat_id, {})
|
||||
delivery = self._delivery_info.get(chat_id, {})
|
||||
deliver_type = delivery.get("deliver", "log")
|
||||
|
||||
if deliver_type == "log":
|
||||
@@ -190,6 +203,23 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
success=False, error=f"Unknown deliver type: {deliver_type}"
|
||||
)
|
||||
|
||||
def _prune_delivery_info(self, now: float) -> None:
|
||||
"""Drop delivery_info entries older than the idempotency TTL.
|
||||
|
||||
Mirrors the cleanup pattern used for ``_seen_deliveries``. Called
|
||||
on each POST so the dict size is bounded by ``rate_limit * TTL``
|
||||
even if many webhooks fire and never receive a final response.
|
||||
"""
|
||||
cutoff = now - self._idempotency_ttl
|
||||
stale = [
|
||||
k
|
||||
for k, t in self._delivery_info_created.items()
|
||||
if t < cutoff
|
||||
]
|
||||
for k in stale:
|
||||
self._delivery_info.pop(k, None)
|
||||
self._delivery_info_created.pop(k, None)
|
||||
|
||||
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
|
||||
return {"name": chat_id, "type": "webhook"}
|
||||
|
||||
@@ -382,7 +412,9 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
# same route get independent agent runs (not queued/interrupted).
|
||||
session_chat_id = f"webhook:{route_name}:{delivery_id}"
|
||||
|
||||
# Store delivery info for send() — consumed (popped) on delivery
|
||||
# Store delivery info for send(). Read by every send() invocation
|
||||
# for this chat_id (interim status messages and the final response),
|
||||
# so we do NOT pop on send. TTL-based cleanup keeps the dict bounded.
|
||||
deliver_config = {
|
||||
"deliver": route_config.get("deliver", "log"),
|
||||
"deliver_extra": self._render_delivery_extra(
|
||||
@@ -391,6 +423,8 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
"payload": payload,
|
||||
}
|
||||
self._delivery_info[session_chat_id] = deliver_config
|
||||
self._delivery_info_created[session_chat_id] = now
|
||||
self._prune_delivery_info(now)
|
||||
|
||||
# Build source and event
|
||||
source = self.build_source(
|
||||
|
||||
@@ -910,6 +910,10 @@ class WeComAdapter(BasePlatformAdapter):
|
||||
url: str,
|
||||
max_bytes: int,
|
||||
) -> Tuple[bytes, Dict[str, str]]:
|
||||
from tools.url_safety import is_safe_url
|
||||
if not is_safe_url(url):
|
||||
raise ValueError(f"Blocked unsafe URL (SSRF protection): {url[:80]}")
|
||||
|
||||
if not HTTPX_AVAILABLE:
|
||||
raise RuntimeError("httpx is required for WeCom media download")
|
||||
|
||||
|
||||
+65
-24
@@ -921,12 +921,11 @@ class GatewayRunner:
|
||||
|
||||
@staticmethod
|
||||
def _load_reasoning_config() -> dict | None:
|
||||
"""Load reasoning effort from config with env fallback.
|
||||
"""Load reasoning effort from config.yaml.
|
||||
|
||||
Checks agent.reasoning_effort in config.yaml first, then
|
||||
HERMES_REASONING_EFFORT as a fallback. Valid: "xhigh", "high",
|
||||
"medium", "low", "minimal", "none". Returns None to use default
|
||||
(medium).
|
||||
Reads agent.reasoning_effort from config.yaml. Valid: "xhigh",
|
||||
"high", "medium", "low", "minimal", "none". Returns None to use
|
||||
default (medium).
|
||||
"""
|
||||
from hermes_constants import parse_reasoning_effort
|
||||
effort = ""
|
||||
@@ -939,8 +938,6 @@ class GatewayRunner:
|
||||
effort = str(cfg.get("agent", {}).get("reasoning_effort", "") or "").strip()
|
||||
except Exception:
|
||||
pass
|
||||
if not effort:
|
||||
effort = os.getenv("HERMES_REASONING_EFFORT", "")
|
||||
result = parse_reasoning_effort(effort)
|
||||
if effort and effort.strip() and result is None:
|
||||
logger.warning("Unknown reasoning_effort '%s', using default (medium)", effort)
|
||||
@@ -1484,6 +1481,14 @@ class GatewayRunner:
|
||||
logger.debug("Interrupted running agent for session %s during shutdown", session_key[:20])
|
||||
except Exception as e:
|
||||
logger.debug("Failed interrupting agent during shutdown: %s", e)
|
||||
# Fire plugin on_session_finalize hook before memory shutdown
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_invoke_hook("on_session_finalize",
|
||||
session_id=getattr(agent, 'session_id', None),
|
||||
platform="gateway")
|
||||
except Exception:
|
||||
pass
|
||||
# Shut down memory provider at actual session boundary
|
||||
try:
|
||||
if hasattr(agent, 'shutdown_memory_provider'):
|
||||
@@ -1987,10 +1992,7 @@ class GatewayRunner:
|
||||
existing.media_urls.extend(event.media_urls)
|
||||
existing.media_types.extend(event.media_types)
|
||||
if event.text:
|
||||
if not existing.text:
|
||||
existing.text = event.text
|
||||
elif event.text not in existing.text:
|
||||
existing.text = f"{existing.text}\n\n{event.text}".strip()
|
||||
existing.text = BasePlatformAdapter._merge_caption(existing.text, event.text)
|
||||
else:
|
||||
adapter._pending_messages[_quick_key] = event
|
||||
else:
|
||||
@@ -3280,6 +3282,15 @@ class GatewayRunner:
|
||||
# the configured default instead of the previously switched model.
|
||||
self._session_model_overrides.pop(session_key, None)
|
||||
|
||||
# Fire plugin on_session_finalize hook (session boundary)
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_old_sid = old_entry.session_id if old_entry else None
|
||||
_invoke_hook("on_session_finalize", session_id=_old_sid,
|
||||
platform=source.platform.value if source.platform else "")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Emit session:end hook (session is ending)
|
||||
await self.hooks.emit("session:end", {
|
||||
"platform": source.platform.value if source.platform else "",
|
||||
@@ -3293,7 +3304,7 @@ class GatewayRunner:
|
||||
"user_id": source.user_id,
|
||||
"session_key": session_key,
|
||||
})
|
||||
|
||||
|
||||
# Resolve session config info to surface to the user
|
||||
try:
|
||||
session_info = self._format_session_info()
|
||||
@@ -3304,9 +3315,18 @@ class GatewayRunner:
|
||||
header = "✨ Session reset! Starting fresh."
|
||||
else:
|
||||
# No existing session, just create one
|
||||
self.session_store.get_or_create_session(source, force_new=True)
|
||||
new_entry = self.session_store.get_or_create_session(source, force_new=True)
|
||||
header = "✨ New session started!"
|
||||
|
||||
# Fire plugin on_session_reset hook (new session guaranteed to exist)
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_new_sid = new_entry.session_id if new_entry else None
|
||||
_invoke_hook("on_session_reset", session_id=_new_sid,
|
||||
platform=source.platform.value if source.platform else "")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if session_info:
|
||||
return f"{header}\n\n{session_info}"
|
||||
return header
|
||||
@@ -3345,25 +3365,36 @@ class GatewayRunner:
|
||||
"""Handle /status command."""
|
||||
source = event.source
|
||||
session_entry = self.session_store.get_or_create_session(source)
|
||||
|
||||
|
||||
connected_platforms = [p.value for p in self.adapters.keys()]
|
||||
|
||||
|
||||
# Check if there's an active agent
|
||||
session_key = session_entry.session_key
|
||||
is_running = session_key in self._running_agents
|
||||
|
||||
|
||||
title = None
|
||||
if self._session_db:
|
||||
try:
|
||||
title = self._session_db.get_session_title(session_entry.session_id)
|
||||
except Exception:
|
||||
title = None
|
||||
|
||||
lines = [
|
||||
"📊 **Hermes Gateway Status**",
|
||||
"",
|
||||
f"**Session ID:** `{session_entry.session_id[:12]}...`",
|
||||
f"**Session ID:** `{session_entry.session_id}`",
|
||||
]
|
||||
if title:
|
||||
lines.append(f"**Title:** {title}")
|
||||
lines.extend([
|
||||
f"**Created:** {session_entry.created_at.strftime('%Y-%m-%d %H:%M')}",
|
||||
f"**Last Activity:** {session_entry.updated_at.strftime('%Y-%m-%d %H:%M')}",
|
||||
f"**Tokens:** {session_entry.total_tokens:,}",
|
||||
f"**Agent Running:** {'Yes ⚡' if is_running else 'No'}",
|
||||
"",
|
||||
f"**Connected Platforms:** {', '.join(connected_platforms)}",
|
||||
]
|
||||
|
||||
])
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
async def _handle_stop_command(self, event: MessageEvent) -> str:
|
||||
@@ -4913,8 +4944,8 @@ class GatewayRunner:
|
||||
cycle = ["off", "new", "all", "verbose"]
|
||||
descriptions = {
|
||||
"off": "⚙️ Tool progress: **OFF** — no tool activity shown.",
|
||||
"new": "⚙️ Tool progress: **NEW** — shown when tool changes (short previews).",
|
||||
"all": "⚙️ Tool progress: **ALL** — every tool call shown (short previews).",
|
||||
"new": "⚙️ Tool progress: **NEW** — shown when tool changes (preview length: `display.tool_preview_length`, default 40).",
|
||||
"all": "⚙️ Tool progress: **ALL** — every tool call shown (preview length: `display.tool_preview_length`, default 40).",
|
||||
"verbose": "⚙️ Tool progress: **VERBOSE** — every tool call with full arguments.",
|
||||
}
|
||||
|
||||
@@ -6036,6 +6067,11 @@ class GatewayRunner:
|
||||
|
||||
if enriched_parts:
|
||||
prefix = "\n\n".join(enriched_parts)
|
||||
# Strip the empty-content placeholder from the Discord adapter
|
||||
# when we successfully transcribed the audio — it's redundant.
|
||||
_placeholder = "(The user sent a message with no text content)"
|
||||
if user_text and user_text.strip() == _placeholder:
|
||||
return prefix
|
||||
if user_text:
|
||||
return f"{prefix}\n\n{user_text}"
|
||||
return prefix
|
||||
@@ -6327,10 +6363,15 @@ class GatewayRunner:
|
||||
progress_queue.put(msg)
|
||||
return
|
||||
|
||||
# "all" / "new" modes: short preview, always truncated (40 chars)
|
||||
# "all" / "new" modes: short preview, respects tool_preview_length
|
||||
# config (defaults to 40 chars when unset to keep gateway messages
|
||||
# compact — unlike CLI spinners, these persist as permanent messages).
|
||||
if preview:
|
||||
if len(preview) > 40:
|
||||
preview = preview[:37] + "..."
|
||||
from agent.display import get_tool_preview_max_len
|
||||
_pl = get_tool_preview_max_len()
|
||||
_cap = _pl if _pl > 0 else 40
|
||||
if len(preview) > _cap:
|
||||
preview = preview[:_cap - 3] + "..."
|
||||
msg = f"{emoji} {tool_name}: \"{preview}\""
|
||||
else:
|
||||
msg = f"{emoji} {tool_name}..."
|
||||
|
||||
+108
-7
@@ -74,6 +74,8 @@ class GatewayStreamConsumer:
|
||||
self._edit_supported = True # Disabled on first edit failure (Signal/Email/HA)
|
||||
self._last_edit_time = 0.0
|
||||
self._last_sent_text = "" # Track last-sent text to skip redundant edits
|
||||
self._fallback_final_send = False
|
||||
self._fallback_prefix = ""
|
||||
|
||||
@property
|
||||
def already_sent(self) -> bool:
|
||||
@@ -138,12 +140,19 @@ class GatewayStreamConsumer:
|
||||
while (
|
||||
len(self._accumulated) > _safe_limit
|
||||
and self._message_id is not None
|
||||
and self._edit_supported
|
||||
):
|
||||
split_at = self._accumulated.rfind("\n", 0, _safe_limit)
|
||||
if split_at < _safe_limit // 2:
|
||||
split_at = _safe_limit
|
||||
chunk = self._accumulated[:split_at]
|
||||
await self._send_or_edit(chunk)
|
||||
if self._fallback_final_send:
|
||||
# Edit failed while attempting to split an oversized
|
||||
# message. Keep the full accumulated text intact so
|
||||
# the fallback final-send path can deliver the
|
||||
# remaining continuation without dropping content.
|
||||
break
|
||||
self._accumulated = self._accumulated[split_at:].lstrip("\n")
|
||||
self._message_id = None
|
||||
self._last_sent_text = ""
|
||||
@@ -156,9 +165,17 @@ class GatewayStreamConsumer:
|
||||
self._last_edit_time = time.monotonic()
|
||||
|
||||
if got_done:
|
||||
# Final edit without cursor
|
||||
if self._accumulated and self._message_id:
|
||||
await self._send_or_edit(self._accumulated)
|
||||
# Final edit without cursor. If progressive editing failed
|
||||
# mid-stream, send a single continuation/fallback message
|
||||
# here instead of letting the base gateway path send the
|
||||
# full response again.
|
||||
if self._accumulated:
|
||||
if self._fallback_final_send:
|
||||
await self._send_fallback_final(self._accumulated)
|
||||
elif self._message_id:
|
||||
await self._send_or_edit(self._accumulated)
|
||||
elif not self._already_sent:
|
||||
await self._send_or_edit(self._accumulated)
|
||||
return
|
||||
|
||||
# Tool boundary: the should_edit block above already flushed
|
||||
@@ -169,6 +186,8 @@ class GatewayStreamConsumer:
|
||||
self._message_id = None
|
||||
self._accumulated = ""
|
||||
self._last_sent_text = ""
|
||||
self._fallback_final_send = False
|
||||
self._fallback_prefix = ""
|
||||
|
||||
await asyncio.sleep(0.05) # Small yield to not busy-loop
|
||||
|
||||
@@ -207,6 +226,86 @@ class GatewayStreamConsumer:
|
||||
# Strip trailing whitespace/newlines but preserve leading content
|
||||
return cleaned.rstrip()
|
||||
|
||||
def _visible_prefix(self) -> str:
|
||||
"""Return the visible text already shown in the streamed message."""
|
||||
prefix = self._last_sent_text or ""
|
||||
if self.cfg.cursor and prefix.endswith(self.cfg.cursor):
|
||||
prefix = prefix[:-len(self.cfg.cursor)]
|
||||
return self._clean_for_display(prefix)
|
||||
|
||||
def _continuation_text(self, final_text: str) -> str:
|
||||
"""Return only the part of final_text the user has not already seen."""
|
||||
prefix = self._fallback_prefix or self._visible_prefix()
|
||||
if prefix and final_text.startswith(prefix):
|
||||
return final_text[len(prefix):].lstrip()
|
||||
return final_text
|
||||
|
||||
@staticmethod
|
||||
def _split_text_chunks(text: str, limit: int) -> list[str]:
|
||||
"""Split text into reasonably sized chunks for fallback sends."""
|
||||
if len(text) <= limit:
|
||||
return [text]
|
||||
chunks: list[str] = []
|
||||
remaining = text
|
||||
while len(remaining) > limit:
|
||||
split_at = remaining.rfind("\n", 0, limit)
|
||||
if split_at < limit // 2:
|
||||
split_at = limit
|
||||
chunks.append(remaining[:split_at])
|
||||
remaining = remaining[split_at:].lstrip("\n")
|
||||
if remaining:
|
||||
chunks.append(remaining)
|
||||
return chunks
|
||||
|
||||
async def _send_fallback_final(self, text: str) -> None:
|
||||
"""Send the final continuation after streaming edits stop working."""
|
||||
final_text = self._clean_for_display(text)
|
||||
continuation = self._continuation_text(final_text)
|
||||
self._fallback_final_send = False
|
||||
if not continuation.strip():
|
||||
# Nothing new to send — the visible partial already matches final text.
|
||||
self._already_sent = True
|
||||
return
|
||||
|
||||
raw_limit = getattr(self.adapter, "MAX_MESSAGE_LENGTH", 4096)
|
||||
safe_limit = max(500, raw_limit - 100)
|
||||
chunks = self._split_text_chunks(continuation, safe_limit)
|
||||
|
||||
last_message_id: Optional[str] = None
|
||||
last_successful_chunk = ""
|
||||
sent_any_chunk = False
|
||||
for chunk in chunks:
|
||||
result = await self.adapter.send(
|
||||
chat_id=self.chat_id,
|
||||
content=chunk,
|
||||
metadata=self.metadata,
|
||||
)
|
||||
if not result.success:
|
||||
if sent_any_chunk:
|
||||
# Some continuation text already reached the user. Suppress
|
||||
# the base gateway final-send path so we don't resend the
|
||||
# full response and create another duplicate.
|
||||
self._already_sent = True
|
||||
self._message_id = last_message_id
|
||||
self._last_sent_text = last_successful_chunk
|
||||
self._fallback_prefix = ""
|
||||
return
|
||||
# No fallback chunk reached the user — allow the normal gateway
|
||||
# final-send path to try one more time.
|
||||
self._already_sent = False
|
||||
self._message_id = None
|
||||
self._last_sent_text = ""
|
||||
self._fallback_prefix = ""
|
||||
return
|
||||
sent_any_chunk = True
|
||||
last_successful_chunk = chunk
|
||||
last_message_id = result.message_id or last_message_id
|
||||
|
||||
self._message_id = last_message_id
|
||||
self._already_sent = True
|
||||
self._last_sent_text = chunks[-1]
|
||||
self._fallback_prefix = ""
|
||||
|
||||
async def _send_or_edit(self, text: str) -> None:
|
||||
"""Send or edit the streaming message."""
|
||||
# Strip MEDIA: directives so they don't appear as visible text.
|
||||
@@ -232,14 +331,16 @@ class GatewayStreamConsumer:
|
||||
self._last_sent_text = text
|
||||
else:
|
||||
# If an edit fails mid-stream (especially Telegram flood control),
|
||||
# stop progressive edits and let the normal final send path deliver
|
||||
# the complete answer instead of leaving the user with a partial.
|
||||
# stop progressive edits and send only the missing tail once the
|
||||
# final response is available.
|
||||
logger.debug("Edit failed, disabling streaming for this adapter")
|
||||
self._fallback_prefix = self._visible_prefix()
|
||||
self._fallback_final_send = True
|
||||
self._edit_supported = False
|
||||
self._already_sent = False
|
||||
self._already_sent = True
|
||||
else:
|
||||
# Editing not supported — skip intermediate updates.
|
||||
# The final response will be sent by the normal path.
|
||||
# The final response will be sent by the fallback path.
|
||||
pass
|
||||
else:
|
||||
# First message — send new
|
||||
|
||||
@@ -11,5 +11,5 @@ Provides subcommands for:
|
||||
- hermes cron - Manage cron jobs
|
||||
"""
|
||||
|
||||
__version__ = "0.7.0"
|
||||
__release_date__ = "2026.4.3"
|
||||
__version__ = "0.8.0"
|
||||
__release_date__ = "2026.4.8"
|
||||
|
||||
+4
-15
@@ -37,7 +37,7 @@ from typing import Any, Dict, List, Optional
|
||||
import httpx
|
||||
import yaml
|
||||
|
||||
from hermes_cli.config import get_hermes_home, get_config_path
|
||||
from hermes_cli.config import get_hermes_home, get_config_path, read_raw_config
|
||||
from hermes_constants import OPENROUTER_BASE_URL
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -2214,14 +2214,7 @@ def _update_config_for_provider(
|
||||
config_path = get_config_path()
|
||||
config_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
config: Dict[str, Any] = {}
|
||||
if config_path.exists():
|
||||
try:
|
||||
loaded = yaml.safe_load(config_path.read_text()) or {}
|
||||
if isinstance(loaded, dict):
|
||||
config = loaded
|
||||
except Exception:
|
||||
config = {}
|
||||
config = read_raw_config()
|
||||
|
||||
current_model = config.get("model")
|
||||
if isinstance(current_model, dict):
|
||||
@@ -2258,12 +2251,8 @@ def _reset_config_provider() -> Path:
|
||||
if not config_path.exists():
|
||||
return config_path
|
||||
|
||||
try:
|
||||
config = yaml.safe_load(config_path.read_text()) or {}
|
||||
except Exception:
|
||||
return config_path
|
||||
|
||||
if not isinstance(config, dict):
|
||||
config = read_raw_config()
|
||||
if not config:
|
||||
return config_path
|
||||
|
||||
model = config.get("model")
|
||||
|
||||
+75
-1
@@ -5,6 +5,7 @@ Pure display functions with no HermesCLI state dependency.
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import threading
|
||||
@@ -189,6 +190,79 @@ def check_for_updates() -> Optional[int]:
|
||||
return behind
|
||||
|
||||
|
||||
def _resolve_repo_dir() -> Optional[Path]:
|
||||
"""Return the active Hermes git checkout, or None if this isn't a git install."""
|
||||
hermes_home = get_hermes_home()
|
||||
repo_dir = hermes_home / "hermes-agent"
|
||||
if not (repo_dir / ".git").exists():
|
||||
repo_dir = Path(__file__).parent.parent.resolve()
|
||||
return repo_dir if (repo_dir / ".git").exists() else None
|
||||
|
||||
|
||||
def _git_short_hash(repo_dir: Path, rev: str) -> Optional[str]:
|
||||
"""Resolve a git revision to an 8-character short hash."""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "rev-parse", "--short=8", rev],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=5,
|
||||
cwd=str(repo_dir),
|
||||
)
|
||||
except Exception:
|
||||
return None
|
||||
if result.returncode != 0:
|
||||
return None
|
||||
value = (result.stdout or "").strip()
|
||||
return value or None
|
||||
|
||||
|
||||
def get_git_banner_state(repo_dir: Optional[Path] = None) -> Optional[dict]:
|
||||
"""Return upstream/local git hashes for the startup banner."""
|
||||
repo_dir = repo_dir or _resolve_repo_dir()
|
||||
if repo_dir is None:
|
||||
return None
|
||||
|
||||
upstream = _git_short_hash(repo_dir, "origin/main")
|
||||
local = _git_short_hash(repo_dir, "HEAD")
|
||||
if not upstream or not local:
|
||||
return None
|
||||
|
||||
ahead = 0
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "rev-list", "--count", "origin/main..HEAD"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=5,
|
||||
cwd=str(repo_dir),
|
||||
)
|
||||
if result.returncode == 0:
|
||||
ahead = int((result.stdout or "0").strip() or "0")
|
||||
except Exception:
|
||||
ahead = 0
|
||||
|
||||
return {"upstream": upstream, "local": local, "ahead": max(ahead, 0)}
|
||||
|
||||
|
||||
def format_banner_version_label() -> str:
|
||||
"""Return the version label shown in the startup banner title."""
|
||||
base = f"Hermes Agent v{VERSION} ({RELEASE_DATE})"
|
||||
state = get_git_banner_state()
|
||||
if not state:
|
||||
return base
|
||||
|
||||
upstream = state["upstream"]
|
||||
local = state["local"]
|
||||
ahead = int(state.get("ahead") or 0)
|
||||
|
||||
if ahead <= 0 or upstream == local:
|
||||
return f"{base} · upstream {upstream}"
|
||||
|
||||
carried_word = "commit" if ahead == 1 else "commits"
|
||||
return f"{base} · upstream {upstream} · local {local} (+{ahead} carried {carried_word})"
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Non-blocking update check
|
||||
# =========================================================================
|
||||
@@ -448,7 +522,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
|
||||
border_color = _skin_color("banner_border", "#CD7F32")
|
||||
outer_panel = Panel(
|
||||
layout_table,
|
||||
title=f"[bold {title_color}]{agent_name} v{VERSION} ({RELEASE_DATE})[/]",
|
||||
title=f"[bold {title_color}]{format_banner_version_label()}[/]",
|
||||
border_style=border_color,
|
||||
padding=(0, 2),
|
||||
)
|
||||
|
||||
@@ -293,14 +293,8 @@ def _resolve_config_gates() -> set[str]:
|
||||
if not gated:
|
||||
return set()
|
||||
try:
|
||||
import yaml
|
||||
from hermes_constants import get_hermes_home
|
||||
config_path = str(get_hermes_home() / "config.yaml")
|
||||
if os.path.exists(config_path):
|
||||
with open(config_path, encoding="utf-8") as f:
|
||||
cfg = yaml.safe_load(f) or {}
|
||||
else:
|
||||
cfg = {}
|
||||
from hermes_cli.config import read_raw_config
|
||||
cfg = read_raw_config()
|
||||
except Exception:
|
||||
return set()
|
||||
result: set[str] = set()
|
||||
|
||||
@@ -416,6 +416,7 @@ DEFAULT_CONFIG = {
|
||||
"provider": "local", # "local" (free, faster-whisper) | "groq" | "openai" (Whisper API)
|
||||
"local": {
|
||||
"model": "base", # tiny, base, small, medium, large-v3
|
||||
"language": "", # auto-detect by default; set to "en", "es", "fr", etc. to force
|
||||
},
|
||||
"openai": {
|
||||
"model": "whisper-1", # whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe
|
||||
|
||||
@@ -93,6 +93,21 @@ def cron_list(show_all: bool = False):
|
||||
script = job.get("script")
|
||||
if script:
|
||||
print(f" Script: {script}")
|
||||
|
||||
# Execution history
|
||||
last_status = job.get("last_status")
|
||||
if last_status:
|
||||
last_run = job.get("last_run_at", "?")
|
||||
if last_status == "ok":
|
||||
status_display = color("ok", Colors.GREEN)
|
||||
else:
|
||||
status_display = color(f"{last_status}: {job.get('last_error', '?')}", Colors.RED)
|
||||
print(f" Last run: {last_run} {status_display}")
|
||||
|
||||
delivery_err = job.get("last_delivery_error")
|
||||
if delivery_err:
|
||||
print(f" {color('⚠ Delivery failed:', Colors.YELLOW)} {delivery_err}")
|
||||
|
||||
print()
|
||||
|
||||
from hermes_cli.gateway import find_gateway_pids
|
||||
|
||||
+51
-9
@@ -267,6 +267,34 @@ def _profile_suffix() -> str:
|
||||
return hashlib.sha256(str(home).encode()).hexdigest()[:8]
|
||||
|
||||
|
||||
def _profile_arg(hermes_home: str | None = None) -> str:
|
||||
"""Return ``--profile <name>`` only when HERMES_HOME is a named profile.
|
||||
|
||||
For ``~/.hermes/profiles/<name>``, returns ``"--profile <name>"``.
|
||||
For the default profile or hash-based custom paths, returns the empty string.
|
||||
|
||||
Args:
|
||||
hermes_home: Optional explicit HERMES_HOME path. Defaults to the current
|
||||
``get_hermes_home()`` value. Should be passed when generating a
|
||||
service definition for a different user (e.g. system service).
|
||||
"""
|
||||
import re
|
||||
from pathlib import Path as _Path
|
||||
home = Path(hermes_home or str(get_hermes_home())).resolve()
|
||||
default = (_Path.home() / ".hermes").resolve()
|
||||
if home == default:
|
||||
return ""
|
||||
profiles_root = (default / "profiles").resolve()
|
||||
try:
|
||||
rel = home.relative_to(profiles_root)
|
||||
parts = rel.parts
|
||||
if len(parts) == 1 and re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", parts[0]):
|
||||
return f"--profile {parts[0]}"
|
||||
except ValueError:
|
||||
pass
|
||||
return ""
|
||||
|
||||
|
||||
def get_service_name() -> str:
|
||||
"""Derive a systemd service name scoped to this HERMES_HOME.
|
||||
|
||||
@@ -626,6 +654,7 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
|
||||
if system:
|
||||
username, group_name, home_dir = _system_service_identity(run_as_user)
|
||||
hermes_home = _hermes_home_for_target_user(home_dir)
|
||||
profile_arg = _profile_arg(hermes_home)
|
||||
path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
|
||||
path_entries.extend(common_bin_paths)
|
||||
sane_path = ":".join(path_entries)
|
||||
@@ -640,7 +669,7 @@ StartLimitBurst=5
|
||||
Type=simple
|
||||
User={username}
|
||||
Group={group_name}
|
||||
ExecStart={python_path} -m hermes_cli.main gateway run --replace
|
||||
ExecStart={python_path} -m hermes_cli.main{f" {profile_arg}" if profile_arg else ""} gateway run --replace
|
||||
WorkingDirectory={working_dir}
|
||||
Environment="HOME={home_dir}"
|
||||
Environment="USER={username}"
|
||||
@@ -661,6 +690,7 @@ WantedBy=multi-user.target
|
||||
"""
|
||||
|
||||
hermes_home = str(get_hermes_home().resolve())
|
||||
profile_arg = _profile_arg(hermes_home)
|
||||
path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
|
||||
path_entries.extend(common_bin_paths)
|
||||
sane_path = ":".join(path_entries)
|
||||
@@ -672,7 +702,7 @@ StartLimitBurst=5
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart={python_path} -m hermes_cli.main gateway run --replace
|
||||
ExecStart={python_path} -m hermes_cli.main{f" {profile_arg}" if profile_arg else ""} gateway run --replace
|
||||
WorkingDirectory={working_dir}
|
||||
Environment="PATH={sane_path}"
|
||||
Environment="VIRTUAL_ENV={venv_dir}"
|
||||
@@ -965,6 +995,7 @@ def generate_launchd_plist() -> str:
|
||||
log_dir = get_hermes_home() / "logs"
|
||||
log_dir.mkdir(parents=True, exist_ok=True)
|
||||
label = get_launchd_label()
|
||||
profile_arg = _profile_arg(hermes_home)
|
||||
# Build a sane PATH for the launchd plist. launchd provides only a
|
||||
# minimal default (/usr/bin:/bin:/usr/sbin:/sbin) which misses Homebrew,
|
||||
# nvm, cargo, etc. We prepend venv/bin and node_modules/.bin (matching
|
||||
@@ -986,21 +1017,32 @@ def generate_launchd_plist() -> str:
|
||||
dict.fromkeys(priority_dirs + [p for p in os.environ.get("PATH", "").split(":") if p])
|
||||
)
|
||||
|
||||
# Build ProgramArguments array, including --profile when using a named profile
|
||||
prog_args = [
|
||||
f"<string>{python_path}</string>",
|
||||
"<string>-m</string>",
|
||||
"<string>hermes_cli.main</string>",
|
||||
]
|
||||
if profile_arg:
|
||||
for part in profile_arg.split():
|
||||
prog_args.append(f"<string>{part}</string>")
|
||||
prog_args.extend([
|
||||
"<string>gateway</string>",
|
||||
"<string>run</string>",
|
||||
"<string>--replace</string>",
|
||||
])
|
||||
prog_args_xml = "\n ".join(prog_args)
|
||||
|
||||
return f"""<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>Label</key>
|
||||
<string>{label}</string>
|
||||
|
||||
|
||||
<key>ProgramArguments</key>
|
||||
<array>
|
||||
<string>{python_path}</string>
|
||||
<string>-m</string>
|
||||
<string>hermes_cli.main</string>
|
||||
<string>gateway</string>
|
||||
<string>run</string>
|
||||
<string>--replace</string>
|
||||
{prog_args_xml}
|
||||
</array>
|
||||
|
||||
<key>WorkingDirectory</key>
|
||||
|
||||
@@ -791,12 +791,12 @@ def list_authenticated_providers(
|
||||
if overlay.auth_type in ("oauth_device_code", "oauth_external", "external_process"):
|
||||
# These use auth stores, not env vars — check for auth.json entries
|
||||
try:
|
||||
from hermes_cli.auth import _read_auth_store
|
||||
store = _read_auth_store()
|
||||
if store and pid in store:
|
||||
from hermes_cli.auth import _load_auth_store
|
||||
store = _load_auth_store()
|
||||
if store and (pid in store.get("providers", {}) or pid in store.get("credential_pool", {})):
|
||||
has_creds = True
|
||||
except Exception:
|
||||
pass
|
||||
except Exception as exc:
|
||||
logger.debug("Auth store check failed for %s: %s", pid, exc)
|
||||
if not has_creds:
|
||||
continue
|
||||
|
||||
|
||||
+12
-8
@@ -144,18 +144,22 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
|
||||
"kimi-k2-0905-preview",
|
||||
],
|
||||
"minimax": [
|
||||
"MiniMax-M2.7",
|
||||
"MiniMax-M2.7-highspeed",
|
||||
"MiniMax-M1",
|
||||
"MiniMax-M1-40k",
|
||||
"MiniMax-M1-80k",
|
||||
"MiniMax-M1-128k",
|
||||
"MiniMax-M1-256k",
|
||||
"MiniMax-M2.5",
|
||||
"MiniMax-M2.5-highspeed",
|
||||
"MiniMax-M2.1",
|
||||
"MiniMax-M2.7",
|
||||
],
|
||||
"minimax-cn": [
|
||||
"MiniMax-M2.7",
|
||||
"MiniMax-M2.7-highspeed",
|
||||
"MiniMax-M1",
|
||||
"MiniMax-M1-40k",
|
||||
"MiniMax-M1-80k",
|
||||
"MiniMax-M1-128k",
|
||||
"MiniMax-M1-256k",
|
||||
"MiniMax-M2.5",
|
||||
"MiniMax-M2.5-highspeed",
|
||||
"MiniMax-M2.1",
|
||||
"MiniMax-M2.7",
|
||||
],
|
||||
"anthropic": [
|
||||
"claude-opus-4-6",
|
||||
|
||||
@@ -61,6 +61,8 @@ VALID_HOOKS: Set[str] = {
|
||||
"post_api_request",
|
||||
"on_session_start",
|
||||
"on_session_end",
|
||||
"on_session_finalize",
|
||||
"on_session_reset",
|
||||
}
|
||||
|
||||
ENTRY_POINTS_GROUP = "hermes_agent.plugins"
|
||||
|
||||
@@ -163,6 +163,16 @@ def _resolve_runtime_from_pool_entry(
|
||||
api_mode = _copilot_runtime_api_mode(model_cfg, getattr(entry, "runtime_api_key", ""))
|
||||
else:
|
||||
configured_provider = str(model_cfg.get("provider") or "").strip().lower()
|
||||
# Honour model.base_url from config.yaml when the configured provider
|
||||
# matches this provider — same pattern as the Anthropic branch above.
|
||||
# Only override when the pool entry has no explicit base_url (i.e. it
|
||||
# fell back to the hardcoded default). Env var overrides win (#6039).
|
||||
pconfig = PROVIDER_REGISTRY.get(provider)
|
||||
pool_url_is_default = pconfig and base_url.rstrip("/") == pconfig.inference_base_url.rstrip("/")
|
||||
if configured_provider == provider and pool_url_is_default:
|
||||
cfg_base_url = str(model_cfg.get("base_url") or "").strip().rstrip("/")
|
||||
if cfg_base_url:
|
||||
base_url = cfg_base_url
|
||||
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
|
||||
if configured_mode and _provider_supports_explicit_api_mode(provider, configured_provider):
|
||||
api_mode = configured_mode
|
||||
@@ -724,7 +734,15 @@ def resolve_runtime_provider(
|
||||
pconfig = PROVIDER_REGISTRY.get(provider)
|
||||
if pconfig and pconfig.auth_type == "api_key":
|
||||
creds = resolve_api_key_provider_credentials(provider)
|
||||
base_url = creds.get("base_url", "").rstrip("/")
|
||||
# Honour model.base_url from config.yaml when the configured provider
|
||||
# matches this provider — mirrors the Anthropic path above. Without
|
||||
# this, users who set model.base_url to e.g. api.minimaxi.com/anthropic
|
||||
# (China endpoint) still get the hardcoded api.minimax.io default (#6039).
|
||||
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
|
||||
cfg_base_url = ""
|
||||
if cfg_provider == provider:
|
||||
cfg_base_url = (model_cfg.get("base_url") or "").strip().rstrip("/")
|
||||
base_url = cfg_base_url or creds.get("base_url", "").rstrip("/")
|
||||
api_mode = "chat_completions"
|
||||
if provider == "copilot":
|
||||
api_mode = _copilot_runtime_api_mode(model_cfg, creds.get("api_key", ""))
|
||||
|
||||
+17
-5
@@ -105,8 +105,8 @@ _DEFAULT_PROVIDER_MODELS = {
|
||||
],
|
||||
"zai": ["glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
|
||||
"kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
|
||||
"minimax": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
|
||||
"minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
|
||||
"minimax": ["MiniMax-M1", "MiniMax-M1-40k", "MiniMax-M1-80k", "MiniMax-M1-128k", "MiniMax-M1-256k", "MiniMax-M2.5", "MiniMax-M2.7"],
|
||||
"minimax-cn": ["MiniMax-M1", "MiniMax-M1-40k", "MiniMax-M1-80k", "MiniMax-M1-128k", "MiniMax-M1-256k", "MiniMax-M2.5", "MiniMax-M2.7"],
|
||||
"ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
|
||||
"kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
|
||||
"opencode-zen": ["gpt-5.4", "gpt-5.3-codex", "claude-sonnet-4-6", "gemini-3-flash", "glm-5", "kimi-k2.5", "minimax-m2.7"],
|
||||
@@ -421,10 +421,22 @@ def _curses_prompt_choice(question: str, choices: list, default: int = 0) -> int
|
||||
curses.init_pair(1, curses.COLOR_GREEN, -1)
|
||||
curses.init_pair(2, curses.COLOR_YELLOW, -1)
|
||||
cursor = default
|
||||
scroll_offset = 0
|
||||
|
||||
while True:
|
||||
stdscr.clear()
|
||||
max_y, max_x = stdscr.getmaxyx()
|
||||
|
||||
# Rows available for list items: rows 2..(max_y-2) inclusive.
|
||||
visible = max(1, max_y - 3)
|
||||
|
||||
# Scroll the viewport so the cursor is always visible.
|
||||
if cursor < scroll_offset:
|
||||
scroll_offset = cursor
|
||||
elif cursor >= scroll_offset + visible:
|
||||
scroll_offset = cursor - visible + 1
|
||||
scroll_offset = max(0, min(scroll_offset, max(0, len(choices) - visible)))
|
||||
|
||||
try:
|
||||
stdscr.addnstr(
|
||||
0,
|
||||
@@ -436,12 +448,12 @@ def _curses_prompt_choice(question: str, choices: list, default: int = 0) -> int
|
||||
except curses.error:
|
||||
pass
|
||||
|
||||
for i, choice in enumerate(choices):
|
||||
y = i + 2
|
||||
for row, i in enumerate(range(scroll_offset, min(scroll_offset + visible, len(choices)))):
|
||||
y = row + 2
|
||||
if y >= max_y - 1:
|
||||
break
|
||||
arrow = "→" if i == cursor else " "
|
||||
line = f" {arrow} {choice}"
|
||||
line = f" {arrow} {choices[i]}"
|
||||
attr = curses.A_NORMAL
|
||||
if i == cursor:
|
||||
attr = curses.A_BOLD
|
||||
|
||||
@@ -554,6 +554,7 @@ def _get_platform_tools(
|
||||
# MCP servers are expected to be available on all platforms by default.
|
||||
# If the platform explicitly lists one or more MCP server names, treat that
|
||||
# as an allowlist. Otherwise include every globally enabled MCP server.
|
||||
# Special sentinel: "no_mcp" in the toolset list disables all MCP servers.
|
||||
mcp_servers = config.get("mcp_servers") or {}
|
||||
enabled_mcp_servers = {
|
||||
name
|
||||
@@ -561,10 +562,15 @@ def _get_platform_tools(
|
||||
if isinstance(server_cfg, dict)
|
||||
and _parse_enabled_flag(server_cfg.get("enabled", True), default=True)
|
||||
}
|
||||
explicit_mcp_servers = explicit_passthrough & enabled_mcp_servers
|
||||
enabled_toolsets.update(explicit_passthrough - enabled_mcp_servers)
|
||||
# Allow "no_mcp" sentinel to opt out of all MCP servers for this platform
|
||||
if "no_mcp" in toolset_names:
|
||||
explicit_mcp_servers = set()
|
||||
enabled_toolsets.update(explicit_passthrough - enabled_mcp_servers - {"no_mcp"})
|
||||
else:
|
||||
explicit_mcp_servers = explicit_passthrough & enabled_mcp_servers
|
||||
enabled_toolsets.update(explicit_passthrough - enabled_mcp_servers)
|
||||
if include_default_mcp_servers:
|
||||
if explicit_mcp_servers:
|
||||
if explicit_mcp_servers or "no_mcp" in toolset_names:
|
||||
enabled_toolsets.update(explicit_mcp_servers)
|
||||
else:
|
||||
enabled_toolsets.update(enabled_mcp_servers)
|
||||
|
||||
@@ -73,6 +73,7 @@ Config file: `~/.hermes/hindsight/config.json`
|
||||
|-----|---------|-------------|
|
||||
| `llm_provider` | `openai` | LLM provider: `openai`, `anthropic`, `gemini`, `groq`, `minimax`, `ollama` |
|
||||
| `llm_model` | per-provider | Model name (e.g. `gpt-4o-mini`, `openai/gpt-oss-120b`) |
|
||||
| `llm_base_url` | — | LLM Base URL override (e.g. `https://openrouter.ai/api/v1`) |
|
||||
|
||||
The LLM API key is stored in `~/.hermes/.env` as `HINDSIGHT_LLM_API_KEY`.
|
||||
|
||||
@@ -92,6 +93,7 @@ Available in `hybrid` and `tools` memory modes:
|
||||
|----------|-------------|
|
||||
| `HINDSIGHT_API_KEY` | API key for Hindsight Cloud |
|
||||
| `HINDSIGHT_LLM_API_KEY` | LLM API key for local mode |
|
||||
| `HINDSIGHT_API_LLM_BASE_URL` | LLM Base URL for local mode (e.g. OpenRouter) |
|
||||
| `HINDSIGHT_API_URL` | Override API endpoint |
|
||||
| `HINDSIGHT_BANK_ID` | Override bank name |
|
||||
| `HINDSIGHT_BUDGET` | Override recall budget |
|
||||
|
||||
@@ -23,6 +23,8 @@ import json
|
||||
import logging
|
||||
import os
|
||||
import threading
|
||||
|
||||
from hermes_constants import get_hermes_home
|
||||
from typing import Any, Dict, List
|
||||
|
||||
from agent.memory_provider import MemoryProvider
|
||||
@@ -142,7 +144,6 @@ def _load_config() -> dict:
|
||||
3. Environment variables
|
||||
"""
|
||||
from pathlib import Path
|
||||
from hermes_constants import get_hermes_home
|
||||
|
||||
# Profile-scoped path (preferred)
|
||||
profile_path = get_hermes_home() / "hindsight" / "config.json"
|
||||
@@ -234,6 +235,7 @@ class HindsightMemoryProvider(MemoryProvider):
|
||||
{"key": "api_key", "description": "Hindsight Cloud API key", "secret": True, "env_var": "HINDSIGHT_API_KEY", "url": "https://ui.hindsight.vectorize.io", "when": {"mode": "cloud"}},
|
||||
{"key": "llm_provider", "description": "LLM provider for local mode", "default": "openai", "choices": ["openai", "anthropic", "gemini", "groq", "minimax", "ollama"], "when": {"mode": "local"}},
|
||||
{"key": "llm_api_key", "description": "LLM API key for local Hindsight", "secret": True, "env_var": "HINDSIGHT_LLM_API_KEY", "when": {"mode": "local"}},
|
||||
{"key": "llm_base_url", "description": "LLM Base URL (e.g. for OpenRouter)", "default": "", "env_var": "HINDSIGHT_API_LLM_BASE_URL", "when": {"mode": "local"}},
|
||||
{"key": "llm_model", "description": "LLM model for local mode", "default": "gpt-4o-mini", "default_from": {"field": "llm_provider", "map": _PROVIDER_DEFAULT_MODELS}, "when": {"mode": "local"}},
|
||||
{"key": "bank_id", "description": "Memory bank name", "default": "hermes"},
|
||||
{"key": "budget", "description": "Recall thoroughness", "default": "mid", "choices": ["low", "mid", "high"]},
|
||||
@@ -250,12 +252,16 @@ class HindsightMemoryProvider(MemoryProvider):
|
||||
# different loop" errors during GC — we handle cleanup in
|
||||
# shutdown() instead.
|
||||
HindsightEmbedded.__del__ = lambda self: None
|
||||
self._client = HindsightEmbedded(
|
||||
kwargs = dict(
|
||||
profile=self._config.get("profile", "hermes"),
|
||||
llm_provider=self._config.get("llm_provider", ""),
|
||||
llm_api_key=self._config.get("llmApiKey") or os.environ.get("HINDSIGHT_LLM_API_KEY", ""),
|
||||
llm_api_key=self._config.get("llm_api_key") or os.environ.get("HINDSIGHT_LLM_API_KEY", ""),
|
||||
llm_model=self._config.get("llm_model", ""),
|
||||
)
|
||||
base_url = self._config.get("llm_base_url") or os.environ.get("HINDSIGHT_API_LLM_BASE_URL", "")
|
||||
if base_url:
|
||||
kwargs["llm_base_url"] = base_url
|
||||
self._client = HindsightEmbedded(**kwargs)
|
||||
else:
|
||||
from hindsight_client import Hindsight
|
||||
kwargs = {"base_url": self._api_url, "timeout": 30.0}
|
||||
@@ -310,9 +316,10 @@ class HindsightMemoryProvider(MemoryProvider):
|
||||
# If the config changed and the daemon is running, stop it.
|
||||
from pathlib import Path as _Path
|
||||
profile_env = _Path.home() / ".hindsight" / "profiles" / f"{profile}.env"
|
||||
current_key = self._config.get("llmApiKey") or os.environ.get("HINDSIGHT_LLM_API_KEY", "")
|
||||
current_key = self._config.get("llm_api_key") or os.environ.get("HINDSIGHT_LLM_API_KEY", "")
|
||||
current_provider = self._config.get("llm_provider", "")
|
||||
current_model = self._config.get("llm_model", "")
|
||||
current_base_url = self._config.get("llm_base_url") or os.environ.get("HINDSIGHT_API_LLM_BASE_URL", "")
|
||||
|
||||
# Read saved profile config
|
||||
saved = {}
|
||||
@@ -325,18 +332,22 @@ class HindsightMemoryProvider(MemoryProvider):
|
||||
config_changed = (
|
||||
saved.get("HINDSIGHT_API_LLM_PROVIDER") != current_provider or
|
||||
saved.get("HINDSIGHT_API_LLM_MODEL") != current_model or
|
||||
saved.get("HINDSIGHT_API_LLM_API_KEY") != current_key
|
||||
saved.get("HINDSIGHT_API_LLM_API_KEY") != current_key or
|
||||
saved.get("HINDSIGHT_API_LLM_BASE_URL", "") != current_base_url
|
||||
)
|
||||
|
||||
if config_changed:
|
||||
# Write updated profile .env
|
||||
profile_env.parent.mkdir(parents=True, exist_ok=True)
|
||||
profile_env.write_text(
|
||||
env_lines = (
|
||||
f"HINDSIGHT_API_LLM_PROVIDER={current_provider}\n"
|
||||
f"HINDSIGHT_API_LLM_API_KEY={current_key}\n"
|
||||
f"HINDSIGHT_API_LLM_MODEL={current_model}\n"
|
||||
f"HINDSIGHT_API_LOG_LEVEL=info\n"
|
||||
)
|
||||
if current_base_url:
|
||||
env_lines += f"HINDSIGHT_API_LLM_BASE_URL={current_base_url}\n"
|
||||
profile_env.write_text(env_lines)
|
||||
if client._manager.is_running(profile):
|
||||
with open(log_path, "a") as f:
|
||||
f.write("\n=== Config changed, restarting daemon ===\n")
|
||||
|
||||
@@ -17,7 +17,7 @@ Or manually:
|
||||
|
||||
```bash
|
||||
hermes config set memory.provider supermemory
|
||||
echo 'SUPERMEMORY_API_KEY=your-key-here' >> ~/.hermes/.env
|
||||
echo 'SUPERMEMORY_API_KEY=***' >> ~/.hermes/.env
|
||||
```
|
||||
|
||||
## Config
|
||||
@@ -26,15 +26,23 @@ Config file: `$HERMES_HOME/supermemory.json`
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `container_tag` | `hermes` | Container tag used for search and writes |
|
||||
| `container_tag` | `hermes` | Container tag used for search and writes. Supports `{identity}` template for profile-scoped tags (e.g. `hermes-{identity}` → `hermes-coder`). |
|
||||
| `auto_recall` | `true` | Inject relevant memory context before turns |
|
||||
| `auto_capture` | `true` | Store cleaned user-assistant turns after each response |
|
||||
| `max_recall_results` | `10` | Max recalled items to format into context |
|
||||
| `profile_frequency` | `50` | Include profile facts on first turn and every N turns |
|
||||
| `capture_mode` | `all` | Skip tiny or trivial turns by default |
|
||||
| `search_mode` | `hybrid` | Search mode: `hybrid` (profile + memories), `memories` (memories only), `documents` (documents only) |
|
||||
| `entity_context` | built-in default | Extraction guidance passed to Supermemory |
|
||||
| `api_timeout` | `5.0` | Timeout for SDK and ingest requests |
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `SUPERMEMORY_API_KEY` | API key (required) |
|
||||
| `SUPERMEMORY_CONTAINER_TAG` | Override container tag (takes priority over config file) |
|
||||
|
||||
## Tools
|
||||
|
||||
| Tool | Description |
|
||||
@@ -52,3 +60,40 @@ When enabled, Hermes can:
|
||||
- store cleaned conversation turns after each completed response
|
||||
- ingest the full session on session end for richer graph updates
|
||||
- expose explicit tools for search, store, forget, and profile access
|
||||
|
||||
## Profile-Scoped Containers
|
||||
|
||||
Use `{identity}` in the `container_tag` to scope memories per Hermes profile:
|
||||
|
||||
```json
|
||||
{
|
||||
"container_tag": "hermes-{identity}"
|
||||
}
|
||||
```
|
||||
|
||||
For a profile named `coder`, this resolves to `hermes-coder`. The default profile resolves to `hermes-default`. Without `{identity}`, all profiles share the same container.
|
||||
|
||||
## Multi-Container Mode
|
||||
|
||||
For advanced setups (e.g. OpenClaw-style multi-workspace), you can enable custom container tags so the agent can read/write across multiple named containers:
|
||||
|
||||
```json
|
||||
{
|
||||
"container_tag": "hermes",
|
||||
"enable_custom_container_tags": true,
|
||||
"custom_containers": ["project-alpha", "project-beta", "shared-knowledge"],
|
||||
"custom_container_instructions": "Use project-alpha for coding tasks, project-beta for research, and shared-knowledge for team-wide facts."
|
||||
}
|
||||
```
|
||||
|
||||
When enabled:
|
||||
- `supermemory_search`, `supermemory_store`, `supermemory_forget`, and `supermemory_profile` accept an optional `container_tag` parameter
|
||||
- The tag must be in the whitelist: primary container + `custom_containers`
|
||||
- Automatic operations (turn sync, prefetch, memory write mirroring, session ingest) always use the **primary** container only
|
||||
- Custom container instructions are injected into the system prompt
|
||||
|
||||
## Support
|
||||
|
||||
- [Supermemory Discord](https://supermemory.link/discord)
|
||||
- [support@supermemory.com](mailto:support@supermemory.com)
|
||||
- [supermemory.ai](https://supermemory.ai)
|
||||
|
||||
@@ -26,6 +26,8 @@ _DEFAULT_CONTAINER_TAG = "hermes"
|
||||
_DEFAULT_MAX_RECALL_RESULTS = 10
|
||||
_DEFAULT_PROFILE_FREQUENCY = 50
|
||||
_DEFAULT_CAPTURE_MODE = "all"
|
||||
_DEFAULT_SEARCH_MODE = "hybrid"
|
||||
_VALID_SEARCH_MODES = ("hybrid", "memories", "documents")
|
||||
_DEFAULT_API_TIMEOUT = 5.0
|
||||
_MIN_CAPTURE_LENGTH = 10
|
||||
_MAX_ENTITY_CONTEXT_LENGTH = 1500
|
||||
@@ -59,8 +61,12 @@ def _default_config() -> dict:
|
||||
"max_recall_results": _DEFAULT_MAX_RECALL_RESULTS,
|
||||
"profile_frequency": _DEFAULT_PROFILE_FREQUENCY,
|
||||
"capture_mode": _DEFAULT_CAPTURE_MODE,
|
||||
"search_mode": _DEFAULT_SEARCH_MODE,
|
||||
"entity_context": _DEFAULT_ENTITY_CONTEXT,
|
||||
"api_timeout": _DEFAULT_API_TIMEOUT,
|
||||
"enable_custom_container_tags": False,
|
||||
"custom_containers": [],
|
||||
"custom_container_instructions": "",
|
||||
}
|
||||
|
||||
|
||||
@@ -100,7 +106,10 @@ def _load_supermemory_config(hermes_home: str) -> dict:
|
||||
except Exception:
|
||||
logger.debug("Failed to parse %s", config_path, exc_info=True)
|
||||
|
||||
config["container_tag"] = _sanitize_tag(str(config.get("container_tag", _DEFAULT_CONTAINER_TAG)))
|
||||
# Keep raw container_tag — template variables like {identity} are resolved
|
||||
# in initialize(), and _sanitize_tag runs AFTER resolution.
|
||||
raw_tag = str(config.get("container_tag", _DEFAULT_CONTAINER_TAG)).strip()
|
||||
config["container_tag"] = raw_tag if raw_tag else _DEFAULT_CONTAINER_TAG
|
||||
config["auto_recall"] = _as_bool(config.get("auto_recall"), True)
|
||||
config["auto_capture"] = _as_bool(config.get("auto_capture"), True)
|
||||
try:
|
||||
@@ -112,11 +121,23 @@ def _load_supermemory_config(hermes_home: str) -> dict:
|
||||
except Exception:
|
||||
config["profile_frequency"] = _DEFAULT_PROFILE_FREQUENCY
|
||||
config["capture_mode"] = "everything" if config.get("capture_mode") == "everything" else "all"
|
||||
raw_search_mode = str(config.get("search_mode", _DEFAULT_SEARCH_MODE)).strip().lower()
|
||||
config["search_mode"] = raw_search_mode if raw_search_mode in _VALID_SEARCH_MODES else _DEFAULT_SEARCH_MODE
|
||||
config["entity_context"] = _clamp_entity_context(str(config.get("entity_context", _DEFAULT_ENTITY_CONTEXT)))
|
||||
try:
|
||||
config["api_timeout"] = max(0.5, min(15.0, float(config.get("api_timeout", _DEFAULT_API_TIMEOUT))))
|
||||
except Exception:
|
||||
config["api_timeout"] = _DEFAULT_API_TIMEOUT
|
||||
|
||||
# Multi-container support
|
||||
config["enable_custom_container_tags"] = _as_bool(config.get("enable_custom_container_tags"), False)
|
||||
raw_containers = config.get("custom_containers", [])
|
||||
if isinstance(raw_containers, list):
|
||||
config["custom_containers"] = [_sanitize_tag(str(t)) for t in raw_containers if t]
|
||||
else:
|
||||
config["custom_containers"] = []
|
||||
config["custom_container_instructions"] = str(config.get("custom_container_instructions", "")).strip()
|
||||
|
||||
return config
|
||||
|
||||
|
||||
@@ -240,28 +261,41 @@ def _is_trivial_message(text: str) -> bool:
|
||||
|
||||
|
||||
class _SupermemoryClient:
|
||||
def __init__(self, api_key: str, timeout: float, container_tag: str):
|
||||
def __init__(self, api_key: str, timeout: float, container_tag: str, search_mode: str = "hybrid"):
|
||||
from supermemory import Supermemory
|
||||
|
||||
self._api_key = api_key
|
||||
self._container_tag = container_tag
|
||||
self._search_mode = search_mode if search_mode in _VALID_SEARCH_MODES else _DEFAULT_SEARCH_MODE
|
||||
self._timeout = timeout
|
||||
self._client = Supermemory(api_key=api_key, timeout=timeout, max_retries=0)
|
||||
|
||||
def add_memory(self, content: str, metadata: Optional[dict] = None, *, entity_context: str = "") -> dict:
|
||||
kwargs = {
|
||||
def add_memory(self, content: str, metadata: Optional[dict] = None, *,
|
||||
entity_context: str = "", container_tag: Optional[str] = None,
|
||||
custom_id: Optional[str] = None) -> dict:
|
||||
tag = container_tag or self._container_tag
|
||||
kwargs: dict[str, Any] = {
|
||||
"content": content.strip(),
|
||||
"container_tags": [self._container_tag],
|
||||
"container_tags": [tag],
|
||||
}
|
||||
if metadata:
|
||||
kwargs["metadata"] = metadata
|
||||
if entity_context:
|
||||
kwargs["entity_context"] = _clamp_entity_context(entity_context)
|
||||
if custom_id:
|
||||
kwargs["custom_id"] = custom_id
|
||||
result = self._client.documents.add(**kwargs)
|
||||
return {"id": getattr(result, "id", "")}
|
||||
|
||||
def search_memories(self, query: str, *, limit: int = 5) -> list[dict]:
|
||||
response = self._client.search.memories(q=query, container_tag=self._container_tag, limit=limit)
|
||||
def search_memories(self, query: str, *, limit: int = 5,
|
||||
container_tag: Optional[str] = None,
|
||||
search_mode: Optional[str] = None) -> list[dict]:
|
||||
tag = container_tag or self._container_tag
|
||||
mode = search_mode or self._search_mode
|
||||
kwargs: dict[str, Any] = {"q": query, "container_tag": tag, "limit": limit}
|
||||
if mode in _VALID_SEARCH_MODES:
|
||||
kwargs["search_mode"] = mode
|
||||
response = self._client.search.memories(**kwargs)
|
||||
results = []
|
||||
for item in (getattr(response, "results", None) or []):
|
||||
results.append({
|
||||
@@ -273,8 +307,10 @@ class _SupermemoryClient:
|
||||
})
|
||||
return results
|
||||
|
||||
def get_profile(self, query: Optional[str] = None) -> dict:
|
||||
kwargs = {"container_tag": self._container_tag}
|
||||
def get_profile(self, query: Optional[str] = None, *,
|
||||
container_tag: Optional[str] = None) -> dict:
|
||||
tag = container_tag or self._container_tag
|
||||
kwargs: dict[str, Any] = {"container_tag": tag}
|
||||
if query:
|
||||
kwargs["q"] = query
|
||||
response = self._client.profile(**kwargs)
|
||||
@@ -296,18 +332,19 @@ class _SupermemoryClient:
|
||||
})
|
||||
return {"static": static, "dynamic": dynamic, "search_results": search_results}
|
||||
|
||||
def forget_memory(self, memory_id: str) -> None:
|
||||
self._client.memories.forget(container_tag=self._container_tag, id=memory_id)
|
||||
def forget_memory(self, memory_id: str, *, container_tag: Optional[str] = None) -> None:
|
||||
tag = container_tag or self._container_tag
|
||||
self._client.memories.forget(container_tag=tag, id=memory_id)
|
||||
|
||||
def forget_by_query(self, query: str) -> dict:
|
||||
results = self.search_memories(query, limit=5)
|
||||
def forget_by_query(self, query: str, *, container_tag: Optional[str] = None) -> dict:
|
||||
results = self.search_memories(query, limit=5, container_tag=container_tag)
|
||||
if not results:
|
||||
return {"success": False, "message": "No matching memory found to forget."}
|
||||
target = results[0]
|
||||
memory_id = target.get("id", "")
|
||||
if not memory_id:
|
||||
return {"success": False, "message": "Best matching memory has no id."}
|
||||
self.forget_memory(memory_id)
|
||||
self.forget_memory(memory_id, container_tag=container_tag)
|
||||
preview = (target.get("memory") or "")[:100]
|
||||
return {"success": True, "message": f'Forgot: "{preview}"', "id": memory_id}
|
||||
|
||||
@@ -398,11 +435,17 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
self._max_recall_results = _DEFAULT_MAX_RECALL_RESULTS
|
||||
self._profile_frequency = _DEFAULT_PROFILE_FREQUENCY
|
||||
self._capture_mode = _DEFAULT_CAPTURE_MODE
|
||||
self._search_mode = _DEFAULT_SEARCH_MODE
|
||||
self._entity_context = _DEFAULT_ENTITY_CONTEXT
|
||||
self._api_timeout = _DEFAULT_API_TIMEOUT
|
||||
self._hermes_home = ""
|
||||
self._write_enabled = True
|
||||
self._active = False
|
||||
# Multi-container support
|
||||
self._enable_custom_containers = False
|
||||
self._custom_containers: List[str] = []
|
||||
self._custom_container_instructions = ""
|
||||
self._allowed_containers: List[str] = []
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
@@ -419,16 +462,11 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
return False
|
||||
|
||||
def get_config_schema(self):
|
||||
# Only prompt for the API key during `hermes memory setup`.
|
||||
# All other options are documented for $HERMES_HOME/supermemory.json
|
||||
# or the SUPERMEMORY_CONTAINER_TAG env var.
|
||||
return [
|
||||
{"key": "api_key", "description": "Supermemory API key", "secret": True, "required": True, "env_var": "SUPERMEMORY_API_KEY", "url": "https://supermemory.ai"},
|
||||
{"key": "container_tag", "description": "Container tag for reads and writes", "default": _DEFAULT_CONTAINER_TAG},
|
||||
{"key": "auto_recall", "description": "Enable automatic recall before each turn", "default": "true", "choices": ["true", "false"]},
|
||||
{"key": "auto_capture", "description": "Enable automatic capture after each completed turn", "default": "true", "choices": ["true", "false"]},
|
||||
{"key": "max_recall_results", "description": "Maximum recalled items to inject", "default": str(_DEFAULT_MAX_RECALL_RESULTS)},
|
||||
{"key": "profile_frequency", "description": "Include profile facts on first turn and every N turns", "default": str(_DEFAULT_PROFILE_FREQUENCY)},
|
||||
{"key": "capture_mode", "description": "Capture mode", "default": _DEFAULT_CAPTURE_MODE, "choices": ["all", "everything"]},
|
||||
{"key": "entity_context", "description": "Extraction guidance passed to Supermemory", "default": _DEFAULT_ENTITY_CONTEXT},
|
||||
{"key": "api_timeout", "description": "Timeout in seconds for SDK and ingest calls", "default": str(_DEFAULT_API_TIMEOUT)},
|
||||
]
|
||||
|
||||
def save_config(self, values, hermes_home):
|
||||
@@ -446,14 +484,29 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
self._turn_count = 0
|
||||
self._config = _load_supermemory_config(self._hermes_home)
|
||||
self._api_key = os.environ.get("SUPERMEMORY_API_KEY", "")
|
||||
self._container_tag = self._config["container_tag"]
|
||||
|
||||
# Resolve container tag: env var > config > default.
|
||||
# Supports {identity} template for profile-scoped containers.
|
||||
env_tag = os.environ.get("SUPERMEMORY_CONTAINER_TAG", "").strip()
|
||||
raw_tag = env_tag or self._config["container_tag"]
|
||||
identity = kwargs.get("agent_identity", "default")
|
||||
self._container_tag = _sanitize_tag(raw_tag.replace("{identity}", identity))
|
||||
|
||||
self._auto_recall = self._config["auto_recall"]
|
||||
self._auto_capture = self._config["auto_capture"]
|
||||
self._max_recall_results = self._config["max_recall_results"]
|
||||
self._profile_frequency = self._config["profile_frequency"]
|
||||
self._capture_mode = self._config["capture_mode"]
|
||||
self._search_mode = self._config["search_mode"]
|
||||
self._entity_context = self._config["entity_context"]
|
||||
self._api_timeout = self._config["api_timeout"]
|
||||
|
||||
# Multi-container setup
|
||||
self._enable_custom_containers = self._config["enable_custom_container_tags"]
|
||||
self._custom_containers = self._config["custom_containers"]
|
||||
self._custom_container_instructions = self._config["custom_container_instructions"]
|
||||
self._allowed_containers = [self._container_tag] + list(self._custom_containers)
|
||||
|
||||
agent_context = kwargs.get("agent_context", "")
|
||||
self._write_enabled = agent_context not in ("cron", "flush", "subagent")
|
||||
self._active = bool(self._api_key)
|
||||
@@ -464,6 +517,7 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
api_key=self._api_key,
|
||||
timeout=self._api_timeout,
|
||||
container_tag=self._container_tag,
|
||||
search_mode=self._search_mode,
|
||||
)
|
||||
except Exception:
|
||||
logger.warning("Supermemory initialization failed", exc_info=True)
|
||||
@@ -476,11 +530,18 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
def system_prompt_block(self) -> str:
|
||||
if not self._active:
|
||||
return ""
|
||||
return (
|
||||
"# Supermemory\n"
|
||||
f"Active. Container: {self._container_tag}.\n"
|
||||
"Use supermemory_search, supermemory_store, supermemory_forget, and supermemory_profile for explicit memory operations."
|
||||
)
|
||||
lines = [
|
||||
"# Supermemory",
|
||||
f"Active. Container: {self._container_tag}.",
|
||||
"Use supermemory_search, supermemory_store, supermemory_forget, and supermemory_profile for explicit memory operations.",
|
||||
]
|
||||
if self._enable_custom_containers and self._custom_containers:
|
||||
tags_str = ", ".join(self._allowed_containers)
|
||||
lines.append(f"\nMulti-container mode enabled. Available containers: {tags_str}.")
|
||||
lines.append("Pass an optional container_tag to supermemory_search, supermemory_store, supermemory_forget, and supermemory_profile to target a specific container.")
|
||||
if self._custom_container_instructions:
|
||||
lines.append(f"\n{self._custom_container_instructions}")
|
||||
return "\n".join(lines)
|
||||
|
||||
def prefetch(self, query: str, *, session_id: str = "") -> str:
|
||||
if not self._active or not self._auto_recall or not self._client or not query.strip():
|
||||
@@ -582,22 +643,62 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
thread.join(timeout=5.0)
|
||||
setattr(self, attr_name, None)
|
||||
|
||||
def _resolve_tool_container_tag(self, args: dict) -> Optional[str]:
|
||||
"""Validate and resolve container_tag from tool call args.
|
||||
|
||||
Returns None (use primary) if multi-container is disabled or no tag provided.
|
||||
Returns the validated tag if it's in the allowed list.
|
||||
Raises ValueError if the tag is not whitelisted.
|
||||
"""
|
||||
if not self._enable_custom_containers:
|
||||
return None
|
||||
tag = str(args.get("container_tag") or "").strip()
|
||||
if not tag:
|
||||
return None
|
||||
sanitized = _sanitize_tag(tag)
|
||||
if sanitized not in self._allowed_containers:
|
||||
raise ValueError(
|
||||
f"Container tag '{sanitized}' is not allowed. "
|
||||
f"Allowed: {', '.join(self._allowed_containers)}"
|
||||
)
|
||||
return sanitized
|
||||
|
||||
def get_tool_schemas(self) -> List[Dict[str, Any]]:
|
||||
return [STORE_SCHEMA, SEARCH_SCHEMA, FORGET_SCHEMA, PROFILE_SCHEMA]
|
||||
if not self._enable_custom_containers:
|
||||
return [STORE_SCHEMA, SEARCH_SCHEMA, FORGET_SCHEMA, PROFILE_SCHEMA]
|
||||
|
||||
# When multi-container is enabled, add optional container_tag to relevant tools
|
||||
container_param = {
|
||||
"type": "string",
|
||||
"description": f"Optional container tag. Allowed: {', '.join(self._allowed_containers)}. Defaults to primary ({self._container_tag}).",
|
||||
}
|
||||
schemas = []
|
||||
for base in [STORE_SCHEMA, SEARCH_SCHEMA, FORGET_SCHEMA, PROFILE_SCHEMA]:
|
||||
schema = json.loads(json.dumps(base)) # deep copy
|
||||
schema["parameters"]["properties"]["container_tag"] = container_param
|
||||
schemas.append(schema)
|
||||
return schemas
|
||||
|
||||
def _tool_store(self, args: dict) -> str:
|
||||
content = str(args.get("content") or "").strip()
|
||||
if not content:
|
||||
return tool_error("content is required")
|
||||
try:
|
||||
tag = self._resolve_tool_container_tag(args)
|
||||
except ValueError as exc:
|
||||
return tool_error(str(exc))
|
||||
metadata = args.get("metadata") or {}
|
||||
if not isinstance(metadata, dict):
|
||||
metadata = {}
|
||||
metadata.setdefault("type", _detect_category(content))
|
||||
metadata["source"] = "hermes_tool"
|
||||
try:
|
||||
result = self._client.add_memory(content, metadata=metadata, entity_context=self._entity_context)
|
||||
result = self._client.add_memory(content, metadata=metadata, entity_context=self._entity_context, container_tag=tag)
|
||||
preview = content[:80] + ("..." if len(content) > 80 else "")
|
||||
return json.dumps({"saved": True, "id": result.get("id", ""), "preview": preview})
|
||||
resp: dict[str, Any] = {"saved": True, "id": result.get("id", ""), "preview": preview}
|
||||
if tag:
|
||||
resp["container_tag"] = tag
|
||||
return json.dumps(resp)
|
||||
except Exception as exc:
|
||||
return tool_error(f"Failed to store memory: {exc}")
|
||||
|
||||
@@ -605,22 +706,29 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
query = str(args.get("query") or "").strip()
|
||||
if not query:
|
||||
return tool_error("query is required")
|
||||
try:
|
||||
tag = self._resolve_tool_container_tag(args)
|
||||
except ValueError as exc:
|
||||
return tool_error(str(exc))
|
||||
try:
|
||||
limit = max(1, min(20, int(args.get("limit", 5) or 5)))
|
||||
except Exception:
|
||||
limit = 5
|
||||
try:
|
||||
results = self._client.search_memories(query, limit=limit)
|
||||
results = self._client.search_memories(query, limit=limit, container_tag=tag)
|
||||
formatted = []
|
||||
for item in results:
|
||||
entry = {"id": item.get("id", ""), "content": item.get("memory", "")}
|
||||
entry: dict[str, Any] = {"id": item.get("id", ""), "content": item.get("memory", "")}
|
||||
if item.get("similarity") is not None:
|
||||
try:
|
||||
entry["similarity"] = round(float(item["similarity"]) * 100)
|
||||
except Exception:
|
||||
pass
|
||||
formatted.append(entry)
|
||||
return json.dumps({"results": formatted, "count": len(formatted)})
|
||||
resp: dict[str, Any] = {"results": formatted, "count": len(formatted)}
|
||||
if tag:
|
||||
resp["container_tag"] = tag
|
||||
return json.dumps(resp)
|
||||
except Exception as exc:
|
||||
return tool_error(f"Search failed: {exc}")
|
||||
|
||||
@@ -629,28 +737,39 @@ class SupermemoryMemoryProvider(MemoryProvider):
|
||||
query = str(args.get("query") or "").strip()
|
||||
if not memory_id and not query:
|
||||
return tool_error("Provide either id or query")
|
||||
try:
|
||||
tag = self._resolve_tool_container_tag(args)
|
||||
except ValueError as exc:
|
||||
return tool_error(str(exc))
|
||||
try:
|
||||
if memory_id:
|
||||
self._client.forget_memory(memory_id)
|
||||
self._client.forget_memory(memory_id, container_tag=tag)
|
||||
return json.dumps({"forgotten": True, "id": memory_id})
|
||||
return json.dumps(self._client.forget_by_query(query))
|
||||
return json.dumps(self._client.forget_by_query(query, container_tag=tag))
|
||||
except Exception as exc:
|
||||
return tool_error(f"Forget failed: {exc}")
|
||||
|
||||
def _tool_profile(self, args: dict) -> str:
|
||||
query = str(args.get("query") or "").strip() or None
|
||||
try:
|
||||
profile = self._client.get_profile(query=query)
|
||||
tag = self._resolve_tool_container_tag(args)
|
||||
except ValueError as exc:
|
||||
return tool_error(str(exc))
|
||||
try:
|
||||
profile = self._client.get_profile(query=query, container_tag=tag)
|
||||
sections = []
|
||||
if profile["static"]:
|
||||
sections.append("## User Profile (Persistent)\n" + "\n".join(f"- {item}" for item in profile["static"]))
|
||||
if profile["dynamic"]:
|
||||
sections.append("## Recent Context\n" + "\n".join(f"- {item}" for item in profile["dynamic"]))
|
||||
return json.dumps({
|
||||
resp: dict[str, Any] = {
|
||||
"profile": "\n\n".join(sections),
|
||||
"static_count": len(profile["static"]),
|
||||
"dynamic_count": len(profile["dynamic"]),
|
||||
})
|
||||
}
|
||||
if tag:
|
||||
resp["container_tag"] = tag
|
||||
return json.dumps(resp)
|
||||
except Exception as exc:
|
||||
return tool_error(f"Profile failed: {exc}")
|
||||
|
||||
|
||||
+1
-1
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "hermes-agent"
|
||||
version = "0.7.0"
|
||||
version = "0.8.0"
|
||||
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
|
||||
+123
-82
@@ -66,7 +66,8 @@ from model_tools import (
|
||||
handle_function_call,
|
||||
check_toolset_requirements,
|
||||
)
|
||||
from tools.terminal_tool import cleanup_vm
|
||||
from tools.terminal_tool import cleanup_vm, get_active_env
|
||||
from tools.tool_result_storage import maybe_persist_tool_result, enforce_turn_budget
|
||||
from tools.interrupt import set_interrupt as _set_interrupt
|
||||
from tools.browser_tool import cleanup_browser
|
||||
|
||||
@@ -75,6 +76,7 @@ from hermes_constants import OPENROUTER_BASE_URL
|
||||
|
||||
# Agent internals extracted to agent/ package for modularity
|
||||
from agent.memory_manager import build_memory_context_block
|
||||
from agent.retry_utils import jittered_backoff
|
||||
from agent.prompt_builder import (
|
||||
DEFAULT_AGENT_IDENTITY, PLATFORM_HINTS,
|
||||
MEMORY_GUIDANCE, SESSION_SEARCH_GUIDANCE, SKILLS_GUIDANCE,
|
||||
@@ -85,6 +87,7 @@ from agent.model_metadata import (
|
||||
estimate_tokens_rough, estimate_messages_tokens_rough, estimate_request_tokens_rough,
|
||||
get_next_probe_tier, parse_context_limit_from_error,
|
||||
save_context_length, is_local_endpoint,
|
||||
query_ollama_num_ctx,
|
||||
)
|
||||
from agent.context_compressor import ContextCompressor
|
||||
from agent.subdirectory_hints import SubdirectoryHintTracker
|
||||
@@ -409,63 +412,6 @@ def _strip_budget_warnings_from_history(messages: list) -> None:
|
||||
# Large tool result handler — save oversized output to temp file
|
||||
# =========================================================================
|
||||
|
||||
# Threshold at which tool results are saved to a file instead of kept inline.
|
||||
# 100K chars ≈ 25K tokens — generous for any reasonable output but prevents
|
||||
# catastrophic context explosions.
|
||||
_LARGE_RESULT_CHARS = 100_000
|
||||
|
||||
# How many characters of the original result to include as an inline preview
|
||||
# so the model has immediate context about what the tool returned.
|
||||
_LARGE_RESULT_PREVIEW_CHARS = 1_500
|
||||
|
||||
|
||||
def _save_oversized_tool_result(function_name: str, function_result: str) -> str:
|
||||
"""Replace oversized tool results with a file reference + preview.
|
||||
|
||||
When a tool returns more than ``_LARGE_RESULT_CHARS`` characters, the full
|
||||
content is written to a temporary file under ``HERMES_HOME/cache/tool_responses/``
|
||||
and the result sent to the model is replaced with:
|
||||
• a brief head preview (first ``_LARGE_RESULT_PREVIEW_CHARS`` chars)
|
||||
• the file path so the model can use ``read_file`` / ``search_files``
|
||||
|
||||
Falls back to destructive truncation if the file write fails.
|
||||
"""
|
||||
original_len = len(function_result)
|
||||
if original_len <= _LARGE_RESULT_CHARS:
|
||||
return function_result
|
||||
|
||||
# Build the target directory
|
||||
try:
|
||||
response_dir = os.path.join(get_hermes_home(), "cache", "tool_responses")
|
||||
os.makedirs(response_dir, exist_ok=True)
|
||||
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
|
||||
# Sanitize tool name for use in filename
|
||||
safe_name = re.sub(r"[^\w\-]", "_", function_name)[:40]
|
||||
filename = f"{safe_name}_{timestamp}.txt"
|
||||
filepath = os.path.join(response_dir, filename)
|
||||
|
||||
with open(filepath, "w", encoding="utf-8") as f:
|
||||
f.write(function_result)
|
||||
|
||||
preview = function_result[:_LARGE_RESULT_PREVIEW_CHARS]
|
||||
return (
|
||||
f"{preview}\n\n"
|
||||
f"[Large tool response: {original_len:,} characters total — "
|
||||
f"only the first {_LARGE_RESULT_PREVIEW_CHARS:,} shown above. "
|
||||
f"Full output saved to: {filepath}\n"
|
||||
f"Use read_file or search_files on that path to access the rest.]"
|
||||
)
|
||||
except Exception as exc:
|
||||
# Fall back to destructive truncation if file write fails
|
||||
logger.warning("Failed to save large tool result to file: %s", exc)
|
||||
return (
|
||||
function_result[:_LARGE_RESULT_CHARS]
|
||||
+ f"\n\n[Truncated: tool response was {original_len:,} chars, "
|
||||
f"exceeding the {_LARGE_RESULT_CHARS:,} char limit. "
|
||||
f"File save failed: {exc}]"
|
||||
)
|
||||
|
||||
|
||||
class AIAgent:
|
||||
"""
|
||||
@@ -1216,6 +1162,33 @@ class AIAgent:
|
||||
self.session_cost_status = "unknown"
|
||||
self.session_cost_source = "none"
|
||||
|
||||
# ── Ollama num_ctx injection ──
|
||||
# Ollama defaults to 2048 context regardless of the model's capabilities.
|
||||
# When running against an Ollama server, detect the model's max context
|
||||
# and pass num_ctx on every chat request so the full window is used.
|
||||
# User override: set model.ollama_num_ctx in config.yaml to cap VRAM use.
|
||||
self._ollama_num_ctx: int | None = None
|
||||
_ollama_num_ctx_override = None
|
||||
if isinstance(_model_cfg, dict):
|
||||
_ollama_num_ctx_override = _model_cfg.get("ollama_num_ctx")
|
||||
if _ollama_num_ctx_override is not None:
|
||||
try:
|
||||
self._ollama_num_ctx = int(_ollama_num_ctx_override)
|
||||
except (TypeError, ValueError):
|
||||
logger.debug("Invalid ollama_num_ctx config value: %r", _ollama_num_ctx_override)
|
||||
if self._ollama_num_ctx is None and self.base_url and is_local_endpoint(self.base_url):
|
||||
try:
|
||||
_detected = query_ollama_num_ctx(self.model, self.base_url)
|
||||
if _detected and _detected > 0:
|
||||
self._ollama_num_ctx = _detected
|
||||
except Exception as exc:
|
||||
logger.debug("Ollama num_ctx detection failed: %s", exc)
|
||||
if self._ollama_num_ctx and not self.quiet_mode:
|
||||
logger.info(
|
||||
"Ollama num_ctx: will request %d tokens (model max from /api/show)",
|
||||
self._ollama_num_ctx,
|
||||
)
|
||||
|
||||
if not self.quiet_mode:
|
||||
if compression_enabled:
|
||||
print(f"📊 Context limit: {self.context_compressor.context_length:,} tokens (compress at {int(compression_threshold*100)}% = {self.context_compressor.threshold_tokens:,})")
|
||||
@@ -5456,6 +5429,15 @@ class AIAgent:
|
||||
if _is_nous:
|
||||
extra_body["tags"] = ["product=hermes-agent"]
|
||||
|
||||
# Ollama num_ctx: override the 2048 default so the model actually
|
||||
# uses the context window it was trained for. Passed via the OpenAI
|
||||
# SDK's extra_body → options.num_ctx, which Ollama's OpenAI-compat
|
||||
# endpoint forwards to the runner as --ctx-size.
|
||||
if self._ollama_num_ctx:
|
||||
options = extra_body.get("options", {})
|
||||
options["num_ctx"] = self._ollama_num_ctx
|
||||
extra_body["options"] = options
|
||||
|
||||
if extra_body:
|
||||
api_kwargs["extra_body"] = extra_body
|
||||
|
||||
@@ -6224,15 +6206,17 @@ class AIAgent:
|
||||
except Exception as cb_err:
|
||||
logging.debug(f"Tool complete callback error: {cb_err}")
|
||||
|
||||
# Save oversized results to file instead of destructive truncation
|
||||
function_result = _save_oversized_tool_result(name, function_result)
|
||||
function_result = maybe_persist_tool_result(
|
||||
content=function_result,
|
||||
tool_name=name,
|
||||
tool_use_id=tc.id,
|
||||
env=get_active_env(effective_task_id),
|
||||
)
|
||||
|
||||
# Discover subdirectory context files from tool arguments
|
||||
subdir_hints = self._subdirectory_hints.check_tool_call(name, args)
|
||||
if subdir_hints:
|
||||
function_result += subdir_hints
|
||||
|
||||
# Append tool result message in order
|
||||
tool_msg = {
|
||||
"role": "tool",
|
||||
"content": function_result,
|
||||
@@ -6240,6 +6224,12 @@ class AIAgent:
|
||||
}
|
||||
messages.append(tool_msg)
|
||||
|
||||
# ── Per-turn aggregate budget enforcement ─────────────────────────
|
||||
num_tools = len(parsed_calls)
|
||||
if num_tools > 0:
|
||||
turn_tool_msgs = messages[-num_tools:]
|
||||
enforce_turn_budget(turn_tool_msgs, env=get_active_env(effective_task_id))
|
||||
|
||||
# ── Budget pressure injection ────────────────────────────────────
|
||||
budget_warning = self._get_budget_warning(api_call_count)
|
||||
if budget_warning and messages and messages[-1].get("role") == "tool":
|
||||
@@ -6524,8 +6514,12 @@ class AIAgent:
|
||||
except Exception as cb_err:
|
||||
logging.debug(f"Tool complete callback error: {cb_err}")
|
||||
|
||||
# Save oversized results to file instead of destructive truncation
|
||||
function_result = _save_oversized_tool_result(function_name, function_result)
|
||||
function_result = maybe_persist_tool_result(
|
||||
content=function_result,
|
||||
tool_name=function_name,
|
||||
tool_use_id=tool_call.id,
|
||||
env=get_active_env(effective_task_id),
|
||||
)
|
||||
|
||||
# Discover subdirectory context files from tool arguments
|
||||
subdir_hints = self._subdirectory_hints.check_tool_call(function_name, function_args)
|
||||
@@ -6563,6 +6557,11 @@ class AIAgent:
|
||||
if self.tool_delay > 0 and i < len(assistant_message.tool_calls):
|
||||
time.sleep(self.tool_delay)
|
||||
|
||||
# ── Per-turn aggregate budget enforcement ─────────────────────────
|
||||
num_tools_seq = len(assistant_message.tool_calls)
|
||||
if num_tools_seq > 0:
|
||||
enforce_turn_budget(messages[-num_tools_seq:], env=get_active_env(effective_task_id))
|
||||
|
||||
# ── Budget pressure injection ─────────────────────────────────
|
||||
# After all tool calls in this turn are processed, check if we're
|
||||
# approaching max_iterations. If so, inject a warning into the LAST
|
||||
@@ -7289,6 +7288,7 @@ class AIAgent:
|
||||
codex_auth_retry_attempted=False
|
||||
anthropic_auth_retry_attempted=False
|
||||
nous_auth_retry_attempted=False
|
||||
thinking_sig_retry_attempted = False
|
||||
has_retried_429 = False
|
||||
restart_with_compressed_messages = False
|
||||
restart_with_length_continuation = False
|
||||
@@ -7391,20 +7391,30 @@ class AIAgent:
|
||||
response_invalid = True
|
||||
error_details.append("response.output is not a list")
|
||||
elif not output_items:
|
||||
# If we reach here, _run_codex_stream's backfill
|
||||
# from output_item.done events and text-delta
|
||||
# synthesis both failed to populate output.
|
||||
_resp_status = getattr(response, "status", None)
|
||||
_resp_incomplete = getattr(response, "incomplete_details", None)
|
||||
logging.warning(
|
||||
"Codex response.output is empty after stream backfill "
|
||||
"(status=%s, incomplete_details=%s, model=%s). %s",
|
||||
_resp_status, _resp_incomplete,
|
||||
getattr(response, "model", None),
|
||||
f"api_mode={self.api_mode} provider={self.provider}",
|
||||
)
|
||||
response_invalid = True
|
||||
error_details.append("response.output is empty")
|
||||
# Stream backfill may have failed, but
|
||||
# _normalize_codex_response can still recover
|
||||
# from response.output_text. Only mark invalid
|
||||
# when that fallback is also absent.
|
||||
_out_text = getattr(response, "output_text", None)
|
||||
_out_text_stripped = _out_text.strip() if isinstance(_out_text, str) else ""
|
||||
if _out_text_stripped:
|
||||
logger.debug(
|
||||
"Codex response.output is empty but output_text is present "
|
||||
"(%d chars); deferring to normalization.",
|
||||
len(_out_text_stripped),
|
||||
)
|
||||
else:
|
||||
_resp_status = getattr(response, "status", None)
|
||||
_resp_incomplete = getattr(response, "incomplete_details", None)
|
||||
logger.warning(
|
||||
"Codex response.output is empty after stream backfill "
|
||||
"(status=%s, incomplete_details=%s, model=%s). %s",
|
||||
_resp_status, _resp_incomplete,
|
||||
getattr(response, "model", None),
|
||||
f"api_mode={self.api_mode} provider={self.provider}",
|
||||
)
|
||||
response_invalid = True
|
||||
error_details.append("response.output is empty")
|
||||
elif self.api_mode == "anthropic_messages":
|
||||
content_blocks = getattr(response, "content", None) if response is not None else None
|
||||
if response is None:
|
||||
@@ -7494,7 +7504,8 @@ class AIAgent:
|
||||
}
|
||||
|
||||
# Longer backoff for rate limiting (likely cause of None choices)
|
||||
wait_time = min(5 * (2 ** (retry_count - 1)), 120) # 5s, 10s, 20s, 40s, 80s, 120s
|
||||
# Jittered exponential: 5s base, 120s cap + random jitter
|
||||
wait_time = jittered_backoff(retry_count, base_delay=5.0, max_delay=120.0)
|
||||
self._vprint(f"{self.log_prefix}⏳ Retrying in {wait_time}s (extended backoff for possible rate limit)...", force=True)
|
||||
logging.warning(f"Invalid API response (retry {retry_count}/{max_retries}): {', '.join(error_details)} | Provider: {provider_name}")
|
||||
|
||||
@@ -7867,8 +7878,38 @@ class AIAgent:
|
||||
print(f"{self.log_prefix} • Check ANTHROPIC_API_KEY in {_dhh}/.env for API keys or legacy token values")
|
||||
print(f"{self.log_prefix} • For API keys: verify at https://console.anthropic.com/settings/keys")
|
||||
print(f"{self.log_prefix} • For Claude Code: run 'claude /login' to refresh, then retry")
|
||||
print(f"{self.log_prefix} • Clear stale keys: hermes config set ANTHROPIC_TOKEN \"\"")
|
||||
print(f"{self.log_prefix} • Legacy cleanup: hermes config set ANTHROPIC_API_KEY \"\"")
|
||||
print(f"{self.log_prefix} • Legacy cleanup: hermes config set ANTHROPIC_TOKEN \"\"")
|
||||
print(f"{self.log_prefix} • Clear stale keys: hermes config set ANTHROPIC_API_KEY \"\"")
|
||||
|
||||
# ── Thinking block signature recovery ─────────────────
|
||||
# Anthropic signs thinking blocks against the full turn
|
||||
# content. Any upstream mutation (context compression,
|
||||
# session truncation, message merging) invalidates the
|
||||
# signature → HTTP 400. Recovery: strip reasoning_details
|
||||
# from all messages so the next retry sends no thinking
|
||||
# blocks at all. One-shot — don't retry infinitely.
|
||||
if (
|
||||
self.api_mode == "anthropic_messages"
|
||||
and status_code == 400
|
||||
and not thinking_sig_retry_attempted
|
||||
):
|
||||
_err_msg_lower = str(api_error).lower()
|
||||
if "signature" in _err_msg_lower and "thinking" in _err_msg_lower:
|
||||
thinking_sig_retry_attempted = True
|
||||
for _m in messages:
|
||||
if isinstance(_m, dict):
|
||||
_m.pop("reasoning_details", None)
|
||||
self._vprint(
|
||||
f"{self.log_prefix}⚠️ Thinking block signature invalid — "
|
||||
f"stripped all thinking blocks, retrying...",
|
||||
force=True,
|
||||
)
|
||||
logging.warning(
|
||||
"%sThinking block signature recovery: stripped "
|
||||
"reasoning_details from %d messages",
|
||||
self.log_prefix, len(messages),
|
||||
)
|
||||
continue
|
||||
|
||||
retry_count += 1
|
||||
elapsed_time = time.time() - api_start_time
|
||||
@@ -8351,7 +8392,7 @@ class AIAgent:
|
||||
_retry_after = min(int(_ra_raw), 120) # Cap at 2 minutes
|
||||
except (TypeError, ValueError):
|
||||
pass
|
||||
wait_time = _retry_after if _retry_after else min(2 ** retry_count, 60)
|
||||
wait_time = _retry_after if _retry_after else jittered_backoff(retry_count, base_delay=2.0, max_delay=60.0)
|
||||
if is_rate_limited:
|
||||
self._emit_status(f"⏱️ Rate limit reached. Waiting {wait_time}s before retry (attempt {retry_count + 1}/{max_retries})...")
|
||||
else:
|
||||
|
||||
@@ -1276,6 +1276,258 @@ class TestRoleAlternation:
|
||||
assert [m["role"] for m in result] == ["user", "assistant", "user"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Thinking block signature management
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestThinkingBlockSignatureManagement:
|
||||
"""Tests for the thinking block handling strategy:
|
||||
strip from old turns, preserve latest signed, downgrade unsigned."""
|
||||
|
||||
def test_thinking_stripped_from_non_last_assistant(self):
|
||||
"""Thinking blocks are removed from all assistant messages except the last."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "",
|
||||
"tool_calls": [
|
||||
{"id": "tc_1", "function": {"name": "tool1", "arguments": "{}"}},
|
||||
],
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Old reasoning.", "signature": "sig_old"},
|
||||
],
|
||||
},
|
||||
{"role": "tool", "tool_call_id": "tc_1", "content": "result 1"},
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "",
|
||||
"tool_calls": [
|
||||
{"id": "tc_2", "function": {"name": "tool2", "arguments": "{}"}},
|
||||
],
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Latest reasoning.", "signature": "sig_new"},
|
||||
],
|
||||
},
|
||||
{"role": "tool", "tool_call_id": "tc_2", "content": "result 2"},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
|
||||
# Find both assistant messages
|
||||
assistants = [m for m in result if m["role"] == "assistant"]
|
||||
assert len(assistants) == 2
|
||||
|
||||
# First (non-last) assistant: no thinking blocks
|
||||
first_types = [b.get("type") for b in assistants[0]["content"]]
|
||||
assert "thinking" not in first_types
|
||||
assert "redacted_thinking" not in first_types
|
||||
assert "tool_use" in first_types # tool_use should survive
|
||||
|
||||
# Last assistant: thinking block preserved with signature
|
||||
last_blocks = assistants[1]["content"]
|
||||
thinking_blocks = [b for b in last_blocks if b.get("type") == "thinking"]
|
||||
assert len(thinking_blocks) == 1
|
||||
assert thinking_blocks[0]["thinking"] == "Latest reasoning."
|
||||
assert thinking_blocks[0]["signature"] == "sig_new"
|
||||
|
||||
def test_signed_thinking_preserved_on_last_turn(self):
|
||||
"""A signed thinking block on the last assistant message is kept."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "The answer is 42.",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Deep thought.", "signature": "sig_valid"},
|
||||
],
|
||||
},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
blocks = result[0]["content"]
|
||||
thinking = [b for b in blocks if b.get("type") == "thinking"]
|
||||
assert len(thinking) == 1
|
||||
assert thinking[0]["signature"] == "sig_valid"
|
||||
|
||||
def test_unsigned_thinking_downgraded_to_text_on_last_turn(self):
|
||||
"""Unsigned thinking blocks on the last turn become text blocks."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "Response text.",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Unsigned reasoning."},
|
||||
# No 'signature' field
|
||||
],
|
||||
},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
blocks = result[0]["content"]
|
||||
|
||||
# No thinking blocks should remain
|
||||
assert not any(b.get("type") == "thinking" for b in blocks)
|
||||
# The reasoning text should be preserved as a text block
|
||||
text_contents = [b.get("text", "") for b in blocks if b.get("type") == "text"]
|
||||
assert "Unsigned reasoning." in text_contents
|
||||
|
||||
def test_redacted_thinking_with_data_preserved(self):
|
||||
"""Redacted thinking with 'data' field is kept on last turn."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "Response.",
|
||||
"reasoning_details": [
|
||||
{"type": "redacted_thinking", "data": "opaque_signature_data"},
|
||||
],
|
||||
},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
blocks = result[0]["content"]
|
||||
redacted = [b for b in blocks if b.get("type") == "redacted_thinking"]
|
||||
assert len(redacted) == 1
|
||||
assert redacted[0]["data"] == "opaque_signature_data"
|
||||
|
||||
def test_redacted_thinking_without_data_dropped(self):
|
||||
"""Redacted thinking without 'data' is dropped — can't be validated."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "Response.",
|
||||
"reasoning_details": [
|
||||
{"type": "redacted_thinking"},
|
||||
# No 'data' field
|
||||
],
|
||||
},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
blocks = result[0]["content"]
|
||||
assert not any(b.get("type") == "redacted_thinking" for b in blocks)
|
||||
|
||||
def test_cache_control_stripped_from_thinking_blocks(self):
|
||||
"""cache_control markers are removed from thinking/redacted_thinking blocks."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "",
|
||||
"tool_calls": [
|
||||
{"id": "tc_1", "function": {"name": "t", "arguments": "{}"}},
|
||||
],
|
||||
"reasoning_details": [
|
||||
{
|
||||
"type": "thinking",
|
||||
"thinking": "Reasoning.",
|
||||
"signature": "sig_1",
|
||||
"cache_control": {"type": "ephemeral"},
|
||||
},
|
||||
],
|
||||
},
|
||||
{"role": "tool", "tool_call_id": "tc_1", "content": "result"},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
assistant = next(m for m in result if m["role"] == "assistant")
|
||||
for block in assistant["content"]:
|
||||
if block.get("type") in ("thinking", "redacted_thinking"):
|
||||
assert "cache_control" not in block
|
||||
|
||||
def test_thinking_stripped_from_merged_consecutive_assistants(self):
|
||||
"""When consecutive assistants are merged, second one's thinking is dropped."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "First response.",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "First thought.", "signature": "sig_1"},
|
||||
],
|
||||
},
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "Second response.",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Second thought.", "signature": "sig_2"},
|
||||
],
|
||||
},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
|
||||
# Should be merged into one assistant message
|
||||
assistants = [m for m in result if m["role"] == "assistant"]
|
||||
assert len(assistants) == 1
|
||||
|
||||
# Only the first thinking block should remain (signed, on the last/only assistant)
|
||||
blocks = assistants[0]["content"]
|
||||
thinking = [b for b in blocks if b.get("type") == "thinking"]
|
||||
assert len(thinking) == 1
|
||||
assert thinking[0]["thinking"] == "First thought."
|
||||
|
||||
def test_empty_content_after_strip_gets_placeholder(self):
|
||||
"""If stripping thinking leaves an empty message, a placeholder is added."""
|
||||
messages = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Only thinking, no text."},
|
||||
# Unsigned — will be downgraded, but content was empty string
|
||||
],
|
||||
},
|
||||
{"role": "user", "content": "Next message."},
|
||||
{"role": "assistant", "content": "Final."},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
# First assistant is non-last, so thinking is stripped completely.
|
||||
# The original content was empty and thinking was unsigned → placeholder
|
||||
first_assistant = result[0]
|
||||
assert first_assistant["role"] == "assistant"
|
||||
assert len(first_assistant["content"]) >= 1
|
||||
|
||||
def test_multi_turn_conversation_preserves_only_last(self):
|
||||
"""Full multi-turn conversation: only last assistant keeps thinking."""
|
||||
messages = [
|
||||
{"role": "user", "content": "Question 1"},
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "Answer 1",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Thought 1", "signature": "sig_1"},
|
||||
],
|
||||
},
|
||||
{"role": "user", "content": "Question 2"},
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "Answer 2",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Thought 2", "signature": "sig_2"},
|
||||
],
|
||||
},
|
||||
{"role": "user", "content": "Question 3"},
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": "Answer 3",
|
||||
"reasoning_details": [
|
||||
{"type": "thinking", "thinking": "Thought 3", "signature": "sig_3"},
|
||||
],
|
||||
},
|
||||
]
|
||||
_, result = convert_messages_to_anthropic(messages)
|
||||
|
||||
assistants = [m for m in result if m["role"] == "assistant"]
|
||||
assert len(assistants) == 3
|
||||
|
||||
# First two: no thinking blocks
|
||||
for a in assistants[:2]:
|
||||
assert not any(
|
||||
b.get("type") in ("thinking", "redacted_thinking")
|
||||
for b in a["content"]
|
||||
if isinstance(b, dict)
|
||||
)
|
||||
|
||||
# Last one: thinking preserved
|
||||
last_thinking = [
|
||||
b for b in assistants[2]["content"]
|
||||
if isinstance(b, dict) and b.get("type") == "thinking"
|
||||
]
|
||||
assert len(last_thinking) == 1
|
||||
assert last_thinking[0]["signature"] == "sig_3"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tool choice
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -471,6 +471,23 @@ class TestExplicitProviderRouting:
|
||||
client, model = resolve_provider_client("zai")
|
||||
assert client is not None
|
||||
|
||||
def test_explicit_google_alias_uses_gemini_credentials(self):
|
||||
"""provider='google' should route through the gemini API-key provider."""
|
||||
with (
|
||||
patch("hermes_cli.auth.resolve_api_key_provider_credentials", return_value={
|
||||
"api_key": "gemini-key",
|
||||
"base_url": "https://generativelanguage.googleapis.com/v1beta/openai",
|
||||
}),
|
||||
patch("agent.auxiliary_client.OpenAI") as mock_openai,
|
||||
):
|
||||
mock_openai.return_value = MagicMock()
|
||||
client, model = resolve_provider_client("google", model="gemini-3.1-pro-preview")
|
||||
|
||||
assert client is not None
|
||||
assert model == "gemini-3.1-pro-preview"
|
||||
assert mock_openai.call_args.kwargs["api_key"] == "gemini-key"
|
||||
assert mock_openai.call_args.kwargs["base_url"] == "https://generativelanguage.googleapis.com/v1beta/openai"
|
||||
|
||||
def test_explicit_unknown_returns_none(self, monkeypatch):
|
||||
"""Unknown provider should return None."""
|
||||
client, model = resolve_provider_client("nonexistent-provider")
|
||||
@@ -624,12 +641,15 @@ class TestVisionClientFallback:
|
||||
assert client is None
|
||||
assert model is None
|
||||
|
||||
def test_vision_auto_includes_anthropic_when_configured(self, monkeypatch):
|
||||
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
|
||||
def test_vision_auto_includes_active_provider_when_configured(self, monkeypatch):
|
||||
"""Active provider appears in available backends when credentials exist."""
|
||||
monkeypatch.setenv("ANTHROPIC_API_KEY", "***")
|
||||
with (
|
||||
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
|
||||
patch("agent.auxiliary_client._read_main_provider", return_value="anthropic"),
|
||||
patch("agent.auxiliary_client._read_main_model", return_value="claude-sonnet-4"),
|
||||
patch("agent.anthropic_adapter.build_anthropic_client", return_value=MagicMock()),
|
||||
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="sk-ant-api03-key"),
|
||||
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="***"),
|
||||
):
|
||||
backends = get_available_vision_backends()
|
||||
|
||||
@@ -702,88 +722,50 @@ class TestAuxiliaryPoolAwareness:
|
||||
assert call_kwargs["base_url"] == "https://api.githubcopilot.com"
|
||||
assert call_kwargs["default_headers"]["Editor-Version"]
|
||||
|
||||
def test_vision_auto_uses_anthropic_when_no_higher_priority_backend(self, monkeypatch):
|
||||
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
|
||||
def test_vision_auto_uses_active_provider_as_fallback(self, monkeypatch):
|
||||
"""When no OpenRouter/Nous available, vision auto falls back to active provider."""
|
||||
monkeypatch.setenv("ANTHROPIC_API_KEY", "***")
|
||||
with (
|
||||
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
|
||||
patch("agent.auxiliary_client._read_main_provider", return_value="anthropic"),
|
||||
patch("agent.auxiliary_client._read_main_model", return_value="claude-sonnet-4"),
|
||||
patch("agent.anthropic_adapter.build_anthropic_client", return_value=MagicMock()),
|
||||
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="sk-ant-api03-key"),
|
||||
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="***"),
|
||||
):
|
||||
client, model = get_vision_auxiliary_client()
|
||||
|
||||
assert client is not None
|
||||
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
|
||||
assert model == "claude-haiku-4-5-20251001"
|
||||
|
||||
def test_selected_anthropic_provider_is_preferred_for_vision_auto(self, monkeypatch):
|
||||
def test_vision_auto_prefers_openrouter_over_active_provider(self, monkeypatch):
|
||||
"""OpenRouter is tried before the active provider in vision auto."""
|
||||
monkeypatch.setenv("OPENROUTER_API_KEY", "or-key")
|
||||
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
|
||||
|
||||
def fake_load_config():
|
||||
return {"model": {"provider": "anthropic", "default": "claude-sonnet-4-6"}}
|
||||
monkeypatch.setenv("ANTHROPIC_API_KEY", "***")
|
||||
|
||||
with (
|
||||
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
|
||||
patch("agent.anthropic_adapter.build_anthropic_client", return_value=MagicMock()),
|
||||
patch("agent.anthropic_adapter.resolve_anthropic_token", return_value="sk-ant-api03-key"),
|
||||
patch("agent.auxiliary_client._read_main_provider", return_value="anthropic"),
|
||||
patch("agent.auxiliary_client._read_main_model", return_value="claude-sonnet-4"),
|
||||
patch("agent.auxiliary_client.OpenAI") as mock_openai,
|
||||
patch("hermes_cli.config.load_config", fake_load_config),
|
||||
):
|
||||
client, model = get_vision_auxiliary_client()
|
||||
|
||||
assert client is not None
|
||||
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
|
||||
assert model == "claude-haiku-4-5-20251001"
|
||||
|
||||
def test_selected_codex_provider_short_circuits_vision_auto(self, monkeypatch):
|
||||
def fake_load_config():
|
||||
return {"model": {"provider": "openai-codex", "default": "gpt-5.2-codex"}}
|
||||
|
||||
codex_client = MagicMock()
|
||||
with (
|
||||
patch("hermes_cli.config.load_config", fake_load_config),
|
||||
patch("agent.auxiliary_client._try_codex", return_value=(codex_client, "gpt-5.2-codex")) as mock_codex,
|
||||
patch("agent.auxiliary_client._try_openrouter") as mock_openrouter,
|
||||
patch("agent.auxiliary_client._try_nous") as mock_nous,
|
||||
patch("agent.auxiliary_client._try_anthropic") as mock_anthropic,
|
||||
patch("agent.auxiliary_client._try_custom_endpoint") as mock_custom,
|
||||
):
|
||||
provider, client, model = resolve_vision_provider_client()
|
||||
|
||||
assert provider == "openai-codex"
|
||||
assert client is codex_client
|
||||
assert model == "gpt-5.2-codex"
|
||||
mock_codex.assert_called_once()
|
||||
mock_openrouter.assert_not_called()
|
||||
mock_nous.assert_not_called()
|
||||
mock_anthropic.assert_not_called()
|
||||
mock_custom.assert_not_called()
|
||||
# OpenRouter should win over anthropic active provider
|
||||
assert provider == "openrouter"
|
||||
|
||||
def test_vision_auto_includes_codex(self, codex_auth_dir):
|
||||
"""Codex supports vision (gpt-5.3-codex), so auto mode should use it."""
|
||||
with patch("agent.auxiliary_client._read_nous_auth", return_value=None), \
|
||||
patch("agent.auxiliary_client.OpenAI"):
|
||||
client, model = get_vision_auxiliary_client()
|
||||
from agent.auxiliary_client import CodexAuxiliaryClient
|
||||
assert isinstance(client, CodexAuxiliaryClient)
|
||||
assert model == "gpt-5.2-codex"
|
||||
|
||||
def test_vision_auto_falls_back_to_custom_endpoint(self, monkeypatch):
|
||||
"""Custom endpoint is used as fallback in vision auto mode.
|
||||
|
||||
Many local models (Qwen-VL, LLaVA, etc.) support vision.
|
||||
When no OpenRouter/Nous/Codex is available, try the custom endpoint.
|
||||
"""
|
||||
def test_vision_auto_uses_named_custom_as_active_provider(self, monkeypatch):
|
||||
"""Named custom provider works as active provider fallback in vision auto."""
|
||||
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
|
||||
monkeypatch.delenv("ANTHROPIC_API_KEY", raising=False)
|
||||
with patch("agent.auxiliary_client._read_nous_auth", return_value=None), \
|
||||
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)), \
|
||||
patch("agent.auxiliary_client._read_codex_access_token", return_value=None), \
|
||||
patch("agent.auxiliary_client._resolve_custom_runtime",
|
||||
return_value=("http://localhost:1234/v1", "local-key")), \
|
||||
patch("agent.auxiliary_client.OpenAI") as mock_openai:
|
||||
client, model = get_vision_auxiliary_client()
|
||||
assert client is not None # Custom endpoint picked up as fallback
|
||||
patch("agent.auxiliary_client._read_main_provider", return_value="custom:local"), \
|
||||
patch("agent.auxiliary_client._read_main_model", return_value="my-local-model"), \
|
||||
patch("agent.auxiliary_client.resolve_provider_client",
|
||||
return_value=(MagicMock(), "my-local-model")) as mock_resolve:
|
||||
provider, client, model = resolve_vision_provider_client()
|
||||
assert client is not None
|
||||
assert provider == "custom:local"
|
||||
|
||||
def test_vision_direct_endpoint_override(self, monkeypatch):
|
||||
monkeypatch.setenv("OPENROUTER_API_KEY", "or-key")
|
||||
@@ -822,6 +804,31 @@ class TestAuxiliaryPoolAwareness:
|
||||
assert model == "google/gemini-3-flash-preview"
|
||||
assert client is not None
|
||||
|
||||
def test_vision_config_google_provider_uses_gemini_credentials(self, monkeypatch):
|
||||
config = {
|
||||
"auxiliary": {
|
||||
"vision": {
|
||||
"provider": "google",
|
||||
"model": "gemini-3.1-pro-preview",
|
||||
}
|
||||
}
|
||||
}
|
||||
monkeypatch.setattr("hermes_cli.config.load_config", lambda: config)
|
||||
with (
|
||||
patch("hermes_cli.auth.resolve_api_key_provider_credentials", return_value={
|
||||
"api_key": "gemini-key",
|
||||
"base_url": "https://generativelanguage.googleapis.com/v1beta/openai",
|
||||
}),
|
||||
patch("agent.auxiliary_client.OpenAI") as mock_openai,
|
||||
):
|
||||
resolved_provider, client, model = resolve_vision_provider_client()
|
||||
|
||||
assert resolved_provider == "gemini"
|
||||
assert client is not None
|
||||
assert model == "gemini-3.1-pro-preview"
|
||||
assert mock_openai.call_args.kwargs["api_key"] == "gemini-key"
|
||||
assert mock_openai.call_args.kwargs["base_url"] == "https://generativelanguage.googleapis.com/v1beta/openai"
|
||||
|
||||
def test_vision_forced_main_uses_custom_endpoint(self, monkeypatch):
|
||||
"""When explicitly forced to 'main', vision CAN use custom endpoint."""
|
||||
config = {
|
||||
@@ -846,7 +853,14 @@ class TestAuxiliaryPoolAwareness:
|
||||
monkeypatch.setenv("AUXILIARY_VISION_PROVIDER", "main")
|
||||
monkeypatch.delenv("OPENAI_BASE_URL", raising=False)
|
||||
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
|
||||
# Clear client cache to avoid stale entries from previous tests
|
||||
from agent.auxiliary_client import _client_cache
|
||||
_client_cache.clear()
|
||||
with patch("agent.auxiliary_client._read_nous_auth", return_value=None), \
|
||||
patch("agent.auxiliary_client._read_main_provider", return_value=""), \
|
||||
patch("agent.auxiliary_client._read_main_model", return_value=""), \
|
||||
patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)), \
|
||||
patch("agent.auxiliary_client._resolve_custom_runtime", return_value=(None, None)), \
|
||||
patch("agent.auxiliary_client._read_codex_access_token", return_value=None), \
|
||||
patch("agent.auxiliary_client._resolve_api_key_provider", return_value=(None, None)):
|
||||
client, model = get_vision_auxiliary_client()
|
||||
|
||||
@@ -13,7 +13,7 @@ from unittest.mock import patch, MagicMock
|
||||
import pytest
|
||||
import yaml
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
|
||||
|
||||
|
||||
def _run_auxiliary_bridge(config_dict, monkeypatch):
|
||||
@@ -199,7 +199,7 @@ class TestGatewayBridgeCodeParity:
|
||||
|
||||
def test_gateway_has_auxiliary_bridge(self):
|
||||
"""The gateway config bridge must include auxiliary.* bridging."""
|
||||
gateway_path = Path(__file__).parent.parent / "gateway" / "run.py"
|
||||
gateway_path = Path(__file__).parent.parent.parent / "gateway" / "run.py"
|
||||
content = gateway_path.read_text()
|
||||
# Check for key patterns that indicate the bridge is present
|
||||
assert "AUXILIARY_VISION_PROVIDER" in content
|
||||
@@ -213,7 +213,7 @@ class TestGatewayBridgeCodeParity:
|
||||
|
||||
def test_gateway_no_compression_env_bridge(self):
|
||||
"""Gateway should NOT bridge compression config to env vars (config-only)."""
|
||||
gateway_path = Path(__file__).parent.parent / "gateway" / "run.py"
|
||||
gateway_path = Path(__file__).parent.parent.parent / "gateway" / "run.py"
|
||||
content = gateway_path.read_text()
|
||||
assert "CONTEXT_COMPRESSION_PROVIDER" not in content
|
||||
assert "CONTEXT_COMPRESSION_MODEL" not in content
|
||||
@@ -0,0 +1,151 @@
|
||||
"""Tests for named custom provider and 'main' alias resolution in auxiliary_client."""
|
||||
|
||||
import os
|
||||
from unittest.mock import patch, MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _isolate(tmp_path, monkeypatch):
|
||||
"""Redirect HERMES_HOME and clear module caches."""
|
||||
hermes_home = tmp_path / ".hermes"
|
||||
hermes_home.mkdir()
|
||||
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
|
||||
# Write a minimal config so load_config doesn't fail
|
||||
(hermes_home / "config.yaml").write_text("model:\n default: test-model\n")
|
||||
|
||||
|
||||
def _write_config(tmp_path, config_dict):
|
||||
"""Write a config.yaml to the test HERMES_HOME."""
|
||||
import yaml
|
||||
config_path = tmp_path / ".hermes" / "config.yaml"
|
||||
config_path.write_text(yaml.dump(config_dict))
|
||||
|
||||
|
||||
class TestNormalizeVisionProvider:
|
||||
"""_normalize_vision_provider should resolve 'main' to actual main provider."""
|
||||
|
||||
def test_main_resolves_to_named_custom(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "my-model", "provider": "custom:beans"},
|
||||
"custom_providers": [{"name": "beans", "base_url": "http://localhost/v1"}],
|
||||
})
|
||||
from agent.auxiliary_client import _normalize_vision_provider
|
||||
assert _normalize_vision_provider("main") == "custom:beans"
|
||||
|
||||
def test_main_resolves_to_openrouter(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "anthropic/claude-sonnet-4", "provider": "openrouter"},
|
||||
})
|
||||
from agent.auxiliary_client import _normalize_vision_provider
|
||||
assert _normalize_vision_provider("main") == "openrouter"
|
||||
|
||||
def test_main_resolves_to_deepseek(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "deepseek-chat", "provider": "deepseek"},
|
||||
})
|
||||
from agent.auxiliary_client import _normalize_vision_provider
|
||||
assert _normalize_vision_provider("main") == "deepseek"
|
||||
|
||||
def test_main_falls_back_to_custom_when_no_provider(self, tmp_path):
|
||||
_write_config(tmp_path, {"model": {"default": "gpt-4o"}})
|
||||
from agent.auxiliary_client import _normalize_vision_provider
|
||||
assert _normalize_vision_provider("main") == "custom"
|
||||
|
||||
def test_bare_provider_name_unchanged(self):
|
||||
from agent.auxiliary_client import _normalize_vision_provider
|
||||
assert _normalize_vision_provider("beans") == "beans"
|
||||
assert _normalize_vision_provider("deepseek") == "deepseek"
|
||||
|
||||
def test_codex_alias_still_works(self):
|
||||
from agent.auxiliary_client import _normalize_vision_provider
|
||||
assert _normalize_vision_provider("codex") == "openai-codex"
|
||||
|
||||
def test_auto_unchanged(self):
|
||||
from agent.auxiliary_client import _normalize_vision_provider
|
||||
assert _normalize_vision_provider("auto") == "auto"
|
||||
assert _normalize_vision_provider(None) == "auto"
|
||||
|
||||
|
||||
class TestResolveProviderClientMainAlias:
|
||||
"""resolve_provider_client('main', ...) should resolve to actual main provider."""
|
||||
|
||||
def test_main_resolves_to_named_custom_provider(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "my-model", "provider": "beans"},
|
||||
"custom_providers": [
|
||||
{"name": "beans", "base_url": "http://beans.local/v1", "api_key": "k"},
|
||||
],
|
||||
})
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, model = resolve_provider_client("main", "override-model")
|
||||
assert client is not None
|
||||
assert model == "override-model"
|
||||
assert "beans.local" in str(client.base_url)
|
||||
|
||||
def test_main_with_custom_colon_prefix(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "my-model", "provider": "custom:beans"},
|
||||
"custom_providers": [
|
||||
{"name": "beans", "base_url": "http://beans.local/v1", "api_key": "k"},
|
||||
],
|
||||
})
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, model = resolve_provider_client("main", "test")
|
||||
assert client is not None
|
||||
assert "beans.local" in str(client.base_url)
|
||||
|
||||
|
||||
class TestResolveProviderClientNamedCustom:
|
||||
"""resolve_provider_client should resolve named custom providers directly."""
|
||||
|
||||
def test_named_custom_provider(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "test-model"},
|
||||
"custom_providers": [
|
||||
{"name": "beans", "base_url": "http://beans.local/v1", "api_key": "k"},
|
||||
],
|
||||
})
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, model = resolve_provider_client("beans", "my-model")
|
||||
assert client is not None
|
||||
assert model == "my-model"
|
||||
assert "beans.local" in str(client.base_url)
|
||||
|
||||
def test_named_custom_provider_default_model(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "main-model"},
|
||||
"custom_providers": [
|
||||
{"name": "beans", "base_url": "http://beans.local/v1", "api_key": "k"},
|
||||
],
|
||||
})
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, model = resolve_provider_client("beans")
|
||||
assert client is not None
|
||||
# Should use _read_main_model() fallback
|
||||
assert model == "main-model"
|
||||
|
||||
def test_named_custom_no_api_key_uses_fallback(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "test"},
|
||||
"custom_providers": [
|
||||
{"name": "local", "base_url": "http://localhost:8080/v1"},
|
||||
],
|
||||
})
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
client, model = resolve_provider_client("local", "test")
|
||||
assert client is not None
|
||||
# no-key-required should be used
|
||||
|
||||
def test_nonexistent_named_custom_falls_through(self, tmp_path):
|
||||
_write_config(tmp_path, {
|
||||
"model": {"default": "test"},
|
||||
"custom_providers": [
|
||||
{"name": "beans", "base_url": "http://beans.local/v1"},
|
||||
],
|
||||
})
|
||||
from agent.auxiliary_client import resolve_provider_client
|
||||
# "coffee" doesn't exist in custom_providers
|
||||
client, model = resolve_provider_client("coffee", "test")
|
||||
assert client is None
|
||||
@@ -0,0 +1,42 @@
|
||||
"""Tests for MiniMax auxiliary client URL normalization.
|
||||
|
||||
MiniMax and MiniMax-CN set inference_base_url to the /anthropic path.
|
||||
The auxiliary client uses the OpenAI SDK, which needs /v1 instead.
|
||||
"""
|
||||
|
||||
import sys
|
||||
import os
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
|
||||
|
||||
from agent.auxiliary_client import _to_openai_base_url
|
||||
|
||||
|
||||
class TestToOpenaiBaseUrl:
|
||||
def test_minimax_global_anthropic_suffix_replaced(self):
|
||||
assert _to_openai_base_url("https://api.minimax.io/anthropic") == "https://api.minimax.io/v1"
|
||||
|
||||
def test_minimax_cn_anthropic_suffix_replaced(self):
|
||||
assert _to_openai_base_url("https://api.minimaxi.com/anthropic") == "https://api.minimaxi.com/v1"
|
||||
|
||||
def test_trailing_slash_stripped_before_replace(self):
|
||||
assert _to_openai_base_url("https://api.minimax.io/anthropic/") == "https://api.minimax.io/v1"
|
||||
|
||||
def test_v1_url_unchanged(self):
|
||||
assert _to_openai_base_url("https://api.openai.com/v1") == "https://api.openai.com/v1"
|
||||
|
||||
def test_openrouter_url_unchanged(self):
|
||||
assert _to_openai_base_url("https://openrouter.ai/api/v1") == "https://openrouter.ai/api/v1"
|
||||
|
||||
def test_anthropic_domain_unchanged(self):
|
||||
"""api.anthropic.com doesn't end with /anthropic — should be untouched."""
|
||||
assert _to_openai_base_url("https://api.anthropic.com") == "https://api.anthropic.com"
|
||||
|
||||
def test_anthropic_in_subpath_unchanged(self):
|
||||
assert _to_openai_base_url("https://example.com/anthropic/extra") == "https://example.com/anthropic/extra"
|
||||
|
||||
def test_empty_string(self):
|
||||
assert _to_openai_base_url("") == ""
|
||||
|
||||
def test_none(self):
|
||||
assert _to_openai_base_url(None) == ""
|
||||
@@ -0,0 +1,105 @@
|
||||
"""Tests for MiniMax provider hardening — context lengths, thinking guard, catalog."""
|
||||
|
||||
|
||||
class TestMinimaxContextLengths:
|
||||
"""Verify per-model context length entries for MiniMax models."""
|
||||
|
||||
def test_m1_variants_have_1m_context(self):
|
||||
from agent.model_metadata import DEFAULT_CONTEXT_LENGTHS
|
||||
# Keys are lowercase because the lookup lowercases model names
|
||||
for model in ("minimax-m1", "minimax-m1-40k", "minimax-m1-80k",
|
||||
"minimax-m1-128k", "minimax-m1-256k"):
|
||||
assert model in DEFAULT_CONTEXT_LENGTHS, f"{model} missing from context lengths"
|
||||
assert DEFAULT_CONTEXT_LENGTHS[model] == 1_000_000, f"{model} expected 1M"
|
||||
|
||||
def test_m2_variants_have_1m_context(self):
|
||||
from agent.model_metadata import DEFAULT_CONTEXT_LENGTHS
|
||||
# Keys are lowercase because the lookup lowercases model names
|
||||
for model in ("minimax-m2.5", "minimax-m2.7"):
|
||||
assert model in DEFAULT_CONTEXT_LENGTHS, f"{model} missing from context lengths"
|
||||
assert DEFAULT_CONTEXT_LENGTHS[model] == 1_048_576, f"{model} expected 1048576"
|
||||
|
||||
def test_minimax_prefix_fallback(self):
|
||||
from agent.model_metadata import DEFAULT_CONTEXT_LENGTHS
|
||||
# The generic "minimax" prefix entry should be 1M for unknown models
|
||||
assert DEFAULT_CONTEXT_LENGTHS["minimax"] == 1_048_576
|
||||
|
||||
|
||||
|
||||
class TestMinimaxThinkingGuard:
|
||||
"""Verify that build_anthropic_kwargs does NOT add thinking params for MiniMax models."""
|
||||
|
||||
def test_no_thinking_for_minimax_m27(self):
|
||||
from agent.anthropic_adapter import build_anthropic_kwargs
|
||||
kwargs = build_anthropic_kwargs(
|
||||
model="MiniMax-M2.7",
|
||||
messages=[{"role": "user", "content": "hello"}],
|
||||
tools=None,
|
||||
max_tokens=4096,
|
||||
reasoning_config={"enabled": True, "effort": "medium"},
|
||||
)
|
||||
assert "thinking" not in kwargs
|
||||
assert "output_config" not in kwargs
|
||||
|
||||
def test_no_thinking_for_minimax_m1(self):
|
||||
from agent.anthropic_adapter import build_anthropic_kwargs
|
||||
kwargs = build_anthropic_kwargs(
|
||||
model="MiniMax-M1-128k",
|
||||
messages=[{"role": "user", "content": "hello"}],
|
||||
tools=None,
|
||||
max_tokens=4096,
|
||||
reasoning_config={"enabled": True, "effort": "high"},
|
||||
)
|
||||
assert "thinking" not in kwargs
|
||||
|
||||
def test_thinking_still_works_for_claude(self):
|
||||
from agent.anthropic_adapter import build_anthropic_kwargs
|
||||
kwargs = build_anthropic_kwargs(
|
||||
model="claude-sonnet-4-20250514",
|
||||
messages=[{"role": "user", "content": "hello"}],
|
||||
tools=None,
|
||||
max_tokens=4096,
|
||||
reasoning_config={"enabled": True, "effort": "medium"},
|
||||
)
|
||||
assert "thinking" in kwargs
|
||||
|
||||
|
||||
class TestMinimaxAuxModel:
|
||||
"""Verify auxiliary model is standard (not highspeed)."""
|
||||
|
||||
def test_minimax_aux_is_standard(self):
|
||||
from agent.auxiliary_client import _API_KEY_PROVIDER_AUX_MODELS
|
||||
assert _API_KEY_PROVIDER_AUX_MODELS["minimax"] == "MiniMax-M2.7"
|
||||
assert _API_KEY_PROVIDER_AUX_MODELS["minimax-cn"] == "MiniMax-M2.7"
|
||||
|
||||
def test_minimax_aux_not_highspeed(self):
|
||||
from agent.auxiliary_client import _API_KEY_PROVIDER_AUX_MODELS
|
||||
assert "highspeed" not in _API_KEY_PROVIDER_AUX_MODELS["minimax"]
|
||||
assert "highspeed" not in _API_KEY_PROVIDER_AUX_MODELS["minimax-cn"]
|
||||
|
||||
|
||||
class TestMinimaxModelCatalog:
|
||||
"""Verify the model catalog includes M1 family and excludes deprecated models."""
|
||||
|
||||
def test_catalog_includes_m1_family(self):
|
||||
from hermes_cli.models import _PROVIDER_MODELS
|
||||
for provider in ("minimax", "minimax-cn"):
|
||||
models = _PROVIDER_MODELS[provider]
|
||||
assert "MiniMax-M1" in models
|
||||
assert "MiniMax-M1-40k" in models
|
||||
assert "MiniMax-M1-80k" in models
|
||||
assert "MiniMax-M1-128k" in models
|
||||
assert "MiniMax-M1-256k" in models
|
||||
|
||||
def test_catalog_excludes_deprecated(self):
|
||||
from hermes_cli.models import _PROVIDER_MODELS
|
||||
for provider in ("minimax", "minimax-cn"):
|
||||
models = _PROVIDER_MODELS[provider]
|
||||
assert "MiniMax-M2.1" not in models
|
||||
|
||||
def test_catalog_excludes_highspeed(self):
|
||||
from hermes_cli.models import _PROVIDER_MODELS
|
||||
for provider in ("minimax", "minimax-cn"):
|
||||
models = _PROVIDER_MODELS[provider]
|
||||
assert "MiniMax-M2.7-highspeed" not in models
|
||||
assert "MiniMax-M2.5-highspeed" not in models
|
||||
@@ -330,7 +330,7 @@ def test_model_flow_nous_prints_subscription_guidance_without_mutating_explicit_
|
||||
"hermes_cli.auth.fetch_nous_models",
|
||||
lambda *args, **kwargs: ["claude-opus-4-6"],
|
||||
)
|
||||
monkeypatch.setattr("hermes_cli.auth._prompt_model_selection", lambda model_ids, current_model="", pricing=None: "claude-opus-4-6")
|
||||
monkeypatch.setattr("hermes_cli.auth._prompt_model_selection", lambda model_ids, current_model="", pricing=None, **kw: "claude-opus-4-6")
|
||||
monkeypatch.setattr("hermes_cli.auth._save_model_choice", lambda model: None)
|
||||
monkeypatch.setattr("hermes_cli.auth._update_config_for_provider", lambda provider, url: None)
|
||||
monkeypatch.setattr(
|
||||
@@ -368,7 +368,7 @@ def test_model_flow_nous_applies_managed_tts_default_when_unconfigured(monkeypat
|
||||
"hermes_cli.auth.fetch_nous_models",
|
||||
lambda *args, **kwargs: ["claude-opus-4-6"],
|
||||
)
|
||||
monkeypatch.setattr("hermes_cli.auth._prompt_model_selection", lambda model_ids, current_model="", pricing=None: "claude-opus-4-6")
|
||||
monkeypatch.setattr("hermes_cli.auth._prompt_model_selection", lambda model_ids, current_model="", pricing=None, **kw: "claude-opus-4-6")
|
||||
monkeypatch.setattr("hermes_cli.auth._save_model_choice", lambda model: None)
|
||||
monkeypatch.setattr("hermes_cli.auth._update_config_for_provider", lambda provider, url: None)
|
||||
monkeypatch.setattr(
|
||||
@@ -1,6 +1,6 @@
|
||||
"""Regression tests for CLI /retry history replacement semantics."""
|
||||
|
||||
from tests.test_cli_init import _make_cli
|
||||
from tests.cli.test_cli_init import _make_cli
|
||||
|
||||
|
||||
def test_retry_last_truncates_history_before_requeueing_message():
|
||||
@@ -0,0 +1,98 @@
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
from cli import HermesCLI, _rich_text_from_ansi
|
||||
from hermes_cli.skin_engine import get_active_skin, set_active_skin
|
||||
|
||||
|
||||
def _make_cli_stub():
|
||||
cli = HermesCLI.__new__(HermesCLI)
|
||||
cli._sudo_state = None
|
||||
cli._secret_state = None
|
||||
cli._approval_state = None
|
||||
cli._clarify_state = None
|
||||
cli._clarify_freetext = False
|
||||
cli._command_running = False
|
||||
cli._agent_running = False
|
||||
cli._voice_recording = False
|
||||
cli._voice_processing = False
|
||||
cli._voice_mode = False
|
||||
cli._command_spinner_frame = lambda: "⟳"
|
||||
cli._tui_style_base = {
|
||||
"prompt": "#fff",
|
||||
"input-area": "#fff",
|
||||
"input-rule": "#aaa",
|
||||
"prompt-working": "#888 italic",
|
||||
}
|
||||
cli._app = SimpleNamespace(style=None)
|
||||
cli._invalidate = MagicMock()
|
||||
return cli
|
||||
|
||||
|
||||
class TestCliSkinPromptIntegration:
|
||||
def test_default_prompt_fragments_use_default_symbol(self):
|
||||
cli = _make_cli_stub()
|
||||
|
||||
set_active_skin("default")
|
||||
assert cli._get_tui_prompt_fragments() == [("class:prompt", "❯ ")]
|
||||
|
||||
def test_ares_prompt_fragments_use_skin_symbol(self):
|
||||
cli = _make_cli_stub()
|
||||
|
||||
set_active_skin("ares")
|
||||
assert cli._get_tui_prompt_fragments() == [("class:prompt", "⚔ ❯ ")]
|
||||
|
||||
def test_secret_prompt_fragments_preserve_secret_state(self):
|
||||
cli = _make_cli_stub()
|
||||
cli._secret_state = {"response_queue": object()}
|
||||
|
||||
set_active_skin("ares")
|
||||
assert cli._get_tui_prompt_fragments() == [("class:sudo-prompt", "🔑 ❯ ")]
|
||||
|
||||
def test_icon_only_skin_symbol_still_visible_in_special_states(self):
|
||||
cli = _make_cli_stub()
|
||||
cli._secret_state = {"response_queue": object()}
|
||||
|
||||
with patch("hermes_cli.skin_engine.get_active_prompt_symbol", return_value="⚔ "):
|
||||
assert cli._get_tui_prompt_fragments() == [("class:sudo-prompt", "🔑 ⚔ ")]
|
||||
|
||||
def test_build_tui_style_dict_uses_skin_overrides(self):
|
||||
cli = _make_cli_stub()
|
||||
|
||||
set_active_skin("ares")
|
||||
skin = get_active_skin()
|
||||
style_dict = cli._build_tui_style_dict()
|
||||
|
||||
assert style_dict["prompt"] == skin.get_color("prompt")
|
||||
assert style_dict["input-rule"] == skin.get_color("input_rule")
|
||||
assert style_dict["prompt-working"] == f"{skin.get_color('banner_dim')} italic"
|
||||
assert style_dict["approval-title"] == f"{skin.get_color('ui_warn')} bold"
|
||||
|
||||
def test_apply_tui_skin_style_updates_running_app(self):
|
||||
cli = _make_cli_stub()
|
||||
|
||||
set_active_skin("ares")
|
||||
assert cli._apply_tui_skin_style() is True
|
||||
assert cli._app.style is not None
|
||||
cli._invalidate.assert_called_once_with(min_interval=0.0)
|
||||
|
||||
def test_handle_skin_command_refreshes_live_tui(self, capsys):
|
||||
cli = _make_cli_stub()
|
||||
|
||||
with patch("cli.save_config_value", return_value=True):
|
||||
cli._handle_skin_command("/skin ares")
|
||||
|
||||
output = capsys.readouterr().out
|
||||
assert "Skin set to: ares (saved)" in output
|
||||
assert "Prompt + TUI colors updated." in output
|
||||
assert cli._app.style is not None
|
||||
|
||||
|
||||
class TestAnsiRichTextHelper:
|
||||
def test_preserves_literal_brackets(self):
|
||||
text = _rich_text_from_ansi("[notatag] literal")
|
||||
assert text.plain == "[notatag] literal"
|
||||
|
||||
def test_strips_ansi_but_keeps_plain_text(self):
|
||||
text = _rich_text_from_ansi("\x1b[31mred\x1b[0m")
|
||||
assert text.plain == "red"
|
||||
@@ -0,0 +1,66 @@
|
||||
import pytest
|
||||
from unittest.mock import MagicMock, patch
|
||||
from hermes_cli.plugins import VALID_HOOKS, PluginManager
|
||||
import os
|
||||
import shutil
|
||||
import tempfile
|
||||
from cli import HermesCLI
|
||||
|
||||
|
||||
def test_session_hooks_in_valid_hooks():
|
||||
"""Verify on_session_finalize and on_session_reset are registered as valid hooks."""
|
||||
assert "on_session_finalize" in VALID_HOOKS
|
||||
assert "on_session_reset" in VALID_HOOKS
|
||||
|
||||
|
||||
@patch("hermes_cli.plugins.invoke_hook")
|
||||
def test_session_finalize_on_reset(mock_invoke_hook):
|
||||
"""Verify on_session_finalize fires when /new or /reset is used."""
|
||||
cli = HermesCLI()
|
||||
cli.agent = MagicMock()
|
||||
cli.agent.session_id = "test-session-id"
|
||||
|
||||
# Simulate /new command which triggers on_session_finalize for the old session
|
||||
cli.new_session(silent=True)
|
||||
|
||||
# Check if on_session_finalize was called for the old session
|
||||
mock_invoke_hook.assert_any_call(
|
||||
"on_session_finalize", session_id="test-session-id", platform="cli"
|
||||
)
|
||||
# Check if on_session_reset was called for the new session
|
||||
mock_invoke_hook.assert_any_call(
|
||||
"on_session_reset", session_id=cli.session_id, platform="cli"
|
||||
)
|
||||
|
||||
|
||||
@patch("hermes_cli.plugins.invoke_hook")
|
||||
def test_session_finalize_on_cleanup(mock_invoke_hook):
|
||||
"""Verify on_session_finalize fires during CLI exit cleanup."""
|
||||
import cli as cli_mod
|
||||
|
||||
mock_agent = MagicMock()
|
||||
mock_agent.session_id = "cleanup-session-id"
|
||||
cli_mod._active_agent_ref = mock_agent
|
||||
cli_mod._cleanup_done = False
|
||||
|
||||
cli_mod._run_cleanup()
|
||||
|
||||
mock_invoke_hook.assert_any_call(
|
||||
"on_session_finalize", session_id="cleanup-session-id", platform="cli"
|
||||
)
|
||||
|
||||
|
||||
@patch("hermes_cli.plugins.invoke_hook")
|
||||
def test_hook_errors_are_caught(mock_invoke_hook):
|
||||
"""Verify hook exceptions are caught and don't crash the agent."""
|
||||
mgr = PluginManager()
|
||||
|
||||
# Register a hook that raises
|
||||
def bad_callback(**kwargs):
|
||||
raise Exception("Hook failed")
|
||||
|
||||
mgr._hooks["on_session_finalize"] = [bad_callback]
|
||||
|
||||
# This should not raise
|
||||
results = mgr.invoke_hook("on_session_finalize", session_id="test", platform="cli")
|
||||
assert results == []
|
||||
@@ -33,6 +33,13 @@ def git_repo(tmp_path):
|
||||
["git", "commit", "-m", "Initial commit"],
|
||||
cwd=repo, capture_output=True,
|
||||
)
|
||||
# Add a fake remote ref so cleanup logic sees the initial commit as
|
||||
# "pushed". Without this, `git log HEAD --not --remotes` treats every
|
||||
# commit as unpushed and cleanup refuses to delete worktrees.
|
||||
subprocess.run(
|
||||
["git", "update-ref", "refs/remotes/origin/main", "HEAD"],
|
||||
cwd=repo, capture_output=True,
|
||||
)
|
||||
return repo
|
||||
|
||||
|
||||
@@ -81,7 +88,11 @@ def _setup_worktree(repo_root):
|
||||
|
||||
|
||||
def _cleanup_worktree(info):
|
||||
"""Test version of _cleanup_worktree."""
|
||||
"""Test version of _cleanup_worktree.
|
||||
|
||||
Preserves the worktree only if it has unpushed commits.
|
||||
Dirty working tree alone is not enough to keep it.
|
||||
"""
|
||||
wt_path = info["path"]
|
||||
branch = info["branch"]
|
||||
repo_root = info["repo_root"]
|
||||
@@ -89,15 +100,15 @@ def _cleanup_worktree(info):
|
||||
if not Path(wt_path).exists():
|
||||
return
|
||||
|
||||
# Check for uncommitted changes
|
||||
status = subprocess.run(
|
||||
["git", "status", "--porcelain"],
|
||||
# Check for unpushed commits
|
||||
result = subprocess.run(
|
||||
["git", "log", "--oneline", "HEAD", "--not", "--remotes"],
|
||||
capture_output=True, text=True, timeout=10, cwd=wt_path,
|
||||
)
|
||||
has_changes = bool(status.stdout.strip())
|
||||
has_unpushed = bool(result.stdout.strip())
|
||||
|
||||
if has_changes:
|
||||
return False # Did not clean up
|
||||
if has_unpushed:
|
||||
return False # Did not clean up — has unpushed commits
|
||||
|
||||
subprocess.run(
|
||||
["git", "worktree", "remove", wt_path, "--force"],
|
||||
@@ -204,20 +215,45 @@ class TestWorktreeCleanup:
|
||||
assert result is True
|
||||
assert not Path(info["path"]).exists()
|
||||
|
||||
def test_dirty_worktree_kept(self, git_repo):
|
||||
def test_dirty_worktree_cleaned_when_no_unpushed(self, git_repo):
|
||||
"""Dirty working tree without unpushed commits is cleaned up.
|
||||
|
||||
Agent sessions typically leave untracked files / artifacts behind.
|
||||
Since all real work is in pushed commits, these don't warrant
|
||||
keeping the worktree.
|
||||
"""
|
||||
info = _setup_worktree(str(git_repo))
|
||||
assert info is not None
|
||||
|
||||
# Make uncommitted changes
|
||||
# Make uncommitted changes (untracked file)
|
||||
(Path(info["path"]) / "new-file.txt").write_text("uncommitted")
|
||||
subprocess.run(
|
||||
["git", "add", "new-file.txt"],
|
||||
cwd=info["path"], capture_output=True,
|
||||
)
|
||||
|
||||
# The git_repo fixture already has a fake remote ref so the initial
|
||||
# commit is seen as "pushed". No unpushed commits → cleanup proceeds.
|
||||
result = _cleanup_worktree(info)
|
||||
assert result is False
|
||||
assert Path(info["path"]).exists() # Still there
|
||||
assert result is True # Cleaned up despite dirty working tree
|
||||
assert not Path(info["path"]).exists()
|
||||
|
||||
def test_worktree_with_unpushed_commits_kept(self, git_repo):
|
||||
"""Worktree with unpushed commits is preserved."""
|
||||
info = _setup_worktree(str(git_repo))
|
||||
assert info is not None
|
||||
|
||||
# Make a commit that is NOT on any remote
|
||||
(Path(info["path"]) / "work.txt").write_text("real work")
|
||||
subprocess.run(["git", "add", "work.txt"], cwd=info["path"], capture_output=True)
|
||||
subprocess.run(
|
||||
["git", "commit", "-m", "agent work"],
|
||||
cwd=info["path"], capture_output=True,
|
||||
)
|
||||
|
||||
result = _cleanup_worktree(info)
|
||||
assert result is False # Kept — has unpushed commits
|
||||
assert Path(info["path"]).exists()
|
||||
|
||||
def test_branch_deleted_on_cleanup(self, git_repo):
|
||||
info = _setup_worktree(str(git_repo))
|
||||
@@ -367,7 +403,7 @@ class TestMultipleWorktrees:
|
||||
lines = [l for l in result.stdout.strip().splitlines() if l.strip()]
|
||||
assert len(lines) == 11
|
||||
|
||||
# Cleanup all
|
||||
# Cleanup all (git_repo fixture has a fake remote ref so cleanup works)
|
||||
for info in worktrees:
|
||||
# Discard changes first so cleanup works
|
||||
subprocess.run(
|
||||
@@ -492,33 +528,77 @@ class TestStaleWorktreePruning:
|
||||
assert not pruned
|
||||
assert Path(info["path"]).exists()
|
||||
|
||||
def test_keeps_dirty_old_worktree(self, git_repo):
|
||||
"""Old worktrees with uncommitted changes should NOT be pruned."""
|
||||
def test_keeps_old_worktree_with_unpushed_commits(self, git_repo):
|
||||
"""Old worktrees (24-72h) with unpushed commits should NOT be pruned."""
|
||||
import time
|
||||
|
||||
info = _setup_worktree(str(git_repo))
|
||||
assert info is not None
|
||||
|
||||
# Make it dirty
|
||||
(Path(info["path"]) / "dirty.txt").write_text("uncommitted")
|
||||
# Make an unpushed commit
|
||||
(Path(info["path"]) / "work.txt").write_text("real work")
|
||||
subprocess.run(["git", "add", "work.txt"], cwd=info["path"], capture_output=True)
|
||||
subprocess.run(
|
||||
["git", "add", "dirty.txt"],
|
||||
["git", "commit", "-m", "agent work"],
|
||||
cwd=info["path"], capture_output=True,
|
||||
)
|
||||
|
||||
# Make it old
|
||||
# Make it old (25h — in the 24-72h soft tier)
|
||||
old_time = time.time() - (25 * 3600)
|
||||
os.utime(info["path"], (old_time, old_time))
|
||||
|
||||
# Check if it would be pruned
|
||||
status = subprocess.run(
|
||||
["git", "status", "--porcelain"],
|
||||
# Check for unpushed commits (simulates prune logic)
|
||||
result = subprocess.run(
|
||||
["git", "log", "--oneline", "HEAD", "--not", "--remotes"],
|
||||
capture_output=True, text=True, cwd=info["path"],
|
||||
)
|
||||
has_changes = bool(status.stdout.strip())
|
||||
assert has_changes # Should be dirty → not pruned
|
||||
has_unpushed = bool(result.stdout.strip())
|
||||
assert has_unpushed # Has unpushed commits → not pruned in soft tier
|
||||
assert Path(info["path"]).exists()
|
||||
|
||||
def test_force_prunes_very_old_worktree(self, git_repo):
|
||||
"""Worktrees older than 72h should be force-pruned regardless."""
|
||||
import time
|
||||
|
||||
info = _setup_worktree(str(git_repo))
|
||||
assert info is not None
|
||||
|
||||
# Make an unpushed commit (would normally protect it)
|
||||
(Path(info["path"]) / "work.txt").write_text("stale work")
|
||||
subprocess.run(["git", "add", "work.txt"], cwd=info["path"], capture_output=True)
|
||||
subprocess.run(
|
||||
["git", "commit", "-m", "old agent work"],
|
||||
cwd=info["path"], capture_output=True,
|
||||
)
|
||||
|
||||
# Make it very old (73h — beyond the 72h hard threshold)
|
||||
old_time = time.time() - (73 * 3600)
|
||||
os.utime(info["path"], (old_time, old_time))
|
||||
|
||||
# Simulate the force-prune tier check
|
||||
hard_cutoff = time.time() - (72 * 3600)
|
||||
mtime = Path(info["path"]).stat().st_mtime
|
||||
assert mtime <= hard_cutoff # Should qualify for force removal
|
||||
|
||||
# Actually remove it (simulates _prune_stale_worktrees force path)
|
||||
branch_result = subprocess.run(
|
||||
["git", "branch", "--show-current"],
|
||||
capture_output=True, text=True, timeout=5, cwd=info["path"],
|
||||
)
|
||||
branch = branch_result.stdout.strip()
|
||||
|
||||
subprocess.run(
|
||||
["git", "worktree", "remove", info["path"], "--force"],
|
||||
capture_output=True, text=True, timeout=15, cwd=str(git_repo),
|
||||
)
|
||||
if branch:
|
||||
subprocess.run(
|
||||
["git", "branch", "-D", branch],
|
||||
capture_output=True, text=True, timeout=10, cwd=str(git_repo),
|
||||
)
|
||||
|
||||
assert not Path(info["path"]).exists()
|
||||
|
||||
|
||||
class TestEdgeCases:
|
||||
"""Test edge cases for robustness."""
|
||||
@@ -611,6 +691,133 @@ class TestTerminalCWDIntegration:
|
||||
assert result.stdout.strip() == "true"
|
||||
|
||||
|
||||
class TestOrphanedBranchPruning:
|
||||
"""Test cleanup of orphaned hermes/* and pr-* branches."""
|
||||
|
||||
def test_prunes_orphaned_hermes_branch(self, git_repo):
|
||||
"""hermes/hermes-* branches with no worktree should be deleted."""
|
||||
# Create a branch that looks like a worktree branch but has no worktree
|
||||
subprocess.run(
|
||||
["git", "branch", "hermes/hermes-deadbeef", "HEAD"],
|
||||
cwd=str(git_repo), capture_output=True,
|
||||
)
|
||||
|
||||
# Verify it exists
|
||||
result = subprocess.run(
|
||||
["git", "branch", "--list", "hermes/hermes-deadbeef"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
assert "hermes/hermes-deadbeef" in result.stdout
|
||||
|
||||
# Simulate _prune_orphaned_branches logic
|
||||
result = subprocess.run(
|
||||
["git", "branch", "--format=%(refname:short)"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
all_branches = [b.strip() for b in result.stdout.strip().split("\n") if b.strip()]
|
||||
|
||||
wt_result = subprocess.run(
|
||||
["git", "worktree", "list", "--porcelain"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
active_branches = {"main"}
|
||||
for line in wt_result.stdout.split("\n"):
|
||||
if line.startswith("branch refs/heads/"):
|
||||
active_branches.add(line.split("branch refs/heads/", 1)[-1].strip())
|
||||
|
||||
orphaned = [
|
||||
b for b in all_branches
|
||||
if b not in active_branches
|
||||
and (b.startswith("hermes/hermes-") or b.startswith("pr-"))
|
||||
]
|
||||
assert "hermes/hermes-deadbeef" in orphaned
|
||||
|
||||
# Delete them
|
||||
if orphaned:
|
||||
subprocess.run(
|
||||
["git", "branch", "-D"] + orphaned,
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
|
||||
# Verify gone
|
||||
result = subprocess.run(
|
||||
["git", "branch", "--list", "hermes/hermes-deadbeef"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
assert "hermes/hermes-deadbeef" not in result.stdout
|
||||
|
||||
def test_prunes_orphaned_pr_branch(self, git_repo):
|
||||
"""pr-* branches should be deleted during pruning."""
|
||||
subprocess.run(
|
||||
["git", "branch", "pr-1234", "HEAD"],
|
||||
cwd=str(git_repo), capture_output=True,
|
||||
)
|
||||
subprocess.run(
|
||||
["git", "branch", "pr-5678", "HEAD"],
|
||||
cwd=str(git_repo), capture_output=True,
|
||||
)
|
||||
|
||||
result = subprocess.run(
|
||||
["git", "branch", "--format=%(refname:short)"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
all_branches = [b.strip() for b in result.stdout.strip().split("\n") if b.strip()]
|
||||
|
||||
active_branches = {"main"}
|
||||
orphaned = [
|
||||
b for b in all_branches
|
||||
if b not in active_branches and b.startswith("pr-")
|
||||
]
|
||||
assert "pr-1234" in orphaned
|
||||
assert "pr-5678" in orphaned
|
||||
|
||||
subprocess.run(
|
||||
["git", "branch", "-D"] + orphaned,
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
|
||||
# Verify gone
|
||||
result = subprocess.run(
|
||||
["git", "branch", "--format=%(refname:short)"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
remaining = result.stdout.strip()
|
||||
assert "pr-1234" not in remaining
|
||||
assert "pr-5678" not in remaining
|
||||
|
||||
def test_preserves_active_worktree_branch(self, git_repo):
|
||||
"""Branches with active worktrees should NOT be pruned."""
|
||||
info = _setup_worktree(str(git_repo))
|
||||
assert info is not None
|
||||
|
||||
result = subprocess.run(
|
||||
["git", "worktree", "list", "--porcelain"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
active_branches = set()
|
||||
for line in result.stdout.split("\n"):
|
||||
if line.startswith("branch refs/heads/"):
|
||||
active_branches.add(line.split("branch refs/heads/", 1)[-1].strip())
|
||||
|
||||
assert info["branch"] in active_branches # Protected
|
||||
|
||||
def test_preserves_main_branch(self, git_repo):
|
||||
"""main branch should never be pruned."""
|
||||
result = subprocess.run(
|
||||
["git", "branch", "--format=%(refname:short)"],
|
||||
capture_output=True, text=True, cwd=str(git_repo),
|
||||
)
|
||||
all_branches = [b.strip() for b in result.stdout.strip().split("\n") if b.strip()]
|
||||
active_branches = {"main"}
|
||||
|
||||
orphaned = [
|
||||
b for b in all_branches
|
||||
if b not in active_branches
|
||||
and (b.startswith("hermes/hermes-") or b.startswith("pr-"))
|
||||
]
|
||||
assert "main" not in orphaned
|
||||
|
||||
|
||||
class TestSystemPromptInjection:
|
||||
"""Test that the agent gets worktree context in its system prompt."""
|
||||
|
||||
@@ -625,7 +832,7 @@ class TestSystemPromptInjection:
|
||||
f"{info['path']}. Your branch is `{info['branch']}`. "
|
||||
f"Changes here do not affect the main working tree or other agents. "
|
||||
f"Remember to commit and push your changes, and create a PR if appropriate. "
|
||||
f"The original repo is at {info['repo_root']}.]"
|
||||
f"The original repo is at {info['repo_root']}.]\n"
|
||||
)
|
||||
|
||||
assert info["path"] in wt_note
|
||||
@@ -339,6 +339,36 @@ class TestMarkJobRun:
|
||||
assert updated["last_status"] == "error"
|
||||
assert updated["last_error"] == "timeout"
|
||||
|
||||
def test_delivery_error_tracked_separately(self, tmp_cron_dir):
|
||||
"""Agent succeeds but delivery fails — both tracked independently."""
|
||||
job = create_job(prompt="Report", schedule="every 1h")
|
||||
mark_job_run(job["id"], success=True, delivery_error="platform 'telegram' not configured")
|
||||
updated = get_job(job["id"])
|
||||
assert updated["last_status"] == "ok"
|
||||
assert updated["last_error"] is None
|
||||
assert updated["last_delivery_error"] == "platform 'telegram' not configured"
|
||||
|
||||
def test_delivery_error_cleared_on_success(self, tmp_cron_dir):
|
||||
"""Successful delivery clears the previous delivery error."""
|
||||
job = create_job(prompt="Report", schedule="every 1h")
|
||||
mark_job_run(job["id"], success=True, delivery_error="network timeout")
|
||||
updated = get_job(job["id"])
|
||||
assert updated["last_delivery_error"] == "network timeout"
|
||||
# Next run delivers successfully
|
||||
mark_job_run(job["id"], success=True, delivery_error=None)
|
||||
updated = get_job(job["id"])
|
||||
assert updated["last_delivery_error"] is None
|
||||
|
||||
def test_both_agent_and_delivery_error(self, tmp_cron_dir):
|
||||
"""Agent fails AND delivery fails — both errors recorded."""
|
||||
job = create_job(prompt="Report", schedule="every 1h")
|
||||
mark_job_run(job["id"], success=False, error="model timeout",
|
||||
delivery_error="platform 'discord' not enabled")
|
||||
updated = get_job(job["id"])
|
||||
assert updated["last_status"] == "error"
|
||||
assert updated["last_error"] == "model timeout"
|
||||
assert updated["last_delivery_error"] == "platform 'discord' not enabled"
|
||||
|
||||
|
||||
class TestAdvanceNextRun:
|
||||
"""Tests for advance_next_run() — crash-safety for recurring jobs."""
|
||||
|
||||
@@ -508,6 +508,90 @@ class TestDeliverResultWrapping:
|
||||
assert send_mock.call_args.kwargs["thread_id"] == "17585"
|
||||
|
||||
|
||||
class TestDeliverResultErrorReturns:
|
||||
"""Verify _deliver_result returns error strings on failure, None on success."""
|
||||
|
||||
def test_returns_none_on_successful_delivery(self):
|
||||
from gateway.config import Platform
|
||||
|
||||
pconfig = MagicMock()
|
||||
pconfig.enabled = True
|
||||
mock_cfg = MagicMock()
|
||||
mock_cfg.platforms = {Platform.TELEGRAM: pconfig}
|
||||
|
||||
with patch("gateway.config.load_gateway_config", return_value=mock_cfg), \
|
||||
patch("tools.send_message_tool._send_to_platform", new=AsyncMock(return_value={"success": True})):
|
||||
job = {
|
||||
"id": "ok-job",
|
||||
"deliver": "origin",
|
||||
"origin": {"platform": "telegram", "chat_id": "123"},
|
||||
}
|
||||
result = _deliver_result(job, "Output.")
|
||||
assert result is None
|
||||
|
||||
def test_returns_none_for_local_delivery(self):
|
||||
"""local-only jobs don't deliver — not a failure."""
|
||||
job = {"id": "local-job", "deliver": "local"}
|
||||
result = _deliver_result(job, "Output.")
|
||||
assert result is None
|
||||
|
||||
def test_returns_error_for_unknown_platform(self):
|
||||
job = {
|
||||
"id": "bad-platform",
|
||||
"deliver": "origin",
|
||||
"origin": {"platform": "fax", "chat_id": "123"},
|
||||
}
|
||||
with patch("gateway.config.load_gateway_config"):
|
||||
result = _deliver_result(job, "Output.")
|
||||
assert result is not None
|
||||
assert "unknown platform" in result
|
||||
|
||||
def test_returns_error_when_platform_disabled(self):
|
||||
from gateway.config import Platform
|
||||
|
||||
pconfig = MagicMock()
|
||||
pconfig.enabled = False
|
||||
mock_cfg = MagicMock()
|
||||
mock_cfg.platforms = {Platform.TELEGRAM: pconfig}
|
||||
|
||||
with patch("gateway.config.load_gateway_config", return_value=mock_cfg):
|
||||
job = {
|
||||
"id": "disabled",
|
||||
"deliver": "origin",
|
||||
"origin": {"platform": "telegram", "chat_id": "123"},
|
||||
}
|
||||
result = _deliver_result(job, "Output.")
|
||||
assert result is not None
|
||||
assert "not configured" in result
|
||||
|
||||
def test_returns_error_on_send_failure(self):
|
||||
from gateway.config import Platform
|
||||
|
||||
pconfig = MagicMock()
|
||||
pconfig.enabled = True
|
||||
mock_cfg = MagicMock()
|
||||
mock_cfg.platforms = {Platform.TELEGRAM: pconfig}
|
||||
|
||||
with patch("gateway.config.load_gateway_config", return_value=mock_cfg), \
|
||||
patch("tools.send_message_tool._send_to_platform", new=AsyncMock(return_value={"error": "rate limited"})):
|
||||
job = {
|
||||
"id": "rate-limited",
|
||||
"deliver": "origin",
|
||||
"origin": {"platform": "telegram", "chat_id": "123"},
|
||||
}
|
||||
result = _deliver_result(job, "Output.")
|
||||
assert result is not None
|
||||
assert "rate limited" in result
|
||||
|
||||
def test_returns_error_for_unresolved_target(self, monkeypatch):
|
||||
"""Non-local delivery with no resolvable target should return an error."""
|
||||
monkeypatch.delenv("TELEGRAM_HOME_CHANNEL", raising=False)
|
||||
job = {"id": "no-target", "deliver": "telegram"}
|
||||
result = _deliver_result(job, "Output.")
|
||||
assert result is not None
|
||||
assert "no delivery target" in result
|
||||
|
||||
|
||||
class TestRunJobSessionPersistence:
|
||||
def test_run_job_passes_session_db_and_cron_platform(self, tmp_path):
|
||||
job = {
|
||||
|
||||
@@ -0,0 +1,164 @@
|
||||
"""Security tests for Terminal-Bench 2 archive extraction."""
|
||||
|
||||
import base64
|
||||
import importlib
|
||||
import io
|
||||
import sys
|
||||
import tarfile
|
||||
import types
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
def _stub_module(name: str, **attrs):
|
||||
module = types.ModuleType(name)
|
||||
for key, value in attrs.items():
|
||||
setattr(module, key, value)
|
||||
return module
|
||||
|
||||
|
||||
def _load_terminalbench_module(monkeypatch):
|
||||
class _EvalHandlingEnum:
|
||||
STOP_TRAIN = "stop_train"
|
||||
|
||||
class _APIServerConfig:
|
||||
def __init__(self, *args, **kwargs):
|
||||
self.args = args
|
||||
self.kwargs = kwargs
|
||||
|
||||
class _AgentResult:
|
||||
pass
|
||||
|
||||
class _HermesAgentLoop:
|
||||
pass
|
||||
|
||||
class _HermesAgentBaseEnv:
|
||||
pass
|
||||
|
||||
class _HermesAgentEnvConfig:
|
||||
pass
|
||||
|
||||
class _ToolContext:
|
||||
pass
|
||||
|
||||
stub_modules = {
|
||||
"atroposlib": _stub_module("atroposlib"),
|
||||
"atroposlib.envs": _stub_module("atroposlib.envs"),
|
||||
"atroposlib.envs.base": _stub_module(
|
||||
"atroposlib.envs.base",
|
||||
EvalHandlingEnum=_EvalHandlingEnum,
|
||||
),
|
||||
"atroposlib.envs.server_handling": _stub_module("atroposlib.envs.server_handling"),
|
||||
"atroposlib.envs.server_handling.server_manager": _stub_module(
|
||||
"atroposlib.envs.server_handling.server_manager",
|
||||
APIServerConfig=_APIServerConfig,
|
||||
),
|
||||
"environments.agent_loop": _stub_module(
|
||||
"environments.agent_loop",
|
||||
AgentResult=_AgentResult,
|
||||
HermesAgentLoop=_HermesAgentLoop,
|
||||
),
|
||||
"environments.hermes_base_env": _stub_module(
|
||||
"environments.hermes_base_env",
|
||||
HermesAgentBaseEnv=_HermesAgentBaseEnv,
|
||||
HermesAgentEnvConfig=_HermesAgentEnvConfig,
|
||||
),
|
||||
"environments.tool_context": _stub_module(
|
||||
"environments.tool_context",
|
||||
ToolContext=_ToolContext,
|
||||
),
|
||||
"tools.terminal_tool": _stub_module(
|
||||
"tools.terminal_tool",
|
||||
register_task_env_overrides=lambda *args, **kwargs: None,
|
||||
clear_task_env_overrides=lambda *args, **kwargs: None,
|
||||
cleanup_vm=lambda *args, **kwargs: None,
|
||||
),
|
||||
}
|
||||
|
||||
stub_modules["atroposlib"].envs = stub_modules["atroposlib.envs"]
|
||||
stub_modules["atroposlib.envs"].base = stub_modules["atroposlib.envs.base"]
|
||||
stub_modules["atroposlib.envs"].server_handling = stub_modules["atroposlib.envs.server_handling"]
|
||||
stub_modules["atroposlib.envs.server_handling"].server_manager = stub_modules[
|
||||
"atroposlib.envs.server_handling.server_manager"
|
||||
]
|
||||
|
||||
for name, module in stub_modules.items():
|
||||
monkeypatch.setitem(sys.modules, name, module)
|
||||
|
||||
module_name = "environments.benchmarks.terminalbench_2.terminalbench2_env"
|
||||
sys.modules.pop(module_name, None)
|
||||
return importlib.import_module(module_name)
|
||||
|
||||
|
||||
def _build_tar_b64(entries):
|
||||
buf = io.BytesIO()
|
||||
with tarfile.open(fileobj=buf, mode="w:gz") as tar:
|
||||
for entry in entries:
|
||||
kind = entry["kind"]
|
||||
info = tarfile.TarInfo(entry["name"])
|
||||
|
||||
if kind == "dir":
|
||||
info.type = tarfile.DIRTYPE
|
||||
tar.addfile(info)
|
||||
continue
|
||||
|
||||
if kind == "file":
|
||||
data = entry["data"].encode("utf-8")
|
||||
info.size = len(data)
|
||||
tar.addfile(info, io.BytesIO(data))
|
||||
continue
|
||||
|
||||
if kind == "symlink":
|
||||
info.type = tarfile.SYMTYPE
|
||||
info.linkname = entry["target"]
|
||||
tar.addfile(info)
|
||||
continue
|
||||
|
||||
raise ValueError(f"Unknown tar entry kind: {kind}")
|
||||
|
||||
return base64.b64encode(buf.getvalue()).decode("ascii")
|
||||
|
||||
|
||||
def test_extract_base64_tar_allows_safe_files(tmp_path, monkeypatch):
|
||||
module = _load_terminalbench_module(monkeypatch)
|
||||
archive = _build_tar_b64(
|
||||
[
|
||||
{"kind": "dir", "name": "nested"},
|
||||
{"kind": "file", "name": "nested/hello.txt", "data": "hello"},
|
||||
]
|
||||
)
|
||||
|
||||
target = tmp_path / "extract"
|
||||
module._extract_base64_tar(archive, target)
|
||||
|
||||
assert (target / "nested" / "hello.txt").read_text(encoding="utf-8") == "hello"
|
||||
|
||||
|
||||
def test_extract_base64_tar_rejects_path_traversal(tmp_path, monkeypatch):
|
||||
module = _load_terminalbench_module(monkeypatch)
|
||||
archive = _build_tar_b64(
|
||||
[
|
||||
{"kind": "file", "name": "../escape.txt", "data": "owned"},
|
||||
]
|
||||
)
|
||||
|
||||
target = tmp_path / "extract"
|
||||
with pytest.raises(ValueError, match="Unsafe archive member path"):
|
||||
module._extract_base64_tar(archive, target)
|
||||
|
||||
assert not (tmp_path / "escape.txt").exists()
|
||||
|
||||
|
||||
def test_extract_base64_tar_rejects_symlinks(tmp_path, monkeypatch):
|
||||
module = _load_terminalbench_module(monkeypatch)
|
||||
archive = _build_tar_b64(
|
||||
[
|
||||
{"kind": "symlink", "name": "link", "target": "../../escape.txt"},
|
||||
]
|
||||
)
|
||||
|
||||
target = tmp_path / "extract"
|
||||
with pytest.raises(ValueError, match="Unsupported archive member type"):
|
||||
module._extract_base64_tar(archive, target)
|
||||
|
||||
assert not (target / "link").exists()
|
||||
@@ -439,7 +439,7 @@ class TestChatCompletionsEndpoint:
|
||||
tp_cb = kwargs.get("tool_progress_callback")
|
||||
# Simulate tool progress before streaming content
|
||||
if tp_cb:
|
||||
tp_cb("terminal", "ls -la", {"command": "ls -la"})
|
||||
tp_cb("tool.started", "terminal", "ls -la", {"command": "ls -la"})
|
||||
if cb:
|
||||
await asyncio.sleep(0.05)
|
||||
cb("Here are the files.")
|
||||
@@ -476,8 +476,8 @@ class TestChatCompletionsEndpoint:
|
||||
cb = kwargs.get("stream_delta_callback")
|
||||
tp_cb = kwargs.get("tool_progress_callback")
|
||||
if tp_cb:
|
||||
tp_cb("_thinking", "some internal state", {})
|
||||
tp_cb("web_search", "Python docs", {"query": "Python docs"})
|
||||
tp_cb("tool.started", "_thinking", "some internal state", {})
|
||||
tp_cb("tool.started", "web_search", "Python docs", {"query": "Python docs"})
|
||||
if cb:
|
||||
await asyncio.sleep(0.05)
|
||||
cb("Found it.")
|
||||
|
||||
@@ -0,0 +1,343 @@
|
||||
"""Tests for Discord ignored_channels and no_thread_channels config."""
|
||||
|
||||
from types import SimpleNamespace
|
||||
from datetime import datetime, timezone
|
||||
from unittest.mock import AsyncMock, MagicMock
|
||||
import sys
|
||||
|
||||
import pytest
|
||||
|
||||
from gateway.config import PlatformConfig
|
||||
|
||||
|
||||
def _ensure_discord_mock():
|
||||
"""Install a mock discord module when discord.py isn't available."""
|
||||
if "discord" in sys.modules and hasattr(sys.modules["discord"], "__file__"):
|
||||
return
|
||||
|
||||
discord_mod = MagicMock()
|
||||
discord_mod.Intents.default.return_value = MagicMock()
|
||||
discord_mod.Client = MagicMock
|
||||
discord_mod.File = MagicMock
|
||||
discord_mod.DMChannel = type("DMChannel", (), {})
|
||||
discord_mod.Thread = type("Thread", (), {})
|
||||
discord_mod.ForumChannel = type("ForumChannel", (), {})
|
||||
discord_mod.ui = SimpleNamespace(View=object, button=lambda *a, **k: (lambda fn: fn), Button=object)
|
||||
discord_mod.ButtonStyle = SimpleNamespace(success=1, primary=2, secondary=2, danger=3, green=1, grey=2, blurple=2, red=3)
|
||||
discord_mod.Color = SimpleNamespace(orange=lambda: 1, green=lambda: 2, blue=lambda: 3, red=lambda: 4, purple=lambda: 5)
|
||||
discord_mod.Interaction = object
|
||||
discord_mod.Embed = MagicMock
|
||||
discord_mod.app_commands = SimpleNamespace(
|
||||
describe=lambda **kwargs: (lambda fn: fn),
|
||||
choices=lambda **kwargs: (lambda fn: fn),
|
||||
Choice=lambda **kwargs: SimpleNamespace(**kwargs),
|
||||
)
|
||||
|
||||
ext_mod = MagicMock()
|
||||
commands_mod = MagicMock()
|
||||
commands_mod.Bot = MagicMock
|
||||
ext_mod.commands = commands_mod
|
||||
|
||||
sys.modules.setdefault("discord", discord_mod)
|
||||
sys.modules.setdefault("discord.ext", ext_mod)
|
||||
sys.modules.setdefault("discord.ext.commands", commands_mod)
|
||||
|
||||
|
||||
_ensure_discord_mock()
|
||||
|
||||
import gateway.platforms.discord as discord_platform # noqa: E402
|
||||
from gateway.platforms.discord import DiscordAdapter # noqa: E402
|
||||
|
||||
|
||||
class FakeDMChannel:
|
||||
def __init__(self, channel_id: int = 1, name: str = "dm"):
|
||||
self.id = channel_id
|
||||
self.name = name
|
||||
|
||||
|
||||
class FakeTextChannel:
|
||||
def __init__(self, channel_id: int = 1, name: str = "general", guild_name: str = "Hermes Server"):
|
||||
self.id = channel_id
|
||||
self.name = name
|
||||
self.guild = SimpleNamespace(name=guild_name)
|
||||
self.topic = None
|
||||
|
||||
|
||||
class FakeThread:
|
||||
def __init__(self, channel_id: int = 1, name: str = "thread", parent=None, guild_name: str = "Hermes Server"):
|
||||
self.id = channel_id
|
||||
self.name = name
|
||||
self.parent = parent
|
||||
self.parent_id = getattr(parent, "id", None)
|
||||
self.guild = getattr(parent, "guild", None) or SimpleNamespace(name=guild_name)
|
||||
self.topic = None
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def adapter(monkeypatch):
|
||||
monkeypatch.setattr(discord_platform.discord, "DMChannel", FakeDMChannel, raising=False)
|
||||
monkeypatch.setattr(discord_platform.discord, "Thread", FakeThread, raising=False)
|
||||
|
||||
config = PlatformConfig(enabled=True, token="fake-token")
|
||||
adapter = DiscordAdapter(config)
|
||||
adapter._client = SimpleNamespace(user=SimpleNamespace(id=999))
|
||||
adapter.handle_message = AsyncMock()
|
||||
return adapter
|
||||
|
||||
|
||||
def make_message(*, channel, content: str, mentions=None):
|
||||
author = SimpleNamespace(id=42, display_name="TestUser", name="TestUser")
|
||||
return SimpleNamespace(
|
||||
id=123,
|
||||
content=content,
|
||||
mentions=list(mentions or []),
|
||||
attachments=[],
|
||||
reference=None,
|
||||
created_at=datetime.now(timezone.utc),
|
||||
channel=channel,
|
||||
author=author,
|
||||
)
|
||||
|
||||
|
||||
# ── ignored_channels ─────────────────────────────────────────────────
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_ignored_channel_blocks_message(adapter, monkeypatch):
|
||||
"""Messages in ignored channels are silently dropped."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "500")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=500), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_not_awaited()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_ignored_channel_blocks_even_with_mention(adapter, monkeypatch):
|
||||
"""Ignored channels take priority — even @mentions are dropped."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "true")
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "500")
|
||||
|
||||
bot_user = adapter._client.user
|
||||
message = make_message(
|
||||
channel=FakeTextChannel(channel_id=500),
|
||||
content=f"<@{bot_user.id}> hello",
|
||||
mentions=[bot_user],
|
||||
)
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_not_awaited()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_ignored_channel_processes_normally(adapter, monkeypatch):
|
||||
"""Channels not in the ignored list process normally."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "500,600")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=700), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_ignored_channels_csv_parsing(adapter, monkeypatch):
|
||||
"""Multiple channel IDs are parsed correctly from CSV."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "500, 600 , 700")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
for ch_id in (500, 600, 700):
|
||||
adapter.handle_message.reset_mock()
|
||||
message = make_message(channel=FakeTextChannel(channel_id=ch_id), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
adapter.handle_message.assert_not_awaited()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_ignored_channels_empty_string_ignores_nothing(adapter, monkeypatch):
|
||||
"""Empty DISCORD_IGNORED_CHANNELS means nothing is ignored."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=500), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_ignored_channel_thread_parent_match(adapter, monkeypatch):
|
||||
"""Thread whose parent channel is ignored should also be ignored."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "500")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
parent = FakeTextChannel(channel_id=500, name="ignored-channel")
|
||||
thread = FakeThread(channel_id=501, name="thread-in-ignored", parent=parent)
|
||||
message = make_message(channel=thread, content="hello from thread")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_not_awaited()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_dms_unaffected_by_ignored_channels(adapter, monkeypatch):
|
||||
"""DMs should never be affected by ignored_channels."""
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "500")
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
message = make_message(channel=FakeDMChannel(channel_id=500), content="dm hello")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
|
||||
|
||||
# ── no_thread_channels ───────────────────────────────────────────────
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_no_thread_channel_skips_auto_thread(adapter, monkeypatch):
|
||||
"""Channels in no_thread_channels should not auto-create threads."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_NO_THREAD_CHANNELS", "800")
|
||||
monkeypatch.delenv("DISCORD_AUTO_THREAD", raising=False)
|
||||
monkeypatch.delenv("DISCORD_IGNORED_CHANNELS", raising=False)
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
adapter._auto_create_thread = AsyncMock(return_value=FakeThread(channel_id=999))
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=800), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter._auto_create_thread.assert_not_awaited()
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.source.chat_type == "group"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_normal_channel_still_auto_threads(adapter, monkeypatch):
|
||||
"""Channels NOT in no_thread_channels still get auto-threading."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_NO_THREAD_CHANNELS", "800")
|
||||
monkeypatch.delenv("DISCORD_AUTO_THREAD", raising=False)
|
||||
monkeypatch.delenv("DISCORD_IGNORED_CHANNELS", raising=False)
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
fake_thread = FakeThread(channel_id=999, name="auto-thread")
|
||||
adapter._auto_create_thread = AsyncMock(return_value=fake_thread)
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=900), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter._auto_create_thread.assert_awaited_once()
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
event = adapter.handle_message.await_args.args[0]
|
||||
assert event.source.chat_type == "thread"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_no_thread_channels_csv_parsing(adapter, monkeypatch):
|
||||
"""Multiple no_thread channel IDs parsed from CSV."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_NO_THREAD_CHANNELS", "800, 900")
|
||||
monkeypatch.delenv("DISCORD_AUTO_THREAD", raising=False)
|
||||
monkeypatch.delenv("DISCORD_IGNORED_CHANNELS", raising=False)
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
adapter._auto_create_thread = AsyncMock(return_value=FakeThread(channel_id=999))
|
||||
|
||||
for ch_id in (800, 900):
|
||||
adapter._auto_create_thread.reset_mock()
|
||||
adapter.handle_message.reset_mock()
|
||||
message = make_message(channel=FakeTextChannel(channel_id=ch_id), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
adapter._auto_create_thread.assert_not_awaited()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_no_thread_with_auto_thread_disabled_is_noop(adapter, monkeypatch):
|
||||
"""no_thread_channels is a no-op when auto_thread is globally disabled."""
|
||||
monkeypatch.setenv("DISCORD_REQUIRE_MENTION", "false")
|
||||
monkeypatch.setenv("DISCORD_AUTO_THREAD", "false")
|
||||
monkeypatch.setenv("DISCORD_NO_THREAD_CHANNELS", "800")
|
||||
monkeypatch.delenv("DISCORD_IGNORED_CHANNELS", raising=False)
|
||||
monkeypatch.delenv("DISCORD_FREE_RESPONSE_CHANNELS", raising=False)
|
||||
|
||||
adapter._auto_create_thread = AsyncMock()
|
||||
|
||||
message = make_message(channel=FakeTextChannel(channel_id=800), content="hello")
|
||||
await adapter._handle_message(message)
|
||||
|
||||
adapter._auto_create_thread.assert_not_awaited()
|
||||
adapter.handle_message.assert_awaited_once()
|
||||
|
||||
|
||||
# ── config.py bridging ───────────────────────────────────────────────
|
||||
|
||||
|
||||
def test_config_bridges_ignored_channels(monkeypatch, tmp_path):
|
||||
"""gateway/config.py bridges discord.ignored_channels to env var."""
|
||||
import yaml
|
||||
config_file = tmp_path / "config.yaml"
|
||||
config_file.write_text(yaml.dump({
|
||||
"discord": {
|
||||
"ignored_channels": ["111", "222"],
|
||||
},
|
||||
}))
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
# Use setenv (not delenv) so monkeypatch registers cleanup even when
|
||||
# the var doesn't exist yet — load_gateway_config will overwrite it.
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "")
|
||||
|
||||
from gateway.config import load_gateway_config
|
||||
load_gateway_config()
|
||||
|
||||
import os
|
||||
assert os.getenv("DISCORD_IGNORED_CHANNELS") == "111,222"
|
||||
|
||||
|
||||
def test_config_bridges_no_thread_channels(monkeypatch, tmp_path):
|
||||
"""gateway/config.py bridges discord.no_thread_channels to env var."""
|
||||
import yaml
|
||||
config_file = tmp_path / "config.yaml"
|
||||
config_file.write_text(yaml.dump({
|
||||
"discord": {
|
||||
"no_thread_channels": ["333"],
|
||||
},
|
||||
}))
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
monkeypatch.setenv("DISCORD_NO_THREAD_CHANNELS", "")
|
||||
|
||||
from gateway.config import load_gateway_config
|
||||
load_gateway_config()
|
||||
|
||||
import os
|
||||
assert os.getenv("DISCORD_NO_THREAD_CHANNELS") == "333"
|
||||
|
||||
|
||||
def test_config_env_var_takes_precedence(monkeypatch, tmp_path):
|
||||
"""Env vars should take precedence over config.yaml values."""
|
||||
import yaml
|
||||
config_file = tmp_path / "config.yaml"
|
||||
config_file.write_text(yaml.dump({
|
||||
"discord": {
|
||||
"ignored_channels": ["111"],
|
||||
},
|
||||
}))
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
monkeypatch.setenv("DISCORD_IGNORED_CHANNELS", "999")
|
||||
|
||||
from gateway.config import load_gateway_config
|
||||
load_gateway_config()
|
||||
|
||||
import os
|
||||
# Env var should NOT be overwritten
|
||||
assert os.getenv("DISCORD_IGNORED_CHANNELS") == "999"
|
||||
@@ -0,0 +1,432 @@
|
||||
"""Tests for Feishu interactive card approval buttons."""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import AsyncMock, MagicMock, Mock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Ensure the repo root is importable
|
||||
# ---------------------------------------------------------------------------
|
||||
_repo = str(Path(__file__).resolve().parents[2])
|
||||
if _repo not in sys.path:
|
||||
sys.path.insert(0, _repo)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Minimal Feishu mock so FeishuAdapter can be imported without lark-oapi
|
||||
# ---------------------------------------------------------------------------
|
||||
def _ensure_feishu_mocks():
|
||||
"""Provide stubs for lark-oapi / aiohttp.web so the import succeeds."""
|
||||
if "lark_oapi" not in sys.modules:
|
||||
mod = MagicMock()
|
||||
for name in (
|
||||
"lark_oapi", "lark_oapi.api.im.v1",
|
||||
"lark_oapi.event", "lark_oapi.event.callback_type",
|
||||
):
|
||||
sys.modules.setdefault(name, mod)
|
||||
if "aiohttp" not in sys.modules:
|
||||
aio = MagicMock()
|
||||
sys.modules.setdefault("aiohttp", aio)
|
||||
sys.modules.setdefault("aiohttp.web", aio.web)
|
||||
|
||||
|
||||
_ensure_feishu_mocks()
|
||||
|
||||
from gateway.config import PlatformConfig
|
||||
from gateway.platforms.feishu import FeishuAdapter
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _make_adapter() -> FeishuAdapter:
|
||||
"""Create a FeishuAdapter with mocked internals."""
|
||||
config = PlatformConfig(enabled=True)
|
||||
adapter = FeishuAdapter(config)
|
||||
adapter._client = MagicMock()
|
||||
return adapter
|
||||
|
||||
|
||||
def _make_card_action_data(
|
||||
action_value: dict,
|
||||
chat_id: str = "oc_12345",
|
||||
open_id: str = "ou_user1",
|
||||
token: str = "tok_abc",
|
||||
) -> SimpleNamespace:
|
||||
"""Create a mock Feishu card action callback data object."""
|
||||
return SimpleNamespace(
|
||||
event=SimpleNamespace(
|
||||
token=token,
|
||||
context=SimpleNamespace(open_chat_id=chat_id),
|
||||
operator=SimpleNamespace(open_id=open_id),
|
||||
action=SimpleNamespace(
|
||||
tag="button",
|
||||
value=action_value,
|
||||
),
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
# ===========================================================================
|
||||
# send_exec_approval — interactive card with buttons
|
||||
# ===========================================================================
|
||||
|
||||
class TestFeishuExecApproval:
|
||||
"""Test send_exec_approval sends an interactive card."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_sends_interactive_card(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
mock_response = SimpleNamespace(
|
||||
success=lambda: True,
|
||||
data=SimpleNamespace(message_id="msg_001"),
|
||||
)
|
||||
with patch.object(
|
||||
adapter, "_feishu_send_with_retry", new_callable=AsyncMock,
|
||||
return_value=mock_response,
|
||||
) as mock_send:
|
||||
result = await adapter.send_exec_approval(
|
||||
chat_id="oc_12345",
|
||||
command="rm -rf /important",
|
||||
session_key="agent:main:feishu:group:oc_12345",
|
||||
description="dangerous deletion",
|
||||
)
|
||||
|
||||
assert result.success is True
|
||||
assert result.message_id == "msg_001"
|
||||
|
||||
mock_send.assert_called_once()
|
||||
kwargs = mock_send.call_args[1]
|
||||
assert kwargs["chat_id"] == "oc_12345"
|
||||
assert kwargs["msg_type"] == "interactive"
|
||||
|
||||
# Verify card payload contains the command and buttons
|
||||
card = json.loads(kwargs["payload"])
|
||||
assert card["header"]["template"] == "orange"
|
||||
assert "rm -rf /important" in card["elements"][0]["content"]
|
||||
assert "dangerous deletion" in card["elements"][0]["content"]
|
||||
|
||||
# Check buttons
|
||||
actions = card["elements"][1]["actions"]
|
||||
assert len(actions) == 4
|
||||
action_names = [a["value"]["hermes_action"] for a in actions]
|
||||
assert action_names == [
|
||||
"approve_once", "approve_session", "approve_always", "deny"
|
||||
]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_stores_approval_state(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
mock_response = SimpleNamespace(
|
||||
success=lambda: True,
|
||||
data=SimpleNamespace(message_id="msg_002"),
|
||||
)
|
||||
with patch.object(
|
||||
adapter, "_feishu_send_with_retry", new_callable=AsyncMock,
|
||||
return_value=mock_response,
|
||||
):
|
||||
await adapter.send_exec_approval(
|
||||
chat_id="oc_12345",
|
||||
command="echo test",
|
||||
session_key="my-session-key",
|
||||
)
|
||||
|
||||
assert len(adapter._approval_state) == 1
|
||||
approval_id = list(adapter._approval_state.keys())[0]
|
||||
state = adapter._approval_state[approval_id]
|
||||
assert state["session_key"] == "my-session-key"
|
||||
assert state["message_id"] == "msg_002"
|
||||
assert state["chat_id"] == "oc_12345"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_not_connected(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._client = None
|
||||
result = await adapter.send_exec_approval(
|
||||
chat_id="oc_12345", command="ls", session_key="s"
|
||||
)
|
||||
assert result.success is False
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_truncates_long_command(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
mock_response = SimpleNamespace(
|
||||
success=lambda: True,
|
||||
data=SimpleNamespace(message_id="msg_003"),
|
||||
)
|
||||
with patch.object(
|
||||
adapter, "_feishu_send_with_retry", new_callable=AsyncMock,
|
||||
return_value=mock_response,
|
||||
) as mock_send:
|
||||
long_cmd = "x" * 5000
|
||||
await adapter.send_exec_approval(
|
||||
chat_id="oc_12345", command=long_cmd, session_key="s"
|
||||
)
|
||||
|
||||
card = json.loads(mock_send.call_args[1]["payload"])
|
||||
content = card["elements"][0]["content"]
|
||||
assert "..." in content
|
||||
assert len(content) < 5000
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_multiple_approvals_get_unique_ids(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
mock_response = SimpleNamespace(
|
||||
success=lambda: True,
|
||||
data=SimpleNamespace(message_id="msg_x"),
|
||||
)
|
||||
with patch.object(
|
||||
adapter, "_feishu_send_with_retry", new_callable=AsyncMock,
|
||||
return_value=mock_response,
|
||||
):
|
||||
await adapter.send_exec_approval(
|
||||
chat_id="oc_1", command="cmd1", session_key="s1"
|
||||
)
|
||||
await adapter.send_exec_approval(
|
||||
chat_id="oc_2", command="cmd2", session_key="s2"
|
||||
)
|
||||
|
||||
assert len(adapter._approval_state) == 2
|
||||
ids = list(adapter._approval_state.keys())
|
||||
assert ids[0] != ids[1]
|
||||
|
||||
|
||||
# ===========================================================================
|
||||
# _handle_card_action_event — approval button clicks
|
||||
# ===========================================================================
|
||||
|
||||
class TestFeishuApprovalCallback:
|
||||
"""Test the approval intercept in _handle_card_action_event."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_resolves_approval_on_click(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._approval_state[1] = {
|
||||
"session_key": "agent:main:feishu:group:oc_12345",
|
||||
"message_id": "msg_001",
|
||||
"chat_id": "oc_12345",
|
||||
}
|
||||
|
||||
data = _make_card_action_data(
|
||||
action_value={"hermes_action": "approve_once", "approval_id": 1},
|
||||
)
|
||||
|
||||
with (
|
||||
patch.object(
|
||||
adapter, "_resolve_sender_profile", new_callable=AsyncMock,
|
||||
return_value={"user_id": "ou_user1", "user_name": "Norbert", "user_id_alt": None},
|
||||
),
|
||||
patch.object(adapter, "_update_approval_card", new_callable=AsyncMock) as mock_update,
|
||||
patch("tools.approval.resolve_gateway_approval", return_value=1) as mock_resolve,
|
||||
):
|
||||
await adapter._handle_card_action_event(data)
|
||||
|
||||
mock_resolve.assert_called_once_with("agent:main:feishu:group:oc_12345", "once")
|
||||
mock_update.assert_called_once_with("msg_001", "Approved once", "Norbert", "once")
|
||||
|
||||
# State should be cleaned up
|
||||
assert 1 not in adapter._approval_state
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_deny_button(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._approval_state[2] = {
|
||||
"session_key": "some-session",
|
||||
"message_id": "msg_002",
|
||||
"chat_id": "oc_12345",
|
||||
}
|
||||
|
||||
data = _make_card_action_data(
|
||||
action_value={"hermes_action": "deny", "approval_id": 2},
|
||||
token="tok_deny",
|
||||
)
|
||||
|
||||
with (
|
||||
patch.object(
|
||||
adapter, "_resolve_sender_profile", new_callable=AsyncMock,
|
||||
return_value={"user_id": "ou_alice", "user_name": "Alice", "user_id_alt": None},
|
||||
),
|
||||
patch.object(adapter, "_update_approval_card", new_callable=AsyncMock) as mock_update,
|
||||
patch("tools.approval.resolve_gateway_approval", return_value=1) as mock_resolve,
|
||||
):
|
||||
await adapter._handle_card_action_event(data)
|
||||
|
||||
mock_resolve.assert_called_once_with("some-session", "deny")
|
||||
mock_update.assert_called_once_with("msg_002", "Denied", "Alice", "deny")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_session_approval(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._approval_state[3] = {
|
||||
"session_key": "sess-3",
|
||||
"message_id": "msg_003",
|
||||
"chat_id": "oc_99",
|
||||
}
|
||||
|
||||
data = _make_card_action_data(
|
||||
action_value={"hermes_action": "approve_session", "approval_id": 3},
|
||||
token="tok_ses",
|
||||
)
|
||||
|
||||
with (
|
||||
patch.object(
|
||||
adapter, "_resolve_sender_profile", new_callable=AsyncMock,
|
||||
return_value={"user_id": "ou_u", "user_name": "Bob", "user_id_alt": None},
|
||||
),
|
||||
patch.object(adapter, "_update_approval_card", new_callable=AsyncMock) as mock_update,
|
||||
patch("tools.approval.resolve_gateway_approval", return_value=1) as mock_resolve,
|
||||
):
|
||||
await adapter._handle_card_action_event(data)
|
||||
|
||||
mock_resolve.assert_called_once_with("sess-3", "session")
|
||||
mock_update.assert_called_once_with("msg_003", "Approved for session", "Bob", "session")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_always_approval(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._approval_state[4] = {
|
||||
"session_key": "sess-4",
|
||||
"message_id": "msg_004",
|
||||
"chat_id": "oc_55",
|
||||
}
|
||||
|
||||
data = _make_card_action_data(
|
||||
action_value={"hermes_action": "approve_always", "approval_id": 4},
|
||||
token="tok_alw",
|
||||
)
|
||||
|
||||
with (
|
||||
patch.object(
|
||||
adapter, "_resolve_sender_profile", new_callable=AsyncMock,
|
||||
return_value={"user_id": "ou_u", "user_name": "Carol", "user_id_alt": None},
|
||||
),
|
||||
patch.object(adapter, "_update_approval_card", new_callable=AsyncMock),
|
||||
patch("tools.approval.resolve_gateway_approval", return_value=1) as mock_resolve,
|
||||
):
|
||||
await adapter._handle_card_action_event(data)
|
||||
|
||||
mock_resolve.assert_called_once_with("sess-4", "always")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_already_resolved_drops_silently(self):
|
||||
adapter = _make_adapter()
|
||||
# No state for approval_id 99 — already resolved
|
||||
|
||||
data = _make_card_action_data(
|
||||
action_value={"hermes_action": "approve_once", "approval_id": 99},
|
||||
token="tok_gone",
|
||||
)
|
||||
|
||||
with patch("tools.approval.resolve_gateway_approval") as mock_resolve:
|
||||
await adapter._handle_card_action_event(data)
|
||||
|
||||
# Should NOT resolve — already handled
|
||||
mock_resolve.assert_not_called()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_approval_actions_route_normally(self):
|
||||
"""Non-approval card actions should still become synthetic commands."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
data = _make_card_action_data(
|
||||
action_value={"custom_action": "something_else"},
|
||||
token="tok_normal",
|
||||
)
|
||||
|
||||
with (
|
||||
patch.object(
|
||||
adapter, "_resolve_sender_profile", new_callable=AsyncMock,
|
||||
return_value={"user_id": "ou_u", "user_name": "Dave", "user_id_alt": None},
|
||||
),
|
||||
patch.object(adapter, "get_chat_info", new_callable=AsyncMock, return_value={"name": "Test Chat"}),
|
||||
patch.object(adapter, "_handle_message_with_guards", new_callable=AsyncMock) as mock_handle,
|
||||
patch("tools.approval.resolve_gateway_approval") as mock_resolve,
|
||||
):
|
||||
await adapter._handle_card_action_event(data)
|
||||
|
||||
# Should NOT resolve any approval
|
||||
mock_resolve.assert_not_called()
|
||||
# Should have routed as synthetic command
|
||||
mock_handle.assert_called_once()
|
||||
event = mock_handle.call_args[0][0]
|
||||
assert "/card button" in event.text
|
||||
|
||||
|
||||
# ===========================================================================
|
||||
# _update_approval_card — card replacement after resolution
|
||||
# ===========================================================================
|
||||
|
||||
class TestFeishuUpdateApprovalCard:
|
||||
"""Test the card update after approval resolution."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_updates_card_on_approve(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
mock_update = AsyncMock()
|
||||
adapter._client.im.v1.message.update = MagicMock()
|
||||
|
||||
with patch("asyncio.to_thread", new_callable=AsyncMock) as mock_thread:
|
||||
await adapter._update_approval_card(
|
||||
"msg_001", "Approved once", "Norbert", "once"
|
||||
)
|
||||
|
||||
mock_thread.assert_called_once()
|
||||
# Verify the update request was built
|
||||
call_args = mock_thread.call_args
|
||||
assert call_args[0][0] == adapter._client.im.v1.message.update
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_updates_card_on_deny(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
with patch("asyncio.to_thread", new_callable=AsyncMock) as mock_thread:
|
||||
await adapter._update_approval_card(
|
||||
"msg_002", "Denied", "Alice", "deny"
|
||||
)
|
||||
|
||||
mock_thread.assert_called_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_skips_update_when_not_connected(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._client = None
|
||||
|
||||
with patch("asyncio.to_thread", new_callable=AsyncMock) as mock_thread:
|
||||
await adapter._update_approval_card(
|
||||
"msg_001", "Approved", "Bob", "once"
|
||||
)
|
||||
|
||||
mock_thread.assert_not_called()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_skips_update_when_no_message_id(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
with patch("asyncio.to_thread", new_callable=AsyncMock) as mock_thread:
|
||||
await adapter._update_approval_card(
|
||||
"", "Approved", "Bob", "once"
|
||||
)
|
||||
|
||||
mock_thread.assert_not_called()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_swallows_update_errors(self):
|
||||
adapter = _make_adapter()
|
||||
|
||||
with patch("asyncio.to_thread", new_callable=AsyncMock, side_effect=Exception("API error")):
|
||||
# Should not raise
|
||||
await adapter._update_approval_card(
|
||||
"msg_001", "Approved", "Bob", "once"
|
||||
)
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user