Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 2f7e5db456 |
@@ -1,13 +0,0 @@
|
||||
# Git
|
||||
.git
|
||||
.gitignore
|
||||
.gitmodules
|
||||
|
||||
# Dependencies
|
||||
node_modules
|
||||
|
||||
# CI/CD
|
||||
.github
|
||||
|
||||
# Environment files
|
||||
.env
|
||||
@@ -59,25 +59,12 @@ OPENCODE_ZEN_API_KEY=
|
||||
# OpenCode Go provides access to open models (GLM-5, Kimi K2.5, MiniMax M2.5)
|
||||
# $10/month subscription. Get your key at: https://opencode.ai/auth
|
||||
OPENCODE_GO_API_KEY=
|
||||
|
||||
# =============================================================================
|
||||
# LLM PROVIDER (Hugging Face Inference Providers)
|
||||
# =============================================================================
|
||||
# Hugging Face routes to 20+ open models via unified OpenAI-compatible endpoint.
|
||||
# Free tier included ($0.10/month), no markup on provider rates.
|
||||
# Get your token at: https://huggingface.co/settings/tokens
|
||||
# Required permission: "Make calls to Inference Providers"
|
||||
HF_TOKEN=
|
||||
# OPENCODE_GO_BASE_URL=https://opencode.ai/zen/go/v1 # Override default base URL
|
||||
|
||||
# =============================================================================
|
||||
# TOOL API KEYS
|
||||
# =============================================================================
|
||||
|
||||
# Exa API Key - AI-native web search and contents
|
||||
# Get at: https://exa.ai
|
||||
EXA_API_KEY=
|
||||
|
||||
# Parallel API Key - AI-native web search and extract
|
||||
# Get at: https://parallel.ai
|
||||
PARALLEL_API_KEY=
|
||||
|
||||
@@ -1,61 +0,0 @@
|
||||
name: Docker Build and Publish
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
pull_request:
|
||||
branches: [main]
|
||||
|
||||
concurrency:
|
||||
group: docker-${{ github.ref }}
|
||||
cancel-in-progress: true
|
||||
|
||||
jobs:
|
||||
build-and-push:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 30
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
submodules: recursive
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v3
|
||||
|
||||
- name: Build image
|
||||
uses: docker/build-push-action@v6
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile
|
||||
load: true
|
||||
tags: nousresearch/hermes-agent:test
|
||||
cache-from: type=gha
|
||||
cache-to: type=gha,mode=max
|
||||
|
||||
- name: Test image starts
|
||||
run: |
|
||||
docker run --rm \
|
||||
-v /tmp/hermes-test:/opt/data \
|
||||
--entrypoint /opt/hermes/docker/entrypoint.sh \
|
||||
nousresearch/hermes-agent:test --help
|
||||
|
||||
- name: Log in to Docker Hub
|
||||
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
||||
uses: docker/login-action@v3
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_TOKEN }}
|
||||
|
||||
- name: Push image
|
||||
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
||||
uses: docker/build-push-action@v6
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile
|
||||
push: true
|
||||
tags: |
|
||||
nousresearch/hermes-agent:latest
|
||||
nousresearch/hermes-agent:${{ github.sha }}
|
||||
cache-from: type=gha
|
||||
cache-to: type=gha,mode=max
|
||||
-20
@@ -1,20 +0,0 @@
|
||||
FROM debian:13.4
|
||||
|
||||
RUN apt-get update
|
||||
RUN apt-get install -y nodejs npm python3 python3-pip ripgrep ffmpeg gcc python3-dev libffi-dev
|
||||
|
||||
COPY . /opt/hermes
|
||||
WORKDIR /opt/hermes
|
||||
|
||||
RUN pip install -e ".[all]" --break-system-packages
|
||||
RUN npm install
|
||||
RUN npx playwright install --with-deps chromium
|
||||
WORKDIR /opt/hermes/scripts/whatsapp-bridge
|
||||
RUN npm install
|
||||
|
||||
WORKDIR /opt/hermes
|
||||
RUN chmod +x /opt/hermes/docker/entrypoint.sh
|
||||
|
||||
ENV HERMES_HOME=/opt/data
|
||||
VOLUME [ "/opt/data" ]
|
||||
ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]
|
||||
@@ -1,348 +0,0 @@
|
||||
# Hermes Agent v0.5.0 (v2026.3.28)
|
||||
|
||||
**Release Date:** March 28, 2026
|
||||
|
||||
> The hardening release — Hugging Face provider, /model command overhaul, Telegram Private Chat Topics, native Modal SDK, plugin lifecycle hooks, tool-use enforcement for GPT models, Nix flake, 50+ security and reliability fixes, and a comprehensive supply chain audit.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
- **Nous Portal now supports 400+ models** — The Nous Research inference portal has expanded dramatically, giving Hermes Agent users access to over 400 models through a single provider endpoint
|
||||
|
||||
- **Hugging Face as a first-class inference provider** — Full integration with HF Inference API including curated agentic model picker that maps to OpenRouter analogues, live `/models` endpoint probe, and setup wizard flow ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419), [#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
|
||||
|
||||
- **Telegram Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
|
||||
|
||||
- **Native Modal SDK backend** — Replaced swe-rex dependency with native Modal SDK (`Sandbox.create.aio` + `exec.aio`), eliminating tunnels and simplifying the Modal terminal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
|
||||
|
||||
- **Plugin lifecycle hooks activated** — `pre_llm_call`, `post_llm_call`, `on_session_start`, and `on_session_end` hooks now fire in the agent loop and CLI/gateway, completing the plugin hook system ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
|
||||
|
||||
- **Improved OpenAI Model Reliability** — Added `GPT_TOOL_USE_GUIDANCE` to prevent GPT models from describing intended actions instead of making tool calls, plus automatic stripping of stale budget warnings from conversation history that caused models to avoid tools across turns ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
|
||||
|
||||
- **Nix flake** — Full uv2nix build, NixOS module with persistent container mode, auto-generated config keys from Python source, and suffix PATHs for agent-friendliness ([#20](https://github.com/NousResearch/hermes-agent/pull/20), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274), [#3061](https://github.com/NousResearch/hermes-agent/pull/3061)) by @alt-glitch
|
||||
|
||||
- **Supply chain hardening** — Removed compromised `litellm` dependency, pinned all dependency version ranges, regenerated `uv.lock` with hashes, added CI workflow scanning PRs for supply chain attack patterns, and bumped deps to fix CVEs ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796), [#2810](https://github.com/NousResearch/hermes-agent/pull/2810), [#2812](https://github.com/NousResearch/hermes-agent/pull/2812), [#2816](https://github.com/NousResearch/hermes-agent/pull/2816), [#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
|
||||
|
||||
- **Anthropic output limits fix** — Replaced hardcoded 16K `max_tokens` with per-model native output limits (128K for Opus 4.6, 64K for Sonnet 4.6), fixing "Response truncated" and thinking-budget exhaustion on direct Anthropic API ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426), [#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Core Agent & Architecture
|
||||
|
||||
### New Provider: Hugging Face
|
||||
- First-class Hugging Face Inference API integration with auth, setup wizard, and model picker ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419))
|
||||
- Curated model list mapping OpenRouter agentic defaults to HF equivalents — providers with 8+ curated models skip live `/models` probe for speed ([#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
|
||||
- Added glm-5-turbo to Z.AI provider model list ([#3095](https://github.com/NousResearch/hermes-agent/pull/3095))
|
||||
|
||||
### Provider & Model Improvements
|
||||
- `/model` command overhaul — extracted shared `switch_model()` pipeline for CLI and gateway, custom endpoint support, provider-aware routing ([#2795](https://github.com/NousResearch/hermes-agent/pull/2795), [#2799](https://github.com/NousResearch/hermes-agent/pull/2799))
|
||||
- Removed `/model` slash command from CLI and gateway in favor of `hermes model` subcommand ([#3080](https://github.com/NousResearch/hermes-agent/pull/3080))
|
||||
- Preserve `custom` provider instead of silently remapping to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
|
||||
- Read root-level `provider` and `base_url` from config.yaml into model config ([#3112](https://github.com/NousResearch/hermes-agent/pull/3112))
|
||||
- Align Nous Portal model slugs with OpenRouter naming ([#3253](https://github.com/NousResearch/hermes-agent/pull/3253))
|
||||
- Fix Alibaba provider default endpoint and model list ([#3484](https://github.com/NousResearch/hermes-agent/pull/3484))
|
||||
- Allow MiniMax users to override `/v1` → `/anthropic` auto-correction ([#3553](https://github.com/NousResearch/hermes-agent/pull/3553))
|
||||
- Migrate OAuth token refresh to `platform.claude.com` with fallback ([#3246](https://github.com/NousResearch/hermes-agent/pull/3246))
|
||||
|
||||
### Agent Loop & Conversation
|
||||
- **Improved OpenAI model reliability** — `GPT_TOOL_USE_GUIDANCE` prevents GPT models from describing actions instead of calling tools + automatic budget warning stripping from history ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
|
||||
- **Surface lifecycle events** — All retry, fallback, and compression events now surface to the user as formatted messages ([#3153](https://github.com/NousResearch/hermes-agent/pull/3153))
|
||||
- **Anthropic output limits** — Per-model native output limits instead of hardcoded 16K `max_tokens` ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426))
|
||||
- **Thinking-budget exhaustion detection** — Skip useless continuation retries when model uses all output tokens on reasoning ([#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
|
||||
- Always prefer streaming for API calls to prevent hung subagents ([#3120](https://github.com/NousResearch/hermes-agent/pull/3120))
|
||||
- Restore safe non-streaming fallback after stream failures ([#3020](https://github.com/NousResearch/hermes-agent/pull/3020))
|
||||
- Give subagents independent iteration budgets ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
|
||||
- Update `api_key` in `_try_activate_fallback` for subagent auth ([#3103](https://github.com/NousResearch/hermes-agent/pull/3103))
|
||||
- Graceful return on max retries instead of crashing thread ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Count compression restarts toward retry limit ([#3070](https://github.com/NousResearch/hermes-agent/pull/3070))
|
||||
- Include tool tokens in preflight estimate, guard context probe persistence ([#3164](https://github.com/NousResearch/hermes-agent/pull/3164))
|
||||
- Update context compressor limits after fallback activation ([#3305](https://github.com/NousResearch/hermes-agent/pull/3305))
|
||||
- Validate empty user messages to prevent Anthropic API 400 errors ([#3322](https://github.com/NousResearch/hermes-agent/pull/3322))
|
||||
- GLM reasoning-only and max-length handling ([#3010](https://github.com/NousResearch/hermes-agent/pull/3010))
|
||||
- Increase API timeout default from 900s to 1800s for slow-thinking models ([#3431](https://github.com/NousResearch/hermes-agent/pull/3431))
|
||||
- Send `max_tokens` for Claude/OpenRouter + retry SSE connection errors ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
|
||||
- Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701)) by @ctlst
|
||||
|
||||
### Streaming & Reasoning
|
||||
- **Persist reasoning across gateway session turns** with new schema v6 columns (`reasoning`, `reasoning_details`, `codex_reasoning_items`) ([#2974](https://github.com/NousResearch/hermes-agent/pull/2974))
|
||||
- Detect and kill stale SSE connections ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Fix stale stream detector race causing spurious `RemoteProtocolError` ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Skip duplicate callback for `<think>`-extracted reasoning during streaming ([#3116](https://github.com/NousResearch/hermes-agent/pull/3116))
|
||||
- Preserve reasoning fields in `rewrite_transcript` ([#3311](https://github.com/NousResearch/hermes-agent/pull/3311))
|
||||
- Preserve Gemini thought signatures in streamed tool calls ([#2997](https://github.com/NousResearch/hermes-agent/pull/2997))
|
||||
- Ensure first delta is fired during reasoning updates ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
|
||||
### Session & Memory
|
||||
- **Session search recent sessions mode** — Omit query to browse recent sessions with titles, previews, and timestamps ([#2533](https://github.com/NousResearch/hermes-agent/pull/2533))
|
||||
- **Session config surfacing** on `/new`, `/reset`, and auto-reset ([#3321](https://github.com/NousResearch/hermes-agent/pull/3321))
|
||||
- **Third-party session isolation** — `--source` flag for isolating sessions by origin ([#3255](https://github.com/NousResearch/hermes-agent/pull/3255))
|
||||
- Add `/resume` CLI handler, session log truncation guard, `reopen_session` API ([#3315](https://github.com/NousResearch/hermes-agent/pull/3315))
|
||||
- Clear compressor summary and turn counter on `/clear` and `/new` ([#3102](https://github.com/NousResearch/hermes-agent/pull/3102))
|
||||
- Surface silent SessionDB failures that cause session data loss ([#2999](https://github.com/NousResearch/hermes-agent/pull/2999))
|
||||
- Session search fallback preview on summarization failure ([#3478](https://github.com/NousResearch/hermes-agent/pull/3478))
|
||||
- Prevent stale memory overwrites by flush agent ([#2687](https://github.com/NousResearch/hermes-agent/pull/2687))
|
||||
|
||||
### Context Compression
|
||||
- Replace dead `summary_target_tokens` with ratio-based scaling ([#2554](https://github.com/NousResearch/hermes-agent/pull/2554))
|
||||
- Expose `compression.target_ratio`, `protect_last_n`, and `threshold` in `DEFAULT_CONFIG` ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Restore sane defaults and cap summary at 12K tokens ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Preserve transcript on `/compress` and hygiene compression ([#3556](https://github.com/NousResearch/hermes-agent/pull/3556))
|
||||
- Update context pressure warnings and token estimates after compaction ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
|
||||
### Architecture & Dependencies
|
||||
- **Remove mini-swe-agent dependency** — Inline Docker and Modal backends directly ([#2804](https://github.com/NousResearch/hermes-agent/pull/2804))
|
||||
- **Replace swe-rex with native Modal SDK** for Modal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
|
||||
- **Plugin lifecycle hooks** — `pre_llm_call`, `post_llm_call`, `on_session_start`, `on_session_end` now fire in the agent loop ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
|
||||
- Fix plugin toolsets invisible in `hermes tools` and standalone processes ([#3457](https://github.com/NousResearch/hermes-agent/pull/3457))
|
||||
- Consolidate `get_hermes_home()` and `parse_reasoning_effort()` ([#3062](https://github.com/NousResearch/hermes-agent/pull/3062))
|
||||
- Remove unused Hermes-native PKCE OAuth flow ([#3107](https://github.com/NousResearch/hermes-agent/pull/3107))
|
||||
- Remove ~100 unused imports across 55 files ([#3016](https://github.com/NousResearch/hermes-agent/pull/3016))
|
||||
- Fix 154 f-strings, simplify getattr/URL patterns, remove dead code ([#3119](https://github.com/NousResearch/hermes-agent/pull/3119))
|
||||
|
||||
---
|
||||
|
||||
## 📱 Messaging Platforms (Gateway)
|
||||
|
||||
### Telegram
|
||||
- **Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
|
||||
- **Auto-discover fallback IPs via DNS-over-HTTPS** when `api.telegram.org` is unreachable ([#3376](https://github.com/NousResearch/hermes-agent/pull/3376))
|
||||
- **Configurable reply threading mode** ([#2907](https://github.com/NousResearch/hermes-agent/pull/2907))
|
||||
- Fall back to no `thread_id` on "Message thread not found" BadRequest ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
|
||||
- Self-reschedule reconnect when `start_polling` fails after 502 ([#3268](https://github.com/NousResearch/hermes-agent/pull/3268))
|
||||
|
||||
### Discord
|
||||
- Stop phantom typing indicator after agent turn completes ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
|
||||
|
||||
### Slack
|
||||
- Send tool call progress messages to correct Slack thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
|
||||
- Scope progress thread fallback to Slack only ([#3488](https://github.com/NousResearch/hermes-agent/pull/3488))
|
||||
|
||||
### WhatsApp
|
||||
- Download documents, audio, and video media from messages ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
|
||||
|
||||
### Matrix
|
||||
- Add missing Matrix entry in `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
|
||||
- Harden e2ee access-token handling ([#3562](https://github.com/NousResearch/hermes-agent/pull/3562))
|
||||
- Add backoff for `SyncError` in sync loop ([#3280](https://github.com/NousResearch/hermes-agent/pull/3280))
|
||||
|
||||
### Signal
|
||||
- Track SSE keepalive comments as connection activity ([#3316](https://github.com/NousResearch/hermes-agent/pull/3316))
|
||||
|
||||
### Email
|
||||
- Prevent unbounded growth of `_seen_uids` in EmailAdapter ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
|
||||
|
||||
### Gateway Core
|
||||
- **Config-gated `/verbose` command** for messaging platforms — toggle tool output verbosity from chat ([#3262](https://github.com/NousResearch/hermes-agent/pull/3262))
|
||||
- **Background review notifications** delivered to user chat ([#3293](https://github.com/NousResearch/hermes-agent/pull/3293))
|
||||
- **Retry transient send failures** and notify user on exhaustion ([#3288](https://github.com/NousResearch/hermes-agent/pull/3288))
|
||||
- Recover from hung agents — `/stop` hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
|
||||
- Thread-safe `SessionStore` — protect `_entries` with `threading.Lock` ([#3052](https://github.com/NousResearch/hermes-agent/pull/3052))
|
||||
- Fix gateway token double-counting with cached agents — use absolute set instead of increment ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
|
||||
- Fingerprint full auth token in agent cache signature ([#3247](https://github.com/NousResearch/hermes-agent/pull/3247))
|
||||
- Silence background agent terminal output ([#3297](https://github.com/NousResearch/hermes-agent/pull/3297))
|
||||
- Include per-platform `ALLOW_ALL` and `SIGNAL_GROUP` in startup allowlist check ([#3313](https://github.com/NousResearch/hermes-agent/pull/3313))
|
||||
- Include user-local bin paths in systemd unit PATH ([#3527](https://github.com/NousResearch/hermes-agent/pull/3527))
|
||||
- Track background task references in `GatewayRunner` ([#3254](https://github.com/NousResearch/hermes-agent/pull/3254))
|
||||
- Add request timeouts to HA, Email, Mattermost, SMS adapters ([#3258](https://github.com/NousResearch/hermes-agent/pull/3258))
|
||||
- Add media download retry to Mattermost, Slack, and base cache ([#3323](https://github.com/NousResearch/hermes-agent/pull/3323))
|
||||
- Detect virtualenv path instead of hardcoding `venv/` ([#2797](https://github.com/NousResearch/hermes-agent/pull/2797))
|
||||
- Use `TERMINAL_CWD` for context file discovery, not process cwd ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Stop loading hermes repo AGENTS.md into gateway sessions (~10k wasted tokens) ([#2891](https://github.com/NousResearch/hermes-agent/pull/2891))
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ CLI & User Experience
|
||||
|
||||
### Interactive CLI
|
||||
- **Configurable busy input mode** + fix `/queue` always working ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
|
||||
- **Preserve user input on multiline paste** ([#3065](https://github.com/NousResearch/hermes-agent/pull/3065))
|
||||
- **Tool generation callback** — streaming "preparing terminal…" updates during tool argument generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Show tool progress for substantive tools, not just "preparing" ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Buffer reasoning preview chunks and fix duplicate display ([#3013](https://github.com/NousResearch/hermes-agent/pull/3013))
|
||||
- Prevent reasoning box from rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
|
||||
- Eliminate "Event loop is closed" / "Press ENTER to continue" during idle — three-layer fix with `neuter_async_httpx_del()`, custom exception handler, and stale client cleanup ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
|
||||
- Fix status bar shows 26K instead of 260K for token counts with trailing zeros ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
|
||||
- Fix status bar duplicates and degrades during long sessions ([#3291](https://github.com/NousResearch/hermes-agent/pull/3291))
|
||||
- Refresh TUI before background task output to prevent status bar overlap ([#3048](https://github.com/NousResearch/hermes-agent/pull/3048))
|
||||
- Suppress KawaiiSpinner animation under `patch_stdout` ([#2994](https://github.com/NousResearch/hermes-agent/pull/2994))
|
||||
- Skip KawaiiSpinner when TUI handles tool progress ([#2973](https://github.com/NousResearch/hermes-agent/pull/2973))
|
||||
- Guard `isatty()` against closed streams via `_is_tty` property ([#3056](https://github.com/NousResearch/hermes-agent/pull/3056))
|
||||
- Ensure single closure of streaming boxes during tool generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Cap context pressure percentage at 100% in display ([#3480](https://github.com/NousResearch/hermes-agent/pull/3480))
|
||||
- Clean up HTML error messages in CLI display ([#3069](https://github.com/NousResearch/hermes-agent/pull/3069))
|
||||
- Show HTTP status code and 400 body in API error output ([#3096](https://github.com/NousResearch/hermes-agent/pull/3096))
|
||||
- Extract useful info from HTML error pages, dump debug on max retries ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Prevent TypeError on startup when `base_url` is None ([#3068](https://github.com/NousResearch/hermes-agent/pull/3068))
|
||||
- Prevent update crash in non-TTY environments ([#3094](https://github.com/NousResearch/hermes-agent/pull/3094))
|
||||
- Handle EOFError in sessions delete/prune confirmation prompts ([#3101](https://github.com/NousResearch/hermes-agent/pull/3101))
|
||||
- Catch KeyboardInterrupt during `flush_memories` on exit and in exit cleanup handlers ([#3025](https://github.com/NousResearch/hermes-agent/pull/3025), [#3257](https://github.com/NousResearch/hermes-agent/pull/3257))
|
||||
- Guard `.strip()` against None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
|
||||
- Guard `config.get()` against YAML null values to prevent AttributeError ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
|
||||
- Store asyncio task references to prevent GC mid-execution ([#3267](https://github.com/NousResearch/hermes-agent/pull/3267))
|
||||
|
||||
### Setup & Configuration
|
||||
- Use explicit key mapping for returning-user menu dispatch instead of positional index ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
|
||||
- Use `sys.executable` for pip in update commands to fix PEP 668 ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
|
||||
- Harden `hermes update` against diverged history, non-main branches, and gateway edge cases ([#3492](https://github.com/NousResearch/hermes-agent/pull/3492))
|
||||
- OpenClaw migration overwrites defaults and setup wizard skips imported sections — fixed ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
|
||||
- Stop recursive AGENTS.md walk, load top-level only ([#3110](https://github.com/NousResearch/hermes-agent/pull/3110))
|
||||
- Add macOS Homebrew paths to browser and terminal PATH resolution ([#2713](https://github.com/NousResearch/hermes-agent/pull/2713))
|
||||
- YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
|
||||
- Reset default SOUL.md to baseline identity text ([#3159](https://github.com/NousResearch/hermes-agent/pull/3159))
|
||||
- Reject relative cwd paths for container terminal backends ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Add explicit `hermes-api-server` toolset for API server platform ([#3304](https://github.com/NousResearch/hermes-agent/pull/3304))
|
||||
- Reorder setup wizard providers — OpenRouter first ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Tool System
|
||||
|
||||
### API Server
|
||||
- **Idempotency-Key support**, body size limit, and OpenAI error envelope ([#2903](https://github.com/NousResearch/hermes-agent/pull/2903))
|
||||
- Allow Idempotency-Key in CORS headers ([#3530](https://github.com/NousResearch/hermes-agent/pull/3530))
|
||||
- Cancel orphaned agent + true interrupt on SSE disconnect ([#3427](https://github.com/NousResearch/hermes-agent/pull/3427))
|
||||
- Fix streaming breaks when agent makes tool calls ([#2985](https://github.com/NousResearch/hermes-agent/pull/2985))
|
||||
|
||||
### Terminal & File Operations
|
||||
- Handle addition-only hunks in V4A patch parser ([#3325](https://github.com/NousResearch/hermes-agent/pull/3325))
|
||||
- Exponential backoff for persistent shell polling ([#2996](https://github.com/NousResearch/hermes-agent/pull/2996))
|
||||
- Add timeout to subprocess calls in `context_references` ([#3469](https://github.com/NousResearch/hermes-agent/pull/3469))
|
||||
|
||||
### Browser & Vision
|
||||
- Handle 402 insufficient credits error in vision tool ([#2802](https://github.com/NousResearch/hermes-agent/pull/2802))
|
||||
- Fix `browser_vision` ignores `auxiliary.vision.timeout` config ([#2901](https://github.com/NousResearch/hermes-agent/pull/2901))
|
||||
- Make browser command timeout configurable via config.yaml ([#2801](https://github.com/NousResearch/hermes-agent/pull/2801))
|
||||
|
||||
### MCP
|
||||
- MCP toolset resolution for runtime and config ([#3252](https://github.com/NousResearch/hermes-agent/pull/3252))
|
||||
- Add MCP tool name collision protection ([#3077](https://github.com/NousResearch/hermes-agent/pull/3077))
|
||||
|
||||
### Auxiliary LLM
|
||||
- Guard aux LLM calls against None content + reasoning fallback + retry ([#3449](https://github.com/NousResearch/hermes-agent/pull/3449))
|
||||
- Catch ImportError from `build_anthropic_client` in vision auto-detection ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
|
||||
|
||||
### Other Tools
|
||||
- Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162)) by @memosr
|
||||
- Auto-repair `jobs.json` with invalid control characters ([#3537](https://github.com/NousResearch/hermes-agent/pull/3537))
|
||||
- Enable fine-grained tool streaming for Claude/OpenRouter ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
|
||||
|
||||
---
|
||||
|
||||
## 🧩 Skills Ecosystem
|
||||
|
||||
### Skills System
|
||||
- **Env var passthrough** for skills and user config — skills can declare environment variables to pass through ([#2807](https://github.com/NousResearch/hermes-agent/pull/2807))
|
||||
- Cache skills prompt with shared `skill_utils` module for faster TTFT ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
|
||||
- Avoid redundant file re-read for skill conditions ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
|
||||
- Use Git Trees API to prevent silent subdirectory loss during install ([#2995](https://github.com/NousResearch/hermes-agent/pull/2995))
|
||||
- Fix skills-sh install for deeply nested repo structures ([#2980](https://github.com/NousResearch/hermes-agent/pull/2980))
|
||||
- Handle null metadata in skill frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Preserve trust for skills-sh identifiers + reduce resolution churn ([#3251](https://github.com/NousResearch/hermes-agent/pull/3251))
|
||||
- Agent-created skills were incorrectly treated as untrusted community content — fixed ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
|
||||
### New Skills
|
||||
- **G0DM0D3 godmode jailbreaking skill** + docs ([#3157](https://github.com/NousResearch/hermes-agent/pull/3157))
|
||||
- **Docker management skill** added to optional-skills ([#3060](https://github.com/NousResearch/hermes-agent/pull/3060))
|
||||
- **OpenClaw migration v2** — 17 new modules, terminal recap for migrating from OpenClaw to Hermes ([#2906](https://github.com/NousResearch/hermes-agent/pull/2906))
|
||||
|
||||
---
|
||||
|
||||
## 🔒 Security & Reliability
|
||||
|
||||
### Security Hardening
|
||||
- **SSRF protection** added to `browser_navigate` ([#3058](https://github.com/NousResearch/hermes-agent/pull/3058))
|
||||
- **SSRF protection** added to `vision_tools` and `web_tools` (hardened) ([#2679](https://github.com/NousResearch/hermes-agent/pull/2679))
|
||||
- **Restrict subagent toolsets** to parent's enabled set ([#3269](https://github.com/NousResearch/hermes-agent/pull/3269))
|
||||
- **Prevent zip-slip path traversal** in self-update ([#3250](https://github.com/NousResearch/hermes-agent/pull/3250))
|
||||
- **Prevent shell injection** in `_expand_path` via `~user` path suffix ([#2685](https://github.com/NousResearch/hermes-agent/pull/2685))
|
||||
- **Normalize input** before dangerous command detection ([#3260](https://github.com/NousResearch/hermes-agent/pull/3260))
|
||||
- Make tirith block verdicts approvable instead of hard-blocking ([#3428](https://github.com/NousResearch/hermes-agent/pull/3428))
|
||||
- Remove compromised `litellm`/`typer`/`platformdirs` from deps ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796))
|
||||
- Pin all dependency version ranges ([#2810](https://github.com/NousResearch/hermes-agent/pull/2810))
|
||||
- Regenerate `uv.lock` with hashes, use lockfile in setup ([#2812](https://github.com/NousResearch/hermes-agent/pull/2812))
|
||||
- Bump dependencies to fix CVEs + regenerate `uv.lock` ([#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
|
||||
- Supply chain audit CI workflow for PR scanning ([#2816](https://github.com/NousResearch/hermes-agent/pull/2816))
|
||||
|
||||
### Reliability
|
||||
- **SQLite WAL write-lock contention** causing 15-20s TUI freeze — fixed ([#3385](https://github.com/NousResearch/hermes-agent/pull/3385))
|
||||
- **SQLite concurrency hardening** + session transcript integrity ([#3249](https://github.com/NousResearch/hermes-agent/pull/3249))
|
||||
- Prevent recurring cron job re-fire on gateway crash/restart loop ([#3396](https://github.com/NousResearch/hermes-agent/pull/3396))
|
||||
- Mark cron session as ended after job completes ([#2998](https://github.com/NousResearch/hermes-agent/pull/2998))
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Performance
|
||||
|
||||
- **TTFT startup optimizations** — salvaged easy-win startup improvements ([#3395](https://github.com/NousResearch/hermes-agent/pull/3395))
|
||||
- Cache skills prompt with shared `skill_utils` module ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
|
||||
- Avoid redundant file re-read for skill conditions in prompt builder ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Notable Bug Fixes
|
||||
|
||||
- Fix gateway token double-counting with cached agents ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
|
||||
- Fix "Event loop is closed" / "Press ENTER to continue" during idle sessions ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
|
||||
- Fix reasoning box rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
|
||||
- Fix status bar shows 26K instead of 260K for token counts ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
|
||||
- Fix `/queue` always working regardless of config ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
|
||||
- Fix phantom Discord typing indicator after agent turn ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
|
||||
- Fix Slack progress messages appearing in wrong thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
|
||||
- Fix WhatsApp media downloads (documents, audio, video) ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
|
||||
- Fix Telegram "Message thread not found" killing progress messages ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
|
||||
- Fix OpenClaw migration overwriting defaults ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
|
||||
- Fix returning-user setup menu dispatching wrong section ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
|
||||
- Fix `hermes update` PEP 668 "externally-managed-environment" error ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
|
||||
- Fix subagents hitting `max_iterations` prematurely via shared budget ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
|
||||
- Fix YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
|
||||
- Fix `config.get()` crashes on YAML null values ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
|
||||
- Fix `.strip()` crash on None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
|
||||
- Fix hung agents on gateway — `/stop` now hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
|
||||
- Fix `_custom` provider silently remapped to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
|
||||
- Fix Matrix missing from `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
|
||||
- Fix Email adapter unbounded `_seen_uids` growth ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Testing
|
||||
|
||||
- Pin `agent-client-protocol` < 0.9 to handle breaking upstream release ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
|
||||
- Catch anthropic ImportError in vision auto-detection tests ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
|
||||
- Update retry-exhaust test for new graceful return behavior ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
|
||||
- Add regression tests for null metadata frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- Update all docs for `/model` command overhaul and custom provider support ([#2800](https://github.com/NousResearch/hermes-agent/pull/2800))
|
||||
- Fix stale and incorrect documentation across 18 files ([#2805](https://github.com/NousResearch/hermes-agent/pull/2805))
|
||||
- Document 9 previously undocumented features ([#2814](https://github.com/NousResearch/hermes-agent/pull/2814))
|
||||
- Add missing skills, CLI commands, and messaging env vars to docs ([#2809](https://github.com/NousResearch/hermes-agent/pull/2809))
|
||||
- Fix api-server response storage documentation — SQLite, not in-memory ([#2819](https://github.com/NousResearch/hermes-agent/pull/2819))
|
||||
- Quote pip install extras to fix zsh glob errors ([#2815](https://github.com/NousResearch/hermes-agent/pull/2815))
|
||||
- Unify hooks documentation — add plugin hooks to hooks page, add `session:end` event ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Clarify two-mode behavior in `session_search` schema description ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
- Fix Discord Public Bot setting for Discord-provided invite link ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519)) by @mehmoodosman
|
||||
- Revise v0.4.0 changelog — fix feature attribution, reorder sections ([untagged commit](https://github.com/NousResearch/hermes-agent))
|
||||
|
||||
---
|
||||
|
||||
## 👥 Contributors
|
||||
|
||||
### Core
|
||||
- **@teknium1** — 157 PRs covering the full scope of this release
|
||||
|
||||
### Community Contributors
|
||||
- **@alt-glitch** (Siddharth Balyan) — 2 PRs: Nix flake with uv2nix build, NixOS module, and persistent container mode ([#20](https://github.com/NousResearch/hermes-agent/pull/20)); auto-generated config keys and suffix PATHs for Nix builds ([#3061](https://github.com/NousResearch/hermes-agent/pull/3061), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274))
|
||||
- **@ctlst** — 1 PR: Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701))
|
||||
- **@memosr** (memosr.eth) — 1 PR: Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162))
|
||||
- **@mehmoodosman** (Osman Mehmood) — 1 PR: Fix Discord docs for Public Bot setting ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519))
|
||||
|
||||
### All Contributors
|
||||
@alt-glitch, @ctlst, @mehmoodosman, @memosr, @teknium1
|
||||
|
||||
---
|
||||
|
||||
**Full Changelog**: [v2026.3.23...v2026.3.28](https://github.com/NousResearch/hermes-agent/compare/v2026.3.23...v2026.3.28)
|
||||
@@ -35,54 +35,6 @@ ADAPTIVE_EFFORT_MAP = {
|
||||
"minimal": "low",
|
||||
}
|
||||
|
||||
# ── Max output token limits per Anthropic model ───────────────────────
|
||||
# Source: Anthropic docs + Cline model catalog. Anthropic's API requires
|
||||
# max_tokens as a mandatory field. Previously we hardcoded 16384, which
|
||||
# starves thinking-enabled models (thinking tokens count toward the limit).
|
||||
_ANTHROPIC_OUTPUT_LIMITS = {
|
||||
# Claude 4.6
|
||||
"claude-opus-4-6": 128_000,
|
||||
"claude-sonnet-4-6": 64_000,
|
||||
# Claude 4.5
|
||||
"claude-opus-4-5": 64_000,
|
||||
"claude-sonnet-4-5": 64_000,
|
||||
"claude-haiku-4-5": 64_000,
|
||||
# Claude 4
|
||||
"claude-opus-4": 32_000,
|
||||
"claude-sonnet-4": 64_000,
|
||||
# Claude 3.7
|
||||
"claude-3-7-sonnet": 128_000,
|
||||
# Claude 3.5
|
||||
"claude-3-5-sonnet": 8_192,
|
||||
"claude-3-5-haiku": 8_192,
|
||||
# Claude 3
|
||||
"claude-3-opus": 4_096,
|
||||
"claude-3-sonnet": 4_096,
|
||||
"claude-3-haiku": 4_096,
|
||||
}
|
||||
|
||||
# For any model not in the table, assume the highest current limit.
|
||||
# Future Anthropic models are unlikely to have *less* output capacity.
|
||||
_ANTHROPIC_DEFAULT_OUTPUT_LIMIT = 128_000
|
||||
|
||||
|
||||
def _get_anthropic_max_output(model: str) -> int:
|
||||
"""Look up the max output token limit for an Anthropic model.
|
||||
|
||||
Uses substring matching against _ANTHROPIC_OUTPUT_LIMITS so date-stamped
|
||||
model IDs (claude-sonnet-4-5-20250929) and variant suffixes (:1m, :fast)
|
||||
resolve correctly. Longest-prefix match wins to avoid e.g. "claude-3-5"
|
||||
matching before "claude-3-5-sonnet".
|
||||
"""
|
||||
m = model.lower()
|
||||
best_key = ""
|
||||
best_val = _ANTHROPIC_DEFAULT_OUTPUT_LIMIT
|
||||
for key, val in _ANTHROPIC_OUTPUT_LIMITS.items():
|
||||
if key in m and len(key) > len(best_key):
|
||||
best_key = key
|
||||
best_val = val
|
||||
return best_val
|
||||
|
||||
|
||||
def _supports_adaptive_thinking(model: str) -> bool:
|
||||
"""Return True for Claude 4.6 models that support adaptive thinking."""
|
||||
@@ -107,7 +59,6 @@ _OAUTH_ONLY_BETAS = [
|
||||
# The version must stay reasonably current — Anthropic rejects OAuth requests
|
||||
# when the spoofed user-agent version is too far behind the actual release.
|
||||
_CLAUDE_CODE_VERSION_FALLBACK = "2.1.74"
|
||||
_claude_code_version_cache: Optional[str] = None
|
||||
|
||||
|
||||
def _detect_claude_code_version() -> str:
|
||||
@@ -135,18 +86,11 @@ def _detect_claude_code_version() -> str:
|
||||
return _CLAUDE_CODE_VERSION_FALLBACK
|
||||
|
||||
|
||||
_CLAUDE_CODE_VERSION = _detect_claude_code_version()
|
||||
_CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
|
||||
_MCP_TOOL_PREFIX = "mcp_"
|
||||
|
||||
|
||||
def _get_claude_code_version() -> str:
|
||||
"""Lazily detect the installed Claude Code version when OAuth headers need it."""
|
||||
global _claude_code_version_cache
|
||||
if _claude_code_version_cache is None:
|
||||
_claude_code_version_cache = _detect_claude_code_version()
|
||||
return _claude_code_version_cache
|
||||
|
||||
|
||||
def _is_oauth_token(key: str) -> bool:
|
||||
"""Check if the key is an OAuth/setup token (not a regular Console API key).
|
||||
|
||||
@@ -188,7 +132,7 @@ def build_anthropic_client(api_key: str, base_url: str = None):
|
||||
kwargs["auth_token"] = api_key
|
||||
kwargs["default_headers"] = {
|
||||
"anthropic-beta": ",".join(all_betas),
|
||||
"user-agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
|
||||
"user-agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
|
||||
"x-app": "cli",
|
||||
}
|
||||
else:
|
||||
@@ -297,7 +241,7 @@ def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
|
||||
|
||||
headers = {
|
||||
"Content-Type": "application/json",
|
||||
"User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
|
||||
"User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
|
||||
}
|
||||
|
||||
for endpoint in token_endpoints:
|
||||
@@ -762,21 +706,14 @@ def convert_messages_to_anthropic(
|
||||
result.append({"role": "user", "content": [tool_result]})
|
||||
continue
|
||||
|
||||
# Regular user message — validate non-empty content (Anthropic rejects empty)
|
||||
# Regular user message
|
||||
if isinstance(content, list):
|
||||
converted_blocks = _convert_content_to_anthropic(content)
|
||||
# Check if all text blocks are empty
|
||||
if not converted_blocks or all(
|
||||
b.get("text", "").strip() == ""
|
||||
for b in converted_blocks
|
||||
if isinstance(b, dict) and b.get("type") == "text"
|
||||
):
|
||||
converted_blocks = [{"type": "text", "text": "(empty message)"}]
|
||||
result.append({"role": "user", "content": converted_blocks})
|
||||
result.append({
|
||||
"role": "user",
|
||||
"content": converted_blocks or [{"type": "text", "text": ""}],
|
||||
})
|
||||
else:
|
||||
# Validate string content is non-empty
|
||||
if not content or (isinstance(content, str) and not content.strip()):
|
||||
content = "(empty message)"
|
||||
result.append({"role": "user", "content": content})
|
||||
|
||||
# Strip orphaned tool_use blocks (no matching tool_result follows)
|
||||
@@ -866,15 +803,9 @@ def build_anthropic_kwargs(
|
||||
tool_choice: Optional[str] = None,
|
||||
is_oauth: bool = False,
|
||||
preserve_dots: bool = False,
|
||||
context_length: Optional[int] = None,
|
||||
) -> Dict[str, Any]:
|
||||
"""Build kwargs for anthropic.messages.create().
|
||||
|
||||
When *max_tokens* is None, the model's native output limit is used
|
||||
(e.g. 128K for Opus 4.6, 64K for Sonnet 4.6). If *context_length*
|
||||
is provided, the effective limit is clamped so it doesn't exceed
|
||||
the context window.
|
||||
|
||||
When *is_oauth* is True, applies Claude Code compatibility transforms:
|
||||
system prompt prefix, tool name prefixing, and prompt sanitization.
|
||||
|
||||
@@ -885,12 +816,7 @@ def build_anthropic_kwargs(
|
||||
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
|
||||
|
||||
model = normalize_model_name(model, preserve_dots=preserve_dots)
|
||||
effective_max_tokens = max_tokens or _get_anthropic_max_output(model)
|
||||
|
||||
# Clamp to context window if the user set a lower context_length
|
||||
# (e.g. custom endpoint with limited capacity).
|
||||
if context_length and effective_max_tokens > context_length:
|
||||
effective_max_tokens = max(context_length - 1, 1)
|
||||
effective_max_tokens = max_tokens or 16384
|
||||
|
||||
# ── OAuth: Claude Code identity ──────────────────────────────────
|
||||
if is_oauth:
|
||||
|
||||
+7
-154
@@ -693,13 +693,7 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
|
||||
is_oauth = _is_oauth_token(token)
|
||||
model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
|
||||
logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
|
||||
try:
|
||||
real_client = build_anthropic_client(token, base_url)
|
||||
except ImportError:
|
||||
# The anthropic_adapter module imports fine but the SDK itself is
|
||||
# missing — build_anthropic_client raises ImportError at call time
|
||||
# when _anthropic_sdk is None. Treat as unavailable.
|
||||
return None, None
|
||||
real_client = build_anthropic_client(token, base_url)
|
||||
return AnthropicAuxiliaryClient(real_client, model, token, base_url, is_oauth=is_oauth), model
|
||||
|
||||
|
||||
@@ -1137,13 +1131,7 @@ def resolve_vision_provider_client(
|
||||
return "custom", client, final_model
|
||||
|
||||
if requested == "auto":
|
||||
ordered = list(_VISION_AUTO_PROVIDER_ORDER)
|
||||
preferred = _preferred_main_vision_provider()
|
||||
if preferred in ordered:
|
||||
ordered.remove(preferred)
|
||||
ordered.insert(0, preferred)
|
||||
|
||||
for candidate in ordered:
|
||||
for candidate in get_available_vision_backends():
|
||||
sync_client, default_model = _resolve_strict_vision_backend(candidate)
|
||||
if sync_client is not None:
|
||||
return _finalize(candidate, sync_client, default_model)
|
||||
@@ -1216,39 +1204,6 @@ _client_cache: Dict[tuple, tuple] = {}
|
||||
_client_cache_lock = threading.Lock()
|
||||
|
||||
|
||||
def neuter_async_httpx_del() -> None:
|
||||
"""Monkey-patch ``AsyncHttpxClientWrapper.__del__`` to be a no-op.
|
||||
|
||||
The OpenAI SDK's ``AsyncHttpxClientWrapper.__del__`` schedules
|
||||
``self.aclose()`` via ``asyncio.get_running_loop().create_task()``.
|
||||
When an ``AsyncOpenAI`` client is garbage-collected while
|
||||
prompt_toolkit's event loop is running (the common CLI idle state),
|
||||
the ``aclose()`` task runs on prompt_toolkit's loop but the
|
||||
underlying TCP transport is bound to a *different* loop (the worker
|
||||
thread's loop that the client was originally created on). If that
|
||||
loop is closed or its thread is dead, the transport's
|
||||
``self._loop.call_soon()`` raises ``RuntimeError("Event loop is
|
||||
closed")``, which prompt_toolkit surfaces as "Unhandled exception
|
||||
in event loop ... Press ENTER to continue...".
|
||||
|
||||
Neutering ``__del__`` is safe because:
|
||||
- Cached clients are explicitly cleaned via ``_force_close_async_httpx``
|
||||
on stale-loop detection and ``shutdown_cached_clients`` on exit.
|
||||
- Uncached clients' TCP connections are cleaned up by the OS when the
|
||||
process exits.
|
||||
- The OpenAI SDK itself marks this as a TODO (``# TODO(someday):
|
||||
support non asyncio runtimes here``).
|
||||
|
||||
Call this once at CLI startup, before any ``AsyncOpenAI`` clients are
|
||||
created.
|
||||
"""
|
||||
try:
|
||||
from openai._base_client import AsyncHttpxClientWrapper
|
||||
AsyncHttpxClientWrapper.__del__ = lambda self: None # type: ignore[assignment]
|
||||
except (ImportError, AttributeError):
|
||||
pass # Graceful degradation if the SDK changes its internals
|
||||
|
||||
|
||||
def _force_close_async_httpx(client: Any) -> None:
|
||||
"""Mark the httpx AsyncClient inside an AsyncOpenAI client as closed.
|
||||
|
||||
@@ -1296,25 +1251,6 @@ def shutdown_cached_clients() -> None:
|
||||
_client_cache.clear()
|
||||
|
||||
|
||||
def cleanup_stale_async_clients() -> None:
|
||||
"""Force-close cached async clients whose event loop is closed.
|
||||
|
||||
Call this after each agent turn to proactively clean up stale clients
|
||||
before GC can trigger ``AsyncHttpxClientWrapper.__del__`` on them.
|
||||
This is defense-in-depth — the primary fix is ``neuter_async_httpx_del``
|
||||
which disables ``__del__`` entirely.
|
||||
"""
|
||||
with _client_cache_lock:
|
||||
stale_keys = []
|
||||
for key, entry in _client_cache.items():
|
||||
client, _default, cached_loop = entry
|
||||
if cached_loop is not None and cached_loop.is_closed():
|
||||
_force_close_async_httpx(client)
|
||||
stale_keys.append(key)
|
||||
for key in stale_keys:
|
||||
del _client_cache[key]
|
||||
|
||||
|
||||
def _get_cached_client(
|
||||
provider: str,
|
||||
model: str = None,
|
||||
@@ -1458,29 +1394,6 @@ def _resolve_task_provider_model(
|
||||
return "auto", resolved_model, None, None
|
||||
|
||||
|
||||
_DEFAULT_AUX_TIMEOUT = 30.0
|
||||
|
||||
|
||||
def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float:
|
||||
"""Read timeout from auxiliary.{task}.timeout in config, falling back to *default*."""
|
||||
if not task:
|
||||
return default
|
||||
try:
|
||||
from hermes_cli.config import load_config
|
||||
config = load_config()
|
||||
except ImportError:
|
||||
return default
|
||||
aux = config.get("auxiliary", {}) if isinstance(config, dict) else {}
|
||||
task_config = aux.get(task, {}) if isinstance(aux, dict) else {}
|
||||
raw = task_config.get("timeout")
|
||||
if raw is not None:
|
||||
try:
|
||||
return float(raw)
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
return default
|
||||
|
||||
|
||||
def _build_call_kwargs(
|
||||
provider: str,
|
||||
model: str,
|
||||
@@ -1538,7 +1451,7 @@ def call_llm(
|
||||
temperature: float = None,
|
||||
max_tokens: int = None,
|
||||
tools: list = None,
|
||||
timeout: float = None,
|
||||
timeout: float = 30.0,
|
||||
extra_body: dict = None,
|
||||
) -> Any:
|
||||
"""Centralized synchronous LLM call.
|
||||
@@ -1556,7 +1469,7 @@ def call_llm(
|
||||
temperature: Sampling temperature (None = provider default).
|
||||
max_tokens: Max output tokens (handles max_tokens vs max_completion_tokens).
|
||||
tools: Tool definitions (for function calling).
|
||||
timeout: Request timeout in seconds (None = read from auxiliary.{task}.timeout config).
|
||||
timeout: Request timeout in seconds.
|
||||
extra_body: Additional request body fields.
|
||||
|
||||
Returns:
|
||||
@@ -1621,12 +1534,10 @@ def call_llm(
|
||||
f"No LLM provider configured for task={task} provider={resolved_provider}. "
|
||||
f"Run: hermes setup")
|
||||
|
||||
effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
|
||||
|
||||
kwargs = _build_call_kwargs(
|
||||
resolved_provider, final_model, messages,
|
||||
temperature=temperature, max_tokens=max_tokens,
|
||||
tools=tools, timeout=effective_timeout, extra_body=extra_body,
|
||||
tools=tools, timeout=timeout, extra_body=extra_body,
|
||||
base_url=resolved_base_url)
|
||||
|
||||
# Handle max_tokens vs max_completion_tokens retry
|
||||
@@ -1641,62 +1552,6 @@ def call_llm(
|
||||
raise
|
||||
|
||||
|
||||
def extract_content_or_reasoning(response) -> str:
|
||||
"""Extract content from an LLM response, falling back to reasoning fields.
|
||||
|
||||
Mirrors the main agent loop's behavior when a reasoning model (DeepSeek-R1,
|
||||
Qwen-QwQ, etc.) returns ``content=None`` with reasoning in structured fields.
|
||||
|
||||
Resolution order:
|
||||
1. ``message.content`` — strip inline think/reasoning blocks, check for
|
||||
remaining non-whitespace text.
|
||||
2. ``message.reasoning`` / ``message.reasoning_content`` — direct
|
||||
structured reasoning fields (DeepSeek, Moonshot, Novita, etc.).
|
||||
3. ``message.reasoning_details`` — OpenRouter unified array format.
|
||||
|
||||
Returns the best available text, or ``""`` if nothing found.
|
||||
"""
|
||||
import re
|
||||
|
||||
msg = response.choices[0].message
|
||||
content = (msg.content or "").strip()
|
||||
|
||||
if content:
|
||||
# Strip inline think/reasoning blocks (mirrors _strip_think_blocks)
|
||||
cleaned = re.sub(
|
||||
r"<(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>"
|
||||
r".*?"
|
||||
r"</(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>",
|
||||
"", content, flags=re.DOTALL | re.IGNORECASE,
|
||||
).strip()
|
||||
if cleaned:
|
||||
return cleaned
|
||||
|
||||
# Content is empty or reasoning-only — try structured reasoning fields
|
||||
reasoning_parts: list[str] = []
|
||||
for field in ("reasoning", "reasoning_content"):
|
||||
val = getattr(msg, field, None)
|
||||
if val and isinstance(val, str) and val.strip() and val not in reasoning_parts:
|
||||
reasoning_parts.append(val.strip())
|
||||
|
||||
details = getattr(msg, "reasoning_details", None)
|
||||
if details and isinstance(details, list):
|
||||
for detail in details:
|
||||
if isinstance(detail, dict):
|
||||
summary = (
|
||||
detail.get("summary")
|
||||
or detail.get("content")
|
||||
or detail.get("text")
|
||||
)
|
||||
if summary and summary not in reasoning_parts:
|
||||
reasoning_parts.append(summary.strip() if isinstance(summary, str) else str(summary))
|
||||
|
||||
if reasoning_parts:
|
||||
return "\n\n".join(reasoning_parts)
|
||||
|
||||
return ""
|
||||
|
||||
|
||||
async def async_call_llm(
|
||||
task: str = None,
|
||||
*,
|
||||
@@ -1708,7 +1563,7 @@ async def async_call_llm(
|
||||
temperature: float = None,
|
||||
max_tokens: int = None,
|
||||
tools: list = None,
|
||||
timeout: float = None,
|
||||
timeout: float = 30.0,
|
||||
extra_body: dict = None,
|
||||
) -> Any:
|
||||
"""Centralized asynchronous LLM call.
|
||||
@@ -1769,12 +1624,10 @@ async def async_call_llm(
|
||||
f"No LLM provider configured for task={task} provider={resolved_provider}. "
|
||||
f"Run: hermes setup")
|
||||
|
||||
effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
|
||||
|
||||
kwargs = _build_call_kwargs(
|
||||
resolved_provider, final_model, messages,
|
||||
temperature=temperature, max_tokens=max_tokens,
|
||||
tools=tools, timeout=effective_timeout, extra_body=extra_body,
|
||||
tools=tools, timeout=timeout, extra_body=extra_body,
|
||||
base_url=resolved_base_url)
|
||||
|
||||
try:
|
||||
|
||||
@@ -141,7 +141,7 @@ class ContextCompressor:
|
||||
"last_prompt_tokens": self.last_prompt_tokens,
|
||||
"threshold_tokens": self.threshold_tokens,
|
||||
"context_length": self.context_length,
|
||||
"usage_percent": min(100, (self.last_prompt_tokens / self.context_length * 100)) if self.context_length else 0,
|
||||
"usage_percent": (self.last_prompt_tokens / self.context_length * 100) if self.context_length else 0,
|
||||
"compression_count": self.compression_count,
|
||||
}
|
||||
|
||||
@@ -347,7 +347,7 @@ Write only the summary body. Do not include any preamble or prefix."""
|
||||
"messages": [{"role": "user", "content": prompt}],
|
||||
"temperature": 0.3,
|
||||
"max_tokens": summary_budget * 2,
|
||||
# timeout resolved from auxiliary.compression.timeout config by call_llm
|
||||
"timeout": 45.0,
|
||||
}
|
||||
if self.summary_model:
|
||||
call_kwargs["model"] = self.summary_model
|
||||
|
||||
@@ -286,16 +286,12 @@ def _expand_git_reference(
|
||||
args: list[str],
|
||||
label: str,
|
||||
) -> tuple[str | None, str | None]:
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", *args],
|
||||
cwd=cwd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30,
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
return f"{ref.raw}: git command timed out (30s)", None
|
||||
result = subprocess.run(
|
||||
["git", *args],
|
||||
cwd=cwd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
stderr = (result.stderr or "").strip() or "git command failed"
|
||||
return f"{ref.raw}: {stderr}", None
|
||||
@@ -453,12 +449,9 @@ def _rg_files(path: Path, cwd: Path, limit: int) -> list[Path] | None:
|
||||
cwd=cwd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=10,
|
||||
)
|
||||
except FileNotFoundError:
|
||||
return None
|
||||
except subprocess.TimeoutExpired:
|
||||
return None
|
||||
if result.returncode != 0:
|
||||
return None
|
||||
files = [Path(line.strip()) for line in result.stdout.splitlines() if line.strip()]
|
||||
|
||||
+7
-7
@@ -284,11 +284,11 @@ class KawaiiSpinner:
|
||||
The CLI already drives a TUI widget (_spinner_text) for spinner display,
|
||||
so KawaiiSpinner's \\r-based animation is redundant under StdoutProxy.
|
||||
"""
|
||||
try:
|
||||
from prompt_toolkit.patch_stdout import StdoutProxy
|
||||
return isinstance(self._out, StdoutProxy)
|
||||
except ImportError:
|
||||
return False
|
||||
out = self._out
|
||||
# StdoutProxy has a 'raw' attribute (bool) that plain file objects lack.
|
||||
if hasattr(out, 'raw') and type(out).__name__ == 'StdoutProxy':
|
||||
return True
|
||||
return False
|
||||
|
||||
def _animate(self):
|
||||
# When stdout is not a real terminal (e.g. Docker, systemd, pipe),
|
||||
@@ -699,7 +699,7 @@ def format_context_pressure(
|
||||
threshold_percent: Compaction threshold as a fraction of context window.
|
||||
compression_enabled: Whether auto-compression is active.
|
||||
"""
|
||||
pct_int = min(int(compaction_progress * 100), 100)
|
||||
pct_int = int(compaction_progress * 100)
|
||||
filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
|
||||
bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)
|
||||
|
||||
@@ -729,7 +729,7 @@ def format_context_pressure_gateway(
|
||||
No ANSI — just Unicode and plain text suitable for Telegram/Discord/etc.
|
||||
The percentage shows progress toward the compaction threshold.
|
||||
"""
|
||||
pct_int = min(int(compaction_progress * 100), 100)
|
||||
pct_int = int(compaction_progress * 100)
|
||||
filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
|
||||
bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)
|
||||
|
||||
|
||||
@@ -113,15 +113,6 @@ DEFAULT_CONTEXT_LENGTHS = {
|
||||
"glm": 202752,
|
||||
# Kimi
|
||||
"kimi": 262144,
|
||||
# Hugging Face Inference Providers — model IDs use org/name format
|
||||
"Qwen/Qwen3.5-397B-A17B": 131072,
|
||||
"Qwen/Qwen3.5-35B-A3B": 131072,
|
||||
"deepseek-ai/DeepSeek-V3.2": 65536,
|
||||
"moonshotai/Kimi-K2.5": 262144,
|
||||
"moonshotai/Kimi-K2-Thinking": 262144,
|
||||
"MiniMaxAI/MiniMax-M2.5": 204800,
|
||||
"XiaomiMiMo/MiMo-V2-Flash": 32768,
|
||||
"zai-org/GLM-5": 202752,
|
||||
}
|
||||
|
||||
_CONTEXT_LENGTH_KEYS = (
|
||||
|
||||
+4
-4
@@ -15,8 +15,6 @@ import time
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, Optional
|
||||
|
||||
from utils import atomic_json_write
|
||||
|
||||
import requests
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -66,10 +64,12 @@ def _load_disk_cache() -> Dict[str, Any]:
|
||||
|
||||
|
||||
def _save_disk_cache(data: Dict[str, Any]) -> None:
|
||||
"""Save models.dev data to disk cache atomically."""
|
||||
"""Save models.dev data to disk cache."""
|
||||
try:
|
||||
cache_path = _get_cache_path()
|
||||
atomic_json_write(cache_path, data, indent=None, separators=(",", ":"))
|
||||
cache_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(cache_path, "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, separators=(",", ":"))
|
||||
except Exception as e:
|
||||
logger.debug("Failed to save models.dev disk cache: %s", e)
|
||||
|
||||
|
||||
+106
-271
@@ -4,27 +4,14 @@ All functions are stateless. AIAgent._build_system_prompt() calls these to
|
||||
assemble pieces, then combines them with memory and ephemeral prompts.
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import threading
|
||||
from collections import OrderedDict
|
||||
from pathlib import Path
|
||||
|
||||
from hermes_constants import get_hermes_home
|
||||
from typing import Optional
|
||||
|
||||
from agent.skill_utils import (
|
||||
extract_skill_conditions,
|
||||
extract_skill_description,
|
||||
get_disabled_skill_names,
|
||||
iter_skill_index_files,
|
||||
parse_frontmatter,
|
||||
skill_matches_platform,
|
||||
)
|
||||
from utils import atomic_json_write
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -169,25 +156,6 @@ SKILLS_GUIDANCE = (
|
||||
"Skills that aren't maintained become liabilities."
|
||||
)
|
||||
|
||||
TOOL_USE_ENFORCEMENT_GUIDANCE = (
|
||||
"# Tool-use enforcement\n"
|
||||
"You MUST use your tools to take action — do not describe what you would do "
|
||||
"or plan to do without actually doing it. When you say you will perform an "
|
||||
"action (e.g. 'I will run the tests', 'Let me check the file', 'I will create "
|
||||
"the project'), you MUST immediately make the corresponding tool call in the same "
|
||||
"response. Never end your turn with a promise of future action — execute it now.\n"
|
||||
"Keep working until the task is actually complete. Do not stop with a summary of "
|
||||
"what you plan to do next time. If you have tools available that can accomplish "
|
||||
"the task, use them instead of telling the user what you would do.\n"
|
||||
"Every response should either (a) contain tool calls that make progress, or "
|
||||
"(b) deliver a final result to the user. Responses that only describe intentions "
|
||||
"without acting are not acceptable."
|
||||
)
|
||||
|
||||
# Model name substrings that trigger tool-use enforcement guidance.
|
||||
# Add new patterns here when a model family needs explicit steering.
|
||||
TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex")
|
||||
|
||||
PLATFORM_HINTS = {
|
||||
"whatsapp": (
|
||||
"You are on a text messaging communication platform, WhatsApp. "
|
||||
@@ -262,111 +230,6 @@ CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
|
||||
CONTEXT_TRUNCATE_TAIL_RATIO = 0.2
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Skills prompt cache
|
||||
# =========================================================================
|
||||
|
||||
_SKILLS_PROMPT_CACHE_MAX = 8
|
||||
_SKILLS_PROMPT_CACHE: OrderedDict[tuple, str] = OrderedDict()
|
||||
_SKILLS_PROMPT_CACHE_LOCK = threading.Lock()
|
||||
_SKILLS_SNAPSHOT_VERSION = 1
|
||||
|
||||
|
||||
def _skills_prompt_snapshot_path() -> Path:
|
||||
return get_hermes_home() / ".skills_prompt_snapshot.json"
|
||||
|
||||
|
||||
def clear_skills_system_prompt_cache(*, clear_snapshot: bool = False) -> None:
|
||||
"""Drop the in-process skills prompt cache (and optionally the disk snapshot)."""
|
||||
with _SKILLS_PROMPT_CACHE_LOCK:
|
||||
_SKILLS_PROMPT_CACHE.clear()
|
||||
if clear_snapshot:
|
||||
try:
|
||||
_skills_prompt_snapshot_path().unlink(missing_ok=True)
|
||||
except OSError as e:
|
||||
logger.debug("Could not remove skills prompt snapshot: %s", e)
|
||||
|
||||
|
||||
def _build_skills_manifest(skills_dir: Path) -> dict[str, list[int]]:
|
||||
"""Build an mtime/size manifest of all SKILL.md and DESCRIPTION.md files."""
|
||||
manifest: dict[str, list[int]] = {}
|
||||
for filename in ("SKILL.md", "DESCRIPTION.md"):
|
||||
for path in iter_skill_index_files(skills_dir, filename):
|
||||
try:
|
||||
st = path.stat()
|
||||
except OSError:
|
||||
continue
|
||||
manifest[str(path.relative_to(skills_dir))] = [st.st_mtime_ns, st.st_size]
|
||||
return manifest
|
||||
|
||||
|
||||
def _load_skills_snapshot(skills_dir: Path) -> Optional[dict]:
|
||||
"""Load the disk snapshot if it exists and its manifest still matches."""
|
||||
snapshot_path = _skills_prompt_snapshot_path()
|
||||
if not snapshot_path.exists():
|
||||
return None
|
||||
try:
|
||||
snapshot = json.loads(snapshot_path.read_text(encoding="utf-8"))
|
||||
except Exception:
|
||||
return None
|
||||
if not isinstance(snapshot, dict):
|
||||
return None
|
||||
if snapshot.get("version") != _SKILLS_SNAPSHOT_VERSION:
|
||||
return None
|
||||
if snapshot.get("manifest") != _build_skills_manifest(skills_dir):
|
||||
return None
|
||||
return snapshot
|
||||
|
||||
|
||||
def _write_skills_snapshot(
|
||||
skills_dir: Path,
|
||||
manifest: dict[str, list[int]],
|
||||
skill_entries: list[dict],
|
||||
category_descriptions: dict[str, str],
|
||||
) -> None:
|
||||
"""Persist skill metadata to disk for fast cold-start reuse."""
|
||||
payload = {
|
||||
"version": _SKILLS_SNAPSHOT_VERSION,
|
||||
"manifest": manifest,
|
||||
"skills": skill_entries,
|
||||
"category_descriptions": category_descriptions,
|
||||
}
|
||||
try:
|
||||
atomic_json_write(_skills_prompt_snapshot_path(), payload)
|
||||
except Exception as e:
|
||||
logger.debug("Could not write skills prompt snapshot: %s", e)
|
||||
|
||||
|
||||
def _build_snapshot_entry(
|
||||
skill_file: Path,
|
||||
skills_dir: Path,
|
||||
frontmatter: dict,
|
||||
description: str,
|
||||
) -> dict:
|
||||
"""Build a serialisable metadata dict for one skill."""
|
||||
rel_path = skill_file.relative_to(skills_dir)
|
||||
parts = rel_path.parts
|
||||
if len(parts) >= 2:
|
||||
skill_name = parts[-2]
|
||||
category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
|
||||
else:
|
||||
category = "general"
|
||||
skill_name = skill_file.parent.name
|
||||
|
||||
platforms = frontmatter.get("platforms") or []
|
||||
if isinstance(platforms, str):
|
||||
platforms = [platforms]
|
||||
|
||||
return {
|
||||
"skill_name": skill_name,
|
||||
"category": category,
|
||||
"frontmatter_name": str(frontmatter.get("name", skill_name)),
|
||||
"description": description,
|
||||
"platforms": [str(p).strip() for p in platforms if str(p).strip()],
|
||||
"conditions": extract_skill_conditions(frontmatter),
|
||||
}
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Skills index
|
||||
# =========================================================================
|
||||
@@ -378,13 +241,22 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
|
||||
(True, {}, "") to err on the side of showing the skill.
|
||||
"""
|
||||
try:
|
||||
from tools.skills_tool import _parse_frontmatter, skill_matches_platform
|
||||
|
||||
raw = skill_file.read_text(encoding="utf-8")[:2000]
|
||||
frontmatter, _ = parse_frontmatter(raw)
|
||||
frontmatter, _ = _parse_frontmatter(raw)
|
||||
|
||||
if not skill_matches_platform(frontmatter):
|
||||
return False, frontmatter, ""
|
||||
return False, {}, ""
|
||||
|
||||
return True, frontmatter, extract_skill_description(frontmatter)
|
||||
desc = ""
|
||||
raw_desc = frontmatter.get("description", "")
|
||||
if raw_desc:
|
||||
desc = str(raw_desc).strip().strip("'\"")
|
||||
if len(desc) > 60:
|
||||
desc = desc[:57] + "..."
|
||||
|
||||
return True, frontmatter, desc
|
||||
except Exception as e:
|
||||
logger.debug("Failed to parse skill file %s: %s", skill_file, e)
|
||||
return True, {}, ""
|
||||
@@ -393,9 +265,16 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
|
||||
def _read_skill_conditions(skill_file: Path) -> dict:
|
||||
"""Extract conditional activation fields from SKILL.md frontmatter."""
|
||||
try:
|
||||
from tools.skills_tool import _parse_frontmatter
|
||||
raw = skill_file.read_text(encoding="utf-8")[:2000]
|
||||
frontmatter, _ = parse_frontmatter(raw)
|
||||
return extract_skill_conditions(frontmatter)
|
||||
frontmatter, _ = _parse_frontmatter(raw)
|
||||
hermes = frontmatter.get("metadata", {}).get("hermes", {})
|
||||
return {
|
||||
"fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
|
||||
"requires_toolsets": hermes.get("requires_toolsets", []),
|
||||
"fallback_for_tools": hermes.get("fallback_for_tools", []),
|
||||
"requires_tools": hermes.get("requires_tools", []),
|
||||
}
|
||||
except Exception as e:
|
||||
logger.debug("Failed to read skill conditions from %s: %s", skill_file, e)
|
||||
return {}
|
||||
@@ -438,12 +317,10 @@ def build_skills_system_prompt(
|
||||
) -> str:
|
||||
"""Build a compact skill index for the system prompt.
|
||||
|
||||
Two-layer cache:
|
||||
1. In-process LRU dict keyed by (skills_dir, tools, toolsets)
|
||||
2. Disk snapshot (``.skills_prompt_snapshot.json``) validated by
|
||||
mtime/size manifest — survives process restarts
|
||||
|
||||
Falls back to a full filesystem scan when both layers miss.
|
||||
Scans ~/.hermes/skills/ for SKILL.md files grouped by category.
|
||||
Includes per-skill descriptions from frontmatter so the model can
|
||||
match skills by meaning, not just name.
|
||||
Filters out skills incompatible with the current OS platform.
|
||||
"""
|
||||
hermes_home = get_hermes_home()
|
||||
skills_dir = hermes_home / "skills"
|
||||
@@ -451,140 +328,98 @@ def build_skills_system_prompt(
|
||||
if not skills_dir.exists():
|
||||
return ""
|
||||
|
||||
# ── Layer 1: in-process LRU cache ─────────────────────────────────
|
||||
cache_key = (
|
||||
str(skills_dir.resolve()),
|
||||
tuple(sorted(str(t) for t in (available_tools or set()))),
|
||||
tuple(sorted(str(ts) for ts in (available_toolsets or set()))),
|
||||
)
|
||||
with _SKILLS_PROMPT_CACHE_LOCK:
|
||||
cached = _SKILLS_PROMPT_CACHE.get(cache_key)
|
||||
if cached is not None:
|
||||
_SKILLS_PROMPT_CACHE.move_to_end(cache_key)
|
||||
return cached
|
||||
|
||||
disabled = get_disabled_skill_names()
|
||||
|
||||
# ── Layer 2: disk snapshot ────────────────────────────────────────
|
||||
snapshot = _load_skills_snapshot(skills_dir)
|
||||
# Collect skills with descriptions, grouped by category.
|
||||
# Each entry: (skill_name, description)
|
||||
# Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
|
||||
# -> category "mlops/training", skill "axolotl"
|
||||
# Load disabled skill names once for the entire scan
|
||||
try:
|
||||
from tools.skills_tool import _get_disabled_skill_names
|
||||
disabled = _get_disabled_skill_names()
|
||||
except Exception:
|
||||
disabled = set()
|
||||
|
||||
skills_by_category: dict[str, list[tuple[str, str]]] = {}
|
||||
category_descriptions: dict[str, str] = {}
|
||||
|
||||
if snapshot is not None:
|
||||
# Fast path: use pre-parsed metadata from disk
|
||||
for entry in snapshot.get("skills", []):
|
||||
if not isinstance(entry, dict):
|
||||
continue
|
||||
skill_name = entry.get("skill_name") or ""
|
||||
category = entry.get("category") or "general"
|
||||
frontmatter_name = entry.get("frontmatter_name") or skill_name
|
||||
platforms = entry.get("platforms") or []
|
||||
if not skill_matches_platform({"platforms": platforms}):
|
||||
continue
|
||||
if frontmatter_name in disabled or skill_name in disabled:
|
||||
continue
|
||||
if not _skill_should_show(
|
||||
entry.get("conditions") or {},
|
||||
available_tools,
|
||||
available_toolsets,
|
||||
):
|
||||
continue
|
||||
skills_by_category.setdefault(category, []).append(
|
||||
(skill_name, entry.get("description", ""))
|
||||
)
|
||||
category_descriptions = {
|
||||
str(k): str(v)
|
||||
for k, v in (snapshot.get("category_descriptions") or {}).items()
|
||||
for skill_file in skills_dir.rglob("SKILL.md"):
|
||||
is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
|
||||
if not is_compatible:
|
||||
continue
|
||||
rel_path = skill_file.relative_to(skills_dir)
|
||||
parts = rel_path.parts
|
||||
if len(parts) >= 2:
|
||||
skill_name = parts[-2]
|
||||
category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
|
||||
else:
|
||||
category = "general"
|
||||
skill_name = skill_file.parent.name
|
||||
# Respect user's disabled skills config
|
||||
fm_name = frontmatter.get("name", skill_name)
|
||||
if fm_name in disabled or skill_name in disabled:
|
||||
continue
|
||||
# Extract conditions inline from already-parsed frontmatter
|
||||
# (avoids redundant file re-read that _read_skill_conditions would do)
|
||||
hermes_meta = (frontmatter.get("metadata") or {}).get("hermes") or {}
|
||||
conditions = {
|
||||
"fallback_for_toolsets": hermes_meta.get("fallback_for_toolsets", []),
|
||||
"requires_toolsets": hermes_meta.get("requires_toolsets", []),
|
||||
"fallback_for_tools": hermes_meta.get("fallback_for_tools", []),
|
||||
"requires_tools": hermes_meta.get("requires_tools", []),
|
||||
}
|
||||
else:
|
||||
# Cold path: full filesystem scan + write snapshot for next time
|
||||
skill_entries: list[dict] = []
|
||||
for skill_file in iter_skill_index_files(skills_dir, "SKILL.md"):
|
||||
is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
|
||||
entry = _build_snapshot_entry(skill_file, skills_dir, frontmatter, desc)
|
||||
skill_entries.append(entry)
|
||||
if not is_compatible:
|
||||
continue
|
||||
skill_name = entry["skill_name"]
|
||||
if entry["frontmatter_name"] in disabled or skill_name in disabled:
|
||||
continue
|
||||
if not _skill_should_show(
|
||||
extract_skill_conditions(frontmatter),
|
||||
available_tools,
|
||||
available_toolsets,
|
||||
):
|
||||
continue
|
||||
skills_by_category.setdefault(entry["category"], []).append(
|
||||
(skill_name, entry["description"])
|
||||
)
|
||||
if not _skill_should_show(conditions, available_tools, available_toolsets):
|
||||
continue
|
||||
skills_by_category.setdefault(category, []).append((skill_name, desc))
|
||||
|
||||
# Read category-level DESCRIPTION.md files
|
||||
for desc_file in iter_skill_index_files(skills_dir, "DESCRIPTION.md"):
|
||||
if not skills_by_category:
|
||||
return ""
|
||||
|
||||
# Read category-level descriptions from DESCRIPTION.md
|
||||
# Checks both the exact category path and parent directories
|
||||
category_descriptions = {}
|
||||
for category in skills_by_category:
|
||||
cat_path = Path(category)
|
||||
desc_file = skills_dir / cat_path / "DESCRIPTION.md"
|
||||
if desc_file.exists():
|
||||
try:
|
||||
content = desc_file.read_text(encoding="utf-8")
|
||||
fm, _ = parse_frontmatter(content)
|
||||
cat_desc = fm.get("description")
|
||||
if not cat_desc:
|
||||
continue
|
||||
rel = desc_file.relative_to(skills_dir)
|
||||
cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
|
||||
category_descriptions[cat] = str(cat_desc).strip().strip("'\"")
|
||||
match = re.search(r"^---\s*\n.*?description:\s*(.+?)\s*\n.*?^---", content, re.MULTILINE | re.DOTALL)
|
||||
if match:
|
||||
category_descriptions[category] = match.group(1).strip()
|
||||
except Exception as e:
|
||||
logger.debug("Could not read skill description %s: %s", desc_file, e)
|
||||
|
||||
_write_skills_snapshot(
|
||||
skills_dir,
|
||||
_build_skills_manifest(skills_dir),
|
||||
skill_entries,
|
||||
category_descriptions,
|
||||
)
|
||||
|
||||
if not skills_by_category:
|
||||
result = ""
|
||||
else:
|
||||
index_lines = []
|
||||
for category in sorted(skills_by_category.keys()):
|
||||
cat_desc = category_descriptions.get(category, "")
|
||||
if cat_desc:
|
||||
index_lines.append(f" {category}: {cat_desc}")
|
||||
index_lines = []
|
||||
for category in sorted(skills_by_category.keys()):
|
||||
cat_desc = category_descriptions.get(category, "")
|
||||
if cat_desc:
|
||||
index_lines.append(f" {category}: {cat_desc}")
|
||||
else:
|
||||
index_lines.append(f" {category}:")
|
||||
# Deduplicate and sort skills within each category
|
||||
seen = set()
|
||||
for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
|
||||
if name in seen:
|
||||
continue
|
||||
seen.add(name)
|
||||
if desc:
|
||||
index_lines.append(f" - {name}: {desc}")
|
||||
else:
|
||||
index_lines.append(f" {category}:")
|
||||
# Deduplicate and sort skills within each category
|
||||
seen = set()
|
||||
for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
|
||||
if name in seen:
|
||||
continue
|
||||
seen.add(name)
|
||||
if desc:
|
||||
index_lines.append(f" - {name}: {desc}")
|
||||
else:
|
||||
index_lines.append(f" - {name}")
|
||||
index_lines.append(f" - {name}")
|
||||
|
||||
result = (
|
||||
"## Skills (mandatory)\n"
|
||||
"Before replying, scan the skills below. If one clearly matches your task, "
|
||||
"load it with skill_view(name) and follow its instructions. "
|
||||
"If a skill has issues, fix it with skill_manage(action='patch').\n"
|
||||
"After difficult/iterative tasks, offer to save as a skill. "
|
||||
"If a skill you loaded was missing steps, had wrong commands, or needed "
|
||||
"pitfalls you discovered, update it before finishing.\n"
|
||||
"\n"
|
||||
"<available_skills>\n"
|
||||
+ "\n".join(index_lines) + "\n"
|
||||
"</available_skills>\n"
|
||||
"\n"
|
||||
"If none match, proceed normally without loading a skill."
|
||||
)
|
||||
|
||||
# ── Store in LRU cache ────────────────────────────────────────────
|
||||
with _SKILLS_PROMPT_CACHE_LOCK:
|
||||
_SKILLS_PROMPT_CACHE[cache_key] = result
|
||||
_SKILLS_PROMPT_CACHE.move_to_end(cache_key)
|
||||
while len(_SKILLS_PROMPT_CACHE) > _SKILLS_PROMPT_CACHE_MAX:
|
||||
_SKILLS_PROMPT_CACHE.popitem(last=False)
|
||||
|
||||
return result
|
||||
return (
|
||||
"## Skills (mandatory)\n"
|
||||
"Before replying, scan the skills below. If one clearly matches your task, "
|
||||
"load it with skill_view(name) and follow its instructions. "
|
||||
"If a skill has issues, fix it with skill_manage(action='patch').\n"
|
||||
"After difficult/iterative tasks, offer to save as a skill. "
|
||||
"If a skill you loaded was missing steps, had wrong commands, or needed "
|
||||
"pitfalls you discovered, update it before finishing.\n"
|
||||
"\n"
|
||||
"<available_skills>\n"
|
||||
+ "\n".join(index_lines) + "\n"
|
||||
"</available_skills>\n"
|
||||
"\n"
|
||||
"If none match, proceed normally without loading a skill."
|
||||
)
|
||||
|
||||
|
||||
# =========================================================================
|
||||
|
||||
@@ -1,203 +0,0 @@
|
||||
"""Lightweight skill metadata utilities shared by prompt_builder and skills_tool.
|
||||
|
||||
This module intentionally avoids importing the tool registry, CLI config, or any
|
||||
heavy dependency chain. It is safe to import at module level without triggering
|
||||
tool registration or provider resolution.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Set, Tuple
|
||||
|
||||
from hermes_constants import get_hermes_home
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# ── Platform mapping ──────────────────────────────────────────────────────
|
||||
|
||||
PLATFORM_MAP = {
|
||||
"macos": "darwin",
|
||||
"linux": "linux",
|
||||
"windows": "win32",
|
||||
}
|
||||
|
||||
EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub"))
|
||||
|
||||
# ── Lazy YAML loader ─────────────────────────────────────────────────────
|
||||
|
||||
_yaml_load_fn = None
|
||||
|
||||
|
||||
def yaml_load(content: str):
|
||||
"""Parse YAML with lazy import and CSafeLoader preference."""
|
||||
global _yaml_load_fn
|
||||
if _yaml_load_fn is None:
|
||||
import yaml
|
||||
|
||||
loader = getattr(yaml, "CSafeLoader", None) or yaml.SafeLoader
|
||||
|
||||
def _load(value: str):
|
||||
return yaml.load(value, Loader=loader)
|
||||
|
||||
_yaml_load_fn = _load
|
||||
return _yaml_load_fn(content)
|
||||
|
||||
|
||||
# ── Frontmatter parsing ──────────────────────────────────────────────────
|
||||
|
||||
|
||||
def parse_frontmatter(content: str) -> Tuple[Dict[str, Any], str]:
|
||||
"""Parse YAML frontmatter from a markdown string.
|
||||
|
||||
Uses yaml with CSafeLoader for full YAML support (nested metadata, lists)
|
||||
with a fallback to simple key:value splitting for robustness.
|
||||
|
||||
Returns:
|
||||
(frontmatter_dict, remaining_body)
|
||||
"""
|
||||
frontmatter: Dict[str, Any] = {}
|
||||
body = content
|
||||
|
||||
if not content.startswith("---"):
|
||||
return frontmatter, body
|
||||
|
||||
end_match = re.search(r"\n---\s*\n", content[3:])
|
||||
if not end_match:
|
||||
return frontmatter, body
|
||||
|
||||
yaml_content = content[3 : end_match.start() + 3]
|
||||
body = content[end_match.end() + 3 :]
|
||||
|
||||
try:
|
||||
parsed = yaml_load(yaml_content)
|
||||
if isinstance(parsed, dict):
|
||||
frontmatter = parsed
|
||||
except Exception:
|
||||
# Fallback: simple key:value parsing for malformed YAML
|
||||
for line in yaml_content.strip().split("\n"):
|
||||
if ":" not in line:
|
||||
continue
|
||||
key, value = line.split(":", 1)
|
||||
frontmatter[key.strip()] = value.strip()
|
||||
|
||||
return frontmatter, body
|
||||
|
||||
|
||||
# ── Platform matching ─────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def skill_matches_platform(frontmatter: Dict[str, Any]) -> bool:
|
||||
"""Return True when the skill is compatible with the current OS.
|
||||
|
||||
Skills declare platform requirements via a top-level ``platforms`` list
|
||||
in their YAML frontmatter::
|
||||
|
||||
platforms: [macos] # macOS only
|
||||
platforms: [macos, linux] # macOS and Linux
|
||||
|
||||
If the field is absent or empty the skill is compatible with **all**
|
||||
platforms (backward-compatible default).
|
||||
"""
|
||||
platforms = frontmatter.get("platforms")
|
||||
if not platforms:
|
||||
return True
|
||||
if not isinstance(platforms, list):
|
||||
platforms = [platforms]
|
||||
current = sys.platform
|
||||
for platform in platforms:
|
||||
normalized = str(platform).lower().strip()
|
||||
mapped = PLATFORM_MAP.get(normalized, normalized)
|
||||
if current.startswith(mapped):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
# ── Disabled skills ───────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def get_disabled_skill_names() -> Set[str]:
|
||||
"""Read disabled skill names from config.yaml.
|
||||
|
||||
Resolves platform from ``HERMES_PLATFORM`` env var, falls back to
|
||||
the global disabled list. Reads the config file directly (no CLI
|
||||
config imports) to stay lightweight.
|
||||
"""
|
||||
config_path = get_hermes_home() / "config.yaml"
|
||||
if not config_path.exists():
|
||||
return set()
|
||||
try:
|
||||
parsed = yaml_load(config_path.read_text(encoding="utf-8"))
|
||||
except Exception as e:
|
||||
logger.debug("Could not read skill config %s: %s", config_path, e)
|
||||
return set()
|
||||
if not isinstance(parsed, dict):
|
||||
return set()
|
||||
|
||||
skills_cfg = parsed.get("skills")
|
||||
if not isinstance(skills_cfg, dict):
|
||||
return set()
|
||||
|
||||
resolved_platform = os.getenv("HERMES_PLATFORM")
|
||||
if resolved_platform:
|
||||
platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
|
||||
resolved_platform
|
||||
)
|
||||
if platform_disabled is not None:
|
||||
return _normalize_string_set(platform_disabled)
|
||||
return _normalize_string_set(skills_cfg.get("disabled"))
|
||||
|
||||
|
||||
def _normalize_string_set(values) -> Set[str]:
|
||||
if values is None:
|
||||
return set()
|
||||
if isinstance(values, str):
|
||||
values = [values]
|
||||
return {str(v).strip() for v in values if str(v).strip()}
|
||||
|
||||
|
||||
# ── Condition extraction ──────────────────────────────────────────────────
|
||||
|
||||
|
||||
def extract_skill_conditions(frontmatter: Dict[str, Any]) -> Dict[str, List]:
|
||||
"""Extract conditional activation fields from parsed frontmatter."""
|
||||
hermes = (frontmatter.get("metadata") or {}).get("hermes") or {}
|
||||
return {
|
||||
"fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
|
||||
"requires_toolsets": hermes.get("requires_toolsets", []),
|
||||
"fallback_for_tools": hermes.get("fallback_for_tools", []),
|
||||
"requires_tools": hermes.get("requires_tools", []),
|
||||
}
|
||||
|
||||
|
||||
# ── Description extraction ────────────────────────────────────────────────
|
||||
|
||||
|
||||
def extract_skill_description(frontmatter: Dict[str, Any]) -> str:
|
||||
"""Extract a truncated description from parsed frontmatter."""
|
||||
raw_desc = frontmatter.get("description", "")
|
||||
if not raw_desc:
|
||||
return ""
|
||||
desc = str(raw_desc).strip().strip("'\"")
|
||||
if len(desc) > 60:
|
||||
return desc[:57] + "..."
|
||||
return desc
|
||||
|
||||
|
||||
# ── File iteration ────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def iter_skill_index_files(skills_dir: Path, filename: str):
|
||||
"""Walk skills_dir yielding sorted paths matching *filename*.
|
||||
|
||||
Excludes ``.git``, ``.github``, ``.hub`` directories.
|
||||
"""
|
||||
matches = []
|
||||
for root, dirs, files in os.walk(skills_dir):
|
||||
dirs[:] = [d for d in dirs if d not in EXCLUDED_SKILL_DIRS]
|
||||
if filename in files:
|
||||
matches.append(Path(root) / filename)
|
||||
for path in sorted(matches, key=lambda p: str(p.relative_to(skills_dir))):
|
||||
yield path
|
||||
@@ -19,7 +19,7 @@ _TITLE_PROMPT = (
|
||||
)
|
||||
|
||||
|
||||
def generate_title(user_message: str, assistant_response: str, timeout: float = 30.0) -> Optional[str]:
|
||||
def generate_title(user_message: str, assistant_response: str, timeout: float = 15.0) -> Optional[str]:
|
||||
"""Generate a session title from the first exchange.
|
||||
|
||||
Uses the auxiliary LLM client (cheapest/fastest available model).
|
||||
|
||||
@@ -7,7 +7,6 @@
|
||||
# =============================================================================
|
||||
model:
|
||||
# Default model to use (can be overridden with --model flag)
|
||||
# Both "default" and "model" work as the key name here.
|
||||
default: "anthropic/claude-opus-4.6"
|
||||
|
||||
# Inference provider selection:
|
||||
|
||||
@@ -449,17 +449,6 @@ try:
|
||||
except Exception:
|
||||
pass # Skin engine is optional — default skin used if unavailable
|
||||
|
||||
# Neuter AsyncHttpxClientWrapper.__del__ before any AsyncOpenAI clients are
|
||||
# created. The SDK's __del__ schedules aclose() on asyncio.get_running_loop()
|
||||
# which, during CLI idle time, finds prompt_toolkit's event loop and tries to
|
||||
# close TCP transports bound to dead worker loops — producing
|
||||
# "Event loop is closed" / "Press ENTER to continue..." errors.
|
||||
try:
|
||||
from agent.auxiliary_client import neuter_async_httpx_del
|
||||
neuter_async_httpx_del()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
from rich import box as rich_box
|
||||
from rich.console import Console
|
||||
from rich.markup import escape as _escape
|
||||
@@ -1078,12 +1067,12 @@ class HermesCLI:
|
||||
# authoritative. This avoids conflicts in multi-agent setups where
|
||||
# env vars would stomp each other.
|
||||
_model_config = CLI_CONFIG.get("model", {})
|
||||
_config_model = (_model_config.get("default") or _model_config.get("model") or "") if isinstance(_model_config, dict) else (_model_config or "")
|
||||
_config_model = _model_config.get("default", "") if isinstance(_model_config, dict) else (_model_config or "")
|
||||
_FALLBACK_MODEL = "anthropic/claude-opus-4.6"
|
||||
self.model = model or _config_model or _FALLBACK_MODEL
|
||||
# Auto-detect model from local server if still on fallback
|
||||
if self.model == _FALLBACK_MODEL:
|
||||
_base_url = (_model_config.get("base_url") or "") if isinstance(_model_config, dict) else ""
|
||||
_base_url = _model_config.get("base_url", "") if isinstance(_model_config, dict) else ""
|
||||
if "localhost" in _base_url or "127.0.0.1" in _base_url:
|
||||
from hermes_cli.runtime_provider import _auto_detect_local_model
|
||||
_detected = _auto_detect_local_model(_base_url)
|
||||
@@ -1625,7 +1614,6 @@ class HermesCLI:
|
||||
if not text:
|
||||
return
|
||||
self._reasoning_stream_started = True
|
||||
self._reasoning_shown_this_turn = True
|
||||
if getattr(self, "_stream_box_opened", False):
|
||||
return
|
||||
|
||||
@@ -2961,82 +2949,6 @@ class HermesCLI:
|
||||
if not silent:
|
||||
print("(^_^)v New session started!")
|
||||
|
||||
def _handle_resume_command(self, cmd_original: str) -> None:
|
||||
"""Handle /resume <session_id_or_title> — switch to a previous session mid-conversation."""
|
||||
parts = cmd_original.split(None, 1)
|
||||
target = parts[1].strip() if len(parts) > 1 else ""
|
||||
|
||||
if not target:
|
||||
_cprint(" Usage: /resume <session_id_or_title>")
|
||||
_cprint(" Tip: Use /history or `hermes sessions list` to find sessions.")
|
||||
return
|
||||
|
||||
if not self._session_db:
|
||||
_cprint(" Session database not available.")
|
||||
return
|
||||
|
||||
# Resolve title or ID
|
||||
from hermes_cli.main import _resolve_session_by_name_or_id
|
||||
resolved = _resolve_session_by_name_or_id(target)
|
||||
target_id = resolved or target
|
||||
|
||||
session_meta = self._session_db.get_session(target_id)
|
||||
if not session_meta:
|
||||
_cprint(f" Session not found: {target}")
|
||||
_cprint(" Use /history or `hermes sessions list` to see available sessions.")
|
||||
return
|
||||
|
||||
if target_id == self.session_id:
|
||||
_cprint(" Already on that session.")
|
||||
return
|
||||
|
||||
# End current session
|
||||
try:
|
||||
self._session_db.end_session(self.session_id, "resumed_other")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Switch to the target session
|
||||
self.session_id = target_id
|
||||
self._resumed = True
|
||||
self._pending_title = None
|
||||
|
||||
# Load conversation history
|
||||
restored = self._session_db.get_messages_as_conversation(target_id)
|
||||
self.conversation_history = restored or []
|
||||
|
||||
# Re-open the target session so it's not marked as ended
|
||||
try:
|
||||
self._session_db.reopen_session(target_id)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Sync the agent if already initialised
|
||||
if self.agent:
|
||||
self.agent.session_id = target_id
|
||||
self.agent.reset_session_state()
|
||||
if hasattr(self.agent, "_last_flushed_db_idx"):
|
||||
self.agent._last_flushed_db_idx = len(self.conversation_history)
|
||||
if hasattr(self.agent, "_todo_store"):
|
||||
try:
|
||||
from tools.todo_tool import TodoStore
|
||||
self.agent._todo_store = TodoStore()
|
||||
except Exception:
|
||||
pass
|
||||
if hasattr(self.agent, "_invalidate_system_prompt"):
|
||||
self.agent._invalidate_system_prompt()
|
||||
|
||||
title_part = f" \"{session_meta['title']}\"" if session_meta.get("title") else ""
|
||||
msg_count = len([m for m in self.conversation_history if m.get("role") == "user"])
|
||||
if self.conversation_history:
|
||||
_cprint(
|
||||
f" ↻ Resumed session {target_id}{title_part}"
|
||||
f" ({msg_count} user message{'s' if msg_count != 1 else ''},"
|
||||
f" {len(self.conversation_history)} total)"
|
||||
)
|
||||
else:
|
||||
_cprint(f" ↻ Resumed session {target_id}{title_part} — no messages, starting fresh.")
|
||||
|
||||
def reset_conversation(self):
|
||||
"""Reset the conversation by starting a new session."""
|
||||
self.new_session()
|
||||
@@ -3755,8 +3667,6 @@ class HermesCLI:
|
||||
_cprint(" Session database not available.")
|
||||
elif canonical == "new":
|
||||
self.new_session()
|
||||
elif canonical == "resume":
|
||||
self._handle_resume_command(cmd_original)
|
||||
elif canonical == "provider":
|
||||
self._show_model_and_providers()
|
||||
elif canonical == "prompt":
|
||||
@@ -4034,17 +3944,6 @@ class HermesCLI:
|
||||
provider_data_collection=self._provider_data_collection,
|
||||
fallback_model=self._fallback_model,
|
||||
)
|
||||
# Silence raw spinner; route thinking through TUI widget when no foreground agent is active.
|
||||
bg_agent._print_fn = lambda *_a, **_kw: None
|
||||
|
||||
def _bg_thinking(text: str) -> None:
|
||||
# Concurrent bg tasks may race on _spinner_text; acceptable for best-effort UI.
|
||||
if not self._agent_running:
|
||||
self._spinner_text = text
|
||||
if self._app:
|
||||
self._app.invalidate()
|
||||
|
||||
bg_agent.thinking_callback = _bg_thinking
|
||||
|
||||
result = bg_agent.run_conversation(
|
||||
user_message=prompt,
|
||||
@@ -4107,9 +4006,6 @@ class HermesCLI:
|
||||
_cprint(f" ❌ Background task #{task_num} failed: {e}")
|
||||
finally:
|
||||
self._background_tasks.pop(task_id, None)
|
||||
# Clear spinner only if no foreground agent owns it
|
||||
if not self._agent_running:
|
||||
self._spinner_text = ""
|
||||
if self._app:
|
||||
self._invalidate(min_interval=0)
|
||||
|
||||
@@ -4520,7 +4416,7 @@ class HermesCLI:
|
||||
compressor = agent.context_compressor
|
||||
last_prompt = compressor.last_prompt_tokens
|
||||
ctx_len = compressor.context_length
|
||||
pct = min(100, (last_prompt / ctx_len * 100)) if ctx_len else 0
|
||||
pct = (last_prompt / ctx_len * 100) if ctx_len else 0
|
||||
compressions = compressor.compression_count
|
||||
|
||||
msg_count = len(self.conversation_history)
|
||||
@@ -5548,13 +5444,6 @@ class HermesCLI:
|
||||
except Exception as e:
|
||||
logging.debug("@ context reference expansion failed: %s", e)
|
||||
|
||||
# Sanitize surrogate characters that can arrive via clipboard paste from
|
||||
# rich-text editors (Google Docs, Word, etc.). Lone surrogates are invalid
|
||||
# UTF-8 and crash JSON serialization in the OpenAI SDK.
|
||||
if isinstance(message, str):
|
||||
from run_agent import _sanitize_surrogates
|
||||
message = _sanitize_surrogates(message)
|
||||
|
||||
# Add user message to history
|
||||
self.conversation_history.append({"role": "user", "content": message})
|
||||
|
||||
@@ -5567,10 +5456,6 @@ class HermesCLI:
|
||||
|
||||
# Reset streaming display state for this turn
|
||||
self._reset_stream_state()
|
||||
# Separate from _reset_stream_state because this must persist
|
||||
# across intermediate turn boundaries (tool-calling loops) — only
|
||||
# reset at the start of each user turn.
|
||||
self._reasoning_shown_this_turn = False
|
||||
|
||||
# --- Streaming TTS setup ---
|
||||
# When ElevenLabs is the TTS provider and sounddevice is available,
|
||||
@@ -5715,16 +5600,6 @@ class HermesCLI:
|
||||
|
||||
agent_thread.join() # Ensure agent thread completes
|
||||
|
||||
# Proactively clean up async clients whose event loop is dead.
|
||||
# The agent thread may have created AsyncOpenAI clients bound
|
||||
# to a per-thread event loop; if that loop is now closed, those
|
||||
# clients' __del__ would crash prompt_toolkit's loop on GC.
|
||||
try:
|
||||
from agent.auxiliary_client import cleanup_stale_async_clients
|
||||
cleanup_stale_async_clients()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Flush any remaining streamed text and close the box
|
||||
self._flush_stream()
|
||||
|
||||
@@ -5785,13 +5660,8 @@ class HermesCLI:
|
||||
response_previewed = result.get("response_previewed", False) if result else False
|
||||
|
||||
# Display reasoning (thinking) box if enabled and available.
|
||||
# Skip when streaming already showed reasoning live. Use the
|
||||
# turn-persistent flag (_reasoning_shown_this_turn) instead of
|
||||
# _reasoning_stream_started — the latter gets reset during
|
||||
# intermediate turn boundaries (tool-calling loops), which caused
|
||||
# the reasoning box to re-render after the final response.
|
||||
_reasoning_already_shown = getattr(self, '_reasoning_shown_this_turn', False)
|
||||
if self.show_reasoning and result and not _reasoning_already_shown:
|
||||
# Skip when streaming already showed reasoning live.
|
||||
if self.show_reasoning and result and not self._reasoning_stream_started:
|
||||
reasoning = result.get("last_reasoning")
|
||||
if reasoning:
|
||||
w = shutil.get_terminal_size().columns
|
||||
@@ -5912,22 +5782,10 @@ class HermesCLI:
|
||||
else:
|
||||
duration_str = f"{seconds}s"
|
||||
|
||||
# Look up session title for resume-by-name hint
|
||||
session_title = None
|
||||
if self._session_db:
|
||||
try:
|
||||
session_title = self._session_db.get_session_title(self.session_id)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
print("Resume this session with:")
|
||||
print(f" hermes --resume {self.session_id}")
|
||||
if session_title:
|
||||
print(f" hermes -c \"{session_title}\"")
|
||||
print()
|
||||
print(f"Session: {self.session_id}")
|
||||
if session_title:
|
||||
print(f"Title: {session_title}")
|
||||
print(f"Duration: {duration_str}")
|
||||
print(f"Messages: {msg_count} ({user_msgs} user, {tool_calls} tool calls)")
|
||||
else:
|
||||
@@ -6103,7 +5961,7 @@ class HermesCLI:
|
||||
from honcho_integration.client import HonchoClientConfig
|
||||
from agent.display import honcho_session_line, write_tty
|
||||
hcfg = HonchoClientConfig.from_global_config()
|
||||
if hcfg.enabled and (hcfg.api_key or hcfg.base_url) and hcfg.explicitly_configured:
|
||||
if hcfg.enabled and hcfg.api_key and hcfg.explicitly_configured:
|
||||
sname = hcfg.resolve_session_name(session_id=self.session_id)
|
||||
if sname:
|
||||
write_tty(honcho_session_line(hcfg.workspace_id, sname) + "\n")
|
||||
@@ -6190,18 +6048,10 @@ class HermesCLI:
|
||||
set_approval_callback(self._approval_callback)
|
||||
set_secret_capture_callback(self._secret_capture_callback)
|
||||
|
||||
# Ensure tirith security scanner is available (downloads if needed).
|
||||
# Warn the user if tirith is enabled in config but not available,
|
||||
# so they know command security scanning is degraded.
|
||||
# Ensure tirith security scanner is available (downloads if needed)
|
||||
try:
|
||||
from tools.tirith_security import ensure_installed
|
||||
tirith_path = ensure_installed(log_failures=False)
|
||||
if tirith_path is None:
|
||||
security_cfg = self.config.get("security", {}) or {}
|
||||
tirith_enabled = security_cfg.get("tirith_enabled", True)
|
||||
if tirith_enabled:
|
||||
_cprint(f" {_DIM}⚠ tirith security scanner enabled but not available "
|
||||
f"— command scanning will use pattern matching only{_RST}")
|
||||
ensure_installed(log_failures=False)
|
||||
except Exception:
|
||||
pass # Non-fatal — fail-open at scan time if unavailable
|
||||
|
||||
@@ -6677,7 +6527,6 @@ class HermesCLI:
|
||||
# Paste collapsing: detect large pastes and save to temp file
|
||||
_paste_counter = [0]
|
||||
_prev_text_len = [0]
|
||||
_prev_newline_count = [0]
|
||||
_paste_just_collapsed = [False]
|
||||
|
||||
def _on_text_changed(buf):
|
||||
@@ -6686,27 +6535,18 @@ class HermesCLI:
|
||||
When bracketed paste is available, handle_paste collapses
|
||||
large pastes directly. This handler is a fallback for
|
||||
terminals without bracketed paste support.
|
||||
|
||||
Two heuristics (either triggers collapse):
|
||||
1. Many characters added at once (chars_added > 1) — works
|
||||
when the terminal delivers the paste in one event-loop tick.
|
||||
2. Newline count jumped by 4+ in a single text-change event —
|
||||
catches terminals that feed characters individually but
|
||||
still batch newlines. Alt+Enter only adds 1 newline per
|
||||
event so it never triggers this.
|
||||
"""
|
||||
text = buf.text
|
||||
chars_added = len(text) - _prev_text_len[0]
|
||||
_prev_text_len[0] = len(text)
|
||||
if _paste_just_collapsed[0]:
|
||||
_paste_just_collapsed[0] = False
|
||||
_prev_newline_count[0] = text.count('\n')
|
||||
return
|
||||
line_count = text.count('\n')
|
||||
newlines_added = line_count - _prev_newline_count[0]
|
||||
_prev_newline_count[0] = line_count
|
||||
is_paste = chars_added > 1 or newlines_added >= 4
|
||||
if line_count >= 5 and is_paste and not text.startswith('/'):
|
||||
# Heuristic: a real paste adds many characters at once (not just a
|
||||
# single newline from Alt+Enter) AND the result has 5+ lines.
|
||||
# Fallback for terminals without bracketed paste support.
|
||||
if line_count >= 5 and chars_added > 1 and not text.startswith('/'):
|
||||
_paste_counter[0] += 1
|
||||
# Save to temp file
|
||||
paste_dir = _hermes_home / "pastes"
|
||||
@@ -6714,7 +6554,6 @@ class HermesCLI:
|
||||
paste_file = paste_dir / f"paste_{_paste_counter[0]}_{datetime.now().strftime('%H%M%S')}.txt"
|
||||
paste_file.write_text(text, encoding="utf-8")
|
||||
# Replace buffer with compact reference
|
||||
_paste_just_collapsed[0] = True
|
||||
buf.text = f"[Pasted text #{_paste_counter[0]}: {line_count + 1} lines \u2192 {paste_file}]"
|
||||
buf.cursor_position = len(buf.text)
|
||||
|
||||
@@ -7324,28 +7163,9 @@ class HermesCLI:
|
||||
# Register atexit cleanup so resources are freed even on unexpected exit
|
||||
atexit.register(_run_cleanup)
|
||||
|
||||
# Install a custom asyncio exception handler that suppresses the
|
||||
# "Event loop is closed" RuntimeError from httpx transport cleanup.
|
||||
# This is defense-in-depth — the primary fix is neuter_async_httpx_del
|
||||
# which disables __del__ entirely, but older clients or SDK upgrades
|
||||
# could bypass it.
|
||||
def _suppress_closed_loop_errors(loop, context):
|
||||
exc = context.get("exception")
|
||||
if isinstance(exc, RuntimeError) and "Event loop is closed" in str(exc):
|
||||
return # silently suppress
|
||||
# Fall back to default handler for everything else
|
||||
loop.default_exception_handler(context)
|
||||
|
||||
# Run the application with patch_stdout for proper output handling
|
||||
try:
|
||||
with patch_stdout():
|
||||
# Set the custom handler on prompt_toolkit's event loop
|
||||
try:
|
||||
import asyncio as _aio
|
||||
_loop = _aio.get_event_loop()
|
||||
_loop.set_exception_handler(_suppress_closed_loop_errors)
|
||||
except Exception:
|
||||
pass
|
||||
app.run()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
pass
|
||||
|
||||
+1
-42
@@ -327,20 +327,7 @@ def load_jobs() -> List[Dict[str, Any]]:
|
||||
with open(JOBS_FILE, 'r', encoding='utf-8') as f:
|
||||
data = json.load(f)
|
||||
return data.get("jobs", [])
|
||||
except json.JSONDecodeError:
|
||||
# Retry with strict=False to handle bare control chars in string values
|
||||
try:
|
||||
with open(JOBS_FILE, 'r', encoding='utf-8') as f:
|
||||
data = json.loads(f.read(), strict=False)
|
||||
jobs = data.get("jobs", [])
|
||||
if jobs:
|
||||
# Auto-repair: rewrite with proper escaping
|
||||
save_jobs(jobs)
|
||||
logger.warning("Auto-repaired jobs.json (had invalid control characters)")
|
||||
return jobs
|
||||
except Exception:
|
||||
return []
|
||||
except IOError:
|
||||
except (json.JSONDecodeError, IOError):
|
||||
return []
|
||||
|
||||
|
||||
@@ -611,34 +598,6 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
|
||||
save_jobs(jobs)
|
||||
|
||||
|
||||
def advance_next_run(job_id: str) -> bool:
|
||||
"""Preemptively advance next_run_at for a recurring job before execution.
|
||||
|
||||
Call this BEFORE run_job() so that if the process crashes mid-execution,
|
||||
the job won't re-fire on the next gateway restart. This converts the
|
||||
scheduler from at-least-once to at-most-once for recurring jobs — missing
|
||||
one run is far better than firing dozens of times in a crash loop.
|
||||
|
||||
One-shot jobs are left unchanged so they can still retry on restart.
|
||||
|
||||
Returns True if next_run_at was advanced, False otherwise.
|
||||
"""
|
||||
jobs = load_jobs()
|
||||
for job in jobs:
|
||||
if job["id"] == job_id:
|
||||
kind = job.get("schedule", {}).get("kind")
|
||||
if kind not in ("cron", "interval"):
|
||||
return False
|
||||
now = _hermes_now().isoformat()
|
||||
new_next = compute_next_run(job["schedule"], now)
|
||||
if new_next and new_next != job.get("next_run_at"):
|
||||
job["next_run_at"] = new_next
|
||||
save_jobs(jobs)
|
||||
return True
|
||||
return False
|
||||
return False
|
||||
|
||||
|
||||
def get_due_jobs() -> List[Dict[str, Any]]:
|
||||
"""Get all jobs that are due to run now.
|
||||
|
||||
|
||||
+1
-7
@@ -35,7 +35,7 @@ logger = logging.getLogger(__name__)
|
||||
# Add parent directory to path for imports
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||
|
||||
from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run
|
||||
from cron.jobs import get_due_jobs, mark_job_run, save_job_output
|
||||
|
||||
# Sentinel: when a cron agent has nothing new to report, it can start its
|
||||
# response with this marker to suppress delivery. Output is still saved
|
||||
@@ -524,12 +524,6 @@ def tick(verbose: bool = True) -> int:
|
||||
executed = 0
|
||||
for job in due_jobs:
|
||||
try:
|
||||
# For recurring jobs (cron/interval), advance next_run_at to the
|
||||
# next future occurrence BEFORE execution. This way, if the
|
||||
# process crashes mid-run, the job won't re-fire on restart.
|
||||
# One-shot jobs are left alone so they can retry on restart.
|
||||
advance_next_run(job["id"])
|
||||
|
||||
success, output, final_response, error = run_job(job)
|
||||
|
||||
output_file = save_job_output(job["id"], output)
|
||||
|
||||
@@ -1,15 +0,0 @@
|
||||
# Hermes Agent Persona
|
||||
|
||||
<!--
|
||||
This file defines the agent's personality and tone.
|
||||
The agent will embody whatever you write here.
|
||||
Edit this to customize how Hermes communicates with you.
|
||||
|
||||
Examples:
|
||||
- "You are a warm, playful assistant who uses kaomoji occasionally."
|
||||
- "You are a concise technical expert. No fluff, just facts."
|
||||
- "You speak like a friendly coworker who happens to know everything."
|
||||
|
||||
This file is loaded fresh each message -- no restart needed.
|
||||
Delete the contents (or this file) to use the default personality.
|
||||
-->
|
||||
@@ -1,31 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Docker entrypoint: bootstrap config files into the mounted volume, then run hermes.
|
||||
set -e
|
||||
|
||||
HERMES_HOME="/opt/data"
|
||||
INSTALL_DIR="/opt/hermes"
|
||||
|
||||
# Create directory structure
|
||||
mkdir -p "$HERMES_HOME"/{cron,sessions,logs,pairing,hooks,image_cache,audio_cache,memories,skills,whatsapp/session}
|
||||
|
||||
# .env
|
||||
if [ ! -f "$HERMES_HOME/.env" ]; then
|
||||
cp "$INSTALL_DIR/.env.example" "$HERMES_HOME/.env"
|
||||
fi
|
||||
|
||||
# config.yaml
|
||||
if [ ! -f "$HERMES_HOME/config.yaml" ]; then
|
||||
cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
|
||||
fi
|
||||
|
||||
# SOUL.md
|
||||
if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
|
||||
cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
|
||||
fi
|
||||
|
||||
# Sync bundled skills (manifest-based so user edits are preserved)
|
||||
if [ -d "$INSTALL_DIR/skills" ]; then
|
||||
python3 "$INSTALL_DIR/tools/skills_sync.py"
|
||||
fi
|
||||
|
||||
exec hermes "$@"
|
||||
@@ -1,56 +0,0 @@
|
||||
# Hermes Agent — Docker
|
||||
|
||||
Want to run Hermes Agent, but without installing packages on your host? This'll sort you out.
|
||||
|
||||
This will let you run the agent in a container, with the most relevant modes outlined below.
|
||||
|
||||
The container stores all user data (config, API keys, sessions, skills, memories) in a single directory mounted from the host at `/opt/data`. The image itself is stateless and can be upgraded by pulling a new version without losing any configuration.
|
||||
|
||||
## Quick start
|
||||
|
||||
If this is your first time running Hermes Agent, create a data directory on the host and start the container interactively to run the setup wizard:
|
||||
|
||||
```sh
|
||||
mkdir -p ~/.hermes
|
||||
docker run -it --rm \
|
||||
-v ~/.hermes:/opt/data \
|
||||
nousresearch/hermes-agent
|
||||
```
|
||||
|
||||
This drops you into the setup wizard, which will prompt you for your API keys and write them to `~/.hermes/.env`. You only need to do this once. It is highly recommended to set up a chat system for the gateway to work with at this point.
|
||||
|
||||
## Running in gateway mode
|
||||
|
||||
Once configured, run the container in the background as a persistent gateway (Telegram, Discord, Slack, WhatsApp, etc.):
|
||||
|
||||
```sh
|
||||
docker run -d \
|
||||
--name hermes \
|
||||
--restart unless-stopped \
|
||||
-v ~/.hermes:/opt/data \
|
||||
nousresearch/hermes-agent gateway run
|
||||
```
|
||||
|
||||
## Running interactively (CLI chat)
|
||||
|
||||
To open an interactive chat session against a running data directory:
|
||||
|
||||
```sh
|
||||
docker run -it --rm \
|
||||
-v ~/.hermes:/opt/data \
|
||||
nousresearch/hermes-agent
|
||||
```
|
||||
|
||||
## Upgrading
|
||||
|
||||
Pull the latest image and recreate the container. Your data directory is untouched.
|
||||
|
||||
```sh
|
||||
docker pull nousresearch/hermes-agent:latest
|
||||
docker rm -f hermes
|
||||
docker run -d \
|
||||
--name hermes \
|
||||
--restart unless-stopped \
|
||||
-v ~/.hermes:/opt/data \
|
||||
nousresearch/hermes-agent
|
||||
```
|
||||
+13
-3
@@ -101,11 +101,21 @@ Available methods:
|
||||
|
||||
### Patches (`patches.py`)
|
||||
|
||||
**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.
|
||||
**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend via SWE-ReX). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.
|
||||
|
||||
**Solution**: `ModalEnvironment` uses a dedicated `_AsyncWorker` background thread with its own event loop. The calling code sees a sync interface, but internally all async Modal SDK calls happen on the worker thread so they don't conflict with Atropos's loop. This is built directly into `tools/environments/modal.py` — no monkey-patching required.
|
||||
**Solution**: `patches.py` monkey-patches `SwerexModalEnvironment` to use a dedicated background thread (`_AsyncWorker`) with its own event loop. The calling code sees the same sync interface, but internally the async work happens on a separate thread that doesn't conflict with Atropos's loop.
|
||||
|
||||
`patches.py` is now a no-op (kept for backward compatibility with imports).
|
||||
What gets patched:
|
||||
- `SwerexModalEnvironment.__init__` -- creates Modal deployment on a background thread
|
||||
- `SwerexModalEnvironment.execute` -- runs commands on the same background thread
|
||||
- `SwerexModalEnvironment.stop` -- stops deployment on the background thread
|
||||
|
||||
The patches are:
|
||||
- **Idempotent** -- calling `apply_patches()` multiple times is safe
|
||||
- **Transparent** -- same interface and behavior, only the internal async execution changes
|
||||
- **Universal** -- works identically in normal CLI use (no running event loop)
|
||||
|
||||
Applied automatically at import time by `hermes_base_env.py`.
|
||||
|
||||
### Tool Call Parsers (`tool_call_parsers/`)
|
||||
|
||||
|
||||
@@ -601,14 +601,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
|
||||
config.platforms[Platform.TELEGRAM] = PlatformConfig()
|
||||
config.platforms[Platform.TELEGRAM].reply_to_mode = telegram_reply_mode
|
||||
|
||||
telegram_fallback_ips = os.getenv("TELEGRAM_FALLBACK_IPS", "")
|
||||
if telegram_fallback_ips:
|
||||
if Platform.TELEGRAM not in config.platforms:
|
||||
config.platforms[Platform.TELEGRAM] = PlatformConfig()
|
||||
config.platforms[Platform.TELEGRAM].extra["fallback_ips"] = [
|
||||
ip.strip() for ip in telegram_fallback_ips.split(",") if ip.strip()
|
||||
]
|
||||
|
||||
telegram_home = os.getenv("TELEGRAM_HOME_CHANNEL")
|
||||
if telegram_home and Platform.TELEGRAM in config.platforms:
|
||||
config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
|
||||
|
||||
+2
-2
@@ -25,7 +25,7 @@ import time
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
from hermes_constants import get_hermes_dir
|
||||
from hermes_cli.config import get_hermes_home
|
||||
|
||||
|
||||
# Unambiguous alphabet -- excludes 0/O, 1/I to prevent confusion
|
||||
@@ -41,7 +41,7 @@ LOCKOUT_SECONDS = 3600 # Lockout duration after too many failures
|
||||
MAX_PENDING_PER_PLATFORM = 3 # Max pending codes per platform
|
||||
MAX_FAILED_ATTEMPTS = 5 # Failed approvals before lockout
|
||||
|
||||
PAIRING_DIR = get_hermes_dir("platforms/pairing", "pairing")
|
||||
PAIRING_DIR = get_hermes_home() / "pairing"
|
||||
|
||||
|
||||
def _secure_write(path: Path, data: str) -> None:
|
||||
|
||||
+67
-126
@@ -166,7 +166,7 @@ class ResponseStore:
|
||||
|
||||
_CORS_HEADERS = {
|
||||
"Access-Control-Allow-Methods": "GET, POST, DELETE, OPTIONS",
|
||||
"Access-Control-Allow-Headers": "Authorization, Content-Type, Idempotency-Key",
|
||||
"Access-Control-Allow-Headers": "Authorization, Content-Type",
|
||||
}
|
||||
|
||||
|
||||
@@ -223,23 +223,6 @@ if AIOHTTP_AVAILABLE:
|
||||
else:
|
||||
body_limit_middleware = None # type: ignore[assignment]
|
||||
|
||||
_SECURITY_HEADERS = {
|
||||
"X-Content-Type-Options": "nosniff",
|
||||
"Referrer-Policy": "no-referrer",
|
||||
}
|
||||
|
||||
|
||||
if AIOHTTP_AVAILABLE:
|
||||
@web.middleware
|
||||
async def security_headers_middleware(request, handler):
|
||||
"""Add security headers to all responses (including errors)."""
|
||||
response = await handler(request)
|
||||
for k, v in _SECURITY_HEADERS.items():
|
||||
response.headers.setdefault(k, v)
|
||||
return response
|
||||
else:
|
||||
security_headers_middleware = None # type: ignore[assignment]
|
||||
|
||||
|
||||
class _IdempotencyCache:
|
||||
"""In-memory idempotency cache with TTL and basic LRU semantics."""
|
||||
@@ -324,7 +307,6 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
if "*" in self._cors_origins:
|
||||
headers = dict(_CORS_HEADERS)
|
||||
headers["Access-Control-Allow-Origin"] = "*"
|
||||
headers["Access-Control-Max-Age"] = "600"
|
||||
return headers
|
||||
|
||||
if origin not in self._cors_origins:
|
||||
@@ -333,7 +315,6 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
headers = dict(_CORS_HEADERS)
|
||||
headers["Access-Control-Allow-Origin"] = origin
|
||||
headers["Vary"] = "Origin"
|
||||
headers["Access-Control-Max-Age"] = "600"
|
||||
return headers
|
||||
|
||||
def _origin_allowed(self, origin: str) -> bool:
|
||||
@@ -514,21 +495,17 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
if delta is not None:
|
||||
_stream_q.put(delta)
|
||||
|
||||
# Start agent in background. agent_ref is a mutable container
|
||||
# so the SSE writer can interrupt the agent on client disconnect.
|
||||
agent_ref = [None]
|
||||
# Start agent in background
|
||||
agent_task = asyncio.ensure_future(self._run_agent(
|
||||
user_message=user_message,
|
||||
conversation_history=history,
|
||||
ephemeral_system_prompt=system_prompt,
|
||||
session_id=session_id,
|
||||
stream_delta_callback=_on_delta,
|
||||
agent_ref=agent_ref,
|
||||
))
|
||||
|
||||
return await self._write_sse_chat_completion(
|
||||
request, completion_id, model_name, created, _stream_q,
|
||||
agent_task, agent_ref,
|
||||
request, completion_id, model_name, created, _stream_q, agent_task
|
||||
)
|
||||
|
||||
# Non-streaming: run the agent (with optional Idempotency-Key)
|
||||
@@ -591,107 +568,80 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
|
||||
async def _write_sse_chat_completion(
|
||||
self, request: "web.Request", completion_id: str, model: str,
|
||||
created: int, stream_q, agent_task, agent_ref=None,
|
||||
created: int, stream_q, agent_task,
|
||||
) -> "web.StreamResponse":
|
||||
"""Write real streaming SSE from agent's stream_delta_callback queue.
|
||||
|
||||
If the client disconnects mid-stream (network drop, browser tab close),
|
||||
the agent is interrupted via ``agent.interrupt()`` so it stops making
|
||||
LLM API calls, and the asyncio task wrapper is cancelled.
|
||||
"""
|
||||
"""Write real streaming SSE from agent's stream_delta_callback queue."""
|
||||
import queue as _q
|
||||
|
||||
sse_headers = {"Content-Type": "text/event-stream", "Cache-Control": "no-cache"}
|
||||
# CORS middleware can't inject headers into StreamResponse after
|
||||
# prepare() flushes them, so resolve CORS headers up front.
|
||||
origin = request.headers.get("Origin", "")
|
||||
cors = self._cors_headers_for_origin(origin) if origin else None
|
||||
if cors:
|
||||
sse_headers.update(cors)
|
||||
response = web.StreamResponse(status=200, headers=sse_headers)
|
||||
response = web.StreamResponse(
|
||||
status=200,
|
||||
headers={"Content-Type": "text/event-stream", "Cache-Control": "no-cache"},
|
||||
)
|
||||
await response.prepare(request)
|
||||
|
||||
try:
|
||||
# Role chunk
|
||||
role_chunk = {
|
||||
"id": completion_id, "object": "chat.completion.chunk",
|
||||
"created": created, "model": model,
|
||||
"choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
|
||||
}
|
||||
await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
|
||||
# Role chunk
|
||||
role_chunk = {
|
||||
"id": completion_id, "object": "chat.completion.chunk",
|
||||
"created": created, "model": model,
|
||||
"choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
|
||||
}
|
||||
await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
|
||||
|
||||
# Stream content chunks as they arrive from the agent
|
||||
loop = asyncio.get_event_loop()
|
||||
while True:
|
||||
try:
|
||||
delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
|
||||
except _q.Empty:
|
||||
if agent_task.done():
|
||||
# Drain any remaining items
|
||||
while True:
|
||||
try:
|
||||
delta = stream_q.get_nowait()
|
||||
if delta is None:
|
||||
break
|
||||
content_chunk = {
|
||||
"id": completion_id, "object": "chat.completion.chunk",
|
||||
"created": created, "model": model,
|
||||
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
|
||||
}
|
||||
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
|
||||
except _q.Empty:
|
||||
break
|
||||
break
|
||||
continue
|
||||
|
||||
if delta is None: # End of stream sentinel
|
||||
break
|
||||
|
||||
content_chunk = {
|
||||
"id": completion_id, "object": "chat.completion.chunk",
|
||||
"created": created, "model": model,
|
||||
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
|
||||
}
|
||||
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
|
||||
|
||||
# Get usage from completed agent
|
||||
usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
|
||||
# Stream content chunks as they arrive from the agent
|
||||
loop = asyncio.get_event_loop()
|
||||
while True:
|
||||
try:
|
||||
result, agent_usage = await agent_task
|
||||
usage = agent_usage or usage
|
||||
except Exception:
|
||||
pass
|
||||
delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
|
||||
except _q.Empty:
|
||||
if agent_task.done():
|
||||
# Drain any remaining items
|
||||
while True:
|
||||
try:
|
||||
delta = stream_q.get_nowait()
|
||||
if delta is None:
|
||||
break
|
||||
content_chunk = {
|
||||
"id": completion_id, "object": "chat.completion.chunk",
|
||||
"created": created, "model": model,
|
||||
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
|
||||
}
|
||||
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
|
||||
except _q.Empty:
|
||||
break
|
||||
break
|
||||
continue
|
||||
|
||||
# Finish chunk
|
||||
finish_chunk = {
|
||||
if delta is None: # End of stream sentinel
|
||||
break
|
||||
|
||||
content_chunk = {
|
||||
"id": completion_id, "object": "chat.completion.chunk",
|
||||
"created": created, "model": model,
|
||||
"choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
|
||||
"usage": {
|
||||
"prompt_tokens": usage.get("input_tokens", 0),
|
||||
"completion_tokens": usage.get("output_tokens", 0),
|
||||
"total_tokens": usage.get("total_tokens", 0),
|
||||
},
|
||||
"choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
|
||||
}
|
||||
await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
|
||||
await response.write(b"data: [DONE]\n\n")
|
||||
except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError, OSError):
|
||||
# Client disconnected mid-stream. Interrupt the agent so it
|
||||
# stops making LLM API calls at the next loop iteration, then
|
||||
# cancel the asyncio task wrapper.
|
||||
agent = agent_ref[0] if agent_ref else None
|
||||
if agent is not None:
|
||||
try:
|
||||
agent.interrupt("SSE client disconnected")
|
||||
except Exception:
|
||||
pass
|
||||
if not agent_task.done():
|
||||
agent_task.cancel()
|
||||
try:
|
||||
await agent_task
|
||||
except (asyncio.CancelledError, Exception):
|
||||
pass
|
||||
logger.info("SSE client disconnected; interrupted agent task %s", completion_id)
|
||||
await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
|
||||
|
||||
# Get usage from completed agent
|
||||
usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
|
||||
try:
|
||||
result, agent_usage = await agent_task
|
||||
usage = agent_usage or usage
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Finish chunk
|
||||
finish_chunk = {
|
||||
"id": completion_id, "object": "chat.completion.chunk",
|
||||
"created": created, "model": model,
|
||||
"choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
|
||||
"usage": {
|
||||
"prompt_tokens": usage.get("input_tokens", 0),
|
||||
"completion_tokens": usage.get("output_tokens", 0),
|
||||
"total_tokens": usage.get("total_tokens", 0),
|
||||
},
|
||||
}
|
||||
await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
|
||||
await response.write(b"data: [DONE]\n\n")
|
||||
|
||||
return response
|
||||
|
||||
@@ -1194,18 +1144,12 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
ephemeral_system_prompt: Optional[str] = None,
|
||||
session_id: Optional[str] = None,
|
||||
stream_delta_callback=None,
|
||||
agent_ref: Optional[list] = None,
|
||||
) -> tuple:
|
||||
"""
|
||||
Create an agent and run a conversation in a thread executor.
|
||||
|
||||
Returns ``(result_dict, usage_dict)`` where *usage_dict* contains
|
||||
``input_tokens``, ``output_tokens`` and ``total_tokens``.
|
||||
|
||||
If *agent_ref* is a one-element list, the AIAgent instance is stored
|
||||
at ``agent_ref[0]`` before ``run_conversation`` begins. This allows
|
||||
callers (e.g. the SSE writer) to call ``agent.interrupt()`` from
|
||||
another thread to stop in-progress LLM calls.
|
||||
"""
|
||||
loop = asyncio.get_event_loop()
|
||||
|
||||
@@ -1215,8 +1159,6 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
session_id=session_id,
|
||||
stream_delta_callback=stream_delta_callback,
|
||||
)
|
||||
if agent_ref is not None:
|
||||
agent_ref[0] = agent
|
||||
result = agent.run_conversation(
|
||||
user_message=user_message,
|
||||
conversation_history=conversation_history,
|
||||
@@ -1241,11 +1183,10 @@ class APIServerAdapter(BasePlatformAdapter):
|
||||
return False
|
||||
|
||||
try:
|
||||
mws = [mw for mw in (cors_middleware, body_limit_middleware, security_headers_middleware) if mw is not None]
|
||||
mws = [mw for mw in (cors_middleware, body_limit_middleware) if mw is not None]
|
||||
self._app = web.Application(middlewares=mws)
|
||||
self._app["api_server_adapter"] = self
|
||||
self._app.router.add_get("/health", self._handle_health)
|
||||
self._app.router.add_get("/v1/health", self._handle_health)
|
||||
self._app.router.add_get("/v1/models", self._handle_models)
|
||||
self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
|
||||
self._app.router.add_post("/v1/responses", self._handle_responses)
|
||||
|
||||
+26
-72
@@ -27,7 +27,6 @@ sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
from gateway.session import SessionSource, build_session_key
|
||||
from hermes_cli.config import get_hermes_home
|
||||
from hermes_constants import get_hermes_dir
|
||||
|
||||
|
||||
GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
|
||||
@@ -45,8 +44,8 @@ GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
|
||||
# (e.g. Telegram file URLs expire after ~1 hour).
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Default location: {HERMES_HOME}/cache/images/ (legacy: image_cache/)
|
||||
IMAGE_CACHE_DIR = get_hermes_dir("cache/images", "image_cache")
|
||||
# Default location: {HERMES_HOME}/image_cache/
|
||||
IMAGE_CACHE_DIR = get_hermes_home() / "image_cache"
|
||||
|
||||
|
||||
def get_image_cache_dir() -> Path:
|
||||
@@ -73,51 +72,31 @@ def cache_image_from_bytes(data: bytes, ext: str = ".jpg") -> str:
|
||||
return str(filepath)
|
||||
|
||||
|
||||
async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) -> str:
|
||||
async def cache_image_from_url(url: str, ext: str = ".jpg") -> str:
|
||||
"""
|
||||
Download an image from a URL and save it to the local cache.
|
||||
|
||||
Retries on transient failures (timeouts, 429, 5xx) with exponential
|
||||
backoff so a single slow CDN response doesn't lose the media.
|
||||
Uses httpx for async download with a reasonable timeout.
|
||||
|
||||
Args:
|
||||
url: The HTTP/HTTPS URL to download from.
|
||||
ext: File extension including the dot (e.g. ".jpg", ".png").
|
||||
retries: Number of retry attempts on transient failures.
|
||||
|
||||
Returns:
|
||||
Absolute path to the cached image file as a string.
|
||||
"""
|
||||
import asyncio
|
||||
import httpx
|
||||
import logging as _logging
|
||||
_log = _logging.getLogger(__name__)
|
||||
|
||||
last_exc = None
|
||||
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
|
||||
for attempt in range(retries + 1):
|
||||
try:
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={
|
||||
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
|
||||
"Accept": "image/*,*/*;q=0.8",
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return cache_image_from_bytes(response.content, ext)
|
||||
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
|
||||
last_exc = exc
|
||||
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
|
||||
raise
|
||||
if attempt < retries:
|
||||
wait = 1.5 * (attempt + 1)
|
||||
_log.debug("Media cache retry %d/%d for %s (%.1fs): %s",
|
||||
attempt + 1, retries, url[:80], wait, exc)
|
||||
await asyncio.sleep(wait)
|
||||
continue
|
||||
raise
|
||||
raise last_exc
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={
|
||||
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
|
||||
"Accept": "image/*,*/*;q=0.8",
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return cache_image_from_bytes(response.content, ext)
|
||||
|
||||
|
||||
def cleanup_image_cache(max_age_hours: int = 24) -> int:
|
||||
@@ -148,7 +127,7 @@ def cleanup_image_cache(max_age_hours: int = 24) -> int:
|
||||
# here so the STT tool (OpenAI Whisper) can transcribe them from local files.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
AUDIO_CACHE_DIR = get_hermes_dir("cache/audio", "audio_cache")
|
||||
AUDIO_CACHE_DIR = get_hermes_home() / "audio_cache"
|
||||
|
||||
|
||||
def get_audio_cache_dir() -> Path:
|
||||
@@ -175,51 +154,29 @@ def cache_audio_from_bytes(data: bytes, ext: str = ".ogg") -> str:
|
||||
return str(filepath)
|
||||
|
||||
|
||||
async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) -> str:
|
||||
async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
|
||||
"""
|
||||
Download an audio file from a URL and save it to the local cache.
|
||||
|
||||
Retries on transient failures (timeouts, 429, 5xx) with exponential
|
||||
backoff so a single slow CDN response doesn't lose the media.
|
||||
|
||||
Args:
|
||||
url: The HTTP/HTTPS URL to download from.
|
||||
ext: File extension including the dot (e.g. ".ogg", ".mp3").
|
||||
retries: Number of retry attempts on transient failures.
|
||||
|
||||
Returns:
|
||||
Absolute path to the cached audio file as a string.
|
||||
"""
|
||||
import asyncio
|
||||
import httpx
|
||||
import logging as _logging
|
||||
_log = _logging.getLogger(__name__)
|
||||
|
||||
last_exc = None
|
||||
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
|
||||
for attempt in range(retries + 1):
|
||||
try:
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={
|
||||
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
|
||||
"Accept": "audio/*,*/*;q=0.8",
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return cache_audio_from_bytes(response.content, ext)
|
||||
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
|
||||
last_exc = exc
|
||||
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
|
||||
raise
|
||||
if attempt < retries:
|
||||
wait = 1.5 * (attempt + 1)
|
||||
_log.debug("Audio cache retry %d/%d for %s (%.1fs): %s",
|
||||
attempt + 1, retries, url[:80], wait, exc)
|
||||
await asyncio.sleep(wait)
|
||||
continue
|
||||
raise
|
||||
raise last_exc
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={
|
||||
"User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
|
||||
"Accept": "audio/*,*/*;q=0.8",
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return cache_audio_from_bytes(response.content, ext)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -229,7 +186,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) ->
|
||||
# here so the agent can reference them by local file path.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
DOCUMENT_CACHE_DIR = get_hermes_dir("cache/documents", "document_cache")
|
||||
DOCUMENT_CACHE_DIR = get_hermes_home() / "document_cache"
|
||||
|
||||
SUPPORTED_DOCUMENT_TYPES = {
|
||||
".pdf": "application/pdf",
|
||||
@@ -356,10 +313,7 @@ class MessageEvent:
|
||||
return None
|
||||
# Split on space and get first word, strip the /
|
||||
parts = self.text.split(maxsplit=1)
|
||||
raw = parts[0][1:].lower() if parts else None
|
||||
if raw and "@" in raw:
|
||||
raw = raw.split("@", 1)[0]
|
||||
return raw
|
||||
return parts[0][1:].lower() if parts else None
|
||||
|
||||
def get_command_args(self) -> str:
|
||||
"""Get the arguments after a command."""
|
||||
|
||||
@@ -550,22 +550,6 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
return
|
||||
# "all" falls through to handle_message
|
||||
|
||||
# If the message @mentions other users but NOT the bot, the
|
||||
# sender is talking to someone else — stay silent. Only
|
||||
# applies in server channels; in DMs the user is always
|
||||
# talking to the bot (mentions are just references).
|
||||
# Controlled by DISCORD_IGNORE_NO_MENTION (default: true).
|
||||
_ignore_no_mention = os.getenv(
|
||||
"DISCORD_IGNORE_NO_MENTION", "true"
|
||||
).lower() in ("true", "1", "yes")
|
||||
if _ignore_no_mention and message.mentions and not isinstance(message.channel, discord.DMChannel):
|
||||
_bot_mentioned = (
|
||||
self._client.user is not None
|
||||
and self._client.user in message.mentions
|
||||
)
|
||||
if not _bot_mentioned:
|
||||
return # Talking to someone else, don't interrupt
|
||||
|
||||
await self._handle_message(message)
|
||||
|
||||
@self._client.event
|
||||
@@ -2112,11 +2096,6 @@ class DiscordAdapter(BasePlatformAdapter):
|
||||
if pending_text_injection:
|
||||
event_text = f"{pending_text_injection}\n\n{event_text}" if event_text else pending_text_injection
|
||||
|
||||
# Defense-in-depth: prevent empty user messages from entering session
|
||||
# (can happen when user sends @mention-only with no other text)
|
||||
if not event_text or not event_text.strip():
|
||||
event_text = "(The user sent a message with no text content)"
|
||||
|
||||
event = MessageEvent(
|
||||
text=event_text,
|
||||
message_type=msg_type,
|
||||
|
||||
@@ -43,20 +43,6 @@ from gateway.platforms.base import (
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
# Automated sender patterns — emails from these are silently ignored
|
||||
_NOREPLY_PATTERNS = (
|
||||
"noreply", "no-reply", "no_reply", "donotreply", "do-not-reply",
|
||||
"mailer-daemon", "postmaster", "bounce", "notifications@",
|
||||
"automated@", "auto-confirm", "auto-reply", "automailer",
|
||||
)
|
||||
|
||||
# RFC headers that indicate bulk/automated mail
|
||||
_AUTOMATED_HEADERS = {
|
||||
"Auto-Submitted": lambda v: v.lower() != "no",
|
||||
"Precedence": lambda v: v.lower() in ("bulk", "list", "junk"),
|
||||
"X-Auto-Response-Suppress": lambda v: bool(v),
|
||||
"List-Unsubscribe": lambda v: bool(v),
|
||||
}
|
||||
|
||||
# Gmail-safe max length per email body
|
||||
MAX_MESSAGE_LENGTH = 50_000
|
||||
@@ -64,17 +50,7 @@ MAX_MESSAGE_LENGTH = 50_000
|
||||
# Supported image extensions for inline detection
|
||||
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}
|
||||
|
||||
def _is_automated_sender(address: str, headers: dict) -> bool:
|
||||
"""Return True if this email is from an automated/noreply source."""
|
||||
addr = address.lower()
|
||||
if any(pattern in addr for pattern in _NOREPLY_PATTERNS):
|
||||
return True
|
||||
for header, check in _AUTOMATED_HEADERS.items():
|
||||
value = headers.get(header, "")
|
||||
if value and check(value):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def check_email_requirements() -> bool:
|
||||
"""Check if email platform dependencies are available."""
|
||||
addr = os.getenv("EMAIL_ADDRESS")
|
||||
@@ -237,7 +213,6 @@ class EmailAdapter(BasePlatformAdapter):
|
||||
|
||||
# Track message IDs we've already processed to avoid duplicates
|
||||
self._seen_uids: set = set()
|
||||
self._seen_uids_max: int = 2000 # cap to prevent unbounded memory growth
|
||||
self._poll_task: Optional[asyncio.Task] = None
|
||||
|
||||
# Map chat_id (sender email) -> last subject + message-id for threading
|
||||
@@ -245,26 +220,6 @@ class EmailAdapter(BasePlatformAdapter):
|
||||
|
||||
logger.info("[Email] Adapter initialized for %s", self._address)
|
||||
|
||||
def _trim_seen_uids(self) -> None:
|
||||
"""Keep only the most recent UIDs to prevent unbounded memory growth.
|
||||
|
||||
IMAP UIDs are monotonically increasing integers. When the set grows
|
||||
beyond the cap, we keep only the highest half — old UIDs are safe to
|
||||
drop because new messages always have higher UIDs and IMAP's UNSEEN
|
||||
flag prevents re-delivery regardless.
|
||||
"""
|
||||
if len(self._seen_uids) <= self._seen_uids_max:
|
||||
return
|
||||
try:
|
||||
# UIDs are bytes like b'1234' — sort numerically and keep top half
|
||||
sorted_uids = sorted(self._seen_uids, key=lambda u: int(u))
|
||||
keep = self._seen_uids_max // 2
|
||||
self._seen_uids = set(sorted_uids[-keep:])
|
||||
logger.debug("[Email] Trimmed seen UIDs to %d entries", len(self._seen_uids))
|
||||
except (ValueError, TypeError):
|
||||
# Fallback: just clear old entries if sort fails
|
||||
self._seen_uids = set(list(self._seen_uids)[-self._seen_uids_max // 2:])
|
||||
|
||||
async def connect(self) -> bool:
|
||||
"""Connect to the IMAP server and start polling for new messages."""
|
||||
try:
|
||||
@@ -277,8 +232,6 @@ class EmailAdapter(BasePlatformAdapter):
|
||||
if status == "OK" and data and data[0]:
|
||||
for uid in data[0].split():
|
||||
self._seen_uids.add(uid)
|
||||
# Keep only the most recent UIDs to prevent unbounded growth
|
||||
self._trim_seen_uids()
|
||||
imap.logout()
|
||||
logger.info("[Email] IMAP connection test passed. %d existing messages skipped.", len(self._seen_uids))
|
||||
except Exception as e:
|
||||
@@ -349,9 +302,6 @@ class EmailAdapter(BasePlatformAdapter):
|
||||
if uid in self._seen_uids:
|
||||
continue
|
||||
self._seen_uids.add(uid)
|
||||
# Trim periodically to prevent unbounded memory growth
|
||||
if len(self._seen_uids) > self._seen_uids_max:
|
||||
self._trim_seen_uids()
|
||||
|
||||
status, msg_data = imap.uid("fetch", uid, "(RFC822)")
|
||||
if status != "OK":
|
||||
@@ -370,11 +320,6 @@ class EmailAdapter(BasePlatformAdapter):
|
||||
subject = _decode_header_value(msg.get("Subject", "(no subject)"))
|
||||
message_id = msg.get("Message-ID", "")
|
||||
in_reply_to = msg.get("In-Reply-To", "")
|
||||
# Skip automated/noreply senders before any processing
|
||||
msg_headers = dict(msg.items())
|
||||
if _is_automated_sender(sender_addr, msg_headers):
|
||||
logger.debug("[Email] Skipping automated sender: %s", sender_addr)
|
||||
continue
|
||||
body = _extract_text_body(msg)
|
||||
attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
|
||||
|
||||
@@ -403,11 +348,6 @@ class EmailAdapter(BasePlatformAdapter):
|
||||
if sender_addr == self._address.lower():
|
||||
return
|
||||
|
||||
# Never reply to automated senders
|
||||
if _is_automated_sender(sender_addr, {}):
|
||||
logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
|
||||
return
|
||||
|
||||
subject = msg_data["subject"]
|
||||
body = msg_data["body"].strip()
|
||||
attachments = msg_data["attachments"]
|
||||
|
||||
+21
-126
@@ -40,9 +40,7 @@ logger = logging.getLogger(__name__)
|
||||
MAX_MESSAGE_LENGTH = 4000
|
||||
|
||||
# Store directory for E2EE keys and sync state.
|
||||
# Uses get_hermes_home() so each profile gets its own Matrix store.
|
||||
from hermes_constants import get_hermes_dir as _get_hermes_dir
|
||||
_STORE_DIR = _get_hermes_dir("platforms/matrix/store", "matrix/store")
|
||||
_STORE_DIR = Path.home() / ".hermes" / "matrix" / "store"
|
||||
|
||||
# Grace period: ignore messages older than this many seconds before startup.
|
||||
_STARTUP_GRACE_SECONDS = 5
|
||||
@@ -163,49 +161,22 @@ class MatrixAdapter(BasePlatformAdapter):
|
||||
# Authenticate.
|
||||
if self._access_token:
|
||||
client.access_token = self._access_token
|
||||
|
||||
# With access-token auth, always resolve whoami so we validate the
|
||||
# token and learn the device_id. The device_id matters for E2EE:
|
||||
# without it, matrix-nio can send plain messages but may fail to
|
||||
# decrypt inbound encrypted events or encrypt outbound room sends.
|
||||
resp = await client.whoami()
|
||||
if isinstance(resp, nio.WhoamiResponse):
|
||||
resolved_user_id = getattr(resp, "user_id", "") or self._user_id
|
||||
resolved_device_id = getattr(resp, "device_id", "")
|
||||
if resolved_user_id:
|
||||
self._user_id = resolved_user_id
|
||||
|
||||
# restore_login() is the matrix-nio path that binds the access
|
||||
# token to a specific device and loads the crypto store.
|
||||
if resolved_device_id and hasattr(client, "restore_login"):
|
||||
client.restore_login(
|
||||
self._user_id or resolved_user_id,
|
||||
resolved_device_id,
|
||||
self._access_token,
|
||||
)
|
||||
# Resolve user_id if not set.
|
||||
if not self._user_id:
|
||||
resp = await client.whoami()
|
||||
if isinstance(resp, nio.WhoamiResponse):
|
||||
self._user_id = resp.user_id
|
||||
client.user_id = resp.user_id
|
||||
logger.info("Matrix: authenticated as %s", self._user_id)
|
||||
else:
|
||||
if self._user_id:
|
||||
client.user_id = self._user_id
|
||||
if resolved_device_id:
|
||||
client.device_id = resolved_device_id
|
||||
client.access_token = self._access_token
|
||||
if self._encryption:
|
||||
logger.warning(
|
||||
"Matrix: access-token login did not restore E2EE state; "
|
||||
"encrypted rooms may fail until a device_id is available"
|
||||
)
|
||||
|
||||
logger.info(
|
||||
"Matrix: using access token for %s%s",
|
||||
self._user_id or "(unknown user)",
|
||||
f" (device {resolved_device_id})" if resolved_device_id else "",
|
||||
)
|
||||
logger.error(
|
||||
"Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
|
||||
)
|
||||
await client.close()
|
||||
return False
|
||||
else:
|
||||
logger.error(
|
||||
"Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
|
||||
)
|
||||
await client.close()
|
||||
return False
|
||||
client.user_id = self._user_id
|
||||
logger.info("Matrix: using access token for %s", self._user_id)
|
||||
elif self._password and self._user_id:
|
||||
resp = await client.login(
|
||||
self._password,
|
||||
@@ -223,18 +194,13 @@ class MatrixAdapter(BasePlatformAdapter):
|
||||
return False
|
||||
|
||||
# If E2EE is enabled, load the crypto store.
|
||||
if self._encryption and getattr(client, "olm", None):
|
||||
if self._encryption and hasattr(client, "olm"):
|
||||
try:
|
||||
if client.should_upload_keys:
|
||||
await client.keys_upload()
|
||||
logger.info("Matrix: E2EE crypto initialized")
|
||||
except Exception as exc:
|
||||
logger.warning("Matrix: crypto init issue: %s", exc)
|
||||
elif self._encryption:
|
||||
logger.warning(
|
||||
"Matrix: E2EE requested but crypto store is not loaded; "
|
||||
"encrypted rooms may fail"
|
||||
)
|
||||
|
||||
# Register event callbacks.
|
||||
client.add_event_callback(self._on_room_message, nio.RoomMessageText)
|
||||
@@ -264,7 +230,6 @@ class MatrixAdapter(BasePlatformAdapter):
|
||||
)
|
||||
# Build DM room cache from m.direct account data.
|
||||
await self._refresh_dm_cache()
|
||||
await self._run_e2ee_maintenance()
|
||||
else:
|
||||
logger.warning("Matrix: initial sync returned %s", type(resp).__name__)
|
||||
|
||||
@@ -336,48 +301,13 @@ class MatrixAdapter(BasePlatformAdapter):
|
||||
relates_to["m.in_reply_to"] = {"event_id": reply_to}
|
||||
msg_content["m.relates_to"] = relates_to
|
||||
|
||||
async def _room_send_once(*, ignore_unverified_devices: bool = False):
|
||||
return await asyncio.wait_for(
|
||||
self._client.room_send(
|
||||
chat_id,
|
||||
"m.room.message",
|
||||
msg_content,
|
||||
ignore_unverified_devices=ignore_unverified_devices,
|
||||
),
|
||||
timeout=45,
|
||||
)
|
||||
|
||||
try:
|
||||
resp = await _room_send_once(ignore_unverified_devices=False)
|
||||
except Exception as exc:
|
||||
retryable = isinstance(exc, asyncio.TimeoutError)
|
||||
olm_unverified = getattr(nio, "OlmUnverifiedDeviceError", None)
|
||||
send_retry = getattr(nio, "SendRetryError", None)
|
||||
if isinstance(olm_unverified, type) and isinstance(exc, olm_unverified):
|
||||
retryable = True
|
||||
if isinstance(send_retry, type) and isinstance(exc, send_retry):
|
||||
retryable = True
|
||||
|
||||
if not retryable:
|
||||
logger.error("Matrix: failed to send to %s: %s", chat_id, exc)
|
||||
return SendResult(success=False, error=str(exc))
|
||||
|
||||
logger.warning(
|
||||
"Matrix: initial encrypted send to %s failed (%s); "
|
||||
"retrying after E2EE maintenance with ignored unverified devices",
|
||||
chat_id,
|
||||
exc,
|
||||
)
|
||||
await self._run_e2ee_maintenance()
|
||||
try:
|
||||
resp = await _room_send_once(ignore_unverified_devices=True)
|
||||
except Exception as retry_exc:
|
||||
logger.error("Matrix: failed to send to %s after retry: %s", chat_id, retry_exc)
|
||||
return SendResult(success=False, error=str(retry_exc))
|
||||
|
||||
resp = await self._client.room_send(
|
||||
chat_id,
|
||||
"m.room.message",
|
||||
msg_content,
|
||||
)
|
||||
if isinstance(resp, nio.RoomSendResponse):
|
||||
last_event_id = resp.event_id
|
||||
logger.info("Matrix: sent event %s to %s", last_event_id, chat_id)
|
||||
else:
|
||||
err = getattr(resp, "message", str(resp))
|
||||
logger.error("Matrix: failed to send to %s: %s", chat_id, err)
|
||||
@@ -635,9 +565,6 @@ class MatrixAdapter(BasePlatformAdapter):
|
||||
getattr(resp, "message", resp),
|
||||
)
|
||||
await asyncio.sleep(5)
|
||||
continue
|
||||
|
||||
await self._run_e2ee_maintenance()
|
||||
except asyncio.CancelledError:
|
||||
return
|
||||
except Exception as exc:
|
||||
@@ -646,38 +573,6 @@ class MatrixAdapter(BasePlatformAdapter):
|
||||
logger.warning("Matrix: sync error: %s — retrying in 5s", exc)
|
||||
await asyncio.sleep(5)
|
||||
|
||||
async def _run_e2ee_maintenance(self) -> None:
|
||||
"""Run matrix-nio E2EE housekeeping between syncs.
|
||||
|
||||
Hermes uses a custom sync loop instead of matrix-nio's sync_forever(),
|
||||
so we need to explicitly drive the key management work that sync_forever()
|
||||
normally handles for encrypted rooms.
|
||||
"""
|
||||
client = self._client
|
||||
if not client or not self._encryption or not getattr(client, "olm", None):
|
||||
return
|
||||
|
||||
tasks = [asyncio.create_task(client.send_to_device_messages())]
|
||||
|
||||
if client.should_upload_keys:
|
||||
tasks.append(asyncio.create_task(client.keys_upload()))
|
||||
|
||||
if client.should_query_keys:
|
||||
tasks.append(asyncio.create_task(client.keys_query()))
|
||||
|
||||
if client.should_claim_keys:
|
||||
users = client.get_users_for_key_claiming()
|
||||
if users:
|
||||
tasks.append(asyncio.create_task(client.keys_claim(users)))
|
||||
|
||||
for task in asyncio.as_completed(tasks):
|
||||
try:
|
||||
await task
|
||||
except asyncio.CancelledError:
|
||||
raise
|
||||
except Exception as exc:
|
||||
logger.warning("Matrix: E2EE maintenance task failed: %s", exc)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Event callbacks
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
@@ -407,38 +407,18 @@ class MattermostAdapter(BasePlatformAdapter):
|
||||
kind: str = "file",
|
||||
) -> SendResult:
|
||||
"""Download a URL and upload it as a file attachment."""
|
||||
import asyncio
|
||||
import aiohttp
|
||||
|
||||
last_exc = None
|
||||
file_data = None
|
||||
ct = "application/octet-stream"
|
||||
fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
|
||||
|
||||
for attempt in range(3):
|
||||
try:
|
||||
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
|
||||
if resp.status >= 500 or resp.status == 429:
|
||||
if attempt < 2:
|
||||
logger.debug("Mattermost download retry %d/2 for %s (status %d)",
|
||||
attempt + 1, url[:80], resp.status)
|
||||
await asyncio.sleep(1.5 * (attempt + 1))
|
||||
continue
|
||||
if resp.status >= 400:
|
||||
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
|
||||
file_data = await resp.read()
|
||||
ct = resp.content_type or "application/octet-stream"
|
||||
break
|
||||
except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
|
||||
last_exc = exc
|
||||
if attempt < 2:
|
||||
await asyncio.sleep(1.5 * (attempt + 1))
|
||||
continue
|
||||
logger.warning("Mattermost: failed to download %s after %d attempts: %s", url, attempt + 1, exc)
|
||||
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
|
||||
|
||||
if file_data is None:
|
||||
logger.warning("Mattermost: download returned no data for %s", url)
|
||||
try:
|
||||
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
|
||||
if resp.status >= 400:
|
||||
# Fall back to sending the URL as text.
|
||||
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
|
||||
file_data = await resp.read()
|
||||
ct = resp.content_type or "application/octet-stream"
|
||||
# Derive filename from URL.
|
||||
fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
|
||||
except Exception as exc:
|
||||
logger.warning("Mattermost: failed to download %s: %s", url, exc)
|
||||
return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
|
||||
|
||||
file_id = await self._upload_file(chat_id, file_data, fname, ct)
|
||||
|
||||
@@ -279,12 +279,6 @@ class SignalAdapter(BasePlatformAdapter):
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
# SSE keepalive comments (":") prove the connection
|
||||
# is alive — update activity so the health monitor
|
||||
# doesn't report false idle warnings.
|
||||
if line.startswith(":"):
|
||||
self._last_sse_activity = time.time()
|
||||
continue
|
||||
# Parse SSE data lines
|
||||
if line.startswith("data:"):
|
||||
data_str = line[5:].strip()
|
||||
|
||||
+19
-51
@@ -819,65 +819,33 @@ class SlackAdapter(BasePlatformAdapter):
|
||||
await self.handle_message(event)
|
||||
|
||||
async def _download_slack_file(self, url: str, ext: str, audio: bool = False) -> str:
|
||||
"""Download a Slack file using the bot token for auth, with retry."""
|
||||
import asyncio
|
||||
"""Download a Slack file using the bot token for auth."""
|
||||
import httpx
|
||||
|
||||
bot_token = self.config.token
|
||||
last_exc = None
|
||||
|
||||
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
|
||||
for attempt in range(3):
|
||||
try:
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={"Authorization": f"Bearer {bot_token}"},
|
||||
)
|
||||
response.raise_for_status()
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={"Authorization": f"Bearer {bot_token}"},
|
||||
)
|
||||
response.raise_for_status()
|
||||
|
||||
if audio:
|
||||
from gateway.platforms.base import cache_audio_from_bytes
|
||||
return cache_audio_from_bytes(response.content, ext)
|
||||
else:
|
||||
from gateway.platforms.base import cache_image_from_bytes
|
||||
return cache_image_from_bytes(response.content, ext)
|
||||
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
|
||||
last_exc = exc
|
||||
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
|
||||
raise
|
||||
if attempt < 2:
|
||||
logger.debug("Slack file download retry %d/2 for %s: %s",
|
||||
attempt + 1, url[:80], exc)
|
||||
await asyncio.sleep(1.5 * (attempt + 1))
|
||||
continue
|
||||
raise
|
||||
raise last_exc
|
||||
if audio:
|
||||
from gateway.platforms.base import cache_audio_from_bytes
|
||||
return cache_audio_from_bytes(response.content, ext)
|
||||
else:
|
||||
from gateway.platforms.base import cache_image_from_bytes
|
||||
return cache_image_from_bytes(response.content, ext)
|
||||
|
||||
async def _download_slack_file_bytes(self, url: str) -> bytes:
|
||||
"""Download a Slack file and return raw bytes, with retry."""
|
||||
import asyncio
|
||||
"""Download a Slack file and return raw bytes."""
|
||||
import httpx
|
||||
|
||||
bot_token = self.config.token
|
||||
last_exc = None
|
||||
|
||||
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
|
||||
for attempt in range(3):
|
||||
try:
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={"Authorization": f"Bearer {bot_token}"},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.content
|
||||
except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
|
||||
last_exc = exc
|
||||
if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
|
||||
raise
|
||||
if attempt < 2:
|
||||
logger.debug("Slack file download retry %d/2 for %s: %s",
|
||||
attempt + 1, url[:80], exc)
|
||||
await asyncio.sleep(1.5 * (attempt + 1))
|
||||
continue
|
||||
raise
|
||||
raise last_exc
|
||||
response = await client.get(
|
||||
url,
|
||||
headers={"Authorization": f"Bearer {bot_token}"},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.content
|
||||
|
||||
@@ -11,7 +11,7 @@ import asyncio
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
from typing import Dict, List, Optional, Any
|
||||
from typing import Dict, Optional, Any
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -25,7 +25,6 @@ try:
|
||||
filters,
|
||||
)
|
||||
from telegram.constants import ParseMode, ChatType
|
||||
from telegram.request import HTTPXRequest
|
||||
TELEGRAM_AVAILABLE = True
|
||||
except ImportError:
|
||||
TELEGRAM_AVAILABLE = False
|
||||
@@ -35,7 +34,6 @@ except ImportError:
|
||||
Application = Any
|
||||
CommandHandler = Any
|
||||
TelegramMessageHandler = Any
|
||||
HTTPXRequest = Any
|
||||
filters = None
|
||||
ParseMode = None
|
||||
ChatType = None
|
||||
@@ -61,11 +59,6 @@ from gateway.platforms.base import (
|
||||
cache_document_from_bytes,
|
||||
SUPPORTED_DOCUMENT_TYPES,
|
||||
)
|
||||
from gateway.platforms.telegram_network import (
|
||||
TelegramFallbackTransport,
|
||||
discover_fallback_ips,
|
||||
parse_fallback_ip_env,
|
||||
)
|
||||
|
||||
|
||||
def check_telegram_requirements() -> bool:
|
||||
@@ -145,13 +138,6 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
# DM Topics config from extra.dm_topics
|
||||
self._dm_topics_config: List[Dict[str, Any]] = self.config.extra.get("dm_topics", [])
|
||||
|
||||
def _fallback_ips(self) -> list[str]:
|
||||
"""Return validated fallback IPs from config (populated by _apply_env_overrides)."""
|
||||
configured = self.config.extra.get("fallback_ips", []) if getattr(self.config, "extra", None) else []
|
||||
if isinstance(configured, str):
|
||||
configured = configured.split(",")
|
||||
return parse_fallback_ip_env(",".join(str(v) for v in configured) if configured else None)
|
||||
|
||||
@staticmethod
|
||||
def _looks_like_polling_conflict(error: Exception) -> bool:
|
||||
text = str(error).lower()
|
||||
@@ -345,8 +331,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
|
||||
"""Save a newly created thread_id back into config.yaml so it persists across restarts."""
|
||||
try:
|
||||
from hermes_constants import get_hermes_home
|
||||
config_path = get_hermes_home() / "config.yaml"
|
||||
config_path = _Path.home() / ".hermes" / "config.yaml"
|
||||
if not config_path.exists():
|
||||
logger.warning("[%s] Config file not found at %s, cannot persist thread_id", self.name, config_path)
|
||||
return
|
||||
@@ -489,26 +474,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
return False
|
||||
|
||||
# Build the application
|
||||
builder = Application.builder().token(self.config.token)
|
||||
fallback_ips = self._fallback_ips()
|
||||
if not fallback_ips:
|
||||
fallback_ips = await discover_fallback_ips()
|
||||
logger.info(
|
||||
"[%s] Auto-discovered Telegram fallback IPs: %s",
|
||||
self.name,
|
||||
", ".join(fallback_ips),
|
||||
)
|
||||
if fallback_ips:
|
||||
logger.warning(
|
||||
"[%s] Telegram fallback IPs active: %s",
|
||||
self.name,
|
||||
", ".join(fallback_ips),
|
||||
)
|
||||
transport = TelegramFallbackTransport(fallback_ips)
|
||||
request = HTTPXRequest(httpx_kwargs={"transport": transport})
|
||||
get_updates_request = HTTPXRequest(httpx_kwargs={"transport": transport})
|
||||
builder = builder.request(request).get_updates_request(get_updates_request)
|
||||
self._app = builder.build()
|
||||
self._app = Application.builder().token(self.config.token).build()
|
||||
self._bot = self._app.bot
|
||||
|
||||
# Register handlers
|
||||
@@ -708,15 +674,9 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
except ImportError:
|
||||
_NetErr = OSError # type: ignore[misc,assignment]
|
||||
|
||||
try:
|
||||
from telegram.error import BadRequest as _BadReq
|
||||
except ImportError:
|
||||
_BadReq = None # type: ignore[assignment,misc]
|
||||
|
||||
for i, chunk in enumerate(chunks):
|
||||
should_thread = self._should_thread_reply(reply_to, i)
|
||||
reply_to_id = int(reply_to) if should_thread else None
|
||||
effective_thread_id = int(thread_id) if thread_id else None
|
||||
|
||||
msg = None
|
||||
for _send_attempt in range(3):
|
||||
@@ -728,7 +688,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
text=chunk,
|
||||
parse_mode=ParseMode.MARKDOWN_V2,
|
||||
reply_to_message_id=reply_to_id,
|
||||
message_thread_id=effective_thread_id,
|
||||
message_thread_id=int(thread_id) if thread_id else None,
|
||||
)
|
||||
except Exception as md_error:
|
||||
# Markdown parsing failed, try plain text
|
||||
@@ -740,30 +700,12 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
text=plain_chunk,
|
||||
parse_mode=None,
|
||||
reply_to_message_id=reply_to_id,
|
||||
message_thread_id=effective_thread_id,
|
||||
message_thread_id=int(thread_id) if thread_id else None,
|
||||
)
|
||||
else:
|
||||
raise
|
||||
break # success
|
||||
except _NetErr as send_err:
|
||||
# BadRequest is a subclass of NetworkError in
|
||||
# python-telegram-bot but represents permanent errors
|
||||
# (not transient network issues). Detect and handle
|
||||
# specific cases instead of blindly retrying.
|
||||
if _BadReq and isinstance(send_err, _BadReq):
|
||||
err_lower = str(send_err).lower()
|
||||
if "thread not found" in err_lower and effective_thread_id is not None:
|
||||
# Thread doesn't exist — retry without
|
||||
# message_thread_id so the message still
|
||||
# reaches the chat.
|
||||
logger.warning(
|
||||
"[%s] Thread %s not found, retrying without message_thread_id",
|
||||
self.name, effective_thread_id,
|
||||
)
|
||||
effective_thread_id = None
|
||||
continue
|
||||
# Other BadRequest errors are permanent — don't retry
|
||||
raise
|
||||
if _send_attempt < 2:
|
||||
wait = 2 ** _send_attempt
|
||||
logger.warning("[%s] Network error on send (attempt %d/3), retrying in %ds: %s",
|
||||
@@ -1758,8 +1700,7 @@ class TelegramAdapter(BasePlatformAdapter):
|
||||
recognized without a gateway restart.
|
||||
"""
|
||||
try:
|
||||
from hermes_constants import get_hermes_home
|
||||
config_path = get_hermes_home() / "config.yaml"
|
||||
config_path = _Path.home() / ".hermes" / "config.yaml"
|
||||
if not config_path.exists():
|
||||
return
|
||||
|
||||
|
||||
@@ -1,245 +0,0 @@
|
||||
"""Telegram-specific network helpers.
|
||||
|
||||
Provides a hostname-preserving fallback transport for networks where
|
||||
api.telegram.org resolves to an endpoint that is unreachable from the current
|
||||
host. The transport keeps the logical request host and TLS SNI as
|
||||
api.telegram.org while retrying the TCP connection against one or more fallback
|
||||
IPv4 addresses.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import ipaddress
|
||||
import logging
|
||||
import os
|
||||
import socket
|
||||
from typing import Iterable, Optional
|
||||
|
||||
import httpx
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_TELEGRAM_API_HOST = "api.telegram.org"
|
||||
|
||||
# DNS-over-HTTPS providers used to discover Telegram API IPs that may differ
|
||||
# from the (potentially unreachable) IP returned by the local system resolver.
|
||||
_DOH_TIMEOUT = 4.0 # seconds — bounded so connect() isn't noticeably delayed
|
||||
|
||||
_DOH_PROVIDERS: list[dict] = [
|
||||
{
|
||||
"url": "https://dns.google/resolve",
|
||||
"params": {"name": _TELEGRAM_API_HOST, "type": "A"},
|
||||
"headers": {},
|
||||
},
|
||||
{
|
||||
"url": "https://cloudflare-dns.com/dns-query",
|
||||
"params": {"name": _TELEGRAM_API_HOST, "type": "A"},
|
||||
"headers": {"Accept": "application/dns-json"},
|
||||
},
|
||||
]
|
||||
|
||||
# Last-resort IPs when DoH is also blocked. These are stable Telegram Bot API
|
||||
# endpoints in the 149.154.160.0/20 block (same seed used by OpenClaw).
|
||||
_SEED_FALLBACK_IPS: list[str] = ["149.154.167.220"]
|
||||
|
||||
|
||||
def _resolve_proxy_url() -> str | None:
|
||||
for key in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY", "https_proxy", "http_proxy", "all_proxy"):
|
||||
value = (os.environ.get(key) or "").strip()
|
||||
if value:
|
||||
return value
|
||||
return None
|
||||
|
||||
|
||||
class TelegramFallbackTransport(httpx.AsyncBaseTransport):
|
||||
"""Retry Telegram Bot API requests via fallback IPs while preserving TLS/SNI.
|
||||
|
||||
Requests continue to target https://api.telegram.org/... logically, but on
|
||||
connect failures the underlying TCP connection is retried against a known
|
||||
reachable IP. This is effectively the programmatic equivalent of
|
||||
``curl --resolve api.telegram.org:443:<ip>``.
|
||||
"""
|
||||
|
||||
def __init__(self, fallback_ips: Iterable[str], **transport_kwargs):
|
||||
self._fallback_ips = [ip for ip in dict.fromkeys(_normalize_fallback_ips(fallback_ips))]
|
||||
proxy_url = _resolve_proxy_url()
|
||||
if proxy_url and "proxy" not in transport_kwargs:
|
||||
transport_kwargs["proxy"] = proxy_url
|
||||
self._primary = httpx.AsyncHTTPTransport(**transport_kwargs)
|
||||
self._fallbacks = {
|
||||
ip: httpx.AsyncHTTPTransport(**transport_kwargs) for ip in self._fallback_ips
|
||||
}
|
||||
self._sticky_ip: Optional[str] = None
|
||||
self._sticky_lock = asyncio.Lock()
|
||||
|
||||
async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
|
||||
if request.url.host != _TELEGRAM_API_HOST or not self._fallback_ips:
|
||||
return await self._primary.handle_async_request(request)
|
||||
|
||||
sticky_ip = self._sticky_ip
|
||||
attempt_order: list[Optional[str]] = [sticky_ip] if sticky_ip else [None]
|
||||
for ip in self._fallback_ips:
|
||||
if ip != sticky_ip:
|
||||
attempt_order.append(ip)
|
||||
|
||||
last_error: Exception | None = None
|
||||
for ip in attempt_order:
|
||||
candidate = request if ip is None else _rewrite_request_for_ip(request, ip)
|
||||
transport = self._primary if ip is None else self._fallbacks[ip]
|
||||
try:
|
||||
response = await transport.handle_async_request(candidate)
|
||||
if ip is not None and self._sticky_ip != ip:
|
||||
async with self._sticky_lock:
|
||||
if self._sticky_ip != ip:
|
||||
self._sticky_ip = ip
|
||||
logger.warning(
|
||||
"[Telegram] Primary api.telegram.org path unreachable; using sticky fallback IP %s",
|
||||
ip,
|
||||
)
|
||||
return response
|
||||
except Exception as exc:
|
||||
last_error = exc
|
||||
if not _is_retryable_connect_error(exc):
|
||||
raise
|
||||
if ip is None:
|
||||
logger.warning(
|
||||
"[Telegram] Primary api.telegram.org connection failed (%s); trying fallback IPs %s",
|
||||
exc,
|
||||
", ".join(self._fallback_ips),
|
||||
)
|
||||
continue
|
||||
logger.warning("[Telegram] Fallback IP %s failed: %s", ip, exc)
|
||||
continue
|
||||
|
||||
assert last_error is not None
|
||||
raise last_error
|
||||
|
||||
async def aclose(self) -> None:
|
||||
await self._primary.aclose()
|
||||
for transport in self._fallbacks.values():
|
||||
await transport.aclose()
|
||||
|
||||
|
||||
def _normalize_fallback_ips(values: Iterable[str]) -> list[str]:
|
||||
normalized: list[str] = []
|
||||
for value in values:
|
||||
raw = str(value).strip()
|
||||
if not raw:
|
||||
continue
|
||||
try:
|
||||
addr = ipaddress.ip_address(raw)
|
||||
except ValueError:
|
||||
logger.warning("Ignoring invalid Telegram fallback IP: %r", raw)
|
||||
continue
|
||||
if addr.version != 4:
|
||||
logger.warning("Ignoring non-IPv4 Telegram fallback IP: %s", raw)
|
||||
continue
|
||||
normalized.append(str(addr))
|
||||
return normalized
|
||||
|
||||
|
||||
def parse_fallback_ip_env(value: str | None) -> list[str]:
|
||||
if not value:
|
||||
return []
|
||||
parts = [part.strip() for part in value.split(",")]
|
||||
return _normalize_fallback_ips(parts)
|
||||
|
||||
|
||||
def _resolve_system_dns() -> set[str]:
|
||||
"""Return the IPv4 addresses that the OS resolver gives for api.telegram.org."""
|
||||
try:
|
||||
results = socket.getaddrinfo(_TELEGRAM_API_HOST, 443, socket.AF_INET)
|
||||
return {addr[4][0] for addr in results}
|
||||
except Exception:
|
||||
return set()
|
||||
|
||||
|
||||
async def _query_doh_provider(
|
||||
client: httpx.AsyncClient, provider: dict
|
||||
) -> list[str]:
|
||||
"""Query one DoH provider and return A-record IPs."""
|
||||
try:
|
||||
resp = await client.get(
|
||||
provider["url"], params=provider["params"], headers=provider["headers"]
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
ips: list[str] = []
|
||||
for answer in data.get("Answer", []):
|
||||
if answer.get("type") != 1: # A record
|
||||
continue
|
||||
raw = answer.get("data", "").strip()
|
||||
try:
|
||||
ipaddress.ip_address(raw)
|
||||
ips.append(raw)
|
||||
except ValueError:
|
||||
continue
|
||||
return ips
|
||||
except Exception as exc:
|
||||
logger.debug("DoH query to %s failed: %s", provider["url"], exc)
|
||||
return []
|
||||
|
||||
|
||||
async def discover_fallback_ips() -> list[str]:
|
||||
"""Auto-discover Telegram API IPs via DNS-over-HTTPS.
|
||||
|
||||
Resolves api.telegram.org through Google and Cloudflare DoH, collects all
|
||||
unique IPs, and excludes the system-DNS-resolved IP (which is presumably
|
||||
unreachable on this network). Falls back to a hardcoded seed list when DoH
|
||||
is also unavailable.
|
||||
"""
|
||||
async with httpx.AsyncClient(timeout=httpx.Timeout(_DOH_TIMEOUT)) as client:
|
||||
doh_tasks = [_query_doh_provider(client, p) for p in _DOH_PROVIDERS]
|
||||
system_dns_task = asyncio.to_thread(_resolve_system_dns)
|
||||
results = await asyncio.gather(system_dns_task, *doh_tasks, return_exceptions=True)
|
||||
|
||||
# results[0] = system DNS IPs (set), results[1:] = DoH IP lists
|
||||
system_ips: set[str] = results[0] if isinstance(results[0], set) else set()
|
||||
|
||||
doh_ips: list[str] = []
|
||||
for r in results[1:]:
|
||||
if isinstance(r, list):
|
||||
doh_ips.extend(r)
|
||||
|
||||
# Deduplicate preserving order, exclude system-DNS IPs
|
||||
seen: set[str] = set()
|
||||
candidates: list[str] = []
|
||||
for ip in doh_ips:
|
||||
if ip not in seen and ip not in system_ips:
|
||||
seen.add(ip)
|
||||
candidates.append(ip)
|
||||
|
||||
# Validate through existing normalization
|
||||
validated = _normalize_fallback_ips(candidates)
|
||||
|
||||
if validated:
|
||||
logger.debug("Discovered Telegram fallback IPs via DoH: %s", ", ".join(validated))
|
||||
return validated
|
||||
|
||||
logger.info(
|
||||
"DoH discovery yielded no new IPs (system DNS: %s); using seed fallback IPs %s",
|
||||
", ".join(system_ips) or "unknown",
|
||||
", ".join(_SEED_FALLBACK_IPS),
|
||||
)
|
||||
return list(_SEED_FALLBACK_IPS)
|
||||
|
||||
|
||||
def _rewrite_request_for_ip(request: httpx.Request, ip: str) -> httpx.Request:
|
||||
original_host = request.url.host or _TELEGRAM_API_HOST
|
||||
url = request.url.copy_with(host=ip)
|
||||
headers = request.headers.copy()
|
||||
headers["host"] = original_host
|
||||
extensions = dict(request.extensions)
|
||||
extensions["sni_hostname"] = original_host
|
||||
return httpx.Request(
|
||||
method=request.method,
|
||||
url=url,
|
||||
headers=headers,
|
||||
stream=request.stream,
|
||||
extensions=extensions,
|
||||
)
|
||||
|
||||
|
||||
def _is_retryable_connect_error(exc: Exception) -> bool:
|
||||
return isinstance(exc, (httpx.ConnectTimeout, httpx.ConnectError))
|
||||
@@ -27,7 +27,6 @@ import hashlib
|
||||
import hmac
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
import time
|
||||
@@ -54,7 +53,6 @@ logger = logging.getLogger(__name__)
|
||||
DEFAULT_HOST = "0.0.0.0"
|
||||
DEFAULT_PORT = 8644
|
||||
_INSECURE_NO_AUTH = "INSECURE_NO_AUTH"
|
||||
_DYNAMIC_ROUTES_FILENAME = "webhook_subscriptions.json"
|
||||
|
||||
|
||||
def check_webhook_requirements() -> bool:
|
||||
@@ -70,10 +68,7 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
self._host: str = config.extra.get("host", DEFAULT_HOST)
|
||||
self._port: int = int(config.extra.get("port", DEFAULT_PORT))
|
||||
self._global_secret: str = config.extra.get("secret", "")
|
||||
self._static_routes: Dict[str, dict] = config.extra.get("routes", {})
|
||||
self._dynamic_routes: Dict[str, dict] = {}
|
||||
self._dynamic_routes_mtime: float = 0.0
|
||||
self._routes: Dict[str, dict] = dict(self._static_routes)
|
||||
self._routes: Dict[str, dict] = config.extra.get("routes", {})
|
||||
self._runner = None
|
||||
|
||||
# Delivery info keyed by session chat_id — consumed by send()
|
||||
@@ -101,9 +96,6 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
async def connect(self) -> bool:
|
||||
# Load agent-created subscriptions before validating
|
||||
self._reload_dynamic_routes()
|
||||
|
||||
# Validate routes at startup — secret is required per route
|
||||
for name, route in self._routes.items():
|
||||
secret = route.get("secret", self._global_secret)
|
||||
@@ -190,46 +182,8 @@ class WebhookAdapter(BasePlatformAdapter):
|
||||
"""GET /health — simple health check."""
|
||||
return web.json_response({"status": "ok", "platform": "webhook"})
|
||||
|
||||
def _reload_dynamic_routes(self) -> None:
|
||||
"""Reload agent-created subscriptions from disk if the file changed."""
|
||||
from pathlib import Path as _Path
|
||||
hermes_home = _Path(
|
||||
os.getenv("HERMES_HOME", str(_Path.home() / ".hermes"))
|
||||
).expanduser()
|
||||
subs_path = hermes_home / _DYNAMIC_ROUTES_FILENAME
|
||||
if not subs_path.exists():
|
||||
if self._dynamic_routes:
|
||||
self._dynamic_routes = {}
|
||||
self._routes = dict(self._static_routes)
|
||||
logger.debug("[webhook] Dynamic subscriptions file removed, cleared dynamic routes")
|
||||
return
|
||||
try:
|
||||
mtime = subs_path.stat().st_mtime
|
||||
if mtime <= self._dynamic_routes_mtime:
|
||||
return # No change
|
||||
data = json.loads(subs_path.read_text(encoding="utf-8"))
|
||||
if not isinstance(data, dict):
|
||||
return
|
||||
# Merge: static routes take precedence over dynamic ones
|
||||
self._dynamic_routes = {
|
||||
k: v for k, v in data.items()
|
||||
if k not in self._static_routes
|
||||
}
|
||||
self._routes = {**self._dynamic_routes, **self._static_routes}
|
||||
self._dynamic_routes_mtime = mtime
|
||||
logger.info(
|
||||
"[webhook] Reloaded %d dynamic route(s): %s",
|
||||
len(self._dynamic_routes),
|
||||
", ".join(self._dynamic_routes.keys()) or "(none)",
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning("[webhook] Failed to reload dynamic routes: %s", e)
|
||||
|
||||
async def _handle_webhook(self, request: "web.Request") -> "web.Response":
|
||||
"""POST /webhooks/{route_name} — receive and process a webhook event."""
|
||||
# Hot-reload dynamic subscriptions on each request (mtime-gated, cheap)
|
||||
self._reload_dynamic_routes()
|
||||
|
||||
route_name = request.match_info.get("route_name", "")
|
||||
route_config = self._routes.get(route_name)
|
||||
|
||||
|
||||
@@ -26,7 +26,6 @@ from pathlib import Path
|
||||
from typing import Dict, Optional, Any
|
||||
|
||||
from hermes_cli.config import get_hermes_home
|
||||
from hermes_constants import get_hermes_dir
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -135,7 +134,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
|
||||
)
|
||||
self._session_path: Path = Path(config.extra.get(
|
||||
"session_path",
|
||||
get_hermes_dir("platforms/whatsapp/session", "whatsapp/session")
|
||||
get_hermes_home() / "whatsapp" / "session"
|
||||
))
|
||||
self._reply_prefix: Optional[str] = config.extra.get("reply_prefix")
|
||||
self._message_queue: asyncio.Queue = asyncio.Queue()
|
||||
@@ -527,7 +526,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
|
||||
image_path: str,
|
||||
caption: Optional[str] = None,
|
||||
reply_to: Optional[str] = None,
|
||||
**kwargs,
|
||||
) -> SendResult:
|
||||
"""Send a local image file natively via bridge."""
|
||||
return await self._send_media_to_bridge(chat_id, image_path, "image", caption)
|
||||
@@ -538,7 +536,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
|
||||
video_path: str,
|
||||
caption: Optional[str] = None,
|
||||
reply_to: Optional[str] = None,
|
||||
**kwargs,
|
||||
) -> SendResult:
|
||||
"""Send a video natively via bridge — plays inline in WhatsApp."""
|
||||
return await self._send_media_to_bridge(chat_id, video_path, "video", caption)
|
||||
@@ -550,7 +547,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
|
||||
caption: Optional[str] = None,
|
||||
file_name: Optional[str] = None,
|
||||
reply_to: Optional[str] = None,
|
||||
**kwargs,
|
||||
) -> SendResult:
|
||||
"""Send a document/file as a downloadable attachment via bridge."""
|
||||
return await self._send_media_to_bridge(
|
||||
|
||||
+19
-149
@@ -288,7 +288,7 @@ def _resolve_gateway_model(config: dict | None = None) -> str:
|
||||
if isinstance(model_cfg, str):
|
||||
model = model_cfg
|
||||
elif isinstance(model_cfg, dict):
|
||||
model = model_cfg.get("default") or model_cfg.get("model") or model
|
||||
model = model_cfg.get("default", model)
|
||||
return model
|
||||
|
||||
|
||||
@@ -432,7 +432,7 @@ class GatewayRunner:
|
||||
from honcho_integration.session import HonchoSessionManager
|
||||
|
||||
hcfg = HonchoClientConfig.from_global_config()
|
||||
if not hcfg.enabled or not (hcfg.api_key or hcfg.base_url):
|
||||
if not hcfg.enabled or not hcfg.api_key:
|
||||
return None, hcfg
|
||||
|
||||
client = get_honcho_client(hcfg)
|
||||
@@ -745,22 +745,10 @@ class GatewayRunner:
|
||||
logger.error("No connected messaging platforms remain. Shutting down gateway cleanly.")
|
||||
await self.stop()
|
||||
elif not self.adapters and self._failed_platforms:
|
||||
# All platforms are down and queued for background reconnection.
|
||||
# If the error is retryable, exit with failure so systemd Restart=on-failure
|
||||
# can restart the process. Otherwise stay alive and keep retrying in background.
|
||||
if adapter.fatal_error_retryable:
|
||||
self._exit_reason = adapter.fatal_error_message or "All messaging platforms failed with retryable errors"
|
||||
self._exit_with_failure = True
|
||||
logger.error(
|
||||
"All messaging platforms failed with retryable errors. "
|
||||
"Shutting down gateway for service restart (systemd will retry)."
|
||||
)
|
||||
await self.stop()
|
||||
else:
|
||||
logger.warning(
|
||||
"No connected messaging platforms remain, but %d platform(s) queued for reconnection",
|
||||
len(self._failed_platforms),
|
||||
)
|
||||
logger.warning(
|
||||
"No connected messaging platforms remain, but %d platform(s) queued for reconnection",
|
||||
len(self._failed_platforms),
|
||||
)
|
||||
|
||||
def _request_clean_exit(self, reason: str) -> None:
|
||||
self._exit_cleanly = True
|
||||
@@ -1994,12 +1982,6 @@ class GatewayRunner:
|
||||
f"Use /resume to browse and restore a previous session.\n"
|
||||
f"Adjust reset timing in config.yaml under session_reset."
|
||||
)
|
||||
try:
|
||||
session_info = self._format_session_info()
|
||||
if session_info:
|
||||
notice = f"{notice}\n\n{session_info}"
|
||||
except Exception:
|
||||
pass
|
||||
await adapter.send(
|
||||
source.chat_id, notice,
|
||||
metadata=getattr(event, 'metadata', None),
|
||||
@@ -2093,7 +2075,7 @@ class GatewayRunner:
|
||||
if isinstance(_model_cfg, str):
|
||||
_hyg_model = _model_cfg
|
||||
elif isinstance(_model_cfg, dict):
|
||||
_hyg_model = _model_cfg.get("default") or _model_cfg.get("model") or _hyg_model
|
||||
_hyg_model = _model_cfg.get("default", _hyg_model)
|
||||
# Read explicit context_length override from model config
|
||||
# (same as run_agent.py lines 995-1005)
|
||||
_raw_ctx = _model_cfg.get("context_length")
|
||||
@@ -2216,15 +2198,6 @@ class GatewayRunner:
|
||||
),
|
||||
)
|
||||
|
||||
# _compress_context ends the old session and creates
|
||||
# a new session_id. Write compressed messages into
|
||||
# the NEW session so the old transcript stays intact
|
||||
# and searchable via session_search.
|
||||
_hyg_new_sid = _hyg_agent.session_id
|
||||
if _hyg_new_sid != session_entry.session_id:
|
||||
session_entry.session_id = _hyg_new_sid
|
||||
self.session_store._save()
|
||||
|
||||
self.session_store.rewrite_transcript(
|
||||
session_entry.session_id, _compressed
|
||||
)
|
||||
@@ -2776,85 +2749,6 @@ class GatewayRunner:
|
||||
# Clear session env
|
||||
self._clear_session_env()
|
||||
|
||||
def _format_session_info(self) -> str:
|
||||
"""Resolve current model config and return a formatted info block.
|
||||
|
||||
Surfaces model, provider, context length, and endpoint so gateway
|
||||
users can immediately see if context detection went wrong (e.g.
|
||||
local models falling to the 128K default).
|
||||
"""
|
||||
from agent.model_metadata import get_model_context_length, DEFAULT_FALLBACK_CONTEXT
|
||||
|
||||
model = _resolve_gateway_model()
|
||||
config_context_length = None
|
||||
provider = None
|
||||
base_url = None
|
||||
api_key = None
|
||||
|
||||
try:
|
||||
cfg_path = _hermes_home / "config.yaml"
|
||||
if cfg_path.exists():
|
||||
import yaml as _info_yaml
|
||||
with open(cfg_path, encoding="utf-8") as f:
|
||||
data = _info_yaml.safe_load(f) or {}
|
||||
model_cfg = data.get("model", {})
|
||||
if isinstance(model_cfg, dict):
|
||||
raw_ctx = model_cfg.get("context_length")
|
||||
if raw_ctx is not None:
|
||||
try:
|
||||
config_context_length = int(raw_ctx)
|
||||
except (TypeError, ValueError):
|
||||
pass
|
||||
provider = model_cfg.get("provider") or None
|
||||
base_url = model_cfg.get("base_url") or None
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Resolve runtime credentials for probing
|
||||
try:
|
||||
runtime = _resolve_runtime_agent_kwargs()
|
||||
provider = provider or runtime.get("provider")
|
||||
base_url = base_url or runtime.get("base_url")
|
||||
api_key = runtime.get("api_key")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
context_length = get_model_context_length(
|
||||
model,
|
||||
base_url=base_url or "",
|
||||
api_key=api_key or "",
|
||||
config_context_length=config_context_length,
|
||||
provider=provider or "",
|
||||
)
|
||||
|
||||
# Format context source hint
|
||||
if config_context_length is not None:
|
||||
ctx_source = "config"
|
||||
elif context_length == DEFAULT_FALLBACK_CONTEXT:
|
||||
ctx_source = "default — set model.context_length in config to override"
|
||||
else:
|
||||
ctx_source = "detected"
|
||||
|
||||
# Format context length for display
|
||||
if context_length >= 1_000_000:
|
||||
ctx_display = f"{context_length / 1_000_000:.1f}M"
|
||||
elif context_length >= 1_000:
|
||||
ctx_display = f"{context_length // 1_000}K"
|
||||
else:
|
||||
ctx_display = str(context_length)
|
||||
|
||||
lines = [
|
||||
f"◆ Model: `{model}`",
|
||||
f"◆ Provider: {provider or 'openrouter'}",
|
||||
f"◆ Context: {ctx_display} tokens ({ctx_source})",
|
||||
]
|
||||
|
||||
# Show endpoint for local/custom setups
|
||||
if base_url and ("localhost" in base_url or "127.0.0.1" in base_url or "0.0.0.0" in base_url):
|
||||
lines.append(f"◆ Endpoint: {base_url}")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
async def _handle_reset_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /new or /reset command."""
|
||||
source = event.source
|
||||
@@ -2895,22 +2789,12 @@ class GatewayRunner:
|
||||
"session_key": session_key,
|
||||
})
|
||||
|
||||
# Resolve session config info to surface to the user
|
||||
try:
|
||||
session_info = self._format_session_info()
|
||||
except Exception:
|
||||
session_info = ""
|
||||
|
||||
if new_entry:
|
||||
header = "✨ Session reset! Starting fresh."
|
||||
return "✨ Session reset! I've started fresh with no memory of our previous conversation."
|
||||
else:
|
||||
# No existing session, just create one
|
||||
self.session_store.get_or_create_session(source, force_new=True)
|
||||
header = "✨ New session started!"
|
||||
|
||||
if session_info:
|
||||
return f"{header}\n\n{session_info}"
|
||||
return header
|
||||
return "✨ New session started!"
|
||||
|
||||
async def _handle_status_command(self, event: MessageEvent) -> str:
|
||||
"""Handle /status command."""
|
||||
@@ -4019,22 +3903,13 @@ class GatewayRunner:
|
||||
loop = asyncio.get_event_loop()
|
||||
compressed, _ = await loop.run_in_executor(
|
||||
None,
|
||||
lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens)
|
||||
lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens),
|
||||
)
|
||||
|
||||
# _compress_context already calls end_session() on the old session
|
||||
# (preserving its full transcript in SQLite) and creates a new
|
||||
# session_id for the continuation. Write the compressed messages
|
||||
# into the NEW session so the original history stays searchable.
|
||||
new_session_id = tmp_agent.session_id
|
||||
if new_session_id != session_entry.session_id:
|
||||
session_entry.session_id = new_session_id
|
||||
self.session_store._save()
|
||||
|
||||
self.session_store.rewrite_transcript(new_session_id, compressed)
|
||||
self.session_store.rewrite_transcript(session_entry.session_id, compressed)
|
||||
# Reset stored token count — transcript changed, old value is stale
|
||||
self.session_store.update_session(
|
||||
session_entry.session_key, last_prompt_tokens=0
|
||||
session_entry.session_key, last_prompt_tokens=0,
|
||||
)
|
||||
new_count = len(compressed)
|
||||
new_tokens = estimate_messages_tokens_rough(compressed)
|
||||
@@ -4190,7 +4065,7 @@ class GatewayRunner:
|
||||
]
|
||||
ctx = agent.context_compressor
|
||||
if ctx.last_prompt_tokens:
|
||||
pct = min(100, ctx.last_prompt_tokens / ctx.context_length * 100) if ctx.context_length else 0
|
||||
pct = ctx.last_prompt_tokens / ctx.context_length * 100 if ctx.context_length else 0
|
||||
lines.append(f"Context: {ctx.last_prompt_tokens:,} / {ctx.context_length:,} ({pct:.0f}%)")
|
||||
if ctx.compression_count:
|
||||
lines.append(f"Compressions: {ctx.compression_count}")
|
||||
@@ -5004,17 +4879,12 @@ class GatewayRunner:
|
||||
progress_queue.put(msg)
|
||||
|
||||
# Background task to send progress messages
|
||||
# Accumulates tool lines into a single message that gets edited.
|
||||
#
|
||||
# Threading metadata is platform-specific:
|
||||
# - Slack DM threading needs event_message_id fallback (reply thread)
|
||||
# - Telegram uses message_thread_id only for forum topics; passing a
|
||||
# normal DM/group message id as thread_id causes send failures
|
||||
# - Other platforms should use explicit source.thread_id only
|
||||
if source.platform == Platform.SLACK:
|
||||
_progress_thread_id = source.thread_id or event_message_id
|
||||
else:
|
||||
_progress_thread_id = source.thread_id
|
||||
# Accumulates tool lines into a single message that gets edited
|
||||
# For DM top-level Slack messages, source.thread_id is None but the
|
||||
# final reply will be threaded under the original message via reply_to.
|
||||
# Use event_message_id as fallback so progress messages land in the
|
||||
# same thread as the final response instead of going to the DM root.
|
||||
_progress_thread_id = source.thread_id or event_message_id
|
||||
_progress_metadata = {"thread_id": _progress_thread_id} if _progress_thread_id else None
|
||||
|
||||
async def send_progress_messages():
|
||||
|
||||
+6
-9
@@ -762,16 +762,14 @@ class SessionStore:
|
||||
if session_key in self._entries:
|
||||
entry = self._entries[session_key]
|
||||
entry.updated_at = _now()
|
||||
# Direct assignment — the gateway receives cumulative totals
|
||||
# from the cached agent, not per-call deltas.
|
||||
entry.input_tokens = input_tokens
|
||||
entry.output_tokens = output_tokens
|
||||
entry.cache_read_tokens = cache_read_tokens
|
||||
entry.cache_write_tokens = cache_write_tokens
|
||||
entry.input_tokens += input_tokens
|
||||
entry.output_tokens += output_tokens
|
||||
entry.cache_read_tokens += cache_read_tokens
|
||||
entry.cache_write_tokens += cache_write_tokens
|
||||
if last_prompt_tokens is not None:
|
||||
entry.last_prompt_tokens = last_prompt_tokens
|
||||
if estimated_cost_usd is not None:
|
||||
entry.estimated_cost_usd = estimated_cost_usd
|
||||
entry.estimated_cost_usd += estimated_cost_usd
|
||||
if cost_status:
|
||||
entry.cost_status = cost_status
|
||||
entry.total_tokens = (
|
||||
@@ -785,7 +783,7 @@ class SessionStore:
|
||||
|
||||
if self._db and db_session_id:
|
||||
try:
|
||||
self._db.set_token_counts(
|
||||
self._db.update_token_counts(
|
||||
db_session_id,
|
||||
input_tokens=input_tokens,
|
||||
output_tokens=output_tokens,
|
||||
@@ -797,7 +795,6 @@ class SessionStore:
|
||||
billing_provider=provider,
|
||||
billing_base_url=base_url,
|
||||
model=model,
|
||||
absolute=True,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.debug("Session DB operation failed: %s", e)
|
||||
|
||||
@@ -11,5 +11,5 @@ Provides subcommands for:
|
||||
- hermes cron - Manage cron jobs
|
||||
"""
|
||||
|
||||
__version__ = "0.5.0"
|
||||
__release_date__ = "2026.3.28"
|
||||
__version__ = "0.4.0"
|
||||
__release_date__ = "2026.3.23"
|
||||
|
||||
+1
-10
@@ -160,7 +160,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
|
||||
id="alibaba",
|
||||
name="Alibaba Cloud (DashScope)",
|
||||
auth_type="api_key",
|
||||
inference_base_url="https://coding-intl.dashscope.aliyuncs.com/v1",
|
||||
inference_base_url="https://dashscope-intl.aliyuncs.com/apps/anthropic",
|
||||
api_key_env_vars=("DASHSCOPE_API_KEY",),
|
||||
base_url_env_var="DASHSCOPE_BASE_URL",
|
||||
),
|
||||
@@ -212,14 +212,6 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
|
||||
api_key_env_vars=("KILOCODE_API_KEY",),
|
||||
base_url_env_var="KILOCODE_BASE_URL",
|
||||
),
|
||||
"huggingface": ProviderConfig(
|
||||
id="huggingface",
|
||||
name="Hugging Face",
|
||||
auth_type="api_key",
|
||||
inference_base_url="https://router.huggingface.co/v1",
|
||||
api_key_env_vars=("HF_TOKEN",),
|
||||
base_url_env_var="HF_BASE_URL",
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
@@ -693,7 +685,6 @@ def resolve_provider(
|
||||
"github-copilot-acp": "copilot-acp", "copilot-acp-agent": "copilot-acp",
|
||||
"aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
|
||||
"opencode": "opencode-zen", "zen": "opencode-zen",
|
||||
"hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
|
||||
"go": "opencode-go", "opencode-go-sub": "opencode-go",
|
||||
"kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
|
||||
}
|
||||
|
||||
+3
-40
@@ -138,12 +138,6 @@ DEFAULT_CONFIG = {
|
||||
"toolsets": ["hermes-cli"],
|
||||
"agent": {
|
||||
"max_turns": 90,
|
||||
# Tool-use enforcement: injects system prompt guidance that tells the
|
||||
# model to actually call tools instead of describing intended actions.
|
||||
# Values: "auto" (default — applies to gpt/codex models), true/false
|
||||
# (force on/off for all models), or a list of model-name substrings
|
||||
# to match (e.g. ["gpt", "codex", "gemini", "qwen"]).
|
||||
"tool_use_enforcement": "auto",
|
||||
},
|
||||
|
||||
"terminal": {
|
||||
@@ -227,49 +221,42 @@ DEFAULT_CONFIG = {
|
||||
"model": "",
|
||||
"base_url": "",
|
||||
"api_key": "",
|
||||
"timeout": 30, # seconds — increase for slow local models
|
||||
},
|
||||
"compression": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
"base_url": "",
|
||||
"api_key": "",
|
||||
"timeout": 120, # seconds — compression summarises large contexts; increase for local models
|
||||
},
|
||||
"session_search": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
"base_url": "",
|
||||
"api_key": "",
|
||||
"timeout": 30,
|
||||
},
|
||||
"skills_hub": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
"base_url": "",
|
||||
"api_key": "",
|
||||
"timeout": 30,
|
||||
},
|
||||
"approval": {
|
||||
"provider": "auto",
|
||||
"model": "", # fast/cheap model recommended (e.g. gemini-flash, haiku)
|
||||
"base_url": "",
|
||||
"api_key": "",
|
||||
"timeout": 30,
|
||||
},
|
||||
"mcp": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
"base_url": "",
|
||||
"api_key": "",
|
||||
"timeout": 30,
|
||||
},
|
||||
"flush_memories": {
|
||||
"provider": "auto",
|
||||
"model": "",
|
||||
"base_url": "",
|
||||
"api_key": "",
|
||||
"timeout": 30,
|
||||
},
|
||||
},
|
||||
|
||||
@@ -560,14 +547,14 @@ OPTIONAL_ENV_VARS = {
|
||||
"category": "provider",
|
||||
},
|
||||
"DASHSCOPE_API_KEY": {
|
||||
"description": "Alibaba Cloud DashScope API key (Qwen + multi-provider models)",
|
||||
"description": "Alibaba Cloud DashScope API key for Qwen models",
|
||||
"prompt": "DashScope API Key",
|
||||
"url": "https://modelstudio.console.alibabacloud.com/",
|
||||
"password": True,
|
||||
"category": "provider",
|
||||
},
|
||||
"DASHSCOPE_BASE_URL": {
|
||||
"description": "Custom DashScope base URL (default: coding-intl OpenAI-compat endpoint)",
|
||||
"description": "Custom DashScope base URL (default: international endpoint)",
|
||||
"prompt": "DashScope Base URL",
|
||||
"url": "",
|
||||
"password": False,
|
||||
@@ -606,31 +593,8 @@ OPTIONAL_ENV_VARS = {
|
||||
"category": "provider",
|
||||
"advanced": True,
|
||||
},
|
||||
"HF_TOKEN": {
|
||||
"description": "Hugging Face token for Inference Providers (20+ open models via router.huggingface.co)",
|
||||
"prompt": "Hugging Face Token",
|
||||
"url": "https://huggingface.co/settings/tokens",
|
||||
"password": True,
|
||||
"category": "provider",
|
||||
},
|
||||
"HF_BASE_URL": {
|
||||
"description": "Hugging Face Inference Providers base URL override",
|
||||
"prompt": "HF base URL (leave empty for default)",
|
||||
"url": None,
|
||||
"password": False,
|
||||
"category": "provider",
|
||||
"advanced": True,
|
||||
},
|
||||
|
||||
# ── Tool API keys ──
|
||||
"EXA_API_KEY": {
|
||||
"description": "Exa API key for AI-native web search and contents",
|
||||
"prompt": "Exa API key",
|
||||
"url": "https://exa.ai/",
|
||||
"tools": ["web_search", "web_extract"],
|
||||
"password": True,
|
||||
"category": "tool",
|
||||
},
|
||||
"PARALLEL_API_KEY": {
|
||||
"description": "Parallel API key for AI-native web search and extract",
|
||||
"prompt": "Parallel API key",
|
||||
@@ -1687,7 +1651,6 @@ def show_config():
|
||||
keys = [
|
||||
("OPENROUTER_API_KEY", "OpenRouter"),
|
||||
("VOICE_TOOLS_OPENAI_KEY", "OpenAI (STT/TTS)"),
|
||||
("EXA_API_KEY", "Exa"),
|
||||
("PARALLEL_API_KEY", "Parallel"),
|
||||
("FIRECRAWL_API_KEY", "Firecrawl"),
|
||||
("TAVILY_API_KEY", "Tavily"),
|
||||
@@ -1847,7 +1810,7 @@ def set_config_value(key: str, value: str):
|
||||
# Check if it's an API key (goes to .env)
|
||||
api_keys = [
|
||||
'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'ANTHROPIC_API_KEY', 'VOICE_TOOLS_OPENAI_KEY',
|
||||
'EXA_API_KEY', 'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'TAVILY_API_KEY',
|
||||
'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'TAVILY_API_KEY',
|
||||
'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID', 'BROWSER_USE_API_KEY',
|
||||
'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
|
||||
'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
|
||||
|
||||
@@ -56,7 +56,7 @@ def _honcho_is_configured_for_doctor() -> bool:
|
||||
from honcho_integration.client import HonchoClientConfig
|
||||
|
||||
cfg = HonchoClientConfig.from_global_config()
|
||||
return bool(cfg.enabled and (cfg.api_key or cfg.base_url))
|
||||
return bool(cfg.enabled and cfg.api_key)
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
@@ -708,8 +708,8 @@ def run_doctor(args):
|
||||
check_warn("Honcho config not found", "run: hermes honcho setup")
|
||||
elif not hcfg.enabled:
|
||||
check_info(f"Honcho disabled (set enabled: true in {_honcho_cfg_path} to activate)")
|
||||
elif not (hcfg.api_key or hcfg.base_url):
|
||||
check_fail("Honcho API key or base URL not set", "run: hermes honcho setup")
|
||||
elif not hcfg.api_key:
|
||||
check_fail("Honcho API key not set", "run: hermes honcho setup")
|
||||
issues.append("No Honcho API key — run 'hermes honcho setup'")
|
||||
else:
|
||||
from honcho_integration.client import get_honcho_client, reset_honcho_client
|
||||
|
||||
+20
-119
@@ -125,43 +125,20 @@ _SERVICE_BASE = "hermes-gateway"
|
||||
SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"
|
||||
|
||||
|
||||
def _profile_suffix() -> str:
|
||||
"""Derive a service-name suffix from the current HERMES_HOME.
|
||||
|
||||
Returns ``""`` for the default ``~/.hermes``, the profile name for
|
||||
``~/.hermes/profiles/<name>``, or a short hash for any other custom
|
||||
HERMES_HOME path.
|
||||
"""
|
||||
import hashlib
|
||||
import re
|
||||
from pathlib import Path as _Path
|
||||
home = get_hermes_home().resolve()
|
||||
default = (_Path.home() / ".hermes").resolve()
|
||||
if home == default:
|
||||
return ""
|
||||
# Detect ~/.hermes/profiles/<name> pattern → use the profile name
|
||||
profiles_root = (default / "profiles").resolve()
|
||||
try:
|
||||
rel = home.relative_to(profiles_root)
|
||||
parts = rel.parts
|
||||
if len(parts) == 1 and re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", parts[0]):
|
||||
return parts[0]
|
||||
except ValueError:
|
||||
pass
|
||||
# Fallback: short hash for arbitrary HERMES_HOME paths
|
||||
return hashlib.sha256(str(home).encode()).hexdigest()[:8]
|
||||
|
||||
|
||||
def get_service_name() -> str:
|
||||
"""Derive a systemd service name scoped to this HERMES_HOME.
|
||||
|
||||
Default ``~/.hermes`` returns ``hermes-gateway`` (backward compatible).
|
||||
Profile ``~/.hermes/profiles/coder`` returns ``hermes-gateway-coder``.
|
||||
Any other HERMES_HOME appends a short hash for uniqueness.
|
||||
Any other HERMES_HOME appends a short hash so multiple installations
|
||||
can each have their own systemd service without conflicting.
|
||||
"""
|
||||
suffix = _profile_suffix()
|
||||
if not suffix:
|
||||
import hashlib
|
||||
from pathlib import Path as _Path # local import to avoid monkeypatch interference
|
||||
home = get_hermes_home().resolve()
|
||||
default = (_Path.home() / ".hermes").resolve()
|
||||
if home == default:
|
||||
return _SERVICE_BASE
|
||||
suffix = hashlib.sha256(str(home).encode()).hexdigest()[:8]
|
||||
return f"{_SERVICE_BASE}-{suffix}"
|
||||
|
||||
|
||||
@@ -392,14 +369,7 @@ def print_systemd_linger_guidance() -> None:
|
||||
print(" sudo loginctl enable-linger $USER")
|
||||
|
||||
def get_launchd_plist_path() -> Path:
|
||||
"""Return the launchd plist path, scoped per profile.
|
||||
|
||||
Default ``~/.hermes`` → ``ai.hermes.gateway.plist`` (backward compatible).
|
||||
Profile ``~/.hermes/profiles/coder`` → ``ai.hermes.gateway-coder.plist``.
|
||||
"""
|
||||
suffix = _profile_suffix()
|
||||
name = f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
|
||||
return Path.home() / "Library" / "LaunchAgents" / f"{name}.plist"
|
||||
return Path.home() / "Library" / "LaunchAgents" / "ai.hermes.gateway.plist"
|
||||
|
||||
def _detect_venv_dir() -> Path | None:
|
||||
"""Detect the active virtualenv directory.
|
||||
@@ -450,17 +420,6 @@ def get_hermes_cli_path() -> str:
|
||||
# Systemd (Linux)
|
||||
# =============================================================================
|
||||
|
||||
def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
|
||||
"""Return user-local bin dirs that exist and aren't already in *path_entries*."""
|
||||
candidates = [
|
||||
str(home / ".local" / "bin"), # uv, uvx, pip-installed CLIs
|
||||
str(home / ".cargo" / "bin"), # Rust/cargo tools
|
||||
str(home / "go" / "bin"), # Go tools
|
||||
str(home / ".npm-global" / "bin"), # npm global packages
|
||||
]
|
||||
return [p for p in candidates if p not in path_entries and Path(p).exists()]
|
||||
|
||||
|
||||
def generate_systemd_unit(system: bool = False, run_as_user: str | None = None) -> str:
|
||||
python_path = get_python_path()
|
||||
working_dir = str(PROJECT_ROOT)
|
||||
@@ -475,16 +434,13 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
|
||||
resolved_node_dir = str(Path(resolved_node).resolve().parent)
|
||||
if resolved_node_dir not in path_entries:
|
||||
path_entries.append(resolved_node_dir)
|
||||
path_entries.extend(["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"])
|
||||
sane_path = ":".join(path_entries)
|
||||
|
||||
hermes_home = str(get_hermes_home().resolve())
|
||||
|
||||
common_bin_paths = ["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]
|
||||
|
||||
if system:
|
||||
username, group_name, home_dir = _system_service_identity(run_as_user)
|
||||
path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
|
||||
path_entries.extend(common_bin_paths)
|
||||
sane_path = ":".join(path_entries)
|
||||
return f"""[Unit]
|
||||
Description={SERVICE_DESCRIPTION}
|
||||
After=network-online.target
|
||||
@@ -516,9 +472,6 @@ StandardError=journal
|
||||
WantedBy=multi-user.target
|
||||
"""
|
||||
|
||||
path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
|
||||
path_entries.extend(common_bin_paths)
|
||||
sane_path = ":".join(path_entries)
|
||||
return f"""[Unit]
|
||||
Description={SERVICE_DESCRIPTION}
|
||||
After=network.target
|
||||
@@ -799,46 +752,18 @@ def systemd_status(deep: bool = False, system: bool = False):
|
||||
# Launchd (macOS)
|
||||
# =============================================================================
|
||||
|
||||
def get_launchd_label() -> str:
|
||||
"""Return the launchd service label, scoped per profile."""
|
||||
suffix = _profile_suffix()
|
||||
return f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
|
||||
|
||||
|
||||
def generate_launchd_plist() -> str:
|
||||
python_path = get_python_path()
|
||||
working_dir = str(PROJECT_ROOT)
|
||||
hermes_home = str(get_hermes_home().resolve())
|
||||
log_dir = get_hermes_home() / "logs"
|
||||
log_dir.mkdir(parents=True, exist_ok=True)
|
||||
label = get_launchd_label()
|
||||
# Build a sane PATH for the launchd plist. launchd provides only a
|
||||
# minimal default (/usr/bin:/bin:/usr/sbin:/sbin) which misses Homebrew,
|
||||
# nvm, cargo, etc. We prepend venv/bin and node_modules/.bin (matching
|
||||
# the systemd unit), then capture the user's full shell PATH so every
|
||||
# user-installed tool (node, ffmpeg, …) is reachable.
|
||||
detected_venv = _detect_venv_dir()
|
||||
venv_bin = str(detected_venv / "bin") if detected_venv else str(PROJECT_ROOT / "venv" / "bin")
|
||||
venv_dir = str(detected_venv) if detected_venv else str(PROJECT_ROOT / "venv")
|
||||
node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
|
||||
# Resolve the directory containing the node binary (e.g. Homebrew, nvm)
|
||||
# so it's explicitly in PATH even if the user's shell PATH changes later.
|
||||
priority_dirs = [venv_bin, node_bin]
|
||||
resolved_node = shutil.which("node")
|
||||
if resolved_node:
|
||||
resolved_node_dir = str(Path(resolved_node).resolve().parent)
|
||||
if resolved_node_dir not in priority_dirs:
|
||||
priority_dirs.append(resolved_node_dir)
|
||||
sane_path = ":".join(
|
||||
dict.fromkeys(priority_dirs + [p for p in os.environ.get("PATH", "").split(":") if p])
|
||||
)
|
||||
|
||||
|
||||
return f"""<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>Label</key>
|
||||
<string>{label}</string>
|
||||
<string>ai.hermes.gateway</string>
|
||||
|
||||
<key>ProgramArguments</key>
|
||||
<array>
|
||||
@@ -853,16 +778,6 @@ def generate_launchd_plist() -> str:
|
||||
<key>WorkingDirectory</key>
|
||||
<string>{working_dir}</string>
|
||||
|
||||
<key>EnvironmentVariables</key>
|
||||
<dict>
|
||||
<key>PATH</key>
|
||||
<string>{sane_path}</string>
|
||||
<key>VIRTUAL_ENV</key>
|
||||
<string>{venv_dir}</string>
|
||||
<key>HERMES_HOME</key>
|
||||
<string>{hermes_home}</string>
|
||||
</dict>
|
||||
|
||||
<key>RunAtLoad</key>
|
||||
<true/>
|
||||
|
||||
@@ -948,33 +863,20 @@ def launchd_uninstall():
|
||||
print("✓ Service uninstalled")
|
||||
|
||||
def launchd_start():
|
||||
plist_path = get_launchd_plist_path()
|
||||
label = get_launchd_label()
|
||||
|
||||
# Self-heal if the plist is missing entirely (e.g., manual cleanup, failed upgrade)
|
||||
if not plist_path.exists():
|
||||
print("↻ launchd plist missing; regenerating service definition")
|
||||
plist_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
|
||||
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
|
||||
subprocess.run(["launchctl", "start", label], check=True)
|
||||
print("✓ Service started")
|
||||
return
|
||||
|
||||
refresh_launchd_plist_if_needed()
|
||||
plist_path = get_launchd_plist_path()
|
||||
try:
|
||||
subprocess.run(["launchctl", "start", label], check=True)
|
||||
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
|
||||
except subprocess.CalledProcessError as e:
|
||||
if e.returncode != 3:
|
||||
if e.returncode != 3 or not plist_path.exists():
|
||||
raise
|
||||
print("↻ launchd job was unloaded; reloading service definition")
|
||||
subprocess.run(["launchctl", "load", str(plist_path)], check=True)
|
||||
subprocess.run(["launchctl", "start", label], check=True)
|
||||
subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
|
||||
print("✓ Service started")
|
||||
|
||||
def launchd_stop():
|
||||
label = get_launchd_label()
|
||||
subprocess.run(["launchctl", "stop", label], check=True)
|
||||
subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
|
||||
print("✓ Service stopped")
|
||||
|
||||
def _wait_for_gateway_exit(timeout: float = 10.0, force_after: float = 5.0):
|
||||
@@ -1029,9 +931,8 @@ def launchd_restart():
|
||||
|
||||
def launchd_status(deep: bool = False):
|
||||
plist_path = get_launchd_plist_path()
|
||||
label = get_launchd_label()
|
||||
result = subprocess.run(
|
||||
["launchctl", "list", label],
|
||||
["launchctl", "list", "ai.hermes.gateway"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
@@ -1536,7 +1437,7 @@ def _is_service_running() -> bool:
|
||||
return False
|
||||
elif is_macos() and get_launchd_plist_path().exists():
|
||||
result = subprocess.run(
|
||||
["launchctl", "list", get_launchd_label()],
|
||||
["launchctl", "list", "ai.hermes.gateway"],
|
||||
capture_output=True, text=True
|
||||
)
|
||||
return result.returncode == 0
|
||||
|
||||
+57
-224
@@ -795,7 +795,6 @@ def cmd_model(args):
|
||||
"ai-gateway": "AI Gateway",
|
||||
"kilocode": "Kilo Code",
|
||||
"alibaba": "Alibaba Cloud (DashScope)",
|
||||
"huggingface": "Hugging Face",
|
||||
"custom": "Custom endpoint",
|
||||
}
|
||||
active_label = provider_labels.get(active, active)
|
||||
@@ -821,8 +820,7 @@ def cmd_model(args):
|
||||
("opencode-zen", "OpenCode Zen (35+ curated models, pay-as-you-go)"),
|
||||
("opencode-go", "OpenCode Go (open models, $10/month subscription)"),
|
||||
("ai-gateway", "AI Gateway (Vercel — 200+ models, pay-per-use)"),
|
||||
("alibaba", "Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
|
||||
("huggingface", "Hugging Face Inference Providers (20+ open models)"),
|
||||
("alibaba", "Alibaba Cloud / DashScope (Qwen models, Anthropic-compatible)"),
|
||||
]
|
||||
|
||||
# Add user-defined custom providers from config.yaml
|
||||
@@ -832,8 +830,8 @@ def cmd_model(args):
|
||||
for entry in custom_providers_cfg:
|
||||
if not isinstance(entry, dict):
|
||||
continue
|
||||
name = (entry.get("name") or "").strip()
|
||||
base_url = (entry.get("base_url") or "").strip()
|
||||
name = entry.get("name", "").strip()
|
||||
base_url = entry.get("base_url", "").strip()
|
||||
if not name or not base_url:
|
||||
continue
|
||||
# Generate a stable key from the name
|
||||
@@ -895,7 +893,7 @@ def cmd_model(args):
|
||||
_model_flow_anthropic(config, current_model)
|
||||
elif selected_provider == "kimi-coding":
|
||||
_model_flow_kimi(config, current_model)
|
||||
elif selected_provider in ("zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface"):
|
||||
elif selected_provider in ("zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba"):
|
||||
_model_flow_api_key_provider(config, selected_provider, current_model)
|
||||
|
||||
|
||||
@@ -1504,18 +1502,6 @@ _PROVIDER_MODELS = {
|
||||
"google/gemini-3-pro-preview",
|
||||
"google/gemini-3-flash-preview",
|
||||
],
|
||||
# Curated HF model list — only agentic models that map to OpenRouter defaults.
|
||||
# Format: HF model ID → OpenRouter equivalent noted in comment
|
||||
"huggingface": [
|
||||
"Qwen/Qwen3.5-397B-A17B", # ↔ qwen/qwen3.5-plus
|
||||
"Qwen/Qwen3.5-35B-A3B", # ↔ qwen/qwen3.5-35b-a3b
|
||||
"deepseek-ai/DeepSeek-V3.2", # ↔ deepseek/deepseek-chat
|
||||
"moonshotai/Kimi-K2.5", # ↔ moonshotai/kimi-k2.5
|
||||
"MiniMaxAI/MiniMax-M2.5", # ↔ minimax/minimax-m2.5
|
||||
"zai-org/GLM-5", # ↔ z-ai/glm-5
|
||||
"XiaomiMiMo/MiMo-V2-Flash", # ↔ xiaomi/mimo-v2-pro
|
||||
"moonshotai/Kimi-K2-Thinking", # ↔ moonshotai/kimi-k2-thinking
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
@@ -2045,25 +2031,19 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
|
||||
save_env_value(base_url_env, override)
|
||||
effective_base = override
|
||||
|
||||
# Model selection — try live /models endpoint first, fall back to defaults.
|
||||
# Providers with large live catalogs (100+ models) use a curated list instead
|
||||
# so users see familiar model names rather than an overwhelming dump.
|
||||
curated = _PROVIDER_MODELS.get(provider_id, [])
|
||||
if curated and len(curated) >= 8:
|
||||
# Curated list is substantial — use it directly, skip live probe
|
||||
live_models = None
|
||||
else:
|
||||
from hermes_cli.models import fetch_api_models
|
||||
api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
|
||||
live_models = fetch_api_models(api_key_for_probe, effective_base)
|
||||
# Model selection — try live /models endpoint first, fall back to defaults
|
||||
from hermes_cli.models import fetch_api_models
|
||||
api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
|
||||
live_models = fetch_api_models(api_key_for_probe, effective_base)
|
||||
|
||||
if live_models:
|
||||
model_list = live_models
|
||||
print(f" Found {len(model_list)} model(s) from {pconfig.name} API")
|
||||
else:
|
||||
model_list = curated
|
||||
model_list = _PROVIDER_MODELS.get(provider_id, [])
|
||||
if model_list:
|
||||
print(f" Showing {len(model_list)} curated models — use \"Enter custom model name\" for others.")
|
||||
print(" ⚠ Could not auto-detect models from API — showing defaults.")
|
||||
print(" Use \"Enter custom model name\" if you don't see your model.")
|
||||
# else: no defaults either, will fall through to raw input
|
||||
|
||||
if model_list:
|
||||
@@ -2339,12 +2319,6 @@ def cmd_cron(args):
|
||||
cron_command(args)
|
||||
|
||||
|
||||
def cmd_webhook(args):
|
||||
"""Webhook subscription management."""
|
||||
from hermes_cli.webhook import webhook_command
|
||||
webhook_command(args)
|
||||
|
||||
|
||||
def cmd_doctor(args):
|
||||
"""Check configuration and dependencies."""
|
||||
from hermes_cli.doctor import run_doctor
|
||||
@@ -2476,18 +2450,8 @@ def _update_via_zip(args):
|
||||
)
|
||||
else:
|
||||
# Use sys.executable to explicitly call the venv's pip module,
|
||||
# avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu.
|
||||
# Some environments lose pip inside the venv; bootstrap it back with
|
||||
# ensurepip before trying the editable install.
|
||||
# avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu
|
||||
pip_cmd = [sys.executable, "-m", "pip"]
|
||||
try:
|
||||
subprocess.run(pip_cmd + ["--version"], cwd=PROJECT_ROOT, check=True, capture_output=True)
|
||||
except subprocess.CalledProcessError:
|
||||
subprocess.run(
|
||||
[sys.executable, "-m", "ensurepip", "--upgrade", "--default-pip"],
|
||||
cwd=PROJECT_ROOT,
|
||||
check=True,
|
||||
)
|
||||
try:
|
||||
subprocess.run(pip_cmd + ["install", "-e", ".[all]", "--quiet"], cwd=PROJECT_ROOT, check=True)
|
||||
except subprocess.CalledProcessError:
|
||||
@@ -2648,12 +2612,7 @@ def _restore_stashed_changes(
|
||||
print("Resolve conflicts manually, then run: git stash drop")
|
||||
|
||||
print(f"Restore your changes with: git stash apply {stash_ref}")
|
||||
# In non-interactive mode (gateway /update), don't abort — the code
|
||||
# update itself succeeded, only the stash restore had conflicts.
|
||||
# Aborting would report the entire update as failed.
|
||||
if prompt_user:
|
||||
sys.exit(1)
|
||||
return False
|
||||
sys.exit(1)
|
||||
|
||||
stash_selector = _resolve_stash_selector(git_cmd, cwd, stash_ref)
|
||||
if stash_selector is None:
|
||||
@@ -2727,60 +2686,30 @@ def cmd_update(args):
|
||||
|
||||
# Fetch and pull
|
||||
try:
|
||||
print("→ Fetching updates...")
|
||||
git_cmd = ["git"]
|
||||
if sys.platform == "win32":
|
||||
git_cmd = ["git", "-c", "windows.appendAtomically=false"]
|
||||
|
||||
print("→ Fetching updates...")
|
||||
fetch_result = subprocess.run(
|
||||
git_cmd + ["fetch", "origin"],
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
if fetch_result.returncode != 0:
|
||||
stderr = fetch_result.stderr.strip()
|
||||
if "Could not resolve host" in stderr or "unable to access" in stderr:
|
||||
print("✗ Network error — cannot reach the remote repository.")
|
||||
print(f" {stderr.splitlines()[0]}" if stderr else "")
|
||||
elif "Authentication failed" in stderr or "could not read Username" in stderr:
|
||||
print("✗ Authentication failed — check your git credentials or SSH key.")
|
||||
else:
|
||||
print(f"✗ Failed to fetch updates from origin.")
|
||||
if stderr:
|
||||
print(f" {stderr.splitlines()[0]}")
|
||||
sys.exit(1)
|
||||
|
||||
# Get current branch (returns literal "HEAD" when detached)
|
||||
|
||||
subprocess.run(git_cmd + ["fetch", "origin"], cwd=PROJECT_ROOT, check=True)
|
||||
|
||||
# Get current branch
|
||||
result = subprocess.run(
|
||||
git_cmd + ["rev-parse", "--abbrev-ref", "HEAD"],
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True,
|
||||
check=True
|
||||
)
|
||||
current_branch = result.stdout.strip()
|
||||
branch = result.stdout.strip()
|
||||
|
||||
# Always update against main
|
||||
branch = "main"
|
||||
|
||||
# If user is on a non-main branch or detached HEAD, switch to main
|
||||
if current_branch != "main":
|
||||
label = "detached HEAD" if current_branch == "HEAD" else f"branch '{current_branch}'"
|
||||
print(f" ⚠ Currently on {label} — switching to main for update...")
|
||||
# Stash before checkout so uncommitted work isn't lost
|
||||
auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
|
||||
subprocess.run(
|
||||
git_cmd + ["checkout", "main"],
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True,
|
||||
)
|
||||
else:
|
||||
auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
|
||||
|
||||
prompt_for_restore = auto_stash_ref is not None and sys.stdin.isatty() and sys.stdout.isatty()
|
||||
# Fall back to main if the current branch doesn't exist on the remote
|
||||
verify = subprocess.run(
|
||||
git_cmd + ["rev-parse", "--verify", f"origin/{branch}"],
|
||||
cwd=PROJECT_ROOT, capture_output=True, text=True,
|
||||
)
|
||||
if verify.returncode != 0:
|
||||
branch = "main"
|
||||
|
||||
# Check if there are updates
|
||||
result = subprocess.run(
|
||||
@@ -2788,69 +2717,31 @@ def cmd_update(args):
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True,
|
||||
check=True
|
||||
)
|
||||
commit_count = int(result.stdout.strip())
|
||||
|
||||
|
||||
if commit_count == 0:
|
||||
_invalidate_update_cache()
|
||||
# Restore stash and switch back to original branch if we moved
|
||||
if auto_stash_ref is not None:
|
||||
_restore_stashed_changes(
|
||||
git_cmd, PROJECT_ROOT, auto_stash_ref,
|
||||
prompt_user=prompt_for_restore,
|
||||
)
|
||||
if current_branch not in ("main", "HEAD"):
|
||||
subprocess.run(
|
||||
git_cmd + ["checkout", current_branch],
|
||||
cwd=PROJECT_ROOT, capture_output=True, text=True, check=False,
|
||||
)
|
||||
print("✓ Already up to date!")
|
||||
return
|
||||
|
||||
|
||||
print(f"→ Found {commit_count} new commit(s)")
|
||||
|
||||
auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
|
||||
prompt_for_restore = auto_stash_ref is not None and sys.stdin.isatty() and sys.stdout.isatty()
|
||||
|
||||
print("→ Pulling updates...")
|
||||
update_succeeded = False
|
||||
try:
|
||||
pull_result = subprocess.run(
|
||||
git_cmd + ["pull", "--ff-only", "origin", branch],
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
if pull_result.returncode != 0:
|
||||
# ff-only failed — local and remote have diverged (e.g. upstream
|
||||
# force-pushed or rebase). Since local changes are already
|
||||
# stashed, reset to match the remote exactly.
|
||||
print(" ⚠ Fast-forward not possible (history diverged), resetting to match remote...")
|
||||
reset_result = subprocess.run(
|
||||
git_cmd + ["reset", "--hard", f"origin/{branch}"],
|
||||
cwd=PROJECT_ROOT,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
if reset_result.returncode != 0:
|
||||
print(f"✗ Failed to reset to origin/{branch}.")
|
||||
if reset_result.stderr.strip():
|
||||
print(f" {reset_result.stderr.strip()}")
|
||||
print(" Try manually: git fetch origin && git reset --hard origin/main")
|
||||
sys.exit(1)
|
||||
update_succeeded = True
|
||||
subprocess.run(git_cmd + ["pull", "--ff-only", "origin", branch], cwd=PROJECT_ROOT, check=True)
|
||||
finally:
|
||||
if auto_stash_ref is not None:
|
||||
# Don't attempt stash restore if the code update itself failed —
|
||||
# working tree is in an unknown state.
|
||||
if not update_succeeded:
|
||||
print(f" ℹ️ Local changes preserved in stash (ref: {auto_stash_ref})")
|
||||
print(f" Restore manually with: git stash apply")
|
||||
else:
|
||||
_restore_stashed_changes(
|
||||
git_cmd,
|
||||
PROJECT_ROOT,
|
||||
auto_stash_ref,
|
||||
prompt_user=prompt_for_restore,
|
||||
)
|
||||
_restore_stashed_changes(
|
||||
git_cmd,
|
||||
PROJECT_ROOT,
|
||||
auto_stash_ref,
|
||||
prompt_user=prompt_for_restore,
|
||||
)
|
||||
|
||||
_invalidate_update_cache()
|
||||
|
||||
@@ -2873,18 +2764,8 @@ def cmd_update(args):
|
||||
)
|
||||
else:
|
||||
# Use sys.executable to explicitly call the venv's pip module,
|
||||
# avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu.
|
||||
# Some environments lose pip inside the venv; bootstrap it back with
|
||||
# ensurepip before trying the editable install.
|
||||
# avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu
|
||||
pip_cmd = [sys.executable, "-m", "pip"]
|
||||
try:
|
||||
subprocess.run(pip_cmd + ["--version"], cwd=PROJECT_ROOT, check=True, capture_output=True)
|
||||
except subprocess.CalledProcessError:
|
||||
subprocess.run(
|
||||
[sys.executable, "-m", "ensurepip", "--upgrade", "--default-pip"],
|
||||
cwd=PROJECT_ROOT,
|
||||
check=True,
|
||||
)
|
||||
try:
|
||||
subprocess.run(pip_cmd + ["install", "-e", ".[all]", "--quiet"], cwd=PROJECT_ROOT, check=True)
|
||||
except subprocess.CalledProcessError:
|
||||
@@ -2943,15 +2824,10 @@ def cmd_update(args):
|
||||
print(f" ℹ️ {len(missing_config)} new config option(s) available")
|
||||
|
||||
print()
|
||||
if not (sys.stdin.isatty() and sys.stdout.isatty()):
|
||||
print(" ℹ Non-interactive session — skipping config migration prompt.")
|
||||
print(" Run 'hermes config migrate' later to apply any new config/env options.")
|
||||
response = "n"
|
||||
if sys.stdin.isatty():
|
||||
response = input("Would you like to configure them now? [Y/n]: ").strip().lower()
|
||||
else:
|
||||
try:
|
||||
response = input("Would you like to configure them now? [Y/n]: ").strip().lower()
|
||||
except EOFError:
|
||||
response = "n"
|
||||
response = "n"
|
||||
|
||||
if response in ('', 'y', 'yes'):
|
||||
print()
|
||||
@@ -2999,11 +2875,10 @@ def cmd_update(args):
|
||||
# Check for macOS launchd service
|
||||
if is_macos():
|
||||
try:
|
||||
from hermes_cli.gateway import get_launchd_label
|
||||
plist_path = get_launchd_plist_path()
|
||||
if plist_path.exists():
|
||||
check = subprocess.run(
|
||||
["launchctl", "list", get_launchd_label()],
|
||||
["launchctl", "list", "ai.hermes.gateway"],
|
||||
capture_output=True, text=True, timeout=5,
|
||||
)
|
||||
has_launchd_service = check.returncode == 0
|
||||
@@ -3059,13 +2934,12 @@ def cmd_update(args):
|
||||
# after a manual SIGTERM, which would race with the
|
||||
# PID file cleanup.
|
||||
print("→ Restarting gateway service...")
|
||||
_launchd_label = get_launchd_label()
|
||||
stop = subprocess.run(
|
||||
["launchctl", "stop", _launchd_label],
|
||||
["launchctl", "stop", "ai.hermes.gateway"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
start = subprocess.run(
|
||||
["launchctl", "start", _launchd_label],
|
||||
["launchctl", "start", "ai.hermes.gateway"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
if start.returncode == 0:
|
||||
@@ -3248,7 +3122,7 @@ For more help on a command:
|
||||
)
|
||||
chat_parser.add_argument(
|
||||
"--provider",
|
||||
choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
|
||||
choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
|
||||
default=None,
|
||||
help="Inference provider (default: auto)"
|
||||
)
|
||||
@@ -3549,38 +3423,7 @@ For more help on a command:
|
||||
cron_subparsers.add_parser("tick", help="Run due jobs once and exit")
|
||||
|
||||
cron_parser.set_defaults(func=cmd_cron)
|
||||
|
||||
# =========================================================================
|
||||
# webhook command
|
||||
# =========================================================================
|
||||
webhook_parser = subparsers.add_parser(
|
||||
"webhook",
|
||||
help="Manage dynamic webhook subscriptions",
|
||||
description="Create, list, and remove webhook subscriptions for event-driven agent activation",
|
||||
)
|
||||
webhook_subparsers = webhook_parser.add_subparsers(dest="webhook_action")
|
||||
|
||||
wh_sub = webhook_subparsers.add_parser("subscribe", aliases=["add"], help="Create a webhook subscription")
|
||||
wh_sub.add_argument("name", help="Route name (used in URL: /webhooks/<name>)")
|
||||
wh_sub.add_argument("--prompt", default="", help="Prompt template with {dot.notation} payload refs")
|
||||
wh_sub.add_argument("--events", default="", help="Comma-separated event types to accept")
|
||||
wh_sub.add_argument("--description", default="", help="What this subscription does")
|
||||
wh_sub.add_argument("--skills", default="", help="Comma-separated skill names to load")
|
||||
wh_sub.add_argument("--deliver", default="log", help="Delivery target: log, telegram, discord, slack, etc.")
|
||||
wh_sub.add_argument("--deliver-chat-id", default="", help="Target chat ID for cross-platform delivery")
|
||||
wh_sub.add_argument("--secret", default="", help="HMAC secret (auto-generated if omitted)")
|
||||
|
||||
webhook_subparsers.add_parser("list", aliases=["ls"], help="List all dynamic subscriptions")
|
||||
|
||||
wh_rm = webhook_subparsers.add_parser("remove", aliases=["rm"], help="Remove a subscription")
|
||||
wh_rm.add_argument("name", help="Subscription name to remove")
|
||||
|
||||
wh_test = webhook_subparsers.add_parser("test", help="Send a test POST to a webhook route")
|
||||
wh_test.add_argument("name", help="Subscription name to test")
|
||||
wh_test.add_argument("--payload", default="", help="JSON payload to send (default: test payload)")
|
||||
|
||||
webhook_parser.set_defaults(func=cmd_webhook)
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# doctor command
|
||||
# =========================================================================
|
||||
@@ -3713,7 +3556,7 @@ For more help on a command:
|
||||
skills_snapshot = skills_subparsers.add_parser("snapshot", help="Export/import skill configurations")
|
||||
snapshot_subparsers = skills_snapshot.add_subparsers(dest="snapshot_action")
|
||||
snap_export = snapshot_subparsers.add_parser("export", help="Export installed skills to a file")
|
||||
snap_export.add_argument("output", help="Output JSON file path (use - for stdout)")
|
||||
snap_export.add_argument("output", help="Output JSON file path")
|
||||
snap_import = snapshot_subparsers.add_parser("import", help="Import and install skills from a file")
|
||||
snap_import.add_argument("input", help="Input JSON file path")
|
||||
snap_import.add_argument("--force", action="store_true", help="Force install despite caution verdict")
|
||||
@@ -3990,7 +3833,7 @@ For more help on a command:
|
||||
sessions_list.add_argument("--limit", type=int, default=20, help="Max sessions to show")
|
||||
|
||||
sessions_export = sessions_subparsers.add_parser("export", help="Export sessions to a JSONL file")
|
||||
sessions_export.add_argument("output", help="Output JSONL file path (use - for stdout)")
|
||||
sessions_export.add_argument("output", help="Output JSONL file path")
|
||||
sessions_export.add_argument("--source", help="Filter by source")
|
||||
sessions_export.add_argument("--session-id", help="Export a specific session")
|
||||
|
||||
@@ -4071,25 +3914,15 @@ For more help on a command:
|
||||
if not data:
|
||||
print(f"Session '{args.session_id}' not found.")
|
||||
return
|
||||
line = _json.dumps(data, ensure_ascii=False) + "\n"
|
||||
if args.output == "-":
|
||||
import sys
|
||||
sys.stdout.write(line)
|
||||
else:
|
||||
with open(args.output, "w", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
print(f"Exported 1 session to {args.output}")
|
||||
with open(args.output, "w", encoding="utf-8") as f:
|
||||
f.write(_json.dumps(data, ensure_ascii=False) + "\n")
|
||||
print(f"Exported 1 session to {args.output}")
|
||||
else:
|
||||
sessions = db.export_all(source=args.source)
|
||||
if args.output == "-":
|
||||
import sys
|
||||
with open(args.output, "w", encoding="utf-8") as f:
|
||||
for s in sessions:
|
||||
sys.stdout.write(_json.dumps(s, ensure_ascii=False) + "\n")
|
||||
else:
|
||||
with open(args.output, "w", encoding="utf-8") as f:
|
||||
for s in sessions:
|
||||
f.write(_json.dumps(s, ensure_ascii=False) + "\n")
|
||||
print(f"Exported {len(sessions)} sessions to {args.output}")
|
||||
f.write(_json.dumps(s, ensure_ascii=False) + "\n")
|
||||
print(f"Exported {len(sessions)} sessions to {args.output}")
|
||||
|
||||
elif action == "delete":
|
||||
resolved_session_id = db.resolve_session_id(args.session_id)
|
||||
|
||||
+5
-26
@@ -208,31 +208,14 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
|
||||
"google/gemini-3-pro-preview",
|
||||
"google/gemini-3-flash-preview",
|
||||
],
|
||||
# Alibaba DashScope Coding platform (coding-intl) — default endpoint.
|
||||
# Supports Qwen models + third-party providers (GLM, Kimi, MiniMax).
|
||||
# Users with classic DashScope keys should override DASHSCOPE_BASE_URL
|
||||
# to https://dashscope-intl.aliyuncs.com/compatible-mode/v1 (OpenAI-compat)
|
||||
# or https://dashscope-intl.aliyuncs.com/apps/anthropic (Anthropic-compat).
|
||||
"alibaba": [
|
||||
"qwen3.5-plus",
|
||||
"qwen3-max",
|
||||
"qwen3-coder-plus",
|
||||
"qwen3-coder-next",
|
||||
# Third-party models available on coding-intl
|
||||
"glm-5",
|
||||
"glm-4.7",
|
||||
"kimi-k2.5",
|
||||
"MiniMax-M2.5",
|
||||
],
|
||||
# Curated HF model list — only agentic models that map to OpenRouter defaults.
|
||||
"huggingface": [
|
||||
"Qwen/Qwen3.5-397B-A17B",
|
||||
"Qwen/Qwen3.5-35B-A3B",
|
||||
"deepseek-ai/DeepSeek-V3.2",
|
||||
"moonshotai/Kimi-K2.5",
|
||||
"MiniMaxAI/MiniMax-M2.5",
|
||||
"zai-org/GLM-5",
|
||||
"XiaomiMiMo/MiMo-V2-Flash",
|
||||
"moonshotai/Kimi-K2-Thinking",
|
||||
"qwen-plus-latest",
|
||||
"qwen3.5-flash",
|
||||
"qwen-vl-max",
|
||||
],
|
||||
}
|
||||
|
||||
@@ -253,7 +236,6 @@ _PROVIDER_LABELS = {
|
||||
"ai-gateway": "AI Gateway",
|
||||
"kilocode": "Kilo Code",
|
||||
"alibaba": "Alibaba Cloud (DashScope)",
|
||||
"huggingface": "Hugging Face",
|
||||
"custom": "Custom endpoint",
|
||||
}
|
||||
|
||||
@@ -289,9 +271,6 @@ _PROVIDER_ALIASES = {
|
||||
"aliyun": "alibaba",
|
||||
"qwen": "alibaba",
|
||||
"alibaba-cloud": "alibaba",
|
||||
"hf": "huggingface",
|
||||
"hugging-face": "huggingface",
|
||||
"huggingface-hub": "huggingface",
|
||||
}
|
||||
|
||||
|
||||
@@ -325,7 +304,7 @@ def list_available_providers() -> list[dict[str, str]]:
|
||||
# Canonical providers in display order
|
||||
_PROVIDER_ORDER = [
|
||||
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
|
||||
"huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
|
||||
"zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
|
||||
"opencode-zen", "opencode-go",
|
||||
"ai-gateway", "deepseek", "custom",
|
||||
]
|
||||
|
||||
+5
-16
@@ -385,23 +385,16 @@ class PluginManager:
|
||||
# Hook invocation
|
||||
# -----------------------------------------------------------------------
|
||||
|
||||
def invoke_hook(self, hook_name: str, **kwargs: Any) -> List[Any]:
|
||||
def invoke_hook(self, hook_name: str, **kwargs: Any) -> None:
|
||||
"""Call all registered callbacks for *hook_name*.
|
||||
|
||||
Each callback is wrapped in its own try/except so a misbehaving
|
||||
plugin cannot break the core agent loop.
|
||||
|
||||
Returns a list of non-``None`` return values from callbacks.
|
||||
This allows hooks like ``pre_llm_call`` to contribute context
|
||||
that the agent core can collect and inject.
|
||||
"""
|
||||
callbacks = self._hooks.get(hook_name, [])
|
||||
results: List[Any] = []
|
||||
for cb in callbacks:
|
||||
try:
|
||||
ret = cb(**kwargs)
|
||||
if ret is not None:
|
||||
results.append(ret)
|
||||
cb(**kwargs)
|
||||
except Exception as exc:
|
||||
logger.warning(
|
||||
"Hook '%s' callback %s raised: %s",
|
||||
@@ -409,7 +402,6 @@ class PluginManager:
|
||||
getattr(cb, "__name__", repr(cb)),
|
||||
exc,
|
||||
)
|
||||
return results
|
||||
|
||||
# -----------------------------------------------------------------------
|
||||
# Introspection
|
||||
@@ -454,12 +446,9 @@ def discover_plugins() -> None:
|
||||
get_plugin_manager().discover_and_load()
|
||||
|
||||
|
||||
def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
|
||||
"""Invoke a lifecycle hook on all loaded plugins.
|
||||
|
||||
Returns a list of non-``None`` return values from plugin callbacks.
|
||||
"""
|
||||
return get_plugin_manager().invoke_hook(hook_name, **kwargs)
|
||||
def invoke_hook(hook_name: str, **kwargs: Any) -> None:
|
||||
"""Invoke a lifecycle hook on all loaded plugins."""
|
||||
get_plugin_manager().invoke_hook(hook_name, **kwargs)
|
||||
|
||||
|
||||
def get_plugin_tool_names() -> Set[str]:
|
||||
|
||||
@@ -63,11 +63,8 @@ def _get_model_config() -> Dict[str, Any]:
|
||||
model_cfg = config.get("model")
|
||||
if isinstance(model_cfg, dict):
|
||||
cfg = dict(model_cfg)
|
||||
# Accept "model" as alias for "default" (users intuitively write model.model)
|
||||
if not cfg.get("default") and cfg.get("model"):
|
||||
cfg["default"] = cfg["model"]
|
||||
default = (cfg.get("default") or "").strip()
|
||||
base_url = (cfg.get("base_url") or "").strip()
|
||||
default = cfg.get("default", "").strip()
|
||||
base_url = cfg.get("base_url", "").strip()
|
||||
is_local = "localhost" in base_url or "127.0.0.1" in base_url
|
||||
is_fallback = not default or default == "anthropic/claude-opus-4.6"
|
||||
if is_local and is_fallback and base_url:
|
||||
@@ -206,7 +203,7 @@ def _resolve_named_custom_runtime(
|
||||
or _detect_api_mode_for_url(base_url)
|
||||
or "chat_completions",
|
||||
"base_url": base_url,
|
||||
"api_key": api_key or "no-key-required",
|
||||
"api_key": api_key,
|
||||
"source": f"custom_provider:{custom_provider.get('name', requested_provider)}",
|
||||
}
|
||||
|
||||
@@ -410,6 +407,12 @@ def resolve_runtime_provider(
|
||||
# (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
|
||||
elif base_url.rstrip("/").endswith("/anthropic"):
|
||||
api_mode = "anthropic_messages"
|
||||
# MiniMax providers always use Anthropic Messages API.
|
||||
# Auto-correct stale /v1 URLs (from old .env or config) to /anthropic.
|
||||
elif provider in ("minimax", "minimax-cn"):
|
||||
api_mode = "anthropic_messages"
|
||||
if base_url.rstrip("/").endswith("/v1"):
|
||||
base_url = base_url.rstrip("/")[:-3] + "/anthropic"
|
||||
return {
|
||||
"provider": provider,
|
||||
"api_mode": api_mode,
|
||||
|
||||
+11
-36
@@ -80,11 +80,6 @@ _DEFAULT_PROVIDER_MODELS = {
|
||||
"minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
|
||||
"ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
|
||||
"kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
|
||||
"huggingface": [
|
||||
"Qwen/Qwen3.5-397B-A17B", "Qwen/Qwen3-235B-A22B-Thinking-2507",
|
||||
"Qwen/Qwen3-Coder-480B-A35B-Instruct", "deepseek-ai/DeepSeek-R1-0528",
|
||||
"deepseek-ai/DeepSeek-V3.2", "moonshotai/Kimi-K2.5",
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
@@ -585,11 +580,11 @@ def _print_setup_summary(config: dict, hermes_home):
|
||||
else:
|
||||
tool_status.append(("Mixture of Agents", False, "OPENROUTER_API_KEY"))
|
||||
|
||||
# Web tools (Exa, Parallel, Firecrawl, or Tavily)
|
||||
if get_env_value("EXA_API_KEY") or get_env_value("PARALLEL_API_KEY") or get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL") or get_env_value("TAVILY_API_KEY"):
|
||||
# Web tools (Parallel, Firecrawl, or Tavily)
|
||||
if get_env_value("PARALLEL_API_KEY") or get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL") or get_env_value("TAVILY_API_KEY"):
|
||||
tool_status.append(("Web Search & Extract", True, None))
|
||||
else:
|
||||
tool_status.append(("Web Search & Extract", False, "EXA_API_KEY, PARALLEL_API_KEY, FIRECRAWL_API_KEY, or TAVILY_API_KEY"))
|
||||
tool_status.append(("Web Search & Extract", False, "PARALLEL_API_KEY, FIRECRAWL_API_KEY, or TAVILY_API_KEY"))
|
||||
|
||||
# Browser tools (local Chromium or Browserbase cloud)
|
||||
import shutil
|
||||
@@ -889,7 +884,6 @@ def setup_model_provider(config: dict):
|
||||
"OpenCode Go (open models, $10/month subscription)",
|
||||
"GitHub Copilot (uses GITHUB_TOKEN or gh auth token)",
|
||||
"GitHub Copilot ACP (spawns `copilot --acp --stdio`)",
|
||||
"Hugging Face Inference Providers (20+ open models)",
|
||||
]
|
||||
if keep_label:
|
||||
provider_choices.append(keep_label)
|
||||
@@ -1534,26 +1528,7 @@ def setup_model_provider(config: dict):
|
||||
_set_model_provider(config, "copilot-acp", pconfig.inference_base_url)
|
||||
selected_base_url = pconfig.inference_base_url
|
||||
|
||||
elif provider_idx == 16: # Hugging Face Inference Providers
|
||||
selected_provider = "huggingface"
|
||||
print()
|
||||
print_header("Hugging Face API Token")
|
||||
pconfig = PROVIDER_REGISTRY["huggingface"]
|
||||
print_info(f"Provider: {pconfig.name}")
|
||||
print_info("Get your token at: https://huggingface.co/settings/tokens")
|
||||
print_info("Required permission: 'Make calls to Inference Providers'")
|
||||
print()
|
||||
|
||||
api_key = prompt(" HF Token", password=True)
|
||||
if api_key:
|
||||
save_env_value("HF_TOKEN", api_key)
|
||||
# Clear OpenRouter env vars to prevent routing confusion
|
||||
save_env_value("OPENAI_BASE_URL", "")
|
||||
save_env_value("OPENAI_API_KEY", "")
|
||||
_set_model_provider(config, "huggingface", pconfig.inference_base_url)
|
||||
selected_base_url = pconfig.inference_base_url
|
||||
|
||||
# else: provider_idx == 17 (Keep current) — only shown when a provider already exists
|
||||
# else: provider_idx == 16 (Keep current) — only shown when a provider already exists
|
||||
# Normalize "keep current" to an explicit provider so downstream logic
|
||||
# doesn't fall back to the generic OpenRouter/static-model path.
|
||||
if selected_provider is None:
|
||||
@@ -2092,11 +2067,11 @@ def setup_terminal_backend(config: dict):
|
||||
print_info("Serverless cloud sandboxes. Each session gets its own container.")
|
||||
print_info("Requires a Modal account: https://modal.com")
|
||||
|
||||
# Check if modal SDK is installed
|
||||
# Check if swe-rex[modal] is installed
|
||||
try:
|
||||
__import__("modal")
|
||||
__import__("swe_rex")
|
||||
except ImportError:
|
||||
print_info("Installing modal SDK...")
|
||||
print_info("Installing swe-rex[modal]...")
|
||||
import subprocess
|
||||
|
||||
uv_bin = shutil.which("uv")
|
||||
@@ -2108,22 +2083,22 @@ def setup_terminal_backend(config: dict):
|
||||
"install",
|
||||
"--python",
|
||||
sys.executable,
|
||||
"modal",
|
||||
"swe-rex[modal]",
|
||||
],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
else:
|
||||
result = subprocess.run(
|
||||
[sys.executable, "-m", "pip", "install", "modal"],
|
||||
[sys.executable, "-m", "pip", "install", "swe-rex[modal]"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
if result.returncode == 0:
|
||||
print_success("modal SDK installed")
|
||||
print_success("swe-rex[modal] installed")
|
||||
else:
|
||||
print_warning(
|
||||
"Install failed — run manually: pip install modal"
|
||||
"Install failed — run manually: pip install 'swe-rex[modal]'"
|
||||
)
|
||||
|
||||
# Modal token
|
||||
|
||||
@@ -24,10 +24,6 @@ PLATFORMS = {
|
||||
"whatsapp": "📱 WhatsApp",
|
||||
"signal": "📡 Signal",
|
||||
"email": "📧 Email",
|
||||
"homeassistant": "🏠 Home Assistant",
|
||||
"mattermost": "💬 Mattermost",
|
||||
"matrix": "💬 Matrix",
|
||||
"dingtalk": "💬 DingTalk",
|
||||
}
|
||||
|
||||
# ─── Config Helpers ───────────────────────────────────────────────────────────
|
||||
|
||||
+14
-48
@@ -304,8 +304,7 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
|
||||
|
||||
|
||||
def do_install(identifier: str, category: str = "", force: bool = False,
|
||||
console: Optional[Console] = None, skip_confirm: bool = False,
|
||||
invalidate_cache: bool = True) -> None:
|
||||
console: Optional[Console] = None, skip_confirm: bool = False) -> None:
|
||||
"""Fetch, quarantine, scan, confirm, and install a skill."""
|
||||
from tools.skills_hub import (
|
||||
GitHubAuth, create_source_router, ensure_hub_dirs,
|
||||
@@ -418,17 +417,6 @@ def do_install(identifier: str, category: str = "", force: bool = False,
|
||||
c.print(f"[bold green]Installed:[/] {install_dir.relative_to(SKILLS_DIR)}")
|
||||
c.print(f"[dim]Files: {', '.join(bundle.files.keys())}[/]\n")
|
||||
|
||||
if invalidate_cache:
|
||||
# Invalidate the skills prompt cache so the new skill appears immediately
|
||||
try:
|
||||
from agent.prompt_builder import clear_skills_system_prompt_cache
|
||||
clear_skills_system_prompt_cache(clear_snapshot=True)
|
||||
except Exception:
|
||||
pass
|
||||
else:
|
||||
c.print("[dim]Skill will be available in your next session.[/]")
|
||||
c.print("[dim]Use /reset to start a new session now, or --now to activate immediately (invalidates prompt cache).[/]\n")
|
||||
|
||||
|
||||
def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
|
||||
"""Preview a skill's SKILL.md content without installing."""
|
||||
@@ -615,8 +603,7 @@ def do_audit(name: Optional[str] = None, console: Optional[Console] = None) -> N
|
||||
|
||||
|
||||
def do_uninstall(name: str, console: Optional[Console] = None,
|
||||
skip_confirm: bool = False,
|
||||
invalidate_cache: bool = True) -> None:
|
||||
skip_confirm: bool = False) -> None:
|
||||
"""Remove a hub-installed skill with confirmation."""
|
||||
from tools.skills_hub import uninstall_skill
|
||||
|
||||
@@ -636,15 +623,6 @@ def do_uninstall(name: str, console: Optional[Console] = None,
|
||||
success, msg = uninstall_skill(name)
|
||||
if success:
|
||||
c.print(f"[bold green]{msg}[/]\n")
|
||||
if invalidate_cache:
|
||||
try:
|
||||
from agent.prompt_builder import clear_skills_system_prompt_cache
|
||||
clear_skills_system_prompt_cache(clear_snapshot=True)
|
||||
except Exception:
|
||||
pass
|
||||
else:
|
||||
c.print("[dim]Change will take effect in your next session.[/]")
|
||||
c.print("[dim]Use /reset to start a new session now, or --now to apply immediately (invalidates prompt cache).[/]\n")
|
||||
else:
|
||||
c.print(f"[bold red]Error:[/] {msg}\n")
|
||||
|
||||
@@ -887,15 +865,10 @@ def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> N
|
||||
"taps": tap_list,
|
||||
}
|
||||
|
||||
payload = json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n"
|
||||
if output_path == "-":
|
||||
import sys
|
||||
sys.stdout.write(payload)
|
||||
else:
|
||||
out = Path(output_path)
|
||||
out.write_text(payload)
|
||||
c.print(f"[bold green]Snapshot exported:[/] {out}")
|
||||
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
|
||||
out = Path(output_path)
|
||||
out.write_text(json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n")
|
||||
c.print(f"[bold green]Snapshot exported:[/] {out}")
|
||||
c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
|
||||
|
||||
|
||||
def do_snapshot_import(input_path: str, force: bool = False,
|
||||
@@ -1086,23 +1059,19 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
|
||||
|
||||
elif action == "install":
|
||||
if not args:
|
||||
c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force] [--now]\n")
|
||||
c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force|--yes]\n")
|
||||
return
|
||||
identifier = args[0]
|
||||
category = ""
|
||||
# Slash commands run inside prompt_toolkit where input() hangs.
|
||||
# Always skip confirmation — the user typing the command is implicit consent.
|
||||
skip_confirm = True
|
||||
# --yes / -y bypasses confirmation prompt (needed in TUI mode)
|
||||
# --force handles reinstall override
|
||||
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
|
||||
force = "--force" in args
|
||||
# --now invalidates prompt cache immediately (costs more money).
|
||||
# Default: defer to next session to preserve cache.
|
||||
invalidate_cache = "--now" in args
|
||||
for i, a in enumerate(args):
|
||||
if a == "--category" and i + 1 < len(args):
|
||||
category = args[i + 1]
|
||||
do_install(identifier, category=category, force=force,
|
||||
skip_confirm=skip_confirm, invalidate_cache=invalidate_cache,
|
||||
console=c)
|
||||
skip_confirm=skip_confirm, console=c)
|
||||
|
||||
elif action == "inspect":
|
||||
if not args:
|
||||
@@ -1132,13 +1101,10 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
|
||||
|
||||
elif action == "uninstall":
|
||||
if not args:
|
||||
c.print("[bold red]Usage:[/] /skills uninstall <name> [--now]\n")
|
||||
c.print("[bold red]Usage:[/] /skills uninstall <name> [--yes]\n")
|
||||
return
|
||||
# Slash commands run inside prompt_toolkit where input() hangs.
|
||||
skip_confirm = True
|
||||
invalidate_cache = "--now" in args
|
||||
do_uninstall(args[0], console=c, skip_confirm=skip_confirm,
|
||||
invalidate_cache=invalidate_cache)
|
||||
skip_confirm = any(flag in args for flag in ("--yes", "-y"))
|
||||
do_uninstall(args[0], console=c, skip_confirm=skip_confirm)
|
||||
|
||||
elif action == "publish":
|
||||
if not args:
|
||||
|
||||
@@ -292,9 +292,8 @@ def show_status(args):
|
||||
print(" Manager: systemd (user)")
|
||||
|
||||
elif sys.platform == 'darwin':
|
||||
from hermes_cli.gateway import get_launchd_label
|
||||
result = subprocess.run(
|
||||
["launchctl", "list", get_launchd_label()],
|
||||
["launchctl", "list", "ai.hermes.gateway"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
@@ -108,8 +108,7 @@ def _get_effective_configurable_toolsets():
|
||||
"""
|
||||
result = list(CONFIGURABLE_TOOLSETS)
|
||||
try:
|
||||
from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
|
||||
discover_plugins() # idempotent — ensures plugins are loaded
|
||||
from hermes_cli.plugins import get_plugin_toolsets
|
||||
result.extend(get_plugin_toolsets())
|
||||
except Exception:
|
||||
pass
|
||||
@@ -119,8 +118,7 @@ def _get_effective_configurable_toolsets():
|
||||
def _get_plugin_toolset_keys() -> set:
|
||||
"""Return the set of toolset keys provided by plugins."""
|
||||
try:
|
||||
from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
|
||||
discover_plugins() # idempotent — ensures plugins are loaded
|
||||
from hermes_cli.plugins import get_plugin_toolsets
|
||||
return {ts_key for ts_key, _, _ in get_plugin_toolsets()}
|
||||
except Exception:
|
||||
return set()
|
||||
@@ -135,10 +133,8 @@ PLATFORMS = {
|
||||
"signal": {"label": "📡 Signal", "default_toolset": "hermes-signal"},
|
||||
"homeassistant": {"label": "🏠 Home Assistant", "default_toolset": "hermes-homeassistant"},
|
||||
"email": {"label": "📧 Email", "default_toolset": "hermes-email"},
|
||||
"matrix": {"label": "💬 Matrix", "default_toolset": "hermes-matrix"},
|
||||
"dingtalk": {"label": "💬 DingTalk", "default_toolset": "hermes-dingtalk"},
|
||||
"api_server": {"label": "🌐 API Server", "default_toolset": "hermes-api-server"},
|
||||
"mattermost": {"label": "💬 Mattermost", "default_toolset": "hermes-mattermost"},
|
||||
}
|
||||
|
||||
|
||||
@@ -190,14 +186,6 @@ TOOL_CATEGORIES = {
|
||||
{"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
|
||||
],
|
||||
},
|
||||
{
|
||||
"name": "Exa",
|
||||
"tag": "AI-native search and contents",
|
||||
"web_backend": "exa",
|
||||
"env_vars": [
|
||||
{"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
|
||||
],
|
||||
},
|
||||
{
|
||||
"name": "Parallel",
|
||||
"tag": "AI-native search and extract",
|
||||
|
||||
@@ -1,256 +0,0 @@
|
||||
"""hermes webhook — manage dynamic webhook subscriptions from the CLI.
|
||||
|
||||
Usage:
|
||||
hermes webhook subscribe <name> [options]
|
||||
hermes webhook list
|
||||
hermes webhook remove <name>
|
||||
hermes webhook test <name> [--payload '{"key": "value"}']
|
||||
|
||||
Subscriptions persist to ~/.hermes/webhook_subscriptions.json and are
|
||||
hot-reloaded by the webhook adapter without a gateway restart.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import secrets
|
||||
import time
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional
|
||||
|
||||
|
||||
_SUBSCRIPTIONS_FILENAME = "webhook_subscriptions.json"
|
||||
|
||||
|
||||
def _hermes_home() -> Path:
|
||||
return Path(
|
||||
os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))
|
||||
).expanduser()
|
||||
|
||||
|
||||
def _subscriptions_path() -> Path:
|
||||
return _hermes_home() / _SUBSCRIPTIONS_FILENAME
|
||||
|
||||
|
||||
def _load_subscriptions() -> Dict[str, dict]:
|
||||
path = _subscriptions_path()
|
||||
if not path.exists():
|
||||
return {}
|
||||
try:
|
||||
data = json.loads(path.read_text(encoding="utf-8"))
|
||||
return data if isinstance(data, dict) else {}
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
|
||||
def _save_subscriptions(subs: Dict[str, dict]) -> None:
|
||||
path = _subscriptions_path()
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
tmp_path = path.with_suffix(".tmp")
|
||||
tmp_path.write_text(
|
||||
json.dumps(subs, indent=2, ensure_ascii=False),
|
||||
encoding="utf-8",
|
||||
)
|
||||
os.replace(str(tmp_path), str(path))
|
||||
|
||||
|
||||
def _get_webhook_config() -> dict:
|
||||
"""Load webhook platform config. Returns {} if not configured."""
|
||||
try:
|
||||
from hermes_cli.config import load_config
|
||||
cfg = load_config()
|
||||
return cfg.get("platforms", {}).get("webhook", {})
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
|
||||
def _is_webhook_enabled() -> bool:
|
||||
return bool(_get_webhook_config().get("enabled"))
|
||||
|
||||
|
||||
def _get_webhook_base_url() -> str:
|
||||
wh = _get_webhook_config().get("extra", {})
|
||||
host = wh.get("host", "0.0.0.0")
|
||||
port = wh.get("port", 8644)
|
||||
display_host = "localhost" if host == "0.0.0.0" else host
|
||||
return f"http://{display_host}:{port}"
|
||||
|
||||
|
||||
_SETUP_HINT = """
|
||||
Webhook platform is not enabled. To set it up:
|
||||
|
||||
1. Run the gateway setup wizard:
|
||||
hermes gateway setup
|
||||
|
||||
2. Or manually add to ~/.hermes/config.yaml:
|
||||
platforms:
|
||||
webhook:
|
||||
enabled: true
|
||||
extra:
|
||||
host: "0.0.0.0"
|
||||
port: 8644
|
||||
secret: "your-global-hmac-secret"
|
||||
|
||||
3. Or set environment variables in ~/.hermes/.env:
|
||||
WEBHOOK_ENABLED=true
|
||||
WEBHOOK_PORT=8644
|
||||
WEBHOOK_SECRET=your-global-secret
|
||||
|
||||
Then start the gateway: hermes gateway run
|
||||
"""
|
||||
|
||||
|
||||
def _require_webhook_enabled() -> bool:
|
||||
"""Check webhook is enabled. Print setup guide and return False if not."""
|
||||
if _is_webhook_enabled():
|
||||
return True
|
||||
print(_SETUP_HINT)
|
||||
return False
|
||||
|
||||
|
||||
def webhook_command(args):
|
||||
"""Entry point for 'hermes webhook' subcommand."""
|
||||
sub = getattr(args, "webhook_action", None)
|
||||
|
||||
if not sub:
|
||||
print("Usage: hermes webhook {subscribe|list|remove|test}")
|
||||
print("Run 'hermes webhook --help' for details.")
|
||||
return
|
||||
|
||||
if not _require_webhook_enabled():
|
||||
return
|
||||
|
||||
if sub in ("subscribe", "add"):
|
||||
_cmd_subscribe(args)
|
||||
elif sub in ("list", "ls"):
|
||||
_cmd_list(args)
|
||||
elif sub in ("remove", "rm"):
|
||||
_cmd_remove(args)
|
||||
elif sub == "test":
|
||||
_cmd_test(args)
|
||||
|
||||
|
||||
def _cmd_subscribe(args):
|
||||
name = args.name.strip().lower().replace(" ", "-")
|
||||
if not re.match(r'^[a-z0-9][a-z0-9_-]*$', name):
|
||||
print(f"Error: Invalid name '{name}'. Use lowercase alphanumeric with hyphens/underscores.")
|
||||
return
|
||||
|
||||
subs = _load_subscriptions()
|
||||
is_update = name in subs
|
||||
|
||||
secret = args.secret or secrets.token_urlsafe(32)
|
||||
events = [e.strip() for e in args.events.split(",")] if args.events else []
|
||||
|
||||
route = {
|
||||
"description": args.description or f"Agent-created subscription: {name}",
|
||||
"events": events,
|
||||
"secret": secret,
|
||||
"prompt": args.prompt or "",
|
||||
"skills": [s.strip() for s in args.skills.split(",")] if args.skills else [],
|
||||
"deliver": args.deliver or "log",
|
||||
"created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
|
||||
}
|
||||
|
||||
if args.deliver_chat_id:
|
||||
route["deliver_extra"] = {"chat_id": args.deliver_chat_id}
|
||||
|
||||
subs[name] = route
|
||||
_save_subscriptions(subs)
|
||||
|
||||
base_url = _get_webhook_base_url()
|
||||
status = "Updated" if is_update else "Created"
|
||||
|
||||
print(f"\n {status} webhook subscription: {name}")
|
||||
print(f" URL: {base_url}/webhooks/{name}")
|
||||
print(f" Secret: {secret}")
|
||||
if events:
|
||||
print(f" Events: {', '.join(events)}")
|
||||
else:
|
||||
print(" Events: (all)")
|
||||
print(f" Deliver: {route['deliver']}")
|
||||
if route.get("prompt"):
|
||||
prompt_preview = route["prompt"][:80] + ("..." if len(route["prompt"]) > 80 else "")
|
||||
print(f" Prompt: {prompt_preview}")
|
||||
print(f"\n Configure your service to POST to the URL above.")
|
||||
print(f" Use the secret for HMAC-SHA256 signature validation.")
|
||||
print(f" The gateway must be running to receive events (hermes gateway run).\n")
|
||||
|
||||
|
||||
def _cmd_list(args):
|
||||
subs = _load_subscriptions()
|
||||
if not subs:
|
||||
print(" No dynamic webhook subscriptions.")
|
||||
print(" Create one with: hermes webhook subscribe <name>")
|
||||
return
|
||||
|
||||
base_url = _get_webhook_base_url()
|
||||
print(f"\n {len(subs)} webhook subscription(s):\n")
|
||||
for name, route in subs.items():
|
||||
events = ", ".join(route.get("events", [])) or "(all)"
|
||||
deliver = route.get("deliver", "log")
|
||||
desc = route.get("description", "")
|
||||
print(f" ◆ {name}")
|
||||
if desc:
|
||||
print(f" {desc}")
|
||||
print(f" URL: {base_url}/webhooks/{name}")
|
||||
print(f" Events: {events}")
|
||||
print(f" Deliver: {deliver}")
|
||||
print()
|
||||
|
||||
|
||||
def _cmd_remove(args):
|
||||
name = args.name.strip().lower()
|
||||
subs = _load_subscriptions()
|
||||
|
||||
if name not in subs:
|
||||
print(f" No subscription named '{name}'.")
|
||||
print(" Note: Static routes from config.yaml cannot be removed here.")
|
||||
return
|
||||
|
||||
del subs[name]
|
||||
_save_subscriptions(subs)
|
||||
print(f" Removed webhook subscription: {name}")
|
||||
|
||||
|
||||
def _cmd_test(args):
|
||||
"""Send a test POST to a webhook route."""
|
||||
name = args.name.strip().lower()
|
||||
subs = _load_subscriptions()
|
||||
|
||||
if name not in subs:
|
||||
print(f" No subscription named '{name}'.")
|
||||
return
|
||||
|
||||
route = subs[name]
|
||||
secret = route.get("secret", "")
|
||||
base_url = _get_webhook_base_url()
|
||||
url = f"{base_url}/webhooks/{name}"
|
||||
|
||||
payload = args.payload or '{"test": true, "event_type": "test", "message": "Hello from hermes webhook test"}'
|
||||
|
||||
import hmac
|
||||
import hashlib
|
||||
sig = "sha256=" + hmac.new(
|
||||
secret.encode(), payload.encode(), hashlib.sha256
|
||||
).hexdigest()
|
||||
|
||||
print(f" Sending test POST to {url}")
|
||||
try:
|
||||
import urllib.request
|
||||
req = urllib.request.Request(
|
||||
url,
|
||||
data=payload.encode(),
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"X-Hub-Signature-256": sig,
|
||||
"X-GitHub-Event": "test",
|
||||
},
|
||||
method="POST",
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=10) as resp:
|
||||
body = resp.read().decode()
|
||||
print(f" Response ({resp.status}): {body}")
|
||||
except Exception as e:
|
||||
print(f" Error: {e}")
|
||||
print(" Is the gateway running? (hermes gateway run)")
|
||||
@@ -17,27 +17,6 @@ def get_hermes_home() -> Path:
|
||||
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
|
||||
|
||||
|
||||
def get_hermes_dir(new_subpath: str, old_name: str) -> Path:
|
||||
"""Resolve a Hermes subdirectory with backward compatibility.
|
||||
|
||||
New installs get the consolidated layout (e.g. ``cache/images``).
|
||||
Existing installs that already have the old path (e.g. ``image_cache``)
|
||||
keep using it — no migration required.
|
||||
|
||||
Args:
|
||||
new_subpath: Preferred path relative to HERMES_HOME (e.g. ``"cache/images"``).
|
||||
old_name: Legacy path relative to HERMES_HOME (e.g. ``"image_cache"``).
|
||||
|
||||
Returns:
|
||||
Absolute ``Path`` — old location if it exists on disk, otherwise the new one.
|
||||
"""
|
||||
home = get_hermes_home()
|
||||
old_path = home / old_name
|
||||
if old_path.exists():
|
||||
return old_path
|
||||
return home / new_subpath
|
||||
|
||||
|
||||
VALID_REASONING_EFFORTS = ("xhigh", "high", "medium", "low", "minimal")
|
||||
|
||||
|
||||
|
||||
+84
-304
@@ -15,20 +15,15 @@ Key design decisions:
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import random
|
||||
import re
|
||||
import sqlite3
|
||||
import threading
|
||||
import time
|
||||
from pathlib import Path
|
||||
from hermes_constants import get_hermes_home
|
||||
from typing import Any, Callable, Dict, List, Optional, TypeVar
|
||||
from typing import Dict, Any, List, Optional
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
DEFAULT_DB_PATH = get_hermes_home() / "state.db"
|
||||
|
||||
@@ -121,38 +116,18 @@ class SessionDB:
|
||||
single writer via WAL mode). Each method opens its own cursor.
|
||||
"""
|
||||
|
||||
# ── Write-contention tuning ──
|
||||
# With multiple hermes processes (gateway + CLI sessions + worktree agents)
|
||||
# all sharing one state.db, WAL write-lock contention causes visible TUI
|
||||
# freezes. SQLite's built-in busy handler uses a deterministic sleep
|
||||
# schedule that causes convoy effects under high concurrency.
|
||||
#
|
||||
# Instead, we keep the SQLite timeout short (1s) and handle retries at the
|
||||
# application level with random jitter, which naturally staggers competing
|
||||
# writers and avoids the convoy.
|
||||
_WRITE_MAX_RETRIES = 15
|
||||
_WRITE_RETRY_MIN_S = 0.020 # 20ms
|
||||
_WRITE_RETRY_MAX_S = 0.150 # 150ms
|
||||
# Attempt a PASSIVE WAL checkpoint every N successful writes.
|
||||
_CHECKPOINT_EVERY_N_WRITES = 50
|
||||
|
||||
def __init__(self, db_path: Path = None):
|
||||
self.db_path = db_path or DEFAULT_DB_PATH
|
||||
self.db_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
self._lock = threading.Lock()
|
||||
self._write_count = 0
|
||||
self._conn = sqlite3.connect(
|
||||
str(self.db_path),
|
||||
check_same_thread=False,
|
||||
# Short timeout — application-level retry with random jitter
|
||||
# handles contention instead of sitting in SQLite's internal
|
||||
# busy handler for up to 30s.
|
||||
timeout=1.0,
|
||||
# Autocommit mode: Python's default isolation_level="" auto-starts
|
||||
# transactions on DML, which conflicts with our explicit
|
||||
# BEGIN IMMEDIATE. None = we manage transactions ourselves.
|
||||
isolation_level=None,
|
||||
# 30s gives the WAL writer (CLI or gateway) time to finish a batch
|
||||
# flush before the concurrent reader/writer gives up. 10s was too
|
||||
# short when the CLI is doing frequent memory flushes.
|
||||
timeout=30.0,
|
||||
)
|
||||
self._conn.row_factory = sqlite3.Row
|
||||
self._conn.execute("PRAGMA journal_mode=WAL")
|
||||
@@ -160,96 +135,6 @@ class SessionDB:
|
||||
|
||||
self._init_schema()
|
||||
|
||||
# ── Core write helper ──
|
||||
|
||||
def _execute_write(self, fn: Callable[[sqlite3.Connection], T]) -> T:
|
||||
"""Execute a write transaction with BEGIN IMMEDIATE and jitter retry.
|
||||
|
||||
*fn* receives the connection and should perform INSERT/UPDATE/DELETE
|
||||
statements. The caller must NOT call ``commit()`` — that's handled
|
||||
here after *fn* returns.
|
||||
|
||||
BEGIN IMMEDIATE acquires the WAL write lock at transaction start
|
||||
(not at commit time), so lock contention surfaces immediately.
|
||||
On ``database is locked``, we release the Python lock, sleep a
|
||||
random 20-150ms, and retry — breaking the convoy pattern that
|
||||
SQLite's built-in deterministic backoff creates.
|
||||
|
||||
Returns whatever *fn* returns.
|
||||
"""
|
||||
last_err: Optional[Exception] = None
|
||||
for attempt in range(self._WRITE_MAX_RETRIES):
|
||||
try:
|
||||
with self._lock:
|
||||
self._conn.execute("BEGIN IMMEDIATE")
|
||||
try:
|
||||
result = fn(self._conn)
|
||||
self._conn.commit()
|
||||
except BaseException:
|
||||
try:
|
||||
self._conn.rollback()
|
||||
except Exception:
|
||||
pass
|
||||
raise
|
||||
# Success — periodic best-effort checkpoint.
|
||||
self._write_count += 1
|
||||
if self._write_count % self._CHECKPOINT_EVERY_N_WRITES == 0:
|
||||
self._try_wal_checkpoint()
|
||||
return result
|
||||
except sqlite3.OperationalError as exc:
|
||||
err_msg = str(exc).lower()
|
||||
if "locked" in err_msg or "busy" in err_msg:
|
||||
last_err = exc
|
||||
if attempt < self._WRITE_MAX_RETRIES - 1:
|
||||
jitter = random.uniform(
|
||||
self._WRITE_RETRY_MIN_S,
|
||||
self._WRITE_RETRY_MAX_S,
|
||||
)
|
||||
time.sleep(jitter)
|
||||
continue
|
||||
# Non-lock error or retries exhausted — propagate.
|
||||
raise
|
||||
# Retries exhausted (shouldn't normally reach here).
|
||||
raise last_err or sqlite3.OperationalError(
|
||||
"database is locked after max retries"
|
||||
)
|
||||
|
||||
def _try_wal_checkpoint(self) -> None:
|
||||
"""Best-effort PASSIVE WAL checkpoint. Never blocks, never raises.
|
||||
|
||||
Flushes committed WAL frames back into the main DB file for any
|
||||
frames that no other connection currently needs. Keeps the WAL
|
||||
from growing unbounded when many processes hold persistent
|
||||
connections.
|
||||
"""
|
||||
try:
|
||||
with self._lock:
|
||||
result = self._conn.execute(
|
||||
"PRAGMA wal_checkpoint(PASSIVE)"
|
||||
).fetchone()
|
||||
if result and result[1] > 0:
|
||||
logger.debug(
|
||||
"WAL checkpoint: %d/%d pages checkpointed",
|
||||
result[2], result[1],
|
||||
)
|
||||
except Exception:
|
||||
pass # Best effort — never fatal.
|
||||
|
||||
def close(self):
|
||||
"""Close the database connection.
|
||||
|
||||
Attempts a PASSIVE WAL checkpoint first so that exiting processes
|
||||
help keep the WAL file from growing unbounded.
|
||||
"""
|
||||
with self._lock:
|
||||
if self._conn:
|
||||
try:
|
||||
self._conn.execute("PRAGMA wal_checkpoint(PASSIVE)")
|
||||
except Exception:
|
||||
pass
|
||||
self._conn.close()
|
||||
self._conn = None
|
||||
|
||||
def _init_schema(self):
|
||||
"""Create tables and FTS if they don't exist, run migrations."""
|
||||
cursor = self._conn.cursor()
|
||||
@@ -371,8 +256,8 @@ class SessionDB:
|
||||
parent_session_id: str = None,
|
||||
) -> str:
|
||||
"""Create a new session record. Returns the session_id."""
|
||||
def _do(conn):
|
||||
conn.execute(
|
||||
with self._lock:
|
||||
self._conn.execute(
|
||||
"""INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
|
||||
system_prompt, parent_session_id, started_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)""",
|
||||
@@ -387,35 +272,26 @@ class SessionDB:
|
||||
time.time(),
|
||||
),
|
||||
)
|
||||
self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
return session_id
|
||||
|
||||
def end_session(self, session_id: str, end_reason: str) -> None:
|
||||
"""Mark a session as ended."""
|
||||
def _do(conn):
|
||||
conn.execute(
|
||||
with self._lock:
|
||||
self._conn.execute(
|
||||
"UPDATE sessions SET ended_at = ?, end_reason = ? WHERE id = ?",
|
||||
(time.time(), end_reason, session_id),
|
||||
)
|
||||
self._execute_write(_do)
|
||||
|
||||
def reopen_session(self, session_id: str) -> None:
|
||||
"""Clear ended_at/end_reason so a session can be resumed."""
|
||||
def _do(conn):
|
||||
conn.execute(
|
||||
"UPDATE sessions SET ended_at = NULL, end_reason = NULL WHERE id = ?",
|
||||
(session_id,),
|
||||
)
|
||||
self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
|
||||
def update_system_prompt(self, session_id: str, system_prompt: str) -> None:
|
||||
"""Store the full assembled system prompt snapshot."""
|
||||
def _do(conn):
|
||||
conn.execute(
|
||||
with self._lock:
|
||||
self._conn.execute(
|
||||
"UPDATE sessions SET system_prompt = ? WHERE id = ?",
|
||||
(system_prompt, session_id),
|
||||
)
|
||||
self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
|
||||
def update_token_counts(
|
||||
self,
|
||||
@@ -434,39 +310,11 @@ class SessionDB:
|
||||
billing_provider: Optional[str] = None,
|
||||
billing_base_url: Optional[str] = None,
|
||||
billing_mode: Optional[str] = None,
|
||||
absolute: bool = False,
|
||||
) -> None:
|
||||
"""Update token counters and backfill model if not already set.
|
||||
|
||||
When *absolute* is False (default), values are **incremented** — use
|
||||
this for per-API-call deltas (CLI path).
|
||||
|
||||
When *absolute* is True, values are **set directly** — use this when
|
||||
the caller already holds cumulative totals (gateway path, where the
|
||||
cached agent accumulates across messages).
|
||||
"""
|
||||
if absolute:
|
||||
sql = """UPDATE sessions SET
|
||||
input_tokens = ?,
|
||||
output_tokens = ?,
|
||||
cache_read_tokens = ?,
|
||||
cache_write_tokens = ?,
|
||||
reasoning_tokens = ?,
|
||||
estimated_cost_usd = COALESCE(?, 0),
|
||||
actual_cost_usd = CASE
|
||||
WHEN ? IS NULL THEN actual_cost_usd
|
||||
ELSE ?
|
||||
END,
|
||||
cost_status = COALESCE(?, cost_status),
|
||||
cost_source = COALESCE(?, cost_source),
|
||||
pricing_version = COALESCE(?, pricing_version),
|
||||
billing_provider = COALESCE(billing_provider, ?),
|
||||
billing_base_url = COALESCE(billing_base_url, ?),
|
||||
billing_mode = COALESCE(billing_mode, ?),
|
||||
model = COALESCE(model, ?)
|
||||
WHERE id = ?"""
|
||||
else:
|
||||
sql = """UPDATE sessions SET
|
||||
"""Increment token counters and backfill model if not already set."""
|
||||
with self._lock:
|
||||
self._conn.execute(
|
||||
"""UPDATE sessions SET
|
||||
input_tokens = input_tokens + ?,
|
||||
output_tokens = output_tokens + ?,
|
||||
cache_read_tokens = cache_read_tokens + ?,
|
||||
@@ -484,94 +332,6 @@ class SessionDB:
|
||||
billing_base_url = COALESCE(billing_base_url, ?),
|
||||
billing_mode = COALESCE(billing_mode, ?),
|
||||
model = COALESCE(model, ?)
|
||||
WHERE id = ?"""
|
||||
params = (
|
||||
input_tokens,
|
||||
output_tokens,
|
||||
cache_read_tokens,
|
||||
cache_write_tokens,
|
||||
reasoning_tokens,
|
||||
estimated_cost_usd,
|
||||
actual_cost_usd,
|
||||
actual_cost_usd,
|
||||
cost_status,
|
||||
cost_source,
|
||||
pricing_version,
|
||||
billing_provider,
|
||||
billing_base_url,
|
||||
billing_mode,
|
||||
model,
|
||||
session_id,
|
||||
)
|
||||
def _do(conn):
|
||||
conn.execute(sql, params)
|
||||
self._execute_write(_do)
|
||||
|
||||
def ensure_session(
|
||||
self,
|
||||
session_id: str,
|
||||
source: str = "unknown",
|
||||
model: str = None,
|
||||
) -> None:
|
||||
"""Ensure a session row exists, creating it with minimal metadata if absent.
|
||||
|
||||
Used by _flush_messages_to_session_db to recover from a failed
|
||||
create_session() call (e.g. transient SQLite lock at agent startup).
|
||||
INSERT OR IGNORE is safe to call even when the row already exists.
|
||||
"""
|
||||
def _do(conn):
|
||||
conn.execute(
|
||||
"""INSERT OR IGNORE INTO sessions
|
||||
(id, source, model, started_at)
|
||||
VALUES (?, ?, ?, ?)""",
|
||||
(session_id, source, model, time.time()),
|
||||
)
|
||||
self._execute_write(_do)
|
||||
|
||||
def set_token_counts(
|
||||
self,
|
||||
session_id: str,
|
||||
input_tokens: int = 0,
|
||||
output_tokens: int = 0,
|
||||
model: str = None,
|
||||
cache_read_tokens: int = 0,
|
||||
cache_write_tokens: int = 0,
|
||||
reasoning_tokens: int = 0,
|
||||
estimated_cost_usd: Optional[float] = None,
|
||||
actual_cost_usd: Optional[float] = None,
|
||||
cost_status: Optional[str] = None,
|
||||
cost_source: Optional[str] = None,
|
||||
pricing_version: Optional[str] = None,
|
||||
billing_provider: Optional[str] = None,
|
||||
billing_base_url: Optional[str] = None,
|
||||
billing_mode: Optional[str] = None,
|
||||
) -> None:
|
||||
"""Set token counters to absolute values (not increment).
|
||||
|
||||
Use this when the caller provides cumulative totals from a completed
|
||||
conversation run (e.g. the gateway, where the cached agent's
|
||||
session_prompt_tokens already reflects the running total).
|
||||
"""
|
||||
def _do(conn):
|
||||
conn.execute(
|
||||
"""UPDATE sessions SET
|
||||
input_tokens = ?,
|
||||
output_tokens = ?,
|
||||
cache_read_tokens = ?,
|
||||
cache_write_tokens = ?,
|
||||
reasoning_tokens = ?,
|
||||
estimated_cost_usd = ?,
|
||||
actual_cost_usd = CASE
|
||||
WHEN ? IS NULL THEN actual_cost_usd
|
||||
ELSE ?
|
||||
END,
|
||||
cost_status = COALESCE(?, cost_status),
|
||||
cost_source = COALESCE(?, cost_source),
|
||||
pricing_version = COALESCE(?, pricing_version),
|
||||
billing_provider = COALESCE(billing_provider, ?),
|
||||
billing_base_url = COALESCE(billing_base_url, ?),
|
||||
billing_mode = COALESCE(billing_mode, ?),
|
||||
model = COALESCE(model, ?)
|
||||
WHERE id = ?""",
|
||||
(
|
||||
input_tokens,
|
||||
@@ -592,7 +352,28 @@ class SessionDB:
|
||||
session_id,
|
||||
),
|
||||
)
|
||||
self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
|
||||
def ensure_session(
|
||||
self,
|
||||
session_id: str,
|
||||
source: str = "unknown",
|
||||
model: str = None,
|
||||
) -> None:
|
||||
"""Ensure a session row exists, creating it with minimal metadata if absent.
|
||||
|
||||
Used by _flush_messages_to_session_db to recover from a failed
|
||||
create_session() call (e.g. transient SQLite lock at agent startup).
|
||||
INSERT OR IGNORE is safe to call even when the row already exists.
|
||||
"""
|
||||
with self._lock:
|
||||
self._conn.execute(
|
||||
"""INSERT OR IGNORE INTO sessions
|
||||
(id, source, model, started_at)
|
||||
VALUES (?, ?, ?, ?)""",
|
||||
(session_id, source, model, time.time()),
|
||||
)
|
||||
self._conn.commit()
|
||||
|
||||
def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get a session by ID."""
|
||||
@@ -686,10 +467,10 @@ class SessionDB:
|
||||
Empty/whitespace-only strings are normalized to None (clearing the title).
|
||||
"""
|
||||
title = self.sanitize_title(title)
|
||||
def _do(conn):
|
||||
with self._lock:
|
||||
if title:
|
||||
# Check uniqueness (allow the same session to keep its own title)
|
||||
cursor = conn.execute(
|
||||
cursor = self._conn.execute(
|
||||
"SELECT id FROM sessions WHERE title = ? AND id != ?",
|
||||
(title, session_id),
|
||||
)
|
||||
@@ -698,12 +479,12 @@ class SessionDB:
|
||||
raise ValueError(
|
||||
f"Title '{title}' is already in use by session {conflict['id']}"
|
||||
)
|
||||
cursor = conn.execute(
|
||||
cursor = self._conn.execute(
|
||||
"UPDATE sessions SET title = ? WHERE id = ?",
|
||||
(title, session_id),
|
||||
)
|
||||
return cursor.rowcount
|
||||
rowcount = self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
rowcount = cursor.rowcount
|
||||
return rowcount > 0
|
||||
|
||||
def get_session_title(self, session_id: str) -> Optional[str]:
|
||||
@@ -875,24 +656,17 @@ class SessionDB:
|
||||
Also increments the session's message_count (and tool_call_count
|
||||
if role is 'tool' or tool_calls is present).
|
||||
"""
|
||||
# Serialize structured fields to JSON before entering the write txn
|
||||
reasoning_details_json = (
|
||||
json.dumps(reasoning_details)
|
||||
if reasoning_details else None
|
||||
)
|
||||
codex_items_json = (
|
||||
json.dumps(codex_reasoning_items)
|
||||
if codex_reasoning_items else None
|
||||
)
|
||||
tool_calls_json = json.dumps(tool_calls) if tool_calls else None
|
||||
|
||||
# Pre-compute tool call count
|
||||
num_tool_calls = 0
|
||||
if tool_calls is not None:
|
||||
num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
|
||||
|
||||
def _do(conn):
|
||||
cursor = conn.execute(
|
||||
with self._lock:
|
||||
# Serialize structured fields to JSON for storage
|
||||
reasoning_details_json = (
|
||||
json.dumps(reasoning_details)
|
||||
if reasoning_details else None
|
||||
)
|
||||
codex_items_json = (
|
||||
json.dumps(codex_reasoning_items)
|
||||
if codex_reasoning_items else None
|
||||
)
|
||||
cursor = self._conn.execute(
|
||||
"""INSERT INTO messages (session_id, role, content, tool_call_id,
|
||||
tool_calls, tool_name, timestamp, token_count, finish_reason,
|
||||
reasoning, reasoning_details, codex_reasoning_items)
|
||||
@@ -902,7 +676,7 @@ class SessionDB:
|
||||
role,
|
||||
content,
|
||||
tool_call_id,
|
||||
tool_calls_json,
|
||||
json.dumps(tool_calls) if tool_calls else None,
|
||||
tool_name,
|
||||
time.time(),
|
||||
token_count,
|
||||
@@ -915,20 +689,25 @@ class SessionDB:
|
||||
msg_id = cursor.lastrowid
|
||||
|
||||
# Update counters
|
||||
# Count actual tool calls from the tool_calls list (not from tool responses).
|
||||
# A single assistant message can contain multiple parallel tool calls.
|
||||
num_tool_calls = 0
|
||||
if tool_calls is not None:
|
||||
num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
|
||||
if num_tool_calls > 0:
|
||||
conn.execute(
|
||||
self._conn.execute(
|
||||
"""UPDATE sessions SET message_count = message_count + 1,
|
||||
tool_call_count = tool_call_count + ? WHERE id = ?""",
|
||||
(num_tool_calls, session_id),
|
||||
)
|
||||
else:
|
||||
conn.execute(
|
||||
self._conn.execute(
|
||||
"UPDATE sessions SET message_count = message_count + 1 WHERE id = ?",
|
||||
(session_id,),
|
||||
)
|
||||
return msg_id
|
||||
|
||||
return self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
return msg_id
|
||||
|
||||
def get_messages(self, session_id: str) -> List[Dict[str, Any]]:
|
||||
"""Load all messages for a session, ordered by timestamp."""
|
||||
@@ -1222,53 +1001,54 @@ class SessionDB:
|
||||
|
||||
def clear_messages(self, session_id: str) -> None:
|
||||
"""Delete all messages for a session and reset its counters."""
|
||||
def _do(conn):
|
||||
conn.execute(
|
||||
with self._lock:
|
||||
self._conn.execute(
|
||||
"DELETE FROM messages WHERE session_id = ?", (session_id,)
|
||||
)
|
||||
conn.execute(
|
||||
self._conn.execute(
|
||||
"UPDATE sessions SET message_count = 0, tool_call_count = 0 WHERE id = ?",
|
||||
(session_id,),
|
||||
)
|
||||
self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
|
||||
def delete_session(self, session_id: str) -> bool:
|
||||
"""Delete a session and all its messages. Returns True if found."""
|
||||
def _do(conn):
|
||||
cursor = conn.execute(
|
||||
with self._lock:
|
||||
cursor = self._conn.execute(
|
||||
"SELECT COUNT(*) FROM sessions WHERE id = ?", (session_id,)
|
||||
)
|
||||
if cursor.fetchone()[0] == 0:
|
||||
return False
|
||||
conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
|
||||
conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
|
||||
self._conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
|
||||
self._conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
|
||||
self._conn.commit()
|
||||
return True
|
||||
return self._execute_write(_do)
|
||||
|
||||
def prune_sessions(self, older_than_days: int = 90, source: str = None) -> int:
|
||||
"""
|
||||
Delete sessions older than N days. Returns count of deleted sessions.
|
||||
Only prunes ended sessions (not active ones).
|
||||
"""
|
||||
cutoff = time.time() - (older_than_days * 86400)
|
||||
import time as _time
|
||||
cutoff = _time.time() - (older_than_days * 86400)
|
||||
|
||||
def _do(conn):
|
||||
with self._lock:
|
||||
if source:
|
||||
cursor = conn.execute(
|
||||
cursor = self._conn.execute(
|
||||
"""SELECT id FROM sessions
|
||||
WHERE started_at < ? AND ended_at IS NOT NULL AND source = ?""",
|
||||
(cutoff, source),
|
||||
)
|
||||
else:
|
||||
cursor = conn.execute(
|
||||
cursor = self._conn.execute(
|
||||
"SELECT id FROM sessions WHERE started_at < ? AND ended_at IS NOT NULL",
|
||||
(cutoff,),
|
||||
)
|
||||
session_ids = [row["id"] for row in cursor.fetchall()]
|
||||
|
||||
for sid in session_ids:
|
||||
conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
|
||||
conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
|
||||
return len(session_ids)
|
||||
self._conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
|
||||
self._conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
|
||||
|
||||
return self._execute_write(_do)
|
||||
self._conn.commit()
|
||||
return len(session_ids)
|
||||
|
||||
@@ -270,7 +270,7 @@ def cmd_status(args) -> None:
|
||||
print(f" {peer}: {mode}")
|
||||
print(f" Write freq: {hcfg.write_frequency}")
|
||||
|
||||
if hcfg.enabled and (hcfg.api_key or hcfg.base_url):
|
||||
if hcfg.enabled and hcfg.api_key:
|
||||
print("\n Connection... ", end="", flush=True)
|
||||
try:
|
||||
get_honcho_client(hcfg)
|
||||
@@ -278,7 +278,7 @@ def cmd_status(args) -> None:
|
||||
except Exception as e:
|
||||
print(f"FAILED ({e})\n")
|
||||
else:
|
||||
reason = "disabled" if not hcfg.enabled else "no API key or base URL"
|
||||
reason = "disabled" if not hcfg.enabled else "no API key"
|
||||
print(f"\n Not connected ({reason})\n")
|
||||
|
||||
|
||||
|
||||
@@ -417,18 +417,9 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
|
||||
else:
|
||||
logger.info("Initializing Honcho client (host: %s, workspace: %s)", config.host, config.workspace_id)
|
||||
|
||||
# Local Honcho instances don't require an API key, but the SDK
|
||||
# expects a non-empty string. Use a placeholder for local URLs.
|
||||
_is_local = resolved_base_url and (
|
||||
"localhost" in resolved_base_url
|
||||
or "127.0.0.1" in resolved_base_url
|
||||
or "::1" in resolved_base_url
|
||||
)
|
||||
effective_api_key = config.api_key or ("local" if _is_local else None)
|
||||
|
||||
kwargs: dict = {
|
||||
"workspace_id": config.workspace_id,
|
||||
"api_key": effective_api_key,
|
||||
"api_key": config.api_key,
|
||||
"environment": config.environment,
|
||||
}
|
||||
if resolved_base_url:
|
||||
|
||||
+8
-46
@@ -10,12 +10,6 @@
|
||||
# container recreation. Environment variables are written to $HERMES_HOME/.env
|
||||
# and read by hermes at startup — no container recreation needed for env changes.
|
||||
#
|
||||
# Tool resolution: the hermes wrapper uses --suffix PATH for nix store tools,
|
||||
# so apt/uv-installed versions take priority. The container entrypoint provisions
|
||||
# extensible tools on first boot: nodejs/npm via apt, uv via curl, and a Python
|
||||
# 3.11 venv (bootstrapped entirely by uv) at ~/.venv with pip seeded. Agents get
|
||||
# writable tool prefixes for npm i -g, pip install, uv tool install, etc.
|
||||
#
|
||||
# Usage:
|
||||
# services.hermes-agent = {
|
||||
# enable = true;
|
||||
@@ -111,52 +105,22 @@
|
||||
fi
|
||||
mkdir -p "$TARGET_HOME"
|
||||
chown "$HERMES_UID:$HERMES_GID" "$TARGET_HOME"
|
||||
chmod 0750 "$TARGET_HOME"
|
||||
|
||||
# Ensure HERMES_HOME is owned by the target user
|
||||
if [ -n "''${HERMES_HOME:-}" ] && [ -d "$HERMES_HOME" ]; then
|
||||
chown -R "$HERMES_UID:$HERMES_GID" "$HERMES_HOME"
|
||||
fi
|
||||
|
||||
# ── Provision apt packages (first boot only, cached in writable layer) ──
|
||||
# sudo: agent self-modification
|
||||
# nodejs/npm: writable node so npm i -g works (nix store copies are read-only)
|
||||
# curl: needed for uv installer
|
||||
if [ ! -f /var/lib/hermes-tools-provisioned ] && command -v apt-get >/dev/null 2>&1; then
|
||||
echo "First boot: provisioning agent tools..."
|
||||
apt-get update -qq
|
||||
apt-get install -y -qq sudo nodejs npm curl
|
||||
touch /var/lib/hermes-tools-provisioned
|
||||
# Install sudo on Debian/Ubuntu if missing (first boot only, cached in writable layer)
|
||||
if command -v apt-get >/dev/null 2>&1 && ! command -v sudo >/dev/null 2>&1; then
|
||||
apt-get update -qq >/dev/null 2>&1 && apt-get install -y -qq sudo >/dev/null 2>&1 || true
|
||||
fi
|
||||
|
||||
if command -v sudo >/dev/null 2>&1 && [ ! -f /etc/sudoers.d/hermes ]; then
|
||||
mkdir -p /etc/sudoers.d
|
||||
echo "$TARGET_USER ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/hermes
|
||||
chmod 0440 /etc/sudoers.d/hermes
|
||||
fi
|
||||
|
||||
# uv (Python manager) — not in Ubuntu repos, retry-safe outside the sentinel
|
||||
if ! command -v uv >/dev/null 2>&1 && [ ! -x "$TARGET_HOME/.local/bin/uv" ] && command -v curl >/dev/null 2>&1; then
|
||||
su -s /bin/sh "$TARGET_USER" -c 'curl -LsSf https://astral.sh/uv/install.sh | sh' || true
|
||||
fi
|
||||
|
||||
# Python 3.11 venv — gives the agent a writable Python with pip.
|
||||
# Uses uv to install Python 3.11 (Ubuntu 24.04 ships 3.12).
|
||||
# --seed includes pip/setuptools so bare `pip install` works.
|
||||
_UV_BIN="$TARGET_HOME/.local/bin/uv"
|
||||
if [ ! -d "$TARGET_HOME/.venv" ] && [ -x "$_UV_BIN" ]; then
|
||||
su -s /bin/sh "$TARGET_USER" -c "
|
||||
export PATH=\"\$HOME/.local/bin:\$PATH\"
|
||||
uv python install 3.11
|
||||
uv venv --python 3.11 --seed \"\$HOME/.venv\"
|
||||
" || true
|
||||
fi
|
||||
|
||||
# Put the agent venv first on PATH so python/pip resolve to writable copies
|
||||
if [ -d "$TARGET_HOME/.venv/bin" ]; then
|
||||
export PATH="$TARGET_HOME/.venv/bin:$PATH"
|
||||
fi
|
||||
|
||||
if command -v setpriv >/dev/null 2>&1; then
|
||||
exec setpriv --reuid="$HERMES_UID" --regid="$HERMES_GID" --init-groups "$@"
|
||||
elif command -v su >/dev/null 2>&1; then
|
||||
@@ -552,8 +516,8 @@
|
||||
# ── Directories ───────────────────────────────────────────────────
|
||||
{
|
||||
systemd.tmpfiles.rules = [
|
||||
"d ${cfg.stateDir} 0750 ${cfg.user} ${cfg.group} - -"
|
||||
"d ${cfg.stateDir}/.hermes 0750 ${cfg.user} ${cfg.group} - -"
|
||||
"d ${cfg.stateDir} 0755 ${cfg.user} ${cfg.group} - -"
|
||||
"d ${cfg.stateDir}/.hermes 0755 ${cfg.user} ${cfg.group} - -"
|
||||
"d ${cfg.stateDir}/home 0750 ${cfg.user} ${cfg.group} - -"
|
||||
"d ${cfg.workingDirectory} 0750 ${cfg.user} ${cfg.group} - -"
|
||||
];
|
||||
@@ -567,23 +531,21 @@
|
||||
mkdir -p ${cfg.stateDir}/home
|
||||
mkdir -p ${cfg.workingDirectory}
|
||||
chown ${cfg.user}:${cfg.group} ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}
|
||||
chmod 0750 ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}
|
||||
|
||||
# Merge Nix settings into existing config.yaml.
|
||||
# Preserves user-added keys (skills, streaming, etc.); Nix keys win.
|
||||
# If configFile is user-provided (not generated), overwrite instead of merge.
|
||||
${if cfg.configFile != null then ''
|
||||
install -o ${cfg.user} -g ${cfg.group} -m 0640 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
|
||||
install -o ${cfg.user} -g ${cfg.group} -m 0644 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
|
||||
'' else ''
|
||||
${configMergeScript} ${generatedConfigFile} ${cfg.stateDir}/.hermes/config.yaml
|
||||
chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/config.yaml
|
||||
chmod 0640 ${cfg.stateDir}/.hermes/config.yaml
|
||||
chmod 0644 ${cfg.stateDir}/.hermes/config.yaml
|
||||
''}
|
||||
|
||||
# Managed mode marker (so interactive shells also detect NixOS management)
|
||||
touch ${cfg.stateDir}/.hermes/.managed
|
||||
chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/.managed
|
||||
chmod 0644 ${cfg.stateDir}/.hermes/.managed
|
||||
|
||||
# Seed auth file if provided
|
||||
${lib.optionalString (cfg.authFile != null) ''
|
||||
@@ -615,7 +577,7 @@ HERMES_NIX_ENV_EOF
|
||||
|
||||
# Link documents into workspace
|
||||
${lib.concatStringsSep "\n" (lib.mapAttrsToList (name: _value: ''
|
||||
install -o ${cfg.user} -g ${cfg.group} -m 0640 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
|
||||
install -o ${cfg.user} -g ${cfg.group} -m 0644 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
|
||||
'') cfg.documents)}
|
||||
'';
|
||||
}
|
||||
|
||||
+1
-1
@@ -35,7 +35,7 @@
|
||||
|
||||
${pkgs.lib.concatMapStringsSep "\n" (name: ''
|
||||
makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
|
||||
--suffix PATH : "${runtimePath}" \
|
||||
--prefix PATH : "${runtimePath}" \
|
||||
--set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills
|
||||
'') [ "hermes" "hermes-agent" "hermes-acp" ]}
|
||||
|
||||
|
||||
+3
-4
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "hermes-agent"
|
||||
version = "0.5.0"
|
||||
version = "0.4.0"
|
||||
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
@@ -26,7 +26,6 @@ dependencies = [
|
||||
# Interactive CLI (prompt_toolkit is used directly by cli.py)
|
||||
"prompt_toolkit>=3.0.52,<4",
|
||||
# Tools
|
||||
"exa-py>=2.9.0,<3",
|
||||
"firecrawl-py>=4.16.0,<5",
|
||||
"parallel-web>=0.4.2,<1",
|
||||
"fal-client>=0.13.1,<1",
|
||||
@@ -38,7 +37,7 @@ dependencies = [
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
modal = ["modal>=1.0.0,<2"]
|
||||
modal = ["swe-rex[modal]>=1.4.0,<2"]
|
||||
daytona = ["daytona>=0.148.0,<1"]
|
||||
dev = ["pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "mcp>=1.2.0,<2"]
|
||||
messaging = ["python-telegram-bot>=22.6,<23", "discord.py[voice]>=2.7.1,<3", "aiohttp>=3.13.3,<4", "slack-bolt>=1.18.0,<2", "slack-sdk>=3.27.0,<4"]
|
||||
@@ -56,7 +55,7 @@ honcho = ["honcho-ai>=2.0.1,<3"]
|
||||
mcp = ["mcp>=1.2.0,<2"]
|
||||
homeassistant = ["aiohttp>=3.9.0,<4"]
|
||||
sms = ["aiohttp>=3.9.0,<4"]
|
||||
acp = ["agent-client-protocol>=0.8.1,<0.9"]
|
||||
acp = ["agent-client-protocol>=0.8.1,<1.0"]
|
||||
dingtalk = ["dingtalk-stream>=0.1.0,<1"]
|
||||
rl = [
|
||||
"atroposlib @ git+https://github.com/NousResearch/atropos.git",
|
||||
|
||||
+31
-532
@@ -62,12 +62,7 @@ else:
|
||||
|
||||
|
||||
# Import our tool system
|
||||
from model_tools import (
|
||||
get_tool_definitions,
|
||||
get_toolset_for_tool,
|
||||
handle_function_call,
|
||||
check_toolset_requirements,
|
||||
)
|
||||
from model_tools import get_tool_definitions, handle_function_call, check_toolset_requirements
|
||||
from tools.terminal_tool import cleanup_vm
|
||||
from tools.interrupt import set_interrupt as _set_interrupt
|
||||
from tools.browser_tool import cleanup_browser
|
||||
@@ -88,7 +83,7 @@ from agent.model_metadata import (
|
||||
)
|
||||
from agent.context_compressor import ContextCompressor
|
||||
from agent.prompt_caching import apply_anthropic_cache_control
|
||||
from agent.prompt_builder import build_skills_system_prompt, build_context_files_prompt, load_soul_md, TOOL_USE_ENFORCEMENT_GUIDANCE, TOOL_USE_ENFORCEMENT_MODELS
|
||||
from agent.prompt_builder import build_skills_system_prompt, build_context_files_prompt, load_soul_md
|
||||
from agent.usage_pricing import estimate_usage_cost, normalize_usage
|
||||
from agent.display import (
|
||||
KawaiiSpinner, build_tool_preview as _build_tool_preview,
|
||||
@@ -361,85 +356,6 @@ def _inject_honcho_turn_context(content, turn_context: str):
|
||||
return f"{text}\n\n{note}"
|
||||
|
||||
|
||||
# Budget warning text patterns injected by _get_budget_warning().
|
||||
_BUDGET_WARNING_RE = re.compile(
|
||||
r"\[BUDGET(?:\s+WARNING)?:\s+Iteration\s+\d+/\d+\..*?\]",
|
||||
re.DOTALL,
|
||||
)
|
||||
|
||||
|
||||
# Regex to match lone surrogate code points (U+D800..U+DFFF).
|
||||
# These are invalid in UTF-8 and cause UnicodeEncodeError when the OpenAI SDK
|
||||
# serialises messages to JSON. Common source: clipboard paste from Google Docs
|
||||
# or other rich-text editors on some platforms.
|
||||
_SURROGATE_RE = re.compile(r'[\ud800-\udfff]')
|
||||
|
||||
|
||||
def _sanitize_surrogates(text: str) -> str:
|
||||
"""Replace lone surrogate code points with U+FFFD (replacement character).
|
||||
|
||||
Surrogates are invalid in UTF-8 and will crash ``json.dumps()`` inside the
|
||||
OpenAI SDK. This is a fast no-op when the text contains no surrogates.
|
||||
"""
|
||||
if _SURROGATE_RE.search(text):
|
||||
return _SURROGATE_RE.sub('\ufffd', text)
|
||||
return text
|
||||
|
||||
|
||||
def _sanitize_messages_surrogates(messages: list) -> bool:
|
||||
"""Sanitize surrogate characters from all string content in a messages list.
|
||||
|
||||
Walks message dicts in-place. Returns True if any surrogates were found
|
||||
and replaced, False otherwise.
|
||||
"""
|
||||
found = False
|
||||
for msg in messages:
|
||||
if not isinstance(msg, dict):
|
||||
continue
|
||||
content = msg.get("content")
|
||||
if isinstance(content, str) and _SURROGATE_RE.search(content):
|
||||
msg["content"] = _SURROGATE_RE.sub('\ufffd', content)
|
||||
found = True
|
||||
elif isinstance(content, list):
|
||||
for part in content:
|
||||
if isinstance(part, dict):
|
||||
text = part.get("text")
|
||||
if isinstance(text, str) and _SURROGATE_RE.search(text):
|
||||
part["text"] = _SURROGATE_RE.sub('\ufffd', text)
|
||||
found = True
|
||||
return found
|
||||
|
||||
|
||||
def _strip_budget_warnings_from_history(messages: list) -> None:
|
||||
"""Remove budget pressure warnings from tool-result messages in-place.
|
||||
|
||||
Budget warnings are turn-scoped signals that must not leak into replayed
|
||||
history. They live in tool-result ``content`` either as a JSON key
|
||||
(``_budget_warning``) or appended plain text.
|
||||
"""
|
||||
for msg in messages:
|
||||
if not isinstance(msg, dict) or msg.get("role") != "tool":
|
||||
continue
|
||||
content = msg.get("content")
|
||||
if not isinstance(content, str) or "_budget_warning" not in content and "[BUDGET" not in content:
|
||||
continue
|
||||
|
||||
# Try JSON first (the common case: _budget_warning key in a dict)
|
||||
try:
|
||||
parsed = json.loads(content)
|
||||
if isinstance(parsed, dict) and "_budget_warning" in parsed:
|
||||
del parsed["_budget_warning"]
|
||||
msg["content"] = json.dumps(parsed, ensure_ascii=False)
|
||||
continue
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
pass
|
||||
|
||||
# Fallback: strip the text pattern from plain-text tool results
|
||||
cleaned = _BUDGET_WARNING_RE.sub("", content).strip()
|
||||
if cleaned != content:
|
||||
msg["content"] = cleaned
|
||||
|
||||
|
||||
class AIAgent:
|
||||
"""
|
||||
AI Agent with tool calling capabilities.
|
||||
@@ -618,7 +534,6 @@ class AIAgent:
|
||||
self.tool_progress_callback = tool_progress_callback
|
||||
self.thinking_callback = thinking_callback
|
||||
self.reasoning_callback = reasoning_callback
|
||||
self._reasoning_deltas_fired = False # Set by _fire_reasoning_delta, reset per API call
|
||||
self.clarify_callback = clarify_callback
|
||||
self.step_callback = step_callback
|
||||
self.stream_delta_callback = stream_delta_callback
|
||||
@@ -861,25 +776,6 @@ class AIAgent:
|
||||
}
|
||||
|
||||
self._client_kwargs = client_kwargs # stored for rebuilding after interrupt
|
||||
|
||||
# Enable fine-grained tool streaming for Claude on OpenRouter.
|
||||
# Without this, Anthropic buffers the entire tool call and goes
|
||||
# silent for minutes while thinking — OpenRouter's upstream proxy
|
||||
# times out during the silence. The beta header makes Anthropic
|
||||
# stream tool call arguments token-by-token, keeping the
|
||||
# connection alive.
|
||||
_effective_base = str(client_kwargs.get("base_url", "")).lower()
|
||||
if "openrouter" in _effective_base and "claude" in (self.model or "").lower():
|
||||
headers = client_kwargs.get("default_headers") or {}
|
||||
existing_beta = headers.get("x-anthropic-beta", "")
|
||||
_FINE_GRAINED = "fine-grained-tool-streaming-2025-05-14"
|
||||
if _FINE_GRAINED not in existing_beta:
|
||||
if existing_beta:
|
||||
headers["x-anthropic-beta"] = f"{existing_beta},{_FINE_GRAINED}"
|
||||
else:
|
||||
headers["x-anthropic-beta"] = _FINE_GRAINED
|
||||
client_kwargs["default_headers"] = headers
|
||||
|
||||
self.api_key = client_kwargs.get("api_key", "")
|
||||
try:
|
||||
self.client = self._create_openai_client(client_kwargs, reason="agent_init", shared=True)
|
||||
@@ -1084,8 +980,8 @@ class AIAgent:
|
||||
else:
|
||||
if not hcfg.enabled:
|
||||
logger.debug("Honcho disabled in global config")
|
||||
elif not (hcfg.api_key or hcfg.base_url):
|
||||
logger.debug("Honcho enabled but no API key or base URL configured")
|
||||
elif not hcfg.api_key:
|
||||
logger.debug("Honcho enabled but no API key configured")
|
||||
else:
|
||||
logger.debug("Honcho enabled but missing API key or disabled in config")
|
||||
except Exception as e:
|
||||
@@ -1122,13 +1018,6 @@ class AIAgent:
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Tool-use enforcement config: "auto" (default — matches hardcoded
|
||||
# model list), true (always), false (never), or list of substrings.
|
||||
_agent_section = _agent_cfg.get("agent", {})
|
||||
if not isinstance(_agent_section, dict):
|
||||
_agent_section = {}
|
||||
self._tool_use_enforcement = _agent_section.get("tool_use_enforcement", "auto")
|
||||
|
||||
# Initialize context compressor for automatic context management
|
||||
# Compresses conversation when approaching model's context limit
|
||||
# Configuration via config.yaml (compression section)
|
||||
@@ -2166,23 +2055,6 @@ class AIAgent:
|
||||
msg["content"] = self._clean_session_content(msg["content"])
|
||||
cleaned.append(msg)
|
||||
|
||||
# Guard: never overwrite a larger session log with fewer messages.
|
||||
# This protects against data loss when --resume loads a session whose
|
||||
# messages weren't fully written to SQLite — the resumed agent starts
|
||||
# with partial history and would otherwise clobber the full JSON log.
|
||||
if self.session_log_file.exists():
|
||||
try:
|
||||
existing = json.loads(self.session_log_file.read_text(encoding="utf-8"))
|
||||
existing_count = existing.get("message_count", len(existing.get("messages", [])))
|
||||
if existing_count > len(cleaned):
|
||||
logging.debug(
|
||||
"Skipping session log overwrite: existing has %d messages, current has %d",
|
||||
existing_count, len(cleaned),
|
||||
)
|
||||
return
|
||||
except Exception:
|
||||
pass # corrupted existing file — allow the overwrite
|
||||
|
||||
entry = {
|
||||
"session_id": self.session_id,
|
||||
"model": self.model,
|
||||
@@ -2292,14 +2164,8 @@ class AIAgent:
|
||||
# ── Honcho integration helpers ──
|
||||
|
||||
def _honcho_should_activate(self, hcfg) -> bool:
|
||||
"""Return True when Honcho should be active.
|
||||
|
||||
Self-hosted Honcho may be configured with a base_url and no API key,
|
||||
so activation should accept either credential style.
|
||||
"""
|
||||
if not hcfg or not hcfg.enabled:
|
||||
return False
|
||||
if not (hcfg.api_key or hcfg.base_url):
|
||||
"""Return True when remote Honcho should be active."""
|
||||
if not hcfg or not hcfg.enabled or not hcfg.api_key:
|
||||
return False
|
||||
return True
|
||||
|
||||
@@ -2565,30 +2431,6 @@ class AIAgent:
|
||||
if tool_guidance:
|
||||
prompt_parts.append(" ".join(tool_guidance))
|
||||
|
||||
# Tool-use enforcement: tells the model to actually call tools instead
|
||||
# of describing intended actions. Controlled by config.yaml
|
||||
# agent.tool_use_enforcement:
|
||||
# "auto" (default) — matches TOOL_USE_ENFORCEMENT_MODELS
|
||||
# true — always inject (all models)
|
||||
# false — never inject
|
||||
# list — custom model-name substrings to match
|
||||
if self.valid_tool_names:
|
||||
_enforce = self._tool_use_enforcement
|
||||
_inject = False
|
||||
if _enforce is True or (isinstance(_enforce, str) and _enforce.lower() in ("true", "always", "yes", "on")):
|
||||
_inject = True
|
||||
elif _enforce is False or (isinstance(_enforce, str) and _enforce.lower() in ("false", "never", "no", "off")):
|
||||
_inject = False
|
||||
elif isinstance(_enforce, list):
|
||||
model_lower = (self.model or "").lower()
|
||||
_inject = any(p.lower() in model_lower for p in _enforce if isinstance(p, str))
|
||||
else:
|
||||
# "auto" or any unrecognised value — use hardcoded defaults
|
||||
model_lower = (self.model or "").lower()
|
||||
_inject = any(p in model_lower for p in TOOL_USE_ENFORCEMENT_MODELS)
|
||||
if _inject:
|
||||
prompt_parts.append(TOOL_USE_ENFORCEMENT_GUIDANCE)
|
||||
|
||||
# Honcho CLI awareness: tell Hermes about its own management commands
|
||||
# so it can refer the user to them rather than reinventing answers.
|
||||
if self._honcho and self._honcho_session_key:
|
||||
@@ -2661,13 +2503,7 @@ class AIAgent:
|
||||
|
||||
has_skills_tools = any(name in self.valid_tool_names for name in ['skills_list', 'skill_view', 'skill_manage'])
|
||||
if has_skills_tools:
|
||||
avail_toolsets = {
|
||||
toolset
|
||||
for toolset in (
|
||||
get_toolset_for_tool(tool_name) for tool_name in self.valid_tool_names
|
||||
)
|
||||
if toolset
|
||||
}
|
||||
avail_toolsets = {ts for ts, avail in check_toolset_requirements().items() if avail}
|
||||
skills_prompt = build_skills_system_prompt(
|
||||
available_tools=self.valid_tool_names,
|
||||
available_toolsets=avail_toolsets,
|
||||
@@ -3551,7 +3387,6 @@ class AIAgent:
|
||||
max_stream_retries = 1
|
||||
has_tool_calls = False
|
||||
first_delta_fired = False
|
||||
self._reasoning_deltas_fired = False
|
||||
for attempt in range(max_stream_retries + 1):
|
||||
try:
|
||||
with active_client.responses.stream(**api_kwargs) as stream:
|
||||
@@ -3828,7 +3663,6 @@ class AIAgent:
|
||||
|
||||
def _fire_reasoning_delta(self, text: str) -> None:
|
||||
"""Fire reasoning callback if registered."""
|
||||
self._reasoning_deltas_fired = True
|
||||
cb = self.reasoning_callback
|
||||
if cb is not None:
|
||||
try:
|
||||
@@ -3907,7 +3741,7 @@ class AIAgent:
|
||||
def _call_chat_completions():
|
||||
"""Stream a chat completions response."""
|
||||
import httpx as _httpx
|
||||
_base_timeout = float(os.getenv("HERMES_API_TIMEOUT", 1800.0))
|
||||
_base_timeout = float(os.getenv("HERMES_API_TIMEOUT", 900.0))
|
||||
_stream_read_timeout = float(os.getenv("HERMES_STREAM_READ_TIMEOUT", 60.0))
|
||||
stream_kwargs = {
|
||||
**api_kwargs,
|
||||
@@ -3923,28 +3757,16 @@ class AIAgent:
|
||||
request_client_holder["client"] = self._create_request_openai_client(
|
||||
reason="chat_completion_stream_request"
|
||||
)
|
||||
# Reset stale-stream timer so the detector measures from this
|
||||
# attempt's start, not a previous attempt's last chunk.
|
||||
last_chunk_time["t"] = time.time()
|
||||
stream = request_client_holder["client"].chat.completions.create(**stream_kwargs)
|
||||
|
||||
content_parts: list = []
|
||||
tool_calls_acc: dict = {}
|
||||
tool_gen_notified: set = set()
|
||||
# Ollama-compatible endpoints reuse index 0 for every tool call
|
||||
# in a parallel batch, distinguishing them only by id. Track
|
||||
# the last seen id per raw index so we can detect a new tool
|
||||
# call starting at the same index and redirect it to a fresh slot.
|
||||
_last_id_at_idx: dict = {} # raw_index -> last seen non-empty id
|
||||
_active_slot_by_idx: dict = {} # raw_index -> current slot in tool_calls_acc
|
||||
finish_reason = None
|
||||
model_name = None
|
||||
role = "assistant"
|
||||
reasoning_parts: list = []
|
||||
usage_obj = None
|
||||
# Reset per-call reasoning tracking so _build_assistant_message
|
||||
# knows whether reasoning was already displayed during streaming.
|
||||
self._reasoning_deltas_fired = False
|
||||
|
||||
for chunk in stream:
|
||||
last_chunk_time["t"] = time.time()
|
||||
@@ -3978,45 +3800,11 @@ class AIAgent:
|
||||
_fire_first_delta()
|
||||
self._fire_stream_delta(delta.content)
|
||||
deltas_were_sent["yes"] = True
|
||||
else:
|
||||
# Tool calls suppress regular content streaming (avoids
|
||||
# displaying chatty "I'll use the tool..." text alongside
|
||||
# tool calls). But reasoning tags embedded in suppressed
|
||||
# content should still reach the display — otherwise the
|
||||
# reasoning box only appears as a post-response fallback,
|
||||
# rendering it confusingly after the already-streamed
|
||||
# response. Route suppressed content through the stream
|
||||
# delta callback so its tag extraction can fire the
|
||||
# reasoning display. Non-reasoning text is harmlessly
|
||||
# suppressed by the CLI's _stream_delta when the stream
|
||||
# box is already closed (tool boundary flush).
|
||||
if self.stream_delta_callback:
|
||||
try:
|
||||
self.stream_delta_callback(delta.content)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Accumulate tool call deltas — notify display on first name
|
||||
if delta and delta.tool_calls:
|
||||
for tc_delta in delta.tool_calls:
|
||||
raw_idx = tc_delta.index if tc_delta.index is not None else 0
|
||||
delta_id = tc_delta.id or ""
|
||||
|
||||
# Ollama fix: detect a new tool call reusing the same
|
||||
# raw index (different id) and redirect to a fresh slot.
|
||||
if raw_idx not in _active_slot_by_idx:
|
||||
_active_slot_by_idx[raw_idx] = raw_idx
|
||||
if (
|
||||
delta_id
|
||||
and raw_idx in _last_id_at_idx
|
||||
and delta_id != _last_id_at_idx[raw_idx]
|
||||
):
|
||||
new_slot = max(tool_calls_acc, default=-1) + 1
|
||||
_active_slot_by_idx[raw_idx] = new_slot
|
||||
if delta_id:
|
||||
_last_id_at_idx[raw_idx] = delta_id
|
||||
idx = _active_slot_by_idx[raw_idx]
|
||||
|
||||
idx = tc_delta.index if tc_delta.index is not None else 0
|
||||
if idx not in tool_calls_acc:
|
||||
tool_calls_acc[idx] = {
|
||||
"id": tc_delta.id or "",
|
||||
@@ -4098,10 +3886,7 @@ class AIAgent:
|
||||
works unchanged.
|
||||
"""
|
||||
has_tool_use = False
|
||||
self._reasoning_deltas_fired = False
|
||||
|
||||
# Reset stale-stream timer for this attempt
|
||||
last_chunk_time["t"] = time.time()
|
||||
# Use the Anthropic SDK's streaming context manager
|
||||
with self._anthropic_client.messages.stream(**api_kwargs) as stream:
|
||||
for event in stream:
|
||||
@@ -4169,37 +3954,7 @@ class AIAgent:
|
||||
e, (_httpx.ConnectError, _httpx.RemoteProtocolError, ConnectionError)
|
||||
)
|
||||
|
||||
# SSE error events from proxies (e.g. OpenRouter sends
|
||||
# {"error":{"message":"Network connection lost."}}) are
|
||||
# raised as APIError by the OpenAI SDK. These are
|
||||
# semantically identical to httpx connection drops —
|
||||
# the upstream stream died — and should be retried with
|
||||
# a fresh connection. Distinguish from HTTP errors:
|
||||
# APIError from SSE has no status_code, while
|
||||
# APIStatusError (4xx/5xx) always has one.
|
||||
_is_sse_conn_err = False
|
||||
if not _is_timeout and not _is_conn_err:
|
||||
from openai import APIError as _APIError
|
||||
if isinstance(e, _APIError) and not getattr(e, "status_code", None):
|
||||
_err_lower_sse = str(e).lower()
|
||||
_SSE_CONN_PHRASES = (
|
||||
"connection lost",
|
||||
"connection reset",
|
||||
"connection closed",
|
||||
"connection terminated",
|
||||
"network error",
|
||||
"network connection",
|
||||
"terminated",
|
||||
"peer closed",
|
||||
"broken pipe",
|
||||
"upstream connect error",
|
||||
)
|
||||
_is_sse_conn_err = any(
|
||||
phrase in _err_lower_sse
|
||||
for phrase in _SSE_CONN_PHRASES
|
||||
)
|
||||
|
||||
if _is_timeout or _is_conn_err or _is_sse_conn_err:
|
||||
if _is_timeout or _is_conn_err:
|
||||
# Transient network / timeout error. Retry the
|
||||
# streaming request with a fresh connection first.
|
||||
if _stream_attempt < _max_stream_retries:
|
||||
@@ -4244,10 +3999,6 @@ class AIAgent:
|
||||
)
|
||||
|
||||
try:
|
||||
# Reset stale timer — the non-streaming fallback
|
||||
# uses its own client; prevent the stale detector
|
||||
# from firing on stale timestamps from failed streams.
|
||||
last_chunk_time["t"] = time.time()
|
||||
result["response"] = self._interruptible_api_call(api_kwargs)
|
||||
except Exception as fallback_err:
|
||||
result["error"] = fallback_err
|
||||
@@ -4257,19 +4008,7 @@ class AIAgent:
|
||||
if request_client is not None:
|
||||
self._close_request_openai_client(request_client, reason="stream_request_complete")
|
||||
|
||||
_stream_stale_timeout_base = float(os.getenv("HERMES_STREAM_STALE_TIMEOUT", 180.0))
|
||||
# Scale the stale timeout for large contexts: slow models (like Opus)
|
||||
# can legitimately think for minutes before producing the first token
|
||||
# when the context is large. Without this, the stale detector kills
|
||||
# healthy connections during the model's thinking phase, producing
|
||||
# spurious RemoteProtocolError ("peer closed connection").
|
||||
_est_tokens = sum(len(str(v)) for v in api_kwargs.get("messages", [])) // 4
|
||||
if _est_tokens > 100_000:
|
||||
_stream_stale_timeout = max(_stream_stale_timeout_base, 300.0)
|
||||
elif _est_tokens > 50_000:
|
||||
_stream_stale_timeout = max(_stream_stale_timeout_base, 240.0)
|
||||
else:
|
||||
_stream_stale_timeout = _stream_stale_timeout_base
|
||||
_stream_stale_timeout = float(os.getenv("HERMES_STREAM_STALE_TIMEOUT", 90.0))
|
||||
|
||||
t = threading.Thread(target=_call, daemon=True)
|
||||
t.start()
|
||||
@@ -4583,10 +4322,6 @@ class AIAgent:
|
||||
if self.api_mode == "anthropic_messages":
|
||||
from agent.anthropic_adapter import build_anthropic_kwargs
|
||||
anthropic_messages = self._prepare_anthropic_messages_for_api(api_messages)
|
||||
# Pass context_length so the adapter can clamp max_tokens if the
|
||||
# user configured a smaller context window than the model's output limit.
|
||||
ctx_len = getattr(self, "context_compressor", None)
|
||||
ctx_len = ctx_len.context_length if ctx_len else None
|
||||
return build_anthropic_kwargs(
|
||||
model=self.model,
|
||||
messages=anthropic_messages,
|
||||
@@ -4595,7 +4330,6 @@ class AIAgent:
|
||||
reasoning_config=self.reasoning_config,
|
||||
is_oauth=self._is_anthropic_oauth,
|
||||
preserve_dots=self._anthropic_preserve_dots(),
|
||||
context_length=ctx_len,
|
||||
)
|
||||
|
||||
if self.api_mode == "codex_responses":
|
||||
@@ -4707,25 +4441,11 @@ class AIAgent:
|
||||
"model": self.model,
|
||||
"messages": sanitized_messages,
|
||||
"tools": self.tools if self.tools else None,
|
||||
"timeout": float(os.getenv("HERMES_API_TIMEOUT", 1800.0)),
|
||||
"timeout": float(os.getenv("HERMES_API_TIMEOUT", 900.0)),
|
||||
}
|
||||
|
||||
if self.max_tokens is not None:
|
||||
api_kwargs.update(self._max_tokens_param(self.max_tokens))
|
||||
elif self._is_openrouter_url() and "claude" in (self.model or "").lower():
|
||||
# OpenRouter translates requests to Anthropic's Messages API,
|
||||
# which requires max_tokens as a mandatory field. When we omit
|
||||
# it, OpenRouter picks a default that can be too low — the model
|
||||
# spends its output budget on thinking and has almost nothing
|
||||
# left for the actual response (especially large tool calls like
|
||||
# write_file). Sending the model's real output limit ensures
|
||||
# full capacity. Other providers handle the default fine.
|
||||
try:
|
||||
from agent.anthropic_adapter import _get_anthropic_max_output
|
||||
_model_output_limit = _get_anthropic_max_output(self.model)
|
||||
api_kwargs["max_tokens"] = _model_output_limit
|
||||
except Exception:
|
||||
pass # fail open — let OpenRouter pick its default
|
||||
|
||||
extra_body = {}
|
||||
|
||||
@@ -4861,15 +4581,11 @@ class AIAgent:
|
||||
logging.debug(f"Captured reasoning ({len(reasoning_text)} chars): {reasoning_text}")
|
||||
|
||||
if reasoning_text and self.reasoning_callback:
|
||||
# Skip callback when streaming is active — reasoning was already
|
||||
# displayed during the stream via one of two paths:
|
||||
# (a) _fire_reasoning_delta (structured reasoning_content deltas)
|
||||
# (b) _stream_delta tag extraction (<think>/<REASONING_SCRATCHPAD>)
|
||||
# When streaming is NOT active, always fire so non-streaming modes
|
||||
# (gateway, batch, quiet) still get reasoning.
|
||||
# Any reasoning that wasn't shown during streaming is caught by the
|
||||
# CLI post-response display fallback (cli.py _reasoning_shown_this_turn).
|
||||
if not self.stream_delta_callback:
|
||||
# Skip callback for <think>-extracted reasoning when streaming is active.
|
||||
# _stream_delta() already displayed <think> blocks during streaming;
|
||||
# firing the callback again would cause duplicate display.
|
||||
# Structured reasoning (from reasoning_content field) always fires.
|
||||
if _from_structured or not self.stream_delta_callback:
|
||||
try:
|
||||
self.reasoning_callback(reasoning_text)
|
||||
except Exception:
|
||||
@@ -6007,14 +5723,6 @@ class AIAgent:
|
||||
# Installed once, transparent when streams are healthy, prevents crash on write.
|
||||
_install_safe_stdio()
|
||||
|
||||
# Sanitize surrogate characters from user input. Clipboard paste from
|
||||
# rich-text editors (Google Docs, Word, etc.) can inject lone surrogates
|
||||
# that are invalid UTF-8 and crash JSON serialization in the OpenAI SDK.
|
||||
if isinstance(user_message, str):
|
||||
user_message = _sanitize_surrogates(user_message)
|
||||
if isinstance(persist_user_message, str):
|
||||
persist_user_message = _sanitize_surrogates(persist_user_message)
|
||||
|
||||
# Store stream callback for _interruptible_api_call to pick up
|
||||
self._stream_callback = stream_callback
|
||||
self._persist_user_message_idx = None
|
||||
@@ -6031,7 +5739,6 @@ class AIAgent:
|
||||
self._codex_incomplete_retries = 0
|
||||
self._last_content_with_tools = None
|
||||
self._mute_post_response = False
|
||||
self._surrogate_sanitized = False
|
||||
# NOTE: _turns_since_memory and _iters_since_skill are NOT reset here.
|
||||
# They are initialized in __init__ and must persist across run_conversation
|
||||
# calls so that nudge logic accumulates correctly in CLI mode.
|
||||
@@ -6039,14 +5746,6 @@ class AIAgent:
|
||||
|
||||
# Initialize conversation (copy to avoid mutating the caller's list)
|
||||
messages = list(conversation_history) if conversation_history else []
|
||||
|
||||
# Strip budget pressure warnings from previous turns. These are
|
||||
# turn-scoped signals injected by _get_budget_warning() into tool
|
||||
# result content. If left in the replayed history, models (especially
|
||||
# GPT-family) interpret them as still-active instructions and avoid
|
||||
# making tool calls in ALL subsequent turns.
|
||||
if messages:
|
||||
_strip_budget_warnings_from_history(messages)
|
||||
|
||||
# Hydrate todo store from conversation history (gateway creates a fresh
|
||||
# AIAgent per message, so the in-memory store is empty -- we need to
|
||||
@@ -6142,22 +5841,6 @@ class AIAgent:
|
||||
self._cached_system_prompt = (
|
||||
self._cached_system_prompt + "\n\n" + self._honcho_context
|
||||
).strip()
|
||||
|
||||
# Plugin hook: on_session_start
|
||||
# Fired once when a brand-new session is created (not on
|
||||
# continuation). Plugins can use this to initialise
|
||||
# session-scoped state (e.g. warm a memory cache).
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_invoke_hook(
|
||||
"on_session_start",
|
||||
session_id=self.session_id,
|
||||
model=self.model,
|
||||
platform=getattr(self, "platform", None) or "",
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning("on_session_start hook failed: %s", exc)
|
||||
|
||||
# Store the system prompt snapshot in SQLite
|
||||
if self._session_db:
|
||||
try:
|
||||
@@ -6219,34 +5902,6 @@ class AIAgent:
|
||||
if _preflight_tokens < self.context_compressor.threshold_tokens:
|
||||
break # Under threshold
|
||||
|
||||
# Plugin hook: pre_llm_call
|
||||
# Fired once per turn before the tool-calling loop. Plugins can
|
||||
# return a dict with a ``context`` key whose value is a string
|
||||
# that will be appended to the ephemeral system prompt for every
|
||||
# API call in this turn (not persisted to session DB or cache).
|
||||
_plugin_turn_context = ""
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_pre_results = _invoke_hook(
|
||||
"pre_llm_call",
|
||||
session_id=self.session_id,
|
||||
user_message=original_user_message,
|
||||
conversation_history=list(messages),
|
||||
is_first_turn=(not bool(conversation_history)),
|
||||
model=self.model,
|
||||
platform=getattr(self, "platform", None) or "",
|
||||
)
|
||||
_ctx_parts = []
|
||||
for r in _pre_results:
|
||||
if isinstance(r, dict) and r.get("context"):
|
||||
_ctx_parts.append(str(r["context"]))
|
||||
elif isinstance(r, str) and r.strip():
|
||||
_ctx_parts.append(r)
|
||||
if _ctx_parts:
|
||||
_plugin_turn_context = "\n\n".join(_ctx_parts)
|
||||
except Exception as exc:
|
||||
logger.warning("pre_llm_call hook failed: %s", exc)
|
||||
|
||||
# Main conversation loop
|
||||
api_call_count = 0
|
||||
final_response = None
|
||||
@@ -6344,9 +5999,6 @@ class AIAgent:
|
||||
effective_system = active_system_prompt or ""
|
||||
if self.ephemeral_system_prompt:
|
||||
effective_system = (effective_system + "\n\n" + self.ephemeral_system_prompt).strip()
|
||||
# Plugin context from pre_llm_call hooks — ephemeral, not cached.
|
||||
if _plugin_turn_context:
|
||||
effective_system = (effective_system + "\n\n" + _plugin_turn_context).strip()
|
||||
if effective_system:
|
||||
api_messages = [{"role": "system", "content": effective_system}] + api_messages
|
||||
|
||||
@@ -6623,62 +6275,6 @@ class AIAgent:
|
||||
if finish_reason == "length":
|
||||
self._vprint(f"{self.log_prefix}⚠️ Response truncated (finish_reason='length') - model hit max output tokens", force=True)
|
||||
|
||||
# ── Detect thinking-budget exhaustion ──────────────
|
||||
# When the model spends ALL output tokens on reasoning
|
||||
# and has none left for the response, continuation
|
||||
# retries are pointless. Detect this early and give a
|
||||
# targeted error instead of wasting 3 API calls.
|
||||
_trunc_content = None
|
||||
if self.api_mode == "chat_completions":
|
||||
_trunc_msg = response.choices[0].message if (hasattr(response, "choices") and response.choices) else None
|
||||
_trunc_content = getattr(_trunc_msg, "content", None) if _trunc_msg else None
|
||||
elif self.api_mode == "anthropic_messages":
|
||||
# Anthropic response.content is a list of blocks
|
||||
_text_parts = []
|
||||
for _blk in getattr(response, "content", []):
|
||||
if getattr(_blk, "type", None) == "text":
|
||||
_text_parts.append(getattr(_blk, "text", ""))
|
||||
_trunc_content = "\n".join(_text_parts) if _text_parts else None
|
||||
|
||||
_thinking_exhausted = (
|
||||
_trunc_content is not None
|
||||
and not self._has_content_after_think_block(_trunc_content)
|
||||
) or _trunc_content is None
|
||||
|
||||
if _thinking_exhausted:
|
||||
_exhaust_error = (
|
||||
"Model used all output tokens on reasoning with none left "
|
||||
"for the response. Try lowering reasoning effort or "
|
||||
"increasing max_tokens."
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix}💭 Reasoning exhausted the output token budget — "
|
||||
f"no visible response was produced.",
|
||||
force=True,
|
||||
)
|
||||
# Return a user-friendly message as the response so
|
||||
# CLI (response box) and gateway (chat message) both
|
||||
# display it naturally instead of a suppressed error.
|
||||
_exhaust_response = (
|
||||
"⚠️ **Thinking Budget Exhausted**\n\n"
|
||||
"The model used all its output tokens on reasoning "
|
||||
"and had none left for the actual response.\n\n"
|
||||
"To fix this:\n"
|
||||
"→ Lower reasoning effort: `/thinkon low` or `/thinkon minimal`\n"
|
||||
"→ Increase the output token limit: "
|
||||
"set `model.max_tokens` in config.yaml"
|
||||
)
|
||||
self._cleanup_task_resources(effective_task_id)
|
||||
self._persist_session(messages, conversation_history)
|
||||
return {
|
||||
"final_response": _exhaust_response,
|
||||
"messages": messages,
|
||||
"api_calls": api_call_count,
|
||||
"completed": False,
|
||||
"partial": True,
|
||||
"error": _exhaust_error,
|
||||
}
|
||||
|
||||
if self.api_mode == "chat_completions":
|
||||
assistant_message = response.choices[0].message
|
||||
if not assistant_message.tool_calls:
|
||||
@@ -6867,24 +6463,6 @@ class AIAgent:
|
||||
if self.thinking_callback:
|
||||
self.thinking_callback("")
|
||||
|
||||
# -----------------------------------------------------------
|
||||
# Surrogate character recovery. UnicodeEncodeError happens
|
||||
# when the messages contain lone surrogates (U+D800..U+DFFF)
|
||||
# that are invalid UTF-8. Common source: clipboard paste
|
||||
# from Google Docs or similar rich-text editors. We sanitize
|
||||
# the entire messages list in-place and retry once.
|
||||
# -----------------------------------------------------------
|
||||
if isinstance(api_error, UnicodeEncodeError) and not getattr(self, '_surrogate_sanitized', False):
|
||||
self._surrogate_sanitized = True
|
||||
if _sanitize_messages_surrogates(messages):
|
||||
self._vprint(
|
||||
f"{self.log_prefix}⚠️ Stripped invalid surrogate characters from messages. Retrying...",
|
||||
force=True,
|
||||
)
|
||||
continue
|
||||
# Surrogates weren't in messages — might be in system
|
||||
# prompt or prefill. Fall through to normal error path.
|
||||
|
||||
status_code = getattr(api_error, "status_code", None)
|
||||
if (
|
||||
self.api_mode == "codex_responses"
|
||||
@@ -7153,13 +6731,8 @@ class AIAgent:
|
||||
# 529 (Anthropic overloaded) is also transient.
|
||||
# Also catch local validation errors (ValueError, TypeError) — these
|
||||
# are programming bugs, not transient failures.
|
||||
# Exclude UnicodeEncodeError — it's a ValueError subclass but is
|
||||
# handled separately by the surrogate sanitization path above.
|
||||
_RETRYABLE_STATUS_CODES = {413, 429, 529}
|
||||
is_local_validation_error = (
|
||||
isinstance(api_error, (ValueError, TypeError))
|
||||
and not isinstance(api_error, UnicodeEncodeError)
|
||||
)
|
||||
is_local_validation_error = isinstance(api_error, (ValueError, TypeError))
|
||||
# Detect generic 400s from Anthropic OAuth (transient server-side failures).
|
||||
# Real invalid_request_error responses include a descriptive message;
|
||||
# transient ones contain only "Error" or are empty. (ref: issue #1608)
|
||||
@@ -7229,36 +6802,6 @@ class AIAgent:
|
||||
_final_summary = self._summarize_api_error(api_error)
|
||||
self._vprint(f"{self.log_prefix}❌ Max retries ({max_retries}) exceeded. Giving up.", force=True)
|
||||
self._vprint(f"{self.log_prefix} 💀 Final error: {_final_summary}", force=True)
|
||||
|
||||
# Detect SSE stream-drop pattern (e.g. "Network
|
||||
# connection lost") and surface actionable guidance.
|
||||
# This typically happens when the model generates a
|
||||
# very large tool call (write_file with huge content)
|
||||
# and the proxy/CDN drops the stream mid-response.
|
||||
_is_stream_drop = (
|
||||
not getattr(api_error, "status_code", None)
|
||||
and any(p in error_msg for p in (
|
||||
"connection lost", "connection reset",
|
||||
"connection closed", "network connection",
|
||||
"network error", "terminated",
|
||||
))
|
||||
)
|
||||
if _is_stream_drop:
|
||||
self._vprint(
|
||||
f"{self.log_prefix} 💡 The provider's stream "
|
||||
f"connection keeps dropping. This often happens "
|
||||
f"when the model tries to write a very large "
|
||||
f"file in a single tool call.",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} Try asking the model "
|
||||
f"to use execute_code with Python's open() for "
|
||||
f"large files, or to write the file in smaller "
|
||||
f"sections.",
|
||||
force=True,
|
||||
)
|
||||
|
||||
logging.error(
|
||||
"%sAPI call failed after %s retries. %s | provider=%s model=%s msgs=%s tokens=~%s",
|
||||
self.log_prefix, max_retries, _final_summary,
|
||||
@@ -7268,18 +6811,8 @@ class AIAgent:
|
||||
api_kwargs, reason="max_retries_exhausted", error=api_error,
|
||||
)
|
||||
self._persist_session(messages, conversation_history)
|
||||
_final_response = f"API call failed after {max_retries} retries: {_final_summary}"
|
||||
if _is_stream_drop:
|
||||
_final_response += (
|
||||
"\n\nThe provider's stream connection keeps "
|
||||
"dropping — this often happens when generating "
|
||||
"very large tool call responses (e.g. write_file "
|
||||
"with long content). Try asking me to use "
|
||||
"execute_code with Python's open() for large "
|
||||
"files, or to write in smaller sections."
|
||||
)
|
||||
return {
|
||||
"final_response": _final_response,
|
||||
"final_response": f"API call failed after {max_retries} retries: {_final_summary}",
|
||||
"messages": messages,
|
||||
"api_calls": api_call_count,
|
||||
"completed": False,
|
||||
@@ -7657,6 +7190,7 @@ class AIAgent:
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
_msg_count_before_tools = len(messages)
|
||||
self._execute_tool_calls(assistant_message, messages, effective_task_id, api_call_count)
|
||||
|
||||
# Signal that a paragraph break is needed before the next
|
||||
@@ -7674,18 +7208,18 @@ class AIAgent:
|
||||
if _tc_names == {"execute_code"}:
|
||||
self.iteration_budget.refund()
|
||||
|
||||
# Use real token counts from the API response to decide
|
||||
# compression. prompt_tokens + completion_tokens is the
|
||||
# actual context size the provider reported plus the
|
||||
# assistant turn — a tight lower bound for the next prompt.
|
||||
# Tool results appended above aren't counted yet, but the
|
||||
# threshold (default 50%) leaves ample headroom; if tool
|
||||
# results push past it, the next API call will report the
|
||||
# real total and trigger compression then.
|
||||
# Estimate next prompt size using real token counts from the
|
||||
# last API response + rough estimate of newly appended tool
|
||||
# results. This catches cases where tool results push the
|
||||
# context past the limit that last_prompt_tokens alone misses
|
||||
# (e.g. large file reads, web extractions).
|
||||
_compressor = self.context_compressor
|
||||
_real_tokens = (
|
||||
_new_tool_msgs = messages[_msg_count_before_tools:]
|
||||
_new_chars = sum(len(str(m.get("content", "") or "")) for m in _new_tool_msgs)
|
||||
_estimated_next_prompt = (
|
||||
_compressor.last_prompt_tokens
|
||||
+ _compressor.last_completion_tokens
|
||||
+ _new_chars // 3 # conservative: JSON-heavy tool results ≈ 3 chars/token
|
||||
)
|
||||
|
||||
# ── Context pressure warnings (user-facing only) ──────────
|
||||
@@ -7695,12 +7229,12 @@ class AIAgent:
|
||||
# Does not inject into messages — just prints to CLI output
|
||||
# and fires status_callback for gateway platforms.
|
||||
if _compressor.threshold_tokens > 0:
|
||||
_compaction_progress = _real_tokens / _compressor.threshold_tokens
|
||||
_compaction_progress = _estimated_next_prompt / _compressor.threshold_tokens
|
||||
if _compaction_progress >= 0.85 and not self._context_pressure_warned:
|
||||
self._context_pressure_warned = True
|
||||
self._emit_context_pressure(_compaction_progress, _compressor)
|
||||
|
||||
if self.compression_enabled and _compressor.should_compress(_real_tokens):
|
||||
if self.compression_enabled and _compressor.should_compress(_estimated_next_prompt):
|
||||
messages, active_system_prompt = self._compress_context(
|
||||
messages, system_message,
|
||||
approx_tokens=self.context_compressor.last_prompt_tokens,
|
||||
@@ -7947,25 +7481,6 @@ class AIAgent:
|
||||
self._honcho_sync(original_user_message, final_response)
|
||||
self._queue_honcho_prefetch(original_user_message)
|
||||
|
||||
# Plugin hook: post_llm_call
|
||||
# Fired once per turn after the tool-calling loop completes.
|
||||
# Plugins can use this to persist conversation data (e.g. sync
|
||||
# to an external memory system).
|
||||
if final_response and not interrupted:
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_invoke_hook(
|
||||
"post_llm_call",
|
||||
session_id=self.session_id,
|
||||
user_message=original_user_message,
|
||||
assistant_response=final_response,
|
||||
conversation_history=list(messages),
|
||||
model=self.model,
|
||||
platform=getattr(self, "platform", None) or "",
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning("post_llm_call hook failed: %s", exc)
|
||||
|
||||
# Extract reasoning from the last assistant message (if any)
|
||||
last_reasoning = None
|
||||
for msg in reversed(messages):
|
||||
@@ -8031,22 +7546,6 @@ class AIAgent:
|
||||
except Exception:
|
||||
pass # Background review is best-effort
|
||||
|
||||
# Plugin hook: on_session_end
|
||||
# Fired at the very end of every run_conversation call.
|
||||
# Plugins can use this for cleanup, flushing buffers, etc.
|
||||
try:
|
||||
from hermes_cli.plugins import invoke_hook as _invoke_hook
|
||||
_invoke_hook(
|
||||
"on_session_end",
|
||||
session_id=self.session_id,
|
||||
completed=completed,
|
||||
interrupted=interrupted,
|
||||
model=self.model,
|
||||
platform=getattr(self, "platform", None) or "",
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning("on_session_end hook failed: %s", exc)
|
||||
|
||||
return result
|
||||
|
||||
def chat(self, message: str, stream_callback: Optional[callable] = None) -> str:
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
# Kill all running Modal apps (sandboxes, deployments, etc.)
|
||||
#
|
||||
# Usage:
|
||||
# bash scripts/kill_modal.sh # Stop hermes-agent sandboxes
|
||||
# bash scripts/kill_modal.sh # Stop swe-rex (the sandbox app)
|
||||
# bash scripts/kill_modal.sh --all # Stop ALL Modal apps
|
||||
|
||||
set -uo pipefail
|
||||
@@ -17,10 +17,10 @@ if [[ "${1:-}" == "--all" ]]; then
|
||||
modal app stop "$app_id" 2>/dev/null || true
|
||||
done
|
||||
else
|
||||
echo "Stopping hermes-agent sandboxes..."
|
||||
APPS=$(echo "$APP_LIST" | grep 'hermes-agent' | grep -oE 'ap-[A-Za-z0-9]+' || true)
|
||||
echo "Stopping swe-rex sandboxes..."
|
||||
APPS=$(echo "$APP_LIST" | grep 'swe-rex' | grep -oE 'ap-[A-Za-z0-9]+' || true)
|
||||
if [[ -z "$APPS" ]]; then
|
||||
echo " No hermes-agent apps found."
|
||||
echo " No swe-rex apps found."
|
||||
else
|
||||
echo "$APPS" | while read app_id; do
|
||||
echo " Stopping $app_id"
|
||||
@@ -30,5 +30,5 @@ else
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "Current hermes-agent status:"
|
||||
modal app list 2>/dev/null | grep -E 'State|hermes-agent' || echo " (none)"
|
||||
echo "Current swe-rex status:"
|
||||
modal app list 2>/dev/null | grep -E 'State|swe-rex' || echo " (none)"
|
||||
|
||||
@@ -1,180 +0,0 @@
|
||||
---
|
||||
name: webhook-subscriptions
|
||||
description: Create and manage webhook subscriptions for event-driven agent activation. Use when the user wants external services to trigger agent runs automatically.
|
||||
version: 1.0.0
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [webhook, events, automation, integrations]
|
||||
---
|
||||
|
||||
# Webhook Subscriptions
|
||||
|
||||
Create dynamic webhook subscriptions so external services (GitHub, GitLab, Stripe, CI/CD, IoT sensors, monitoring tools) can trigger Hermes agent runs by POSTing events to a URL.
|
||||
|
||||
## Setup (Required First)
|
||||
|
||||
The webhook platform must be enabled before subscriptions can be created. Check with:
|
||||
```bash
|
||||
hermes webhook list
|
||||
```
|
||||
|
||||
If it says "Webhook platform is not enabled", set it up:
|
||||
|
||||
### Option 1: Setup wizard
|
||||
```bash
|
||||
hermes gateway setup
|
||||
```
|
||||
Follow the prompts to enable webhooks, set the port, and set a global HMAC secret.
|
||||
|
||||
### Option 2: Manual config
|
||||
Add to `~/.hermes/config.yaml`:
|
||||
```yaml
|
||||
platforms:
|
||||
webhook:
|
||||
enabled: true
|
||||
extra:
|
||||
host: "0.0.0.0"
|
||||
port: 8644
|
||||
secret: "generate-a-strong-secret-here"
|
||||
```
|
||||
|
||||
### Option 3: Environment variables
|
||||
Add to `~/.hermes/.env`:
|
||||
```bash
|
||||
WEBHOOK_ENABLED=true
|
||||
WEBHOOK_PORT=8644
|
||||
WEBHOOK_SECRET=generate-a-strong-secret-here
|
||||
```
|
||||
|
||||
After configuration, start (or restart) the gateway:
|
||||
```bash
|
||||
hermes gateway run
|
||||
# Or if using systemd:
|
||||
systemctl --user restart hermes-gateway
|
||||
```
|
||||
|
||||
Verify it's running:
|
||||
```bash
|
||||
curl http://localhost:8644/health
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
All management is via the `hermes webhook` CLI command:
|
||||
|
||||
### Create a subscription
|
||||
```bash
|
||||
hermes webhook subscribe <name> \
|
||||
--prompt "Prompt template with {payload.fields}" \
|
||||
--events "event1,event2" \
|
||||
--description "What this does" \
|
||||
--skills "skill1,skill2" \
|
||||
--deliver telegram \
|
||||
--deliver-chat-id "12345" \
|
||||
--secret "optional-custom-secret"
|
||||
```
|
||||
|
||||
Returns the webhook URL and HMAC secret. The user configures their service to POST to that URL.
|
||||
|
||||
### List subscriptions
|
||||
```bash
|
||||
hermes webhook list
|
||||
```
|
||||
|
||||
### Remove a subscription
|
||||
```bash
|
||||
hermes webhook remove <name>
|
||||
```
|
||||
|
||||
### Test a subscription
|
||||
```bash
|
||||
hermes webhook test <name>
|
||||
hermes webhook test <name> --payload '{"key": "value"}'
|
||||
```
|
||||
|
||||
## Prompt Templates
|
||||
|
||||
Prompts support `{dot.notation}` for accessing nested payload fields:
|
||||
|
||||
- `{issue.title}` — GitHub issue title
|
||||
- `{pull_request.user.login}` — PR author
|
||||
- `{data.object.amount}` — Stripe payment amount
|
||||
- `{sensor.temperature}` — IoT sensor reading
|
||||
|
||||
If no prompt is specified, the full JSON payload is dumped into the agent prompt.
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### GitHub: new issues
|
||||
```bash
|
||||
hermes webhook subscribe github-issues \
|
||||
--events "issues" \
|
||||
--prompt "New GitHub issue #{issue.number}: {issue.title}\n\nAction: {action}\nAuthor: {issue.user.login}\nBody:\n{issue.body}\n\nPlease triage this issue." \
|
||||
--deliver telegram \
|
||||
--deliver-chat-id "-100123456789"
|
||||
```
|
||||
|
||||
Then in GitHub repo Settings → Webhooks → Add webhook:
|
||||
- Payload URL: the returned webhook_url
|
||||
- Content type: application/json
|
||||
- Secret: the returned secret
|
||||
- Events: "Issues"
|
||||
|
||||
### GitHub: PR reviews
|
||||
```bash
|
||||
hermes webhook subscribe github-prs \
|
||||
--events "pull_request" \
|
||||
--prompt "PR #{pull_request.number} {action}: {pull_request.title}\nBy: {pull_request.user.login}\nBranch: {pull_request.head.ref}\n\n{pull_request.body}" \
|
||||
--skills "github-code-review" \
|
||||
--deliver github_comment
|
||||
```
|
||||
|
||||
### Stripe: payment events
|
||||
```bash
|
||||
hermes webhook subscribe stripe-payments \
|
||||
--events "payment_intent.succeeded,payment_intent.payment_failed" \
|
||||
--prompt "Payment {data.object.status}: {data.object.amount} cents from {data.object.receipt_email}" \
|
||||
--deliver telegram \
|
||||
--deliver-chat-id "-100123456789"
|
||||
```
|
||||
|
||||
### CI/CD: build notifications
|
||||
```bash
|
||||
hermes webhook subscribe ci-builds \
|
||||
--events "pipeline" \
|
||||
--prompt "Build {object_attributes.status} on {project.name} branch {object_attributes.ref}\nCommit: {commit.message}" \
|
||||
--deliver discord \
|
||||
--deliver-chat-id "1234567890"
|
||||
```
|
||||
|
||||
### Generic monitoring alert
|
||||
```bash
|
||||
hermes webhook subscribe alerts \
|
||||
--prompt "Alert: {alert.name}\nSeverity: {alert.severity}\nMessage: {alert.message}\n\nPlease investigate and suggest remediation." \
|
||||
--deliver origin
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
- Each subscription gets an auto-generated HMAC-SHA256 secret (or provide your own with `--secret`)
|
||||
- The webhook adapter validates signatures on every incoming POST
|
||||
- Static routes from config.yaml cannot be overwritten by dynamic subscriptions
|
||||
- Subscriptions persist to `~/.hermes/webhook_subscriptions.json`
|
||||
|
||||
## How It Works
|
||||
|
||||
1. `hermes webhook subscribe` writes to `~/.hermes/webhook_subscriptions.json`
|
||||
2. The webhook adapter hot-reloads this file on each incoming request (mtime-gated, negligible overhead)
|
||||
3. When a POST arrives matching a route, the adapter formats the prompt and triggers an agent run
|
||||
4. The agent's response is delivered to the configured target (Telegram, Discord, GitHub comment, etc.)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
If webhooks aren't working:
|
||||
|
||||
1. **Is the gateway running?** Check with `systemctl --user status hermes-gateway` or `ps aux | grep gateway`
|
||||
2. **Is the webhook server listening?** `curl http://localhost:8644/health` should return `{"status": "ok"}`
|
||||
3. **Check gateway logs:** `grep webhook ~/.hermes/logs/gateway.log | tail -20`
|
||||
4. **Signature mismatch?** Verify the secret in your service matches the one from `hermes webhook list`. GitHub sends `X-Hub-Signature-256`, GitLab sends `X-Gitlab-Token`.
|
||||
5. **Firewall/NAT?** The webhook URL must be reachable from the service. For local development, use a tunnel (ngrok, cloudflared).
|
||||
6. **Wrong event type?** Check `--events` filter matches what the service sends. Use `hermes webhook test <name>` to verify the route works.
|
||||
@@ -219,9 +219,6 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
|
||||
echo "AUTH_METHOD=gh"
|
||||
elif [ -n "$GITHUB_TOKEN" ]; then
|
||||
echo "AUTH_METHOD=curl"
|
||||
elif [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
|
||||
export GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
|
||||
echo "AUTH_METHOD=curl"
|
||||
elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
|
||||
export GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
echo "AUTH_METHOD=curl"
|
||||
|
||||
@@ -23,11 +23,6 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null 2>&1; then
|
||||
GH_USER=$(gh api user --jq '.login' 2>/dev/null)
|
||||
elif [ -n "$GITHUB_TOKEN" ]; then
|
||||
GH_AUTH_METHOD="curl"
|
||||
elif [ -f "$HOME/.hermes/.env" ] && grep -q "^GITHUB_TOKEN=" "$HOME/.hermes/.env" 2>/dev/null; then
|
||||
GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" "$HOME/.hermes/.env" | head -1 | cut -d= -f2 | tr -d '\n\r')
|
||||
if [ -n "$GITHUB_TOKEN" ]; then
|
||||
GH_AUTH_METHOD="curl"
|
||||
fi
|
||||
elif [ -f "$HOME/.git-credentials" ] && grep -q "github.com" "$HOME/.git-credentials" 2>/dev/null; then
|
||||
GITHUB_TOKEN=$(grep "github.com" "$HOME/.git-credentials" | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
if [ -n "$GITHUB_TOKEN" ]; then
|
||||
|
||||
@@ -27,11 +27,7 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
|
||||
else
|
||||
AUTH="git"
|
||||
if [ -z "$GITHUB_TOKEN" ]; then
|
||||
if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
|
||||
GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
|
||||
elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
fi
|
||||
|
||||
|
||||
@@ -27,11 +27,7 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
|
||||
else
|
||||
AUTH="git"
|
||||
if [ -z "$GITHUB_TOKEN" ]; then
|
||||
if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
|
||||
GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
|
||||
elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
fi
|
||||
|
||||
|
||||
@@ -29,11 +29,7 @@ else
|
||||
AUTH="git"
|
||||
# Ensure we have a token for API calls
|
||||
if [ -z "$GITHUB_TOKEN" ]; then
|
||||
if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
|
||||
GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
|
||||
elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
fi
|
||||
echo "Using: $AUTH"
|
||||
|
||||
@@ -26,11 +26,7 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
|
||||
else
|
||||
AUTH="git"
|
||||
if [ -z "$GITHUB_TOKEN" ]; then
|
||||
if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
|
||||
GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
|
||||
elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
|
||||
fi
|
||||
fi
|
||||
|
||||
|
||||
@@ -11,7 +11,6 @@ from agent.auxiliary_client import (
|
||||
get_text_auxiliary_client,
|
||||
get_vision_auxiliary_client,
|
||||
get_available_vision_backends,
|
||||
resolve_vision_provider_client,
|
||||
resolve_provider_client,
|
||||
auxiliary_max_tokens_param,
|
||||
_read_codex_access_token,
|
||||
@@ -639,30 +638,6 @@ class TestVisionClientFallback:
|
||||
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
|
||||
assert model == "claude-haiku-4-5-20251001"
|
||||
|
||||
def test_selected_codex_provider_short_circuits_vision_auto(self, monkeypatch):
|
||||
def fake_load_config():
|
||||
return {"model": {"provider": "openai-codex", "default": "gpt-5.2-codex"}}
|
||||
|
||||
codex_client = MagicMock()
|
||||
with (
|
||||
patch("hermes_cli.config.load_config", fake_load_config),
|
||||
patch("agent.auxiliary_client._try_codex", return_value=(codex_client, "gpt-5.2-codex")) as mock_codex,
|
||||
patch("agent.auxiliary_client._try_openrouter") as mock_openrouter,
|
||||
patch("agent.auxiliary_client._try_nous") as mock_nous,
|
||||
patch("agent.auxiliary_client._try_anthropic") as mock_anthropic,
|
||||
patch("agent.auxiliary_client._try_custom_endpoint") as mock_custom,
|
||||
):
|
||||
provider, client, model = resolve_vision_provider_client()
|
||||
|
||||
assert provider == "openai-codex"
|
||||
assert client is codex_client
|
||||
assert model == "gpt-5.2-codex"
|
||||
mock_codex.assert_called_once()
|
||||
mock_openrouter.assert_not_called()
|
||||
mock_nous.assert_not_called()
|
||||
mock_anthropic.assert_not_called()
|
||||
mock_custom.assert_not_called()
|
||||
|
||||
def test_vision_auto_includes_codex(self, codex_auth_dir):
|
||||
"""Codex supports vision (gpt-5.3-codex), so auto mode should use it."""
|
||||
with patch("agent.auxiliary_client._read_nous_auth", return_value=None), \
|
||||
|
||||
@@ -18,8 +18,6 @@ from agent.prompt_builder import (
|
||||
build_context_files_prompt,
|
||||
CONTEXT_FILE_MAX_CHARS,
|
||||
DEFAULT_AGENT_IDENTITY,
|
||||
TOOL_USE_ENFORCEMENT_GUIDANCE,
|
||||
TOOL_USE_ENFORCEMENT_MODELS,
|
||||
MEMORY_GUIDANCE,
|
||||
SESSION_SEARCH_GUIDANCE,
|
||||
PLATFORM_HINTS,
|
||||
@@ -234,18 +232,7 @@ class TestPromptBuilderImports:
|
||||
# =========================================================================
|
||||
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
class TestBuildSkillsSystemPrompt:
|
||||
@pytest.fixture(autouse=True)
|
||||
def _clear_skills_cache(self):
|
||||
"""Ensure the in-process skills prompt cache doesn't leak between tests."""
|
||||
from agent.prompt_builder import clear_skills_system_prompt_cache
|
||||
clear_skills_system_prompt_cache(clear_snapshot=True)
|
||||
yield
|
||||
clear_skills_system_prompt_cache(clear_snapshot=True)
|
||||
|
||||
def test_empty_when_no_skills_dir(self, monkeypatch, tmp_path):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
result = build_skills_system_prompt()
|
||||
@@ -315,7 +302,7 @@ class TestBuildSkillsSystemPrompt:
|
||||
|
||||
from unittest.mock import patch
|
||||
|
||||
with patch("agent.skill_utils.sys") as mock_sys:
|
||||
with patch("tools.skills_tool.sys") as mock_sys:
|
||||
mock_sys.platform = "darwin"
|
||||
result = build_skills_system_prompt()
|
||||
|
||||
@@ -343,7 +330,7 @@ class TestBuildSkillsSystemPrompt:
|
||||
from unittest.mock import patch
|
||||
|
||||
with patch(
|
||||
"agent.prompt_builder.get_disabled_skill_names",
|
||||
"tools.skills_tool._get_disabled_skill_names",
|
||||
return_value={"old-tool"},
|
||||
):
|
||||
result = build_skills_system_prompt()
|
||||
@@ -817,13 +804,6 @@ class TestSkillShouldShow:
|
||||
|
||||
|
||||
class TestBuildSkillsSystemPromptConditional:
|
||||
@pytest.fixture(autouse=True)
|
||||
def _clear_skills_cache(self):
|
||||
from agent.prompt_builder import clear_skills_system_prompt_cache
|
||||
clear_skills_system_prompt_cache(clear_snapshot=True)
|
||||
yield
|
||||
clear_skills_system_prompt_cache(clear_snapshot=True)
|
||||
|
||||
def test_fallback_skill_hidden_when_primary_available(self, monkeypatch, tmp_path):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
skill_dir = tmp_path / "skills" / "search" / "duckduckgo"
|
||||
@@ -928,98 +908,3 @@ class TestBuildSkillsSystemPromptConditional:
|
||||
available_toolsets=set(),
|
||||
)
|
||||
assert "nested-null" in result
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Tool-use enforcement guidance
|
||||
# =========================================================================
|
||||
|
||||
|
||||
class TestToolUseEnforcementGuidance:
|
||||
def test_guidance_mentions_tool_calls(self):
|
||||
assert "tool call" in TOOL_USE_ENFORCEMENT_GUIDANCE.lower()
|
||||
|
||||
def test_guidance_forbids_description_only(self):
|
||||
assert "describe" in TOOL_USE_ENFORCEMENT_GUIDANCE.lower()
|
||||
assert "promise" in TOOL_USE_ENFORCEMENT_GUIDANCE.lower()
|
||||
|
||||
def test_guidance_requires_action(self):
|
||||
assert "MUST" in TOOL_USE_ENFORCEMENT_GUIDANCE
|
||||
|
||||
def test_enforcement_models_includes_gpt(self):
|
||||
assert "gpt" in TOOL_USE_ENFORCEMENT_MODELS
|
||||
|
||||
def test_enforcement_models_includes_codex(self):
|
||||
assert "codex" in TOOL_USE_ENFORCEMENT_MODELS
|
||||
|
||||
def test_enforcement_models_is_tuple(self):
|
||||
assert isinstance(TOOL_USE_ENFORCEMENT_MODELS, tuple)
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Budget warning history stripping
|
||||
# =========================================================================
|
||||
|
||||
|
||||
class TestStripBudgetWarningsFromHistory:
|
||||
def test_strips_json_budget_warning_key(self):
|
||||
import json
|
||||
from run_agent import _strip_budget_warnings_from_history
|
||||
|
||||
messages = [
|
||||
{"role": "tool", "tool_call_id": "c1", "content": json.dumps({
|
||||
"output": "hello",
|
||||
"exit_code": 0,
|
||||
"_budget_warning": "[BUDGET: Iteration 55/60. 5 iterations left. Start consolidating your work.]",
|
||||
})},
|
||||
]
|
||||
_strip_budget_warnings_from_history(messages)
|
||||
parsed = json.loads(messages[0]["content"])
|
||||
assert "_budget_warning" not in parsed
|
||||
assert parsed["output"] == "hello"
|
||||
assert parsed["exit_code"] == 0
|
||||
|
||||
def test_strips_text_budget_warning(self):
|
||||
from run_agent import _strip_budget_warnings_from_history
|
||||
|
||||
messages = [
|
||||
{"role": "tool", "tool_call_id": "c1",
|
||||
"content": "some result\n\n[BUDGET WARNING: Iteration 58/60. Only 2 iteration(s) left. Provide your final response NOW. No more tool calls unless absolutely critical.]"},
|
||||
]
|
||||
_strip_budget_warnings_from_history(messages)
|
||||
assert messages[0]["content"] == "some result"
|
||||
|
||||
def test_leaves_non_tool_messages_unchanged(self):
|
||||
from run_agent import _strip_budget_warnings_from_history
|
||||
|
||||
messages = [
|
||||
{"role": "assistant", "content": "[BUDGET WARNING: Iteration 58/60. Only 2 iteration(s) left. Provide your final response NOW. No more tool calls unless absolutely critical.]"},
|
||||
{"role": "user", "content": "hello"},
|
||||
]
|
||||
original_contents = [m["content"] for m in messages]
|
||||
_strip_budget_warnings_from_history(messages)
|
||||
assert [m["content"] for m in messages] == original_contents
|
||||
|
||||
def test_handles_empty_and_missing_content(self):
|
||||
from run_agent import _strip_budget_warnings_from_history
|
||||
|
||||
messages = [
|
||||
{"role": "tool", "tool_call_id": "c1", "content": ""},
|
||||
{"role": "tool", "tool_call_id": "c2"},
|
||||
]
|
||||
_strip_budget_warnings_from_history(messages)
|
||||
assert messages[0]["content"] == ""
|
||||
|
||||
def test_strips_caution_variant(self):
|
||||
import json
|
||||
from run_agent import _strip_budget_warnings_from_history
|
||||
|
||||
messages = [
|
||||
{"role": "tool", "tool_call_id": "c1", "content": json.dumps({
|
||||
"output": "ok",
|
||||
"_budget_warning": "[BUDGET: Iteration 42/60. 18 iterations left. Start consolidating your work.]",
|
||||
})},
|
||||
]
|
||||
_strip_budget_warnings_from_history(messages)
|
||||
parsed = json.loads(messages[0]["content"])
|
||||
assert "_budget_warning" not in parsed
|
||||
|
||||
@@ -54,7 +54,7 @@ class TestScanSkillCommands:
|
||||
"""macOS-only skills should not register slash commands on Linux."""
|
||||
with (
|
||||
patch("tools.skills_tool.SKILLS_DIR", tmp_path),
|
||||
patch("agent.skill_utils.sys") as mock_sys,
|
||||
patch("tools.skills_tool.sys") as mock_sys,
|
||||
):
|
||||
mock_sys.platform = "linux"
|
||||
_make_skill(tmp_path, "imessage", frontmatter_extra="platforms: [macos]\n")
|
||||
@@ -67,7 +67,7 @@ class TestScanSkillCommands:
|
||||
"""macOS-only skills should register slash commands on macOS."""
|
||||
with (
|
||||
patch("tools.skills_tool.SKILLS_DIR", tmp_path),
|
||||
patch("agent.skill_utils.sys") as mock_sys,
|
||||
patch("tools.skills_tool.sys") as mock_sys,
|
||||
):
|
||||
mock_sys.platform = "darwin"
|
||||
_make_skill(tmp_path, "imessage", frontmatter_extra="platforms: [macos]\n")
|
||||
@@ -78,7 +78,7 @@ class TestScanSkillCommands:
|
||||
"""Skills without platforms field should register on any platform."""
|
||||
with (
|
||||
patch("tools.skills_tool.SKILLS_DIR", tmp_path),
|
||||
patch("agent.skill_utils.sys") as mock_sys,
|
||||
patch("tools.skills_tool.sys") as mock_sys,
|
||||
):
|
||||
mock_sys.platform = "win32"
|
||||
_make_skill(tmp_path, "generic-tool")
|
||||
|
||||
@@ -20,7 +20,6 @@ from cron.jobs import (
|
||||
resume_job,
|
||||
remove_job,
|
||||
mark_job_run,
|
||||
advance_next_run,
|
||||
get_due_jobs,
|
||||
save_job_output,
|
||||
)
|
||||
@@ -340,90 +339,6 @@ class TestMarkJobRun:
|
||||
assert updated["last_error"] == "timeout"
|
||||
|
||||
|
||||
class TestAdvanceNextRun:
|
||||
"""Tests for advance_next_run() — crash-safety for recurring jobs."""
|
||||
|
||||
def test_advances_interval_job(self, tmp_cron_dir):
|
||||
"""Interval jobs should have next_run_at bumped to the next future occurrence."""
|
||||
job = create_job(prompt="Recurring check", schedule="every 1h")
|
||||
# Force next_run_at to 5 minutes ago (i.e. the job is due)
|
||||
jobs = load_jobs()
|
||||
old_next = (datetime.now() - timedelta(minutes=5)).isoformat()
|
||||
jobs[0]["next_run_at"] = old_next
|
||||
save_jobs(jobs)
|
||||
|
||||
result = advance_next_run(job["id"])
|
||||
assert result is True
|
||||
|
||||
updated = get_job(job["id"])
|
||||
from cron.jobs import _ensure_aware, _hermes_now
|
||||
new_next_dt = _ensure_aware(datetime.fromisoformat(updated["next_run_at"]))
|
||||
assert new_next_dt > _hermes_now(), "next_run_at should be in the future after advance"
|
||||
|
||||
def test_advances_cron_job(self, tmp_cron_dir):
|
||||
"""Cron-expression jobs should have next_run_at bumped to the next occurrence."""
|
||||
pytest.importorskip("croniter")
|
||||
job = create_job(prompt="Daily wakeup", schedule="15 6 * * *")
|
||||
# Force next_run_at to 30 minutes ago
|
||||
jobs = load_jobs()
|
||||
old_next = (datetime.now() - timedelta(minutes=30)).isoformat()
|
||||
jobs[0]["next_run_at"] = old_next
|
||||
save_jobs(jobs)
|
||||
|
||||
result = advance_next_run(job["id"])
|
||||
assert result is True
|
||||
|
||||
updated = get_job(job["id"])
|
||||
from cron.jobs import _ensure_aware, _hermes_now
|
||||
new_next_dt = _ensure_aware(datetime.fromisoformat(updated["next_run_at"]))
|
||||
assert new_next_dt > _hermes_now(), "next_run_at should be in the future after advance"
|
||||
|
||||
def test_skips_oneshot_job(self, tmp_cron_dir):
|
||||
"""One-shot jobs should NOT be advanced — they need to retry on restart."""
|
||||
job = create_job(prompt="Run once", schedule="30m")
|
||||
original_next = get_job(job["id"])["next_run_at"]
|
||||
|
||||
result = advance_next_run(job["id"])
|
||||
assert result is False
|
||||
|
||||
updated = get_job(job["id"])
|
||||
assert updated["next_run_at"] == original_next, "one-shot next_run_at should be unchanged"
|
||||
|
||||
def test_nonexistent_job_returns_false(self, tmp_cron_dir):
|
||||
result = advance_next_run("nonexistent-id")
|
||||
assert result is False
|
||||
|
||||
def test_already_future_stays_future(self, tmp_cron_dir):
|
||||
"""If next_run_at is already in the future, advance keeps it in the future (no harm)."""
|
||||
job = create_job(prompt="Future job", schedule="every 1h")
|
||||
# next_run_at is already set to ~1h from now by create_job
|
||||
advance_next_run(job["id"])
|
||||
# Regardless of return value, the job should still be in the future
|
||||
updated = get_job(job["id"])
|
||||
from cron.jobs import _ensure_aware, _hermes_now
|
||||
new_next_dt = _ensure_aware(datetime.fromisoformat(updated["next_run_at"]))
|
||||
assert new_next_dt > _hermes_now(), "next_run_at should remain in the future"
|
||||
|
||||
def test_crash_safety_scenario(self, tmp_cron_dir):
|
||||
"""Simulate the crash-loop scenario: after advance, the job should NOT be due."""
|
||||
job = create_job(prompt="Crash test", schedule="every 1h")
|
||||
# Force next_run_at to 5 minutes ago (job is due)
|
||||
jobs = load_jobs()
|
||||
jobs[0]["next_run_at"] = (datetime.now() - timedelta(minutes=5)).isoformat()
|
||||
save_jobs(jobs)
|
||||
|
||||
# Job should be due before advance
|
||||
due_before = get_due_jobs()
|
||||
assert len(due_before) == 1
|
||||
|
||||
# Advance (simulating what tick() does before run_job)
|
||||
advance_next_run(job["id"])
|
||||
|
||||
# Now the job should NOT be due (simulates restart after crash)
|
||||
due_after = get_due_jobs()
|
||||
assert len(due_after) == 0, "Job should not be due after advance_next_run"
|
||||
|
||||
|
||||
class TestGetDueJobs:
|
||||
def test_past_due_within_window_returned(self, tmp_cron_dir):
|
||||
"""Jobs within the dynamic grace window are still considered due (not stale).
|
||||
|
||||
@@ -687,41 +687,3 @@ class TestBuildJobPromptMissingSkill:
|
||||
result = _build_job_prompt({"skills": ["ghost-skill", "real-skill"], "prompt": "go"})
|
||||
assert "Real skill content." in result
|
||||
assert "go" in result
|
||||
|
||||
|
||||
class TestTickAdvanceBeforeRun:
|
||||
"""Verify that tick() calls advance_next_run before run_job for crash safety."""
|
||||
|
||||
def test_advance_called_before_run_job(self, tmp_path):
|
||||
"""advance_next_run must be called before run_job to prevent crash-loop re-fires."""
|
||||
call_order = []
|
||||
|
||||
def fake_advance(job_id):
|
||||
call_order.append(("advance", job_id))
|
||||
return True
|
||||
|
||||
def fake_run_job(job):
|
||||
call_order.append(("run", job["id"]))
|
||||
return True, "output", "response", None
|
||||
|
||||
fake_job = {
|
||||
"id": "test-advance",
|
||||
"name": "test",
|
||||
"prompt": "hello",
|
||||
"enabled": True,
|
||||
"schedule": {"kind": "cron", "expr": "15 6 * * *"},
|
||||
}
|
||||
|
||||
with patch("cron.scheduler.get_due_jobs", return_value=[fake_job]), \
|
||||
patch("cron.scheduler.advance_next_run", side_effect=fake_advance) as adv_mock, \
|
||||
patch("cron.scheduler.run_job", side_effect=fake_run_job), \
|
||||
patch("cron.scheduler.save_job_output", return_value=tmp_path / "out.md"), \
|
||||
patch("cron.scheduler.mark_job_run"), \
|
||||
patch("cron.scheduler._deliver_result"):
|
||||
from cron.scheduler import tick
|
||||
executed = tick(verbose=False)
|
||||
|
||||
assert executed == 1
|
||||
adv_mock.assert_called_once_with("test-advance")
|
||||
# advance must happen before run
|
||||
assert call_order == [("advance", "test-advance"), ("run", "test-advance")]
|
||||
|
||||
@@ -28,7 +28,6 @@ from gateway.platforms.api_server import (
|
||||
_CORS_HEADERS,
|
||||
check_api_server_requirements,
|
||||
cors_middleware,
|
||||
security_headers_middleware,
|
||||
)
|
||||
|
||||
|
||||
@@ -215,11 +214,9 @@ def _make_adapter(api_key: str = "", cors_origins=None) -> APIServerAdapter:
|
||||
|
||||
def _create_app(adapter: APIServerAdapter) -> web.Application:
|
||||
"""Create the aiohttp app from the adapter (without starting the full server)."""
|
||||
mws = [mw for mw in (cors_middleware, security_headers_middleware) if mw is not None]
|
||||
app = web.Application(middlewares=mws)
|
||||
app = web.Application(middlewares=[cors_middleware])
|
||||
app["api_server_adapter"] = adapter
|
||||
app.router.add_get("/health", adapter._handle_health)
|
||||
app.router.add_get("/v1/health", adapter._handle_health)
|
||||
app.router.add_get("/v1/models", adapter._handle_models)
|
||||
app.router.add_post("/v1/chat/completions", adapter._handle_chat_completions)
|
||||
app.router.add_post("/v1/responses", adapter._handle_responses)
|
||||
@@ -244,16 +241,6 @@ def auth_adapter():
|
||||
|
||||
|
||||
class TestHealthEndpoint:
|
||||
@pytest.mark.asyncio
|
||||
async def test_security_headers_present(self, adapter):
|
||||
"""Responses should include basic security headers."""
|
||||
app = _create_app(adapter)
|
||||
async with TestClient(TestServer(app)) as cli:
|
||||
resp = await cli.get("/health")
|
||||
assert resp.status == 200
|
||||
assert resp.headers.get("X-Content-Type-Options") == "nosniff"
|
||||
assert resp.headers.get("Referrer-Policy") == "no-referrer"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_health_returns_ok(self, adapter):
|
||||
app = _create_app(adapter)
|
||||
@@ -264,17 +251,6 @@ class TestHealthEndpoint:
|
||||
assert data["status"] == "ok"
|
||||
assert data["platform"] == "hermes-agent"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_v1_health_alias_returns_ok(self, adapter):
|
||||
"""GET /v1/health should return the same response as /health."""
|
||||
app = _create_app(adapter)
|
||||
async with TestClient(TestServer(app)) as cli:
|
||||
resp = await cli.get("/v1/health")
|
||||
assert resp.status == 200
|
||||
data = await resp.json()
|
||||
assert data["status"] == "ok"
|
||||
assert data["platform"] == "hermes-agent"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# /v1/models endpoint
|
||||
@@ -1324,31 +1300,6 @@ class TestCORS:
|
||||
assert "POST" in resp.headers.get("Access-Control-Allow-Methods", "")
|
||||
assert "DELETE" in resp.headers.get("Access-Control-Allow-Methods", "")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cors_allows_idempotency_key_header(self):
|
||||
adapter = _make_adapter(cors_origins=["http://localhost:3000"])
|
||||
app = _create_app(adapter)
|
||||
async with TestClient(TestServer(app)) as cli:
|
||||
resp = await cli.options(
|
||||
"/v1/chat/completions",
|
||||
headers={
|
||||
"Origin": "http://localhost:3000",
|
||||
"Access-Control-Request-Method": "POST",
|
||||
"Access-Control-Request-Headers": "Idempotency-Key",
|
||||
},
|
||||
)
|
||||
assert resp.status == 200
|
||||
assert "Idempotency-Key" in resp.headers.get("Access-Control-Allow-Headers", "")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cors_sets_vary_origin_header(self):
|
||||
adapter = _make_adapter(cors_origins=["http://localhost:3000"])
|
||||
app = _create_app(adapter)
|
||||
async with TestClient(TestServer(app)) as cli:
|
||||
resp = await cli.get("/health", headers={"Origin": "http://localhost:3000"})
|
||||
assert resp.status == 200
|
||||
assert resp.headers.get("Vary") == "Origin"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cors_options_preflight_allowed_for_configured_origin(self):
|
||||
"""Configured origins can complete browser preflight."""
|
||||
@@ -1368,21 +1319,6 @@ class TestCORS:
|
||||
assert "Authorization" in resp.headers.get("Access-Control-Allow-Headers", "")
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cors_preflight_sets_max_age(self):
|
||||
adapter = _make_adapter(cors_origins=["http://localhost:3000"])
|
||||
app = _create_app(adapter)
|
||||
async with TestClient(TestServer(app)) as cli:
|
||||
resp = await cli.options(
|
||||
"/v1/chat/completions",
|
||||
headers={
|
||||
"Origin": "http://localhost:3000",
|
||||
"Access-Control-Request-Method": "POST",
|
||||
"Access-Control-Request-Headers": "Authorization, Content-Type",
|
||||
},
|
||||
)
|
||||
assert resp.status == 200
|
||||
assert resp.headers.get("Access-Control-Max-Age") == "600"
|
||||
# ---------------------------------------------------------------------------
|
||||
# Conversation parameter
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@@ -10,7 +10,6 @@ Covers:
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from types import SimpleNamespace
|
||||
@@ -33,7 +32,7 @@ def _ensure_telegram_mock():
|
||||
telegram_mod.constants.ChatType.CHANNEL = "channel"
|
||||
telegram_mod.constants.ChatType.PRIVATE = "private"
|
||||
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants"):
|
||||
sys.modules.setdefault(name, telegram_mod)
|
||||
|
||||
|
||||
@@ -228,8 +227,7 @@ def test_persist_dm_topic_thread_id_writes_config(tmp_path):
|
||||
|
||||
adapter = _make_adapter()
|
||||
|
||||
with patch.object(Path, "home", return_value=tmp_path), \
|
||||
patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
|
||||
with patch.object(Path, "home", return_value=tmp_path):
|
||||
adapter._persist_dm_topic_thread_id(111, "General", 999)
|
||||
|
||||
with open(config_file) as f:
|
||||
@@ -368,8 +366,7 @@ def test_get_dm_topic_info_hot_reloads_from_config(tmp_path):
|
||||
with open(config_file, "w") as f:
|
||||
yaml.dump(config_data, f)
|
||||
|
||||
with patch.object(Path, "home", return_value=tmp_path), \
|
||||
patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
|
||||
with patch.object(Path, "home", return_value=tmp_path):
|
||||
result = adapter._get_dm_topic_info("111", "555")
|
||||
|
||||
assert result is not None
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
"""Tests for Matrix platform adapter."""
|
||||
import asyncio
|
||||
import json
|
||||
import re
|
||||
import pytest
|
||||
@@ -447,199 +446,3 @@ class TestMatrixRequirements:
|
||||
monkeypatch.delenv("MATRIX_HOMESERVER", raising=False)
|
||||
from gateway.platforms.matrix import check_matrix_requirements
|
||||
assert check_matrix_requirements() is False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Access-token auth / E2EE bootstrap
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestMatrixAccessTokenAuth:
|
||||
@pytest.mark.asyncio
|
||||
async def test_connect_fetches_device_id_from_whoami_for_access_token(self):
|
||||
from gateway.platforms.matrix import MatrixAdapter
|
||||
|
||||
config = PlatformConfig(
|
||||
enabled=True,
|
||||
token="syt_test_access_token",
|
||||
extra={
|
||||
"homeserver": "https://matrix.example.org",
|
||||
"user_id": "@bot:example.org",
|
||||
"encryption": True,
|
||||
},
|
||||
)
|
||||
adapter = MatrixAdapter(config)
|
||||
|
||||
class FakeWhoamiResponse:
|
||||
def __init__(self, user_id, device_id):
|
||||
self.user_id = user_id
|
||||
self.device_id = device_id
|
||||
|
||||
class FakeSyncResponse:
|
||||
def __init__(self):
|
||||
self.rooms = MagicMock(join={})
|
||||
|
||||
fake_client = MagicMock()
|
||||
fake_client.whoami = AsyncMock(return_value=FakeWhoamiResponse("@bot:example.org", "DEV123"))
|
||||
fake_client.sync = AsyncMock(return_value=FakeSyncResponse())
|
||||
fake_client.keys_upload = AsyncMock()
|
||||
fake_client.keys_query = AsyncMock()
|
||||
fake_client.keys_claim = AsyncMock()
|
||||
fake_client.send_to_device_messages = AsyncMock(return_value=[])
|
||||
fake_client.get_users_for_key_claiming = MagicMock(return_value={})
|
||||
fake_client.close = AsyncMock()
|
||||
fake_client.add_event_callback = MagicMock()
|
||||
fake_client.rooms = {}
|
||||
fake_client.account_data = {}
|
||||
fake_client.olm = object()
|
||||
fake_client.should_upload_keys = False
|
||||
fake_client.should_query_keys = False
|
||||
fake_client.should_claim_keys = False
|
||||
|
||||
def _restore_login(user_id, device_id, access_token):
|
||||
fake_client.user_id = user_id
|
||||
fake_client.device_id = device_id
|
||||
fake_client.access_token = access_token
|
||||
fake_client.olm = object()
|
||||
|
||||
fake_client.restore_login = MagicMock(side_effect=_restore_login)
|
||||
|
||||
fake_nio = MagicMock()
|
||||
fake_nio.AsyncClient = MagicMock(return_value=fake_client)
|
||||
fake_nio.WhoamiResponse = FakeWhoamiResponse
|
||||
fake_nio.SyncResponse = FakeSyncResponse
|
||||
fake_nio.LoginResponse = type("LoginResponse", (), {})
|
||||
fake_nio.RoomMessageText = type("RoomMessageText", (), {})
|
||||
fake_nio.RoomMessageImage = type("RoomMessageImage", (), {})
|
||||
fake_nio.RoomMessageAudio = type("RoomMessageAudio", (), {})
|
||||
fake_nio.RoomMessageVideo = type("RoomMessageVideo", (), {})
|
||||
fake_nio.RoomMessageFile = type("RoomMessageFile", (), {})
|
||||
fake_nio.InviteMemberEvent = type("InviteMemberEvent", (), {})
|
||||
fake_nio.MegolmEvent = type("MegolmEvent", (), {})
|
||||
|
||||
with patch.dict("sys.modules", {"nio": fake_nio}):
|
||||
with patch.object(adapter, "_refresh_dm_cache", AsyncMock()):
|
||||
with patch.object(adapter, "_sync_loop", AsyncMock(return_value=None)):
|
||||
assert await adapter.connect() is True
|
||||
|
||||
fake_client.restore_login.assert_called_once_with(
|
||||
"@bot:example.org", "DEV123", "syt_test_access_token"
|
||||
)
|
||||
assert fake_client.access_token == "syt_test_access_token"
|
||||
assert fake_client.user_id == "@bot:example.org"
|
||||
assert fake_client.device_id == "DEV123"
|
||||
fake_client.whoami.assert_awaited_once()
|
||||
|
||||
await adapter.disconnect()
|
||||
|
||||
|
||||
class TestMatrixE2EEMaintenance:
|
||||
@pytest.mark.asyncio
|
||||
async def test_sync_loop_runs_e2ee_maintenance_requests(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._encryption = True
|
||||
adapter._closing = False
|
||||
|
||||
class FakeSyncError:
|
||||
pass
|
||||
|
||||
async def _sync_once(timeout=30000):
|
||||
adapter._closing = True
|
||||
return MagicMock()
|
||||
|
||||
fake_client = MagicMock()
|
||||
fake_client.sync = AsyncMock(side_effect=_sync_once)
|
||||
fake_client.send_to_device_messages = AsyncMock(return_value=[])
|
||||
fake_client.keys_upload = AsyncMock()
|
||||
fake_client.keys_query = AsyncMock()
|
||||
fake_client.get_users_for_key_claiming = MagicMock(
|
||||
return_value={"@alice:example.org": ["DEVICE1"]}
|
||||
)
|
||||
fake_client.keys_claim = AsyncMock()
|
||||
fake_client.olm = object()
|
||||
fake_client.should_upload_keys = True
|
||||
fake_client.should_query_keys = True
|
||||
fake_client.should_claim_keys = True
|
||||
|
||||
adapter._client = fake_client
|
||||
|
||||
fake_nio = MagicMock()
|
||||
fake_nio.SyncError = FakeSyncError
|
||||
|
||||
with patch.dict("sys.modules", {"nio": fake_nio}):
|
||||
await adapter._sync_loop()
|
||||
|
||||
fake_client.sync.assert_awaited_once_with(timeout=30000)
|
||||
fake_client.send_to_device_messages.assert_awaited_once()
|
||||
fake_client.keys_upload.assert_awaited_once()
|
||||
fake_client.keys_query.assert_awaited_once()
|
||||
fake_client.keys_claim.assert_awaited_once_with(
|
||||
{"@alice:example.org": ["DEVICE1"]}
|
||||
)
|
||||
|
||||
|
||||
class TestMatrixEncryptedSendFallback:
|
||||
@pytest.mark.asyncio
|
||||
async def test_send_retries_with_ignored_unverified_devices(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._encryption = True
|
||||
|
||||
class FakeRoomSendResponse:
|
||||
def __init__(self, event_id):
|
||||
self.event_id = event_id
|
||||
|
||||
class FakeOlmUnverifiedDeviceError(Exception):
|
||||
pass
|
||||
|
||||
fake_client = MagicMock()
|
||||
fake_client.room_send = AsyncMock(side_effect=[
|
||||
FakeOlmUnverifiedDeviceError("unverified"),
|
||||
FakeRoomSendResponse("$event123"),
|
||||
])
|
||||
adapter._client = fake_client
|
||||
adapter._run_e2ee_maintenance = AsyncMock()
|
||||
|
||||
fake_nio = MagicMock()
|
||||
fake_nio.RoomSendResponse = FakeRoomSendResponse
|
||||
fake_nio.OlmUnverifiedDeviceError = FakeOlmUnverifiedDeviceError
|
||||
|
||||
with patch.dict("sys.modules", {"nio": fake_nio}):
|
||||
result = await adapter.send("!room:example.org", "hello")
|
||||
|
||||
assert result.success is True
|
||||
assert result.message_id == "$event123"
|
||||
adapter._run_e2ee_maintenance.assert_awaited_once()
|
||||
assert fake_client.room_send.await_count == 2
|
||||
first_call = fake_client.room_send.await_args_list[0]
|
||||
second_call = fake_client.room_send.await_args_list[1]
|
||||
assert first_call.kwargs.get("ignore_unverified_devices") is False
|
||||
assert second_call.kwargs.get("ignore_unverified_devices") is True
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_send_retries_after_timeout_in_encrypted_room(self):
|
||||
adapter = _make_adapter()
|
||||
adapter._encryption = True
|
||||
|
||||
class FakeRoomSendResponse:
|
||||
def __init__(self, event_id):
|
||||
self.event_id = event_id
|
||||
|
||||
fake_client = MagicMock()
|
||||
fake_client.room_send = AsyncMock(side_effect=[
|
||||
asyncio.TimeoutError(),
|
||||
FakeRoomSendResponse("$event456"),
|
||||
])
|
||||
adapter._client = fake_client
|
||||
adapter._run_e2ee_maintenance = AsyncMock()
|
||||
|
||||
fake_nio = MagicMock()
|
||||
fake_nio.RoomSendResponse = FakeRoomSendResponse
|
||||
|
||||
with patch.dict("sys.modules", {"nio": fake_nio}):
|
||||
result = await adapter.send("!room:example.org", "hello")
|
||||
|
||||
assert result.success is True
|
||||
assert result.message_id == "$event456"
|
||||
adapter._run_e2ee_maintenance.assert_awaited_once()
|
||||
assert fake_client.room_send.await_count == 2
|
||||
second_call = fake_client.room_send.await_args_list[1]
|
||||
assert second_call.kwargs.get("ignore_unverified_devices") is True
|
||||
|
||||
@@ -1,722 +0,0 @@
|
||||
"""
|
||||
Tests for media download retry logic added in PR #2982.
|
||||
|
||||
Covers:
|
||||
- gateway/platforms/base.py: cache_image_from_url
|
||||
- gateway/platforms/slack.py: SlackAdapter._download_slack_file
|
||||
SlackAdapter._download_slack_file_bytes
|
||||
- gateway/platforms/mattermost.py: MattermostAdapter._send_url_as_file
|
||||
|
||||
All async tests use asyncio.run() directly — pytest-asyncio is not installed
|
||||
in this environment.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import sys
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
import httpx
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers for building httpx exceptions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _make_http_status_error(status_code: int) -> httpx.HTTPStatusError:
|
||||
request = httpx.Request("GET", "http://example.com/img.jpg")
|
||||
response = httpx.Response(status_code=status_code, request=request)
|
||||
return httpx.HTTPStatusError(
|
||||
f"HTTP {status_code}", request=request, response=response
|
||||
)
|
||||
|
||||
|
||||
def _make_timeout_error() -> httpx.TimeoutException:
|
||||
return httpx.TimeoutException("timed out")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# cache_image_from_url (base.py)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestCacheImageFromUrl:
|
||||
"""Tests for gateway.platforms.base.cache_image_from_url"""
|
||||
|
||||
def test_success_on_first_attempt(self, tmp_path, monkeypatch):
|
||||
"""A clean 200 response caches the image and returns a path."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
|
||||
fake_response = MagicMock()
|
||||
fake_response.content = b"\xff\xd8\xff fake jpeg"
|
||||
fake_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(return_value=fake_response)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client):
|
||||
from gateway.platforms.base import cache_image_from_url
|
||||
return await cache_image_from_url(
|
||||
"http://example.com/img.jpg", ext=".jpg"
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".jpg")
|
||||
mock_client.get.assert_called_once()
|
||||
|
||||
def test_retries_on_timeout_then_succeeds(self, tmp_path, monkeypatch):
|
||||
"""A timeout on the first attempt is retried; second attempt succeeds."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
|
||||
fake_response = MagicMock()
|
||||
fake_response.content = b"image data"
|
||||
fake_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(
|
||||
side_effect=[_make_timeout_error(), fake_response]
|
||||
)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", mock_sleep):
|
||||
from gateway.platforms.base import cache_image_from_url
|
||||
return await cache_image_from_url(
|
||||
"http://example.com/img.jpg", ext=".jpg", retries=2
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".jpg")
|
||||
assert mock_client.get.call_count == 2
|
||||
mock_sleep.assert_called_once()
|
||||
|
||||
def test_retries_on_429_then_succeeds(self, tmp_path, monkeypatch):
|
||||
"""A 429 response on the first attempt is retried; second attempt succeeds."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
|
||||
ok_response = MagicMock()
|
||||
ok_response.content = b"image data"
|
||||
ok_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(
|
||||
side_effect=[_make_http_status_error(429), ok_response]
|
||||
)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
from gateway.platforms.base import cache_image_from_url
|
||||
return await cache_image_from_url(
|
||||
"http://example.com/img.jpg", ext=".jpg", retries=2
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".jpg")
|
||||
assert mock_client.get.call_count == 2
|
||||
|
||||
def test_raises_after_max_retries_exhausted(self, tmp_path, monkeypatch):
|
||||
"""Timeout on every attempt raises after all retries are consumed."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(side_effect=_make_timeout_error())
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
from gateway.platforms.base import cache_image_from_url
|
||||
await cache_image_from_url(
|
||||
"http://example.com/img.jpg", ext=".jpg", retries=2
|
||||
)
|
||||
|
||||
with pytest.raises(httpx.TimeoutException):
|
||||
asyncio.run(run())
|
||||
|
||||
# 3 total calls: initial + 2 retries
|
||||
assert mock_client.get.call_count == 3
|
||||
|
||||
def test_non_retryable_4xx_raises_immediately(self, tmp_path, monkeypatch):
|
||||
"""A 404 (non-retryable) is raised immediately without any retry."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(side_effect=_make_http_status_error(404))
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", mock_sleep):
|
||||
from gateway.platforms.base import cache_image_from_url
|
||||
await cache_image_from_url(
|
||||
"http://example.com/img.jpg", ext=".jpg", retries=2
|
||||
)
|
||||
|
||||
with pytest.raises(httpx.HTTPStatusError):
|
||||
asyncio.run(run())
|
||||
|
||||
# Only 1 attempt, no sleep
|
||||
assert mock_client.get.call_count == 1
|
||||
mock_sleep.assert_not_called()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# cache_audio_from_url (base.py)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestCacheAudioFromUrl:
|
||||
"""Tests for gateway.platforms.base.cache_audio_from_url"""
|
||||
|
||||
def test_success_on_first_attempt(self, tmp_path, monkeypatch):
|
||||
"""A clean 200 response caches the audio and returns a path."""
|
||||
monkeypatch.setattr("gateway.platforms.base.AUDIO_CACHE_DIR", tmp_path / "audio")
|
||||
|
||||
fake_response = MagicMock()
|
||||
fake_response.content = b"\x00\x01 fake audio"
|
||||
fake_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(return_value=fake_response)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client):
|
||||
from gateway.platforms.base import cache_audio_from_url
|
||||
return await cache_audio_from_url(
|
||||
"http://example.com/voice.ogg", ext=".ogg"
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".ogg")
|
||||
mock_client.get.assert_called_once()
|
||||
|
||||
def test_retries_on_timeout_then_succeeds(self, tmp_path, monkeypatch):
|
||||
"""A timeout on the first attempt is retried; second attempt succeeds."""
|
||||
monkeypatch.setattr("gateway.platforms.base.AUDIO_CACHE_DIR", tmp_path / "audio")
|
||||
|
||||
fake_response = MagicMock()
|
||||
fake_response.content = b"audio data"
|
||||
fake_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(
|
||||
side_effect=[_make_timeout_error(), fake_response]
|
||||
)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", mock_sleep):
|
||||
from gateway.platforms.base import cache_audio_from_url
|
||||
return await cache_audio_from_url(
|
||||
"http://example.com/voice.ogg", ext=".ogg", retries=2
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".ogg")
|
||||
assert mock_client.get.call_count == 2
|
||||
mock_sleep.assert_called_once()
|
||||
|
||||
def test_retries_on_429_then_succeeds(self, tmp_path, monkeypatch):
|
||||
"""A 429 response on the first attempt is retried; second attempt succeeds."""
|
||||
monkeypatch.setattr("gateway.platforms.base.AUDIO_CACHE_DIR", tmp_path / "audio")
|
||||
|
||||
ok_response = MagicMock()
|
||||
ok_response.content = b"audio data"
|
||||
ok_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(
|
||||
side_effect=[_make_http_status_error(429), ok_response]
|
||||
)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
from gateway.platforms.base import cache_audio_from_url
|
||||
return await cache_audio_from_url(
|
||||
"http://example.com/voice.ogg", ext=".ogg", retries=2
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".ogg")
|
||||
assert mock_client.get.call_count == 2
|
||||
|
||||
def test_retries_on_500_then_succeeds(self, tmp_path, monkeypatch):
|
||||
"""A 500 response on the first attempt is retried; second attempt succeeds."""
|
||||
monkeypatch.setattr("gateway.platforms.base.AUDIO_CACHE_DIR", tmp_path / "audio")
|
||||
|
||||
ok_response = MagicMock()
|
||||
ok_response.content = b"audio data"
|
||||
ok_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(
|
||||
side_effect=[_make_http_status_error(500), ok_response]
|
||||
)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
from gateway.platforms.base import cache_audio_from_url
|
||||
return await cache_audio_from_url(
|
||||
"http://example.com/voice.ogg", ext=".ogg", retries=2
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".ogg")
|
||||
assert mock_client.get.call_count == 2
|
||||
|
||||
def test_raises_after_max_retries_exhausted(self, tmp_path, monkeypatch):
|
||||
"""Timeout on every attempt raises after all retries are consumed."""
|
||||
monkeypatch.setattr("gateway.platforms.base.AUDIO_CACHE_DIR", tmp_path / "audio")
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(side_effect=_make_timeout_error())
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
from gateway.platforms.base import cache_audio_from_url
|
||||
await cache_audio_from_url(
|
||||
"http://example.com/voice.ogg", ext=".ogg", retries=2
|
||||
)
|
||||
|
||||
with pytest.raises(httpx.TimeoutException):
|
||||
asyncio.run(run())
|
||||
|
||||
# 3 total calls: initial + 2 retries
|
||||
assert mock_client.get.call_count == 3
|
||||
|
||||
def test_non_retryable_4xx_raises_immediately(self, tmp_path, monkeypatch):
|
||||
"""A 404 (non-retryable) is raised immediately without any retry."""
|
||||
monkeypatch.setattr("gateway.platforms.base.AUDIO_CACHE_DIR", tmp_path / "audio")
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(side_effect=_make_http_status_error(404))
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", mock_sleep):
|
||||
from gateway.platforms.base import cache_audio_from_url
|
||||
await cache_audio_from_url(
|
||||
"http://example.com/voice.ogg", ext=".ogg", retries=2
|
||||
)
|
||||
|
||||
with pytest.raises(httpx.HTTPStatusError):
|
||||
asyncio.run(run())
|
||||
|
||||
# Only 1 attempt, no sleep
|
||||
assert mock_client.get.call_count == 1
|
||||
mock_sleep.assert_not_called()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Slack mock setup (mirrors existing test_slack.py approach)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _ensure_slack_mock():
|
||||
if "slack_bolt" in sys.modules and hasattr(sys.modules["slack_bolt"], "__file__"):
|
||||
return
|
||||
slack_bolt = MagicMock()
|
||||
slack_bolt.async_app.AsyncApp = MagicMock
|
||||
slack_bolt.adapter.socket_mode.async_handler.AsyncSocketModeHandler = MagicMock
|
||||
slack_sdk = MagicMock()
|
||||
slack_sdk.web.async_client.AsyncWebClient = MagicMock
|
||||
for name, mod in [
|
||||
("slack_bolt", slack_bolt),
|
||||
("slack_bolt.async_app", slack_bolt.async_app),
|
||||
("slack_bolt.adapter", slack_bolt.adapter),
|
||||
("slack_bolt.adapter.socket_mode", slack_bolt.adapter.socket_mode),
|
||||
("slack_bolt.adapter.socket_mode.async_handler",
|
||||
slack_bolt.adapter.socket_mode.async_handler),
|
||||
("slack_sdk", slack_sdk),
|
||||
("slack_sdk.web", slack_sdk.web),
|
||||
("slack_sdk.web.async_client", slack_sdk.web.async_client),
|
||||
]:
|
||||
sys.modules.setdefault(name, mod)
|
||||
|
||||
|
||||
_ensure_slack_mock()
|
||||
|
||||
import gateway.platforms.slack as _slack_mod # noqa: E402
|
||||
_slack_mod.SLACK_AVAILABLE = True
|
||||
|
||||
from gateway.platforms.slack import SlackAdapter # noqa: E402
|
||||
from gateway.config import Platform, PlatformConfig # noqa: E402
|
||||
|
||||
|
||||
def _make_slack_adapter():
|
||||
config = PlatformConfig(enabled=True, token="xoxb-fake-token")
|
||||
adapter = SlackAdapter(config)
|
||||
adapter._app = MagicMock()
|
||||
adapter._app.client = AsyncMock()
|
||||
adapter._bot_user_id = "U_BOT"
|
||||
adapter._running = True
|
||||
return adapter
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# SlackAdapter._download_slack_file
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestSlackDownloadSlackFile:
|
||||
"""Tests for SlackAdapter._download_slack_file"""
|
||||
|
||||
def test_success_on_first_attempt(self, tmp_path, monkeypatch):
|
||||
"""Successful download on first try returns a cached file path."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
adapter = _make_slack_adapter()
|
||||
|
||||
fake_response = MagicMock()
|
||||
fake_response.content = b"fake image bytes"
|
||||
fake_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(return_value=fake_response)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client):
|
||||
return await adapter._download_slack_file(
|
||||
"https://files.slack.com/img.jpg", ext=".jpg"
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".jpg")
|
||||
mock_client.get.assert_called_once()
|
||||
|
||||
def test_retries_on_timeout_then_succeeds(self, tmp_path, monkeypatch):
|
||||
"""Timeout on first attempt triggers retry; success on second."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
adapter = _make_slack_adapter()
|
||||
|
||||
fake_response = MagicMock()
|
||||
fake_response.content = b"image bytes"
|
||||
fake_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(
|
||||
side_effect=[_make_timeout_error(), fake_response]
|
||||
)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", mock_sleep):
|
||||
return await adapter._download_slack_file(
|
||||
"https://files.slack.com/img.jpg", ext=".jpg"
|
||||
)
|
||||
|
||||
path = asyncio.run(run())
|
||||
assert path.endswith(".jpg")
|
||||
assert mock_client.get.call_count == 2
|
||||
mock_sleep.assert_called_once()
|
||||
|
||||
def test_raises_after_max_retries(self, tmp_path, monkeypatch):
|
||||
"""Timeout on every attempt eventually raises after 3 total tries."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
adapter = _make_slack_adapter()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(side_effect=_make_timeout_error())
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
await adapter._download_slack_file(
|
||||
"https://files.slack.com/img.jpg", ext=".jpg"
|
||||
)
|
||||
|
||||
with pytest.raises(httpx.TimeoutException):
|
||||
asyncio.run(run())
|
||||
|
||||
assert mock_client.get.call_count == 3
|
||||
|
||||
def test_non_retryable_403_raises_immediately(self, tmp_path, monkeypatch):
|
||||
"""A 403 is not retried; it raises immediately."""
|
||||
monkeypatch.setattr("gateway.platforms.base.IMAGE_CACHE_DIR", tmp_path / "img")
|
||||
adapter = _make_slack_adapter()
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(side_effect=_make_http_status_error(403))
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", mock_sleep):
|
||||
await adapter._download_slack_file(
|
||||
"https://files.slack.com/img.jpg", ext=".jpg"
|
||||
)
|
||||
|
||||
with pytest.raises(httpx.HTTPStatusError):
|
||||
asyncio.run(run())
|
||||
|
||||
assert mock_client.get.call_count == 1
|
||||
mock_sleep.assert_not_called()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# SlackAdapter._download_slack_file_bytes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestSlackDownloadSlackFileBytes:
|
||||
"""Tests for SlackAdapter._download_slack_file_bytes"""
|
||||
|
||||
def test_success_returns_bytes(self):
|
||||
"""Successful download returns raw bytes."""
|
||||
adapter = _make_slack_adapter()
|
||||
|
||||
fake_response = MagicMock()
|
||||
fake_response.content = b"raw bytes here"
|
||||
fake_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(return_value=fake_response)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client):
|
||||
return await adapter._download_slack_file_bytes(
|
||||
"https://files.slack.com/file.bin"
|
||||
)
|
||||
|
||||
result = asyncio.run(run())
|
||||
assert result == b"raw bytes here"
|
||||
|
||||
def test_retries_on_429_then_succeeds(self):
|
||||
"""429 on first attempt is retried; raw bytes returned on second."""
|
||||
adapter = _make_slack_adapter()
|
||||
|
||||
ok_response = MagicMock()
|
||||
ok_response.content = b"final bytes"
|
||||
ok_response.raise_for_status = MagicMock()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(
|
||||
side_effect=[_make_http_status_error(429), ok_response]
|
||||
)
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
return await adapter._download_slack_file_bytes(
|
||||
"https://files.slack.com/file.bin"
|
||||
)
|
||||
|
||||
result = asyncio.run(run())
|
||||
assert result == b"final bytes"
|
||||
assert mock_client.get.call_count == 2
|
||||
|
||||
def test_raises_after_max_retries(self):
|
||||
"""Persistent timeouts raise after all 3 attempts are exhausted."""
|
||||
adapter = _make_slack_adapter()
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.get = AsyncMock(side_effect=_make_timeout_error())
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=False)
|
||||
|
||||
async def run():
|
||||
with patch("httpx.AsyncClient", return_value=mock_client), \
|
||||
patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
await adapter._download_slack_file_bytes(
|
||||
"https://files.slack.com/file.bin"
|
||||
)
|
||||
|
||||
with pytest.raises(httpx.TimeoutException):
|
||||
asyncio.run(run())
|
||||
|
||||
assert mock_client.get.call_count == 3
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# MattermostAdapter._send_url_as_file
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _make_mm_adapter():
|
||||
"""Build a minimal MattermostAdapter with mocked internals."""
|
||||
from gateway.platforms.mattermost import MattermostAdapter
|
||||
config = PlatformConfig(
|
||||
enabled=True, token="mm-token-fake",
|
||||
extra={"url": "https://mm.example.com"},
|
||||
)
|
||||
adapter = MattermostAdapter(config)
|
||||
adapter._session = MagicMock()
|
||||
adapter._upload_file = AsyncMock(return_value="file-id-123")
|
||||
adapter._api_post = AsyncMock(return_value={"id": "post-id-abc"})
|
||||
adapter.send = AsyncMock(return_value=MagicMock(success=True))
|
||||
return adapter
|
||||
|
||||
|
||||
def _make_aiohttp_resp(status: int, content: bytes = b"file bytes",
|
||||
content_type: str = "image/jpeg"):
|
||||
"""Build a context-manager mock for an aiohttp response."""
|
||||
resp = MagicMock()
|
||||
resp.status = status
|
||||
resp.content_type = content_type
|
||||
resp.read = AsyncMock(return_value=content)
|
||||
resp.__aenter__ = AsyncMock(return_value=resp)
|
||||
resp.__aexit__ = AsyncMock(return_value=False)
|
||||
return resp
|
||||
|
||||
|
||||
class TestMattermostSendUrlAsFile:
|
||||
"""Tests for MattermostAdapter._send_url_as_file"""
|
||||
|
||||
def test_success_on_first_attempt(self):
|
||||
"""200 on first attempt → file uploaded and post created."""
|
||||
adapter = _make_mm_adapter()
|
||||
resp = _make_aiohttp_resp(200)
|
||||
adapter._session.get = MagicMock(return_value=resp)
|
||||
|
||||
async def run():
|
||||
with patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
return await adapter._send_url_as_file(
|
||||
"C123", "http://cdn.example.com/img.png", "caption", None
|
||||
)
|
||||
|
||||
result = asyncio.run(run())
|
||||
assert result.success
|
||||
adapter._upload_file.assert_called_once()
|
||||
adapter._api_post.assert_called_once()
|
||||
|
||||
def test_retries_on_429_then_succeeds(self):
|
||||
"""429 on first attempt is retried; 200 on second attempt succeeds."""
|
||||
adapter = _make_mm_adapter()
|
||||
|
||||
resp_429 = _make_aiohttp_resp(429)
|
||||
resp_200 = _make_aiohttp_resp(200)
|
||||
adapter._session.get = MagicMock(side_effect=[resp_429, resp_200])
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
|
||||
async def run():
|
||||
with patch("asyncio.sleep", mock_sleep):
|
||||
return await adapter._send_url_as_file(
|
||||
"C123", "http://cdn.example.com/img.png", None, None
|
||||
)
|
||||
|
||||
result = asyncio.run(run())
|
||||
assert result.success
|
||||
assert adapter._session.get.call_count == 2
|
||||
mock_sleep.assert_called_once()
|
||||
|
||||
def test_retries_on_500_then_succeeds(self):
|
||||
"""5xx on first attempt is retried; 200 on second attempt succeeds."""
|
||||
adapter = _make_mm_adapter()
|
||||
|
||||
resp_500 = _make_aiohttp_resp(500)
|
||||
resp_200 = _make_aiohttp_resp(200)
|
||||
adapter._session.get = MagicMock(side_effect=[resp_500, resp_200])
|
||||
|
||||
async def run():
|
||||
with patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
return await adapter._send_url_as_file(
|
||||
"C123", "http://cdn.example.com/img.png", None, None
|
||||
)
|
||||
|
||||
result = asyncio.run(run())
|
||||
assert result.success
|
||||
assert adapter._session.get.call_count == 2
|
||||
|
||||
def test_falls_back_to_text_after_max_retries_on_5xx(self):
|
||||
"""Three consecutive 500s exhaust retries; falls back to send() with URL text."""
|
||||
adapter = _make_mm_adapter()
|
||||
|
||||
resp_500 = _make_aiohttp_resp(500)
|
||||
adapter._session.get = MagicMock(return_value=resp_500)
|
||||
|
||||
async def run():
|
||||
with patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
return await adapter._send_url_as_file(
|
||||
"C123", "http://cdn.example.com/img.png", "my caption", None
|
||||
)
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
adapter.send.assert_called_once()
|
||||
text_arg = adapter.send.call_args[0][1]
|
||||
assert "http://cdn.example.com/img.png" in text_arg
|
||||
|
||||
def test_falls_back_on_client_error(self):
|
||||
"""aiohttp.ClientError on every attempt falls back to send() with URL."""
|
||||
import aiohttp
|
||||
|
||||
adapter = _make_mm_adapter()
|
||||
|
||||
error_resp = MagicMock()
|
||||
error_resp.__aenter__ = AsyncMock(
|
||||
side_effect=aiohttp.ClientConnectionError("connection refused")
|
||||
)
|
||||
error_resp.__aexit__ = AsyncMock(return_value=False)
|
||||
adapter._session.get = MagicMock(return_value=error_resp)
|
||||
|
||||
async def run():
|
||||
with patch("asyncio.sleep", new_callable=AsyncMock):
|
||||
return await adapter._send_url_as_file(
|
||||
"C123", "http://cdn.example.com/img.png", None, None
|
||||
)
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
adapter.send.assert_called_once()
|
||||
text_arg = adapter.send.call_args[0][1]
|
||||
assert "http://cdn.example.com/img.png" in text_arg
|
||||
|
||||
def test_non_retryable_404_falls_back_immediately(self):
|
||||
"""404 is non-retryable (< 500, != 429); send() is called right away."""
|
||||
adapter = _make_mm_adapter()
|
||||
|
||||
resp_404 = _make_aiohttp_resp(404)
|
||||
adapter._session.get = MagicMock(return_value=resp_404)
|
||||
|
||||
mock_sleep = AsyncMock()
|
||||
|
||||
async def run():
|
||||
with patch("asyncio.sleep", mock_sleep):
|
||||
return await adapter._send_url_as_file(
|
||||
"C123", "http://cdn.example.com/img.png", None, None
|
||||
)
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
adapter.send.assert_called_once()
|
||||
# No sleep — fell back on first attempt
|
||||
mock_sleep.assert_not_called()
|
||||
assert adapter._session.get.call_count == 1
|
||||
@@ -62,18 +62,6 @@ class TestMessageEventGetCommand:
|
||||
event = MessageEvent(text="/")
|
||||
assert event.get_command() == ""
|
||||
|
||||
def test_command_with_at_botname(self):
|
||||
event = MessageEvent(text="/new@TigerNanoBot")
|
||||
assert event.get_command() == "new"
|
||||
|
||||
def test_command_with_at_botname_and_args(self):
|
||||
event = MessageEvent(text="/compress@TigerNanoBot")
|
||||
assert event.get_command() == "compress"
|
||||
|
||||
def test_command_mixed_case_with_at_botname(self):
|
||||
event = MessageEvent(text="/RESET@TigerNanoBot")
|
||||
assert event.get_command() == "reset"
|
||||
|
||||
|
||||
class TestMessageEventGetCommandArgs:
|
||||
def test_command_with_args(self):
|
||||
|
||||
@@ -344,7 +344,6 @@ class TestRuntimeDisconnectQueuing:
|
||||
async def test_retryable_runtime_error_queued_for_reconnect(self):
|
||||
"""Retryable runtime errors should add the platform to _failed_platforms."""
|
||||
runner = _make_runner()
|
||||
runner.stop = AsyncMock()
|
||||
|
||||
adapter = StubAdapter(succeed=True)
|
||||
adapter._set_fatal_error("network_error", "DNS failure", retryable=True)
|
||||
@@ -372,12 +371,8 @@ class TestRuntimeDisconnectQueuing:
|
||||
assert Platform.TELEGRAM not in runner._failed_platforms
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_retryable_error_exits_for_service_restart_when_all_down(self):
|
||||
"""Gateway should exit with failure when all platforms fail with retryable errors.
|
||||
|
||||
This lets systemd Restart=on-failure restart the process, which is more
|
||||
reliable than in-process background reconnection after exhausted retries.
|
||||
"""
|
||||
async def test_retryable_error_prevents_shutdown_when_queued(self):
|
||||
"""Gateway should not shut down if failed platforms are queued for reconnection."""
|
||||
runner = _make_runner()
|
||||
runner.stop = AsyncMock()
|
||||
|
||||
@@ -387,28 +382,7 @@ class TestRuntimeDisconnectQueuing:
|
||||
|
||||
await runner._handle_adapter_fatal_error(adapter)
|
||||
|
||||
# stop() SHOULD be called — gateway exits for systemd restart
|
||||
runner.stop.assert_called_once()
|
||||
assert runner._exit_with_failure is True
|
||||
assert Platform.TELEGRAM in runner._failed_platforms
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_retryable_error_no_exit_when_other_adapters_still_connected(self):
|
||||
"""Gateway should NOT exit if some adapters are still connected."""
|
||||
runner = _make_runner()
|
||||
runner.stop = AsyncMock()
|
||||
|
||||
failing_adapter = StubAdapter(succeed=True)
|
||||
failing_adapter._set_fatal_error("network_error", "DNS failure", retryable=True)
|
||||
runner.adapters[Platform.TELEGRAM] = failing_adapter
|
||||
|
||||
# Another adapter is still connected
|
||||
healthy_adapter = StubAdapter(succeed=True)
|
||||
runner.adapters[Platform.DISCORD] = healthy_adapter
|
||||
|
||||
await runner._handle_adapter_fatal_error(failing_adapter)
|
||||
|
||||
# stop() should NOT have been called — Discord is still up
|
||||
# stop() should NOT have been called since we have platforms queued
|
||||
runner.stop.assert_not_called()
|
||||
assert Platform.TELEGRAM in runner._failed_platforms
|
||||
|
||||
|
||||
@@ -14,8 +14,8 @@ from gateway.session import SessionSource
|
||||
|
||||
|
||||
class ProgressCaptureAdapter(BasePlatformAdapter):
|
||||
def __init__(self, platform=Platform.TELEGRAM):
|
||||
super().__init__(PlatformConfig(enabled=True, token="***"), platform)
|
||||
def __init__(self):
|
||||
super().__init__(PlatformConfig(enabled=True, token="fake-token"), Platform.TELEGRAM)
|
||||
self.sent = []
|
||||
self.edits = []
|
||||
self.typing = []
|
||||
@@ -76,7 +76,7 @@ def _make_runner(adapter):
|
||||
GatewayRunner = gateway_run.GatewayRunner
|
||||
|
||||
runner = object.__new__(GatewayRunner)
|
||||
runner.adapters = {adapter.platform: adapter}
|
||||
runner.adapters = {Platform.TELEGRAM: adapter}
|
||||
runner._voice_mode = {}
|
||||
runner._prefill_messages = []
|
||||
runner._ephemeral_system_prompt = ""
|
||||
@@ -133,87 +133,3 @@ async def test_run_agent_progress_stays_in_originating_topic(monkeypatch, tmp_pa
|
||||
]
|
||||
assert adapter.edits
|
||||
assert all(call["metadata"] == {"thread_id": "17585"} for call in adapter.typing)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_run_agent_progress_does_not_use_event_message_id_for_telegram_dm(monkeypatch, tmp_path):
|
||||
"""Telegram DM progress must not reuse event message id as thread metadata."""
|
||||
monkeypatch.setenv("HERMES_TOOL_PROGRESS_MODE", "all")
|
||||
|
||||
fake_dotenv = types.ModuleType("dotenv")
|
||||
fake_dotenv.load_dotenv = lambda *args, **kwargs: None
|
||||
monkeypatch.setitem(sys.modules, "dotenv", fake_dotenv)
|
||||
|
||||
fake_run_agent = types.ModuleType("run_agent")
|
||||
fake_run_agent.AIAgent = FakeAgent
|
||||
monkeypatch.setitem(sys.modules, "run_agent", fake_run_agent)
|
||||
|
||||
adapter = ProgressCaptureAdapter(platform=Platform.TELEGRAM)
|
||||
runner = _make_runner(adapter)
|
||||
gateway_run = importlib.import_module("gateway.run")
|
||||
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
|
||||
monkeypatch.setattr(gateway_run, "_resolve_runtime_agent_kwargs", lambda: {"api_key": "***"})
|
||||
|
||||
source = SessionSource(
|
||||
platform=Platform.TELEGRAM,
|
||||
chat_id="12345",
|
||||
chat_type="dm",
|
||||
thread_id=None,
|
||||
)
|
||||
|
||||
result = await runner._run_agent(
|
||||
message="hello",
|
||||
context_prompt="",
|
||||
history=[],
|
||||
source=source,
|
||||
session_id="sess-2",
|
||||
session_key="agent:main:telegram:dm:12345",
|
||||
event_message_id="777",
|
||||
)
|
||||
|
||||
assert result["final_response"] == "done"
|
||||
assert adapter.sent
|
||||
assert adapter.sent[0]["metadata"] is None
|
||||
assert all(call["metadata"] is None for call in adapter.typing)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_run_agent_progress_uses_event_message_id_for_slack_dm(monkeypatch, tmp_path):
|
||||
"""Slack DM progress should keep event ts fallback threading."""
|
||||
monkeypatch.setenv("HERMES_TOOL_PROGRESS_MODE", "all")
|
||||
|
||||
fake_dotenv = types.ModuleType("dotenv")
|
||||
fake_dotenv.load_dotenv = lambda *args, **kwargs: None
|
||||
monkeypatch.setitem(sys.modules, "dotenv", fake_dotenv)
|
||||
|
||||
fake_run_agent = types.ModuleType("run_agent")
|
||||
fake_run_agent.AIAgent = FakeAgent
|
||||
monkeypatch.setitem(sys.modules, "run_agent", fake_run_agent)
|
||||
|
||||
adapter = ProgressCaptureAdapter(platform=Platform.SLACK)
|
||||
runner = _make_runner(adapter)
|
||||
gateway_run = importlib.import_module("gateway.run")
|
||||
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
|
||||
monkeypatch.setattr(gateway_run, "_resolve_runtime_agent_kwargs", lambda: {"api_key": "***"})
|
||||
|
||||
source = SessionSource(
|
||||
platform=Platform.SLACK,
|
||||
chat_id="D123",
|
||||
chat_type="dm",
|
||||
thread_id=None,
|
||||
)
|
||||
|
||||
result = await runner._run_agent(
|
||||
message="hello",
|
||||
context_prompt="",
|
||||
history=[],
|
||||
source=source,
|
||||
session_id="sess-3",
|
||||
session_key="agent:main:slack:dm:D123",
|
||||
event_message_id="1234567890.000001",
|
||||
)
|
||||
|
||||
assert result["final_response"] == "done"
|
||||
assert adapter.sent
|
||||
assert adapter.sent[0]["metadata"] == {"thread_id": "1234567890.000001"}
|
||||
assert all(call["metadata"] == {"thread_id": "1234567890.000001"} for call in adapter.typing)
|
||||
|
||||
@@ -89,8 +89,7 @@ async def test_runner_queues_retryable_runtime_fatal_for_reconnection(monkeypatc
|
||||
|
||||
await runner._handle_adapter_fatal_error(adapter)
|
||||
|
||||
# Should shut down with failure — systemd Restart=on-failure will restart
|
||||
runner.stop.assert_awaited_once()
|
||||
assert runner._exit_with_failure is True
|
||||
# Should NOT shut down — platform is queued for reconnection
|
||||
runner.stop.assert_not_awaited()
|
||||
assert Platform.WHATSAPP in runner._failed_platforms
|
||||
assert runner._failed_platforms[Platform.WHATSAPP]["attempts"] == 0
|
||||
|
||||
@@ -76,7 +76,7 @@ def _ensure_telegram_mock():
|
||||
telegram_mod.constants.ChatType.CHANNEL = "channel"
|
||||
telegram_mod.constants.ChatType.PRIVATE = "private"
|
||||
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants"):
|
||||
sys.modules.setdefault(name, telegram_mod)
|
||||
|
||||
|
||||
|
||||
@@ -846,7 +846,7 @@ class TestLastPromptTokens:
|
||||
|
||||
store.update_session("k1", model="openai/gpt-5.4")
|
||||
|
||||
store._db.set_token_counts.assert_called_once_with(
|
||||
store._db.update_token_counts.assert_called_once_with(
|
||||
"s1",
|
||||
input_tokens=0,
|
||||
output_tokens=0,
|
||||
@@ -858,7 +858,6 @@ class TestLastPromptTokens:
|
||||
billing_provider=None,
|
||||
billing_base_url=None,
|
||||
model="openai/gpt-5.4",
|
||||
absolute=True,
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -304,12 +304,8 @@ async def test_session_hygiene_messages_stay_in_originating_topic(monkeypatch, t
|
||||
class FakeCompressAgent:
|
||||
def __init__(self, **kwargs):
|
||||
self.model = kwargs.get("model")
|
||||
self.session_id = kwargs.get("session_id", "fake-session")
|
||||
self._print_fn = None
|
||||
|
||||
def _compress_context(self, messages, *_args, **_kwargs):
|
||||
# Simulate real _compress_context: create a new session_id
|
||||
self.session_id = f"{self.session_id}_compressed"
|
||||
return ([{"role": "assistant", "content": "compressed"}], None)
|
||||
|
||||
fake_run_agent = types.ModuleType("run_agent")
|
||||
|
||||
@@ -1,110 +0,0 @@
|
||||
"""Tests for GatewayRunner._format_session_info — session config surfacing."""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import patch, MagicMock
|
||||
from pathlib import Path
|
||||
|
||||
from gateway.run import GatewayRunner
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def runner():
|
||||
"""Create a bare GatewayRunner without __init__."""
|
||||
return GatewayRunner.__new__(GatewayRunner)
|
||||
|
||||
|
||||
def _patch_info(tmp_path, config_yaml, model, runtime):
|
||||
"""Return a context-manager stack that patches _format_session_info deps."""
|
||||
cfg_path = tmp_path / "config.yaml"
|
||||
if config_yaml is not None:
|
||||
cfg_path.write_text(config_yaml)
|
||||
return (
|
||||
patch("gateway.run._hermes_home", tmp_path),
|
||||
patch("gateway.run._resolve_gateway_model", return_value=model),
|
||||
patch("gateway.run._resolve_runtime_agent_kwargs", return_value=runtime),
|
||||
)
|
||||
|
||||
|
||||
class TestFormatSessionInfo:
|
||||
|
||||
def test_includes_model_name(self, runner, tmp_path):
|
||||
p1, p2, p3 = _patch_info(tmp_path, "model:\n default: anthropic/claude-opus-4.6\n provider: openrouter\n",
|
||||
"anthropic/claude-opus-4.6",
|
||||
{"provider": "openrouter", "base_url": "https://openrouter.ai/api/v1", "api_key": "k"})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "claude-opus-4.6" in info
|
||||
|
||||
def test_includes_provider(self, runner, tmp_path):
|
||||
p1, p2, p3 = _patch_info(tmp_path, "model:\n default: test-model\n provider: openrouter\n",
|
||||
"test-model",
|
||||
{"provider": "openrouter", "base_url": "", "api_key": ""})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "openrouter" in info
|
||||
|
||||
def test_config_context_length(self, runner, tmp_path):
|
||||
p1, p2, p3 = _patch_info(tmp_path, "model:\n default: test-model\n context_length: 32768\n",
|
||||
"test-model",
|
||||
{"provider": "custom", "base_url": "", "api_key": ""})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "32K" in info
|
||||
assert "config" in info
|
||||
|
||||
def test_default_fallback_hint(self, runner, tmp_path):
|
||||
p1, p2, p3 = _patch_info(tmp_path, "model:\n default: unknown-model-xyz\n",
|
||||
"unknown-model-xyz",
|
||||
{"provider": "", "base_url": "", "api_key": ""})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "128K" in info
|
||||
assert "model.context_length" in info
|
||||
|
||||
def test_local_endpoint_shown(self, runner, tmp_path):
|
||||
p1, p2, p3 = _patch_info(
|
||||
tmp_path,
|
||||
"model:\n default: qwen3:8b\n provider: custom\n base_url: http://localhost:11434/v1\n context_length: 8192\n",
|
||||
"qwen3:8b",
|
||||
{"provider": "custom", "base_url": "http://localhost:11434/v1", "api_key": ""})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "localhost:11434" in info
|
||||
assert "8K" in info
|
||||
|
||||
def test_cloud_endpoint_hidden(self, runner, tmp_path):
|
||||
p1, p2, p3 = _patch_info(tmp_path, "model:\n default: test-model\n provider: openrouter\n",
|
||||
"test-model",
|
||||
{"provider": "openrouter", "base_url": "https://openrouter.ai/api/v1", "api_key": "k"})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "Endpoint" not in info
|
||||
|
||||
def test_million_context_format(self, runner, tmp_path):
|
||||
p1, p2, p3 = _patch_info(tmp_path, "model:\n default: test-model\n context_length: 1000000\n",
|
||||
"test-model",
|
||||
{"provider": "", "base_url": "", "api_key": ""})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "1.0M" in info
|
||||
|
||||
def test_missing_config(self, runner, tmp_path):
|
||||
"""No config.yaml should not crash."""
|
||||
p1, p2, p3 = _patch_info(tmp_path, None, # don't create config
|
||||
"anthropic/claude-sonnet-4.6",
|
||||
{"provider": "openrouter", "base_url": "", "api_key": ""})
|
||||
with p1, p2, p3:
|
||||
info = runner._format_session_info()
|
||||
assert "Model" in info
|
||||
assert "Context" in info
|
||||
|
||||
def test_runtime_resolution_failure_doesnt_crash(self, runner, tmp_path):
|
||||
"""If runtime resolution raises, should still produce output."""
|
||||
cfg_path = tmp_path / "config.yaml"
|
||||
cfg_path.write_text("model:\n default: test-model\n context_length: 4096\n")
|
||||
with patch("gateway.run._hermes_home", tmp_path), \
|
||||
patch("gateway.run._resolve_gateway_model", return_value="test-model"), \
|
||||
patch("gateway.run._resolve_runtime_agent_kwargs", side_effect=RuntimeError("no creds")):
|
||||
info = runner._format_session_info()
|
||||
assert "4K" in info
|
||||
assert "config" in info
|
||||
@@ -1,280 +0,0 @@
|
||||
"""Tests for SSE client disconnect → agent task cancellation.
|
||||
|
||||
When a streaming /v1/chat/completions client disconnects mid-stream
|
||||
(network drop, browser tab close), the agent is interrupted via
|
||||
agent.interrupt() so it stops making LLM API calls, and the asyncio
|
||||
task wrapper is cancelled.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import queue
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _make_adapter():
|
||||
"""Build a minimal APIServerAdapter with mocked internals."""
|
||||
from gateway.platforms.api_server import APIServerAdapter
|
||||
from gateway.config import PlatformConfig
|
||||
|
||||
config = PlatformConfig(enabled=True, token="test-key")
|
||||
adapter = APIServerAdapter(config)
|
||||
return adapter
|
||||
|
||||
|
||||
def _make_request():
|
||||
"""Build a mock aiohttp request."""
|
||||
req = MagicMock()
|
||||
req.headers = {}
|
||||
return req
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestSSEAgentCancelOnDisconnect:
|
||||
"""gateway/platforms/api_server.py — _write_sse_chat_completion()"""
|
||||
|
||||
def test_agent_task_cancelled_on_client_disconnect(self):
|
||||
"""When response.write raises ConnectionResetError (client dropped),
|
||||
the agent task must be cancelled."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
stream_q = queue.Queue()
|
||||
stream_q.put("hello ") # Some data already queued
|
||||
|
||||
# Agent task that runs forever (simulates a long LLM call)
|
||||
agent_done = asyncio.Event()
|
||||
|
||||
async def fake_agent():
|
||||
await agent_done.wait()
|
||||
return {"final_response": "done"}, {"input_tokens": 10, "output_tokens": 5, "total_tokens": 15}
|
||||
|
||||
async def run():
|
||||
from aiohttp import web
|
||||
|
||||
agent_task = asyncio.ensure_future(fake_agent())
|
||||
|
||||
# Mock response that raises ConnectionResetError on second write
|
||||
mock_response = AsyncMock(spec=web.StreamResponse)
|
||||
call_count = 0
|
||||
|
||||
async def write_side_effect(data):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
if call_count >= 2:
|
||||
raise ConnectionResetError("client disconnected")
|
||||
|
||||
mock_response.write = AsyncMock(side_effect=write_side_effect)
|
||||
mock_response.prepare = AsyncMock()
|
||||
|
||||
with patch.object(type(adapter), '_write_sse_chat_completion',
|
||||
adapter._write_sse_chat_completion):
|
||||
# Patch StreamResponse creation
|
||||
with patch("gateway.platforms.api_server.web.StreamResponse",
|
||||
return_value=mock_response):
|
||||
await adapter._write_sse_chat_completion(
|
||||
_make_request(), "cmpl-123", "gpt-4", 1234567890,
|
||||
stream_q, agent_task,
|
||||
)
|
||||
|
||||
# The critical assertion: agent_task must be cancelled
|
||||
assert agent_task.cancelled() or agent_task.done()
|
||||
# Clean up
|
||||
agent_done.set()
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
def test_agent_task_not_cancelled_on_normal_completion(self):
|
||||
"""On normal stream completion, agent task should NOT be cancelled."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
stream_q = queue.Queue()
|
||||
stream_q.put("hello")
|
||||
stream_q.put(None) # End-of-stream sentinel
|
||||
|
||||
async def fake_agent():
|
||||
return {"final_response": "done"}, {"input_tokens": 10, "output_tokens": 5, "total_tokens": 15}
|
||||
|
||||
async def run():
|
||||
from aiohttp import web
|
||||
|
||||
agent_task = asyncio.ensure_future(fake_agent())
|
||||
await asyncio.sleep(0) # Let agent complete
|
||||
|
||||
mock_response = AsyncMock(spec=web.StreamResponse)
|
||||
mock_response.write = AsyncMock()
|
||||
mock_response.prepare = AsyncMock()
|
||||
|
||||
with patch("gateway.platforms.api_server.web.StreamResponse",
|
||||
return_value=mock_response):
|
||||
await adapter._write_sse_chat_completion(
|
||||
_make_request(), "cmpl-456", "gpt-4", 1234567890,
|
||||
stream_q, agent_task,
|
||||
)
|
||||
|
||||
# Agent should have completed normally, not been cancelled
|
||||
assert agent_task.done()
|
||||
assert not agent_task.cancelled()
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
def test_broken_pipe_also_cancels_agent(self):
|
||||
"""BrokenPipeError (another disconnect variant) also cancels the task."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
stream_q = queue.Queue()
|
||||
|
||||
async def fake_agent():
|
||||
await asyncio.sleep(999) # Never completes
|
||||
return {}, {}
|
||||
|
||||
async def run():
|
||||
from aiohttp import web
|
||||
|
||||
agent_task = asyncio.ensure_future(fake_agent())
|
||||
|
||||
mock_response = AsyncMock(spec=web.StreamResponse)
|
||||
mock_response.write = AsyncMock(side_effect=BrokenPipeError("pipe broken"))
|
||||
mock_response.prepare = AsyncMock()
|
||||
|
||||
with patch("gateway.platforms.api_server.web.StreamResponse",
|
||||
return_value=mock_response):
|
||||
await adapter._write_sse_chat_completion(
|
||||
_make_request(), "cmpl-789", "gpt-4", 1234567890,
|
||||
stream_q, agent_task,
|
||||
)
|
||||
|
||||
assert agent_task.cancelled() or agent_task.done()
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
def test_already_done_task_not_cancelled_on_disconnect(self):
|
||||
"""If agent already finished before disconnect, don't try to cancel."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
stream_q = queue.Queue()
|
||||
stream_q.put("data")
|
||||
|
||||
async def fake_agent():
|
||||
return {"final_response": "done"}, {}
|
||||
|
||||
async def run():
|
||||
from aiohttp import web
|
||||
|
||||
agent_task = asyncio.ensure_future(fake_agent())
|
||||
await asyncio.sleep(0) # Let agent complete
|
||||
|
||||
mock_response = AsyncMock(spec=web.StreamResponse)
|
||||
call_count = 0
|
||||
|
||||
async def write_side_effect(data):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
if call_count >= 2:
|
||||
raise ConnectionResetError("late disconnect")
|
||||
|
||||
mock_response.write = AsyncMock(side_effect=write_side_effect)
|
||||
mock_response.prepare = AsyncMock()
|
||||
|
||||
with patch("gateway.platforms.api_server.web.StreamResponse",
|
||||
return_value=mock_response):
|
||||
await adapter._write_sse_chat_completion(
|
||||
_make_request(), "cmpl-done", "gpt-4", 1234567890,
|
||||
stream_q, agent_task,
|
||||
)
|
||||
|
||||
# Task was already done — should not be cancelled
|
||||
assert agent_task.done()
|
||||
assert not agent_task.cancelled()
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
def test_agent_interrupt_called_on_disconnect(self):
|
||||
"""When the client disconnects, agent.interrupt() must be called
|
||||
so the agent thread stops making LLM API calls."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
stream_q = queue.Queue()
|
||||
stream_q.put("hello ")
|
||||
|
||||
agent_done = asyncio.Event()
|
||||
|
||||
async def fake_agent():
|
||||
await agent_done.wait()
|
||||
return {"final_response": "done"}, {}
|
||||
|
||||
# Mock agent with an interrupt method
|
||||
mock_agent = MagicMock()
|
||||
mock_agent.interrupt = MagicMock()
|
||||
|
||||
async def run():
|
||||
from aiohttp import web
|
||||
|
||||
agent_task = asyncio.ensure_future(fake_agent())
|
||||
agent_ref = [mock_agent]
|
||||
|
||||
mock_response = AsyncMock(spec=web.StreamResponse)
|
||||
call_count = 0
|
||||
|
||||
async def write_side_effect(data):
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
if call_count >= 2:
|
||||
raise ConnectionResetError("client disconnected")
|
||||
|
||||
mock_response.write = AsyncMock(side_effect=write_side_effect)
|
||||
mock_response.prepare = AsyncMock()
|
||||
|
||||
with patch("gateway.platforms.api_server.web.StreamResponse",
|
||||
return_value=mock_response):
|
||||
await adapter._write_sse_chat_completion(
|
||||
_make_request(), "cmpl-int", "gpt-4", 1234567890,
|
||||
stream_q, agent_task, agent_ref,
|
||||
)
|
||||
|
||||
# agent.interrupt() must have been called
|
||||
mock_agent.interrupt.assert_called_once_with("SSE client disconnected")
|
||||
# Clean up
|
||||
agent_done.set()
|
||||
|
||||
asyncio.run(run())
|
||||
|
||||
def test_agent_ref_none_still_cancels_task(self):
|
||||
"""When agent_ref is not provided (None), the task is still cancelled
|
||||
on disconnect — just without the interrupt() call."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
stream_q = queue.Queue()
|
||||
|
||||
async def fake_agent():
|
||||
await asyncio.sleep(999)
|
||||
return {}, {}
|
||||
|
||||
async def run():
|
||||
from aiohttp import web
|
||||
|
||||
agent_task = asyncio.ensure_future(fake_agent())
|
||||
|
||||
mock_response = AsyncMock(spec=web.StreamResponse)
|
||||
mock_response.write = AsyncMock(side_effect=BrokenPipeError("gone"))
|
||||
mock_response.prepare = AsyncMock()
|
||||
|
||||
with patch("gateway.platforms.api_server.web.StreamResponse",
|
||||
return_value=mock_response):
|
||||
# No agent_ref passed — should still handle disconnect cleanly
|
||||
await adapter._write_sse_chat_completion(
|
||||
_make_request(), "cmpl-noref", "gpt-4", 1234567890,
|
||||
stream_q, agent_task,
|
||||
)
|
||||
|
||||
assert agent_task.cancelled() or agent_task.done()
|
||||
|
||||
asyncio.run(run())
|
||||
@@ -20,7 +20,7 @@ def _ensure_telegram_mock():
|
||||
telegram_mod.constants.ChatType.CHANNEL = "channel"
|
||||
telegram_mod.constants.ChatType.PRIVATE = "private"
|
||||
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants"):
|
||||
sys.modules.setdefault(name, telegram_mod)
|
||||
|
||||
|
||||
@@ -29,14 +29,6 @@ _ensure_telegram_mock()
|
||||
from gateway.platforms.telegram import TelegramAdapter # noqa: E402
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _no_auto_discovery(monkeypatch):
|
||||
"""Disable DoH auto-discovery so connect() uses the plain builder chain."""
|
||||
async def _noop():
|
||||
return []
|
||||
monkeypatch.setattr("gateway.platforms.telegram.discover_fallback_ips", _noop)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_connect_rejects_same_host_token_lock(monkeypatch):
|
||||
adapter = TelegramAdapter(PlatformConfig(enabled=True, token="secret-token"))
|
||||
|
||||
@@ -45,7 +45,7 @@ def _ensure_telegram_mock():
|
||||
telegram_mod.constants.ChatType.CHANNEL = "channel"
|
||||
telegram_mod.constants.ChatType.PRIVATE = "private"
|
||||
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants"):
|
||||
sys.modules.setdefault(name, telegram_mod)
|
||||
|
||||
|
||||
|
||||
@@ -28,7 +28,7 @@ def _ensure_telegram_mock():
|
||||
mod.constants.ChatType.SUPERGROUP = "supergroup"
|
||||
mod.constants.ChatType.CHANNEL = "channel"
|
||||
mod.constants.ChatType.PRIVATE = "private"
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants"):
|
||||
sys.modules.setdefault(name, mod)
|
||||
|
||||
|
||||
|
||||
@@ -1,644 +0,0 @@
|
||||
"""Tests for gateway.platforms.telegram_network – fallback transport layer.
|
||||
|
||||
Background
|
||||
----------
|
||||
api.telegram.org resolves to an IP (e.g. 149.154.166.110) that is unreachable
|
||||
from some networks. The workaround: route TCP through a different IP in the
|
||||
same Telegram-owned 149.154.160.0/20 block (e.g. 149.154.167.220) while
|
||||
keeping TLS SNI and the Host header as api.telegram.org so Telegram's edge
|
||||
servers still accept the request. This is the programmatic equivalent of:
|
||||
|
||||
curl --resolve api.telegram.org:443:149.154.167.220 https://api.telegram.org/bot<token>/getMe
|
||||
|
||||
The TelegramFallbackTransport implements this: try the primary (DNS-resolved)
|
||||
path first, and on ConnectTimeout / ConnectError fall through to configured
|
||||
fallback IPs in order, then "stick" to whichever IP works.
|
||||
"""
|
||||
|
||||
import httpx
|
||||
import pytest
|
||||
|
||||
from gateway.platforms import telegram_network as tnet
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class FakeTransport(httpx.AsyncBaseTransport):
|
||||
"""Records calls and raises / returns based on a host→action mapping."""
|
||||
|
||||
def __init__(self, calls, behavior):
|
||||
self.calls = calls
|
||||
self.behavior = behavior
|
||||
self.closed = False
|
||||
|
||||
async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
|
||||
self.calls.append(
|
||||
{
|
||||
"url_host": request.url.host,
|
||||
"host_header": request.headers.get("host"),
|
||||
"sni_hostname": request.extensions.get("sni_hostname"),
|
||||
"path": request.url.path,
|
||||
}
|
||||
)
|
||||
action = self.behavior.get(request.url.host, "ok")
|
||||
if action == "timeout":
|
||||
raise httpx.ConnectTimeout("timed out")
|
||||
if action == "connect_error":
|
||||
raise httpx.ConnectError("connect error")
|
||||
if isinstance(action, Exception):
|
||||
raise action
|
||||
return httpx.Response(200, request=request, text="ok")
|
||||
|
||||
async def aclose(self) -> None:
|
||||
self.closed = True
|
||||
|
||||
|
||||
def _fake_transport_factory(calls, behavior):
|
||||
"""Returns a factory that creates FakeTransport instances."""
|
||||
instances = []
|
||||
|
||||
def factory(**kwargs):
|
||||
t = FakeTransport(calls, behavior)
|
||||
instances.append(t)
|
||||
return t
|
||||
|
||||
factory.instances = instances
|
||||
return factory
|
||||
|
||||
|
||||
def _telegram_request(path="/botTOKEN/getMe"):
|
||||
return httpx.Request("GET", f"https://api.telegram.org{path}")
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
# IP parsing & validation
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
|
||||
class TestParseFallbackIpEnv:
|
||||
def test_filters_invalid_and_ipv6(self, caplog):
|
||||
ips = tnet.parse_fallback_ip_env("149.154.167.220, bad, 2001:67c:4e8:f004::9,149.154.167.220")
|
||||
assert ips == ["149.154.167.220", "149.154.167.220"]
|
||||
assert "Ignoring invalid Telegram fallback IP" in caplog.text
|
||||
assert "Ignoring non-IPv4 Telegram fallback IP" in caplog.text
|
||||
|
||||
def test_none_returns_empty(self):
|
||||
assert tnet.parse_fallback_ip_env(None) == []
|
||||
|
||||
def test_empty_string_returns_empty(self):
|
||||
assert tnet.parse_fallback_ip_env("") == []
|
||||
|
||||
def test_whitespace_only_returns_empty(self):
|
||||
assert tnet.parse_fallback_ip_env(" , , ") == []
|
||||
|
||||
def test_single_valid_ip(self):
|
||||
assert tnet.parse_fallback_ip_env("149.154.167.220") == ["149.154.167.220"]
|
||||
|
||||
def test_multiple_valid_ips(self):
|
||||
ips = tnet.parse_fallback_ip_env("149.154.167.220, 149.154.167.221")
|
||||
assert ips == ["149.154.167.220", "149.154.167.221"]
|
||||
|
||||
def test_rejects_leading_zeros(self, caplog):
|
||||
"""Leading zeros are ambiguous (octal?) so ipaddress rejects them."""
|
||||
ips = tnet.parse_fallback_ip_env("149.154.167.010")
|
||||
assert ips == []
|
||||
assert "Ignoring invalid" in caplog.text
|
||||
|
||||
|
||||
class TestNormalizeFallbackIps:
|
||||
def test_deduplication_happens_at_transport_level(self):
|
||||
"""_normalize does not dedup; TelegramFallbackTransport.__init__ does."""
|
||||
raw = ["149.154.167.220", "149.154.167.220"]
|
||||
assert tnet._normalize_fallback_ips(raw) == ["149.154.167.220", "149.154.167.220"]
|
||||
|
||||
def test_empty_strings_skipped(self):
|
||||
assert tnet._normalize_fallback_ips(["", " ", "149.154.167.220"]) == ["149.154.167.220"]
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
# Request rewriting
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
|
||||
class TestRewriteRequestForIp:
|
||||
def test_preserves_host_and_sni(self):
|
||||
request = _telegram_request()
|
||||
rewritten = tnet._rewrite_request_for_ip(request, "149.154.167.220")
|
||||
|
||||
assert rewritten.url.host == "149.154.167.220"
|
||||
assert rewritten.headers["host"] == "api.telegram.org"
|
||||
assert rewritten.extensions["sni_hostname"] == "api.telegram.org"
|
||||
assert rewritten.url.path == "/botTOKEN/getMe"
|
||||
|
||||
def test_preserves_method_and_path(self):
|
||||
request = httpx.Request("POST", "https://api.telegram.org/botTOKEN/sendMessage")
|
||||
rewritten = tnet._rewrite_request_for_ip(request, "149.154.167.220")
|
||||
|
||||
assert rewritten.method == "POST"
|
||||
assert rewritten.url.path == "/botTOKEN/sendMessage"
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
# Fallback transport – core behavior
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
|
||||
class TestFallbackTransport:
|
||||
"""Primary path fails → try fallback IPs → stick to whichever works."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_falls_back_on_connect_timeout_and_becomes_sticky(self, monkeypatch):
|
||||
calls = []
|
||||
behavior = {"api.telegram.org": "timeout", "149.154.167.220": "ok"}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220"])
|
||||
resp = await transport.handle_async_request(_telegram_request())
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert transport._sticky_ip == "149.154.167.220"
|
||||
# First attempt was primary (api.telegram.org), second was fallback
|
||||
assert calls[0]["url_host"] == "api.telegram.org"
|
||||
assert calls[1]["url_host"] == "149.154.167.220"
|
||||
assert calls[1]["host_header"] == "api.telegram.org"
|
||||
assert calls[1]["sni_hostname"] == "api.telegram.org"
|
||||
|
||||
# Second request goes straight to sticky IP
|
||||
calls.clear()
|
||||
resp2 = await transport.handle_async_request(_telegram_request())
|
||||
assert resp2.status_code == 200
|
||||
assert calls[0]["url_host"] == "149.154.167.220"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_falls_back_on_connect_error(self, monkeypatch):
|
||||
calls = []
|
||||
behavior = {"api.telegram.org": "connect_error", "149.154.167.220": "ok"}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220"])
|
||||
resp = await transport.handle_async_request(_telegram_request())
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert transport._sticky_ip == "149.154.167.220"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_does_not_fallback_on_non_connect_error(self, monkeypatch):
|
||||
"""Errors like ReadTimeout are not connection issues — don't retry."""
|
||||
calls = []
|
||||
behavior = {"api.telegram.org": httpx.ReadTimeout("read timeout"), "149.154.167.220": "ok"}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220"])
|
||||
|
||||
with pytest.raises(httpx.ReadTimeout):
|
||||
await transport.handle_async_request(_telegram_request())
|
||||
|
||||
assert [c["url_host"] for c in calls] == ["api.telegram.org"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_all_ips_fail_raises_last_error(self, monkeypatch):
|
||||
calls = []
|
||||
behavior = {"api.telegram.org": "timeout", "149.154.167.220": "timeout"}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220"])
|
||||
|
||||
with pytest.raises(httpx.ConnectTimeout):
|
||||
await transport.handle_async_request(_telegram_request())
|
||||
|
||||
assert [c["url_host"] for c in calls] == ["api.telegram.org", "149.154.167.220"]
|
||||
assert transport._sticky_ip is None
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_multiple_fallback_ips_tried_in_order(self, monkeypatch):
|
||||
calls = []
|
||||
behavior = {
|
||||
"api.telegram.org": "timeout",
|
||||
"149.154.167.220": "timeout",
|
||||
"149.154.167.221": "ok",
|
||||
}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220", "149.154.167.221"])
|
||||
resp = await transport.handle_async_request(_telegram_request())
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert transport._sticky_ip == "149.154.167.221"
|
||||
assert [c["url_host"] for c in calls] == [
|
||||
"api.telegram.org",
|
||||
"149.154.167.220",
|
||||
"149.154.167.221",
|
||||
]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_sticky_ip_tried_first_but_falls_through_if_stale(self, monkeypatch):
|
||||
"""If the sticky IP stops working, the transport retries others."""
|
||||
calls = []
|
||||
behavior = {
|
||||
"api.telegram.org": "timeout",
|
||||
"149.154.167.220": "ok",
|
||||
"149.154.167.221": "ok",
|
||||
}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220", "149.154.167.221"])
|
||||
|
||||
# First request: primary fails → .220 works → becomes sticky
|
||||
await transport.handle_async_request(_telegram_request())
|
||||
assert transport._sticky_ip == "149.154.167.220"
|
||||
|
||||
# Now .220 goes bad too
|
||||
calls.clear()
|
||||
behavior["149.154.167.220"] = "timeout"
|
||||
|
||||
resp = await transport.handle_async_request(_telegram_request())
|
||||
assert resp.status_code == 200
|
||||
# Tried sticky (.220) first, then fell through to .221
|
||||
assert [c["url_host"] for c in calls] == ["149.154.167.220", "149.154.167.221"]
|
||||
assert transport._sticky_ip == "149.154.167.221"
|
||||
|
||||
|
||||
class TestFallbackTransportPassthrough:
|
||||
"""Requests that don't need fallback behavior."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_telegram_host_bypasses_fallback(self, monkeypatch):
|
||||
calls = []
|
||||
behavior = {}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220"])
|
||||
request = httpx.Request("GET", "https://example.com/path")
|
||||
resp = await transport.handle_async_request(request)
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert calls[0]["url_host"] == "example.com"
|
||||
assert transport._sticky_ip is None
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_empty_fallback_list_uses_primary_only(self, monkeypatch):
|
||||
calls = []
|
||||
behavior = {}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport([])
|
||||
resp = await transport.handle_async_request(_telegram_request())
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert calls[0]["url_host"] == "api.telegram.org"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_primary_succeeds_no_fallback_needed(self, monkeypatch):
|
||||
calls = []
|
||||
behavior = {"api.telegram.org": "ok"}
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", _fake_transport_factory(calls, behavior))
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220"])
|
||||
resp = await transport.handle_async_request(_telegram_request())
|
||||
|
||||
assert resp.status_code == 200
|
||||
assert transport._sticky_ip is None
|
||||
assert len(calls) == 1
|
||||
|
||||
|
||||
class TestFallbackTransportInit:
|
||||
def test_deduplicates_fallback_ips(self, monkeypatch):
|
||||
monkeypatch.setattr(
|
||||
tnet.httpx, "AsyncHTTPTransport", lambda **kw: FakeTransport([], {})
|
||||
)
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220", "149.154.167.220"])
|
||||
assert transport._fallback_ips == ["149.154.167.220"]
|
||||
|
||||
def test_filters_invalid_ips_at_init(self, monkeypatch):
|
||||
monkeypatch.setattr(
|
||||
tnet.httpx, "AsyncHTTPTransport", lambda **kw: FakeTransport([], {})
|
||||
)
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220", "not-an-ip"])
|
||||
assert transport._fallback_ips == ["149.154.167.220"]
|
||||
|
||||
def test_uses_proxy_env_for_primary_and_fallback_transports(self, monkeypatch):
|
||||
seen_kwargs = []
|
||||
|
||||
def factory(**kwargs):
|
||||
seen_kwargs.append(kwargs.copy())
|
||||
return FakeTransport([], {})
|
||||
|
||||
for key in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY", "https_proxy", "http_proxy", "all_proxy"):
|
||||
monkeypatch.delenv(key, raising=False)
|
||||
monkeypatch.setenv("HTTPS_PROXY", "http://proxy.example:8080")
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", factory)
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220"])
|
||||
|
||||
assert transport._fallback_ips == ["149.154.167.220"]
|
||||
assert len(seen_kwargs) == 2
|
||||
assert all(kwargs["proxy"] == "http://proxy.example:8080" for kwargs in seen_kwargs)
|
||||
|
||||
|
||||
class TestFallbackTransportClose:
|
||||
@pytest.mark.asyncio
|
||||
async def test_aclose_closes_all_transports(self, monkeypatch):
|
||||
factory = _fake_transport_factory([], {})
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncHTTPTransport", factory)
|
||||
|
||||
transport = tnet.TelegramFallbackTransport(["149.154.167.220", "149.154.167.221"])
|
||||
await transport.aclose()
|
||||
|
||||
# 1 primary + 2 fallback transports
|
||||
assert len(factory.instances) == 3
|
||||
assert all(t.closed for t in factory.instances)
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
# Config layer – TELEGRAM_FALLBACK_IPS env → config.extra
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
|
||||
class TestConfigFallbackIps:
|
||||
def test_env_var_populates_config_extra(self, monkeypatch):
|
||||
from gateway.config import GatewayConfig, Platform, PlatformConfig, _apply_env_overrides
|
||||
|
||||
monkeypatch.setenv("TELEGRAM_FALLBACK_IPS", "149.154.167.220,149.154.167.221")
|
||||
config = GatewayConfig(platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="tok")})
|
||||
_apply_env_overrides(config)
|
||||
|
||||
assert config.platforms[Platform.TELEGRAM].extra["fallback_ips"] == [
|
||||
"149.154.167.220", "149.154.167.221",
|
||||
]
|
||||
|
||||
def test_env_var_creates_platform_if_missing(self, monkeypatch):
|
||||
from gateway.config import GatewayConfig, Platform, _apply_env_overrides
|
||||
|
||||
monkeypatch.setenv("TELEGRAM_FALLBACK_IPS", "149.154.167.220")
|
||||
config = GatewayConfig(platforms={})
|
||||
_apply_env_overrides(config)
|
||||
|
||||
assert Platform.TELEGRAM in config.platforms
|
||||
assert config.platforms[Platform.TELEGRAM].extra["fallback_ips"] == ["149.154.167.220"]
|
||||
|
||||
def test_env_var_strips_whitespace(self, monkeypatch):
|
||||
from gateway.config import GatewayConfig, Platform, PlatformConfig, _apply_env_overrides
|
||||
|
||||
monkeypatch.setenv("TELEGRAM_FALLBACK_IPS", " 149.154.167.220 , 149.154.167.221 ")
|
||||
config = GatewayConfig(platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="tok")})
|
||||
_apply_env_overrides(config)
|
||||
|
||||
assert config.platforms[Platform.TELEGRAM].extra["fallback_ips"] == [
|
||||
"149.154.167.220", "149.154.167.221",
|
||||
]
|
||||
|
||||
def test_empty_env_var_does_not_populate(self, monkeypatch):
|
||||
from gateway.config import GatewayConfig, Platform, PlatformConfig, _apply_env_overrides
|
||||
|
||||
monkeypatch.setenv("TELEGRAM_FALLBACK_IPS", "")
|
||||
config = GatewayConfig(platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="tok")})
|
||||
_apply_env_overrides(config)
|
||||
|
||||
assert "fallback_ips" not in config.platforms[Platform.TELEGRAM].extra
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
# Adapter layer – _fallback_ips() reads config correctly
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
|
||||
class TestAdapterFallbackIps:
|
||||
def _make_adapter(self, extra=None):
|
||||
import sys
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
# Ensure telegram mock is in place
|
||||
if "telegram" not in sys.modules or not hasattr(sys.modules["telegram"], "__file__"):
|
||||
mod = MagicMock()
|
||||
mod.ext.ContextTypes.DEFAULT_TYPE = type(None)
|
||||
mod.constants.ParseMode.MARKDOWN_V2 = "MarkdownV2"
|
||||
mod.constants.ChatType.GROUP = "group"
|
||||
mod.constants.ChatType.SUPERGROUP = "supergroup"
|
||||
mod.constants.ChatType.CHANNEL = "channel"
|
||||
mod.constants.ChatType.PRIVATE = "private"
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
sys.modules.setdefault(name, mod)
|
||||
|
||||
from gateway.config import PlatformConfig
|
||||
from gateway.platforms.telegram import TelegramAdapter
|
||||
|
||||
config = PlatformConfig(enabled=True, token="test-token")
|
||||
if extra:
|
||||
config.extra.update(extra)
|
||||
return TelegramAdapter(config)
|
||||
|
||||
def test_list_in_extra(self):
|
||||
adapter = self._make_adapter(extra={"fallback_ips": ["149.154.167.220"]})
|
||||
assert adapter._fallback_ips() == ["149.154.167.220"]
|
||||
|
||||
def test_csv_string_in_extra(self):
|
||||
adapter = self._make_adapter(extra={"fallback_ips": "149.154.167.220,149.154.167.221"})
|
||||
assert adapter._fallback_ips() == ["149.154.167.220", "149.154.167.221"]
|
||||
|
||||
def test_empty_extra(self):
|
||||
adapter = self._make_adapter()
|
||||
assert adapter._fallback_ips() == []
|
||||
|
||||
def test_no_extra_attr(self):
|
||||
adapter = self._make_adapter()
|
||||
adapter.config.extra = None
|
||||
assert adapter._fallback_ips() == []
|
||||
|
||||
def test_invalid_ips_filtered(self):
|
||||
adapter = self._make_adapter(extra={"fallback_ips": ["149.154.167.220", "not-valid"]})
|
||||
assert adapter._fallback_ips() == ["149.154.167.220"]
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
# DoH auto-discovery
|
||||
# ═══════════════════════════════════════════════════════════════════════════
|
||||
|
||||
def _doh_answer(*ips: str) -> dict:
|
||||
"""Build a minimal DoH JSON response with A records."""
|
||||
return {"Answer": [{"type": 1, "data": ip} for ip in ips]}
|
||||
|
||||
|
||||
class FakeDoHClient:
|
||||
"""Mock httpx.AsyncClient for DoH queries."""
|
||||
|
||||
def __init__(self, responses: dict):
|
||||
# responses: URL prefix → (status, json_body) | Exception
|
||||
self._responses = responses
|
||||
self.requests_made: list[dict] = []
|
||||
|
||||
@staticmethod
|
||||
def _make_response(status, body, url):
|
||||
"""Build an httpx.Response with a request attached (needed for raise_for_status)."""
|
||||
request = httpx.Request("GET", url)
|
||||
return httpx.Response(status, json=body, request=request)
|
||||
|
||||
async def get(self, url, *, params=None, headers=None, **kwargs):
|
||||
self.requests_made.append({"url": url, "params": params, "headers": headers})
|
||||
for prefix, action in self._responses.items():
|
||||
if url.startswith(prefix):
|
||||
if isinstance(action, Exception):
|
||||
raise action
|
||||
status, body = action
|
||||
return self._make_response(status, body, url)
|
||||
return self._make_response(200, {}, url)
|
||||
|
||||
async def __aenter__(self):
|
||||
return self
|
||||
|
||||
async def __aexit__(self, *args):
|
||||
pass
|
||||
|
||||
|
||||
class TestDiscoverFallbackIps:
|
||||
"""Tests for discover_fallback_ips() — DoH-based auto-discovery."""
|
||||
|
||||
def _patch_doh(self, monkeypatch, responses, system_dns_ips=None):
|
||||
"""Wire up fake DoH client and system DNS."""
|
||||
client = FakeDoHClient(responses)
|
||||
monkeypatch.setattr(tnet.httpx, "AsyncClient", lambda **kw: client)
|
||||
|
||||
if system_dns_ips is not None:
|
||||
addrs = [(None, None, None, None, (ip, 443)) for ip in system_dns_ips]
|
||||
monkeypatch.setattr(tnet.socket, "getaddrinfo", lambda *a, **kw: addrs)
|
||||
else:
|
||||
def _fail(*a, **kw):
|
||||
raise OSError("dns failed")
|
||||
monkeypatch.setattr(tnet.socket, "getaddrinfo", _fail)
|
||||
return client
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_google_and_cloudflare_ips_collected(self, monkeypatch):
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, _doh_answer("149.154.167.220")),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer("149.154.167.221")),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert "149.154.167.220" in ips
|
||||
assert "149.154.167.221" in ips
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_system_dns_ip_excluded(self, monkeypatch):
|
||||
"""The IP from system DNS is the one that doesn't work — exclude it."""
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, _doh_answer("149.154.166.110", "149.154.167.220")),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer("149.154.166.110")),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == ["149.154.167.220"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_doh_results_deduplicated(self, monkeypatch):
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, _doh_answer("149.154.167.220")),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer("149.154.167.220")),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == ["149.154.167.220"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_doh_timeout_falls_back_to_seed(self, monkeypatch):
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": httpx.TimeoutException("timeout"),
|
||||
"https://cloudflare-dns.com": httpx.TimeoutException("timeout"),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == tnet._SEED_FALLBACK_IPS
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_doh_connect_error_falls_back_to_seed(self, monkeypatch):
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": httpx.ConnectError("refused"),
|
||||
"https://cloudflare-dns.com": httpx.ConnectError("refused"),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == tnet._SEED_FALLBACK_IPS
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_doh_malformed_json_falls_back_to_seed(self, monkeypatch):
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, {"Status": 0}), # no Answer key
|
||||
"https://cloudflare-dns.com": (200, {"garbage": True}),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == tnet._SEED_FALLBACK_IPS
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_one_provider_fails_other_succeeds(self, monkeypatch):
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": httpx.TimeoutException("timeout"),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer("149.154.167.220")),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == ["149.154.167.220"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_system_dns_failure_keeps_all_doh_ips(self, monkeypatch):
|
||||
"""If system DNS fails, nothing gets excluded — all DoH IPs kept."""
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, _doh_answer("149.154.166.110", "149.154.167.220")),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer()),
|
||||
}, system_dns_ips=None) # triggers OSError
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert "149.154.166.110" in ips
|
||||
assert "149.154.167.220" in ips
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_all_doh_ips_same_as_system_dns_uses_seed(self, monkeypatch):
|
||||
"""DoH returns only the same blocked IP — seed list is the fallback."""
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, _doh_answer("149.154.166.110")),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer("149.154.166.110")),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == tnet._SEED_FALLBACK_IPS
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cloudflare_gets_accept_header(self, monkeypatch):
|
||||
client = self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, _doh_answer("149.154.167.220")),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer("149.154.167.221")),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
await tnet.discover_fallback_ips()
|
||||
|
||||
cf_reqs = [r for r in client.requests_made if "cloudflare" in r["url"]]
|
||||
assert cf_reqs
|
||||
assert cf_reqs[0]["headers"]["Accept"] == "application/dns-json"
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_a_records_ignored(self, monkeypatch):
|
||||
"""AAAA records (type 28) and CNAME (type 5) should be skipped."""
|
||||
answer = {
|
||||
"Answer": [
|
||||
{"type": 5, "data": "telegram.org"}, # CNAME
|
||||
{"type": 28, "data": "2001:67c:4e8:f004::9"}, # AAAA
|
||||
{"type": 1, "data": "149.154.167.220"}, # A ✓
|
||||
]
|
||||
}
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, answer),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer()),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == ["149.154.167.220"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_invalid_ip_in_doh_response_skipped(self, monkeypatch):
|
||||
answer = {"Answer": [
|
||||
{"type": 1, "data": "not-an-ip"},
|
||||
{"type": 1, "data": "149.154.167.220"},
|
||||
]}
|
||||
self._patch_doh(monkeypatch, {
|
||||
"https://dns.google": (200, answer),
|
||||
"https://cloudflare-dns.com": (200, _doh_answer()),
|
||||
}, system_dns_ips=["149.154.166.110"])
|
||||
|
||||
ips = await tnet.discover_fallback_ips()
|
||||
assert ips == ["149.154.167.220"]
|
||||
@@ -27,7 +27,7 @@ def _ensure_telegram_mock():
|
||||
telegram_mod.constants.ChatType.CHANNEL = "channel"
|
||||
telegram_mod.constants.ChatType.PRIVATE = "private"
|
||||
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants"):
|
||||
sys.modules.setdefault(name, telegram_mod)
|
||||
|
||||
|
||||
@@ -36,14 +36,6 @@ _ensure_telegram_mock()
|
||||
from gateway.platforms.telegram import TelegramAdapter # noqa: E402
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _no_auto_discovery(monkeypatch):
|
||||
"""Disable DoH auto-discovery so connect() uses the plain builder chain."""
|
||||
async def _noop():
|
||||
return []
|
||||
monkeypatch.setattr("gateway.platforms.telegram.discover_fallback_ips", _noop)
|
||||
|
||||
|
||||
def _make_adapter() -> TelegramAdapter:
|
||||
return TelegramAdapter(PlatformConfig(enabled=True, token="test-token"))
|
||||
|
||||
|
||||
@@ -25,7 +25,7 @@ def _ensure_telegram_mock():
|
||||
mod.constants.ChatType.SUPERGROUP = "supergroup"
|
||||
mod.constants.ChatType.CHANNEL = "channel"
|
||||
mod.constants.ChatType.PRIVATE = "private"
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants", "telegram.request"):
|
||||
for name in ("telegram", "telegram.ext", "telegram.constants"):
|
||||
sys.modules.setdefault(name, mod)
|
||||
|
||||
|
||||
|
||||
@@ -1,199 +0,0 @@
|
||||
"""Tests for Telegram send() thread_id fallback.
|
||||
|
||||
When message_thread_id points to a non-existent thread, Telegram returns
|
||||
BadRequest('Message thread not found'). Since BadRequest is a subclass of
|
||||
NetworkError in python-telegram-bot, the old retry loop treated this as a
|
||||
transient error and retried 3 times before silently failing — killing all
|
||||
tool progress messages, streaming responses, and typing indicators.
|
||||
|
||||
The fix detects "thread not found" BadRequest errors and retries the send
|
||||
WITHOUT message_thread_id so the message still reaches the chat.
|
||||
"""
|
||||
|
||||
import sys
|
||||
import types
|
||||
from types import SimpleNamespace
|
||||
|
||||
import pytest
|
||||
|
||||
from gateway.config import PlatformConfig, Platform
|
||||
from gateway.platforms.base import SendResult
|
||||
|
||||
|
||||
# ── Fake telegram.error hierarchy ──────────────────────────────────────
|
||||
# Mirrors the real python-telegram-bot hierarchy:
|
||||
# BadRequest → NetworkError → TelegramError → Exception
|
||||
|
||||
|
||||
class FakeNetworkError(Exception):
|
||||
pass
|
||||
|
||||
|
||||
class FakeBadRequest(FakeNetworkError):
|
||||
pass
|
||||
|
||||
|
||||
# Build a fake telegram module tree so the adapter's internal imports work
|
||||
_fake_telegram = types.ModuleType("telegram")
|
||||
_fake_telegram_error = types.ModuleType("telegram.error")
|
||||
_fake_telegram_error.NetworkError = FakeNetworkError
|
||||
_fake_telegram_error.BadRequest = FakeBadRequest
|
||||
_fake_telegram.error = _fake_telegram_error
|
||||
_fake_telegram_constants = types.ModuleType("telegram.constants")
|
||||
_fake_telegram_constants.ParseMode = SimpleNamespace(MARKDOWN_V2="MarkdownV2")
|
||||
_fake_telegram.constants = _fake_telegram_constants
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _inject_fake_telegram(monkeypatch):
|
||||
"""Inject fake telegram modules so the adapter can import from them."""
|
||||
monkeypatch.setitem(sys.modules, "telegram", _fake_telegram)
|
||||
monkeypatch.setitem(sys.modules, "telegram.error", _fake_telegram_error)
|
||||
monkeypatch.setitem(sys.modules, "telegram.constants", _fake_telegram_constants)
|
||||
|
||||
|
||||
def _make_adapter():
|
||||
from gateway.platforms.telegram import TelegramAdapter
|
||||
|
||||
config = PlatformConfig(enabled=True, token="fake-token")
|
||||
adapter = object.__new__(TelegramAdapter)
|
||||
adapter._config = config
|
||||
adapter._platform = Platform.TELEGRAM
|
||||
adapter._connected = True
|
||||
adapter._dm_topics = {}
|
||||
adapter._dm_topics_config = []
|
||||
adapter._reply_to_mode = "first"
|
||||
adapter._fallback_ips = []
|
||||
adapter._polling_conflict_count = 0
|
||||
adapter._polling_network_error_count = 0
|
||||
adapter._polling_error_callback_ref = None
|
||||
adapter.platform = Platform.TELEGRAM
|
||||
return adapter
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_send_retries_without_thread_on_thread_not_found():
|
||||
"""When message_thread_id causes 'thread not found', retry without it."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
call_log = []
|
||||
|
||||
async def mock_send_message(**kwargs):
|
||||
call_log.append(dict(kwargs))
|
||||
tid = kwargs.get("message_thread_id")
|
||||
if tid is not None:
|
||||
raise FakeBadRequest("Message thread not found")
|
||||
return SimpleNamespace(message_id=42)
|
||||
|
||||
adapter._bot = SimpleNamespace(send_message=mock_send_message)
|
||||
|
||||
result = await adapter.send(
|
||||
chat_id="123",
|
||||
content="test message",
|
||||
metadata={"thread_id": "99999"},
|
||||
)
|
||||
|
||||
assert result.success is True
|
||||
assert result.message_id == "42"
|
||||
# First call has thread_id, second call retries without
|
||||
assert len(call_log) == 2
|
||||
assert call_log[0]["message_thread_id"] == 99999
|
||||
assert call_log[1]["message_thread_id"] is None
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_send_raises_on_other_bad_request():
|
||||
"""Non-thread BadRequest errors should NOT be retried — they fail immediately."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
async def mock_send_message(**kwargs):
|
||||
raise FakeBadRequest("Chat not found")
|
||||
|
||||
adapter._bot = SimpleNamespace(send_message=mock_send_message)
|
||||
|
||||
result = await adapter.send(
|
||||
chat_id="123",
|
||||
content="test message",
|
||||
metadata={"thread_id": "99999"},
|
||||
)
|
||||
|
||||
assert result.success is False
|
||||
assert "Chat not found" in result.error
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_send_without_thread_id_unaffected():
|
||||
"""Normal sends without thread_id should work as before."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
call_log = []
|
||||
|
||||
async def mock_send_message(**kwargs):
|
||||
call_log.append(dict(kwargs))
|
||||
return SimpleNamespace(message_id=100)
|
||||
|
||||
adapter._bot = SimpleNamespace(send_message=mock_send_message)
|
||||
|
||||
result = await adapter.send(
|
||||
chat_id="123",
|
||||
content="test message",
|
||||
)
|
||||
|
||||
assert result.success is True
|
||||
assert len(call_log) == 1
|
||||
assert call_log[0]["message_thread_id"] is None
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_send_retries_network_errors_normally():
|
||||
"""Real transient network errors (not BadRequest) should still be retried."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
attempt = [0]
|
||||
|
||||
async def mock_send_message(**kwargs):
|
||||
attempt[0] += 1
|
||||
if attempt[0] < 3:
|
||||
raise FakeNetworkError("Connection reset")
|
||||
return SimpleNamespace(message_id=200)
|
||||
|
||||
adapter._bot = SimpleNamespace(send_message=mock_send_message)
|
||||
|
||||
result = await adapter.send(
|
||||
chat_id="123",
|
||||
content="test message",
|
||||
)
|
||||
|
||||
assert result.success is True
|
||||
assert attempt[0] == 3 # Two retries then success
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_thread_fallback_only_fires_once():
|
||||
"""After clearing thread_id, subsequent chunks should also use None."""
|
||||
adapter = _make_adapter()
|
||||
|
||||
call_log = []
|
||||
|
||||
async def mock_send_message(**kwargs):
|
||||
call_log.append(dict(kwargs))
|
||||
tid = kwargs.get("message_thread_id")
|
||||
if tid is not None:
|
||||
raise FakeBadRequest("Message thread not found")
|
||||
return SimpleNamespace(message_id=42)
|
||||
|
||||
adapter._bot = SimpleNamespace(send_message=mock_send_message)
|
||||
|
||||
# Send a long message that gets split into chunks
|
||||
long_msg = "A" * 5000 # Exceeds Telegram's 4096 limit
|
||||
result = await adapter.send(
|
||||
chat_id="123",
|
||||
content=long_msg,
|
||||
metadata={"thread_id": "99999"},
|
||||
)
|
||||
|
||||
assert result.success is True
|
||||
# First chunk: attempt with thread → fail → retry without → succeed
|
||||
# Second chunk: should use thread_id=None directly (effective_thread_id
|
||||
# was cleared per-chunk but the metadata doesn't change between chunks)
|
||||
# The key point: the message was delivered despite the invalid thread
|
||||
@@ -1,87 +0,0 @@
|
||||
"""Tests for webhook adapter dynamic route loading."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import pytest
|
||||
from pathlib import Path
|
||||
|
||||
from gateway.config import PlatformConfig
|
||||
from gateway.platforms.webhook import WebhookAdapter, _DYNAMIC_ROUTES_FILENAME
|
||||
|
||||
|
||||
def _make_adapter(routes=None, extra=None):
|
||||
_extra = extra or {}
|
||||
if routes:
|
||||
_extra["routes"] = routes
|
||||
_extra.setdefault("secret", "test-global-secret")
|
||||
config = PlatformConfig(enabled=True, extra=_extra)
|
||||
return WebhookAdapter(config)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _isolate(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
|
||||
|
||||
class TestDynamicRouteLoading:
|
||||
def test_no_dynamic_file(self):
|
||||
adapter = _make_adapter(routes={"static": {"secret": "s"}})
|
||||
adapter._reload_dynamic_routes()
|
||||
assert "static" in adapter._routes
|
||||
assert len(adapter._dynamic_routes) == 0
|
||||
|
||||
def test_loads_dynamic_routes(self, tmp_path):
|
||||
subs = {"my-hook": {"secret": "dynamic-secret", "prompt": "test", "events": []}}
|
||||
(tmp_path / _DYNAMIC_ROUTES_FILENAME).write_text(json.dumps(subs))
|
||||
|
||||
adapter = _make_adapter(routes={"static": {"secret": "s"}})
|
||||
adapter._reload_dynamic_routes()
|
||||
assert "my-hook" in adapter._routes
|
||||
assert "static" in adapter._routes
|
||||
|
||||
def test_static_takes_precedence(self, tmp_path):
|
||||
(tmp_path / _DYNAMIC_ROUTES_FILENAME).write_text(
|
||||
json.dumps({"conflict": {"secret": "dynamic", "prompt": "dyn"}})
|
||||
)
|
||||
adapter = _make_adapter(routes={"conflict": {"secret": "static", "prompt": "stat"}})
|
||||
adapter._reload_dynamic_routes()
|
||||
assert adapter._routes["conflict"]["secret"] == "static"
|
||||
|
||||
def test_mtime_gated(self, tmp_path):
|
||||
import time
|
||||
path = tmp_path / _DYNAMIC_ROUTES_FILENAME
|
||||
path.write_text(json.dumps({"v1": {"secret": "s"}}))
|
||||
|
||||
adapter = _make_adapter()
|
||||
adapter._reload_dynamic_routes()
|
||||
assert "v1" in adapter._dynamic_routes
|
||||
|
||||
# Same mtime — no reload
|
||||
adapter._dynamic_routes["injected"] = True
|
||||
adapter._reload_dynamic_routes()
|
||||
assert "injected" in adapter._dynamic_routes
|
||||
|
||||
# New write — reloads
|
||||
time.sleep(0.05)
|
||||
path.write_text(json.dumps({"v2": {"secret": "s"}}))
|
||||
adapter._reload_dynamic_routes()
|
||||
assert "v2" in adapter._dynamic_routes
|
||||
assert "v1" not in adapter._dynamic_routes
|
||||
|
||||
def test_file_removal_clears(self, tmp_path):
|
||||
path = tmp_path / _DYNAMIC_ROUTES_FILENAME
|
||||
path.write_text(json.dumps({"temp": {"secret": "s"}}))
|
||||
adapter = _make_adapter()
|
||||
adapter._reload_dynamic_routes()
|
||||
assert "temp" in adapter._dynamic_routes
|
||||
|
||||
path.unlink()
|
||||
adapter._reload_dynamic_routes()
|
||||
assert len(adapter._dynamic_routes) == 0
|
||||
|
||||
def test_corrupted_file(self, tmp_path):
|
||||
(tmp_path / _DYNAMIC_ROUTES_FILENAME).write_text("not json")
|
||||
adapter = _make_adapter(routes={"static": {"secret": "s"}})
|
||||
adapter._reload_dynamic_routes()
|
||||
assert "static" in adapter._routes
|
||||
assert len(adapter._dynamic_routes) == 0
|
||||
@@ -105,24 +105,3 @@ class TestCmdUpdateBranchFallback:
|
||||
commands = [" ".join(str(a) for a in c.args[0]) for c in mock_run.call_args_list]
|
||||
pull_cmds = [c for c in commands if "pull" in c]
|
||||
assert len(pull_cmds) == 0
|
||||
|
||||
def test_update_non_interactive_skips_migration_prompt(self, mock_args, capsys):
|
||||
"""When stdin/stdout aren't TTYs, config migration prompt is skipped."""
|
||||
with patch("shutil.which", return_value=None), patch(
|
||||
"subprocess.run"
|
||||
) as mock_run, patch("builtins.input") as mock_input, patch(
|
||||
"hermes_cli.config.get_missing_env_vars", return_value=["MISSING_KEY"]
|
||||
), patch("hermes_cli.config.get_missing_config_fields", return_value=[]), patch(
|
||||
"hermes_cli.config.check_config_version", return_value=(1, 2)
|
||||
), patch("hermes_cli.main.sys") as mock_sys:
|
||||
mock_sys.stdin.isatty.return_value = False
|
||||
mock_sys.stdout.isatty.return_value = False
|
||||
mock_run.side_effect = _make_run_side_effect(
|
||||
branch="main", verify_ok=True, commit_count="1"
|
||||
)
|
||||
|
||||
cmd_update(mock_args)
|
||||
|
||||
mock_input.assert_not_called()
|
||||
captured = capsys.readouterr()
|
||||
assert "Non-interactive session" in captured.out
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
"""Tests for gateway service management helpers."""
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
from types import SimpleNamespace
|
||||
|
||||
import hermes_cli.gateway as gateway_cli
|
||||
@@ -153,13 +152,12 @@ class TestLaunchdServiceRecovery:
|
||||
def test_launchd_start_reloads_unloaded_job_and_retries(self, tmp_path, monkeypatch):
|
||||
plist_path = tmp_path / "ai.hermes.gateway.plist"
|
||||
plist_path.write_text(gateway_cli.generate_launchd_plist(), encoding="utf-8")
|
||||
label = gateway_cli.get_launchd_label()
|
||||
|
||||
calls = []
|
||||
|
||||
def fake_run(cmd, check=False, **kwargs):
|
||||
calls.append(cmd)
|
||||
if cmd == ["launchctl", "start", label] and calls.count(cmd) == 1:
|
||||
if cmd == ["launchctl", "start", "ai.hermes.gateway"] and calls.count(cmd) == 1:
|
||||
raise gateway_cli.subprocess.CalledProcessError(3, cmd, stderr="Could not find service")
|
||||
return SimpleNamespace(returncode=0, stdout="", stderr="")
|
||||
|
||||
@@ -169,9 +167,9 @@ class TestLaunchdServiceRecovery:
|
||||
gateway_cli.launchd_start()
|
||||
|
||||
assert calls == [
|
||||
["launchctl", "start", label],
|
||||
["launchctl", "start", "ai.hermes.gateway"],
|
||||
["launchctl", "load", str(plist_path)],
|
||||
["launchctl", "start", label],
|
||||
["launchctl", "start", "ai.hermes.gateway"],
|
||||
]
|
||||
|
||||
def test_launchd_status_reports_local_stale_plist_when_unloaded(self, tmp_path, monkeypatch, capsys):
|
||||
@@ -356,20 +354,6 @@ class TestGeneratedUnitUsesDetectedVenv:
|
||||
assert "/venv/" not in unit or "/.venv/" in unit
|
||||
|
||||
|
||||
class TestGeneratedUnitIncludesLocalBin:
|
||||
"""~/.local/bin must be in PATH so uvx/pipx tools are discoverable."""
|
||||
|
||||
def test_user_unit_includes_local_bin_in_path(self):
|
||||
unit = gateway_cli.generate_systemd_unit(system=False)
|
||||
home = str(Path.home())
|
||||
assert f"{home}/.local/bin" in unit
|
||||
|
||||
def test_system_unit_includes_local_bin_in_path(self):
|
||||
unit = gateway_cli.generate_systemd_unit(system=True)
|
||||
# System unit uses the resolved home dir from _system_service_identity
|
||||
assert "/.local/bin" in unit
|
||||
|
||||
|
||||
class TestEnsureUserSystemdEnv:
|
||||
"""Tests for _ensure_user_systemd_env() D-Bus session bus auto-detection."""
|
||||
|
||||
|
||||
@@ -1,13 +1,10 @@
|
||||
"""
|
||||
Tests for skip_confirm and invalidate_cache behavior in /skills install
|
||||
and /skills uninstall slash commands.
|
||||
Tests for skip_confirm behavior in /skills install and /skills uninstall.
|
||||
|
||||
Slash commands always skip confirmation (input() hangs in TUI).
|
||||
Cache invalidation is deferred by default; --now opts into immediate
|
||||
invalidation (at the cost of breaking prompt cache mid-session).
|
||||
Verifies that --yes / -y bypasses the interactive confirmation prompt
|
||||
that hangs inside prompt_toolkit's TUI.
|
||||
|
||||
Based on PR #1595 by 333Alden333 (salvaged).
|
||||
Updated for PR #3586 (cache-aware install/uninstall).
|
||||
"""
|
||||
|
||||
from unittest.mock import patch, MagicMock
|
||||
@@ -35,43 +32,23 @@ class TestHandleSkillsSlashInstallFlags:
|
||||
_, kwargs = mock_install.call_args
|
||||
assert kwargs.get("skip_confirm") is True
|
||||
|
||||
def test_force_flag_sets_force(self):
|
||||
def test_force_flag_sets_force_not_skip(self):
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
with patch("hermes_cli.skills_hub.do_install") as mock_install:
|
||||
handle_skills_slash("/skills install test/skill --force")
|
||||
mock_install.assert_called_once()
|
||||
_, kwargs = mock_install.call_args
|
||||
assert kwargs.get("force") is True
|
||||
# Slash commands always skip confirmation (input() hangs in TUI)
|
||||
assert kwargs.get("skip_confirm") is True
|
||||
assert kwargs.get("skip_confirm") is False
|
||||
|
||||
def test_no_flags_still_skips_confirm(self):
|
||||
"""Slash commands always skip confirmation — input() hangs in TUI."""
|
||||
def test_no_flags(self):
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
with patch("hermes_cli.skills_hub.do_install") as mock_install:
|
||||
handle_skills_slash("/skills install test/skill")
|
||||
mock_install.assert_called_once()
|
||||
_, kwargs = mock_install.call_args
|
||||
assert kwargs.get("force") is False
|
||||
assert kwargs.get("skip_confirm") is True
|
||||
|
||||
def test_default_defers_cache_invalidation(self):
|
||||
"""Without --now, cache invalidation is deferred to next session."""
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
with patch("hermes_cli.skills_hub.do_install") as mock_install:
|
||||
handle_skills_slash("/skills install test/skill")
|
||||
mock_install.assert_called_once()
|
||||
_, kwargs = mock_install.call_args
|
||||
assert kwargs.get("invalidate_cache") is False
|
||||
|
||||
def test_now_flag_invalidates_cache(self):
|
||||
"""--now opts into immediate cache invalidation."""
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
with patch("hermes_cli.skills_hub.do_install") as mock_install:
|
||||
handle_skills_slash("/skills install test/skill --now")
|
||||
mock_install.assert_called_once()
|
||||
_, kwargs = mock_install.call_args
|
||||
assert kwargs.get("invalidate_cache") is True
|
||||
assert kwargs.get("skip_confirm") is False
|
||||
|
||||
|
||||
class TestHandleSkillsSlashUninstallFlags:
|
||||
@@ -93,32 +70,13 @@ class TestHandleSkillsSlashUninstallFlags:
|
||||
_, kwargs = mock_uninstall.call_args
|
||||
assert kwargs.get("skip_confirm") is True
|
||||
|
||||
def test_no_flags_still_skips_confirm(self):
|
||||
"""Slash commands always skip confirmation — input() hangs in TUI."""
|
||||
def test_no_flags(self):
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
with patch("hermes_cli.skills_hub.do_uninstall") as mock_uninstall:
|
||||
handle_skills_slash("/skills uninstall test-skill")
|
||||
mock_uninstall.assert_called_once()
|
||||
_, kwargs = mock_uninstall.call_args
|
||||
assert kwargs.get("skip_confirm") is True
|
||||
|
||||
def test_default_defers_cache_invalidation(self):
|
||||
"""Without --now, cache invalidation is deferred to next session."""
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
with patch("hermes_cli.skills_hub.do_uninstall") as mock_uninstall:
|
||||
handle_skills_slash("/skills uninstall test-skill")
|
||||
mock_uninstall.assert_called_once()
|
||||
_, kwargs = mock_uninstall.call_args
|
||||
assert kwargs.get("invalidate_cache") is False
|
||||
|
||||
def test_now_flag_invalidates_cache(self):
|
||||
"""--now opts into immediate cache invalidation."""
|
||||
from hermes_cli.skills_hub import handle_skills_slash
|
||||
with patch("hermes_cli.skills_hub.do_uninstall") as mock_uninstall:
|
||||
handle_skills_slash("/skills uninstall test-skill --now")
|
||||
mock_uninstall.assert_called_once()
|
||||
_, kwargs = mock_uninstall.call_args
|
||||
assert kwargs.get("invalidate_cache") is True
|
||||
assert kwargs.get("skip_confirm", False) is False
|
||||
|
||||
|
||||
class TestDoInstallSkipConfirm:
|
||||
|
||||
@@ -237,53 +237,3 @@ def test_save_platform_tools_still_preserves_mcp_with_platform_default_present()
|
||||
|
||||
# Deselected configurable toolset removed
|
||||
assert "terminal" not in saved
|
||||
|
||||
|
||||
# ── Platform / toolset consistency ────────────────────────────────────────────
|
||||
|
||||
|
||||
class TestPlatformToolsetConsistency:
|
||||
"""Every platform in tools_config.PLATFORMS must have a matching toolset."""
|
||||
|
||||
def test_all_platforms_have_toolset_definitions(self):
|
||||
"""Each platform's default_toolset must exist in TOOLSETS."""
|
||||
from hermes_cli.tools_config import PLATFORMS
|
||||
from toolsets import TOOLSETS
|
||||
|
||||
for platform, meta in PLATFORMS.items():
|
||||
ts_name = meta["default_toolset"]
|
||||
assert ts_name in TOOLSETS, (
|
||||
f"Platform {platform!r} references toolset {ts_name!r} "
|
||||
f"which is not defined in toolsets.py"
|
||||
)
|
||||
|
||||
def test_gateway_toolset_includes_all_messaging_platforms(self):
|
||||
"""hermes-gateway includes list should cover all messaging platforms."""
|
||||
from hermes_cli.tools_config import PLATFORMS
|
||||
from toolsets import TOOLSETS
|
||||
|
||||
gateway_includes = set(TOOLSETS["hermes-gateway"]["includes"])
|
||||
# Exclude non-messaging platforms from the check
|
||||
non_messaging = {"cli", "api_server"}
|
||||
for platform, meta in PLATFORMS.items():
|
||||
if platform in non_messaging:
|
||||
continue
|
||||
ts_name = meta["default_toolset"]
|
||||
assert ts_name in gateway_includes, (
|
||||
f"Platform {platform!r} toolset {ts_name!r} missing from "
|
||||
f"hermes-gateway includes"
|
||||
)
|
||||
|
||||
def test_skills_config_covers_tools_config_platforms(self):
|
||||
"""skills_config.PLATFORMS should have entries for all gateway platforms."""
|
||||
from hermes_cli.tools_config import PLATFORMS as TOOLS_PLATFORMS
|
||||
from hermes_cli.skills_config import PLATFORMS as SKILLS_PLATFORMS
|
||||
|
||||
non_messaging = {"api_server"}
|
||||
for platform in TOOLS_PLATFORMS:
|
||||
if platform in non_messaging:
|
||||
continue
|
||||
assert platform in SKILLS_PLATFORMS, (
|
||||
f"Platform {platform!r} in tools_config but missing from "
|
||||
f"skills_config PLATFORMS"
|
||||
)
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user