Compare commits

...

75 Commits

Author SHA1 Message Date
helix4u 526fbcf66e fix(run_agent): recover primary client on openai transport errors 2026-04-10 03:18:35 -07:00
Teknium 6d5f607e48 fix: add all platforms to webhook cross-platform delivery
The delivery tuple in webhook.py only had 5 of 14 platforms with
gateway adapters. Adds whatsapp, matrix, mattermost, homeassistant,
email, dingtalk, feishu, wecom, and bluebubbles so webhooks can
deliver to any connected platform.

Updates docs delivery options table to list all platforms.

Follow-up to cherry-picked fix from olafthiele (PR #7035).
2026-04-10 03:16:24 -07:00
olafthiele 52bd3bd200 mattermost added as deliver to webhook gateway 2026-04-10 03:16:24 -07:00
Teknium 568be71003 fix: extract custom_provider_slug() helper, harden gateway test
- Add custom_provider_slug() to hermes_cli/providers.py as the single
  source of truth for building 'custom:<name>' slugs.
- Use it in resolve_custom_provider() and list_authenticated_providers()
  instead of duplicated inline slug construction.
- Add _session_model_overrides and _voice_mode to gateway test runner
  for object.__new__() safety.
2026-04-10 03:07:00 -07:00
donrhmexe a2f46e4665 fix: include custom_providers in /model command listings and resolution
Custom providers defined in config.yaml under  were
completely invisible to the /model command in both gateway (Telegram,
Discord, etc.) and CLI. The provider listing skipped them and explicit
switching via --provider failed with "Unknown provider".

Root cause: gateway/run.py, cli.py, and model_switch.py only read the
 dict from config, ignoring  entirely.

Changes:
- providers.py: add resolve_custom_provider() and extend
  resolve_provider_full() to check custom_providers after user_providers
- model_switch.py: propagate custom_providers through switch_model(),
  list_authenticated_providers(), and get_authenticated_provider_slugs();
  add custom provider section to provider listings
- gateway/run.py: read custom_providers from config, pass to all
  model-switch calls
- cli.py: hoist config loading, pass custom_providers to listing and
  switch calls

Tests: 4 new regression tests covering listing, resolution, and gateway
command handler. All 71 tests pass.
2026-04-10 03:07:00 -07:00
Teknium 7d426e6536 test: update session ID tests to require auth (follow-up to #6930)
Session continuation now requires API_SERVER_KEY to be configured.
Update TestSessionIdHeader tests to use auth_adapter with Bearer token.
2026-04-10 03:05:04 -07:00
Teknium 30ae68dd33 fix: apply hidden_div regex newline bypass fix to skills_guard.py
The same .* pattern vulnerable to newline bypass that was fixed in
prompt_builder.py (PR #6925) also existed in skills_guard.py. Changed
to [\s\S]*? to match across newlines.
2026-04-10 03:05:04 -07:00
aaronagent 9afe1784bd fix: hidden_div regex bypass with newlines, credential config silent failure, webhook route error severity
prompt_builder.py: The `hidden_div` detection pattern uses `.*` which does not
match newlines in Python regex (re.DOTALL is not passed).  An attacker can bypass
detection by splitting the style attribute across lines:
  `<div style="color:red;\ndisplay: none">injected content</div>`
Replace `.*` with `[\s\S]*?` to match across line boundaries.

credential_files.py: `_load_config_files()` catches all exceptions at DEBUG level
(line 171), making YAML parse failures invisible in production logs.  Users whose
credential files silently fail to mount into sandboxes have no diagnostic clue.
Promote to WARNING to match the severity pattern used by the path validation
warnings at lines 150 and 158 in the same function.

webhook.py: `_reload_dynamic_routes()` logs JSON parse failures at WARNING (line
265) but the impact — stale/corrupted dynamic routes persisting silently — warrants
ERROR level to ensure operator visibility in alerting pipelines.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:05:04 -07:00
aaronagent 94f5979cc2 fix(approval,mcp): log silent exception handlers, narrow OAuth catches, close server on error
Three silent `except Exception` blocks in approval.py (lines 345, 387, 469) return
fallback values with zero logging — making it impossible to debug callback failures,
allowlist load errors, or config read issues.  Add logger.warning/error calls that
match the pattern already used by save_permanent_allowlist() and _smart_approve()
in the same file.

In mcp_oauth.py, narrow the overly-broad `except Exception` in get_tokens() and
get_client_info() to the specific exceptions Pydantic's model_validate() can raise
(ValueError, TypeError, KeyError), and include the exception message in the warning.
Also wrap the _wait_for_callback() polling loop in try/finally so the HTTPServer is
always closed — previously an asyncio.CancelledError or any exception in the loop
would leak the server socket.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:05:04 -07:00
aaronagent 738f0bac13 fix: align auth-by-message classification with status-code path, decode URLs before secret check
error_classifier.py: Message-only auth errors ("invalid api key", "unauthorized",
etc.) were classified as retryable=True (line 707), inconsistent with the HTTP 401
path (line 432) which correctly uses retryable=False + should_fallback=True.  The
mismatch causes 3 wasted retries with the same broken credential before fallback,
while 401 errors immediately attempt fallback.  Align the message-based path to
match: retryable=False, should_fallback=True.

web_tools.py: The _PREFIX_RE secret-detection check in web_extract_tool() runs
against the raw URL string (line 1196).  URL-encoded secrets like %73k-1234... (
sk-1234...) bypass the filter because the regex expects literal ASCII.  Add
urllib.parse.unquote() before the check so percent-encoded variants are also caught.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:05:04 -07:00
aaronagent 37bb4f807b fix(dingtalk,api): validate session webhook URL origin, cap webhook cache, reject header injection
dingtalk.py: The session_webhook URL from incoming DingTalk messages is POSTed to
without any origin validation (line 290), enabling SSRF attacks via crafted webhook
URLs (e.g. http://169.254.169.254/ to reach cloud metadata).  Add a regex check
that only accepts the official DingTalk API origin (https://api.dingtalk.com/).
Also cap _session_webhooks dict at 500 entries with FIFO eviction to prevent
unbounded memory growth from long-running gateway instances.

api_server.py: The X-Hermes-Session-Id request header is accepted and echoed back
into response headers (lines 675, 697) without sanitization.  A session ID
containing \r\n enables HTTP response splitting / header injection.  Add a check
that rejects session IDs containing control characters (\r, \n, \x00).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:05:04 -07:00
Julien Talbot b577697189 fix(model_metadata): add xAI Grok context length fallbacks
xAI /v1/models does not return context_length metadata, so Hermes
probes down to the 128k default whenever a user configures a custom
provider pointing at https://api.x.ai/v1. This forces every xAI user
to manually override model.context_length in config.yaml (2M for
Grok 4.20 / 4.1-fast / 4-fast) or lose most of the usable context
window.

Add DEFAULT_CONTEXT_LENGTHS entries for the Grok family so the
fallback lookup returns the correct value via substring matching.
Values sourced from models.dev (2026-04) and cross-checked against
the xAI /v1/models listing:

  - grok-4.20-*          2,000,000  (reasoning, non-reasoning, multi-agent)
  - grok-4-1-fast-*      2,000,000
  - grok-4-fast-*        2,000,000
  - grok-4 / grok-4-0709   256,000
  - grok-code-fast-1       256,000
  - grok-3*                131,072
  - grok-2 / latest        131,072
  - grok-2-vision*           8,192
  - grok (catch-all)       131,072

Keys are ordered longest-first so that specific variants match before
the catch-all, consistent with the existing Claude/Gemma/MiniMax entries.

Add TestDefaultContextLengths.test_grok_models_context_lengths and
test_grok_substring_matching to pin the values and verify the full
lookup path. All 77 tests in test_model_metadata.py pass.
2026-04-10 03:04:19 -07:00
Jeff Davis 5b22e61cfa feat(discord): add allowed_channels whitelist config
Add DISCORD_ALLOWED_CHANNELS (env var) / discord.allowed_channels (config.yaml)
support to restrict the bot to only respond in specified channels.

When set, messages from any channel NOT in the allowed list are silently
ignored — even if the bot is @mentioned. This provides a secure default-
deny posture vs the existing ignored_channels which is default-allow.

This is especially useful when bots in other channels may create new
channels dynamically (e.g., project bots) — a blacklist requires constant
maintenance while a whitelist is set-and-forget.

Follows the same config pattern as ignored_channels and free_response_channels:
- Env var: DISCORD_ALLOWED_CHANNELS (comma-separated channel IDs)
- Config: discord.allowed_channels (string or list of channel IDs)
- Env var takes precedence over config.yaml
- Empty/unset = no restriction (backward compatible)

Files changed:
- gateway/platforms/discord.py: check allowed_channels before ignored_channels
- gateway/config.py: map discord.allowed_channels → DISCORD_ALLOWED_CHANNELS
- hermes_cli/config.py: add allowed_channels to DEFAULT_CONFIG
2026-04-10 03:02:42 -07:00
Teknium b39ea46488 fix(gateway): remove DM thread session seeding to prevent cross-thread contamination (#7084)
The session store was copying the ENTIRE parent DM transcript into new
thread sessions. This caused unrelated conversations to bleed across
threads in Slack DMs.

The Slack adapter already handles thread context correctly via
_fetch_thread_context() (conversations.replies API), which fetches
only the actual thread messages. The session-level seeding was both
redundant and harmful.

No other platform (Telegram, Discord) uses DM threads, so the seeding
code path was only triggered by Slack — where it conflicted with the
adapter-level context.

Tests updated to assert thread isolation: all thread sessions start
empty, platform adapters are responsible for injecting thread context.

Salvage of PR #5868 (jarvisxyz). Reported by norbert on Discord.
2026-04-10 03:01:59 -07:00
alt-glitch aad40f6d0c fix(tests): update mocks for file sync changes
- Modal snapshot tests: accept **kw in iter_skills_files/iter_cache_files
  mock lambdas to match new container_base kwarg
- SSH preflight test: mock _detect_remote_home, _ensure_remote_dirs,
  init_session, and FileSyncManager added in file sync PR
2026-04-10 03:01:46 -07:00
alt-glitch 41c233cb99 test: add reproducible perf benchmark for file sync overhead
Direct env.execute() timing — no LLM in the loop.
Measures per-command wall-clock including sync check.

Results on SSH:
- echo median: 617ms (pure SSH round-trip + spawn overhead)
- sync-triggered after 6s wait: 621ms (mtime skip adds ~0ms)
- within-interval (no sync): 618ms

Confirms mtime skip makes sync overhead unmeasurable.
2026-04-10 03:01:46 -07:00
alt-glitch 1f1f297528 feat(environments): unified file sync with change tracking and deletion
Replace per-backend ad-hoc file sync with a shared FileSyncManager
that handles mtime-based change detection, remote deletion of
locally-removed files, and transactional state updates.

- New FileSyncManager class (tools/environments/file_sync.py)
  with callbacks for upload/delete, rate limiting, and rollback
- Shared iter_sync_files() eliminates 3 duplicate implementations
- SSH: replace unconditional rsync with scp + mtime skip
- Modal/Daytona: replace inline _synced_files dict with manager
- All 3 backends now sync credentials + skills + cache uniformly
- Remote deletion: files removed locally are cleaned from remote
- HERMES_FORCE_FILE_SYNC=1 env var for debugging
- Base class _before_execute() simplified to empty hook
- 12 unit tests covering mtime skip, deletion, rollback, rate limiting
2026-04-10 03:01:46 -07:00
buray 1495647636 fix(config): allow HERMES_HOME_MODE env var to override _secure_dir() permissions (#6993)
Operators running a web server (nginx, caddy) that needs to traverse ~/.hermes/ can now set HERMES_HOME_MODE=0701 (or any octal mode) instead of having _secure_dir() revert their manual chmod on every gateway restart. Default behavior (0o700) is unchanged. Fixes #6991. Contributed by @ygd58.
2026-04-10 03:00:15 -07:00
Teknium 4e78963fe8 fix(acp): remove dead nested usage dict path
run_conversation() never returns a result["usage"] nested dict —
token counters are always at the top level. The nested path used
the wrong key name ("cached_tokens" vs "cache_read_tokens") and
was never reachable. Remove it.
2026-04-10 03:00:12 -07:00
Yuhan Lei f92298fe95 fix(acp): populate usage from top-level result fields 2026-04-10 03:00:12 -07:00
Kamil Gwóźdź eaa21a8275 fix(copilot): add missing Copilot-Integration-Id header
The GitHub Copilot API now requires a Copilot-Integration-Id header
on all requests. Without it, every API call fails with HTTP 400:
"missing required Copilot-Integration-Id header".

Uses vscode-chat as the integration ID, matching opencode which
shares the same OAuth client ID (Ov23li8tweQw6odWQebz).

Fixes: Copilot provider fails with "missing required Copilot-Integration-Id header" (HTTP 400)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-10 02:59:02 -07:00
Teknium a420235b66 fix: reject foreground timeout above cap instead of clamping
Change behavior from silent clamping to returning an error when the
model requests a foreground timeout exceeding FOREGROUND_MAX_TIMEOUT.
This forces the model to use background=true for long-running commands
rather than silently changing its intent.

- Config default timeouts above the cap are NOT rejected (user's choice)
- Only explicit model-requested timeouts trigger rejection
- Added boundary test for timeout exactly at the limit
2026-04-10 02:58:54 -07:00
kshitijk4poor 6c3565df57 fix(terminal): cap foreground timeout to prevent session deadlocks
When the model calls terminal() in foreground mode without background=true
(e.g. to start a server), the tool call blocks until the command exits or
the timeout expires. Without an upper bound the model can request arbitrarily
high timeouts (the schema had minimum=1 but no maximum), blocking the entire
agent session for hours until the gateway idle watchdog kills it.

Changes:
- Add FOREGROUND_MAX_TIMEOUT (600s, configurable via
  TERMINAL_MAX_FOREGROUND_TIMEOUT env var) that caps foreground timeout
- Clamp effective_timeout to the cap when background=false and timeout
  exceeds the limit
- Include a timeout_note in the tool result when clamped, nudging the
  model to use background=true for long-running processes
- Update schema description to show the max timeout value
- Remove dead clamping code in the background branch that could never
  fire (max_timeout was set to effective_timeout, so timeout > max_timeout
  was always false)
- Add 7 tests covering clamping, no-clamping, config-default-exceeds-cap
  edge case, background bypass, default timeout, constant value, and
  schema content

Self-review fixes:
- Fixed bug where timeout_note said 'Requested timeout Nones' when
  clamping fired from config default exceeding cap (timeout param is
  None). Now uses unclamped_timeout instead of the raw timeout param.
- Removed unused pytest import from test file
- Extracted test config dict into _make_env_config() helper
- Fixed tautological test_default_value assertion
- Added missing test for config default > cap with no model timeout
2026-04-10 02:58:54 -07:00
kshitijk4poor 51d826f889 fix(gateway): apply /model session overrides so switch persists across messages
The gateway /model command stored session overrides in
_session_model_overrides but run_sync() never consulted them when
resolving the model and runtime for the next message.  It always read
from config.yaml, so the switch was lost as soon as a new agent was
created.

Two fixes:

1. In run_sync(), apply _session_model_overrides after resolving from
   config.yaml/env — the override takes precedence for model, provider,
   api_key, base_url, and api_mode.

2. In post-run fallback detection, check whether the model mismatch
   (agent.model != config_model) is due to an intentional /model switch
   before evicting the cached agent.  Without this, the first message
   after /model would work (cached agent reused) but the fallback
   detector would evict it, causing the next message to revert.

Affects all gateway platforms (Telegram, Discord, Slack, WhatsApp,
Signal, Matrix, BlueBubbles, HomeAssistant) since they all share
GatewayRunner._run_agent().

Fixes #6213
2026-04-10 02:58:42 -07:00
coffee a04854800f fix(security): require auth for session continuation and warn on missing API key
Two security hardening changes for the API server:

1. **Startup warning when no API key is configured.**
   When `API_SERVER_KEY` is not set, all endpoints accept unauthenticated
   requests.  This is the default configuration, but operators may not
   realize the security implications.  A prominent warning at startup
   makes the risk visible.

2. **Require authentication for session continuation.**
   The `X-Hermes-Session-Id` header allows callers to load and continue
   any session stored in state.db.  Without authentication, an attacker
   who can reach the API server (e.g. via CORS from a malicious page,
   or on a shared host) could enumerate session IDs and read conversation
   history — which may contain API keys, passwords, code, or other
   sensitive data shared with the agent.

   Session continuation now returns 403 when no API key is configured,
   with a clear error message explaining how to enable the feature.
   When a key IS configured, the existing Bearer token check already
   gates access.

This is defense-in-depth: the API server is intended for local use,
but defense against cross-origin and shared-host attacks is important
since the default binding is 127.0.0.1 which is reachable from
browsers via DNS rebinding or localhost CORS.
2026-04-10 02:58:21 -07:00
Young 940237c6fd fix(cli): prevent stale image attachment on text paste and voice input
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 02:58:18 -07:00
Teknium 95ee453bc0 docs: add cron script timeout and provider recovery documentation
- Add HERMES_CRON_TIMEOUT and HERMES_CRON_SCRIPT_TIMEOUT to env vars reference
- Add script timeout and provider recovery sections to cron features page
- Add timeout resolution chain and credential pool details to cron internals
2026-04-10 02:57:57 -07:00
Dominic Grieco 38cce22e2c fix: harden cron script timeout and provider recovery 2026-04-10 02:57:57 -07:00
Carlos 7368854398 Refresh OpenRouter model catalog
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-10 02:57:39 -07:00
Carlos 38ccd9eb95 Harden setup provider flows
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-10 02:57:39 -07:00
Cocoon-Break 45034b746f fix: set retryable=False for message-based auth errors in _classify_by_message() (#7027)
Auth errors matched by message pattern were incorrectly marked retryable=True, causing futile retry loops. Aligns with _classify_by_status() which already sets retryable=False for 401/403. Fixes #7026. Contributed by @kuishou68.
2026-04-10 02:48:45 -07:00
JiayuWang(王嘉宇) a7588830d4 fix(cli): add missing os and platform imports in uninstall.py (#7034)
Fixes #6983. Contributed by @JiayuuWang.
2026-04-10 02:41:33 -07:00
kshitijk4poor 9431f82aff fix: update Kimi Coding User-Agent to KimiCLI/1.30.0
The hardcoded User-Agent 'KimiCLI/1.3' is outdated — Kimi CLI is now at
v1.30.0. The stale version string causes intermittent 403 errors from
Kimi's coding endpoint ('only available for Coding Agents').

Update all 8 occurrences across run_agent.py, auxiliary_client.py, and
doctor.py to 'KimiCLI/1.30.0' to match the current official Kimi CLI.
2026-04-10 02:37:28 -07:00
Teknium 6da952bc50 fix(gateway): /usage now shows rate limits, cost, and token details between turns (#7038)
The gateway /usage handler only looked in _running_agents for the agent
object, which is only populated while the agent is actively processing a
message. Between turns (when users actually type /usage), the dict is
empty and the handler fell through to a rough message-count estimate.

The agent object actually lives in _agent_cache between turns (kept for
prompt caching). This fix checks both dicts, with _running_agents taking
priority (mid-turn) and _agent_cache as the between-turns fallback.

Also brings the gateway output to parity with the CLI /usage:
- Model name
- Detailed token breakdown (input, output, cache read, cache write)
- Cost estimation (estimated amount or 'included' for subscriptions)
- Cache token lines hidden when zero (cleaner output)

This fixes Nous Portal rate limit headers not showing up for gateway
users — the data was being captured correctly but the handler could
never see it.
2026-04-10 02:33:01 -07:00
Teknium 8779a268a7 feat: add Anthropic Fast Mode support to /fast command (#7037)
Extends the /fast command to support Anthropic's Fast Mode beta in addition
to OpenAI Priority Processing. When enabled on Claude Opus 4.6, adds
speed:"fast" and the fast-mode-2026-02-01 beta header to API requests for
~2.5x faster output token throughput.

Changes:
- hermes_cli/models.py: Add _ANTHROPIC_FAST_MODE_MODELS registry,
  model_supports_fast_mode() now recognizes Claude Opus 4.6,
  resolve_fast_mode_overrides() returns {speed: fast} for Anthropic
  vs {service_tier: priority} for OpenAI
- agent/anthropic_adapter.py: Add _FAST_MODE_BETA constant,
  build_anthropic_kwargs() accepts fast_mode=True which injects
  speed:fast + beta header via extra_headers (skipped for third-party
  Anthropic-compatible endpoints like MiniMax)
- run_agent.py: Pass fast_mode to build_anthropic_kwargs in the
  anthropic_messages path of _build_api_kwargs()
- cli.py: Update _handle_fast_command with provider-aware messaging
  (shows 'Anthropic Fast Mode' vs 'Priority Processing')
- hermes_cli/commands.py: Update /fast description to mention both
  providers
- tests: 13 new tests covering Anthropic model detection, override
  resolution, CLI availability, routing, adapter kwargs, and
  third-party endpoint safety
2026-04-10 02:32:15 -07:00
Teknium 0848a79476 fix(update): always reset on stash conflict — never leave conflict markers (#7010)
When `hermes update` stashes local changes and the restore hits merge
conflicts, the old code prompted the user to reset or keep conflict
markers.  If the user declined the reset, git conflict markers
(<<<<<<< Updated upstream) were left in source files, making hermes
completely unrunnable with a SyntaxError on the next invocation.

Additionally, the interactive path called sys.exit(1), which killed
the entire update process before pip dependency install, skill sync,
and gateway restart could finish — even though the code pull itself
had succeeded.

Changes:
- Always auto-reset to clean state when stash restore conflicts
- Remove the "Reset working tree?" prompt (footgun)
- Remove sys.exit(1) — return False so cmd_update continues normally
- User's changes remain safely in the stash for manual recovery

Also fixes a secondary bug where the conflict handling prompt used
bare input() instead of the input_fn parameter, which would hang
in gateway mode.

Tests updated: replaced prompt/sys.exit assertions with auto-reset
behavior checks; removed the "user declines reset" test (path no
longer exists).
2026-04-10 00:32:20 -07:00
Teknium 871313ae2d fix: clear conversation_history after mid-loop compression to prevent empty sessions (#7001)
After mid-loop compression (triggered by 413, context_overflow, or Anthropic
long-context tier errors), _compress_context() creates a new session in SQLite
and resets _last_flushed_db_idx=0. However, conversation_history was not cleared,
so _flush_messages_to_session_db() computed:

    flush_from = max(len(conversation_history=200), _last_flushed_db_idx=0) = 200
    messages[200:] → empty (compressed messages < 200)

This resulted in zero messages being written to the new session's SQLite store.
On resume, the user would see 'Session found but has no messages.'

The preflight compression path (line 7311) already had the fix:
    conversation_history = None

This commit adds the same clearing to the three mid-loop compression sites:
- Anthropic long-context tier overflow
- HTTP 413 payload too large
- Generic context_overflow error

Reported by Aaryan (Nous community).
2026-04-10 00:14:59 -07:00
Teknium 13d7ff3420 fix(gateway): bypass text batching when delay is 0 (#6996)
The text batching feature routes TEXT messages through
asyncio.create_task() + asyncio.sleep(delay). Even with delay=0,
the task fires asynchronously and won't complete before synchronous
test assertions. This broke 33 tests across Discord, Matrix, and
WeCom adapters.

When _text_batch_delay_seconds is 0 (the test fixture setting),
dispatch directly to handle_message() instead of going through
the async batching path. This preserves the pre-batching behavior
for tests while keeping batching active in production (default
delay 0.6s).
2026-04-09 23:59:20 -07:00
Teknium d5023d36d8 docs: document streaming timeout auto-detection for local LLMs (#6990)
Add streaming timeout documentation to three pages:

- guides/local-llm-on-mac.md: New 'Timeouts' section with table of all
  three timeouts, their defaults, local auto-adjustments, and env var
  overrides
- reference/faq.md: Tip box in the local models FAQ section
- user-guide/configuration.md: 'Streaming Timeouts' subsection under
  the agent config section

Follow-up to #6967.
2026-04-09 23:28:25 -07:00
Sahil 0602ff8f58 fix(docker): use uv for dependency resolution to fix resolution-too-deep error 2026-04-09 23:25:56 -07:00
Teknium 8104f400f8 test: disable text batching in existing adapter tests
Set _text_batch_delay_seconds = 0 on test adapter fixtures so messages
dispatch immediately (bypassing async batching). This preserves the
existing synchronous assertion patterns while the batching logic is
tested separately in test_text_batching.py.
2026-04-09 23:25:27 -07:00
Teknium 1ed00496f2 test: add text batching tests for Discord, Matrix, WeCom, Telegram, Feishu
22 tests covering:
- Single message dispatch after delay
- Split message aggregation (2-way and 3-way)
- Different chats/rooms not merged
- Adaptive delay for near-limit chunks
- State cleanup after flush
- Split continuation merging

All 5 platform adapters tested.
2026-04-09 23:25:27 -07:00
Teknium f92a0b8596 fix(feishu): add adaptive batch delay for split long messages
Feishu already had text batching with a static 0.6s delay. This adds
adaptive delay: waits 2.0s when a chunk is near the ~4096-char split
point since a continuation is almost certain.

Tracks _last_chunk_len on each queued event to determine the delay.
Configurable via HERMES_FEISHU_TEXT_BATCH_SPLIT_DELAY_SECONDS (default 2.0).

Ref #6892
2026-04-09 23:25:27 -07:00
Teknium 1723e8e998 fix(wecom): add text batching to merge split long messages
Ports the adaptive batching pattern from the Telegram adapter.
WeCom clients split messages around 4000 chars. Adaptive delay waits
2.0s when a chunk is near the limit, 0.6s otherwise. Only text messages
are batched; commands/media dispatch immediately.

Ref #6892
2026-04-09 23:25:27 -07:00
Teknium 07148cac9a fix(matrix): add text batching to merge split long messages
Ports the adaptive batching pattern from the Telegram adapter.
Matrix clients split messages around 4000 chars. Adaptive delay waits
2.0s when a chunk is near the limit, 0.6s otherwise. Only text messages
are batched; commands dispatch immediately.

Ref #6892
2026-04-09 23:25:27 -07:00
Teknium 0fc0c1c83b fix(discord): add text batching to merge split long messages
Cherry-picked from PR #6894 by SHL0MS with fixes:
- Only batch TEXT messages; commands/media dispatch immediately
- Use build_session_key() for proper session-scoped batch keys
- Consistent naming (_text_batch_delay_seconds)
- Proper Dict[str, MessageEvent] typing

Discord splits at 2000 chars (lowest of all platforms). Adaptive delay
waits 2.0s when a chunk is near the limit, 0.6s otherwise.
2026-04-09 23:25:27 -07:00
Teknium 5075717949 fix(telegram): adaptive batch delay for split long messages
Cherry-picked from PR #6891 by SHL0MS.
When a chunk is near the 4096-char split point, wait 2.0s instead of 0.6s
since a continuation is almost certain.
2026-04-09 23:25:27 -07:00
Teknium f783986f5a fix: increase stream read timeout default to 120s, auto-raise for local LLMs (#6967)
Raise the default httpx stream read timeout from 60s to 120s for all
providers. Additionally, auto-detect local LLM endpoints (Ollama,
llama.cpp, vLLM) and raise the read timeout to HERMES_API_TIMEOUT
(1800s) since local models can take minutes for prefill on large
contexts before producing the first token.

The stale stream timeout already had this local auto-detection pattern;
the httpx read timeout was missing it — causing a hard 60s wall that
users couldn't find (HERMES_STREAM_READ_TIMEOUT was undocumented).

Changes:
- Default HERMES_STREAM_READ_TIMEOUT: 60s -> 120s
- Auto-detect local endpoints -> raise to 1800s (user override respected)
- Document HERMES_STREAM_READ_TIMEOUT and HERMES_STREAM_STALE_TIMEOUT
- Add 10 parametrized tests

Reported-by: Pavan Srinivas (@pavanandums)
2026-04-09 22:35:30 -07:00
emozilla bda9aa17cb fix(streaming): prevent <think> in prose from suppressing response output
When the model mentions <think> as literal text in its response (e.g.
"(/think not producing <think> tags)"), the streaming display treated it
as a reasoning block opener and suppressed everything after it. The
response box would close with truncated content and no error — the API
response was complete but the display ate it.

Root cause: _stream_delta() matched <think> anywhere in the text stream
regardless of position. Real reasoning blocks always start at the
beginning of a line; mentions in prose appear mid-sentence.

Fix: track line position across streaming deltas with a
_stream_last_was_newline flag. Only enter reasoning suppression when
the tag appears at a block boundary (start of stream, after a newline,
or after only whitespace on the current line). Add a _flush_stream()
safety net that recovers buffered content if no closing tag is found
by end-of-stream.

Also fixes three related issues discovered during investigation:

- anthropic_adapter: _get_anthropic_max_output() now normalizes dots to
  hyphens so 'claude-opus-4.6' matches the 'claude-opus-4-6' table key
  (was returning 32K instead of 128K)

- run_agent: send explicit max_tokens for Claude models on Nous Portal,
  same as OpenRouter — both proxy to Anthropic's API which requires it.
  Without it the backend defaults to a low limit that truncates responses.

- run_agent: reset truncated_tool_call_retries after successful tool
  execution so a single truncation doesn't poison the entire conversation.
2026-04-09 22:16:36 -07:00
Teknium 8394b5ddd2 feat: expand /fast to all OpenAI Priority Processing models (#6960)
Previously /fast only supported gpt-5.4 and forced a provider switch to
openai-codex. Now supports all 13 models from OpenAI's Priority Processing
pricing table (gpt-5.4, gpt-5.4-mini, gpt-5.2, gpt-5.1, gpt-5, gpt-5-mini,
gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, gpt-4o-mini, o3, o4-mini).

Key changes:
- Replaced _FAST_MODE_BACKEND_CONFIG with _PRIORITY_PROCESSING_MODELS frozenset
- Removed provider-forcing logic — service_tier is now injected into whatever
  API path the user is already on (Codex Responses, Chat Completions, or
  OpenRouter passthrough)
- Added request_overrides support to chat_completions path in run_agent.py
- Updated messaging from 'Codex inference tier' to 'Priority Processing'
- Expanded test coverage for all supported models
2026-04-09 22:06:30 -07:00
g-guthrie d416a69288 feat: add Codex fast mode toggle (/fast command)
Add /fast slash command to toggle OpenAI Codex service_tier between
normal and priority ('fast') inference. Only exposed for models
registered in _FAST_MODE_BACKEND_CONFIG (currently gpt-5.4).

- Registry-based backend config for extensibility
- Dynamic command visibility (hidden from help/autocomplete for
  non-supported models) via command_filter on SlashCommandCompleter
- service_tier flows through request_overrides from route resolution
- Omit max_output_tokens for Codex backend (rejects it)
- Persists to config.yaml under agent.service_tier

Salvage cleanup: removed simple_term_menu/input() menu (banned),
bare /fast now shows status like /reasoning. Removed redundant
override resolution in _build_api_kwargs — single source of truth
via request_overrides from route.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-04-09 21:54:32 -07:00
Teknium 4caa635803 fix: add auth.json write-back for Codex retry and valid-token early-return paths
The Codex retry block and valid-token short-circuit in _refresh_entry()
both return early, bypassing the auth.json sync at the end of the method.
This adds _sync_device_code_entry_to_auth_store() calls on both paths
so refreshed/synced tokens are written back to auth.json regardless of
which code path succeeds.
2026-04-09 21:48:50 -07:00
Ben Barclay a64d8a83e1 fix: proactive Codex CLI sync before refresh + retry on failure 2026-04-09 21:48:50 -07:00
Ben Barclay dfde4058cf fix: sync refreshed OAuth tokens from pool back to auth.json providers 2026-04-09 21:48:50 -07:00
Ben Barclay 13b3ea6484 fix: skip stale Nous pool entry when agent_key is expired 2026-04-09 21:48:50 -07:00
Teknium b87d00288d fix: add actionable hint for OpenRouter 'no tool endpoints' error
When OpenRouter returns 'No endpoints found that support tool use'
(HTTP 404), display a hint explaining that provider routing restrictions
may be filtering out tool-capable providers. Links the user directly
to the model's OpenRouter page to check which providers support tools.

The hint fires in the error display block that runs regardless of whether
fallback succeeds — so the user always understands WHY the model failed,
not just that it fell back.

Reported via Discord: GLM-5.1 on OpenRouter with US-based provider
restrictions eliminated all 4 tool-supporting endpoints (DeepInfra,
Z.AI, Friendli, Venice), leaving only 7 non-tool providers.
2026-04-09 18:03:09 -07:00
kshitijk4poor 08e2a1a51e fix(anthropic): omit tool-streaming beta on MiniMax endpoints
MiniMax's Anthropic-compatible endpoints reject requests that include
the fine-grained-tool-streaming beta header — every tool-use message
triggers a connection error (~18s timeout). Regular chat works fine.

Add _common_betas_for_base_url() that filters out the tool-streaming
beta for Bearer-auth (MiniMax) endpoints while keeping all other betas.
All four client-construction branches now use the filtered list.

Based on #6528 by @HiddenPuppy.
Original cherry-picked from PR #6688 by kshitijk4poor.
Fixes #6510, fixes #6555.
2026-04-09 17:53:52 -07:00
Teknium 9634e20e15 feat: API server model name derived from profile name (#6857)
* feat: API server model name derived from profile name

For multi-user setups (e.g. OpenWebUI), each profile's API server now
advertises a distinct model name on /v1/models:

- Profile 'lucas' -> model ID 'lucas'
- Profile 'admin' -> model ID 'admin'
- Default profile -> 'hermes-agent' (unchanged)

Explicit override via API_SERVER_MODEL_NAME env var or
platforms.api_server.model_name config for custom names.

Resolves friction where OpenWebUI couldn't distinguish multiple
hermes-agent connections all advertising the same model name.

* docs: multi-user setup with profiles for API server + Open WebUI

- api-server.md: added Multi-User Setup section, API_SERVER_MODEL_NAME
  to config table, updated /v1/models description
- open-webui.md: added Multi-User Setup with Profiles section with
  step-by-step guide, updated model name references
- environment-variables.md: added API_SERVER_MODEL_NAME entry
2026-04-09 17:07:29 -07:00
AIandI0x1 2d0d05a337 fix(agent): detect truncated streaming tool calls before execution
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.

Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.

Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.

Original cherry-picked from PR #6776 by AIandI0x1.
Closes #6638.
2026-04-09 17:03:54 -07:00
Teknium 3b554bf839 fix: test for suppress_status_output should capture stdout, not mock _vprint
The test was mocking _vprint entirely, bypassing the suppress guard.
Switch to capturing _print_fn output so the real _vprint runs and
the guard suppresses retry noise as intended.
2026-04-09 16:24:53 -07:00
Teknium 69a0092c38 fix: deduplicate _is_termux() into hermes_constants.is_termux()
Replace 6 identical copies of the Termux detection function across
cli.py, browser_tool.py, voice_mode.py, status.py, doctor.py, and
gateway.py with a single shared implementation in hermes_constants.py.

Each call site imports with its original local name to preserve all
existing callers (internal references and test monkeypatches).
2026-04-09 16:24:53 -07:00
adybag14-cyber c3141429b7 fix(termux): tighten voice setup and mobile chat UX 2026-04-09 16:24:53 -07:00
adybag14-cyber 769ec1ee1a fix(termux): deepen browser, voice, and tui support 2026-04-09 16:24:53 -07:00
adybag14-cyber 3237733ca5 fix(termux): harden execute_code and mobile browser/audio UX 2026-04-09 16:24:53 -07:00
adybag14-cyber 54d5138a54 fix(termux): harden env-backed background jobs 2026-04-09 16:24:53 -07:00
adybag14-cyber 6dcb3c4774 fix(termux): compact narrow-screen tui chrome 2026-04-09 16:24:53 -07:00
adybag14-cyber 096b3f9f12 fix(termux): add local image chat route 2026-04-09 16:24:53 -07:00
adybag14-cyber a3aed1bd26 fix(termux): keep quiet chat output parseable 2026-04-09 16:24:53 -07:00
adybag14-cyber 4970705ed3 fix(termux): silence quiet chat tool previews 2026-04-09 16:24:53 -07:00
adybag14-cyber 2194425918 fix(termux): make setup-hermes use android path 2026-04-09 16:24:53 -07:00
adybag14-cyber 3878495972 fix(termux): disable gateway service flows on android 2026-04-09 16:24:53 -07:00
adybag14-cyber 4e40e93b98 fix(termux): improve status and install UX 2026-04-09 16:24:53 -07:00
adybag14-cyber 122925a6f2 fix(termux): honor temp dirs for local temp artifacts 2026-04-09 16:24:53 -07:00
adybag14-cyber e79cc88985 feat: add tested Termux install path and EOF-aware gh auth 2026-04-09 16:24:53 -07:00
sprmn24 e053433c84 fix(error_classifier): disambiguate usage-limit patterns in _classify_by_message
_classify_by_message had no handling for _USAGE_LIMIT_PATTERNS, so
messages like 'usage limit exceeded, try again in 5 minutes' arriving
without an HTTP status code fell through to FailoverReason.unknown
instead of rate_limit.

Apply the same billing/rate-limit disambiguation that _classify_402
already uses: USAGE_LIMIT_PATTERNS + transient signal → rate_limit,
USAGE_LIMIT_PATTERNS alone → billing.

Add 4 tests covering the no-status-code usage-limit path.
2026-04-09 16:24:13 -07:00
132 changed files with 8086 additions and 893 deletions
+2 -1
View File
@@ -13,7 +13,8 @@ COPY . /opt/hermes
WORKDIR /opt/hermes
# Install Python and Node dependencies in one layer, no cache
RUN pip install --no-cache-dir -e ".[all]" --break-system-packages && \
RUN pip install --no-cache-dir uv --break-system-packages && \
uv pip install --system --break-system-packages --no-cache -e ".[all]" && \
npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
cd /opt/hermes/scripts/whatsapp-bridge && \
+3 -1
View File
@@ -33,8 +33,10 @@ Use any model you want — [Nous Portal](https://portal.nousresearch.com), [Open
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```
Works on Linux, macOS, and WSL2. The installer handles everything — Python, Node.js, dependencies, and the `hermes` command. No prerequisites except git.
Works on Linux, macOS, WSL2, and Android via Termux. The installer handles the platform-specific setup for you.
> **Android / Termux:** The tested manual path is documented in the [Termux guide](https://hermes-agent.nousresearch.com/docs/getting-started/termux). On Termux, Hermes installs a curated `.[termux]` extra because the full `.[all]` extra currently pulls Android-incompatible voice dependencies.
>
> **Windows:** Native Windows is not supported. Please install [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install) and run the command above.
After installation:
+6 -7
View File
@@ -451,14 +451,13 @@ class HermesACPAgent(acp.Agent):
await conn.session_update(session_id, update)
usage = None
usage_data = result.get("usage")
if usage_data and isinstance(usage_data, dict):
if any(result.get(key) is not None for key in ("prompt_tokens", "completion_tokens", "total_tokens")):
usage = Usage(
input_tokens=usage_data.get("prompt_tokens", 0),
output_tokens=usage_data.get("completion_tokens", 0),
total_tokens=usage_data.get("total_tokens", 0),
thought_tokens=usage_data.get("reasoning_tokens"),
cached_read_tokens=usage_data.get("cached_tokens"),
input_tokens=result.get("prompt_tokens", 0),
output_tokens=result.get("completion_tokens", 0),
total_tokens=result.get("total_tokens", 0),
thought_tokens=result.get("reasoning_tokens"),
cached_read_tokens=result.get("cache_read_tokens"),
)
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
+54 -9
View File
@@ -74,8 +74,11 @@ def _get_anthropic_max_output(model: str) -> int:
model IDs (claude-sonnet-4-5-20250929) and variant suffixes (:1m, :fast)
resolve correctly. Longest-prefix match wins to avoid e.g. "claude-3-5"
matching before "claude-3-5-sonnet".
Normalizes dots to hyphens so that model names like
``anthropic/claude-opus-4.6`` match the ``claude-opus-4-6`` table key.
"""
m = model.lower()
m = model.lower().replace(".", "-")
best_key = ""
best_val = _ANTHROPIC_DEFAULT_OUTPUT_LIMIT
for key, val in _ANTHROPIC_OUTPUT_LIMITS.items():
@@ -95,6 +98,15 @@ _COMMON_BETAS = [
"interleaved-thinking-2025-05-14",
"fine-grained-tool-streaming-2025-05-14",
]
# MiniMax's Anthropic-compatible endpoints fail tool-use requests when
# the fine-grained tool streaming beta is present. Omit it so tool calls
# fall back to the provider's default response path.
_TOOL_STREAMING_BETA = "fine-grained-tool-streaming-2025-05-14"
# Fast mode beta — enables the ``speed: "fast"`` request parameter for
# significantly higher output token throughput on Opus 4.6 (~2.5x).
# See https://platform.claude.com/docs/en/build-with-claude/fast-mode
_FAST_MODE_BETA = "fast-mode-2026-02-01"
# Additional beta headers required for OAuth/subscription auth.
# Matches what Claude Code (and pi-ai / OpenCode) send.
@@ -204,6 +216,19 @@ def _requires_bearer_auth(base_url: str | None) -> bool:
return normalized.startswith(("https://api.minimax.io/anthropic", "https://api.minimaxi.com/anthropic"))
def _common_betas_for_base_url(base_url: str | None) -> list[str]:
"""Return the beta headers that are safe for the configured endpoint.
MiniMax's Anthropic-compatible endpoints (Bearer-auth) reject requests
that include Anthropic's ``fine-grained-tool-streaming`` beta — every
tool-use message triggers a connection error. Strip that beta for
Bearer-auth endpoints while keeping all other betas intact.
"""
if _requires_bearer_auth(base_url):
return [b for b in _COMMON_BETAS if b != _TOOL_STREAMING_BETA]
return _COMMON_BETAS
def build_anthropic_client(api_key: str, base_url: str = None):
"""Create an Anthropic client, auto-detecting setup-tokens vs API keys.
@@ -222,6 +247,7 @@ def build_anthropic_client(api_key: str, base_url: str = None):
}
if normalized_base_url:
kwargs["base_url"] = normalized_base_url
common_betas = _common_betas_for_base_url(normalized_base_url)
if _requires_bearer_auth(normalized_base_url):
# Some Anthropic-compatible providers (e.g. MiniMax) expect the API key in
@@ -231,21 +257,21 @@ def build_anthropic_client(api_key: str, base_url: str = None):
# not use Anthropic's sk-ant-api prefix and would otherwise be misread as
# Anthropic OAuth/setup tokens.
kwargs["auth_token"] = api_key
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
if common_betas:
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
elif _is_third_party_anthropic_endpoint(base_url):
# Third-party proxies (Azure AI Foundry, AWS Bedrock, etc.) use their
# own API keys with x-api-key auth. Skip OAuth detection — their keys
# don't follow Anthropic's sk-ant-* prefix convention and would be
# misclassified as OAuth tokens.
kwargs["api_key"] = api_key
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
if common_betas:
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
elif _is_oauth_token(api_key):
# OAuth access token / setup-token → Bearer auth + Claude Code identity.
# Anthropic routes OAuth requests based on user-agent and headers;
# without Claude Code's fingerprint, requests get intermittent 500s.
all_betas = _COMMON_BETAS + _OAUTH_ONLY_BETAS
all_betas = common_betas + _OAUTH_ONLY_BETAS
kwargs["auth_token"] = api_key
kwargs["default_headers"] = {
"anthropic-beta": ",".join(all_betas),
@@ -255,8 +281,8 @@ def build_anthropic_client(api_key: str, base_url: str = None):
else:
# Regular API key → x-api-key header + common betas
kwargs["api_key"] = api_key
if _COMMON_BETAS:
kwargs["default_headers"] = {"anthropic-beta": ",".join(_COMMON_BETAS)}
if common_betas:
kwargs["default_headers"] = {"anthropic-beta": ",".join(common_betas)}
return _anthropic_sdk.Anthropic(**kwargs)
@@ -1235,6 +1261,7 @@ def build_anthropic_kwargs(
preserve_dots: bool = False,
context_length: Optional[int] = None,
base_url: str | None = None,
fast_mode: bool = False,
) -> Dict[str, Any]:
"""Build kwargs for anthropic.messages.create().
@@ -1268,6 +1295,10 @@ def build_anthropic_kwargs(
When *base_url* points to a third-party Anthropic-compatible endpoint,
thinking block signatures are stripped (they are Anthropic-proprietary).
When *fast_mode* is True, adds ``speed: "fast"`` and the fast-mode beta
header for ~2.5x faster output throughput on Opus 4.6. Currently only
supported on native Anthropic endpoints (not third-party compatible ones).
"""
system, anthropic_messages = convert_messages_to_anthropic(messages, base_url=base_url)
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
@@ -1366,6 +1397,20 @@ def build_anthropic_kwargs(
kwargs["temperature"] = 1
kwargs["max_tokens"] = max(effective_max_tokens, budget + 4096)
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
# Adds speed:"fast" + the fast-mode beta header for ~2.5x output speed.
# Only for native Anthropic endpoints — third-party providers would
# reject the unknown beta header and speed parameter.
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
kwargs["speed"] = "fast"
# Build extra_headers with ALL applicable betas (the per-request
# extra_headers override the client-level anthropic-beta header).
betas = list(_common_betas_for_base_url(base_url))
if is_oauth:
betas.extend(_OAUTH_ONLY_BETAS)
betas.append(_FAST_MODE_BETA)
kwargs["extra_headers"] = {"anthropic-beta": ",".join(betas)}
return kwargs
@@ -1427,4 +1472,4 @@ def normalize_anthropic_response(
reasoning_details=reasoning_details or None,
),
finish_reason,
)
)
+5 -5
View File
@@ -702,7 +702,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
extra = {}
if "api.kimi.com" in base_url.lower():
extra["default_headers"] = {"User-Agent": "KimiCLI/1.3"}
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
elif "api.githubcopilot.com" in base_url.lower():
from hermes_cli.models import copilot_default_headers
@@ -721,7 +721,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
extra = {}
if "api.kimi.com" in base_url.lower():
extra["default_headers"] = {"User-Agent": "KimiCLI/1.3"}
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
elif "api.githubcopilot.com" in base_url.lower():
from hermes_cli.models import copilot_default_headers
@@ -1195,7 +1195,7 @@ def _to_async_client(sync_client, model: str):
async_kwargs["default_headers"] = copilot_default_headers()
elif "api.kimi.com" in base_lower:
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.3"}
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
return AsyncOpenAI(**async_kwargs), model
@@ -1317,7 +1317,7 @@ def resolve_provider_client(
final_model = model or _read_main_model() or "gpt-4o-mini"
extra = {}
if "api.kimi.com" in custom_base.lower():
extra["default_headers"] = {"User-Agent": "KimiCLI/1.3"}
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
elif "api.githubcopilot.com" in custom_base.lower():
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
@@ -1400,7 +1400,7 @@ def resolve_provider_client(
# Provider-specific headers
headers = {}
if "api.kimi.com" in base_url.lower():
headers["User-Agent"] = "KimiCLI/1.3"
headers["User-Agent"] = "KimiCLI/1.30.0"
elif "api.githubcopilot.com" in base_url.lower():
from hermes_cli.models import copilot_default_headers
+106
View File
@@ -20,6 +20,7 @@ from hermes_cli.auth import (
DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
KIMI_CODE_BASE_URL,
PROVIDER_REGISTRY,
_auth_store_lock,
_codex_access_token_is_expiring,
_decode_jwt_claims,
_import_codex_cli_tokens,
@@ -27,6 +28,8 @@ from hermes_cli.auth import (
_load_provider_state,
_resolve_kimi_base_url,
_resolve_zai_base_url,
_save_auth_store,
_save_provider_state,
read_credential_pool,
write_credential_pool,
)
@@ -479,6 +482,67 @@ class CredentialPool:
logger.debug("Failed to sync from ~/.codex/auth.json: %s", exc)
return entry
def _sync_device_code_entry_to_auth_store(self, entry: PooledCredential) -> None:
"""Write refreshed pool entry tokens back to auth.json providers.
After a pool-level refresh, the pool entry has fresh tokens but
auth.json's ``providers.<id>`` still holds the pre-refresh state.
On the next ``load_pool()``, ``_seed_from_singletons()`` reads that
stale state and can overwrite the fresh pool entry — potentially
re-seeding a consumed single-use refresh token.
Applies to any OAuth provider whose singleton lives in auth.json
(currently Nous and OpenAI Codex).
"""
if entry.source != "device_code":
return
try:
with _auth_store_lock():
auth_store = _load_auth_store()
if self.provider == "nous":
state = _load_provider_state(auth_store, "nous")
if state is None:
return
state["access_token"] = entry.access_token
if entry.refresh_token:
state["refresh_token"] = entry.refresh_token
if entry.expires_at:
state["expires_at"] = entry.expires_at
if entry.agent_key:
state["agent_key"] = entry.agent_key
if entry.agent_key_expires_at:
state["agent_key_expires_at"] = entry.agent_key_expires_at
for extra_key in ("obtained_at", "expires_in", "agent_key_id",
"agent_key_expires_in", "agent_key_reused",
"agent_key_obtained_at"):
val = entry.extra.get(extra_key)
if val is not None:
state[extra_key] = val
if entry.inference_base_url:
state["inference_base_url"] = entry.inference_base_url
_save_provider_state(auth_store, "nous", state)
elif self.provider == "openai-codex":
state = _load_provider_state(auth_store, "openai-codex")
if not isinstance(state, dict):
return
tokens = state.get("tokens")
if not isinstance(tokens, dict):
return
tokens["access_token"] = entry.access_token
if entry.refresh_token:
tokens["refresh_token"] = entry.refresh_token
if entry.last_refresh:
state["last_refresh"] = entry.last_refresh
_save_provider_state(auth_store, "openai-codex", state)
else:
return
_save_auth_store(auth_store)
except Exception as exc:
logger.debug("Failed to sync %s pool entry back to auth store: %s", self.provider, exc)
def _refresh_entry(self, entry: PooledCredential, *, force: bool) -> Optional[PooledCredential]:
if entry.auth_type != AUTH_TYPE_OAUTH or not entry.refresh_token:
if force:
@@ -513,6 +577,13 @@ class CredentialPool:
except Exception as wexc:
logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
elif self.provider == "openai-codex":
# Proactively sync from ~/.codex/auth.json before refresh.
# The Codex CLI (or another Hermes profile) may have already
# consumed our refresh_token. Syncing first avoids a
# "refresh_token_reused" error when the CLI has a newer pair.
synced = self._sync_codex_entry_from_cli(entry)
if synced is not entry:
entry = synced
refreshed = auth_mod.refresh_codex_oauth_pure(
entry.access_token,
entry.refresh_token,
@@ -598,6 +669,37 @@ class CredentialPool:
# Credentials file had a valid (non-expired) token — use it directly
logger.debug("Credentials file has valid token, using without refresh")
return synced
# For openai-codex: the refresh_token may have been consumed by
# the Codex CLI between our proactive sync and the refresh call.
# Re-sync and retry once.
if self.provider == "openai-codex":
synced = self._sync_codex_entry_from_cli(entry)
if synced.refresh_token != entry.refresh_token:
logger.debug("Retrying Codex refresh with synced token from ~/.codex/auth.json")
try:
refreshed = auth_mod.refresh_codex_oauth_pure(
synced.access_token,
synced.refresh_token,
)
updated = replace(
synced,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
last_status=STATUS_OK,
last_status_at=None,
last_error_code=None,
)
self._replace_entry(synced, updated)
self._persist()
self._sync_device_code_entry_to_auth_store(updated)
return updated
except Exception as retry_exc:
logger.debug("Codex retry refresh also failed: %s", retry_exc)
elif not self._entry_needs_refresh(synced):
logger.debug("Codex CLI has valid token, using without refresh")
self._sync_device_code_entry_to_auth_store(synced)
return synced
self._mark_exhausted(entry, None)
return None
@@ -612,6 +714,10 @@ class CredentialPool:
)
self._replace_entry(entry, updated)
self._persist()
# Sync refreshed tokens back to auth.json providers so that
# _seed_from_singletons() on the next load_pool() sees fresh state
# instead of re-seeding stale/consumed tokens.
self._sync_device_code_entry_to_auth_store(updated)
return updated
def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
+27 -1
View File
@@ -677,6 +677,27 @@ def _classify_by_message(
should_compress=True,
)
# Usage-limit patterns need the same disambiguation as 402: some providers
# surface "usage limit" errors without an HTTP status code. A transient
# signal ("try again", "resets at", …) means it's a periodic quota, not
# billing exhaustion.
has_usage_limit = any(p in error_msg for p in _USAGE_LIMIT_PATTERNS)
if has_usage_limit:
has_transient_signal = any(p in error_msg for p in _USAGE_LIMIT_TRANSIENT_SIGNALS)
if has_transient_signal:
return result_fn(
FailoverReason.rate_limit,
retryable=True,
should_rotate_credential=True,
should_fallback=True,
)
return result_fn(
FailoverReason.billing,
retryable=False,
should_rotate_credential=True,
should_fallback=True,
)
# Billing patterns
if any(p in error_msg for p in _BILLING_PATTERNS):
return result_fn(
@@ -704,11 +725,16 @@ def _classify_by_message(
)
# Auth patterns
# Auth errors should NOT be retried directly — the credential is invalid and
# retrying with the same key will always fail. Set retryable=False so the
# caller triggers credential rotation (should_rotate_credential=True) or
# provider fallback rather than an immediate retry loop.
if any(p in error_msg for p in _AUTH_PATTERNS):
return result_fn(
FailoverReason.auth,
retryable=True,
retryable=False,
should_rotate_credential=True,
should_fallback=True,
)
# Model not found patterns
+15
View File
@@ -126,6 +126,21 @@ DEFAULT_CONTEXT_LENGTHS = {
"minimax": 1048576,
# GLM
"glm": 202752,
# xAI Grok — xAI /v1/models does not return context_length metadata,
# so these hardcoded fallbacks prevent Hermes from probing-down to
# the default 128k when the user points at https://api.x.ai/v1
# via a custom provider. Values sourced from models.dev (2026-04).
# Keys use substring matching (longest-first), so e.g. "grok-4.20"
# matches "grok-4.20-0309-reasoning" / "-non-reasoning" / "-multi-agent-0309".
"grok-code-fast": 256000, # grok-code-fast-1
"grok-4-1-fast": 2000000, # grok-4-1-fast-(non-)reasoning
"grok-2-vision": 8192, # grok-2-vision, -1212, -latest
"grok-4-fast": 2000000, # grok-4-fast-(non-)reasoning
"grok-4.20": 2000000, # grok-4.20-0309-(non-)reasoning, -multi-agent-0309
"grok-4": 256000, # grok-4, grok-4-0709
"grok-3": 131072, # grok-3, grok-3-mini, grok-3-fast, grok-3-mini-fast
"grok-2": 131072, # grok-2, grok-2-1212, grok-2-latest
"grok": 131072, # catch-all (grok-beta, unknown grok-*)
# Kimi
"kimi": 262144,
# Arcee
+1 -1
View File
@@ -40,7 +40,7 @@ _CONTEXT_THREAT_PATTERNS = [
(r'disregard\s+(your|all|any)\s+(instructions|rules|guidelines)', "disregard_rules"),
(r'act\s+as\s+(if|though)\s+you\s+(have\s+no|don\'t\s+have)\s+(restrictions|limits|rules)', "bypass_restrictions"),
(r'<!--[^>]*(?:ignore|override|system|secret|hidden)[^>]*-->', "html_comment_injection"),
(r'<\s*div\s+style\s*=\s*["\'].*display\s*:\s*none', "hidden_div"),
(r'<\s*div\s+style\s*=\s*["\'][\s\S]*?display\s*:\s*none', "hidden_div"),
(r'translate\s+.*\s+into\s+.*\s+and\s+(execute|run|eval)', "translate_execute"),
(r'curl\s+[^\n]*\$\{?\w*(KEY|TOKEN|SECRET|PASSWORD|CREDENTIAL|API)', "exfil_curl"),
(r'cat\s+[^\n]*(\.env|credentials|\.netrc|\.pgpass)', "read_secrets"),
+559 -105
View File
File diff suppressed because it is too large Load Diff
+15
View File
@@ -0,0 +1,15 @@
# Termux / Android dependency constraints for Hermes Agent.
#
# Usage:
# python -m pip install -e '.[termux]' -c constraints-termux.txt
#
# These pins keep the tested Android install path stable when upstream packages
# move faster than Termux-compatible wheels / sdists.
ipython<10
jedi>=0.18.1,<0.20
parso>=0.8.4,<0.9
stack-data>=0.6,<0.7
pexpect>4.3,<5
matplotlib-inline>=0.1.7,<0.2
asttokens>=2.1,<3
+60 -3
View File
@@ -346,7 +346,42 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
return None
_SCRIPT_TIMEOUT = 120 # seconds
_DEFAULT_SCRIPT_TIMEOUT = 120 # seconds
# Backward-compatible module override used by tests and emergency monkeypatches.
_SCRIPT_TIMEOUT = _DEFAULT_SCRIPT_TIMEOUT
def _get_script_timeout() -> int:
"""Resolve cron pre-run script timeout from module/env/config with a safe default."""
if _SCRIPT_TIMEOUT != _DEFAULT_SCRIPT_TIMEOUT:
try:
timeout = int(float(_SCRIPT_TIMEOUT))
if timeout > 0:
return timeout
except Exception:
logger.warning("Invalid patched _SCRIPT_TIMEOUT=%r; using env/config/default", _SCRIPT_TIMEOUT)
env_value = os.getenv("HERMES_CRON_SCRIPT_TIMEOUT", "").strip()
if env_value:
try:
timeout = int(float(env_value))
if timeout > 0:
return timeout
except Exception:
logger.warning("Invalid HERMES_CRON_SCRIPT_TIMEOUT=%r; using config/default", env_value)
try:
cfg = load_config() or {}
cron_cfg = cfg.get("cron", {}) if isinstance(cfg, dict) else {}
configured = cron_cfg.get("script_timeout_seconds")
if configured is not None:
timeout = int(float(configured))
if timeout > 0:
return timeout
except Exception as exc:
logger.debug("Failed to load cron script timeout from config: %s", exc)
return _DEFAULT_SCRIPT_TIMEOUT
def _run_job_script(script_path: str) -> tuple[bool, str]:
@@ -393,12 +428,14 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
if not path.is_file():
return False, f"Script path is not a file: {path}"
script_timeout = _get_script_timeout()
try:
result = subprocess.run(
[sys.executable, str(path)],
capture_output=True,
text=True,
timeout=_SCRIPT_TIMEOUT,
timeout=script_timeout,
cwd=str(path.parent),
)
stdout = (result.stdout or "").strip()
@@ -422,7 +459,7 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
return True, stdout
except subprocess.TimeoutExpired:
return False, f"Script timed out after {_SCRIPT_TIMEOUT}s: {path}"
return False, f"Script timed out after {script_timeout}s: {path}"
except Exception as exc:
return False, f"Script execution failed: {exc}"
@@ -646,6 +683,24 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
},
)
fallback_model = _cfg.get("fallback_providers") or _cfg.get("fallback_model") or None
credential_pool = None
runtime_provider = str(turn_route["runtime"].get("provider") or "").strip().lower()
if runtime_provider:
try:
from agent.credential_pool import load_pool
pool = load_pool(runtime_provider)
if pool.has_credentials():
credential_pool = pool
logger.info(
"Job '%s': loaded credential pool for provider %s with %d entries",
job_id,
runtime_provider,
len(pool.entries()),
)
except Exception as e:
logger.debug("Job '%s': failed to load credential pool for %s: %s", job_id, runtime_provider, e)
agent = AIAgent(
model=turn_route["model"],
api_key=turn_route["runtime"].get("api_key"),
@@ -657,6 +712,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
max_iterations=max_iterations,
reasoning_config=reasoning_config,
prefill_messages=prefill_messages,
fallback_model=fallback_model,
credential_pool=credential_pool,
providers_allowed=pr.get("only"),
providers_ignored=pr.get("ignore"),
providers_order=pr.get("order"),
+9
View File
@@ -581,6 +581,12 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(ic, list):
ic = ",".join(str(v) for v in ic)
os.environ["DISCORD_IGNORED_CHANNELS"] = str(ic)
# allowed_channels: if set, bot ONLY responds in these channels (whitelist)
ac = discord_cfg.get("allowed_channels")
if ac is not None and not os.getenv("DISCORD_ALLOWED_CHANNELS"):
if isinstance(ac, list):
ac = ",".join(str(v) for v in ac)
os.environ["DISCORD_ALLOWED_CHANNELS"] = str(ac)
# no_thread_channels: channels where bot responds directly without creating thread
ntc = discord_cfg.get("no_thread_channels")
if ntc is not None and not os.getenv("DISCORD_NO_THREAD_CHANNELS"):
@@ -901,6 +907,9 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
pass
if api_server_host:
config.platforms[Platform.API_SERVER].extra["host"] = api_server_host
api_server_model_name = os.getenv("API_SERVER_MODEL_NAME", "")
if api_server_model_name:
config.platforms[Platform.API_SERVER].extra["model_name"] = api_server_model_name
# Webhook platform
webhook_enabled = os.getenv("WEBHOOK_ENABLED", "").lower() in ("true", "1", "yes")
+62 -6
View File
@@ -24,6 +24,7 @@ import hmac
import json
import logging
import os
import re
import sqlite3
import time
import uuid
@@ -299,6 +300,9 @@ class APIServerAdapter(BasePlatformAdapter):
self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
)
self._model_name: str = self._resolve_model_name(
extra.get("model_name", os.getenv("API_SERVER_MODEL_NAME", "")),
)
self._app: Optional["web.Application"] = None
self._runner: Optional["web.AppRunner"] = None
self._site: Optional["web.TCPSite"] = None
@@ -324,6 +328,26 @@ class APIServerAdapter(BasePlatformAdapter):
return tuple(str(item).strip() for item in items if str(item).strip())
@staticmethod
def _resolve_model_name(explicit: str) -> str:
"""Derive the advertised model name for /v1/models.
Priority:
1. Explicit override (config extra or API_SERVER_MODEL_NAME env var)
2. Active profile name (so each profile advertises a distinct model)
3. Fallback: "hermes-agent"
"""
if explicit and explicit.strip():
return explicit.strip()
try:
from hermes_cli.profiles import get_active_profile_name
profile = get_active_profile_name()
if profile and profile not in ("default", "custom"):
return profile
except Exception:
pass
return "hermes-agent"
def _cors_headers_for_origin(self, origin: str) -> Optional[Dict[str, str]]:
"""Return CORS headers for an allowed browser origin."""
if not origin or not self._cors_origins:
@@ -468,12 +492,12 @@ class APIServerAdapter(BasePlatformAdapter):
"object": "list",
"data": [
{
"id": "hermes-agent",
"id": self._model_name,
"object": "model",
"created": int(time.time()),
"owned_by": "hermes",
"permission": [],
"root": "hermes-agent",
"root": self._model_name,
"parent": None,
}
],
@@ -531,8 +555,32 @@ class APIServerAdapter(BasePlatformAdapter):
# Allow caller to continue an existing session by passing X-Hermes-Session-Id.
# When provided, history is loaded from state.db instead of from the request body.
#
# Security: session continuation exposes conversation history, so it is
# only allowed when the API key is configured and the request is
# authenticated. Without this gate, any unauthenticated client could
# read arbitrary session history by guessing/enumerating session IDs.
provided_session_id = request.headers.get("X-Hermes-Session-Id", "").strip()
if provided_session_id:
if not self._api_key:
logger.warning(
"Session continuation via X-Hermes-Session-Id rejected: "
"no API key configured. Set API_SERVER_KEY to enable "
"session continuity."
)
return web.json_response(
_openai_error(
"Session continuation requires API key authentication. "
"Configure API_SERVER_KEY to enable this feature."
),
status=403,
)
# Sanitize: reject control characters that could enable header injection.
if re.search(r'[\r\n\x00]', provided_session_id):
return web.json_response(
{"error": {"message": "Invalid session ID", "type": "invalid_request_error"}},
status=400,
)
session_id = provided_session_id
try:
db = self._ensure_session_db()
@@ -546,7 +594,7 @@ class APIServerAdapter(BasePlatformAdapter):
# history already set from request body above
completion_id = f"chatcmpl-{uuid.uuid4().hex[:29]}"
model_name = body.get("model", "hermes-agent")
model_name = body.get("model", self._model_name)
created = int(time.time())
if stream:
@@ -923,7 +971,7 @@ class APIServerAdapter(BasePlatformAdapter):
"object": "response",
"status": "completed",
"created_at": created_at,
"model": body.get("model", "hermes-agent"),
"model": body.get("model", self._model_name),
"output": output_items,
"usage": {
"input_tokens": usage.get("input_tokens", 0),
@@ -1652,9 +1700,17 @@ class APIServerAdapter(BasePlatformAdapter):
await self._site.start()
self._mark_connected()
if not self._api_key:
logger.warning(
"[%s] ⚠️ No API key configured (API_SERVER_KEY / platforms.api_server.key). "
"All requests will be accepted without authentication. "
"Set an API key for production deployments to prevent "
"unauthorized access to sessions, responses, and cron jobs.",
self.name,
)
logger.info(
"[%s] API server listening on http://%s:%d",
self.name, self._host, self._port,
"[%s] API server listening on http://%s:%d (model: %s)",
self.name, self._host, self._port, self._model_name,
)
return True
+11 -2
View File
@@ -20,6 +20,7 @@ Configuration in config.yaml:
import asyncio
import logging
import os
import re
import time
import uuid
from datetime import datetime, timezone
@@ -54,6 +55,8 @@ MAX_MESSAGE_LENGTH = 20000
DEDUP_WINDOW_SECONDS = 300
DEDUP_MAX_SIZE = 1000
RECONNECT_BACKOFF = [2, 5, 10, 30, 60]
_SESSION_WEBHOOKS_MAX = 500
_DINGTALK_WEBHOOK_RE = re.compile(r'^https://api\.dingtalk\.com/')
def check_dingtalk_requirements() -> bool:
@@ -195,9 +198,15 @@ class DingTalkAdapter(BasePlatformAdapter):
chat_id = conversation_id or sender_id
chat_type = "group" if is_group else "dm"
# Store session webhook for reply routing
# Store session webhook for reply routing (validate origin to prevent SSRF)
session_webhook = getattr(message, "session_webhook", None) or ""
if session_webhook and chat_id:
if session_webhook and chat_id and _DINGTALK_WEBHOOK_RE.match(session_webhook):
if len(self._session_webhooks) >= _SESSION_WEBHOOKS_MAX:
# Evict oldest entry to cap memory growth
try:
self._session_webhooks.pop(next(iter(self._session_webhooks)))
except StopIteration:
pass
self._session_webhooks[chat_id] = session_webhook
source = self.build_source(
+93 -4
View File
@@ -422,6 +422,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Discord message limits
MAX_MESSAGE_LENGTH = 2000
_SPLIT_THRESHOLD = 1900 # near the 2000-char split point
# Auto-disconnect from voice channel after this many seconds of inactivity
VOICE_TIMEOUT = 300
@@ -433,6 +434,11 @@ class DiscordAdapter(BasePlatformAdapter):
self._allowed_user_ids: set = set() # For button approval authorization
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
# Text batching: merge rapid successive messages (Telegram-style)
self._text_batch_delay_seconds = float(os.getenv("HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._voice_text_channels: Dict[int, int] = {} # guild_id -> text_channel_id
self._voice_timeout_tasks: Dict[int, asyncio.Task] = {} # guild_id -> timeout task
# Phase 2: voice listening
@@ -2228,6 +2234,7 @@ class DiscordAdapter(BasePlatformAdapter):
# discord.require_mention: Require @mention in server channels (default: true)
# discord.free_response_channels: Channel IDs where bot responds without mention
# discord.ignored_channels: Channel IDs where bot NEVER responds (even when mentioned)
# discord.allowed_channels: If set, bot ONLY responds in these channels (whitelist)
# discord.no_thread_channels: Channel IDs where bot responds directly without creating thread
# discord.auto_thread: Auto-create thread on @mention in channels (default: true)
@@ -2239,12 +2246,21 @@ class DiscordAdapter(BasePlatformAdapter):
parent_channel_id = self._get_parent_channel_id(message.channel)
if not isinstance(message.channel, discord.DMChannel):
# Check ignored channels first - never respond even when mentioned
ignored_channels_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
ignored_channels = {ch.strip() for ch in ignored_channels_raw.split(",") if ch.strip()}
channel_ids = {str(message.channel.id)}
if parent_channel_id:
channel_ids.add(parent_channel_id)
# Check allowed channels - if set, only respond in these channels
allowed_channels_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
if allowed_channels_raw:
allowed_channels = {ch.strip() for ch in allowed_channels_raw.split(",") if ch.strip()}
if not (channel_ids & allowed_channels):
logger.debug("[%s] Ignoring message in non-allowed channel: %s", self.name, channel_ids)
return
# Check ignored channels - never respond even when mentioned
ignored_channels_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
ignored_channels = {ch.strip() for ch in ignored_channels_raw.split(",") if ch.strip()}
if channel_ids & ignored_channels:
logger.debug("[%s] Ignoring message in ignored channel: %s", self.name, channel_ids)
return
@@ -2466,7 +2482,80 @@ class DiscordAdapter(BasePlatformAdapter):
if thread_id:
self._track_thread(thread_id)
await self.handle_message(event)
# Only batch plain text messages — commands, media, etc. dispatch
# immediately since they won't be split by the Discord client.
if msg_type == MessageType.TEXT and self._text_batch_delay_seconds > 0:
self._enqueue_text_event(event)
else:
await self.handle_message(event)
# ------------------------------------------------------------------
# Text message aggregation (handles Discord client-side splits)
# ------------------------------------------------------------------
def _text_batch_key(self, event: MessageEvent) -> str:
"""Session-scoped key for text message batching."""
from gateway.session import build_session_key
return build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
thread_sessions_per_user=self.config.extra.get("thread_sessions_per_user", False),
)
def _enqueue_text_event(self, event: MessageEvent) -> None:
"""Buffer a text event and reset the flush timer.
When Discord splits a long user message at 2000 chars, the chunks
arrive within a few hundred milliseconds. This merges them into
a single event before dispatching.
"""
key = self._text_batch_key(event)
existing = self._pending_text_batches.get(key)
chunk_len = len(event.text or "")
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
else:
if event.text:
existing.text = f"{existing.text}\n{event.text}" if existing.text else event.text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
if event.media_urls:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
prior_task = self._pending_text_batch_tasks.get(key)
if prior_task and not prior_task.done():
prior_task.cancel()
self._pending_text_batch_tasks[key] = asyncio.create_task(
self._flush_text_batch(key)
)
async def _flush_text_batch(self, key: str) -> None:
"""Wait for the quiet period then dispatch the aggregated text.
Uses a longer delay when the latest chunk is near Discord's 2000-char
split point, since a continuation chunk is almost certain.
"""
current_task = asyncio.current_task()
try:
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
event = self._pending_text_batches.pop(key, None)
if not event:
return
logger.info(
"[Discord] Flushing text batch %s (%d chars)",
key, len(event.text or ""),
)
await self.handle_message(event)
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
self._pending_text_batch_tasks.pop(key, None)
# ---------------------------------------------------------------------------
+26 -2
View File
@@ -264,6 +264,7 @@ class FeishuAdapterSettings:
bot_name: str
dedup_cache_size: int
text_batch_delay_seconds: float
text_batch_split_delay_seconds: float
text_batch_max_messages: int
text_batch_max_chars: int
media_batch_delay_seconds: float
@@ -1014,6 +1015,10 @@ class FeishuAdapter(BasePlatformAdapter):
"""Feishu/Lark bot adapter."""
MAX_MESSAGE_LENGTH = 8000
# Threshold for detecting Feishu client-side message splits.
# When a chunk is near the ~4096-char practical limit, a continuation
# is almost certain.
_SPLIT_THRESHOLD = 4000
# =========================================================================
# Lifecycle — init / settings / connect / disconnect
@@ -1105,6 +1110,9 @@ class FeishuAdapter(BasePlatformAdapter):
text_batch_delay_seconds=float(
os.getenv("HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS", str(_DEFAULT_TEXT_BATCH_DELAY_SECONDS))
),
text_batch_split_delay_seconds=float(
os.getenv("HERMES_FEISHU_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0")
),
text_batch_max_messages=max(
1,
int(os.getenv("HERMES_FEISHU_TEXT_BATCH_MAX_MESSAGES", str(_DEFAULT_TEXT_BATCH_MAX_MESSAGES))),
@@ -1152,6 +1160,7 @@ class FeishuAdapter(BasePlatformAdapter):
self._bot_name = settings.bot_name
self._dedup_cache_size = settings.dedup_cache_size
self._text_batch_delay_seconds = settings.text_batch_delay_seconds
self._text_batch_split_delay_seconds = settings.text_batch_split_delay_seconds
self._text_batch_max_messages = settings.text_batch_max_messages
self._text_batch_max_chars = settings.text_batch_max_chars
self._media_batch_delay_seconds = settings.media_batch_delay_seconds
@@ -2478,8 +2487,10 @@ class FeishuAdapter(BasePlatformAdapter):
async def _enqueue_text_event(self, event: MessageEvent) -> None:
"""Debounce rapid Feishu text bursts into a single MessageEvent."""
key = self._text_batch_key(event)
chunk_len = len(event.text or "")
existing = self._pending_text_batches.get(key)
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
self._pending_text_batch_counts[key] = 1
self._schedule_text_batch_flush(key)
@@ -2504,6 +2515,7 @@ class FeishuAdapter(BasePlatformAdapter):
return
existing.text = next_text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
existing.timestamp = event.timestamp
if event.message_id:
existing.message_id = event.message_id
@@ -2530,10 +2542,22 @@ class FeishuAdapter(BasePlatformAdapter):
task_map[key] = asyncio.create_task(flush_fn(key))
async def _flush_text_batch(self, key: str) -> None:
"""Flush a pending text batch after the quiet period."""
"""Flush a pending text batch after the quiet period.
Uses a longer delay when the latest chunk is near Feishu's ~4096-char
split point, since a continuation chunk is almost certain.
"""
current_task = asyncio.current_task()
try:
await asyncio.sleep(self._text_batch_delay_seconds)
# Adaptive delay: if the latest chunk is near the split threshold,
# a continuation is almost certain — wait longer.
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
await self._flush_text_batch_now(key)
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
+87 -1
View File
@@ -120,6 +120,11 @@ def check_matrix_requirements() -> bool:
class MatrixAdapter(BasePlatformAdapter):
"""Gateway adapter for Matrix (any homeserver)."""
# Threshold for detecting Matrix client-side message splits.
# When a chunk is near the ~4000-char practical limit, a continuation
# is almost certain.
_SPLIT_THRESHOLD = 3900
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.MATRIX)
@@ -172,6 +177,13 @@ class MatrixAdapter(BasePlatformAdapter):
"MATRIX_REACTIONS", "true"
).lower() not in ("false", "0", "no")
# Text batching: merge rapid successive messages (Telegram-style).
# Matrix clients split long messages around 4000 chars.
self._text_batch_delay_seconds = float(os.getenv("HERMES_MATRIX_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_MATRIX_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
def _is_duplicate_event(self, event_id) -> bool:
"""Return True if this event was already processed. Tracks the ID otherwise."""
if not event_id:
@@ -1088,7 +1100,81 @@ class MatrixAdapter(BasePlatformAdapter):
# Acknowledge receipt so the room shows as read (fire-and-forget).
self._background_read_receipt(room.room_id, event.event_id)
await self.handle_message(msg_event)
# Only batch plain text messages — commands dispatch immediately.
if msg_type == MessageType.TEXT and self._text_batch_delay_seconds > 0:
self._enqueue_text_event(msg_event)
else:
await self.handle_message(msg_event)
# ------------------------------------------------------------------
# Text message aggregation (handles Matrix client-side splits)
# ------------------------------------------------------------------
def _text_batch_key(self, event: MessageEvent) -> str:
"""Session-scoped key for text message batching."""
from gateway.session import build_session_key
return build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
thread_sessions_per_user=self.config.extra.get("thread_sessions_per_user", False),
)
def _enqueue_text_event(self, event: MessageEvent) -> None:
"""Buffer a text event and reset the flush timer.
When a Matrix client splits a long message, the chunks arrive within
a few hundred milliseconds. This merges them into a single event
before dispatching.
"""
key = self._text_batch_key(event)
existing = self._pending_text_batches.get(key)
chunk_len = len(event.text or "")
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
else:
if event.text:
existing.text = f"{existing.text}\n{event.text}" if existing.text else event.text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
# Merge any media that might be attached
if event.media_urls:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
# Cancel any pending flush and restart the timer
prior_task = self._pending_text_batch_tasks.get(key)
if prior_task and not prior_task.done():
prior_task.cancel()
self._pending_text_batch_tasks[key] = asyncio.create_task(
self._flush_text_batch(key)
)
async def _flush_text_batch(self, key: str) -> None:
"""Wait for the quiet period then dispatch the aggregated text.
Uses a longer delay when the latest chunk is near Matrix's ~4000-char
split point, since a continuation chunk is almost certain.
"""
current_task = asyncio.current_task()
try:
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
event = self._pending_text_batches.pop(key, None)
if not event:
return
logger.info(
"[Matrix] Flushing text batch %s (%d chars)",
key, len(event.text or ""),
)
await self.handle_message(event)
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
self._pending_text_batch_tasks.pop(key, None)
async def _on_room_message_media(self, room: Any, event: Any) -> None:
"""Handle incoming media messages (images, audio, video, files)."""
+21 -2
View File
@@ -121,6 +121,9 @@ class TelegramAdapter(BasePlatformAdapter):
# Telegram message limits
MAX_MESSAGE_LENGTH = 4096
# Threshold for detecting Telegram client-side message splits.
# When a chunk is near this limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 4000
MEDIA_GROUP_WAIT_SECONDS = 0.8
def __init__(self, config: PlatformConfig):
@@ -140,6 +143,7 @@ class TelegramAdapter(BasePlatformAdapter):
# Buffer rapid text messages so Telegram client-side splits of long
# messages are aggregated into a single MessageEvent.
self._text_batch_delay_seconds = float(os.getenv("HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._token_lock_identity: Optional[str] = None
@@ -2160,12 +2164,15 @@ class TelegramAdapter(BasePlatformAdapter):
"""
key = self._text_batch_key(event)
existing = self._pending_text_batches.get(key)
chunk_len = len(event.text or "")
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
else:
# Append text from the follow-up chunk
if event.text:
existing.text = f"{existing.text}\n{event.text}" if existing.text else event.text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
# Merge any media that might be attached
if event.media_urls:
existing.media_urls.extend(event.media_urls)
@@ -2180,10 +2187,22 @@ class TelegramAdapter(BasePlatformAdapter):
)
async def _flush_text_batch(self, key: str) -> None:
"""Wait for the quiet period then dispatch the aggregated text."""
"""Wait for the quiet period then dispatch the aggregated text.
Uses a longer delay when the latest chunk is near Telegram's 4096-char
split point, since a continuation chunk is almost certain.
"""
current_task = asyncio.current_task()
try:
await asyncio.sleep(self._text_batch_delay_seconds)
# Adaptive delay: if the latest chunk is near Telegram's 4096-char
# split point, a continuation is almost certain — wait longer.
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
event = self._pending_text_batches.pop(key, None)
if not event:
return
+11 -2
View File
@@ -186,13 +186,22 @@ class WebhookAdapter(BasePlatformAdapter):
if deliver_type == "github_comment":
return await self._deliver_github_comment(content, delivery)
# Cross-platform delivery (telegram, discord, etc.)
# Cross-platform delivery — any platform with a gateway adapter
if self.gateway_runner and deliver_type in (
"telegram",
"discord",
"slack",
"signal",
"sms",
"whatsapp",
"matrix",
"mattermost",
"homeassistant",
"email",
"dingtalk",
"feishu",
"wecom",
"bluebubbles",
):
return await self._deliver_cross_platform(
deliver_type, content, delivery
@@ -262,7 +271,7 @@ class WebhookAdapter(BasePlatformAdapter):
", ".join(self._dynamic_routes.keys()) or "(none)",
)
except Exception as e:
logger.warning("[webhook] Failed to reload dynamic routes: %s", e)
logger.error("[webhook] Failed to reload dynamic routes: %s", e)
async def _handle_webhook(self, request: "web.Request") -> "web.Response":
"""POST /webhooks/{route_name} — receive and process a webhook event."""
+86 -1
View File
@@ -143,6 +143,9 @@ class WeComAdapter(BasePlatformAdapter):
"""WeCom AI Bot adapter backed by a persistent WebSocket connection."""
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
# Threshold for detecting WeCom client-side message splits.
# When a chunk is near the 4000-char limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 3900
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WECOM)
@@ -172,6 +175,13 @@ class WeComAdapter(BasePlatformAdapter):
self._seen_messages: Dict[str, float] = {}
self._reply_req_ids: Dict[str, str] = {}
# Text batching: merge rapid successive messages (Telegram-style).
# WeCom clients split long messages around 4000 chars.
self._text_batch_delay_seconds = float(os.getenv("HERMES_WECOM_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_WECOM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
# ------------------------------------------------------------------
# Connection lifecycle
# ------------------------------------------------------------------
@@ -519,7 +529,82 @@ class WeComAdapter(BasePlatformAdapter):
timestamp=datetime.now(tz=timezone.utc),
)
await self.handle_message(event)
# Only batch plain text messages — commands, media, etc. dispatch
# immediately since they won't be split by the WeCom client.
if message_type == MessageType.TEXT and self._text_batch_delay_seconds > 0:
self._enqueue_text_event(event)
else:
await self.handle_message(event)
# ------------------------------------------------------------------
# Text message aggregation (handles WeCom client-side splits)
# ------------------------------------------------------------------
def _text_batch_key(self, event: MessageEvent) -> str:
"""Session-scoped key for text message batching."""
from gateway.session import build_session_key
return build_session_key(
event.source,
group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
thread_sessions_per_user=self.config.extra.get("thread_sessions_per_user", False),
)
def _enqueue_text_event(self, event: MessageEvent) -> None:
"""Buffer a text event and reset the flush timer.
When WeCom splits a long user message at 4000 chars, the chunks
arrive within a few hundred milliseconds. This merges them into
a single event before dispatching.
"""
key = self._text_batch_key(event)
existing = self._pending_text_batches.get(key)
chunk_len = len(event.text or "")
if existing is None:
event._last_chunk_len = chunk_len # type: ignore[attr-defined]
self._pending_text_batches[key] = event
else:
if event.text:
existing.text = f"{existing.text}\n{event.text}" if existing.text else event.text
existing._last_chunk_len = chunk_len # type: ignore[attr-defined]
# Merge any media that might be attached
if event.media_urls:
existing.media_urls.extend(event.media_urls)
existing.media_types.extend(event.media_types)
# Cancel any pending flush and restart the timer
prior_task = self._pending_text_batch_tasks.get(key)
if prior_task and not prior_task.done():
prior_task.cancel()
self._pending_text_batch_tasks[key] = asyncio.create_task(
self._flush_text_batch(key)
)
async def _flush_text_batch(self, key: str) -> None:
"""Wait for the quiet period then dispatch the aggregated text.
Uses a longer delay when the latest chunk is near WeCom's 4000-char
split point, since a continuation chunk is almost certain.
"""
current_task = asyncio.current_task()
try:
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
event = self._pending_text_batches.pop(key, None)
if not event:
return
logger.info(
"[WeCom] Flushing text batch %s (%d chars)",
key, len(event.text or ""),
)
await self.handle_message(event)
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
self._pending_text_batch_tasks.pop(key, None)
@staticmethod
def _extract_text(body: Dict[str, Any]) -> Tuple[str, Optional[str]]:
+98 -9
View File
@@ -3546,6 +3546,7 @@ class GatewayRunner:
current_base_url = ""
current_api_key = ""
user_provs = None
custom_provs = None
config_path = _hermes_home / "config.yaml"
try:
if config_path.exists():
@@ -3557,6 +3558,7 @@ class GatewayRunner:
current_provider = model_cfg.get("provider", current_provider)
current_base_url = model_cfg.get("base_url", "")
user_provs = cfg.get("providers")
custom_provs = cfg.get("custom_providers")
except Exception:
pass
@@ -3584,6 +3586,7 @@ class GatewayRunner:
providers = list_authenticated_providers(
current_provider=current_provider,
user_providers=user_provs,
custom_providers=custom_provs,
max_models=50,
)
except Exception:
@@ -3611,6 +3614,8 @@ class GatewayRunner:
current_api_key=_cur_api_key,
is_global=False,
explicit_provider=provider_slug,
user_providers=user_provs,
custom_providers=custom_provs,
)
if not result.success:
return f"Error: {result.error_message}"
@@ -3689,6 +3694,7 @@ class GatewayRunner:
providers = list_authenticated_providers(
current_provider=current_provider,
user_providers=user_provs,
custom_providers=custom_provs,
max_models=5,
)
for p in providers:
@@ -3718,6 +3724,8 @@ class GatewayRunner:
current_api_key=current_api_key,
is_global=persist_global,
explicit_provider=explicit_provider,
user_providers=user_provs,
custom_providers=custom_provs,
)
if not result.success:
@@ -5274,27 +5282,76 @@ class GatewayRunner:
)
async def _handle_usage_command(self, event: MessageEvent) -> str:
"""Handle /usage command -- show token usage for the session's last agent run."""
"""Handle /usage command -- show token usage for the current session.
Checks both _running_agents (mid-turn) and _agent_cache (between turns)
so that rate limits, cost estimates, and detailed token breakdowns are
available whenever the user asks, not only while the agent is running.
"""
source = event.source
session_key = self._session_key_for_source(source)
# Try running agent first (mid-turn), then cached agent (between turns)
agent = self._running_agents.get(session_key)
if not agent or agent is _AGENT_PENDING_SENTINEL:
_cache_lock = getattr(self, "_agent_cache_lock", None)
_cache = getattr(self, "_agent_cache", None)
if _cache_lock and _cache is not None:
with _cache_lock:
cached = _cache.get(session_key)
if cached:
agent = cached[0]
if agent and hasattr(agent, "session_total_tokens") and agent.session_api_calls > 0:
lines = []
# Rate limits first (when available from provider headers)
# Rate limits (when available from provider headers)
rl_state = agent.get_rate_limit_state()
if rl_state and rl_state.has_data:
from agent.rate_limit_tracker import format_rate_limit_compact
lines.append(f"⏱️ **Rate Limits:** {format_rate_limit_compact(rl_state)}")
lines.append("")
# Session token usage
# Session token usage — detailed breakdown matching CLI
input_tokens = getattr(agent, "session_input_tokens", 0) or 0
output_tokens = getattr(agent, "session_output_tokens", 0) or 0
cache_read = getattr(agent, "session_cache_read_tokens", 0) or 0
cache_write = getattr(agent, "session_cache_write_tokens", 0) or 0
lines.append("📊 **Session Token Usage**")
lines.append(f"Prompt (input): {agent.session_prompt_tokens:,}")
lines.append(f"Completion (output): {agent.session_completion_tokens:,}")
lines.append(f"Model: `{agent.model}`")
lines.append(f"Input tokens: {input_tokens:,}")
if cache_read:
lines.append(f"Cache read tokens: {cache_read:,}")
if cache_write:
lines.append(f"Cache write tokens: {cache_write:,}")
lines.append(f"Output tokens: {output_tokens:,}")
lines.append(f"Total: {agent.session_total_tokens:,}")
lines.append(f"API calls: {agent.session_api_calls}")
# Cost estimation
try:
from agent.usage_pricing import CanonicalUsage, estimate_usage_cost
cost_result = estimate_usage_cost(
agent.model,
CanonicalUsage(
input_tokens=input_tokens,
output_tokens=output_tokens,
cache_read_tokens=cache_read,
cache_write_tokens=cache_write,
),
provider=getattr(agent, "provider", None),
base_url=getattr(agent, "base_url", None),
)
if cost_result.amount_usd is not None:
prefix = "~" if cost_result.status == "estimated" else ""
lines.append(f"Cost: {prefix}${float(cost_result.amount_usd):.4f}")
elif cost_result.status == "included":
lines.append("Cost: included")
except Exception:
pass
# Context window and compressions
ctx = agent.context_compressor
if ctx.last_prompt_tokens:
pct = min(100, ctx.last_prompt_tokens / ctx.context_length * 100) if ctx.context_length else 0
@@ -5304,7 +5361,7 @@ class GatewayRunner:
return "\n".join(lines)
# No running agent -- check session history for a rough count
# No agent at all -- check session history for a rough count
session_entry = self.session_store.get_or_create_session(source)
history = self.session_store.load_transcript(session_entry.session_id)
if history:
@@ -5315,7 +5372,7 @@ class GatewayRunner:
f"📊 **Session Info**\n"
f"Messages: {len(msgs)}\n"
f"Estimated context: ~{approx:,} tokens\n"
f"_(Detailed usage available during active conversations)_"
f"_(Detailed usage available after the first agent response)_"
)
return "No usage data available for this session."
@@ -6283,6 +6340,32 @@ class GatewayRunner:
)
return hashlib.sha256(blob.encode()).hexdigest()[:16]
def _apply_session_model_override(
self, session_key: str, model: str, runtime_kwargs: dict
) -> tuple:
"""Apply /model session overrides if present, returning (model, runtime_kwargs).
The gateway /model command stores per-session overrides in
``_session_model_overrides``. These must take precedence over
config.yaml defaults so the switched model is actually used for
subsequent messages. Fields with ``None`` values are skipped so
partial overrides don't clobber valid config defaults.
"""
override = self._session_model_overrides.get(session_key)
if not override:
return model, runtime_kwargs
model = override.get("model", model)
for key in ("provider", "api_key", "base_url", "api_mode"):
val = override.get(key)
if val is not None:
runtime_kwargs[key] = val
return model, runtime_kwargs
def _is_intentional_model_switch(self, session_key: str, agent_model: str) -> bool:
"""Return True if *agent_model* matches an active /model session override."""
override = self._session_model_overrides.get(session_key)
return override is not None and override.get("model") == agent_model
def _evict_cached_agent(self, session_key: str) -> None:
"""Remove a cached agent for a session (called on /new, /model, etc)."""
_lock = getattr(self, "_agent_cache_lock", None)
@@ -6660,6 +6743,11 @@ class GatewayRunner:
"tools": [],
}
# /model overrides take precedence over config.yaml defaults.
model, runtime_kwargs = self._apply_session_model_override(
session_key, model, runtime_kwargs
)
pr = self._provider_routing
reasoning_config = self._load_reasoning_config()
self._reasoning_config = reasoning_config
@@ -7279,14 +7367,15 @@ class GatewayRunner:
_agent = agent_holder[0]
if _agent is not None and hasattr(_agent, 'model'):
_cfg_model = _resolve_gateway_model()
if _agent.model != _cfg_model:
if _agent.model != _cfg_model and not self._is_intentional_model_switch(session_key, _agent.model):
self._effective_model = _agent.model
self._effective_provider = getattr(_agent, 'provider', None)
# Fallback activated — evict cached agent so the next
# message starts fresh and retries the primary model.
self._evict_cached_agent(session_key)
else:
# Primary model worked — clear any stale fallback state
# Primary model worked (or intentional /model switch)
# — clear any stale fallback state.
self._effective_model = None
self._effective_provider = None
-35
View File
@@ -770,41 +770,6 @@ class SessionStore:
except Exception as e:
print(f"[gateway] Warning: Failed to create SQLite session: {e}")
# Seed new DM thread sessions with parent DM session history.
# When a bot reply creates a Slack thread and the user responds in it,
# the thread gets a new session (keyed by thread_ts). Without seeding,
# the thread session starts with zero context — the user's original
# question and the bot's answer are invisible. Fix: copy the parent
# DM session's transcript into the new thread session so context carries
# over while still keeping threads isolated from each other.
if (
source.chat_type == "dm"
and source.thread_id
and entry.created_at == entry.updated_at # brand-new session
and not was_auto_reset
):
parent_source = SessionSource(
platform=source.platform,
chat_id=source.chat_id,
chat_type="dm",
user_id=source.user_id,
# no thread_id — this is the parent DM session
)
parent_key = self._generate_session_key(parent_source)
with self._lock:
parent_entry = self._entries.get(parent_key)
if parent_entry and parent_entry.session_id != entry.session_id:
try:
parent_history = self.load_transcript(parent_entry.session_id)
if parent_history:
self.rewrite_transcript(entry.session_id, parent_history)
logger.info(
"[Session] Seeded DM thread session %s with %d messages from parent %s",
entry.session_id, len(parent_history), parent_entry.session_id,
)
except Exception as e:
logger.warning("[Session] Failed to seed thread session: %s", e)
return entry
def update_session(
+1 -1
View File
@@ -2581,7 +2581,7 @@ def _prompt_model_selection(
custom = input("Enter model name: ").strip()
return custom if custom else None
return None
except (ImportError, NotImplementedError):
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
pass
# Fallback: numbered list
+22 -1
View File
@@ -100,6 +100,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("reasoning", "Manage reasoning effort and display", "Configuration",
args_hint="[level|show|hide]",
subcommands=("none", "minimal", "low", "medium", "high", "xhigh", "show", "hide", "on", "off")),
CommandDef("fast", "Toggle fast mode — OpenAI Priority Processing / Anthropic Fast Mode (Normal/Fast)", "Configuration",
cli_only=True, args_hint="[normal|fast|status]",
subcommands=("normal", "fast", "status", "on", "off")),
CommandDef("skin", "Show or change the display skin/theme", "Configuration",
cli_only=True, args_hint="[name]"),
CommandDef("voice", "Toggle voice mode", "Configuration",
@@ -135,6 +138,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True, aliases=("gateway",)),
CommandDef("paste", "Check clipboard for an image and attach it", "Info",
cli_only=True),
CommandDef("image", "Attach a local image file for your next prompt", "Info",
cli_only=True, args_hint="<path>"),
CommandDef("update", "Update Hermes Agent to the latest version", "Info",
gateway_only=True),
@@ -637,8 +642,18 @@ class SlashCommandCompleter(Completer):
def __init__(
self,
skill_commands_provider: Callable[[], Mapping[str, dict[str, Any]]] | None = None,
command_filter: Callable[[str], bool] | None = None,
) -> None:
self._skill_commands_provider = skill_commands_provider
self._command_filter = command_filter
def _command_allowed(self, slash_command: str) -> bool:
if self._command_filter is None:
return True
try:
return bool(self._command_filter(slash_command))
except Exception:
return True
def _iter_skill_commands(self) -> Mapping[str, dict[str, Any]]:
if self._skill_commands_provider is None:
@@ -916,7 +931,7 @@ class SlashCommandCompleter(Completer):
return
# Static subcommand completions
if " " not in sub_text and base_cmd in SUBCOMMANDS:
if " " not in sub_text and base_cmd in SUBCOMMANDS and self._command_allowed(base_cmd):
for sub in SUBCOMMANDS[base_cmd]:
if sub.startswith(sub_lower) and sub != sub_lower:
yield Completion(
@@ -929,6 +944,8 @@ class SlashCommandCompleter(Completer):
word = text[1:]
for cmd, desc in COMMANDS.items():
if not self._command_allowed(cmd):
continue
cmd_name = cmd[1:]
if cmd_name.startswith(word):
yield Completion(
@@ -987,6 +1004,8 @@ class SlashCommandAutoSuggest(AutoSuggest):
# Still typing the command name: /upd → suggest "ate"
word = text[1:].lower()
for cmd in COMMANDS:
if self._completer is not None and not self._completer._command_allowed(cmd):
continue
cmd_name = cmd[1:] # strip leading /
if cmd_name.startswith(word) and cmd_name != word:
return Suggestion(cmd_name[len(word):])
@@ -997,6 +1016,8 @@ class SlashCommandAutoSuggest(AutoSuggest):
sub_lower = sub_text.lower()
# Static subcommands
if self._completer is not None and not self._completer._command_allowed(base_cmd):
return None
if base_cmd in SUBCOMMANDS and SUBCOMMANDS[base_cmd]:
if " " not in sub_text:
for sub in SUBCOMMANDS[base_cmd]:
+23 -2
View File
@@ -158,16 +158,27 @@ def get_project_root() -> Path:
return Path(__file__).parent.parent.resolve()
def _secure_dir(path):
"""Set directory to owner-only access (0700). No-op on Windows.
"""Set directory to owner-only access (0700 by default). No-op on Windows.
Skipped in managed mode the NixOS module sets group-readable
permissions (0750) so interactive users in the hermes group can
share state with the gateway service.
The mode can be overridden via the HERMES_HOME_MODE environment variable
(e.g. HERMES_HOME_MODE=0701) for deployments where a web server (nginx,
caddy, etc.) needs to traverse HERMES_HOME to reach a served subdirectory.
The execute-only bit on a directory permits cd-through without exposing
directory listings.
"""
if is_managed():
return
try:
os.chmod(path, 0o700)
mode_str = os.environ.get("HERMES_HOME_MODE", "").strip()
mode = int(mode_str, 8) if mode_str else 0o700
except ValueError:
mode = 0o700
try:
os.chmod(path, mode)
except (OSError, NotImplementedError):
pass
@@ -255,6 +266,7 @@ DEFAULT_CONFIG = {
# tools or receiving API responses. Only fires when the agent has
# been completely idle for this duration. 0 = unlimited.
"gateway_timeout": 1800,
"service_tier": "",
# Tool-use enforcement: injects system prompt guidance that tells the
# model to actually call tools instead of describing intended actions.
# Values: "auto" (default — applies to gpt/codex models), true/false
@@ -540,6 +552,7 @@ DEFAULT_CONFIG = {
"discord": {
"require_mention": True, # Require @mention to respond in server channels
"free_response_channels": "", # Comma-separated channel IDs where bot responds without mention
"allowed_channels": "", # If set, bot ONLY responds in these channel IDs (whitelist)
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
"reactions": True, # Add 👀/✅/❌ reactions to messages during processing
},
@@ -1216,6 +1229,14 @@ OPTIONAL_ENV_VARS = {
"category": "messaging",
"advanced": True,
},
"API_SERVER_MODEL_NAME": {
"description": "Model name advertised on /v1/models. Defaults to the profile name (or 'hermes-agent' for the default profile). Useful for multi-user setups with OpenWebUI.",
"prompt": "API server model name",
"url": None,
"password": False,
"category": "messaging",
"advanced": True,
},
"WEBHOOK_ENABLED": {
"description": "Enable the webhook platform adapter for receiving events from GitHub, GitLab, etc.",
"prompt": "Enable webhooks (true/false)",
+1
View File
@@ -285,6 +285,7 @@ def copilot_request_headers(
headers: dict[str, str] = {
"Editor-Version": "vscode/1.104.1",
"User-Agent": "HermesAgent/1.0",
"Copilot-Integration-Id": "vscode-chat",
"Openai-Intent": "conversation-edits",
"x-initiator": "agent" if is_agent_turn else "user",
}
+52 -8
View File
@@ -54,6 +54,32 @@ _PROVIDER_ENV_HINTS = (
)
from hermes_constants import is_termux as _is_termux
def _python_install_cmd() -> str:
return "python -m pip install" if _is_termux() else "uv pip install"
def _system_package_install_cmd(pkg: str) -> str:
if _is_termux():
return f"pkg install {pkg}"
if sys.platform == "darwin":
return f"brew install {pkg}"
return f"sudo apt install {pkg}"
def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
steps: list[str] = []
step = 1
if not node_installed:
steps.append(f"{step}) pkg install nodejs")
step += 1
steps.append(f"{step}) npm install -g agent-browser")
steps.append(f"{step + 1}) agent-browser install")
return steps
def _has_provider_env_config(content: str) -> bool:
"""Return True when ~/.hermes/.env contains provider auth/base URL settings."""
return any(key in content for key in _PROVIDER_ENV_HINTS)
@@ -200,7 +226,7 @@ def run_doctor(args):
check_ok(name)
except ImportError:
check_fail(name, "(missing)")
issues.append(f"Install {name}: uv pip install {module}")
issues.append(f"Install {name}: {_python_install_cmd()} {module}")
for module, name in optional_packages:
try:
@@ -503,7 +529,7 @@ def run_doctor(args):
check_ok("ripgrep (rg)", "(faster file search)")
else:
check_warn("ripgrep (rg) not found", "(file search uses grep fallback)")
check_info("Install for faster search: sudo apt install ripgrep")
check_info(f"Install for faster search: {_system_package_install_cmd('ripgrep')}")
# Docker (optional)
terminal_env = os.getenv("TERMINAL_ENV", "local")
@@ -526,7 +552,10 @@ def run_doctor(args):
if shutil.which("docker"):
check_ok("docker", "(optional)")
else:
check_warn("docker not found", "(optional)")
if _is_termux():
check_info("Docker backend is not available inside Termux (expected on Android)")
else:
check_warn("docker not found", "(optional)")
# SSH (if using ssh backend)
if terminal_env == "ssh":
@@ -574,9 +603,23 @@ def run_doctor(args):
if agent_browser_path.exists():
check_ok("agent-browser (Node.js)", "(browser automation)")
else:
check_warn("agent-browser not installed", "(run: npm install)")
if _is_termux():
check_info("agent-browser is not installed (expected in the tested Termux path)")
check_info("Install it manually later with: npm install -g agent-browser && agent-browser install")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=True):
check_info(step)
else:
check_warn("agent-browser not installed", "(run: npm install)")
else:
check_warn("Node.js not found", "(optional, needed for browser tools)")
if _is_termux():
check_info("Node.js not found (browser tools are optional in the tested Termux path)")
check_info("Install Node.js on Termux with: pkg install nodejs")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=False):
check_info(step)
else:
check_warn("Node.js not found", "(optional, needed for browser tools)")
# npm audit for all Node.js packages
if shutil.which("npm"):
@@ -709,7 +752,7 @@ def run_doctor(args):
_url = (_base.rstrip("/") + "/models") if _base else _default_url
_headers = {"Authorization": f"Bearer {_key}"}
if "api.kimi.com" in _url.lower():
_headers["User-Agent"] = "KimiCLI/1.0"
_headers["User-Agent"] = "KimiCLI/1.30.0"
_resp = httpx.get(
_url,
headers=_headers,
@@ -739,8 +782,9 @@ def run_doctor(args):
__import__("tinker_atropos")
check_ok("tinker-atropos", "(RL training backend)")
except ImportError:
check_warn("tinker-atropos found but not installed", "(run: uv pip install -e ./tinker-atropos)")
issues.append("Install tinker-atropos: uv pip install -e ./tinker-atropos")
install_cmd = f"{_python_install_cmd()} -e ./tinker-atropos"
check_warn("tinker-atropos found but not installed", f"(run: {install_cmd})")
issues.append(f"Install tinker-atropos: {install_cmd}")
else:
check_warn("tinker-atropos requires Python 3.11+", f"(current: {py_version.major}.{py_version.minor})")
else:
+64 -29
View File
@@ -39,7 +39,7 @@ def _get_service_pids() -> set:
pids: set = set()
# --- systemd (Linux): user and system scopes ---
if is_linux():
if supports_systemd_services():
for scope_args in [["systemctl", "--user"], ["systemctl"]]:
try:
result = subprocess.run(
@@ -225,6 +225,14 @@ def stop_profile_gateway() -> bool:
def is_linux() -> bool:
return sys.platform.startswith('linux')
from hermes_constants import is_termux
def supports_systemd_services() -> bool:
return is_linux() and not is_termux()
def is_macos() -> bool:
return sys.platform == 'darwin'
@@ -477,13 +485,15 @@ def install_linux_gateway_from_setup(force: bool = False) -> tuple[str | None, b
def get_systemd_linger_status() -> tuple[bool | None, str]:
"""Return whether systemd user lingering is enabled for the current user.
"""Return systemd linger status for the current user.
Returns:
(True, "") when linger is enabled.
(False, "") when linger is disabled.
(None, detail) when the status could not be determined.
"""
if is_termux():
return None, "not supported in Termux"
if not is_linux():
return None, "not supported on this platform"
@@ -766,7 +776,7 @@ def _print_linger_enable_warning(username: str, detail: str | None = None) -> No
def _ensure_linger_enabled() -> None:
"""Enable linger when possible so the user gateway survives logout."""
if not is_linux():
if is_termux() or not is_linux():
return
import getpass
@@ -1801,7 +1811,7 @@ def _setup_whatsapp():
def _is_service_installed() -> bool:
"""Check if the gateway is installed as a system service."""
if is_linux():
if supports_systemd_services():
return get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()
elif is_macos():
return get_launchd_plist_path().exists()
@@ -1810,7 +1820,7 @@ def _is_service_installed() -> bool:
def _is_service_running() -> bool:
"""Check if the gateway service is currently running."""
if is_linux():
if supports_systemd_services():
user_unit_exists = get_systemd_unit_path(system=False).exists()
system_unit_exists = get_systemd_unit_path(system=True).exists()
@@ -1983,7 +1993,7 @@ def gateway_setup():
service_installed = _is_service_installed()
service_running = _is_service_running()
if is_linux() and has_conflicting_systemd_units():
if supports_systemd_services() and has_conflicting_systemd_units():
print_systemd_scope_conflict_warning()
print()
@@ -1993,7 +2003,7 @@ def gateway_setup():
print_warning("Gateway service is installed but not running.")
if prompt_yes_no(" Start it now?", True):
try:
if is_linux():
if supports_systemd_services():
systemd_start()
elif is_macos():
launchd_start()
@@ -2044,7 +2054,7 @@ def gateway_setup():
if service_running:
if prompt_yes_no(" Restart the gateway to pick up changes?", True):
try:
if is_linux():
if supports_systemd_services():
systemd_restart()
elif is_macos():
launchd_restart()
@@ -2056,7 +2066,7 @@ def gateway_setup():
elif service_installed:
if prompt_yes_no(" Start the gateway service?", True):
try:
if is_linux():
if supports_systemd_services():
systemd_start()
elif is_macos():
launchd_start()
@@ -2064,13 +2074,13 @@ def gateway_setup():
print_error(f" Start failed: {e}")
else:
print()
if is_linux() or is_macos():
platform_name = "systemd" if is_linux() else "launchd"
if supports_systemd_services() or is_macos():
platform_name = "systemd" if supports_systemd_services() else "launchd"
if prompt_yes_no(f" Install the gateway as a {platform_name} service? (runs in background, starts on boot)", True):
try:
installed_scope = None
did_install = False
if is_linux():
if supports_systemd_services():
installed_scope, did_install = install_linux_gateway_from_setup(force=False)
else:
launchd_install(force=False)
@@ -2078,7 +2088,7 @@ def gateway_setup():
print()
if did_install and prompt_yes_no(" Start the service now?", True):
try:
if is_linux():
if supports_systemd_services():
systemd_start(system=installed_scope == "system")
else:
launchd_start()
@@ -2089,12 +2099,18 @@ def gateway_setup():
print_info(" You can try manually: hermes gateway install")
else:
print_info(" You can install later: hermes gateway install")
if is_linux():
if supports_systemd_services():
print_info(" Or as a boot-time service: sudo hermes gateway install --system")
print_info(" Or run in foreground: hermes gateway")
else:
print_info(" Service install not supported on this platform.")
print_info(" Run in foreground: hermes gateway")
if is_termux():
from hermes_constants import display_hermes_home as _dhh
print_info(" Termux does not use systemd/launchd services.")
print_info(" Run in foreground: hermes gateway")
print_info(f" Or start it manually in the background (best effort): nohup hermes gateway >{_dhh()}/logs/gateway.log 2>&1 &")
else:
print_info(" Service install not supported on this platform.")
print_info(" Run in foreground: hermes gateway")
else:
print()
print_info("No platforms configured. Run 'hermes gateway setup' when ready.")
@@ -2130,7 +2146,11 @@ def gateway_command(args):
force = getattr(args, 'force', False)
system = getattr(args, 'system', False)
run_as_user = getattr(args, 'run_as_user', None)
if is_linux():
if is_termux():
print("Gateway service installation is not supported on Termux.")
print("Run manually: hermes gateway")
sys.exit(1)
if supports_systemd_services():
systemd_install(force=force, system=system, run_as_user=run_as_user)
elif is_macos():
launchd_install(force)
@@ -2144,7 +2164,11 @@ def gateway_command(args):
managed_error("uninstall gateway service (managed by NixOS)")
return
system = getattr(args, 'system', False)
if is_linux():
if is_termux():
print("Gateway service uninstall is not supported on Termux because there is no managed service to remove.")
print("Stop manual runs with: hermes gateway stop")
sys.exit(1)
if supports_systemd_services():
systemd_uninstall(system=system)
elif is_macos():
launchd_uninstall()
@@ -2154,7 +2178,11 @@ def gateway_command(args):
elif subcmd == "start":
system = getattr(args, 'system', False)
if is_linux():
if is_termux():
print("Gateway service start is not supported on Termux because there is no system service manager.")
print("Run manually: hermes gateway")
sys.exit(1)
if supports_systemd_services():
systemd_start(system=system)
elif is_macos():
launchd_start()
@@ -2169,7 +2197,7 @@ def gateway_command(args):
if stop_all:
# --all: kill every gateway process on the machine
service_available = False
if is_linux() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
try:
systemd_stop(system=system)
service_available = True
@@ -2190,7 +2218,7 @@ def gateway_command(args):
else:
# Default: stop only the current profile's gateway
service_available = False
if is_linux() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
try:
systemd_stop(system=system)
service_available = True
@@ -2218,7 +2246,7 @@ def gateway_command(args):
system = getattr(args, 'system', False)
service_configured = False
if is_linux() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
service_configured = True
try:
systemd_restart(system=system)
@@ -2235,7 +2263,7 @@ def gateway_command(args):
if not service_available:
# systemd/launchd restart failed — check if linger is the issue
if is_linux():
if supports_systemd_services():
linger_ok, _detail = get_systemd_linger_status()
if linger_ok is not True:
import getpass
@@ -2272,7 +2300,7 @@ def gateway_command(args):
system = getattr(args, 'system', False)
# Check for service first
if is_linux() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
systemd_status(deep, system=system)
elif is_macos() and get_launchd_plist_path().exists():
launchd_status(deep)
@@ -2289,9 +2317,13 @@ def gateway_command(args):
for line in runtime_lines:
print(f" {line}")
print()
print("To install as a service:")
print(" hermes gateway install")
print(" sudo hermes gateway install --system")
if is_termux():
print("Termux note:")
print(" Android may stop background jobs when Termux is suspended")
else:
print("To install as a service:")
print(" hermes gateway install")
print(" sudo hermes gateway install --system")
else:
print("✗ Gateway is not running")
runtime_lines = _runtime_health_lines()
@@ -2303,5 +2335,8 @@ def gateway_command(args):
print()
print("To start:")
print(" hermes gateway # Run in foreground")
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
if is_termux():
print(" nohup hermes gateway > ~/.hermes/logs/gateway.log 2>&1 & # Best-effort background start")
else:
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
+54 -49
View File
@@ -646,6 +646,7 @@ def cmd_chat(args):
"verbose": args.verbose,
"quiet": getattr(args, "quiet", False),
"query": args.query,
"image": getattr(args, "image", None),
"resume": getattr(args, "resume", None),
"worktree": getattr(args, "worktree", False),
"checkpoints": getattr(args, "checkpoints", False),
@@ -857,7 +858,6 @@ def cmd_whatsapp(args):
def cmd_setup(args):
"""Interactive setup wizard."""
_require_tty("setup")
from hermes_cli.setup import run_setup_wizard
run_setup_wizard(args)
@@ -967,10 +967,11 @@ def select_provider_and_model(args=None):
("alibaba", "Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
]
# Add user-defined custom providers from config.yaml
custom_providers_cfg = config.get("custom_providers") or []
_custom_provider_map = {} # key → {name, base_url, api_key}
if isinstance(custom_providers_cfg, list):
def _named_custom_provider_map(cfg) -> dict[str, dict[str, str]]:
custom_providers_cfg = cfg.get("custom_providers") or []
custom_provider_map = {}
if not isinstance(custom_providers_cfg, list):
return custom_provider_map
for entry in custom_providers_cfg:
if not isinstance(entry, dict):
continue
@@ -979,16 +980,23 @@ def select_provider_and_model(args=None):
if not name or not base_url:
continue
key = "custom:" + name.lower().replace(" ", "-")
short_url = base_url.replace("https://", "").replace("http://", "").rstrip("/")
saved_model = entry.get("model", "")
model_hint = f"{saved_model}" if saved_model else ""
top_providers.append((key, f"{name} ({short_url}){model_hint}"))
_custom_provider_map[key] = {
custom_provider_map[key] = {
"name": name,
"base_url": base_url,
"api_key": entry.get("api_key", ""),
"model": saved_model,
"model": entry.get("model", ""),
}
return custom_provider_map
# Add user-defined custom providers from config.yaml
_custom_provider_map = _named_custom_provider_map(config) # key → {name, base_url, api_key}
for key, provider_info in _custom_provider_map.items():
name = provider_info["name"]
base_url = provider_info["base_url"]
short_url = base_url.replace("https://", "").replace("http://", "").rstrip("/")
saved_model = provider_info.get("model", "")
model_hint = f"{saved_model}" if saved_model else ""
top_providers.append((key, f"{name} ({short_url}){model_hint}"))
top_keys = {k for k, _ in top_providers}
extended_keys = {k for k, _ in extended_providers}
@@ -1053,8 +1061,15 @@ def select_provider_and_model(args=None):
_model_flow_copilot(config, current_model)
elif selected_provider == "custom":
_model_flow_custom(config)
elif selected_provider.startswith("custom:") and selected_provider in _custom_provider_map:
_model_flow_named_custom(config, _custom_provider_map[selected_provider])
elif selected_provider.startswith("custom:"):
provider_info = _named_custom_provider_map(load_config()).get(selected_provider)
if provider_info is None:
print(
"Warning: the selected saved custom provider is no longer available. "
"It may have been removed from config.yaml. No change."
)
return
_model_flow_named_custom(config, provider_info)
elif selected_provider == "remove-custom":
_remove_custom_provider(config)
elif selected_provider == "anthropic":
@@ -1127,10 +1142,10 @@ def _model_flow_openrouter(config, current_model=""):
print()
from hermes_cli.models import model_ids, get_pricing_for_provider
openrouter_models = model_ids()
openrouter_models = model_ids(force_refresh=True)
# Fetch live pricing (non-blocking — returns empty dict on failure)
pricing = get_pricing_for_provider("openrouter")
pricing = get_pricing_for_provider("openrouter", force_refresh=True)
selected = _prompt_model_selection(openrouter_models, current_model=current_model, pricing=pricing)
if selected:
@@ -1658,7 +1673,7 @@ def _remove_custom_provider(config):
)
idx = menu.show()
print()
except (ImportError, NotImplementedError):
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
for i, c in enumerate(choices, 1):
print(f" {i}. {c}")
print()
@@ -1739,7 +1754,7 @@ def _model_flow_named_custom(config, provider_info):
print("Cancelled.")
return
model_name = models[idx]
except (ImportError, NotImplementedError):
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
for i, m in enumerate(models, 1):
print(f" {i}. {m}")
print(f" {len(models) + 1}. Cancel")
@@ -1860,7 +1875,7 @@ def _prompt_reasoning_effort_selection(efforts, current_effort=""):
if idx == len(ordered):
return "none"
return None
except (ImportError, NotImplementedError):
except (ImportError, NotImplementedError, OSError, subprocess.SubprocessError):
pass
print("Select reasoning effort:")
@@ -3021,33 +3036,19 @@ def _restore_stashed_changes(
print("\nYour stashed changes are preserved — nothing is lost.")
print(f" Stash ref: {stash_ref}")
# Ask before resetting (if interactive)
do_reset = True
if prompt_user:
print("\nReset working tree to clean state so Hermes can run?")
print(" (You can re-apply your changes later with: git stash apply)")
print("[Y/n] ", end="", flush=True)
response = input().strip().lower()
if response not in ("", "y", "yes"):
do_reset = False
if do_reset:
subprocess.run(
git_cmd + ["reset", "--hard", "HEAD"],
cwd=cwd,
capture_output=True,
)
print("Working tree reset to clean state.")
else:
print("Working tree left as-is (may have conflict markers).")
print("Resolve conflicts manually, then run: git stash drop")
print(f"Restore your changes with: git stash apply {stash_ref}")
# In non-interactive mode (gateway /update), don't abort — the code
# update itself succeeded, only the stash restore had conflicts.
# Aborting would report the entire update as failed.
if prompt_user:
sys.exit(1)
# Always reset to clean state — leaving conflict markers in source
# files makes hermes completely unrunnable (SyntaxError on import).
# The user's changes are safe in the stash for manual recovery.
subprocess.run(
git_cmd + ["reset", "--hard", "HEAD"],
cwd=cwd,
capture_output=True,
)
print("Working tree reset to clean state.")
print(f"Restore your changes later with: git stash apply {stash_ref}")
# Don't sys.exit — the code update itself succeeded, only the stash
# restore had conflicts. Let cmd_update continue with pip install,
# skill sync, and gateway restart.
return False
stash_selector = _resolve_stash_selector(git_cmd, cwd, stash_ref)
@@ -3763,7 +3764,7 @@ def cmd_update(args):
# running gateway needs restarting to pick up the new code.
try:
from hermes_cli.gateway import (
is_macos, is_linux, _ensure_user_systemd_env,
is_macos, supports_systemd_services, _ensure_user_systemd_env,
find_gateway_pids,
_get_service_pids,
)
@@ -3774,7 +3775,7 @@ def cmd_update(args):
# --- Systemd services (Linux) ---
# Discover all hermes-gateway* units (default + profiles)
if is_linux():
if supports_systemd_services():
try:
_ensure_user_systemd_env()
except Exception:
@@ -4291,6 +4292,10 @@ For more help on a command:
"-q", "--query",
help="Single query (non-interactive mode)"
)
chat_parser.add_argument(
"--image",
help="Optional local image path to attach to a single query"
)
chat_parser.add_argument(
"-m", "--model",
help="Model to use (e.g., anthropic/claude-sonnet-4)"
@@ -4481,12 +4486,12 @@ For more help on a command:
"setup",
help="Interactive setup wizard",
description="Configure Hermes Agent with an interactive wizard. "
"Run a specific section: hermes setup model|terminal|gateway|tools|agent"
"Run a specific section: hermes setup model|tts|terminal|gateway|tools|agent"
)
setup_parser.add_argument(
"section",
nargs="?",
choices=["model", "terminal", "gateway", "tools", "agent"],
choices=["model", "tts", "terminal", "gateway", "tools", "agent"],
default=None,
help="Run a specific setup section instead of the full wizard"
)
+57 -1
View File
@@ -25,6 +25,7 @@ from dataclasses import dataclass
from typing import List, NamedTuple, Optional
from hermes_cli.providers import (
custom_provider_slug,
determine_api_mode,
get_label,
is_aggregator,
@@ -336,6 +337,7 @@ def resolve_alias(
def get_authenticated_provider_slugs(
current_provider: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
) -> list[str]:
"""Return slugs of providers that have credentials.
@@ -346,6 +348,7 @@ def get_authenticated_provider_slugs(
providers = list_authenticated_providers(
current_provider=current_provider,
user_providers=user_providers,
custom_providers=custom_providers,
max_models=0,
)
return [p["slug"] for p in providers]
@@ -383,6 +386,7 @@ def switch_model(
is_global: bool = False,
explicit_provider: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
) -> ModelSwitchResult:
"""Core model-switching pipeline shared between CLI and gateway.
@@ -416,6 +420,7 @@ def switch_model(
is_global: Whether to persist the switch.
explicit_provider: From --provider flag (empty = no explicit provider).
user_providers: The ``providers:`` dict from config.yaml (for user endpoints).
custom_providers: The ``custom_providers:`` list from config.yaml.
Returns:
ModelSwitchResult with all information the caller needs.
@@ -436,7 +441,11 @@ def switch_model(
# =================================================================
if explicit_provider:
# Resolve the provider
pdef = resolve_provider_full(explicit_provider, user_providers)
pdef = resolve_provider_full(
explicit_provider,
user_providers,
custom_providers,
)
if pdef is None:
_switch_err = (
f"Unknown provider '{explicit_provider}'. "
@@ -516,6 +525,7 @@ def switch_model(
authed = get_authenticated_provider_slugs(
current_provider=current_provider,
user_providers=user_providers,
custom_providers=custom_providers,
)
fallback_result = _resolve_alias_fallback(raw_input, authed)
if fallback_result is not None:
@@ -590,6 +600,14 @@ def switch_model(
provider_changed = target_provider != current_provider
provider_label = get_label(target_provider)
if target_provider.startswith("custom:"):
custom_pdef = resolve_provider_full(
target_provider,
user_providers,
custom_providers,
)
if custom_pdef is not None:
provider_label = custom_pdef.name
# --- Resolve credentials ---
api_key = current_api_key
@@ -708,6 +726,7 @@ def switch_model(
def list_authenticated_providers(
current_provider: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
max_models: int = 8,
) -> List[dict]:
"""Detect which providers have credentials and list their curated models.
@@ -853,6 +872,43 @@ def list_authenticated_providers(
"api_url": api_url,
})
# --- 4. Saved custom providers from config ---
if custom_providers and isinstance(custom_providers, list):
for entry in custom_providers:
if not isinstance(entry, dict):
continue
display_name = (entry.get("name") or "").strip()
api_url = (
entry.get("base_url", "")
or entry.get("url", "")
or entry.get("api", "")
or ""
).strip()
if not display_name or not api_url:
continue
slug = custom_provider_slug(display_name)
if slug in seen_slugs:
continue
models_list = []
default_model = (entry.get("model") or "").strip()
if default_model:
models_list.append(default_model)
results.append({
"slug": slug,
"name": display_name,
"is_current": slug == current_provider,
"is_user_defined": True,
"models": models_list,
"total_models": len(models_list),
"source": "user-config",
"api_url": api_url,
})
seen_slugs.add(slug)
# Sort: current provider first, then by model count descending
results.sort(key=lambda r: (not r["is_current"], -r["total_models"]))
+160 -14
View File
@@ -24,18 +24,19 @@ COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
GITHUB_MODELS_BASE_URL = COPILOT_BASE_URL
GITHUB_MODELS_CATALOG_URL = COPILOT_MODELS_URL
# Fallback OpenRouter snapshot used when the live catalog is unavailable.
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.6", "recommended"),
("anthropic/claude-sonnet-4.6", ""),
("qwen/qwen3.6-plus:free", "free"),
("qwen/qwen3.6-plus", ""),
("anthropic/claude-sonnet-4.5", ""),
("anthropic/claude-haiku-4.5", ""),
("openai/gpt-5.4", ""),
("openai/gpt-5.4-mini", ""),
("xiaomi/mimo-v2-pro", ""),
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-pro-image-preview", ""),
("google/gemini-3-flash-preview", ""),
("google/gemini-3.1-pro-preview", ""),
("google/gemini-3.1-flash-lite-preview", ""),
@@ -47,7 +48,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("z-ai/glm-5.1", ""),
("z-ai/glm-5-turbo", ""),
("moonshotai/kimi-k2.5", ""),
("x-ai/grok-4.20-beta", ""),
("x-ai/grok-4.20", ""),
("nvidia/nemotron-3-super-120b-a12b", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
("arcee-ai/trinity-large-preview:free", "free"),
@@ -56,6 +57,8 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("openai/gpt-5.4-nano", ""),
]
_openrouter_catalog_cache: list[tuple[str, str]] | None = None
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"anthropic/claude-opus-4.6",
@@ -530,15 +533,79 @@ _PROVIDER_ALIASES = {
}
def model_ids() -> list[str]:
def _openrouter_model_is_free(pricing: Any) -> bool:
"""Return True when both prompt and completion pricing are zero."""
if not isinstance(pricing, dict):
return False
try:
return float(pricing.get("prompt", "0")) == 0 and float(pricing.get("completion", "0")) == 0
except (TypeError, ValueError):
return False
def fetch_openrouter_models(
timeout: float = 8.0,
*,
force_refresh: bool = False,
) -> list[tuple[str, str]]:
"""Return the curated OpenRouter picker list, refreshed from the live catalog when possible."""
global _openrouter_catalog_cache
if _openrouter_catalog_cache is not None and not force_refresh:
return list(_openrouter_catalog_cache)
fallback = list(OPENROUTER_MODELS)
preferred_ids = [mid for mid, _ in fallback]
try:
req = urllib.request.Request(
"https://openrouter.ai/api/v1/models",
headers={"Accept": "application/json"},
)
with urllib.request.urlopen(req, timeout=timeout) as resp:
payload = json.loads(resp.read().decode())
except Exception:
return list(_openrouter_catalog_cache or fallback)
live_items = payload.get("data", [])
if not isinstance(live_items, list):
return list(_openrouter_catalog_cache or fallback)
live_by_id: dict[str, dict[str, Any]] = {}
for item in live_items:
if not isinstance(item, dict):
continue
mid = str(item.get("id") or "").strip()
if not mid:
continue
live_by_id[mid] = item
curated: list[tuple[str, str]] = []
for preferred_id in preferred_ids:
live_item = live_by_id.get(preferred_id)
if live_item is None:
continue
desc = "free" if _openrouter_model_is_free(live_item.get("pricing")) else ""
curated.append((preferred_id, desc))
if not curated:
return list(_openrouter_catalog_cache or fallback)
first_id, _ = curated[0]
curated[0] = (first_id, "recommended")
_openrouter_catalog_cache = curated
return list(curated)
def model_ids(*, force_refresh: bool = False) -> list[str]:
"""Return just the OpenRouter model-id strings."""
return [mid for mid, _ in OPENROUTER_MODELS]
return [mid for mid, _ in fetch_openrouter_models(force_refresh=force_refresh)]
def menu_labels() -> list[str]:
def menu_labels(*, force_refresh: bool = False) -> list[str]:
"""Return display labels like 'anthropic/claude-opus-4.6 (recommended)'."""
labels = []
for mid, desc in OPENROUTER_MODELS:
for mid, desc in fetch_openrouter_models(force_refresh=force_refresh):
labels.append(f"{mid} ({desc})" if desc else mid)
return labels
@@ -727,13 +794,14 @@ def _resolve_nous_pricing_credentials() -> tuple[str, str]:
return ("", "")
def get_pricing_for_provider(provider: str) -> dict[str, dict[str, str]]:
def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> dict[str, dict[str, str]]:
"""Return live pricing for providers that support it (openrouter, nous)."""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return fetch_models_with_pricing(
api_key=_resolve_openrouter_api_key(),
base_url="https://openrouter.ai/api",
force_refresh=force_refresh,
)
if normalized == "nous":
api_key, base_url = _resolve_nous_pricing_credentials()
@@ -746,6 +814,7 @@ def get_pricing_for_provider(provider: str) -> dict[str, dict[str, str]]:
return fetch_models_with_pricing(
api_key=api_key,
base_url=stripped,
force_refresh=force_refresh,
)
return {}
@@ -854,7 +923,11 @@ def _get_custom_base_url() -> str:
return ""
def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]]:
def curated_models_for_provider(
provider: Optional[str],
*,
force_refresh: bool = False,
) -> list[tuple[str, str]]:
"""Return ``(model_id, description)`` tuples for a provider's model list.
Tries to fetch the live model list from the provider's API first,
@@ -863,7 +936,7 @@ def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]
"""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return list(OPENROUTER_MODELS)
return fetch_openrouter_models(force_refresh=force_refresh)
# Try live API first (Codex, Nous, etc. all support /models)
live = provider_model_ids(normalized)
@@ -982,12 +1055,12 @@ def _find_openrouter_slug(model_name: str) -> Optional[str]:
return None
# Exact match (already has provider/ prefix)
for mid, _ in OPENROUTER_MODELS:
for mid in model_ids():
if name_lower == mid.lower():
return mid
# Try matching just the model part (after the /)
for mid, _ in OPENROUTER_MODELS:
for mid in model_ids():
if "/" in mid:
_, model_part = mid.split("/", 1)
if name_lower == model_part.lower():
@@ -1017,6 +1090,79 @@ def provider_label(provider: Optional[str]) -> str:
return _PROVIDER_LABELS.get(normalized, original or "OpenRouter")
# Models that support OpenAI Priority Processing (service_tier="priority").
# See https://openai.com/api-priority-processing/ for the canonical list.
# Only the bare model slug is stored (no vendor prefix).
_PRIORITY_PROCESSING_MODELS: frozenset[str] = frozenset({
"gpt-5.4",
"gpt-5.4-mini",
"gpt-5.2",
"gpt-5.1",
"gpt-5",
"gpt-5-mini",
"gpt-4.1",
"gpt-4.1-mini",
"gpt-4.1-nano",
"gpt-4o",
"gpt-4o-mini",
"o3",
"o4-mini",
})
# Models that support Anthropic Fast Mode (speed="fast").
# See https://platform.claude.com/docs/en/build-with-claude/fast-mode
# Currently only Claude Opus 4.6. Both hyphen and dot variants are stored
# to handle native Anthropic (claude-opus-4-6) and OpenRouter (claude-opus-4.6).
_ANTHROPIC_FAST_MODE_MODELS: frozenset[str] = frozenset({
"claude-opus-4-6",
"claude-opus-4.6",
})
def _strip_vendor_prefix(model_id: str) -> str:
"""Strip vendor/ prefix from a model ID (e.g. 'anthropic/claude-opus-4-6' -> 'claude-opus-4-6')."""
raw = str(model_id or "").strip().lower()
if "/" in raw:
raw = raw.split("/", 1)[1]
return raw
def model_supports_fast_mode(model_id: Optional[str]) -> bool:
"""Return whether Hermes should expose the /fast toggle for this model."""
raw = _strip_vendor_prefix(str(model_id or ""))
if raw in _PRIORITY_PROCESSING_MODELS:
return True
# Anthropic fast mode — strip date suffixes (e.g. claude-opus-4-6-20260401)
# and OpenRouter variant tags (:fast, :beta) for matching.
base = raw.split(":")[0]
return base in _ANTHROPIC_FAST_MODE_MODELS
def _is_anthropic_fast_model(model_id: Optional[str]) -> bool:
"""Return True if the model supports Anthropic's fast mode (speed='fast')."""
raw = _strip_vendor_prefix(str(model_id or ""))
base = raw.split(":")[0]
return base in _ANTHROPIC_FAST_MODE_MODELS
def resolve_fast_mode_overrides(model_id: Optional[str]) -> dict[str, Any] | None:
"""Return request_overrides for fast/priority mode, or None if unsupported.
Returns provider-appropriate overrides:
- OpenAI models: ``{"service_tier": "priority"}`` (Priority Processing)
- Anthropic models: ``{"speed": "fast"}`` (Anthropic Fast Mode beta)
The overrides are injected into the API request kwargs by
``_build_api_kwargs`` in run_agent.py each API path handles its own
keys (service_tier for OpenAI/Codex, speed for Anthropic Messages).
"""
if not model_supports_fast_mode(model_id):
return None
if _is_anthropic_fast_model(model_id):
return {"speed": "fast"}
return {"service_tier": "priority"}
def _resolve_copilot_catalog_api_key() -> str:
"""Best-effort GitHub token for fetching the Copilot model catalog."""
try:
@@ -1028,7 +1174,7 @@ def _resolve_copilot_catalog_api_key() -> str:
return ""
def provider_model_ids(provider: Optional[str]) -> list[str]:
def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False) -> list[str]:
"""Return the best known model catalog for a provider.
Tries live API endpoints for providers that support them (Codex, Nous),
@@ -1036,7 +1182,7 @@ def provider_model_ids(provider: Optional[str]) -> list[str]:
"""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return model_ids()
return model_ids(force_refresh=force_refresh)
if normalized == "openai-codex":
from hermes_cli.codex_models import get_codex_model_ids
+61
View File
@@ -452,9 +452,64 @@ def resolve_user_provider(name: str, user_config: Dict[str, Any]) -> Optional[Pr
)
def custom_provider_slug(display_name: str) -> str:
"""Build a canonical slug for a custom_providers entry.
Matches the convention used by runtime_provider and credential_pool
(``custom:<normalized-name>``). Centralised here so all call-sites
produce identical slugs.
"""
return "custom:" + display_name.strip().lower().replace(" ", "-")
def resolve_custom_provider(
name: str,
custom_providers: Optional[List[Dict[str, Any]]],
) -> Optional[ProviderDef]:
"""Resolve a provider from the user's config.yaml ``custom_providers`` list."""
if not custom_providers or not isinstance(custom_providers, list):
return None
requested = (name or "").strip().lower()
if not requested:
return None
for entry in custom_providers:
if not isinstance(entry, dict):
continue
display_name = (entry.get("name") or "").strip()
api_url = (
entry.get("base_url", "")
or entry.get("url", "")
or entry.get("api", "")
or ""
).strip()
if not display_name or not api_url:
continue
slug = custom_provider_slug(display_name)
if requested not in {display_name.lower(), slug}:
continue
return ProviderDef(
id=slug,
name=display_name,
transport="openai_chat",
api_key_env_vars=(),
base_url=api_url,
is_aggregator=False,
auth_type="api_key",
source="user-config",
)
return None
def resolve_provider_full(
name: str,
user_providers: Optional[Dict[str, Any]] = None,
custom_providers: Optional[List[Dict[str, Any]]] = None,
) -> Optional[ProviderDef]:
"""Full resolution chain: built-in → models.dev → user config.
@@ -463,6 +518,7 @@ def resolve_provider_full(
Args:
name: Provider name or alias.
user_providers: The ``providers:`` dict from config.yaml (optional).
custom_providers: The ``custom_providers:`` list from config.yaml (optional).
Returns:
ProviderDef if found, else None.
@@ -485,6 +541,11 @@ def resolve_provider_full(
if user_pdef is not None:
return user_pdef
# 2b. Saved custom providers from config
custom_pdef = resolve_custom_provider(name, custom_providers)
if custom_pdef is not None:
return custom_pdef
# 3. Try models.dev directly (for providers not in our ALIASES)
try:
from agent.models_dev import get_provider_info as _mdev_provider
+16
View File
@@ -16,6 +16,7 @@ from hermes_cli.auth import (
DEFAULT_CODEX_BASE_URL,
DEFAULT_QWEN_BASE_URL,
PROVIDER_REGISTRY,
_agent_key_is_usable,
format_auth_error,
resolve_provider,
resolve_nous_runtime_credentials,
@@ -644,6 +645,21 @@ def resolve_runtime_provider(
getattr(entry, "runtime_api_key", None)
or getattr(entry, "access_token", "")
)
# For Nous, the pool entry's runtime_api_key is the agent_key — a
# short-lived inference credential (~30 min TTL). The pool doesn't
# refresh it during selection (that would trigger network calls in
# non-runtime contexts like `hermes auth list`). If the key is
# expired, clear pool_api_key so we fall through to
# resolve_nous_runtime_credentials() which handles refresh + mint.
if provider == "nous" and entry is not None and pool_api_key:
min_ttl = max(60, int(os.getenv("HERMES_NOUS_MIN_KEY_TTL_SECONDS", "1800")))
nous_state = {
"agent_key": getattr(entry, "agent_key", None),
"agent_key_expires_at": getattr(entry, "agent_key_expires_at", None),
}
if not _agent_key_is_usable(nous_state, min_ttl):
logger.debug("Nous pool entry agent_key expired/missing, falling through to runtime resolution")
pool_api_key = ""
if entry is not None and pool_api_key:
return _resolve_runtime_from_pool_entry(
provider=provider,
+15 -16
View File
@@ -16,6 +16,7 @@ import logging
import os
import shutil
import sys
import copy
from pathlib import Path
from typing import Optional, Dict, Any
@@ -316,6 +317,7 @@ def _setup_provider_model_selection(config, provider_id, current_model, prompt_c
# Import config helpers
from hermes_cli.config import (
DEFAULT_CONFIG,
get_hermes_home,
get_config_path,
get_env_path,
@@ -921,8 +923,10 @@ def setup_model_provider(config: dict, *, quick: bool = False):
# changes with stale values (#4172).
_refreshed = load_config()
config["model"] = _refreshed.get("model", config.get("model"))
if _refreshed.get("custom_providers"):
if "custom_providers" in _refreshed:
config["custom_providers"] = _refreshed["custom_providers"]
else:
config.pop("custom_providers", None)
# Derive the selected provider for downstream steps (vision setup).
selected_provider = None
@@ -1006,8 +1010,6 @@ def setup_model_provider(config: dict, *, quick: bool = False):
strategy_value = ["fill_first", "round_robin", "random"][strategy_idx]
_set_credential_pool_strategy(config, selected_provider, strategy_value)
print_success(f"Saved {selected_provider} rotation strategy: {strategy_value}")
else:
_set_credential_pool_strategy(config, selected_provider, "fill_first")
except Exception as exc:
logger.debug("Could not configure same-provider fallback in setup: %s", exc)
@@ -2844,6 +2846,7 @@ def run_setup_wizard(args):
Supports full, quick, and section-specific setup:
hermes setup full or quick (auto-detected)
hermes setup model just model/provider
hermes setup tts just text-to-speech
hermes setup terminal just terminal backend
hermes setup gateway just messaging platforms
hermes setup tools just tool configuration
@@ -2855,6 +2858,11 @@ def run_setup_wizard(args):
return
ensure_hermes_home()
reset_requested = bool(getattr(args, "reset", False))
if reset_requested:
save_config(copy.deepcopy(DEFAULT_CONFIG))
print_success("Configuration reset to defaults.")
config = load_config()
hermes_home = get_hermes_home()
@@ -2955,18 +2963,13 @@ def run_setup_wizard(args):
menu_choices = [
"Quick Setup - configure missing items only",
"Full Setup - reconfigure everything",
"---",
"Model & Provider",
"Terminal Backend",
"Messaging Platforms (Gateway)",
"Tools",
"Agent Settings",
"---",
"Exit",
]
# Separator indices (not selectable, but prompt_choice doesn't filter them,
# so we handle them below)
choice = prompt_choice("What would you like to do?", menu_choices, 0)
if choice == 0:
@@ -2976,18 +2979,14 @@ def run_setup_wizard(args):
elif choice == 1:
# Full setup — fall through to run all sections
pass
elif choice in (2, 8):
# Separator — treat as exit
elif choice == 7:
print_info("Exiting. Run 'hermes setup' again when ready.")
return
elif choice == 9:
print_info("Exiting. Run 'hermes setup' again when ready.")
return
elif 3 <= choice <= 7:
elif 2 <= choice <= 6:
# Individual section — map by key, not by position.
# SETUP_SECTIONS includes TTS but the returning-user menu skips it,
# so positional indexing (choice - 3) would dispatch the wrong section.
section_key = RETURNING_USER_MENU_SECTION_KEYS[choice - 3]
# so positional indexing (choice - 2) would dispatch the wrong section.
section_key = RETURNING_USER_MENU_SECTION_KEYS[choice - 2]
section = next((s for s in SETUP_SECTIONS if s[0] == section_key), None)
if section:
_, label, func = section
+23 -2
View File
@@ -79,6 +79,9 @@ def _effective_provider_label() -> str:
return provider_label(effective)
from hermes_constants import is_termux as _is_termux
def show_status(args):
"""Show status of all Hermes Agent components."""
show_all = getattr(args, 'all', False)
@@ -325,7 +328,25 @@ def show_status(args):
print()
print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
if sys.platform.startswith('linux'):
if _is_termux():
try:
from hermes_cli.gateway import find_gateway_pids
gateway_pids = find_gateway_pids()
except Exception:
gateway_pids = []
is_running = bool(gateway_pids)
print(f" Status: {check_mark(is_running)} {'running' if is_running else 'stopped'}")
print(" Manager: Termux / manual process")
if gateway_pids:
rendered = ", ".join(str(pid) for pid in gateway_pids[:3])
if len(gateway_pids) > 3:
rendered += ", ..."
print(f" PID(s): {rendered}")
else:
print(" Start with: hermes gateway")
print(" Note: Android may stop background jobs when Termux is suspended")
elif sys.platform.startswith('linux'):
try:
from hermes_cli.gateway import get_service_name
_gw_svc = get_service_name()
@@ -339,7 +360,7 @@ def show_status(args):
timeout=5
)
is_active = result.stdout.strip() == "active"
except subprocess.TimeoutExpired:
except (FileNotFoundError, subprocess.TimeoutExpired):
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
+6
View File
@@ -6,6 +6,8 @@ Provides options for:
- Keep data: Remove code but keep ~/.hermes/ (configs, sessions, logs)
"""
import os
import platform
import shutil
import subprocess
from pathlib import Path
@@ -122,6 +124,10 @@ def uninstall_gateway_service():
if platform.system() != "Linux":
return False
prefix = os.getenv("PREFIX", "")
if os.getenv("TERMUX_VERSION") or "com.termux/files/usr" in prefix:
return False
try:
from hermes_cli.gateway import get_service_name
+10
View File
@@ -93,6 +93,16 @@ def parse_reasoning_effort(effort: str) -> dict | None:
return None
def is_termux() -> bool:
"""Return True when running inside a Termux (Android) environment.
Checks ``TERMUX_VERSION`` (set by Termux) or the Termux-specific
``PREFIX`` path. Import-safe no heavy deps.
"""
prefix = os.getenv("PREFIX", "")
return bool(os.getenv("TERMUX_VERSION") or "com.termux/files/usr" in prefix)
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"
OPENROUTER_CHAT_URL = f"{OPENROUTER_BASE_URL}/chat/completions"
+11
View File
@@ -63,6 +63,17 @@ homeassistant = ["aiohttp>=3.9.0,<4"]
sms = ["aiohttp>=3.9.0,<4"]
acp = ["agent-client-protocol>=0.9.0,<1.0"]
mistral = ["mistralai>=2.3.0,<3"]
termux = [
# Tested Android / Termux path: keeps the core CLI feature-rich while
# avoiding extras that currently depend on non-Android wheels (notably
# faster-whisper -> ctranslate2 via the voice extra).
"hermes-agent[cron]",
"hermes-agent[cli]",
"hermes-agent[pty]",
"hermes-agent[mcp]",
"hermes-agent[honcho]",
"hermes-agent[acp]",
]
dingtalk = ["dingtalk-stream>=0.1.0,<1"]
feishu = ["lark-oapi>=1.5.3,<2"]
rl = [
+169 -32
View File
@@ -500,6 +500,8 @@ class AIAgent:
status_callback: callable = None,
max_tokens: int = None,
reasoning_config: Dict[str, Any] = None,
service_tier: str = None,
request_overrides: Dict[str, Any] = None,
prefill_messages: List[Dict[str, Any]] = None,
platform: str = None,
user_id: str = None,
@@ -622,6 +624,7 @@ class AIAgent:
self.tool_progress_callback = tool_progress_callback
self.tool_start_callback = tool_start_callback
self.tool_complete_callback = tool_complete_callback
self.suppress_status_output = False
self.thinking_callback = thinking_callback
self.reasoning_callback = reasoning_callback
self._reasoning_deltas_fired = False # Set by _fire_reasoning_delta, reset per API call
@@ -661,6 +664,8 @@ class AIAgent:
# Model response configuration
self.max_tokens = max_tokens # None = use model default
self.reasoning_config = reasoning_config # None = use default (medium for OpenRouter)
self.service_tier = service_tier
self.request_overrides = dict(request_overrides or {})
self.prefill_messages = prefill_messages or [] # Prefilled conversation turns
# Anthropic prompt caching: auto-enabled for Claude models via OpenRouter.
@@ -789,7 +794,7 @@ class AIAgent:
client_kwargs["default_headers"] = copilot_default_headers()
elif "api.kimi.com" in effective_base.lower():
client_kwargs["default_headers"] = {
"User-Agent": "KimiCLI/1.3",
"User-Agent": "KimiCLI/1.30.0",
}
elif "portal.qwen.ai" in effective_base.lower():
client_kwargs["default_headers"] = _qwen_portal_headers()
@@ -1460,7 +1465,14 @@ class AIAgent:
After the main response has been delivered and the remaining tool
calls are post-response housekeeping (``_mute_post_response``),
all non-forced output is suppressed.
``suppress_status_output`` is a stricter CLI automation mode used by
parseable single-query flows such as ``hermes chat -q``. In that mode,
all status/diagnostic prints routed through ``_vprint`` are suppressed
so stdout stays machine-readable.
"""
if getattr(self, "suppress_status_output", False):
return
if not force and getattr(self, "_mute_post_response", False):
return
if not force and self._has_stream_consumers() and not self._executing_tools:
@@ -1486,6 +1498,17 @@ class AIAgent:
except (AttributeError, ValueError, OSError):
return False
def _should_emit_quiet_tool_messages(self) -> bool:
"""Return True when quiet-mode tool summaries should print directly.
When the caller provides ``tool_progress_callback`` (for example the CLI
TUI or a gateway progress renderer), that callback owns progress display.
Emitting quiet-mode summary lines here duplicates progress and leaks tool
previews into flows that are expected to stay silent, such as
``hermes chat -q``.
"""
return self.quiet_mode and not self.tool_progress_callback
def _emit_status(self, message: str) -> None:
"""Emit a lifecycle status message to both CLI and gateway channels.
@@ -3324,7 +3347,7 @@ class AIAgent:
allowed_keys = {
"model", "instructions", "input", "tools", "store",
"reasoning", "include", "max_output_tokens", "temperature",
"tool_choice", "parallel_tool_calls", "prompt_cache_key",
"tool_choice", "parallel_tool_calls", "prompt_cache_key", "service_tier",
}
normalized: Dict[str, Any] = {
"model": model,
@@ -3342,6 +3365,9 @@ class AIAgent:
include = api_kwargs.get("include")
if isinstance(include, list):
normalized["include"] = include
service_tier = api_kwargs.get("service_tier")
if isinstance(service_tier, str) and service_tier.strip():
normalized["service_tier"] = service_tier.strip()
# Pass through max_output_tokens and temperature
max_output_tokens = api_kwargs.get("max_output_tokens")
@@ -4155,7 +4181,7 @@ class AIAgent:
self._client_kwargs["default_headers"] = copilot_default_headers()
elif "api.kimi.com" in normalized:
self._client_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.3"}
self._client_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
elif "portal.qwen.ai" in normalized:
self._client_kwargs["default_headers"] = _qwen_portal_headers()
else:
@@ -4407,7 +4433,17 @@ class AIAgent:
"""Stream a chat completions response."""
import httpx as _httpx
_base_timeout = float(os.getenv("HERMES_API_TIMEOUT", 1800.0))
_stream_read_timeout = float(os.getenv("HERMES_STREAM_READ_TIMEOUT", 60.0))
_stream_read_timeout = float(os.getenv("HERMES_STREAM_READ_TIMEOUT", 120.0))
# Local providers (Ollama, llama.cpp, vLLM) can take minutes for
# prefill on large contexts before producing the first token.
# Auto-increase the httpx read timeout unless the user explicitly
# overrode HERMES_STREAM_READ_TIMEOUT.
if _stream_read_timeout == 120.0 and self.base_url and is_local_endpoint(self.base_url):
_stream_read_timeout = _base_timeout
logger.debug(
"Local provider detected (%s) — stream read timeout raised to %.0fs",
self.base_url, _stream_read_timeout,
)
stream_kwargs = {
**api_kwargs,
"stream": True,
@@ -4565,20 +4601,31 @@ class AIAgent:
# Build mock response matching non-streaming shape
full_content = "".join(content_parts) or None
mock_tool_calls = None
has_truncated_tool_args = False
if tool_calls_acc:
mock_tool_calls = []
for idx in sorted(tool_calls_acc):
tc = tool_calls_acc[idx]
arguments = tc["function"]["arguments"]
if arguments and arguments.strip():
try:
json.loads(arguments)
except json.JSONDecodeError:
has_truncated_tool_args = True
mock_tool_calls.append(SimpleNamespace(
id=tc["id"],
type=tc["type"],
extra_content=tc.get("extra_content"),
function=SimpleNamespace(
name=tc["function"]["name"],
arguments=tc["function"]["arguments"],
arguments=arguments,
),
))
effective_finish_reason = finish_reason or "stop"
if has_truncated_tool_args:
effective_finish_reason = "length"
full_reasoning = "".join(reasoning_parts) or None
mock_message = SimpleNamespace(
role=role,
@@ -4589,7 +4636,7 @@ class AIAgent:
mock_choice = SimpleNamespace(
index=0,
message=mock_message,
finish_reason=finish_reason or "stop",
finish_reason=effective_finish_reason,
)
return SimpleNamespace(
id="stream-" + str(uuid.uuid4()),
@@ -5096,6 +5143,7 @@ class AIAgent:
_TRANSIENT_TRANSPORT_ERRORS = frozenset({
"ReadTimeout", "ConnectTimeout", "PoolTimeout",
"ConnectError", "RemoteProtocolError",
"APIConnectionError", "APITimeoutError",
})
def _try_recover_primary_transport(
@@ -5419,6 +5467,7 @@ class AIAgent:
preserve_dots=self._anthropic_preserve_dots(),
context_length=ctx_len,
base_url=getattr(self, "_anthropic_base_url", None),
fast_mode=self.request_overrides.get("speed") == "fast",
)
if self.api_mode == "codex_responses":
@@ -5434,6 +5483,10 @@ class AIAgent:
"models.github.ai" in self.base_url.lower()
or "api.githubcopilot.com" in self.base_url.lower()
)
is_codex_backend = (
self.provider == "openai-codex"
or "chatgpt.com/backend-api/codex" in self.base_url.lower()
)
# Resolve reasoning effort: config > default (medium)
reasoning_effort = "medium"
@@ -5471,7 +5524,10 @@ class AIAgent:
elif not is_github_responses:
kwargs["include"] = []
if self.max_tokens is not None:
if self.request_overrides:
kwargs.update(self.request_overrides)
if self.max_tokens is not None and not is_codex_backend:
kwargs["max_output_tokens"] = self.max_tokens
return kwargs
@@ -5566,20 +5622,20 @@ class AIAgent:
if self.max_tokens is not None:
if not self._is_qwen_portal():
api_kwargs.update(self._max_tokens_param(self.max_tokens))
elif self._is_openrouter_url() and "claude" in (self.model or "").lower():
# OpenRouter translates requests to Anthropic's Messages API,
# which requires max_tokens as a mandatory field. When we omit
# it, OpenRouter picks a default that can be too low — the model
# spends its output budget on thinking and has almost nothing
# left for the actual response (especially large tool calls like
# write_file). Sending the model's real output limit ensures
# full capacity. Other providers handle the default fine.
elif (self._is_openrouter_url() or "nousresearch" in self._base_url_lower) and "claude" in (self.model or "").lower():
# OpenRouter and Nous Portal translate requests to Anthropic's
# Messages API, which requires max_tokens as a mandatory field.
# When we omit it, the proxy picks a default that can be too
# low — the model spends its output budget on thinking and has
# almost nothing left for the actual response (especially large
# tool calls like write_file). Sending the model's real output
# limit ensures full capacity.
try:
from agent.anthropic_adapter import _get_anthropic_max_output
_model_output_limit = _get_anthropic_max_output(self.model)
api_kwargs["max_tokens"] = _model_output_limit
except Exception:
pass # fail open — let OpenRouter pick its default
pass # fail open — let the proxy pick its default
extra_body = {}
@@ -5642,6 +5698,11 @@ class AIAgent:
if "x.ai" in self._base_url_lower and hasattr(self, "session_id") and self.session_id:
api_kwargs["extra_headers"] = {"x-grok-conv-id": self.session_id}
# Priority Processing / generic request overrides (e.g. service_tier).
# Applied last so overrides win over any defaults set above.
if self.request_overrides:
api_kwargs.update(self.request_overrides)
return api_kwargs
def _supports_reasoning_extra_body(self) -> bool:
@@ -6347,7 +6408,7 @@ class AIAgent:
# Start spinner for CLI mode (skip when TUI handles tool progress)
spinner = None
if self.quiet_mode and not self.tool_progress_callback and self._should_start_quiet_spinner():
if self._should_emit_quiet_tool_messages() and self._should_start_quiet_spinner():
face = random.choice(KawaiiSpinner.KAWAII_WAITING)
spinner = KawaiiSpinner(f"{face} ⚡ running {num_tools} tools concurrently", spinner_type='dots', print_fn=self._print_fn)
spinner.start()
@@ -6397,7 +6458,7 @@ class AIAgent:
logging.debug(f"Tool result ({len(function_result)} chars): {function_result}")
# Print cute message per tool
if self.quiet_mode:
if self._should_emit_quiet_tool_messages():
cute_msg = _get_cute_tool_message_impl(name, args, tool_duration, result=function_result)
self._safe_print(f" {cute_msg}")
elif not self.quiet_mode:
@@ -6554,7 +6615,7 @@ class AIAgent:
store=self._todo_store,
)
tool_duration = time.time() - tool_start_time
if self.quiet_mode:
if self._should_emit_quiet_tool_messages():
self._vprint(f" {_get_cute_tool_message_impl('todo', function_args, tool_duration, result=function_result)}")
elif function_name == "session_search":
if not self._session_db:
@@ -6569,7 +6630,7 @@ class AIAgent:
current_session_id=self.session_id,
)
tool_duration = time.time() - tool_start_time
if self.quiet_mode:
if self._should_emit_quiet_tool_messages():
self._vprint(f" {_get_cute_tool_message_impl('session_search', function_args, tool_duration, result=function_result)}")
elif function_name == "memory":
target = function_args.get("target", "memory")
@@ -6582,7 +6643,7 @@ class AIAgent:
store=self._memory_store,
)
tool_duration = time.time() - tool_start_time
if self.quiet_mode:
if self._should_emit_quiet_tool_messages():
self._vprint(f" {_get_cute_tool_message_impl('memory', function_args, tool_duration, result=function_result)}")
elif function_name == "clarify":
from tools.clarify_tool import clarify_tool as _clarify_tool
@@ -6592,7 +6653,7 @@ class AIAgent:
callback=self.clarify_callback,
)
tool_duration = time.time() - tool_start_time
if self.quiet_mode:
if self._should_emit_quiet_tool_messages():
self._vprint(f" {_get_cute_tool_message_impl('clarify', function_args, tool_duration, result=function_result)}")
elif function_name == "delegate_task":
from tools.delegate_tool import delegate_task as _delegate_task
@@ -6603,7 +6664,7 @@ class AIAgent:
goal_preview = (function_args.get("goal") or "")[:30]
spinner_label = f"🔀 {goal_preview}" if goal_preview else "🔀 delegating"
spinner = None
if self.quiet_mode and not self.tool_progress_callback and self._should_start_quiet_spinner():
if self._should_emit_quiet_tool_messages() and self._should_start_quiet_spinner():
face = random.choice(KawaiiSpinner.KAWAII_WAITING)
spinner = KawaiiSpinner(f"{face} {spinner_label}", spinner_type='dots', print_fn=self._print_fn)
spinner.start()
@@ -6625,13 +6686,13 @@ class AIAgent:
cute_msg = _get_cute_tool_message_impl('delegate_task', function_args, tool_duration, result=_delegate_result)
if spinner:
spinner.stop(cute_msg)
elif self.quiet_mode:
elif self._should_emit_quiet_tool_messages():
self._vprint(f" {cute_msg}")
elif self._memory_manager and self._memory_manager.has_tool(function_name):
# Memory provider tools (hindsight_retain, honcho_search, etc.)
# These are not in the tool registry — route through MemoryManager.
spinner = None
if self.quiet_mode and not self.tool_progress_callback:
if self._should_emit_quiet_tool_messages() and self._should_start_quiet_spinner():
face = random.choice(KawaiiSpinner.KAWAII_WAITING)
emoji = _get_tool_emoji(function_name)
preview = _build_tool_preview(function_name, function_args) or function_name
@@ -6649,11 +6710,11 @@ class AIAgent:
cute_msg = _get_cute_tool_message_impl(function_name, function_args, tool_duration, result=_mem_result)
if spinner:
spinner.stop(cute_msg)
elif self.quiet_mode:
elif self._should_emit_quiet_tool_messages():
self._vprint(f" {cute_msg}")
elif self.quiet_mode:
spinner = None
if not self.tool_progress_callback:
if self._should_emit_quiet_tool_messages() and self._should_start_quiet_spinner():
face = random.choice(KawaiiSpinner.KAWAII_WAITING)
emoji = _get_tool_emoji(function_name)
preview = _build_tool_preview(function_name, function_args) or function_name
@@ -6676,7 +6737,7 @@ class AIAgent:
cute_msg = _get_cute_tool_message_impl(function_name, function_args, tool_duration, result=_spinner_result)
if spinner:
spinner.stop(cute_msg)
else:
elif self._should_emit_quiet_tool_messages():
self._vprint(f" {cute_msg}")
else:
try:
@@ -7300,6 +7361,7 @@ class AIAgent:
interrupted = False
codex_ack_continuations = 0
length_continue_retries = 0
truncated_tool_call_retries = 0
truncated_response_prefix = ""
compression_attempts = 0
_turn_exit_reason = "unknown" # Diagnostic: why the loop ended
@@ -7768,9 +7830,11 @@ class AIAgent:
# retries are pointless. Detect this early and give a
# targeted error instead of wasting 3 API calls.
_trunc_content = None
_trunc_has_tool_calls = False
if self.api_mode == "chat_completions":
_trunc_msg = response.choices[0].message if (hasattr(response, "choices") and response.choices) else None
_trunc_content = getattr(_trunc_msg, "content", None) if _trunc_msg else None
_trunc_has_tool_calls = bool(getattr(_trunc_msg, "tool_calls", None)) if _trunc_msg else False
elif self.api_mode == "anthropic_messages":
# Anthropic response.content is a list of blocks
_text_parts = []
@@ -7780,9 +7844,11 @@ class AIAgent:
_trunc_content = "\n".join(_text_parts) if _text_parts else None
_thinking_exhausted = (
_trunc_content is not None
and not self._has_content_after_think_block(_trunc_content)
) or _trunc_content is None
not _trunc_has_tool_calls and (
(_trunc_content is not None and not self._has_content_after_think_block(_trunc_content))
or _trunc_content is None
)
)
if _thinking_exhausted:
_exhaust_error = (
@@ -7858,6 +7924,34 @@ class AIAgent:
"error": "Response remained truncated after 3 continuation attempts",
}
if self.api_mode == "chat_completions":
assistant_message = response.choices[0].message
if assistant_message.tool_calls:
if truncated_tool_call_retries < 1:
truncated_tool_call_retries += 1
self._vprint(
f"{self.log_prefix}⚠️ Truncated tool call detected — retrying API call...",
force=True,
)
# Don't append the broken response to messages;
# just re-run the same API call from the current
# message state, giving the model another chance.
continue
self._vprint(
f"{self.log_prefix}⚠️ Truncated tool call response detected again — refusing to execute incomplete tool arguments.",
force=True,
)
self._cleanup_task_resources(effective_task_id)
self._persist_session(messages, conversation_history)
return {
"final_response": None,
"messages": messages,
"api_calls": api_call_count,
"completed": False,
"partial": True,
"error": "Response truncated due to output length limit",
}
# If we have prior messages, roll back to last complete state
if len(messages) > 1:
self._vprint(f"{self.log_prefix} ⏪ Rolling back to last complete assistant turn")
@@ -8170,7 +8264,33 @@ class AIAgent:
if _err_body_str:
self._vprint(f"{self.log_prefix} 📋 Details: {_err_body_str}", force=True)
self._vprint(f"{self.log_prefix} ⏱️ Elapsed: {elapsed_time:.2f}s Context: {len(api_messages)} msgs, ~{approx_tokens:,} tokens")
# Actionable hint for OpenRouter "no tool endpoints" error.
# This fires regardless of whether fallback succeeds — the
# user needs to know WHY their model failed so they can fix
# their provider routing, not just silently fall back.
if (
self._is_openrouter_url()
and "support tool use" in error_msg
):
self._vprint(
f"{self.log_prefix} 💡 No OpenRouter providers for {_model} support tool calling with your current settings.",
force=True,
)
if self.providers_allowed:
self._vprint(
f"{self.log_prefix} Your provider_routing.only restriction is filtering out tool-capable providers.",
force=True,
)
self._vprint(
f"{self.log_prefix} Try removing the restriction or adding providers that support tools for this model.",
force=True,
)
self._vprint(
f"{self.log_prefix} Check which providers support tools: https://openrouter.ai/models/{_model}",
force=True,
)
# Check for interrupt before deciding to retry
if self._interrupt_requested:
self._vprint(f"{self.log_prefix}⚡ Interrupt detected during error handling, aborting retries.", force=True)
@@ -8226,6 +8346,10 @@ class AIAgent:
approx_tokens=approx_tokens,
task_id=effective_task_id,
)
# Compression created a new session — clear history
# so _flush_messages_to_session_db writes compressed
# messages to the new session, not skipping them.
conversation_history = None
if len(messages) < original_len or old_ctx > _reduced_ctx:
self._emit_status(
f"🗜️ Context reduced to {_reduced_ctx:,} tokens "
@@ -8283,6 +8407,10 @@ class AIAgent:
messages, system_message, approx_tokens=approx_tokens,
task_id=effective_task_id,
)
# Compression created a new session — clear history
# so _flush_messages_to_session_db writes compressed
# messages to the new session, not skipping them.
conversation_history = None
if len(messages) < original_len:
self._emit_status(f"🗜️ Compressed {original_len}{len(messages)} messages, retrying...")
@@ -8401,6 +8529,10 @@ class AIAgent:
messages, system_message, approx_tokens=approx_tokens,
task_id=effective_task_id,
)
# Compression created a new session — clear history
# so _flush_messages_to_session_db writes compressed
# messages to the new session, not skipping them.
conversation_history = None
if len(messages) < original_len or new_ctx and new_ctx < old_ctx:
if len(messages) < original_len:
@@ -9008,6 +9140,11 @@ class AIAgent:
self._execute_tool_calls(assistant_message, messages, effective_task_id, api_call_count)
# Reset per-turn retry counters after successful tool
# execution so a single truncation doesn't poison the
# entire conversation.
truncated_tool_call_retries = 0
# Signal that a paragraph break is needed before the next
# streamed text. We don't emit it immediately because
# multiple consecutive tool iterations would stack up
+250 -35
View File
@@ -2,8 +2,8 @@
# ============================================================================
# Hermes Agent Installer
# ============================================================================
# Installation script for Linux and macOS.
# Uses uv for fast Python provisioning and package management.
# Installation script for Linux, macOS, and Android/Termux.
# Uses uv for desktop/server installs and Python's stdlib venv + pip on Termux.
#
# Usage:
# curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
@@ -117,6 +117,36 @@ log_error() {
echo -e "${RED}${NC} $1"
}
is_termux() {
[ -n "${TERMUX_VERSION:-}" ] || [[ "${PREFIX:-}" == *"com.termux/files/usr"* ]]
}
get_command_link_dir() {
if is_termux && [ -n "${PREFIX:-}" ]; then
echo "$PREFIX/bin"
else
echo "$HOME/.local/bin"
fi
}
get_command_link_display_dir() {
if is_termux && [ -n "${PREFIX:-}" ]; then
echo '$PREFIX/bin'
else
echo '~/.local/bin'
fi
}
get_hermes_command_path() {
local link_dir
link_dir="$(get_command_link_dir)"
if [ -x "$link_dir/hermes" ]; then
echo "$link_dir/hermes"
else
echo "hermes"
fi
}
# ============================================================================
# System detection
# ============================================================================
@@ -124,12 +154,17 @@ log_error() {
detect_os() {
case "$(uname -s)" in
Linux*)
OS="linux"
if [ -f /etc/os-release ]; then
. /etc/os-release
DISTRO="$ID"
if is_termux; then
OS="android"
DISTRO="termux"
else
DISTRO="unknown"
OS="linux"
if [ -f /etc/os-release ]; then
. /etc/os-release
DISTRO="$ID"
else
DISTRO="unknown"
fi
fi
;;
Darwin*)
@@ -158,6 +193,12 @@ detect_os() {
# ============================================================================
install_uv() {
if [ "$DISTRO" = "termux" ]; then
log_info "Termux detected — using Python's stdlib venv + pip instead of uv"
UV_CMD=""
return 0
fi
log_info "Checking for uv package manager..."
# Check common locations for uv
@@ -209,6 +250,25 @@ install_uv() {
}
check_python() {
if [ "$DISTRO" = "termux" ]; then
log_info "Checking Termux Python..."
if command -v python >/dev/null 2>&1; then
PYTHON_PATH="$(command -v python)"
if "$PYTHON_PATH" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 11) else 1)' 2>/dev/null; then
PYTHON_FOUND_VERSION=$($PYTHON_PATH --version 2>/dev/null)
log_success "Python found: $PYTHON_FOUND_VERSION"
return 0
fi
fi
log_info "Installing Python via pkg..."
pkg install -y python >/dev/null
PYTHON_PATH="$(command -v python)"
PYTHON_FOUND_VERSION=$($PYTHON_PATH --version 2>/dev/null)
log_success "Python installed: $PYTHON_FOUND_VERSION"
return 0
fi
log_info "Checking Python $PYTHON_VERSION..."
# Let uv handle Python — it can download and manage Python versions
@@ -243,6 +303,17 @@ check_git() {
fi
log_error "Git not found"
if [ "$DISTRO" = "termux" ]; then
log_info "Installing Git via pkg..."
pkg install -y git >/dev/null
if command -v git >/dev/null 2>&1; then
GIT_VERSION=$(git --version | awk '{print $3}')
log_success "Git $GIT_VERSION installed"
return 0
fi
fi
log_info "Please install Git:"
case "$OS" in
@@ -262,6 +333,9 @@ check_git() {
;;
esac
;;
android)
log_info " pkg install git"
;;
macos)
log_info " xcode-select --install"
log_info " Or: brew install git"
@@ -290,11 +364,29 @@ check_node() {
return 0
fi
log_info "Node.js not found — installing Node.js $NODE_VERSION LTS..."
if [ "$DISTRO" = "termux" ]; then
log_info "Node.js not found — installing Node.js via pkg..."
else
log_info "Node.js not found — installing Node.js $NODE_VERSION LTS..."
fi
install_node
}
install_node() {
if [ "$DISTRO" = "termux" ]; then
log_info "Installing Node.js via pkg..."
if pkg install -y nodejs >/dev/null; then
local installed_ver
installed_ver=$(node --version 2>/dev/null)
log_success "Node.js $installed_ver installed via pkg"
HAS_NODE=true
else
log_warn "Failed to install Node.js via pkg"
HAS_NODE=false
fi
return 0
fi
local arch=$(uname -m)
local node_arch
case "$arch" in
@@ -413,6 +505,30 @@ install_system_packages() {
need_ffmpeg=true
fi
# Termux always needs the Android build toolchain for the tested pip path,
# even when ripgrep/ffmpeg are already present.
if [ "$DISTRO" = "termux" ]; then
local termux_pkgs=(clang rust make pkg-config libffi openssl)
if [ "$need_ripgrep" = true ]; then
termux_pkgs+=("ripgrep")
fi
if [ "$need_ffmpeg" = true ]; then
termux_pkgs+=("ffmpeg")
fi
log_info "Installing Termux packages: ${termux_pkgs[*]}"
if pkg install -y "${termux_pkgs[@]}" >/dev/null; then
[ "$need_ripgrep" = true ] && HAS_RIPGREP=true && log_success "ripgrep installed"
[ "$need_ffmpeg" = true ] && HAS_FFMPEG=true && log_success "ffmpeg installed"
log_success "Termux build dependencies installed"
return 0
fi
log_warn "Could not auto-install all Termux packages"
log_info "Install manually: pkg install ${termux_pkgs[*]}"
return 0
fi
# Nothing to install — done
if [ "$need_ripgrep" = false ] && [ "$need_ffmpeg" = false ]; then
return 0
@@ -550,6 +666,9 @@ show_manual_install_hint() {
*) log_info " Use your package manager or visit the project homepage" ;;
esac
;;
android)
log_info " pkg install $pkg"
;;
macos) log_info " brew install $pkg" ;;
esac
}
@@ -646,6 +765,19 @@ setup_venv() {
return 0
fi
if [ "$DISTRO" = "termux" ]; then
log_info "Creating virtual environment with Termux Python..."
if [ -d "venv" ]; then
log_info "Virtual environment already exists, recreating..."
rm -rf venv
fi
"$PYTHON_PATH" -m venv venv
log_success "Virtual environment ready ($(./venv/bin/python --version 2>/dev/null))"
return 0
fi
log_info "Creating virtual environment with Python $PYTHON_VERSION..."
if [ -d "venv" ]; then
@@ -662,6 +794,46 @@ setup_venv() {
install_deps() {
log_info "Installing dependencies..."
if [ "$DISTRO" = "termux" ]; then
if [ "$USE_VENV" = true ]; then
export VIRTUAL_ENV="$INSTALL_DIR/venv"
PIP_PYTHON="$INSTALL_DIR/venv/bin/python"
else
PIP_PYTHON="$PYTHON_PATH"
fi
if [ -z "${ANDROID_API_LEVEL:-}" ]; then
ANDROID_API_LEVEL="$(getprop ro.build.version.sdk 2>/dev/null || true)"
if [ -z "$ANDROID_API_LEVEL" ]; then
ANDROID_API_LEVEL=24
fi
export ANDROID_API_LEVEL
log_info "Using ANDROID_API_LEVEL=$ANDROID_API_LEVEL for Android wheel builds"
fi
"$PIP_PYTHON" -m pip install --upgrade pip setuptools wheel >/dev/null
if ! "$PIP_PYTHON" -m pip install -e '.[termux]' -c constraints-termux.txt; then
log_warn "Termux feature install (.[termux]) failed, trying base install..."
if ! "$PIP_PYTHON" -m pip install -e '.' -c constraints-termux.txt; then
log_error "Package installation failed on Termux."
log_info "Ensure these packages are installed: pkg install clang rust make pkg-config libffi openssl"
log_info "Then re-run: cd $INSTALL_DIR && python -m pip install -e '.[termux]' -c constraints-termux.txt"
exit 1
fi
fi
log_success "Main package installed"
log_info "Termux note: browser/WhatsApp tooling is not installed by default; see the Termux guide for optional follow-up steps."
if [ -d "tinker-atropos" ] && [ -f "tinker-atropos/pyproject.toml" ]; then
log_info "tinker-atropos submodule found — skipping install (optional, for RL training)"
log_info " To install later: $PIP_PYTHON -m pip install -e \"./tinker-atropos\""
fi
log_success "All dependencies installed"
return 0
fi
if [ "$USE_VENV" = true ]; then
# Tell uv to install into our venv (no need to activate)
export VIRTUAL_ENV="$INSTALL_DIR/venv"
@@ -743,19 +915,35 @@ setup_path() {
if [ ! -x "$HERMES_BIN" ]; then
log_warn "hermes entry point not found at $HERMES_BIN"
log_info "This usually means the pip install didn't complete successfully."
log_info "Try: cd $INSTALL_DIR && uv pip install -e '.[all]'"
if [ "$DISTRO" = "termux" ]; then
log_info "Try: cd $INSTALL_DIR && python -m pip install -e '.[termux]' -c constraints-termux.txt"
else
log_info "Try: cd $INSTALL_DIR && uv pip install -e '.[all]'"
fi
return 0
fi
# Create symlink in ~/.local/bin (standard user binary location, usually on PATH)
mkdir -p "$HOME/.local/bin"
ln -sf "$HERMES_BIN" "$HOME/.local/bin/hermes"
log_success "Symlinked hermes → ~/.local/bin/hermes"
local command_link_dir
local command_link_display_dir
command_link_dir="$(get_command_link_dir)"
command_link_display_dir="$(get_command_link_display_dir)"
# Create a user-facing shim for the hermes command.
mkdir -p "$command_link_dir"
ln -sf "$HERMES_BIN" "$command_link_dir/hermes"
log_success "Symlinked hermes → $command_link_display_dir/hermes"
if [ "$DISTRO" = "termux" ]; then
export PATH="$command_link_dir:$PATH"
log_info "$command_link_display_dir is the native Termux command path"
log_success "hermes command ready"
return 0
fi
# Check if ~/.local/bin is on PATH; if not, add it to shell config.
# Detect the user's actual login shell (not the shell running this script,
# which is always bash when piped from curl).
if ! echo "$PATH" | tr ':' '\n' | grep -q "^$HOME/.local/bin$"; then
if ! echo "$PATH" | tr ':' '\n' | grep -q "^$command_link_dir$"; then
SHELL_CONFIGS=()
LOGIN_SHELL="$(basename "${SHELL:-/bin/bash}")"
case "$LOGIN_SHELL" in
@@ -801,7 +989,7 @@ setup_path() {
fi
# Export for current session so hermes works immediately
export PATH="$HOME/.local/bin:$PATH"
export PATH="$command_link_dir:$PATH"
log_success "hermes command ready"
}
@@ -878,6 +1066,13 @@ install_node_deps() {
return 0
fi
if [ "$DISTRO" = "termux" ]; then
log_info "Skipping automatic Node/browser dependency setup on Termux"
log_info "Browser automation and WhatsApp bridge are not part of the tested Termux install path yet."
log_info "If you want to experiment manually later, run: cd $INSTALL_DIR && npm install"
return 0
fi
if [ -f "$INSTALL_DIR/package.json" ]; then
log_info "Installing Node.js dependencies (browser tools)..."
cd "$INSTALL_DIR"
@@ -992,8 +1187,7 @@ maybe_start_gateway() {
read -p "Pair WhatsApp now? [Y/n] " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
HERMES_CMD="$HOME/.local/bin/hermes"
[ ! -x "$HERMES_CMD" ] && HERMES_CMD="hermes"
HERMES_CMD="$(get_hermes_command_path)"
$HERMES_CMD whatsapp || true
fi
else
@@ -1007,16 +1201,17 @@ maybe_start_gateway() {
fi
echo ""
read -p "Would you like to install the gateway as a background service? [Y/n] " -n 1 -r < /dev/tty
if [ "$DISTRO" = "termux" ]; then
read -p "Would you like to start the gateway in the background? [Y/n] " -n 1 -r < /dev/tty
else
read -p "Would you like to install the gateway as a background service? [Y/n] " -n 1 -r < /dev/tty
fi
echo
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
HERMES_CMD="$HOME/.local/bin/hermes"
if [ ! -x "$HERMES_CMD" ]; then
HERMES_CMD="hermes"
fi
HERMES_CMD="$(get_hermes_command_path)"
if command -v systemctl &> /dev/null; then
if [ "$DISTRO" != "termux" ] && command -v systemctl &> /dev/null; then
log_info "Installing systemd service..."
if $HERMES_CMD gateway install 2>/dev/null; then
log_success "Gateway service installed"
@@ -1029,12 +1224,19 @@ maybe_start_gateway() {
log_warn "Systemd install failed. You can start manually: hermes gateway"
fi
else
log_info "systemd not available — starting gateway in background..."
if [ "$DISTRO" = "termux" ]; then
log_info "Termux detected — starting gateway in best-effort background mode..."
else
log_info "systemd not available — starting gateway in background..."
fi
nohup $HERMES_CMD gateway > "$HERMES_HOME/logs/gateway.log" 2>&1 &
GATEWAY_PID=$!
log_success "Gateway started (PID $GATEWAY_PID). Logs: ~/.hermes/logs/gateway.log"
log_info "To stop: kill $GATEWAY_PID"
log_info "To restart later: hermes gateway"
if [ "$DISTRO" = "termux" ]; then
log_warn "Android may stop background processes when Termux is suspended or the system reclaims resources."
fi
fi
else
log_info "Skipped. Start the gateway later with: hermes gateway"
@@ -1073,24 +1275,33 @@ print_success() {
echo -e "${CYAN}─────────────────────────────────────────────────────────${NC}"
echo ""
echo -e "${YELLOW}⚡ Reload your shell to use 'hermes' command:${NC}"
echo ""
LOGIN_SHELL="$(basename "${SHELL:-/bin/bash}")"
if [ "$LOGIN_SHELL" = "zsh" ]; then
echo " source ~/.zshrc"
elif [ "$LOGIN_SHELL" = "bash" ]; then
echo " source ~/.bashrc"
if [ "$DISTRO" = "termux" ]; then
echo -e "${YELLOW}⚡ 'hermes' was linked into $(get_command_link_display_dir), which is already on PATH in Termux.${NC}"
echo ""
else
echo " source ~/.bashrc # or ~/.zshrc"
echo -e "${YELLOW}⚡ Reload your shell to use 'hermes' command:${NC}"
echo ""
LOGIN_SHELL="$(basename "${SHELL:-/bin/bash}")"
if [ "$LOGIN_SHELL" = "zsh" ]; then
echo " source ~/.zshrc"
elif [ "$LOGIN_SHELL" = "bash" ]; then
echo " source ~/.bashrc"
else
echo " source ~/.bashrc # or ~/.zshrc"
fi
echo ""
fi
echo ""
# Show Node.js warning if auto-install failed
if [ "$HAS_NODE" = false ]; then
echo -e "${YELLOW}"
echo "Note: Node.js could not be installed automatically."
echo "Browser tools need Node.js. Install manually:"
echo " https://nodejs.org/en/download/"
if [ "$DISTRO" = "termux" ]; then
echo " pkg install nodejs"
else
echo " https://nodejs.org/en/download/"
fi
echo -e "${NC}"
fi
@@ -1099,7 +1310,11 @@ print_success() {
echo -e "${YELLOW}"
echo "Note: ripgrep (rg) was not found. File search will use"
echo "grep as a fallback. For faster search in large codebases,"
echo "install ripgrep: sudo apt install ripgrep (or brew install ripgrep)"
if [ "$DISTRO" = "termux" ]; then
echo "install ripgrep: pkg install ripgrep"
else
echo "install ripgrep: sudo apt install ripgrep (or brew install ripgrep)"
fi
echo -e "${NC}"
fi
}
+212 -120
View File
@@ -3,17 +3,17 @@
# Hermes Agent Setup Script
# ============================================================================
# Quick setup for developers who cloned the repo manually.
# Uses uv for fast Python provisioning and package management.
# Uses uv for desktop/server setup and Python's stdlib venv + pip on Termux.
#
# Usage:
# ./setup-hermes.sh
#
# This script:
# 1. Installs uv if not present
# 2. Creates a virtual environment with Python 3.11 via uv
# 3. Installs all dependencies (main package + submodules)
# 1. Detects desktop/server vs Android/Termux setup path
# 2. Creates a Python 3.11 virtual environment
# 3. Installs the appropriate dependency set for the platform
# 4. Creates .env from template (if not exists)
# 5. Symlinks the 'hermes' CLI command into ~/.local/bin
# 5. Symlinks the 'hermes' CLI command into a user-facing bin dir
# 6. Runs the setup wizard (optional)
# ============================================================================
@@ -31,6 +31,26 @@ cd "$SCRIPT_DIR"
PYTHON_VERSION="3.11"
is_termux() {
[ -n "${TERMUX_VERSION:-}" ] || [[ "${PREFIX:-}" == *"com.termux/files/usr"* ]]
}
get_command_link_dir() {
if is_termux && [ -n "${PREFIX:-}" ]; then
echo "$PREFIX/bin"
else
echo "$HOME/.local/bin"
fi
}
get_command_link_display_dir() {
if is_termux && [ -n "${PREFIX:-}" ]; then
echo '$PREFIX/bin'
else
echo '~/.local/bin'
fi
}
echo ""
echo -e "${CYAN}⚕ Hermes Agent Setup${NC}"
echo ""
@@ -42,36 +62,40 @@ echo ""
echo -e "${CYAN}${NC} Checking for uv..."
UV_CMD=""
if command -v uv &> /dev/null; then
UV_CMD="uv"
elif [ -x "$HOME/.local/bin/uv" ]; then
UV_CMD="$HOME/.local/bin/uv"
elif [ -x "$HOME/.cargo/bin/uv" ]; then
UV_CMD="$HOME/.cargo/bin/uv"
fi
if [ -n "$UV_CMD" ]; then
UV_VERSION=$($UV_CMD --version 2>/dev/null)
echo -e "${GREEN}${NC} uv found ($UV_VERSION)"
if is_termux; then
echo -e "${CYAN}${NC} Termux detected — using Python's stdlib venv + pip instead of uv"
else
echo -e "${CYAN}${NC} Installing uv..."
if curl -LsSf https://astral.sh/uv/install.sh | sh 2>/dev/null; then
if [ -x "$HOME/.local/bin/uv" ]; then
UV_CMD="$HOME/.local/bin/uv"
elif [ -x "$HOME/.cargo/bin/uv" ]; then
UV_CMD="$HOME/.cargo/bin/uv"
fi
if [ -n "$UV_CMD" ]; then
UV_VERSION=$($UV_CMD --version 2>/dev/null)
echo -e "${GREEN}${NC} uv installed ($UV_VERSION)"
if command -v uv &> /dev/null; then
UV_CMD="uv"
elif [ -x "$HOME/.local/bin/uv" ]; then
UV_CMD="$HOME/.local/bin/uv"
elif [ -x "$HOME/.cargo/bin/uv" ]; then
UV_CMD="$HOME/.cargo/bin/uv"
fi
if [ -n "$UV_CMD" ]; then
UV_VERSION=$($UV_CMD --version 2>/dev/null)
echo -e "${GREEN}${NC} uv found ($UV_VERSION)"
else
echo -e "${CYAN}${NC} Installing uv..."
if curl -LsSf https://astral.sh/uv/install.sh | sh 2>/dev/null; then
if [ -x "$HOME/.local/bin/uv" ]; then
UV_CMD="$HOME/.local/bin/uv"
elif [ -x "$HOME/.cargo/bin/uv" ]; then
UV_CMD="$HOME/.cargo/bin/uv"
fi
if [ -n "$UV_CMD" ]; then
UV_VERSION=$($UV_CMD --version 2>/dev/null)
echo -e "${GREEN}${NC} uv installed ($UV_VERSION)"
else
echo -e "${RED}${NC} uv installed but not found. Add ~/.local/bin to PATH and retry."
exit 1
fi
else
echo -e "${RED}${NC} uv installed but not found. Add ~/.local/bin to PATH and retry."
echo -e "${RED}${NC} Failed to install uv. Visit https://docs.astral.sh/uv/"
exit 1
fi
else
echo -e "${RED}${NC} Failed to install uv. Visit https://docs.astral.sh/uv/"
exit 1
fi
fi
@@ -81,16 +105,34 @@ fi
echo -e "${CYAN}${NC} Checking Python $PYTHON_VERSION..."
if $UV_CMD python find "$PYTHON_VERSION" &> /dev/null; then
PYTHON_PATH=$($UV_CMD python find "$PYTHON_VERSION")
PYTHON_FOUND_VERSION=$($PYTHON_PATH --version 2>/dev/null)
echo -e "${GREEN}${NC} $PYTHON_FOUND_VERSION found"
if is_termux; then
if command -v python >/dev/null 2>&1; then
PYTHON_PATH="$(command -v python)"
if "$PYTHON_PATH" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 11) else 1)' 2>/dev/null; then
PYTHON_FOUND_VERSION=$($PYTHON_PATH --version 2>/dev/null)
echo -e "${GREEN}${NC} $PYTHON_FOUND_VERSION found"
else
echo -e "${RED}${NC} Termux Python must be 3.11+"
echo " Run: pkg install python"
exit 1
fi
else
echo -e "${RED}${NC} Python not found in Termux"
echo " Run: pkg install python"
exit 1
fi
else
echo -e "${CYAN}${NC} Python $PYTHON_VERSION not found, installing via uv..."
$UV_CMD python install "$PYTHON_VERSION"
PYTHON_PATH=$($UV_CMD python find "$PYTHON_VERSION")
PYTHON_FOUND_VERSION=$($PYTHON_PATH --version 2>/dev/null)
echo -e "${GREEN}${NC} $PYTHON_FOUND_VERSION installed"
if $UV_CMD python find "$PYTHON_VERSION" &> /dev/null; then
PYTHON_PATH=$($UV_CMD python find "$PYTHON_VERSION")
PYTHON_FOUND_VERSION=$($PYTHON_PATH --version 2>/dev/null)
echo -e "${GREEN}${NC} $PYTHON_FOUND_VERSION found"
else
echo -e "${CYAN}${NC} Python $PYTHON_VERSION not found, installing via uv..."
$UV_CMD python install "$PYTHON_VERSION"
PYTHON_PATH=$($UV_CMD python find "$PYTHON_VERSION")
PYTHON_FOUND_VERSION=$($PYTHON_PATH --version 2>/dev/null)
echo -e "${GREEN}${NC} $PYTHON_FOUND_VERSION installed"
fi
fi
# ============================================================================
@@ -104,11 +146,16 @@ if [ -d "venv" ]; then
rm -rf venv
fi
$UV_CMD venv venv --python "$PYTHON_VERSION"
echo -e "${GREEN}${NC} venv created (Python $PYTHON_VERSION)"
if is_termux; then
"$PYTHON_PATH" -m venv venv
echo -e "${GREEN}${NC} venv created with stdlib venv"
else
$UV_CMD venv venv --python "$PYTHON_VERSION"
echo -e "${GREEN}${NC} venv created (Python $PYTHON_VERSION)"
fi
# Tell uv to install into this venv (no activation needed for uv)
export VIRTUAL_ENV="$SCRIPT_DIR/venv"
SETUP_PYTHON="$SCRIPT_DIR/venv/bin/python"
# ============================================================================
# Dependencies
@@ -116,19 +163,34 @@ export VIRTUAL_ENV="$SCRIPT_DIR/venv"
echo -e "${CYAN}${NC} Installing dependencies..."
# Prefer uv sync with lockfile (hash-verified installs) when available,
# fall back to pip install for compatibility or when lockfile is stale.
if [ -f "uv.lock" ]; then
echo -e "${CYAN}${NC} Using uv.lock for hash-verified installation..."
UV_PROJECT_ENVIRONMENT="$SCRIPT_DIR/venv" $UV_CMD sync --all-extras --locked 2>/dev/null && \
echo -e "${GREEN}${NC} Dependencies installed (lockfile verified)" || {
echo -e "${YELLOW}${NC} Lockfile install failed (may be outdated), falling back to pip install..."
if is_termux; then
export ANDROID_API_LEVEL="$(getprop ro.build.version.sdk 2>/dev/null || printf '%s' "${ANDROID_API_LEVEL:-}")"
echo -e "${CYAN}${NC} Termux detected — installing the tested Android bundle"
"$SETUP_PYTHON" -m pip install --upgrade pip setuptools wheel
if [ -f "constraints-termux.txt" ]; then
"$SETUP_PYTHON" -m pip install -e ".[termux]" -c constraints-termux.txt || {
echo -e "${YELLOW}${NC} Termux bundle install failed, falling back to base install..."
"$SETUP_PYTHON" -m pip install -e "." -c constraints-termux.txt
}
else
"$SETUP_PYTHON" -m pip install -e ".[termux]" || "$SETUP_PYTHON" -m pip install -e "."
fi
echo -e "${GREEN}${NC} Dependencies installed"
else
# Prefer uv sync with lockfile (hash-verified installs) when available,
# fall back to pip install for compatibility or when lockfile is stale.
if [ -f "uv.lock" ]; then
echo -e "${CYAN}${NC} Using uv.lock for hash-verified installation..."
UV_PROJECT_ENVIRONMENT="$SCRIPT_DIR/venv" $UV_CMD sync --all-extras --locked 2>/dev/null && \
echo -e "${GREEN}${NC} Dependencies installed (lockfile verified)" || {
echo -e "${YELLOW}${NC} Lockfile install failed (may be outdated), falling back to pip install..."
$UV_CMD pip install -e ".[all]" || $UV_CMD pip install -e "."
echo -e "${GREEN}${NC} Dependencies installed"
}
else
$UV_CMD pip install -e ".[all]" || $UV_CMD pip install -e "."
echo -e "${GREEN}${NC} Dependencies installed"
}
else
$UV_CMD pip install -e ".[all]" || $UV_CMD pip install -e "."
echo -e "${GREEN}${NC} Dependencies installed"
fi
fi
# ============================================================================
@@ -138,7 +200,9 @@ fi
echo -e "${CYAN}${NC} Installing optional submodules..."
# tinker-atropos (RL training backend)
if [ -d "tinker-atropos" ] && [ -f "tinker-atropos/pyproject.toml" ]; then
if is_termux; then
echo -e "${CYAN}${NC} Skipping tinker-atropos on Termux (not part of the tested Android path)"
elif [ -d "tinker-atropos" ] && [ -f "tinker-atropos/pyproject.toml" ]; then
$UV_CMD pip install -e "./tinker-atropos" && \
echo -e "${GREEN}${NC} tinker-atropos installed" || \
echo -e "${YELLOW}${NC} tinker-atropos install failed (RL tools may not work)"
@@ -160,34 +224,42 @@ else
echo
if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
INSTALLED=false
# Check if sudo is available
if command -v sudo &> /dev/null && sudo -n true 2>/dev/null; then
if command -v apt &> /dev/null; then
sudo apt install -y ripgrep && INSTALLED=true
elif command -v dnf &> /dev/null; then
sudo dnf install -y ripgrep && INSTALLED=true
if is_termux; then
pkg install -y ripgrep && INSTALLED=true
else
# Check if sudo is available
if command -v sudo &> /dev/null && sudo -n true 2>/dev/null; then
if command -v apt &> /dev/null; then
sudo apt install -y ripgrep && INSTALLED=true
elif command -v dnf &> /dev/null; then
sudo dnf install -y ripgrep && INSTALLED=true
fi
fi
# Try brew (no sudo needed)
if [ "$INSTALLED" = false ] && command -v brew &> /dev/null; then
brew install ripgrep && INSTALLED=true
fi
# Try cargo (no sudo needed)
if [ "$INSTALLED" = false ] && command -v cargo &> /dev/null; then
echo -e "${CYAN}${NC} Trying cargo install (no sudo required)..."
cargo install ripgrep && INSTALLED=true
fi
fi
# Try brew (no sudo needed)
if [ "$INSTALLED" = false ] && command -v brew &> /dev/null; then
brew install ripgrep && INSTALLED=true
fi
# Try cargo (no sudo needed)
if [ "$INSTALLED" = false ] && command -v cargo &> /dev/null; then
echo -e "${CYAN}${NC} Trying cargo install (no sudo required)..."
cargo install ripgrep && INSTALLED=true
fi
if [ "$INSTALLED" = true ]; then
echo -e "${GREEN}${NC} ripgrep installed"
else
echo -e "${YELLOW}${NC} Auto-install failed. Install options:"
echo " sudo apt install ripgrep # Debian/Ubuntu"
echo " brew install ripgrep # macOS"
echo " cargo install ripgrep # With Rust (no sudo)"
if is_termux; then
echo " pkg install ripgrep # Termux / Android"
else
echo " sudo apt install ripgrep # Debian/Ubuntu"
echo " brew install ripgrep # macOS"
echo " cargo install ripgrep # With Rust (no sudo)"
fi
echo " https://github.com/BurntSushi/ripgrep#installation"
fi
fi
@@ -207,49 +279,56 @@ else
fi
# ============================================================================
# PATH setup — symlink hermes into ~/.local/bin
# PATH setup — symlink hermes into a user-facing bin dir
# ============================================================================
echo -e "${CYAN}${NC} Setting up hermes command..."
HERMES_BIN="$SCRIPT_DIR/venv/bin/hermes"
mkdir -p "$HOME/.local/bin"
ln -sf "$HERMES_BIN" "$HOME/.local/bin/hermes"
echo -e "${GREEN}${NC} Symlinked hermes → ~/.local/bin/hermes"
COMMAND_LINK_DIR="$(get_command_link_dir)"
COMMAND_LINK_DISPLAY_DIR="$(get_command_link_display_dir)"
mkdir -p "$COMMAND_LINK_DIR"
ln -sf "$HERMES_BIN" "$COMMAND_LINK_DIR/hermes"
echo -e "${GREEN}${NC} Symlinked hermes → $COMMAND_LINK_DISPLAY_DIR/hermes"
# Determine the appropriate shell config file
SHELL_CONFIG=""
if [[ "$SHELL" == *"zsh"* ]]; then
SHELL_CONFIG="$HOME/.zshrc"
elif [[ "$SHELL" == *"bash"* ]]; then
SHELL_CONFIG="$HOME/.bashrc"
[ ! -f "$SHELL_CONFIG" ] && SHELL_CONFIG="$HOME/.bash_profile"
if is_termux; then
export PATH="$COMMAND_LINK_DIR:$PATH"
echo -e "${GREEN}${NC} $COMMAND_LINK_DISPLAY_DIR is already on PATH in Termux"
else
# Fallback to checking existing files
if [ -f "$HOME/.zshrc" ]; then
# Determine the appropriate shell config file
SHELL_CONFIG=""
if [[ "$SHELL" == *"zsh"* ]]; then
SHELL_CONFIG="$HOME/.zshrc"
elif [ -f "$HOME/.bashrc" ]; then
elif [[ "$SHELL" == *"bash"* ]]; then
SHELL_CONFIG="$HOME/.bashrc"
elif [ -f "$HOME/.bash_profile" ]; then
SHELL_CONFIG="$HOME/.bash_profile"
fi
fi
if [ -n "$SHELL_CONFIG" ]; then
# Touch the file just in case it doesn't exist yet but was selected
touch "$SHELL_CONFIG" 2>/dev/null || true
if ! echo "$PATH" | tr ':' '\n' | grep -q "^$HOME/.local/bin$"; then
if ! grep -q '\.local/bin' "$SHELL_CONFIG" 2>/dev/null; then
echo "" >> "$SHELL_CONFIG"
echo "# Hermes Agent — ensure ~/.local/bin is on PATH" >> "$SHELL_CONFIG"
echo 'export PATH="$HOME/.local/bin:$PATH"' >> "$SHELL_CONFIG"
echo -e "${GREEN}${NC} Added ~/.local/bin to PATH in $SHELL_CONFIG"
else
echo -e "${GREEN}${NC} ~/.local/bin already in $SHELL_CONFIG"
fi
[ ! -f "$SHELL_CONFIG" ] && SHELL_CONFIG="$HOME/.bash_profile"
else
echo -e "${GREEN}${NC} ~/.local/bin already on PATH"
# Fallback to checking existing files
if [ -f "$HOME/.zshrc" ]; then
SHELL_CONFIG="$HOME/.zshrc"
elif [ -f "$HOME/.bashrc" ]; then
SHELL_CONFIG="$HOME/.bashrc"
elif [ -f "$HOME/.bash_profile" ]; then
SHELL_CONFIG="$HOME/.bash_profile"
fi
fi
if [ -n "$SHELL_CONFIG" ]; then
# Touch the file just in case it doesn't exist yet but was selected
touch "$SHELL_CONFIG" 2>/dev/null || true
if ! echo "$PATH" | tr ':' '\n' | grep -q "^$HOME/.local/bin$"; then
if ! grep -q '\.local/bin' "$SHELL_CONFIG" 2>/dev/null; then
echo "" >> "$SHELL_CONFIG"
echo "# Hermes Agent — ensure ~/.local/bin is on PATH" >> "$SHELL_CONFIG"
echo 'export PATH="$HOME/.local/bin:$PATH"' >> "$SHELL_CONFIG"
echo -e "${GREEN}${NC} Added ~/.local/bin to PATH in $SHELL_CONFIG"
else
echo -e "${GREEN}${NC} ~/.local/bin already in $SHELL_CONFIG"
fi
else
echo -e "${GREEN}${NC} ~/.local/bin already on PATH"
fi
fi
fi
@@ -281,18 +360,31 @@ echo -e "${GREEN}✓ Setup complete!${NC}"
echo ""
echo "Next steps:"
echo ""
echo " 1. Reload your shell:"
echo " source $SHELL_CONFIG"
echo ""
echo " 2. Run the setup wizard to configure API keys:"
echo " hermes setup"
echo ""
echo " 3. Start chatting:"
echo " hermes"
echo ""
if is_termux; then
echo " 1. Run the setup wizard to configure API keys:"
echo " hermes setup"
echo ""
echo " 2. Start chatting:"
echo " hermes"
echo ""
else
echo " 1. Reload your shell:"
echo " source $SHELL_CONFIG"
echo ""
echo " 2. Run the setup wizard to configure API keys:"
echo " hermes setup"
echo ""
echo " 3. Start chatting:"
echo " hermes"
echo ""
fi
echo "Other commands:"
echo " hermes status # Check configuration"
echo " hermes gateway install # Install gateway service (messaging + cron)"
if is_termux; then
echo " hermes gateway # Run gateway in foreground"
else
echo " hermes gateway install # Install gateway service (messaging + cron)"
fi
echo " hermes cron list # View scheduled jobs"
echo " hermes doctor # Diagnose issues"
echo ""
+31
View File
@@ -410,6 +410,37 @@ class TestPrompt:
update = last_call[1].get("update") or last_call[0][1]
assert update.session_update == "agent_message_chunk"
@pytest.mark.asyncio
async def test_prompt_populates_usage_from_top_level_run_conversation_fields(self, agent):
"""ACP should map top-level token fields into PromptResponse.usage."""
new_resp = await agent.new_session(cwd=".")
state = agent.session_manager.get_session(new_resp.session_id)
state.agent.run_conversation = MagicMock(return_value={
"final_response": "usage attached",
"messages": [],
"prompt_tokens": 123,
"completion_tokens": 45,
"total_tokens": 168,
"reasoning_tokens": 7,
"cache_read_tokens": 11,
})
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
prompt = [TextContentBlock(type="text", text="show usage")]
resp = await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
assert isinstance(resp, PromptResponse)
assert resp.usage is not None
assert resp.usage.input_tokens == 123
assert resp.usage.output_tokens == 45
assert resp.usage.total_tokens == 168
assert resp.usage.thought_tokens == 7
assert resp.usage.cached_read_tokens == 11
@pytest.mark.asyncio
async def test_prompt_cancelled_returns_cancelled_stop_reason(self, agent):
"""If cancel is called during prompt, stop_reason should be 'cancelled'."""
+17 -1
View File
@@ -81,6 +81,9 @@ class TestBuildAnthropicClient:
build_anthropic_client("sk-ant-api03-x", base_url="https://custom.api.com")
kwargs = mock_sdk.Anthropic.call_args[1]
assert kwargs["base_url"] == "https://custom.api.com"
assert kwargs["default_headers"] == {
"anthropic-beta": "interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14"
}
def test_minimax_anthropic_endpoint_uses_bearer_auth_for_regular_api_keys(self):
with patch("agent.anthropic_adapter._anthropic_sdk") as mock_sdk:
@@ -92,7 +95,20 @@ class TestBuildAnthropicClient:
assert kwargs["auth_token"] == "minimax-secret-123"
assert "api_key" not in kwargs
assert kwargs["default_headers"] == {
"anthropic-beta": "interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14"
"anthropic-beta": "interleaved-thinking-2025-05-14"
}
def test_minimax_cn_anthropic_endpoint_omits_tool_streaming_beta(self):
with patch("agent.anthropic_adapter._anthropic_sdk") as mock_sdk:
build_anthropic_client(
"minimax-cn-secret-123",
base_url="https://api.minimaxi.com/anthropic",
)
kwargs = mock_sdk.Anthropic.call_args[1]
assert kwargs["auth_token"] == "minimax-cn-secret-123"
assert "api_key" not in kwargs
assert kwargs["default_headers"] == {
"anthropic-beta": "interleaved-thinking-2025-05-14"
}
+33
View File
@@ -480,6 +480,39 @@ class TestClassifyApiError:
result = classify_api_error(e)
assert result.reason == FailoverReason.context_overflow
# ── Message-only usage limit disambiguation (no status code) ──
def test_message_usage_limit_transient_is_rate_limit(self):
"""'usage limit' + 'try again' with no status code → rate_limit, not billing."""
e = Exception("usage limit exceeded, try again in 5 minutes")
result = classify_api_error(e)
assert result.reason == FailoverReason.rate_limit
assert result.retryable is True
assert result.should_rotate_credential is True
assert result.should_fallback is True
def test_message_usage_limit_no_retry_signal_is_billing(self):
"""'usage limit' with no transient signal and no status code → billing."""
e = Exception("usage limit reached")
result = classify_api_error(e)
assert result.reason == FailoverReason.billing
assert result.retryable is False
assert result.should_rotate_credential is True
def test_message_quota_with_reset_window_is_rate_limit(self):
"""'quota' + 'resets at' with no status code → rate_limit."""
e = Exception("quota exceeded, resets at midnight UTC")
result = classify_api_error(e)
assert result.reason == FailoverReason.rate_limit
assert result.retryable is True
def test_message_limit_exceeded_with_wait_is_rate_limit(self):
"""'limit exceeded' + 'wait' with no status code → rate_limit."""
e = Exception("key limit exceeded, please wait before retrying")
result = classify_api_error(e)
assert result.reason == FailoverReason.rate_limit
assert result.retryable is True
# ── Unknown / fallback ──
def test_generic_exception_is_unknown(self):
+70
View File
@@ -0,0 +1,70 @@
"""Tests for local provider stream read timeout auto-detection.
When a local LLM provider is detected (Ollama, llama.cpp, vLLM, etc.),
the httpx stream read timeout should be automatically increased from the
default 60s to HERMES_API_TIMEOUT (1800s) to avoid premature connection
kills during long prefill phases.
"""
import os
import pytest
from unittest.mock import patch
from agent.model_metadata import is_local_endpoint
class TestLocalStreamReadTimeout:
"""Verify stream read timeout auto-detection logic."""
@pytest.mark.parametrize("base_url", [
"http://localhost:11434",
"http://127.0.0.1:8080",
"http://0.0.0.0:5000",
"http://192.168.1.100:8000",
"http://10.0.0.5:1234",
])
def test_local_endpoint_bumps_read_timeout(self, base_url):
"""Local endpoint + default timeout -> bumps to base_timeout."""
with patch.dict(os.environ, {}, clear=False):
os.environ.pop("HERMES_STREAM_READ_TIMEOUT", None)
_base_timeout = float(os.getenv("HERMES_API_TIMEOUT", 1800.0))
_stream_read_timeout = float(os.getenv("HERMES_STREAM_READ_TIMEOUT", 120.0))
if _stream_read_timeout == 120.0 and base_url and is_local_endpoint(base_url):
_stream_read_timeout = _base_timeout
assert _stream_read_timeout == 1800.0
def test_user_override_respected_for_local(self):
"""User sets HERMES_STREAM_READ_TIMEOUT -> keep their value even for local."""
with patch.dict(os.environ, {"HERMES_STREAM_READ_TIMEOUT": "300"}, clear=False):
_base_timeout = float(os.getenv("HERMES_API_TIMEOUT", 1800.0))
_stream_read_timeout = float(os.getenv("HERMES_STREAM_READ_TIMEOUT", 120.0))
base_url = "http://localhost:11434"
if _stream_read_timeout == 120.0 and base_url and is_local_endpoint(base_url):
_stream_read_timeout = _base_timeout
assert _stream_read_timeout == 300.0
@pytest.mark.parametrize("base_url", [
"https://api.openai.com",
"https://openrouter.ai/api",
"https://api.anthropic.com",
])
def test_remote_endpoint_keeps_default(self, base_url):
"""Remote endpoint -> keep 120s default."""
with patch.dict(os.environ, {}, clear=False):
os.environ.pop("HERMES_STREAM_READ_TIMEOUT", None)
_base_timeout = float(os.getenv("HERMES_API_TIMEOUT", 1800.0))
_stream_read_timeout = float(os.getenv("HERMES_STREAM_READ_TIMEOUT", 120.0))
if _stream_read_timeout == 120.0 and base_url and is_local_endpoint(base_url):
_stream_read_timeout = _base_timeout
assert _stream_read_timeout == 120.0
def test_empty_base_url_keeps_default(self):
"""No base_url set -> keep 120s default."""
with patch.dict(os.environ, {}, clear=False):
os.environ.pop("HERMES_STREAM_READ_TIMEOUT", None)
_base_timeout = float(os.getenv("HERMES_API_TIMEOUT", 1800.0))
_stream_read_timeout = float(os.getenv("HERMES_STREAM_READ_TIMEOUT", 120.0))
base_url = ""
if _stream_read_timeout == 120.0 and base_url and is_local_endpoint(base_url):
_stream_read_timeout = _base_timeout
assert _stream_read_timeout == 120.0
+100 -1
View File
@@ -1,4 +1,6 @@
"""Tests for MiniMax provider hardening — context lengths, thinking guard, catalog."""
"""Tests for MiniMax provider hardening — context lengths, thinking guard, catalog, beta headers."""
from unittest.mock import patch
class TestMinimaxContextLengths:
@@ -103,3 +105,100 @@ class TestMinimaxModelCatalog:
models = _PROVIDER_MODELS[provider]
assert "MiniMax-M2.7-highspeed" not in models
assert "MiniMax-M2.5-highspeed" not in models
class TestMinimaxBetaHeaders:
"""MiniMax Anthropic-compat endpoints reject fine-grained-tool-streaming beta.
Verify that build_anthropic_client omits the tool-streaming beta for MiniMax
(both global and China domains) while keeping it for native Anthropic and
other third-party endpoints. Covers the fix for #6510 / #6555.
"""
_TOOL_BETA = "fine-grained-tool-streaming-2025-05-14"
_THINKING_BETA = "interleaved-thinking-2025-05-14"
# -- helper ----------------------------------------------------------
def _build_and_get_betas(self, api_key, base_url=None):
"""Build client, return the anthropic-beta header string."""
from agent.anthropic_adapter import build_anthropic_client
with patch("agent.anthropic_adapter._anthropic_sdk") as mock_sdk:
build_anthropic_client(api_key, base_url=base_url)
kwargs = mock_sdk.Anthropic.call_args[1]
headers = kwargs.get("default_headers", {})
return headers.get("anthropic-beta", "")
# -- MiniMax global --------------------------------------------------
def test_minimax_global_omits_tool_streaming(self):
betas = self._build_and_get_betas(
"mm-key-123", base_url="https://api.minimax.io/anthropic"
)
assert self._TOOL_BETA not in betas
assert self._THINKING_BETA in betas
def test_minimax_global_trailing_slash(self):
betas = self._build_and_get_betas(
"mm-key-123", base_url="https://api.minimax.io/anthropic/"
)
assert self._TOOL_BETA not in betas
# -- MiniMax China ---------------------------------------------------
def test_minimax_cn_omits_tool_streaming(self):
betas = self._build_and_get_betas(
"mm-cn-key-456", base_url="https://api.minimaxi.com/anthropic"
)
assert self._TOOL_BETA not in betas
assert self._THINKING_BETA in betas
def test_minimax_cn_trailing_slash(self):
betas = self._build_and_get_betas(
"mm-cn-key-456", base_url="https://api.minimaxi.com/anthropic/"
)
assert self._TOOL_BETA not in betas
# -- Non-MiniMax keeps full betas ------------------------------------
def test_native_anthropic_keeps_tool_streaming(self):
betas = self._build_and_get_betas("sk-ant-api03-real-key-here")
assert self._TOOL_BETA in betas
assert self._THINKING_BETA in betas
def test_third_party_proxy_keeps_tool_streaming(self):
betas = self._build_and_get_betas(
"custom-key", base_url="https://my-proxy.example.com/anthropic"
)
assert self._TOOL_BETA in betas
def test_custom_base_url_keeps_tool_streaming(self):
betas = self._build_and_get_betas(
"custom-key", base_url="https://custom.api.com"
)
assert self._TOOL_BETA in betas
# -- _common_betas_for_base_url unit tests ---------------------------
def test_common_betas_none_url(self):
from agent.anthropic_adapter import _common_betas_for_base_url, _COMMON_BETAS
assert _common_betas_for_base_url(None) == _COMMON_BETAS
def test_common_betas_empty_url(self):
from agent.anthropic_adapter import _common_betas_for_base_url, _COMMON_BETAS
assert _common_betas_for_base_url("") == _COMMON_BETAS
def test_common_betas_minimax_url(self):
from agent.anthropic_adapter import _common_betas_for_base_url, _TOOL_STREAMING_BETA
betas = _common_betas_for_base_url("https://api.minimax.io/anthropic")
assert _TOOL_STREAMING_BETA not in betas
assert len(betas) > 0 # still has other betas
def test_common_betas_minimax_cn_url(self):
from agent.anthropic_adapter import _common_betas_for_base_url, _TOOL_STREAMING_BETA
betas = _common_betas_for_base_url("https://api.minimaxi.com/anthropic")
assert _TOOL_STREAMING_BETA not in betas
def test_common_betas_regular_url(self):
from agent.anthropic_adapter import _common_betas_for_base_url, _COMMON_BETAS
assert _common_betas_for_base_url("https://api.anthropic.com") == _COMMON_BETAS
+55
View File
@@ -132,6 +132,61 @@ class TestDefaultContextLengths:
if "gemini" in key:
assert value == 1048576, f"{key} should be 1048576"
def test_grok_models_context_lengths(self):
# xAI /v1/models does not return context_length metadata, so
# DEFAULT_CONTEXT_LENGTHS must cover the Grok family explicitly.
# Values sourced from models.dev (2026-04).
expected = {
"grok-4.20": 2000000,
"grok-4-1-fast": 2000000,
"grok-4-fast": 2000000,
"grok-4": 256000,
"grok-code-fast": 256000,
"grok-3": 131072,
"grok-2": 131072,
"grok-2-vision": 8192,
"grok": 131072,
}
for key, value in expected.items():
assert key in DEFAULT_CONTEXT_LENGTHS, f"{key} missing from DEFAULT_CONTEXT_LENGTHS"
assert DEFAULT_CONTEXT_LENGTHS[key] == value, (
f"{key} should be {value}, got {DEFAULT_CONTEXT_LENGTHS[key]}"
)
def test_grok_substring_matching(self):
# Longest-first substring matching must resolve the real xAI model
# IDs to the correct fallback entries without 128k probe-down.
from agent.model_metadata import get_model_context_length
from unittest.mock import patch as mock_patch
# Fake the provider/API/cache layers so the lookup falls through
# to DEFAULT_CONTEXT_LENGTHS.
with mock_patch("agent.model_metadata.fetch_model_metadata", return_value={}), mock_patch("agent.model_metadata.fetch_endpoint_model_metadata", return_value={}), mock_patch("agent.model_metadata.get_cached_context_length", return_value=None):
cases = [
("grok-4.20-0309-reasoning", 2000000),
("grok-4.20-0309-non-reasoning", 2000000),
("grok-4.20-multi-agent-0309", 2000000),
("grok-4-1-fast-reasoning", 2000000),
("grok-4-1-fast-non-reasoning", 2000000),
("grok-4-fast-reasoning", 2000000),
("grok-4-fast-non-reasoning", 2000000),
("grok-4", 256000),
("grok-4-0709", 256000),
("grok-code-fast-1", 256000),
("grok-3", 131072),
("grok-3-mini", 131072),
("grok-3-mini-fast", 131072),
("grok-2", 131072),
("grok-2-vision", 8192),
("grok-2-vision-1212", 8192),
("grok-beta", 131072),
]
for model_id, expected_ctx in cases:
actual = get_model_context_length(model_id)
assert actual == expected_ctx, (
f"{model_id}: expected {expected_ctx}, got {actual}"
)
def test_all_values_positive(self):
for key, value in DEFAULT_CONTEXT_LENGTHS.items():
assert value > 0, f"{key} has non-positive context length"
+14
View File
@@ -147,6 +147,20 @@ class TestEscapedSpaces:
assert result["path"] == tmp_image_with_spaces
assert result["remainder"] == "what is this?"
def test_tilde_prefixed_path(self, tmp_path, monkeypatch):
home = tmp_path / "home"
img = home / "storage" / "shared" / "Pictures" / "cat.png"
img.parent.mkdir(parents=True, exist_ok=True)
img.write_bytes(b"\x89PNG\r\n\x1a\n")
monkeypatch.setenv("HOME", str(home))
result = _detect_file_drop("~/storage/shared/Pictures/cat.png what is this?")
assert result is not None
assert result["path"] == img
assert result["is_image"] is True
assert result["remainder"] == "what is this?"
# ---------------------------------------------------------------------------
# Tests: edge cases
+109
View File
@@ -0,0 +1,109 @@
from pathlib import Path
from unittest.mock import patch
from cli import (
HermesCLI,
_collect_query_images,
_format_image_attachment_badges,
_termux_example_image_path,
)
def _make_cli():
cli_obj = HermesCLI.__new__(HermesCLI)
cli_obj._attached_images = []
return cli_obj
def _make_image(path: Path) -> Path:
path.parent.mkdir(parents=True, exist_ok=True)
path.write_bytes(b"\x89PNG\r\n\x1a\n")
return path
class TestImageCommand:
def test_handle_image_command_attaches_local_image(self, tmp_path):
img = _make_image(tmp_path / "photo.png")
cli_obj = _make_cli()
with patch("cli._cprint"):
cli_obj._handle_image_command(f"/image {img}")
assert cli_obj._attached_images == [img]
def test_handle_image_command_supports_quoted_path_with_spaces(self, tmp_path):
img = _make_image(tmp_path / "my photo.png")
cli_obj = _make_cli()
with patch("cli._cprint"):
cli_obj._handle_image_command(f'/image "{img}"')
assert cli_obj._attached_images == [img]
def test_handle_image_command_rejects_non_image_file(self, tmp_path):
file_path = tmp_path / "notes.txt"
file_path.write_text("hello\n", encoding="utf-8")
cli_obj = _make_cli()
with patch("cli._cprint") as mock_print:
cli_obj._handle_image_command(f"/image {file_path}")
assert cli_obj._attached_images == []
rendered = " ".join(str(arg) for call in mock_print.call_args_list for arg in call.args)
assert "Not a supported image file" in rendered
class TestCollectQueryImages:
def test_collect_query_images_accepts_explicit_image_arg(self, tmp_path):
img = _make_image(tmp_path / "diagram.png")
message, images = _collect_query_images("describe this", str(img))
assert message == "describe this"
assert images == [img]
def test_collect_query_images_extracts_leading_path(self, tmp_path):
img = _make_image(tmp_path / "camera.png")
message, images = _collect_query_images(f"{img} what do you see?")
assert message == "what do you see?"
assert images == [img]
def test_collect_query_images_supports_tilde_paths(self, tmp_path, monkeypatch):
home = tmp_path / "home"
img = _make_image(home / "storage" / "shared" / "Pictures" / "cat.png")
monkeypatch.setenv("HOME", str(home))
message, images = _collect_query_images("describe this", "~/storage/shared/Pictures/cat.png")
assert message == "describe this"
assert images == [img]
class TestTermuxImageHints:
def test_termux_example_image_path_prefers_real_shared_storage_root(self, monkeypatch):
existing = {"/sdcard", "/storage/emulated/0"}
monkeypatch.setattr("cli.os.path.isdir", lambda path: path in existing)
hint = _termux_example_image_path()
assert hint == "/sdcard/Pictures/cat.png"
class TestImageBadgeFormatting:
def test_compact_badges_use_filename_on_narrow_terminals(self, tmp_path):
img = _make_image(tmp_path / "Screenshot 2026-04-09 at 11.22.33 AM.png")
badges = _format_image_attachment_badges([img], image_counter=1, width=40)
assert badges.startswith("[📎 ")
assert "Image #1" not in badges
def test_compact_badges_summarize_multiple_images(self, tmp_path):
img1 = _make_image(tmp_path / "one.png")
img2 = _make_image(tmp_path / "two.png")
badges = _format_image_attachment_badges([img1, img2], image_counter=2, width=45)
assert badges == "[📎 2 images attached]"
+19
View File
@@ -49,6 +49,25 @@ class TestCliSkinPromptIntegration:
set_active_skin("ares")
assert cli._get_tui_prompt_fragments() == [("class:sudo-prompt", "🔑 ")]
def test_narrow_terminals_compact_voice_prompt_fragments(self):
cli = _make_cli_stub()
cli._voice_mode = True
with patch.object(HermesCLI, "_get_tui_terminal_width", return_value=50):
assert cli._get_tui_prompt_fragments() == [("class:voice-prompt", "🎤 ")]
def test_narrow_terminals_compact_voice_recording_prompt_fragments(self):
cli = _make_cli_stub()
cli._voice_recording = True
cli._voice_recorder = SimpleNamespace(current_rms=3000)
with patch.object(HermesCLI, "_get_tui_terminal_width", return_value=50):
frags = cli._get_tui_prompt_fragments()
assert frags[0][0] == "class:voice-recording"
assert frags[0][1].startswith("")
assert "" not in frags[0][1]
def test_icon_only_skin_symbol_still_visible_in_special_states(self):
cli = _make_cli_stub()
cli._secret_state = {"response_queue": object()}
+53
View File
@@ -206,6 +206,59 @@ class TestCLIStatusBar:
assert "" in text
assert "claude-sonnet-4-20250514" in text
def test_minimal_tui_chrome_threshold(self):
cli_obj = _make_cli()
assert cli_obj._use_minimal_tui_chrome(width=63) is True
assert cli_obj._use_minimal_tui_chrome(width=64) is False
def test_bottom_input_rule_hides_on_narrow_terminals(self):
cli_obj = _make_cli()
assert cli_obj._tui_input_rule_height("top", width=50) == 1
assert cli_obj._tui_input_rule_height("bottom", width=50) == 0
assert cli_obj._tui_input_rule_height("bottom", width=90) == 1
def test_agent_spacer_reclaimed_on_narrow_terminals(self):
cli_obj = _make_cli()
cli_obj._agent_running = True
assert cli_obj._agent_spacer_height(width=50) == 0
assert cli_obj._agent_spacer_height(width=90) == 1
cli_obj._agent_running = False
assert cli_obj._agent_spacer_height(width=90) == 0
def test_spinner_line_hidden_on_narrow_terminals(self):
cli_obj = _make_cli()
cli_obj._spinner_text = "thinking"
assert cli_obj._spinner_widget_height(width=50) == 0
assert cli_obj._spinner_widget_height(width=90) == 1
cli_obj._spinner_text = ""
assert cli_obj._spinner_widget_height(width=90) == 0
def test_voice_status_bar_compacts_on_narrow_terminals(self):
cli_obj = _make_cli()
cli_obj._voice_mode = True
cli_obj._voice_recording = False
cli_obj._voice_processing = False
cli_obj._voice_tts = True
cli_obj._voice_continuous = True
fragments = cli_obj._get_voice_status_fragments(width=50)
assert fragments == [("class:voice-status", " 🎤 Ctrl+B ")]
def test_voice_recording_status_bar_compacts_on_narrow_terminals(self):
cli_obj = _make_cli()
cli_obj._voice_mode = True
cli_obj._voice_recording = True
cli_obj._voice_processing = False
fragments = cli_obj._get_voice_status_fragments(width=50)
assert fragments == [("class:voice-status-recording", " ● REC ")]
class TestCLIUsageReport:
def test_show_usage_includes_estimated_cost(self, capsys):
+413
View File
@@ -0,0 +1,413 @@
"""Tests for the /fast CLI command and service-tier config handling."""
import unittest
from types import SimpleNamespace
from unittest.mock import MagicMock, patch
def _import_cli():
import hermes_cli.config as config_mod
if not hasattr(config_mod, "save_env_value_secure"):
config_mod.save_env_value_secure = lambda key, value: {
"success": True,
"stored_as": key,
"validated": False,
}
import cli as cli_mod
return cli_mod
class TestParseServiceTierConfig(unittest.TestCase):
def _parse(self, raw):
cli_mod = _import_cli()
return cli_mod._parse_service_tier_config(raw)
def test_fast_maps_to_priority(self):
self.assertEqual(self._parse("fast"), "priority")
self.assertEqual(self._parse("priority"), "priority")
def test_normal_disables_service_tier(self):
self.assertIsNone(self._parse("normal"))
self.assertIsNone(self._parse("off"))
self.assertIsNone(self._parse(""))
class TestHandleFastCommand(unittest.TestCase):
def _make_cli(self, service_tier=None):
return SimpleNamespace(
service_tier=service_tier,
provider="openai-codex",
requested_provider="openai-codex",
model="gpt-5.4",
_fast_command_available=lambda: True,
agent=MagicMock(),
)
def test_no_args_shows_status(self):
cli_mod = _import_cli()
stub = self._make_cli(service_tier=None)
with (
patch.object(cli_mod, "_cprint") as mock_cprint,
patch.object(cli_mod, "save_config_value") as mock_save,
):
cli_mod.HermesCLI._handle_fast_command(stub, "/fast")
# Bare /fast shows status, does not change config
mock_save.assert_not_called()
# Should have printed the status line
printed = " ".join(str(c) for c in mock_cprint.call_args_list)
self.assertIn("normal", printed)
def test_no_args_shows_fast_when_enabled(self):
cli_mod = _import_cli()
stub = self._make_cli(service_tier="priority")
with (
patch.object(cli_mod, "_cprint") as mock_cprint,
patch.object(cli_mod, "save_config_value") as mock_save,
):
cli_mod.HermesCLI._handle_fast_command(stub, "/fast")
mock_save.assert_not_called()
printed = " ".join(str(c) for c in mock_cprint.call_args_list)
self.assertIn("fast", printed)
def test_normal_argument_clears_service_tier(self):
cli_mod = _import_cli()
stub = self._make_cli(service_tier="priority")
with (
patch.object(cli_mod, "_cprint"),
patch.object(cli_mod, "save_config_value", return_value=True) as mock_save,
):
cli_mod.HermesCLI._handle_fast_command(stub, "/fast normal")
mock_save.assert_called_once_with("agent.service_tier", "normal")
self.assertIsNone(stub.service_tier)
self.assertIsNone(stub.agent)
def test_unsupported_model_does_not_expose_fast(self):
cli_mod = _import_cli()
stub = SimpleNamespace(
service_tier=None,
provider="openai-codex",
requested_provider="openai-codex",
model="gpt-5.3-codex",
_fast_command_available=lambda: False,
agent=MagicMock(),
)
with (
patch.object(cli_mod, "_cprint") as mock_cprint,
patch.object(cli_mod, "save_config_value") as mock_save,
):
cli_mod.HermesCLI._handle_fast_command(stub, "/fast")
mock_save.assert_not_called()
self.assertTrue(mock_cprint.called)
class TestPriorityProcessingModels(unittest.TestCase):
"""Verify the expanded Priority Processing model registry."""
def test_all_documented_models_supported(self):
from hermes_cli.models import model_supports_fast_mode
# All models from OpenAI's Priority Processing pricing table
supported = [
"gpt-5.4", "gpt-5.4-mini", "gpt-5.2",
"gpt-5.1", "gpt-5", "gpt-5-mini",
"gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano",
"gpt-4o", "gpt-4o-mini",
"o3", "o4-mini",
]
for model in supported:
assert model_supports_fast_mode(model), f"{model} should support fast mode"
def test_vendor_prefix_stripped(self):
from hermes_cli.models import model_supports_fast_mode
assert model_supports_fast_mode("openai/gpt-5.4") is True
assert model_supports_fast_mode("openai/gpt-4.1") is True
assert model_supports_fast_mode("openai/o3") is True
def test_non_priority_models_rejected(self):
from hermes_cli.models import model_supports_fast_mode
assert model_supports_fast_mode("gpt-5.3-codex") is False
assert model_supports_fast_mode("claude-sonnet-4") is False
assert model_supports_fast_mode("") is False
assert model_supports_fast_mode(None) is False
def test_resolve_overrides_returns_service_tier(self):
from hermes_cli.models import resolve_fast_mode_overrides
result = resolve_fast_mode_overrides("gpt-5.4")
assert result == {"service_tier": "priority"}
result = resolve_fast_mode_overrides("gpt-4.1")
assert result == {"service_tier": "priority"}
def test_resolve_overrides_none_for_unsupported(self):
from hermes_cli.models import resolve_fast_mode_overrides
assert resolve_fast_mode_overrides("gpt-5.3-codex") is None
assert resolve_fast_mode_overrides("claude-sonnet-4") is None
class TestFastModeRouting(unittest.TestCase):
def test_fast_command_exposed_for_model_even_when_provider_is_auto(self):
cli_mod = _import_cli()
stub = SimpleNamespace(provider="auto", requested_provider="auto", model="gpt-5.4", agent=None)
assert cli_mod.HermesCLI._fast_command_available(stub) is True
def test_fast_command_exposed_for_non_codex_models(self):
cli_mod = _import_cli()
stub = SimpleNamespace(provider="openai", requested_provider="openai", model="gpt-4.1", agent=None)
assert cli_mod.HermesCLI._fast_command_available(stub) is True
stub = SimpleNamespace(provider="openrouter", requested_provider="openrouter", model="o3", agent=None)
assert cli_mod.HermesCLI._fast_command_available(stub) is True
def test_turn_route_injects_overrides_without_provider_switch(self):
"""Fast mode should add request_overrides but NOT change the provider/runtime."""
cli_mod = _import_cli()
stub = SimpleNamespace(
model="gpt-5.4",
api_key="primary-key",
base_url="https://openrouter.ai/api/v1",
provider="openrouter",
api_mode="chat_completions",
acp_command=None,
acp_args=[],
_credential_pool=None,
_smart_model_routing={},
service_tier="priority",
)
original_runtime = {
"api_key": "***",
"base_url": "https://openrouter.ai/api/v1",
"provider": "openrouter",
"api_mode": "chat_completions",
"command": None,
"args": [],
"credential_pool": None,
}
with patch("agent.smart_model_routing.resolve_turn_route", return_value={
"model": "gpt-5.4",
"runtime": dict(original_runtime),
"label": None,
"signature": ("gpt-5.4", "openrouter", "https://openrouter.ai/api/v1", "chat_completions", None, ()),
}):
route = cli_mod.HermesCLI._resolve_turn_agent_config(stub, "hi")
# Provider should NOT have changed
assert route["runtime"]["provider"] == "openrouter"
assert route["runtime"]["api_mode"] == "chat_completions"
# But request_overrides should be set
assert route["request_overrides"] == {"service_tier": "priority"}
def test_turn_route_keeps_primary_runtime_when_model_has_no_fast_backend(self):
cli_mod = _import_cli()
stub = SimpleNamespace(
model="gpt-5.3-codex",
api_key="primary-key",
base_url="https://openrouter.ai/api/v1",
provider="openrouter",
api_mode="chat_completions",
acp_command=None,
acp_args=[],
_credential_pool=None,
_smart_model_routing={},
service_tier="priority",
)
primary_route = {
"model": "gpt-5.3-codex",
"runtime": {
"api_key": "***",
"base_url": "https://openrouter.ai/api/v1",
"provider": "openrouter",
"api_mode": "chat_completions",
"command": None,
"args": [],
"credential_pool": None,
},
"label": None,
"signature": ("gpt-5.3-codex", "openrouter", "https://openrouter.ai/api/v1", "chat_completions", None, ()),
}
with patch("agent.smart_model_routing.resolve_turn_route", return_value=primary_route):
route = cli_mod.HermesCLI._resolve_turn_agent_config(stub, "hi")
assert route["runtime"]["provider"] == "openrouter"
assert route.get("request_overrides") is None
class TestAnthropicFastMode(unittest.TestCase):
"""Verify Anthropic Fast Mode model support and override resolution."""
def test_anthropic_opus_supported(self):
from hermes_cli.models import model_supports_fast_mode
# Native Anthropic format (hyphens)
assert model_supports_fast_mode("claude-opus-4-6") is True
# OpenRouter format (dots)
assert model_supports_fast_mode("claude-opus-4.6") is True
# With vendor prefix
assert model_supports_fast_mode("anthropic/claude-opus-4-6") is True
assert model_supports_fast_mode("anthropic/claude-opus-4.6") is True
def test_anthropic_non_opus_rejected(self):
from hermes_cli.models import model_supports_fast_mode
assert model_supports_fast_mode("claude-sonnet-4-6") is False
assert model_supports_fast_mode("claude-sonnet-4.6") is False
assert model_supports_fast_mode("claude-haiku-4-5") is False
assert model_supports_fast_mode("anthropic/claude-sonnet-4.6") is False
def test_anthropic_variant_tags_stripped(self):
from hermes_cli.models import model_supports_fast_mode
# OpenRouter variant tags after colon should be stripped
assert model_supports_fast_mode("claude-opus-4.6:fast") is True
assert model_supports_fast_mode("claude-opus-4.6:beta") is True
def test_resolve_overrides_returns_speed_for_anthropic(self):
from hermes_cli.models import resolve_fast_mode_overrides
result = resolve_fast_mode_overrides("claude-opus-4-6")
assert result == {"speed": "fast"}
result = resolve_fast_mode_overrides("anthropic/claude-opus-4.6")
assert result == {"speed": "fast"}
def test_resolve_overrides_returns_service_tier_for_openai(self):
"""OpenAI models should still get service_tier, not speed."""
from hermes_cli.models import resolve_fast_mode_overrides
result = resolve_fast_mode_overrides("gpt-5.4")
assert result == {"service_tier": "priority"}
def test_is_anthropic_fast_model(self):
from hermes_cli.models import _is_anthropic_fast_model
assert _is_anthropic_fast_model("claude-opus-4-6") is True
assert _is_anthropic_fast_model("claude-opus-4.6") is True
assert _is_anthropic_fast_model("anthropic/claude-opus-4-6") is True
assert _is_anthropic_fast_model("gpt-5.4") is False
assert _is_anthropic_fast_model("claude-sonnet-4-6") is False
def test_fast_command_exposed_for_anthropic_model(self):
cli_mod = _import_cli()
stub = SimpleNamespace(
provider="anthropic", requested_provider="anthropic",
model="claude-opus-4-6", agent=None,
)
assert cli_mod.HermesCLI._fast_command_available(stub) is True
def test_fast_command_hidden_for_anthropic_sonnet(self):
cli_mod = _import_cli()
stub = SimpleNamespace(
provider="anthropic", requested_provider="anthropic",
model="claude-sonnet-4-6", agent=None,
)
assert cli_mod.HermesCLI._fast_command_available(stub) is False
def test_turn_route_injects_speed_for_anthropic(self):
"""Anthropic models should get speed:'fast' override, not service_tier."""
cli_mod = _import_cli()
stub = SimpleNamespace(
model="claude-opus-4-6",
api_key="sk-ant-test",
base_url="https://api.anthropic.com",
provider="anthropic",
api_mode="anthropic_messages",
acp_command=None,
acp_args=[],
_credential_pool=None,
_smart_model_routing={},
service_tier="priority",
)
original_runtime = {
"api_key": "***",
"base_url": "https://api.anthropic.com",
"provider": "anthropic",
"api_mode": "anthropic_messages",
"command": None,
"args": [],
"credential_pool": None,
}
with patch("agent.smart_model_routing.resolve_turn_route", return_value={
"model": "claude-opus-4-6",
"runtime": dict(original_runtime),
"label": None,
"signature": ("claude-opus-4-6", "anthropic", "https://api.anthropic.com", "anthropic_messages", None, ()),
}):
route = cli_mod.HermesCLI._resolve_turn_agent_config(stub, "hi")
assert route["runtime"]["provider"] == "anthropic"
assert route["request_overrides"] == {"speed": "fast"}
class TestAnthropicFastModeAdapter(unittest.TestCase):
"""Verify build_anthropic_kwargs handles fast_mode parameter."""
def test_fast_mode_adds_speed_and_beta(self):
from agent.anthropic_adapter import build_anthropic_kwargs, _FAST_MODE_BETA
kwargs = build_anthropic_kwargs(
model="claude-opus-4-6",
messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
tools=None,
max_tokens=None,
reasoning_config=None,
fast_mode=True,
)
assert kwargs.get("speed") == "fast"
assert "extra_headers" in kwargs
assert _FAST_MODE_BETA in kwargs["extra_headers"].get("anthropic-beta", "")
def test_fast_mode_off_no_speed(self):
from agent.anthropic_adapter import build_anthropic_kwargs
kwargs = build_anthropic_kwargs(
model="claude-opus-4-6",
messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
tools=None,
max_tokens=None,
reasoning_config=None,
fast_mode=False,
)
assert "speed" not in kwargs
assert "extra_headers" not in kwargs
def test_fast_mode_skipped_for_third_party_endpoint(self):
from agent.anthropic_adapter import build_anthropic_kwargs
kwargs = build_anthropic_kwargs(
model="claude-opus-4-6",
messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
tools=None,
max_tokens=None,
reasoning_config=None,
fast_mode=True,
base_url="https://api.minimax.io/anthropic/v1",
)
# Third-party endpoints should NOT get speed or fast-mode beta
assert "speed" not in kwargs
assert "extra_headers" not in kwargs
class TestConfigDefault(unittest.TestCase):
def test_default_config_has_service_tier(self):
from hermes_cli.config import DEFAULT_CONFIG
agent = DEFAULT_CONFIG.get("agent", {})
self.assertIn("service_tier", agent)
self.assertEqual(agent["service_tier"], "")
+138
View File
@@ -0,0 +1,138 @@
"""Tests for _stream_delta's handling of <think> tags in prose vs real reasoning blocks."""
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
import pytest
def _make_cli_stub():
"""Create a minimal HermesCLI-like object with stream state."""
from cli import HermesCLI
cli = HermesCLI.__new__(HermesCLI)
cli.show_reasoning = False
cli._stream_buf = ""
cli._stream_started = False
cli._stream_box_opened = False
cli._stream_prefilt = ""
cli._in_reasoning_block = False
cli._reasoning_stream_started = False
cli._reasoning_box_opened = False
cli._reasoning_buf = ""
cli._reasoning_preview_buf = ""
cli._deferred_content = ""
cli._stream_text_ansi = ""
cli._stream_needs_break = False
cli._emitted = []
# Mock _emit_stream_text to capture output
def mock_emit(text):
cli._emitted.append(text)
cli._emit_stream_text = mock_emit
# Mock _stream_reasoning_delta
cli._reasoning_emitted = []
def mock_reasoning(text):
cli._reasoning_emitted.append(text)
cli._stream_reasoning_delta = mock_reasoning
return cli
class TestThinkTagInProse:
"""<think> mentioned in prose should NOT trigger reasoning suppression."""
def test_think_tag_mid_sentence(self):
"""'(/think not producing <think> tags)' should pass through."""
cli = _make_cli_stub()
tokens = [
" 1. Fix reasoning mode in eval ",
"(/think not producing ",
"<think>",
" tags — ~2% gap)",
"\n 2. Launch production",
]
for t in tokens:
cli._stream_delta(t)
assert not cli._in_reasoning_block, "<think> in prose should not enter reasoning block"
full = "".join(cli._emitted)
assert "<think>" in full, "The literal <think> tag should be in the emitted text"
assert "Launch production" in full
def test_think_tag_after_text_on_same_line(self):
"""'some text <think>' should NOT trigger reasoning."""
cli = _make_cli_stub()
cli._stream_delta("Here is the <think> tag explanation")
assert not cli._in_reasoning_block
full = "".join(cli._emitted)
assert "<think>" in full
def test_think_tag_in_backticks(self):
"""'`<think>`' should NOT trigger reasoning."""
cli = _make_cli_stub()
cli._stream_delta("Use the `<think>` tag for reasoning")
assert not cli._in_reasoning_block
class TestRealReasoningBlock:
"""Real <think> tags at block boundaries should still be caught."""
def test_think_at_start_of_stream(self):
"""'<think>reasoning</think>answer' should suppress reasoning."""
cli = _make_cli_stub()
cli._stream_delta("<think>")
assert cli._in_reasoning_block
cli._stream_delta("I need to analyze this")
cli._stream_delta("</think>")
assert not cli._in_reasoning_block
cli._stream_delta("Here is my answer")
full = "".join(cli._emitted)
assert "Here is my answer" in full
assert "I need to analyze" not in full # reasoning was suppressed
def test_think_after_newline(self):
"""'text\\n<think>' should trigger reasoning block."""
cli = _make_cli_stub()
cli._stream_delta("Some preamble\n<think>")
assert cli._in_reasoning_block
full = "".join(cli._emitted)
assert "Some preamble" in full
def test_think_after_newline_with_whitespace(self):
"""'text\\n <think>' should trigger reasoning block."""
cli = _make_cli_stub()
cli._stream_delta("Some preamble\n <think>")
assert cli._in_reasoning_block
def test_think_with_only_whitespace_before(self):
"""' <think>' (whitespace only prefix) should trigger."""
cli = _make_cli_stub()
cli._stream_delta(" <think>")
assert cli._in_reasoning_block
class TestFlushRecovery:
"""_flush_stream should recover content from false-positive reasoning blocks."""
def test_flush_recovers_buffered_content(self):
"""If somehow in reasoning block at flush, content is recovered."""
cli = _make_cli_stub()
# Manually set up a false-positive state
cli._in_reasoning_block = True
cli._stream_prefilt = " tags — ~2% gap)\n 2. Launch production"
cli._stream_box_opened = True
# Mock _close_reasoning_box and box closing
cli._close_reasoning_box = lambda: None
# Call flush
from unittest.mock import patch
import shutil
with patch.object(shutil, "get_terminal_size", return_value=os.terminal_size((80, 24))):
with patch("cli._cprint"):
cli._flush_stream()
assert not cli._in_reasoning_block
full = "".join(cli._emitted)
assert "Launch production" in full
+49 -15
View File
@@ -294,6 +294,40 @@ class TestModelsEndpoint:
assert data["data"][0]["id"] == "hermes-agent"
assert data["data"][0]["owned_by"] == "hermes"
@pytest.mark.asyncio
async def test_models_returns_profile_name(self):
"""When running under a named profile, /v1/models advertises the profile name."""
with patch("gateway.platforms.api_server.APIServerAdapter._resolve_model_name", return_value="lucas"):
adapter = _make_adapter()
app = _create_app(adapter)
async with TestClient(TestServer(app)) as cli:
resp = await cli.get("/v1/models")
assert resp.status == 200
data = await resp.json()
assert data["data"][0]["id"] == "lucas"
assert data["data"][0]["root"] == "lucas"
@pytest.mark.asyncio
async def test_models_returns_explicit_model_name(self):
"""Explicit model_name in config overrides profile name."""
extra = {"model_name": "my-custom-agent"}
config = PlatformConfig(enabled=True, extra=extra)
adapter = APIServerAdapter(config)
assert adapter._model_name == "my-custom-agent"
def test_resolve_model_name_explicit(self):
assert APIServerAdapter._resolve_model_name("my-bot") == "my-bot"
def test_resolve_model_name_default_profile(self):
"""Default profile falls back to 'hermes-agent'."""
with patch("hermes_cli.profiles.get_active_profile_name", return_value="default"):
assert APIServerAdapter._resolve_model_name("") == "hermes-agent"
def test_resolve_model_name_named_profile(self):
"""Named profile uses the profile name as model name."""
with patch("hermes_cli.profiles.get_active_profile_name", return_value="lucas"):
assert APIServerAdapter._resolve_model_name("") == "lucas"
@pytest.mark.asyncio
async def test_models_requires_auth(self, auth_adapter):
app = _create_app(auth_adapter)
@@ -1600,7 +1634,7 @@ class TestSessionIdHeader:
assert resp.headers.get("X-Hermes-Session-Id") is not None
@pytest.mark.asyncio
async def test_provided_session_id_is_used_and_echoed(self, adapter):
async def test_provided_session_id_is_used_and_echoed(self, auth_adapter):
"""When X-Hermes-Session-Id is provided, it's passed to the agent and echoed in the response."""
mock_result = {"final_response": "Continuing!", "messages": [], "api_calls": 1}
mock_db = MagicMock()
@@ -1608,15 +1642,15 @@ class TestSessionIdHeader:
{"role": "user", "content": "previous message"},
{"role": "assistant", "content": "previous reply"},
]
adapter._session_db = mock_db
app = _create_app(adapter)
auth_adapter._session_db = mock_db
app = _create_app(auth_adapter)
async with TestClient(TestServer(app)) as cli:
with patch.object(adapter, "_run_agent", new_callable=AsyncMock) as mock_run:
with patch.object(auth_adapter, "_run_agent", new_callable=AsyncMock) as mock_run:
mock_run.return_value = (mock_result, {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0})
resp = await cli.post(
"/v1/chat/completions",
headers={"X-Hermes-Session-Id": "my-session-123"},
headers={"X-Hermes-Session-Id": "my-session-123", "Authorization": "Bearer sk-secret"},
json={"model": "hermes-agent", "messages": [{"role": "user", "content": "Continue"}]},
)
@@ -1626,7 +1660,7 @@ class TestSessionIdHeader:
assert call_kwargs["session_id"] == "my-session-123"
@pytest.mark.asyncio
async def test_provided_session_id_loads_history_from_db(self, adapter):
async def test_provided_session_id_loads_history_from_db(self, auth_adapter):
"""When X-Hermes-Session-Id is provided, history comes from SessionDB not request body."""
mock_result = {"final_response": "OK", "messages": [], "api_calls": 1}
db_history = [
@@ -1635,15 +1669,15 @@ class TestSessionIdHeader:
]
mock_db = MagicMock()
mock_db.get_messages_as_conversation.return_value = db_history
adapter._session_db = mock_db
app = _create_app(adapter)
auth_adapter._session_db = mock_db
app = _create_app(auth_adapter)
async with TestClient(TestServer(app)) as cli:
with patch.object(adapter, "_run_agent", new_callable=AsyncMock) as mock_run:
with patch.object(auth_adapter, "_run_agent", new_callable=AsyncMock) as mock_run:
mock_run.return_value = (mock_result, {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0})
resp = await cli.post(
"/v1/chat/completions",
headers={"X-Hermes-Session-Id": "existing-session"},
headers={"X-Hermes-Session-Id": "existing-session", "Authorization": "Bearer sk-secret"},
# Request body has different history — should be ignored
json={
"model": "hermes-agent",
@@ -1662,20 +1696,20 @@ class TestSessionIdHeader:
assert call_kwargs["user_message"] == "new question"
@pytest.mark.asyncio
async def test_db_failure_falls_back_to_empty_history(self, adapter):
async def test_db_failure_falls_back_to_empty_history(self, auth_adapter):
"""If SessionDB raises, history falls back to empty and request still succeeds."""
mock_result = {"final_response": "OK", "messages": [], "api_calls": 1}
# Simulate DB failure: _session_db is None and SessionDB() constructor raises
adapter._session_db = None
app = _create_app(adapter)
auth_adapter._session_db = None
app = _create_app(auth_adapter)
async with TestClient(TestServer(app)) as cli:
with patch.object(adapter, "_run_agent", new_callable=AsyncMock) as mock_run, \
with patch.object(auth_adapter, "_run_agent", new_callable=AsyncMock) as mock_run, \
patch("hermes_state.SessionDB", side_effect=Exception("DB unavailable")):
mock_run.return_value = (mock_result, {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0})
resp = await cli.post(
"/v1/chat/completions",
headers={"X-Hermes-Session-Id": "some-session"},
headers={"X-Hermes-Session-Id": "some-session", "Authorization": "Bearer sk-secret"},
json={"model": "hermes-agent", "messages": [{"role": "user", "content": "Hi"}]},
)
@@ -81,6 +81,7 @@ def adapter(monkeypatch):
config = PlatformConfig(enabled=True, token="fake-token")
adapter = DiscordAdapter(config)
adapter._client = SimpleNamespace(user=SimpleNamespace(id=999))
adapter._text_batch_delay_seconds = 0 # disable batching for tests
adapter.handle_message = AsyncMock()
return adapter
@@ -91,6 +91,7 @@ def adapter(monkeypatch):
config = PlatformConfig(enabled=True, token="fake-token")
adapter = DiscordAdapter(config)
adapter._client = SimpleNamespace(user=SimpleNamespace(id=999))
adapter._text_batch_delay_seconds = 0 # disable batching for tests
adapter.handle_message = AsyncMock()
return adapter
@@ -62,6 +62,7 @@ def adapter():
fetch_channel=AsyncMock(),
user=SimpleNamespace(id=99999, name="HermesBot"),
)
adapter._text_batch_delay_seconds = 0 # disable batching for tests
return adapter
+1
View File
@@ -44,6 +44,7 @@ def _make_adapter(tmp_path=None):
},
)
adapter = MatrixAdapter(config)
adapter._text_batch_delay_seconds = 0 # disable batching for tests
adapter.handle_message = AsyncMock()
adapter._startup_ts = time.time() - 10 # avoid startup grace filter
return adapter
@@ -0,0 +1,63 @@
"""Regression tests for gateway /model support of config.yaml custom_providers."""
import yaml
import pytest
from gateway.config import Platform
from gateway.platforms.base import MessageEvent, MessageType
from gateway.run import GatewayRunner
from gateway.session import SessionSource
def _make_runner():
runner = object.__new__(GatewayRunner)
runner.adapters = {}
runner._voice_mode = {}
runner._session_model_overrides = {}
return runner
def _make_event(text="/model"):
return MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=SessionSource(platform=Platform.TELEGRAM, chat_id="12345", chat_type="dm"),
)
@pytest.mark.asyncio
async def test_handle_model_command_lists_saved_custom_provider(tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "config.yaml").write_text(
yaml.safe_dump(
{
"model": {
"default": "gpt-5.4",
"provider": "openai-codex",
"base_url": "https://chatgpt.com/backend-api/codex",
},
"providers": {},
"custom_providers": [
{
"name": "Local (127.0.0.1:4141)",
"base_url": "http://127.0.0.1:4141/v1",
"model": "rotator-openrouter-coding",
}
],
}
),
encoding="utf-8",
)
import gateway.run as gateway_run
monkeypatch.setattr(gateway_run, "_hermes_home", hermes_home)
monkeypatch.setattr("agent.models_dev.fetch_models_dev", lambda: {})
result = await _make_runner()._handle_model_command(_make_event())
assert result is not None
assert "Local (127.0.0.1:4141)" in result
assert "custom:local-(127.0.0.1:4141)" in result
assert "rotator-openrouter-coding" in result
@@ -0,0 +1,245 @@
"""Tests that gateway /model switch persists across messages.
The gateway /model command stores session overrides in
``_session_model_overrides``. These must:
1. Be applied in ``run_sync()`` so the next agent uses the switched model.
2. Not be mistaken for fallback activation (which evicts the cached agent).
3. Survive across multiple messages until /reset clears them.
Tests exercise the real ``_apply_session_model_override()`` and
``_is_intentional_model_switch()`` methods on ``GatewayRunner``.
"""
from datetime import datetime
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.session import SessionEntry, SessionSource, build_session_key
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_source() -> SessionSource:
return SessionSource(
platform=Platform.TELEGRAM,
user_id="u1",
chat_id="c1",
user_name="tester",
chat_type="dm",
)
def _make_runner():
"""Create a minimal GatewayRunner with stubbed internals."""
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="tok")}
)
adapter = MagicMock()
adapter.send = AsyncMock()
runner.adapters = {Platform.TELEGRAM: adapter}
runner._voice_mode = {}
runner.hooks = SimpleNamespace(emit=AsyncMock(), loaded_hooks=False)
runner._session_model_overrides = {}
runner._pending_model_notes = {}
runner._background_tasks = set()
runner._running_agents = {}
runner._pending_messages = {}
runner._pending_approvals = {}
runner._session_db = None
runner._agent_cache = {}
runner._agent_cache_lock = None
runner._effective_model = None
runner._effective_provider = None
runner.session_store = MagicMock()
session_key = build_session_key(_make_source())
session_entry = SessionEntry(
session_key=session_key,
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
)
runner.session_store.get_or_create_session.return_value = session_entry
runner.session_store._entries = {session_key: session_entry}
return runner
# ---------------------------------------------------------------------------
# Tests: _apply_session_model_override
# ---------------------------------------------------------------------------
class TestApplySessionModelOverride:
"""Verify _apply_session_model_override replaces config defaults."""
def test_override_replaces_all_fields(self):
runner = _make_runner()
sk = build_session_key(_make_source())
runner._session_model_overrides[sk] = {
"model": "gpt-5.4-turbo",
"provider": "openrouter",
"api_key": "or-key-123",
"base_url": "https://openrouter.ai/api/v1",
"api_mode": "chat_completions",
}
model, rt = runner._apply_session_model_override(
sk,
"anthropic/claude-sonnet-4",
{"provider": "anthropic", "api_key": "ant-key", "base_url": "https://api.anthropic.com", "api_mode": "anthropic_messages"},
)
assert model == "gpt-5.4-turbo"
assert rt["provider"] == "openrouter"
assert rt["api_key"] == "or-key-123"
assert rt["base_url"] == "https://openrouter.ai/api/v1"
assert rt["api_mode"] == "chat_completions"
def test_no_override_returns_originals(self):
runner = _make_runner()
sk = build_session_key(_make_source())
orig_model = "anthropic/claude-sonnet-4"
orig_rt = {"provider": "anthropic", "api_key": "key", "base_url": "https://api.anthropic.com", "api_mode": "anthropic_messages"}
model, rt = runner._apply_session_model_override(sk, orig_model, dict(orig_rt))
assert model == orig_model
assert rt == orig_rt
def test_none_values_do_not_overwrite(self):
"""Override with None api_key/base_url should preserve config defaults."""
runner = _make_runner()
sk = build_session_key(_make_source())
runner._session_model_overrides[sk] = {
"model": "gpt-5.4",
"provider": "openai",
"api_key": None,
"base_url": None,
"api_mode": "chat_completions",
}
model, rt = runner._apply_session_model_override(
sk,
"anthropic/claude-sonnet-4",
{"provider": "anthropic", "api_key": "ant-key", "base_url": "https://api.anthropic.com", "api_mode": "anthropic_messages"},
)
assert model == "gpt-5.4"
assert rt["provider"] == "openai"
assert rt["api_key"] == "ant-key" # preserved — None didn't overwrite
assert rt["base_url"] == "https://api.anthropic.com" # preserved
assert rt["api_mode"] == "chat_completions" # overwritten (not None)
def test_empty_string_overwrites(self):
"""Empty string is not None — it should overwrite the config value."""
runner = _make_runner()
sk = build_session_key(_make_source())
runner._session_model_overrides[sk] = {
"model": "local-model",
"provider": "custom",
"api_key": "local-key",
"base_url": "",
"api_mode": "chat_completions",
}
_, rt = runner._apply_session_model_override(
sk,
"anthropic/claude-sonnet-4",
{"provider": "anthropic", "api_key": "ant-key", "base_url": "https://api.anthropic.com", "api_mode": "anthropic_messages"},
)
assert rt["base_url"] == "" # empty string overwrites
def test_different_session_key_not_affected(self):
runner = _make_runner()
sk = build_session_key(_make_source())
other_sk = "other_session"
runner._session_model_overrides[other_sk] = {
"model": "gpt-5.4",
"provider": "openai",
"api_key": "key",
"base_url": "",
"api_mode": "chat_completions",
}
model, rt = runner._apply_session_model_override(
sk,
"anthropic/claude-sonnet-4",
{"provider": "anthropic", "api_key": "ant-key", "base_url": "url", "api_mode": "anthropic_messages"},
)
assert model == "anthropic/claude-sonnet-4" # unchanged — wrong session key
# ---------------------------------------------------------------------------
# Tests: _is_intentional_model_switch
# ---------------------------------------------------------------------------
class TestIsIntentionalModelSwitch:
"""Verify fallback detection respects intentional /model overrides."""
def test_matches_override(self):
runner = _make_runner()
sk = build_session_key(_make_source())
runner._session_model_overrides[sk] = {
"model": "gpt-5.4",
"provider": "openai",
"api_key": "key",
"base_url": "",
"api_mode": "chat_completions",
}
assert runner._is_intentional_model_switch(sk, "gpt-5.4") is True
def test_no_override_returns_false(self):
runner = _make_runner()
sk = build_session_key(_make_source())
assert runner._is_intentional_model_switch(sk, "gpt-5.4") is False
def test_different_model_returns_false(self):
"""Agent fell back to a different model than the override."""
runner = _make_runner()
sk = build_session_key(_make_source())
runner._session_model_overrides[sk] = {
"model": "gpt-5.4",
"provider": "openai",
"api_key": "key",
"base_url": "",
"api_mode": "chat_completions",
}
assert runner._is_intentional_model_switch(sk, "gpt-5.4-mini") is False
def test_wrong_session_key(self):
runner = _make_runner()
sk = build_session_key(_make_source())
runner._session_model_overrides["other_session"] = {
"model": "gpt-5.4",
"provider": "openai",
"api_key": "key",
"base_url": "",
"api_mode": "chat_completions",
}
assert runner._is_intentional_model_switch(sk, "gpt-5.4") is False
+43 -72
View File
@@ -1,19 +1,17 @@
"""Tests for DM thread session seeding.
"""Tests for DM thread session isolation.
When a bot reply creates a thread in a DM (e.g. Slack), the user's reply
in that thread gets a new session (keyed by thread_ts). The seeding logic
copies the parent DM session's transcript into the new thread session so
the bot retains context of the original conversation.
DM thread sessions must start empty no parent transcript seeding.
Thread context is handled by platform adapters (e.g. Slack's
_fetch_thread_context fetches actual thread replies via the API).
Session-level seeding was removed because it copied the ENTIRE parent
DM transcript, causing unrelated conversations to bleed across threads.
Covers:
- Basic seeding: parent transcript copied to new thread session
- No seeding for group/channel chats
- No seeding when parent session doesn't exist
- No seeding on auto-reset sessions
- No seeding on existing (non-new) thread sessions
- Parent transcript is not mutated by seeding
- Multiple threads from same parent each get independent copies
- Cross-platform: works for any platform with DM threads (Slack, Telegram, Discord)
- Thread sessions start empty (no parent seeding)
- Group/channel thread sessions also start empty
- Multiple threads from same parent are independent
- Existing thread sessions are not mutated on re-access
- Cross-platform: consistent behavior for Slack, Telegram, Discord
"""
import pytest
@@ -60,48 +58,41 @@ PARENT_HISTORY = [
]
class TestDMThreadSeeding:
"""Core seeding behavior."""
class TestDMThreadIsolation:
"""Thread sessions must start empty — no parent transcript seeding."""
def test_thread_session_seeded_from_parent(self, store):
"""New DM thread session should contain the parent's transcript."""
# Create parent DM session with history
def test_thread_session_starts_empty(self, store):
"""New DM thread session should NOT inherit parent's transcript."""
parent_source = _dm_source()
parent_entry = store.get_or_create_session(parent_source)
for msg in PARENT_HISTORY:
store.append_to_transcript(parent_entry.session_id, msg)
# Create thread session (user replied in thread)
thread_source = _dm_source(thread_id="1234567890.000001")
thread_entry = store.get_or_create_session(thread_source)
# Thread should have parent's history
thread_transcript = store.load_transcript(thread_entry.session_id)
assert len(thread_transcript) == 2
assert thread_transcript[0]["content"] == "What's the weather?"
assert thread_transcript[1]["content"] == "It's sunny and 72°F."
assert len(thread_transcript) == 0
def test_parent_transcript_not_mutated(self, store):
"""Seeding should not alter the parent session's transcript."""
def test_parent_transcript_unaffected_by_thread(self, store):
"""Creating a thread session should not alter parent's transcript."""
parent_source = _dm_source()
parent_entry = store.get_or_create_session(parent_source)
for msg in PARENT_HISTORY:
store.append_to_transcript(parent_entry.session_id, msg)
# Create thread and add a message to it
thread_source = _dm_source(thread_id="1234567890.000001")
thread_entry = store.get_or_create_session(thread_source)
store.append_to_transcript(thread_entry.session_id, {
"role": "user", "content": "thread-only message"
})
# Parent should still have only its original messages
parent_transcript = store.load_transcript(parent_entry.session_id)
assert len(parent_transcript) == 2
assert all(m["content"] != "thread-only message" for m in parent_transcript)
def test_multiple_threads_get_independent_copies(self, store):
"""Each thread from the same parent gets its own copy."""
def test_multiple_threads_are_independent(self, store):
"""Each thread from the same parent starts empty and stays independent."""
parent_source = _dm_source()
parent_entry = store.get_or_create_session(parent_source)
for msg in PARENT_HISTORY:
@@ -118,49 +109,43 @@ class TestDMThreadSeeding:
thread_b_source = _dm_source(thread_id="2222.000002")
thread_b_entry = store.get_or_create_session(thread_b_source)
# Thread B should have parent history, not thread A's additions
# Thread B starts empty
thread_b_transcript = store.load_transcript(thread_b_entry.session_id)
assert len(thread_b_transcript) == 2
assert all(m["content"] != "thread A message" for m in thread_b_transcript)
assert len(thread_b_transcript) == 0
# Thread A should have parent history + its own message
# Thread A has only its own message
thread_a_transcript = store.load_transcript(thread_a_entry.session_id)
assert len(thread_a_transcript) == 3
assert len(thread_a_transcript) == 1
assert thread_a_transcript[0]["content"] == "thread A message"
def test_existing_thread_session_not_reseeded(self, store):
"""Returning to an existing thread session should not re-copy parent history."""
def test_existing_thread_session_preserved(self, store):
"""Returning to an existing thread session should not reset it."""
parent_source = _dm_source()
parent_entry = store.get_or_create_session(parent_source)
for msg in PARENT_HISTORY:
store.append_to_transcript(parent_entry.session_id, msg)
# Create thread session
thread_source = _dm_source(thread_id="1234567890.000001")
thread_entry = store.get_or_create_session(thread_source)
store.append_to_transcript(thread_entry.session_id, {
"role": "user", "content": "follow-up"
})
# Add more to parent after thread was created
store.append_to_transcript(parent_entry.session_id, {
"role": "user", "content": "new parent message"
})
# Get the same thread session again (not new — created_at != updated_at)
# Get the same thread session again
thread_entry_again = store.get_or_create_session(thread_source)
assert thread_entry_again.session_id == thread_entry.session_id
# Should still have 3 messages (2 seeded + 1 follow-up), not re-seeded
# Should still have only its own message
thread_transcript = store.load_transcript(thread_entry_again.session_id)
assert len(thread_transcript) == 3
assert thread_transcript[2]["content"] == "follow-up"
assert len(thread_transcript) == 1
assert thread_transcript[0]["content"] == "follow-up"
class TestDMThreadSeedingEdgeCases:
"""Edge cases and conditions where seeding should NOT happen."""
class TestDMThreadIsolationEdgeCases:
"""Edge cases — threads always start empty regardless of context."""
def test_no_seeding_for_group_threads(self, store):
"""Group/channel threads should not trigger seeding."""
def test_group_thread_starts_empty(self, store):
"""Group/channel threads should also start empty."""
parent_source = _group_source()
parent_entry = store.get_or_create_session(parent_source)
for msg in PARENT_HISTORY:
@@ -172,7 +157,7 @@ class TestDMThreadSeedingEdgeCases:
thread_transcript = store.load_transcript(thread_entry.session_id)
assert len(thread_transcript) == 0
def test_no_seeding_without_parent_session(self, store):
def test_thread_without_parent_session_starts_empty(self, store):
"""Thread session without a parent DM session should start empty."""
thread_source = _dm_source(thread_id="1234567890.000001")
thread_entry = store.get_or_create_session(thread_source)
@@ -180,34 +165,21 @@ class TestDMThreadSeedingEdgeCases:
thread_transcript = store.load_transcript(thread_entry.session_id)
assert len(thread_transcript) == 0
def test_no_seeding_with_empty_parent(self, store):
"""If parent session exists but has no transcript, thread starts empty."""
parent_source = _dm_source()
store.get_or_create_session(parent_source)
# No messages appended to parent
thread_source = _dm_source(thread_id="1234567890.000001")
thread_entry = store.get_or_create_session(thread_source)
thread_transcript = store.load_transcript(thread_entry.session_id)
assert len(thread_transcript) == 0
def test_no_seeding_for_dm_without_thread_id(self, store):
"""Top-level DMs (no thread_id) should not trigger seeding."""
def test_dm_without_thread_starts_empty(self, store):
"""Top-level DMs (no thread_id) should start empty as always."""
source = _dm_source()
entry = store.get_or_create_session(source)
# Should just be a normal empty session
transcript = store.load_transcript(entry.session_id)
assert len(transcript) == 0
class TestDMThreadSeedingCrossPlatform:
"""Verify seeding works for platforms beyond Slack."""
class TestDMThreadIsolationCrossPlatform:
"""Verify thread isolation is consistent across all platforms."""
@pytest.mark.parametrize("platform", [Platform.SLACK, Platform.TELEGRAM, Platform.DISCORD])
def test_seeding_works_across_platforms(self, store, platform):
"""DM thread seeding should work for any platform that uses thread_id."""
def test_thread_starts_empty_across_platforms(self, store, platform):
"""DM thread sessions start empty regardless of platform."""
parent_source = _dm_source(platform=platform)
parent_entry = store.get_or_create_session(parent_source)
for msg in PARENT_HISTORY:
@@ -217,5 +189,4 @@ class TestDMThreadSeedingCrossPlatform:
thread_entry = store.get_or_create_session(thread_source)
thread_transcript = store.load_transcript(thread_entry.session_id)
assert len(thread_transcript) == 2
assert thread_transcript[0]["content"] == "What's the weather?"
assert len(thread_transcript) == 0
+448
View File
@@ -0,0 +1,448 @@
"""Tests for text message batching across all gateway adapters.
When a user sends a long message, the messaging client splits it at the
platform's character limit. Each adapter should buffer rapid successive
text messages from the same session and aggregate them before dispatching.
Covers: Discord, Matrix, WeCom, and the adaptive delay logic for
Telegram and Feishu.
"""
import asyncio
import os
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import MessageEvent, MessageType, SessionSource
# =====================================================================
# Helpers
# =====================================================================
def _make_event(
text: str,
platform: Platform,
chat_id: str = "12345",
msg_type: MessageType = MessageType.TEXT,
) -> MessageEvent:
return MessageEvent(
text=text,
message_type=msg_type,
source=SessionSource(platform=platform, chat_id=chat_id, chat_type="dm"),
)
# =====================================================================
# Discord text batching
# =====================================================================
def _make_discord_adapter():
"""Create a minimal DiscordAdapter for testing text batching."""
from gateway.platforms.discord import DiscordAdapter
config = PlatformConfig(enabled=True, token="test-token")
adapter = object.__new__(DiscordAdapter)
adapter._platform = Platform.DISCORD
adapter.config = config
adapter._pending_text_batches = {}
adapter._pending_text_batch_tasks = {}
adapter._text_batch_delay_seconds = 0.1 # fast for tests
adapter._text_batch_split_delay_seconds = 0.3 # fast for tests
adapter._active_sessions = {}
adapter._pending_messages = {}
adapter._message_handler = AsyncMock()
adapter.handle_message = AsyncMock()
return adapter
class TestDiscordTextBatching:
@pytest.mark.asyncio
async def test_single_message_dispatched_after_delay(self):
adapter = _make_discord_adapter()
event = _make_event("hello world", Platform.DISCORD)
adapter._enqueue_text_event(event)
# Not dispatched yet
adapter.handle_message.assert_not_called()
# Wait for flush
await asyncio.sleep(0.2)
adapter.handle_message.assert_called_once()
dispatched = adapter.handle_message.call_args[0][0]
assert dispatched.text == "hello world"
@pytest.mark.asyncio
async def test_split_messages_aggregated(self):
"""Two rapid messages from the same chat should be merged."""
adapter = _make_discord_adapter()
adapter._enqueue_text_event(_make_event("Part one of a long", Platform.DISCORD))
await asyncio.sleep(0.02)
adapter._enqueue_text_event(_make_event("message that was split.", Platform.DISCORD))
adapter.handle_message.assert_not_called()
await asyncio.sleep(0.2)
adapter.handle_message.assert_called_once()
text = adapter.handle_message.call_args[0][0].text
assert "Part one" in text
assert "split" in text
@pytest.mark.asyncio
async def test_three_way_split_aggregated(self):
adapter = _make_discord_adapter()
adapter._enqueue_text_event(_make_event("chunk 1", Platform.DISCORD))
await asyncio.sleep(0.02)
adapter._enqueue_text_event(_make_event("chunk 2", Platform.DISCORD))
await asyncio.sleep(0.02)
adapter._enqueue_text_event(_make_event("chunk 3", Platform.DISCORD))
await asyncio.sleep(0.2)
adapter.handle_message.assert_called_once()
text = adapter.handle_message.call_args[0][0].text
assert "chunk 1" in text
assert "chunk 2" in text
assert "chunk 3" in text
@pytest.mark.asyncio
async def test_different_chats_not_merged(self):
adapter = _make_discord_adapter()
adapter._enqueue_text_event(_make_event("from A", Platform.DISCORD, chat_id="111"))
adapter._enqueue_text_event(_make_event("from B", Platform.DISCORD, chat_id="222"))
await asyncio.sleep(0.2)
assert adapter.handle_message.call_count == 2
@pytest.mark.asyncio
async def test_batch_cleans_up_after_flush(self):
adapter = _make_discord_adapter()
adapter._enqueue_text_event(_make_event("test", Platform.DISCORD))
await asyncio.sleep(0.2)
assert len(adapter._pending_text_batches) == 0
@pytest.mark.asyncio
async def test_adaptive_delay_for_near_limit_chunk(self):
"""Chunks near the 2000-char limit should trigger longer delay."""
adapter = _make_discord_adapter()
# Simulate a chunk near Discord's 2000-char split point
long_text = "x" * 1950
adapter._enqueue_text_event(_make_event(long_text, Platform.DISCORD))
# After the short delay (0.1s), should NOT have flushed yet (split delay is 0.3s)
await asyncio.sleep(0.15)
adapter.handle_message.assert_not_called()
# After the split delay, should be flushed
await asyncio.sleep(0.25)
adapter.handle_message.assert_called_once()
# =====================================================================
# Matrix text batching
# =====================================================================
def _make_matrix_adapter():
"""Create a minimal MatrixAdapter for testing text batching."""
from gateway.platforms.matrix import MatrixAdapter
config = PlatformConfig(enabled=True, token="test-token")
adapter = object.__new__(MatrixAdapter)
adapter._platform = Platform.MATRIX
adapter.config = config
adapter._pending_text_batches = {}
adapter._pending_text_batch_tasks = {}
adapter._text_batch_delay_seconds = 0.1
adapter._text_batch_split_delay_seconds = 0.3
adapter._active_sessions = {}
adapter._pending_messages = {}
adapter._message_handler = AsyncMock()
adapter.handle_message = AsyncMock()
return adapter
class TestMatrixTextBatching:
@pytest.mark.asyncio
async def test_single_message_dispatched_after_delay(self):
adapter = _make_matrix_adapter()
event = _make_event("hello world", Platform.MATRIX)
adapter._enqueue_text_event(event)
adapter.handle_message.assert_not_called()
await asyncio.sleep(0.2)
adapter.handle_message.assert_called_once()
assert adapter.handle_message.call_args[0][0].text == "hello world"
@pytest.mark.asyncio
async def test_split_messages_aggregated(self):
adapter = _make_matrix_adapter()
adapter._enqueue_text_event(_make_event("first part", Platform.MATRIX))
await asyncio.sleep(0.02)
adapter._enqueue_text_event(_make_event("second part", Platform.MATRIX))
adapter.handle_message.assert_not_called()
await asyncio.sleep(0.2)
adapter.handle_message.assert_called_once()
text = adapter.handle_message.call_args[0][0].text
assert "first part" in text
assert "second part" in text
@pytest.mark.asyncio
async def test_different_rooms_not_merged(self):
adapter = _make_matrix_adapter()
adapter._enqueue_text_event(_make_event("room A", Platform.MATRIX, chat_id="!aaa:matrix.org"))
adapter._enqueue_text_event(_make_event("room B", Platform.MATRIX, chat_id="!bbb:matrix.org"))
await asyncio.sleep(0.2)
assert adapter.handle_message.call_count == 2
@pytest.mark.asyncio
async def test_adaptive_delay_for_near_limit_chunk(self):
"""Chunks near the 4000-char limit should trigger longer delay."""
adapter = _make_matrix_adapter()
long_text = "x" * 3950
adapter._enqueue_text_event(_make_event(long_text, Platform.MATRIX))
await asyncio.sleep(0.15)
adapter.handle_message.assert_not_called()
await asyncio.sleep(0.25)
adapter.handle_message.assert_called_once()
@pytest.mark.asyncio
async def test_batch_cleans_up_after_flush(self):
adapter = _make_matrix_adapter()
adapter._enqueue_text_event(_make_event("test", Platform.MATRIX))
await asyncio.sleep(0.2)
assert len(adapter._pending_text_batches) == 0
# =====================================================================
# WeCom text batching
# =====================================================================
def _make_wecom_adapter():
"""Create a minimal WeComAdapter for testing text batching."""
from gateway.platforms.wecom import WeComAdapter
config = PlatformConfig(enabled=True, token="test-token")
adapter = object.__new__(WeComAdapter)
adapter._platform = Platform.WECOM
adapter.config = config
adapter._pending_text_batches = {}
adapter._pending_text_batch_tasks = {}
adapter._text_batch_delay_seconds = 0.1
adapter._text_batch_split_delay_seconds = 0.3
adapter._active_sessions = {}
adapter._pending_messages = {}
adapter._message_handler = AsyncMock()
adapter.handle_message = AsyncMock()
return adapter
class TestWeComTextBatching:
@pytest.mark.asyncio
async def test_single_message_dispatched_after_delay(self):
adapter = _make_wecom_adapter()
event = _make_event("hello world", Platform.WECOM)
adapter._enqueue_text_event(event)
adapter.handle_message.assert_not_called()
await asyncio.sleep(0.2)
adapter.handle_message.assert_called_once()
assert adapter.handle_message.call_args[0][0].text == "hello world"
@pytest.mark.asyncio
async def test_split_messages_aggregated(self):
adapter = _make_wecom_adapter()
adapter._enqueue_text_event(_make_event("first part", Platform.WECOM))
await asyncio.sleep(0.02)
adapter._enqueue_text_event(_make_event("second part", Platform.WECOM))
adapter.handle_message.assert_not_called()
await asyncio.sleep(0.2)
adapter.handle_message.assert_called_once()
text = adapter.handle_message.call_args[0][0].text
assert "first part" in text
assert "second part" in text
@pytest.mark.asyncio
async def test_different_chats_not_merged(self):
adapter = _make_wecom_adapter()
adapter._enqueue_text_event(_make_event("chat A", Platform.WECOM, chat_id="chat_a"))
adapter._enqueue_text_event(_make_event("chat B", Platform.WECOM, chat_id="chat_b"))
await asyncio.sleep(0.2)
assert adapter.handle_message.call_count == 2
@pytest.mark.asyncio
async def test_adaptive_delay_for_near_limit_chunk(self):
"""Chunks near the 4000-char limit should trigger longer delay."""
adapter = _make_wecom_adapter()
long_text = "x" * 3950
adapter._enqueue_text_event(_make_event(long_text, Platform.WECOM))
await asyncio.sleep(0.15)
adapter.handle_message.assert_not_called()
await asyncio.sleep(0.25)
adapter.handle_message.assert_called_once()
@pytest.mark.asyncio
async def test_batch_cleans_up_after_flush(self):
adapter = _make_wecom_adapter()
adapter._enqueue_text_event(_make_event("test", Platform.WECOM))
await asyncio.sleep(0.2)
assert len(adapter._pending_text_batches) == 0
# =====================================================================
# Telegram adaptive delay (PR #6891)
# =====================================================================
def _make_telegram_adapter():
"""Create a minimal TelegramAdapter for testing adaptive delay."""
from gateway.platforms.telegram import TelegramAdapter
config = PlatformConfig(enabled=True, token="test-token")
adapter = object.__new__(TelegramAdapter)
adapter._platform = Platform.TELEGRAM
adapter.config = config
adapter._pending_text_batches = {}
adapter._pending_text_batch_tasks = {}
adapter._text_batch_delay_seconds = 0.1
adapter._text_batch_split_delay_seconds = 0.3
adapter._active_sessions = {}
adapter._pending_messages = {}
adapter._message_handler = AsyncMock()
adapter.handle_message = AsyncMock()
return adapter
class TestTelegramAdaptiveDelay:
@pytest.mark.asyncio
async def test_short_chunk_uses_normal_delay(self):
adapter = _make_telegram_adapter()
adapter._enqueue_text_event(_make_event("short msg", Platform.TELEGRAM))
# Should flush after the normal 0.1s delay
await asyncio.sleep(0.15)
adapter.handle_message.assert_called_once()
@pytest.mark.asyncio
async def test_near_limit_chunk_uses_split_delay(self):
"""A chunk near the 4096-char limit should trigger longer delay."""
adapter = _make_telegram_adapter()
long_text = "x" * 4050 # near the 4096 limit
adapter._enqueue_text_event(_make_event(long_text, Platform.TELEGRAM))
# After the short delay, should NOT have flushed yet
await asyncio.sleep(0.15)
adapter.handle_message.assert_not_called()
# After the split delay, should be flushed
await asyncio.sleep(0.25)
adapter.handle_message.assert_called_once()
@pytest.mark.asyncio
async def test_split_continuation_merged(self):
"""Two near-limit chunks should both be merged."""
adapter = _make_telegram_adapter()
adapter._enqueue_text_event(_make_event("x" * 4050, Platform.TELEGRAM))
await asyncio.sleep(0.05)
adapter._enqueue_text_event(_make_event("continuation text", Platform.TELEGRAM))
# Short chunk arrived → should use normal delay now
await asyncio.sleep(0.15)
adapter.handle_message.assert_called_once()
text = adapter.handle_message.call_args[0][0].text
assert "continuation text" in text
# =====================================================================
# Feishu adaptive delay
# =====================================================================
def _make_feishu_adapter():
"""Create a minimal FeishuAdapter for testing adaptive delay."""
from gateway.platforms.feishu import FeishuAdapter, FeishuBatchState
config = PlatformConfig(enabled=True, token="test-token")
adapter = object.__new__(FeishuAdapter)
adapter._platform = Platform.FEISHU
adapter.config = config
batch_state = FeishuBatchState()
adapter._pending_text_batches = batch_state.events
adapter._pending_text_batch_tasks = batch_state.tasks
adapter._pending_text_batch_counts = batch_state.counts
adapter._text_batch_delay_seconds = 0.1
adapter._text_batch_split_delay_seconds = 0.3
adapter._text_batch_max_messages = 20
adapter._text_batch_max_chars = 50000
adapter._active_sessions = {}
adapter._pending_messages = {}
adapter._message_handler = AsyncMock()
adapter._handle_message_with_guards = AsyncMock()
return adapter
class TestFeishuAdaptiveDelay:
@pytest.mark.asyncio
async def test_short_chunk_uses_normal_delay(self):
adapter = _make_feishu_adapter()
event = _make_event("short msg", Platform.FEISHU)
await adapter._enqueue_text_event(event)
await asyncio.sleep(0.15)
adapter._handle_message_with_guards.assert_called_once()
@pytest.mark.asyncio
async def test_near_limit_chunk_uses_split_delay(self):
"""A chunk near the 4096-char limit should trigger longer delay."""
adapter = _make_feishu_adapter()
long_text = "x" * 4050
event = _make_event(long_text, Platform.FEISHU)
await adapter._enqueue_text_event(event)
await asyncio.sleep(0.15)
adapter._handle_message_with_guards.assert_not_called()
await asyncio.sleep(0.25)
adapter._handle_message_with_guards.assert_called_once()
@pytest.mark.asyncio
async def test_split_continuation_merged(self):
adapter = _make_feishu_adapter()
await adapter._enqueue_text_event(_make_event("x" * 4050, Platform.FEISHU))
await asyncio.sleep(0.05)
await adapter._enqueue_text_event(_make_event("continuation text", Platform.FEISHU))
await asyncio.sleep(0.15)
adapter._handle_message_with_guards.assert_called_once()
text = adapter._handle_message_with_guards.call_args[0][0].text
assert "continuation text" in text
+177
View File
@@ -0,0 +1,177 @@
"""Tests for gateway /usage command — agent cache lookup and output fields."""
import asyncio
import threading
from unittest.mock import MagicMock, patch
import pytest
def _make_mock_agent(**overrides):
"""Create a mock AIAgent with realistic session counters."""
agent = MagicMock()
defaults = {
"model": "anthropic/claude-sonnet-4.6",
"provider": "openrouter",
"base_url": None,
"session_total_tokens": 50_000,
"session_api_calls": 5,
"session_prompt_tokens": 40_000,
"session_completion_tokens": 10_000,
"session_input_tokens": 35_000,
"session_output_tokens": 10_000,
"session_cache_read_tokens": 5_000,
"session_cache_write_tokens": 2_000,
}
defaults.update(overrides)
for k, v in defaults.items():
setattr(agent, k, v)
# Rate limit state
rl = MagicMock()
rl.has_data = True
agent.get_rate_limit_state.return_value = rl
# Context compressor
ctx = MagicMock()
ctx.last_prompt_tokens = 30_000
ctx.context_length = 200_000
ctx.compression_count = 1
agent.context_compressor = ctx
return agent
def _make_runner(session_key, agent=None, cached_agent=None):
"""Build a bare GatewayRunner with just the fields _handle_usage_command needs."""
from gateway.run import GatewayRunner, _AGENT_PENDING_SENTINEL
runner = object.__new__(GatewayRunner)
runner._running_agents = {}
runner._running_agents_ts = {}
runner._agent_cache = {}
runner._agent_cache_lock = threading.Lock()
runner.session_store = MagicMock()
if agent is not None:
runner._running_agents[session_key] = agent
if cached_agent is not None:
runner._agent_cache[session_key] = (cached_agent, "sig")
# Wire helper
runner._session_key_for_source = MagicMock(return_value=session_key)
return runner
SK = "agent:main:telegram:private:12345"
class TestUsageCachedAgent:
"""The main fix: /usage should find agents in _agent_cache between turns."""
@pytest.mark.asyncio
async def test_cached_agent_shows_detailed_usage(self):
agent = _make_mock_agent()
runner = _make_runner(SK, cached_agent=agent)
event = MagicMock()
with patch("agent.rate_limit_tracker.format_rate_limit_compact", return_value="RPM: 50/60"), \
patch("agent.usage_pricing.estimate_usage_cost") as mock_cost:
mock_cost.return_value = MagicMock(amount_usd=0.1234, status="estimated")
result = await runner._handle_usage_command(event)
assert "claude-sonnet-4.6" in result
assert "35,000" in result # input tokens
assert "10,000" in result # output tokens
assert "5,000" in result # cache read
assert "2,000" in result # cache write
assert "50,000" in result # total
assert "$0.1234" in result
assert "30,000" in result # context
assert "Compressions: 1" in result
@pytest.mark.asyncio
async def test_running_agent_preferred_over_cache(self):
"""When agent is in both dicts, the running one wins."""
running = _make_mock_agent(session_api_calls=10, session_total_tokens=80_000)
cached = _make_mock_agent(session_api_calls=5, session_total_tokens=50_000)
runner = _make_runner(SK, agent=running, cached_agent=cached)
event = MagicMock()
with patch("agent.rate_limit_tracker.format_rate_limit_compact", return_value="RPM: 50/60"), \
patch("agent.usage_pricing.estimate_usage_cost") as mock_cost:
mock_cost.return_value = MagicMock(amount_usd=None, status="unknown")
result = await runner._handle_usage_command(event)
assert "80,000" in result # running agent's total
assert "API calls: 10" in result
@pytest.mark.asyncio
async def test_sentinel_skipped_uses_cache(self):
"""PENDING sentinel in _running_agents should fall through to cache."""
from gateway.run import _AGENT_PENDING_SENTINEL
cached = _make_mock_agent()
runner = _make_runner(SK, cached_agent=cached)
runner._running_agents[SK] = _AGENT_PENDING_SENTINEL
event = MagicMock()
with patch("agent.rate_limit_tracker.format_rate_limit_compact", return_value="RPM: 50/60"), \
patch("agent.usage_pricing.estimate_usage_cost") as mock_cost:
mock_cost.return_value = MagicMock(amount_usd=None, status="unknown")
result = await runner._handle_usage_command(event)
assert "claude-sonnet-4.6" in result
assert "Session Token Usage" in result
@pytest.mark.asyncio
async def test_no_agent_anywhere_falls_to_history(self):
"""No running or cached agent → rough estimate from transcript."""
runner = _make_runner(SK)
event = MagicMock()
session_entry = MagicMock()
session_entry.session_id = "sess123"
runner.session_store.get_or_create_session.return_value = session_entry
runner.session_store.load_transcript.return_value = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "hi there"},
]
with patch("agent.model_metadata.estimate_messages_tokens_rough", return_value=500):
result = await runner._handle_usage_command(event)
assert "Session Info" in result
assert "Messages: 2" in result
assert "~500" in result
@pytest.mark.asyncio
async def test_cache_read_write_hidden_when_zero(self):
"""Cache token lines should be omitted when zero."""
agent = _make_mock_agent(session_cache_read_tokens=0, session_cache_write_tokens=0)
runner = _make_runner(SK, cached_agent=agent)
event = MagicMock()
with patch("agent.rate_limit_tracker.format_rate_limit_compact", return_value="RPM: 50/60"), \
patch("agent.usage_pricing.estimate_usage_cost") as mock_cost:
mock_cost.return_value = MagicMock(amount_usd=None, status="unknown")
result = await runner._handle_usage_command(event)
assert "Cache read" not in result
assert "Cache write" not in result
@pytest.mark.asyncio
async def test_cost_included_status(self):
"""Subscription-included providers show 'included' instead of dollar amount."""
agent = _make_mock_agent(provider="openai-codex")
runner = _make_runner(SK, cached_agent=agent)
event = MagicMock()
with patch("agent.rate_limit_tracker.format_rate_limit_compact", return_value="RPM: 50/60"), \
patch("agent.usage_pricing.estimate_usage_cost") as mock_cost:
mock_cost.return_value = MagicMock(amount_usd=None, status="included")
result = await runner._handle_usage_command(event)
assert "Cost: included" in result
+2
View File
@@ -508,6 +508,7 @@ class TestInboundMessages:
from gateway.platforms.wecom import WeComAdapter
adapter = WeComAdapter(PlatformConfig(enabled=True))
adapter._text_batch_delay_seconds = 0 # disable batching for tests
adapter.handle_message = AsyncMock()
adapter._extract_media = AsyncMock(return_value=(["/tmp/test.png"], ["image/png"]))
@@ -539,6 +540,7 @@ class TestInboundMessages:
from gateway.platforms.wecom import WeComAdapter
adapter = WeComAdapter(PlatformConfig(enabled=True))
adapter._text_batch_delay_seconds = 0 # disable batching for tests
adapter.handle_message = AsyncMock()
adapter._extract_media = AsyncMock(return_value=([], []))
@@ -633,6 +633,7 @@ class TestHasAnyProviderConfigured:
hermes_home.mkdir()
monkeypatch.setattr(config_module, "get_env_path", lambda: hermes_home / ".env")
monkeypatch.setattr(config_module, "get_hermes_home", lambda: hermes_home)
monkeypatch.setattr("hermes_cli.copilot_auth.resolve_copilot_token", lambda: ("", ""))
# Clear all provider env vars so earlier checks don't short-circuit
_all_vars = {"OPENROUTER_API_KEY", "OPENAI_API_KEY", "ANTHROPIC_API_KEY",
"ANTHROPIC_TOKEN", "OPENAI_BASE_URL"}
@@ -727,6 +728,7 @@ class TestHasAnyProviderConfigured:
monkeypatch.setattr(config_module, "get_env_path", lambda: hermes_home / ".env")
monkeypatch.setattr(config_module, "get_hermes_home", lambda: hermes_home)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setattr("hermes_cli.copilot_auth.resolve_copilot_token", lambda: ("", ""))
_all_vars = {"OPENROUTER_API_KEY", "OPENAI_API_KEY", "ANTHROPIC_API_KEY",
"ANTHROPIC_TOKEN", "OPENAI_BASE_URL"}
for pconfig in PROVIDER_REGISTRY.values():
+24
View File
@@ -49,6 +49,30 @@ def test_chat_subcommand_accepts_skills_flag(monkeypatch):
}
def test_chat_subcommand_accepts_image_flag(monkeypatch):
import hermes_cli.main as main_mod
captured = {}
def fake_cmd_chat(args):
captured["query"] = args.query
captured["image"] = args.image
monkeypatch.setattr(main_mod, "cmd_chat", fake_cmd_chat)
monkeypatch.setattr(
sys,
"argv",
["hermes", "chat", "-q", "hello", "--image", "~/storage/shared/Pictures/cat.png"],
)
main_mod.main()
assert captured == {
"query": "hello",
"image": "~/storage/shared/Pictures/cat.png",
}
def test_continue_worktree_and_skills_flags_work_together(monkeypatch):
import hermes_cli.main as main_mod
+28
View File
@@ -446,6 +446,13 @@ class TestSubcommands:
assert "show" in subs
assert "hide" in subs
def test_fast_has_subcommands(self):
assert "/fast" in SUBCOMMANDS
subs = SUBCOMMANDS["/fast"]
assert "fast" in subs
assert "normal" in subs
assert "status" in subs
def test_voice_has_subcommands(self):
assert "/voice" in SUBCOMMANDS
assert "on" in SUBCOMMANDS["/voice"]
@@ -474,6 +481,20 @@ class TestSubcommandCompletion:
assert "high" in texts
assert "show" in texts
def test_fast_subcommand_completion_after_space(self):
completions = _completions(SlashCommandCompleter(), "/fast ")
texts = {c.text for c in completions}
assert "fast" in texts
assert "normal" in texts
def test_fast_command_filtered_out_when_unavailable(self):
completions = _completions(
SlashCommandCompleter(command_filter=lambda cmd: cmd != "/fast"),
"/fa",
)
texts = {c.text for c in completions}
assert "fast" not in texts
def test_subcommand_prefix_filters(self):
"""Typing '/reasoning sh' should only show 'show'."""
completions = _completions(SlashCommandCompleter(), "/reasoning sh")
@@ -527,6 +548,13 @@ class TestGhostText:
"""/reasoning sh → 'ow'"""
assert _suggestion("/reasoning sh") == "ow"
def test_fast_subcommand_suggestion(self):
assert _suggestion("/fast f") == "ast"
def test_fast_subcommand_suggestion_hidden_when_filtered(self):
completer = SlashCommandCompleter(command_filter=lambda cmd: cmd != "/fast")
assert _suggestion("/fa", completer=completer) is None
def test_no_suggestion_for_non_slash(self):
assert _suggestion("hello") is None
+86
View File
@@ -14,6 +14,23 @@ from hermes_cli import doctor as doctor_mod
from hermes_cli.doctor import _has_provider_env_config
class TestDoctorPlatformHints:
def test_termux_package_hint(self, monkeypatch):
monkeypatch.setenv("TERMUX_VERSION", "0.118.3")
monkeypatch.setenv("PREFIX", "/data/data/com.termux/files/usr")
assert doctor._is_termux() is True
assert doctor._python_install_cmd() == "python -m pip install"
assert doctor._system_package_install_cmd("ripgrep") == "pkg install ripgrep"
def test_non_termux_package_hint_defaults_to_apt(self, monkeypatch):
monkeypatch.delenv("TERMUX_VERSION", raising=False)
monkeypatch.setenv("PREFIX", "/usr")
monkeypatch.setattr(sys, "platform", "linux")
assert doctor._is_termux() is False
assert doctor._python_install_cmd() == "uv pip install"
assert doctor._system_package_install_cmd("ripgrep") == "sudo apt install ripgrep"
class TestProviderEnvDetection:
def test_detects_openai_api_key(self):
content = "OPENAI_BASE_URL=http://localhost:1234/v1\nOPENAI_API_KEY=***"
@@ -206,3 +223,72 @@ class TestDoctorMemoryProviderSection:
out = self._run_doctor_and_capture(monkeypatch, tmp_path, provider="mem0")
assert "Memory Provider" in out
assert "Built-in memory active" not in out
def test_run_doctor_termux_treats_docker_and_browser_warnings_as_expected(monkeypatch, tmp_path):
helper = TestDoctorMemoryProviderSection()
monkeypatch.setenv("TERMUX_VERSION", "0.118.3")
monkeypatch.setenv("PREFIX", "/data/data/com.termux/files/usr")
real_which = doctor_mod.shutil.which
def fake_which(cmd):
if cmd in {"docker", "node", "npm"}:
return None
return real_which(cmd)
monkeypatch.setattr(doctor_mod.shutil, "which", fake_which)
out = helper._run_doctor_and_capture(monkeypatch, tmp_path, provider="")
assert "Docker backend is not available inside Termux" in out
assert "Node.js not found (browser tools are optional in the tested Termux path)" in out
assert "Install Node.js on Termux with: pkg install nodejs" in out
assert "Termux browser setup:" in out
assert "1) pkg install nodejs" in out
assert "2) npm install -g agent-browser" in out
assert "3) agent-browser install" in out
assert "docker not found (optional)" not in out
def test_run_doctor_termux_does_not_mark_browser_available_without_agent_browser(monkeypatch, tmp_path):
home = tmp_path / ".hermes"
home.mkdir(parents=True, exist_ok=True)
(home / "config.yaml").write_text("memory: {}\n", encoding="utf-8")
project = tmp_path / "project"
project.mkdir(exist_ok=True)
monkeypatch.setenv("TERMUX_VERSION", "0.118.3")
monkeypatch.setenv("PREFIX", "/data/data/com.termux/files/usr")
monkeypatch.setattr(doctor_mod, "HERMES_HOME", home)
monkeypatch.setattr(doctor_mod, "PROJECT_ROOT", project)
monkeypatch.setattr(doctor_mod, "_DHH", str(home))
monkeypatch.setattr(doctor_mod.shutil, "which", lambda cmd: "/data/data/com.termux/files/usr/bin/node" if cmd in {"node", "npm"} else None)
fake_model_tools = types.SimpleNamespace(
check_tool_availability=lambda *a, **kw: (["terminal"], [{"name": "browser", "env_vars": [], "tools": ["browser_navigate"]}]),
TOOLSET_REQUIREMENTS={
"terminal": {"name": "terminal"},
"browser": {"name": "browser"},
},
)
monkeypatch.setitem(sys.modules, "model_tools", fake_model_tools)
try:
from hermes_cli import auth as _auth_mod
monkeypatch.setattr(_auth_mod, "get_nous_auth_status", lambda: {})
monkeypatch.setattr(_auth_mod, "get_codex_auth_status", lambda: {})
except Exception:
pass
import io, contextlib
buf = io.StringIO()
with contextlib.redirect_stdout(buf):
doctor_mod.run_doctor(Namespace(fix=False))
out = buf.getvalue()
assert "✓ browser" not in out
assert "browser" in out
assert "system dependency not met" in out
assert "agent-browser is not installed (expected in the tested Termux path)" in out
assert "npm install -g agent-browser && agent-browser install" in out
+7
View File
@@ -10,6 +10,7 @@ import hermes_cli.gateway as gateway
class TestSystemdLingerStatus:
def test_reports_enabled(self, monkeypatch):
monkeypatch.setattr(gateway, "is_linux", lambda: True)
monkeypatch.setattr(gateway, "is_termux", lambda: False)
monkeypatch.setenv("USER", "alice")
monkeypatch.setattr(
gateway.subprocess,
@@ -22,6 +23,7 @@ class TestSystemdLingerStatus:
def test_reports_disabled(self, monkeypatch):
monkeypatch.setattr(gateway, "is_linux", lambda: True)
monkeypatch.setattr(gateway, "is_termux", lambda: False)
monkeypatch.setenv("USER", "alice")
monkeypatch.setattr(
gateway.subprocess,
@@ -32,6 +34,11 @@ class TestSystemdLingerStatus:
assert gateway.get_systemd_linger_status() == (False, "")
def test_reports_termux_as_not_supported(self, monkeypatch):
monkeypatch.setattr(gateway, "is_termux", lambda: True)
assert gateway.get_systemd_linger_status() == (None, "not supported in Termux")
def test_systemd_status_warns_when_linger_disabled(monkeypatch, tmp_path, capsys):
unit_path = tmp_path / "hermes-gateway.service"
+5
View File
@@ -8,6 +8,7 @@ import hermes_cli.gateway as gateway
class TestEnsureLingerEnabled:
def test_linger_already_enabled_via_file(self, monkeypatch, capsys):
monkeypatch.setattr(gateway, "is_linux", lambda: True)
monkeypatch.setattr(gateway, "is_termux", lambda: False)
monkeypatch.setattr("getpass.getuser", lambda: "testuser")
monkeypatch.setattr(gateway, "Path", lambda _path: SimpleNamespace(exists=lambda: True))
@@ -22,6 +23,7 @@ class TestEnsureLingerEnabled:
def test_status_enabled_skips_enable(self, monkeypatch, capsys):
monkeypatch.setattr(gateway, "is_linux", lambda: True)
monkeypatch.setattr(gateway, "is_termux", lambda: False)
monkeypatch.setattr("getpass.getuser", lambda: "testuser")
monkeypatch.setattr(gateway, "Path", lambda _path: SimpleNamespace(exists=lambda: False))
monkeypatch.setattr(gateway, "get_systemd_linger_status", lambda: (True, ""))
@@ -37,6 +39,7 @@ class TestEnsureLingerEnabled:
def test_loginctl_success_enables_linger(self, monkeypatch, capsys):
monkeypatch.setattr(gateway, "is_linux", lambda: True)
monkeypatch.setattr(gateway, "is_termux", lambda: False)
monkeypatch.setattr("getpass.getuser", lambda: "testuser")
monkeypatch.setattr(gateway, "Path", lambda _path: SimpleNamespace(exists=lambda: False))
monkeypatch.setattr(gateway, "get_systemd_linger_status", lambda: (False, ""))
@@ -59,6 +62,7 @@ class TestEnsureLingerEnabled:
def test_missing_loginctl_shows_manual_guidance(self, monkeypatch, capsys):
monkeypatch.setattr(gateway, "is_linux", lambda: True)
monkeypatch.setattr(gateway, "is_termux", lambda: False)
monkeypatch.setattr("getpass.getuser", lambda: "testuser")
monkeypatch.setattr(gateway, "Path", lambda _path: SimpleNamespace(exists=lambda: False))
monkeypatch.setattr(gateway, "get_systemd_linger_status", lambda: (None, "loginctl not found"))
@@ -76,6 +80,7 @@ class TestEnsureLingerEnabled:
def test_loginctl_failure_shows_manual_guidance(self, monkeypatch, capsys):
monkeypatch.setattr(gateway, "is_linux", lambda: True)
monkeypatch.setattr(gateway, "is_termux", lambda: False)
monkeypatch.setattr("getpass.getuser", lambda: "testuser")
monkeypatch.setattr(gateway, "Path", lambda _path: SimpleNamespace(exists=lambda: False))
monkeypatch.setattr(gateway, "get_systemd_linger_status", lambda: (False, ""))
+55 -8
View File
@@ -109,7 +109,8 @@ class TestGatewayStopCleanup:
unit_path = tmp_path / "hermes-gateway.service"
unit_path.write_text("unit\n", encoding="utf-8")
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(gateway_cli, "get_systemd_unit_path", lambda system=False: unit_path)
@@ -134,7 +135,8 @@ class TestGatewayStopCleanup:
unit_path = tmp_path / "hermes-gateway.service"
unit_path.write_text("unit\n", encoding="utf-8")
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(gateway_cli, "get_systemd_unit_path", lambda system=False: unit_path)
@@ -256,7 +258,8 @@ class TestGatewayServiceDetection:
user_unit = SimpleNamespace(exists=lambda: True)
system_unit = SimpleNamespace(exists=lambda: True)
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(
gateway_cli,
@@ -278,7 +281,8 @@ class TestGatewayServiceDetection:
class TestGatewaySystemServiceRouting:
def test_gateway_install_passes_system_flags(self, monkeypatch):
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
calls = []
@@ -294,11 +298,30 @@ class TestGatewaySystemServiceRouting:
assert calls == [(True, True, "alice")]
def test_gateway_install_reports_termux_manual_mode(self, monkeypatch, capsys):
monkeypatch.setattr(gateway_cli, "is_termux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
try:
gateway_cli.gateway_command(
SimpleNamespace(gateway_command="install", force=False, system=False, run_as_user=None)
)
except SystemExit as exc:
assert exc.code == 1
else:
raise AssertionError("Expected gateway_command to exit on unsupported Termux service install")
out = capsys.readouterr().out
assert "not supported on Termux" in out
assert "Run manually: hermes gateway" in out
def test_gateway_status_prefers_system_service_when_only_system_unit_exists(self, monkeypatch):
user_unit = SimpleNamespace(exists=lambda: False)
system_unit = SimpleNamespace(exists=lambda: True)
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(
gateway_cli,
@@ -313,6 +336,20 @@ class TestGatewaySystemServiceRouting:
assert calls == [(False, False)]
def test_gateway_status_on_termux_shows_manual_guidance(self, monkeypatch, capsys):
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: False)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: True)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(gateway_cli, "find_gateway_pids", lambda exclude_pids=None: [])
monkeypatch.setattr(gateway_cli, "_runtime_health_lines", lambda: [])
gateway_cli.gateway_command(SimpleNamespace(gateway_command="status", deep=False, system=False))
out = capsys.readouterr().out
assert "Gateway is not running" in out
assert "nohup hermes gateway" in out
assert "install as user service" not in out
def test_gateway_restart_does_not_fallback_to_foreground_when_launchd_restart_fails(self, tmp_path, monkeypatch):
plist_path = tmp_path / "ai.hermes.gateway.plist"
plist_path.write_text("plist\n", encoding="utf-8")
@@ -513,12 +550,22 @@ class TestGeneratedUnitUsesDetectedVenv:
class TestGeneratedUnitIncludesLocalBin:
"""~/.local/bin must be in PATH so uvx/pipx tools are discoverable."""
def test_user_unit_includes_local_bin_in_path(self):
def test_user_unit_includes_local_bin_in_path(self, monkeypatch):
home = Path.home()
monkeypatch.setattr(
gateway_cli,
"_build_user_local_paths",
lambda home_path, existing: [str(home / ".local" / "bin")],
)
unit = gateway_cli.generate_systemd_unit(system=False)
home = str(Path.home())
assert f"{home}/.local/bin" in unit
def test_system_unit_includes_local_bin_in_path(self):
def test_system_unit_includes_local_bin_in_path(self, monkeypatch):
monkeypatch.setattr(
gateway_cli,
"_build_user_local_paths",
lambda home_path, existing: [str(home_path / ".local" / "bin")],
)
unit = gateway_cli.generate_systemd_unit(system=True)
# System unit uses the resolved home dir from _system_service_identity
assert "/.local/bin" in unit
@@ -0,0 +1,104 @@
"""Regression tests for /model support of config.yaml custom_providers.
The terminal `hermes model` flow already exposes `custom_providers`, but the
shared slash-command pipeline (`/model` in CLI/gateway/Telegram) historically
only looked at `providers:`.
"""
import hermes_cli.providers as providers_mod
from hermes_cli.model_switch import list_authenticated_providers, switch_model
from hermes_cli.providers import resolve_provider_full
_MOCK_VALIDATION = {
"accepted": True,
"persist": True,
"recognized": True,
"message": None,
}
def test_list_authenticated_providers_includes_custom_providers(monkeypatch):
"""No-args /model menus should include saved custom_providers entries."""
monkeypatch.setattr("agent.models_dev.fetch_models_dev", lambda: {})
monkeypatch.setattr(providers_mod, "HERMES_OVERLAYS", {})
providers = list_authenticated_providers(
current_provider="openai-codex",
user_providers={},
custom_providers=[
{
"name": "Local (127.0.0.1:4141)",
"base_url": "http://127.0.0.1:4141/v1",
"model": "rotator-openrouter-coding",
}
],
max_models=50,
)
assert any(
p["slug"] == "custom:local-(127.0.0.1:4141)"
and p["name"] == "Local (127.0.0.1:4141)"
and p["models"] == ["rotator-openrouter-coding"]
and p["api_url"] == "http://127.0.0.1:4141/v1"
for p in providers
)
def test_resolve_provider_full_finds_named_custom_provider():
"""Explicit /model --provider should resolve saved custom_providers entries."""
resolved = resolve_provider_full(
"custom:local-(127.0.0.1:4141)",
user_providers={},
custom_providers=[
{
"name": "Local (127.0.0.1:4141)",
"base_url": "http://127.0.0.1:4141/v1",
}
],
)
assert resolved is not None
assert resolved.id == "custom:local-(127.0.0.1:4141)"
assert resolved.name == "Local (127.0.0.1:4141)"
assert resolved.base_url == "http://127.0.0.1:4141/v1"
assert resolved.source == "user-config"
def test_switch_model_accepts_explicit_named_custom_provider(monkeypatch):
"""Shared /model switch pipeline should accept --provider for custom_providers."""
monkeypatch.setattr(
"hermes_cli.runtime_provider.resolve_runtime_provider",
lambda requested: {
"api_key": "no-key-required",
"base_url": "http://127.0.0.1:4141/v1",
"api_mode": "chat_completions",
},
)
monkeypatch.setattr("hermes_cli.models.validate_requested_model", lambda *a, **k: _MOCK_VALIDATION)
monkeypatch.setattr("hermes_cli.model_switch.get_model_info", lambda *a, **k: None)
monkeypatch.setattr("hermes_cli.model_switch.get_model_capabilities", lambda *a, **k: None)
result = switch_model(
raw_input="rotator-openrouter-coding",
current_provider="openai-codex",
current_model="gpt-5.4",
current_base_url="https://chatgpt.com/backend-api/codex",
current_api_key="",
explicit_provider="custom:local-(127.0.0.1:4141)",
user_providers={},
custom_providers=[
{
"name": "Local (127.0.0.1:4141)",
"base_url": "http://127.0.0.1:4141/v1",
"model": "rotator-openrouter-coding",
}
],
)
assert result.success is True
assert result.target_provider == "custom:local-(127.0.0.1:4141)"
assert result.provider_label == "Local (127.0.0.1:4141)"
assert result.new_model == "rotator-openrouter-coding"
assert result.base_url == "http://127.0.0.1:4141/v1"
assert result.api_key == "no-key-required"
+16 -2
View File
@@ -124,7 +124,14 @@ class TestParseModelInput:
class TestCuratedModelsForProvider:
def test_openrouter_returns_curated_list(self):
models = curated_models_for_provider("openrouter")
with patch(
"hermes_cli.models.fetch_openrouter_models",
return_value=[
("anthropic/claude-opus-4.6", "recommended"),
("qwen/qwen3.6-plus", ""),
],
):
models = curated_models_for_provider("openrouter")
assert len(models) > 0
assert any("claude" in m[0] for m in models)
@@ -169,7 +176,14 @@ class TestProviderLabel:
class TestProviderModelIds:
def test_openrouter_returns_curated_list(self):
ids = provider_model_ids("openrouter")
with patch(
"hermes_cli.models.fetch_openrouter_models",
return_value=[
("anthropic/claude-opus-4.6", "recommended"),
("qwen/qwen3.6-plus", ""),
],
):
ids = provider_model_ids("openrouter")
assert len(ids) > 0
assert all("/" in mid for mid in ids)
+75 -22
View File
@@ -3,7 +3,7 @@
from unittest.mock import patch, MagicMock
from hermes_cli.models import (
OPENROUTER_MODELS, menu_labels, model_ids, detect_provider_for_model,
OPENROUTER_MODELS, fetch_openrouter_models, menu_labels, model_ids, detect_provider_for_model,
filter_nous_free_models, _NOUS_ALLOWED_FREE_MODELS,
is_nous_free_tier, partition_nous_models_by_tier,
check_nous_free_tier, clear_nous_free_tier_cache,
@@ -11,43 +11,57 @@ from hermes_cli.models import (
)
import hermes_cli.models as _models_mod
LIVE_OPENROUTER_MODELS = [
("anthropic/claude-opus-4.6", "recommended"),
("qwen/qwen3.6-plus", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
]
class TestModelIds:
def test_returns_non_empty_list(self):
ids = model_ids()
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
ids = model_ids()
assert isinstance(ids, list)
assert len(ids) > 0
def test_ids_match_models_list(self):
ids = model_ids()
expected = [mid for mid, _ in OPENROUTER_MODELS]
def test_ids_match_fetched_catalog(self):
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
ids = model_ids()
expected = [mid for mid, _ in LIVE_OPENROUTER_MODELS]
assert ids == expected
def test_all_ids_contain_provider_slash(self):
"""Model IDs should follow the provider/model format."""
for mid in model_ids():
assert "/" in mid, f"Model ID '{mid}' missing provider/ prefix"
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
for mid in model_ids():
assert "/" in mid, f"Model ID '{mid}' missing provider/ prefix"
def test_no_duplicate_ids(self):
ids = model_ids()
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
ids = model_ids()
assert len(ids) == len(set(ids)), "Duplicate model IDs found"
class TestMenuLabels:
def test_same_length_as_model_ids(self):
assert len(menu_labels()) == len(model_ids())
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
assert len(menu_labels()) == len(model_ids())
def test_first_label_marked_recommended(self):
labels = menu_labels()
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
labels = menu_labels()
assert "recommended" in labels[0].lower()
def test_each_label_contains_its_model_id(self):
for label, mid in zip(menu_labels(), model_ids()):
assert mid in label, f"Label '{label}' doesn't contain model ID '{mid}'"
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
for label, mid in zip(menu_labels(), model_ids()):
assert mid in label, f"Label '{label}' doesn't contain model ID '{mid}'"
def test_non_recommended_labels_have_no_tag(self):
"""Only the first model should have (recommended)."""
labels = menu_labels()
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
labels = menu_labels()
for label in labels[1:]:
assert "recommended" not in label.lower(), f"Unexpected 'recommended' in '{label}'"
@@ -65,30 +79,65 @@ class TestOpenRouterModels:
assert len(OPENROUTER_MODELS) >= 5
class TestFetchOpenRouterModels:
def test_live_fetch_recomputes_free_tags(self, monkeypatch):
class _Resp:
def __enter__(self):
return self
def __exit__(self, exc_type, exc, tb):
return False
def read(self):
return b'{"data":[{"id":"anthropic/claude-opus-4.6","pricing":{"prompt":"0.000015","completion":"0.000075"}},{"id":"qwen/qwen3.6-plus","pricing":{"prompt":"0.000000325","completion":"0.00000195"}},{"id":"nvidia/nemotron-3-super-120b-a12b:free","pricing":{"prompt":"0","completion":"0"}}]}'
monkeypatch.setattr(_models_mod, "_openrouter_catalog_cache", None)
with patch("hermes_cli.models.urllib.request.urlopen", return_value=_Resp()):
models = fetch_openrouter_models(force_refresh=True)
assert models == [
("anthropic/claude-opus-4.6", "recommended"),
("qwen/qwen3.6-plus", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
]
def test_falls_back_to_static_snapshot_on_fetch_failure(self, monkeypatch):
monkeypatch.setattr(_models_mod, "_openrouter_catalog_cache", None)
with patch("hermes_cli.models.urllib.request.urlopen", side_effect=OSError("boom")):
models = fetch_openrouter_models(force_refresh=True)
assert models == OPENROUTER_MODELS
class TestFindOpenrouterSlug:
def test_exact_match(self):
from hermes_cli.models import _find_openrouter_slug
assert _find_openrouter_slug("anthropic/claude-opus-4.6") == "anthropic/claude-opus-4.6"
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
assert _find_openrouter_slug("anthropic/claude-opus-4.6") == "anthropic/claude-opus-4.6"
def test_bare_name_match(self):
from hermes_cli.models import _find_openrouter_slug
result = _find_openrouter_slug("claude-opus-4.6")
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
result = _find_openrouter_slug("claude-opus-4.6")
assert result == "anthropic/claude-opus-4.6"
def test_case_insensitive(self):
from hermes_cli.models import _find_openrouter_slug
result = _find_openrouter_slug("Anthropic/Claude-Opus-4.6")
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
result = _find_openrouter_slug("Anthropic/Claude-Opus-4.6")
assert result is not None
def test_unknown_returns_none(self):
from hermes_cli.models import _find_openrouter_slug
assert _find_openrouter_slug("totally-fake-model-xyz") is None
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
assert _find_openrouter_slug("totally-fake-model-xyz") is None
class TestDetectProviderForModel:
def test_anthropic_model_detected(self):
"""claude-opus-4-6 should resolve to anthropic provider."""
result = detect_provider_for_model("claude-opus-4-6", "openai-codex")
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
result = detect_provider_for_model("claude-opus-4-6", "openai-codex")
assert result is not None
assert result[0] == "anthropic"
@@ -105,7 +154,8 @@ class TestDetectProviderForModel:
def test_openrouter_slug_match(self):
"""Models in the OpenRouter catalog should be found."""
result = detect_provider_for_model("anthropic/claude-opus-4.6", "openai-codex")
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
result = detect_provider_for_model("anthropic/claude-opus-4.6", "openai-codex")
assert result is not None
assert result[0] == "openrouter"
assert result[1] == "anthropic/claude-opus-4.6"
@@ -119,18 +169,21 @@ class TestDetectProviderForModel:
):
monkeypatch.delenv(env_var, raising=False)
"""Bare model names should get mapped to full OpenRouter slugs."""
result = detect_provider_for_model("claude-opus-4.6", "openai-codex")
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
result = detect_provider_for_model("claude-opus-4.6", "openai-codex")
assert result is not None
# Should find it on OpenRouter with full slug
assert result[1] == "anthropic/claude-opus-4.6"
def test_unknown_model_returns_none(self):
"""Completely unknown model names should return None."""
assert detect_provider_for_model("nonexistent-model-xyz", "openai-codex") is None
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
assert detect_provider_for_model("nonexistent-model-xyz", "openai-codex") is None
def test_aggregator_not_suggested(self):
"""nous/openrouter should never be auto-suggested as target provider."""
result = detect_provider_for_model("claude-opus-4-6", "openai-codex")
with patch("hermes_cli.models.fetch_openrouter_models", return_value=LIVE_OPENROUTER_MODELS):
result = detect_provider_for_model("claude-opus-4-6", "openai-codex")
assert result is not None
assert result[0] not in ("nous",) # nous has claude models but shouldn't be suggested
+57
View File
@@ -142,6 +142,31 @@ def test_setup_custom_providers_synced(tmp_path, monkeypatch):
assert reloaded.get("custom_providers") == [{"name": "Local", "base_url": "http://localhost:8080/v1"}]
def test_setup_syncs_custom_provider_removal_from_disk(tmp_path, monkeypatch):
"""Removing the last custom provider in model setup should persist."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
_stub_tts(monkeypatch)
config = load_config()
config["custom_providers"] = [{"name": "Local", "base_url": "http://localhost:8080/v1"}]
save_config(config)
def fake_select():
cfg = load_config()
cfg["model"] = {"provider": "openrouter", "default": "anthropic/claude-opus-4.6"}
cfg["custom_providers"] = []
save_config(cfg)
monkeypatch.setattr("hermes_cli.main.select_provider_and_model", fake_select)
setup_model_provider(config)
save_config(config)
reloaded = load_config()
assert reloaded.get("custom_providers") == []
def test_setup_cancel_preserves_existing_config(tmp_path, monkeypatch):
"""When the user cancels provider selection, existing config is preserved."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
@@ -201,6 +226,38 @@ def test_setup_keyboard_interrupt_gracefully_handled(tmp_path, monkeypatch):
setup_model_provider(config)
def test_select_provider_and_model_warns_if_named_custom_provider_disappears(
tmp_path, monkeypatch, capsys
):
"""If a saved custom provider is deleted mid-selection, show a warning instead of silently doing nothing."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
cfg = load_config()
cfg["custom_providers"] = [{"name": "Local", "base_url": "http://localhost:8080/v1"}]
save_config(cfg)
def fake_prompt_provider_choice(choices, default=0):
current = load_config()
current["custom_providers"] = []
save_config(current)
return next(i for i, label in enumerate(choices) if label.startswith("Local (localhost:8080/v1)"))
monkeypatch.setattr("hermes_cli.auth.resolve_provider", lambda provider: None)
monkeypatch.setattr("hermes_cli.main._prompt_provider_choice", fake_prompt_provider_choice)
monkeypatch.setattr(
"hermes_cli.main._model_flow_named_custom",
lambda *args, **kwargs: (_ for _ in ()).throw(AssertionError("named custom flow should not run")),
)
from hermes_cli.main import select_provider_and_model
select_provider_and_model()
out = capsys.readouterr().out
assert "selected saved custom provider is no longer available" in out
def test_codex_setup_uses_runtime_access_token_for_live_model_list(tmp_path, monkeypatch):
"""Codex model list fetching uses the runtime access token."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
@@ -0,0 +1,21 @@
from pathlib import Path
import subprocess
REPO_ROOT = Path(__file__).resolve().parents[2]
SETUP_SCRIPT = REPO_ROOT / "setup-hermes.sh"
def test_setup_hermes_script_is_valid_shell():
result = subprocess.run(["bash", "-n", str(SETUP_SCRIPT)], capture_output=True, text=True)
assert result.returncode == 0, result.stderr
def test_setup_hermes_script_has_termux_path():
content = SETUP_SCRIPT.read_text(encoding="utf-8")
assert "is_termux()" in content
assert ".[termux]" in content
assert "constraints-termux.txt" in content
assert "$PREFIX/bin" in content
assert "Skipping tinker-atropos on Termux" in content
@@ -230,6 +230,39 @@ def test_setup_same_provider_fallback_can_add_another_credential(tmp_path, monke
assert config.get("credential_pool_strategies", {}).get("openrouter") == "fill_first"
def test_setup_same_provider_single_credential_keeps_existing_rotation_strategy(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
save_env_value("OPENROUTER_API_KEY", "or-key")
_write_model_config("openrouter", "", "anthropic/claude-opus-4.6")
config = load_config()
config["credential_pool_strategies"] = {"openrouter": "round_robin"}
save_config(config)
class _Entry:
def __init__(self, label):
self.label = label
class _Pool:
def entries(self):
return [_Entry("primary")]
def fake_select():
pass
monkeypatch.setattr("hermes_cli.main.select_provider_and_model", fake_select)
_stub_tts(monkeypatch)
monkeypatch.setattr("hermes_cli.setup.prompt", lambda *args, **kwargs: "")
monkeypatch.setattr("agent.credential_pool.load_pool", lambda provider: _Pool())
monkeypatch.setattr("agent.auxiliary_client.get_available_vision_backends", lambda: [])
setup_model_provider(config)
assert config.get("credential_pool_strategies", {}).get("openrouter") == "round_robin"
def test_setup_pool_step_shows_manual_vs_auto_detected_counts(tmp_path, monkeypatch, capsys):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
+108 -1
View File
@@ -4,6 +4,7 @@ from argparse import Namespace
from unittest.mock import MagicMock, patch
import pytest
from hermes_cli.config import DEFAULT_CONFIG, load_config, save_config
def _make_setup_args(**overrides):
@@ -34,6 +35,36 @@ def _make_chat_args(**overrides):
class TestNonInteractiveSetup:
"""Verify setup paths exit cleanly in headless/non-interactive environments."""
def test_cmd_setup_allows_noninteractive_flag_without_tty(self):
"""The CLI entrypoint should not block --non-interactive before setup.py handles it."""
from hermes_cli.main import cmd_setup
args = _make_setup_args(non_interactive=True)
with (
patch("hermes_cli.setup.run_setup_wizard") as mock_run_setup,
patch("sys.stdin") as mock_stdin,
):
mock_stdin.isatty.return_value = False
cmd_setup(args)
mock_run_setup.assert_called_once_with(args)
def test_cmd_setup_defers_no_tty_handling_to_setup_wizard(self):
"""Bare `hermes setup` should reach setup.py, which prints headless guidance."""
from hermes_cli.main import cmd_setup
args = _make_setup_args(non_interactive=False)
with (
patch("hermes_cli.setup.run_setup_wizard") as mock_run_setup,
patch("sys.stdin") as mock_stdin,
):
mock_stdin.isatty.return_value = False
cmd_setup(args)
mock_run_setup.assert_called_once_with(args)
def test_non_interactive_flag_skips_wizard(self, capsys):
"""--non-interactive should print guidance and not enter the wizard."""
from hermes_cli.setup import run_setup_wizard
@@ -72,6 +103,26 @@ class TestNonInteractiveSetup:
out = capsys.readouterr().out
assert "hermes config set model.provider custom" in out
def test_reset_flag_rewrites_config_before_noninteractive_exit(self, tmp_path, monkeypatch, capsys):
"""--reset should rewrite config.yaml even when the wizard cannot run interactively."""
from hermes_cli.setup import run_setup_wizard
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
cfg = load_config()
cfg["model"] = {"provider": "custom", "base_url": "http://localhost:8080/v1", "default": "llama3"}
cfg["agent"]["max_turns"] = 12
save_config(cfg)
args = _make_setup_args(non_interactive=True, reset=True)
run_setup_wizard(args)
reloaded = load_config()
assert reloaded["model"] == DEFAULT_CONFIG["model"]
assert reloaded["agent"]["max_turns"] == DEFAULT_CONFIG["agent"]["max_turns"]
out = capsys.readouterr().out
assert "Configuration reset to defaults." in out
def test_chat_first_run_headless_skips_setup_prompt(self, capsys):
"""Bare `hermes` should not prompt for input when no provider exists and stdin is headless."""
from hermes_cli.main import cmd_chat
@@ -117,7 +168,7 @@ class TestNonInteractiveSetup:
side_effect=lambda key: "sk-test" if key == "OPENROUTER_API_KEY" else "",
),
patch("hermes_cli.auth.get_active_provider", return_value=None),
patch.object(setup_mod, "prompt_choice", return_value=4),
patch.object(setup_mod, "prompt_choice", return_value=3),
patch.object(
setup_mod,
"SETUP_SECTIONS",
@@ -137,3 +188,59 @@ class TestNonInteractiveSetup:
terminal_section.assert_called_once_with(config)
tts_section.assert_not_called()
def test_returning_user_menu_does_not_show_separator_rows(self, tmp_path):
"""Returning-user menu should only show selectable actions."""
from hermes_cli import setup as setup_mod
args = _make_setup_args()
captured = {}
def fake_prompt_choice(question, choices, default=0):
captured["question"] = question
captured["choices"] = list(choices)
return len(choices) - 1
with (
patch.object(setup_mod, "ensure_hermes_home"),
patch.object(setup_mod, "load_config", return_value={}),
patch.object(setup_mod, "get_hermes_home", return_value=tmp_path),
patch.object(setup_mod, "is_interactive_stdin", return_value=True),
patch.object(
setup_mod,
"get_env_value",
side_effect=lambda key: "sk-test" if key == "OPENROUTER_API_KEY" else "",
),
patch("hermes_cli.auth.get_active_provider", return_value=None),
patch.object(setup_mod, "prompt_choice", side_effect=fake_prompt_choice),
):
setup_mod.run_setup_wizard(args)
assert captured["question"] == "What would you like to do?"
assert "---" not in captured["choices"]
assert captured["choices"] == [
"Quick Setup - configure missing items only",
"Full Setup - reconfigure everything",
"Model & Provider",
"Terminal Backend",
"Messaging Platforms (Gateway)",
"Tools",
"Agent Settings",
"Exit",
]
def test_main_accepts_tts_setup_section(self, monkeypatch):
"""`hermes setup tts` should parse and dispatch like other setup sections."""
from hermes_cli import main as main_mod
received = {}
def fake_cmd_setup(args):
received["section"] = args.section
monkeypatch.setattr(main_mod, "cmd_setup", fake_cmd_setup)
monkeypatch.setattr("sys.argv", ["hermes", "setup", "tts"])
main_mod.main()
assert received["section"] == "tts"
+30
View File
@@ -12,3 +12,33 @@ def test_show_status_includes_tavily_key(monkeypatch, capsys, tmp_path):
output = capsys.readouterr().out
assert "Tavily" in output
assert "tvly...cdef" in output
def test_show_status_termux_gateway_section_skips_systemctl(monkeypatch, capsys, tmp_path):
from hermes_cli import status as status_mod
import hermes_cli.auth as auth_mod
import hermes_cli.gateway as gateway_mod
monkeypatch.setenv("TERMUX_VERSION", "0.118.3")
monkeypatch.setenv("PREFIX", "/data/data/com.termux/files/usr")
monkeypatch.setattr(status_mod, "get_env_path", lambda: tmp_path / ".env", raising=False)
monkeypatch.setattr(status_mod, "get_hermes_home", lambda: tmp_path, raising=False)
monkeypatch.setattr(status_mod, "load_config", lambda: {"model": "gpt-5.4"}, raising=False)
monkeypatch.setattr(status_mod, "resolve_requested_provider", lambda requested=None: "openai-codex", raising=False)
monkeypatch.setattr(status_mod, "resolve_provider", lambda requested=None, **kwargs: "openai-codex", raising=False)
monkeypatch.setattr(status_mod, "provider_label", lambda provider: "OpenAI Codex", raising=False)
monkeypatch.setattr(auth_mod, "get_nous_auth_status", lambda: {}, raising=False)
monkeypatch.setattr(auth_mod, "get_codex_auth_status", lambda: {}, raising=False)
monkeypatch.setattr(gateway_mod, "find_gateway_pids", lambda exclude_pids=None: [], raising=False)
def _unexpected_systemctl(*args, **kwargs):
raise AssertionError("systemctl should not be called in the Termux status view")
monkeypatch.setattr(status_mod.subprocess, "run", _unexpected_systemctl)
status_mod.show_status(SimpleNamespace(all=False, deep=False))
output = capsys.readouterr().out
assert "Manager: Termux / manual process" in output
assert "Start with: hermes gateway" in output
assert "systemd (user)" not in output
@@ -0,0 +1,106 @@
"""Regression tests for numbered fallbacks when TerminalMenu cannot initialize."""
import subprocess
import sys
import types
from hermes_cli.config import load_config, save_config
class _BrokenTerminalMenu:
def __init__(self, *args, **kwargs):
raise subprocess.CalledProcessError(2, ["tput", "clear"])
def test_prompt_model_selection_falls_back_on_terminalmenu_runtime_error(monkeypatch):
from hermes_cli.auth import _prompt_model_selection
monkeypatch.setitem(
sys.modules,
"simple_term_menu",
types.SimpleNamespace(TerminalMenu=_BrokenTerminalMenu),
)
responses = iter(["2"])
monkeypatch.setattr("builtins.input", lambda _prompt="": next(responses))
selected = _prompt_model_selection(["model-a", "model-b"])
assert selected == "model-b"
def test_prompt_reasoning_effort_falls_back_on_terminalmenu_runtime_error(monkeypatch):
from hermes_cli.main import _prompt_reasoning_effort_selection
monkeypatch.setitem(
sys.modules,
"simple_term_menu",
types.SimpleNamespace(TerminalMenu=_BrokenTerminalMenu),
)
responses = iter(["3"])
monkeypatch.setattr("builtins.input", lambda _prompt="": next(responses))
selected = _prompt_reasoning_effort_selection(["low", "medium", "high"], current_effort="")
assert selected == "high"
def test_remove_custom_provider_falls_back_on_terminalmenu_runtime_error(tmp_path, monkeypatch):
from hermes_cli.main import _remove_custom_provider
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
monkeypatch.setitem(
sys.modules,
"simple_term_menu",
types.SimpleNamespace(TerminalMenu=_BrokenTerminalMenu),
)
cfg = load_config()
cfg["custom_providers"] = [
{"name": "Local A", "base_url": "http://localhost:8001/v1"},
{"name": "Local B", "base_url": "http://localhost:8002/v1"},
]
save_config(cfg)
responses = iter(["1"])
monkeypatch.setattr("builtins.input", lambda _prompt="": next(responses))
_remove_custom_provider(cfg)
reloaded = load_config()
assert reloaded["custom_providers"] == [
{"name": "Local B", "base_url": "http://localhost:8002/v1"},
]
def test_named_custom_provider_model_picker_falls_back_on_terminalmenu_runtime_error(tmp_path, monkeypatch):
from hermes_cli.main import _model_flow_named_custom
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
monkeypatch.setitem(
sys.modules,
"simple_term_menu",
types.SimpleNamespace(TerminalMenu=_BrokenTerminalMenu),
)
monkeypatch.setattr("hermes_cli.models.fetch_api_models", lambda *args, **kwargs: ["model-a", "model-b"])
monkeypatch.setattr("hermes_cli.auth.deactivate_provider", lambda: None)
cfg = load_config()
save_config(cfg)
responses = iter(["2"])
monkeypatch.setattr("builtins.input", lambda _prompt="": next(responses))
_model_flow_named_custom(
cfg,
{
"name": "Local",
"base_url": "http://localhost:8000/v1",
"api_key": "",
"model": "",
},
)
reloaded = load_config()
assert reloaded["model"]["provider"] == "custom"
assert reloaded["model"]["base_url"] == "http://localhost:8000/v1"
assert reloaded["model"]["default"] == "model-b"
+9 -31
View File
@@ -213,8 +213,12 @@ def test_restore_stashed_changes_keeps_going_when_drop_fails(monkeypatch, tmp_pa
assert "git stash drop stash@{0}" in out
def test_restore_stashed_changes_prompts_before_reset_on_conflict(monkeypatch, tmp_path, capsys):
"""When conflicts occur interactively, user is prompted before reset."""
def test_restore_stashed_changes_always_resets_on_conflict(monkeypatch, tmp_path, capsys):
"""Conflicts always auto-reset (no prompt) and return False, even interactively.
Leaving conflict markers in source files makes hermes unrunnable (SyntaxError).
The stash is preserved for manual recovery; cmd_update continues normally.
"""
calls = []
def fake_run(cmd, **kwargs):
@@ -230,45 +234,19 @@ def test_restore_stashed_changes_prompts_before_reset_on_conflict(monkeypatch, t
monkeypatch.setattr(hermes_main.subprocess, "run", fake_run)
monkeypatch.setattr("builtins.input", lambda: "y")
with pytest.raises(SystemExit, match="1"):
hermes_main._restore_stashed_changes(["git"], tmp_path, "abc123", prompt_user=True)
result = hermes_main._restore_stashed_changes(["git"], tmp_path, "abc123", prompt_user=True)
assert result is False
out = capsys.readouterr().out
assert "Conflicted files:" in out
assert "hermes_cli/main.py" in out
assert "stashed changes are preserved" in out
assert "Reset working tree to clean state" in out
assert "Working tree reset to clean state" in out
assert "git stash apply abc123" in out
reset_calls = [c for c, _ in calls if c[1:3] == ["reset", "--hard"]]
assert len(reset_calls) == 1
def test_restore_stashed_changes_user_declines_reset(monkeypatch, tmp_path, capsys):
"""When user declines reset, working tree is left as-is."""
calls = []
def fake_run(cmd, **kwargs):
calls.append((cmd, kwargs))
if cmd[1:3] == ["stash", "apply"]:
return SimpleNamespace(stdout="", stderr="conflict\n", returncode=1)
if cmd[1:3] == ["diff", "--name-only"]:
return SimpleNamespace(stdout="cli.py\n", stderr="", returncode=0)
raise AssertionError(f"unexpected command: {cmd}")
monkeypatch.setattr(hermes_main.subprocess, "run", fake_run)
# First input: "y" to restore, second input: "n" to decline reset
inputs = iter(["y", "n"])
monkeypatch.setattr("builtins.input", lambda: next(inputs))
with pytest.raises(SystemExit, match="1"):
hermes_main._restore_stashed_changes(["git"], tmp_path, "abc123", prompt_user=True)
out = capsys.readouterr().out
assert "left as-is" in out
reset_calls = [c for c, _ in calls if c[1:3] == ["reset", "--hard"]]
assert len(reset_calls) == 0
def test_restore_stashed_changes_auto_resets_non_interactive(monkeypatch, tmp_path, capsys):
"""Non-interactive mode auto-resets without prompting and returns False
instead of sys.exit(1) so the update can continue (gateway /update path)."""
@@ -368,9 +368,8 @@ class TestCmdUpdateLaunchdRestart:
monkeypatch.setattr(
gateway_cli, "is_macos", lambda: False,
)
monkeypatch.setattr(
gateway_cli, "is_linux", lambda: True,
)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
mock_run.side_effect = _make_run_side_effect(
commit_count="3",
@@ -429,7 +428,8 @@ class TestCmdUpdateSystemService:
):
"""When user systemd is inactive but a system service exists, restart via system scope."""
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
mock_run.side_effect = _make_run_side_effect(
commit_count="3",
@@ -458,7 +458,8 @@ class TestCmdUpdateSystemService:
):
"""When system service restart fails, show the failure message."""
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
mock_run.side_effect = _make_run_side_effect(
commit_count="3",
@@ -480,7 +481,8 @@ class TestCmdUpdateSystemService:
):
"""When both user and system services are active, both are restarted."""
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
mock_run.side_effect = _make_run_side_effect(
commit_count="3",
@@ -563,7 +565,8 @@ class TestServicePidExclusion:
):
"""After systemd restart, the sweep must exclude the service PID."""
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
SERVICE_PID = 55000
@@ -642,7 +645,8 @@ class TestGetServicePids:
"""Unit tests for _get_service_pids()."""
def test_returns_systemd_main_pid(self, monkeypatch):
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
def fake_run(cmd, **kwargs):
@@ -691,7 +695,8 @@ class TestGetServicePids:
def test_excludes_zero_pid(self, monkeypatch):
"""systemd returns MainPID=0 for stopped services; skip those."""
monkeypatch.setattr(gateway_cli, "is_linux", lambda: True)
monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
def fake_run(cmd, **kwargs):
+81
View File
@@ -172,6 +172,87 @@ class TestHTTP413Compression:
mock_compress.assert_called_once()
assert result["completed"] is True
def test_413_clears_conversation_history_on_persist(self, agent):
"""After 413-triggered compression, _persist_session must receive None history.
Bug: _compress_context() creates a new session and resets _last_flushed_db_idx=0,
but if conversation_history still holds the original (pre-compression) list,
_flush_messages_to_session_db computes flush_from = max(len(history), 0) which
exceeds len(compressed_messages), so messages[flush_from:] is empty and nothing
is written to the new session "Session found but has no messages" on resume.
"""
err_413 = _make_413_error()
ok_resp = _mock_response(content="OK", finish_reason="stop")
agent.client.chat.completions.create.side_effect = [err_413, ok_resp]
big_history = [
{"role": "user" if i % 2 == 0 else "assistant", "content": f"msg {i}"}
for i in range(200)
]
persist_calls = []
with (
patch.object(agent, "_compress_context") as mock_compress,
patch.object(
agent, "_persist_session",
side_effect=lambda msgs, hist: persist_calls.append(hist),
),
patch.object(agent, "_save_trajectory"),
patch.object(agent, "_cleanup_task_resources"),
):
mock_compress.return_value = (
[{"role": "user", "content": "summary"}],
"compressed prompt",
)
agent.run_conversation("hello", conversation_history=big_history)
assert len(persist_calls) >= 1, "Expected at least one _persist_session call"
for hist in persist_calls:
assert hist is None, (
f"conversation_history should be None after mid-loop compression, "
f"got list with {len(hist)} items"
)
def test_context_overflow_clears_conversation_history_on_persist(self, agent):
"""After context-overflow compression, _persist_session must receive None history."""
err_400 = Exception(
"Error code: 400 - This endpoint's maximum context length is 128000 tokens. "
"However, you requested about 270460 tokens."
)
err_400.status_code = 400
ok_resp = _mock_response(content="OK", finish_reason="stop")
agent.client.chat.completions.create.side_effect = [err_400, ok_resp]
big_history = [
{"role": "user" if i % 2 == 0 else "assistant", "content": f"msg {i}"}
for i in range(200)
]
persist_calls = []
with (
patch.object(agent, "_compress_context") as mock_compress,
patch.object(
agent, "_persist_session",
side_effect=lambda msgs, hist: persist_calls.append(hist),
),
patch.object(agent, "_save_trajectory"),
patch.object(agent, "_cleanup_task_resources"),
):
mock_compress.return_value = (
[{"role": "user", "content": "summary"}],
"compressed prompt",
)
agent.run_conversation("hello", conversation_history=big_history)
assert len(persist_calls) >= 1
for hist in persist_calls:
assert hist is None, (
f"conversation_history should be None after context-overflow compression, "
f"got list with {len(hist)} items"
)
def test_400_context_length_triggers_compression(self, agent):
"""A 400 with 'maximum context length' should trigger compression, not abort as generic 4xx.
@@ -262,6 +262,30 @@ class TestTryRecoverPrimaryTransport:
assert result is True
def test_recovers_on_openai_api_connection_error(self):
agent = _make_agent(provider="custom")
error = _make_transport_error("APIConnectionError")
with patch("run_agent.OpenAI", return_value=MagicMock()), \
patch("time.sleep"):
result = agent._try_recover_primary_transport(
error, retry_count=3, max_retries=3,
)
assert result is True
def test_recovers_on_openai_api_timeout_error(self):
agent = _make_agent(provider="custom")
error = _make_transport_error("APITimeoutError")
with patch("run_agent.OpenAI", return_value=MagicMock()), \
patch("time.sleep"):
result = agent._try_recover_primary_transport(
error, retry_count=3, max_retries=3,
)
assert result is True
def test_skipped_when_already_on_fallback(self):
agent = _make_agent(provider="custom")
agent._fallback_activated = True
+39
View File
@@ -225,6 +225,26 @@ class TestDeveloperRoleSwap:
assert kwargs["messages"][0]["role"] == "developer"
class TestBuildApiKwargsChatCompletionsServiceTier:
"""service_tier via request_overrides works on the chat_completions path."""
def test_includes_service_tier_via_request_overrides(self, monkeypatch):
agent = _make_agent(monkeypatch, "openrouter")
agent.model = "gpt-4.1"
agent.request_overrides = {"service_tier": "priority"}
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert kwargs["service_tier"] == "priority"
def test_no_service_tier_when_overrides_empty(self, monkeypatch):
agent = _make_agent(monkeypatch, "openrouter")
agent.model = "gpt-4.1"
agent.request_overrides = {}
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert "service_tier" not in kwargs
class TestBuildApiKwargsAIGateway:
def test_uses_chat_completions_format(self, monkeypatch):
agent = _make_agent(monkeypatch, "ai-gateway", base_url="https://ai-gateway.vercel.sh/v1")
@@ -356,6 +376,25 @@ class TestBuildApiKwargsCodex:
assert "reasoning" in kwargs
assert kwargs["reasoning"]["effort"] == "medium"
def test_includes_service_tier_via_request_overrides(self, monkeypatch):
agent = _make_agent(monkeypatch, "openai-codex", api_mode="codex_responses",
base_url="https://chatgpt.com/backend-api/codex")
agent.model = "gpt-5.4"
agent.service_tier = "priority"
agent.request_overrides = {"service_tier": "priority"}
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert kwargs["service_tier"] == "priority"
def test_omits_max_output_tokens_for_codex_backend(self, monkeypatch):
agent = _make_agent(monkeypatch, "openai-codex", api_mode="codex_responses",
base_url="https://chatgpt.com/backend-api/codex")
agent.model = "gpt-5.4"
agent.max_tokens = 20
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert "max_output_tokens" not in kwargs
def test_includes_encrypted_content_in_include(self, monkeypatch):
agent = _make_agent(monkeypatch, "openai-codex", api_mode="codex_responses",
base_url="https://chatgpt.com/backend-api/codex")
+148
View File
@@ -5,6 +5,7 @@ pieces. The OpenAI client and tool loading are mocked so no network calls
are made.
"""
import io
import json
import logging
import re
@@ -1061,6 +1062,77 @@ class TestExecuteToolCalls:
assert len(messages[0]["content"]) < 150_000
assert ("Truncated" in messages[0]["content"] or "<persisted-output>" in messages[0]["content"])
def test_quiet_tool_output_suppressed_when_progress_callback_present(self, agent):
tc = _mock_tool_call(name="web_search", arguments='{"q":"test"}', call_id="c1")
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc])
messages = []
agent.tool_progress_callback = lambda *args, **kwargs: None
with patch("run_agent.handle_function_call", return_value="search result"), \
patch.object(agent, "_safe_print") as mock_print:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_print.assert_not_called()
assert len(messages) == 1
assert messages[0]["role"] == "tool"
def test_quiet_tool_output_prints_without_progress_callback(self, agent):
tc = _mock_tool_call(name="web_search", arguments='{"q":"test"}', call_id="c1")
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc])
messages = []
agent.tool_progress_callback = None
with patch("run_agent.handle_function_call", return_value="search result"), \
patch.object(agent, "_safe_print") as mock_print:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_print.assert_called_once()
assert "search" in str(mock_print.call_args.args[0]).lower()
assert len(messages) == 1
assert messages[0]["role"] == "tool"
def test_vprint_suppressed_in_parseable_quiet_mode(self, agent):
agent.suppress_status_output = True
with patch.object(agent, "_safe_print") as mock_print:
agent._vprint("status line", force=True)
agent._vprint("normal line")
mock_print.assert_not_called()
def test_run_conversation_suppresses_retry_noise_in_parseable_quiet_mode(self, agent):
class _RateLimitError(Exception):
status_code = 429
def __str__(self):
return "Error code: 429 - Rate limit exceeded."
responses = [_RateLimitError(), _mock_response(content="Recovered")]
def _fake_api_call(api_kwargs):
result = responses.pop(0)
if isinstance(result, Exception):
raise result
return result
agent.suppress_status_output = True
agent._interruptible_api_call = _fake_api_call
agent._persist_session = lambda *args, **kwargs: None
agent._save_trajectory = lambda *args, **kwargs: None
agent._save_session_log = lambda *args, **kwargs: None
captured = io.StringIO()
agent._print_fn = lambda *args, **kw: print(*args, file=captured, **kw)
with patch("run_agent.time.sleep", return_value=None):
result = agent.run_conversation("hello")
assert result["completed"] is True
assert result["final_response"] == "Recovered"
output = captured.getvalue()
assert "API call failed" not in output
assert "Rate limit reached" not in output
class TestConcurrentToolExecution:
"""Tests for _execute_tool_calls_concurrent and dispatch logic."""
@@ -1877,6 +1949,68 @@ class TestRunConversation:
assert result["final_response"] is not None
assert "Thinking Budget Exhausted" in result["final_response"]
def test_length_with_tool_calls_returns_partial_without_executing_tools(self, agent):
self._setup_agent(agent)
bad_tc = _mock_tool_call(
name="write_file",
arguments='{"path":"report.md","content":"partial',
call_id="c1",
)
resp = _mock_response(content="", finish_reason="length", tool_calls=[bad_tc])
agent.client.chat.completions.create.return_value = resp
with (
patch("run_agent.handle_function_call") as mock_handle_function_call,
patch.object(agent, "_persist_session"),
patch.object(agent, "_save_trajectory"),
patch.object(agent, "_cleanup_task_resources"),
):
result = agent.run_conversation("write the report")
assert result["completed"] is False
assert result["partial"] is True
assert "truncated due to output length limit" in result["error"]
mock_handle_function_call.assert_not_called()
def test_truncated_tool_call_retries_once_before_refusing(self, agent):
"""When tool call args are truncated, the agent retries the API call
once. If the retry succeeds (valid JSON args), tool execution proceeds."""
self._setup_agent(agent)
agent.valid_tool_names.add("write_file")
bad_tc = _mock_tool_call(
name="write_file",
arguments='{"path":"report.md","content":"partial',
call_id="c1",
)
truncated_resp = _mock_response(
content="", finish_reason="length", tool_calls=[bad_tc],
)
good_tc = _mock_tool_call(
name="write_file",
arguments='{"path":"report.md","content":"full content"}',
call_id="c2",
)
good_resp = _mock_response(
content="", finish_reason="stop", tool_calls=[good_tc],
)
with (
patch("run_agent.handle_function_call", return_value='{"success":true}') as mock_hfc,
patch.object(agent, "_persist_session"),
patch.object(agent, "_save_trajectory"),
patch.object(agent, "_cleanup_task_resources"),
):
# First call: truncated → retry. Second: valid → execute tool.
# Third: final text response.
final_resp = _mock_response(content="Done!", finish_reason="stop")
agent.client.chat.completions.create.side_effect = [
truncated_resp, good_resp, final_resp,
]
result = agent.run_conversation("write the report")
# Tool was executed on the retry (good_resp)
mock_hfc.assert_called_once()
assert result["final_response"] == "Done!"
class TestRetryExhaustion:
"""Regression: retry_count > max_retries was dead code (off-by-one).
@@ -3010,6 +3144,20 @@ class TestStreamingApiCall:
assert tc[0].function.name == "search"
assert tc[1].function.name == "read"
def test_truncated_tool_call_args_upgrade_finish_reason_to_length(self, agent):
chunks = [
_make_chunk(tool_calls=[_make_tc_delta(0, "call_1", "write_file", '{"path":"x.txt","content":"hel')]),
]
agent.client.chat.completions.create.return_value = iter(chunks)
resp = agent._interruptible_streaming_api_call({"messages": []})
tc = resp.choices[0].message.tool_calls
assert len(tc) == 1
assert tc[0].function.name == "write_file"
assert tc[0].function.arguments == '{"path":"x.txt","content":"hel'
assert resp.choices[0].finish_reason == "length"
def test_ollama_reused_index_separate_tool_calls(self, agent):
"""Ollama sends every tool call at index 0 with different ids.
@@ -648,6 +648,15 @@ def test_preflight_codex_api_kwargs_allows_reasoning_and_temperature(monkeypatch
assert result["max_output_tokens"] == 4096
def test_preflight_codex_api_kwargs_allows_service_tier(monkeypatch):
agent = _build_agent(monkeypatch)
kwargs = _codex_request_kwargs()
kwargs["service_tier"] = "priority"
result = agent._preflight_codex_api_kwargs(kwargs)
assert result["service_tier"] == "priority"
def test_run_conversation_codex_replay_payload_keeps_call_id(monkeypatch):
agent = _build_agent(monkeypatch)
responses = [_codex_tool_call_response(), _codex_message_response("done")]
@@ -13,6 +13,7 @@ from tools.browser_tool import (
_find_agent_browser,
_run_browser_command,
_SANE_PATH,
check_browser_requirements,
)
@@ -149,6 +150,31 @@ class TestFindAgentBrowser:
_find_agent_browser()
class TestBrowserRequirements:
def test_termux_requires_real_agent_browser_install_not_npx_fallback(self, monkeypatch):
monkeypatch.setenv("TERMUX_VERSION", "0.118.3")
monkeypatch.setenv("PREFIX", "/data/data/com.termux/files/usr")
monkeypatch.setattr("tools.browser_tool._is_camofox_mode", lambda: False)
monkeypatch.setattr("tools.browser_tool._get_cloud_provider", lambda: None)
monkeypatch.setattr("tools.browser_tool._find_agent_browser", lambda: "npx agent-browser")
assert check_browser_requirements() is False
class TestRunBrowserCommandTermuxFallback:
def test_termux_local_mode_rejects_bare_npx_fallback(self, monkeypatch):
monkeypatch.setenv("TERMUX_VERSION", "0.118.3")
monkeypatch.setenv("PREFIX", "/data/data/com.termux/files/usr")
monkeypatch.setattr("tools.browser_tool._find_agent_browser", lambda: "npx agent-browser")
monkeypatch.setattr("tools.browser_tool._get_cloud_provider", lambda: None)
result = _run_browser_command("task-1", "navigate", ["https://example.com"])
assert result["success"] is False
assert "bare npx fallback" in result["error"]
assert "agent-browser install" in result["error"]
class TestRunBrowserCommandPathConstruction:
"""Verify _run_browser_command() includes Homebrew node dirs in subprocess PATH."""
+43
View File
@@ -35,6 +35,7 @@ from hermes_cli.clipboard import (
_windows_has_image,
_convert_to_png,
)
from cli import _should_auto_attach_clipboard_image_on_paste
FAKE_PNG = b"\x89PNG\r\n\x1a\n" + b"\x00" * 100
FAKE_BMP = b"BM" + b"\x00" * 100
@@ -919,6 +920,48 @@ class TestTryAttachClipboardImage:
assert path.suffix == ".png"
class TestAutoAttachClipboardImageOnPaste:
def test_skips_auto_attach_for_plain_text_paste(self):
assert _should_auto_attach_clipboard_image_on_paste("hello world") is False
def test_skips_auto_attach_for_whitespace_and_text_paste(self):
assert _should_auto_attach_clipboard_image_on_paste(" hello world ") is False
def test_allows_auto_attach_for_empty_paste(self):
assert _should_auto_attach_clipboard_image_on_paste("") is True
def test_allows_auto_attach_for_whitespace_only_paste(self):
assert _should_auto_attach_clipboard_image_on_paste(" \n\t ") is True
class TestVoiceSubmission:
@pytest.fixture
def cli(self):
from cli import HermesCLI
cli_obj = HermesCLI.__new__(HermesCLI)
cli_obj._attached_images = [Path("/tmp/stale.png")]
cli_obj._pending_input = queue.Queue()
cli_obj._voice_lock = MagicMock()
cli_obj._voice_processing = True
cli_obj._voice_recording = True
cli_obj._voice_continuous = False
cli_obj._no_speech_count = 0
cli_obj._voice_recorder = MagicMock()
cli_obj._voice_recorder.stop.return_value = "/tmp/fake.wav"
cli_obj._app = None
return cli_obj
def test_voice_transcript_clears_stale_attached_images(self, cli):
with patch("tools.voice_mode.play_beep"):
with patch("tools.voice_mode.transcribe_recording", return_value={"success": True, "transcript": "hello"}):
with patch("os.path.isfile", return_value=False):
with patch("cli._cprint"):
cli._voice_stop_and_transcribe()
assert cli._attached_images == []
assert cli._pending_input.get_nowait() == "hello"
# ═════════════════════════════════════════════════════════════════════════
# Level 4: Queue routing — tuple unpacking in process_loop
# ═════════════════════════════════════════════════════════════════════════
+43
View File
@@ -44,6 +44,7 @@ from tools.code_execution_tool import (
build_execute_code_schema,
EXECUTE_CODE_SCHEMA,
_TOOL_DOC_LINES,
_execute_remote,
)
@@ -115,6 +116,48 @@ class TestHermesToolsGeneration(unittest.TestCase):
self.assertIn("def retry(", src)
self.assertIn("import json, os, socket, shlex, time", src)
def test_file_transport_uses_tempfile_fallback_for_rpc_dir(self):
src = generate_hermes_tools_module(["terminal"], transport="file")
self.assertIn("import json, os, shlex, tempfile, time", src)
self.assertIn("os.path.join(tempfile.gettempdir(), \"hermes_rpc\")", src)
self.assertNotIn('os.environ.get("HERMES_RPC_DIR", "/tmp/hermes_rpc")', src)
class TestExecuteCodeRemoteTempDir(unittest.TestCase):
def test_execute_remote_uses_backend_temp_dir_for_sandbox(self):
class FakeEnv:
def __init__(self):
self.commands = []
def get_temp_dir(self):
return "/data/data/com.termux/files/usr/tmp"
def execute(self, command, cwd=None, timeout=None):
self.commands.append((command, cwd, timeout))
if "command -v python3" in command:
return {"output": "OK\n"}
if "python3 script.py" in command:
return {"output": "hello\n", "returncode": 0}
return {"output": ""}
env = FakeEnv()
fake_thread = MagicMock()
with patch("tools.code_execution_tool._load_config", return_value={"timeout": 30, "max_tool_calls": 5}), \
patch("tools.code_execution_tool._get_or_create_env", return_value=(env, "ssh")), \
patch("tools.code_execution_tool._ship_file_to_remote"), \
patch("tools.code_execution_tool.threading.Thread", return_value=fake_thread):
result = json.loads(_execute_remote("print('hello')", "task-1", ["terminal"]))
self.assertEqual(result["status"], "success")
mkdir_cmd = env.commands[1][0]
run_cmd = next(cmd for cmd, _, _ in env.commands if "python3 script.py" in cmd)
cleanup_cmd = env.commands[-1][0]
self.assertIn("mkdir -p /data/data/com.termux/files/usr/tmp/hermes_exec_", mkdir_cmd)
self.assertIn("HERMES_RPC_DIR=/data/data/com.termux/files/usr/tmp/hermes_exec_", run_cmd)
self.assertIn("rm -rf /data/data/com.termux/files/usr/tmp/hermes_exec_", cleanup_cmd)
self.assertNotIn("mkdir -p /tmp/hermes_exec_", mkdir_cmd)
@unittest.skipIf(sys.platform == "win32", "UDS not available on Windows")
class TestExecuteCode(unittest.TestCase):
+257
View File
@@ -0,0 +1,257 @@
"""Tests for FileSyncManager — mtime tracking, deletion detection, transactional rollback."""
import os
import time
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from tools.environments.file_sync import FileSyncManager, _FORCE_SYNC_ENV
@pytest.fixture
def tmp_files(tmp_path):
"""Create a few temp files to use as sync sources."""
files = {}
for name in ("cred_a.json", "cred_b.json", "skill_main.py"):
p = tmp_path / name
p.write_text(f"content of {name}")
files[name] = str(p)
return files
def _make_get_files(tmp_files, remote_base="/root/.hermes"):
"""Return a get_files_fn that maps local files to remote paths."""
mapping = [(hp, f"{remote_base}/{name}") for name, hp in tmp_files.items()]
def get_files():
return [(hp, rp) for hp, rp in mapping if Path(hp).exists()]
return get_files
def _make_manager(tmp_files, remote_base="/root/.hermes", upload=None, delete=None):
"""Create a FileSyncManager with test callbacks."""
return FileSyncManager(
get_files_fn=_make_get_files(tmp_files, remote_base),
upload_fn=upload or MagicMock(),
delete_fn=delete or MagicMock(),
)
class TestMtimeSkip:
def test_unchanged_files_not_re_uploaded(self, tmp_files):
upload = MagicMock()
mgr = _make_manager(tmp_files, upload=upload)
mgr.sync(force=True)
assert upload.call_count == 3
upload.reset_mock()
mgr.sync(force=True)
assert upload.call_count == 0, "unchanged files should not be re-uploaded"
def test_changed_file_re_uploaded(self, tmp_files):
upload = MagicMock()
mgr = _make_manager(tmp_files, upload=upload)
mgr.sync(force=True)
upload.reset_mock()
# Touch one file
time.sleep(0.05)
Path(tmp_files["cred_a.json"]).write_text("updated content")
mgr.sync(force=True)
assert upload.call_count == 1
assert tmp_files["cred_a.json"] in upload.call_args[0][0]
def test_new_file_detected(self, tmp_files, tmp_path):
upload = MagicMock()
mgr = FileSyncManager(
get_files_fn=_make_get_files(tmp_files),
upload_fn=upload,
delete_fn=MagicMock(),
)
mgr.sync(force=True)
assert upload.call_count == 3
# Add a new file
new_file = tmp_path / "new_skill.py"
new_file.write_text("new content")
tmp_files["new_skill.py"] = str(new_file)
# Recreate manager with updated file list
mgr._get_files_fn = _make_get_files(tmp_files)
upload.reset_mock()
mgr.sync(force=True)
assert upload.call_count == 1
class TestDeletion:
def test_removed_file_triggers_delete(self, tmp_files):
upload = MagicMock()
delete = MagicMock()
mgr = _make_manager(tmp_files, upload=upload, delete=delete)
mgr.sync(force=True)
delete.assert_not_called()
# Remove a file locally
os.unlink(tmp_files["cred_b.json"])
del tmp_files["cred_b.json"]
mgr._get_files_fn = _make_get_files(tmp_files)
mgr.sync(force=True)
delete.assert_called_once()
deleted_paths = delete.call_args[0][0]
assert any("cred_b.json" in p for p in deleted_paths)
def test_no_delete_when_no_removals(self, tmp_files):
delete = MagicMock()
mgr = _make_manager(tmp_files, delete=delete)
mgr.sync(force=True)
mgr.sync(force=True)
delete.assert_not_called()
class TestTransactionalRollback:
def test_upload_failure_rolls_back(self, tmp_files):
call_count = 0
def failing_upload(host_path, remote_path):
nonlocal call_count
call_count += 1
if call_count == 2:
raise RuntimeError("upload failed")
mgr = _make_manager(tmp_files, upload=failing_upload)
# First sync fails (swallowed, logged, state rolled back)
mgr.sync(force=True)
# State should be empty (rolled back) — next sync retries all files
good_upload = MagicMock()
mgr._upload_fn = good_upload
mgr.sync(force=True)
assert good_upload.call_count == 3, "all files should be retried after rollback"
def test_delete_failure_rolls_back(self, tmp_files):
upload = MagicMock()
mgr = _make_manager(tmp_files, upload=upload)
# Initial sync
mgr.sync(force=True)
# Remove a file
os.unlink(tmp_files["skill_main.py"])
del tmp_files["skill_main.py"]
mgr._get_files_fn = _make_get_files(tmp_files)
# Delete fails (swallowed, state rolled back)
mgr._delete_fn = MagicMock(side_effect=RuntimeError("delete failed"))
mgr.sync(force=True)
# Next sync should retry the delete
good_delete = MagicMock()
mgr._delete_fn = good_delete
upload.reset_mock()
mgr.sync(force=True)
good_delete.assert_called_once()
class TestRateLimiting:
def test_sync_skipped_within_interval(self, tmp_files):
upload = MagicMock()
mgr = FileSyncManager(
get_files_fn=_make_get_files(tmp_files),
upload_fn=upload,
delete_fn=MagicMock(),
sync_interval=10.0,
)
mgr.sync(force=True)
assert upload.call_count == 3
upload.reset_mock()
# Without force, should skip due to rate limit
mgr.sync()
assert upload.call_count == 0
def test_force_bypasses_rate_limit(self, tmp_files, tmp_path):
upload = MagicMock()
mgr = FileSyncManager(
get_files_fn=_make_get_files(tmp_files),
upload_fn=upload,
delete_fn=MagicMock(),
sync_interval=10.0,
)
mgr.sync(force=True)
upload.reset_mock()
# Add a new file and force sync
new_file = tmp_path / "forced.txt"
new_file.write_text("forced")
tmp_files["forced.txt"] = str(new_file)
mgr._get_files_fn = _make_get_files(tmp_files)
mgr.sync(force=True)
assert upload.call_count == 1
def test_env_var_forces_sync(self, tmp_files, tmp_path):
upload = MagicMock()
mgr = FileSyncManager(
get_files_fn=_make_get_files(tmp_files),
upload_fn=upload,
delete_fn=MagicMock(),
sync_interval=10.0,
)
mgr.sync(force=True)
upload.reset_mock()
new_file = tmp_path / "env_forced.txt"
new_file.write_text("env forced")
tmp_files["env_forced.txt"] = str(new_file)
mgr._get_files_fn = _make_get_files(tmp_files)
with patch.dict(os.environ, {_FORCE_SYNC_ENV: "1"}):
mgr.sync()
assert upload.call_count == 1
class TestEdgeCases:
def test_empty_file_list(self):
upload = MagicMock()
delete = MagicMock()
mgr = FileSyncManager(
get_files_fn=lambda: [],
upload_fn=upload,
delete_fn=delete,
)
mgr.sync(force=True)
upload.assert_not_called()
delete.assert_not_called()
def test_file_disappears_between_list_and_upload(self, tmp_path):
"""File listed by get_files but deleted before _file_mtime_key reads it."""
f = tmp_path / "ephemeral.txt"
f.write_text("here now")
upload = MagicMock()
mgr = FileSyncManager(
get_files_fn=lambda: [(str(f), "/root/.hermes/ephemeral.txt")],
upload_fn=upload,
delete_fn=MagicMock(),
)
# Delete the file before sync can stat it
os.unlink(str(f))
mgr.sync(force=True)
upload.assert_not_called() # _file_mtime_key returns None, skipped
+127
View File
@@ -0,0 +1,127 @@
"""Reproducible perf benchmark for file sync overhead.
Measures actual env.execute() wall-clock time, no LLM in the loop.
Run with: uv run pytest tests/tools/test_file_sync_perf.py -v -o "addopts=" -s
Requires backends to be configured (SSH host, Modal creds, etc).
Skip markers gate each backend.
"""
import statistics
import time
import pytest
# ---------------------------------------------------------------------------
# Backend fixtures
# ---------------------------------------------------------------------------
@pytest.fixture
def local_env():
from tools.environments.local import LocalEnvironment
env = LocalEnvironment(cwd="/tmp", timeout=30)
yield env
env.cleanup()
@pytest.fixture
def ssh_env():
import os
host = os.environ.get("TERMINAL_SSH_HOST")
user = os.environ.get("TERMINAL_SSH_USER")
if not host or not user:
pytest.skip("TERMINAL_SSH_HOST and TERMINAL_SSH_USER required")
from tools.environments.ssh import SSHEnvironment
env = SSHEnvironment(host=host, user=user, cwd="/tmp", timeout=30)
yield env
env.cleanup()
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _time_executions(env, command: str, n: int = 10) -> list[float]:
"""Run *command* n times and return per-call wall-clock durations."""
durations = []
for _ in range(n):
t0 = time.monotonic()
result = env.execute(command, timeout=10)
elapsed = time.monotonic() - t0
durations.append(elapsed)
assert result.get("returncode", result.get("exit_code", -1)) == 0, \
f"command failed: {result}"
return durations
def _report(label: str, durations: list[float]):
"""Print timing stats."""
med = statistics.median(durations)
mean = statistics.mean(durations)
p95 = sorted(durations)[int(len(durations) * 0.95)]
print(f"\n {label}:")
print(f" n={len(durations)} median={med*1000:.0f}ms mean={mean*1000:.0f}ms p95={p95*1000:.0f}ms")
print(f" raw: {[f'{d*1000:.0f}ms' for d in durations]}")
return med
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
class TestLocalPerf:
"""Local baseline — no file sync, no network. Sets the floor."""
def test_echo_latency(self, local_env):
durations = _time_executions(local_env, "echo hello", n=20)
med = _report("local echo", durations)
# Spawn-per-call overhead should be < 500ms
assert med < 0.5, f"local echo median {med*1000:.0f}ms exceeds 500ms"
@pytest.mark.ssh
class TestSSHPerf:
"""SSH with FileSyncManager — mtime skip should make sync ~0ms."""
def test_echo_latency(self, ssh_env):
"""Sequential echo commands — measures per-command overhead including sync check."""
durations = _time_executions(ssh_env, "echo hello", n=20)
med = _report("ssh echo (with sync check)", durations)
# SSH round-trip + spawn-per-call, but sync should be ~0ms (rate limited)
assert med < 2.0, f"ssh echo median {med*1000:.0f}ms exceeds 2000ms"
def test_sync_overhead_after_interval(self, ssh_env):
"""Measure sync cost when the rate-limit window has expired.
Sleep past the 5s interval, then time the next command which
triggers a real sync cycle (but with mtime skip, should be fast).
"""
# Warm up
ssh_env.execute("echo warmup", timeout=10)
# Wait for sync interval to expire
time.sleep(6)
# This command will trigger a real sync cycle
t0 = time.monotonic()
result = ssh_env.execute("echo after-interval", timeout=10)
elapsed = time.monotonic() - t0
print(f"\n ssh echo after 6s wait (sync triggered): {elapsed*1000:.0f}ms")
assert result.get("returncode", result.get("exit_code", -1)) == 0
# Even with sync triggered, mtime skip should keep it fast
# Old rsync approach: ~2-3s. New mtime skip: should be < 1.5s
assert elapsed < 1.5, f"sync-triggered command took {elapsed*1000:.0f}ms (expected < 1500ms)"
def test_no_sync_within_interval(self, ssh_env):
"""Rapid sequential commands within 5s window — no sync at all."""
# First command triggers sync
ssh_env.execute("echo prime", timeout=10)
# Immediately run 10 more — all within rate-limit window
durations = _time_executions(ssh_env, "echo rapid", n=10)
med = _report("ssh echo (within interval, no sync)", durations)
# Should be pure SSH overhead, no sync
assert med < 1.5, f"within-interval median {med*1000:.0f}ms exceeds 1500ms"
+51
View File
@@ -0,0 +1,51 @@
from unittest.mock import patch
from tools.environments.local import LocalEnvironment
class TestLocalTempDir:
def test_uses_os_tmpdir_for_session_artifacts(self, monkeypatch):
monkeypatch.setenv("TMPDIR", "/data/data/com.termux/files/usr/tmp")
monkeypatch.delenv("TMP", raising=False)
monkeypatch.delenv("TEMP", raising=False)
with patch.object(LocalEnvironment, "init_session", autospec=True, return_value=None):
env = LocalEnvironment(cwd=".", timeout=10)
assert env.get_temp_dir() == "/data/data/com.termux/files/usr/tmp"
assert env._snapshot_path == f"/data/data/com.termux/files/usr/tmp/hermes-snap-{env._session_id}.sh"
assert env._cwd_file == f"/data/data/com.termux/files/usr/tmp/hermes-cwd-{env._session_id}.txt"
def test_prefers_backend_env_tmpdir_override(self, monkeypatch):
monkeypatch.delenv("TMPDIR", raising=False)
monkeypatch.delenv("TMP", raising=False)
monkeypatch.delenv("TEMP", raising=False)
with patch.object(LocalEnvironment, "init_session", autospec=True, return_value=None):
env = LocalEnvironment(
cwd=".",
timeout=10,
env={"TMPDIR": "/data/data/com.termux/files/home/.cache/hermes-tmp/"},
)
assert env.get_temp_dir() == "/data/data/com.termux/files/home/.cache/hermes-tmp"
assert env._snapshot_path == (
f"/data/data/com.termux/files/home/.cache/hermes-tmp/hermes-snap-{env._session_id}.sh"
)
assert env._cwd_file == (
f"/data/data/com.termux/files/home/.cache/hermes-tmp/hermes-cwd-{env._session_id}.txt"
)
def test_falls_back_to_tempfile_when_tmp_missing(self, monkeypatch):
monkeypatch.delenv("TMPDIR", raising=False)
monkeypatch.delenv("TMP", raising=False)
monkeypatch.delenv("TEMP", raising=False)
with patch("tools.environments.local.os.path.isdir", return_value=False), \
patch("tools.environments.local.os.access", return_value=False), \
patch("tools.environments.local.tempfile.gettempdir", return_value="/cache/tmp"), \
patch.object(LocalEnvironment, "init_session", autospec=True, return_value=None):
env = LocalEnvironment(cwd=".", timeout=10)
assert env.get_temp_dir() == "/cache/tmp"
assert env._snapshot_path == f"/cache/tmp/hermes-snap-{env._session_id}.sh"
assert env._cwd_file == f"/cache/tmp/hermes-cwd-{env._session_id}.txt"
+2 -2
View File
@@ -124,8 +124,8 @@ def _install_modal_test_modules(
sys.modules["tools.interrupt"] = types.SimpleNamespace(is_interrupted=lambda: False)
sys.modules["tools.credential_files"] = types.SimpleNamespace(
get_credential_file_mounts=lambda: [],
iter_skills_files=lambda: [],
iter_cache_files=lambda: [],
iter_skills_files=lambda **kw: [],
iter_cache_files=lambda **kw: [],
)
from_id_calls: list[str] = []
+119
View File
@@ -135,6 +135,64 @@ class TestReadLog:
assert "5 lines" in result["showing"]
# =========================================================================
# Stdin helpers
# =========================================================================
class TestStdinHelpers:
def test_close_stdin_not_found(self, registry):
result = registry.close_stdin("nonexistent")
assert result["status"] == "not_found"
def test_close_stdin_pipe_mode(self, registry):
proc = MagicMock()
proc.stdin = MagicMock()
s = _make_session()
s.process = proc
registry._running[s.id] = s
result = registry.close_stdin(s.id)
proc.stdin.close.assert_called_once()
assert result["status"] == "ok"
def test_close_stdin_pty_mode(self, registry):
pty = MagicMock()
s = _make_session()
s._pty = pty
registry._running[s.id] = s
result = registry.close_stdin(s.id)
pty.sendeof.assert_called_once()
assert result["status"] == "ok"
def test_close_stdin_allows_eof_driven_process_to_finish(self, registry, tmp_path):
session = registry.spawn_local(
'python3 -c "import sys; print(sys.stdin.read().strip())"',
cwd=str(tmp_path),
use_pty=False,
)
try:
time.sleep(0.5)
assert registry.submit_stdin(session.id, "hello")["status"] == "ok"
assert registry.close_stdin(session.id)["status"] == "ok"
deadline = time.time() + 5
while time.time() < deadline:
poll = registry.poll(session.id)
if poll["status"] == "exited":
assert poll["exit_code"] == 0
assert "hello" in poll["output_preview"]
return
time.sleep(0.2)
pytest.fail("process did not exit after stdin was closed")
finally:
registry.kill_process(session.id)
# =========================================================================
# List sessions
# =========================================================================
@@ -282,6 +340,67 @@ class TestSpawnEnvSanitization:
assert f"{_HERMES_PROVIDER_ENV_FORCE_PREFIX}TELEGRAM_BOT_TOKEN" not in env
assert env["PYTHONUNBUFFERED"] == "1"
def test_spawn_via_env_uses_backend_temp_dir_for_artifacts(self, registry):
class FakeEnv:
def __init__(self):
self.commands = []
def get_temp_dir(self):
return "/data/data/com.termux/files/usr/tmp"
def execute(self, command, timeout=None):
self.commands.append((command, timeout))
return {"output": "4321\n"}
env = FakeEnv()
fake_thread = MagicMock()
with patch("tools.process_registry.threading.Thread", return_value=fake_thread), \
patch.object(registry, "_write_checkpoint"):
session = registry.spawn_via_env(env, "echo hello")
bg_command = env.commands[0][0]
assert session.pid == 4321
assert "/data/data/com.termux/files/usr/tmp/hermes_bg_" in bg_command
assert ".exit" in bg_command
assert "rc=$?;" in bg_command
assert " > /tmp/hermes_bg_" not in bg_command
assert "cat /tmp/hermes_bg_" not in bg_command
fake_thread.start.assert_called_once()
def test_env_poller_quotes_temp_paths_with_spaces(self, registry):
session = _make_session(sid="proc_space")
session.exited = False
class FakeEnv:
def __init__(self):
self.commands = []
self._responses = iter([
{"output": "hello\n"},
{"output": "1\n"},
{"output": "0\n"},
])
def execute(self, command, timeout=None):
self.commands.append((command, timeout))
return next(self._responses)
env = FakeEnv()
with patch("tools.process_registry.time.sleep", return_value=None), \
patch.object(registry, "_move_to_finished"):
registry._env_poller_loop(
session,
env,
"/path with spaces/hermes_bg.log",
"/path with spaces/hermes_bg.pid",
"/path with spaces/hermes_bg.exit",
)
assert env.commands[0][0] == "cat '/path with spaces/hermes_bg.log' 2>/dev/null"
assert env.commands[1][0] == "kill -0 \"$(cat '/path with spaces/hermes_bg.pid' 2>/dev/null)\" 2>/dev/null; echo $?"
assert env.commands[2][0] == "cat '/path with spaces/hermes_bg.exit' 2>/dev/null"
# =========================================================================
# Checkpoint
+4
View File
@@ -121,6 +121,10 @@ class TestSSHPreflight:
called["count"] += 1
monkeypatch.setattr(ssh_env.SSHEnvironment, "_establish_connection", _fake_establish)
monkeypatch.setattr(ssh_env.SSHEnvironment, "_detect_remote_home", lambda self: "/home/alice")
monkeypatch.setattr(ssh_env.SSHEnvironment, "_ensure_remote_dirs", lambda self: None)
monkeypatch.setattr(ssh_env.SSHEnvironment, "init_session", lambda self: None)
monkeypatch.setattr(ssh_env, "FileSyncManager", lambda **kw: type("M", (), {"sync": lambda self, **k: None})())
env = ssh_env.SSHEnvironment(host="example.com", user="alice")
@@ -0,0 +1,187 @@
"""Tests for foreground timeout cap in terminal_tool.
Ensures that foreground commands with timeout > FOREGROUND_MAX_TIMEOUT
are rejected with an error suggesting background=true.
"""
import json
import os
from unittest.mock import patch, MagicMock
# ---------------------------------------------------------------------------
# Shared test config dict — mirrors _get_env_config() return shape.
# ---------------------------------------------------------------------------
def _make_env_config(**overrides):
"""Return a minimal _get_env_config()-shaped dict with optional overrides."""
config = {
"env_type": "local",
"timeout": 180,
"cwd": "/tmp",
"host_cwd": None,
"modal_mode": "auto",
"docker_image": "",
"singularity_image": "",
"modal_image": "",
"daytona_image": "",
}
config.update(overrides)
return config
class TestForegroundTimeoutCap:
"""FOREGROUND_MAX_TIMEOUT rejects foreground commands that exceed it."""
def test_foreground_timeout_rejected_above_max(self):
"""When model requests timeout > FOREGROUND_MAX_TIMEOUT, return error."""
from tools.terminal_tool import terminal_tool, FOREGROUND_MAX_TIMEOUT
with patch("tools.terminal_tool._get_env_config", return_value=_make_env_config()), \
patch("tools.terminal_tool._start_cleanup_thread"):
result = json.loads(terminal_tool(
command="echo hello",
timeout=9999, # Way above max
))
assert "error" in result
assert "9999" in result["error"]
assert str(FOREGROUND_MAX_TIMEOUT) in result["error"]
assert "background=true" in result["error"]
def test_foreground_timeout_within_max_executes(self):
"""When model requests timeout <= FOREGROUND_MAX_TIMEOUT, execute normally."""
from tools.terminal_tool import terminal_tool
with patch("tools.terminal_tool._get_env_config", return_value=_make_env_config()), \
patch("tools.terminal_tool._start_cleanup_thread"):
mock_env = MagicMock()
mock_env.execute.return_value = {"output": "done", "returncode": 0}
with patch("tools.terminal_tool._active_environments", {"default": mock_env}), \
patch("tools.terminal_tool._last_activity", {"default": 0}), \
patch("tools.terminal_tool._check_all_guards", return_value={"approved": True}):
result = json.loads(terminal_tool(
command="echo hello",
timeout=300, # Within max
))
call_kwargs = mock_env.execute.call_args
assert call_kwargs[1]["timeout"] == 300
assert "error" not in result or result["error"] is None
def test_config_default_above_cap_not_rejected(self):
"""When config default timeout > cap but model passes no timeout, execute normally.
Only the model's explicit timeout parameter triggers rejection,
not the user's configured default.
"""
from tools.terminal_tool import terminal_tool, FOREGROUND_MAX_TIMEOUT
# User configured TERMINAL_TIMEOUT=900 in their env
with patch("tools.terminal_tool._get_env_config",
return_value=_make_env_config(timeout=900)), \
patch("tools.terminal_tool._start_cleanup_thread"):
mock_env = MagicMock()
mock_env.execute.return_value = {"output": "done", "returncode": 0}
with patch("tools.terminal_tool._active_environments", {"default": mock_env}), \
patch("tools.terminal_tool._last_activity", {"default": 0}), \
patch("tools.terminal_tool._check_all_guards", return_value={"approved": True}):
result = json.loads(terminal_tool(command="make build"))
# Should execute with the config default, NOT be rejected
call_kwargs = mock_env.execute.call_args
assert call_kwargs[1]["timeout"] == 900
assert "error" not in result or result["error"] is None
def test_background_not_rejected(self):
"""Background commands should NOT be subject to foreground timeout cap."""
from tools.terminal_tool import terminal_tool
with patch("tools.terminal_tool._get_env_config", return_value=_make_env_config()), \
patch("tools.terminal_tool._start_cleanup_thread"):
mock_env = MagicMock()
mock_env.env = {}
mock_proc_session = MagicMock()
mock_proc_session.id = "test-123"
mock_proc_session.pid = 1234
mock_registry = MagicMock()
mock_registry.spawn_local.return_value = mock_proc_session
with patch("tools.terminal_tool._active_environments", {"default": mock_env}), \
patch("tools.terminal_tool._last_activity", {"default": 0}), \
patch("tools.terminal_tool._check_all_guards", return_value={"approved": True}), \
patch("tools.process_registry.process_registry", mock_registry), \
patch("tools.approval.get_current_session_key", return_value=""):
result = json.loads(terminal_tool(
command="python server.py",
background=True,
timeout=9999,
))
# Background should NOT be rejected
assert "error" not in result or result["error"] is None
def test_default_timeout_not_rejected(self):
"""Default timeout (180s) should not trigger rejection."""
from tools.terminal_tool import terminal_tool, FOREGROUND_MAX_TIMEOUT
# 180 < 600, so no rejection
assert 180 < FOREGROUND_MAX_TIMEOUT
with patch("tools.terminal_tool._get_env_config", return_value=_make_env_config()), \
patch("tools.terminal_tool._start_cleanup_thread"):
mock_env = MagicMock()
mock_env.execute.return_value = {"output": "done", "returncode": 0}
with patch("tools.terminal_tool._active_environments", {"default": mock_env}), \
patch("tools.terminal_tool._last_activity", {"default": 0}), \
patch("tools.terminal_tool._check_all_guards", return_value={"approved": True}):
result = json.loads(terminal_tool(command="echo hello"))
call_kwargs = mock_env.execute.call_args
assert call_kwargs[1]["timeout"] == 180
assert "error" not in result or result["error"] is None
def test_exactly_at_max_not_rejected(self):
"""Timeout exactly at FOREGROUND_MAX_TIMEOUT should execute normally."""
from tools.terminal_tool import terminal_tool, FOREGROUND_MAX_TIMEOUT
with patch("tools.terminal_tool._get_env_config", return_value=_make_env_config()), \
patch("tools.terminal_tool._start_cleanup_thread"):
mock_env = MagicMock()
mock_env.execute.return_value = {"output": "done", "returncode": 0}
with patch("tools.terminal_tool._active_environments", {"default": mock_env}), \
patch("tools.terminal_tool._last_activity", {"default": 0}), \
patch("tools.terminal_tool._check_all_guards", return_value={"approved": True}):
result = json.loads(terminal_tool(
command="echo hello",
timeout=FOREGROUND_MAX_TIMEOUT, # Exactly at limit
))
call_kwargs = mock_env.execute.call_args
assert call_kwargs[1]["timeout"] == FOREGROUND_MAX_TIMEOUT
assert "error" not in result or result["error"] is None
class TestForegroundMaxTimeoutConstant:
"""Verify the FOREGROUND_MAX_TIMEOUT constant and schema."""
def test_default_value_is_600(self):
"""Default FOREGROUND_MAX_TIMEOUT is 600 when env var is not set."""
from tools.terminal_tool import FOREGROUND_MAX_TIMEOUT
assert FOREGROUND_MAX_TIMEOUT == 600
def test_schema_mentions_max(self):
"""Tool schema description should mention the max timeout."""
from tools.terminal_tool import TERMINAL_SCHEMA, FOREGROUND_MAX_TIMEOUT
timeout_desc = TERMINAL_SCHEMA["parameters"]["properties"]["timeout"]["description"]
assert str(FOREGROUND_MAX_TIMEOUT) in timeout_desc
assert "background=true" in timeout_desc
@@ -0,0 +1,91 @@
import json
from types import SimpleNamespace
import tools.terminal_tool as terminal_tool_module
from tools import process_registry as process_registry_module
def _base_config(tmp_path):
return {
"env_type": "local",
"docker_image": "",
"singularity_image": "",
"modal_image": "",
"daytona_image": "",
"cwd": str(tmp_path),
"timeout": 30,
}
def test_command_requires_pipe_stdin_detects_gh_with_token():
assert terminal_tool_module._command_requires_pipe_stdin(
"gh auth login --hostname github.com --git-protocol https --with-token"
) is True
assert terminal_tool_module._command_requires_pipe_stdin(
"gh auth login --web"
) is False
def test_terminal_background_disables_pty_for_gh_with_token(monkeypatch, tmp_path):
config = _base_config(tmp_path)
dummy_env = SimpleNamespace(env={})
captured = {}
def fake_spawn_local(**kwargs):
captured.update(kwargs)
return SimpleNamespace(id="proc_test", pid=1234, notify_on_complete=False)
monkeypatch.setattr(terminal_tool_module, "_get_env_config", lambda: config)
monkeypatch.setattr(terminal_tool_module, "_start_cleanup_thread", lambda: None)
monkeypatch.setattr(terminal_tool_module, "_check_all_guards", lambda *_args, **_kwargs: {"approved": True})
monkeypatch.setattr(process_registry_module.process_registry, "spawn_local", fake_spawn_local)
monkeypatch.setitem(terminal_tool_module._active_environments, "default", dummy_env)
monkeypatch.setitem(terminal_tool_module._last_activity, "default", 0.0)
try:
result = json.loads(
terminal_tool_module.terminal_tool(
command="gh auth login --hostname github.com --git-protocol https --with-token",
background=True,
pty=True,
)
)
finally:
terminal_tool_module._active_environments.pop("default", None)
terminal_tool_module._last_activity.pop("default", None)
assert captured["use_pty"] is False
assert result["session_id"] == "proc_test"
assert "PTY disabled" in result["pty_note"]
def test_terminal_background_keeps_pty_for_regular_interactive_commands(monkeypatch, tmp_path):
config = _base_config(tmp_path)
dummy_env = SimpleNamespace(env={})
captured = {}
def fake_spawn_local(**kwargs):
captured.update(kwargs)
return SimpleNamespace(id="proc_test", pid=1234, notify_on_complete=False)
monkeypatch.setattr(terminal_tool_module, "_get_env_config", lambda: config)
monkeypatch.setattr(terminal_tool_module, "_start_cleanup_thread", lambda: None)
monkeypatch.setattr(terminal_tool_module, "_check_all_guards", lambda *_args, **_kwargs: {"approved": True})
monkeypatch.setattr(process_registry_module.process_registry, "spawn_local", fake_spawn_local)
monkeypatch.setitem(terminal_tool_module._active_environments, "default", dummy_env)
monkeypatch.setitem(terminal_tool_module._last_activity, "default", 0.0)
try:
result = json.loads(
terminal_tool_module.terminal_tool(
command="python3 -c \"print(input())\"",
background=True,
pty=True,
)
)
finally:
terminal_tool_module._active_environments.pop("default", None)
terminal_tool_module._last_activity.pop("default", None)
assert captured["use_pty"] is True
assert "pty_note" not in result
+35
View File
@@ -16,6 +16,7 @@ from tools.tool_result_storage import (
STORAGE_DIR,
_build_persisted_message,
_heredoc_marker,
_resolve_storage_dir,
_write_to_sandbox,
enforce_turn_budget,
generate_preview,
@@ -115,6 +116,24 @@ class TestWriteToSandbox:
_write_to_sandbox("content", "/tmp/hermes-results/abc.txt", env)
assert env.execute.call_args[1]["timeout"] == 30
def test_uses_parent_dir_of_remote_path(self):
env = MagicMock()
env.execute.return_value = {"output": "", "returncode": 0}
remote_path = "/data/data/com.termux/files/usr/tmp/hermes-results/abc.txt"
_write_to_sandbox("content", remote_path, env)
cmd = env.execute.call_args[0][0]
assert "mkdir -p /data/data/com.termux/files/usr/tmp/hermes-results" in cmd
class TestResolveStorageDir:
def test_defaults_to_storage_dir_without_env(self):
assert _resolve_storage_dir(None) == STORAGE_DIR
def test_uses_env_temp_dir_when_available(self):
env = MagicMock()
env.get_temp_dir.return_value = "/data/data/com.termux/files/usr/tmp"
assert _resolve_storage_dir(env) == "/data/data/com.termux/files/usr/tmp/hermes-results"
# ── _build_persisted_message ──────────────────────────────────────────
@@ -341,6 +360,22 @@ class TestMaybePersistToolResult:
)
assert "DISTINCTIVE_START_MARKER" in result
def test_env_temp_dir_changes_persisted_path(self):
env = MagicMock()
env.execute.return_value = {"output": "", "returncode": 0}
env.get_temp_dir.return_value = "/data/data/com.termux/files/usr/tmp"
content = "x" * 60_000
result = maybe_persist_tool_result(
content=content,
tool_name="terminal",
tool_use_id="tc_termux",
env=env,
threshold=30_000,
)
assert "/data/data/com.termux/files/usr/tmp/hermes-results/tc_termux.txt" in result
cmd = env.execute.call_args[0][0]
assert "mkdir -p /data/data/com.termux/files/usr/tmp/hermes-results" in cmd
def test_threshold_zero_forces_persist(self):
env = MagicMock()
env.execute.return_value = {"output": "", "returncode": 0}

Some files were not shown because too many files have changed in this diff Show More