Compare commits

...

78 Commits

Author SHA1 Message Date
Test 205891c9c8 fix: support Anthropic-compatible endpoints for third-party providers
Three bugs prevented providers like MiniMax from using their
Anthropic-compatible endpoints (e.g. api.minimax.io/anthropic):

1. _VALID_API_MODES was missing 'anthropic_messages', so explicit
   api_mode config was silently rejected and defaulted to
   chat_completions.

2. API-key provider resolution hardcoded api_mode to 'chat_completions'
   without checking model config or detecting Anthropic-compatible URLs.

3. run_agent.py auto-detection only recognized api.anthropic.com, not
   third-party endpoints using the /anthropic URL convention.

Fixes:
- Add 'anthropic_messages' to _VALID_API_MODES
- API-key providers now check model config api_mode and auto-detect
  URLs ending in /anthropic
- run_agent.py and fallback logic detect /anthropic URL convention
- 5 new tests covering all scenarios

Users can now either:
- Set MINIMAX_BASE_URL=https://api.minimax.io/anthropic (auto-detected)
- Set api_mode: anthropic_messages in model config (explicit)
- Use custom_providers with api_mode: anthropic_messages
2026-03-18 16:07:32 -07:00
Teknium f24db23458 fix: custom provider uses config base_url and api_key over env vars (#1760) (#1994)
When provider: custom is set in config.yaml with base_url and api_key,
those values are now used instead of falling back to OPENAI_BASE_URL and
OPENAI_API_KEY env vars. Also reads the 'api' field as an alternative to
'api_key' for config compatibility.

Cherry-picked from PR #1762 by crazywriter1.

Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
2026-03-18 16:00:14 -07:00
Teknium d132e344d7 fix(agent): prevent silent tool result loss during context compression (#1993)
_align_boundary_backward only checked messages[idx-1] to decide if
the compress-end boundary splits a tool_call/result group. When an
assistant issues 3+ parallel tool calls, their results span multiple
consecutive messages. If the boundary fell in the middle of that group,
the parent assistant was summarized away and orphaned tool results were
silently deleted by _sanitize_tool_pairs.

Now walks backward through all consecutive tool results to find the
parent assistant, then pulls the boundary before the entire group.

6 regression tests added in tests/test_compression_boundary.py.

Co-authored-by: Guts <Gutslabs@users.noreply.github.com>
2026-03-18 15:22:51 -07:00
Teknium 22f41daded fix: send error details to user in gateway outer exception handler
Previously, if an error occurred during response processing in
_process_message_background (e.g. during extract_media, send, or
any uncaught exception from the handler), the error was only logged
to server console and the user was left with radio silence — typing
indicator stops but no message arrives.

Now the outer except block attempts to send the error type and detail
(truncated to 300 chars) to the user's chat, matching the format
already used by the inner handler in gateway/run.py.

Co-authored-by: Test <test@test.com>
2026-03-18 10:42:43 -07:00
Teknium 7c7feaa033 Merge pull request #1929 from NousResearch/hermes/hermes-b29f73b2
feat: inject model and provider into system prompt
2026-03-18 04:18:41 -07:00
Teknium 2f80bd9f87 fix: whatsapp reply_prefix config.yaml bridging was dead code (#1923)
The whatsapp reply_prefix bridging referenced config.platforms before
the config object was constructed, making it a silent NameError caught
by except Exception: pass.

Fix: fold reply_prefix into the per-platform bridging loop (introduced
in #1919) which correctly writes to gw_data dict pre-construction.
Removes the broken standalone whatsapp bridging block.

Co-authored-by: Test <test@test.com>
2026-03-18 04:18:33 -07:00
Teknium 23e5e8dde9 Merge pull request #1928 from NousResearch/hermes/hermes-ba3c8fa1
chore: trim huggingface-hub skill description
2026-03-18 04:18:27 -07:00
Test e99aca98ab feat: inject model and provider into system prompt
Adds model name and provider to the system prompt metadata block,
alongside the existing session ID and timestamp. These are frozen
at session start and don't change mid-conversation, so they won't
break prompt caching.
2026-03-18 04:18:26 -07:00
Test 7e30e97a59 chore: trim redundant trigger sentence from huggingface-hub description 2026-03-18 04:18:13 -07:00
Teknium db4dfea7ec docs: document SOUL.md as primary agent identity (#1927)
Update all SOUL.md documentation to reflect that it now occupies
slot #1 in the system prompt, replacing the hardcoded default identity.

Updated pages:
- user-guide/features/personality.md — SOUL.md is primary identity, not just a layer
- developer-guide/prompt-assembly.md — updated prompt layer order, context files list
- guides/use-soul-with-hermes.md — SOUL.md replaces built-in identity
- user-guide/configuration.md — updated context files table and directory tree

Co-authored-by: Test <test@test.com>
2026-03-18 04:18:08 -07:00
Teknium 17254a7692 Merge pull request #1926 from NousResearch/hermes/hermes-ba3c8fa1
chore: add search to huggingface-hub skill description
2026-03-18 04:15:17 -07:00
Test adf188c439 chore: add search to huggingface-hub skill description 2026-03-18 04:15:03 -07:00
Teknium 21958a55d1 Merge pull request #1925 from NousResearch/hermes/hermes-ba3c8fa1
chore: tighten huggingface-hub skill description
2026-03-18 04:11:43 -07:00
Test 947827bba0 chore: tighten huggingface-hub skill description 2026-03-18 04:11:33 -07:00
Teknium e4a3ffa9c1 feat: use SOUL.md as primary agent identity instead of hardcoded default (#1922)
SOUL.md now loads in slot #1 of the system prompt, replacing the
hardcoded DEFAULT_AGENT_IDENTITY. This lets users fully customize
the agent's identity and personality by editing ~/.hermes/SOUL.md
without it conflicting with the built-in identity text.

When SOUL.md is loaded as identity, it's excluded from the context
files section to avoid appearing twice. When SOUL.md is missing,
empty, unreadable, or skip_context_files is set, the hardcoded
DEFAULT_AGENT_IDENTITY is used as a fallback.

The default SOUL.md (seeded on first run) already contains the full
Hermes personality, so existing installs are unaffected.

Co-authored-by: Test <test@test.com>
2026-03-18 04:11:20 -07:00
Teknium 1fa3737134 feat: GitHub Copilot provider integration (#1924)
feat: GitHub Copilot provider integration with OAuth auth, API routing, and docs
2026-03-18 04:09:30 -07:00
Test e7844e9c8d Merge origin/main, resolve conflicts (self._base_url_lower) 2026-03-18 04:09:00 -07:00
Teknium 1c761ae042 feat: add huggingface-hub bundled skill (#1921)
feat: add huggingface-hub bundled skill
2026-03-18 04:08:00 -07:00
Test 56ca84f243 feat: add huggingface-hub bundled skill
Adds the Hugging Face CLI (hf) reference as a built-in skill under
mlops/. Covers downloading/uploading models and datasets, repo
management, SQL queries on datasets, inference endpoints, Spaces,
buckets, and more.

Based on the official HF skill from huggingface/skills.
2026-03-18 04:07:41 -07:00
Test 04101bc59e docs: comprehensive GitHub Copilot provider documentation
- Add dedicated GitHub Copilot section in configuration guide with:
  - Auth options (OAuth device code, env vars, gh CLI)
  - Token type table (supported vs unsupported)
  - API routing explanation (GPT-5+ → Responses, others → Chat)
  - Copilot ACP setup instructions
  - Environment variable reference
- Add all Copilot env vars to environment-variables.md:
  COPILOT_GITHUB_TOKEN, HERMES_COPILOT_ACP_COMMAND, etc.
- Add copilot-acp to --provider list in cli-commands.md
- Docs build verified
2026-03-18 04:07:34 -07:00
Teknium 0a247a50f2 feat: support ignoring unauthorized gateway DMs (#1919)
Add unauthorized_dm_behavior config (pair|ignore) with global default
and per-platform override. WhatsApp can silently drop unknown DMs
instead of sending pairing codes.

Adapted config bridging to work with gw_data dict (pre-construction)
rather than config object. Dropped implementation plan document.

Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>
2026-03-18 04:06:08 -07:00
Teknium 0e2714acea fix(cron): recover recent one-shot jobs (#1918)
Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>
2026-03-18 04:06:02 -07:00
Test 36921a3e98 fix: correct Copilot API mode selection to match opencode
The previous copilot_model_api_mode() checked the catalog's
supported_endpoints first and picked /chat/completions when a model
supported both endpoints. This is wrong — GPT-5+ models should use
the Responses API even when the catalog lists both.

Replicate opencode's shouldUseCopilotResponsesApi() logic:
- GPT-5+ models (gpt-5.4, gpt-5.3-codex, etc.) → Responses API
- gpt-5-mini → Chat Completions (explicit exception)
- Everything else (gpt-4o, claude, gemini, etc.) → Chat Completions
- Model ID pattern is the primary signal, catalog is secondary

The catalog fallback now only matters for non-GPT-5 models that might
exclusively support /v1/messages (e.g. Claude via Copilot).

Models are auto-detected from the live catalog at
api.githubcopilot.com/models — no hardcoded list required for
supported models, only a static fallback for when the API is
unreachable.
2026-03-18 03:54:50 -07:00
Teknium c1a127c87c Merge pull request #1917 from NousResearch/hermes/hermes-b29f73b2
feat(cli): add /statusbar command to toggle context bar
2026-03-18 03:50:05 -07:00
Test c1750bb32d feat(cli): add /statusbar command to toggle context bar
Adds /statusbar (alias /sb) to show/hide the bottom status bar that
displays model name, context usage, and session duration.

Uses ConditionalContainer so the bar takes zero space when hidden
rather than leaving a blank line.
2026-03-18 03:49:49 -07:00
Teknium 4699c226da chore: reorder OpenRouter model catalog (#1916)
chore: reorder OpenRouter model catalog
2026-03-18 03:31:19 -07:00
Test b05f9b6256 chore: reorder OpenRouter catalog — glm-5-turbo under glm-5, minimax under stepfun 2026-03-18 03:31:04 -07:00
Teknium 0679712d26 feat: reorder OpenRouter catalog, add haiku-4.5, fix minimax slug (#1915)
feat: reorder OpenRouter catalog, add haiku-4.5, fix minimax slug
2026-03-18 03:26:22 -07:00
Test cb54750e07 feat: reorder OpenRouter catalog, add haiku-4.5, fix minimax slug
- Add anthropic/claude-haiku-4.5
- Move gpt-5.4-pro and gpt-5.4-nano to bottom
- Fix minimax/minimax-m2.7 → minimax-m2.5 (m2.7 not on OpenRouter)
- Tag hunter-alpha and healer-alpha as free
- Place hunter/healer-alpha right below gpt-5.4-mini
2026-03-18 03:26:06 -07:00
Test 21c45ba0ac feat: proper Copilot auth with OAuth device code flow and token validation
Builds on PR #1879's Copilot integration with critical auth improvements
modeled after opencode's implementation:

- Add hermes_cli/copilot_auth.py with:
  - OAuth device code flow (copilot_device_code_login) using the same
    client_id (Ov23li8tweQw6odWQebz) as opencode and Copilot CLI
  - Token type validation: reject classic PATs (ghp_*) with a clear
    error message explaining supported token types
  - Proper env var priority: COPILOT_GITHUB_TOKEN > GH_TOKEN > GITHUB_TOKEN
    (matching Copilot CLI documentation)
  - copilot_request_headers() with Openai-Intent, x-initiator, and
    Copilot-Vision-Request headers (matching opencode)

- Update auth.py:
  - PROVIDER_REGISTRY copilot entry uses correct env var order
  - _resolve_api_key_provider_secret delegates to copilot_auth for
    the copilot provider with proper token validation

- Update models.py:
  - copilot_default_headers() now includes Openai-Intent and x-initiator

- Update main.py:
  - _model_flow_copilot offers OAuth device code login when no token
    is found, with manual token entry as fallback
  - Shows supported vs unsupported token types

- 22 new tests covering token validation, env var priority, header
  generation, and integration with existing auth infrastructure
2026-03-18 03:25:58 -07:00
Teknium c0c14e60b4 fix: make concurrent tool batching path-aware for file mutations (#1914)
* Improve tool batching independence checks

* fix: address review feedback on path-aware batching

- Log malformed/non-dict tool arguments at debug level before
  falling back to sequential, instead of silently swallowing
  the error into an empty dict
- Guard empty paths in _paths_overlap (unreachable in practice
  due to upstream filtering, but makes the invariant explicit)
- Add tests: malformed JSON args, non-dict args, _paths_overlap
  unit tests including empty path edge cases
- web_crawl is not a registered tool (only web_search/web_extract
  are); no addition needed to _PARALLEL_SAFE_TOOLS

---------

Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-18 03:25:38 -07:00
Teknium 050b43108c feat: add gpt-5.4-mini, gpt-5.4-nano, healer-alpha to OpenRouter catalog (#1913)
feat: add gpt-5.4-mini, gpt-5.4-nano, healer-alpha to OpenRouter catalog
2026-03-18 03:23:36 -07:00
Test 00cc0c6a28 feat: add gpt-5.4-mini, gpt-5.4-nano, healer-alpha to OpenRouter catalog 2026-03-18 03:23:20 -07:00
Teknium bee13d9921 Merge pull request #1912 from NousResearch/hermes/hermes-b29f73b2
fix(banner): normalize toolset labels and use skin colors
2026-03-18 03:23:15 -07:00
Test f814787144 fix(banner): normalize toolset labels and use skin colors
- Strip '_tools' suffix from internal toolset identifiers in the banner
  (e.g. 'web_tools' -> 'web', 'homeassistant_tools' -> 'homeassistant')
- Stop appending '_tools' to unavailable toolset names
- Replace 6 hardcoded hex colors (#B8860B, #FFBF00, #FFF8DC) in toolset
  rows, overflow line, and MCP server rows with the skin variables
  (dim, accent, text) already resolved at the top of the function

Inspired by PR #1871 by @kshitijk4poor.
Adds 4 tests.
2026-03-18 03:22:58 -07:00
Teknium c9bb0c587f fix: direct user message on STT failure + hermes-agent-setup skill (#1905)
fix: direct user message on STT failure + hermes-agent-setup skill
2026-03-18 03:21:12 -07:00
Test 8422196e89 Merge PR #1879: feat: integrate GitHub Copilot providers 2026-03-18 03:18:33 -07:00
Teknium b70dd51cfa fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897)
* fix: banner skill count now respects disabled skills and platform filtering

The banner's get_available_skills() was doing a raw rglob scan of
~/.hermes/skills/ without checking:
- Whether skills are disabled (skills.disabled config)
- Whether skills match the current platform (platforms: frontmatter)

This caused the banner to show inflated skill counts (e.g. '100 skills'
when many are disabled) and list macOS-only skills on Linux.

Fix: delegate to _find_all_skills() from tools/skills_tool which already
handles both platform gating and disabled-skill filtering.

* fix: system prompt and slash commands now respect disabled skills

Two more places where disabled skills were still surfaced:

1. build_skills_system_prompt() in prompt_builder.py — disabled skills
   appeared in the <available_skills> system prompt section, causing
   the agent to suggest/load them despite being disabled.

2. scan_skill_commands() in skill_commands.py — disabled skills still
   registered as /skill-name slash commands in CLI help and could be
   invoked.

Both now load _get_disabled_skill_names() and filter accordingly.

* fix: skill_view blocks disabled skills

skill_view() checked platform compatibility but not disabled state,
so the agent could still load and read disabled skills directly.

Now returns a clear error when a disabled skill is requested, telling
the user to enable it via hermes skills or inspect the files manually.

---------

Co-authored-by: Test <test@test.com>
2026-03-18 03:17:37 -07:00
Test 190c07975d fix: check skill availability before hinting at hermes-agent-setup
Only mention the hermes-agent-setup skill in STT failure notes (both
the direct user message and the agent context note) when the skill is
actually installed. Uses _find_skill() from skill_manager_tool.

Also confirmed: STT is the only user-facing failure case where the
setup skill hint helps. Vision failures are transient API issues,
runtime transcription errors indicate a configured-but-broken provider,
and platform startup warnings are server logs.
2026-03-18 03:17:23 -07:00
Teknium 011ed540dd Merge pull request #1909 from NousResearch/hermes/hermes-b29f73b2
docs: fix MCP install commands — use uv, not bare pip
2026-03-18 03:15:15 -07:00
Test a9c405fac9 docs: fix MCP install commands — use uv, not bare pip
The standard install already includes MCP via .[all]. For users who
need to add it separately, the correct command is:
  cd ~/.hermes/hermes-agent && uv pip install -e ".[mcp]"

The venv is created by uv, so bare 'pip' isn't available. All four
occurrences across 3 docs pages updated.
2026-03-18 03:14:58 -07:00
Teknium 9c174e0940 Merge pull request #1908 from NousResearch/hermes/hermes-b29f73b2
fix(gateway): detect script-style gateway processes for --replace
2026-03-18 03:13:21 -07:00
TheSameCat2 5c4c4b8b7d fix(gateway): detect script-style gateway processes for --replace
Recognize hermes_cli/main.py gateway command lines in gateway
process detection and PID validation so --replace reliably finds
existing gateway instances.

Adds a regression test covering script-style cmdline detection.

Closes #1830
2026-03-18 03:12:59 -07:00
Test 764825bbff feat: expand hermes-agent-setup skill + tell agent about it in STT notes
Skill now covers full CLI usage (hermes setup, hermes skills, hermes
tools, hermes config, session management, etc.), config file reference,
and expanded gateway commands.

Agent context notes for STT failure now mention the hermes-agent-setup
skill is available to help users configure Hermes features.
2026-03-18 03:05:17 -07:00
Teknium ee4cc8ee3b Merge pull request #1907 from NousResearch/hermes/hermes-b29f73b2
feat(mcp): expose MCP servers as standalone toolsets
2026-03-18 03:04:34 -07:00
Test 4b53b89f09 feat(mcp): expose MCP servers as standalone toolsets
Each configured MCP server now registers as its own toolset in TOOLSETS
(e.g. TOOLSETS['github'] = {tools: ['mcp_github_list_files', ...]}),
making raw server names resolvable in platform_toolsets overrides.

Previously MCP tools were only injected into hermes-* umbrella toolsets,
so gateway sessions using raw toolset names like ['terminal', 'github']
in platform_toolsets couldn't resolve MCP tools.

Skips server names that collide with built-in toolsets. Also handles
idempotent reloads (syncs toolsets even when no new servers connect).

Inspired by PR #1876 by @kshitijk4poor.
Adds 2 tests (standalone toolset creation + built-in collision guard).
2026-03-18 03:04:17 -07:00
Teknium a2440f72f6 feat: use endpoint metadata for custom model context and pricing (#1906)
* perf: cache base_url.lower() via property, consolidate triple load_config(), hoist set constant

run_agent.py:
- Add base_url property that auto-caches _base_url_lower on every
  assignment, eliminating 12+ redundant .lower() calls per API cycle
  across __init__, _build_api_kwargs, _supports_reasoning_extra_body,
  and the main conversation loop
- Consolidate three separate load_config() disk reads in __init__
  (memory, skills, compression) into a single call, reusing the
  result dict for all three config sections

model_tools.py:
- Hoist _READ_SEARCH_TOOLS set to module level (was rebuilt inside
  handle_function_call on every tool invocation)

* Use endpoint metadata for custom model context and pricing

---------

Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-18 03:04:07 -07:00
Test 9c0f346258 fix: direct user message on STT failure + hermes-agent-setup skill
When a user sends a voice message and STT isn't configured, the gateway
now sends a clear message directly to the user explaining how to set up
voice transcription, rather than relying on the agent to relay an
injected context note (which often gets misinterpreted).

Also adds a hermes-agent-setup bundled skill covering STT/TTS setup,
tool configuration, dependency installation, and troubleshooting.
2026-03-18 03:01:41 -07:00
Teknium 11f029c311 fix(tts): document NeuTTS provider and align install guidance (#1903)
Co-authored-by: charles-édouard <59705750+ccbbccbb@users.noreply.github.com>
2026-03-18 02:55:30 -07:00
Teknium fb923d5efc Merge pull request #1902 from NousResearch/hermes/hermes-b29f73b2
fix(gateway): PID-based wait with force-kill for gateway restart
2026-03-18 02:54:38 -07:00
Test ace2cc6257 fix(gateway): PID-based wait with force-kill for gateway restart
Add _wait_for_gateway_exit() that polls get_running_pid() to confirm
the old gateway process has actually exited before starting a new one.
If the process doesn't exit within 5s, sends SIGKILL to the specific
PID. Uses the saved PID from gateway.pid (not launchd labels) so it
works correctly with multiple gateway instances under separate
HERMES_HOME directories.

Applied to both launchd_restart() and the manual restart path (replaces
the blind time.sleep(2)).

Inspired by PR #1881 by @AzothZephyr (race condition diagnosis).
Adds 4 tests.
2026-03-18 02:54:18 -07:00
Teknium 24ac577046 fix: respect model.default from config.yaml for openai-codex provider (#1896)
When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the
provider was openai-codex, _normalize_model_for_provider() would replace
it with the latest available codex model because _model_is_default only
checked the CLI argument, not the config value.

Now _model_is_default is False when config.yaml has a model that differs
from the global fallback (anthropic/claude-opus-4.6), so the user's
explicit config choice is preserved.

Fixes #1887

Co-authored-by: Test <test@test.com>
2026-03-18 02:50:31 -07:00
Teknium e86bfd7667 feat: upgrade MiniMax default to M2.7 + add new OpenRouter models (#1900)
feat: upgrade MiniMax default to M2.7 + add new OpenRouter models
2026-03-18 02:43:19 -07:00
octo-patch e4043633fc feat: upgrade MiniMax default to M2.7 + add new OpenRouter models
MiniMax: Add M2.7 and M2.7-highspeed as new defaults across provider
model lists, auxiliary client, metadata, setup wizard, RL training tool,
fallback tests, and docs. Retain M2.5/M2.1 as alternatives.

OpenRouter: Add grok-4.20-beta, nemotron-3-super-120b-a12b:free,
trinity-large-preview:free, glm-5-turbo, and hunter-alpha to the
model catalog.

MiniMax changes based on PR #1882 by @octo-patch (applied manually
due to stale conflicts in refactored pricing module).
2026-03-18 02:42:58 -07:00
Test a8132d1252 fix: respect model.default from config.yaml for openai-codex provider
When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the
provider was openai-codex, _normalize_model_for_provider() would replace
it with the latest available codex model because _model_is_default only
checked the CLI argument, not the config value.

Now _model_is_default is False when config.yaml has a model that differs
from the global fallback (anthropic/claude-opus-4.6), so the user's
explicit config choice is preserved.

Fixes #1887
2026-03-18 02:24:41 -07:00
Teknium 927f4d3a37 fix(matrix): use correct reply_to_message_id parameter name (#1895)
fix(matrix): use correct reply_to_message_id parameter name
2026-03-18 02:23:38 -07:00
Bartok9 66f71c1836 fix(matrix): use correct reply_to_message_id parameter name
Fixes #1842

The MessageEvent dataclass expects 'reply_to_message_id' but the Matrix
connector was passing 'reply_to'. This caused replies to fail with:

    MessageEvent.__init__() got an unexpected keyword argument 'reply_to'

Changed the parameter name to match the dataclass definition.
2026-03-18 02:23:21 -07:00
Teknium b1069196a6 Merge pull request #1894 from NousResearch/hermes/hermes-b29f73b2
fix(delegate): move _saved_tool_names save/restore to _run_single_child scope
2026-03-18 02:23:14 -07:00
Bartok9 ba7248c669 fix(delegate): move _saved_tool_names save/restore to _run_single_child scope
Fixes #1802

The v0.3.0 refactor split child agent construction (_build_child_agent)
and execution (_run_single_child) into separate functions. This created
a scope bug where _saved_tool_names was defined in _build_child_agent
but referenced in _run_single_child's finally block, causing a NameError
on every delegate_task call.

Solution: Move the save/restore logic entirely into _run_single_child,
keeping the save and restore in the same scope as the try/finally block.
This is cleaner than passing the variable through and removes the dead
save from _build_child_agent.
2026-03-18 02:22:46 -07:00
Teknium 6fc4e36625 fix: search all sources by default in session_search (#1892)
* fix: include ACP sessions in default search sources

* fix: remove hardcoded source allowlist from session search

The default source_filter was a hardcoded list that silently excluded
any platform not explicitly listed. Instead of maintaining an ever-growing
allowlist, remove it entirely so all sources are searched by default.
Callers can still pass source_filter explicitly to narrow results.

Follow-up to cherry-picked PR #1817.

---------

Co-authored-by: someoneexistsontheinternet <154079416+someoneexistsontheinternet@users.noreply.github.com>
Co-authored-by: Test <test@test.com>
2026-03-18 02:21:29 -07:00
Teknium 7d7c2a62dd Merge pull request #1890 from NousResearch/hermes/hermes-b29f73b2
fix: OAuth flag stale after refresh/fallback, memory nudge never fires, dead code
2026-03-18 02:20:19 -07:00
Test 5b74df2bfc fix: OAuth flag stale after refresh/fallback, memory nudge never fires, dead code
- Update _is_anthropic_oauth in _try_refresh_anthropic_client_credentials()
  when token type changes during credential refresh
- Set _is_anthropic_oauth in _try_activate_fallback() Anthropic path
- Move _turns_since_memory and _iters_since_skill init to __init__ so
  nudge counters accumulate across run_conversation() calls in CLI mode
- Remove unreachable retry_count >= max_retries block after raise

Adds 7 regression tests. Salvaged from PR #1797 by @0xbyt4.
2026-03-18 02:19:57 -07:00
max 0c392e7a87 feat: integrate GitHub Copilot providers across Hermes
Add first-class GitHub Copilot and Copilot ACP provider support across
model selection, runtime provider resolution, CLI sessions, delegated
subagents, cron jobs, and the Telegram gateway.

This also normalizes Copilot model catalogs and API modes, introduces a
Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by
resolving Homebrew-installed gh binaries under launchd.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-17 23:40:22 -07:00
Teknium f656dfcb32 Merge pull request #1840 from NousResearch/hermes/hermes-b29f73b2
fix: allow agent-created skills with caution-level findings
2026-03-17 16:33:04 -07:00
Test 0fab46f65c fix: allow agent-created skills with caution-level findings
Agent-created skills were using the same policy as community hub
installs, blocking any skill with medium/high severity findings
(e.g. docker pull, pip install, git clone). This meant the agent
couldn't create skills that reference Docker or other common tools.

Changed agent-created policy from (allow, block, block) to
(allow, allow, block) — matching the trusted policy. Caution-level
findings (medium/high severity) are now allowed through, while
dangerous findings (critical severity like exfiltration, prompt
injection, reverse shells) remain blocked.

Added 4 tests covering the agent-created policy: safe allowed,
caution allowed, dangerous blocked, force override.
2026-03-17 16:32:25 -07:00
Teknium 37dceb043e fix: improve gateway error handling for 429 usage limits and 500 context overflow (#1839)
fix: improve gateway error handling for 429 usage limits and 500 context overflow
2026-03-17 16:32:20 -07:00
silentconsensus 7ce374d3b9 Improve gateway error handling for 429 usage limits and 500 context overflow
- Distinguish plan usage limits (429 with usage_limit_reached) from transient rate limits
- Show approximate reset time in hours for plan limits
- Treat HTTP 500 with large sessions as context overflow (same as 400)
- Move history length check earlier for reuse across status codes
2026-03-17 16:32:01 -07:00
Teknium 6e4415e865 Merge pull request #1838 from NousResearch/hermes/hermes-b29f73b2
fix(context_compressor): replace print() calls with logger
2026-03-17 16:31:32 -07:00
Test 45bad9771d fix(context_compressor): replace print() calls with logger
Replaces all remaining print() calls in compress() with logger.info()
and logger.warning() for consistency with the rest of the module.

Inspired by PR #1822.
2026-03-17 16:31:01 -07:00
Teknium 8d60db0f6f fix(discord): remove bugged followup messages + remove /ask command (#1836)
fix(discord): remove bugged followup messages + remove /ask command
2026-03-17 16:28:36 -07:00
Test 1bee519a6f fix(discord): remove redundant /ask slash command
/ask was just 'send a message to the bot' via the slash command menu —
completely redundant since Discord bots already listen to channel messages.
Removed as part of salvaging PR #1827.
2026-03-17 16:25:09 -07:00
charliekerfoot 72bfa115a0 fix(discord): removebugged follow up messages from discord slash commands 2026-03-17 16:24:17 -07:00
Teknium 7f85b2914d Merge pull request #1824 from cutepawss/fix/search-files-pagination
Clean fix — adds pagination args to search_key for parity with read_file. Thanks @cutepawss!
2026-03-17 16:16:47 -07:00
Teknium b8076bb0bd feat: cron agents can suppress delivery with [SILENT] response (#1833)
feat: cron agents can suppress delivery with [SILENT] response
2026-03-17 16:09:24 -07:00
Test d35d923c76 feat: cron agents can suppress delivery with [SILENT] response
Every cron job prompt now includes guidance that the agent can respond
with [SILENT] when it has nothing new or noteworthy to report. The
scheduler checks for this marker and skips delivery, while still saving
output to disk for audit. Failed jobs always deliver regardless.

This replaces the notify parameter approach from PR #1807 with a simpler
always-on design — the model is smart enough to decide when there's
nothing worth reporting without needing a per-job flag.
2026-03-17 16:06:49 -07:00
darya a654bc04f7 fix(file_tools): include pagination args in repeated search key 2026-03-18 01:19:05 +03:00
Test a71e3f4d98 fix: add /browser to COMMAND_REGISTRY so it shows in help and autocomplete
The /browser command handler existed in cli.py but was never added to
COMMAND_REGISTRY after the centralized command registry refactor. This
meant:
- /browser didn't appear in /help
- No tab-completion or subcommand suggestions
- Dispatch used _base_word fallback instead of canonical resolution

Added CommandDef with connect/disconnect/status subcommands and
switched dispatch to use canonical instead of _base_word.
2026-03-17 13:29:36 -07:00
Teknium 588962d24e docs: escape {id} in api-server.md headings to fix MDX build (#1787)
MDX v2+ interprets curly braces in regular markdown as JSX
expressions. The headings 'GET /v1/responses/{id}' and
'DELETE /v1/responses/{id}' caused a ReferenceError during
Docusaurus static site generation because 'id' is not a
defined JavaScript variable. Escaped with backslashes.

Co-authored-by: Test <test@test.com>
2026-03-17 11:04:37 -07:00
87 changed files with 6175 additions and 439 deletions
+2
View File
@@ -194,6 +194,8 @@ class SessionManager:
"api_mode": runtime.get("api_mode"),
"base_url": runtime.get("base_url"),
"api_key": runtime.get("api_key"),
"command": runtime.get("command"),
"args": list(runtime.get("args") or []),
}
)
except Exception:
+31 -41
View File
@@ -55,8 +55,8 @@ logger = logging.getLogger(__name__)
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
"minimax": "MiniMax-M2.5-highspeed",
"minimax-cn": "MiniMax-M2.5-highspeed",
"minimax": "MiniMax-M2.7-highspeed",
"minimax-cn": "MiniMax-M2.7-highspeed",
"anthropic": "claude-haiku-4-5-20251001",
"ai-gateway": "google/gemini-3-flash",
"opencode-zen": "gemini-3-flash",
@@ -480,11 +480,11 @@ def _read_codex_access_token() -> Optional[str]:
def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
"""Try each API-key provider in PROVIDER_REGISTRY order.
Returns (client, model) for the first provider whose env var is set,
or (None, None) if none are configured.
Returns (client, model) for the first provider with usable runtime
credentials, or (None, None) if none are configured.
"""
try:
from hermes_cli.auth import PROVIDER_REGISTRY
from hermes_cli.auth import PROVIDER_REGISTRY, resolve_api_key_provider_credentials
except ImportError:
logger.debug("Could not import PROVIDER_REGISTRY for API-key fallback")
return None, None
@@ -492,34 +492,24 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
for provider_id, pconfig in PROVIDER_REGISTRY.items():
if pconfig.auth_type != "api_key":
continue
# Check if any of the provider's env vars are set
api_key = ""
for env_var in pconfig.api_key_env_vars:
val = os.getenv(env_var, "").strip()
if val:
api_key = val
break
if not api_key:
continue
if provider_id == "anthropic":
return _try_anthropic()
# Resolve base URL (with optional env-var override)
# Kimi Code keys (sk-kimi-) need api.kimi.com/coding/v1
env_url = ""
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if env_url:
base_url = env_url.rstrip("/")
elif provider_id == "kimi-coding" and api_key.startswith("sk-kimi-"):
base_url = "https://api.kimi.com/coding/v1"
else:
base_url = pconfig.inference_base_url
creds = resolve_api_key_provider_credentials(provider_id)
api_key = str(creds.get("api_key", "")).strip()
if not api_key:
continue
base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id, "default")
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
extra = {}
if "api.kimi.com" in base_url.lower():
extra["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
elif "api.githubcopilot.com" in base_url.lower():
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
return None, None
@@ -744,6 +734,10 @@ def _to_async_client(sync_client, model: str):
base_lower = str(sync_client.base_url).lower()
if "openrouter" in base_lower:
async_kwargs["default_headers"] = dict(_OR_HEADERS)
elif "api.githubcopilot.com" in base_lower:
from hermes_cli.models import copilot_default_headers
async_kwargs["default_headers"] = copilot_default_headers()
elif "api.kimi.com" in base_lower:
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.0"}
return AsyncOpenAI(**async_kwargs), model
@@ -885,7 +879,7 @@ def resolve_provider_client(
# ── API-key providers from PROVIDER_REGISTRY ─────────────────────
try:
from hermes_cli.auth import PROVIDER_REGISTRY, _resolve_kimi_base_url
from hermes_cli.auth import PROVIDER_REGISTRY, resolve_api_key_provider_credentials
except ImportError:
logger.debug("hermes_cli.auth not available for provider %s", provider)
return None, None
@@ -904,26 +898,18 @@ def resolve_provider_client(
final_model = model or default_model
return (_to_async_client(client, final_model) if async_mode else (client, final_model))
# Find the first configured API key
api_key = ""
for env_var in pconfig.api_key_env_vars:
api_key = os.getenv(env_var, "").strip()
if api_key:
break
creds = resolve_api_key_provider_credentials(provider)
api_key = str(creds.get("api_key", "")).strip()
if not api_key:
tried_sources = list(pconfig.api_key_env_vars)
if provider == "copilot":
tried_sources.append("gh auth token")
logger.warning("resolve_provider_client: provider %s has no API "
"key configured (tried: %s)",
provider, ", ".join(pconfig.api_key_env_vars))
provider, ", ".join(tried_sources))
return None, None
# Resolve base URL (env override → provider-specific logic → default)
base_url_override = os.getenv(pconfig.base_url_env_var, "").strip() if pconfig.base_url_env_var else ""
if provider == "kimi-coding":
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, base_url_override)
elif base_url_override:
base_url = base_url_override
else:
base_url = pconfig.inference_base_url
base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
final_model = model or default_model
@@ -932,6 +918,10 @@ def resolve_provider_client(
headers = {}
if "api.kimi.com" in base_url.lower():
headers["User-Agent"] = "KimiCLI/1.0"
elif "api.githubcopilot.com" in base_url.lower():
from hermes_cli.models import copilot_default_headers
headers.update(copilot_default_headers())
client = OpenAI(api_key=api_key, base_url=base_url,
**({"default_headers": headers} if headers else {}))
+48 -19
View File
@@ -45,16 +45,18 @@ class ContextCompressor:
quiet_mode: bool = False,
summary_model_override: str = None,
base_url: str = "",
api_key: str = "",
):
self.model = model
self.base_url = base_url
self.api_key = api_key
self.threshold_percent = threshold_percent
self.protect_first_n = protect_first_n
self.protect_last_n = protect_last_n
self.summary_target_tokens = summary_target_tokens
self.quiet_mode = quiet_mode
self.context_length = get_model_context_length(model, base_url=base_url)
self.context_length = get_model_context_length(model, base_url=base_url, api_key=api_key)
self.threshold_tokens = int(self.context_length * threshold_percent)
self.compression_count = 0
self._context_probed = False # True after a step-down from context error
@@ -251,18 +253,24 @@ Write only the summary body. Do not include any preamble or prefix; the system w
"""Pull a compress-end boundary backward to avoid splitting a
tool_call / result group.
If the message just before ``idx`` is an assistant message with
tool_calls, those tool results will start at ``idx`` and would be
separated from their parent. Move backwards to include the whole
group in the summarised region.
If the boundary falls in the middle of a tool-result group (i.e.
there are consecutive tool messages before ``idx``), walk backward
past all of them to find the parent assistant message. If found,
move the boundary before the assistant so the entire
assistant + tool_results group is included in the summarised region
rather than being split (which causes silent data loss when
``_sanitize_tool_pairs`` removes the orphaned tail results).
"""
if idx <= 0 or idx >= len(messages):
return idx
prev = messages[idx - 1]
if prev.get("role") == "assistant" and prev.get("tool_calls"):
# The results for this assistant turn sit at idx..idx+k.
# Include the assistant message in the summarised region too.
idx -= 1
# Walk backward past consecutive tool results
check = idx - 1
while check >= 0 and messages[check].get("role") == "tool":
check -= 1
# If we landed on the parent assistant with tool_calls, pull the
# boundary before it so the whole group gets summarised together.
if check >= 0 and messages[check].get("role") == "assistant" and messages[check].get("tool_calls"):
idx = check
return idx
def compress(self, messages: List[Dict[str, Any]], current_tokens: int = None) -> List[Dict[str, Any]]:
@@ -275,7 +283,11 @@ Write only the summary body. Do not include any preamble or prefix; the system w
n_messages = len(messages)
if n_messages <= self.protect_first_n + self.protect_last_n + 1:
if not self.quiet_mode:
print(f"⚠️ Cannot compress: only {n_messages} messages (need > {self.protect_first_n + self.protect_last_n + 1})")
logger.warning(
"Cannot compress: only %d messages (need > %d)",
n_messages,
self.protect_first_n + self.protect_last_n + 1,
)
return messages
compress_start = self.protect_first_n
@@ -293,11 +305,23 @@ Write only the summary body. Do not include any preamble or prefix; the system w
display_tokens = current_tokens if current_tokens else self.last_prompt_tokens or estimate_messages_tokens_rough(messages)
if not self.quiet_mode:
print(f"\n📦 Context compression triggered ({display_tokens:,} tokens ≥ {self.threshold_tokens:,} threshold)")
print(f" 📊 Model context limit: {self.context_length:,} tokens ({self.threshold_percent*100:.0f}% = {self.threshold_tokens:,})")
if not self.quiet_mode:
print(f" 🗜️ Summarizing turns {compress_start+1}-{compress_end} ({len(turns_to_summarize)} turns)")
logger.info(
"Context compression triggered (%d tokens >= %d threshold)",
display_tokens,
self.threshold_tokens,
)
logger.info(
"Model context limit: %d tokens (%.0f%% = %d)",
self.context_length,
self.threshold_percent * 100,
self.threshold_tokens,
)
logger.info(
"Summarizing turns %d-%d (%d turns)",
compress_start + 1,
compress_end,
len(turns_to_summarize),
)
summary = self._generate_summary(turns_to_summarize)
@@ -337,7 +361,7 @@ Write only the summary body. Do not include any preamble or prefix; the system w
compressed.append({"role": summary_role, "content": summary})
else:
if not self.quiet_mode:
print(" ⚠️ No summary model available — middle turns dropped without summary")
logger.warning("No summary model available — middle turns dropped without summary")
for i in range(compress_end, n_messages):
msg = messages[i].copy()
@@ -354,7 +378,12 @@ Write only the summary body. Do not include any preamble or prefix; the system w
if not self.quiet_mode:
new_estimate = estimate_messages_tokens_rough(compressed)
saved_estimate = display_tokens - new_estimate
print(f" ✅ Compressed: {n_messages}{len(compressed)} messages (~{saved_estimate:,} tokens saved)")
print(f" 💡 Compression #{self.compression_count} complete")
logger.info(
"Compressed: %d -> %d messages (~%d tokens saved)",
n_messages,
len(compressed),
saved_estimate,
)
logger.info("Compression #%d complete", self.compression_count)
return compressed
+447
View File
@@ -0,0 +1,447 @@
"""OpenAI-compatible shim that forwards Hermes requests to `copilot --acp`.
This adapter lets Hermes treat the GitHub Copilot ACP server as a chat-style
backend. Each request starts a short-lived ACP session, sends the formatted
conversation as a single prompt, collects text chunks, and converts the result
back into the minimal shape Hermes expects from an OpenAI client.
"""
from __future__ import annotations
import json
import os
import queue
import shlex
import subprocess
import threading
import time
from collections import deque
from pathlib import Path
from types import SimpleNamespace
from typing import Any
ACP_MARKER_BASE_URL = "acp://copilot"
_DEFAULT_TIMEOUT_SECONDS = 900.0
def _resolve_command() -> str:
return (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
def _resolve_args() -> list[str]:
raw = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
if not raw:
return ["--acp", "--stdio"]
return shlex.split(raw)
def _jsonrpc_error(message_id: Any, code: int, message: str) -> dict[str, Any]:
return {
"jsonrpc": "2.0",
"id": message_id,
"error": {
"code": code,
"message": message,
},
}
def _format_messages_as_prompt(messages: list[dict[str, Any]], model: str | None = None) -> str:
sections: list[str] = [
"You are being used as the active ACP agent backend for Hermes.",
"Use your own ACP capabilities and respond directly in natural language.",
"Do not emit OpenAI tool-call JSON.",
]
if model:
sections.append(f"Hermes requested model hint: {model}")
transcript: list[str] = []
for message in messages:
if not isinstance(message, dict):
continue
role = str(message.get("role") or "unknown").strip().lower()
if role == "tool":
role = "tool"
elif role not in {"system", "user", "assistant"}:
role = "context"
content = message.get("content")
rendered = _render_message_content(content)
if not rendered:
continue
label = {
"system": "System",
"user": "User",
"assistant": "Assistant",
"tool": "Tool",
"context": "Context",
}.get(role, role.title())
transcript.append(f"{label}:\n{rendered}")
if transcript:
sections.append("Conversation transcript:\n\n" + "\n\n".join(transcript))
sections.append("Continue the conversation from the latest user request.")
return "\n\n".join(section.strip() for section in sections if section and section.strip())
def _render_message_content(content: Any) -> str:
if content is None:
return ""
if isinstance(content, str):
return content.strip()
if isinstance(content, dict):
if "text" in content:
return str(content.get("text") or "").strip()
if "content" in content and isinstance(content.get("content"), str):
return str(content.get("content") or "").strip()
return json.dumps(content, ensure_ascii=True)
if isinstance(content, list):
parts: list[str] = []
for item in content:
if isinstance(item, str):
parts.append(item)
elif isinstance(item, dict):
text = item.get("text")
if isinstance(text, str) and text.strip():
parts.append(text.strip())
return "\n".join(parts).strip()
return str(content).strip()
def _ensure_path_within_cwd(path_text: str, cwd: str) -> Path:
candidate = Path(path_text)
if not candidate.is_absolute():
raise PermissionError("ACP file-system paths must be absolute.")
resolved = candidate.resolve()
root = Path(cwd).resolve()
try:
resolved.relative_to(root)
except ValueError as exc:
raise PermissionError(f"Path '{resolved}' is outside the session cwd '{root}'.") from exc
return resolved
class _ACPChatCompletions:
def __init__(self, client: "CopilotACPClient"):
self._client = client
def create(self, **kwargs: Any) -> Any:
return self._client._create_chat_completion(**kwargs)
class _ACPChatNamespace:
def __init__(self, client: "CopilotACPClient"):
self.completions = _ACPChatCompletions(client)
class CopilotACPClient:
"""Minimal OpenAI-client-compatible facade for Copilot ACP."""
def __init__(
self,
*,
api_key: str | None = None,
base_url: str | None = None,
default_headers: dict[str, str] | None = None,
acp_command: str | None = None,
acp_args: list[str] | None = None,
acp_cwd: str | None = None,
command: str | None = None,
args: list[str] | None = None,
**_: Any,
):
self.api_key = api_key or "copilot-acp"
self.base_url = base_url or ACP_MARKER_BASE_URL
self._default_headers = dict(default_headers or {})
self._acp_command = acp_command or command or _resolve_command()
self._acp_args = list(acp_args or args or _resolve_args())
self._acp_cwd = str(Path(acp_cwd or os.getcwd()).resolve())
self.chat = _ACPChatNamespace(self)
self.is_closed = False
self._active_process: subprocess.Popen[str] | None = None
self._active_process_lock = threading.Lock()
def close(self) -> None:
proc: subprocess.Popen[str] | None
with self._active_process_lock:
proc = self._active_process
self._active_process = None
self.is_closed = True
if proc is None:
return
try:
proc.terminate()
proc.wait(timeout=2)
except Exception:
try:
proc.kill()
except Exception:
pass
def _create_chat_completion(
self,
*,
model: str | None = None,
messages: list[dict[str, Any]] | None = None,
timeout: float | None = None,
**_: Any,
) -> Any:
prompt_text = _format_messages_as_prompt(messages or [], model=model)
response_text, reasoning_text = self._run_prompt(
prompt_text,
timeout_seconds=float(timeout or _DEFAULT_TIMEOUT_SECONDS),
)
usage = SimpleNamespace(
prompt_tokens=0,
completion_tokens=0,
total_tokens=0,
prompt_tokens_details=SimpleNamespace(cached_tokens=0),
)
assistant_message = SimpleNamespace(
content=response_text,
tool_calls=[],
reasoning=reasoning_text or None,
reasoning_content=reasoning_text or None,
reasoning_details=None,
)
choice = SimpleNamespace(message=assistant_message, finish_reason="stop")
return SimpleNamespace(
choices=[choice],
usage=usage,
model=model or "copilot-acp",
)
def _run_prompt(self, prompt_text: str, *, timeout_seconds: float) -> tuple[str, str]:
try:
proc = subprocess.Popen(
[self._acp_command] + self._acp_args,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
cwd=self._acp_cwd,
)
except FileNotFoundError as exc:
raise RuntimeError(
f"Could not start Copilot ACP command '{self._acp_command}'. "
"Install GitHub Copilot CLI or set HERMES_COPILOT_ACP_COMMAND/COPILOT_CLI_PATH."
) from exc
if proc.stdin is None or proc.stdout is None:
proc.kill()
raise RuntimeError("Copilot ACP process did not expose stdin/stdout pipes.")
self.is_closed = False
with self._active_process_lock:
self._active_process = proc
inbox: queue.Queue[dict[str, Any]] = queue.Queue()
stderr_tail: deque[str] = deque(maxlen=40)
def _stdout_reader() -> None:
for line in proc.stdout:
try:
inbox.put(json.loads(line))
except Exception:
inbox.put({"raw": line.rstrip("\n")})
def _stderr_reader() -> None:
if proc.stderr is None:
return
for line in proc.stderr:
stderr_tail.append(line.rstrip("\n"))
out_thread = threading.Thread(target=_stdout_reader, daemon=True)
err_thread = threading.Thread(target=_stderr_reader, daemon=True)
out_thread.start()
err_thread.start()
next_id = 0
def _request(method: str, params: dict[str, Any], *, text_parts: list[str] | None = None, reasoning_parts: list[str] | None = None) -> Any:
nonlocal next_id
next_id += 1
request_id = next_id
payload = {
"jsonrpc": "2.0",
"id": request_id,
"method": method,
"params": params,
}
proc.stdin.write(json.dumps(payload) + "\n")
proc.stdin.flush()
deadline = time.time() + timeout_seconds
while time.time() < deadline:
if proc.poll() is not None:
break
try:
msg = inbox.get(timeout=0.1)
except queue.Empty:
continue
if self._handle_server_message(
msg,
process=proc,
cwd=self._acp_cwd,
text_parts=text_parts,
reasoning_parts=reasoning_parts,
):
continue
if msg.get("id") != request_id:
continue
if "error" in msg:
err = msg.get("error") or {}
raise RuntimeError(
f"Copilot ACP {method} failed: {err.get('message') or err}"
)
return msg.get("result")
stderr_text = "\n".join(stderr_tail).strip()
if proc.poll() is not None and stderr_text:
raise RuntimeError(f"Copilot ACP process exited early: {stderr_text}")
raise TimeoutError(f"Timed out waiting for Copilot ACP response to {method}.")
try:
_request(
"initialize",
{
"protocolVersion": 1,
"clientCapabilities": {
"fs": {
"readTextFile": True,
"writeTextFile": True,
}
},
"clientInfo": {
"name": "hermes-agent",
"title": "Hermes Agent",
"version": "0.0.0",
},
},
)
session = _request(
"session/new",
{
"cwd": self._acp_cwd,
"mcpServers": [],
},
) or {}
session_id = str(session.get("sessionId") or "").strip()
if not session_id:
raise RuntimeError("Copilot ACP did not return a sessionId.")
text_parts: list[str] = []
reasoning_parts: list[str] = []
_request(
"session/prompt",
{
"sessionId": session_id,
"prompt": [
{
"type": "text",
"text": prompt_text,
}
],
},
text_parts=text_parts,
reasoning_parts=reasoning_parts,
)
return "".join(text_parts).strip(), "".join(reasoning_parts).strip()
finally:
self.close()
def _handle_server_message(
self,
msg: dict[str, Any],
*,
process: subprocess.Popen[str],
cwd: str,
text_parts: list[str] | None,
reasoning_parts: list[str] | None,
) -> bool:
method = msg.get("method")
if not isinstance(method, str):
return False
if method == "session/update":
params = msg.get("params") or {}
update = params.get("update") or {}
kind = str(update.get("sessionUpdate") or "").strip()
content = update.get("content") or {}
chunk_text = ""
if isinstance(content, dict):
chunk_text = str(content.get("text") or "").strip()
if kind == "agent_message_chunk" and chunk_text and text_parts is not None:
text_parts.append(chunk_text)
elif kind == "agent_thought_chunk" and chunk_text and reasoning_parts is not None:
reasoning_parts.append(chunk_text)
return True
if process.stdin is None:
return True
message_id = msg.get("id")
params = msg.get("params") or {}
if method == "session/request_permission":
response = {
"jsonrpc": "2.0",
"id": message_id,
"result": {
"outcome": {
"outcome": "allow_once",
}
},
}
elif method == "fs/read_text_file":
try:
path = _ensure_path_within_cwd(str(params.get("path") or ""), cwd)
content = path.read_text() if path.exists() else ""
line = params.get("line")
limit = params.get("limit")
if isinstance(line, int) and line > 1:
lines = content.splitlines(keepends=True)
start = line - 1
end = start + limit if isinstance(limit, int) and limit > 0 else None
content = "".join(lines[start:end])
response = {
"jsonrpc": "2.0",
"id": message_id,
"result": {
"content": content,
},
}
except Exception as exc:
response = _jsonrpc_error(message_id, -32602, str(exc))
elif method == "fs/write_text_file":
try:
path = _ensure_path_within_cwd(str(params.get("path") or ""), cwd)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(str(params.get("content") or ""))
response = {
"jsonrpc": "2.0",
"id": message_id,
"result": None,
}
except Exception as exc:
response = _jsonrpc_error(message_id, -32602, str(exc))
else:
response = _jsonrpc_error(
message_id,
-32601,
f"ACP client method '{method}' is not supported by Hermes yet.",
)
process.stdin.write(json.dumps(response) + "\n")
process.stdin.flush()
return True
+220 -9
View File
@@ -10,6 +10,7 @@ import re
import time
from pathlib import Path
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse
import requests
import yaml
@@ -21,6 +22,9 @@ logger = logging.getLogger(__name__)
_model_metadata_cache: Dict[str, Dict[str, Any]] = {}
_model_metadata_cache_time: float = 0
_MODEL_CACHE_TTL = 3600
_endpoint_model_metadata_cache: Dict[str, Dict[str, Dict[str, Any]]] = {}
_endpoint_model_metadata_cache_time: Dict[str, float] = {}
_ENDPOINT_MODEL_CACHE_TTL = 300
# Descending tiers for context length probing when the model is unknown.
# We start high and step down on context-length errors until one works.
@@ -77,6 +81,8 @@ DEFAULT_CONTEXT_LENGTHS = {
"kimi-k2-thinking-turbo": 262144,
"kimi-k2-turbo-preview": 262144,
"kimi-k2-0905-preview": 131072,
"MiniMax-M2.7": 204800,
"MiniMax-M2.7-highspeed": 204800,
"MiniMax-M2.5": 204800,
"MiniMax-M2.5-highspeed": 204800,
"MiniMax-M2.1": 204800,
@@ -121,6 +127,128 @@ DEFAULT_CONTEXT_LENGTHS = {
"qwen-vl-max": 32768,
}
_CONTEXT_LENGTH_KEYS = (
"context_length",
"context_window",
"max_context_length",
"max_position_embeddings",
"max_model_len",
"max_input_tokens",
"max_sequence_length",
"max_seq_len",
)
_MAX_COMPLETION_KEYS = (
"max_completion_tokens",
"max_output_tokens",
"max_tokens",
)
def _normalize_base_url(base_url: str) -> str:
return (base_url or "").strip().rstrip("/")
def _is_openrouter_base_url(base_url: str) -> bool:
return "openrouter.ai" in _normalize_base_url(base_url).lower()
def _is_custom_endpoint(base_url: str) -> bool:
normalized = _normalize_base_url(base_url)
return bool(normalized) and not _is_openrouter_base_url(normalized)
def _is_known_provider_base_url(base_url: str) -> bool:
normalized = _normalize_base_url(base_url)
if not normalized:
return False
parsed = urlparse(normalized if "://" in normalized else f"https://{normalized}")
host = parsed.netloc.lower() or parsed.path.lower()
known_hosts = (
"api.openai.com",
"chatgpt.com",
"api.anthropic.com",
"api.z.ai",
"api.moonshot.ai",
"api.kimi.com",
"api.minimax",
)
return any(known_host in host for known_host in known_hosts)
def _iter_nested_dicts(value: Any):
if isinstance(value, dict):
yield value
for nested in value.values():
yield from _iter_nested_dicts(nested)
elif isinstance(value, list):
for item in value:
yield from _iter_nested_dicts(item)
def _coerce_reasonable_int(value: Any, minimum: int = 1024, maximum: int = 10_000_000) -> Optional[int]:
try:
if isinstance(value, bool):
return None
if isinstance(value, str):
value = value.strip().replace(",", "")
result = int(value)
except (TypeError, ValueError):
return None
if minimum <= result <= maximum:
return result
return None
def _extract_first_int(payload: Dict[str, Any], keys: tuple[str, ...]) -> Optional[int]:
keyset = {key.lower() for key in keys}
for mapping in _iter_nested_dicts(payload):
for key, value in mapping.items():
if str(key).lower() not in keyset:
continue
coerced = _coerce_reasonable_int(value)
if coerced is not None:
return coerced
return None
def _extract_context_length(payload: Dict[str, Any]) -> Optional[int]:
return _extract_first_int(payload, _CONTEXT_LENGTH_KEYS)
def _extract_max_completion_tokens(payload: Dict[str, Any]) -> Optional[int]:
return _extract_first_int(payload, _MAX_COMPLETION_KEYS)
def _extract_pricing(payload: Dict[str, Any]) -> Dict[str, Any]:
alias_map = {
"prompt": ("prompt", "input", "input_cost_per_token", "prompt_token_cost"),
"completion": ("completion", "output", "output_cost_per_token", "completion_token_cost"),
"request": ("request", "request_cost"),
"cache_read": ("cache_read", "cached_prompt", "input_cache_read", "cache_read_cost_per_token"),
"cache_write": ("cache_write", "cache_creation", "input_cache_write", "cache_write_cost_per_token"),
}
for mapping in _iter_nested_dicts(payload):
normalized = {str(key).lower(): value for key, value in mapping.items()}
if not any(any(alias in normalized for alias in aliases) for aliases in alias_map.values()):
continue
pricing: Dict[str, Any] = {}
for target, aliases in alias_map.items():
for alias in aliases:
if alias in normalized and normalized[alias] not in (None, ""):
pricing[target] = normalized[alias]
break
if pricing:
return pricing
return {}
def _add_model_aliases(cache: Dict[str, Dict[str, Any]], model_id: str, entry: Dict[str, Any]) -> None:
cache[model_id] = entry
if "/" in model_id:
bare_model = model_id.split("/", 1)[1]
cache.setdefault(bare_model, entry)
def fetch_model_metadata(force_refresh: bool = False) -> Dict[str, Dict[str, Any]]:
"""Fetch model metadata from OpenRouter (cached for 1 hour)."""
@@ -137,15 +265,16 @@ def fetch_model_metadata(force_refresh: bool = False) -> Dict[str, Dict[str, Any
cache = {}
for model in data.get("data", []):
model_id = model.get("id", "")
cache[model_id] = {
entry = {
"context_length": model.get("context_length", 128000),
"max_completion_tokens": model.get("top_provider", {}).get("max_completion_tokens", 4096),
"name": model.get("name", model_id),
"pricing": model.get("pricing", {}),
}
_add_model_aliases(cache, model_id, entry)
canonical = model.get("canonical_slug", "")
if canonical and canonical != model_id:
cache[canonical] = cache[model_id]
_add_model_aliases(cache, canonical, entry)
_model_metadata_cache = cache
_model_metadata_cache_time = time.time()
@@ -157,6 +286,75 @@ def fetch_model_metadata(force_refresh: bool = False) -> Dict[str, Dict[str, Any
return _model_metadata_cache or {}
def fetch_endpoint_model_metadata(
base_url: str,
api_key: str = "",
force_refresh: bool = False,
) -> Dict[str, Dict[str, Any]]:
"""Fetch model metadata from an OpenAI-compatible ``/models`` endpoint.
This is used for explicit custom endpoints where hardcoded global model-name
defaults are unreliable. Results are cached in memory per base URL.
"""
normalized = _normalize_base_url(base_url)
if not normalized or _is_openrouter_base_url(normalized):
return {}
if not force_refresh:
cached = _endpoint_model_metadata_cache.get(normalized)
cached_at = _endpoint_model_metadata_cache_time.get(normalized, 0)
if cached is not None and (time.time() - cached_at) < _ENDPOINT_MODEL_CACHE_TTL:
return cached
candidates = [normalized]
if normalized.endswith("/v1"):
alternate = normalized[:-3].rstrip("/")
else:
alternate = normalized + "/v1"
if alternate and alternate not in candidates:
candidates.append(alternate)
headers = {"Authorization": f"Bearer {api_key}"} if api_key else {}
last_error: Optional[Exception] = None
for candidate in candidates:
url = candidate.rstrip("/") + "/models"
try:
response = requests.get(url, headers=headers, timeout=10)
response.raise_for_status()
payload = response.json()
cache: Dict[str, Dict[str, Any]] = {}
for model in payload.get("data", []):
if not isinstance(model, dict):
continue
model_id = model.get("id")
if not model_id:
continue
entry: Dict[str, Any] = {"name": model.get("name", model_id)}
context_length = _extract_context_length(model)
if context_length is not None:
entry["context_length"] = context_length
max_completion_tokens = _extract_max_completion_tokens(model)
if max_completion_tokens is not None:
entry["max_completion_tokens"] = max_completion_tokens
pricing = _extract_pricing(model)
if pricing:
entry["pricing"] = pricing
_add_model_aliases(cache, model_id, entry)
_endpoint_model_metadata_cache[normalized] = cache
_endpoint_model_metadata_cache_time[normalized] = time.time()
return cache
except Exception as exc:
last_error = exc
if last_error:
logger.debug("Failed to fetch model metadata from %s/models: %s", normalized, last_error)
_endpoint_model_metadata_cache[normalized] = {}
_endpoint_model_metadata_cache_time[normalized] = time.time()
return {}
def _get_context_cache_path() -> Path:
"""Return path to the persistent context length cache file."""
hermes_home = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
@@ -241,14 +439,15 @@ def parse_context_limit_from_error(error_msg: str) -> Optional[int]:
return None
def get_model_context_length(model: str, base_url: str = "") -> int:
def get_model_context_length(model: str, base_url: str = "", api_key: str = "") -> int:
"""Get the context length for a model.
Resolution order:
1. Persistent cache (previously discovered via probing)
2. OpenRouter API metadata
3. Hardcoded DEFAULT_CONTEXT_LENGTHS (fuzzy match)
4. First probe tier (2M) — will be narrowed on first context error
2. Active endpoint metadata (/models for explicit custom endpoints)
3. OpenRouter API metadata
4. Hardcoded DEFAULT_CONTEXT_LENGTHS (fuzzy match for hosted routes only)
5. First probe tier (2M) — will be narrowed on first context error
"""
# 1. Check persistent cache (model+provider)
if base_url:
@@ -256,19 +455,31 @@ def get_model_context_length(model: str, base_url: str = "") -> int:
if cached is not None:
return cached
# 2. OpenRouter API metadata
# 2. Active endpoint metadata for explicit custom routes
if _is_custom_endpoint(base_url):
endpoint_metadata = fetch_endpoint_model_metadata(base_url, api_key=api_key)
if model in endpoint_metadata:
context_length = endpoint_metadata[model].get("context_length")
if isinstance(context_length, int):
return context_length
if not _is_known_provider_base_url(base_url):
# Explicit third-party endpoints should not borrow fuzzy global
# defaults from unrelated providers with similarly named models.
return CONTEXT_PROBE_TIERS[0]
# 3. OpenRouter API metadata
metadata = fetch_model_metadata()
if model in metadata:
return metadata[model].get("context_length", 128000)
# 3. Hardcoded defaults (fuzzy match — longest key first for specificity)
# 4. Hardcoded defaults (fuzzy match — longest key first for specificity)
for default_model, length in sorted(
DEFAULT_CONTEXT_LENGTHS.items(), key=lambda x: len(x[0]), reverse=True
):
if default_model in model or model in default_model:
return length
# 4. Unknown model — start at highest probe tier
# 5. Unknown model — start at highest probe tier
return CONTEXT_PROBE_TIERS[0]
+53 -28
View File
@@ -330,28 +330,34 @@ def build_skills_system_prompt(
# Each entry: (skill_name, description)
# Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
# -> category "mlops/training", skill "axolotl"
# Load disabled skill names once for the entire scan
try:
from tools.skills_tool import _get_disabled_skill_names
disabled = _get_disabled_skill_names()
except Exception:
disabled = set()
skills_by_category: dict[str, list[tuple[str, str]]] = {}
for skill_file in skills_dir.rglob("SKILL.md"):
is_compatible, _, desc = _parse_skill_file(skill_file)
is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
if not is_compatible:
continue
# Skip skills whose conditional activation rules exclude them
conditions = _read_skill_conditions(skill_file)
if not _skill_should_show(conditions, available_tools, available_toolsets):
continue
rel_path = skill_file.relative_to(skills_dir)
parts = rel_path.parts
if len(parts) >= 2:
# Category is everything between skills_dir and the skill folder
# e.g. parts = ("mlops", "training", "axolotl", "SKILL.md")
# → category = "mlops/training", skill_name = "axolotl"
# e.g. parts = ("github", "github-auth", "SKILL.md")
# → category = "github", skill_name = "github-auth"
skill_name = parts[-2]
category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
else:
category = "general"
skill_name = skill_file.parent.name
# Respect user's disabled skills config
fm_name = frontmatter.get("name", skill_name)
if fm_name in disabled or skill_name in disabled:
continue
# Skip skills whose conditional activation rules exclude them
conditions = _read_skill_conditions(skill_file)
if not _skill_should_show(conditions, available_tools, available_toolsets):
continue
skills_by_category.setdefault(category, []).append((skill_name, desc))
if not skills_by_category:
@@ -423,11 +429,42 @@ def _truncate_content(content: str, filename: str, max_chars: int = CONTEXT_FILE
return head + marker + tail
def build_context_files_prompt(cwd: Optional[str] = None) -> str:
def load_soul_md() -> Optional[str]:
"""Load SOUL.md from HERMES_HOME and return its content, or None.
Used as the agent identity (slot #1 in the system prompt). When this
returns content, ``build_context_files_prompt`` should be called with
``skip_soul=True`` so SOUL.md isn't injected twice.
"""
try:
from hermes_cli.config import ensure_hermes_home
ensure_hermes_home()
except Exception as e:
logger.debug("Could not ensure HERMES_HOME before loading SOUL.md: %s", e)
soul_path = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) / "SOUL.md"
if not soul_path.exists():
return None
try:
content = soul_path.read_text(encoding="utf-8").strip()
if not content:
return None
content = _scan_context_content(content, "SOUL.md")
content = _truncate_content(content, "SOUL.md")
return content
except Exception as e:
logger.debug("Could not read SOUL.md from %s: %s", soul_path, e)
return None
def build_context_files_prompt(cwd: Optional[str] = None, skip_soul: bool = False) -> str:
"""Discover and load context files for the system prompt.
Discovery: AGENTS.md (recursive), .cursorrules / .cursor/rules/*.mdc,
and SOUL.md from HERMES_HOME only. Each capped at 20,000 chars.
When *skip_soul* is True, SOUL.md is not included here (it was already
loaded via ``load_soul_md()`` for the identity slot).
"""
if cwd is None:
cwd = os.getcwd()
@@ -517,23 +554,11 @@ def build_context_files_prompt(cwd: Optional[str] = None) -> str:
hermes_md_content = _truncate_content(hermes_md_content, ".hermes.md")
sections.append(hermes_md_content)
# SOUL.md from HERMES_HOME only
try:
from hermes_cli.config import ensure_hermes_home
ensure_hermes_home()
except Exception as e:
logger.debug("Could not ensure HERMES_HOME before loading SOUL.md: %s", e)
soul_path = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) / "SOUL.md"
if soul_path.exists():
try:
content = soul_path.read_text(encoding="utf-8").strip()
if content:
content = _scan_context_content(content, "SOUL.md")
content = _truncate_content(content, "SOUL.md")
sections.append(content)
except Exception as e:
logger.debug("Could not read SOUL.md from %s: %s", soul_path, e)
# SOUL.md from HERMES_HOME only — skip when already loaded as identity
if not skip_soul:
soul_content = load_soul_md()
if soul_content:
sections.append(soul_content)
if not sections:
return ""
+5 -1
View File
@@ -157,9 +157,10 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
global _skill_commands
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
if not SKILLS_DIR.exists():
return _skill_commands
disabled = _get_disabled_skill_names()
for skill_md in SKILLS_DIR.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
@@ -170,6 +171,9 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
if not skill_matches_platform(frontmatter):
continue
name = frontmatter.get('name', skill_md.parent.name)
# Respect user's disabled skills config
if name in disabled:
continue
description = frontmatter.get('description', '')
if not description:
for line in body.strip().split('\n'):
+12
View File
@@ -125,6 +125,8 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
"base_url": primary.get("base_url"),
"provider": primary.get("provider"),
"api_mode": primary.get("api_mode"),
"command": primary.get("command"),
"args": list(primary.get("args") or []),
},
"label": None,
"signature": (
@@ -132,6 +134,8 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
primary.get("provider"),
primary.get("base_url"),
primary.get("api_mode"),
primary.get("command"),
tuple(primary.get("args") or ()),
),
}
@@ -156,6 +160,8 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
"base_url": primary.get("base_url"),
"provider": primary.get("provider"),
"api_mode": primary.get("api_mode"),
"command": primary.get("command"),
"args": list(primary.get("args") or []),
},
"label": None,
"signature": (
@@ -163,6 +169,8 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
primary.get("provider"),
primary.get("base_url"),
primary.get("api_mode"),
primary.get("command"),
tuple(primary.get("args") or ()),
),
}
@@ -173,6 +181,8 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
"base_url": runtime.get("base_url"),
"provider": runtime.get("provider"),
"api_mode": runtime.get("api_mode"),
"command": runtime.get("command"),
"args": list(runtime.get("args") or []),
},
"label": f"smart route → {route.get('model')} ({runtime.get('provider')})",
"signature": (
@@ -180,5 +190,7 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
runtime.get("provider"),
runtime.get("base_url"),
runtime.get("api_mode"),
runtime.get("command"),
tuple(runtime.get("args") or ()),
),
}
+37 -8
View File
@@ -5,7 +5,7 @@ from datetime import datetime, timezone
from decimal import Decimal
from typing import Any, Dict, Literal, Optional
from agent.model_metadata import fetch_model_metadata
from agent.model_metadata import fetch_endpoint_model_metadata, fetch_model_metadata
DEFAULT_PRICING = {"input": 0.0, "output": 0.0}
@@ -335,8 +335,21 @@ def _lookup_official_docs_pricing(route: BillingRoute) -> Optional[PricingEntry]
def _openrouter_pricing_entry(route: BillingRoute) -> Optional[PricingEntry]:
metadata = fetch_model_metadata()
model_id = route.model
return _pricing_entry_from_metadata(
fetch_model_metadata(),
route.model,
source_url="https://openrouter.ai/docs/api/api-reference/models/get-models",
pricing_version="openrouter-models-api",
)
def _pricing_entry_from_metadata(
metadata: Dict[str, Dict[str, Any]],
model_id: str,
*,
source_url: str,
pricing_version: str,
) -> Optional[PricingEntry]:
if model_id not in metadata:
return None
pricing = metadata[model_id].get("pricing") or {}
@@ -355,6 +368,7 @@ def _openrouter_pricing_entry(route: BillingRoute) -> Optional[PricingEntry]:
)
if prompt is None and completion is None and request is None:
return None
def _per_token_to_per_million(value: Optional[Decimal]) -> Optional[Decimal]:
if value is None:
return None
@@ -367,8 +381,8 @@ def _openrouter_pricing_entry(route: BillingRoute) -> Optional[PricingEntry]:
cache_write_cost_per_million=_per_token_to_per_million(cache_write),
request_cost=request,
source="provider_models_api",
source_url="https://openrouter.ai/docs/api/api-reference/models/get-models",
pricing_version="openrouter-models-api",
source_url=source_url,
pricing_version=pricing_version,
fetched_at=_UTC_NOW(),
)
@@ -377,6 +391,7 @@ def get_pricing_entry(
model_name: str,
provider: Optional[str] = None,
base_url: Optional[str] = None,
api_key: Optional[str] = None,
) -> Optional[PricingEntry]:
route = resolve_billing_route(model_name, provider=provider, base_url=base_url)
if route.billing_mode == "subscription_included":
@@ -390,6 +405,15 @@ def get_pricing_entry(
)
if route.provider == "openrouter":
return _openrouter_pricing_entry(route)
if route.base_url:
entry = _pricing_entry_from_metadata(
fetch_endpoint_model_metadata(route.base_url, api_key=api_key or ""),
route.model,
source_url=f"{route.base_url.rstrip('/')}/models",
pricing_version="openai-compatible-models-api",
)
if entry:
return entry
return _lookup_official_docs_pricing(route)
@@ -460,6 +484,7 @@ def estimate_usage_cost(
*,
provider: Optional[str] = None,
base_url: Optional[str] = None,
api_key: Optional[str] = None,
) -> CostResult:
route = resolve_billing_route(model_name, provider=provider, base_url=base_url)
if route.billing_mode == "subscription_included":
@@ -471,7 +496,7 @@ def estimate_usage_cost(
pricing_version="included-route",
)
entry = get_pricing_entry(model_name, provider=provider, base_url=base_url)
entry = get_pricing_entry(model_name, provider=provider, base_url=base_url, api_key=api_key)
if not entry:
return CostResult(amount_usd=None, status="unknown", source="none", label="n/a")
@@ -536,6 +561,7 @@ def has_known_pricing(
model_name: str,
provider: Optional[str] = None,
base_url: Optional[str] = None,
api_key: Optional[str] = None,
) -> bool:
"""Check whether we have pricing data for this model+route.
@@ -545,7 +571,7 @@ def has_known_pricing(
route = resolve_billing_route(model_name, provider=provider, base_url=base_url)
if route.billing_mode == "subscription_included":
return True
entry = get_pricing_entry(model_name, provider=provider, base_url=base_url)
entry = get_pricing_entry(model_name, provider=provider, base_url=base_url, api_key=api_key)
return entry is not None
@@ -553,13 +579,14 @@ def get_pricing(
model_name: str,
provider: Optional[str] = None,
base_url: Optional[str] = None,
api_key: Optional[str] = None,
) -> Dict[str, float]:
"""Backward-compatible thin wrapper for legacy callers.
Returns only non-cache input/output fields when a pricing entry exists.
Unknown routes return zeroes.
"""
entry = get_pricing_entry(model_name, provider=provider, base_url=base_url)
entry = get_pricing_entry(model_name, provider=provider, base_url=base_url, api_key=api_key)
if not entry:
return {"input": 0.0, "output": 0.0}
return {
@@ -575,6 +602,7 @@ def estimate_cost_usd(
*,
provider: Optional[str] = None,
base_url: Optional[str] = None,
api_key: Optional[str] = None,
) -> float:
"""Backward-compatible helper for legacy callers.
@@ -586,6 +614,7 @@ def estimate_cost_usd(
CanonicalUsage(input_tokens=input_tokens, output_tokens=output_tokens),
provider=provider,
base_url=base_url,
api_key=api_key,
)
return float(result.amount_usd or _ZERO)
+68 -24
View File
@@ -1044,11 +1044,17 @@ class HermesCLI:
# env vars would stomp each other.
_model_config = CLI_CONFIG.get("model", {})
_config_model = _model_config.get("default", "") if isinstance(_model_config, dict) else (_model_config or "")
self.model = model or _config_model or "anthropic/claude-opus-4.6"
_FALLBACK_MODEL = "anthropic/claude-opus-4.6"
self.model = model or _config_model or _FALLBACK_MODEL
# Track whether model was explicitly chosen by the user or fell back
# to the global default. Provider-specific normalisation may override
# the default silently but should warn when overriding an explicit choice.
self._model_is_default = not model
# A config model that matches the global fallback is NOT considered an
# explicit choice — the user just never changed it. But a config model
# like "gpt-5.3-codex" IS explicit and must be preserved.
self._model_is_default = not model and (
not _config_model or _config_model == _FALLBACK_MODEL
)
self._explicit_api_key = api_key
self._explicit_base_url = base_url
@@ -1063,6 +1069,8 @@ class HermesCLI:
self._provider_source: Optional[str] = None
self.provider = self.requested_provider
self.api_mode = "chat_completions"
self.acp_command: Optional[str] = None
self.acp_args: list[str] = []
self.base_url = (
base_url
or os.getenv("OPENAI_BASE_URL")
@@ -1209,6 +1217,9 @@ class HermesCLI:
self._voice_tts_done = threading.Event()
self._voice_tts_done.set()
# Status bar visibility (toggled via /statusbar)
self._status_bar_visible = True
# Background task tracking: {task_id: threading.Thread}
self._background_tasks: Dict[str, threading.Thread] = {}
self._background_task_counter = 0
@@ -1316,6 +1327,8 @@ class HermesCLI:
return f"{self.model if getattr(self, 'model', None) else 'Hermes'}"
def _get_status_bar_fragments(self):
if not self._status_bar_visible:
return []
try:
snapshot = self._get_status_bar_snapshot()
width = shutil.get_terminal_size((80, 24)).columns
@@ -1374,27 +1387,35 @@ class HermesCLI:
return [("class:status-bar", f" {self._build_status_bar_text()} ")]
def _normalize_model_for_provider(self, resolved_provider: str) -> bool:
"""Strip provider prefixes and swap the default model for Codex.
When the resolved provider is ``openai-codex``:
1. Strip any ``provider/`` prefix (the Codex Responses API only
accepts bare model slugs like ``gpt-5.4``, not ``openai/gpt-5.4``).
2. If the active model is still the *untouched default* (user never
explicitly chose a model), replace it with a Codex-compatible
default so the first session doesn't immediately error.
If the user explicitly chose a model *any* model we trust them
and let the API be the judge. No allowlists, no slug checks.
Returns True when the active model was changed.
"""
if resolved_provider != "openai-codex":
return False
"""Normalize provider-specific model IDs and routing."""
current_model = (self.model or "").strip()
changed = False
if resolved_provider == "copilot":
try:
from hermes_cli.models import copilot_model_api_mode, normalize_copilot_model_id
canonical = normalize_copilot_model_id(current_model, api_key=self.api_key)
if canonical and canonical != current_model:
if not self._model_is_default:
self.console.print(
f"[yellow]⚠️ Normalized Copilot model '{current_model}' to '{canonical}'.[/]"
)
self.model = canonical
current_model = canonical
changed = True
resolved_mode = copilot_model_api_mode(current_model, api_key=self.api_key)
if resolved_mode != self.api_mode:
self.api_mode = resolved_mode
changed = True
except Exception:
pass
return changed
if resolved_provider != "openai-codex":
return False
# 1. Strip provider prefix ("openai/gpt-5.4" → "gpt-5.4")
if "/" in current_model:
slug = current_model.split("/", 1)[1]
@@ -1670,6 +1691,8 @@ class HermesCLI:
base_url = runtime.get("base_url")
resolved_provider = runtime.get("provider", "openrouter")
resolved_api_mode = runtime.get("api_mode", self.api_mode)
resolved_acp_command = runtime.get("command")
resolved_acp_args = list(runtime.get("args") or [])
if not isinstance(api_key, str) or not api_key:
self.console.print("[bold red]Provider resolver returned an empty API key.[/]")
return False
@@ -1681,9 +1704,13 @@ class HermesCLI:
routing_changed = (
resolved_provider != self.provider
or resolved_api_mode != self.api_mode
or resolved_acp_command != self.acp_command
or resolved_acp_args != self.acp_args
)
self.provider = resolved_provider
self.api_mode = resolved_api_mode
self.acp_command = resolved_acp_command
self.acp_args = resolved_acp_args
self._provider_source = runtime.get("source")
self.api_key = api_key
self.base_url = base_url
@@ -1713,6 +1740,8 @@ class HermesCLI:
"base_url": self.base_url,
"provider": self.provider,
"api_mode": self.api_mode,
"command": self.acp_command,
"args": list(self.acp_args or []),
},
)
@@ -1781,6 +1810,8 @@ class HermesCLI:
"base_url": self.base_url,
"provider": self.provider,
"api_mode": self.api_mode,
"command": self.acp_command,
"args": list(self.acp_args or []),
}
effective_model = model_override or self.model
self.agent = AIAgent(
@@ -1789,6 +1820,8 @@ class HermesCLI:
base_url=runtime.get("base_url"),
provider=runtime.get("provider"),
api_mode=runtime.get("api_mode"),
acp_command=runtime.get("command"),
acp_args=runtime.get("args"),
max_iterations=self.max_turns,
enabled_toolsets=self.enabled_toolsets,
verbose_logging=self.verbose,
@@ -1825,6 +1858,8 @@ class HermesCLI:
runtime.get("provider"),
runtime.get("base_url"),
runtime.get("api_mode"),
runtime.get("command"),
tuple(runtime.get("args") or ()),
)
if self._pending_title and self._session_db:
@@ -3545,6 +3580,10 @@ class HermesCLI:
self._handle_skills_command(cmd_original)
elif canonical == "platforms":
self._show_gateway_status()
elif canonical == "statusbar":
self._status_bar_visible = not self._status_bar_visible
state = "visible" if self._status_bar_visible else "hidden"
self.console.print(f" Status bar {state}")
elif canonical == "verbose":
self._toggle_verbose()
elif canonical == "reasoning":
@@ -3560,7 +3599,7 @@ class HermesCLI:
elif canonical == "reload-mcp":
with self._busy_command(self._slow_command_status(cmd_original)):
self._reload_mcp()
elif _base_word == "browser":
elif canonical == "browser":
self._handle_browser_command(cmd_original)
elif canonical == "plugins":
try:
@@ -3750,6 +3789,8 @@ class HermesCLI:
base_url=turn_route["runtime"].get("base_url"),
provider=turn_route["runtime"].get("provider"),
api_mode=turn_route["runtime"].get("api_mode"),
acp_command=turn_route["runtime"].get("command"),
acp_args=turn_route["runtime"].get("args"),
max_iterations=self.max_turns,
enabled_toolsets=self.enabled_toolsets,
quiet_mode=True,
@@ -6581,9 +6622,12 @@ class HermesCLI:
filter=Condition(lambda: cli_ref._voice_mode),
)
status_bar = Window(
content=FormattedTextControl(lambda: cli_ref._get_status_bar_fragments()),
height=1,
status_bar = ConditionalContainer(
Window(
content=FormattedTextControl(lambda: cli_ref._get_status_bar_fragments()),
height=1,
),
filter=Condition(lambda: cli_ref._status_bar_visible),
)
# Layout: interactive prompt widgets + ruled input at bottom.
+49 -4
View File
@@ -34,6 +34,7 @@ HERMES_DIR = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
CRON_DIR = HERMES_DIR / "cron"
JOBS_FILE = CRON_DIR / "jobs.json"
OUTPUT_DIR = CRON_DIR / "output"
ONESHOT_GRACE_SECONDS = 120
def _normalize_skill_list(skill: Optional[str] = None, skills: Optional[Any] = None) -> List[str]:
@@ -220,6 +221,33 @@ def _ensure_aware(dt: datetime) -> datetime:
return dt.astimezone(target_tz)
def _recoverable_oneshot_run_at(
schedule: Dict[str, Any],
now: datetime,
*,
last_run_at: Optional[str] = None,
) -> Optional[str]:
"""Return a one-shot run time if it is still eligible to fire.
One-shot jobs get a small grace window so jobs created a few seconds after
their requested minute still run on the next tick. Once a one-shot has
already run, it is never eligible again.
"""
if schedule.get("kind") != "once":
return None
if last_run_at:
return None
run_at = schedule.get("run_at")
if not run_at:
return None
run_at_dt = _ensure_aware(datetime.fromisoformat(run_at))
if run_at_dt >= now - timedelta(seconds=ONESHOT_GRACE_SECONDS):
return run_at
return None
def compute_next_run(schedule: Dict[str, Any], last_run_at: Optional[str] = None) -> Optional[str]:
"""
Compute the next run time for a schedule.
@@ -229,9 +257,7 @@ def compute_next_run(schedule: Dict[str, Any], last_run_at: Optional[str] = None
now = _hermes_now()
if schedule["kind"] == "once":
run_at = _ensure_aware(datetime.fromisoformat(schedule["run_at"]))
# If in the future, return it; if in the past, no more runs
return schedule["run_at"] if run_at > now else None
return _recoverable_oneshot_run_at(schedule, now, last_run_at=last_run_at)
elif schedule["kind"] == "interval":
minutes = schedule["minutes"]
@@ -555,7 +581,26 @@ def get_due_jobs() -> List[Dict[str, Any]]:
next_run = job.get("next_run_at")
if not next_run:
continue
recovered_next = _recoverable_oneshot_run_at(
job.get("schedule", {}),
now,
last_run_at=job.get("last_run_at"),
)
if not recovered_next:
continue
job["next_run_at"] = recovered_next
next_run = recovered_next
logger.info(
"Job '%s' had no next_run_at; recovering one-shot run at %s",
job.get("name", job["id"]),
recovered_next,
)
for rj in raw_jobs:
if rj["id"] == job["id"]:
rj["next_run_at"] = recovered_next
needs_save = True
break
next_run_dt = _ensure_aware(datetime.fromisoformat(next_run))
if next_run_dt <= now:
+29 -2
View File
@@ -37,6 +37,11 @@ sys.path.insert(0, str(Path(__file__).parent.parent))
from cron.jobs import get_due_jobs, mark_job_run, save_job_output
# Sentinel: when a cron agent has nothing new to report, it can start its
# response with this marker to suppress delivery. Output is still saved
# locally for audit.
SILENT_MARKER = "[SILENT]"
# Resolve Hermes home directory (respects HERMES_HOME override)
_hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
@@ -180,6 +185,17 @@ def _build_job_prompt(job: dict) -> str:
"""Build the effective prompt for a cron job, optionally loading one or more skills first."""
prompt = job.get("prompt", "")
skills = job.get("skills")
# Always prepend [SILENT] guidance so the cron agent can suppress
# delivery when it has nothing new or noteworthy to report.
silent_hint = (
"[SYSTEM: If you have nothing new or noteworthy to report, respond "
"with exactly \"[SILENT]\" (optionally followed by a brief internal "
"note). This suppresses delivery to the user while still saving "
"output locally. Only use [SILENT] when there are genuinely no "
"changes worth reporting.]\n\n"
)
prompt = silent_hint + prompt
if skills is None:
legacy = job.get("skill")
skills = [legacy] if legacy else []
@@ -343,6 +359,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
"base_url": runtime.get("base_url"),
"provider": runtime.get("provider"),
"api_mode": runtime.get("api_mode"),
"command": runtime.get("command"),
"args": list(runtime.get("args") or []),
},
)
@@ -352,6 +370,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
base_url=turn_route["runtime"].get("base_url"),
provider=turn_route["runtime"].get("provider"),
api_mode=turn_route["runtime"].get("api_mode"),
acp_command=turn_route["runtime"].get("command"),
acp_args=turn_route["runtime"].get("args"),
max_iterations=max_iterations,
reasoning_config=reasoning_config,
prefill_messages=prefill_messages,
@@ -480,9 +500,16 @@ def tick(verbose: bool = True) -> int:
if verbose:
logger.info("Output saved to: %s", output_file)
# Deliver the final response to the origin/target chat
# Deliver the final response to the origin/target chat.
# If the agent responded with [SILENT], skip delivery (but
# output is already saved above). Failed jobs always deliver.
deliver_content = final_response if success else f"⚠️ Cron job '{job.get('name', job['id'])}' failed:\n{error}"
if deliver_content:
should_deliver = bool(deliver_content)
if should_deliver and success and deliver_content.strip().upper().startswith(SILENT_MARKER):
logger.info("Job '%s': agent returned %s — skipping delivery", job["id"], SILENT_MARKER)
should_deliver = False
if should_deliver:
try:
_deliver_result(job, deliver_content)
except Exception as de:
+67 -7
View File
@@ -32,6 +32,15 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
return bool(value)
def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
"""Normalize unauthorized DM behavior to a supported value."""
if isinstance(value, str):
normalized = value.strip().lower()
if normalized in {"pair", "ignore"}:
return normalized
return default
class Platform(Enum):
"""Supported messaging platforms."""
LOCAL = "local"
@@ -215,6 +224,9 @@ class GatewayConfig:
# Session isolation in shared chats
group_sessions_per_user: bool = True # Isolate group/channel sessions per participant when user IDs are available
# Unauthorized DM policy
unauthorized_dm_behavior: str = "pair" # "pair" or "ignore"
# Streaming configuration
streaming: StreamingConfig = field(default_factory=StreamingConfig)
@@ -289,6 +301,7 @@ class GatewayConfig:
"always_log_local": self.always_log_local,
"stt_enabled": self.stt_enabled,
"group_sessions_per_user": self.group_sessions_per_user,
"unauthorized_dm_behavior": self.unauthorized_dm_behavior,
"streaming": self.streaming.to_dict(),
}
@@ -331,6 +344,10 @@ class GatewayConfig:
stt_enabled = data.get("stt", {}).get("enabled") if isinstance(data.get("stt"), dict) else None
group_sessions_per_user = data.get("group_sessions_per_user")
unauthorized_dm_behavior = _normalize_unauthorized_dm_behavior(
data.get("unauthorized_dm_behavior"),
"pair",
)
return cls(
platforms=platforms,
@@ -343,9 +360,21 @@ class GatewayConfig:
always_log_local=data.get("always_log_local", True),
stt_enabled=_coerce_bool(stt_enabled, True),
group_sessions_per_user=_coerce_bool(group_sessions_per_user, True),
unauthorized_dm_behavior=unauthorized_dm_behavior,
streaming=StreamingConfig.from_dict(data.get("streaming", {})),
)
def get_unauthorized_dm_behavior(self, platform: Optional[Platform] = None) -> str:
"""Return the effective unauthorized-DM behavior for a platform."""
if platform:
platform_cfg = self.platforms.get(platform)
if platform_cfg and "unauthorized_dm_behavior" in platform_cfg.extra:
return _normalize_unauthorized_dm_behavior(
platform_cfg.extra.get("unauthorized_dm_behavior"),
self.unauthorized_dm_behavior,
)
return self.unauthorized_dm_behavior
def load_gateway_config() -> GatewayConfig:
"""
@@ -416,6 +445,44 @@ def load_gateway_config() -> GatewayConfig:
if "always_log_local" in yaml_cfg:
gw_data["always_log_local"] = yaml_cfg["always_log_local"]
if "unauthorized_dm_behavior" in yaml_cfg:
gw_data["unauthorized_dm_behavior"] = _normalize_unauthorized_dm_behavior(
yaml_cfg.get("unauthorized_dm_behavior"),
"pair",
)
# Bridge per-platform settings from config.yaml into gw_data
platforms_data = gw_data.setdefault("platforms", {})
if not isinstance(platforms_data, dict):
platforms_data = {}
gw_data["platforms"] = platforms_data
for plat in Platform:
if plat == Platform.LOCAL:
continue
platform_cfg = yaml_cfg.get(plat.value)
if not isinstance(platform_cfg, dict):
continue
# Collect bridgeable keys from this platform section
bridged = {}
if "unauthorized_dm_behavior" in platform_cfg:
bridged["unauthorized_dm_behavior"] = _normalize_unauthorized_dm_behavior(
platform_cfg.get("unauthorized_dm_behavior"),
gw_data.get("unauthorized_dm_behavior", "pair"),
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if not bridged:
continue
plat_data = platforms_data.setdefault(plat.value, {})
if not isinstance(plat_data, dict):
plat_data = {}
platforms_data[plat.value] = plat_data
extra = plat_data.setdefault("extra", {})
if not isinstance(extra, dict):
extra = {}
plat_data["extra"] = extra
extra.update(bridged)
# Discord settings → env vars (env vars take precedence)
discord_cfg = yaml_cfg.get("discord", {})
if isinstance(discord_cfg, dict):
@@ -428,13 +495,6 @@ def load_gateway_config() -> GatewayConfig:
os.environ["DISCORD_FREE_RESPONSE_CHANNELS"] = str(frc)
if "auto_thread" in discord_cfg and not os.getenv("DISCORD_AUTO_THREAD"):
os.environ["DISCORD_AUTO_THREAD"] = str(discord_cfg["auto_thread"]).lower()
# Bridge whatsapp settings from config.yaml into platform config
whatsapp_cfg = yaml_cfg.get("whatsapp", {})
if isinstance(whatsapp_cfg, dict) and "reply_prefix" in whatsapp_cfg:
if Platform.WHATSAPP not in config.platforms:
config.platforms[Platform.WHATSAPP] = PlatformConfig()
config.platforms[Platform.WHATSAPP].extra["reply_prefix"] = whatsapp_cfg["reply_prefix"]
except Exception:
pass
+16
View File
@@ -1099,6 +1099,22 @@ class BasePlatformAdapter(ABC):
print(f"[{self.name}] Error handling message: {e}")
import traceback
traceback.print_exc()
# Send the error to the user so they aren't left with radio silence
try:
error_type = type(e).__name__
error_detail = str(e)[:300] if str(e) else "no details available"
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
await self.send(
chat_id=event.source.chat_id,
content=(
f"Sorry, I encountered an error ({error_type}).\n"
f"{error_detail}\n"
"Try again or use /reset to start a fresh session."
),
metadata=_thread_metadata,
)
except Exception:
pass # Last resort — don't let error reporting crash the handler
finally:
# Stop typing indicator
typing_task.cancel()
+6 -26
View File
@@ -1364,16 +1364,17 @@ class DiscordAdapter(BasePlatformAdapter):
self,
interaction: discord.Interaction,
command_text: str,
followup_msg: str = "Done~",
followup_msg: str | None = None,
) -> None:
"""Common handler for simple slash commands that dispatch a command string."""
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
try:
await interaction.followup.send(followup_msg, ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
if followup_msg:
try:
await interaction.followup.send(followup_msg, ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
def _register_slash_commands(self) -> None:
"""Register Discord slash commands on the command tree."""
@@ -1382,19 +1383,6 @@ class DiscordAdapter(BasePlatformAdapter):
tree = self._client.tree
@tree.command(name="ask", description="Ask Hermes a question")
@discord.app_commands.describe(question="Your question for Hermes")
async def slash_ask(interaction: discord.Interaction, question: str):
await interaction.response.defer()
event = self._build_slash_event(interaction, question)
await self.handle_message(event)
# The response is sent via the normal send() flow
# Send a followup to close the interaction if needed
try:
await interaction.followup.send("Processing complete~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="new", description="Start a new conversation")
async def slash_new(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/reset", "New conversation started~")
@@ -1414,10 +1402,6 @@ class DiscordAdapter(BasePlatformAdapter):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/reasoning {effort}".strip())
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="personality", description="Set a personality")
@discord.app_commands.describe(name="Personality name. Leave empty to list available.")
@@ -1493,10 +1477,6 @@ class DiscordAdapter(BasePlatformAdapter):
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, f"/voice {mode}".strip())
await self.handle_message(event)
try:
await interaction.followup.send("Done~", ephemeral=True)
except Exception as e:
logger.debug("Discord followup failed: %s", e)
@tree.command(name="update", description="Update Hermes Agent to the latest version")
async def slash_update(interaction: discord.Interaction):
+1 -1
View File
@@ -635,7 +635,7 @@ class MatrixAdapter(BasePlatformAdapter):
source=source,
raw_message=getattr(event, "source", {}),
message_id=event.event_id,
reply_to=reply_to,
reply_to_message_id=reply_to,
)
await self.handle_message(msg_event)
+96 -11
View File
@@ -242,6 +242,8 @@ def _resolve_runtime_agent_kwargs() -> dict:
"base_url": runtime.get("base_url"),
"provider": runtime.get("provider"),
"api_mode": runtime.get("api_mode"),
"command": runtime.get("command"),
"args": list(runtime.get("args") or []),
}
@@ -432,6 +434,16 @@ class GatewayRunner:
for session_key in list(managers.keys()):
self._shutdown_gateway_honcho(session_key)
# -- Setup skill availability ----------------------------------------
def _has_setup_skill(self) -> bool:
"""Check if the hermes-agent-setup skill is installed."""
try:
from tools.skill_manager_tool import _find_skill
return _find_skill("hermes-agent-setup") is not None
except Exception:
return False
# -- Voice mode persistence ------------------------------------------
_VOICE_MODE_PATH = _hermes_home / "gateway_voice_mode.json"
@@ -601,6 +613,8 @@ class GatewayRunner:
"base_url": runtime_kwargs.get("base_url"),
"provider": runtime_kwargs.get("provider"),
"api_mode": runtime_kwargs.get("api_mode"),
"command": runtime_kwargs.get("command"),
"args": list(runtime_kwargs.get("args") or []),
}
return resolve_turn_route(user_message, getattr(self, "_smart_model_routing", {}), primary)
@@ -1247,6 +1261,13 @@ class GatewayRunner:
if "@" in user_id:
check_ids.add(user_id.split("@")[0])
return bool(check_ids & allowed_ids)
def _get_unauthorized_dm_behavior(self, platform: Optional[Platform]) -> str:
"""Return how unauthorized DMs should be handled for a platform."""
config = getattr(self, "config", None)
if config and hasattr(config, "get_unauthorized_dm_behavior"):
return config.get_unauthorized_dm_behavior(platform)
return "pair"
async def _handle_message(self, event: MessageEvent) -> Optional[str]:
"""
@@ -1267,7 +1288,7 @@ class GatewayRunner:
if not self._is_user_authorized(source):
logger.warning("Unauthorized user: %s (%s) on %s", source.user_id, source.user_name, source.platform.value)
# In DMs: offer pairing code. In groups: silently ignore.
if source.chat_type == "dm":
if source.chat_type == "dm" and self._get_unauthorized_dm_behavior(source.platform) == "pair":
platform_name = source.platform.value if source.platform else "unknown"
code = self.pairing_store.generate_code(
platform_name, source.user_id, source.user_name or ""
@@ -1870,6 +1891,37 @@ class GatewayRunner:
message_text = await self._enrich_message_with_transcription(
message_text, audio_paths
)
# If STT failed, send a direct message to the user so they
# know voice isn't configured — don't rely on the agent to
# relay the error clearly.
_stt_fail_markers = (
"No STT provider",
"STT is disabled",
"can't listen",
"VOICE_TOOLS_OPENAI_KEY",
)
if any(m in message_text for m in _stt_fail_markers):
_stt_adapter = self.adapters.get(source.platform)
_stt_meta = {"thread_id": source.thread_id} if source.thread_id else None
if _stt_adapter:
try:
_stt_msg = (
"🎤 I received your voice message but can't transcribe it — "
"no speech-to-text provider is configured.\n\n"
"To enable voice: install faster-whisper "
"(`pip install faster-whisper` in the Hermes venv) "
"and set `stt.enabled: true` in config.yaml, "
"then /restart the gateway."
)
# Point to setup skill if it's installed
if self._has_setup_skill():
_stt_msg += "\n\nFor full setup instructions, type: `/skill hermes-agent-setup`"
await _stt_adapter.send(
source.chat_id, _stt_msg,
metadata=_stt_meta,
)
except Exception:
pass
# -----------------------------------------------------------------
# Enrich document messages with context notes for the agent
@@ -2122,23 +2174,41 @@ class GatewayRunner:
error_detail = str(e)[:300] if str(e) else "no details available"
status_hint = ""
status_code = getattr(e, "status_code", None)
_hist_len = len(history) if 'history' in locals() else 0
if status_code == 401:
status_hint = " Check your API key or run `claude /login` to refresh OAuth credentials."
elif status_code == 429:
status_hint = " You are being rate-limited. Please wait a moment and try again."
# Check if this is a plan usage limit (resets on a schedule) vs a transient rate limit
_err_body = getattr(e, "response", None)
_err_json = {}
try:
if _err_body is not None:
_err_json = _err_body.json().get("error", {})
except Exception:
pass
if _err_json.get("type") == "usage_limit_reached":
_resets_in = _err_json.get("resets_in_seconds")
if _resets_in and _resets_in > 0:
import math
_hours = math.ceil(_resets_in / 3600)
status_hint = f" Your plan's usage limit has been reached. It resets in ~{_hours}h."
else:
status_hint = " Your plan's usage limit has been reached. Please wait until it resets."
else:
status_hint = " You are being rate-limited. Please wait a moment and try again."
elif status_code == 529:
status_hint = " The API is temporarily overloaded. Please try again shortly."
elif status_code == 400:
# 400 with a large session is almost always a context overflow.
# Give specific guidance instead of a generic error. (#1630)
_hist_len = len(history) if 'history' in locals() else 0
elif status_code in (400, 500):
# 400 with a large session is context overflow.
# 500 with a large session often means the payload is too large
# for the API to process — treat it the same way.
if _hist_len > 50:
return (
"⚠️ Session too large for the model's context window.\n"
"Use /compact to compress the conversation, or "
"/reset to start fresh."
)
else:
elif status_code == 400:
status_hint = " The request was rejected by the API."
return (
f"Sorry, I encountered an error ({error_type}).\n"
@@ -3921,7 +3991,13 @@ class GatewayRunner:
The enriched message string with transcriptions prepended.
"""
if not getattr(self.config, "stt_enabled", True):
disabled_note = "[The user sent voice message(s), but transcription is disabled in config.]"
disabled_note = "[The user sent voice message(s), but transcription is disabled in config."
if self._has_setup_skill():
disabled_note += (
" You have a skill called hermes-agent-setup that can help "
"users configure Hermes features including voice, tools, and more."
)
disabled_note += "]"
if user_text:
return f"{disabled_note}\n\n{user_text}"
return disabled_note
@@ -3948,11 +4024,20 @@ class GatewayRunner:
"No STT provider" in error
or error.startswith("Neither VOICE_TOOLS_OPENAI_KEY nor OPENAI_API_KEY is set")
):
enriched_parts.append(
_no_stt_note = (
"[The user sent a voice message but I can't listen "
"to it right now~ No STT provider is configured "
"(';w;') Let them know!]"
"to it right now — no STT provider is configured. "
"A direct message has already been sent to the user "
"with setup instructions."
)
if self._has_setup_skill():
_no_stt_note += (
" You have a skill called hermes-agent-setup "
"that can help users configure Hermes features "
"including voice, tools, and more."
)
_no_stt_note += "]"
enriched_parts.append(_no_stt_note)
else:
enriched_parts.append(
"[The user sent a voice message but I had trouble "
+2
View File
@@ -87,6 +87,7 @@ def _looks_like_gateway_process(pid: int) -> bool:
patterns = (
"hermes_cli.main gateway",
"hermes_cli/main.py gateway",
"hermes gateway",
"gateway/run.py",
)
@@ -105,6 +106,7 @@ def _record_looks_like_gateway(record: dict[str, Any]) -> bool:
cmdline = " ".join(str(part) for part in argv)
patterns = (
"hermes_cli.main gateway",
"hermes_cli/main.py gateway",
"hermes gateway",
"gateway/run.py",
)
+2 -2
View File
@@ -11,5 +11,5 @@ Provides subcommands for:
- hermes cron - Manage cron jobs
"""
__version__ = "0.3.0"
__release_date__ = "2026.3.17"
__version__ = "0.4.0"
__release_date__ = "2026.3.18"
+163 -12
View File
@@ -19,6 +19,7 @@ import json
import logging
import os
import shutil
import shlex
import stat
import base64
import hashlib
@@ -66,6 +67,8 @@ DEFAULT_AGENT_KEY_MIN_TTL_SECONDS = 30 * 60 # 30 minutes
ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120 # refresh 2 min before expiry
DEVICE_AUTH_POLL_INTERVAL_CAP_SECONDS = 1 # poll at most every 1s
DEFAULT_CODEX_BASE_URL = "https://chatgpt.com/backend-api/codex"
DEFAULT_GITHUB_MODELS_BASE_URL = "https://api.githubcopilot.com"
DEFAULT_COPILOT_ACP_BASE_URL = "acp://copilot"
CODEX_OAUTH_CLIENT_ID = "app_EMoamEEZ73f0CkXaXp7hrann"
CODEX_OAUTH_TOKEN_URL = "https://auth.openai.com/oauth/token"
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS = 120
@@ -108,6 +111,20 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
auth_type="oauth_external",
inference_base_url=DEFAULT_CODEX_BASE_URL,
),
"copilot": ProviderConfig(
id="copilot",
name="GitHub Copilot",
auth_type="api_key",
inference_base_url=DEFAULT_GITHUB_MODELS_BASE_URL,
api_key_env_vars=("COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"),
),
"copilot-acp": ProviderConfig(
id="copilot-acp",
name="GitHub Copilot ACP",
auth_type="external_process",
inference_base_url=DEFAULT_COPILOT_ACP_BASE_URL,
base_url_env_var="COPILOT_ACP_BASE_URL",
),
"zai": ProviderConfig(
id="zai",
name="Z.AI / GLM",
@@ -222,6 +239,70 @@ def _resolve_kimi_base_url(api_key: str, default_url: str, env_override: str) ->
return default_url
def _gh_cli_candidates() -> list[str]:
"""Return candidate ``gh`` binary paths, including common Homebrew installs."""
candidates: list[str] = []
resolved = shutil.which("gh")
if resolved:
candidates.append(resolved)
for candidate in (
"/opt/homebrew/bin/gh",
"/usr/local/bin/gh",
str(Path.home() / ".local" / "bin" / "gh"),
):
if candidate in candidates:
continue
if os.path.isfile(candidate) and os.access(candidate, os.X_OK):
candidates.append(candidate)
return candidates
def _try_gh_cli_token() -> Optional[str]:
"""Return a token from ``gh auth token`` when the GitHub CLI is available."""
for gh_path in _gh_cli_candidates():
try:
result = subprocess.run(
[gh_path, "auth", "token"],
capture_output=True,
text=True,
timeout=5,
)
except (FileNotFoundError, subprocess.TimeoutExpired) as exc:
logger.debug("gh CLI token lookup failed (%s): %s", gh_path, exc)
continue
if result.returncode == 0 and result.stdout.strip():
return result.stdout.strip()
return None
def _resolve_api_key_provider_secret(
provider_id: str, pconfig: ProviderConfig
) -> tuple[str, str]:
"""Resolve an API-key provider's token and indicate where it came from."""
if provider_id == "copilot":
# Use the dedicated copilot auth module for proper token validation
try:
from hermes_cli.copilot_auth import resolve_copilot_token
token, source = resolve_copilot_token()
if token:
return token, source
except ValueError as exc:
logger.warning("Copilot token validation failed: %s", exc)
except Exception:
pass
return "", ""
for env_var in pconfig.api_key_env_vars:
val = os.getenv(env_var, "").strip()
if val:
return val, env_var
return "", ""
# =============================================================================
# Z.AI Endpoint Detection
# =============================================================================
@@ -572,6 +653,9 @@ def resolve_provider(
"kimi": "kimi-coding", "moonshot": "kimi-coding",
"minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
"claude": "anthropic", "claude-code": "anthropic",
"github": "copilot", "github-copilot": "copilot",
"github-models": "copilot", "github-model": "copilot",
"github-copilot-acp": "copilot-acp", "copilot-acp-agent": "copilot-acp",
"aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
"opencode": "opencode-zen", "zen": "opencode-zen",
"go": "opencode-go", "opencode-go-sub": "opencode-go",
@@ -611,6 +695,11 @@ def resolve_provider(
for pid, pconfig in PROVIDER_REGISTRY.items():
if pconfig.auth_type != "api_key":
continue
# GitHub tokens are commonly present for repo/tool access but should not
# hijack inference auto-selection unless the user explicitly chooses
# Copilot/GitHub Models as the provider.
if pid == "copilot":
continue
for env_var in pconfig.api_key_env_vars:
if os.getenv(env_var, "").strip():
return pid
@@ -1479,12 +1568,7 @@ def get_api_key_provider_status(provider_id: str) -> Dict[str, Any]:
api_key = ""
key_source = ""
for env_var in pconfig.api_key_env_vars:
val = os.getenv(env_var, "").strip()
if val:
api_key = val
key_source = env_var
break
api_key, key_source = _resolve_api_key_provider_secret(provider_id, pconfig)
env_url = ""
if pconfig.base_url_env_var:
@@ -1507,6 +1591,36 @@ def get_api_key_provider_status(provider_id: str) -> Dict[str, Any]:
}
def get_external_process_provider_status(provider_id: str) -> Dict[str, Any]:
"""Status snapshot for providers that run a local subprocess."""
pconfig = PROVIDER_REGISTRY.get(provider_id)
if not pconfig or pconfig.auth_type != "external_process":
return {"configured": False}
command = (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
raw_args = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
args = shlex.split(raw_args) if raw_args else ["--acp", "--stdio"]
base_url = os.getenv(pconfig.base_url_env_var, "").strip() if pconfig.base_url_env_var else ""
if not base_url:
base_url = pconfig.inference_base_url
resolved_command = shutil.which(command) if command else None
return {
"configured": bool(resolved_command or base_url.startswith("acp+tcp://")),
"provider": provider_id,
"name": pconfig.name,
"command": command,
"args": args,
"resolved_command": resolved_command,
"base_url": base_url,
"logged_in": bool(resolved_command or base_url.startswith("acp+tcp://")),
}
def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
"""Generic auth status dispatcher."""
target = provider_id or get_active_provider()
@@ -1514,6 +1628,8 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
return get_nous_auth_status()
if target == "openai-codex":
return get_codex_auth_status()
if target == "copilot-acp":
return get_external_process_provider_status(target)
# API-key providers
pconfig = PROVIDER_REGISTRY.get(target)
if pconfig and pconfig.auth_type == "api_key":
@@ -1536,12 +1652,7 @@ def resolve_api_key_provider_credentials(provider_id: str) -> Dict[str, Any]:
api_key = ""
key_source = ""
for env_var in pconfig.api_key_env_vars:
val = os.getenv(env_var, "").strip()
if val:
api_key = val
key_source = env_var
break
api_key, key_source = _resolve_api_key_provider_secret(provider_id, pconfig)
env_url = ""
if pconfig.base_url_env_var:
@@ -1562,6 +1673,46 @@ def resolve_api_key_provider_credentials(provider_id: str) -> Dict[str, Any]:
}
def resolve_external_process_provider_credentials(provider_id: str) -> Dict[str, Any]:
"""Resolve runtime details for local subprocess-backed providers."""
pconfig = PROVIDER_REGISTRY.get(provider_id)
if not pconfig or pconfig.auth_type != "external_process":
raise AuthError(
f"Provider '{provider_id}' is not an external-process provider.",
provider=provider_id,
code="invalid_provider",
)
base_url = os.getenv(pconfig.base_url_env_var, "").strip() if pconfig.base_url_env_var else ""
if not base_url:
base_url = pconfig.inference_base_url
command = (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
raw_args = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
args = shlex.split(raw_args) if raw_args else ["--acp", "--stdio"]
resolved_command = shutil.which(command) if command else None
if not resolved_command and not base_url.startswith("acp+tcp://"):
raise AuthError(
f"Could not find the Copilot CLI command '{command}'. "
"Install GitHub Copilot CLI or set HERMES_COPILOT_ACP_COMMAND/COPILOT_CLI_PATH.",
provider=provider_id,
code="missing_copilot_cli",
)
return {
"provider": provider_id,
"api_key": "copilot-acp",
"base_url": base_url.rstrip("/"),
"command": resolved_command or command,
"args": args,
"source": "process",
}
# =============================================================================
# External credential detection
# =============================================================================
+32 -26
View File
@@ -102,27 +102,22 @@ COMPACT_BANNER = """
# =========================================================================
def get_available_skills() -> Dict[str, List[str]]:
"""Scan ~/.hermes/skills/ and return skills grouped by category."""
import os
"""Return skills grouped by category, filtered by platform and disabled state.
hermes_home = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
skills_dir = hermes_home / "skills"
skills_by_category = {}
if not skills_dir.exists():
return skills_by_category
for skill_file in skills_dir.rglob("SKILL.md"):
rel_path = skill_file.relative_to(skills_dir)
parts = rel_path.parts
if len(parts) >= 2:
category = parts[0]
skill_name = parts[-2]
else:
category = "general"
skill_name = skill_file.parent.name
skills_by_category.setdefault(category, []).append(skill_name)
Delegates to ``_find_all_skills()`` from ``tools/skills_tool`` which already
handles platform gating (``platforms:`` frontmatter) and respects the
user's ``skills.disabled`` config list.
"""
try:
from tools.skills_tool import _find_all_skills
all_skills = _find_all_skills() # already filtered
except Exception:
return {}
skills_by_category: Dict[str, List[str]] = {}
for skill in all_skills:
category = skill.get("category") or "general"
skills_by_category.setdefault(category, []).append(skill["name"])
return skills_by_category
@@ -233,6 +228,17 @@ def _format_context_length(tokens: int) -> str:
return str(tokens)
def _display_toolset_name(toolset_name: str) -> str:
"""Normalize internal/legacy toolset identifiers for banner display."""
if not toolset_name:
return "unknown"
return (
toolset_name[:-6]
if toolset_name.endswith("_tools")
else toolset_name
)
def build_welcome_banner(console: Console, model: str, cwd: str,
tools: List[dict] = None,
enabled_toolsets: List[str] = None,
@@ -297,12 +303,12 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
for tool in tools:
tool_name = tool["function"]["name"]
toolset = get_toolset_for_tool(tool_name) or "other"
toolset = _display_toolset_name(get_toolset_for_tool(tool_name) or "other")
toolsets_dict.setdefault(toolset, []).append(tool_name)
for item in unavailable_toolsets:
toolset_id = item.get("id", item.get("name", "unknown"))
display_name = f"{toolset_id}_tools" if not toolset_id.endswith("_tools") else toolset_id
display_name = _display_toolset_name(toolset_id)
if display_name not in toolsets_dict:
toolsets_dict[display_name] = []
for tool_name in item.get("tools", []):
@@ -342,10 +348,10 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
colored_names.append(f"[{text}]{name}[/]")
tools_str = ", ".join(colored_names)
right_lines.append(f"[dim #B8860B]{toolset}:[/] {tools_str}")
right_lines.append(f"[dim {dim}]{toolset}:[/] {tools_str}")
if remaining_toolsets > 0:
right_lines.append(f"[dim #B8860B](and {remaining_toolsets} more toolsets...)[/]")
right_lines.append(f"[dim {dim}](and {remaining_toolsets} more toolsets...)[/]")
# MCP Servers section (only if configured)
try:
@@ -356,12 +362,12 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if mcp_status:
right_lines.append("")
right_lines.append("[bold #FFBF00]MCP Servers[/]")
right_lines.append(f"[bold {accent}]MCP Servers[/]")
for srv in mcp_status:
if srv["connected"]:
right_lines.append(
f"[dim #B8860B]{srv['name']}[/] [#FFF8DC]({srv['transport']})[/] "
f"[dim #B8860B]—[/] [#FFF8DC]{srv['tools']} tool(s)[/]"
f"[dim {dim}]{srv['name']}[/] [{text}]({srv['transport']})[/] "
f"[dim {dim}]—[/] [{text}]{srv['tools']} tool(s)[/]"
)
else:
right_lines.append(
+5
View File
@@ -81,6 +81,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True, args_hint="[text]", subcommands=("clear",)),
CommandDef("personality", "Set a predefined personality", "Configuration",
args_hint="[name]"),
CommandDef("statusbar", "Toggle the context/model status bar", "Configuration",
cli_only=True, aliases=("sb",)),
CommandDef("verbose", "Cycle tool progress display: off -> new -> all -> verbose",
"Configuration", cli_only=True),
CommandDef("reasoning", "Manage reasoning effort and display", "Configuration",
@@ -104,6 +106,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
subcommands=("list", "add", "create", "edit", "pause", "resume", "run", "remove")),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
aliases=("reload_mcp",)),
CommandDef("browser", "Connect browser tools to your live Chrome via CDP", "Tools & Skills",
cli_only=True, args_hint="[connect|disconnect|status]",
subcommands=("connect", "disconnect", "status")),
CommandDef("plugins", "List installed plugins and their status",
"Tools & Skills", cli_only=True),
+295
View File
@@ -0,0 +1,295 @@
"""GitHub Copilot authentication utilities.
Implements the OAuth device code flow used by the Copilot CLI and handles
token validation/exchange for the Copilot API.
Token type support (per GitHub docs):
gho_ OAuth token (default via copilot login)
github_pat_ Fine-grained PAT (needs Copilot Requests permission)
ghu_ GitHub App token (via environment variable)
ghp_ Classic PAT NOT SUPPORTED
Credential search order (matching Copilot CLI behaviour):
1. COPILOT_GITHUB_TOKEN env var
2. GH_TOKEN env var
3. GITHUB_TOKEN env var
4. gh auth token CLI fallback
"""
from __future__ import annotations
import json
import logging
import os
import re
import shutil
import subprocess
import time
from pathlib import Path
from typing import Any, Optional
logger = logging.getLogger(__name__)
# OAuth device code flow constants (same client ID as opencode/Copilot CLI)
COPILOT_OAUTH_CLIENT_ID = "Ov23li8tweQw6odWQebz"
COPILOT_DEVICE_CODE_URL = "https://github.com/login/device/code"
COPILOT_ACCESS_TOKEN_URL = "https://github.com/login/oauth/access_token"
# Copilot API constants
COPILOT_TOKEN_EXCHANGE_URL = "https://api.github.com/copilot_internal/v2/token"
COPILOT_API_BASE_URL = "https://api.githubcopilot.com"
# Token type prefixes
_CLASSIC_PAT_PREFIX = "ghp_"
_SUPPORTED_PREFIXES = ("gho_", "github_pat_", "ghu_")
# Env var search order (matches Copilot CLI)
COPILOT_ENV_VARS = ("COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN")
# Polling constants
_DEVICE_CODE_POLL_INTERVAL = 5 # seconds
_DEVICE_CODE_POLL_SAFETY_MARGIN = 3 # seconds
def is_classic_pat(token: str) -> bool:
"""Check if a token is a classic PAT (ghp_*), which Copilot doesn't support."""
return token.strip().startswith(_CLASSIC_PAT_PREFIX)
def validate_copilot_token(token: str) -> tuple[bool, str]:
"""Validate that a token is usable with the Copilot API.
Returns (valid, message).
"""
token = token.strip()
if not token:
return False, "Empty token"
if token.startswith(_CLASSIC_PAT_PREFIX):
return False, (
"Classic Personal Access Tokens (ghp_*) are not supported by the "
"Copilot API. Use one of:\n"
" → `copilot login` or `hermes model` to authenticate via OAuth\n"
" → A fine-grained PAT (github_pat_*) with Copilot Requests permission\n"
" → `gh auth login` with the default device code flow (produces gho_* tokens)"
)
return True, "OK"
def resolve_copilot_token() -> tuple[str, str]:
"""Resolve a GitHub token suitable for Copilot API use.
Returns (token, source) where source describes where the token came from.
Raises ValueError if only a classic PAT is available.
"""
# 1. Check env vars in priority order
for env_var in COPILOT_ENV_VARS:
val = os.getenv(env_var, "").strip()
if val:
valid, msg = validate_copilot_token(val)
if not valid:
logger.warning(
"Token from %s is not supported: %s", env_var, msg
)
continue
return val, env_var
# 2. Fall back to gh auth token
token = _try_gh_cli_token()
if token:
valid, msg = validate_copilot_token(token)
if not valid:
raise ValueError(
f"Token from `gh auth token` is a classic PAT (ghp_*). {msg}"
)
return token, "gh auth token"
return "", ""
def _gh_cli_candidates() -> list[str]:
"""Return candidate ``gh`` binary paths, including common Homebrew installs."""
candidates: list[str] = []
resolved = shutil.which("gh")
if resolved:
candidates.append(resolved)
for candidate in (
"/opt/homebrew/bin/gh",
"/usr/local/bin/gh",
str(Path.home() / ".local" / "bin" / "gh"),
):
if candidate in candidates:
continue
if os.path.isfile(candidate) and os.access(candidate, os.X_OK):
candidates.append(candidate)
return candidates
def _try_gh_cli_token() -> Optional[str]:
"""Return a token from ``gh auth token`` when the GitHub CLI is available."""
for gh_path in _gh_cli_candidates():
try:
result = subprocess.run(
[gh_path, "auth", "token"],
capture_output=True,
text=True,
timeout=5,
)
except (FileNotFoundError, subprocess.TimeoutExpired) as exc:
logger.debug("gh CLI token lookup failed (%s): %s", gh_path, exc)
continue
if result.returncode == 0 and result.stdout.strip():
return result.stdout.strip()
return None
# ─── OAuth Device Code Flow ────────────────────────────────────────────────
def copilot_device_code_login(
*,
host: str = "github.com",
timeout_seconds: float = 300,
) -> Optional[str]:
"""Run the GitHub OAuth device code flow for Copilot.
Prints instructions for the user, polls for completion, and returns
the OAuth access token on success, or None on failure/cancellation.
This replicates the flow used by opencode and the Copilot CLI.
"""
import urllib.request
import urllib.parse
domain = host.rstrip("/")
device_code_url = f"https://{domain}/login/device/code"
access_token_url = f"https://{domain}/login/oauth/access_token"
# Step 1: Request device code
data = urllib.parse.urlencode({
"client_id": COPILOT_OAUTH_CLIENT_ID,
"scope": "read:user",
}).encode()
req = urllib.request.Request(
device_code_url,
data=data,
headers={
"Accept": "application/json",
"Content-Type": "application/x-www-form-urlencoded",
"User-Agent": "HermesAgent/1.0",
},
)
try:
with urllib.request.urlopen(req, timeout=15) as resp:
device_data = json.loads(resp.read().decode())
except Exception as exc:
logger.error("Failed to initiate device authorization: %s", exc)
print(f" ✗ Failed to start device authorization: {exc}")
return None
verification_uri = device_data.get("verification_uri", "https://github.com/login/device")
user_code = device_data.get("user_code", "")
device_code = device_data.get("device_code", "")
interval = max(device_data.get("interval", _DEVICE_CODE_POLL_INTERVAL), 1)
if not device_code or not user_code:
print(" ✗ GitHub did not return a device code.")
return None
# Step 2: Show instructions
print()
print(f" Open this URL in your browser: {verification_uri}")
print(f" Enter this code: {user_code}")
print()
print(" Waiting for authorization...", end="", flush=True)
# Step 3: Poll for completion
deadline = time.time() + timeout_seconds
while time.time() < deadline:
time.sleep(interval + _DEVICE_CODE_POLL_SAFETY_MARGIN)
poll_data = urllib.parse.urlencode({
"client_id": COPILOT_OAUTH_CLIENT_ID,
"device_code": device_code,
"grant_type": "urn:ietf:params:oauth:grant-type:device_code",
}).encode()
poll_req = urllib.request.Request(
access_token_url,
data=poll_data,
headers={
"Accept": "application/json",
"Content-Type": "application/x-www-form-urlencoded",
"User-Agent": "HermesAgent/1.0",
},
)
try:
with urllib.request.urlopen(poll_req, timeout=10) as resp:
result = json.loads(resp.read().decode())
except Exception:
print(".", end="", flush=True)
continue
if result.get("access_token"):
print("")
return result["access_token"]
error = result.get("error", "")
if error == "authorization_pending":
print(".", end="", flush=True)
continue
elif error == "slow_down":
# RFC 8628: add 5 seconds to polling interval
server_interval = result.get("interval")
if isinstance(server_interval, (int, float)) and server_interval > 0:
interval = int(server_interval)
else:
interval += 5
print(".", end="", flush=True)
continue
elif error == "expired_token":
print()
print(" ✗ Device code expired. Please try again.")
return None
elif error == "access_denied":
print()
print(" ✗ Authorization was denied.")
return None
elif error:
print()
print(f" ✗ Authorization failed: {error}")
return None
print()
print(" ✗ Timed out waiting for authorization.")
return None
# ─── Copilot API Headers ───────────────────────────────────────────────────
def copilot_request_headers(
*,
is_agent_turn: bool = True,
is_vision: bool = False,
) -> dict[str, str]:
"""Build the standard headers for Copilot API requests.
Replicates the header set used by opencode and the Copilot CLI.
"""
headers: dict[str, str] = {
"Editor-Version": "vscode/1.104.1",
"User-Agent": "HermesAgent/1.0",
"Openai-Intent": "conversation-edits",
"x-initiator": "agent" if is_agent_turn else "user",
}
if is_vision:
headers["Copilot-Vision-Request"] = "true"
return headers
+45 -4
View File
@@ -31,6 +31,7 @@ def find_gateway_pids() -> list:
pids = []
patterns = [
"hermes_cli.main gateway",
"hermes_cli/main.py gateway",
"hermes gateway",
"gateway/run.py",
]
@@ -849,6 +850,46 @@ def launchd_stop():
subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
print("✓ Service stopped")
def _wait_for_gateway_exit(timeout: float = 10.0, force_after: float = 5.0):
"""Wait for the gateway process (by saved PID) to exit.
Uses the PID from the gateway.pid file not launchd labels so this
works correctly when multiple gateway instances run under separate
HERMES_HOME directories.
Args:
timeout: Total seconds to wait before giving up.
force_after: Seconds of graceful waiting before sending SIGKILL.
"""
import time
from gateway.status import get_running_pid
deadline = time.monotonic() + timeout
force_deadline = time.monotonic() + force_after
force_sent = False
while time.monotonic() < deadline:
pid = get_running_pid()
if pid is None:
return # Process exited cleanly.
if not force_sent and time.monotonic() >= force_deadline:
# Grace period expired — force-kill the specific PID.
try:
os.kill(pid, signal.SIGKILL)
print(f"⚠ Gateway PID {pid} did not exit gracefully; sent SIGKILL")
except (ProcessLookupError, PermissionError):
return # Already gone or we can't touch it.
force_sent = True
time.sleep(0.3)
# Timed out even after SIGKILL.
remaining_pid = get_running_pid()
if remaining_pid is not None:
print(f"⚠ Gateway PID {remaining_pid} still running after {timeout}s — restart may fail")
def launchd_restart():
try:
launchd_stop()
@@ -856,6 +897,7 @@ def launchd_restart():
if e.returncode != 3:
raise
print("↻ launchd job was unloaded; skipping stop")
_wait_for_gateway_exit()
launchd_start()
def launchd_status(deep: bool = False):
@@ -1753,10 +1795,9 @@ def gateway_command(args):
killed = kill_gateway_processes()
if killed:
print(f"✓ Stopped {killed} gateway process(es)")
import time
time.sleep(2)
_wait_for_gateway_exit(timeout=10.0, force_after=5.0)
# Start fresh
print("Starting gateway...")
run_gateway(verbose=False)
+409 -1
View File
@@ -125,6 +125,17 @@ def _has_any_provider_configured() -> bool:
except Exception:
pass
# Check provider-specific auth fallbacks (for example, Copilot via gh auth).
try:
for provider_id, pconfig in PROVIDER_REGISTRY.items():
if pconfig.auth_type != "api_key":
continue
status = get_auth_status(provider_id)
if status.get("logged_in"):
return True
except Exception:
pass
# Check for Nous Portal OAuth credentials
auth_file = get_hermes_home() / "auth.json"
if auth_file.exists():
@@ -775,6 +786,8 @@ def cmd_model(args):
"openrouter": "OpenRouter",
"nous": "Nous Portal",
"openai-codex": "OpenAI Codex",
"copilot-acp": "GitHub Copilot ACP",
"copilot": "GitHub Copilot",
"anthropic": "Anthropic",
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
@@ -799,6 +812,8 @@ def cmd_model(args):
("openrouter", "OpenRouter (100+ models, pay-per-use)"),
("nous", "Nous Portal (Nous Research subscription)"),
("openai-codex", "OpenAI Codex"),
("copilot-acp", "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
("copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
("anthropic", "Anthropic (Claude models — API key or Claude Code)"),
("zai", "Z.AI / GLM (Zhipu AI direct API)"),
("kimi-coding", "Kimi / Moonshot (Moonshot AI direct API)"),
@@ -867,6 +882,10 @@ def cmd_model(args):
_model_flow_nous(config, current_model)
elif selected_provider == "openai-codex":
_model_flow_openai_codex(config, current_model)
elif selected_provider == "copilot-acp":
_model_flow_copilot_acp(config, current_model)
elif selected_provider == "copilot":
_model_flow_copilot(config, current_model)
elif selected_provider == "custom":
_model_flow_custom(config)
elif selected_provider.startswith("custom:") and selected_provider in _custom_provider_map:
@@ -1407,6 +1426,25 @@ def _model_flow_named_custom(config, provider_info):
# Curated model lists for direct API-key providers
_PROVIDER_MODELS = {
"copilot-acp": [
"copilot-acp",
],
"copilot": [
"gpt-5.4",
"gpt-5.4-mini",
"gpt-5-mini",
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-4.1",
"gpt-4o",
"gpt-4o-mini",
"claude-opus-4.6",
"claude-sonnet-4.6",
"claude-sonnet-4.5",
"claude-haiku-4.5",
"gemini-2.5-pro",
"grok-code-fast-1",
],
"zai": [
"glm-5",
"glm-4.7",
@@ -1447,6 +1485,376 @@ _PROVIDER_MODELS = {
}
def _current_reasoning_effort(config) -> str:
agent_cfg = config.get("agent")
if isinstance(agent_cfg, dict):
return str(agent_cfg.get("reasoning_effort") or "").strip().lower()
return ""
def _set_reasoning_effort(config, effort: str) -> None:
agent_cfg = config.get("agent")
if not isinstance(agent_cfg, dict):
agent_cfg = {}
config["agent"] = agent_cfg
agent_cfg["reasoning_effort"] = effort
def _prompt_reasoning_effort_selection(efforts, current_effort=""):
"""Prompt for a reasoning effort. Returns effort, 'none', or None to keep current."""
ordered = list(dict.fromkeys(str(effort).strip().lower() for effort in efforts if str(effort).strip()))
if not ordered:
return None
def _label(effort):
if effort == current_effort:
return f"{effort} ← currently in use"
return effort
disable_label = "Disable reasoning"
skip_label = "Skip (keep current)"
if current_effort == "none":
default_idx = len(ordered)
elif current_effort in ordered:
default_idx = ordered.index(current_effort)
elif "medium" in ordered:
default_idx = ordered.index("medium")
else:
default_idx = 0
try:
from simple_term_menu import TerminalMenu
choices = [f" {_label(effort)}" for effort in ordered]
choices.append(f" {disable_label}")
choices.append(f" {skip_label}")
menu = TerminalMenu(
choices,
cursor_index=default_idx,
menu_cursor="-> ",
menu_cursor_style=("fg_green", "bold"),
menu_highlight_style=("fg_green",),
cycle_cursor=True,
clear_screen=False,
title="Select reasoning effort:",
)
idx = menu.show()
if idx is None:
return None
print()
if idx < len(ordered):
return ordered[idx]
if idx == len(ordered):
return "none"
return None
except (ImportError, NotImplementedError):
pass
print("Select reasoning effort:")
for i, effort in enumerate(ordered, 1):
print(f" {i}. {_label(effort)}")
n = len(ordered)
print(f" {n + 1}. {disable_label}")
print(f" {n + 2}. {skip_label}")
print()
while True:
try:
choice = input(f"Choice [1-{n + 2}] (default: keep current): ").strip()
if not choice:
return None
idx = int(choice)
if 1 <= idx <= n:
return ordered[idx - 1]
if idx == n + 1:
return "none"
if idx == n + 2:
return None
print(f"Please enter 1-{n + 2}")
except ValueError:
print("Please enter a number")
except (KeyboardInterrupt, EOFError):
return None
def _model_flow_copilot(config, current_model=""):
"""GitHub Copilot flow using env vars, gh CLI, or OAuth device code."""
from hermes_cli.auth import (
PROVIDER_REGISTRY,
_prompt_model_selection,
_save_model_choice,
deactivate_provider,
resolve_api_key_provider_credentials,
)
from hermes_cli.config import get_env_value, save_env_value, load_config, save_config
from hermes_cli.models import (
fetch_api_models,
fetch_github_model_catalog,
github_model_reasoning_efforts,
copilot_model_api_mode,
normalize_copilot_model_id,
)
provider_id = "copilot"
pconfig = PROVIDER_REGISTRY[provider_id]
creds = resolve_api_key_provider_credentials(provider_id)
api_key = creds.get("api_key", "")
source = creds.get("source", "")
if not api_key:
print("No GitHub token configured for GitHub Copilot.")
print()
print(" Supported token types:")
print(" → OAuth token (gho_*) via `copilot login` or device code flow")
print(" → Fine-grained PAT (github_pat_*) with Copilot Requests permission")
print(" → GitHub App token (ghu_*) via environment variable")
print(" ✗ Classic PAT (ghp_*) NOT supported by Copilot API")
print()
print(" Options:")
print(" 1. Login with GitHub (OAuth device code flow)")
print(" 2. Enter a token manually")
print(" 3. Cancel")
print()
try:
choice = input(" Choice [1-3]: ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
if choice == "1":
try:
from hermes_cli.copilot_auth import copilot_device_code_login
token = copilot_device_code_login()
if token:
save_env_value("COPILOT_GITHUB_TOKEN", token)
print(" Copilot token saved.")
print()
else:
print(" Login cancelled or failed.")
return
except Exception as exc:
print(f" Login failed: {exc}")
return
elif choice == "2":
try:
new_key = input(" Token (COPILOT_GITHUB_TOKEN): ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
if not new_key:
print(" Cancelled.")
return
# Validate token type
try:
from hermes_cli.copilot_auth import validate_copilot_token
valid, msg = validate_copilot_token(new_key)
if not valid:
print(f"{msg}")
return
except ImportError:
pass
save_env_value("COPILOT_GITHUB_TOKEN", new_key)
print(" Token saved.")
print()
else:
print(" Cancelled.")
return
creds = resolve_api_key_provider_credentials(provider_id)
api_key = creds.get("api_key", "")
source = creds.get("source", "")
else:
if source in ("GITHUB_TOKEN", "GH_TOKEN"):
print(f" GitHub token: {api_key[:8]}... ✓ ({source})")
elif source == "gh auth token":
print(" GitHub token: ✓ (from `gh auth token`)")
else:
print(" GitHub token: ✓")
print()
effective_base = pconfig.inference_base_url
catalog = fetch_github_model_catalog(api_key)
live_models = [item.get("id", "") for item in catalog if item.get("id")] if catalog else fetch_api_models(api_key, effective_base)
normalized_current_model = normalize_copilot_model_id(
current_model,
catalog=catalog,
api_key=api_key,
) or current_model
if live_models:
model_list = [model_id for model_id in live_models if model_id]
print(f" Found {len(model_list)} model(s) from GitHub Copilot")
else:
model_list = _PROVIDER_MODELS.get(provider_id, [])
if model_list:
print(" ⚠ Could not auto-detect models from GitHub Copilot — showing defaults.")
print(' Use "Enter custom model name" if you do not see your model.')
if model_list:
selected = _prompt_model_selection(model_list, current_model=normalized_current_model)
else:
try:
selected = input("Model name: ").strip()
except (KeyboardInterrupt, EOFError):
selected = None
if selected:
selected = normalize_copilot_model_id(
selected,
catalog=catalog,
api_key=api_key,
) or selected
# Clear stale custom-endpoint overrides so the Copilot provider wins cleanly.
if get_env_value("OPENAI_BASE_URL"):
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
initial_cfg = load_config()
current_effort = _current_reasoning_effort(initial_cfg)
reasoning_efforts = github_model_reasoning_efforts(
selected,
catalog=catalog,
api_key=api_key,
)
selected_effort = None
if reasoning_efforts:
print(f" {selected} supports reasoning controls.")
selected_effort = _prompt_reasoning_effort_selection(
reasoning_efforts, current_effort=current_effort
)
_save_model_choice(selected)
cfg = load_config()
model = cfg.get("model")
if not isinstance(model, dict):
model = {"default": model} if model else {}
cfg["model"] = model
model["provider"] = provider_id
model["base_url"] = effective_base
model["api_mode"] = copilot_model_api_mode(
selected,
catalog=catalog,
api_key=api_key,
)
if selected_effort is not None:
_set_reasoning_effort(cfg, selected_effort)
save_config(cfg)
deactivate_provider()
print(f"Default model set to: {selected} (via {pconfig.name})")
if reasoning_efforts:
if selected_effort == "none":
print("Reasoning disabled for this model.")
elif selected_effort:
print(f"Reasoning effort set to: {selected_effort}")
else:
print("No change.")
def _model_flow_copilot_acp(config, current_model=""):
"""GitHub Copilot ACP flow using the local Copilot CLI."""
from hermes_cli.auth import (
PROVIDER_REGISTRY,
_prompt_model_selection,
_save_model_choice,
deactivate_provider,
get_external_process_provider_status,
resolve_api_key_provider_credentials,
resolve_external_process_provider_credentials,
)
from hermes_cli.models import (
fetch_github_model_catalog,
normalize_copilot_model_id,
)
from hermes_cli.config import load_config, save_config
del config
provider_id = "copilot-acp"
pconfig = PROVIDER_REGISTRY[provider_id]
status = get_external_process_provider_status(provider_id)
resolved_command = status.get("resolved_command") or status.get("command") or "copilot"
effective_base = status.get("base_url") or pconfig.inference_base_url
print(" GitHub Copilot ACP delegates Hermes turns to `copilot --acp`.")
print(" Hermes currently starts its own ACP subprocess for each request.")
print(" Hermes uses your selected model as a hint for the Copilot ACP session.")
print(f" Command: {resolved_command}")
print(f" Backend marker: {effective_base}")
print()
try:
creds = resolve_external_process_provider_credentials(provider_id)
except Exception as exc:
print(f"{exc}")
print(" Set HERMES_COPILOT_ACP_COMMAND or COPILOT_CLI_PATH if Copilot CLI is installed elsewhere.")
return
effective_base = creds.get("base_url") or effective_base
catalog_api_key = ""
try:
catalog_creds = resolve_api_key_provider_credentials("copilot")
catalog_api_key = catalog_creds.get("api_key", "")
except Exception:
pass
catalog = fetch_github_model_catalog(catalog_api_key)
normalized_current_model = normalize_copilot_model_id(
current_model,
catalog=catalog,
api_key=catalog_api_key,
) or current_model
if catalog:
model_list = [item.get("id", "") for item in catalog if item.get("id")]
print(f" Found {len(model_list)} model(s) from GitHub Copilot")
else:
model_list = _PROVIDER_MODELS.get("copilot", [])
if model_list:
print(" ⚠ Could not auto-detect models from GitHub Copilot — showing defaults.")
print(' Use "Enter custom model name" if you do not see your model.')
if model_list:
selected = _prompt_model_selection(
model_list,
current_model=normalized_current_model,
)
else:
try:
selected = input("Model name: ").strip()
except (KeyboardInterrupt, EOFError):
selected = None
if not selected:
print("No change.")
return
selected = normalize_copilot_model_id(
selected,
catalog=catalog,
api_key=catalog_api_key,
) or selected
_save_model_choice(selected)
cfg = load_config()
model = cfg.get("model")
if not isinstance(model, dict):
model = {"default": model} if model else {}
cfg["model"] = model
model["provider"] = provider_id
model["base_url"] = effective_base
model["api_mode"] = "chat_completions"
save_config(cfg)
deactivate_provider()
print(f"Default model set to: {selected} (via {pconfig.name})")
def _model_flow_kimi(config, current_model=""):
"""Kimi / Moonshot model selection with automatic endpoint routing.
@@ -2642,7 +3050,7 @@ For more help on a command:
)
chat_parser.add_argument(
"--provider",
choices=["auto", "openrouter", "nous", "openai-codex", "anthropic", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
default=None,
help="Inference provider (default: auto)"
)
+394 -6
View File
@@ -14,21 +14,40 @@ import urllib.error
from difflib import get_close_matches
from typing import Any, Optional
COPILOT_BASE_URL = "https://api.githubcopilot.com"
COPILOT_MODELS_URL = f"{COPILOT_BASE_URL}/models"
COPILOT_EDITOR_VERSION = "vscode/1.104.1"
COPILOT_REASONING_EFFORTS_GPT5 = ["minimal", "low", "medium", "high"]
COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
# Backward-compatible aliases for the earlier GitHub Models-backed Copilot work.
GITHUB_MODELS_BASE_URL = COPILOT_BASE_URL
GITHUB_MODELS_CATALOG_URL = COPILOT_MODELS_URL
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.6", "recommended"),
("anthropic/claude-sonnet-4.5", ""),
("openai/gpt-5.4-pro", ""),
("anthropic/claude-haiku-4.5", ""),
("openai/gpt-5.4", ""),
("openai/gpt-5.4-mini", ""),
("openrouter/hunter-alpha", "free"),
("openrouter/healer-alpha", "free"),
("openai/gpt-5.3-codex", ""),
("google/gemini-3-pro-preview", ""),
("google/gemini-3-flash-preview", ""),
("qwen/qwen3.5-plus-02-15", ""),
("qwen/qwen3.5-35b-a3b", ""),
("stepfun/step-3.5-flash", ""),
("z-ai/glm-5", ""),
("moonshotai/kimi-k2.5", ""),
("minimax/minimax-m2.5", ""),
("z-ai/glm-5", ""),
("z-ai/glm-5-turbo", ""),
("moonshotai/kimi-k2.5", ""),
("x-ai/grok-4.20-beta", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
("arcee-ai/trinity-large-preview:free", "free"),
("openai/gpt-5.4-pro", ""),
("openai/gpt-5.4-nano", ""),
]
_PROVIDER_MODELS: dict[str, list[str]] = {
@@ -46,6 +65,25 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"gpt-5.1-codex-mini",
"gpt-5.1-codex-max",
],
"copilot-acp": [
"copilot-acp",
],
"copilot": [
"gpt-5.4",
"gpt-5.4-mini",
"gpt-5-mini",
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-4.1",
"gpt-4o",
"gpt-4o-mini",
"claude-opus-4.6",
"claude-sonnet-4.6",
"claude-sonnet-4.5",
"claude-haiku-4.5",
"gemini-2.5-pro",
"grok-code-fast-1",
],
"zai": [
"glm-5",
"glm-4.7",
@@ -61,11 +99,15 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"kimi-k2-0905-preview",
],
"minimax": [
"MiniMax-M2.7",
"MiniMax-M2.7-highspeed",
"MiniMax-M2.5",
"MiniMax-M2.5-highspeed",
"MiniMax-M2.1",
],
"minimax-cn": [
"MiniMax-M2.7",
"MiniMax-M2.7-highspeed",
"MiniMax-M2.5",
"MiniMax-M2.5-highspeed",
"MiniMax-M2.1",
@@ -160,7 +202,9 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
_PROVIDER_LABELS = {
"openrouter": "OpenRouter",
"openai-codex": "OpenAI Codex",
"copilot-acp": "GitHub Copilot ACP",
"nous": "Nous Portal",
"copilot": "GitHub Copilot",
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
"minimax": "MiniMax",
@@ -180,6 +224,12 @@ _PROVIDER_ALIASES = {
"z-ai": "zai",
"z.ai": "zai",
"zhipu": "zai",
"github": "copilot",
"github-copilot": "copilot",
"github-models": "copilot",
"github-model": "copilot",
"github-copilot-acp": "copilot-acp",
"copilot-acp-agent": "copilot-acp",
"kimi": "kimi-coding",
"moonshot": "kimi-coding",
"minimax-china": "minimax-cn",
@@ -233,7 +283,7 @@ def list_available_providers() -> list[dict[str, str]]:
"""
# Canonical providers in display order
_PROVIDER_ORDER = [
"openrouter", "nous", "openai-codex",
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
"opencode-zen", "opencode-go",
"ai-gateway", "deepseek", "custom",
@@ -454,6 +504,17 @@ def provider_label(provider: Optional[str]) -> str:
return _PROVIDER_LABELS.get(normalized, original or "OpenRouter")
def _resolve_copilot_catalog_api_key() -> str:
"""Best-effort GitHub token for fetching the Copilot model catalog."""
try:
from hermes_cli.auth import resolve_api_key_provider_credentials
creds = resolve_api_key_provider_credentials("copilot")
return str(creds.get("api_key") or "").strip()
except Exception:
return ""
def provider_model_ids(provider: Optional[str]) -> list[str]:
"""Return the best known model catalog for a provider.
@@ -467,6 +528,15 @@ def provider_model_ids(provider: Optional[str]) -> list[str]:
from hermes_cli.codex_models import get_codex_model_ids
return get_codex_model_ids()
if normalized in {"copilot", "copilot-acp"}:
try:
live = _fetch_github_models(_resolve_copilot_catalog_api_key())
if live:
return live
except Exception:
pass
if normalized == "copilot-acp":
return list(_PROVIDER_MODELS.get("copilot", []))
if normalized == "nous":
# Try live Nous Portal /models endpoint
try:
@@ -545,6 +615,306 @@ def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
return None
def _payload_items(payload: Any) -> list[dict[str, Any]]:
if isinstance(payload, list):
return [item for item in payload if isinstance(item, dict)]
if isinstance(payload, dict):
data = payload.get("data", [])
if isinstance(data, list):
return [item for item in data if isinstance(item, dict)]
return []
def _extract_model_ids(payload: Any) -> list[str]:
return [item.get("id", "") for item in _payload_items(payload) if item.get("id")]
def copilot_default_headers() -> dict[str, str]:
"""Standard headers for Copilot API requests.
Includes Openai-Intent and x-initiator headers that opencode and the
Copilot CLI send on every request.
"""
try:
from hermes_cli.copilot_auth import copilot_request_headers
return copilot_request_headers(is_agent_turn=True)
except ImportError:
return {
"Editor-Version": COPILOT_EDITOR_VERSION,
"User-Agent": "HermesAgent/1.0",
"Openai-Intent": "conversation-edits",
"x-initiator": "agent",
}
def _copilot_catalog_item_is_text_model(item: dict[str, Any]) -> bool:
model_id = str(item.get("id") or "").strip()
if not model_id:
return False
if item.get("model_picker_enabled") is False:
return False
capabilities = item.get("capabilities")
if isinstance(capabilities, dict):
model_type = str(capabilities.get("type") or "").strip().lower()
if model_type and model_type != "chat":
return False
supported_endpoints = item.get("supported_endpoints")
if isinstance(supported_endpoints, list):
normalized_endpoints = {
str(endpoint).strip()
for endpoint in supported_endpoints
if str(endpoint).strip()
}
if normalized_endpoints and not normalized_endpoints.intersection(
{"/chat/completions", "/responses", "/v1/messages"}
):
return False
return True
def fetch_github_model_catalog(
api_key: Optional[str] = None, timeout: float = 5.0
) -> Optional[list[dict[str, Any]]]:
"""Fetch the live GitHub Copilot model catalog for this account."""
attempts: list[dict[str, str]] = []
if api_key:
attempts.append({
**copilot_default_headers(),
"Authorization": f"Bearer {api_key}",
})
attempts.append(copilot_default_headers())
for headers in attempts:
req = urllib.request.Request(COPILOT_MODELS_URL, headers=headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
items = _payload_items(data)
models: list[dict[str, Any]] = []
seen_ids: set[str] = set()
for item in items:
if not _copilot_catalog_item_is_text_model(item):
continue
model_id = str(item.get("id") or "").strip()
if not model_id or model_id in seen_ids:
continue
seen_ids.add(model_id)
models.append(item)
if models:
return models
except Exception:
continue
return None
def _is_github_models_base_url(base_url: Optional[str]) -> bool:
normalized = (base_url or "").strip().rstrip("/").lower()
return (
normalized.startswith(COPILOT_BASE_URL)
or normalized.startswith("https://models.github.ai/inference")
)
def _fetch_github_models(api_key: Optional[str] = None, timeout: float = 5.0) -> Optional[list[str]]:
catalog = fetch_github_model_catalog(api_key=api_key, timeout=timeout)
if not catalog:
return None
return [item.get("id", "") for item in catalog if item.get("id")]
_COPILOT_MODEL_ALIASES = {
"openai/gpt-5": "gpt-5-mini",
"openai/gpt-5-chat": "gpt-5-mini",
"openai/gpt-5-mini": "gpt-5-mini",
"openai/gpt-5-nano": "gpt-5-mini",
"openai/gpt-4.1": "gpt-4.1",
"openai/gpt-4.1-mini": "gpt-4.1",
"openai/gpt-4.1-nano": "gpt-4.1",
"openai/gpt-4o": "gpt-4o",
"openai/gpt-4o-mini": "gpt-4o-mini",
"openai/o1": "gpt-5.2",
"openai/o1-mini": "gpt-5-mini",
"openai/o1-preview": "gpt-5.2",
"openai/o3": "gpt-5.3-codex",
"openai/o3-mini": "gpt-5-mini",
"openai/o4-mini": "gpt-5-mini",
"anthropic/claude-opus-4.6": "claude-opus-4.6",
"anthropic/claude-sonnet-4.6": "claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5": "claude-sonnet-4.5",
"anthropic/claude-haiku-4.5": "claude-haiku-4.5",
}
def _copilot_catalog_ids(
catalog: Optional[list[dict[str, Any]]] = None,
api_key: Optional[str] = None,
) -> set[str]:
if catalog is None and api_key:
catalog = fetch_github_model_catalog(api_key=api_key)
if not catalog:
return set()
return {
str(item.get("id") or "").strip()
for item in catalog
if str(item.get("id") or "").strip()
}
def normalize_copilot_model_id(
model_id: Optional[str],
*,
catalog: Optional[list[dict[str, Any]]] = None,
api_key: Optional[str] = None,
) -> str:
raw = str(model_id or "").strip()
if not raw:
return ""
catalog_ids = _copilot_catalog_ids(catalog=catalog, api_key=api_key)
alias = _COPILOT_MODEL_ALIASES.get(raw)
if alias:
return alias
candidates = [raw]
if "/" in raw:
candidates.append(raw.split("/", 1)[1].strip())
if raw.endswith("-mini"):
candidates.append(raw[:-5])
if raw.endswith("-nano"):
candidates.append(raw[:-5])
if raw.endswith("-chat"):
candidates.append(raw[:-5])
seen: set[str] = set()
for candidate in candidates:
if not candidate or candidate in seen:
continue
seen.add(candidate)
if candidate in _COPILOT_MODEL_ALIASES:
return _COPILOT_MODEL_ALIASES[candidate]
if candidate in catalog_ids:
return candidate
if "/" in raw:
return raw.split("/", 1)[1].strip()
return raw
def _github_reasoning_efforts_for_model_id(model_id: str) -> list[str]:
raw = (model_id or "").strip().lower()
if raw.startswith(("openai/o1", "openai/o3", "openai/o4", "o1", "o3", "o4")):
return list(COPILOT_REASONING_EFFORTS_O_SERIES)
normalized = normalize_copilot_model_id(model_id).lower()
if normalized.startswith("gpt-5"):
return list(COPILOT_REASONING_EFFORTS_GPT5)
return []
def _should_use_copilot_responses_api(model_id: str) -> bool:
"""Decide whether a Copilot model should use the Responses API.
Replicates opencode's ``shouldUseCopilotResponsesApi`` logic:
GPT-5+ models use Responses API, except ``gpt-5-mini`` which uses
Chat Completions. All non-GPT models (Claude, Gemini, etc.) use
Chat Completions.
"""
import re
match = re.match(r"^gpt-(\d+)", model_id)
if not match:
return False
major = int(match.group(1))
return major >= 5 and not model_id.startswith("gpt-5-mini")
def copilot_model_api_mode(
model_id: Optional[str],
*,
catalog: Optional[list[dict[str, Any]]] = None,
api_key: Optional[str] = None,
) -> str:
"""Determine the API mode for a Copilot model.
Uses the model ID pattern (matching opencode's approach) as the
primary signal. Falls back to the catalog's ``supported_endpoints``
only for models not covered by the pattern check.
"""
normalized = normalize_copilot_model_id(model_id, catalog=catalog, api_key=api_key)
if not normalized:
return "chat_completions"
# Primary: model ID pattern (matches opencode's shouldUseCopilotResponsesApi)
if _should_use_copilot_responses_api(normalized):
return "codex_responses"
# Secondary: check catalog for non-GPT-5 models (Claude via /v1/messages, etc.)
if catalog is None and api_key:
catalog = fetch_github_model_catalog(api_key=api_key)
if catalog:
catalog_entry = next((item for item in catalog if item.get("id") == normalized), None)
if isinstance(catalog_entry, dict):
supported_endpoints = {
str(endpoint).strip()
for endpoint in (catalog_entry.get("supported_endpoints") or [])
if str(endpoint).strip()
}
# For non-GPT-5 models, check if they only support messages API
if "/v1/messages" in supported_endpoints and "/chat/completions" not in supported_endpoints:
return "anthropic_messages"
return "chat_completions"
def github_model_reasoning_efforts(
model_id: Optional[str],
*,
catalog: Optional[list[dict[str, Any]]] = None,
api_key: Optional[str] = None,
) -> list[str]:
"""Return supported reasoning-effort levels for a Copilot-visible model."""
normalized = normalize_copilot_model_id(model_id, catalog=catalog, api_key=api_key)
if not normalized:
return []
catalog_entry = None
if catalog is not None:
catalog_entry = next((item for item in catalog if item.get("id") == normalized), None)
elif api_key:
fetched_catalog = fetch_github_model_catalog(api_key=api_key)
if fetched_catalog:
catalog_entry = next((item for item in fetched_catalog if item.get("id") == normalized), None)
if catalog_entry is not None:
capabilities = catalog_entry.get("capabilities")
if isinstance(capabilities, dict):
supports = capabilities.get("supports")
if isinstance(supports, dict):
efforts = supports.get("reasoning_effort")
if isinstance(efforts, list):
normalized_efforts = [
str(effort).strip().lower()
for effort in efforts
if str(effort).strip()
]
return list(dict.fromkeys(normalized_efforts))
return []
legacy_capabilities = {
str(capability).strip().lower()
for capability in catalog_entry.get("capabilities", [])
if str(capability).strip()
}
if "reasoning" not in legacy_capabilities:
return []
return _github_reasoning_efforts_for_model_id(str(model_id or normalized))
def probe_api_models(
api_key: Optional[str],
base_url: Optional[str],
@@ -561,6 +931,16 @@ def probe_api_models(
"used_fallback": False,
}
if _is_github_models_base_url(normalized):
models = _fetch_github_models(api_key=api_key, timeout=timeout)
return {
"models": models,
"probed_url": COPILOT_MODELS_URL,
"resolved_base_url": COPILOT_BASE_URL,
"suggested_base_url": None,
"used_fallback": False,
}
if normalized.endswith("/v1"):
alternate_base = normalized[:-3].rstrip("/")
else:
@@ -574,6 +954,8 @@ def probe_api_models(
headers: dict[str, str] = {}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
if normalized.startswith(COPILOT_BASE_URL):
headers.update(copilot_default_headers())
for candidate_base, is_fallback in candidates:
url = candidate_base.rstrip("/") + "/models"
@@ -664,6 +1046,12 @@ def validate_requested_model(
normalized = normalize_provider(provider)
if normalized == "openrouter" and base_url and "openrouter.ai" not in base_url:
normalized = "custom"
requested_for_lookup = requested
if normalized == "copilot":
requested_for_lookup = normalize_copilot_model_id(
requested,
api_key=api_key,
) or requested
if not requested:
return {
@@ -685,7 +1073,7 @@ def validate_requested_model(
probe = probe_api_models(api_key, base_url)
api_models = probe.get("models")
if api_models is not None:
if requested in set(api_models):
if requested_for_lookup in set(api_models):
return {
"accepted": True,
"persist": True,
@@ -734,7 +1122,7 @@ def validate_requested_model(
api_models = fetch_api_models(api_key, base_url)
if api_models is not None:
if requested in set(api_models):
if requested_for_lookup in set(api_models):
# API confirmed the model exists
return {
"accepted": True,
+63 -12
View File
@@ -14,6 +14,7 @@ from hermes_cli.auth import (
resolve_nous_runtime_credentials,
resolve_codex_runtime_credentials,
resolve_api_key_provider_credentials,
resolve_external_process_provider_credentials,
)
from hermes_cli.config import load_config
from hermes_constants import OPENROUTER_BASE_URL
@@ -33,7 +34,24 @@ def _get_model_config() -> Dict[str, Any]:
return {}
_VALID_API_MODES = {"chat_completions", "codex_responses"}
def _copilot_runtime_api_mode(model_cfg: Dict[str, Any], api_key: str) -> str:
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if configured_mode:
return configured_mode
model_name = str(model_cfg.get("default") or "").strip()
if not model_name:
return "chat_completions"
try:
from hermes_cli.models import copilot_model_api_mode
return copilot_model_api_mode(model_name, api_key=api_key)
except Exception:
return "chat_completions"
_VALID_API_MODES = {"chat_completions", "codex_responses", "anthropic_messages"}
def _parse_api_mode(raw: Any) -> Optional[str]:
@@ -153,6 +171,12 @@ def _resolve_openrouter_runtime(
model_cfg = _get_model_config()
cfg_base_url = model_cfg.get("base_url") if isinstance(model_cfg.get("base_url"), str) else ""
cfg_provider = model_cfg.get("provider") if isinstance(model_cfg.get("provider"), str) else ""
cfg_api_key = ""
for k in ("api_key", "api"):
v = model_cfg.get(k)
if isinstance(v, str) and v.strip():
cfg_api_key = v.strip()
break
requested_norm = (requested_provider or "").strip().lower()
cfg_provider = cfg_provider.strip().lower()
@@ -160,26 +184,24 @@ def _resolve_openrouter_runtime(
env_openrouter_base_url = os.getenv("OPENROUTER_BASE_URL", "").strip()
use_config_base_url = False
if cfg_base_url.strip() and not explicit_base_url and not env_openai_base_url:
if cfg_base_url.strip() and not explicit_base_url:
if requested_norm == "auto":
if not cfg_provider or cfg_provider == "auto":
use_config_base_url = True
elif requested_norm == "custom":
# Persisted custom endpoints store their base URL in config.yaml.
# If OPENAI_BASE_URL is not currently set in the environment, keep
# honoring that saved endpoint instead of falling back to OpenRouter.
if cfg_provider == "custom":
if (not cfg_provider or cfg_provider == "auto") and not env_openai_base_url:
use_config_base_url = True
elif requested_norm == "custom" and cfg_provider == "custom":
# provider: custom — use base_url from config (Fixes #1760).
use_config_base_url = True
# When the user explicitly requested the openrouter provider, skip
# OPENAI_BASE_URL — it typically points to a custom / non-OpenRouter
# endpoint and would prevent switching back to OpenRouter (#874).
skip_openai_base = requested_norm == "openrouter"
# For custom, prefer config base_url over env so config.yaml is honored (#1760).
base_url = (
(explicit_base_url or "").strip()
or ("" if skip_openai_base else env_openai_base_url)
or (cfg_base_url.strip() if use_config_base_url else "")
or ("" if skip_openai_base else env_openai_base_url)
or env_openrouter_base_url
or OPENROUTER_BASE_URL
).rstrip("/")
@@ -198,8 +220,10 @@ def _resolve_openrouter_runtime(
or ""
)
else:
# Custom endpoint: use api_key from config when using config base_url (#1760).
api_key = (
explicit_api_key
or (cfg_api_key if use_config_base_url else "")
or os.getenv("OPENAI_API_KEY")
or os.getenv("OPENROUTER_API_KEY")
or ""
@@ -267,6 +291,19 @@ def resolve_runtime_provider(
"requested_provider": requested_provider,
}
if provider == "copilot-acp":
creds = resolve_external_process_provider_credentials(provider)
return {
"provider": "copilot-acp",
"api_mode": "chat_completions",
"base_url": creds.get("base_url", "").rstrip("/"),
"api_key": creds.get("api_key", ""),
"command": creds.get("command", ""),
"args": list(creds.get("args") or []),
"source": creds.get("source", "process"),
"requested_provider": requested_provider,
}
# Anthropic (native Messages API)
if provider == "anthropic":
from agent.anthropic_adapter import resolve_anthropic_token
@@ -302,10 +339,24 @@ def resolve_runtime_provider(
pconfig = PROVIDER_REGISTRY.get(provider)
if pconfig and pconfig.auth_type == "api_key":
creds = resolve_api_key_provider_credentials(provider)
model_cfg = _get_model_config()
base_url = creds.get("base_url", "").rstrip("/")
api_mode = "chat_completions"
if provider == "copilot":
api_mode = _copilot_runtime_api_mode(model_cfg, creds.get("api_key", ""))
else:
# Check explicit api_mode from model config first
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if configured_mode:
api_mode = configured_mode
# Auto-detect Anthropic-compatible endpoints by URL convention
# (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
elif base_url.rstrip("/").endswith("/anthropic"):
api_mode = "anthropic_messages"
return {
"provider": provider,
"api_mode": "chat_completions",
"base_url": creds.get("base_url", "").rstrip("/"),
"api_mode": api_mode,
"base_url": base_url,
"api_key": creds.get("api_key", ""),
"source": creds.get("source", "env"),
"requested_provider": requested_provider,
+227 -20
View File
@@ -55,15 +55,87 @@ def _set_default_model(config: Dict[str, Any], model_name: str) -> None:
# Default model lists per provider — used as fallback when the live
# /models endpoint can't be reached.
_DEFAULT_PROVIDER_MODELS = {
"copilot-acp": [
"copilot-acp",
],
"copilot": [
"gpt-5.4",
"gpt-5.4-mini",
"gpt-5-mini",
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-4.1",
"gpt-4o",
"gpt-4o-mini",
"claude-opus-4.6",
"claude-sonnet-4.6",
"claude-sonnet-4.5",
"claude-haiku-4.5",
"gemini-2.5-pro",
"grok-code-fast-1",
],
"zai": ["glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
"kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
"minimax": ["MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
"minimax-cn": ["MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
"minimax": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
"minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
"ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
"kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
}
def _current_reasoning_effort(config: Dict[str, Any]) -> str:
agent_cfg = config.get("agent")
if isinstance(agent_cfg, dict):
return str(agent_cfg.get("reasoning_effort") or "").strip().lower()
return ""
def _set_reasoning_effort(config: Dict[str, Any], effort: str) -> None:
agent_cfg = config.get("agent")
if not isinstance(agent_cfg, dict):
agent_cfg = {}
config["agent"] = agent_cfg
agent_cfg["reasoning_effort"] = effort
def _setup_copilot_reasoning_selection(
config: Dict[str, Any],
model_id: str,
prompt_choice,
*,
catalog: Optional[list[dict[str, Any]]] = None,
api_key: str = "",
) -> None:
from hermes_cli.models import github_model_reasoning_efforts, normalize_copilot_model_id
normalized_model = normalize_copilot_model_id(
model_id,
catalog=catalog,
api_key=api_key,
) or model_id
efforts = github_model_reasoning_efforts(normalized_model, catalog=catalog, api_key=api_key)
if not efforts:
return
current_effort = _current_reasoning_effort(config)
choices = list(efforts) + ["Disable reasoning", f"Keep current ({current_effort or 'default'})"]
if current_effort == "none":
default_idx = len(efforts)
elif current_effort in efforts:
default_idx = efforts.index(current_effort)
elif "medium" in efforts:
default_idx = efforts.index("medium")
else:
default_idx = len(choices) - 1
effort_idx = prompt_choice("Select reasoning effort:", choices, default_idx)
if effort_idx < len(efforts):
_set_reasoning_effort(config, efforts[effort_idx])
elif effort_idx == len(efforts):
_set_reasoning_effort(config, "none")
def _setup_provider_model_selection(config, provider_id, current_model, prompt_choice, prompt_fn):
"""Model selection for API-key providers with live /models detection.
@@ -71,29 +143,60 @@ def _setup_provider_model_selection(config, provider_id, current_model, prompt_c
hardcoded default list with a warning if the endpoint is unreachable.
Always offers a 'Custom model' escape hatch.
"""
from hermes_cli.auth import PROVIDER_REGISTRY
from hermes_cli.auth import PROVIDER_REGISTRY, resolve_api_key_provider_credentials
from hermes_cli.config import get_env_value
from hermes_cli.models import fetch_api_models
from hermes_cli.models import (
copilot_model_api_mode,
fetch_api_models,
fetch_github_model_catalog,
normalize_copilot_model_id,
)
pconfig = PROVIDER_REGISTRY[provider_id]
is_copilot_catalog_provider = provider_id in {"copilot", "copilot-acp"}
# Resolve API key and base URL for the probe
api_key = ""
for ev in pconfig.api_key_env_vars:
api_key = get_env_value(ev) or os.getenv(ev, "")
if api_key:
break
base_url_env = pconfig.base_url_env_var or ""
base_url = (get_env_value(base_url_env) if base_url_env else "") or pconfig.inference_base_url
if is_copilot_catalog_provider:
api_key = ""
if provider_id == "copilot":
creds = resolve_api_key_provider_credentials(provider_id)
api_key = creds.get("api_key", "")
base_url = creds.get("base_url", "") or pconfig.inference_base_url
else:
try:
creds = resolve_api_key_provider_credentials("copilot")
api_key = creds.get("api_key", "")
except Exception:
pass
base_url = pconfig.inference_base_url
catalog = fetch_github_model_catalog(api_key)
current_model = normalize_copilot_model_id(
current_model,
catalog=catalog,
api_key=api_key,
) or current_model
else:
api_key = ""
for ev in pconfig.api_key_env_vars:
api_key = get_env_value(ev) or os.getenv(ev, "")
if api_key:
break
base_url_env = pconfig.base_url_env_var or ""
base_url = (get_env_value(base_url_env) if base_url_env else "") or pconfig.inference_base_url
catalog = None
# Try live /models endpoint
live_models = fetch_api_models(api_key, base_url)
if is_copilot_catalog_provider and catalog:
live_models = [item.get("id", "") for item in catalog if item.get("id")]
else:
live_models = fetch_api_models(api_key, base_url)
if live_models:
provider_models = live_models
print_info(f"Found {len(live_models)} model(s) from {pconfig.name} API")
else:
provider_models = _DEFAULT_PROVIDER_MODELS.get(provider_id, [])
fallback_provider_id = "copilot" if provider_id == "copilot-acp" else provider_id
provider_models = _DEFAULT_PROVIDER_MODELS.get(fallback_provider_id, [])
if provider_models:
print_warning(
f"Could not auto-detect models from {pconfig.name} API — showing defaults.\n"
@@ -107,12 +210,29 @@ def _setup_provider_model_selection(config, provider_id, current_model, prompt_c
keep_idx = len(model_choices) - 1
model_idx = prompt_choice("Select default model:", model_choices, keep_idx)
selected_model = current_model
if model_idx < len(provider_models):
_set_default_model(config, provider_models[model_idx])
selected_model = provider_models[model_idx]
if is_copilot_catalog_provider:
selected_model = normalize_copilot_model_id(
selected_model,
catalog=catalog,
api_key=api_key,
) or selected_model
_set_default_model(config, selected_model)
elif model_idx == len(provider_models):
custom = prompt_fn("Enter model name")
if custom:
_set_default_model(config, custom)
if is_copilot_catalog_provider:
selected_model = normalize_copilot_model_id(
custom,
catalog=catalog,
api_key=api_key,
) or custom
else:
selected_model = custom
_set_default_model(config, selected_model)
else:
# "Keep current" selected — validate it's compatible with the new
# provider. OpenRouter-formatted names (containing "/") won't work
@@ -123,8 +243,25 @@ def _setup_provider_model_selection(config, provider_id, current_model, prompt_c
f"and won't work with {pconfig.name}. "
f"Switching to {provider_models[0]}."
)
selected_model = provider_models[0]
_set_default_model(config, provider_models[0])
if provider_id == "copilot" and selected_model:
model_cfg = _model_config_dict(config)
model_cfg["api_mode"] = copilot_model_api_mode(
selected_model,
catalog=catalog,
api_key=api_key,
)
config["model"] = model_cfg
_setup_copilot_reasoning_selection(
config,
selected_model,
prompt_choice,
catalog=catalog,
api_key=api_key,
)
def _sync_model_from_disk(config: Dict[str, Any]) -> None:
disk_model = load_config().get("model")
@@ -673,6 +810,8 @@ def setup_model_provider(config: dict):
resolve_codex_runtime_credentials,
DEFAULT_CODEX_BASE_URL,
detect_external_credentials,
get_auth_status,
resolve_api_key_provider_credentials,
)
print_header("Inference Provider")
@@ -682,6 +821,8 @@ def setup_model_provider(config: dict):
existing_or = get_env_value("OPENROUTER_API_KEY")
active_oauth = get_active_provider()
existing_custom = get_env_value("OPENAI_BASE_URL")
copilot_status = get_auth_status("copilot")
copilot_acp_status = get_auth_status("copilot-acp")
model_cfg = config.get("model") if isinstance(config.get("model"), dict) else {}
current_config_provider = str(model_cfg.get("provider") or "").strip().lower() or None
@@ -702,7 +843,12 @@ def setup_model_provider(config: dict):
# Detect if any provider is already configured
has_any_provider = bool(
current_config_provider or active_oauth or existing_custom or existing_or
current_config_provider
or active_oauth
or existing_custom
or existing_or
or copilot_status.get("logged_in")
or copilot_acp_status.get("logged_in")
)
# Build "keep current" label
@@ -741,6 +887,8 @@ def setup_model_provider(config: dict):
"Alibaba Cloud / DashScope (Qwen models via Anthropic-compatible API)",
"OpenCode Zen (35+ curated models, pay-as-you-go)",
"OpenCode Go (open models, $10/month subscription)",
"GitHub Copilot (uses GITHUB_TOKEN or gh auth token)",
"GitHub Copilot ACP (spawns `copilot --acp --stdio`)",
]
if keep_label:
provider_choices.append(keep_label)
@@ -1412,7 +1560,56 @@ def setup_model_provider(config: dict):
_set_model_provider(config, "opencode-go", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
# else: provider_idx == 14 (Keep current) — only shown when a provider already exists
elif provider_idx == 14: # GitHub Copilot
selected_provider = "copilot"
print()
print_header("GitHub Copilot")
pconfig = PROVIDER_REGISTRY["copilot"]
print_info("Hermes can use GITHUB_TOKEN, GH_TOKEN, or your gh CLI login.")
print_info(f"Base URL: {pconfig.inference_base_url}")
print()
copilot_creds = resolve_api_key_provider_credentials("copilot")
source = copilot_creds.get("source", "")
token = copilot_creds.get("api_key", "")
if token:
if source in ("GITHUB_TOKEN", "GH_TOKEN"):
print_info(f"Current: {token[:8]}... ({source})")
elif source == "gh auth token":
print_info("Current: authenticated via `gh auth token`")
else:
print_info("Current: GitHub token configured")
else:
api_key = prompt(" GitHub token", password=True)
if api_key:
save_env_value("GITHUB_TOKEN", api_key)
print_success("GitHub token saved")
else:
print_warning("Skipped - agent won't work without a GitHub token or gh auth login")
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_set_model_provider(config, "copilot", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
elif provider_idx == 15: # GitHub Copilot ACP
selected_provider = "copilot-acp"
print()
print_header("GitHub Copilot ACP")
pconfig = PROVIDER_REGISTRY["copilot-acp"]
print_info("Hermes will start `copilot --acp --stdio` for each request.")
print_info("Use HERMES_COPILOT_ACP_COMMAND or COPILOT_CLI_PATH to override the command.")
print_info(f"Base marker: {pconfig.inference_base_url}")
print()
if existing_custom:
save_env_value("OPENAI_BASE_URL", "")
save_env_value("OPENAI_API_KEY", "")
_set_model_provider(config, "copilot-acp", pconfig.inference_base_url)
selected_base_url = pconfig.inference_base_url
# else: provider_idx == 16 (Keep current) — only shown when a provider already exists
# Normalize "keep current" to an explicit provider so downstream logic
# doesn't fall back to the generic OpenRouter/static-model path.
if selected_provider is None:
@@ -1444,6 +1641,8 @@ def setup_model_provider(config: dict):
if _vision_needs_setup:
_prov_names = {
"nous-api": "Nous Portal API key",
"copilot": "GitHub Copilot",
"copilot-acp": "GitHub Copilot ACP",
"zai": "Z.AI / GLM",
"kimi-coding": "Kimi / Moonshot",
"minimax": "MiniMax",
@@ -1583,7 +1782,15 @@ def setup_model_provider(config: dict):
_set_default_model(config, custom)
_update_config_for_provider("openai-codex", DEFAULT_CODEX_BASE_URL)
_set_model_provider(config, "openai-codex", DEFAULT_CODEX_BASE_URL)
elif selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "ai-gateway"):
elif selected_provider == "copilot-acp":
_setup_provider_model_selection(
config, selected_provider, current_model,
prompt_choice, prompt,
)
model_cfg = _model_config_dict(config)
model_cfg["api_mode"] = "chat_completions"
config["model"] = model_cfg
elif selected_provider in ("copilot", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "ai-gateway"):
_setup_provider_model_selection(
config, selected_provider, current_model,
prompt_choice, prompt,
@@ -1644,7 +1851,7 @@ def setup_model_provider(config: dict):
# Write provider+base_url to config.yaml only after model selection is complete.
# This prevents a race condition where the gateway picks up a new provider
# before the model name has been updated to match.
if selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic") and selected_base_url is not None:
if selected_provider in ("copilot-acp", "copilot", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic") and selected_base_url is not None:
_update_config_for_provider(selected_provider, selected_base_url)
save_config(config)
@@ -1710,7 +1917,7 @@ def _install_neutts_deps() -> bool:
return True
except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e:
print_error(f"Failed to install neutts: {e}")
print_info("Try manually: pip install neutts[all]")
print_info("Try manually: python -m pip install -U neutts[all]")
return False
+4 -6
View File
@@ -757,16 +757,14 @@ class SessionDB:
if not query:
return []
if source_filter is None:
source_filter = ["cli", "telegram", "discord", "whatsapp", "slack"]
# Build WHERE clauses dynamically
where_clauses = ["messages_fts MATCH ?"]
params: list = [query]
source_placeholders = ",".join("?" for _ in source_filter)
where_clauses.append(f"s.source IN ({source_placeholders})")
params.extend(source_filter)
if source_filter is not None:
source_placeholders = ",".join("?" for _ in source_filter)
where_clauses.append(f"s.source IN ({source_placeholders})")
params.extend(source_filter)
if role_filter:
role_placeholders = ",".join("?" for _ in role_filter)
+1 -1
View File
@@ -276,6 +276,7 @@ def get_tool_definitions(
# The registry still holds their schemas; dispatch just returns a stub error
# so if something slips through, the LLM sees a sensible message.
_AGENT_LOOP_TOOLS = {"todo", "memory", "session_search", "delegate_task"}
_READ_SEARCH_TOOLS = {"read_file", "search_files"}
def handle_function_call(
@@ -305,7 +306,6 @@ def handle_function_call(
"""
# Notify the read-loop tracker when a non-read/search tool runs,
# so the *consecutive* counter resets (reads after other work are fine).
_READ_SEARCH_TOOLS = {"read_file", "search_files"}
if function_name not in _READ_SEARCH_TOOLS:
try:
from tools.file_tools import notify_other_tool_call
+1 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent"
version = "0.3.0"
version = "0.4.0"
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
readme = "README.md"
requires-python = ">=3.11"
+290 -80
View File
@@ -85,7 +85,7 @@ from agent.model_metadata import (
)
from agent.context_compressor import ContextCompressor
from agent.prompt_caching import apply_anthropic_cache_control
from agent.prompt_builder import build_skills_system_prompt, build_context_files_prompt
from agent.prompt_builder import build_skills_system_prompt, build_context_files_prompt, load_soul_md
from agent.usage_pricing import estimate_usage_cost, normalize_usage
from agent.display import (
KawaiiSpinner, build_tool_preview as _build_tool_preview,
@@ -203,6 +203,27 @@ class IterationBudget:
# When any of these appear in a batch, we fall back to sequential execution.
_NEVER_PARALLEL_TOOLS = frozenset({"clarify"})
# Read-only tools with no shared mutable session state.
_PARALLEL_SAFE_TOOLS = frozenset({
"ha_get_state",
"ha_list_entities",
"ha_list_services",
"honcho_context",
"honcho_profile",
"honcho_search",
"read_file",
"search_files",
"session_search",
"skill_view",
"skills_list",
"vision_analyze",
"web_extract",
"web_search",
})
# File tools can run concurrently when they target independent paths.
_PATH_SCOPED_TOOLS = frozenset({"read_file", "write_file", "patch"})
# Maximum number of concurrent worker threads for parallel tool execution.
_MAX_TOOL_WORKERS = 8
@@ -234,6 +255,74 @@ def _is_destructive_command(cmd: str) -> bool:
return False
def _should_parallelize_tool_batch(tool_calls) -> bool:
"""Return True when a tool-call batch is safe to run concurrently."""
if len(tool_calls) <= 1:
return False
tool_names = [tc.function.name for tc in tool_calls]
if any(name in _NEVER_PARALLEL_TOOLS for name in tool_names):
return False
reserved_paths: list[Path] = []
for tool_call in tool_calls:
tool_name = tool_call.function.name
try:
function_args = json.loads(tool_call.function.arguments)
except Exception:
logging.debug(
"Could not parse args for %s — defaulting to sequential; raw=%s",
tool_name,
tool_call.function.arguments[:200],
)
return False
if not isinstance(function_args, dict):
logging.debug(
"Non-dict args for %s (%s) — defaulting to sequential",
tool_name,
type(function_args).__name__,
)
return False
if tool_name in _PATH_SCOPED_TOOLS:
scoped_path = _extract_parallel_scope_path(tool_name, function_args)
if scoped_path is None:
return False
if any(_paths_overlap(scoped_path, existing) for existing in reserved_paths):
return False
reserved_paths.append(scoped_path)
continue
if tool_name not in _PARALLEL_SAFE_TOOLS:
return False
return True
def _extract_parallel_scope_path(tool_name: str, function_args: dict) -> Path | None:
"""Return the normalized file target for path-scoped tools."""
if tool_name not in _PATH_SCOPED_TOOLS:
return None
raw_path = function_args.get("path")
if not isinstance(raw_path, str) or not raw_path.strip():
return None
# Avoid resolve(); the file may not exist yet.
return Path(raw_path).expanduser()
def _paths_overlap(left: Path, right: Path) -> bool:
"""Return True when two paths may refer to the same subtree."""
left_parts = left.parts
right_parts = right.parts
if not left_parts or not right_parts:
# Empty paths shouldn't reach here (guarded upstream), but be safe.
return bool(left_parts) == bool(right_parts) and bool(left_parts)
common_len = min(len(left_parts), len(right_parts))
return left_parts[:common_len] == right_parts[:common_len]
def _inject_honcho_turn_context(content, turn_context: str):
"""Append Honcho recall to the current-turn user message without mutating history.
@@ -263,17 +352,30 @@ def _inject_honcho_turn_context(content, turn_context: str):
class AIAgent:
"""
AI Agent with tool calling capabilities.
This class manages the conversation flow, tool execution, and response handling
for AI models that support function calling.
"""
@property
def base_url(self) -> str:
return self._base_url
@base_url.setter
def base_url(self, value: str) -> None:
self._base_url = value
self._base_url_lower = value.lower() if value else ""
def __init__(
self,
base_url: str = None,
api_key: str = None,
provider: str = None,
api_mode: str = None,
acp_command: str = None,
acp_args: list[str] | None = None,
command: str = None,
args: list[str] | None = None,
model: str = "anthropic/claude-opus-4.6", # OpenRouter format
max_iterations: int = 90, # Default tool-calling iterations (shared with subagents)
tool_delay: float = 1.0,
@@ -379,23 +481,30 @@ class AIAgent:
self.base_url = base_url or OPENROUTER_BASE_URL
provider_name = provider.strip().lower() if isinstance(provider, str) and provider.strip() else None
self.provider = provider_name or "openrouter"
self.acp_command = acp_command or command
self.acp_args = list(acp_args or args or [])
if api_mode in {"chat_completions", "codex_responses", "anthropic_messages"}:
self.api_mode = api_mode
elif self.provider == "openai-codex":
self.api_mode = "codex_responses"
elif (provider_name is None) and "chatgpt.com/backend-api/codex" in self.base_url.lower():
elif (provider_name is None) and "chatgpt.com/backend-api/codex" in self._base_url_lower:
self.api_mode = "codex_responses"
self.provider = "openai-codex"
elif self.provider == "anthropic" or (provider_name is None and "api.anthropic.com" in self.base_url.lower()):
elif self.provider == "anthropic" or (provider_name is None and "api.anthropic.com" in self._base_url_lower):
self.api_mode = "anthropic_messages"
self.provider = "anthropic"
elif self._base_url_lower.rstrip("/").endswith("/anthropic"):
# Third-party Anthropic-compatible endpoints (e.g. MiniMax, DashScope)
# use a URL convention ending in /anthropic. Auto-detect these so the
# Anthropic Messages API adapter is used instead of chat completions.
self.api_mode = "anthropic_messages"
else:
self.api_mode = "chat_completions"
# Pre-warm OpenRouter model metadata cache in a background thread.
# fetch_model_metadata() is cached for 1 hour; this avoids a blocking
# HTTP request on the first API response when pricing is estimated.
if self.provider == "openrouter" or "openrouter" in self.base_url.lower():
if self.provider == "openrouter" or "openrouter" in self._base_url_lower:
threading.Thread(
target=lambda: fetch_model_metadata(),
daemon=True,
@@ -439,7 +548,7 @@ class AIAgent:
# Anthropic prompt caching: auto-enabled for Claude models via OpenRouter.
# Reduces input costs by ~75% on multi-turn conversations by caching the
# conversation prefix. Uses system_and_3 strategy (4 breakpoints).
is_openrouter = "openrouter" in self.base_url.lower()
is_openrouter = "openrouter" in self._base_url_lower
is_claude = "claude" in self.model.lower()
is_native_anthropic = self.api_mode == "anthropic_messages"
self._use_prompt_caching = (is_openrouter and is_claude) or is_native_anthropic
@@ -555,6 +664,7 @@ class AIAgent:
if self.api_mode == "anthropic_messages":
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
effective_key = api_key or resolve_anthropic_token() or ""
self.api_key = effective_key
self._anthropic_api_key = effective_key
self._anthropic_base_url = base_url
from agent.anthropic_adapter import _is_oauth_token as _is_oat
@@ -572,6 +682,9 @@ class AIAgent:
# Explicit credentials from CLI/gateway — construct directly.
# The runtime provider resolver already handled auth for us.
client_kwargs = {"api_key": api_key, "base_url": base_url}
if self.provider == "copilot-acp":
client_kwargs["command"] = self.acp_command
client_kwargs["args"] = self.acp_args
effective_base = base_url
if "openrouter" in effective_base.lower():
client_kwargs["default_headers"] = {
@@ -579,6 +692,10 @@ class AIAgent:
"X-OpenRouter-Title": "Hermes Agent",
"X-OpenRouter-Categories": "productivity,cli-agent",
}
elif "api.githubcopilot.com" in effective_base.lower():
from hermes_cli.models import copilot_default_headers
client_kwargs["default_headers"] = copilot_default_headers()
elif "api.kimi.com" in effective_base.lower():
client_kwargs["default_headers"] = {
"User-Agent": "KimiCLI/1.3",
@@ -609,6 +726,7 @@ class AIAgent:
}
self._client_kwargs = client_kwargs # stored for rebuilding after interrupt
self.api_key = client_kwargs.get("api_key", "")
try:
self.client = self._create_openai_client(client_kwargs, reason="agent_init", shared=True)
if not self.quiet_mode:
@@ -732,16 +850,24 @@ class AIAgent:
from tools.todo_tool import TodoStore
self._todo_store = TodoStore()
# Load config once for memory, skills, and compression sections
try:
from hermes_cli.config import load_config as _load_agent_config
_agent_cfg = _load_agent_config()
except Exception:
_agent_cfg = {}
# Persistent memory (MEMORY.md + USER.md) -- loaded from disk
self._memory_store = None
self._memory_enabled = False
self._user_profile_enabled = False
self._memory_nudge_interval = 10
self._memory_flush_min_turns = 6
self._turns_since_memory = 0
self._iters_since_skill = 0
if not skip_memory:
try:
from hermes_cli.config import load_config as _load_mem_config
mem_config = _load_mem_config().get("memory", {})
mem_config = _agent_cfg.get("memory", {})
self._memory_enabled = mem_config.get("memory_enabled", False)
self._user_profile_enabled = mem_config.get("user_profile_enabled", False)
self._memory_nudge_interval = int(mem_config.get("nudge_interval", 10))
@@ -829,21 +955,16 @@ class AIAgent:
# Skills config: nudge interval for skill creation reminders
self._skill_nudge_interval = 10
try:
from hermes_cli.config import load_config as _load_skills_config
skills_config = _load_skills_config().get("skills", {})
skills_config = _agent_cfg.get("skills", {})
self._skill_nudge_interval = int(skills_config.get("creation_nudge_interval", 15))
except Exception:
pass
# Initialize context compressor for automatic context management
# Compresses conversation when approaching model's context limit
# Configuration via config.yaml (compression section)
try:
from hermes_cli.config import load_config as _load_compression_config
_compression_cfg = _load_compression_config().get("compression", {})
if not isinstance(_compression_cfg, dict):
_compression_cfg = {}
except ImportError:
_compression_cfg = _agent_cfg.get("compression", {})
if not isinstance(_compression_cfg, dict):
_compression_cfg = {}
compression_threshold = float(_compression_cfg.get("threshold", 0.50))
compression_enabled = str(_compression_cfg.get("enabled", True)).lower() in ("true", "1", "yes")
@@ -858,6 +979,7 @@ class AIAgent:
summary_model_override=compression_summary_model,
quiet_mode=self.quiet_mode,
base_url=self.base_url,
api_key=getattr(self, "api_key", ""),
)
self.compression_enabled = compression_enabled
self._user_turn_count = 0
@@ -913,8 +1035,8 @@ class AIAgent:
OpenAI models use 'max_tokens'.
"""
_is_direct_openai = (
"api.openai.com" in self.base_url.lower()
and "openrouter" not in self.base_url.lower()
"api.openai.com" in self._base_url_lower
and "openrouter" not in self._base_url_lower
)
if _is_direct_openai:
return {"max_completion_tokens": value}
@@ -1831,28 +1953,38 @@ class AIAgent:
is stable across all turns in a session, maximizing prefix cache hits.
"""
# Layers (in order):
# 1. Default agent identity (always present)
# 1. Agent identity — SOUL.md when available, else DEFAULT_AGENT_IDENTITY
# 2. User / gateway system prompt (if provided)
# 3. Persistent memory (frozen snapshot)
# 4. Skills guidance (if skills tools are loaded)
# 5. Context files (SOUL.md, AGENTS.md, .cursorrules)
# 5. Context files (AGENTS.md, .cursorrules — SOUL.md excluded here when used as identity)
# 6. Current date & time (frozen at build time)
# 7. Platform-specific formatting hint
# If an AI peer name is configured in Honcho, personalise the identity line.
_ai_peer_name = (
self._honcho_config.ai_peer
if self._honcho_config and self._honcho_config.ai_peer != "hermes"
else None
)
if _ai_peer_name:
_identity = DEFAULT_AGENT_IDENTITY.replace(
"You are Hermes Agent",
f"You are {_ai_peer_name}",
1,
# Try SOUL.md as primary identity (unless context files are skipped)
_soul_loaded = False
if not self.skip_context_files:
_soul_content = load_soul_md()
if _soul_content:
prompt_parts = [_soul_content]
_soul_loaded = True
if not _soul_loaded:
# Fallback to hardcoded identity
_ai_peer_name = (
self._honcho_config.ai_peer
if self._honcho_config and self._honcho_config.ai_peer != "hermes"
else None
)
else:
_identity = DEFAULT_AGENT_IDENTITY
prompt_parts = [_identity]
if _ai_peer_name:
_identity = DEFAULT_AGENT_IDENTITY.replace(
"You are Hermes Agent",
f"You are {_ai_peer_name}",
1,
)
else:
_identity = DEFAULT_AGENT_IDENTITY
prompt_parts = [_identity]
# Tool-aware behavioral guidance: only inject when the tools are loaded
tool_guidance = []
@@ -1948,7 +2080,7 @@ class AIAgent:
prompt_parts.append(skills_prompt)
if not self.skip_context_files:
context_files_prompt = build_context_files_prompt()
context_files_prompt = build_context_files_prompt(skip_soul=_soul_loaded)
if context_files_prompt:
prompt_parts.append(context_files_prompt)
@@ -1957,6 +2089,10 @@ class AIAgent:
timestamp_line = f"Conversation started: {now.strftime('%A, %B %d, %Y %I:%M %p')}"
if self.pass_session_id and self.session_id:
timestamp_line += f"\nSession ID: {self.session_id}"
if self.model:
timestamp_line += f"\nModel: {self.model}"
if self.provider:
timestamp_line += f"\nProvider: {self.provider}"
prompt_parts.append(timestamp_line)
platform_key = (self.platform or "").lower().strip()
@@ -2685,10 +2821,23 @@ class AIAgent:
if isinstance(client, Mock):
return False
if bool(getattr(client, "is_closed", False)):
return True
http_client = getattr(client, "_client", None)
return bool(getattr(http_client, "is_closed", False))
def _create_openai_client(self, client_kwargs: dict, *, reason: str, shared: bool) -> Any:
if self.provider == "copilot-acp" or str(client_kwargs.get("base_url", "")).startswith("acp://copilot"):
from agent.copilot_acp_client import CopilotACPClient
client = CopilotACPClient(**client_kwargs)
logger.info(
"Copilot ACP client created (%s, shared=%s) %s",
reason,
shared,
self._client_log_context(),
)
return client
client = OpenAI(**client_kwargs)
logger.info(
"OpenAI client created (%s, shared=%s) %s",
@@ -2951,6 +3100,9 @@ class AIAgent:
return False
self._anthropic_api_key = new_token
# Update OAuth flag — token type may have changed (API key ↔ OAuth)
from agent.anthropic_adapter import _is_oauth_token
self._is_anthropic_oauth = _is_oauth_token(new_token)
return True
def _anthropic_messages_create(self, api_kwargs: dict):
@@ -3327,11 +3479,11 @@ class AIAgent:
# Determine api_mode from provider
fb_api_mode = "chat_completions"
fb_base_url = str(fb_client.base_url)
if fb_provider == "openai-codex":
fb_api_mode = "codex_responses"
elif fb_provider == "anthropic":
elif fb_provider == "anthropic" or fb_base_url.rstrip("/").lower().endswith("/anthropic"):
fb_api_mode = "anthropic_messages"
fb_base_url = str(fb_client.base_url)
old_model = self.model
self.model = fb_model
@@ -3342,11 +3494,12 @@ class AIAgent:
if fb_api_mode == "anthropic_messages":
# Build native Anthropic client instead of using OpenAI client
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token, _is_oauth_token
effective_key = fb_client.api_key or resolve_anthropic_token() or ""
self._anthropic_api_key = effective_key
self._anthropic_base_url = getattr(fb_client, "base_url", None)
self._anthropic_client = build_anthropic_client(effective_key, self._anthropic_base_url)
self._is_anthropic_oauth = _is_oauth_token(effective_key)
self.client = None
self._client_kwargs = {}
else:
@@ -3544,6 +3697,11 @@ class AIAgent:
if not instructions:
instructions = DEFAULT_AGENT_IDENTITY
is_github_responses = (
"models.github.ai" in self.base_url.lower()
or "api.githubcopilot.com" in self.base_url.lower()
)
# Resolve reasoning effort: config > default (medium)
reasoning_effort = "medium"
reasoning_enabled = True
@@ -3561,13 +3719,23 @@ class AIAgent:
"tool_choice": "auto",
"parallel_tool_calls": True,
"store": False,
"prompt_cache_key": self.session_id,
}
if not is_github_responses:
kwargs["prompt_cache_key"] = self.session_id
if reasoning_enabled:
kwargs["reasoning"] = {"effort": reasoning_effort, "summary": "auto"}
kwargs["include"] = ["reasoning.encrypted_content"]
else:
if is_github_responses:
# Copilot's Responses route advertises reasoning-effort support,
# but not OpenAI-specific prompt cache or encrypted reasoning
# fields. Keep the payload to the documented subset.
github_reasoning = self._github_models_reasoning_extra_body()
if github_reasoning is not None:
kwargs["reasoning"] = github_reasoning
else:
kwargs["reasoning"] = {"effort": reasoning_effort, "summary": "auto"}
kwargs["include"] = ["reasoning.encrypted_content"]
elif not is_github_responses:
kwargs["include"] = []
if self.max_tokens is not None:
@@ -3637,7 +3805,11 @@ class AIAgent:
extra_body = {}
_is_openrouter = "openrouter" in self.base_url.lower()
_is_openrouter = "openrouter" in self._base_url_lower
_is_github_models = (
"models.github.ai" in self._base_url_lower
or "api.githubcopilot.com" in self._base_url_lower
)
# Provider preferences (only, ignore, order, sort) are OpenRouter-
# specific. Only send to OpenRouter-compatible endpoints.
@@ -3645,22 +3817,27 @@ class AIAgent:
# for _is_nous when their backend is updated.
if provider_preferences and _is_openrouter:
extra_body["provider"] = provider_preferences
_is_nous = "nousresearch" in self.base_url.lower()
_is_nous = "nousresearch" in self._base_url_lower
if self._supports_reasoning_extra_body():
if self.reasoning_config is not None:
rc = dict(self.reasoning_config)
# Nous Portal requires reasoning enabled — don't send
# enabled=false to it (would cause 400).
if _is_nous and rc.get("enabled") is False:
pass # omit reasoning entirely for Nous when disabled
else:
extra_body["reasoning"] = rc
if _is_github_models:
github_reasoning = self._github_models_reasoning_extra_body()
if github_reasoning is not None:
extra_body["reasoning"] = github_reasoning
else:
extra_body["reasoning"] = {
"enabled": True,
"effort": "medium"
}
if self.reasoning_config is not None:
rc = dict(self.reasoning_config)
# Nous Portal requires reasoning enabled — don't send
# enabled=false to it (would cause 400).
if _is_nous and rc.get("enabled") is False:
pass # omit reasoning entirely for Nous when disabled
else:
extra_body["reasoning"] = rc
else:
extra_body["reasoning"] = {
"enabled": True,
"effort": "medium"
}
# Nous Portal product attribution
if _is_nous:
@@ -3678,14 +3855,20 @@ class AIAgent:
Some providers/routes reject `reasoning` with 400s, so gate it to
known reasoning-capable model families and direct Nous Portal.
"""
base_url = (self.base_url or "").lower()
if "nousresearch" in base_url:
if "nousresearch" in self._base_url_lower:
return True
if "ai-gateway.vercel.sh" in base_url:
if "ai-gateway.vercel.sh" in self._base_url_lower:
return True
if "openrouter" not in base_url:
if "models.github.ai" in self._base_url_lower or "api.githubcopilot.com" in self._base_url_lower:
try:
from hermes_cli.models import github_model_reasoning_efforts
return bool(github_model_reasoning_efforts(self.model))
except Exception:
return False
if "openrouter" not in self._base_url_lower:
return False
if "api.mistral.ai" in base_url:
if "api.mistral.ai" in self._base_url_lower:
return False
model = (self.model or "").lower()
@@ -3699,6 +3882,38 @@ class AIAgent:
)
return any(model.startswith(prefix) for prefix in reasoning_model_prefixes)
def _github_models_reasoning_extra_body(self) -> dict | None:
"""Format reasoning payload for GitHub Models/OpenAI-compatible routes."""
try:
from hermes_cli.models import github_model_reasoning_efforts
except Exception:
return None
supported_efforts = github_model_reasoning_efforts(self.model)
if not supported_efforts:
return None
if self.reasoning_config and isinstance(self.reasoning_config, dict):
if self.reasoning_config.get("enabled") is False:
return None
requested_effort = str(
self.reasoning_config.get("effort", "medium")
).strip().lower()
else:
requested_effort = "medium"
if requested_effort == "xhigh" and "high" in supported_efforts:
requested_effort = "high"
elif requested_effort not in supported_efforts:
if requested_effort == "minimal" and "low" in supported_efforts:
requested_effort = "low"
elif "medium" in supported_efforts:
requested_effort = "medium"
else:
requested_effort = supported_efforts[0]
return {"effort": requested_effort}
def _build_assistant_message(self, assistant_message, finish_reason: str) -> dict:
"""Build a normalized assistant message dict from an API response message.
@@ -3871,7 +4086,7 @@ class AIAgent:
try:
# Build API messages for the flush call
_is_strict_api = "api.mistral.ai" in self.base_url.lower()
_is_strict_api = "api.mistral.ai" in self._base_url_lower
api_messages = []
for msg in messages:
api_msg = msg.copy()
@@ -4060,20 +4275,17 @@ class AIAgent:
def _execute_tool_calls(self, assistant_message, messages: list, effective_task_id: str, api_call_count: int = 0) -> None:
"""Execute tool calls from the assistant message and append results to messages.
Dispatches to concurrent execution when multiple independent tool calls
are present, falling back to sequential execution for single calls or
when interactive tools (e.g. clarify) are in the batch.
Dispatches to concurrent execution only for batches that look
independent: read-only tools may always share the parallel path, while
file reads/writes may do so only when their target paths do not overlap.
"""
tool_calls = assistant_message.tool_calls
# Single tool call or interactive tool present → sequential
if (len(tool_calls) <= 1
or any(tc.function.name in _NEVER_PARALLEL_TOOLS for tc in tool_calls)):
if not _should_parallelize_tool_batch(tool_calls):
return self._execute_tool_calls_sequential(
assistant_message, messages, effective_task_id, api_call_count
)
# Multiple non-interactive tools → concurrent
return self._execute_tool_calls_concurrent(
assistant_message, messages, effective_task_id, api_call_count
)
@@ -4647,7 +4859,7 @@ class AIAgent:
try:
# Build API messages, stripping internal-only fields
# (finish_reason, reasoning) that strict APIs like Mistral reject with 422
_is_strict_api = "api.mistral.ai" in self.base_url.lower()
_is_strict_api = "api.mistral.ai" in self._base_url_lower
api_messages = []
for msg in messages:
api_msg = msg.copy()
@@ -4668,7 +4880,7 @@ class AIAgent:
api_messages.insert(sys_offset + idx, pfm.copy())
summary_extra_body = {}
_is_nous = "nousresearch" in self.base_url.lower()
_is_nous = "nousresearch" in self._base_url_lower
if self._supports_reasoning_extra_body():
if self.reasoning_config is not None:
summary_extra_body["reasoning"] = self.reasoning_config
@@ -4831,8 +5043,9 @@ class AIAgent:
self._incomplete_scratchpad_retries = 0
self._codex_incomplete_retries = 0
self._last_content_with_tools = None
self._turns_since_memory = 0
self._iters_since_skill = 0
# NOTE: _turns_since_memory and _iters_since_skill are NOT reset here.
# They are initialized in __init__ and must persist across run_conversation
# calls so that nudge logic accumulates correctly in CLI mode.
self.iteration_budget = IterationBudget(self.max_iterations)
# Initialize conversation (copy to avoid mutating the caller's list)
@@ -5085,7 +5298,7 @@ class AIAgent:
# strict providers like Mistral that reject unknown fields with 422.
# Uses new dicts so the internal messages list retains the fields
# for Codex Responses compatibility.
if "api.mistral.ai" in self.base_url.lower():
if "api.mistral.ai" in self._base_url_lower:
self._sanitize_tool_calls_for_strict_api(api_msg)
# Keep 'reasoning_details' - OpenRouter uses this for multi-turn reasoning context
# The signature field helps maintain reasoning continuity
@@ -5457,6 +5670,7 @@ class AIAgent:
canonical_usage,
provider=self.provider,
base_url=self.base_url,
api_key=getattr(self, "api_key", ""),
)
if cost_result.amount_usd is not None:
self.session_estimated_cost_usd += float(cost_result.amount_usd)
@@ -5850,10 +6064,6 @@ class AIAgent:
self._client_log_context(),
api_error,
)
if retry_count >= max_retries:
self._vprint(f"{self.log_prefix}⚠️ API call failed after {retry_count} attempts: {str(api_error)[:100]}")
self._vprint(f"{self.log_prefix}⏳ Final retry in {wait_time}s...")
# Sleep in small increments so we can respond to interrupts quickly
# instead of blocking the entire wait_time in one sleep() call
sleep_end = time.time() + wait_time
+300
View File
@@ -0,0 +1,300 @@
---
name: hermes-agent-setup
description: Help users configure Hermes Agent — CLI usage, setup wizard, model/provider selection, tools, skills, voice/STT/TTS, gateway, and troubleshooting. Use when someone asks to enable features, configure settings, or needs help with Hermes itself.
version: 1.1.0
author: Hermes Agent
tags: [setup, configuration, tools, stt, tts, voice, hermes, cli, skills]
---
# Hermes Agent Setup & Configuration
Use this skill when a user asks about configuring Hermes, enabling features, setting up voice, managing tools/skills, or troubleshooting.
## Key Paths
- Config: `~/.hermes/config.yaml`
- API keys: `~/.hermes/.env`
- Skills: `~/.hermes/skills/`
- Hermes install: `~/.hermes/hermes-agent/`
- Venv: `~/.hermes/hermes-agent/.venv/` (or `venv/`)
## CLI Overview
Hermes is used via the `hermes` command (or `python -m hermes_cli.main` from the repo).
### Core commands:
```
hermes Interactive chat (default)
hermes chat -q "question" Single query, then exit
hermes chat -m MODEL Chat with a specific model
hermes -c Resume most recent session
hermes -c "project name" Resume session by name
hermes --resume SESSION_ID Resume by exact ID
hermes -w Isolated git worktree mode
hermes -s skill1,skill2 Preload skills for the session
hermes --yolo Skip dangerous command approval
```
### Configuration & setup:
```
hermes setup Interactive setup wizard (provider, API keys, model)
hermes model Interactive model/provider selection
hermes config View current configuration
hermes config edit Open config.yaml in $EDITOR
hermes config set KEY VALUE Set a config value directly
hermes login Authenticate with a provider
hermes logout Clear stored auth
hermes doctor Check configuration and dependencies
```
### Tools & skills:
```
hermes tools Interactive tool enable/disable per platform
hermes skills list List installed skills
hermes skills search QUERY Search the skills hub
hermes skills install NAME Install a skill from the hub
hermes skills config Enable/disable skills per platform
```
### Gateway (messaging platforms):
```
hermes gateway run Start the messaging gateway
hermes gateway install Install gateway as background service
hermes gateway status Check gateway status
```
### Session management:
```
hermes sessions list List past sessions
hermes sessions browse Interactive session picker
hermes sessions rename ID TITLE Rename a session
hermes sessions export ID Export session as markdown
hermes sessions prune Clean up old sessions
```
### Other:
```
hermes status Show status of all components
hermes cron list List cron jobs
hermes insights Usage analytics
hermes update Update to latest version
hermes pairing Manage DM authorization codes
```
## Setup Wizard (`hermes setup`)
The interactive setup wizard walks through:
1. **Provider selection** — OpenRouter, Anthropic, OpenAI, Google, DeepSeek, and many more
2. **API key entry** — stores securely in the env file
3. **Model selection** — picks from available models for the chosen provider
4. **Basic settings** — reasoning effort, tool preferences
Run it from terminal:
```bash
cd ~/.hermes/hermes-agent
source .venv/bin/activate
python -m hermes_cli.main setup
```
To change just the model/provider later: `hermes model`
## Skills Configuration (`hermes skills`)
Skills are reusable instruction sets that extend what Hermes can do.
### Managing skills:
```bash
hermes skills list # Show installed skills
hermes skills search "docker" # Search the hub
hermes skills install NAME # Install from hub
hermes skills config # Enable/disable per platform
```
### Per-platform skill control:
`hermes skills config` opens an interactive UI where you can enable or disable specific skills for each platform (cli, telegram, discord, etc.). Disabled skills won't appear in the agent's available skills list for that platform.
### Loading skills in a session:
- CLI: `hermes -s skill-name` or `hermes -s skill1,skill2`
- Chat: `/skill skill-name`
- Gateway: type `/skill skill-name` in any chat
## Voice Messages (STT)
Voice messages from Telegram/Discord/WhatsApp/Slack/Signal are auto-transcribed when an STT provider is available.
### Provider priority (auto-detected):
1. **Local faster-whisper** — free, no API key, runs on CPU/GPU
2. **Groq Whisper** — free tier, needs GROQ_API_KEY
3. **OpenAI Whisper** — paid, needs VOICE_TOOLS_OPENAI_KEY
### Setup local STT (recommended):
```bash
cd ~/.hermes/hermes-agent
source .venv/bin/activate # or: source venv/bin/activate
pip install faster-whisper
```
Add to config.yaml under the `stt:` section:
```yaml
stt:
enabled: true
provider: local
local:
model: base # Options: tiny, base, small, medium, large-v3
```
Model downloads automatically on first use (~150 MB for base).
### Setup Groq STT (free cloud):
1. Get free key from https://console.groq.com
2. Add GROQ_API_KEY to the env file
3. Set provider to groq in config.yaml stt section
### Verify STT:
After config changes, restart the gateway (send /restart in chat, or restart `hermes gateway run`). Then send a voice message.
## Voice Replies (TTS)
Hermes can reply with voice when users send voice messages.
### TTS providers (set API key in env file):
| Provider | Env var | Free? |
|----------|---------|-------|
| ElevenLabs | ELEVENLABS_API_KEY | Free tier |
| OpenAI | VOICE_TOOLS_OPENAI_KEY | Paid |
| Kokoro (local) | None needed | Free |
| Fish Audio | FISH_AUDIO_API_KEY | Free tier |
### Voice commands (in any chat):
- `/voice on` — voice reply to voice messages only
- `/voice tts` — voice reply to all messages
- `/voice off` — text only (default)
## Enabling/Disabling Tools (`hermes tools`)
### Interactive tool config:
```bash
cd ~/.hermes/hermes-agent
source .venv/bin/activate
python -m hermes_cli.main tools
```
This opens a curses UI to enable/disable toolsets per platform (cli, telegram, discord, slack, etc.).
### After changing tools:
Use `/reset` in the chat to start a fresh session with the new toolset. Tool changes do NOT take effect mid-conversation (this preserves prompt caching and avoids cost spikes).
### Common toolsets:
| Toolset | What it provides |
|---------|-----------------|
| terminal | Shell command execution |
| file | File read/write/search/patch |
| web | Web search and extraction |
| browser | Browser automation (needs Browserbase) |
| image_gen | AI image generation |
| mcp | MCP server connections |
| voice | Text-to-speech output |
| cronjob | Scheduled tasks |
## Installing Dependencies
Some tools need extra packages:
```bash
cd ~/.hermes/hermes-agent && source .venv/bin/activate
pip install faster-whisper # Local STT (voice transcription)
pip install browserbase # Browser automation
pip install mcp # MCP server connections
```
## Config File Reference
The main config file is `~/.hermes/config.yaml`. Key sections:
```yaml
# Model and provider
model:
default: anthropic/claude-opus-4.6
provider: openrouter
# Agent behavior
agent:
max_turns: 90
reasoning_effort: high # xhigh, high, medium, low, minimal, none
# Voice
stt:
enabled: true
provider: local # local, groq, openai
tts:
provider: elevenlabs # elevenlabs, openai, kokoro, fish
# Display
display:
skin: default # default, ares, mono, slate
tool_progress: full # full, compact, off
background_process_notifications: all # all, result, error, off
```
Edit with `hermes config edit` or `hermes config set KEY VALUE`.
## Gateway Commands (Messaging Platforms)
| Command | What it does |
|---------|-------------|
| /reset or /new | Fresh session (picks up new tool config) |
| /help | Show all commands |
| /model [name] | Show or change model |
| /compact | Compress conversation to save context |
| /voice [mode] | Configure voice replies |
| /reasoning [effort] | Set reasoning level |
| /sethome | Set home channel for cron/notifications |
| /restart | Restart the gateway (picks up config changes) |
| /status | Show session info |
| /retry | Retry last message |
| /undo | Remove last exchange |
| /personality [name] | Set agent personality |
| /skill [name] | Load a skill |
## Troubleshooting
### Voice messages not working
1. Check stt.enabled is true in config.yaml
2. Check a provider is available (faster-whisper installed, or API key set)
3. Restart gateway after config changes (/restart)
### Tool not available
1. Run `hermes tools` to check if the toolset is enabled for your platform
2. Some tools need env vars — check the env file
3. Use /reset after enabling tools
### Model/provider issues
1. Run `hermes doctor` to check configuration
2. Run `hermes login` to re-authenticate
3. Check the env file has the right API key
### Changes not taking effect
- Gateway: /reset for tool changes, /restart for config changes
- CLI: start a new session
### Skills not showing up
1. Check `hermes skills list` shows the skill
2. Check `hermes skills config` has it enabled for your platform
3. Load explicitly with `/skill name` or `hermes -s name`
+80
View File
@@ -0,0 +1,80 @@
---
name: huggingface-hub
description: Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Spaces and buckets.
version: 1.0.0
author: Hugging Face
license: MIT
tags: [huggingface, hf, models, datasets, hub, mlops]
---
# Hugging Face CLI (`hf`) Reference Guide
The `hf` command is the modern command-line interface for interacting with the Hugging Face Hub, providing tools to manage repositories, models, datasets, and Spaces.
> **IMPORTANT:** The `hf` command replaces the now deprecated `huggingface-cli` command.
## Quick Start
* **Installation:** `curl -LsSf https://hf.co/cli/install.sh | bash -s`
* **Help:** Use `hf --help` to view all available functions and real-world examples.
* **Authentication:** Recommended via `HF_TOKEN` environment variable or the `--token` flag.
---
## Core Commands
### General Operations
* `hf download REPO_ID`: Download files from the Hub.
* `hf upload REPO_ID`: Upload files/folders (recommended for single-commit).
* `hf upload-large-folder REPO_ID LOCAL_PATH`: Recommended for resumable uploads of large directories.
* `hf sync`: Sync files between a local directory and a bucket.
* `hf env` / `hf version`: View environment and version details.
### Authentication (`hf auth`)
* `login` / `logout`: Manage sessions using tokens from [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens).
* `list` / `switch`: Manage and toggle between multiple stored access tokens.
* `whoami`: Identify the currently logged-in account.
### Repository Management (`hf repos`)
* `create` / `delete`: Create or permanently remove repositories.
* `duplicate`: Clone a model, dataset, or Space to a new ID.
* `move`: Transfer a repository between namespaces.
* `branch` / `tag`: Manage Git-like references.
* `delete-files`: Remove specific files using patterns.
---
## Specialized Hub Interactions
### Datasets & Models
* **Datasets:** `hf datasets list`, `info`, and `parquet` (list parquet URLs).
* **SQL Queries:** `hf datasets sql SQL` — Execute raw SQL via DuckDB against dataset parquet URLs.
* **Models:** `hf models list` and `info`.
* **Papers:** `hf papers list` — View daily papers.
### Discussions & Pull Requests (`hf discussions`)
* Manage the lifecycle of Hub contributions: `list`, `create`, `info`, `comment`, `close`, `reopen`, and `rename`.
* `diff`: View changes in a PR.
* `merge`: Finalize pull requests.
### Infrastructure & Compute
* **Endpoints:** Deploy and manage Inference Endpoints (`deploy`, `pause`, `resume`, `scale-to-zero`, `catalog`).
* **Jobs:** Run compute tasks on HF infrastructure. Includes `hf jobs uv` for running Python scripts with inline dependencies and `stats` for resource monitoring.
* **Spaces:** Manage interactive apps. Includes `dev-mode` and `hot-reload` for Python files without full restarts.
### Storage & Automation
* **Buckets:** Full S3-like bucket management (`create`, `cp`, `mv`, `rm`, `sync`).
* **Cache:** Manage local storage with `list`, `prune` (remove detached revisions), and `verify` (checksum checks).
* **Webhooks:** Automate workflows by managing Hub webhooks (`create`, `watch`, `enable`/`disable`).
* **Collections:** Organize Hub items into collections (`add-item`, `update`, `list`).
---
## Advanced Usage & Tips
### Global Flags
* `--format json`: Produces machine-readable output for automation.
* `-q` / `--quiet`: Limits output to IDs only.
### Extensions & Skills
* **Extensions:** Extend CLI functionality via GitHub repositories using `hf extensions install REPO_ID`.
* **Skills:** Manage AI assistant skills with `hf skills add`.
+25
View File
@@ -248,6 +248,31 @@ class TestVisionClientFallback:
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
assert model == "claude-haiku-4-5-20251001"
def test_resolve_provider_client_copilot_uses_runtime_credentials(self, monkeypatch):
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GH_TOKEN", raising=False)
with (
patch(
"hermes_cli.auth.resolve_api_key_provider_credentials",
return_value={
"provider": "copilot",
"api_key": "gh-cli-token",
"base_url": "https://api.githubcopilot.com",
"source": "gh auth token",
},
),
patch("agent.auxiliary_client.OpenAI") as mock_openai,
):
client, model = resolve_provider_client("copilot", model="gpt-5.4")
assert client is not None
assert model == "gpt-5.4"
call_kwargs = mock_openai.call_args.kwargs
assert call_kwargs["api_key"] == "gh-cli-token"
assert call_kwargs["base_url"] == "https://api.githubcopilot.com"
assert call_kwargs["default_headers"]["Editor-Version"]
def test_vision_auto_uses_anthropic_when_no_higher_priority_backend(self, monkeypatch):
monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-api03-key")
with (
+49
View File
@@ -188,6 +188,36 @@ class TestGetModelContextLength:
result = get_model_context_length("custom/model")
assert result == CONTEXT_PROBE_TIERS[0]
@patch("agent.model_metadata.fetch_model_metadata")
@patch("agent.model_metadata.fetch_endpoint_model_metadata")
def test_custom_endpoint_metadata_beats_fuzzy_default(self, mock_endpoint_fetch, mock_fetch):
mock_fetch.return_value = {}
mock_endpoint_fetch.return_value = {
"zai-org/GLM-5-TEE": {"context_length": 65536}
}
result = get_model_context_length(
"zai-org/GLM-5-TEE",
base_url="https://llm.chutes.ai/v1",
api_key="test-key",
)
assert result == 65536
@patch("agent.model_metadata.fetch_model_metadata")
@patch("agent.model_metadata.fetch_endpoint_model_metadata")
def test_custom_endpoint_without_metadata_skips_name_based_default(self, mock_endpoint_fetch, mock_fetch):
mock_fetch.return_value = {}
mock_endpoint_fetch.return_value = {}
result = get_model_context_length(
"zai-org/GLM-5-TEE",
base_url="https://llm.chutes.ai/v1",
api_key="test-key",
)
assert result == CONTEXT_PROBE_TIERS[0]
# =========================================================================
# fetch_model_metadata — caching, TTL, slugs, failures
@@ -258,6 +288,25 @@ class TestFetchModelMetadata:
assert "anthropic/claude-3.5-sonnet" in result
assert result["anthropic/claude-3.5-sonnet"]["context_length"] == 200000
@patch("agent.model_metadata.requests.get")
def test_provider_prefixed_models_get_bare_aliases(self, mock_get):
self._reset_cache()
mock_response = MagicMock()
mock_response.json.return_value = {
"data": [{
"id": "provider/test-model",
"context_length": 123456,
"name": "Provider: Test Model",
}]
}
mock_response.raise_for_status = MagicMock()
mock_get.return_value = mock_response
result = fetch_model_metadata(force_refresh=True)
assert result["provider/test-model"]["context_length"] == 123456
assert result["test-model"]["context_length"] == 123456
@patch("agent.model_metadata.requests.get")
def test_ttl_expiry_triggers_refetch(self, mock_get):
"""Cache expires after _MODEL_CACHE_TTL seconds."""
+29
View File
@@ -309,6 +309,35 @@ class TestBuildSkillsSystemPrompt:
assert "imessage" in result
assert "Send iMessages" in result
def test_excludes_disabled_skills(self, monkeypatch, tmp_path):
"""Skills in the user's disabled list should not appear in the system prompt."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
skills_dir = tmp_path / "skills" / "tools"
skills_dir.mkdir(parents=True)
enabled_skill = skills_dir / "web-search"
enabled_skill.mkdir()
(enabled_skill / "SKILL.md").write_text(
"---\nname: web-search\ndescription: Search the web\n---\n"
)
disabled_skill = skills_dir / "old-tool"
disabled_skill.mkdir()
(disabled_skill / "SKILL.md").write_text(
"---\nname: old-tool\ndescription: Deprecated tool\n---\n"
)
from unittest.mock import patch
with patch(
"tools.skills_tool._get_disabled_skill_names",
return_value={"old-tool"},
):
result = build_skills_system_prompt()
assert "web-search" in result
assert "old-tool" not in result
def test_includes_setup_needed_skills(self, monkeypatch, tmp_path):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
monkeypatch.delenv("MISSING_API_KEY_XYZ", raising=False)
+15
View File
@@ -85,6 +85,21 @@ class TestScanSkillCommands:
result = scan_skill_commands()
assert "/generic-tool" in result
def test_excludes_disabled_skills(self, tmp_path):
"""Disabled skills should not register slash commands."""
with (
patch("tools.skills_tool.SKILLS_DIR", tmp_path),
patch(
"tools.skills_tool._get_disabled_skill_names",
return_value={"disabled-skill"},
),
):
_make_skill(tmp_path, "enabled-skill")
_make_skill(tmp_path, "disabled-skill")
result = scan_skill_commands()
assert "/enabled-skill" in result
assert "/disabled-skill" not in result
class TestBuildPreloadedSkillsPrompt:
def test_builds_prompt_for_multiple_named_skills(self, tmp_path):
+24
View File
@@ -99,3 +99,27 @@ def test_estimate_usage_cost_refuses_cache_pricing_without_official_cache_rate(m
)
assert result.status == "unknown"
def test_custom_endpoint_models_api_pricing_is_supported(monkeypatch):
monkeypatch.setattr(
"agent.usage_pricing.fetch_endpoint_model_metadata",
lambda base_url, api_key=None: {
"zai-org/GLM-5-TEE": {
"pricing": {
"prompt": "0.0000005",
"completion": "0.000002",
}
}
},
)
entry = get_pricing_entry(
"zai-org/GLM-5-TEE",
provider="custom",
base_url="https://llm.chutes.ai/v1",
api_key="test-key",
)
assert float(entry.input_cost_per_million) == 0.5
assert float(entry.output_cost_per_million) == 2.0
+80 -1
View File
@@ -2,7 +2,7 @@
import json
import pytest
from datetime import datetime, timedelta
from datetime import datetime, timedelta, timezone
from pathlib import Path
from unittest.mock import patch
@@ -122,11 +122,29 @@ class TestComputeNextRun:
schedule = {"kind": "once", "run_at": future}
assert compute_next_run(schedule) == future
def test_once_recent_past_within_grace_returns_time(self, monkeypatch):
now = datetime(2026, 3, 18, 4, 22, 3, tzinfo=timezone.utc)
run_at = "2026-03-18T04:22:00+00:00"
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
schedule = {"kind": "once", "run_at": run_at}
assert compute_next_run(schedule) == run_at
def test_once_past_returns_none(self):
past = (datetime.now() - timedelta(hours=1)).isoformat()
schedule = {"kind": "once", "run_at": past}
assert compute_next_run(schedule) is None
def test_once_with_last_run_returns_none_even_within_grace(self, monkeypatch):
now = datetime(2026, 3, 18, 4, 22, 3, tzinfo=timezone.utc)
run_at = "2026-03-18T04:22:00+00:00"
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
schedule = {"kind": "once", "run_at": run_at}
assert compute_next_run(schedule, last_run_at=now.isoformat()) is None
def test_interval_first_run(self):
schedule = {"kind": "interval", "minutes": 60}
result = compute_next_run(schedule)
@@ -347,6 +365,67 @@ class TestGetDueJobs:
due = get_due_jobs()
assert len(due) == 0
def test_broken_recent_one_shot_without_next_run_is_recovered(self, tmp_cron_dir, monkeypatch):
now = datetime(2026, 3, 18, 4, 22, 30, tzinfo=timezone.utc)
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
run_at = "2026-03-18T04:22:00+00:00"
save_jobs(
[{
"id": "oneshot-recover",
"name": "Recover me",
"prompt": "Word of the day",
"schedule": {"kind": "once", "run_at": run_at, "display": "once at 2026-03-18 04:22"},
"schedule_display": "once at 2026-03-18 04:22",
"repeat": {"times": 1, "completed": 0},
"enabled": True,
"state": "scheduled",
"paused_at": None,
"paused_reason": None,
"created_at": "2026-03-18T04:21:00+00:00",
"next_run_at": None,
"last_run_at": None,
"last_status": None,
"last_error": None,
"deliver": "local",
"origin": None,
}]
)
due = get_due_jobs()
assert [job["id"] for job in due] == ["oneshot-recover"]
assert get_job("oneshot-recover")["next_run_at"] == run_at
def test_broken_stale_one_shot_without_next_run_is_not_recovered(self, tmp_cron_dir, monkeypatch):
now = datetime(2026, 3, 18, 4, 30, 0, tzinfo=timezone.utc)
monkeypatch.setattr("cron.jobs._hermes_now", lambda: now)
save_jobs(
[{
"id": "oneshot-stale",
"name": "Too old",
"prompt": "Word of the day",
"schedule": {"kind": "once", "run_at": "2026-03-18T04:22:00+00:00", "display": "once at 2026-03-18 04:22"},
"schedule_display": "once at 2026-03-18 04:22",
"repeat": {"times": 1, "completed": 0},
"enabled": True,
"state": "scheduled",
"paused_at": None,
"paused_reason": None,
"created_at": "2026-03-18T04:21:00+00:00",
"next_run_at": None,
"last_run_at": None,
"last_status": None,
"last_error": None,
"deliver": "local",
"origin": None,
}]
)
assert get_due_jobs() == []
assert get_job("oneshot-stale")["next_run_at"] is None
class TestSaveJobOutput:
def test_creates_output_file(self, tmp_cron_dir):
+95 -1
View File
@@ -7,7 +7,7 @@ from unittest.mock import AsyncMock, patch, MagicMock
import pytest
from cron.scheduler import _resolve_origin, _resolve_delivery_target, _deliver_result, run_job
from cron.scheduler import _resolve_origin, _resolve_delivery_target, _deliver_result, run_job, SILENT_MARKER
class TestResolveOrigin:
@@ -449,3 +449,97 @@ class TestRunJobSkillBacked:
assert "Instructions for blogwatcher." in prompt_arg
assert "Instructions for find-nearby." in prompt_arg
assert "Combine the results." in prompt_arg
class TestSilentDelivery:
"""Verify that [SILENT] responses suppress delivery while still saving output."""
def _make_job(self):
return {
"id": "monitor-job",
"name": "monitor",
"deliver": "origin",
"origin": {"platform": "telegram", "chat_id": "123"},
}
def test_normal_response_delivers(self):
with patch("cron.scheduler.get_due_jobs", return_value=[self._make_job()]), \
patch("cron.scheduler.run_job", return_value=(True, "# output", "Results here", None)), \
patch("cron.scheduler.save_job_output", return_value="/tmp/out.md"), \
patch("cron.scheduler._deliver_result") as deliver_mock, \
patch("cron.scheduler.mark_job_run"):
from cron.scheduler import tick
tick(verbose=False)
deliver_mock.assert_called_once()
def test_silent_response_suppresses_delivery(self, caplog):
with patch("cron.scheduler.get_due_jobs", return_value=[self._make_job()]), \
patch("cron.scheduler.run_job", return_value=(True, "# output", "[SILENT]", None)), \
patch("cron.scheduler.save_job_output", return_value="/tmp/out.md"), \
patch("cron.scheduler._deliver_result") as deliver_mock, \
patch("cron.scheduler.mark_job_run"):
from cron.scheduler import tick
with caplog.at_level(logging.INFO, logger="cron.scheduler"):
tick(verbose=False)
deliver_mock.assert_not_called()
assert any(SILENT_MARKER in r.message for r in caplog.records)
def test_silent_with_note_suppresses_delivery(self):
with patch("cron.scheduler.get_due_jobs", return_value=[self._make_job()]), \
patch("cron.scheduler.run_job", return_value=(True, "# output", "[SILENT] No changes detected", None)), \
patch("cron.scheduler.save_job_output", return_value="/tmp/out.md"), \
patch("cron.scheduler._deliver_result") as deliver_mock, \
patch("cron.scheduler.mark_job_run"):
from cron.scheduler import tick
tick(verbose=False)
deliver_mock.assert_not_called()
def test_silent_is_case_insensitive(self):
with patch("cron.scheduler.get_due_jobs", return_value=[self._make_job()]), \
patch("cron.scheduler.run_job", return_value=(True, "# output", "[silent] nothing new", None)), \
patch("cron.scheduler.save_job_output", return_value="/tmp/out.md"), \
patch("cron.scheduler._deliver_result") as deliver_mock, \
patch("cron.scheduler.mark_job_run"):
from cron.scheduler import tick
tick(verbose=False)
deliver_mock.assert_not_called()
def test_failed_job_always_delivers(self):
"""Failed jobs deliver regardless of [SILENT] in output."""
with patch("cron.scheduler.get_due_jobs", return_value=[self._make_job()]), \
patch("cron.scheduler.run_job", return_value=(False, "# output", "", "some error")), \
patch("cron.scheduler.save_job_output", return_value="/tmp/out.md"), \
patch("cron.scheduler._deliver_result") as deliver_mock, \
patch("cron.scheduler.mark_job_run"):
from cron.scheduler import tick
tick(verbose=False)
deliver_mock.assert_called_once()
def test_output_saved_even_when_delivery_suppressed(self):
with patch("cron.scheduler.get_due_jobs", return_value=[self._make_job()]), \
patch("cron.scheduler.run_job", return_value=(True, "# full output", "[SILENT]", None)), \
patch("cron.scheduler.save_job_output") as save_mock, \
patch("cron.scheduler._deliver_result") as deliver_mock, \
patch("cron.scheduler.mark_job_run"):
save_mock.return_value = "/tmp/out.md"
from cron.scheduler import tick
tick(verbose=False)
save_mock.assert_called_once_with("monitor-job", "# full output")
deliver_mock.assert_not_called()
class TestBuildJobPromptSilentHint:
"""Verify _build_job_prompt always injects [SILENT] guidance."""
def test_hint_always_present(self):
from cron.scheduler import _build_job_prompt
job = {"prompt": "Check for updates"}
result = _build_job_prompt(job)
assert "[SILENT]" in result
assert "Check for updates" in result
def test_hint_present_even_without_prompt(self):
from cron.scheduler import _build_job_prompt
job = {"prompt": ""}
result = _build_job_prompt(job)
assert "[SILENT]" in result
+34
View File
@@ -115,6 +115,22 @@ class TestGatewayConfigRoundtrip:
assert restored.quick_commands == {"limits": {"type": "exec", "command": "echo ok"}}
assert restored.group_sessions_per_user is False
def test_roundtrip_preserves_unauthorized_dm_behavior(self):
config = GatewayConfig(
unauthorized_dm_behavior="ignore",
platforms={
Platform.WHATSAPP: PlatformConfig(
enabled=True,
extra={"unauthorized_dm_behavior": "pair"},
),
},
)
restored = GatewayConfig.from_dict(config.to_dict())
assert restored.unauthorized_dm_behavior == "ignore"
assert restored.platforms[Platform.WHATSAPP].extra["unauthorized_dm_behavior"] == "pair"
class TestLoadGatewayConfig:
def test_bridges_quick_commands_from_config_yaml(self, tmp_path, monkeypatch):
@@ -158,3 +174,21 @@ class TestLoadGatewayConfig:
config = load_gateway_config()
assert config.quick_commands == {}
def test_bridges_unauthorized_dm_behavior_from_config_yaml(self, tmp_path, monkeypatch):
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
config_path = hermes_home / "config.yaml"
config_path.write_text(
"unauthorized_dm_behavior: ignore\n"
"whatsapp:\n"
" unauthorized_dm_behavior: pair\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
config = load_gateway_config()
assert config.unauthorized_dm_behavior == "ignore"
assert config.platforms[Platform.WHATSAPP].extra["unauthorized_dm_behavior"] == "pair"
+20
View File
@@ -42,6 +42,26 @@ class TestGatewayPidState:
assert status.get_running_pid() == os.getpid()
def test_get_running_pid_accepts_script_style_gateway_cmdline(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
pid_path = tmp_path / "gateway.pid"
pid_path.write_text(json.dumps({
"pid": os.getpid(),
"kind": "hermes-gateway",
"argv": ["/venv/bin/python", "/repo/hermes_cli/main.py", "gateway", "run", "--replace"],
"start_time": 123,
}))
monkeypatch.setattr(status.os, "kill", lambda pid, sig: None)
monkeypatch.setattr(status, "_get_process_start_time", lambda pid: 123)
monkeypatch.setattr(
status,
"_read_process_cmdline",
lambda pid: "/venv/bin/python /repo/hermes_cli/main.py gateway run --replace",
)
assert status.get_running_pid() == os.getpid()
class TestGatewayRuntimeStatus:
def test_write_runtime_status_overwrites_stale_pid_on_restart(self, tmp_path, monkeypatch):
@@ -0,0 +1,137 @@
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock
import pytest
from gateway.config import GatewayConfig, Platform, PlatformConfig
from gateway.platforms.base import MessageEvent
from gateway.session import SessionSource
def _clear_auth_env(monkeypatch) -> None:
for key in (
"TELEGRAM_ALLOWED_USERS",
"DISCORD_ALLOWED_USERS",
"WHATSAPP_ALLOWED_USERS",
"SLACK_ALLOWED_USERS",
"SIGNAL_ALLOWED_USERS",
"EMAIL_ALLOWED_USERS",
"SMS_ALLOWED_USERS",
"MATTERMOST_ALLOWED_USERS",
"MATRIX_ALLOWED_USERS",
"DINGTALK_ALLOWED_USERS",
"GATEWAY_ALLOWED_USERS",
"TELEGRAM_ALLOW_ALL_USERS",
"DISCORD_ALLOW_ALL_USERS",
"WHATSAPP_ALLOW_ALL_USERS",
"SLACK_ALLOW_ALL_USERS",
"SIGNAL_ALLOW_ALL_USERS",
"EMAIL_ALLOW_ALL_USERS",
"SMS_ALLOW_ALL_USERS",
"MATTERMOST_ALLOW_ALL_USERS",
"MATRIX_ALLOW_ALL_USERS",
"DINGTALK_ALLOW_ALL_USERS",
"GATEWAY_ALLOW_ALL_USERS",
):
monkeypatch.delenv(key, raising=False)
def _make_event(platform: Platform, user_id: str, chat_id: str) -> MessageEvent:
return MessageEvent(
text="hello",
message_id="m1",
source=SessionSource(
platform=platform,
user_id=user_id,
chat_id=chat_id,
user_name="tester",
chat_type="dm",
),
)
def _make_runner(platform: Platform, config: GatewayConfig):
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner.config = config
adapter = SimpleNamespace(send=AsyncMock())
runner.adapters = {platform: adapter}
runner.pairing_store = MagicMock()
runner.pairing_store.is_approved.return_value = False
return runner, adapter
@pytest.mark.asyncio
async def test_unauthorized_dm_pairs_by_default(monkeypatch):
_clear_auth_env(monkeypatch)
config = GatewayConfig(
platforms={Platform.WHATSAPP: PlatformConfig(enabled=True)},
)
runner, adapter = _make_runner(Platform.WHATSAPP, config)
runner.pairing_store.generate_code.return_value = "ABC12DEF"
result = await runner._handle_message(
_make_event(
Platform.WHATSAPP,
"15551234567@s.whatsapp.net",
"15551234567@s.whatsapp.net",
)
)
assert result is None
runner.pairing_store.generate_code.assert_called_once_with(
"whatsapp",
"15551234567@s.whatsapp.net",
"tester",
)
adapter.send.assert_awaited_once()
assert "ABC12DEF" in adapter.send.await_args.args[1]
@pytest.mark.asyncio
async def test_unauthorized_whatsapp_dm_can_be_ignored(monkeypatch):
_clear_auth_env(monkeypatch)
config = GatewayConfig(
platforms={
Platform.WHATSAPP: PlatformConfig(
enabled=True,
extra={"unauthorized_dm_behavior": "ignore"},
),
},
)
runner, adapter = _make_runner(Platform.WHATSAPP, config)
result = await runner._handle_message(
_make_event(
Platform.WHATSAPP,
"15551234567@s.whatsapp.net",
"15551234567@s.whatsapp.net",
)
)
assert result is None
runner.pairing_store.generate_code.assert_not_called()
adapter.send.assert_not_awaited()
@pytest.mark.asyncio
async def test_global_ignore_suppresses_pairing_reply(monkeypatch):
_clear_auth_env(monkeypatch)
config = GatewayConfig(
unauthorized_dm_behavior="ignore",
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")},
)
runner, adapter = _make_runner(Platform.TELEGRAM, config)
result = await runner._handle_message(
_make_event(
Platform.TELEGRAM,
"12345",
"12345",
)
)
assert result is None
runner.pairing_store.generate_code.assert_not_called()
adapter.send.assert_not_awaited()
+70
View File
@@ -0,0 +1,70 @@
"""Tests for banner toolset name normalization and skin color usage."""
from unittest.mock import patch
from rich.console import Console
import hermes_cli.banner as banner
import model_tools
import tools.mcp_tool
def test_display_toolset_name_strips_legacy_suffix():
assert banner._display_toolset_name("homeassistant_tools") == "homeassistant"
assert banner._display_toolset_name("honcho_tools") == "honcho"
assert banner._display_toolset_name("web_tools") == "web"
def test_display_toolset_name_preserves_clean_names():
assert banner._display_toolset_name("browser") == "browser"
assert banner._display_toolset_name("file") == "file"
assert banner._display_toolset_name("terminal") == "terminal"
def test_display_toolset_name_handles_empty():
assert banner._display_toolset_name("") == "unknown"
assert banner._display_toolset_name(None) == "unknown"
def test_build_welcome_banner_uses_normalized_toolset_names():
"""Unavailable toolsets should not have '_tools' appended in banner output."""
with (
patch.object(
model_tools,
"check_tool_availability",
return_value=(
["web"],
[
{"name": "homeassistant", "tools": ["ha_call_service"]},
{"name": "honcho", "tools": ["honcho_conclude"]},
],
),
),
patch.object(banner, "get_available_skills", return_value={}),
patch.object(banner, "get_update_result", return_value=None),
patch.object(tools.mcp_tool, "get_mcp_status", return_value=[]),
):
console = Console(
record=True, force_terminal=False, color_system=None, width=160
)
banner.build_welcome_banner(
console=console,
model="anthropic/test-model",
cwd="/tmp/project",
tools=[
{"function": {"name": "web_search"}},
{"function": {"name": "read_file"}},
],
get_toolset_for_tool=lambda name: {
"web_search": "web_tools",
"read_file": "file",
}.get(name),
)
output = console.export_text()
assert "homeassistant:" in output
assert "honcho:" in output
assert "web:" in output
assert "homeassistant_tools:" not in output
assert "honcho_tools:" not in output
assert "web_tools:" not in output
+68
View File
@@ -0,0 +1,68 @@
"""Tests for banner get_available_skills() — disabled and platform filtering."""
from unittest.mock import patch
import pytest
_MOCK_SKILLS = [
{"name": "skill-a", "description": "A skill", "category": "tools"},
{"name": "skill-b", "description": "B skill", "category": "tools"},
{"name": "skill-c", "description": "C skill", "category": "creative"},
]
def test_get_available_skills_delegates_to_find_all_skills():
"""get_available_skills should call _find_all_skills (which handles filtering)."""
with patch("tools.skills_tool._find_all_skills", return_value=list(_MOCK_SKILLS)):
from hermes_cli.banner import get_available_skills
result = get_available_skills()
assert "tools" in result
assert "creative" in result
assert sorted(result["tools"]) == ["skill-a", "skill-b"]
assert result["creative"] == ["skill-c"]
def test_get_available_skills_excludes_disabled():
"""Disabled skills should not appear in the banner count."""
# _find_all_skills already filters disabled skills, so if we give it
# a filtered list, get_available_skills should reflect that.
filtered = [s for s in _MOCK_SKILLS if s["name"] != "skill-b"]
with patch("tools.skills_tool._find_all_skills", return_value=filtered):
from hermes_cli.banner import get_available_skills
result = get_available_skills()
all_names = [n for names in result.values() for n in names]
assert "skill-b" not in all_names
assert "skill-a" in all_names
assert len(all_names) == 2
def test_get_available_skills_empty_when_no_skills():
"""No skills installed returns empty dict."""
with patch("tools.skills_tool._find_all_skills", return_value=[]):
from hermes_cli.banner import get_available_skills
result = get_available_skills()
assert result == {}
def test_get_available_skills_handles_import_failure():
"""If _find_all_skills import fails, return empty dict gracefully."""
with patch("tools.skills_tool._find_all_skills", side_effect=ImportError("boom")):
from hermes_cli.banner import get_available_skills
result = get_available_skills()
assert result == {}
def test_get_available_skills_null_category_becomes_general():
"""Skills with None category should be grouped under 'general'."""
skills = [{"name": "orphan-skill", "description": "No cat", "category": None}]
with patch("tools.skills_tool._find_all_skills", return_value=skills):
from hermes_cli.banner import get_available_skills
result = get_available_skills()
assert "general" in result
assert result["general"] == ["orphan-skill"]
+208
View File
@@ -0,0 +1,208 @@
"""Tests for hermes_cli.copilot_auth — Copilot token validation and resolution."""
import os
import pytest
from unittest.mock import patch, MagicMock
class TestTokenValidation:
"""Token type validation."""
def test_classic_pat_rejected(self):
from hermes_cli.copilot_auth import validate_copilot_token
valid, msg = validate_copilot_token("ghp_abcdefghijklmnop1234")
assert valid is False
assert "Classic Personal Access Tokens" in msg
assert "ghp_" in msg
def test_oauth_token_accepted(self):
from hermes_cli.copilot_auth import validate_copilot_token
valid, msg = validate_copilot_token("gho_abcdefghijklmnop1234")
assert valid is True
def test_fine_grained_pat_accepted(self):
from hermes_cli.copilot_auth import validate_copilot_token
valid, msg = validate_copilot_token("github_pat_abcdefghijklmnop1234")
assert valid is True
def test_github_app_token_accepted(self):
from hermes_cli.copilot_auth import validate_copilot_token
valid, msg = validate_copilot_token("ghu_abcdefghijklmnop1234")
assert valid is True
def test_empty_token_rejected(self):
from hermes_cli.copilot_auth import validate_copilot_token
valid, msg = validate_copilot_token("")
assert valid is False
def test_is_classic_pat(self):
from hermes_cli.copilot_auth import is_classic_pat
assert is_classic_pat("ghp_abc123") is True
assert is_classic_pat("gho_abc123") is False
assert is_classic_pat("github_pat_abc") is False
assert is_classic_pat("") is False
class TestResolveToken:
"""Token resolution with env var priority."""
def test_copilot_github_token_first_priority(self, monkeypatch):
from hermes_cli.copilot_auth import resolve_copilot_token
monkeypatch.setenv("COPILOT_GITHUB_TOKEN", "gho_copilot_first")
monkeypatch.setenv("GH_TOKEN", "gho_gh_second")
monkeypatch.setenv("GITHUB_TOKEN", "gho_github_third")
token, source = resolve_copilot_token()
assert token == "gho_copilot_first"
assert source == "COPILOT_GITHUB_TOKEN"
def test_gh_token_second_priority(self, monkeypatch):
from hermes_cli.copilot_auth import resolve_copilot_token
monkeypatch.delenv("COPILOT_GITHUB_TOKEN", raising=False)
monkeypatch.setenv("GH_TOKEN", "gho_gh_second")
monkeypatch.setenv("GITHUB_TOKEN", "gho_github_third")
token, source = resolve_copilot_token()
assert token == "gho_gh_second"
assert source == "GH_TOKEN"
def test_github_token_third_priority(self, monkeypatch):
from hermes_cli.copilot_auth import resolve_copilot_token
monkeypatch.delenv("COPILOT_GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GH_TOKEN", raising=False)
monkeypatch.setenv("GITHUB_TOKEN", "gho_github_third")
token, source = resolve_copilot_token()
assert token == "gho_github_third"
assert source == "GITHUB_TOKEN"
def test_classic_pat_in_env_skipped(self, monkeypatch):
"""Classic PATs in env vars should be skipped, not returned."""
from hermes_cli.copilot_auth import resolve_copilot_token
monkeypatch.setenv("COPILOT_GITHUB_TOKEN", "ghp_classic_pat_nope")
monkeypatch.delenv("GH_TOKEN", raising=False)
monkeypatch.setenv("GITHUB_TOKEN", "gho_valid_oauth")
token, source = resolve_copilot_token()
# Should skip the ghp_ token and find the gho_ one
assert token == "gho_valid_oauth"
assert source == "GITHUB_TOKEN"
def test_gh_cli_fallback(self, monkeypatch):
from hermes_cli.copilot_auth import resolve_copilot_token
monkeypatch.delenv("COPILOT_GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GH_TOKEN", raising=False)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
with patch("hermes_cli.copilot_auth._try_gh_cli_token", return_value="gho_from_cli"):
token, source = resolve_copilot_token()
assert token == "gho_from_cli"
assert source == "gh auth token"
def test_gh_cli_classic_pat_raises(self, monkeypatch):
from hermes_cli.copilot_auth import resolve_copilot_token
monkeypatch.delenv("COPILOT_GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GH_TOKEN", raising=False)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
with patch("hermes_cli.copilot_auth._try_gh_cli_token", return_value="ghp_classic"):
with pytest.raises(ValueError, match="classic PAT"):
resolve_copilot_token()
def test_no_token_returns_empty(self, monkeypatch):
from hermes_cli.copilot_auth import resolve_copilot_token
monkeypatch.delenv("COPILOT_GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GH_TOKEN", raising=False)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
with patch("hermes_cli.copilot_auth._try_gh_cli_token", return_value=None):
token, source = resolve_copilot_token()
assert token == ""
assert source == ""
class TestRequestHeaders:
"""Copilot API header generation."""
def test_default_headers_include_openai_intent(self):
from hermes_cli.copilot_auth import copilot_request_headers
headers = copilot_request_headers()
assert headers["Openai-Intent"] == "conversation-edits"
assert headers["User-Agent"] == "HermesAgent/1.0"
assert "Editor-Version" in headers
def test_agent_turn_sets_initiator(self):
from hermes_cli.copilot_auth import copilot_request_headers
headers = copilot_request_headers(is_agent_turn=True)
assert headers["x-initiator"] == "agent"
def test_user_turn_sets_initiator(self):
from hermes_cli.copilot_auth import copilot_request_headers
headers = copilot_request_headers(is_agent_turn=False)
assert headers["x-initiator"] == "user"
def test_vision_header(self):
from hermes_cli.copilot_auth import copilot_request_headers
headers = copilot_request_headers(is_vision=True)
assert headers["Copilot-Vision-Request"] == "true"
def test_no_vision_header_by_default(self):
from hermes_cli.copilot_auth import copilot_request_headers
headers = copilot_request_headers()
assert "Copilot-Vision-Request" not in headers
class TestCopilotDefaultHeaders:
"""The models.py copilot_default_headers uses copilot_auth."""
def test_includes_openai_intent(self):
from hermes_cli.models import copilot_default_headers
headers = copilot_default_headers()
assert "Openai-Intent" in headers
assert headers["Openai-Intent"] == "conversation-edits"
def test_includes_x_initiator(self):
from hermes_cli.models import copilot_default_headers
headers = copilot_default_headers()
assert "x-initiator" in headers
class TestApiModeSelection:
"""API mode selection matching opencode's shouldUseCopilotResponsesApi."""
def test_gpt5_uses_responses(self):
from hermes_cli.models import _should_use_copilot_responses_api
assert _should_use_copilot_responses_api("gpt-5.4") is True
assert _should_use_copilot_responses_api("gpt-5.4-mini") is True
assert _should_use_copilot_responses_api("gpt-5.3-codex") is True
assert _should_use_copilot_responses_api("gpt-5.2-codex") is True
assert _should_use_copilot_responses_api("gpt-5.2") is True
assert _should_use_copilot_responses_api("gpt-5.1-codex-max") is True
def test_gpt5_mini_excluded(self):
from hermes_cli.models import _should_use_copilot_responses_api
assert _should_use_copilot_responses_api("gpt-5-mini") is False
def test_gpt4_uses_chat(self):
from hermes_cli.models import _should_use_copilot_responses_api
assert _should_use_copilot_responses_api("gpt-4.1") is False
assert _should_use_copilot_responses_api("gpt-4o") is False
assert _should_use_copilot_responses_api("gpt-4o-mini") is False
def test_non_gpt_uses_chat(self):
from hermes_cli.models import _should_use_copilot_responses_api
assert _should_use_copilot_responses_api("claude-sonnet-4.6") is False
assert _should_use_copilot_responses_api("claude-opus-4.6") is False
assert _should_use_copilot_responses_api("gemini-2.5-pro") is False
assert _should_use_copilot_responses_api("grok-code-fast-1") is False
class TestEnvVarOrder:
"""PROVIDER_REGISTRY has correct env var order."""
def test_copilot_env_vars_include_copilot_github_token(self):
from hermes_cli.auth import PROVIDER_REGISTRY
copilot = PROVIDER_REGISTRY["copilot"]
assert "COPILOT_GITHUB_TOKEN" in copilot.api_key_env_vars
# COPILOT_GITHUB_TOKEN should be first
assert copilot.api_key_env_vars[0] == "COPILOT_GITHUB_TOKEN"
def test_copilot_env_vars_order_matches_docs(self):
from hermes_cli.auth import PROVIDER_REGISTRY
copilot = PROVIDER_REGISTRY["copilot"]
assert copilot.api_key_env_vars == (
"COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"
)
+83
View File
@@ -1,6 +1,8 @@
"""Tests for hermes_cli.gateway."""
import signal
from types import SimpleNamespace
from unittest.mock import patch, call
import hermes_cli.gateway as gateway
@@ -169,3 +171,84 @@ def test_install_linux_gateway_from_setup_system_choice_as_root_installs(monkeyp
assert (scope, did_install) == ("system", True)
assert calls == [(True, True, "alice")]
# ---------------------------------------------------------------------------
# _wait_for_gateway_exit
# ---------------------------------------------------------------------------
class TestWaitForGatewayExit:
"""PID-based wait with force-kill on timeout."""
def test_returns_immediately_when_no_pid(self, monkeypatch):
"""If get_running_pid returns None, exit instantly."""
monkeypatch.setattr("gateway.status.get_running_pid", lambda: None)
# Should return without sleeping at all.
gateway._wait_for_gateway_exit(timeout=1.0, force_after=0.5)
def test_returns_when_process_exits_gracefully(self, monkeypatch):
"""Process exits after a couple of polls — no SIGKILL needed."""
poll_count = 0
def mock_get_running_pid():
nonlocal poll_count
poll_count += 1
return 12345 if poll_count <= 2 else None
monkeypatch.setattr("gateway.status.get_running_pid", mock_get_running_pid)
monkeypatch.setattr("time.sleep", lambda _: None)
gateway._wait_for_gateway_exit(timeout=10.0, force_after=999.0)
# Should have polled until None was returned.
assert poll_count == 3
def test_force_kills_after_grace_period(self, monkeypatch):
"""When the process doesn't exit, SIGKILL the saved PID."""
import time as _time
# Simulate monotonic time advancing past force_after
call_num = 0
def fake_monotonic():
nonlocal call_num
call_num += 1
# First two calls: initial deadline + force_deadline setup (time 0)
# Then each loop iteration advances time
return call_num * 2.0 # 2, 4, 6, 8, ...
kills = []
def mock_kill(pid, sig):
kills.append((pid, sig))
# get_running_pid returns the PID until kill is sent, then None
def mock_get_running_pid():
return None if kills else 42
monkeypatch.setattr("time.monotonic", fake_monotonic)
monkeypatch.setattr("time.sleep", lambda _: None)
monkeypatch.setattr("gateway.status.get_running_pid", mock_get_running_pid)
monkeypatch.setattr("os.kill", mock_kill)
gateway._wait_for_gateway_exit(timeout=10.0, force_after=5.0)
assert (42, signal.SIGKILL) in kills
def test_handles_process_already_gone_on_kill(self, monkeypatch):
"""ProcessLookupError during SIGKILL is not fatal."""
import time as _time
call_num = 0
def fake_monotonic():
nonlocal call_num
call_num += 1
return call_num * 3.0 # Jump past force_after quickly
def mock_kill(pid, sig):
raise ProcessLookupError
monkeypatch.setattr("time.monotonic", fake_monotonic)
monkeypatch.setattr("time.sleep", lambda _: None)
monkeypatch.setattr("gateway.status.get_running_pid", lambda: 99)
monkeypatch.setattr("os.kill", mock_kill)
# Should not raise — ProcessLookupError means it's already gone.
gateway._wait_for_gateway_exit(timeout=10.0, force_after=2.0)
+131
View File
@@ -3,8 +3,12 @@
from unittest.mock import patch
from hermes_cli.models import (
copilot_model_api_mode,
fetch_github_model_catalog,
curated_models_for_provider,
fetch_api_models,
github_model_reasoning_efforts,
normalize_copilot_model_id,
normalize_provider,
parse_model_input,
probe_api_models,
@@ -116,6 +120,7 @@ class TestNormalizeProvider:
assert normalize_provider("glm") == "zai"
assert normalize_provider("kimi") == "kimi-coding"
assert normalize_provider("moonshot") == "kimi-coding"
assert normalize_provider("github-copilot") == "copilot"
def test_case_insensitive(self):
assert normalize_provider("OpenRouter") == "openrouter"
@@ -125,6 +130,8 @@ class TestProviderLabel:
def test_known_labels_and_auto(self):
assert provider_label("anthropic") == "Anthropic"
assert provider_label("kimi") == "Kimi / Moonshot"
assert provider_label("copilot") == "GitHub Copilot"
assert provider_label("copilot-acp") == "GitHub Copilot ACP"
assert provider_label("auto") == "Auto"
def test_unknown_provider_preserves_original_name(self):
@@ -145,6 +152,24 @@ class TestProviderModelIds:
def test_zai_returns_glm_models(self):
assert "glm-5" in provider_model_ids("zai")
def test_copilot_prefers_live_catalog(self):
with patch("hermes_cli.auth.resolve_api_key_provider_credentials", return_value={"api_key": "gh-token"}), \
patch("hermes_cli.models._fetch_github_models", return_value=["gpt-5.4", "claude-sonnet-4.6"]):
assert provider_model_ids("copilot") == ["gpt-5.4", "claude-sonnet-4.6"]
def test_copilot_acp_reuses_copilot_catalog(self):
with patch("hermes_cli.auth.resolve_api_key_provider_credentials", return_value={"api_key": "gh-token"}), \
patch("hermes_cli.models._fetch_github_models", return_value=["gpt-5.4", "claude-sonnet-4.6"]):
assert provider_model_ids("copilot-acp") == ["gpt-5.4", "claude-sonnet-4.6"]
def test_copilot_acp_falls_back_to_copilot_defaults(self):
with patch("hermes_cli.auth.resolve_api_key_provider_credentials", side_effect=Exception("no token")), \
patch("hermes_cli.models._fetch_github_models", return_value=None):
ids = provider_model_ids("copilot-acp")
assert "gpt-5.4" in ids
assert "copilot-acp" not in ids
# -- fetch_api_models --------------------------------------------------------
@@ -183,6 +208,112 @@ class TestFetchApiModels:
assert probe["resolved_base_url"] == "http://localhost:8000/v1"
assert probe["used_fallback"] is True
def test_probe_api_models_uses_copilot_catalog(self):
class _Resp:
def __enter__(self):
return self
def __exit__(self, exc_type, exc, tb):
return False
def read(self):
return b'{"data": [{"id": "gpt-5.4", "model_picker_enabled": true, "supported_endpoints": ["/responses"], "capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}}}, {"id": "claude-sonnet-4.6", "model_picker_enabled": true, "supported_endpoints": ["/chat/completions"], "capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}}}, {"id": "text-embedding-3-small", "model_picker_enabled": true, "capabilities": {"type": "embedding"}}]}'
with patch("hermes_cli.models.urllib.request.urlopen", return_value=_Resp()) as mock_urlopen:
probe = probe_api_models("gh-token", "https://api.githubcopilot.com")
assert mock_urlopen.call_args[0][0].full_url == "https://api.githubcopilot.com/models"
assert probe["models"] == ["gpt-5.4", "claude-sonnet-4.6"]
assert probe["resolved_base_url"] == "https://api.githubcopilot.com"
assert probe["used_fallback"] is False
def test_fetch_github_model_catalog_filters_non_chat_models(self):
class _Resp:
def __enter__(self):
return self
def __exit__(self, exc_type, exc, tb):
return False
def read(self):
return b'{"data": [{"id": "gpt-5.4", "model_picker_enabled": true, "supported_endpoints": ["/responses"], "capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}}}, {"id": "text-embedding-3-small", "model_picker_enabled": true, "capabilities": {"type": "embedding"}}]}'
with patch("hermes_cli.models.urllib.request.urlopen", return_value=_Resp()):
catalog = fetch_github_model_catalog("gh-token")
assert catalog is not None
assert [item["id"] for item in catalog] == ["gpt-5.4"]
class TestGithubReasoningEfforts:
def test_gpt5_supports_minimal_to_high(self):
catalog = [{
"id": "gpt-5.4",
"capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}},
"supported_endpoints": ["/responses"],
}]
assert github_model_reasoning_efforts("gpt-5.4", catalog=catalog) == [
"low",
"medium",
"high",
]
def test_legacy_catalog_reasoning_still_supported(self):
catalog = [{"id": "openai/o3", "capabilities": ["reasoning"]}]
assert github_model_reasoning_efforts("openai/o3", catalog=catalog) == [
"low",
"medium",
"high",
]
def test_non_reasoning_model_returns_empty(self):
catalog = [{"id": "gpt-4.1", "capabilities": {"type": "chat", "supports": {}}}]
assert github_model_reasoning_efforts("gpt-4.1", catalog=catalog) == []
class TestCopilotNormalization:
def test_normalize_old_github_models_slug(self):
catalog = [{"id": "gpt-4.1"}, {"id": "gpt-5.4"}]
assert normalize_copilot_model_id("openai/gpt-4.1-mini", catalog=catalog) == "gpt-4.1"
def test_copilot_api_mode_gpt5_uses_responses(self):
"""GPT-5+ models should use Responses API (matching opencode)."""
assert copilot_model_api_mode("gpt-5.4") == "codex_responses"
assert copilot_model_api_mode("gpt-5.4-mini") == "codex_responses"
assert copilot_model_api_mode("gpt-5.3-codex") == "codex_responses"
assert copilot_model_api_mode("gpt-5.2-codex") == "codex_responses"
assert copilot_model_api_mode("gpt-5.2") == "codex_responses"
def test_copilot_api_mode_gpt5_mini_uses_chat(self):
"""gpt-5-mini is the exception — uses Chat Completions."""
assert copilot_model_api_mode("gpt-5-mini") == "chat_completions"
def test_copilot_api_mode_non_gpt5_uses_chat(self):
"""Non-GPT-5 models use Chat Completions."""
assert copilot_model_api_mode("gpt-4.1") == "chat_completions"
assert copilot_model_api_mode("gpt-4o") == "chat_completions"
assert copilot_model_api_mode("gpt-4o-mini") == "chat_completions"
assert copilot_model_api_mode("claude-sonnet-4.6") == "chat_completions"
assert copilot_model_api_mode("claude-opus-4.6") == "chat_completions"
assert copilot_model_api_mode("gemini-2.5-pro") == "chat_completions"
def test_copilot_api_mode_with_catalog_both_endpoints(self):
"""When catalog shows both endpoints, model ID pattern wins."""
catalog = [{
"id": "gpt-5.4",
"supported_endpoints": ["/chat/completions", "/responses"],
}]
# GPT-5.4 should use responses even though chat/completions is listed
assert copilot_model_api_mode("gpt-5.4", catalog=catalog) == "codex_responses"
def test_copilot_api_mode_with_catalog_only_responses(self):
catalog = [{
"id": "gpt-5.4",
"supported_endpoints": ["/responses"],
"capabilities": {"type": "chat"},
}]
assert copilot_model_api_mode("gpt-5.4", catalog=catalog) == "codex_responses"
# -- validate — format checks -----------------------------------------------
@@ -32,6 +32,8 @@ def _clear_provider_env(monkeypatch):
"OPENAI_BASE_URL",
"OPENAI_API_KEY",
"OPENROUTER_API_KEY",
"GITHUB_TOKEN",
"GH_TOKEN",
"GLM_API_KEY",
"KIMI_API_KEY",
"MINIMAX_API_KEY",
@@ -231,6 +233,152 @@ def test_setup_keep_current_anthropic_can_configure_openai_vision_default(tmp_pa
assert env.get("AUXILIARY_VISION_MODEL") == "gpt-4o-mini"
def test_setup_copilot_uses_gh_auth_and_saves_provider(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
config = load_config()
def fake_prompt_choice(question, choices, default=0):
if question == "Select your inference provider:":
assert choices[14] == "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"
return 14
if question == "Select default model:":
assert "gpt-4.1" in choices
assert "gpt-5.4" in choices
return choices.index("gpt-5.4")
if question == "Select reasoning effort:":
assert "low" in choices
assert "high" in choices
return choices.index("high")
if question == "Configure vision:":
return len(choices) - 1
tts_idx = _maybe_keep_current_tts(question, choices)
if tts_idx is not None:
return tts_idx
raise AssertionError(f"Unexpected prompt_choice call: {question}")
def fake_prompt(message, *args, **kwargs):
raise AssertionError(f"Unexpected prompt call: {message}")
def fake_get_auth_status(provider_id):
if provider_id == "copilot":
return {"logged_in": True}
return {"logged_in": False}
monkeypatch.setattr("hermes_cli.setup.prompt_choice", fake_prompt_choice)
monkeypatch.setattr("hermes_cli.setup.prompt", fake_prompt)
monkeypatch.setattr("hermes_cli.setup.prompt_yes_no", lambda *args, **kwargs: False)
monkeypatch.setattr("hermes_cli.auth.get_active_provider", lambda: None)
monkeypatch.setattr("hermes_cli.auth.detect_external_credentials", lambda: [])
monkeypatch.setattr("hermes_cli.auth.get_auth_status", fake_get_auth_status)
monkeypatch.setattr(
"hermes_cli.auth.resolve_api_key_provider_credentials",
lambda provider_id: {
"provider": provider_id,
"api_key": "gh-cli-token",
"base_url": "https://api.githubcopilot.com",
"source": "gh auth token",
},
)
monkeypatch.setattr(
"hermes_cli.models.fetch_github_model_catalog",
lambda api_key: [
{
"id": "gpt-4.1",
"capabilities": {"type": "chat", "supports": {}},
"supported_endpoints": ["/chat/completions"],
},
{
"id": "gpt-5.4",
"capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}},
"supported_endpoints": ["/responses"],
},
],
)
monkeypatch.setattr("agent.auxiliary_client.get_available_vision_backends", lambda: [])
setup_model_provider(config)
save_config(config)
env = _read_env(tmp_path)
reloaded = load_config()
assert env.get("GITHUB_TOKEN") is None
assert reloaded["model"]["provider"] == "copilot"
assert reloaded["model"]["base_url"] == "https://api.githubcopilot.com"
assert reloaded["model"]["default"] == "gpt-5.4"
assert reloaded["model"]["api_mode"] == "codex_responses"
assert reloaded["agent"]["reasoning_effort"] == "high"
def test_setup_copilot_acp_uses_model_picker_and_saves_provider(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
_clear_provider_env(monkeypatch)
config = load_config()
def fake_prompt_choice(question, choices, default=0):
if question == "Select your inference provider:":
assert choices[15] == "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"
return 15
if question == "Select default model:":
assert "gpt-4.1" in choices
assert "gpt-5.4" in choices
return choices.index("gpt-5.4")
if question == "Configure vision:":
return len(choices) - 1
tts_idx = _maybe_keep_current_tts(question, choices)
if tts_idx is not None:
return tts_idx
raise AssertionError(f"Unexpected prompt_choice call: {question}")
def fake_prompt(message, *args, **kwargs):
raise AssertionError(f"Unexpected prompt call: {message}")
monkeypatch.setattr("hermes_cli.setup.prompt_choice", fake_prompt_choice)
monkeypatch.setattr("hermes_cli.setup.prompt", fake_prompt)
monkeypatch.setattr("hermes_cli.setup.prompt_yes_no", lambda *args, **kwargs: False)
monkeypatch.setattr("hermes_cli.auth.get_active_provider", lambda: None)
monkeypatch.setattr("hermes_cli.auth.detect_external_credentials", lambda: [])
monkeypatch.setattr("hermes_cli.auth.get_auth_status", lambda provider_id: {"logged_in": provider_id == "copilot-acp"})
monkeypatch.setattr(
"hermes_cli.auth.resolve_api_key_provider_credentials",
lambda provider_id: {
"provider": "copilot",
"api_key": "gh-cli-token",
"base_url": "https://api.githubcopilot.com",
"source": "gh auth token",
},
)
monkeypatch.setattr(
"hermes_cli.models.fetch_github_model_catalog",
lambda api_key: [
{
"id": "gpt-4.1",
"capabilities": {"type": "chat", "supports": {}},
"supported_endpoints": ["/chat/completions"],
},
{
"id": "gpt-5.4",
"capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}},
"supported_endpoints": ["/responses"],
},
],
)
monkeypatch.setattr("agent.auxiliary_client.get_available_vision_backends", lambda: [])
setup_model_provider(config)
save_config(config)
reloaded = load_config()
assert reloaded["model"]["provider"] == "copilot-acp"
assert reloaded["model"]["base_url"] == "acp://copilot"
assert reloaded["model"]["default"] == "gpt-5.4"
assert reloaded["model"]["api_mode"] == "chat_completions"
def test_setup_switch_custom_to_codex_clears_custom_endpoint_and_updates_config(tmp_path, monkeypatch):
"""Switching from custom to Codex should clear custom endpoint overrides."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
+170 -2
View File
@@ -18,9 +18,12 @@ from hermes_cli.auth import (
resolve_provider,
get_api_key_provider_status,
resolve_api_key_provider_credentials,
get_external_process_provider_status,
resolve_external_process_provider_credentials,
get_auth_status,
AuthError,
KIMI_CODE_BASE_URL,
_try_gh_cli_token,
_resolve_kimi_base_url,
)
@@ -33,6 +36,8 @@ class TestProviderRegistry:
"""Test that new providers are correctly registered."""
@pytest.mark.parametrize("provider_id,name,auth_type", [
("copilot-acp", "GitHub Copilot ACP", "external_process"),
("copilot", "GitHub Copilot", "api_key"),
("zai", "Z.AI / GLM", "api_key"),
("kimi-coding", "Kimi / Moonshot", "api_key"),
("minimax", "MiniMax", "api_key"),
@@ -52,6 +57,11 @@ class TestProviderRegistry:
assert pconfig.api_key_env_vars == ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY")
assert pconfig.base_url_env_var == "GLM_BASE_URL"
def test_copilot_env_vars(self):
pconfig = PROVIDER_REGISTRY["copilot"]
assert pconfig.api_key_env_vars == ("COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN")
assert pconfig.base_url_env_var == ""
def test_kimi_env_vars(self):
pconfig = PROVIDER_REGISTRY["kimi-coding"]
assert pconfig.api_key_env_vars == ("KIMI_API_KEY",)
@@ -78,6 +88,8 @@ class TestProviderRegistry:
assert pconfig.base_url_env_var == "KILOCODE_BASE_URL"
def test_base_urls(self):
assert PROVIDER_REGISTRY["copilot"].inference_base_url == "https://api.githubcopilot.com"
assert PROVIDER_REGISTRY["copilot-acp"].inference_base_url == "acp://copilot"
assert PROVIDER_REGISTRY["zai"].inference_base_url == "https://api.z.ai/api/paas/v4"
assert PROVIDER_REGISTRY["kimi-coding"].inference_base_url == "https://api.moonshot.ai/v1"
assert PROVIDER_REGISTRY["minimax"].inference_base_url == "https://api.minimax.io/v1"
@@ -105,8 +117,9 @@ PROVIDER_ENV_VARS = (
"AI_GATEWAY_API_KEY", "AI_GATEWAY_BASE_URL",
"KILOCODE_API_KEY", "KILOCODE_BASE_URL",
"DASHSCOPE_API_KEY", "OPENCODE_ZEN_API_KEY", "OPENCODE_GO_API_KEY",
"NOUS_API_KEY",
"OPENAI_BASE_URL",
"NOUS_API_KEY", "GITHUB_TOKEN", "GH_TOKEN",
"OPENAI_BASE_URL", "HERMES_COPILOT_ACP_COMMAND", "COPILOT_CLI_PATH",
"HERMES_COPILOT_ACP_ARGS", "COPILOT_ACP_BASE_URL",
)
@@ -176,6 +189,16 @@ class TestResolveProvider:
assert resolve_provider("Z-AI") == "zai"
assert resolve_provider("Kimi") == "kimi-coding"
def test_alias_github_copilot(self):
assert resolve_provider("github-copilot") == "copilot"
def test_alias_github_models(self):
assert resolve_provider("github-models") == "copilot"
def test_alias_github_copilot_acp(self):
assert resolve_provider("github-copilot-acp") == "copilot-acp"
assert resolve_provider("copilot-acp-agent") == "copilot-acp"
def test_unknown_provider_raises(self):
with pytest.raises(AuthError):
resolve_provider("nonexistent-provider-xyz")
@@ -218,6 +241,10 @@ class TestResolveProvider:
monkeypatch.setenv("GLM_API_KEY", "glm-key")
assert resolve_provider("auto") == "openrouter"
def test_auto_does_not_select_copilot_from_github_token(self, monkeypatch):
monkeypatch.setenv("GITHUB_TOKEN", "gh-test-token")
assert resolve_provider("auto") == "openrouter"
# =============================================================================
# API Key Provider Status tests
@@ -251,12 +278,41 @@ class TestApiKeyProviderStatus:
status = get_api_key_provider_status("kimi-coding")
assert status["base_url"] == "https://custom.kimi.example/v1"
def test_copilot_status_uses_gh_cli_token(self, monkeypatch):
monkeypatch.setattr("hermes_cli.copilot_auth._try_gh_cli_token", lambda: "gho_gh_cli_token")
status = get_api_key_provider_status("copilot")
assert status["configured"] is True
assert status["logged_in"] is True
assert status["key_source"] == "gh auth token"
assert status["base_url"] == "https://api.githubcopilot.com"
def test_get_auth_status_dispatches_to_api_key(self, monkeypatch):
monkeypatch.setenv("MINIMAX_API_KEY", "mm-key")
status = get_auth_status("minimax")
assert status["configured"] is True
assert status["provider"] == "minimax"
def test_copilot_acp_status_detects_local_cli(self, monkeypatch):
monkeypatch.setenv("HERMES_COPILOT_ACP_ARGS", "--acp --stdio --debug")
monkeypatch.setattr("hermes_cli.auth.shutil.which", lambda command: f"/usr/local/bin/{command}")
status = get_external_process_provider_status("copilot-acp")
assert status["configured"] is True
assert status["logged_in"] is True
assert status["command"] == "copilot"
assert status["resolved_command"] == "/usr/local/bin/copilot"
assert status["args"] == ["--acp", "--stdio", "--debug"]
assert status["base_url"] == "acp://copilot"
def test_get_auth_status_dispatches_to_external_process(self, monkeypatch):
monkeypatch.setattr("hermes_cli.auth.shutil.which", lambda command: f"/opt/bin/{command}")
status = get_auth_status("copilot-acp")
assert status["configured"] is True
assert status["provider"] == "copilot-acp"
def test_non_api_key_provider(self):
status = get_api_key_provider_status("nous")
assert status["configured"] is False
@@ -276,6 +332,61 @@ class TestResolveApiKeyProviderCredentials:
assert creds["base_url"] == "https://api.z.ai/api/paas/v4"
assert creds["source"] == "GLM_API_KEY"
def test_resolve_copilot_with_github_token(self, monkeypatch):
monkeypatch.setenv("GITHUB_TOKEN", "gh-env-secret")
creds = resolve_api_key_provider_credentials("copilot")
assert creds["provider"] == "copilot"
assert creds["api_key"] == "gh-env-secret"
assert creds["base_url"] == "https://api.githubcopilot.com"
assert creds["source"] == "GITHUB_TOKEN"
def test_resolve_copilot_with_gh_cli_fallback(self, monkeypatch):
monkeypatch.setattr("hermes_cli.copilot_auth._try_gh_cli_token", lambda: "gho_cli_secret")
creds = resolve_api_key_provider_credentials("copilot")
assert creds["provider"] == "copilot"
assert creds["api_key"] == "gho_cli_secret"
assert creds["base_url"] == "https://api.githubcopilot.com"
assert creds["source"] == "gh auth token"
def test_try_gh_cli_token_uses_homebrew_path_when_not_on_path(self, monkeypatch):
monkeypatch.setattr("hermes_cli.auth.shutil.which", lambda command: None)
monkeypatch.setattr(
"hermes_cli.auth.os.path.isfile",
lambda path: path == "/opt/homebrew/bin/gh",
)
monkeypatch.setattr(
"hermes_cli.auth.os.access",
lambda path, mode: path == "/opt/homebrew/bin/gh" and mode == os.X_OK,
)
calls = []
class _Result:
returncode = 0
stdout = "gh-cli-secret\n"
def _fake_run(cmd, capture_output, text, timeout):
calls.append(cmd)
return _Result()
monkeypatch.setattr("hermes_cli.auth.subprocess.run", _fake_run)
assert _try_gh_cli_token() == "gh-cli-secret"
assert calls == [["/opt/homebrew/bin/gh", "auth", "token"]]
def test_resolve_copilot_acp_with_local_cli(self, monkeypatch):
monkeypatch.setenv("HERMES_COPILOT_ACP_ARGS", "--acp --stdio")
monkeypatch.setattr("hermes_cli.auth.shutil.which", lambda command: f"/usr/local/bin/{command}")
creds = resolve_external_process_provider_credentials("copilot-acp")
assert creds["provider"] == "copilot-acp"
assert creds["api_key"] == "copilot-acp"
assert creds["base_url"] == "acp://copilot"
assert creds["command"] == "/usr/local/bin/copilot"
assert creds["args"] == ["--acp", "--stdio"]
assert creds["source"] == "process"
def test_resolve_kimi_with_key(self, monkeypatch):
monkeypatch.setenv("KIMI_API_KEY", "kimi-secret-key")
creds = resolve_api_key_provider_credentials("kimi-coding")
@@ -403,6 +514,53 @@ class TestRuntimeProviderResolution:
assert result["provider"] == "kimi-coding"
assert result["api_key"] == "auto-kimi-key"
def test_runtime_copilot_uses_gh_cli_token(self, monkeypatch):
monkeypatch.setattr("hermes_cli.copilot_auth._try_gh_cli_token", lambda: "gho_cli_secret")
from hermes_cli.runtime_provider import resolve_runtime_provider
result = resolve_runtime_provider(requested="copilot")
assert result["provider"] == "copilot"
assert result["api_mode"] == "chat_completions"
assert result["api_key"] == "gho_cli_secret"
assert result["base_url"] == "https://api.githubcopilot.com"
def test_runtime_copilot_uses_responses_for_gpt_5_4(self, monkeypatch):
monkeypatch.setattr("hermes_cli.copilot_auth._try_gh_cli_token", lambda: "gho_cli_secret")
monkeypatch.setattr(
"hermes_cli.runtime_provider._get_model_config",
lambda: {"provider": "copilot", "default": "gpt-5.4"},
)
monkeypatch.setattr(
"hermes_cli.models.fetch_github_model_catalog",
lambda api_key=None, timeout=5.0: [
{
"id": "gpt-5.4",
"supported_endpoints": ["/responses"],
"capabilities": {"type": "chat"},
}
],
)
from hermes_cli.runtime_provider import resolve_runtime_provider
result = resolve_runtime_provider(requested="copilot")
assert result["provider"] == "copilot"
assert result["api_mode"] == "codex_responses"
def test_runtime_copilot_acp_uses_process_runtime(self, monkeypatch):
monkeypatch.setattr("hermes_cli.auth.shutil.which", lambda command: f"/usr/local/bin/{command}")
monkeypatch.setenv("HERMES_COPILOT_ACP_ARGS", "--acp --stdio --debug")
from hermes_cli.runtime_provider import resolve_runtime_provider
result = resolve_runtime_provider(requested="copilot-acp")
assert result["provider"] == "copilot-acp"
assert result["api_mode"] == "chat_completions"
assert result["api_key"] == "copilot-acp"
assert result["base_url"] == "acp://copilot"
assert result["command"] == "/usr/local/bin/copilot"
assert result["args"] == ["--acp", "--stdio", "--debug"]
# =============================================================================
# _has_any_provider_configured tests
@@ -430,6 +588,16 @@ class TestHasAnyProviderConfigured:
from hermes_cli.main import _has_any_provider_configured
assert _has_any_provider_configured() is True
def test_gh_cli_token_counts(self, monkeypatch, tmp_path):
from hermes_cli import config as config_module
monkeypatch.setattr("hermes_cli.copilot_auth._try_gh_cli_token", lambda: "gho_cli_secret")
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
monkeypatch.setattr(config_module, "get_env_path", lambda: hermes_home / ".env")
monkeypatch.setattr(config_module, "get_hermes_home", lambda: hermes_home)
from hermes_cli.main import _has_any_provider_configured
assert _has_any_provider_configured() is True
# =============================================================================
# Kimi Code auto-detection tests
+43
View File
@@ -312,6 +312,49 @@ def test_codex_provider_uses_config_model(monkeypatch):
assert shell.model != "should-be-ignored"
def test_codex_config_model_not_replaced_by_normalization(monkeypatch):
"""When the user sets model.default in config.yaml to a specific codex
model, _normalize_model_for_provider must NOT replace it with the latest
available model from the API. Regression test for #1887."""
cli = _import_cli()
monkeypatch.delenv("LLM_MODEL", raising=False)
monkeypatch.delenv("OPENAI_MODEL", raising=False)
# User explicitly configured gpt-5.3-codex in config.yaml
monkeypatch.setitem(cli.CLI_CONFIG, "model", {
"default": "gpt-5.3-codex",
"provider": "openai-codex",
"base_url": "https://chatgpt.com/backend-api/codex",
})
def _runtime_resolve(**kwargs):
return {
"provider": "openai-codex",
"api_mode": "codex_responses",
"base_url": "https://chatgpt.com/backend-api/codex",
"api_key": "fake-key",
"source": "env/config",
}
monkeypatch.setattr("hermes_cli.runtime_provider.resolve_runtime_provider", _runtime_resolve)
monkeypatch.setattr("hermes_cli.runtime_provider.format_runtime_provider_error", lambda exc: str(exc))
# API returns a DIFFERENT model than what the user configured
monkeypatch.setattr(
"hermes_cli.codex_models.get_codex_model_ids",
lambda access_token=None: ["gpt-5.4", "gpt-5.3-codex"],
)
shell = cli.HermesCLI(compact=True, max_turns=1)
# Config model is NOT the global default — user made a deliberate choice
assert shell._model_is_default is False
assert shell._ensure_runtime_credentials() is True
assert shell.provider == "openai-codex"
# Model must stay as user configured, not replaced by gpt-5.4
assert shell.model == "gpt-5.3-codex"
def test_codex_provider_preserves_explicit_codex_model(monkeypatch):
"""If the user explicitly passes a Codex-compatible model, it must be
preserved even when the provider resolves to openai-codex."""
+199
View File
@@ -0,0 +1,199 @@
"""Tests for context compression boundary alignment.
Verifies that _align_boundary_backward correctly handles tool result groups
so that parallel tool calls are never split during compression.
"""
import pytest
from unittest.mock import patch, MagicMock
from agent.context_compressor import ContextCompressor
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _tc(call_id: str) -> dict:
"""Create a minimal tool_call dict."""
return {"id": call_id, "type": "function", "function": {"name": "test", "arguments": "{}"}}
def _tool_result(call_id: str, content: str = "result") -> dict:
"""Create a tool result message."""
return {"role": "tool", "tool_call_id": call_id, "content": content}
def _assistant_with_tools(*call_ids: str) -> dict:
"""Create an assistant message with tool_calls."""
return {"role": "assistant", "tool_calls": [_tc(cid) for cid in call_ids], "content": None}
def _make_compressor(**kwargs) -> ContextCompressor:
defaults = dict(
model="test-model",
threshold_percent=0.75,
protect_first_n=3,
protect_last_n=4,
quiet_mode=True,
)
defaults.update(kwargs)
with patch("agent.context_compressor.get_model_context_length", return_value=8000):
return ContextCompressor(**defaults)
# ---------------------------------------------------------------------------
# _align_boundary_backward tests
# ---------------------------------------------------------------------------
class TestAlignBoundaryBackward:
"""Test that compress-end boundary never splits a tool_call/result group."""
def test_boundary_at_clean_position(self):
"""Boundary after a user message — no adjustment needed."""
comp = _make_compressor()
messages = [
{"role": "system", "content": "sys"},
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "hi"},
{"role": "user", "content": "do something"},
_assistant_with_tools("tc_1"),
_tool_result("tc_1", "done"),
{"role": "user", "content": "thanks"}, # idx=6
{"role": "assistant", "content": "np"},
]
# Boundary at 7, messages[6] = user — no adjustment
assert comp._align_boundary_backward(messages, 7) == 7
def test_boundary_after_assistant_with_tools(self):
"""Original case: boundary right after assistant with tool_calls."""
comp = _make_compressor()
messages = [
{"role": "system", "content": "sys"},
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "hi"},
_assistant_with_tools("tc_1", "tc_2"), # idx=3
_tool_result("tc_1"), # idx=4
_tool_result("tc_2"), # idx=5
{"role": "user", "content": "next"},
]
# Boundary at 4, messages[3] = assistant with tool_calls → pull back to 3
assert comp._align_boundary_backward(messages, 4) == 3
def test_boundary_in_middle_of_tool_results(self):
"""THE BUG: boundary falls between tool results of the same group."""
comp = _make_compressor()
messages = [
{"role": "system", "content": "sys"},
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "hi"},
{"role": "user", "content": "do 5 things"},
_assistant_with_tools("tc_A", "tc_B", "tc_C", "tc_D", "tc_E"), # idx=4
_tool_result("tc_A", "result A"), # idx=5
_tool_result("tc_B", "result B"), # idx=6
_tool_result("tc_C", "result C"), # idx=7
_tool_result("tc_D", "result D"), # idx=8
_tool_result("tc_E", "result E"), # idx=9
{"role": "user", "content": "ok"},
{"role": "assistant", "content": "done"},
]
# Boundary at 8 — in middle of tool results. messages[7] = tool result.
# Must walk back to idx=4 (the parent assistant).
assert comp._align_boundary_backward(messages, 8) == 4
def test_boundary_at_last_tool_result(self):
"""Boundary right after last tool result — messages[idx-1] is tool."""
comp = _make_compressor()
messages = [
{"role": "system", "content": "sys"},
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "hi"},
_assistant_with_tools("tc_1", "tc_2", "tc_3"), # idx=3
_tool_result("tc_1"), # idx=4
_tool_result("tc_2"), # idx=5
_tool_result("tc_3"), # idx=6
{"role": "user", "content": "next"},
]
# Boundary at 7 — messages[6] is last tool result.
# Walk back: [6]=tool, [5]=tool, [4]=tool, [3]=assistant with tools → idx=3
assert comp._align_boundary_backward(messages, 7) == 3
def test_boundary_with_consecutive_tool_groups(self):
"""Two consecutive tool groups — only walk back to the nearest parent."""
comp = _make_compressor()
messages = [
{"role": "system", "content": "sys"},
{"role": "user", "content": "hello"},
_assistant_with_tools("tc_1"), # idx=2
_tool_result("tc_1"), # idx=3
{"role": "user", "content": "more"},
_assistant_with_tools("tc_2", "tc_3"), # idx=5
_tool_result("tc_2"), # idx=6
_tool_result("tc_3"), # idx=7
{"role": "user", "content": "done"},
]
# Boundary at 7 — messages[6] = tool result for tc_2 group
# Walk back: [6]=tool, [5]=assistant with tools → idx=5
assert comp._align_boundary_backward(messages, 7) == 5
# ---------------------------------------------------------------------------
# End-to-end: compression must not lose tool results
# ---------------------------------------------------------------------------
class TestCompressionToolResultPreservation:
"""Verify that compress() never silently drops tool results."""
def test_parallel_tool_results_not_lost(self):
"""The exact scenario that triggered silent data loss before the fix."""
comp = _make_compressor(protect_first_n=3, protect_last_n=4)
messages = [
{"role": "system", "content": "You are helpful."}, # 0
{"role": "user", "content": "Hello"}, # 1
{"role": "assistant", "content": "Hi there!"}, # 2 (end of head)
{"role": "user", "content": "Read 7 files for me"}, # 3
_assistant_with_tools("tc_A", "tc_B", "tc_C", "tc_D", "tc_E", "tc_F", "tc_G"), # 4
_tool_result("tc_A", "content of file A"), # 5
_tool_result("tc_B", "content of file B"), # 6
_tool_result("tc_C", "content of file C"), # 7
_tool_result("tc_D", "content of file D"), # 8
_tool_result("tc_E", "content of file E"), # 9
_tool_result("tc_F", "content of file F"), # 10
_tool_result("tc_G", "CRITICAL DATA in file G"), # 11 ← compress_end=15-4=11
{"role": "user", "content": "Now summarize them"}, # 12
{"role": "assistant", "content": "Here is the summary..."}, # 13
{"role": "user", "content": "Thanks"}, # 14
]
# 15 messages. compress_end = 15 - 4 = 11 (before fix: splits tool group)
fake_summary = "[Summary of earlier conversation]"
with patch.object(comp, "_generate_summary", return_value=fake_summary):
result = comp.compress(messages, current_tokens=7000)
# After compression, no tool results should be orphaned/lost.
# All tool results in the result must have a matching assistant tool_call.
assistant_call_ids = set()
for msg in result:
if msg.get("role") == "assistant":
for tc in msg.get("tool_calls") or []:
cid = tc.get("id", "")
if cid:
assistant_call_ids.add(cid)
tool_result_ids = set()
for msg in result:
if msg.get("role") == "tool":
cid = msg.get("tool_call_id")
if cid:
tool_result_ids.add(cid)
# Every tool result must have a parent — no orphans
orphaned = tool_result_ids - assistant_call_ids
assert not orphaned, f"Orphaned tool results found (data loss!): {orphaned}"
# Every assistant tool_call must have a real result (not a stub)
for msg in result:
if msg.get("role") == "tool":
assert msg["content"] != "[Result from earlier conversation — see context summary above]", \
f"Stub result found for {msg.get('tool_call_id')} — real result was lost"
+4 -4
View File
@@ -131,7 +131,7 @@ class TestTryActivateFallback:
def test_activates_minimax_fallback(self):
agent = _make_agent(
fallback_model={"provider": "minimax", "model": "MiniMax-M2.5"},
fallback_model={"provider": "minimax", "model": "MiniMax-M2.7"},
)
mock_client = _mock_resolve(
api_key="sk-mm-key",
@@ -139,10 +139,10 @@ class TestTryActivateFallback:
)
with patch(
"agent.auxiliary_client.resolve_provider_client",
return_value=(mock_client, "MiniMax-M2.5"),
return_value=(mock_client, "MiniMax-M2.7"),
):
assert agent._try_activate_fallback() is True
assert agent.model == "MiniMax-M2.5"
assert agent.model == "MiniMax-M2.7"
assert agent.provider == "minimax"
assert agent.client is mock_client
@@ -165,7 +165,7 @@ class TestTryActivateFallback:
def test_returns_false_when_no_api_key(self):
"""Fallback should fail gracefully when the API key env var is unset."""
agent = _make_agent(
fallback_model={"provider": "minimax", "model": "MiniMax-M2.5"},
fallback_model={"provider": "minimax", "model": "MiniMax-M2.7"},
)
with patch(
"agent.auxiliary_client.resolve_provider_client",
+19
View File
@@ -210,6 +210,25 @@ class TestFTS5Search:
sources = [r["source"] for r in results]
assert all(s == "telegram" for s in sources)
def test_search_default_sources_include_acp(self, db):
db.create_session(session_id="s1", source="acp")
db.append_message("s1", role="user", content="ACP question about Python")
results = db.search_messages("Python")
sources = [r["source"] for r in results]
assert "acp" in sources
def test_search_default_includes_all_platforms(self, db):
"""Default search (no source_filter) should find sessions from any platform."""
for src in ("cli", "telegram", "signal", "homeassistant", "acp", "matrix"):
sid = f"s-{src}"
db.create_session(session_id=sid, source=src)
db.append_message(sid, role="user", content=f"universal search test from {src}")
results = db.search_messages("universal search test")
found_sources = {r["source"] for r in results}
assert found_sources == {"cli", "telegram", "signal", "homeassistant", "acp", "matrix"}
def test_search_with_role_filter(self, db):
db.create_session(session_id="s1", source="cli")
db.append_message("s1", role="user", content="What is FastAPI?")
+113
View File
@@ -27,6 +27,8 @@ def config_home(tmp_path, monkeypatch):
monkeypatch.delenv("HERMES_MODEL", raising=False)
monkeypatch.delenv("LLM_MODEL", raising=False)
monkeypatch.delenv("HERMES_INFERENCE_PROVIDER", raising=False)
monkeypatch.delenv("GITHUB_TOKEN", raising=False)
monkeypatch.delenv("GH_TOKEN", raising=False)
monkeypatch.delenv("OPENAI_BASE_URL", raising=False)
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
@@ -97,3 +99,114 @@ class TestProviderPersistsAfterModelSave:
f"provider should be 'kimi-coding', got {model.get('provider')}"
)
assert model.get("default") == "kimi-k2.5"
def test_copilot_provider_saved_when_selected(self, config_home):
"""_model_flow_copilot should persist provider/base_url/model together."""
from hermes_cli.main import _model_flow_copilot
from hermes_cli.config import load_config
with patch(
"hermes_cli.auth.resolve_api_key_provider_credentials",
return_value={
"provider": "copilot",
"api_key": "gh-cli-token",
"base_url": "https://api.githubcopilot.com",
"source": "gh auth token",
},
), patch(
"hermes_cli.models.fetch_github_model_catalog",
return_value=[
{
"id": "gpt-4.1",
"capabilities": {"type": "chat", "supports": {}},
"supported_endpoints": ["/chat/completions"],
},
{
"id": "gpt-5.4",
"capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}},
"supported_endpoints": ["/responses"],
},
],
), patch(
"hermes_cli.auth._prompt_model_selection",
return_value="gpt-5.4",
), patch(
"hermes_cli.main._prompt_reasoning_effort_selection",
return_value="high",
), patch(
"hermes_cli.auth.deactivate_provider",
):
_model_flow_copilot(load_config(), "old-model")
import yaml
config = yaml.safe_load((config_home / "config.yaml").read_text()) or {}
model = config.get("model")
assert isinstance(model, dict), f"model should be dict, got {type(model)}"
assert model.get("provider") == "copilot"
assert model.get("base_url") == "https://api.githubcopilot.com"
assert model.get("default") == "gpt-5.4"
assert model.get("api_mode") == "codex_responses"
assert config["agent"]["reasoning_effort"] == "high"
def test_copilot_acp_provider_saved_when_selected(self, config_home):
"""_model_flow_copilot_acp should persist provider/base_url/model together."""
from hermes_cli.main import _model_flow_copilot_acp
from hermes_cli.config import load_config
with patch(
"hermes_cli.auth.get_external_process_provider_status",
return_value={
"resolved_command": "/usr/local/bin/copilot",
"command": "copilot",
"base_url": "acp://copilot",
},
), patch(
"hermes_cli.auth.resolve_external_process_provider_credentials",
return_value={
"provider": "copilot-acp",
"api_key": "copilot-acp",
"base_url": "acp://copilot",
"command": "/usr/local/bin/copilot",
"args": ["--acp", "--stdio"],
"source": "process",
},
), patch(
"hermes_cli.auth.resolve_api_key_provider_credentials",
return_value={
"provider": "copilot",
"api_key": "gh-cli-token",
"base_url": "https://api.githubcopilot.com",
"source": "gh auth token",
},
), patch(
"hermes_cli.models.fetch_github_model_catalog",
return_value=[
{
"id": "gpt-4.1",
"capabilities": {"type": "chat", "supports": {}},
"supported_endpoints": ["/chat/completions"],
},
{
"id": "gpt-5.4",
"capabilities": {"type": "chat", "supports": {"reasoning_effort": ["low", "medium", "high"]}},
"supported_endpoints": ["/responses"],
},
],
), patch(
"hermes_cli.auth._prompt_model_selection",
return_value="gpt-5.4",
), patch(
"hermes_cli.auth.deactivate_provider",
):
_model_flow_copilot_acp(load_config(), "old-model")
import yaml
config = yaml.safe_load((config_home / "config.yaml").read_text()) or {}
model = config.get("model")
assert isinstance(model, dict), f"model should be dict, got {type(model)}"
assert model.get("provider") == "copilot-acp"
assert model.get("base_url") == "acp://copilot"
assert model.get("default") == "gpt-5.4"
assert model.get("api_mode") == "chat_completions"
+311 -1
View File
@@ -631,6 +631,28 @@ class TestBuildApiKwargs:
kwargs = agent._build_api_kwargs(messages)
assert kwargs["extra_body"]["reasoning"]["effort"] == "medium"
def test_reasoning_sent_for_copilot_gpt5(self, agent):
agent.base_url = "https://api.githubcopilot.com"
agent.model = "gpt-5.4"
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert kwargs["extra_body"]["reasoning"] == {"effort": "medium"}
def test_reasoning_xhigh_normalized_for_copilot(self, agent):
agent.base_url = "https://api.githubcopilot.com"
agent.model = "gpt-5.4"
agent.reasoning_config = {"enabled": True, "effort": "xhigh"}
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert kwargs["extra_body"]["reasoning"] == {"effort": "high"}
def test_reasoning_omitted_for_non_reasoning_copilot_model(self, agent):
agent.base_url = "https://api.githubcopilot.com"
agent.model = "gpt-4.1"
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
assert "reasoning" not in kwargs.get("extra_body", {})
def test_max_tokens_injected(self, agent):
agent.max_tokens = 4096
messages = [{"role": "user", "content": "hi"}]
@@ -806,7 +828,7 @@ class TestConcurrentToolExecution:
mock_con.assert_not_called()
def test_multiple_tools_uses_concurrent_path(self, agent):
"""Multiple non-interactive tools should use concurrent path."""
"""Multiple read-only tools should use concurrent path."""
tc1 = _mock_tool_call(name="web_search", arguments='{}', call_id="c1")
tc2 = _mock_tool_call(name="read_file", arguments='{"path":"x.py"}', call_id="c2")
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc1, tc2])
@@ -817,6 +839,94 @@ class TestConcurrentToolExecution:
mock_con.assert_called_once()
mock_seq.assert_not_called()
def test_terminal_batch_forces_sequential(self, agent):
"""Stateful tools should not share the concurrent execution path."""
tc1 = _mock_tool_call(name="web_search", arguments='{}', call_id="c1")
tc2 = _mock_tool_call(name="terminal", arguments='{"command":"pwd"}', call_id="c2")
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc1, tc2])
messages = []
with patch.object(agent, "_execute_tool_calls_sequential") as mock_seq:
with patch.object(agent, "_execute_tool_calls_concurrent") as mock_con:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_seq.assert_called_once()
mock_con.assert_not_called()
def test_write_batch_forces_sequential(self, agent):
"""File mutations should stay ordered within a turn."""
tc1 = _mock_tool_call(name="read_file", arguments='{"path":"x.py"}', call_id="c1")
tc2 = _mock_tool_call(name="write_file", arguments='{"path":"x.py","content":"print(1)"}', call_id="c2")
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc1, tc2])
messages = []
with patch.object(agent, "_execute_tool_calls_sequential") as mock_seq:
with patch.object(agent, "_execute_tool_calls_concurrent") as mock_con:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_seq.assert_called_once()
mock_con.assert_not_called()
def test_disjoint_write_batch_uses_concurrent_path(self, agent):
"""Independent file writes should still run concurrently."""
tc1 = _mock_tool_call(
name="write_file",
arguments='{"path":"src/a.py","content":"print(1)"}',
call_id="c1",
)
tc2 = _mock_tool_call(
name="write_file",
arguments='{"path":"src/b.py","content":"print(2)"}',
call_id="c2",
)
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc1, tc2])
messages = []
with patch.object(agent, "_execute_tool_calls_sequential") as mock_seq:
with patch.object(agent, "_execute_tool_calls_concurrent") as mock_con:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_con.assert_called_once()
mock_seq.assert_not_called()
def test_overlapping_write_batch_forces_sequential(self, agent):
"""Writes to the same file must stay ordered."""
tc1 = _mock_tool_call(
name="write_file",
arguments='{"path":"src/a.py","content":"print(1)"}',
call_id="c1",
)
tc2 = _mock_tool_call(
name="patch",
arguments='{"path":"src/a.py","old_string":"1","new_string":"2"}',
call_id="c2",
)
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc1, tc2])
messages = []
with patch.object(agent, "_execute_tool_calls_sequential") as mock_seq:
with patch.object(agent, "_execute_tool_calls_concurrent") as mock_con:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_seq.assert_called_once()
mock_con.assert_not_called()
def test_malformed_json_args_forces_sequential(self, agent):
"""Unparseable tool arguments should fall back to sequential."""
tc1 = _mock_tool_call(name="web_search", arguments='{}', call_id="c1")
tc2 = _mock_tool_call(name="web_search", arguments="NOT JSON {{{", call_id="c2")
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc1, tc2])
messages = []
with patch.object(agent, "_execute_tool_calls_sequential") as mock_seq:
with patch.object(agent, "_execute_tool_calls_concurrent") as mock_con:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_seq.assert_called_once()
mock_con.assert_not_called()
def test_non_dict_args_forces_sequential(self, agent):
"""Tool arguments that parse to a non-dict type should fall back to sequential."""
tc1 = _mock_tool_call(name="web_search", arguments='{}', call_id="c1")
tc2 = _mock_tool_call(name="web_search", arguments='"just a string"', call_id="c2")
mock_msg = _mock_assistant_msg(content="", tool_calls=[tc1, tc2])
messages = []
with patch.object(agent, "_execute_tool_calls_sequential") as mock_seq:
with patch.object(agent, "_execute_tool_calls_concurrent") as mock_con:
agent._execute_tool_calls(mock_msg, messages, "task-1")
mock_seq.assert_called_once()
mock_con.assert_not_called()
def test_concurrent_executes_all_tools(self, agent):
"""Concurrent path should execute all tools and append results in order."""
tc1 = _mock_tool_call(name="web_search", arguments='{"q":"alpha"}', call_id="c1")
@@ -943,6 +1053,39 @@ class TestConcurrentToolExecution:
assert "ok" in result
class TestPathsOverlap:
"""Unit tests for the _paths_overlap helper."""
def test_same_path_overlaps(self):
from run_agent import _paths_overlap
assert _paths_overlap(Path("src/a.py"), Path("src/a.py"))
def test_siblings_do_not_overlap(self):
from run_agent import _paths_overlap
assert not _paths_overlap(Path("src/a.py"), Path("src/b.py"))
def test_parent_child_overlap(self):
from run_agent import _paths_overlap
assert _paths_overlap(Path("src"), Path("src/sub/a.py"))
def test_different_roots_do_not_overlap(self):
from run_agent import _paths_overlap
assert not _paths_overlap(Path("src/a.py"), Path("other/a.py"))
def test_nested_vs_flat_do_not_overlap(self):
from run_agent import _paths_overlap
assert not _paths_overlap(Path("src/sub/a.py"), Path("src/a.py"))
def test_empty_paths_do_not_overlap(self):
from run_agent import _paths_overlap
assert not _paths_overlap(Path(""), Path(""))
def test_one_empty_path_does_not_overlap(self):
from run_agent import _paths_overlap
assert not _paths_overlap(Path(""), Path("src/a.py"))
assert not _paths_overlap(Path("src/a.py"), Path(""))
class TestHandleMaxIterations:
def test_returns_summary(self, agent):
resp = _mock_response(content="Here is a summary of what I did.")
@@ -2172,6 +2315,41 @@ class TestFallbackAnthropicProvider:
assert agent.client is mock_client
def test_aiagent_uses_copilot_acp_client():
with (
patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("run_agent.OpenAI") as mock_openai,
patch("agent.copilot_acp_client.CopilotACPClient") as mock_acp_client,
):
acp_client = MagicMock()
mock_acp_client.return_value = acp_client
agent = AIAgent(
api_key="copilot-acp",
base_url="acp://copilot",
provider="copilot-acp",
acp_command="/usr/local/bin/copilot",
acp_args=["--acp", "--stdio"],
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
assert agent.client is acp_client
mock_openai.assert_not_called()
mock_acp_client.assert_called_once()
assert mock_acp_client.call_args.kwargs["base_url"] == "acp://copilot"
assert mock_acp_client.call_args.kwargs["api_key"] == "copilot-acp"
assert mock_acp_client.call_args.kwargs["command"] == "/usr/local/bin/copilot"
assert mock_acp_client.call_args.kwargs["args"] == ["--acp", "--stdio"]
def test_is_openai_client_closed_honors_custom_client_flag():
assert AIAgent._is_openai_client_closed(SimpleNamespace(is_closed=True)) is True
assert AIAgent._is_openai_client_closed(SimpleNamespace(is_closed=False)) is False
class TestAnthropicBaseUrlPassthrough:
"""Bug fix: base_url was filtered with 'anthropic in base_url', blocking proxies."""
@@ -2717,3 +2895,135 @@ class TestNormalizeCodexDictArguments:
msg, _ = agent._normalize_codex_response(response)
tc = msg.tool_calls[0]
assert tc.function.arguments == args_str
# ---------------------------------------------------------------------------
# OAuth flag and nudge counter fixes (salvaged from PR #1797)
# ---------------------------------------------------------------------------
class TestOAuthFlagAfterCredentialRefresh:
"""_is_anthropic_oauth must update when token type changes during refresh."""
def test_oauth_flag_updates_api_key_to_oauth(self, agent):
"""Refreshing from API key to OAuth token must set flag to True."""
agent.api_mode = "anthropic_messages"
agent._anthropic_api_key = "sk-ant-api-old"
agent._anthropic_client = MagicMock()
agent._is_anthropic_oauth = False
with (
patch("agent.anthropic_adapter.resolve_anthropic_token",
return_value="sk-ant-setup-oauth-token"),
patch("agent.anthropic_adapter.build_anthropic_client",
return_value=MagicMock()),
):
result = agent._try_refresh_anthropic_client_credentials()
assert result is True
assert agent._is_anthropic_oauth is True
def test_oauth_flag_updates_oauth_to_api_key(self, agent):
"""Refreshing from OAuth to API key must set flag to False."""
agent.api_mode = "anthropic_messages"
agent._anthropic_api_key = "sk-ant-setup-old"
agent._anthropic_client = MagicMock()
agent._is_anthropic_oauth = True
with (
patch("agent.anthropic_adapter.resolve_anthropic_token",
return_value="sk-ant-api03-new-key"),
patch("agent.anthropic_adapter.build_anthropic_client",
return_value=MagicMock()),
):
result = agent._try_refresh_anthropic_client_credentials()
assert result is True
assert agent._is_anthropic_oauth is False
class TestFallbackSetsOAuthFlag:
"""_try_activate_fallback must set _is_anthropic_oauth for Anthropic fallbacks."""
def test_fallback_to_anthropic_oauth_sets_flag(self, agent):
agent._fallback_activated = False
agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-6"}
mock_client = MagicMock()
mock_client.base_url = "https://api.anthropic.com/v1"
mock_client.api_key = "sk-ant-setup-oauth-token"
with (
patch("agent.auxiliary_client.resolve_provider_client",
return_value=(mock_client, None)),
patch("agent.anthropic_adapter.build_anthropic_client",
return_value=MagicMock()),
patch("agent.anthropic_adapter.resolve_anthropic_token",
return_value=None),
):
result = agent._try_activate_fallback()
assert result is True
assert agent._is_anthropic_oauth is True
def test_fallback_to_anthropic_api_key_clears_flag(self, agent):
agent._fallback_activated = False
agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-6"}
mock_client = MagicMock()
mock_client.base_url = "https://api.anthropic.com/v1"
mock_client.api_key = "sk-ant-api03-regular-key"
with (
patch("agent.auxiliary_client.resolve_provider_client",
return_value=(mock_client, None)),
patch("agent.anthropic_adapter.build_anthropic_client",
return_value=MagicMock()),
patch("agent.anthropic_adapter.resolve_anthropic_token",
return_value=None),
):
result = agent._try_activate_fallback()
assert result is True
assert agent._is_anthropic_oauth is False
class TestMemoryNudgeCounterPersistence:
"""_turns_since_memory must persist across run_conversation calls."""
def test_counters_initialized_in_init(self):
"""Counters must exist on the agent after __init__."""
with patch("run_agent.get_tool_definitions", return_value=[]):
a = AIAgent(
model="test", api_key="test-key", provider="openrouter",
skip_context_files=True, skip_memory=True,
)
assert hasattr(a, "_turns_since_memory")
assert hasattr(a, "_iters_since_skill")
assert a._turns_since_memory == 0
assert a._iters_since_skill == 0
def test_counters_not_reset_in_preamble(self):
"""The run_conversation preamble must not zero the nudge counters."""
import inspect
src = inspect.getsource(AIAgent.run_conversation)
# The preamble resets many fields (retry counts, budget, etc.)
# before the main loop. Find that reset block and verify our
# counters aren't in it. The reset block ends at iteration_budget.
preamble_end = src.index("self.iteration_budget = IterationBudget")
preamble = src[:preamble_end]
assert "self._turns_since_memory = 0" not in preamble
assert "self._iters_since_skill = 0" not in preamble
class TestDeadRetryCode:
"""Unreachable retry_count >= max_retries after raise must not exist."""
def test_no_unreachable_max_retries_after_backoff(self):
import inspect
source = inspect.getsource(AIAgent.run_conversation)
occurrences = source.count("if retry_count >= max_retries:")
assert occurrences == 2, (
f"Expected 2 occurrences of 'if retry_count >= max_retries:' "
f"but found {occurrences}"
)
+43
View File
@@ -49,6 +49,27 @@ def _build_agent(monkeypatch):
return agent
def _build_copilot_agent(monkeypatch, *, model="gpt-5.4"):
_patch_agent_bootstrap(monkeypatch)
agent = run_agent.AIAgent(
model=model,
provider="copilot",
api_mode="codex_responses",
base_url="https://api.githubcopilot.com",
api_key="gh-token",
quiet_mode=True,
max_iterations=4,
skip_context_files=True,
skip_memory=True,
)
agent._cleanup_task_resources = lambda task_id: None
agent._persist_session = lambda messages, history=None: None
agent._save_trajectory = lambda messages, user_message, completed: None
agent._save_session_log = lambda messages: None
return agent
def _codex_message_response(text: str):
return SimpleNamespace(
output=[
@@ -244,6 +265,28 @@ def test_build_api_kwargs_codex(monkeypatch):
assert "extra_body" not in kwargs
def test_build_api_kwargs_copilot_responses_omits_openai_only_fields(monkeypatch):
agent = _build_copilot_agent(monkeypatch)
kwargs = agent._build_api_kwargs([{"role": "user", "content": "hi"}])
assert kwargs["model"] == "gpt-5.4"
assert kwargs["store"] is False
assert kwargs["tool_choice"] == "auto"
assert kwargs["parallel_tool_calls"] is True
assert kwargs["reasoning"] == {"effort": "medium"}
assert "prompt_cache_key" not in kwargs
assert "include" not in kwargs
def test_build_api_kwargs_copilot_responses_omits_reasoning_for_non_reasoning_model(monkeypatch):
agent = _build_copilot_agent(monkeypatch, model="gpt-4.1")
kwargs = agent._build_api_kwargs([{"role": "user", "content": "hi"}])
assert "reasoning" not in kwargs
assert "include" not in kwargs
assert "prompt_cache_key" not in kwargs
def test_run_codex_stream_retries_when_completed_event_missing(monkeypatch):
agent = _build_agent(monkeypatch)
calls = {"stream": 0}
+110 -1
View File
@@ -177,6 +177,50 @@ def test_custom_endpoint_uses_saved_config_base_url_when_env_missing(monkeypatch
assert resolved["api_key"] == "local-key"
def test_custom_endpoint_uses_config_api_key_over_env(monkeypatch):
"""provider: custom with base_url and api_key in config uses them (#1760)."""
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "openrouter")
monkeypatch.setattr(
rp,
"_get_model_config",
lambda: {
"provider": "custom",
"base_url": "https://my-api.example.com/v1",
"api_key": "config-api-key",
},
)
monkeypatch.setenv("OPENAI_BASE_URL", "https://other.example.com/v1")
monkeypatch.setenv("OPENAI_API_KEY", "env-key")
monkeypatch.delenv("OPENROUTER_BASE_URL", raising=False)
resolved = rp.resolve_runtime_provider(requested="custom")
assert resolved["base_url"] == "https://my-api.example.com/v1"
assert resolved["api_key"] == "config-api-key"
def test_custom_endpoint_uses_config_api_field_when_no_api_key(monkeypatch):
"""provider: custom with 'api' in config uses it as api_key (#1760)."""
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "openrouter")
monkeypatch.setattr(
rp,
"_get_model_config",
lambda: {
"provider": "custom",
"base_url": "https://custom.example.com/v1",
"api": "config-api-field",
},
)
monkeypatch.delenv("OPENAI_BASE_URL", raising=False)
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
resolved = rp.resolve_runtime_provider(requested="custom")
assert resolved["base_url"] == "https://custom.example.com/v1"
assert resolved["api_key"] == "config-api-field"
def test_custom_endpoint_auto_provider_prefers_openai_key(monkeypatch):
"""Auto provider with non-OpenRouter base_url should prefer OPENAI_API_KEY.
@@ -394,10 +438,75 @@ def test_named_custom_provider_without_api_mode_defaults(monkeypatch):
lambda p: {
"name": "my-server",
"base_url": "http://localhost:8000/v1",
"api_key": "sk-test",
"api_key": "***",
},
)
resolved = rp.resolve_runtime_provider(requested="my-server")
assert resolved["api_mode"] == "chat_completions"
def test_anthropic_messages_in_valid_api_modes():
"""anthropic_messages should be accepted by _parse_api_mode."""
assert rp._parse_api_mode("anthropic_messages") == "anthropic_messages"
def test_api_key_provider_anthropic_url_auto_detection(monkeypatch):
"""API-key providers with /anthropic base URL should auto-detect anthropic_messages mode."""
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "minimax")
monkeypatch.setattr(rp, "_get_model_config", lambda: {})
monkeypatch.setenv("MINIMAX_API_KEY", "test-minimax-key")
monkeypatch.setenv("MINIMAX_BASE_URL", "https://api.minimax.io/anthropic")
resolved = rp.resolve_runtime_provider(requested="minimax")
assert resolved["provider"] == "minimax"
assert resolved["api_mode"] == "anthropic_messages"
assert resolved["base_url"] == "https://api.minimax.io/anthropic"
def test_api_key_provider_explicit_api_mode_config(monkeypatch):
"""API-key providers should respect api_mode from model config."""
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "minimax")
monkeypatch.setattr(rp, "_get_model_config", lambda: {"api_mode": "anthropic_messages"})
monkeypatch.setenv("MINIMAX_API_KEY", "test-minimax-key")
monkeypatch.delenv("MINIMAX_BASE_URL", raising=False)
resolved = rp.resolve_runtime_provider(requested="minimax")
assert resolved["provider"] == "minimax"
assert resolved["api_mode"] == "anthropic_messages"
def test_api_key_provider_default_url_stays_chat_completions(monkeypatch):
"""API-key providers with default /v1 URL should stay on chat_completions."""
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "minimax")
monkeypatch.setattr(rp, "_get_model_config", lambda: {})
monkeypatch.setenv("MINIMAX_API_KEY", "test-minimax-key")
monkeypatch.delenv("MINIMAX_BASE_URL", raising=False)
resolved = rp.resolve_runtime_provider(requested="minimax")
assert resolved["provider"] == "minimax"
assert resolved["api_mode"] == "chat_completions"
assert resolved["base_url"] == "https://api.minimax.io/v1"
def test_named_custom_provider_anthropic_api_mode(monkeypatch):
"""Custom providers should accept api_mode: anthropic_messages."""
monkeypatch.setattr(rp, "resolve_provider", lambda *a, **k: "my-anthropic-proxy")
monkeypatch.setattr(
rp, "_get_named_custom_provider",
lambda p: {
"name": "my-anthropic-proxy",
"base_url": "https://proxy.example.com/anthropic",
"api_key": "test-key",
"api_mode": "anthropic_messages",
},
)
resolved = rp.resolve_runtime_provider(requested="my-anthropic-proxy")
assert resolved["api_mode"] == "anthropic_messages"
assert resolved["base_url"] == "https://proxy.example.com/anthropic"
+36
View File
@@ -505,6 +505,42 @@ class TestToolsetInjection:
assert "mcp_fs_list_files" not in fake_toolsets["non-hermes"]["tools"]
# Original tools preserved
assert "terminal" in fake_toolsets["hermes-cli"]["tools"]
# Server name becomes a standalone toolset
assert "fs" in fake_toolsets
assert "mcp_fs_list_files" in fake_toolsets["fs"]["tools"]
assert fake_toolsets["fs"]["description"].startswith("MCP server '")
def test_server_toolset_skips_builtin_collision(self):
"""MCP server named after a built-in toolset shouldn't overwrite it."""
from tools.mcp_tool import MCPServerTask
mock_tools = [_make_mcp_tool("run", "Run command")]
mock_session = MagicMock()
fresh_servers = {}
async def fake_connect(name, config):
server = MCPServerTask(name)
server.session = mock_session
server._tools = mock_tools
return server
fake_toolsets = {
"hermes-cli": {"tools": ["terminal"], "description": "CLI", "includes": []},
# Built-in toolset named "terminal" — must not be overwritten
"terminal": {"tools": ["terminal"], "description": "Terminal tools", "includes": []},
}
fake_config = {"terminal": {"command": "npx", "args": []}}
with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
patch("tools.mcp_tool._servers", fresh_servers), \
patch("tools.mcp_tool._load_mcp_config", return_value=fake_config), \
patch("tools.mcp_tool._connect_server", side_effect=fake_connect), \
patch("toolsets.TOOLSETS", fake_toolsets):
from tools.mcp_tool import discover_mcp_tools
discover_mcp_tools()
# Built-in toolset preserved — description unchanged
assert fake_toolsets["terminal"]["description"] == "Terminal tools"
def test_server_connection_failure_skipped(self):
"""If one server fails to connect, others still proceed."""
+8
View File
@@ -441,6 +441,14 @@ class TestSearchLoopDetection(unittest.TestCase):
self.assertNotIn("_warning", result)
self.assertNotIn("error", result)
@patch("tools.file_tools._get_file_ops", return_value=_make_fake_file_ops())
def test_pagination_offset_does_not_count_as_repeat(self, _mock_ops):
"""Paginating truncated results should not be blocked as a repeat search."""
for offset in (0, 50, 100, 150):
result = json.loads(search_tool("def main", task_id="t1", offset=offset, limit=50))
self.assertNotIn("_warning", result)
self.assertNotIn("error", result)
@patch("tools.file_tools._get_file_ops", return_value=_make_fake_file_ops())
def test_read_between_searches_resets_consecutive(self, _mock_ops):
"""A read_file call between searches resets search consecutive counter."""
+28
View File
@@ -154,6 +154,34 @@ class TestShouldAllowInstall:
assert allowed is True
assert "Force-installed" in reason
# -- agent-created policy --
def test_safe_agent_created_allowed(self):
allowed, _ = should_allow_install(self._result("agent-created", "safe"))
assert allowed is True
def test_caution_agent_created_allowed(self):
"""Agent-created skills with caution verdict (e.g. docker refs) should pass."""
f = [Finding("docker_pull", "medium", "supply_chain", "SKILL.md", 1, "docker pull img", "pulls Docker image")]
allowed, reason = should_allow_install(self._result("agent-created", "caution", f))
assert allowed is True
assert "agent-created" in reason
def test_dangerous_agent_created_blocked(self):
"""Agent-created skills with dangerous verdict (critical findings) stay blocked."""
f = [Finding("env_exfil_curl", "critical", "exfiltration", "SKILL.md", 1, "curl $TOKEN", "exfiltration")]
allowed, reason = should_allow_install(self._result("agent-created", "dangerous", f))
assert allowed is False
assert "Blocked" in reason
def test_force_overrides_dangerous_for_agent_created(self):
f = [Finding("x", "critical", "c", "f", 1, "m", "d")]
allowed, reason = should_allow_install(
self._result("agent-created", "dangerous", f), force=True
)
assert allowed is True
assert "Force-installed" in reason
# ---------------------------------------------------------------------------
# scan_file — pattern detection
+29
View File
@@ -374,6 +374,35 @@ class TestSkillView:
result = json.loads(raw)
assert result["success"] is False
def test_view_disabled_skill_blocked(self, tmp_path):
"""Disabled skills should not be viewable via skill_view."""
with (
patch("tools.skills_tool.SKILLS_DIR", tmp_path),
patch(
"tools.skills_tool._is_skill_disabled",
return_value=True,
),
):
_make_skill(tmp_path, "hidden-skill")
raw = skill_view("hidden-skill")
result = json.loads(raw)
assert result["success"] is False
assert "disabled" in result["error"].lower()
def test_view_enabled_skill_allowed(self, tmp_path):
"""Non-disabled skills should be viewable normally."""
with (
patch("tools.skills_tool.SKILLS_DIR", tmp_path),
patch(
"tools.skills_tool._is_skill_disabled",
return_value=False,
),
):
_make_skill(tmp_path, "active-skill")
raw = skill_view("active-skill")
result = json.loads(raw)
assert result["success"] is True
class TestSkillViewSecureSetupOnLoad:
def test_requests_missing_required_env_and_continues(self, tmp_path, monkeypatch):
+19 -5
View File
@@ -173,10 +173,6 @@ def _build_child_agent(
from run_agent import AIAgent
import model_tools
# Save the parent's resolved tool names before the child agent can
# overwrite the process-global via get_tool_definitions().
_saved_tool_names = list(model_tools._last_resolved_tool_names)
# When no explicit toolsets given, inherit from parent's enabled toolsets
# so disabled tools (e.g. web) don't leak to subagents.
if toolsets:
@@ -205,6 +201,8 @@ def _build_child_agent(
effective_base_url = override_base_url or parent_agent.base_url
effective_api_key = override_api_key or parent_api_key
effective_api_mode = override_api_mode or getattr(parent_agent, "api_mode", None)
effective_acp_command = getattr(parent_agent, "acp_command", None)
effective_acp_args = list(getattr(parent_agent, "acp_args", []) or [])
child = AIAgent(
base_url=effective_base_url,
@@ -212,6 +210,8 @@ def _build_child_agent(
model=effective_model,
provider=effective_provider,
api_mode=effective_api_mode,
acp_command=effective_acp_command,
acp_args=effective_acp_args,
max_iterations=max_iterations,
max_tokens=getattr(parent_agent, "max_tokens", None),
reasoning_config=getattr(parent_agent, "reasoning_config", None),
@@ -232,6 +232,7 @@ def _build_child_agent(
tool_progress_callback=child_progress_cb,
iteration_budget=shared_budget,
)
child._delegate_saved_tool_names = list(_saved_tool_names)
# Set delegation depth so children can't spawn grandchildren
child._delegate_depth = getattr(parent_agent, '_delegate_depth', 0) + 1
@@ -263,6 +264,13 @@ def _run_single_child(
# Get the progress callback from the child agent
child_progress_cb = getattr(child, 'tool_progress_callback', None)
# Save the parent's resolved tool names before the child agent can
# overwrite the process-global via get_tool_definitions().
# This must be in _run_single_child (not _build_child_agent) so the
# save/restore happens in the same scope as the try/finally.
import model_tools
_saved_tool_names = list(model_tools._last_resolved_tool_names)
try:
result = child.run_conversation(user_message=goal)
@@ -372,7 +380,11 @@ def _run_single_child(
finally:
# Restore the parent's tool names so the process-global is correct
# for any subsequent execute_code calls or other consumers.
model_tools._last_resolved_tool_names = _saved_tool_names
import model_tools
saved_tool_names = getattr(child, "_delegate_saved_tool_names", None)
if isinstance(saved_tool_names, list):
model_tools._last_resolved_tool_names = list(saved_tool_names)
# Unregister child from interrupt propagation
if hasattr(parent_agent, '_active_children'):
@@ -623,6 +635,8 @@ def _resolve_delegation_credentials(cfg: dict, parent_agent) -> dict:
"base_url": runtime.get("base_url"),
"api_key": api_key,
"api_mode": runtime.get("api_mode"),
"command": runtime.get("command"),
"args": list(runtime.get("args") or []),
}
+11 -1
View File
@@ -337,7 +337,17 @@ def search_tool(pattern: str, target: str = "content", path: str = ".",
"""Search for content or files."""
try:
# Track searches to detect *consecutive* repeated search loops.
search_key = ("search", pattern, target, str(path), file_glob or "")
# Include pagination args so users can page through truncated
# results without tripping the repeated-search guard.
search_key = (
"search",
pattern,
target,
str(path),
file_glob or "",
limit,
offset,
)
with _read_tracker_lock:
task_data = _read_tracker.setdefault(task_id, {
"last_key": None, "consecutive": 0, "read_history": set(),
+53 -8
View File
@@ -1238,6 +1238,57 @@ def _convert_mcp_schema(server_name: str, mcp_tool) -> dict:
}
def _sync_mcp_toolsets(server_names: Optional[List[str]] = None) -> None:
"""Expose each MCP server as a standalone toolset and inject into hermes-* sets.
Creates a real toolset entry in TOOLSETS for each server name (e.g.
TOOLSETS["github"] = {"tools": ["mcp_github_list_files", ...]}). This
makes raw server names resolvable in platform_toolsets overrides.
Also injects all MCP tools into hermes-* umbrella toolsets for the
default behavior.
Skips server names that collide with built-in toolsets.
"""
from toolsets import TOOLSETS
if server_names is None:
server_names = list(_load_mcp_config().keys())
existing = _existing_tool_names()
all_mcp_tools: List[str] = []
for server_name in server_names:
safe_prefix = f"mcp_{server_name.replace('-', '_').replace('.', '_')}_"
server_tools = sorted(
t for t in existing if t.startswith(safe_prefix)
)
all_mcp_tools.extend(server_tools)
# Don't overwrite a built-in toolset that happens to share the name.
existing_ts = TOOLSETS.get(server_name)
if existing_ts and not str(existing_ts.get("description", "")).startswith("MCP server '"):
logger.warning(
"Skipping MCP toolset alias '%s' — a built-in toolset already uses that name",
server_name,
)
continue
TOOLSETS[server_name] = {
"description": f"MCP server '{server_name}' tools",
"tools": server_tools,
"includes": [],
}
# Also inject into hermes-* umbrella toolsets for default behavior.
for ts_name, ts in TOOLSETS.items():
if not ts_name.startswith("hermes-"):
continue
for tool_name in all_mcp_tools:
if tool_name not in ts["tools"]:
ts["tools"].append(tool_name)
def _build_utility_schemas(server_name: str) -> List[dict]:
"""Build schemas for the MCP utility tools (resources & prompts).
@@ -1523,6 +1574,7 @@ def discover_mcp_tools() -> List[str]:
}
if not new_servers:
_sync_mcp_toolsets(list(servers.keys()))
return _existing_tool_names()
# Start the background event loop for MCP connections
@@ -1562,14 +1614,7 @@ def discover_mcp_tools() -> List[str]:
# The outer timeout is generous: 120s total for parallel discovery.
_run_on_mcp_loop(_discover_all(), timeout=120)
if all_tools:
# Dynamically inject into all hermes-* platform toolsets
from toolsets import TOOLSETS
for ts_name, ts in TOOLSETS.items():
if ts_name.startswith("hermes-"):
for tool_name in all_tools:
if tool_name not in ts["tools"]:
ts["tools"].append(tool_name)
_sync_mcp_toolsets(list(servers.keys()))
# Print summary
total_servers = len(new_servers)
+2 -2
View File
@@ -8,7 +8,7 @@ Usage:
python -m tools.neutts_synth --text "Hello" --out output.wav \
--ref-audio samples/jo.wav --ref-text samples/jo.txt
Requires: pip install neutts[all]
Requires: python -m pip install -U neutts[all]
System: apt install espeak-ng (or brew install espeak-ng)
"""
@@ -75,7 +75,7 @@ def main():
try:
from neutts import NeuTTS
except ImportError:
print("Error: neutts not installed. Run: pip install neutts[all]", file=sys.stderr)
print("Error: neutts not installed. Run: python -m pip install -U neutts[all]", file=sys.stderr)
sys.exit(1)
tts = NeuTTS(
+2 -2
View File
@@ -1009,7 +1009,7 @@ async def rl_list_runs() -> str:
TEST_MODELS = [
{"id": "qwen/qwen3-8b", "name": "Qwen3 8B", "scale": "small"},
{"id": "z-ai/glm-4.7-flash", "name": "GLM-4.7 Flash", "scale": "medium"},
{"id": "minimax/minimax-m2.5", "name": "MiniMax M2.5", "scale": "large"},
{"id": "minimax/minimax-m2.7", "name": "MiniMax M2.7", "scale": "large"},
]
# Default test parameters - quick but representative
@@ -1370,7 +1370,7 @@ RL_CHECK_STATUS_SCHEMA = {"name": "rl_check_status", "description": "Get status
RL_STOP_TRAINING_SCHEMA = {"name": "rl_stop_training", "description": "Stop a running training job. Use if metrics look bad, training is stagnant, or you want to try different settings.", "parameters": {"type": "object", "properties": {"run_id": {"type": "string", "description": "The run ID to stop"}}, "required": ["run_id"]}}
RL_GET_RESULTS_SCHEMA = {"name": "rl_get_results", "description": "Get final results and metrics for a completed training run. Returns final metrics and path to trained weights.", "parameters": {"type": "object", "properties": {"run_id": {"type": "string", "description": "The run ID to get results for"}}, "required": ["run_id"]}}
RL_LIST_RUNS_SCHEMA = {"name": "rl_list_runs", "description": "List all training runs (active and completed) with their status.", "parameters": {"type": "object", "properties": {}, "required": []}}
RL_TEST_INFERENCE_SCHEMA = {"name": "rl_test_inference", "description": "Quick inference test for any environment. Runs a few steps of inference + scoring using OpenRouter. Default: 3 steps x 16 completions = 48 rollouts per model, testing 3 models = 144 total. Tests environment loading, prompt construction, inference parsing, and verifier logic. Use BEFORE training to catch issues.", "parameters": {"type": "object", "properties": {"num_steps": {"type": "integer", "description": "Number of steps to run (default: 3, recommended max for testing)", "default": 3}, "group_size": {"type": "integer", "description": "Completions per step (default: 16, like training)", "default": 16}, "models": {"type": "array", "items": {"type": "string"}, "description": "Optional list of OpenRouter model IDs. Default: qwen/qwen3-8b, z-ai/glm-4.7-flash, minimax/minimax-m2.5"}}, "required": []}}
RL_TEST_INFERENCE_SCHEMA = {"name": "rl_test_inference", "description": "Quick inference test for any environment. Runs a few steps of inference + scoring using OpenRouter. Default: 3 steps x 16 completions = 48 rollouts per model, testing 3 models = 144 total. Tests environment loading, prompt construction, inference parsing, and verifier logic. Use BEFORE training to catch issues.", "parameters": {"type": "object", "properties": {"num_steps": {"type": "integer", "description": "Number of steps to run (default: 3, recommended max for testing)", "default": 3}, "group_size": {"type": "integer", "description": "Completions per step (default: 16, like training)", "default": 16}, "models": {"type": "array", "items": {"type": "string"}, "description": "Optional list of OpenRouter model IDs. Default: qwen/qwen3-8b, z-ai/glm-4.7-flash, minimax/minimax-m2.7"}}, "required": []}}
_rl_env = ["TINKER_API_KEY", "WANDB_API_KEY"]
+1 -1
View File
@@ -43,7 +43,7 @@ INSTALL_POLICY = {
"builtin": ("allow", "allow", "allow"),
"trusted": ("allow", "allow", "block"),
"community": ("allow", "block", "block"),
"agent-created": ("allow", "block", "block"),
"agent-created": ("allow", "allow", "block"),
}
VERDICT_INDEX = {"safe": 0, "caution": 1, "dangerous": 2}
+14
View File
@@ -920,6 +920,20 @@ def skill_view(name: str, file_path: str = None, task_id: str = None) -> str:
ensure_ascii=False,
)
# Check if the skill is disabled by the user
resolved_name = parsed_frontmatter.get("name", skill_md.parent.name)
if _is_skill_disabled(resolved_name):
return json.dumps(
{
"success": False,
"error": (
f"Skill '{resolved_name}' is disabled. "
"Enable it with `hermes skills` or inspect the files directly on disk."
),
},
ensure_ascii=False,
)
# If a specific file path is requested, read that instead
if file_path and skill_dir:
# Security: Prevent path traversal attacks
+2 -2
View File
@@ -423,8 +423,8 @@ def text_to_speech_tool(
if not _check_neutts_available():
return json.dumps({
"success": False,
"error": "NeuTTS provider selected but neutts_cli is not installed. "
"Install the NeuTTS skill and run the bootstrap helper first."
"error": "NeuTTS provider selected but neutts is not installed. "
"Run hermes setup and choose NeuTTS, or install espeak-ng and run python -m pip install -U neutts[all]."
}, ensure_ascii=False)
logger.info("Generating speech with NeuTTS (local)...")
_generate_neutts(text, file_str, tts_config)
@@ -28,17 +28,19 @@ Primary files:
The cached system prompt is assembled in roughly this order:
1. default agent identity
1. agent identity`SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py`
2. tool-aware behavior guidance
3. Honcho static block (when active)
4. optional system message
5. frozen MEMORY snapshot
6. frozen USER profile snapshot
7. skills index
8. context files (`AGENTS.md`, `SOUL.md`, `.cursorrules`, `.cursor/rules/*.mdc`)
8. context files (`AGENTS.md`, `.cursorrules`, `.cursor/rules/*.mdc`) — SOUL.md is **not** included here when it was already loaded as the identity in step 1
9. timestamp / optional session ID
10. platform hint
When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead.
## API-call-time-only layers
These are intentionally *not* persisted as part of the cached system prompt:
@@ -59,10 +61,11 @@ Local memory and user profile data are injected as frozen snapshots at session s
`agent/prompt_builder.py` scans and sanitizes:
- `AGENTS.md`
- `SOUL.md`
- `.cursorrules`
- `.cursor/rules/*.mdc`
`SOUL.md` is loaded separately via `load_soul_md()` for the identity slot. When it loads successfully, `build_context_files_prompt(skip_soul=True)` prevents it from appearing twice.
Long files are truncated before injection.
## Skills index
+6 -1
View File
@@ -37,8 +37,13 @@ That last part matters. Good MCP usage is not just “connect everything.” It
## Step 1: install MCP support
If you installed Hermes with the standard install script, MCP support is already included (the installer runs `uv pip install -e ".[all]"`).
If you installed without extras and need to add MCP separately:
```bash
pip install hermes-agent[mcp]
cd ~/.hermes/hermes-agent
uv pip install -e ".[mcp]"
```
For npm-based servers, make sure Node.js and `npx` are available.
+5 -5
View File
@@ -6,9 +6,9 @@ description: "How to use SOUL.md to shape Hermes Agent's default voice, what bel
# Use SOUL.md with Hermes
`SOUL.md` is the easiest way to give Hermes a stable, default voice.
`SOUL.md` is the **primary identity** for your Hermes instance. It's the first thing in the system prompt — it defines who the agent is, how it speaks, and what it avoids.
If you want Hermes to feel like the same assistant every time you talk to it — without repeating instructions in every session — this is the file to use.
If you want Hermes to feel like the same assistant every time you talk to it — or if you want to replace the Hermes persona entirely with your own — this is the file to use.
## What SOUL.md is for
@@ -65,11 +65,11 @@ Important:
## How Hermes uses it
When Hermes starts a session, it reads `SOUL.md` from `HERMES_HOME`, scans it for prompt-injection patterns, truncates it if needed, and injects the content directly into the prompt.
When Hermes starts a session, it reads `SOUL.md` from `HERMES_HOME`, scans it for prompt-injection patterns, truncates it if needed, and uses it as the **agent identity** — slot #1 in the system prompt. This means SOUL.md completely replaces the built-in default identity text.
No wrapper language is added around the file.
If SOUL.md is missing, empty, or cannot be loaded, Hermes falls back to a built-in default identity.
So the content itself matters. Write the way you want Hermes to think and speak.
No wrapper language is added around the file. The content itself matters — write the way you want your agent to think and speak.
## A good first edit
@@ -72,6 +72,12 @@ pip install hermes-agent[messaging]
pip install hermes-agent[tts-premium]
```
### Local NeuTTS (optional)
```bash
python -m pip install -U neutts[all]
```
### Everything
```bash
@@ -84,18 +90,21 @@ pip install hermes-agent[all]
```bash
brew install portaudio ffmpeg opus
brew install espeak-ng
```
### Ubuntu / Debian
```bash
sudo apt install portaudio19-dev ffmpeg libopus0
sudo apt install espeak-ng
```
Why these matter:
- `portaudio` → microphone input / playback for CLI voice mode
- `ffmpeg` → audio conversion for TTS and messaging delivery
- `opus` → Discord voice codec support
- `espeak-ng` → phonemizer backend for NeuTTS
## Step 4: choose STT and TTS providers
@@ -133,9 +142,20 @@ ELEVENLABS_API_KEY=***
#### Text-to-speech
- `edge` → free and good enough for most users
- `neutts` → free local/on-device TTS
- `elevenlabs` → best quality
- `openai` → good middle ground
### If you use `hermes setup`
If you choose NeuTTS in the setup wizard, Hermes checks whether `neutts` is already installed. If it is missing, the wizard tells you NeuTTS needs the Python package `neutts` and the system package `espeak-ng`, offers to install them for you, installs `espeak-ng` with your platform package manager, and then runs:
```bash
python -m pip install -U neutts[all]
```
If you skip that install or it fails, the wizard falls back to Edge TTS.
## Step 5: recommended config
```yaml
@@ -159,6 +179,18 @@ tts:
This is a good conservative default for most people.
If you want local TTS instead, switch the `tts` block to:
```yaml
tts:
provider: "neutts"
neutts:
ref_audio: ''
ref_text: ''
model: neuphonic/neutts-air-q4-gguf
device: cpu
```
## Use case 1: CLI voice mode
## Turn it on
+1 -1
View File
@@ -66,7 +66,7 @@ Common options:
| `-q`, `--query "..."` | One-shot, non-interactive prompt. |
| `-m`, `--model <model>` | Override the model for this run. |
| `-t`, `--toolsets <csv>` | Enable a comma-separated set of toolsets. |
| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`. |
| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`. |
| `-v`, `--verbose` | Verbose output. |
| `-Q`, `--quiet` | Programmatic mode: suppress banner/spinner/tool previews. |
| `--resume <session>` / `--continue [name]` | Resume a session directly from `chat`. |
@@ -18,6 +18,13 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
| `AI_GATEWAY_BASE_URL` | Override AI Gateway base URL (default: `https://ai-gateway.vercel.sh/v1`) |
| `OPENAI_API_KEY` | API key for custom OpenAI-compatible endpoints (used with `OPENAI_BASE_URL`) |
| `OPENAI_BASE_URL` | Base URL for custom endpoint (VLLM, SGLang, etc.) |
| `COPILOT_GITHUB_TOKEN` | GitHub token for Copilot API — first priority (OAuth `gho_*` or fine-grained PAT `github_pat_*`; classic PATs `ghp_*` are **not supported**) |
| `GH_TOKEN` | GitHub token — second priority for Copilot (also used by `gh` CLI) |
| `GITHUB_TOKEN` | GitHub token — third priority for Copilot |
| `HERMES_COPILOT_ACP_COMMAND` | Override Copilot ACP CLI binary path (default: `copilot`) |
| `COPILOT_CLI_PATH` | Alias for `HERMES_COPILOT_ACP_COMMAND` |
| `HERMES_COPILOT_ACP_ARGS` | Override Copilot ACP arguments (default: `--acp --stdio`) |
| `COPILOT_ACP_BASE_URL` | Override Copilot ACP base URL |
| `GLM_API_KEY` | z.ai / ZhipuAI GLM API key ([z.ai](https://z.ai)) |
| `ZAI_API_KEY` | Alias for `GLM_API_KEY` |
| `Z_AI_API_KEY` | Alias for `GLM_API_KEY` |
@@ -48,7 +55,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| Variable | Description |
|----------|-------------|
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `openrouter`, `nous`, `openai-codex`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`, `alibaba` (default: `auto`) |
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`, `alibaba` (default: `auto`) |
| `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
| `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
| `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
+2 -2
View File
@@ -372,8 +372,8 @@ hermes chat --continue
**Solution:**
```bash
# Ensure MCP dependencies are installed
pip install hermes-agent[mcp]
# Ensure MCP dependencies are installed (already included in standard install)
cd ~/.hermes/hermes-agent && uv pip install -e ".[mcp]"
# For npm-based servers, ensure Node.js is available
node --version
+85 -10
View File
@@ -15,7 +15,7 @@ All settings are stored in the `~/.hermes/` directory for easy access.
├── config.yaml # Settings (model, terminal, TTS, compression, etc.)
├── .env # API keys and secrets
├── auth.json # OAuth provider credentials (Nous Portal, etc.)
├── SOUL.md # Optional: global persona (agent embodies this personality)
├── SOUL.md # Primary agent identity (slot #1 in system prompt)
├── memories/ # Persistent memory (MEMORY.md, USER.md)
├── skills/ # Agent-created skills (managed via skill_manage tool)
├── cron/ # Scheduled jobs
@@ -63,6 +63,8 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
|----------|-------|
| **Nous Portal** | `hermes model` (OAuth, subscription-based) |
| **OpenAI Codex** | `hermes model` (ChatGPT OAuth, uses Codex models) |
| **GitHub Copilot** | `hermes model` (OAuth device code flow, `COPILOT_GITHUB_TOKEN`, `GH_TOKEN`, or `gh auth token`) |
| **GitHub Copilot ACP** | `hermes model` (spawns local `copilot --acp --stdio`) |
| **Anthropic** | `hermes model` (Claude Pro/Max via Claude Code auth, Anthropic API key, or manual setup-token) |
| **OpenRouter** | `OPENROUTER_API_KEY` in `~/.hermes/.env` |
| **AI Gateway** | `AI_GATEWAY_API_KEY` in `~/.hermes/.env` (provider: `ai-gateway`) |
@@ -117,6 +119,59 @@ model:
`--provider claude` and `--provider claude-code` also work as shorthand for `--provider anthropic`.
:::
### GitHub Copilot
Hermes supports GitHub Copilot as a first-class provider with two modes:
**`copilot` — Direct Copilot API** (recommended). Uses your GitHub Copilot subscription to access GPT-5.x, Claude, Gemini, and other models through the Copilot API.
```bash
hermes chat --provider copilot --model gpt-5.4
```
**Authentication options** (checked in this order):
1. `COPILOT_GITHUB_TOKEN` environment variable
2. `GH_TOKEN` environment variable
3. `GITHUB_TOKEN` environment variable
4. `gh auth token` CLI fallback
If no token is found, `hermes model` offers an **OAuth device code login** — the same flow used by the Copilot CLI and opencode.
:::warning Token types
The Copilot API does **not** support classic Personal Access Tokens (`ghp_*`). Supported token types:
| Type | Prefix | How to get |
|------|--------|------------|
| OAuth token | `gho_` | `hermes model` → GitHub Copilot → Login with GitHub |
| Fine-grained PAT | `github_pat_` | GitHub Settings → Developer settings → Fine-grained tokens (needs **Copilot Requests** permission) |
| GitHub App token | `ghu_` | Via GitHub App installation |
If your `gh auth token` returns a `ghp_*` token, use `hermes model` to authenticate via OAuth instead.
:::
**API routing**: GPT-5+ models (except `gpt-5-mini`) automatically use the Responses API. All other models (GPT-4o, Claude, Gemini, etc.) use Chat Completions. Models are auto-detected from the live Copilot catalog.
**`copilot-acp` — Copilot ACP agent backend**. Spawns the local Copilot CLI as a subprocess:
```bash
hermes chat --provider copilot-acp --model copilot-acp
# Requires the GitHub Copilot CLI in PATH and an existing `copilot login` session
```
**Permanent config:**
```yaml
model:
provider: "copilot"
default: "gpt-5.4"
```
| Environment variable | Description |
|---------------------|-------------|
| `COPILOT_GITHUB_TOKEN` | GitHub token for Copilot API (first priority) |
| `HERMES_COPILOT_ACP_COMMAND` | Override the Copilot CLI binary path (default: `copilot`) |
| `HERMES_COPILOT_ACP_ARGS` | Override ACP args (default: `--acp --stdio`) |
### First-Class Chinese AI Providers
These providers have built-in support with dedicated provider IDs. Set the API key and use `--provider` to select:
@@ -131,11 +186,11 @@ hermes chat --provider kimi-coding --model moonshot-v1-auto
# Requires: KIMI_API_KEY in ~/.hermes/.env
# MiniMax (global endpoint)
hermes chat --provider minimax --model MiniMax-Text-01
hermes chat --provider minimax --model MiniMax-M2.7
# Requires: MINIMAX_API_KEY in ~/.hermes/.env
# MiniMax (China endpoint)
hermes chat --provider minimax-cn --model MiniMax-Text-01
hermes chat --provider minimax-cn --model MiniMax-M2.7
# Requires: MINIMAX_CN_API_KEY in ~/.hermes/.env
# Alibaba Cloud / DashScope (Qwen models)
@@ -443,7 +498,7 @@ fallback_model:
When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.
Supported providers: `openrouter`, `nous`, `openai-codex`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `custom`.
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `anthropic`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `custom`.
:::tip
Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
@@ -766,7 +821,7 @@ Every model slot in Hermes — auxiliary tasks, compression, fallback — uses t
When `base_url` is set, Hermes ignores the provider and calls that endpoint directly (using `api_key` or `OPENAI_API_KEY` for auth). When only `provider` is set, Hermes uses that provider's built-in auth and base URL.
Available providers: `auto`, `openrouter`, `nous`, `codex`, `anthropic`, `main`, `zai`, `kimi-coding`, `minimax`, and any provider registered in the [provider registry](/docs/reference/environment-variables).
Available providers: `auto`, `openrouter`, `nous`, `codex`, `copilot`, `anthropic`, `main`, `zai`, `kimi-coding`, `minimax`, and any provider registered in the [provider registry](/docs/reference/environment-variables).
### Full auxiliary config reference
@@ -929,7 +984,7 @@ You can also change the reasoning effort at runtime with the `/reasoning` comman
```yaml
tts:
provider: "edge" # "edge" | "elevenlabs" | "openai"
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts"
edge:
voice: "en-US-AriaNeural" # 322 voices, 74 languages
elevenlabs:
@@ -938,6 +993,11 @@ tts:
openai:
model: "gpt-4o-mini-tts"
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
neutts:
ref_audio: ''
ref_text: ''
model: neuphonic/neutts-air-q4-gguf
device: cpu
```
This controls both the `text_to_speech` tool and spoken replies in voice mode (`/voice tts` in the CLI or messaging gateway).
@@ -1085,6 +1145,21 @@ group_sessions_per_user: true # true = per-user isolation in groups/channels, f
For the behavior details and examples, see [Sessions](/docs/user-guide/sessions) and the [Discord guide](/docs/user-guide/messaging/discord).
## Unauthorized DM Behavior
Control what Hermes does when an unknown user sends a direct message:
```yaml
unauthorized_dm_behavior: pair
whatsapp:
unauthorized_dm_behavior: ignore
```
- `pair` is the default. Hermes denies access, but replies with a one-time pairing code in DMs.
- `ignore` silently drops unauthorized DMs.
- Platform sections override the global default, so you can keep pairing enabled broadly while making one platform quieter.
## Quick Commands
Define custom commands that run shell commands without invoking the LLM — zero token usage, instant execution. Especially useful from messaging platforms (Telegram, Discord, etc.) for quick server checks or utility scripts.
@@ -1224,7 +1299,7 @@ delegation:
**Direct endpoint override:** If you want the obvious custom-endpoint path, set `delegation.base_url`, `delegation.api_key`, and `delegation.model`. That sends subagents directly to that OpenAI-compatible endpoint and takes precedence over `delegation.provider`. If `delegation.api_key` is omitted, Hermes falls back to `OPENAI_API_KEY` only.
The delegation provider uses the same credential resolution as CLI/gateway startup. All configured providers are supported: `openrouter`, `nous`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`. When a provider is set, the system automatically resolves the correct base URL, API key, and API mode — no manual credential wiring needed.
The delegation provider uses the same credential resolution as CLI/gateway startup. All configured providers are supported: `openrouter`, `nous`, `copilot`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`. When a provider is set, the system automatically resolves the correct base URL, API key, and API mode — no manual credential wiring needed.
**Precedence:** `delegation.base_url` in config → `delegation.provider` in config → parent provider (inherited). `delegation.model` in config → parent model (inherited). Setting just `model` without `provider` changes only the model name while keeping the parent's credentials (useful for switching models within the same provider like OpenRouter).
@@ -1243,15 +1318,15 @@ Hermes uses two different context scopes:
| File | Purpose | Scope |
|------|---------|-------|
| `SOUL.md` | **Primary agent identity** — defines who the agent is (slot #1 in the system prompt) | `~/.hermes/SOUL.md` or `$HERMES_HOME/SOUL.md` |
| `AGENTS.md` | Project-specific instructions, coding conventions | Working directory / project tree |
| `SOUL.md` | Default persona for this Hermes instance | `~/.hermes/SOUL.md` or `$HERMES_HOME/SOUL.md` |
| `.cursorrules` | Cursor IDE rules (also detected) | Working directory |
| `.cursor/rules/*.mdc` | Cursor rule files (also detected) | Working directory |
- **SOUL.md** is the agent's primary identity. It occupies slot #1 in the system prompt, completely replacing the built-in default identity. Edit it to fully customize who the agent is.
- If SOUL.md is missing, empty, or cannot be loaded, Hermes falls back to a built-in default identity.
- **AGENTS.md** is hierarchical: if subdirectories also have AGENTS.md, all are combined.
- **SOUL.md** is now global to the Hermes instance and is loaded only from `HERMES_HOME`.
- Hermes automatically seeds a default `SOUL.md` if one does not already exist.
- An empty `SOUL.md` contributes nothing to the system prompt.
- All loaded context files are capped at 20,000 characters with smart truncation.
See also:
@@ -136,11 +136,11 @@ Use the `conversation` parameter instead of tracking response IDs:
The server automatically chains to the latest response in that conversation. Like the `/title` command for gateway sessions.
### GET /v1/responses/{id}
### GET /v1/responses/\{id\}
Retrieve a previously stored response by ID.
### DELETE /v1/responses/{id}
### DELETE /v1/responses/\{id\}
Delete a stored response.
+6 -3
View File
@@ -20,10 +20,11 @@ If you have ever wanted Hermes to use a tool that already exists somewhere else,
## Quick start
1. Install MCP support:
1. Install MCP support (already included if you used the standard install script):
```bash
pip install hermes-agent[mcp]
cd ~/.hermes/hermes-agent
uv pip install -e ".[mcp]"
```
2. Add an MCP server to `~/.hermes/config.yaml`:
@@ -374,7 +375,9 @@ Inspect the project root and explain the directory layout.
Check:
```bash
pip install hermes-agent[mcp]
# Verify MCP deps are installed (already included in standard install)
cd ~/.hermes/hermes-agent && uv pip install -e ".[mcp]"
node --version
npx --version
```
+19 -16
View File
@@ -6,12 +6,12 @@ description: "Customize Hermes Agent's personality with a global SOUL.md, built-
# Personality & SOUL.md
Hermes Agent's personality is customizable, but there are two different layers that matter:
Hermes Agent's personality is fully customizable. `SOUL.md` is the **primary identity** — it's the first thing in the system prompt and defines who the agent is.
- `SOUL.md` — a durable persona file that lives in `HERMES_HOME` and is loaded automatically for that Hermes instance
- `SOUL.md` — a durable persona file that lives in `HERMES_HOME` and serves as the agent's identity (slot #1 in the system prompt)
- built-in or custom `/personality` presets — session-level system-prompt overlays
If you want a stable default voice that follows you across sessions, `SOUL.md` is the right tool.
If you want to change who Hermes is — or replace it with an entirely different agent persona — edit `SOUL.md`.
## How SOUL.md works now
@@ -29,15 +29,16 @@ $HERMES_HOME/SOUL.md
### Important behavior
- **SOUL.md is the agent's primary identity.** It occupies slot #1 in the system prompt, replacing the hardcoded default identity.
- Hermes creates a starter `SOUL.md` automatically if one does not exist yet
- Existing user `SOUL.md` files are never overwritten
- Hermes loads `SOUL.md` only from `HERMES_HOME`
- Hermes does not look in the current working directory for `SOUL.md`
- If `SOUL.md` exists but is empty, Hermes adds nothing from it to the prompt
- If `SOUL.md` exists but is empty, or cannot be loaded, Hermes falls back to a built-in default identity
- If `SOUL.md` has content, that content is injected verbatim after security scanning and truncation
- Hermes does not add wrapper language like "If SOUL.md is present..." around the file anymore
- SOUL.md is **not** duplicated in the context files section — it appears only once, as the identity
That makes `SOUL.md` a true per-user or per-instance default personality, not a repo-local trick.
That makes `SOUL.md` a true per-user or per-instance identity, not just an additive layer.
## Why this design
@@ -117,13 +118,13 @@ You optimize for truth, clarity, and usefulness over politeness theater.
## What Hermes injects into the prompt
If `SOUL.md` contains text, Hermes injects the file's text itself — not a wrapper explanation.
`SOUL.md` content goes directly into slot #1 of the system prompt — the agent identity position. No wrapper language is added around it.
So the system prompt gets the content directly, after:
The content goes through:
- prompt-injection scanning
- truncation if it is too large
If the file is empty or whitespace-only, nothing from `SOUL.md` is added.
If the file is empty, whitespace-only, or cannot be read, Hermes falls back to a built-in default identity ("You are Hermes Agent, an intelligent AI assistant created by Nous Research..."). This fallback also applies when `skip_context_files` is set (e.g., in subagent/delegation contexts).
## Security scanning
@@ -242,14 +243,16 @@ That gives you:
## How personality interacts with the full prompt
At a high level, the prompt stack includes:
1. default Hermes identity
2. memory/user context
3. skills guidance
4. context files such as `AGENTS.md`, `.cursorrules`, and global `SOUL.md`
5. platform-specific formatting hints
6. optional system-prompt overlays such as `/personality`
1. **SOUL.md** (agent identity — or built-in fallback if SOUL.md is unavailable)
2. tool-aware behavior guidance
3. memory/user context
4. skills guidance
5. context files (`AGENTS.md`, `.cursorrules`)
6. timestamp
7. platform-specific formatting hints
8. optional system-prompt overlays such as `/personality`
So `SOUL.md` is important, but it is one layer in a broader system.
`SOUL.md` is the foundation — everything else builds on top of it.
## Related docs
@@ -147,7 +147,7 @@ Default configuration:
- Tests 3 models at different scales for robustness:
- `qwen/qwen3-8b` (small)
- `z-ai/glm-4.7-flash` (medium)
- `minimax/minimax-m2.5` (large)
- `minimax/minimax-m2.7` (large)
- Total: ~144 rollouts
This validates:
+10 -3
View File
@@ -10,13 +10,14 @@ Hermes Agent supports both text-to-speech output and voice message transcription
## Text-to-Speech
Convert text to speech with three providers:
Convert text to speech with four providers:
| Provider | Quality | Cost | API Key |
|----------|---------|------|---------|
| **Edge TTS** (default) | Good | Free | None needed |
| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
| **NeuTTS** | Good | Free | None needed |
### Platform Delivery
@@ -32,7 +33,7 @@ Convert text to speech with three providers:
```yaml
# In ~/.hermes/config.yaml
tts:
provider: "edge" # "edge" | "elevenlabs" | "openai"
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts"
edge:
voice: "en-US-AriaNeural" # 322 voices, 74 languages
elevenlabs:
@@ -41,6 +42,11 @@ tts:
openai:
model: "gpt-4o-mini-tts"
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
neutts:
ref_audio: ''
ref_text: ''
model: neuphonic/neutts-air-q4-gguf
device: cpu
```
### Telegram Voice Bubbles & ffmpeg
@@ -49,6 +55,7 @@ Telegram voice bubbles require Opus/OGG audio format:
- **OpenAI and ElevenLabs** produce Opus natively — no extra setup
- **Edge TTS** (default) outputs MP3 and needs **ffmpeg** to convert:
- **NeuTTS** outputs WAV and also needs **ffmpeg** to convert for Telegram voice bubbles
```bash
# Ubuntu/Debian
@@ -61,7 +68,7 @@ brew install ffmpeg
sudo dnf install ffmpeg
```
Without ffmpeg, Edge TTS audio is sent as a regular audio file (playable, but shows as a rectangular player instead of a voice bubble).
Without ffmpeg, Edge TTS and NeuTTS audio are sent as regular audio files (playable, but shown as a rectangular player instead of a voice bubble).
:::tip
If you want voice bubbles without installing ffmpeg, switch to the OpenAI or ElevenLabs provider.
+26 -8
View File
@@ -44,6 +44,9 @@ pip install hermes-agent[messaging]
# Premium TTS (ElevenLabs)
pip install hermes-agent[tts-premium]
# Local TTS (NeuTTS, optional)
python -m pip install -U neutts[all]
# Everything at once
pip install hermes-agent[all]
```
@@ -54,6 +57,8 @@ pip install hermes-agent[all]
| `messaging` | `discord.py[voice]`, `python-telegram-bot`, `aiohttp` | Discord & Telegram bots |
| `tts-premium` | `elevenlabs` | ElevenLabs TTS provider |
Optional local TTS provider: install `neutts` separately with `python -m pip install -U neutts[all]`. On first use it downloads the model automatically.
:::info
`discord.py[voice]` installs **PyNaCl** (for voice encryption) and **opus bindings** automatically. This is required for Discord voice channel support.
:::
@@ -63,9 +68,11 @@ pip install hermes-agent[all]
```bash
# macOS
brew install portaudio ffmpeg opus
brew install espeak-ng # for NeuTTS
# Ubuntu/Debian
sudo apt install portaudio19-dev ffmpeg libopus0
sudo apt install espeak-ng # for NeuTTS
```
| Dependency | Purpose | Required For |
@@ -73,6 +80,7 @@ sudo apt install portaudio19-dev ffmpeg libopus0
| **PortAudio** | Microphone input and audio playback | CLI voice mode |
| **ffmpeg** | Audio format conversion (MP3 → Opus, PCM → WAV) | All platforms |
| **Opus** | Discord voice codec | Discord voice channels |
| **espeak-ng** | Phonemizer backend | Local NeuTTS provider |
### API Keys
@@ -84,8 +92,9 @@ Add to `~/.hermes/.env`:
GROQ_API_KEY=your-key # Groq Whisper — fast, free tier (cloud)
VOICE_TOOLS_OPENAI_KEY=your-key # OpenAI Whisper — paid (cloud)
# Text-to-Speech (optional — Edge TTS works without any key)
ELEVENLABS_API_KEY=your-key # ElevenLabs — premium quality
# Text-to-Speech (optional — Edge TTS and NeuTTS work without any key)
ELEVENLABS_API_KEY=*** # ElevenLabs — premium quality
# VOICE_TOOLS_OPENAI_KEY above also enables OpenAI TTS
```
:::tip
@@ -303,8 +312,9 @@ DISCORD_ALLOWED_USERS=your-user-id
# STT — local provider needs no key (pip install faster-whisper)
# GROQ_API_KEY=your-key # Alternative: cloud-based, fast, free tier
# TTS — optional, Edge TTS (free) is the default
# ELEVENLABS_API_KEY=your-key # Premium quality
# TTS — optional. Edge TTS and NeuTTS need no key.
# ELEVENLABS_API_KEY=*** # Premium quality
# VOICE_TOOLS_OPENAI_KEY=*** # OpenAI TTS / Whisper
```
### Start the Gateway
@@ -385,7 +395,7 @@ stt:
# Text-to-Speech
tts:
provider: "edge" # "edge" (free) | "elevenlabs" | "openai"
provider: "edge" # "edge" (free) | "elevenlabs" | "openai" | "neutts"
edge:
voice: "en-US-AriaNeural" # 322 voices, 74 languages
elevenlabs:
@@ -394,6 +404,11 @@ tts:
openai:
model: "gpt-4o-mini-tts"
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
neutts:
ref_audio: ''
ref_text: ''
model: neuphonic/neutts-air-q4-gguf
device: cpu
```
### Environment Variables
@@ -410,9 +425,9 @@ STT_OPENAI_MODEL=whisper-1 # Override default OpenAI STT model
GROQ_BASE_URL=https://api.groq.com/openai/v1 # Custom Groq endpoint
STT_OPENAI_BASE_URL=https://api.openai.com/v1 # Custom OpenAI STT endpoint
# Text-to-Speech providers (Edge TTS needs no key)
ELEVENLABS_API_KEY=... # ElevenLabs (premium quality)
# OpenAI TTS uses VOICE_TOOLS_OPENAI_KEY
# Text-to-Speech providers (Edge TTS and NeuTTS need no key)
ELEVENLABS_API_KEY=*** # ElevenLabs (premium quality)
# VOICE_TOOLS_OPENAI_KEY above also enables OpenAI TTS
# Discord voice channel
DISCORD_BOT_TOKEN=...
@@ -440,6 +455,9 @@ Provider priority (automatic fallback): **local** > **groq** > **openai**
| **Edge TTS** | Good | Free | ~1s | No |
| **ElevenLabs** | Excellent | Paid | ~2s | Yes |
| **OpenAI TTS** | Good | Paid | ~1.5s | Yes |
| **NeuTTS** | Good | Free | Depends on CPU/GPU | No |
NeuTTS uses the `tts.neutts` config block above.
---
@@ -97,6 +97,18 @@ WHATSAPP_MODE=bot # "bot" or "self-chat"
WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers (with country code, no +)
```
Optional behavior settings in `~/.hermes/config.yaml`:
```yaml
unauthorized_dm_behavior: pair
whatsapp:
unauthorized_dm_behavior: ignore
```
- `unauthorized_dm_behavior: pair` is the global default. Unknown DM senders get a pairing code.
- `whatsapp.unauthorized_dm_behavior: ignore` makes WhatsApp stay silent for unauthorized DMs, which is usually the better choice for a private number.
Then start the gateway:
```bash
@@ -162,6 +174,7 @@ whatsapp:
| **Bridge crashes or reconnect loops** | Restart the gateway, update Hermes, and re-pair if the session was invalidated by a WhatsApp protocol change. |
| **Bot stops working after WhatsApp update** | Update Hermes to get the latest bridge version, then re-pair. |
| **Messages not being received** | Verify `WHATSAPP_ALLOWED_USERS` includes the sender's number (with country code, no `+` or spaces). |
| **Bot replies to strangers with a pairing code** | Set `whatsapp.unauthorized_dm_behavior: ignore` in `~/.hermes/config.yaml` if you want unauthorized DMs to be silently ignored instead. |
---
@@ -173,6 +186,13 @@ of authorized users. Without this setting, the gateway will **deny all incoming
safety measure.
:::
By default, unauthorized DMs still receive a pairing code reply. If you want a private WhatsApp number to stay completely silent to strangers, set:
```yaml
whatsapp:
unauthorized_dm_behavior: ignore
```
- The `~/.hermes/whatsapp/session` directory contains full session credentials — protect it like a password
- Set file permissions: `chmod 700 ~/.hermes/whatsapp/session`
- Use a **dedicated phone number** for the bot to isolate risk from your personal account
+13
View File
@@ -151,6 +151,19 @@ For more flexible authorization, Hermes includes a code-based pairing system. In
3. The bot owner runs `hermes pairing approve <platform> <code>` on the CLI
4. The user is permanently approved for that platform
Control how unauthorized direct messages are handled in `~/.hermes/config.yaml`:
```yaml
unauthorized_dm_behavior: pair
whatsapp:
unauthorized_dm_behavior: ignore
```
- `pair` is the default. Unauthorized DMs get a pairing code reply.
- `ignore` silently drops unauthorized DMs.
- Platform sections override the global default, so you can keep pairing on Telegram while keeping WhatsApp silent.
**Security features** (based on OWASP + NIST SP 800-63-4 guidance):
| Feature | Details |