Compare commits

..

2 Commits

Author SHA1 Message Date
Brooklyn Nicholson 3d8be3e84d feat(lint): observability for the LSP bridge with steady-state silence
Adds per-call structured logging on the dedicated ``hermes.lint.lsp``
logger so an opt-in user can answer "did LSP fire on that edit?" with
``rg 'lsp\['`` against ``~/.hermes/logs/agent.log``. Levels are tuned
so a 1000-write session emits exactly ONE INFO line at the default
threshold, not 1000.

Level model
-----------
* ``DEBUG`` (invisible at the default INFO threshold) for every per-call
  steady-state event: ``clean``, ``feature off``, ``extension not
  mapped``, ``backend not local``, repeated ``no project root`` for an
  already-announced file, repeated ``server unavailable`` for an
  already-announced binary.
* ``INFO`` for state transitions worth surfacing once: ``active for
  <root>`` the first time a (language, project_root) client starts,
  ``no project root for <path>`` the first time we see that orphan
  file. Plus every ``N diags`` event — diagnostics are inherently rare
  per-edit and are exactly the failure signal users want to grep for.
* ``WARNING`` for action-required failures the first time per
  (language, binary): ``server unavailable`` (binary not on PATH),
  ``no server configured``. Per-call ``WARNING`` for timeouts, server
  errors, and unexpected bridge exceptions — these are inherently
  novel events, not steady state, and each one is its own signal.

Dedup is in-process module-level sets guarded by a lock. Sets grow at
most by the number of distinct (language, project_root) and (language,
binary) pairs touched in one Python process — a few hundred entries in
the most aggressive monorepo session, which is bytes of memory. A
bounded LRU was rejected because evicting an entry would risk
re-firing the WARNING/INFO line we explicitly want to suppress.

Why this matters
----------------
The previous draft logged every per-call event at INFO. ``agent.log``
caps at 5 MB × 3 backups (= 20 MB) via ``RotatingFileHandler``, so
nothing would crash, but a normal coding session would dwarf the actual
signal under hundreds of ``lsp[typescript] clean (...)`` lines. The
new model preserves the verification answer ("LSP active for <root>")
and the action-required signals while keeping clean steady state out
of the user's face.

Tests
-----
* ``TestLogLevelsSteadyState`` — feature off, unmapped extension, non-
  local backend, and repeated clean writes all stay at DEBUG. Exactly
  one INFO ("active for ...") survives across N calls.
* ``TestLogLevelsNovelEvents`` — diagnostics are INFO per call;
  ``active for`` fires once per (language, root).
* ``TestLogLevelsActionRequired`` — server unavailable warns once per
  binary; orphan files INFO once per path; timeouts WARN every time.
2026-05-12 00:28:30 -04:00
Brooklyn Nicholson 9a546d6b08 feat(lint): opt-in LSP-backed lint path in _check_lint
The post-edit lint hook in `tools/file_operations.py::_check_lint` has
historically resolved `.ts`/`.tsx` to `npx tsc --noEmit <single-file>`,
which has no view of the surrounding project and floods the agent with
phantom "Cannot find module" errors that disappear the moment
`tsconfig.json` is in scope. Same shape for `.go` (`go vet` orphan) and
`.rs` (`rustfmt --check` is style, not types).

This change adds an opt-in LSP path that runs *before* the in-process /
shell linter table:

- `tools/lsp_client.py` — stdio JSON-RPC client with Content-Length
  framing, per-(language, project_root, server_cmd) cache, idle reaper.
  Sync API (`diagnostics(path, content)`) returns one snapshot via
  `textDocument/didOpen` + `publishDiagnostics` with a configurable
  settle window for servers that emit "indexing" diagnostics first.
  Hand-rolled rather than pulled from `multilspy` / `sansio-lsp-client`
  / `pygls` — see the module docstring for the comparison.
- `tools/lsp_lint.py` — bridge between `_check_lint` and the client.
  Returns `None` (caller falls through to the existing shell linter)
  whenever LSP cannot or should not handle the file: feature off,
  language unmapped, env not local, no project root marker, server
  binary missing, request timed out. Never raises.
- `tools/file_operations.py::_check_lint` — calls the bridge first,
  falls through to the in-process and shell linters unchanged. With
  the flag off (the default) this is one extra `import` on the lint
  path and zero behaviour change.
- `hermes_cli/config.py` — new `lint.lsp.{enabled,servers,...}` block,
  off by default. New top-level key, so the deep-merge handles
  upgrades and no `_config_version` bump is required.

Out of scope for this PR (explicitly noted in the module docstrings):

- Container / SSH / Modal / Daytona backends. The server has to run
  inside the sandbox, which means baking it into the image or
  installing on first use; that's the gnarly half and gets its own PR.
- didChange / live editing. We re-open with the freshly written bytes
  on every call.
- Pull diagnostics (LSP 3.17). Push works on every server we care
  about today.

Tests:

- `test_lsp_client.py` — pure helpers (project root walk, URI, framing,
  Diagnostic conversion) + end-to-end coverage against an in-tree fake
  LSP server (Python script over stdio) for clean / dirty / settle /
  timeout / registry caching / shutdown.
- `test_lsp_lint_bridge.py` — feature flag, backend gating, project
  root gating, language-map gating, error containment.
- `test_check_lint_lsp_routing.py` — regression guard that the default
  (LSP disabled) path still hits in-process / shell linters, plus
  proof that an enabled bridge short-circuits the shell linter.
2026-05-12 00:13:12 -04:00
367 changed files with 7470 additions and 39136 deletions
-22
View File
@@ -14,14 +14,6 @@
# LLM_MODEL is no longer read from .env — this line is kept for reference only.
# LLM_MODEL=anthropic/claude-opus-4.6
# =============================================================================
# LLM PROVIDER (NovitaAI)
# =============================================================================
# NovitaAI — 90+ models, pay-per-use
# Get your key at: https://novita.ai/settings/key-management
# NOVITA_API_KEY=
# NOVITA_BASE_URL=https://api.novita.ai/openai/v1 # Override default base URL
# =============================================================================
# LLM PROVIDER (Google AI Studio / Gemini)
# =============================================================================
@@ -281,20 +273,6 @@ BROWSER_SESSION_TIMEOUT=300
# Browser sessions are automatically closed after this period of no activity
BROWSER_INACTIVITY_TIMEOUT=120
# Camofox local anti-detection browser (Camoufox-based Firefox).
# Set CAMOFOX_URL to route the browser tools through a local Camofox server
# instead of agent-browser/Browserbase. See docs/user-guide/features/browser.md.
# CAMOFOX_URL=http://localhost:9377
# Externally managed Camofox sessions — when another app owns the visible
# Camofox browser, set these so Hermes shares the same userId/profile instead
# of creating its own isolated session.
# CAMOFOX_USER_ID=
# CAMOFOX_SESSION_KEY=
# Set to true to reuse an already-open Camofox tab for this identity before
# creating a new one (useful for gateway restarts).
# CAMOFOX_ADOPT_EXISTING_TAB=false
# =============================================================================
# SESSION LOGGING
# =============================================================================
+34 -161
View File
@@ -28,10 +28,9 @@ permissions:
contents: read
# Concurrency: push/release runs are NEVER cancelled so every merge gets its
# own SHA-tagged image; :main and :latest are guarded separately by the
# move-main and move-latest jobs. PR runs reuse a PR-scoped group with
# cancel-in-progress: true so rapid pushes to the same PR collapse to the
# latest commit.
# own SHA-tagged image; :latest is guarded separately by the move-latest job.
# PR runs reuse a PR-scoped group with cancel-in-progress: true so rapid
# pushes to the same PR collapse to the latest commit.
concurrency:
group: docker-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
@@ -92,10 +91,10 @@ jobs:
# pattern for multi-runner multi-platform builds.
#
# We apply the OCI revision label here (and again on arm64) because
# the move-main / move-latest jobs read it off the linux/amd64
# sub-manifest config of the floating tag to decide whether it's safe
# to advance. The label must be on each per-arch image — manifest
# lists themselves don't carry image config labels.
# the move-latest job reads it off the linux/amd64 sub-manifest config
# of `:latest` to decide whether it's safe to advance. The label must
# be on each per-arch image — manifest lists themselves don't carry
# image config labels.
- name: Push amd64 by digest
id: push
if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
@@ -218,8 +217,6 @@ jobs:
timeout-minutes: 10
outputs:
pushed_sha_tag: ${{ steps.mark_pushed.outputs.pushed }}
pushed_release_tag: ${{ steps.mark_release_pushed.outputs.pushed }}
release_tag: ${{ steps.tag.outputs.tag }}
steps:
- name: Download digests
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
@@ -274,43 +271,33 @@ jobs:
IMAGE_NAME: ${{ env.IMAGE_NAME }}
TAG: ${{ steps.tag.outputs.tag }}
# Signal to move-main that the SHA tag is live. Only on main pushes;
# releases set pushed_release_tag instead.
# Signal to move-latest that the SHA tag is live. Only on main pushes;
# releases don't trigger move-latest (they use their own release tag).
- name: Mark SHA tag pushed
id: mark_pushed
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
run: echo "pushed=true" >> "$GITHUB_OUTPUT"
# Signal to move-latest that the release tag is live.
- name: Mark release tag pushed
id: mark_release_pushed
if: github.event_name == 'release'
run: echo "pushed=true" >> "$GITHUB_OUTPUT"
# ---------------------------------------------------------------------------
# Move :main to point at the SHA tag the merge job pushed.
#
# :main is the floating tag that tracks the tip of the main branch. Every
# merge to main retags :main forward. Users who want "latest dev build"
# pull :main; users who want stable releases pull :latest.
# Move :latest to point at the SHA tag the merge job pushed.
#
# The real serialization guarantee comes from the top-level concurrency
# group (`docker-${{ github.ref }}` with `cancel-in-progress: false`),
# which ensures at most one workflow run for this ref executes at a time.
# That means two move-main steps for the same ref cannot overlap.
# That means two move-latest steps for the same ref cannot overlap.
#
# This job has its own concurrency group as defense-in-depth: if the
# top-level group is ever loosened, queued move-mains will run serially
# top-level group is ever loosened, queued move-latests will run serially
# in arrival order, each one running the ancestor check below and either
# advancing :main or skipping. `cancel-in-progress: false` matches the
# advancing :latest or skipping. `cancel-in-progress: false` matches the
# top-level setting — we don't want rapid pushes to cancel a queued
# move-main, because the ancestor check is the real safety mechanism
# and queueing is cheap (move-main is a ~30s registry op).
# move-latest, because the ancestor check is the real safety mechanism
# and queueing is cheap (move-latest is a ~30s registry op).
#
# Combined with the ancestor check, this means :main only ever moves
# Combined with the ancestor check, this means :latest only ever moves
# forward in git history.
# ---------------------------------------------------------------------------
move-main:
move-latest:
if: |
github.repository == 'NousResearch/hermes-agent'
&& github.event_name == 'push'
@@ -320,7 +307,7 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 10
concurrency:
group: docker-move-main-${{ github.ref }}
group: docker-move-latest-${{ github.ref }}
cancel-in-progress: false
steps:
- name: Checkout code
@@ -337,13 +324,13 @@ jobs:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
# Read the git revision label off the current :main manifest, then
# Read the git revision label off the current :latest manifest, then
# use `git merge-base --is-ancestor` to check whether our commit is a
# descendant of it. If :main doesn't exist yet, or its label is
# descendant of it. If :latest doesn't exist yet, or its label is
# missing, we treat that as "safe to publish". If another run already
# advanced :main past us (or diverged), we skip and leave it alone.
- name: Decide whether to move :main
id: main_check
# advanced :latest past us (or diverged), we skip and leave it alone.
- name: Decide whether to move :latest
id: latest_check
run: |
set -euo pipefail
image=nousresearch/hermes-agent
@@ -351,119 +338,6 @@ jobs:
# Pull the JSON for the linux/amd64 sub-manifest's config and extract
# the OCI revision label with jq — Go template field access can't
# handle dots in map keys, so using json+jq is the robust route.
image_json=$(
docker buildx imagetools inspect "${image}:main" \
--format '{{ json (index .Image "linux/amd64") }}' \
2>/dev/null || true
)
if [ -z "${image_json}" ]; then
echo "No existing :main (or inspect failed) — safe to publish."
echo "push_main=true" >> "$GITHUB_OUTPUT"
exit 0
fi
current_sha=$(
printf '%s' "${image_json}" \
| jq -r '.config.Labels."org.opencontainers.image.revision" // ""'
)
if [ -z "${current_sha}" ]; then
echo "Registry :main has no revision label — safe to publish."
echo "push_main=true" >> "$GITHUB_OUTPUT"
exit 0
fi
echo "Registry :main is at ${current_sha}"
echo "This run is at ${GITHUB_SHA}"
if [ "${current_sha}" = "${GITHUB_SHA}" ]; then
echo ":main already points at our SHA — nothing to do."
echo "push_main=false" >> "$GITHUB_OUTPUT"
exit 0
fi
# Make sure we have the :main commit locally for merge-base.
if ! git cat-file -e "${current_sha}^{commit}" 2>/dev/null; then
git fetch --no-tags --prune origin \
"+refs/heads/main:refs/remotes/origin/main" \
|| true
fi
if ! git cat-file -e "${current_sha}^{commit}" 2>/dev/null; then
echo "Registry :main points at an unknown commit (${current_sha}); refusing to overwrite."
echo "push_main=false" >> "$GITHUB_OUTPUT"
exit 0
fi
# Our SHA must be a descendant of the current :main to be safe.
if git merge-base --is-ancestor "${current_sha}" "${GITHUB_SHA}"; then
echo "Our commit is a descendant of :main — safe to advance."
echo "push_main=true" >> "$GITHUB_OUTPUT"
else
echo "Another run advanced :main past us (or diverged) — leaving it alone."
echo "push_main=false" >> "$GITHUB_OUTPUT"
fi
# Retag the already-pushed SHA manifest as :main. This is a registry-
# side operation — no rebuild, no layer re-push — so it's quick and
# atomic per-tag. The ancestor check above plus the cancel-in-progress
# concurrency on this job together guarantee we only ever move :main
# forward in git history.
- name: Move :main to this SHA
if: steps.main_check.outputs.push_main == 'true'
run: |
set -euo pipefail
image=nousresearch/hermes-agent
docker buildx imagetools create \
--tag "${image}:main" \
"${image}:sha-${GITHUB_SHA}"
# ---------------------------------------------------------------------------
# Move :latest to point at the release tag the merge job pushed.
#
# :latest is the floating tag that tracks the most recent stable release.
# Only `release: published` events advance it — never main pushes.
#
# We still run an ancestor check against the existing :latest so that a
# backport release on an older branch (e.g. patching v1.1.5 after v1.2.3
# is out) doesn't drag :latest backwards. The check is the same shape as
# move-main: read the OCI revision label off the current :latest, look up
# that commit in git, and only advance if our release commit is a strict
# descendant.
# ---------------------------------------------------------------------------
move-latest:
if: |
github.repository == 'NousResearch/hermes-agent'
&& github.event_name == 'release'
&& needs.merge.outputs.pushed_release_tag == 'true'
needs: merge
runs-on: ubuntu-latest
timeout-minutes: 10
concurrency:
group: docker-move-latest
cancel-in-progress: false
steps:
- name: Checkout code
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 1000
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
- name: Log in to Docker Hub
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Decide whether to move :latest
id: latest_check
run: |
set -euo pipefail
image=nousresearch/hermes-agent
image_json=$(
docker buildx imagetools inspect "${image}:latest" \
--format '{{ json (index .Image "linux/amd64") }}' \
@@ -488,7 +362,7 @@ jobs:
fi
echo "Registry :latest is at ${current_sha}"
echo "This release is at ${GITHUB_SHA}"
echo "This run is at ${GITHUB_SHA}"
if [ "${current_sha}" = "${GITHUB_SHA}" ]; then
echo ":latest already points at our SHA — nothing to do."
@@ -497,7 +371,6 @@ jobs:
fi
# Make sure we have the :latest commit locally for merge-base.
# Releases can be cut from any branch, so fetch broadly.
if ! git cat-file -e "${current_sha}^{commit}" 2>/dev/null; then
git fetch --no-tags --prune origin \
"+refs/heads/main:refs/remotes/origin/main" \
@@ -510,25 +383,25 @@ jobs:
exit 0
fi
# Our release SHA must be a descendant of the current :latest.
# Backport releases on older branches won't satisfy this and will
# be left alone — :latest stays on the newer release.
# Our SHA must be a descendant of the current :latest to be safe.
if git merge-base --is-ancestor "${current_sha}" "${GITHUB_SHA}"; then
echo "Our release commit is a descendant of :latest — safe to advance."
echo "Our commit is a descendant of :latest — safe to advance."
echo "push_latest=true" >> "$GITHUB_OUTPUT"
else
echo "Existing :latest is newer than this release (likely a backport) — leaving it alone."
echo "Another run advanced :latest past us (or diverged) — leaving it alone."
echo "push_latest=false" >> "$GITHUB_OUTPUT"
fi
# Retag the already-pushed release manifest as :latest.
- name: Move :latest to this release tag
# Retag the already-pushed SHA manifest as :latest. This is a registry-
# side operation — no rebuild, no layer re-push — so it's quick and
# atomic per-tag. The ancestor check above plus the cancel-in-progress
# concurrency on this job together guarantee we only ever move :latest
# forward in git history.
- name: Move :latest to this SHA
if: steps.latest_check.outputs.push_latest == 'true'
env:
RELEASE_TAG: ${{ needs.merge.outputs.release_tag }}
run: |
set -euo pipefail
image=nousresearch/hermes-agent
docker buildx imagetools create \
--tag "${image}:latest" \
"${image}:${RELEASE_TAG}"
"${image}:sha-${GITHUB_SHA}"
+1 -4
View File
@@ -55,14 +55,11 @@ jobs:
e2e:
runs-on: ubuntu-latest
timeout-minutes: 15
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Install system dependencies
run: sudo apt-get update && sudo apt-get install -y ripgrep
- name: Install uv
uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86 # v5
-91
View File
@@ -513,17 +513,6 @@ generic plugin surface (new hook, new ctx method) — never hardcode
plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
honcho argparse from `main.py` for exactly this reason.
**No new in-tree memory providers (policy, May 2026):** the set of
built-in memory providers under `plugins/memory/` is closed. New memory
backends must ship as **standalone plugin repos** that users install
into `~/.hermes/plugins/` (or via pip entry points) — they implement
the same `MemoryProvider` ABC, register through the same discovery
path, and integrate via `hermes memory setup` / `post_setup()` without
landing in this tree. PRs that add a new directory under
`plugins/memory/` will be closed with a pointer to publish the
provider as its own repo. Existing in-tree providers stay; bug fixes
to them are welcome.
### Model-provider plugins (`plugins/model-providers/<name>/`)
Every inference backend (openrouter, anthropic, gmi, deepseek, nvidia, …)
@@ -591,86 +580,6 @@ during setup, injected at load time).
Top-level `tags:` and `category:` are also accepted and mirrored from
`metadata.hermes.*` by the loader.
### Skill authoring standards (HARDLINE)
Every new or modernized skill — bundled, optional, or contributed —
must meet these standards before merge. Reviewers reject PRs that
violate them.
1. **`description` ≤ 60 characters, one sentence, ends with a period.**
Long descriptions bloat skill listings and dilute the model's
attention when many skills are loaded. State the capability, not
the implementation. No marketing words ("powerful",
"comprehensive", "seamless", "advanced"). Don't repeat the skill
name. Verify with:
```python
import re, pathlib
m = re.search(r'^description: (.*)$',
pathlib.Path('skills/<cat>/<name>/SKILL.md').read_text(),
re.MULTILINE)
assert len(m.group(1)) <= 60, len(m.group(1))
```
2. **Tools referenced in SKILL.md prose must be native Hermes tools or
MCP servers the skill explicitly expects.** When the skill needs a
capability, point at the proper tool by name in backticks
(`` `terminal` ``, `` `web_extract` ``, `` `read_file` ``,
`` `patch` ``, `` `search_files` ``, `` `vision_analyze` ``,
`` `browser_navigate` ``, `` `delegate_task` ``, etc.). Do NOT
name shell utilities the agent already has wrapped — `grep` →
`search_files`, `cat`/`head`/`tail` → `read_file`, `sed`/`awk` →
`patch`, `find`/`ls` → `search_files target='files'`. If the skill
depends on an MCP server, name the MCP server and document the
expected setup in `## Prerequisites`. Anything else (third-party
CLIs, shell pipelines, etc.) is fair game inside script files but
should not be the headline interaction surface in the prose.
3. **`platforms:` gating audited against actual script imports.**
Skills that use POSIX-only primitives (`fcntl`, `termios`,
`os.setsid`, `os.kill(pid, 0)` for liveness, `/proc`, `/tmp`
hardcoded, `signal.SIGKILL`, bash heredocs, `osascript`, `apt`,
`systemctl`) must declare their supported platforms. Default
posture: try to fix it cross-platform first — `tempfile.gettempdir`,
`pathlib.Path`, `psutil.pid_exists`, Python-level filtering instead
of `grep`. Gate to a narrower set only when the dependency is
genuinely platform-bound.
4. **`author` credits the human contributor first.** For external
contributions, the contributor's real name + GitHub handle goes
first; "Hermes Agent" is the secondary collaborator. If the
contributor's commit shows "Hermes Agent" as author (because they
used Hermes to draft the skill), replace it with their actual name
— credit the human, not the tool.
5. **SKILL.md body uses the modern section order.** `# <Skill> Skill`
title, 2-3 sentence intro stating what it does and doesn't do,
`## When to Use`, `## Prerequisites`, `## How to Run`,
`## Quick Reference`, `## Procedure`, `## Pitfalls`,
`## Verification`. Target ~200 lines for a complex skill,
~100 lines for a simple one. Cut redundant intro fluff, marketing
prose, and re-explanations of env vars already in
`## Prerequisites`.
6. **Scripts go in `scripts/`, references in `references/`,
templates in `templates/`.** Don't expect the model to inline-write
parsers, XML walkers, or non-trivial logic every call — ship a
helper script. Reference it from SKILL.md by path relative to the
skill directory.
7. **Tests live at `tests/skills/test_<skill>_skill.py`** and use only
stdlib + pytest + `unittest.mock`. No live network calls. Run via
`scripts/run_tests.sh tests/skills/test_<skill>_skill.py -q`.
8. **`.env.example` additions are isolated to a clearly delimited
block.** Don't touch the surrounding file — contributor-supplied
`.env.example` versions are usually stale and edits outside the
skill's own block must be dropped during salvage.
The full salvage / modernization checklist for external skill PRs
lives in the `hermes-agent-dev` skill at
`references/new-skill-pr-salvage.md` — load it before polishing
contributor skill PRs.
---
## Toolsets
-70
View File
@@ -49,24 +49,6 @@ If your skill is specialized, community-contributed, or niche, it's better suite
---
## Memory Providers: Ship as a Standalone Plugin
**We are no longer accepting new memory providers into this repo.** The set of built-in providers under `plugins/memory/` (honcho, mem0, supermemory, byterover, hindsight, holographic, openviking, retaindb) is closed. If you want to add a new memory backend, publish it as a **standalone plugin repo** that users install into `~/.hermes/plugins/` (or via a pip entry point).
Standalone memory plugins:
- Implement the same `MemoryProvider` ABC (`agent/memory_provider.py`) — `sync_turn`, `prefetch`, `shutdown`, and optionally `post_setup(hermes_home, config)` for setup-wizard integration
- Use the same discovery system — `discover_memory_providers()` picks them up from user/project plugin directories and pip entry points
- Integrate with `hermes memory setup` via `post_setup()` — no need to touch core code
- Can register their own CLI subcommands via `register_cli(subparser)` in a `cli.py` file
- Get all the same lifecycle hooks and config plumbing as in-tree providers
PRs that add a new directory under `plugins/memory/` will be closed with a pointer to publish the provider as its own repo. Existing in-tree providers stay; bug fixes to them are welcome.
This isn't a quality bar — it's a coupling-and-maintenance decision. Memory providers are the most common plugin type and they shouldn't all live in this tree.
---
## Development Setup
### Prerequisites
@@ -479,58 +461,6 @@ Gateway and messaging sessions never collect secrets in-band; they instruct the
See `skills/gifs/gif-search/` and `skills/email/himalaya/` for examples.
### Skill authoring standards (HARDLINE)
Every new or modernized skill — bundled, optional, or contributed — must meet these standards before merge. Reviewers reject PRs that violate them.
1. **`description` ≤ 60 characters, one sentence, ends with a period.** Long descriptions bloat the skill listing UI and dilute the model's attention when many skills are loaded. State the capability, not the implementation. No marketing words ("powerful", "comprehensive", "seamless", "advanced"). Don't repeat the skill name. Verify with:
```python
import re, pathlib
m = re.search(r'^description: (.*)$',
pathlib.Path('skills/<cat>/<name>/SKILL.md').read_text(),
re.MULTILINE)
assert len(m.group(1)) <= 60, len(m.group(1))
```
Good: `Search arXiv papers by keyword, author, category, or ID.`
Bad: `A powerful and comprehensive skill that allows the agent to search arXiv for relevant academic papers using various criteria including keywords, authors, and categories.`
2. **Tools referenced in SKILL.md prose must be native Hermes tools or MCP servers the skill explicitly expects.** When the skill needs a capability, point at the proper tool by name in backticks: `` `terminal` ``, `` `web_extract` ``, `` `web_search` ``, `` `read_file` ``, `` `write_file` ``, `` `patch` ``, `` `search_files` ``, `` `vision_analyze` ``, `` `browser_navigate` ``, `` `delegate_task` ``, `` `image_generate` ``, `` `text_to_speech` ``, `` `cronjob` ``, `` `memory` ``, `` `skill_view` ``, `` `todo` ``, `` `execute_code` ``.
Do NOT name shell utilities the agent already has wrapped:
| Don't say | Say |
|---|---|
| `grep`, `rg` | `search_files` |
| `cat`, `head`, `tail` | `read_file` |
| `sed`, `awk` | `patch` |
| `find`, `ls` | `search_files` (with `target='files'`) |
| `curl` for content extraction | `web_extract` |
| `echo > file`, `cat <<EOF` | `write_file` |
If the skill depends on an MCP server, name the MCP server and document its setup in `## Prerequisites`. Third-party CLIs (e.g. `ffmpeg`, `gh`, a specific SDK) are fine to invoke from inside script files, but the prose should frame the interaction as "invoke through the `terminal` tool", not as a manual shell session.
3. **`platforms:` gating audited against actual script imports.** Skills that use POSIX-only primitives (`fcntl`, `termios`, `os.setsid`, `os.kill(pid, 0)` for liveness, `/proc`, hardcoded `/tmp` paths, `signal.SIGKILL`, bash heredocs, `osascript`, `apt`, `systemctl`) must declare their supported platforms via the `platforms:` frontmatter. Default posture is to fix it cross-platform first — `tempfile.gettempdir()`, `pathlib.Path`, `psutil.pid_exists()`, Python-level filtering instead of `grep`. Gate to a narrower set only when the dependency is genuinely platform-bound (e.g. `osascript` is macOS-only, `/proc` is Linux-only).
4. **`author` credits the human contributor first.** For external contributions, the contributor's real name + GitHub handle goes first (`Jane Doe (jane-doe)`); "Hermes Agent" is the secondary collaborator. If the contributor's commit shows "Hermes Agent" as author because they used Hermes to draft the skill, replace it with their actual name — credit the human, not the tool.
5. **SKILL.md body uses the modern section order.** `# <Skill> Skill` title, 2-3 sentence intro stating what it does and what it doesn't do, then:
- `## When to Use` — trigger conditions
- `## Prerequisites` — env vars, install steps, MCP setup, API key sourcing
- `## How to Run` — canonical invocation through the `terminal` tool
- `## Quick Reference` — flat command/API reference
- `## Procedure` — numbered steps with copy-paste commands
- `## Pitfalls` — known limits, rate limits, things that look broken but aren't
- `## Verification` — single command that proves the skill works
Target ~200 lines for a complex skill, ~100 lines for a simple one. Cut redundant intro fluff, marketing prose, and re-explanations of env vars already documented in `## Prerequisites`.
6. **Scripts go in `scripts/`, references in `references/`, templates in `templates/`.** Don't expect the model to inline-write parsers, XML walkers, or non-trivial logic every call — ship a helper script. Reference scripts from SKILL.md by path relative to the skill directory.
7. **Tests live at `tests/skills/test_<skill>_skill.py`** and use only stdlib + pytest + `unittest.mock`. No live network calls. Run via `scripts/run_tests.sh tests/skills/test_<skill>_skill.py -q`. Must pass under the hermetic CI env (no API keys leaking through). Use `monkeypatch` and `tmp_path` for any env-var or filesystem dependencies.
8. **`.env.example` additions are isolated to a clearly delimited block.** Don't touch the surrounding file — contributor-supplied `.env.example` versions are usually stale, and edits outside the skill's own block will be dropped during salvage. Comment all values with `#` (it's documentation, not live config).
### Skill guidelines
- **No external dependencies unless absolutely necessary.** Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`).
+1 -5
View File
@@ -94,13 +94,9 @@ RUN cd web && npm run build && \
# hermes_cli/main.py succeeds (see #18800). /opt/hermes/web is build-time
# only (HERMES_WEB_DIST points at hermes_cli/web_dist) and is intentionally
# not chowned here.
# The .venv MUST be hermes-writable so lazy_deps.py can install platform
# packages (discord.py, telegram, slack, etc.) at first gateway boot.
# Without this, `uv pip install` fails with EACCES and all messaging
# adapters silently fail to load. See tools/lazy_deps.py.
USER root
RUN chmod -R a+rX /opt/hermes && \
chown -R hermes:hermes /opt/hermes/.venv /opt/hermes/ui-tui /opt/hermes/node_modules
chown -R hermes:hermes /opt/hermes/ui-tui /opt/hermes/node_modules
# Start as root so the entrypoint can usermod/groupmod + gosu.
# If HERMES_UID is unset, the entrypoint drops to the default hermes user (10000).
+1 -1
View File
@@ -14,7 +14,7 @@
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NovitaAI](https://novita.ai) (AI-native cloud for Model API, Agent Sandbox, and GPU Cloud), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
<table>
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
+33 -94
View File
@@ -1,11 +1,10 @@
"""ACP permission bridging for Hermes dangerous-command approvals."""
"""ACP permission bridging — maps ACP approval requests to hermes approval callbacks."""
from __future__ import annotations
import asyncio
import logging
from concurrent.futures import TimeoutError as FutureTimeout
from itertools import count
from typing import Callable
from acp.schema import (
@@ -15,87 +14,24 @@ from acp.schema import (
logger = logging.getLogger(__name__)
# Maps ACP permission option ids to Hermes approval result strings.
# Option ids are stable across both the ``allow_permanent=True`` and
# ``allow_permanent=False`` paths even though the option list differs.
_OPTION_ID_TO_HERMES = {
# Maps ACP PermissionOptionKind -> hermes approval result strings
_KIND_TO_HERMES = {
"allow_once": "once",
"allow_session": "session",
"allow_always": "always",
"deny": "deny",
"reject_once": "deny",
"reject_always": "deny",
}
_PERMISSION_REQUEST_IDS = count(1)
def _build_permission_options(*, allow_permanent: bool) -> list[PermissionOption]:
"""Return ACP options that match Hermes approval semantics."""
options = [
PermissionOption(option_id="allow_once", kind="allow_once", name="Allow once"),
PermissionOption(
option_id="allow_session",
# ACP has no session-scoped kind, so use the closest persistent
# hint while keeping Hermes semantics in the option id.
kind="allow_always",
name="Allow for session",
),
]
if allow_permanent:
options.append(
PermissionOption(
option_id="allow_always",
kind="allow_always",
name="Allow always",
),
)
options.append(PermissionOption(option_id="deny", kind="reject_once", name="Deny"))
return options
def _build_permission_tool_call(command: str, description: str):
"""Return the ACP tool-call update attached to a permission request.
``request_permission`` expects a ``ToolCallUpdate`` payload — produced
by ``_acp.update_tool_call`` — not a ``ToolCallStart``. Each request
gets a unique ``perm-check-N`` id so concurrent requests don't collide.
"""
import acp as _acp
tool_call_id = f"perm-check-{next(_PERMISSION_REQUEST_IDS)}"
return _acp.update_tool_call(
tool_call_id,
title=description,
kind="execute",
status="pending",
content=[_acp.tool_content(_acp.text_block(f"$ {command}"))],
raw_input={"command": command, "description": description},
)
def _map_outcome_to_hermes(outcome: object, *, allowed_option_ids: set[str]) -> str:
"""Map an ACP permission outcome into Hermes approval strings."""
if not isinstance(outcome, AllowedOutcome):
return "deny"
option_id = outcome.option_id
if option_id not in allowed_option_ids:
logger.warning("Permission request returned unknown option_id: %s", option_id)
return "deny"
return _OPTION_ID_TO_HERMES.get(option_id, "deny")
def make_approval_callback(
request_permission_fn: Callable,
loop: asyncio.AbstractEventLoop,
session_id: str,
timeout: float = 60.0,
) -> Callable[..., str]:
) -> Callable[[str, str], str]:
"""
Return a Hermes-compatible approval callback that bridges to ACP.
The callback accepts ``command`` and ``description`` plus optional
keyword arguments such as ``allow_permanent`` used by
``tools.approval.prompt_dangerous_approval()``.
Return a hermes-compatible ``approval_callback(command, description) -> str``
that bridges to the ACP client's ``request_permission`` call.
Args:
request_permission_fn: The ACP connection's ``request_permission`` coroutine.
@@ -104,38 +40,41 @@ def make_approval_callback(
timeout: Seconds to wait for a response before auto-denying.
"""
def _callback(
command: str,
description: str,
*,
allow_permanent: bool = True,
**_: object,
) -> str:
options = _build_permission_options(allow_permanent=allow_permanent)
def _callback(command: str, description: str) -> str:
options = [
PermissionOption(option_id="allow_once", kind="allow_once", name="Allow once"),
PermissionOption(option_id="allow_always", kind="allow_always", name="Allow always"),
PermissionOption(option_id="deny", kind="reject_once", name="Deny"),
]
import acp as _acp
tool_call = _acp.start_tool_call("perm-check", command, kind="execute")
coro = request_permission_fn(
session_id=session_id,
tool_call=tool_call,
options=options,
)
future = None
try:
tool_call = _build_permission_tool_call(command, description)
coro = request_permission_fn(
session_id=session_id,
tool_call=tool_call,
options=options,
)
future = asyncio.run_coroutine_threadsafe(coro, loop)
response = future.result(timeout=timeout)
except (FutureTimeout, Exception) as exc:
if future is not None:
future.cancel()
logger.warning("Permission request timed out or failed: %s", exc)
return "deny"
if response is None:
return "deny"
allowed_option_ids = {option.option_id for option in options}
return _map_outcome_to_hermes(
response.outcome,
allowed_option_ids=allowed_option_ids,
)
outcome = response.outcome
if isinstance(outcome, AllowedOutcome):
option_id = outcome.option_id
# Look up the kind from our options list
for opt in options:
if opt.option_id == option_id:
return _KIND_TO_HERMES.get(opt.kind, "deny")
return "once" # fallback for unknown option_id
else:
return "deny"
return _callback
+3 -10
View File
@@ -35,14 +35,6 @@ def _get_anthropic_sdk():
"""Return the ``anthropic`` SDK module, importing lazily. None if not installed."""
global _anthropic_sdk
if _anthropic_sdk is ...:
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("provider.anthropic", prompt=False)
except ImportError:
pass
except Exception:
# FeatureUnavailable — fall through to ImportError handling below
pass
try:
import anthropic as _sdk
_anthropic_sdk = _sdk
@@ -1305,8 +1297,9 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
),
}
# Forward cache_control marker when present on the OpenAI-format
# tool dict. Anthropic's tools array supports cache_control on the
# last tool to cache the entire schema cross-session.
# tool dict (set by ``mark_tools_for_long_lived_cache``). Anthropic's
# tools array supports cache_control on the last tool to cache the
# entire schema cross-session.
cache_control = t.get("cache_control")
if isinstance(cache_control, dict):
anthropic_tool["cache_control"] = dict(cache_control)
+5 -30
View File
@@ -382,28 +382,7 @@ _AI_GATEWAY_HEADERS = {
# Nous Portal extra_body for product attribution.
# Callers should pass this as extra_body in chat.completions.create()
# when the auxiliary client is backed by Nous Portal.
#
# The tags are computed from agent.portal_tags so the client= marker stays
# in lockstep with hermes_cli.__version__ across every Portal call site
# (main loop, aux, compression, web_extract). Do not inline a literal here;
# see agent/portal_tags.py for the rationale.
from agent.portal_tags import nous_portal_tags as _nous_portal_tags
def _nous_extra_body() -> dict:
"""Return a fresh Nous Portal ``extra_body`` dict.
Computed at call time so a hot-reloaded ``hermes_cli.__version__`` is
reflected without restarting long-running processes.
"""
return {"tags": _nous_portal_tags()}
# Backwards-compatible module attribute. Some callers (tests, third-party
# plugins) read ``NOUS_EXTRA_BODY`` directly; keep it as a snapshot of the
# current tags. Callers that need the freshest value should call
# ``_nous_extra_body()`` or import ``nous_portal_tags`` directly.
NOUS_EXTRA_BODY = _nous_extra_body()
NOUS_EXTRA_BODY = {"tags": ["product=hermes-agent"]}
# Set at resolve time — True if the auxiliary client points to Nous Portal
auxiliary_is_nous: bool = False
@@ -1407,7 +1386,6 @@ def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Opt
if pool_present:
or_key = explicit_api_key or _pool_runtime_api_key(entry)
if not or_key:
_mark_provider_unhealthy("openrouter", ttl=60)
return None, None
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
@@ -1416,7 +1394,6 @@ def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Opt
or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
if not or_key:
_mark_provider_unhealthy("openrouter", ttl=60)
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
@@ -1448,7 +1425,6 @@ def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
"Auxiliary: skipping Nous Portal (rate-limited, resets in %.0fs)",
_remaining,
)
_mark_provider_unhealthy("nous", ttl=_remaining)
return None, None
except Exception:
pass
@@ -1456,7 +1432,6 @@ def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
nous = _read_nous_auth()
runtime = _resolve_nous_runtime_api(force_refresh=False)
if runtime is None and not nous:
_mark_provider_unhealthy("nous", ttl=60)
return None, None
global auxiliary_is_nous
auxiliary_is_nous = True
@@ -3462,7 +3437,7 @@ def get_auxiliary_extra_body() -> dict:
Includes Nous Portal product tags when the auxiliary client is backed
by Nous Portal. Returns empty dict otherwise.
"""
return _nous_extra_body() if auxiliary_is_nous else {}
return dict(NOUS_EXTRA_BODY) if auxiliary_is_nous else {}
def auxiliary_max_tokens_param(value: int) -> dict:
@@ -3853,7 +3828,7 @@ def _resolve_task_provider_model(
# (e.g. OPENROUTER_API_KEY) instead of locking into "custom".
return cfg_provider, resolved_model, cfg_base_url, None, resolved_api_mode
if cfg_provider and cfg_provider != "auto":
return cfg_provider, resolved_model, cfg_base_url, cfg_api_key, resolved_api_mode
return cfg_provider, resolved_model, None, None, resolved_api_mode
return "auto", resolved_model, None, None, resolved_api_mode
@@ -4051,7 +4026,7 @@ def _build_call_kwargs(
# Provider-specific extra_body
merged_extra = dict(extra_body or {})
if provider == "nous" or auxiliary_is_nous:
merged_extra.setdefault("tags", []).extend(_nous_portal_tags())
merged_extra.setdefault("tags", []).extend(["product=hermes-agent"])
if merged_extra:
kwargs["extra_body"] = merged_extra
@@ -4436,7 +4411,7 @@ def extract_content_or_reasoning(response) -> str:
1. ``message.content`` — strip inline think/reasoning blocks, check for
remaining non-whitespace text.
2. ``message.reasoning`` / ``message.reasoning_content`` — direct
structured reasoning fields (DeepSeek, Moonshot, NovitaAI, etc.).
structured reasoning fields (DeepSeek, Moonshot, Novita, etc.).
3. ``message.reasoning_details`` — OpenRouter unified array format.
Returns the best available text, or ``""`` if nothing found.
+3 -23
View File
@@ -1185,26 +1185,6 @@ The user has requested that this compaction PRIORITISE preserving all informatio
idx += 1
return idx
def _protect_head_size(self, messages: List[Dict[str, Any]]) -> int:
"""Total count of head messages to protect.
``protect_first_n`` is defined as *additional* messages protected
beyond the system prompt. The system prompt (if present at index 0)
is always implicitly protected it's load-bearing context that
must never be summarised away. This keeps semantics stable across
call paths where the system prompt may or may not be included in
the ``messages`` list (e.g. the gateway ``/compress`` handler
strips it before calling compress()).
Examples:
protect_first_n=0 system prompt only (or nothing if no system msg)
protect_first_n=3 system + first 3 non-system messages
"""
head = 0
if messages and messages[0].get("role") == "system":
head = 1
return head + self.protect_first_n
def _align_boundary_backward(self, messages: List[Dict[str, Any]], idx: int) -> int:
"""Pull a compress-end boundary backward to avoid splitting a
tool_call / result group.
@@ -1363,7 +1343,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
skip the LLM call when the transcript is still entirely inside
the protected head/tail.
"""
compress_start = self._align_boundary_forward(messages, self._protect_head_size(messages))
compress_start = self._align_boundary_forward(messages, self.protect_first_n)
compress_end = self._find_tail_cut_by_tokens(messages, compress_start)
return compress_start < compress_end
@@ -1399,7 +1379,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
self._last_aux_model_failure_model = None
n_messages = len(messages)
# Only need head + 3 tail messages minimum (token budget decides the real tail size)
_min_for_compress = self._protect_head_size(messages) + 3 + 1
_min_for_compress = self.protect_first_n + 3 + 1
if n_messages <= _min_for_compress:
if not self.quiet_mode:
logger.warning(
@@ -1419,7 +1399,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
logger.info("Pre-compression: pruned %d old tool result(s)", pruned_count)
# Phase 2: Determine boundaries
compress_start = self._protect_head_size(messages)
compress_start = self.protect_first_n
compress_start = self._align_boundary_forward(messages, compress_start)
# Use token-budget tail protection instead of fixed message count
-5
View File
@@ -55,11 +55,6 @@ class ContextEngine(ABC):
# These control the preflight compression check. Subclasses may
# override via __init__ or property; defaults are sensible for most
# engines.
#
# protect_first_n semantics (since PR #13754): count of non-system head
# messages always preserved verbatim, IN ADDITION to the system prompt
# which is always implicitly protected. Default 3 keeps the
# historical "system + first 3 non-system messages" head shape.
threshold_percent: float = 0.75
protect_first_n: int = 3
-3
View File
@@ -14,7 +14,6 @@ from difflib import unified_diff
from pathlib import Path
from utils import safe_json_loads
from agent.tool_result_classification import file_mutation_result_landed
# ANSI escape codes for coloring tool failure indicators
_RED = "\033[31m"
@@ -811,8 +810,6 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
"""
if result is None:
return False, ""
if file_mutation_result_landed(tool_name, result):
return False, ""
if tool_name == "terminal":
data = safe_json_loads(result)
+1 -7
View File
@@ -450,13 +450,7 @@ def _make_stream_chunk(
finish_reason: Optional[str] = None,
reasoning: str = "",
) -> _GeminiStreamChunk:
delta_kwargs: Dict[str, Any] = {
"role": "assistant",
"content": None,
"tool_calls": None,
"reasoning": None,
"reasoning_content": None,
}
delta_kwargs: Dict[str, Any] = {"role": "assistant"}
if content:
delta_kwargs["content"] = content
if tool_call_delta is not None:
+6 -31
View File
@@ -77,17 +77,6 @@ def get_active_provider() -> Optional[ImageGenProvider]:
Reads ``image_gen.provider`` from config.yaml; falls back per the
module docstring.
**Availability semantics** (mirrors :mod:`agent.web_search_registry`):
- When ``image_gen.provider`` is explicitly set, the configured
provider is returned even if :meth:`ImageGenProvider.is_available`
reports False the dispatcher surfaces a precise "X_API_KEY is not
set" error rather than silently switching backends.
- When ``image_gen.provider`` is unset, the fallback path (single-
provider shortcut and the FAL legacy preference) is filtered by
``is_available()`` so we don't pick a provider the user has no
credentials for.
"""
configured: Optional[str] = None
try:
@@ -105,17 +94,6 @@ def get_active_provider() -> Optional[ImageGenProvider]:
with _lock:
snapshot = dict(_providers)
def _is_available_safe(p: ImageGenProvider) -> bool:
"""Wrap ``is_available()`` so a buggy provider doesn't kill resolution."""
try:
return bool(p.is_available())
except Exception as exc: # noqa: BLE001
logger.debug("image_gen provider %s.is_available() raised %s", p.name, exc)
return False
# 1. Explicit config wins — return regardless of is_available() so the
# user gets a precise downstream error message rather than a silent
# backend switch.
if configured:
provider = snapshot.get(configured)
if provider is not None:
@@ -125,16 +103,13 @@ def get_active_provider() -> Optional[ImageGenProvider]:
configured,
)
# 2. Fallback: single registered provider — but only if it's actually
# available (no credentials = don't surface it as "active").
available = [p for p in snapshot.values() if _is_available_safe(p)]
if len(available) == 1:
return available[0]
# Fallback: single-provider case
if len(snapshot) == 1:
return next(iter(snapshot.values()))
# 3. Fallback: prefer legacy FAL for backward compat, when available.
fal = snapshot.get("fal")
if fal is not None and _is_available_safe(fal):
return fal
# Fallback: prefer legacy FAL for backward compat
if "fal" in snapshot:
return snapshot["fal"]
return None
-106
View File
@@ -1,106 +0,0 @@
"""Language Server Protocol (LSP) integration for Hermes Agent.
Hermes runs full language servers (pyright, gopls, rust-analyzer,
typescript-language-server, etc.) as subprocesses and pipes their
``textDocument/publishDiagnostics`` output into the post-write lint
delta filter used by ``write_file`` and ``patch``.
LSP is **gated on git workspace detection** if the agent's cwd is
inside a git repository, LSP runs against that workspace; otherwise the
file_operations layer falls back to its existing in-process syntax
checks. This keeps users on user-home cwd's (e.g. Telegram gateway
chats) from spawning daemons they don't need.
Public API:
from agent.lsp import get_service
svc = get_service()
if svc and svc.enabled_for(path):
await svc.touch_file(path)
diags = svc.diagnostics_for(path)
The bulk of the wiring is internal most callers only need the layer
in :func:`tools.file_operations.FileOperations._check_lint_delta`,
which is already wired (see that module).
Architecture is documented in ``website/docs/user-guide/features/lsp.md``.
"""
from __future__ import annotations
import atexit
import logging
import threading
from typing import Optional
from agent.lsp.manager import LSPService
logger = logging.getLogger("agent.lsp")
_service: Optional[LSPService] = None
_atexit_registered = False
_service_lock = threading.Lock()
def get_service() -> Optional[LSPService]:
"""Return the process-wide LSP service singleton, or None when disabled.
The service is created lazily on first call. ``None`` is returned
when LSP is disabled in config, when no workspace can be detected,
or when the platform doesn't support subprocess-based LSP servers.
On first creation, registers an :mod:`atexit` handler that tears
down spawned language servers on Python exit so a long-running
CLI or gateway session doesn't leak pyright/gopls/etc. processes
when it terminates.
"""
global _service, _atexit_registered
if _service is not None:
return _service if _service.is_active() else None
with _service_lock:
if _service is not None:
return _service if _service.is_active() else None
_service = LSPService.create_from_config()
if not _atexit_registered:
# ``atexit`` handlers run in LIFO order on normal Python
# exit and on SystemExit, but NOT on os._exit() or
# uncaught signals. Language servers are stateless
# subprocesses — losing them on SIGKILL is fine; they'll
# be reaped by the kernel along with their parent. We
# care about clean exits where Python flushes stdio
# before terminating; without this hook every
# ``hermes chat`` exit would leak pyright processes that
# outlive the parent for a few seconds while their
# stdout buffers drain.
atexit.register(_atexit_shutdown)
_atexit_registered = True
return _service if (_service is not None and _service.is_active()) else None
def shutdown_service() -> None:
"""Tear down the LSP service if one was started.
Safe to call multiple times; safe to call when no service was created.
"""
global _service
with _service_lock:
svc = _service
_service = None
if svc is not None:
try:
svc.shutdown()
except Exception as e: # noqa: BLE001
logger.debug("LSP shutdown error: %s", e)
def _atexit_shutdown() -> None:
"""atexit-registered wrapper. Logs at debug because by the time
atexit fires the user has already seen the agent's final output —
a noisy shutdown line on top of that is just clutter."""
try:
shutdown_service()
except Exception as e: # noqa: BLE001
logger.debug("atexit LSP shutdown failed: %s", e)
__all__ = ["get_service", "shutdown_service", "LSPService"]
-308
View File
@@ -1,308 +0,0 @@
"""``hermes lsp`` CLI subcommand.
Subcommands:
- ``status`` show service state, configured servers, install status.
- ``install <server_id>`` eagerly install one server's binary.
- ``install-all`` try to install every server with a known recipe.
- ``restart`` tear down running clients so the next edit re-spawns.
- ``which <server_id>`` print the resolved binary path for one server.
- ``list`` print the registry of supported servers.
The handlers are kept here (rather than in
``hermes_cli/main.py``) so the LSP module ships self-contained.
"""
from __future__ import annotations
import argparse
import sys
from typing import Optional
def register_subparser(subparsers: argparse._SubParsersAction) -> None:
"""Wire the ``hermes lsp`` subcommand tree into the main argparse."""
parser = subparsers.add_parser(
"lsp",
help="Language Server Protocol management",
description=(
"Manage the LSP layer that powers post-write semantic "
"diagnostics in write_file/patch."
),
)
sub = parser.add_subparsers(dest="lsp_command")
sub_status = sub.add_parser("status", help="Show LSP service status")
sub_status.add_argument(
"--json", action="store_true", help="Emit machine-readable JSON"
)
sub_list = sub.add_parser("list", help="List supported language servers")
sub_list.add_argument(
"--installed-only",
action="store_true",
help="Only show servers whose binary is currently available",
)
sub_install = sub.add_parser("install", help="Install a server binary")
sub_install.add_argument("server", help="Server id (e.g. pyright, gopls)")
sub_install_all = sub.add_parser(
"install-all",
help="Install every server with a known auto-install recipe",
)
sub_install_all.add_argument(
"--include-manual",
action="store_true",
help="Even attempt servers marked manual-install (best effort)",
)
sub_restart = sub.add_parser(
"restart",
help="Tear down running LSP clients (next edit re-spawns)",
)
sub_which = sub.add_parser("which", help="Print binary path for a server")
sub_which.add_argument("server", help="Server id")
parser.set_defaults(func=run_lsp_command)
def run_lsp_command(args: argparse.Namespace) -> int:
"""Top-level dispatcher for ``hermes lsp <subcommand>``."""
sub = getattr(args, "lsp_command", None) or "status"
try:
if sub == "status":
return _cmd_status(getattr(args, "json", False))
if sub == "list":
return _cmd_list(getattr(args, "installed_only", False))
if sub == "install":
return _cmd_install(args.server)
if sub == "install-all":
return _cmd_install_all(getattr(args, "include_manual", False))
if sub == "restart":
return _cmd_restart()
if sub == "which":
return _cmd_which(args.server)
sys.stderr.write(f"unknown lsp subcommand: {sub}\n")
return 2
except KeyboardInterrupt:
return 130
def _cmd_status(emit_json: bool) -> int:
from agent.lsp import get_service
from agent.lsp.servers import SERVERS
from agent.lsp.install import detect_status
svc = get_service()
service_active = svc is not None
info = svc.get_status() if svc is not None else {"enabled": False}
if emit_json:
import json
payload = {
"service": info,
"registry": [
{
"server_id": s.server_id,
"extensions": list(s.extensions),
"description": s.description,
"binary_status": detect_status(_recipe_pkg_for(s.server_id)),
}
for s in SERVERS
],
}
sys.stdout.write(json.dumps(payload, indent=2) + "\n")
return 0
out = []
out.append("LSP Service")
out.append("===========")
out.append(f" enabled: {info.get('enabled', False)}")
if service_active:
out.append(f" wait_mode: {info.get('wait_mode')}")
out.append(f" wait_timeout: {info.get('wait_timeout')}s")
out.append(f" install_strategy:{info.get('install_strategy')}")
clients = info.get("clients") or []
if clients:
out.append(f" active clients: {len(clients)}")
for c in clients:
out.append(
f" - {c['server_id']:20s} state={c['state']:10s} root={c['workspace_root']}"
)
else:
out.append(" active clients: none")
broken = info.get("broken") or []
if broken:
out.append(f" broken pairs: {len(broken)}")
for b in broken:
out.append(f" - {b}")
disabled = info.get("disabled_servers") or []
if disabled:
out.append(f" disabled in cfg: {', '.join(disabled)}")
# Surface backend-tool gaps that aren't visible in the registry table:
# some servers spawn fine but emit no diagnostics without a sidecar
# binary (bash-language-server -> shellcheck).
backend_warnings = _backend_warnings()
if backend_warnings:
out.append("")
out.append("Backend warnings")
out.append("================")
for line in backend_warnings:
out.append(f" ! {line}")
out.append("")
out.append("Registered Servers")
out.append("==================")
for s in SERVERS:
pkg = _recipe_pkg_for(s.server_id)
status = detect_status(pkg)
marker = {
"installed": "",
"missing": "·",
"manual-only": "?",
}.get(status, " ")
ext_summary = ", ".join(list(s.extensions)[:5])
if len(s.extensions) > 5:
ext_summary += f", … (+{len(s.extensions) - 5})"
out.append(
f" {marker} {s.server_id:24s} [{status:11s}] {ext_summary}"
)
if s.description:
out.append(f" {s.description}")
sys.stdout.write("\n".join(out) + "\n")
return 0
def _cmd_list(installed_only: bool) -> int:
from agent.lsp.servers import SERVERS
from agent.lsp.install import detect_status
for s in SERVERS:
pkg = _recipe_pkg_for(s.server_id)
status = detect_status(pkg)
if installed_only and status != "installed":
continue
sys.stdout.write(
f"{s.server_id:24s} [{status:11s}] {','.join(s.extensions)}\n"
)
return 0
def _cmd_install(server_id: str) -> int:
from agent.lsp.install import try_install, INSTALL_RECIPES, detect_status
pkg = _recipe_pkg_for(server_id)
pre_status = detect_status(pkg)
if pre_status == "installed":
sys.stdout.write(f"{server_id} already installed\n")
return 0
sys.stdout.write(f"installing {server_id} (pkg={pkg}) ...\n")
sys.stdout.flush()
bin_path = try_install(pkg, "auto")
if bin_path is None:
recipe = INSTALL_RECIPES.get(pkg)
if recipe and recipe.get("strategy") == "manual":
sys.stderr.write(
f"{server_id}: this server requires a manual install. "
f"See documentation.\n"
)
else:
sys.stderr.write(f"{server_id}: install failed (see logs).\n")
return 1
sys.stdout.write(f"installed: {bin_path}\n")
return 0
def _cmd_install_all(include_manual: bool) -> int:
from agent.lsp.servers import SERVERS
from agent.lsp.install import try_install, INSTALL_RECIPES, detect_status
rc = 0
for s in SERVERS:
pkg = _recipe_pkg_for(s.server_id)
recipe = INSTALL_RECIPES.get(pkg)
if recipe is None:
continue
if recipe.get("strategy") == "manual" and not include_manual:
continue
if detect_status(pkg) == "installed":
sys.stdout.write(f" {s.server_id:24s} already installed\n")
continue
sys.stdout.write(f" installing {s.server_id} (pkg={pkg}) ... ")
sys.stdout.flush()
path = try_install(pkg, "auto")
if path:
sys.stdout.write(f"ok ({path})\n")
else:
sys.stdout.write("FAILED\n")
rc = 1
return rc
def _cmd_restart() -> int:
from agent.lsp import shutdown_service
shutdown_service()
sys.stdout.write("LSP service shut down. Next edit will respawn clients.\n")
return 0
def _cmd_which(server_id: str) -> int:
from agent.lsp.install import INSTALL_RECIPES, hermes_lsp_bin_dir
import os
import shutil as _shutil
recipe = INSTALL_RECIPES.get(server_id)
bin_name = (recipe or {}).get("bin", server_id)
staged = hermes_lsp_bin_dir() / bin_name
if staged.exists():
sys.stdout.write(str(staged) + "\n")
return 0
on_path = _shutil.which(bin_name)
if on_path:
sys.stdout.write(on_path + "\n")
return 0
sys.stderr.write(f"{server_id}: not installed\n")
return 1
def _recipe_pkg_for(server_id: str) -> str:
"""Map a registry ``server_id`` to its install-recipe package key."""
# The mapping lives here (not in install.py) because it's a CLI
# convenience layer. Most server_ids are also their own recipe
# key, but a few differ (e.g. ``vue-language-server`` →
# ``@vue/language-server``).
aliases = {
"vue-language-server": "@vue/language-server",
"astro-language-server": "@astrojs/language-server",
"dockerfile-ls": "dockerfile-language-server-nodejs",
"typescript": "typescript-language-server",
}
return aliases.get(server_id, server_id)
def _backend_warnings() -> list:
"""Return human-readable notes about LSP backend tools that are missing
in a way that won't surface elsewhere.
Some language servers ship as thin wrappers around an external CLI for
actual diagnostics they spawn cleanly but never emit any errors when
the sidecar binary isn't on PATH. bash-language-server / shellcheck
is the load-bearing example.
Returned strings are short, actionable, and include the install
suggestion across common platforms.
"""
import shutil as _shutil
from agent.lsp.install import hermes_lsp_bin_dir
notes: list = []
bash_installed = _shutil.which("bash-language-server") is not None or (
(hermes_lsp_bin_dir() / "bash-language-server").exists()
)
if bash_installed and _shutil.which("shellcheck") is None:
notes.append(
"bash-language-server is installed but shellcheck is missing — "
"diagnostics will be empty (apt: shellcheck, brew: shellcheck, "
"scoop: shellcheck)."
)
return notes
-930
View File
@@ -1,930 +0,0 @@
"""Async LSP client over stdin/stdout.
One :class:`LSPClient` corresponds to one ``(language_server, workspace_root)``
pair exactly what OpenCode keys clients on, and the same shape Claude
Code uses. The client owns a child process, drives the JSON-RPC
exchange, and exposes:
- :meth:`open_file` / :meth:`change_file` text document sync
- :meth:`wait_for_diagnostics` block until the server emits fresh
diagnostics for a specific file (or a timeout fires)
- :meth:`diagnostics_for` read the current per-file diagnostic store
- :meth:`shutdown` graceful close + SIGTERM/SIGKILL fallback
The class is designed for async use from a single asyncio event loop.
The :class:`agent.lsp.manager.LSPService` runs an event loop in a
background thread so the synchronous file_operations layer can call
into it via :func:`agent.lsp.manager.LSPService.touch_file`.
Implementation notes:
- Push diagnostics are stored per-URI in :attr:`_push_diagnostics` from
``textDocument/publishDiagnostics`` notifications. Pull diagnostics
go in :attr:`_pull_diagnostics`. The merged view dedupes by content.
- Whole-document sync. Even when the server advertises incremental
sync, we send a single ``contentChanges`` entry replacing the
entire document. Pretending to be incremental while sending a
full replacement is well-tolerated by every major server and saves
range bookkeeping. See OpenCode's ``client.ts:584-659`` for the
same trick.
- The "touch-file dance": every ``open_file`` call also fires a
``workspace/didChangeWatchedFiles`` notification (CREATED on the
first open, CHANGED thereafter). Some servers (clangd, eslint)
only re-scan when this notification fires, even though the LSP spec
doesn't strictly require it.
- ``ContentModified`` (-32801) errors get retried with exponential
backoff up to 3 times. This matches Claude Code's
``LSPServerInstance.sendRequest``.
"""
from __future__ import annotations
import asyncio
import logging
import os
from pathlib import Path
from typing import Any, Awaitable, Callable, Dict, List, Optional, Set
from urllib.parse import quote, unquote
from agent.lsp.protocol import (
ERROR_CONTENT_MODIFIED,
ERROR_METHOD_NOT_FOUND,
LSPProtocolError,
LSPRequestError,
classify_message,
encode_message,
make_error_response,
make_notification,
make_request,
make_response,
read_message,
)
logger = logging.getLogger("agent.lsp.client")
# Timeouts (seconds) — mirror OpenCode's constants, scaled to seconds.
INITIALIZE_TIMEOUT = 45.0
DIAGNOSTICS_DOCUMENT_WAIT = 5.0
DIAGNOSTICS_FULL_WAIT = 10.0
DIAGNOSTICS_REQUEST_TIMEOUT = 3.0
PUSH_DEBOUNCE = 0.15
SHUTDOWN_GRACE = 1.0 # seconds between SIGTERM and SIGKILL
# Retry policy for transient ContentModified errors.
MAX_CONTENT_MODIFIED_RETRIES = 3
RETRY_BASE_DELAY = 0.5 # 0.5, 1.0, 2.0 — exponential
def file_uri(path: str) -> str:
"""Return ``file://`` URI for an absolute filesystem path.
Mirrors Node's ``pathToFileURL`` — handles spaces, unicode, and
Windows drive letters (``C:\\foo`` ``file:///C:/foo``).
"""
abs_path = os.path.abspath(path)
if os.name == "nt":
# Windows: backslash → forward slash, prepend extra slash so
# the drive letter shows up as part of the path component.
abs_path = abs_path.replace("\\", "/")
if not abs_path.startswith("/"):
abs_path = "/" + abs_path
return "file://" + quote(abs_path, safe="/:")
def uri_to_path(uri: str) -> str:
"""Inverse of :func:`file_uri`."""
if not uri.startswith("file://"):
return uri
raw = uri[len("file://"):]
if os.name == "nt" and raw.startswith("/") and len(raw) > 2 and raw[2] == ":":
raw = raw[1:] # strip leading slash before drive letter
return os.path.normpath(unquote(raw))
def _end_position(text: str) -> Dict[str, int]:
"""Return the LSP Position at the end of ``text``.
Used to construct a single-range "replace whole document" change
for ``textDocument/didChange`` regardless of the server's declared
sync mode.
"""
if not text:
return {"line": 0, "character": 0}
lines = text.splitlines(keepends=False)
last_line = len(lines) - 1
last_col = len(lines[-1]) if lines else 0
# If the text ends with a trailing newline, ``splitlines`` won't
# represent it. The end position is then the start of the next
# (empty) line — line index is len(lines), column 0.
if text.endswith(("\n", "\r")):
return {"line": last_line + 1, "character": 0}
return {"line": last_line, "character": last_col}
class LSPClient:
"""Async LSP client tied to one server process and one workspace root.
Lifecycle:
c = LSPClient(server_id, workspace_root, command, args, init_options)
await c.start() # spawn + initialize
ver = await c.open_file("/path/to/foo.py")
await c.wait_for_diagnostics("/path/to/foo.py", ver)
diags = c.diagnostics_for("/path/to/foo.py")
await c.shutdown()
"""
# ------------------------------------------------------------------
# construction + lifecycle
# ------------------------------------------------------------------
def __init__(
self,
*,
server_id: str,
workspace_root: str,
command: List[str],
env: Optional[Dict[str, str]] = None,
cwd: Optional[str] = None,
initialization_options: Optional[Dict[str, Any]] = None,
seed_diagnostics_on_first_push: bool = False,
) -> None:
self.server_id = server_id
self.workspace_root = workspace_root
self._command = list(command)
self._env = env
self._cwd = cwd or workspace_root
self._init_options = initialization_options or {}
self._seed_first_push = seed_diagnostics_on_first_push
# Process + streams
self._proc: Optional[asyncio.subprocess.Process] = None
self._stderr_task: Optional[asyncio.Task] = None
self._reader_task: Optional[asyncio.Task] = None
# Request/response correlation
self._next_id: int = 0
self._pending: Dict[int, asyncio.Future] = {}
# Server-side request handlers (server → client requests).
# Kept small and explicit; everything else returns method-not-found.
self._request_handlers: Dict[str, Callable[[Any], Awaitable[Any]]] = {
"window/workDoneProgress/create": self._handle_work_done_create,
"workspace/configuration": self._handle_workspace_configuration,
"client/registerCapability": self._handle_register_capability,
"client/unregisterCapability": self._handle_unregister_capability,
"workspace/workspaceFolders": self._handle_workspace_folders,
"workspace/diagnostic/refresh": self._handle_diagnostic_refresh,
}
# Notifications (server → client) we care about.
self._notification_handlers: Dict[str, Callable[[Any], None]] = {
"textDocument/publishDiagnostics": self._handle_publish_diagnostics,
# Everything else (window/showMessage, $/progress, etc.)
# is silently dropped by default.
}
# Tracked file state — required for didChange version bumps.
self._files: Dict[str, Dict[str, Any]] = {}
# Diagnostic stores, keyed by file path (NOT URI).
self._push_diagnostics: Dict[str, List[Dict[str, Any]]] = {}
self._pull_diagnostics: Dict[str, List[Dict[str, Any]]] = {}
# Per-path "last published" time so wait-for-fresh logic works.
self._published: Dict[str, float] = {}
# Per-path version of the latest push (matches our didChange
# version when the server respects it).
self._published_version: Dict[str, int] = {}
# First-push seen flag, for typescript-style seed-on-first-push.
self._first_push_seen: Set[str] = set()
# Capability registrations — only diagnostic ones are tracked.
self._diagnostic_registrations: Dict[str, Dict[str, Any]] = {}
# State machine
self._state: str = "stopped"
self._initialize_result: Optional[Dict[str, Any]] = None
self._sync_kind: int = 1 # 1=Full, 2=Incremental
self._stopping: bool = False
# Push event for waiters.
self._push_event = asyncio.Event()
# Monotonic counter incremented on every publishDiagnostics push.
# Waiters snapshot it on entry and treat any increase as
# "something happened, recheck the predicate". Avoids the
# asyncio.Event sticky-state trap.
self._push_counter = 0
# Registration change event so wait_for_diagnostics can re-loop
# when the server announces a new dynamic provider.
self._registration_event = asyncio.Event()
@property
def is_running(self) -> bool:
return self._state == "running" and self._proc is not None and self._proc.returncode is None
@property
def state(self) -> str:
return self._state
async def start(self) -> None:
"""Spawn the server and complete the initialize handshake.
Raises any exception encountered during spawn/init. On failure
the process is killed and the client is left in state
``"error"`` re-call ``start()`` to retry.
"""
if self._state in ("running", "starting"):
return
self._state = "starting"
try:
await self._spawn()
await self._initialize()
self._state = "running"
except Exception:
self._state = "error"
await self._cleanup_process()
raise
async def _spawn(self) -> None:
env = dict(os.environ)
if self._env:
env.update(self._env)
try:
self._proc = await asyncio.create_subprocess_exec(
self._command[0],
*self._command[1:],
stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
env=env,
cwd=self._cwd,
)
except FileNotFoundError as e:
raise LSPProtocolError(
f"LSP server binary not found: {self._command[0]} ({e})"
) from e
# Drain stderr at debug level — if we don't, the pipe buffer
# fills and the server hangs.
self._stderr_task = asyncio.create_task(self._drain_stderr())
# Start the reader loop.
self._reader_task = asyncio.create_task(self._reader_loop())
async def _drain_stderr(self) -> None:
if self._proc is None or self._proc.stderr is None:
return
try:
while True:
line = await self._proc.stderr.readline()
if not line:
break
text = line.decode("utf-8", errors="replace").rstrip()
if text:
logger.debug("[%s] stderr: %s", self.server_id, text[:1000])
except (asyncio.CancelledError, OSError):
pass
async def _reader_loop(self) -> None:
if self._proc is None or self._proc.stdout is None:
return
try:
while True:
msg = await read_message(self._proc.stdout)
if msg is None:
logger.debug("[%s] server closed stdout cleanly", self.server_id)
break
kind, key = classify_message(msg)
if kind == "response":
self._dispatch_response(key, msg)
elif kind == "request":
asyncio.create_task(self._dispatch_request(key, msg))
elif kind == "notification":
self._dispatch_notification(key, msg)
else:
logger.warning("[%s] dropping invalid message: %r", self.server_id, msg)
except LSPProtocolError as e:
logger.warning("[%s] protocol error in reader loop: %s", self.server_id, e)
except (asyncio.CancelledError, OSError):
pass
finally:
# Wake up any pending requests so they can fail fast.
for fut in list(self._pending.values()):
if not fut.done():
fut.set_exception(LSPProtocolError("server connection closed"))
self._pending.clear()
async def _initialize(self) -> None:
params = {
"rootUri": file_uri(self.workspace_root),
"rootPath": self.workspace_root,
"processId": os.getpid(),
"workspaceFolders": [
{"name": "workspace", "uri": file_uri(self.workspace_root)}
],
"initializationOptions": self._init_options,
"capabilities": {
"window": {"workDoneProgress": True},
"workspace": {
"configuration": True,
"workspaceFolders": True,
"didChangeWatchedFiles": {"dynamicRegistration": True},
"diagnostics": {"refreshSupport": False},
},
"textDocument": {
"synchronization": {
"dynamicRegistration": False,
"didOpen": True,
"didChange": True,
"didSave": True,
"willSave": False,
"willSaveWaitUntil": False,
},
"diagnostic": {
"dynamicRegistration": True,
"relatedDocumentSupport": True,
},
"publishDiagnostics": {
"relatedInformation": True,
"tagSupport": {"valueSet": [1, 2]},
"versionSupport": True,
"codeDescriptionSupport": True,
"dataSupport": False,
},
"hover": {"contentFormat": ["markdown", "plaintext"]},
"definition": {"linkSupport": True},
"references": {},
"documentSymbol": {"hierarchicalDocumentSymbolSupport": True},
},
"general": {"positionEncodings": ["utf-16"]},
},
}
result = await asyncio.wait_for(
self._send_request("initialize", params),
timeout=INITIALIZE_TIMEOUT,
)
self._initialize_result = result
self._sync_kind = self._extract_sync_kind(result.get("capabilities") or {})
await self._send_notification("initialized", {})
if self._init_options:
# Some servers (vtsls, eslint) want config pushed via
# didChangeConfiguration even if it was sent in
# initializationOptions.
await self._send_notification(
"workspace/didChangeConfiguration",
{"settings": self._init_options},
)
@staticmethod
def _extract_sync_kind(capabilities: dict) -> int:
sync = capabilities.get("textDocumentSync")
if isinstance(sync, int):
return sync
if isinstance(sync, dict):
change = sync.get("change")
if isinstance(change, int):
return change
return 1 # default to Full
async def shutdown(self) -> None:
"""Best-effort graceful shutdown.
Sends ``shutdown`` + ``exit``, then SIGTERMs/SIGKILLs the
process if it doesn't exit cleanly. Idempotent.
"""
if self._stopping:
return
self._stopping = True
try:
if self.is_running:
try:
await asyncio.wait_for(self._send_request("shutdown", None), timeout=2.0)
except (asyncio.TimeoutError, LSPRequestError, LSPProtocolError):
pass
try:
await self._send_notification("exit", None)
except Exception:
pass
finally:
self._state = "stopped"
await self._cleanup_process()
async def _cleanup_process(self) -> None:
if self._reader_task is not None and not self._reader_task.done():
self._reader_task.cancel()
try:
await self._reader_task
except (asyncio.CancelledError, Exception): # noqa: BLE001
pass
if self._stderr_task is not None and not self._stderr_task.done():
self._stderr_task.cancel()
try:
await self._stderr_task
except (asyncio.CancelledError, Exception): # noqa: BLE001
pass
proc = self._proc
self._proc = None
if proc is None:
return
if proc.returncode is None:
try:
proc.terminate()
try:
await asyncio.wait_for(proc.wait(), timeout=SHUTDOWN_GRACE)
except asyncio.TimeoutError:
try:
proc.kill()
await proc.wait()
except ProcessLookupError:
pass
except ProcessLookupError:
pass
# ------------------------------------------------------------------
# request / notification plumbing
# ------------------------------------------------------------------
async def _send_request(self, method: str, params: Any) -> Any:
if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
raise LSPProtocolError(f"cannot send {method!r}: stdin closed")
loop = asyncio.get_running_loop()
req_id = self._next_id
self._next_id += 1
fut: asyncio.Future = loop.create_future()
self._pending[req_id] = fut
try:
self._proc.stdin.write(encode_message(make_request(req_id, method, params)))
await self._proc.stdin.drain()
except (BrokenPipeError, ConnectionResetError, OSError) as e:
self._pending.pop(req_id, None)
raise LSPProtocolError(f"send failed for {method!r}: {e}") from e
try:
return await fut
finally:
self._pending.pop(req_id, None)
async def _send_request_with_retry(self, method: str, params: Any, *, timeout: float) -> Any:
"""Send a request, retrying on ``ContentModified`` (-32801).
Other errors propagate. The retry policy matches Claude Code's
``LSPServerInstance.sendRequest`` 3 attempts with delays
0.5s, 1.0s, 2.0s.
"""
for attempt in range(MAX_CONTENT_MODIFIED_RETRIES + 1):
try:
return await asyncio.wait_for(self._send_request(method, params), timeout=timeout)
except LSPRequestError as e:
if e.code == ERROR_CONTENT_MODIFIED and attempt < MAX_CONTENT_MODIFIED_RETRIES:
await asyncio.sleep(RETRY_BASE_DELAY * (2 ** attempt))
continue
raise
async def _send_notification(self, method: str, params: Any) -> None:
if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
return
try:
self._proc.stdin.write(encode_message(make_notification(method, params)))
await self._proc.stdin.drain()
except (BrokenPipeError, ConnectionResetError, OSError) as e:
logger.debug("[%s] notify %s failed: %s", self.server_id, method, e)
async def _send_response(self, req_id: Any, result: Any) -> None:
if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
return
try:
self._proc.stdin.write(encode_message(make_response(req_id, result)))
await self._proc.stdin.drain()
except (BrokenPipeError, ConnectionResetError, OSError):
pass
async def _send_error_response(self, req_id: Any, code: int, message: str) -> None:
if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
return
try:
self._proc.stdin.write(encode_message(make_error_response(req_id, code, message)))
await self._proc.stdin.drain()
except (BrokenPipeError, ConnectionResetError, OSError):
pass
def _dispatch_response(self, req_id: int, msg: dict) -> None:
fut = self._pending.get(req_id)
if fut is None or fut.done():
return
if "error" in msg:
err = msg["error"] or {}
fut.set_exception(
LSPRequestError(
code=int(err.get("code", -32000)),
message=str(err.get("message", "unknown")),
data=err.get("data"),
)
)
else:
fut.set_result(msg.get("result"))
async def _dispatch_request(self, req_id: Any, msg: dict) -> None:
method = msg.get("method", "")
params = msg.get("params")
handler = self._request_handlers.get(method)
if handler is None:
await self._send_error_response(req_id, ERROR_METHOD_NOT_FOUND, f"method not found: {method}")
return
try:
result = await handler(params)
except Exception as e: # noqa: BLE001 — protocol must not blow up
logger.warning("[%s] request handler %s failed: %s", self.server_id, method, e)
await self._send_error_response(req_id, -32000, f"handler failed: {e}")
return
await self._send_response(req_id, result)
def _dispatch_notification(self, method: str, msg: dict) -> None:
handler = self._notification_handlers.get(method)
if handler is None:
return
try:
handler(msg.get("params"))
except Exception as e: # noqa: BLE001
logger.debug("[%s] notification handler %s failed: %s", self.server_id, method, e)
# ------------------------------------------------------------------
# built-in server-→-client request handlers
# ------------------------------------------------------------------
async def _handle_work_done_create(self, params: Any) -> Any:
# Acknowledge progress tokens — required by some servers.
return None
async def _handle_workspace_configuration(self, params: Any) -> Any:
# Walk dotted sections through initializationOptions. Mirrors
# OpenCode's `client.ts:198-220` — return null when missing.
if not isinstance(params, dict):
return [None]
items = params.get("items") or []
out: List[Any] = []
for item in items:
if not isinstance(item, dict):
out.append(None)
continue
section = item.get("section")
if not section or not self._init_options:
out.append(self._init_options or None)
continue
cur: Any = self._init_options
for part in str(section).split("."):
if isinstance(cur, dict) and part in cur:
cur = cur[part]
else:
cur = None
break
out.append(cur)
return out
async def _handle_register_capability(self, params: Any) -> Any:
if not isinstance(params, dict):
return None
for reg in params.get("registrations") or []:
if not isinstance(reg, dict):
continue
method = reg.get("method")
reg_id = reg.get("id")
if method == "textDocument/diagnostic" and reg_id:
self._diagnostic_registrations[str(reg_id)] = reg
self._registration_event.set()
return None
async def _handle_unregister_capability(self, params: Any) -> Any:
if not isinstance(params, dict):
return None
for unreg in params.get("unregisterations") or []:
if not isinstance(unreg, dict):
continue
reg_id = unreg.get("id")
if reg_id:
self._diagnostic_registrations.pop(str(reg_id), None)
return None
async def _handle_workspace_folders(self, params: Any) -> Any:
return [{"name": "workspace", "uri": file_uri(self.workspace_root)}]
async def _handle_diagnostic_refresh(self, params: Any) -> Any:
# We don't honour refresh — we re-pull on every touchFile.
return None
# ------------------------------------------------------------------
# publishDiagnostics handler
# ------------------------------------------------------------------
def _handle_publish_diagnostics(self, params: Any) -> None:
if not isinstance(params, dict):
return
uri = params.get("uri")
if not isinstance(uri, str):
return
path = uri_to_path(uri)
diagnostics = params.get("diagnostics") or []
if not isinstance(diagnostics, list):
diagnostics = []
version = params.get("version")
loop_time = asyncio.get_event_loop().time()
if self._seed_first_push and path not in self._first_push_seen:
# First push: seed without firing the event so a waiter
# doesn't resolve on the very first push (which arrives
# before the user-triggered didChange could've produced
# fresh diagnostics).
self._first_push_seen.add(path)
self._push_diagnostics[path] = diagnostics
self._published[path] = loop_time
if isinstance(version, int):
self._published_version[path] = version
return
self._push_diagnostics[path] = diagnostics
self._published[path] = loop_time
if isinstance(version, int):
self._published_version[path] = version
self._first_push_seen.add(path)
# Bump the monotonic push counter and wake every waiter. We
# keep the Event sticky-set so any wait already in progress
# resolves; waiters re-check their predicate after waking and
# decide whether to keep waiting. ``_push_counter`` is what
# they actually compare against to detect a fresh event.
self._push_counter += 1
self._push_event.set()
# ------------------------------------------------------------------
# public file-sync API
# ------------------------------------------------------------------
async def open_file(self, path: str, *, language_id: str = "plaintext") -> int:
"""Send didOpen (first time) or didChange (subsequent) for ``path``.
Returns the new document version number that the agent's
``wait_for_diagnostics`` should match against.
"""
if not self.is_running:
raise LSPProtocolError("client not running")
abs_path = os.path.abspath(path)
try:
text = Path(abs_path).read_text(encoding="utf-8", errors="replace")
except OSError as e:
raise LSPProtocolError(f"cannot read {abs_path}: {e}") from e
uri = file_uri(abs_path)
existing = self._files.get(abs_path)
if existing is not None:
# Re-open: bump version, fire didChangeWatchedFiles + didChange.
await self._send_notification(
"workspace/didChangeWatchedFiles",
{"changes": [{"uri": uri, "type": 2}]}, # 2 = CHANGED
)
new_version = existing["version"] + 1
old_text = existing["text"]
content_changes: List[Dict[str, Any]]
if self._sync_kind == 2:
content_changes = [
{
"range": {
"start": {"line": 0, "character": 0},
"end": _end_position(old_text),
},
"text": text,
}
]
else:
content_changes = [{"text": text}]
await self._send_notification(
"textDocument/didChange",
{
"textDocument": {"uri": uri, "version": new_version},
"contentChanges": content_changes,
},
)
self._files[abs_path] = {"version": new_version, "text": text}
return new_version
# First open: didChangeWatchedFiles CREATED + didOpen.
await self._send_notification(
"workspace/didChangeWatchedFiles",
{"changes": [{"uri": uri, "type": 1}]}, # 1 = CREATED
)
# Clear any stale push/pull entries — fresh open should start
# from scratch.
self._push_diagnostics.pop(abs_path, None)
self._pull_diagnostics.pop(abs_path, None)
self._published.pop(abs_path, None)
self._published_version.pop(abs_path, None)
await self._send_notification(
"textDocument/didOpen",
{
"textDocument": {
"uri": uri,
"languageId": language_id,
"version": 0,
"text": text,
}
},
)
self._files[abs_path] = {"version": 0, "text": text}
return 0
async def save_file(self, path: str) -> None:
"""Send didSave for ``path``. Some linters re-scan only on save."""
if not self.is_running:
return
abs_path = os.path.abspath(path)
await self._send_notification(
"textDocument/didSave",
{"textDocument": {"uri": file_uri(abs_path)}},
)
# ------------------------------------------------------------------
# diagnostics: pull + wait
# ------------------------------------------------------------------
async def _pull_document_diagnostics(self, path: str) -> None:
"""Send ``textDocument/diagnostic`` for one file.
Stores results into :attr:`_pull_diagnostics`. Silently
no-ops on errors (server may not support the pull endpoint).
"""
try:
params: Dict[str, Any] = {
"textDocument": {"uri": file_uri(os.path.abspath(path))}
}
result = await self._send_request_with_retry(
"textDocument/diagnostic",
params,
timeout=DIAGNOSTICS_REQUEST_TIMEOUT,
)
except (LSPRequestError, LSPProtocolError, asyncio.TimeoutError) as e:
logger.debug("[%s] document diagnostic pull failed: %s", self.server_id, e)
return
if not isinstance(result, dict):
return
items = result.get("items")
if isinstance(items, list):
self._pull_diagnostics[os.path.abspath(path)] = items
related = result.get("relatedDocuments")
if isinstance(related, dict):
for uri, sub in related.items():
if not isinstance(sub, dict):
continue
sub_items = sub.get("items")
if isinstance(sub_items, list):
self._pull_diagnostics[uri_to_path(uri)] = sub_items
async def wait_for_diagnostics(
self,
path: str,
version: int,
*,
mode: str = "document",
) -> None:
"""Wait for the server to publish diagnostics for ``path`` at ``version``.
``mode`` is ``"document"`` (5s budget, document pulls) or
``"full"`` (10s budget, also workspace pulls). Best-effort
returns silently on timeout. Does NOT throw if the server
doesn't support pull diagnostics; we still get the push side.
"""
budget = DIAGNOSTICS_FULL_WAIT if mode == "full" else DIAGNOSTICS_DOCUMENT_WAIT
deadline = asyncio.get_event_loop().time() + budget
abs_path = os.path.abspath(path)
while True:
remaining = deadline - asyncio.get_event_loop().time()
if remaining <= 0:
return
# Concurrent: document pull + push wait.
pull_task = asyncio.create_task(self._pull_document_diagnostics(abs_path))
push_task = asyncio.create_task(self._wait_for_fresh_push(abs_path, version, remaining))
done, pending = await asyncio.wait(
{pull_task, push_task},
timeout=remaining,
return_when=asyncio.FIRST_COMPLETED,
)
for t in pending:
t.cancel()
for t in pending:
try:
await t
except (asyncio.CancelledError, Exception): # noqa: BLE001
pass
# If we got a fresh push for our version, we're done.
current_v = self._published_version.get(abs_path)
if abs_path in self._published and (
current_v is None or current_v >= version
):
return
# Pull may have populated _pull_diagnostics — that's also
# success.
if abs_path in self._pull_diagnostics:
return
# Loop until budget runs out.
async def _wait_for_fresh_push(self, path: str, version: int, timeout: float) -> None:
"""Wait until a publishDiagnostics arrives for ``path`` at ``version``+."""
deadline = asyncio.get_event_loop().time() + timeout
baseline = self._push_counter
while True:
current_v = self._published_version.get(path)
if path in self._published and (current_v is None or current_v >= version):
# Debounce — wait a tick in case more diagnostics arrive
# immediately after. TS often emits in pairs. We
# snapshot the counter so we wake on a *new* push, not
# on the one that satisfied us a moment ago.
debounce_baseline = self._push_counter
debounce_deadline = asyncio.get_event_loop().time() + PUSH_DEBOUNCE
while self._push_counter == debounce_baseline:
remaining = debounce_deadline - asyncio.get_event_loop().time()
if remaining <= 0:
break
self._push_event.clear()
try:
await asyncio.wait_for(self._push_event.wait(), timeout=remaining)
except asyncio.TimeoutError:
break
return
remaining = deadline - asyncio.get_event_loop().time()
if remaining <= 0:
return
if self._push_counter > baseline:
# New event arrived but predicate still false — re-check
# immediately without waiting again.
baseline = self._push_counter
continue
self._push_event.clear()
try:
await asyncio.wait_for(self._push_event.wait(), timeout=min(remaining, 0.5))
except asyncio.TimeoutError:
continue
def diagnostics_for(self, path: str) -> List[Dict[str, Any]]:
"""Return current merged + deduped diagnostics for one file.
Diagnostics from push and pull stores are concatenated and
deduplicated by ``(severity, code, message, range)`` content
key. Empty list if the server hasn't published anything.
"""
abs_path = os.path.abspath(path)
push = self._push_diagnostics.get(abs_path) or []
pull = self._pull_diagnostics.get(abs_path) or []
return _dedupe(push, pull)
def _dedupe(*lists: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
seen: Set[str] = set()
out: List[Dict[str, Any]] = []
for lst in lists:
for d in lst:
if not isinstance(d, dict):
continue
key = _diagnostic_key(d)
if key in seen:
continue
seen.add(key)
out.append(d)
return out
def _diagnostic_key(d: Dict[str, Any]) -> str:
"""Content-equality key for a diagnostic.
Matches the structural-equality used in claude-code's
``areDiagnosticsEqual`` message + severity + source + code +
range coords. The range is reduced to a tuple to keep the key
stable across dict orderings.
"""
rng = d.get("range") or {}
start = rng.get("start") or {}
end = rng.get("end") or {}
code = d.get("code")
if code is not None and not isinstance(code, str):
code = str(code)
return "\x00".join(
[
str(d.get("severity") or 1),
str(code or ""),
str(d.get("source") or ""),
str(d.get("message") or "").strip(),
f"{start.get('line', 0)}:{start.get('character', 0)}-{end.get('line', 0)}:{end.get('character', 0)}",
]
)
__all__ = [
"LSPClient",
"file_uri",
"uri_to_path",
"INITIALIZE_TIMEOUT",
"DIAGNOSTICS_DOCUMENT_WAIT",
"DIAGNOSTICS_FULL_WAIT",
]
-213
View File
@@ -1,213 +0,0 @@
"""Structured logging with steady-state silence for the LSP layer.
The LSP layer fires on every write_file/patch. In a busy session
that's hundreds of events. We want users to be able to ``rg`` the
log for "did LSP fire on that edit?" without drowning in noise.
The level model:
- ``DEBUG`` for steady-state events that have no novel signal:
``clean``, ``feature off``, ``extension not mapped``, ``no project
root for already-announced file``, ``server unavailable for
already-announced binary``. These never reach ``agent.log`` at the
default INFO threshold.
- ``INFO`` for state transitions worth surfacing exactly once per
session: ``active for <root>`` the first time a (server_id,
workspace_root) client starts, ``no project root for <path>``
the first time we see that file. Plus every diagnostic event
(those are inherently rare and per-edit, exactly what users grep
for).
- ``WARNING`` for action-required failures: ``server unavailable``
(binary not on PATH) the first time per (server_id, binary),
``no server configured`` once per language. Per-call WARNING for
timeouts and unexpected bridge exceptions.
The dedup is in-process module-level sets. Each set grows at most by
the number of distinct (server_id, root) and (server_id, binary)
pairs touched in one Python process bytes of memory in even an
aggressive monorepo session. Bounded LRU was rejected: evicting an
entry would risk re-firing the WARNING/INFO line we explicitly want
to suppress.
Grep recipe::
tail -f ~/.hermes/logs/agent.log | rg 'lsp\\['
"""
from __future__ import annotations
import logging
import os
import threading
from typing import Tuple
# Dedicated logger name so the documented grep recipe survives a
# ``logging.getLogger(__name__)`` rename of any internal module.
event_log = logging.getLogger("hermes.lint.lsp")
# ---------------------------------------------------------------------------
# Once-per-X dedup sets
# ---------------------------------------------------------------------------
_announce_lock = threading.Lock()
_announced_active: set = set() # keys: (server_id, workspace_root)
_announced_unavailable: set = set() # keys: (server_id, binary_path_or_name)
_announced_no_root: set = set() # keys: (server_id, file_path)
_announced_no_server: set = set() # keys: (server_id,)
def _short_path(file_path: str) -> str:
"""Render *file_path* relative to the cwd when sensible, else absolute.
Keeps log lines readable for the common case (the user is inside
the project they're editing) without emitting brittle ``../../..``
chains for the cross-tree case.
"""
if not file_path:
return file_path
try:
rel = os.path.relpath(file_path)
except ValueError:
return file_path
if rel.startswith(".." + os.sep) or rel == "..":
return file_path
return rel
def _emit(server_id: str, level: int, message: str) -> None:
event_log.log(level, "lsp[%s] %s", server_id, message)
def _announce_once(bucket: set, key: Tuple) -> bool:
"""Return True if *key* has not been announced for *bucket* yet.
Atomically marks the key as announced so concurrent callers
cannot both win the race and double-log.
"""
with _announce_lock:
if key in bucket:
return False
bucket.add(key)
return True
# ---------------------------------------------------------------------------
# Public event helpers — call these from the LSP layer.
# ---------------------------------------------------------------------------
def log_clean(server_id: str, file_path: str) -> None:
"""No diagnostics emitted for *file_path*. DEBUG (silent at default)."""
_emit(server_id, logging.DEBUG, f"clean ({_short_path(file_path)})")
def log_disabled(server_id: str, file_path: str, reason: str) -> None:
"""LSP intentionally skipped for this file (feature off, ext unmapped,
backend not local, etc.). DEBUG."""
_emit(server_id, logging.DEBUG, f"skipped: {reason} ({_short_path(file_path)})")
def log_active(server_id: str, workspace_root: str) -> None:
"""A new LSP client started for (server_id, workspace_root).
INFO once per (server_id, workspace_root); DEBUG thereafter.
Lets users verify "is LSP actually running?" with a single grep.
"""
key = (server_id, workspace_root)
if _announce_once(_announced_active, key):
_emit(server_id, logging.INFO, f"active for {workspace_root}")
else:
_emit(server_id, logging.DEBUG, f"reused client for {workspace_root}")
def log_diagnostics(server_id: str, file_path: str, count: int) -> None:
"""Diagnostics arrived for a file. INFO every time — these are the
failure signals users actually want to grep for, and they are
inherently rare per edit."""
_emit(server_id, logging.INFO, f"{count} diags ({_short_path(file_path)})")
def log_no_project_root(server_id: str, file_path: str) -> None:
"""File had no recognised project marker. INFO once per file,
DEBUG thereafter."""
key = (server_id, file_path)
if _announce_once(_announced_no_root, key):
_emit(server_id, logging.INFO, f"no project root for {_short_path(file_path)}")
else:
_emit(server_id, logging.DEBUG, f"no project root for {_short_path(file_path)}")
def log_server_unavailable(server_id: str, binary_or_pkg: str) -> None:
"""The server binary couldn't be resolved. WARNING once per
(server_id, binary), DEBUG thereafter so a hundred subsequent
.py edits don't spam the log."""
key = (server_id, binary_or_pkg)
if _announce_once(_announced_unavailable, key):
_emit(
server_id,
logging.WARNING,
f"server unavailable: {binary_or_pkg} not found "
"(install via `hermes lsp install <id>` or set lsp.servers.<id>.command)",
)
else:
_emit(server_id, logging.DEBUG, f"server still unavailable: {binary_or_pkg}")
def log_no_server_configured(server_id: str) -> None:
"""No spawn recipe for this language. WARNING once."""
if _announce_once(_announced_no_server, (server_id,)):
_emit(server_id, logging.WARNING, "no server configured")
def log_timeout(server_id: str, file_path: str, kind: str = "diagnostics") -> None:
"""A request to the server timed out. WARNING every time — these are
inherently novel events worth surfacing on each occurrence."""
_emit(
server_id,
logging.WARNING,
f"{kind} timed out for {_short_path(file_path)}",
)
def log_server_error(server_id: str, file_path: str, exc: BaseException) -> None:
"""An unexpected exception bubbled out of the LSP layer. WARNING."""
_emit(
server_id,
logging.WARNING,
f"unexpected error for {_short_path(file_path)}: {type(exc).__name__}: {exc}",
)
def log_spawn_failed(server_id: str, workspace_root: str, exc: BaseException) -> None:
"""The LSP server failed to spawn or initialize. WARNING."""
_emit(
server_id,
logging.WARNING,
f"spawn/initialize failed for {workspace_root}: {type(exc).__name__}: {exc}",
)
def reset_announce_caches() -> None:
"""Test-only: clear the dedup caches. Production code never calls this."""
with _announce_lock:
_announced_active.clear()
_announced_unavailable.clear()
_announced_no_root.clear()
_announced_no_server.clear()
__all__ = [
"event_log",
"log_clean",
"log_disabled",
"log_active",
"log_diagnostics",
"log_no_project_root",
"log_server_unavailable",
"log_no_server_configured",
"log_timeout",
"log_server_error",
"log_spawn_failed",
"reset_announce_caches",
]
-376
View File
@@ -1,376 +0,0 @@
"""Auto-installation of LSP server binaries.
Tries to install missing servers using whatever package manager is
appropriate. All installs go to a Hermes-owned bin staging dir,
``<HERMES_HOME>/lsp/bin/``, so we don't pollute the user's global
toolchain.
Strategies:
- ``auto`` attempt to install with the best available package
manager. This is the default.
- ``manual`` never install; if a binary is missing, the server is
silently skipped and the user is told about it via ``hermes lsp
status``.
- ``off`` same as ``manual`` for now (kept distinct so we can
evolve behavior later, e.g. logging differently).
The actual installs happen synchronously the first time a server is
needed and concurrent calls to :func:`try_install` for the same
package are deduplicated via a per-package lock.
Failure modes are non-fatal: every install path is wrapped in
try/except and returns ``None`` on failure. The tool layer then
falls back to its in-process syntax checker, exactly as if the user
hadn't enabled LSP at all.
"""
from __future__ import annotations
import logging
import os
import shutil
import subprocess
import sys
import threading
from pathlib import Path
from typing import Any, Dict, Optional
logger = logging.getLogger("agent.lsp.install")
# Package-name → install-strategy hint registry. Each entry is a
# tuple of strategy name + package name + executable name. When the
# install completes, we look for the executable in
# ``<HERMES_HOME>/lsp/bin/`` first, then on PATH.
#
# Optional fields:
# - ``extra_pkgs``: list of sibling packages to install alongside
# ``pkg`` in the same node_modules tree. Used when an LSP server
# has a runtime peer dependency that npm doesn't auto-pull (e.g.
# typescript-language-server needs ``typescript``).
INSTALL_RECIPES: Dict[str, Dict[str, Any]] = {
# Python
"pyright": {"strategy": "npm", "pkg": "pyright", "bin": "pyright-langserver"},
# JS/TS family
"typescript-language-server": {
"strategy": "npm",
"pkg": "typescript-language-server",
"bin": "typescript-language-server",
# typescript-language-server requires the `typescript` SDK
# (tsserver) to be importable from the same node_modules tree;
# otherwise initialize() fails with "Could not find a valid
# TypeScript installation". Install them together.
"extra_pkgs": ["typescript"],
},
"@vue/language-server": {
"strategy": "npm",
"pkg": "@vue/language-server",
"bin": "vue-language-server",
},
"svelte-language-server": {
"strategy": "npm",
"pkg": "svelte-language-server",
"bin": "svelteserver",
},
"@astrojs/language-server": {
"strategy": "npm",
"pkg": "@astrojs/language-server",
"bin": "astro-ls",
},
"yaml-language-server": {
"strategy": "npm",
"pkg": "yaml-language-server",
"bin": "yaml-language-server",
},
"bash-language-server": {
"strategy": "npm",
"pkg": "bash-language-server",
"bin": "bash-language-server",
},
"intelephense": {"strategy": "npm", "pkg": "intelephense", "bin": "intelephense"},
"dockerfile-language-server-nodejs": {
"strategy": "npm",
"pkg": "dockerfile-language-server-nodejs",
"bin": "docker-langserver",
},
# Go
"gopls": {"strategy": "go", "pkg": "golang.org/x/tools/gopls@latest", "bin": "gopls"},
# Rust — too heavy (hundreds of MB to bootstrap). We do NOT
# auto-install rust-analyzer; users install via rustup.
"rust-analyzer": {"strategy": "manual", "pkg": "", "bin": "rust-analyzer"},
# C/C++ — manual (clangd ships with LLVM, very heavy)
"clangd": {"strategy": "manual", "pkg": "", "bin": "clangd"},
# Lua — manual (LuaLS is platform-specific binaries from GitHub
# releases; complex enough that we punt to the user)
"lua-language-server": {"strategy": "manual", "pkg": "", "bin": "lua-language-server"},
}
_install_locks: Dict[str, threading.Lock] = {}
_install_results: Dict[str, Optional[str]] = {}
_install_lock_meta = threading.Lock()
def hermes_lsp_bin_dir() -> Path:
"""Return the Hermes-owned bin staging dir for LSP servers."""
home = os.environ.get("HERMES_HOME")
if home is None:
home = os.path.join(os.path.expanduser("~"), ".hermes")
p = Path(home) / "lsp" / "bin"
p.mkdir(parents=True, exist_ok=True)
return p
def _existing_binary(name: str) -> Optional[str]:
"""Probe the staging dir + PATH for a binary named ``name``."""
staged = hermes_lsp_bin_dir() / name
if staged.exists() and os.access(staged, os.X_OK):
return str(staged)
on_path = shutil.which(name)
if on_path:
return on_path
return None
def _get_lock(pkg: str) -> threading.Lock:
with _install_lock_meta:
lock = _install_locks.get(pkg)
if lock is None:
lock = threading.Lock()
_install_locks[pkg] = lock
return lock
def try_install(pkg: str, strategy: str = "auto") -> Optional[str]:
"""Try to install ``pkg`` and return the binary path if successful.
``strategy`` is ``"auto"``, ``"manual"``, or ``"off"``. In
``manual``/``off`` mode, this function only probes for an
existing binary and returns ``None`` if not found.
The install is cached per-package a second call returns the
same path (or ``None``) without reinstalling. Concurrent calls
are serialized.
"""
if strategy not in ("auto",):
# Only ``auto`` triggers an actual install. In manual/off,
# we still check whether the binary already exists.
recipe = INSTALL_RECIPES.get(pkg, {})
bin_name = recipe.get("bin", pkg)
return _existing_binary(bin_name)
if pkg in _install_results:
return _install_results[pkg]
lock = _get_lock(pkg)
with lock:
# Double-check after acquiring lock.
if pkg in _install_results:
return _install_results[pkg]
result = _do_install(pkg)
_install_results[pkg] = result
return result
def _do_install(pkg: str) -> Optional[str]:
recipe = INSTALL_RECIPES.get(pkg)
if recipe is None:
# Not in our registry — best-effort: just probe PATH.
return shutil.which(pkg)
strategy = recipe.get("strategy", "manual")
bin_name = recipe.get("bin", pkg)
# Check if already present (shutil.which or staging dir)
existing = _existing_binary(bin_name)
if existing:
return existing
if strategy == "manual":
logger.debug("[install] %s requires manual install (recipe=%s)", pkg, recipe)
return None
if strategy == "npm":
return _install_npm(
recipe.get("pkg", pkg),
bin_name,
extra_pkgs=recipe.get("extra_pkgs") or [],
)
if strategy == "go":
return _install_go(recipe.get("pkg", pkg), bin_name)
if strategy == "pip":
return _install_pip(recipe.get("pkg", pkg), bin_name)
logger.warning("[install] unknown strategy %r for %s", strategy, pkg)
return None
def _install_npm(
pkg: str,
bin_name: str,
extra_pkgs: Optional[list] = None,
) -> Optional[str]:
"""Install an npm package into our staging dir.
Uses ``npm install --prefix`` so the binaries land in
``<staging>/node_modules/.bin/<bin_name>`` and we symlink them up
one level for direct PATH-style access.
``extra_pkgs`` is a list of sibling packages to install in the
same ``node_modules`` tree. Used for LSP servers with runtime
peer deps that npm doesn't auto-pull (typescript-language-server
needs ``typescript`` next to it; intelephense ships standalone).
"""
npm = shutil.which("npm")
if npm is None:
logger.info("[install] cannot install %s: npm not on PATH", pkg)
return None
staging = hermes_lsp_bin_dir().parent # <HERMES_HOME>/lsp/
install_targets = [pkg] + list(extra_pkgs or [])
try:
logger.info(
"[install] npm install --prefix %s %s",
staging,
" ".join(install_targets),
)
proc = subprocess.run(
[npm, "install", "--prefix", str(staging), "--silent", "--no-fund", "--no-audit", *install_targets],
check=False,
capture_output=True,
text=True,
timeout=300,
)
if proc.returncode != 0:
logger.warning(
"[install] npm install failed for %s: %s", pkg, proc.stderr.strip()[:500]
)
return None
except (subprocess.TimeoutExpired, OSError) as e:
logger.warning("[install] npm install errored for %s: %s", pkg, e)
return None
# Find the bin
nm_bin = staging / "node_modules" / ".bin" / bin_name
if os.name == "nt":
# On Windows npm sometimes drops `.cmd` shims
candidates = [nm_bin, nm_bin.with_suffix(".cmd")]
else:
candidates = [nm_bin]
for c in candidates:
if c.exists():
# Symlink into our `lsp/bin/` for stable PATH access.
link = hermes_lsp_bin_dir() / c.name
if not link.exists():
try:
link.symlink_to(c)
except (OSError, NotImplementedError):
# Symlinks fail on some Windows setups — copy instead.
try:
shutil.copy2(c, link)
except OSError:
return str(c)
return str(link if link.exists() else c)
logger.warning("[install] npm install for %s succeeded but bin %s not found", pkg, bin_name)
return None
def _install_go(pkg: str, bin_name: str) -> Optional[str]:
"""Install a Go module to GOBIN=<staging>."""
go = shutil.which("go")
if go is None:
logger.info("[install] cannot install %s: go not on PATH", pkg)
return None
staging = hermes_lsp_bin_dir()
env = dict(os.environ)
env["GOBIN"] = str(staging)
try:
logger.info("[install] go install %s (GOBIN=%s)", pkg, staging)
proc = subprocess.run(
[go, "install", pkg],
check=False,
capture_output=True,
text=True,
timeout=600,
env=env,
)
if proc.returncode != 0:
logger.warning(
"[install] go install failed for %s: %s", pkg, proc.stderr.strip()[:500]
)
return None
except (subprocess.TimeoutExpired, OSError) as e:
logger.warning("[install] go install errored for %s: %s", pkg, e)
return None
bin_path = staging / bin_name
if os.name == "nt":
bin_path = bin_path.with_suffix(".exe")
if bin_path.exists():
return str(bin_path)
logger.warning("[install] go install for %s succeeded but bin %s not found", pkg, bin_name)
return None
def _install_pip(pkg: str, bin_name: str) -> Optional[str]:
"""Install a Python package into a hermes-owned target dir.
We avoid polluting the user's site-packages by using
``pip install --target``. Bins go into
``<staging>/python-packages/bin/`` which we symlink into
``<staging>/bin``. Note: this only works for packages that ship a
console script.
"""
pip_target = hermes_lsp_bin_dir().parent / "python-packages"
pip_target.mkdir(parents=True, exist_ok=True)
try:
logger.info("[install] pip install --target %s %s", pip_target, pkg)
proc = subprocess.run(
[sys.executable, "-m", "pip", "install", "--target", str(pip_target), "--quiet", pkg],
check=False,
capture_output=True,
text=True,
timeout=300,
)
if proc.returncode != 0:
logger.warning(
"[install] pip install failed for %s: %s", pkg, proc.stderr.strip()[:500]
)
return None
except (subprocess.TimeoutExpired, OSError) as e:
logger.warning("[install] pip install errored for %s: %s", pkg, e)
return None
# Look for the script
bin_path = pip_target / "bin" / bin_name
if bin_path.exists():
link = hermes_lsp_bin_dir() / bin_name
if not link.exists():
try:
link.symlink_to(bin_path)
except (OSError, NotImplementedError):
try:
shutil.copy2(bin_path, link)
except OSError:
return str(bin_path)
return str(link if link.exists() else bin_path)
return None
def detect_status(pkg: str) -> str:
"""Return ``installed``, ``missing``, or ``manual-only`` for a package.
Used by the ``hermes lsp status`` CLI to give users a quick
overview of what's available without spawning anything.
"""
recipe = INSTALL_RECIPES.get(pkg)
bin_name = recipe.get("bin", pkg) if recipe else pkg
if _existing_binary(bin_name):
return "installed"
if recipe and recipe.get("strategy") == "manual":
return "manual-only"
return "missing"
__all__ = [
"INSTALL_RECIPES",
"try_install",
"detect_status",
"hermes_lsp_bin_dir",
]
-607
View File
@@ -1,607 +0,0 @@
"""Service-level orchestration for LSP clients.
The :class:`LSPService` is the bridge between the synchronous
file_operations layer and the async :class:`agent.lsp.client.LSPClient`.
Design choices:
- A **single asyncio event loop** runs in a background thread. All
client work happens on that loop. Synchronous callers from
``tools/file_operations.py`` use :meth:`get_diagnostics_sync` to
open + wait + drain in one blocking call.
- One client per ``(server_id, workspace_root)`` key. Lazy spawn:
the first request for a key spawns the client; subsequent requests
re-use it.
- A **broken-set** records ``(server_id, workspace_root)`` pairs that
failed to spawn or initialize. These are never retried for the
life of the service. Mirrors OpenCode's design.
- A **delta baseline** map keeps "diagnostics-as-of-the-last-snapshot"
per file. ``snapshot_baseline()`` is called BEFORE a write; the
next ``get_diagnostics_sync()`` returns only diagnostics that
weren't in the baseline. This is the lift from Claude Code's
``beforeFileEdited`` / ``getNewDiagnostics`` pattern, except wired
to the local LSP layer instead of MCP IDE RPC.
The service is **off by default** call :meth:`is_active` to check
whether it's actually doing anything. When LSP is disabled in
config, when no git workspace can be detected, when all configured
servers are missing binaries and auto-install is off, ``is_active``
returns False and the file_operations layer falls through to the
in-process syntax check.
"""
from __future__ import annotations
import asyncio
import logging
import os
import threading
import time
from concurrent.futures import Future as ConcurrentFuture
from typing import Any, Dict, List, Optional, Tuple
from agent.lsp import eventlog
from agent.lsp.client import (
DIAGNOSTICS_DOCUMENT_WAIT,
LSPClient,
file_uri,
)
from agent.lsp.servers import (
ServerContext,
ServerDef,
SpawnSpec,
find_server_for_file,
language_id_for,
)
from agent.lsp.workspace import (
clear_cache,
is_inside_workspace,
resolve_workspace_for_file,
)
logger = logging.getLogger("agent.lsp.manager")
DEFAULT_IDLE_TIMEOUT = 600 # seconds; servers idle for >10min get reaped
class _BackgroundLoop:
"""A daemon thread that owns one asyncio event loop.
Provides :meth:`run` for synchronous callers submits a coroutine
to the loop and blocks until it finishes (or a timeout fires).
"""
def __init__(self) -> None:
self._loop: Optional[asyncio.AbstractEventLoop] = None
self._thread: Optional[threading.Thread] = None
self._ready = threading.Event()
def start(self) -> None:
if self._thread is not None:
return
self._thread = threading.Thread(
target=self._run_forever,
name="hermes-lsp-loop",
daemon=True,
)
self._thread.start()
self._ready.wait(timeout=5.0)
def _run_forever(self) -> None:
loop = asyncio.new_event_loop()
self._loop = loop
asyncio.set_event_loop(loop)
self._ready.set()
try:
loop.run_forever()
finally:
try:
loop.close()
except Exception: # noqa: BLE001
pass
def run(self, coro, *, timeout: Optional[float] = None) -> Any:
"""Submit a coroutine to the loop and block until done.
Returns the coroutine's result, or raises its exception.
"""
if self._loop is None:
raise RuntimeError("background loop not started")
fut: ConcurrentFuture = asyncio.run_coroutine_threadsafe(coro, self._loop)
try:
return fut.result(timeout=timeout)
except Exception:
fut.cancel()
raise
def stop(self) -> None:
loop = self._loop
if loop is None:
return
try:
loop.call_soon_threadsafe(loop.stop)
except RuntimeError:
pass
if self._thread is not None:
self._thread.join(timeout=2.0)
self._loop = None
self._thread = None
class LSPService:
"""The process-wide LSP service.
Created once via :meth:`create_from_config`; the
:func:`agent.lsp.get_service` accessor manages the singleton.
Most callers should use that accessor rather than constructing
:class:`LSPService` directly.
"""
# ------------------------------------------------------------------
# construction + factory
# ------------------------------------------------------------------
def __init__(
self,
*,
enabled: bool,
wait_mode: str,
wait_timeout: float,
install_strategy: str,
binary_overrides: Optional[Dict[str, List[str]]] = None,
env_overrides: Optional[Dict[str, Dict[str, str]]] = None,
init_overrides: Optional[Dict[str, Dict[str, Any]]] = None,
disabled_servers: Optional[List[str]] = None,
idle_timeout: float = DEFAULT_IDLE_TIMEOUT,
) -> None:
self._enabled = enabled
self._wait_mode = wait_mode if wait_mode in ("document", "full") else "document"
self._wait_timeout = wait_timeout
self._install_strategy = install_strategy
self._binary_overrides = binary_overrides or {}
self._env_overrides = env_overrides or {}
self._init_overrides = init_overrides or {}
self._disabled_servers = set(disabled_servers or [])
self._idle_timeout = idle_timeout
self._loop = _BackgroundLoop()
if self._enabled:
self._loop.start()
# Per-(server_id, workspace_root) state
self._clients: Dict[Tuple[str, str], LSPClient] = {}
self._broken: set = set()
self._spawning: Dict[Tuple[str, str], asyncio.Future] = {}
self._last_used: Dict[Tuple[str, str], float] = {}
self._state_lock = threading.Lock()
# Delta baseline: file path → snapshot of diagnostics taken
# immediately before a write. ``get_diagnostics_sync`` filters
# out anything in the baseline so the agent only sees errors
# introduced by the current edit.
self._delta_baseline: Dict[str, List[Dict[str, Any]]] = {}
@classmethod
def create_from_config(cls) -> Optional["LSPService"]:
"""Build a service from ``hermes_cli.config`` settings.
Returns ``None`` if the config can't be loaded. The service
itself returns ``is_active()`` False when LSP is disabled.
"""
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as e: # noqa: BLE001
logger.debug("LSP config load failed: %s", e)
return None
lsp_cfg = (cfg.get("lsp") or {}) if isinstance(cfg, dict) else {}
if not isinstance(lsp_cfg, dict):
lsp_cfg = {}
enabled = bool(lsp_cfg.get("enabled", True))
wait_mode = lsp_cfg.get("wait_mode", "document")
wait_timeout = float(lsp_cfg.get("wait_timeout", DIAGNOSTICS_DOCUMENT_WAIT))
install_strategy = lsp_cfg.get("install_strategy", "auto")
servers_cfg = lsp_cfg.get("servers") or {}
disabled = []
binary_overrides: Dict[str, List[str]] = {}
env_overrides: Dict[str, Dict[str, str]] = {}
init_overrides: Dict[str, Dict[str, Any]] = {}
if isinstance(servers_cfg, dict):
for name, sub in servers_cfg.items():
if not isinstance(sub, dict):
continue
if sub.get("disabled"):
disabled.append(name)
cmd = sub.get("command")
if isinstance(cmd, list) and cmd:
binary_overrides[name] = cmd
env = sub.get("env")
if isinstance(env, dict):
env_overrides[name] = {k: str(v) for k, v in env.items()}
init = sub.get("initialization_options")
if isinstance(init, dict):
init_overrides[name] = init
return cls(
enabled=enabled,
wait_mode=wait_mode,
wait_timeout=wait_timeout,
install_strategy=install_strategy,
binary_overrides=binary_overrides,
env_overrides=env_overrides,
init_overrides=init_overrides,
disabled_servers=disabled,
)
# ------------------------------------------------------------------
# public API
# ------------------------------------------------------------------
def is_active(self) -> bool:
"""Return True iff this service should be consulted at all."""
return self._enabled
def enabled_for(self, file_path: str) -> bool:
"""Return True iff LSP should run for this specific file.
Gates on workspace detection (file or cwd inside a git worktree),
on whether any registered server matches the extension, and
on whether the (server_id, workspace_root) pair is in the
broken-set from a previous spawn failure.
Files in already-broken pairs return False so the file_operations
layer skips the LSP path entirely no spawn attempts, no
timeout cost until the service is restarted (``hermes lsp
restart``) or the process exits.
"""
if not self._enabled:
return False
srv = find_server_for_file(file_path)
if srv is None or srv.server_id in self._disabled_servers:
return False
ws_root, gated_in = resolve_workspace_for_file(file_path)
if not (ws_root and gated_in):
return False
# Broken-set short-circuit. Use the per-server root if we can
# compute one cheaply; otherwise fall back to the workspace
# root as the broken key (which is what _get_or_spawn would
# have used anyway when it failed).
try:
per_server_root = srv.resolve_root(file_path, ws_root) or ws_root
except Exception: # noqa: BLE001
per_server_root = ws_root
if (srv.server_id, per_server_root) in self._broken:
return False
return True
def snapshot_baseline(self, file_path: str) -> None:
"""Snapshot current diagnostics for ``file_path`` as the delta baseline.
Called BEFORE a write so the next ``get_diagnostics_sync()``
can filter out pre-existing errors. Best-effort failures
are silently swallowed so a flaky server can't break a write.
Outer timeouts (e.g. server hangs during initialize) mark the
(server_id, workspace_root) pair as broken so subsequent edits
skip it instantly instead of re-paying the timeout cost.
"""
if not self.enabled_for(file_path):
return
try:
diags = self._loop.run(self._snapshot_async(file_path), timeout=8.0)
self._delta_baseline[os.path.abspath(file_path)] = diags or []
except Exception as e: # noqa: BLE001
logger.debug("baseline snapshot failed for %s: %s", file_path, e)
self._mark_broken_for_file(file_path, e)
self._delta_baseline[os.path.abspath(file_path)] = []
def get_diagnostics_sync(
self,
file_path: str,
*,
delta: bool = True,
timeout: Optional[float] = None,
) -> List[Dict[str, Any]]:
"""Synchronously open ``file_path`` in the right server, wait for
diagnostics, return them.
If ``delta`` is True (default), the result is filtered against
any baseline previously captured via :meth:`snapshot_baseline`.
Diagnostics present in the baseline are removed so the caller
only sees errors introduced by the current edit.
Returns an empty list when LSP is disabled, when no workspace
can be detected, when no server matches, or when the server
can't be spawned. Never raises.
"""
if not self.enabled_for(file_path):
return []
# Resolve server_id eagerly so we can emit structured logs even
# when the request errors out below.
srv = find_server_for_file(file_path)
server_id = srv.server_id if srv else "?"
try:
t = timeout if timeout is not None else self._wait_timeout + 2.0
diags = self._loop.run(self._open_and_wait_async(file_path), timeout=t) or []
except asyncio.TimeoutError as e:
eventlog.log_timeout(server_id, file_path)
logger.debug("LSP diagnostics timeout for %s: %s", file_path, e)
self._mark_broken_for_file(file_path, e)
return []
except Exception as e: # noqa: BLE001
eventlog.log_server_error(server_id, file_path, e)
logger.debug("LSP diagnostics fetch failed for %s: %s", file_path, e)
self._mark_broken_for_file(file_path, e)
return []
abs_path = os.path.abspath(file_path)
if delta:
baseline = self._delta_baseline.get(abs_path) or []
if baseline:
seen = {_diag_key(d) for d in baseline}
diags = [d for d in diags if _diag_key(d) not in seen]
# Roll baseline forward — next call returns deltas relative
# to the just-emitted state, mirroring claude-code's
# diagnosticTracking.
try:
fresh = self._loop.run(self._current_diags_async(file_path), timeout=2.0) or []
except Exception: # noqa: BLE001
fresh = []
if fresh:
self._delta_baseline[abs_path] = fresh
if diags:
eventlog.log_diagnostics(server_id, file_path, len(diags))
else:
eventlog.log_clean(server_id, file_path)
return diags
def _mark_broken_for_file(self, file_path: str, exc: BaseException) -> None:
"""Mark the (server_id, workspace_root) pair as broken so subsequent
edits skip it instantly instead of re-paying timeout cost.
Called when the outer ``_loop.run`` timeout cancels an in-flight
spawn/initialize that the inner ``_get_or_spawn`` task was still
holding open. Without this, every subsequent write would re-enter
the spawn path and re-pay the full ``snapshot_baseline``
timeout (8s) until the binary is fixed.
Also kills any orphan client process that survived the cancelled
future, and emits a single eventlog WARNING so the user knows
which server gave up.
``exc`` is whatever exception the outer wrapper caught used
only for logging, never re-raised.
"""
srv = find_server_for_file(file_path)
if srv is None:
return
ws_root, gated = resolve_workspace_for_file(file_path)
if not (ws_root and gated):
return
try:
per_server_root = srv.resolve_root(file_path, ws_root) or ws_root
except Exception: # noqa: BLE001
per_server_root = ws_root
key = (srv.server_id, per_server_root)
already_broken = key in self._broken
self._broken.add(key)
# Kill any client we managed to spawn before the timeout. The
# cancelled future never reached the broken-set add inside
# ``_get_or_spawn`` so the client may still be hanging in
# ``_clients`` with a half-initialized state.
with self._state_lock:
client = self._clients.pop(key, None)
if client is not None:
try:
# Fire-and-forget shutdown — give it a second to cleanup,
# but don't block. We're already on a slow path.
self._loop.run(client.shutdown(), timeout=1.0)
except Exception: # noqa: BLE001
pass
if not already_broken:
eventlog.log_spawn_failed(srv.server_id, per_server_root, exc)
def shutdown(self) -> None:
"""Tear down all clients and stop the background loop."""
if not self._enabled:
return
try:
self._loop.run(self._shutdown_async(), timeout=10.0)
except Exception as e: # noqa: BLE001
logger.debug("LSP shutdown error: %s", e)
self._loop.stop()
clear_cache()
# ------------------------------------------------------------------
# async internals
# ------------------------------------------------------------------
async def _snapshot_async(self, file_path: str) -> List[Dict[str, Any]]:
client = await self._get_or_spawn(file_path)
if client is None:
return []
try:
version = await client.open_file(file_path, language_id=language_id_for(file_path))
await client.wait_for_diagnostics(file_path, version, mode=self._wait_mode)
except Exception as e: # noqa: BLE001
logger.debug("snapshot open/wait failed: %s", e)
return []
self._last_used[(client.server_id, client.workspace_root)] = time.time()
return list(client.diagnostics_for(file_path))
async def _open_and_wait_async(self, file_path: str) -> List[Dict[str, Any]]:
client = await self._get_or_spawn(file_path)
if client is None:
return []
try:
version = await client.open_file(file_path, language_id=language_id_for(file_path))
await client.save_file(file_path)
await client.wait_for_diagnostics(file_path, version, mode=self._wait_mode)
except Exception as e: # noqa: BLE001
logger.debug("open/wait failed for %s: %s", file_path, e)
return []
self._last_used[(client.server_id, client.workspace_root)] = time.time()
return list(client.diagnostics_for(file_path))
async def _current_diags_async(self, file_path: str) -> List[Dict[str, Any]]:
ws, gated = resolve_workspace_for_file(file_path)
srv = find_server_for_file(file_path)
if not (ws and gated and srv):
return []
with self._state_lock:
client = self._clients.get((srv.server_id, ws))
if client is None:
return []
return list(client.diagnostics_for(file_path))
async def _get_or_spawn(self, file_path: str) -> Optional[LSPClient]:
srv = find_server_for_file(file_path)
if srv is None:
return None
if srv.server_id in self._disabled_servers:
eventlog.log_disabled(srv.server_id, file_path, "disabled in config")
return None
ws_root, gated = resolve_workspace_for_file(file_path)
if not (ws_root and gated):
eventlog.log_no_project_root(srv.server_id, file_path)
return None
per_server_root = srv.resolve_root(file_path, ws_root)
if per_server_root is None:
eventlog.log_disabled(
srv.server_id, file_path, "exclude marker hit (server gated off)"
)
return None # exclude marker hit, server gated off
key = (srv.server_id, per_server_root)
if key in self._broken:
return None
with self._state_lock:
client = self._clients.get(key)
if client is not None and client.is_running:
eventlog.log_active(srv.server_id, per_server_root)
return client
spawning = self._spawning.get(key)
if spawning is not None:
try:
return await spawning
except Exception: # noqa: BLE001
return None
# Begin spawn
loop = asyncio.get_running_loop()
spawn_future: asyncio.Future = loop.create_future()
with self._state_lock:
self._spawning[key] = spawn_future
try:
ctx = ServerContext(
workspace_root=per_server_root,
install_strategy=self._install_strategy,
binary_overrides=self._binary_overrides,
env_overrides=self._env_overrides,
init_overrides=self._init_overrides,
)
spec = srv.build_spawn(per_server_root, ctx)
if spec is None:
# ``build_spawn`` returns None when the binary can't be
# located (auto-install disabled, manual-only server,
# or install attempt failed). Surface this once via
# the structured logger so the user can act on it.
eventlog.log_server_unavailable(srv.server_id, srv.server_id)
self._broken.add(key)
spawn_future.set_result(None)
return None
client = LSPClient(
server_id=srv.server_id,
workspace_root=spec.workspace_root,
command=spec.command,
env=spec.env,
cwd=spec.cwd,
initialization_options=spec.initialization_options,
seed_diagnostics_on_first_push=spec.seed_diagnostics_on_first_push or srv.seed_first_push,
)
try:
await client.start()
except Exception as e: # noqa: BLE001
eventlog.log_spawn_failed(srv.server_id, per_server_root, e)
self._broken.add(key)
spawn_future.set_result(None)
return None
with self._state_lock:
self._clients[key] = client
self._last_used[key] = time.time()
eventlog.log_active(srv.server_id, per_server_root)
spawn_future.set_result(client)
return client
finally:
with self._state_lock:
self._spawning.pop(key, None)
async def _shutdown_async(self) -> None:
with self._state_lock:
clients = list(self._clients.values())
self._clients.clear()
self._broken.clear()
self._last_used.clear()
await asyncio.gather(
*(c.shutdown() for c in clients),
return_exceptions=True,
)
# ------------------------------------------------------------------
# status / introspection (used by ``hermes lsp status``)
# ------------------------------------------------------------------
def get_status(self) -> Dict[str, Any]:
"""Return a snapshot of the service for the CLI status command."""
with self._state_lock:
clients = [
{
"server_id": k[0],
"workspace_root": k[1],
"state": c.state,
"running": c.is_running,
}
for k, c in self._clients.items()
]
broken = list(self._broken)
return {
"enabled": self._enabled,
"wait_mode": self._wait_mode,
"wait_timeout": self._wait_timeout,
"install_strategy": self._install_strategy,
"clients": clients,
"broken": broken,
"disabled_servers": sorted(self._disabled_servers),
}
def _diag_key(d: Dict[str, Any]) -> str:
"""Content equality key used for delta filtering. Mirrors
:func:`agent.lsp.client._diagnostic_key`."""
rng = d.get("range") or {}
start = rng.get("start") or {}
end = rng.get("end") or {}
code = d.get("code")
if code is not None and not isinstance(code, str):
code = str(code)
return "\x00".join(
[
str(d.get("severity") or 1),
str(code or ""),
str(d.get("source") or ""),
str(d.get("message") or "").strip(),
f"{start.get('line', 0)}:{start.get('character', 0)}-{end.get('line', 0)}:{end.get('character', 0)}",
]
)
__all__ = ["LSPService"]
-196
View File
@@ -1,196 +0,0 @@
"""Minimal LSP JSON-RPC 2.0 framer over async streams.
LSP wire format:
Content-Length: <bytes>\\r\\n
\\r\\n
<utf-8 JSON body>
The body is a JSON-RPC 2.0 envelope: request, response, or notification.
This module replaces what ``vscode-jsonrpc/node`` would do in a
TypeScript implementation. We keep it deliberately small just the
framer + envelope helpers so :class:`agent.lsp.client.LSPClient` can
focus on protocol semantics.
"""
from __future__ import annotations
import asyncio
import json
import logging
from typing import Any, Optional, Tuple
logger = logging.getLogger("agent.lsp.protocol")
# LSP error codes we care about. Full list in
# https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#errorCodes
ERROR_CONTENT_MODIFIED = -32801
ERROR_REQUEST_CANCELLED = -32800
ERROR_METHOD_NOT_FOUND = -32601
class LSPProtocolError(Exception):
"""Raised when the wire protocol is violated.
Distinct from :class:`LSPRequestError` which represents a server
returning a JSON-RPC error response that's protocol-conformant.
This exception means the framing or envelope itself is broken.
"""
class LSPRequestError(Exception):
"""Raised when an LSP request returns an error response.
Carries the JSON-RPC ``code``, ``message``, and optional ``data``.
"""
def __init__(self, code: int, message: str, data: Any = None) -> None:
super().__init__(f"LSP error {code}: {message}")
self.code = code
self.message = message
self.data = data
def encode_message(obj: dict) -> bytes:
"""Encode a JSON-RPC envelope as a Content-Length framed byte string.
The body is encoded as compact UTF-8 JSON (no spaces between
separators) matches what ``vscode-jsonrpc`` emits and keeps the
Content-Length count exact.
"""
body = json.dumps(obj, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
header = f"Content-Length: {len(body)}\r\n\r\n".encode("ascii")
return header + body
async def read_message(reader: asyncio.StreamReader) -> Optional[dict]:
"""Read one Content-Length framed JSON-RPC message from the stream.
Returns ``None`` on clean EOF (server closed stdout cleanly between
messages typical shutdown). Raises :class:`LSPProtocolError` on
malformed framing.
The reader is advanced to just past the JSON body on success.
"""
headers: dict = {}
header_bytes = 0
while True:
try:
line = await reader.readuntil(b"\r\n")
except asyncio.IncompleteReadError as e:
# EOF while reading headers. If we hadn't started a header
# block, treat as clean EOF; otherwise the framing is bad.
if not e.partial and not headers:
return None
raise LSPProtocolError(
f"unexpected EOF while reading LSP headers (partial={e.partial!r})"
) from e
# Defensive cap against a server streaming headers without ever
# emitting CRLF-CRLF. Caps total header bytes at 8 KiB — a
# well-behaved server fits in well under 200 bytes.
header_bytes += len(line)
if header_bytes > 8192:
raise LSPProtocolError(
f"LSP header block exceeded 8 KiB without terminator"
)
line = line[:-2] # strip CRLF
if not line:
break # blank line ends header block
try:
key, _, value = line.decode("ascii").partition(":")
except UnicodeDecodeError as e:
raise LSPProtocolError(f"non-ASCII LSP header: {line!r}") from e
if not key:
raise LSPProtocolError(f"malformed LSP header line: {line!r}")
headers[key.strip().lower()] = value.strip()
cl = headers.get("content-length")
if cl is None:
raise LSPProtocolError(f"LSP message missing Content-Length: {headers!r}")
try:
n = int(cl)
except ValueError as e:
raise LSPProtocolError(f"non-integer Content-Length: {cl!r}") from e
if n < 0 or n > 64 * 1024 * 1024: # 64 MiB sanity cap
raise LSPProtocolError(f"unreasonable Content-Length: {n}")
try:
body = await reader.readexactly(n)
except asyncio.IncompleteReadError as e:
raise LSPProtocolError(
f"truncated LSP body: expected {n} bytes, got {len(e.partial)}"
) from e
try:
return json.loads(body.decode("utf-8"))
except json.JSONDecodeError as e:
raise LSPProtocolError(f"invalid JSON in LSP body: {e}") from e
except UnicodeDecodeError as e:
raise LSPProtocolError(f"non-UTF-8 LSP body: {e}") from e
def make_request(req_id: int, method: str, params: Any) -> dict:
"""Build a JSON-RPC 2.0 request envelope."""
msg: dict = {"jsonrpc": "2.0", "id": req_id, "method": method}
if params is not None:
msg["params"] = params
return msg
def make_notification(method: str, params: Any) -> dict:
"""Build a JSON-RPC 2.0 notification envelope (no ``id``)."""
msg: dict = {"jsonrpc": "2.0", "method": method}
if params is not None:
msg["params"] = params
return msg
def make_response(req_id: Any, result: Any) -> dict:
"""Build a JSON-RPC 2.0 success response envelope."""
return {"jsonrpc": "2.0", "id": req_id, "result": result}
def make_error_response(req_id: Any, code: int, message: str, data: Any = None) -> dict:
"""Build a JSON-RPC 2.0 error response envelope."""
err: dict = {"code": code, "message": message}
if data is not None:
err["data"] = data
return {"jsonrpc": "2.0", "id": req_id, "error": err}
def classify_message(msg: dict) -> Tuple[str, Any]:
"""Return ``(kind, key)`` where kind is one of ``request``,
``response``, ``notification``, ``invalid``.
The key is the request id for request/response, the method name
for notifications, and ``None`` for invalid messages.
"""
if not isinstance(msg, dict):
return "invalid", None
if msg.get("jsonrpc") != "2.0":
return "invalid", None
has_id = "id" in msg
has_method = "method" in msg
if has_id and has_method:
return "request", msg["id"]
if has_id and ("result" in msg or "error" in msg):
return "response", msg["id"]
if has_method and not has_id:
return "notification", msg["method"]
return "invalid", None
__all__ = [
"ERROR_CONTENT_MODIFIED",
"ERROR_REQUEST_CANCELLED",
"ERROR_METHOD_NOT_FOUND",
"LSPProtocolError",
"LSPRequestError",
"encode_message",
"read_message",
"make_request",
"make_notification",
"make_response",
"make_error_response",
"classify_message",
]
-78
View File
@@ -1,78 +0,0 @@
"""Format LSP diagnostics for inclusion in tool output.
The model sees a compact, severity-filtered, line-bounded summary of
diagnostics introduced by the latest edit. Format matches what
OpenCode's ``lsp/diagnostic.ts`` and Claude Code's
``formatDiagnosticsSummary`` produce ``<diagnostics>`` blocks with
1-indexed line/column, capped at ``MAX_PER_FILE`` errors.
"""
from __future__ import annotations
from typing import Any, Dict, List
# Severity-1 only by default — warnings/info/hints would flood the
# agent. Lift this in config under ``lsp.severities`` if needed.
SEVERITY_NAMES = {1: "ERROR", 2: "WARN", 3: "INFO", 4: "HINT"}
DEFAULT_SEVERITIES = frozenset({1}) # ERROR only
MAX_PER_FILE = 20
MAX_TOTAL_CHARS = 4000
def format_diagnostic(d: Dict[str, Any]) -> str:
"""One-line representation of a single diagnostic."""
sev = SEVERITY_NAMES.get(d.get("severity") or 1, "ERROR")
rng = d.get("range") or {}
start = rng.get("start") or {}
line = int(start.get("line", 0)) + 1
col = int(start.get("character", 0)) + 1
msg = str(d.get("message") or "").rstrip()
code = d.get("code")
code_part = f" [{code}]" if code not in (None, "") else ""
source = d.get("source")
source_part = f" ({source})" if source else ""
return f"{sev} [{line}:{col}] {msg}{code_part}{source_part}"
def report_for_file(
file_path: str,
diagnostics: List[Dict[str, Any]],
*,
severities: frozenset = DEFAULT_SEVERITIES,
max_per_file: int = MAX_PER_FILE,
) -> str:
"""Build a ``<diagnostics file=...>`` block for one file.
Returns an empty string when no diagnostics pass the severity
filter, so callers can do ``if block:`` to skip empty cases.
"""
if not diagnostics:
return ""
filtered = [d for d in diagnostics if (d.get("severity") or 1) in severities]
if not filtered:
return ""
limited = filtered[:max_per_file]
extra = len(filtered) - len(limited)
lines = [format_diagnostic(d) for d in limited]
body = "\n".join(lines)
if extra > 0:
body += f"\n... and {extra} more"
return f"<diagnostics file=\"{file_path}\">\n{body}\n</diagnostics>"
def truncate(s: str, *, limit: int = MAX_TOTAL_CHARS) -> str:
"""Hard-cap a formatted summary string."""
if len(s) <= limit:
return s
marker = "\n…[truncated]"
return s[: limit - len(marker)] + marker
__all__ = [
"SEVERITY_NAMES",
"DEFAULT_SEVERITIES",
"MAX_PER_FILE",
"format_diagnostic",
"report_for_file",
"truncate",
]
-1040
View File
File diff suppressed because it is too large Load Diff
-223
View File
@@ -1,223 +0,0 @@
"""Workspace and project-root resolution for LSP.
Two concerns live here:
1. **Workspace gate** the upper-level "is this directory a project?"
check. Hermes only runs LSP when the cwd (or the file being edited)
sits inside a git worktree. Files outside any git root never
trigger LSP, even if a server is configured. This keeps Telegram
gateway users on user-home cwd's from spawning daemons.
2. **NearestRoot** the per-server project-root walk. Each language
server cares about a different marker (``pyproject.toml`` for
Python, ``Cargo.toml`` for Rust, ``go.mod`` for Go, etc.) and
wants the directory containing that marker. ``nearest_root()``
walks up from a starting path looking for any of a list of marker
files, optionally bailing if an exclude marker shows up first.
"""
from __future__ import annotations
import logging
import os
from pathlib import Path
from typing import Iterable, Optional, Tuple
logger = logging.getLogger("agent.lsp.workspace")
# Cache: cwd → (worktree_root, is_git) so repeated calls don't re-stat.
# Cleared on shutdown. Keyed by absolute resolved path so symlink
# folds collapse to one entry.
_workspace_cache: dict = {}
def normalize_path(path: str) -> str:
"""Normalize a path for use as a stable map key.
Resolves ``~``, makes absolute, and collapses ``.``/``..``. We do
NOT resolve symlinks here symlink stability matters for some
LSP servers (rust-analyzer cares about Cargo workspace identity)
and we want the canonical path the user typed when possible.
"""
return os.path.abspath(os.path.expanduser(path))
def find_git_worktree(start: str) -> Optional[str]:
"""Walk up from ``start`` looking for a ``.git`` entry (file or dir).
Returns the directory containing ``.git``, or ``None`` if no git
root is found before hitting the filesystem root.
A ``.git`` *file* (not directory) means we're inside a git
worktree set up via ``git worktree add`` both forms count.
"""
try:
start_path = Path(normalize_path(start))
if start_path.is_file():
start_path = start_path.parent
except (OSError, RuntimeError, ValueError):
# Pathological input (loop in symlinks, encoding error, etc.) —
# bail out rather than crash the lint hook.
return None
# Cache check
cached = _workspace_cache.get(str(start_path))
if cached is not None:
root, _is_git = cached
return root
cur = start_path
# Defensive cap: the deepest reasonable monorepo is well under 64
# levels. Caps the walk so a pathological cwd or a symlink cycle
# we somehow traverse can't keep us looping.
for _ in range(64):
git_marker = cur / ".git"
try:
if git_marker.exists():
resolved = str(cur)
_workspace_cache[str(start_path)] = (resolved, True)
return resolved
except OSError:
# Permission error on a parent dir — bail out cleanly.
break
parent = cur.parent
if parent == cur:
break
cur = parent
_workspace_cache[str(start_path)] = (None, False)
return None
def is_inside_workspace(path: str, workspace_root: str) -> bool:
"""Return True iff ``path`` is inside (or equal to) ``workspace_root``.
Uses absolute paths but does not resolve symlinks a file accessed
via a symlink that points outside the workspace still counts as
outside. This is the conservative interpretation; matches LSP
behaviour where servers reject didOpen for unrelated files.
"""
p = normalize_path(path)
root = normalize_path(workspace_root)
if p == root:
return True
# Use os.path.commonpath to handle case-insensitive filesystems
# correctly on macOS/Windows.
try:
common = os.path.commonpath([p, root])
except ValueError:
# Different drives on Windows.
return False
return common == root
def nearest_root(
start: str,
markers: Iterable[str],
*,
excludes: Optional[Iterable[str]] = None,
ceiling: Optional[str] = None,
) -> Optional[str]:
"""Walk up from ``start`` looking for any of the given marker files.
Returns the **directory containing** the first matched marker, or
``None`` if no marker is found before hitting ``ceiling`` (or the
filesystem root if no ceiling).
If ``excludes`` is provided and an exclude marker matches *first*
in the upward walk, returns ``None`` the server is gated off
for that file. Mirrors OpenCode's NearestRoot exclude semantics
(e.g. typescript skips deno projects when ``deno.json`` is found
before ``package.json``).
"""
start_path = Path(normalize_path(start))
try:
if start_path.is_file():
start_path = start_path.parent
except (OSError, RuntimeError, ValueError):
return None
ceiling_path = Path(normalize_path(ceiling)) if ceiling else None
markers_list = list(markers)
excludes_list = list(excludes) if excludes else []
cur = start_path
# Defensive cap matching ``find_git_worktree``. Bounded walk
# protects against pathological inputs even though the
# parent-equality stop normally terminates within ~10 steps.
for _ in range(64):
# Check excludes first — if an exclude is found at this level,
# the server is gated off for this file.
for exc in excludes_list:
try:
if (cur / exc).exists():
return None
except OSError:
continue
# Then check markers.
for marker in markers_list:
try:
if (cur / marker).exists():
return str(cur)
except OSError:
continue
# Stop conditions.
if ceiling_path is not None and cur == ceiling_path:
return None
parent = cur.parent
if parent == cur:
return None
cur = parent
return None
def resolve_workspace_for_file(
file_path: str,
*,
cwd: Optional[str] = None,
) -> Tuple[Optional[str], bool]:
"""Resolve the workspace root for a file.
Returns ``(workspace_root, gated_in)`` where ``gated_in`` is True
iff LSP should run for this file at all. Currently the gate is
"file is inside a git worktree found by walking up from cwd OR
from the file itself".
The cwd path takes precedence if the agent was launched in a
git project, that worktree is the workspace, and any edit inside
it (regardless of where the file lives) is in-scope. If the cwd
isn't in a git worktree, we try the file's own location as a
fallback.
Returns ``(None, False)`` when neither path is in a git worktree.
"""
cwd = cwd or os.getcwd()
cwd_root = find_git_worktree(cwd)
if cwd_root is not None:
if is_inside_workspace(file_path, cwd_root):
return cwd_root, True
# File is outside the cwd's worktree — try the file's own
# location as a secondary anchor. Useful for monorepos where
# the user opens an unrelated checkout.
file_root = find_git_worktree(file_path)
if file_root is not None:
return file_root, True
return None, False
def clear_cache() -> None:
"""Clear the workspace-resolution cache.
Called on service shutdown so a subsequent re-init doesn't pick
up stale results from a previous session.
"""
_workspace_cache.clear()
__all__ = [
"find_git_worktree",
"is_inside_workspace",
"nearest_root",
"normalize_path",
"resolve_workspace_for_file",
"clear_cache",
]
+27 -117
View File
@@ -10,7 +10,7 @@ import os
import re
import time
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from typing import Any, Dict, List, Optional
from urllib.parse import urlparse
import requests
@@ -47,7 +47,7 @@ def _resolve_requests_verify() -> bool | str:
_PROVIDER_PREFIXES: frozenset[str] = frozenset({
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "stepfun", "minimax", "minimax-oauth", "minimax-cn", "anthropic", "deepseek",
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba", "novita",
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
"qwen-oauth",
"xiaomi",
"arcee",
@@ -66,7 +66,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
"gmi-cloud", "gmicloud",
"xai", "x-ai", "x.ai", "grok",
"nvidia", "nim", "nvidia-nim", "nemotron",
"qwen-portal", "novita-ai", "novitaai",
"qwen-portal",
})
@@ -104,8 +104,6 @@ def _strip_provider_prefix(model: str) -> str:
_model_metadata_cache: Dict[str, Dict[str, Any]] = {}
_model_metadata_cache_time: float = 0
_novita_metadata_cache: Dict[str, Dict[str, Any]] = {}
_novita_metadata_cache_time: float = 0
_MODEL_CACHE_TTL = 3600
_endpoint_model_metadata_cache: Dict[str, Dict[str, Dict[str, Any]]] = {}
_endpoint_model_metadata_cache_time: Dict[str, float] = {}
@@ -287,7 +285,6 @@ def grok_supports_reasoning_effort(model: str) -> bool:
_CONTEXT_LENGTH_KEYS = (
"context_length",
"context_window",
"context_size",
"max_context_length",
"max_position_embeddings",
"max_model_len",
@@ -364,7 +361,6 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"api.xiaomimimo.com": "xiaomi",
"xiaomimimo.com": "xiaomi",
"api.gmi-serving.com": "gmi",
"api.novita.ai": "novita",
"tokenhub.tencentmaas.com": "tencent-tokenhub",
"ollama.com": "ollama-cloud",
}
@@ -561,16 +557,6 @@ def _extract_max_completion_tokens(payload: Dict[str, Any]) -> Optional[int]:
def _extract_pricing(payload: Dict[str, Any]) -> Dict[str, Any]:
novita_input = payload.get("input_token_price_per_m")
novita_output = payload.get("output_token_price_per_m")
if novita_input is not None or novita_output is not None:
pricing: Dict[str, Any] = {}
if novita_input is not None:
pricing["prompt"] = str(float(novita_input) / 10_000 / 1_000_000)
if novita_output is not None:
pricing["completion"] = str(float(novita_output) / 10_000 / 1_000_000)
return pricing
alias_map = {
"prompt": ("prompt", "input", "input_cost_per_token", "prompt_token_cost"),
"completion": ("completion", "output", "output_cost_per_token", "completion_token_cost"),
@@ -1344,66 +1330,27 @@ def _resolve_codex_oauth_context_length(
return None
def _resolve_nous_context_length(
model: str,
base_url: str = "",
api_key: str = "",
) -> Tuple[Optional[int], str]:
"""Resolve Nous Portal model context length.
def _resolve_nous_context_length(model: str) -> Optional[int]:
"""Resolve Nous Portal model context length via OpenRouter metadata.
Tries the live Nous inference endpoint first (authoritative), then falls
back to OpenRouter metadata with suffix/version matching.
Nous model IDs are bare after prefix-stripping (e.g. 'qwen3.6-plus',
'claude-opus-4-6') while OpenRouter uses prefixed IDs (e.g.
'qwen/qwen3.6-plus', 'anthropic/claude-opus-4.6'). Version
normalization (dotdash) is applied to handle name drifts.
Returns ``(context_length, source)`` where ``source`` is one of:
- ``"portal"`` live /v1/models response (authoritative)
- ``"openrouter"`` OpenRouter cache fallback (non-authoritative;
callers must NOT persist this to the on-disk cache or a single
portal blip will freeze the wrong value in forever)
- ``""`` could not resolve
Nous model IDs are bare (e.g. 'claude-opus-4-6') while OpenRouter uses
prefixed IDs (e.g. 'anthropic/claude-opus-4.6'). Try suffix matching
with version normalization (dotdash).
"""
# Portal first — the Nous /models endpoint is authoritative for what our
# infrastructure enforces and may differ from OR (e.g. OR reports 1M for
# qwen3.6-plus; the portal correctly says 262144). Fall back to the OR
# catalog only if the portal doesn't list the model.
if base_url:
portal_ctx = _resolve_endpoint_context_length(model, base_url, api_key=api_key)
if portal_ctx is not None:
return portal_ctx, "portal"
metadata = fetch_model_metadata()
def _safe_ctx(or_id: str, entry: dict) -> Optional[int]:
ctx = entry.get("context_length")
if ctx is None:
return None
if ctx <= 32768 and _model_name_suggests_kimi(or_id):
logger.info(
"Rejecting OpenRouter metadata context=%s for %r "
"(Kimi-family underreport, Nous path); falling through to hardcoded defaults",
ctx, or_id,
)
return None
return ctx
metadata = fetch_model_metadata() # OpenRouter cache
# Exact match first
if model in metadata:
ctx = _safe_ctx(model, metadata[model])
if ctx is not None:
return ctx, "openrouter"
return metadata[model].get("context_length")
normalized = _normalize_model_version(model).lower()
for or_id, entry in metadata.items():
bare = or_id.split("/", 1)[1] if "/" in or_id else or_id
if bare.lower() == model.lower() or _normalize_model_version(bare).lower() == normalized:
ctx = _safe_ctx(or_id, entry)
if ctx is not None:
return ctx, "openrouter"
return entry.get("context_length")
# Partial prefix match for cases like gemini-3-flash → gemini-3-flash-preview
# Require match to be at a word boundary (followed by -, :, or end of string)
model_lower = model.lower()
for or_id, entry in metadata.items():
bare = or_id.split("/", 1)[1] if "/" in or_id else or_id
@@ -1411,11 +1358,9 @@ def _resolve_nous_context_length(
if candidate.startswith(query) and (
len(candidate) == len(query) or candidate[len(query)] in "-:."
):
ctx = _safe_ctx(or_id, entry)
if ctx is not None:
return ctx, "openrouter"
return entry.get("context_length")
return None, ""
return None
def get_model_context_length(
@@ -1430,18 +1375,14 @@ def get_model_context_length(
Resolution order:
0. Explicit config override (model.context_length or custom_providers per-model)
1. Persistent cache (previously discovered via probing). Nous URLs
bypass the cache here so step 5b can always reconcile against
the authoritative portal /v1/models response.
1. Persistent cache (previously discovered via probing)
1b. AWS Bedrock static table (must precede custom-endpoint probe)
2. Active endpoint metadata (/models for explicit custom endpoints)
3. Local server query (for local endpoints)
4. Anthropic /v1/models API (API-key users only, not OAuth)
5. Provider-aware lookups (before generic OpenRouter cache):
a. Copilot live /models API
b. Nous: live /v1/models probe first (authoritative), then OR
cache fallback with suffix/version normalisation. Only
portal-derived values are persisted to disk.
b. Nous suffix-match via OpenRouter cache
c. Codex OAuth /models probe
d. GMI /models endpoint
e. Ollama native /api/show probe (any base_url, provider-agnostic)
@@ -1496,28 +1437,6 @@ def get_model_context_length(
model, base_url, f"{cached:,}",
)
_invalidate_cached_context_length(model, base_url)
# Invalidate stale 32k cache entries for Kimi-family models.
elif cached <= 32768 and _model_name_suggests_kimi(model):
logger.info(
"Dropping stale Kimi cache entry %s@%s -> %s (OpenRouter underreport); "
"re-resolving via hardcoded defaults",
model, base_url, f"{cached:,}",
)
_invalidate_cached_context_length(model, base_url)
# Nous Portal: the portal /v1/models endpoint is authoritative.
# Bypass the persistent cache so step 5b can always reconcile
# against it — this corrects pre-fix entries seeded from the
# OR catalog (the same OR underreport class that the Kimi/Qwen
# DEFAULT_CONTEXT_LENGTHS overrides exist to mitigate) without
# touching the on-disk file when the portal is unreachable.
# The in-memory 300s endpoint metadata cache makes the per-call
# cost amortise to ~0 within a process.
elif _infer_provider_from_url(base_url) == "nous":
logger.debug(
"Bypassing persistent cache for %s@%s (Nous portal authoritative)",
model, base_url,
)
# Fall through; step 5b reconciles and overwrites if portal responds.
else:
return cached
@@ -1541,13 +1460,6 @@ def get_model_context_length(
except ImportError:
pass # boto3 not installed — fall through to generic resolution
if provider == "novita" or (base_url and base_url_host_matches(base_url, "api.novita.ai")):
ctx = _resolve_endpoint_context_length(model, base_url or "https://api.novita.ai/openai/v1", api_key=api_key)
if ctx is not None:
if base_url:
save_context_length(model, base_url, ctx)
return ctx
# 2. Active endpoint metadata for truly custom/unknown endpoints.
# Known providers (Copilot, OpenAI, Anthropic, etc.) skip this — their
# /models endpoint may report a provider-imposed limit (e.g. Copilot
@@ -1616,18 +1528,8 @@ def get_model_context_length(
pass # Fall through to models.dev
if effective_provider == "nous":
ctx, source = _resolve_nous_context_length(
model, base_url=base_url or "", api_key=api_key or ""
)
ctx = _resolve_nous_context_length(model)
if ctx:
# Persist ONLY portal-derived values. Caching an OR-fallback
# value here would freeze in a wrong number on the first portal
# blip / auth glitch and step-1 would short-circuit it forever.
# OR's catalog is community-maintained and is precisely why the
# Kimi/Qwen DEFAULT_CONTEXT_LENGTHS overrides exist — we don't
# want it leaking into the persistent cache for Nous URLs.
if base_url and source == "portal":
save_context_length(model, base_url, ctx)
return ctx
if effective_provider == "openai-codex":
# Codex OAuth enforces lower context limits than the direct OpenAI
@@ -1673,6 +1575,14 @@ def get_model_context_length(
if model in metadata:
or_ctx = metadata[model].get("context_length", DEFAULT_FALLBACK_CONTEXT)
# Guard against stale OpenRouter metadata for Kimi-family models.
# OpenRouter reports 32768 for moonshotai/kimi-k2.6, but the model
# actually supports 262144 (models.dev + official Kimi docs agree).
# Providers that host their own Kimi endpoints (Ollama Cloud, Kimi
# Coding, Moonshot) would otherwise trip the 64k minimum-context
# guard and reject a perfectly capable model.
# The filter is narrow: only reject exactly 32768 for Kimi-named
# models. If OpenRouter ever updates its data, the stale path
# becomes dead code with no impact.
if or_ctx == 32768 and _model_name_suggests_kimi(model):
logger.info(
"Rejecting OpenRouter metadata context=%s for %r "
-1
View File
@@ -141,7 +141,6 @@ class ProviderInfo:
# Hermes provider names → models.dev provider IDs
PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"openrouter": "openrouter",
"novita": "novita-ai",
"anthropic": "anthropic",
"openai": "openai",
"openai-codex": "openai",
-64
View File
@@ -1,64 +0,0 @@
"""Centralized Nous Portal request tags.
Every Hermes request that hits the Nous Portal main agent loop, auxiliary
client (compression / titles / vision / web_extract / session_search / etc.),
and any future code path must carry the same product-attribution tags so
Nous can attribute usage to Hermes Agent and bucket it by client release.
Tag shape (sent in OpenAI-compatible ``extra_body['tags']``):
[
"product=hermes-agent",
"client=hermes-client-v<__version__>",
]
The version is sourced live from ``hermes_cli.__version__`` so it auto-aligns
to whatever release is installed; the release script
(``scripts/release.py``) regex-bumps that single string, and every Portal
request picks up the new tag on the next process start.
Why one helper instead of inlining the literal at each site:
* Four call sites (main loop profile, aux client, run_agent compression
fallback, web_tools fallback) used to drift apart see PR #24194 which
only got the aux site, leaving the main loop sending a different tag set.
* Tests should assert the same tag list everywhere; centralizing makes that
assertion a one-liner against this module.
Do NOT pre-compute these as module-level constants in the consumers. The
version can change at runtime (editable installs, hot-reload tooling), and
``hermes_cli.__version__`` is the canonical source of truth.
"""
from __future__ import annotations
from typing import List
def _hermes_version() -> str:
"""Return the current Hermes release version, e.g. ``"0.13.0"``.
Falls back to ``"unknown"`` if ``hermes_cli`` cannot be imported (should
never happen in a real install guarded for defensive testing).
"""
try:
from hermes_cli import __version__
return __version__
except Exception:
return "unknown"
def hermes_client_tag() -> str:
"""Return the ``client=...`` tag for Nous Portal requests.
Format: ``client=hermes-client-v<MAJOR>.<MINOR>.<PATCH>``.
"""
return f"client=hermes-client-v{_hermes_version()}"
def nous_portal_tags() -> List[str]:
"""Return the canonical list of Nous Portal product tags.
Always returns a fresh list so callers can mutate it freely
(e.g. ``merged_extra.setdefault("tags", []).extend(nous_portal_tags())``).
"""
return ["product=hermes-agent", hermes_client_tag()]
+1 -1
View File
@@ -268,7 +268,7 @@ TOOL_USE_ENFORCEMENT_GUIDANCE = (
# Model name substrings that trigger tool-use enforcement guidance.
# Add new patterns here when a model family needs explicit steering.
TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma", "grok", "glm")
TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma", "grok")
# OpenAI GPT/Codex-specific execution guidance. Addresses known failure modes
# where GPT models abandon work on partial results, skip prerequisite lookups,
+128 -6
View File
@@ -1,15 +1,25 @@
"""Anthropic prompt caching strategy.
"""Anthropic prompt caching strategies.
Single layout: ``system_and_3``. 4 cache_control breakpoints system
prompt + last 3 non-system messages, all at the same TTL (5m or 1h).
Reduces input token costs by ~75% on multi-turn conversations within a
single session.
Two layouts:
* ``system_and_3`` (default, used everywhere except the long-lived path):
4 cache_control breakpoints system prompt + last 3 non-system messages.
All at the same TTL (5m or 1h). Reduces input token costs by ~75% on
multi-turn conversations within a single session.
* ``prefix_and_2`` (Claude on Anthropic / OpenRouter / Nous Portal):
4 breakpoints split across two TTL tiers tools[-1] (1h) +
stable system prefix (1h) + last 2 non-system messages (5m). The
long-lived prefix is byte-stable across sessions for a given user
config, so every fresh session reads the cached system+tools instead
of re-paying for them. Within-session rolling window shrinks from 3
messages to 2 to free the breakpoint budget.
Pure functions -- no class state, no AIAgent dependency.
"""
import copy
from typing import Any, Dict, List
from typing import Any, Dict, List, Optional
def _apply_cache_marker(msg: dict, cache_marker: dict, native_anthropic: bool = False) -> None:
@@ -77,3 +87,115 @@ def apply_anthropic_cache_control(
_apply_cache_marker(messages[idx], marker, native_anthropic=native_anthropic)
return messages
def _mark_system_stable_block(
messages: List[Dict[str, Any]],
long_lived_marker: Dict[str, str],
) -> bool:
"""Mark the *first* content block of the system message with the 1h marker.
The system message is expected to have been split into multiple content
blocks beforehand by the caller block[0] is the cross-session-stable
prefix, subsequent blocks carry context files + volatile suffix.
Falls back to marking the whole system message as a single block when
the message hasn't been split (preserves correctness on the fallback path).
Returns True when a marker was placed.
"""
if not messages or messages[0].get("role") != "system":
return False
sys_msg = messages[0]
content = sys_msg.get("content")
# Already a list of blocks → mark the first block.
if isinstance(content, list) and content:
first = content[0]
if isinstance(first, dict):
first["cache_control"] = long_lived_marker
return True
return False
# String content (no split) → cannot place a stable-prefix breakpoint
# without changing the byte content. Caller is responsible for
# splitting; if they didn't, fall through to envelope marker so we still
# cache *something* for this turn.
if isinstance(content, str) and content:
sys_msg["content"] = [
{"type": "text", "text": content, "cache_control": long_lived_marker}
]
return True
return False
def apply_anthropic_cache_control_long_lived(
api_messages: List[Dict[str, Any]],
long_lived_ttl: str = "1h",
rolling_ttl: str = "5m",
native_anthropic: bool = False,
) -> List[Dict[str, Any]]:
"""Apply prefix_and_2 caching: long-lived stable prefix + rolling window.
Layout (4 breakpoints total):
* Stable system prefix (block[0]) ``long_lived_ttl`` TTL
* Last 2 non-system messages ``rolling_ttl`` TTL each
NOTE: this function does NOT mark the tools array. Tools cache_control
is attached separately (see ``mark_tools_for_long_lived_cache``) because
tools live outside the messages list in the API payload.
The caller MUST have split the system message into ordered content
blocks where block[0] is the cross-session-stable portion. If the system
message is still a single string, it is wrapped into a single block and
marked this is correct, just less effective (the volatile suffix is
not isolated, so the prefix invalidates per-session).
Returns:
Deep copy of messages with cache_control breakpoints injected.
"""
messages = copy.deepcopy(api_messages)
if not messages:
return messages
long_marker = _build_marker(long_lived_ttl)
rolling_marker = _build_marker(rolling_ttl)
placed_prefix = _mark_system_stable_block(messages, long_marker)
# Reserve 1 breakpoint for the system prefix (when placed); spend the
# remaining 3 on the rolling tail. Anthropic max is 4 total —
# tools[-1] (when marked) consumes the 4th, so we cap rolling at 2 here.
rolling_budget = 2 if placed_prefix else 3
non_sys = [i for i in range(len(messages)) if messages[i].get("role") != "system"]
for idx in non_sys[-rolling_budget:]:
_apply_cache_marker(messages[idx], rolling_marker, native_anthropic=native_anthropic)
return messages
def mark_tools_for_long_lived_cache(
tools: Optional[List[Dict[str, Any]]],
long_lived_ttl: str = "1h",
) -> Optional[List[Dict[str, Any]]]:
"""Attach cache_control to the last tool in the OpenAI-format tools list.
Anthropic prefix-cache order is ``tools system messages``. Marking
the last tool dict caches the entire tools array (Anthropic's docs:
"the marker is placed on the last block you want included in the cached
prefix"). Marker is preserved across the OpenAI-wire boundary on
OpenRouter and Nous Portal (which proxies to OpenRouter); on native
Anthropic the marker is forwarded by ``convert_tools_to_anthropic``.
Returns a deep copy of the tools list with the marker attached, or the
input unchanged when tools is empty/None. Pure function does not
mutate the input.
"""
if not tools:
return tools
out = copy.deepcopy(tools)
last = out[-1]
if isinstance(last, dict):
last["cache_control"] = _build_marker(long_lived_ttl)
return out
-3
View File
@@ -14,7 +14,6 @@ from dataclasses import dataclass, field
from typing import Any, Mapping
from utils import safe_json_loads
from agent.tool_result_classification import file_mutation_result_landed
IDEMPOTENT_TOOL_NAMES = frozenset(
@@ -197,8 +196,6 @@ def classify_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str
"""
if result is None:
return False, ""
if file_mutation_result_landed(tool_name, result):
return False, ""
if tool_name == "terminal":
data = safe_json_loads(result)
-26
View File
@@ -1,26 +0,0 @@
"""Shared helpers for classifying tool result payloads."""
from __future__ import annotations
import json
from typing import Any
FILE_MUTATING_TOOL_NAMES = frozenset({"write_file", "patch"})
def file_mutation_result_landed(tool_name: str, result: Any) -> bool:
"""Return True when a file mutation result proves the write landed."""
if tool_name not in FILE_MUTATING_TOOL_NAMES or not isinstance(result, str):
return False
try:
data = json.loads(result.strip())
except Exception:
return False
if not isinstance(data, dict) or data.get("error"):
return False
if tool_name == "write_file":
return "bytes_written" in data
if tool_name == "patch":
return data.get("success") is True
return False
-368
View File
@@ -1,368 +0,0 @@
"""Codex app-server JSON-RPC client.
Speaks the protocol documented in codex-rs/app-server/README.md (codex 0.125+).
Transport is newline-delimited JSON-RPC 2.0 over stdio: spawn `codex app-server`,
do an `initialize` handshake, then drive `thread/start` + `turn/start` and
consume streaming `item/*` notifications until `turn/completed`.
This module is the wire-level speaker only. Higher-level concerns (event
projection into Hermes' display, approval bridging, transcript projection into
AIAgent.messages, plugin migration) live in sibling modules.
Status: optional opt-in runtime gated behind `model.openai_runtime ==
"codex_app_server"`. Hermes' default tool dispatch is unchanged when this
runtime is not selected.
"""
from __future__ import annotations
import json
import os
import queue
import subprocess
import threading
import time
from dataclasses import dataclass, field
from typing import Any, Callable, Optional
# Default minimum codex version we test against. The PR sets this from the
# `codex --version` parsed at install time; bumping is a one-line change here.
MIN_CODEX_VERSION = (0, 125, 0)
@dataclass
class CodexAppServerError(RuntimeError):
"""Raised on JSON-RPC errors from the app-server."""
code: int
message: str
data: Optional[Any] = None
def __str__(self) -> str: # pragma: no cover - trivial
return f"codex app-server error {self.code}: {self.message}"
@dataclass
class _Pending:
queue: queue.Queue
method: str
sent_at: float = field(default_factory=time.time)
class CodexAppServerClient:
"""Minimal JSON-RPC 2.0 client for `codex app-server` over stdio.
Threading model:
- Spawning thread (caller) drives request/response pairs synchronously.
- One reader thread parses stdout, dispatches replies to the right
pending future, and routes notifications + server-initiated requests
to bounded queues that the caller drains on their own cadence.
- One reader thread captures stderr for diagnostics; codex emits
tracing logs there at RUST_LOG-controlled levels.
Intentionally NOT async. AIAgent.run_conversation() is synchronous and
runs on the main thread; layering asyncio just to drive a stdio child
creates surprising interrupt semantics. We use blocking queues with
timeouts and rely on `turn/interrupt` for cancellation.
"""
def __init__(
self,
codex_bin: str = "codex",
codex_home: Optional[str] = None,
extra_args: Optional[list[str]] = None,
env: Optional[dict[str, str]] = None,
) -> None:
self._codex_bin = codex_bin
cmd = [codex_bin, "app-server"] + list(extra_args or [])
spawn_env = os.environ.copy()
if env:
spawn_env.update(env)
if codex_home:
spawn_env["CODEX_HOME"] = codex_home
# Codex emits tracing to stderr; default WARN keeps it quiet for users.
spawn_env.setdefault("RUST_LOG", "warn")
self._proc = subprocess.Popen(
cmd,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
bufsize=0,
env=spawn_env,
)
self._next_id = 1
self._pending: dict[int, _Pending] = {}
self._pending_lock = threading.Lock()
self._notifications: queue.Queue = queue.Queue()
self._server_requests: queue.Queue = queue.Queue()
self._stderr_lines: list[str] = []
self._stderr_lock = threading.Lock()
self._closed = False
self._initialized = False
self._reader = threading.Thread(target=self._read_stdout, daemon=True)
self._reader.start()
self._stderr_reader = threading.Thread(target=self._read_stderr, daemon=True)
self._stderr_reader.start()
# ---------- lifecycle ----------
def initialize(
self,
client_name: str = "hermes",
client_title: str = "Hermes Agent",
client_version: str = "0.1",
capabilities: Optional[dict] = None,
timeout: float = 10.0,
) -> dict:
"""Send `initialize` + `initialized` handshake. Returns the server's
InitializeResponse (userAgent, codexHome, platformFamily, platformOs)."""
if self._initialized:
raise RuntimeError("already initialized")
params = {
"clientInfo": {
"name": client_name,
"title": client_title,
"version": client_version,
},
"capabilities": capabilities or {},
}
result = self.request("initialize", params, timeout=timeout)
self.notify("initialized")
self._initialized = True
return result
def close(self, timeout: float = 3.0) -> None:
"""Close stdin and wait for the subprocess to exit, escalating to kill."""
if self._closed:
return
self._closed = True
try:
if self._proc.stdin and not self._proc.stdin.closed:
self._proc.stdin.close()
except Exception:
pass
try:
self._proc.terminate()
self._proc.wait(timeout=timeout)
except subprocess.TimeoutExpired:
try:
self._proc.kill()
self._proc.wait(timeout=1.0)
except Exception:
pass
def __enter__(self) -> "CodexAppServerClient":
return self
def __exit__(self, *exc: Any) -> None:
self.close()
# ---------- send/receive ----------
def request(
self,
method: str,
params: Optional[dict] = None,
timeout: float = 30.0,
) -> dict:
"""Send a JSON-RPC request and block on the response. Returns `result`,
raises CodexAppServerError on `error`."""
rid = self._take_id()
q: queue.Queue = queue.Queue(maxsize=1)
with self._pending_lock:
self._pending[rid] = _Pending(queue=q, method=method)
self._send({"id": rid, "method": method, "params": params or {}})
try:
msg = q.get(timeout=timeout)
except queue.Empty:
with self._pending_lock:
self._pending.pop(rid, None)
raise TimeoutError(
f"codex app-server method {method!r} timed out after {timeout}s"
)
if "error" in msg:
err = msg["error"]
raise CodexAppServerError(
code=err.get("code", -1),
message=err.get("message", ""),
data=err.get("data"),
)
return msg.get("result", {})
def notify(self, method: str, params: Optional[dict] = None) -> None:
"""Send a JSON-RPC notification (no id, no response expected)."""
self._send({"method": method, "params": params or {}})
def respond(self, request_id: Any, result: dict) -> None:
"""Reply to a server-initiated request (e.g. approval prompts)."""
self._send({"id": request_id, "result": result})
def respond_error(
self, request_id: Any, code: int, message: str, data: Optional[Any] = None
) -> None:
"""Reply to a server-initiated request with an error."""
err: dict[str, Any] = {"code": code, "message": message}
if data is not None:
err["data"] = data
self._send({"id": request_id, "error": err})
def take_notification(self, timeout: float = 0.0) -> Optional[dict]:
"""Pop the next streaming notification, or return None on timeout.
timeout=0.0 means non-blocking. Use small positive timeouts inside the
AIAgent turn loop to interleave reads with interrupt checks."""
try:
if timeout <= 0:
return self._notifications.get_nowait()
return self._notifications.get(timeout=timeout)
except queue.Empty:
return None
def take_server_request(self, timeout: float = 0.0) -> Optional[dict]:
"""Pop the next server-initiated request (e.g. exec/applyPatch approval)."""
try:
if timeout <= 0:
return self._server_requests.get_nowait()
return self._server_requests.get(timeout=timeout)
except queue.Empty:
return None
# ---------- diagnostics ----------
def stderr_tail(self, n: int = 20) -> list[str]:
"""Return last n lines of codex's stderr (for error reports)."""
with self._stderr_lock:
return list(self._stderr_lines[-n:])
def is_alive(self) -> bool:
return self._proc.poll() is None
# ---------- internals ----------
def _take_id(self) -> int:
# JSON-RPC ids only need to be unique per-connection. A simple
# monotonically increasing int is the common choice and matches what
# codex's own clients use.
rid = self._next_id
self._next_id += 1
return rid
def _send(self, obj: dict) -> None:
if self._closed:
raise RuntimeError("codex app-server client is closed")
if self._proc.stdin is None:
raise RuntimeError("codex app-server stdin not available")
try:
self._proc.stdin.write((json.dumps(obj) + "\n").encode("utf-8"))
self._proc.stdin.flush()
except (BrokenPipeError, ValueError) as exc:
raise RuntimeError(
f"codex app-server stdin closed unexpectedly: {exc}"
) from exc
def _read_stdout(self) -> None:
if self._proc.stdout is None:
return
try:
for line in iter(self._proc.stdout.readline, b""):
if not line:
break
line = line.strip()
if not line:
continue
try:
msg = json.loads(line)
except json.JSONDecodeError:
# Non-JSON output is unexpected on stdout; tracing belongs
# on stderr. Surface it via stderr buffer for diagnostics.
with self._stderr_lock:
self._stderr_lines.append(
f"<non-json on stdout> {line[:200]!r}"
)
continue
self._dispatch(msg)
except Exception as exc:
with self._stderr_lock:
self._stderr_lines.append(f"<stdout reader error> {exc}")
def _dispatch(self, msg: dict) -> None:
# Reply (has id + result/error, no method)
if "id" in msg and ("result" in msg or "error" in msg):
with self._pending_lock:
pending = self._pending.pop(msg["id"], None)
if pending is not None:
try:
pending.queue.put_nowait(msg)
except queue.Full: # pragma: no cover - defensive
pass
return
# Server-initiated request (has id + method)
if "id" in msg and "method" in msg:
self._server_requests.put(msg)
return
# Notification (no id)
if "method" in msg:
self._notifications.put(msg)
def _read_stderr(self) -> None:
if self._proc.stderr is None:
return
try:
for line in iter(self._proc.stderr.readline, b""):
if not line:
break
with self._stderr_lock:
self._stderr_lines.append(
line.decode("utf-8", "replace").rstrip()
)
# Bound memory: keep last 500 lines.
if len(self._stderr_lines) > 500:
self._stderr_lines = self._stderr_lines[-500:]
except Exception: # pragma: no cover
pass
def parse_codex_version(output: str) -> Optional[tuple[int, int, int]]:
"""Parse `codex --version` output. Returns (major, minor, patch) or None."""
# Output format: "codex-cli 0.130.0" possibly followed by metadata.
import re
match = re.search(r"(\d+)\.(\d+)\.(\d+)", output or "")
if not match:
return None
return (int(match.group(1)), int(match.group(2)), int(match.group(3)))
def check_codex_binary(
codex_bin: str = "codex", min_version: tuple[int, int, int] = MIN_CODEX_VERSION
) -> tuple[bool, str]:
"""Verify codex CLI is installed and meets minimum version.
Returns (ok, message). Used by setup wizard and runtime startup."""
try:
proc = subprocess.run(
[codex_bin, "--version"],
capture_output=True,
text=True,
timeout=10,
)
except FileNotFoundError:
return False, (
f"codex CLI not found at {codex_bin!r}. Install with: "
f"npm i -g @openai/codex"
)
except subprocess.TimeoutExpired:
return False, "codex --version timed out"
if proc.returncode != 0:
return False, f"codex --version exited {proc.returncode}: {proc.stderr.strip()}"
version = parse_codex_version(proc.stdout)
if version is None:
return False, f"could not parse codex version from: {proc.stdout!r}"
if version < min_version:
return False, (
f"codex {'.'.join(map(str, version))} is older than required "
f"{'.'.join(map(str, min_version))}. Run: npm i -g @openai/codex"
)
return True, ".".join(map(str, version))
@@ -1,810 +0,0 @@
"""Session adapter for codex app-server runtime.
Owns one Codex thread per Hermes session. Drives `turn/start`, consumes
streaming notifications via CodexEventProjector, handles server-initiated
approval requests (apply_patch, exec command), translates cancellation,
and returns a clean turn result that AIAgent.run_conversation() can splice
into its `messages` list.
Lifecycle:
session = CodexAppServerSession(cwd="/home/x/proj")
session.ensure_started() # spawns + handshake + thread/start
result = session.run_turn(user_input="hello") # blocks until turn/completed
# result.final_text → assistant text returned to caller
# result.projected_messages → list of {role, content, ...} for messages list
# result.tool_iterations → how many tool-shaped items completed (skill nudge counter)
# result.interrupted → True if Ctrl+C / interrupt_requested fired mid-turn
session.close() # tears down subprocess
Threading model: the adapter is single-threaded from the caller's perspective.
The underlying CodexAppServerClient owns its own reader threads but exposes
blocking-with-timeout queues that this adapter polls in a loop, so the run_turn
call is synchronous and behaves like AIAgent's existing chat_completions loop.
"""
from __future__ import annotations
import logging
import os
import threading
import time
from dataclasses import dataclass, field
from typing import Any, Callable, Optional
from agent.redact import redact_sensitive_text
from agent.transports.codex_app_server import (
CodexAppServerClient,
CodexAppServerError,
)
from agent.transports.codex_event_projector import CodexEventProjector
logger = logging.getLogger(__name__)
# How many tailing stderr lines from the codex subprocess to attach to a
# user-facing error when we don't have a more specific classification (OAuth,
# wedge watchdog, etc.). Small enough to keep error messages legible, large
# enough to surface a config/provider/auth diagnostic.
_STDERR_TAIL_LINES = 12
# Permission profile mapping mirrors the docstring in PR proposal:
# Hermes' tools.terminal.security_mode → Codex's permissions profile id.
# Defaults if config is missing → workspace-write (matches Codex's own default).
_HERMES_TO_CODEX_PERMISSION_PROFILE = {
"auto": "workspace-write",
"approval-required": "read-only-with-approval",
"unrestricted": "full-access",
# Backstop alias used by some skills/tests.
"yolo": "full-access",
}
@dataclass
class TurnResult:
"""Result of one user→assistant→tool turn through the codex app-server."""
final_text: str = ""
projected_messages: list[dict] = field(default_factory=list)
tool_iterations: int = 0
interrupted: bool = False
error: Optional[str] = None # Set if turn ended in a non-recoverable error
turn_id: Optional[str] = None
thread_id: Optional[str] = None
# Hint to the caller that the underlying codex subprocess is likely
# wedged (turn-level timeout fired, post-tool watchdog tripped, or
# token-refresh failure killed the child). The caller should retire
# the session so the next turn respawns codex from scratch instead
# of riding a CPU-spinning or auth-broken process. Mirrors openclaw
# beta.8's "retire timed-out app-server clients" fix.
should_retire: bool = False
# Markers we accept as terminal even when codex never emits turn/completed.
# Some codex versions stream `<turn_aborted>` as raw text in agentMessage
# items when an interrupt or upstream error tears the turn down before the
# normal completion path fires. Mirrors openclaw beta.8 fix.
_TURN_ABORTED_MARKERS = ("<turn_aborted>", "<turn_aborted/>")
# Substrings in codex stderr / JSON-RPC error messages that signal the
# subprocess died because its OAuth credentials are no longer valid.
# Kept conservative: we only redirect users to `codex login` when we're
# reasonably sure that's the actual failure, otherwise we surface the
# original error verbatim. Mirrors openclaw beta.8's auth-refresh
# classification.
_OAUTH_REFRESH_FAILURE_HINTS = (
"invalid_grant",
"invalid grant",
"refresh token",
"refresh_token",
"token refresh",
"token_refresh",
"token has expired",
"expired_token",
"expired token",
"not authenticated",
"unauthenticated",
"unauthorized",
"401 unauthorized",
"re-authenticate",
"reauthenticate",
"please log in",
"please login",
"auth profile",
"no auth profile",
"oauth",
)
def _classify_oauth_failure(*parts: str) -> Optional[str]:
"""Return a user-friendly re-auth hint if any of the provided strings
look like a codex OAuth/token-refresh failure; otherwise None.
Used for both `turn/start` JSON-RPC errors and post-mortem stderr
inspection when the subprocess exits unexpectedly. Conservative on
purpose we only redirect users to `codex login` when the signal
is strong, so unrelated runtime failures still surface verbatim.
"""
haystack = " ".join(p for p in parts if p).lower()
if not haystack:
return None
for needle in _OAUTH_REFRESH_FAILURE_HINTS:
if needle in haystack:
return (
"Codex authentication failed — your ChatGPT/Codex login "
"looks expired or invalid. Run `codex login` to refresh, "
"then retry. (Fall back to default runtime with "
"`/codex-runtime auto` if the issue persists.)"
)
return None
@dataclass
class _ServerRequestRouting:
"""Default policies for codex-side approval requests when no interactive
callback is wired in. These are only used by tests + cron / non-interactive
contexts; the live CLI path passes an approval_callback that defers to
tools.approval.prompt_dangerous_approval()."""
auto_approve_exec: bool = False
auto_approve_apply_patch: bool = False
class CodexAppServerSession:
"""One Codex thread per Hermes session, lifetime owned by AIAgent.
Not thread-safe one caller drives it at a time, matching how AIAgent's
run_conversation() loop is structured today. The codex client itself can
handle interleaved reads/writes via its own threads, but the adapter's
state (projector, thread_id, turn counter) is owned by the caller thread.
"""
def __init__(
self,
*,
cwd: Optional[str] = None,
codex_bin: str = "codex",
codex_home: Optional[str] = None,
permission_profile: Optional[str] = None,
approval_callback: Optional[Callable[..., str]] = None,
on_event: Optional[Callable[[dict], None]] = None,
request_routing: Optional[_ServerRequestRouting] = None,
client_factory: Optional[Callable[..., CodexAppServerClient]] = None,
) -> None:
self._cwd = cwd or os.getcwd()
self._codex_bin = codex_bin
self._codex_home = codex_home
self._permission_profile = (
permission_profile or _HERMES_TO_CODEX_PERMISSION_PROFILE.get(
os.environ.get("HERMES_TERMINAL_SECURITY_MODE", "auto"),
"workspace-write",
)
)
self._approval_callback = approval_callback
self._on_event = on_event # Display hook (kawaii spinner ticks etc.)
self._routing = request_routing or _ServerRequestRouting()
self._client_factory = client_factory or CodexAppServerClient
self._client: Optional[CodexAppServerClient] = None
self._thread_id: Optional[str] = None
self._interrupt_event = threading.Event()
# Pending file-change items, keyed by item id. Populated on
# item/started for fileChange items; consumed by the approval
# bridge when codex sends item/fileChange/requestApproval. The
# approval params don't carry the changeset, so we cache here
# to surface a real summary in the approval prompt (quirk #4).
self._pending_file_changes: dict[str, str] = {}
self._closed = False
# ---------- lifecycle ----------
def ensure_started(self) -> str:
"""Spawn the subprocess, do the initialize handshake, and start a
thread. Returns the codex thread id. Idempotent repeated calls
return the same thread id."""
if self._thread_id is not None:
return self._thread_id
if self._client is None:
self._client = self._client_factory(
codex_bin=self._codex_bin, codex_home=self._codex_home
)
self._client.initialize(
client_name="hermes",
client_title="Hermes Agent",
client_version=_get_hermes_version(),
)
# Permission selection is intentionally NOT sent on thread/start.
# Two reasons (live-tested against codex 0.130.0):
# 1. `thread/start.permissions` is gated behind the experimentalApi
# capability on this codex version — we'd have to opt in during
# initialize and accept the unstable surface.
# 2. Even with experimentalApi declared and the correct shape
# (`{"type": "profile", "id": "..."}`, not `{"profileId": ...}`),
# codex requires a matching `[permissions]` table in
# ~/.codex/config.toml or it fails the request with
# 'default_permissions requires a [permissions] table'.
# Letting codex pick its default (`:read-only` unless the user has
# configured otherwise in their codex config.toml) is the standard
# codex CLI workflow and avoids fighting codex's own validation.
# Users who want a write-capable profile configure it in their
# ~/.codex/config.toml the same way they would for any codex usage.
params: dict[str, Any] = {"cwd": self._cwd}
result = self._client.request("thread/start", params, timeout=15)
# Cross-fill thread.id/sessionId — different codex versions have
# serialized this under either key. Mirrors openclaw beta.8's
# tolerance fix so future codex drops/renames don't KeyError us
# at handshake time.
thread_obj = result.get("thread") or {}
thread_id = (
thread_obj.get("id")
or thread_obj.get("sessionId")
or result.get("sessionId")
or result.get("threadId")
)
if not thread_id:
raise CodexAppServerError(
code=-32603,
message=(
"codex thread/start returned no thread id "
f"(payload keys: {sorted(result.keys())})"
),
)
self._thread_id = thread_id
logger.info(
"codex app-server thread started: id=%s profile=%s cwd=%s",
self._thread_id[:8],
self._permission_profile,
self._cwd,
)
return self._thread_id
def close(self) -> None:
if self._closed:
return
self._closed = True
if self._client is not None:
try:
self._client.close()
except Exception: # pragma: no cover - best-effort cleanup
pass
self._client = None
self._thread_id = None
def __enter__(self) -> "CodexAppServerSession":
return self
def __exit__(self, *exc: Any) -> None:
self.close()
# ---------- interrupt ----------
def request_interrupt(self) -> None:
"""Idempotent: signal the active turn loop to issue turn/interrupt
and unwind. Called by AIAgent's _interrupt_requested path."""
self._interrupt_event.set()
# ---------- diagnostics ----------
def _format_error_with_stderr(
self,
prefix: str,
exc: Any = "",
*,
tail_lines: int = _STDERR_TAIL_LINES,
) -> str:
"""Build a user-facing error string for codex failures.
Appends the last few lines of codex's stderr buffer when available,
passed through agent.redact with force=True so secrets in provider
error responses (auth headers, query-string tokens, sk-* keys) never
leak into chat output or trajectories. The codex CLI's own error
text ('Internal error', 'turn/start failed: ...') is otherwise
opaque and forces users to re-run with verbose flags to diagnose
config / provider / auth-bridge problems.
Use this for the generic / catch-all branches. Specific
classifications (OAuth via _classify_oauth_failure, post-tool wedge
watchdog) already produce a clean hint and should be used instead.
"""
exc_str = str(exc) if exc != "" and exc is not None else ""
base = f"{prefix}: {exc_str}" if exc_str else prefix
if self._client is None:
return base
try:
tail = self._client.stderr_tail(tail_lines)
except Exception: # pragma: no cover - diagnostic best-effort
return base
if not tail:
return base
joined = "\n".join(line.rstrip() for line in tail if line)
if not joined.strip():
return base
redacted = redact_sensitive_text(joined, force=True)
return f"{base}\ncodex stderr (last {len(tail)} lines):\n{redacted}"
# ---------- per-turn ----------
def run_turn(
self,
user_input: str,
*,
turn_timeout: float = 600.0,
notification_poll_timeout: float = 0.25,
post_tool_quiet_timeout: float = 90.0,
) -> TurnResult:
"""Send a user message and block until turn/completed, while
forwarding server-initiated approval requests and projecting items
into Hermes' messages shape.
post_tool_quiet_timeout: if codex emits a tool completion and then
goes quiet for this many seconds without emitting another item or
`turn/completed`, fast-fail and mark the session for retirement.
Mirrors openclaw beta.8's post-tool completion watchdog (#81697)
so a wedged codex doesn't burn the full turn deadline.
"""
# Pre-create the result so startup failures (codex subprocess can't
# spawn, initialize handshake rejects, thread/start blows up) surface
# the same way per-turn failures do — with a TurnResult.error string
# the caller can render — instead of bubbling raw codex exceptions
# up to AIAgent.run_conversation.
result = TurnResult()
try:
self.ensure_started()
except (CodexAppServerError, TimeoutError) as exc:
result.error = self._format_error_with_stderr(
"codex app-server startup failed", exc
)
# Subprocess almost certainly unhealthy — retire so the next
# turn re-spawns cleanly.
result.should_retire = True
return result
assert self._client is not None and self._thread_id is not None
result.thread_id = self._thread_id
self._interrupt_event.clear()
projector = CodexEventProjector()
# Send turn/start with the user input. Text-only for now (codex
# supports rich content but Hermes' text path is the common case).
try:
ts = self._client.request(
"turn/start",
{
"threadId": self._thread_id,
"input": [{"type": "text", "text": user_input}],
},
timeout=10,
)
except CodexAppServerError as exc:
# Classify auth/refresh failures so the user gets a clear
# `codex login` pointer instead of a raw RPC error string.
stderr_blob = "\n".join(self._client.stderr_tail(40))
hint = _classify_oauth_failure(exc.message, stderr_blob)
if hint is not None:
result.error = hint
# Subprocess is fine on a JSON-RPC level here, but the
# token store is broken — retire so the next turn does a
# clean handshake (and the user has a chance to re-auth
# via `codex login` between turns).
result.should_retire = True
else:
result.error = self._format_error_with_stderr(
"turn/start failed", exc
)
return result
except TimeoutError as exc:
# turn/start hanging is a strong signal the subprocess is wedged.
stderr_blob = "\n".join(self._client.stderr_tail(40))
hint = _classify_oauth_failure(stderr_blob)
result.error = hint or self._format_error_with_stderr(
"turn/start timed out", exc
)
result.should_retire = True
return result
result.turn_id = (ts.get("turn") or {}).get("id")
deadline = time.time() + turn_timeout
turn_complete = False
# Post-tool watchdog state. last_tool_completion_at is set whenever
# a tool-shaped item completes; if no further notification arrives
# within post_tool_quiet_timeout and the turn hasn't completed, we
# fast-fail and retire the session.
last_tool_completion_at: Optional[float] = None
while time.time() < deadline and not turn_complete:
if self._interrupt_event.is_set():
self._issue_interrupt(result.turn_id)
result.interrupted = True
break
# Detect a dead subprocess between iterations. If codex exited
# (e.g. crashed, segfaulted, or its auth refresh thread killed
# the process), we won't get any more notifications — bail out
# rather than waiting for the full turn deadline.
if not self._client.is_alive():
stderr_blob = "\n".join(self._client.stderr_tail(60))
hint = _classify_oauth_failure(stderr_blob)
if hint is not None:
result.error = hint
else:
result.error = self._format_error_with_stderr(
"codex app-server subprocess exited unexpectedly",
tail_lines=20,
)
result.should_retire = True
break
# Post-tool watchdog: if a tool completion was the most recent
# signal and codex has been silent past the quiet timeout, give
# up on this turn instead of waiting for the outer deadline.
if (
last_tool_completion_at is not None
and (time.time() - last_tool_completion_at)
> post_tool_quiet_timeout
):
self._issue_interrupt(result.turn_id)
result.interrupted = True
result.error = (
f"codex went silent for "
f"{post_tool_quiet_timeout:.0f}s after a tool result; "
f"retiring app-server session."
)
result.should_retire = True
break
# Drain any server-initiated requests (approvals) before
# reading notifications, so the codex side isn't blocked.
sreq = self._client.take_server_request(timeout=0)
if sreq is not None:
# Drain any pending notifications first so per-turn state
# (e.g. _pending_file_changes for fileChange approvals) is
# up to date when we make the approval decision. Bounded
# to avoid starving the server-request response.
for _ in range(8):
pending = self._client.take_notification(timeout=0)
if pending is None:
break
self._track_pending_file_change(pending)
proj = projector.project(pending)
if proj.messages:
result.projected_messages.extend(proj.messages)
if proj.is_tool_iteration:
result.tool_iterations += 1
last_tool_completion_at = time.time()
if proj.final_text is not None:
result.final_text = proj.final_text
if _has_turn_aborted_marker(proj.final_text):
turn_complete = True
result.interrupted = True
result.error = (
result.error
or "codex reported turn_aborted"
)
self._handle_server_request(sreq)
# Activity counts as live signal — reset the post-tool
# quiet timer so an approval round-trip doesn't trip it.
last_tool_completion_at = None
continue
note = self._client.take_notification(
timeout=notification_poll_timeout
)
if note is None:
continue
method = note.get("method", "")
if self._on_event is not None:
try:
self._on_event(note)
except Exception: # pragma: no cover - display callback
logger.debug("on_event callback raised", exc_info=True)
# Track in-progress fileChange items so the approval bridge
# can surface a real change summary when codex requests
# approval (the approval params themselves don't carry the
# changeset). Quirk #4 fix.
self._track_pending_file_change(note)
# Project into messages
projection = projector.project(note)
if projection.messages:
result.projected_messages.extend(projection.messages)
if projection.is_tool_iteration:
result.tool_iterations += 1
# Arm/refresh the post-tool quiet watchdog whenever a
# tool-shaped item completes.
last_tool_completion_at = time.time()
else:
# Any non-tool projected activity (assistant message,
# status update, etc.) means codex is still producing
# output — clear the quiet timer so we don't fast-fail.
if projection.messages or projection.final_text is not None:
last_tool_completion_at = None
if projection.final_text is not None:
# Codex can emit multiple agentMessage items in one turn
# (e.g. partial then final). Take the last one as canonical.
result.final_text = projection.final_text
# Some codex builds tear a turn down by emitting a
# `<turn_aborted>` marker in the agent message text and
# never sending turn/completed. Treat the marker itself
# as terminal so we don't burn the full deadline.
if _has_turn_aborted_marker(projection.final_text):
turn_complete = True
result.interrupted = True
result.error = (
result.error or "codex reported turn_aborted"
)
if method == "turn/completed":
turn_complete = True
turn_status = (
(note.get("params") or {}).get("turn") or {}
).get("status")
if turn_status and turn_status not in ("completed", "interrupted"):
err_obj = (
(note.get("params") or {}).get("turn") or {}
).get("error")
if err_obj:
err_msg = err_obj.get("message") or str(err_obj)
# If the turn failed for an auth/refresh reason,
# rewrite the error into a re-auth hint AND mark
# the session for retirement.
stderr_blob = "\n".join(
self._client.stderr_tail(40)
)
hint = _classify_oauth_failure(err_msg, stderr_blob)
if hint is not None:
result.error = hint
result.should_retire = True
else:
result.error = self._format_error_with_stderr(
f"turn ended status={turn_status}", err_msg
)
if not turn_complete and not result.interrupted:
# Hit the deadline. Issue interrupt to stop wasted compute, and
# tell the caller to retire the session — a turn that never
# finished is a strong sign codex is wedged in a way the next
# turn shouldn't inherit.
self._issue_interrupt(result.turn_id)
result.interrupted = True
if not result.error:
result.error = self._format_error_with_stderr(
f"turn timed out after {turn_timeout}s"
)
result.should_retire = True
return result
# ---------- internals ----------
def _issue_interrupt(self, turn_id: Optional[str]) -> None:
if self._client is None or self._thread_id is None or turn_id is None:
return
try:
self._client.request(
"turn/interrupt",
{"threadId": self._thread_id, "turnId": turn_id},
timeout=5,
)
except CodexAppServerError as exc:
# "no active turn to interrupt" is fine — already done.
logger.debug("turn/interrupt non-fatal: %s", exc)
except TimeoutError:
logger.warning("turn/interrupt timed out")
def _handle_server_request(self, req: dict) -> None:
"""Translate a codex server request (approval) into Hermes' approval
flow, then send the response.
Method names verified live against codex 0.130.0 (Apr 2026):
item/commandExecution/requestApproval exec approvals
item/fileChange/requestApproval apply_patch approvals
item/permissions/requestApproval permissions changes
(we decline; user controls
permission profile in
~/.codex/config.toml).
"""
if self._client is None:
return
method = req.get("method", "")
rid = req.get("id")
params = req.get("params") or {}
if method == "item/commandExecution/requestApproval":
decision = self._decide_exec_approval(params)
self._client.respond(rid, {"decision": decision})
elif method == "item/fileChange/requestApproval":
decision = self._decide_apply_patch_approval(params)
self._client.respond(rid, {"decision": decision})
elif method == "item/permissions/requestApproval":
# Codex sometimes asks to escalate permissions mid-turn. We
# always decline — the user already chose their permission
# profile in ~/.codex/config.toml and surprise escalations
# shouldn't be silently accepted.
self._client.respond(rid, {"decision": "decline"})
elif method == "mcpServer/elicitation/request":
# Codex's MCP layer asks the user for structured input on
# behalf of an MCP server (e.g. tool-call confirmation,
# OAuth, form data). For our own hermes-tools callback we
# auto-accept — the user already approved Hermes' tools
# by enabling the runtime, and we never expose anything
# codex's built-in shell can't already do. For other MCP
# servers we decline so the user explicitly opts in via
# codex's own auth flow.
server_name = params.get("serverName") or ""
if server_name == "hermes-tools":
self._client.respond(
rid,
{"action": "accept", "content": None, "_meta": None},
)
else:
self._client.respond(
rid,
{"action": "decline", "content": None, "_meta": None},
)
else:
# Unknown server request — codex can extend this surface. Reject
# cleanly so codex doesn't hang waiting for us.
logger.warning("Unknown codex server request: %s", method)
self._client.respond_error(
rid, code=-32601, message=f"Unsupported method: {method}"
)
def _decide_exec_approval(self, params: dict) -> str:
if self._routing.auto_approve_exec:
return "accept"
command = params.get("command") or ""
# Codex's CommandExecutionRequestApprovalParams has cwd as Optional —
# fall back to the session's cwd when codex doesn't include it so the
# approval prompt is never empty (quirk #10 fix).
cwd = params.get("cwd") or self._cwd or "<unknown>"
reason = params.get("reason")
description = f"Codex requests exec in {cwd}"
if reason:
description += f"{reason}"
if self._approval_callback is not None:
try:
choice = self._approval_callback(
command, description, allow_permanent=False
)
return _approval_choice_to_codex_decision(choice)
except Exception:
logger.exception("approval_callback raised on exec request")
return "decline"
return "decline" # fail-closed when no callback wired
def _decide_apply_patch_approval(self, params: dict) -> str:
if self._routing.auto_approve_apply_patch:
return "accept"
if self._approval_callback is not None:
# FileChangeRequestApprovalParams gives us reason + grantRoot.
# The actual changeset lives on the corresponding fileChange
# item which the projector has already cached for us — look it
# up by item_id so the user sees what's actually changing.
reason = params.get("reason")
grant_root = params.get("grantRoot")
item_id = params.get("itemId") or ""
change_summary = self._lookup_pending_file_change(item_id)
description_parts = []
if reason:
description_parts.append(reason)
if change_summary:
description_parts.append(change_summary)
if grant_root:
description_parts.append(f"grants write to {grant_root}")
description = (
"; ".join(description_parts)
if description_parts
else "Codex requests to apply a patch"
)
command_label = (
f"apply_patch: {change_summary}" if change_summary
else f"apply_patch: {reason}" if reason
else "apply_patch"
)
try:
choice = self._approval_callback(
command_label,
description,
allow_permanent=False,
)
return _approval_choice_to_codex_decision(choice)
except Exception:
logger.exception("approval_callback raised on apply_patch")
return "decline"
return "decline"
def _track_pending_file_change(self, note: dict) -> None:
"""Maintain self._pending_file_changes from item/started + item/completed
notifications. Lets the apply_patch approval prompt show what's
actually changing codex's approval params don't carry the data."""
method = note.get("method", "")
params = note.get("params") or {}
item = params.get("item") or {}
if item.get("type") != "fileChange":
return
item_id = item.get("id") or ""
if not item_id:
return
if method == "item/started":
changes = item.get("changes") or []
if not changes:
self._pending_file_changes[item_id] = "1 change pending"
return
kinds: dict[str, int] = {}
paths: list[str] = []
for ch in changes:
if not isinstance(ch, dict):
continue
kind = (ch.get("kind") or {}).get("type") or "update"
kinds[kind] = kinds.get(kind, 0) + 1
p = ch.get("path") or ""
if p:
paths.append(p)
counts = ", ".join(f"{n} {k}" for k, n in sorted(kinds.items()))
preview = ", ".join(paths[:3])
if len(paths) > 3:
preview += f", +{len(paths) - 3} more"
self._pending_file_changes[item_id] = (
f"{counts}: {preview}" if preview else counts
)
elif method == "item/completed":
self._pending_file_changes.pop(item_id, None)
def _lookup_pending_file_change(self, item_id: str) -> Optional[str]:
"""Look up an in-progress fileChange item by id and summarize its
changes for the approval prompt. Returns None when we don't have
the item cached (e.g. approval arrived before item/started, or
fileChange item content not tracked yet)."""
if not item_id:
return None
cached = self._pending_file_changes.get(item_id)
if not cached:
return None
return cached
def _approval_choice_to_codex_decision(choice: str) -> str:
"""Map Hermes approval choices onto codex's CommandExecutionApprovalDecision
/ FileChangeApprovalDecision wire values.
Hermes returns 'once', 'session', 'always', or 'deny'.
Codex expects 'accept', 'acceptForSession', 'decline', or 'cancel'
(verified against codex-rs/app-server-protocol/src/protocol/v2/item.rs
on codex 0.130.0).
"""
if choice in ("once",):
return "accept"
if choice in ("session", "always"):
return "acceptForSession"
return "decline"
def _has_turn_aborted_marker(text: str) -> bool:
"""Return True if `text` contains any of the raw markers codex uses
to signal a turn was aborted without emitting `turn/completed`.
Codex emits `<turn_aborted>` (and sometimes `<turn_aborted/>`) as raw
text inside agentMessage items when an interrupt or upstream error
tears the turn down before the normal completion path fires. Mirrors
openclaw beta.8's terminal-marker fix so we don't burn the full turn
deadline waiting for a turn/completed that never comes.
"""
if not text:
return False
for marker in _TURN_ABORTED_MARKERS:
if marker in text:
return True
return False
def _get_hermes_version() -> str:
"""Best-effort Hermes version string for codex's userAgent line."""
try:
from importlib.metadata import version
return version("hermes-agent")
except Exception: # pragma: no cover
return "0.0.0"
-312
View File
@@ -1,312 +0,0 @@
"""Projects codex app-server events into Hermes' messages list.
The translator that lets Hermes' memory/skill review keep working under the
Codex runtime: it converts Codex `item/*` notifications into the standard
OpenAI-shaped `{role, content, tool_calls, tool_call_id}` entries that
`agent/curator.py` already knows how to read.
Codex emits items with a discriminator field `type`:
- userMessage {role: "user", content}
- agentMessage {role: "assistant", content}
- reasoning stashed in the assistant's "reasoning" field
- commandExecution assistant tool_call(name="exec") + tool result
- fileChange assistant tool_call(name="apply_patch") + tool result
- mcpToolCall assistant tool_call(name=f"mcp.{server}.{tool}") + tool result
- dynamicToolCall assistant tool_call(name=tool) + tool result
- plan/hookPrompt/collabAgentToolCall recorded as opaque assistant notes
Each item maps to AT MOST one assistant entry + one tool entry, preserving
Hermes' message-alternation invariants (system → user → assistant → user/tool
assistant ...). Multiple Codex tool calls within one Codex turn produce
multiple consecutive (assistant, tool) pairs, which is the same shape Hermes
already produces for parallel tool calls.
Counters tracked alongside projection:
- tool_iterations: ticks once per completed tool-shaped item. Used by
AIAgent._iters_since_skill (skill nudge gate, default threshold 10).
"""
from __future__ import annotations
import hashlib
import json
from dataclasses import dataclass, field
from typing import Any, Optional
def _deterministic_call_id(item_type: str, item_id: str) -> str:
"""Stable id for tool_call message correlation.
Uses the codex item id directly when present (already a uuid); falls back
to a content hash so replay produces the same id across sessions and
prefix caches stay valid. See AGENTS.md Pitfall #16 (deterministic IDs in
tool call history)."""
if item_id:
return f"codex_{item_type}_{item_id}"
digest = hashlib.sha256(f"{item_type}".encode()).hexdigest()[:16]
return f"codex_{item_type}_{digest}"
def _format_tool_args(d: dict) -> str:
"""Format a dict as JSON the way Hermes' existing tool_calls path does."""
return json.dumps(d, ensure_ascii=False, sort_keys=True)
@dataclass
class ProjectionResult:
"""Output of projecting one Codex item.
`messages` is a list because some Codex items produce two messages
(assistant tool_call + tool result). Empty list = item ignored (e.g. a
streaming `outputDelta` that doesn't materialize into messages until the
`item/completed` event)."""
messages: list[dict] = field(default_factory=list)
is_tool_iteration: bool = False
final_text: Optional[str] = None # Set when an agentMessage completes
class CodexEventProjector:
"""Stateful projector consuming Codex notifications in arrival order.
Owns the in-progress reasoning content (codex emits reasoning as separate
items but Hermes stashes it on the next assistant message)."""
def __init__(self) -> None:
self._pending_reasoning: list[str] = []
def project(self, notification: dict) -> ProjectionResult:
"""Project a single notification. Idempotent for non-completion events;
only `item/completed` and `turn/completed` materialize messages."""
method = notification.get("method", "")
params = notification.get("params", {}) or {}
# We only materialize messages on `item/completed`. Streaming deltas
# (`item/<type>/outputDelta`, `item/<type>/delta`) are display-only and
# don't enter the messages list — same way Hermes already only writes
# the assistant message after the streaming completion event.
if method != "item/completed":
return ProjectionResult()
item = params.get("item") or {}
item_type = item.get("type") or ""
item_id = item.get("id") or ""
if item_type == "agentMessage":
return self._project_agent_message(item)
if item_type == "reasoning":
self._pending_reasoning.extend(item.get("summary") or [])
self._pending_reasoning.extend(item.get("content") or [])
return ProjectionResult()
if item_type == "commandExecution":
return self._project_command(item, item_id)
if item_type == "fileChange":
return self._project_file_change(item, item_id)
if item_type == "mcpToolCall":
return self._project_mcp_tool_call(item, item_id)
if item_type == "dynamicToolCall":
return self._project_dynamic_tool_call(item, item_id)
if item_type == "userMessage":
return self._project_user_message(item)
# Unknown / rare items (plan, hookPrompt, collabAgentToolCall, etc.)
# — record as opaque assistant note so memory review can still see
# *something* happened, but don't fabricate tool_call structure.
return self._project_opaque(item, item_type)
# ---------- per-type projections ----------
def _project_agent_message(self, item: dict) -> ProjectionResult:
text = item.get("text") or ""
msg: dict[str, Any] = {"role": "assistant", "content": text}
if self._pending_reasoning:
msg["reasoning"] = "\n".join(self._pending_reasoning)
self._pending_reasoning = []
return ProjectionResult(messages=[msg], final_text=text)
def _project_user_message(self, item: dict) -> ProjectionResult:
# codex's userMessage content is a list of UserInput variants. For
# projection purposes we flatten any text fragments and ignore
# non-text parts (images, etc.) — Hermes' messages store text only.
text_parts: list[str] = []
for fragment in item.get("content") or []:
if isinstance(fragment, dict):
if fragment.get("type") == "text":
text_parts.append(fragment.get("text") or "")
elif "text" in fragment:
text_parts.append(str(fragment["text"]))
return ProjectionResult(
messages=[{"role": "user", "content": "\n".join(text_parts)}]
)
def _project_command(self, item: dict, item_id: str) -> ProjectionResult:
call_id = _deterministic_call_id("exec", item_id)
args = {
"command": item.get("command") or "",
"cwd": item.get("cwd") or "",
}
assistant_msg = {
"role": "assistant",
"content": None,
"tool_calls": [
{
"id": call_id,
"type": "function",
"function": {
"name": "exec_command",
"arguments": _format_tool_args(args),
},
}
],
}
if self._pending_reasoning:
assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
self._pending_reasoning = []
output = item.get("aggregatedOutput") or ""
exit_code = item.get("exitCode")
if exit_code is not None and exit_code != 0:
output = f"[exit {exit_code}]\n{output}"
tool_msg = {
"role": "tool",
"tool_call_id": call_id,
"content": output,
}
return ProjectionResult(
messages=[assistant_msg, tool_msg], is_tool_iteration=True
)
def _project_file_change(self, item: dict, item_id: str) -> ProjectionResult:
call_id = _deterministic_call_id("apply_patch", item_id)
# Reduce the codex changes array to a digest the agent loop will
# find readable. We record per-file change kinds (Add/Update/Delete)
# without inlining full file contents — those can be huge.
changes_summary = []
for change in item.get("changes") or []:
kind = (change.get("kind") or {}).get("type") or "update"
path = change.get("path") or ""
changes_summary.append({"kind": kind, "path": path})
args = {"changes": changes_summary}
assistant_msg = {
"role": "assistant",
"content": None,
"tool_calls": [
{
"id": call_id,
"type": "function",
"function": {
"name": "apply_patch",
"arguments": _format_tool_args(args),
},
}
],
}
if self._pending_reasoning:
assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
self._pending_reasoning = []
status = item.get("status") or "unknown"
n = len(changes_summary)
tool_msg = {
"role": "tool",
"tool_call_id": call_id,
"content": f"apply_patch status={status}, {n} change(s)",
}
return ProjectionResult(
messages=[assistant_msg, tool_msg], is_tool_iteration=True
)
def _project_mcp_tool_call(self, item: dict, item_id: str) -> ProjectionResult:
server = item.get("server") or "mcp"
tool = item.get("tool") or "unknown"
call_id = _deterministic_call_id(f"mcp_{server}_{tool}", item_id)
args = item.get("arguments") or {}
if not isinstance(args, dict):
args = {"arguments": args}
assistant_msg = {
"role": "assistant",
"content": None,
"tool_calls": [
{
"id": call_id,
"type": "function",
"function": {
"name": f"mcp.{server}.{tool}",
"arguments": _format_tool_args(args),
},
}
],
}
if self._pending_reasoning:
assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
self._pending_reasoning = []
result = item.get("result")
error = item.get("error")
if error:
content = f"[error] {json.dumps(error, ensure_ascii=False)[:1000]}"
elif result is not None:
content = json.dumps(result, ensure_ascii=False)[:4000]
else:
content = ""
tool_msg = {
"role": "tool",
"tool_call_id": call_id,
"content": content,
}
return ProjectionResult(
messages=[assistant_msg, tool_msg], is_tool_iteration=True
)
def _project_dynamic_tool_call(
self, item: dict, item_id: str
) -> ProjectionResult:
tool = item.get("tool") or "unknown"
call_id = _deterministic_call_id(f"dyn_{tool}", item_id)
args = item.get("arguments") or {}
if not isinstance(args, dict):
args = {"arguments": args}
assistant_msg = {
"role": "assistant",
"content": None,
"tool_calls": [
{
"id": call_id,
"type": "function",
"function": {
"name": tool,
"arguments": _format_tool_args(args),
},
}
],
}
if self._pending_reasoning:
assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
self._pending_reasoning = []
content_items = item.get("contentItems") or []
if isinstance(content_items, list) and content_items:
content = json.dumps(content_items, ensure_ascii=False)[:4000]
else:
success = item.get("success")
content = f"success={success}"
tool_msg = {
"role": "tool",
"tool_call_id": call_id,
"content": content,
}
return ProjectionResult(
messages=[assistant_msg, tool_msg], is_tool_iteration=True
)
def _project_opaque(self, item: dict, item_type: str) -> ProjectionResult:
# Record the existence of the item without inventing tool_calls.
# Memory review will see this and may or may not save anything.
try:
payload = json.dumps(item, ensure_ascii=False)[:1500]
except (TypeError, ValueError):
payload = repr(item)[:1500]
return ProjectionResult(
messages=[
{
"role": "assistant",
"content": f"[codex {item_type}] {payload}",
}
]
)
-225
View File
@@ -1,225 +0,0 @@
"""Hermes-tools-as-MCP server for the codex_app_server runtime.
When the user runs `openai/*` turns through the codex app-server, codex
owns the loop and builds its own tool list. By default, that means
Hermes' richer tool surface — web search, browser automation,
delegate_task subagents, vision analysis, persistent memory, skills,
cross-session search, image generation, TTS is unreachable.
This module exposes a curated subset of those Hermes tools to the
spawned codex subprocess via stdio MCP. Codex registers it as a normal
MCP server (per `~/.codex/config.toml [mcp_servers.hermes-tools]`) and
the user gets full Hermes capability inside a Codex turn.
Scope (what we expose):
- web_search, web_extract Firecrawl, no codex equivalent
- browser_navigate / _click / _type / Camofox/Browserbase automation
_snapshot / _screenshot / _scroll / _back / _press / _vision
- delegate_task Hermes subagents
- vision_analyze image inspection by vision model
- image_generate image generation
- memory Hermes' persistent memory store
- skill_view, skills_list Hermes' skill library
- session_search cross-session search
- text_to_speech TTS
What we DO NOT expose (codex has equivalents):
- terminal / shell codex's own shell tool
- read_file / write_file / patch codex's apply_patch + shell
- search_files / process codex's shell
- clarify, todo codex's own UX
Run with: python -m agent.transports.hermes_tools_mcp_server
Spawned by: CodexAppServerSession.ensure_started() when the runtime is
active and config opts in.
"""
from __future__ import annotations
import json
import logging
import os
import sys
from typing import Any, Optional
logger = logging.getLogger(__name__)
# Tools we expose. Each name MUST match a registered Hermes tool that
# `model_tools.handle_function_call()` can dispatch.
#
# What we deliberately DO NOT expose:
# - terminal / shell / read_file / write_file / patch / search_files /
# process — codex's built-ins cover these and approval routes through
# codex's own UI.
# - delegate_task / memory / session_search / todo — these are
# `_AGENT_LOOP_TOOLS` in Hermes (model_tools.py:493). They require
# the running AIAgent context to dispatch (mid-loop state), so a
# stateless MCP callback can't drive them. Hermes' default runtime
# keeps these working; the codex_app_server runtime cannot.
EXPOSED_TOOLS: tuple[str, ...] = (
"web_search",
"web_extract",
"browser_navigate",
"browser_click",
"browser_type",
"browser_press",
"browser_snapshot",
"browser_scroll",
"browser_back",
"browser_get_images",
"browser_console",
"browser_vision",
"vision_analyze",
"image_generate",
"skill_view",
"skills_list",
"text_to_speech",
# Kanban worker handoff tools — gated on HERMES_KANBAN_TASK env var
# (set by the kanban dispatcher when spawning a worker). Without these
# in the callback, a worker spawned with openai_runtime=codex_app_server
# could do the work but couldn't report completion back to the kernel,
# making it hang until timeout. Stateless dispatch — they just read
# the env var and write to ~/.hermes/kanban.db.
"kanban_complete",
"kanban_block",
"kanban_comment",
"kanban_heartbeat",
"kanban_show",
"kanban_list",
# NOTE: kanban_create / kanban_unblock / kanban_link are orchestrator-
# only — the kanban tool gates them on HERMES_KANBAN_TASK being unset.
# They're exposed here for orchestrator agents running on the codex
# runtime that need to dispatch new tasks.
"kanban_create",
"kanban_unblock",
"kanban_link",
)
def _build_server() -> Any:
"""Create the FastMCP server with Hermes tools attached. Lazy imports
so the module can be imported without the mcp package installed
(we degrade to a clear error only when actually run)."""
try:
from mcp.server.fastmcp import FastMCP
except ImportError as exc: # pragma: no cover - install hint
raise ImportError(
f"hermes-tools MCP server requires the 'mcp' package: {exc}"
) from exc
# Discover Hermes tools so dispatch works.
from model_tools import (
get_tool_definitions,
handle_function_call,
)
mcp = FastMCP(
"hermes-tools",
instructions=(
"Hermes Agent's tool surface, exposed for use inside a Codex "
"session. Use these for capabilities Codex's built-in toolset "
"doesn't cover: web search/extract, browser automation, "
"subagent delegation, vision, image generation, persistent "
"memory, skills, and cross-session search."
),
)
# Pull authoritative Hermes tool schemas for the ones we expose, so
# MCP clients see the same parameter docs Hermes gives the model.
all_defs = {
td["function"]["name"]: td["function"]
for td in (get_tool_definitions(quiet_mode=True) or [])
if isinstance(td, dict) and td.get("type") == "function"
}
exposed_count = 0
for name in EXPOSED_TOOLS:
spec = all_defs.get(name)
if spec is None:
logger.debug(
"skipping %s — not registered in this Hermes process", name
)
continue
description = spec.get("description") or f"Hermes {name} tool"
params_schema = spec.get("parameters") or {"type": "object", "properties": {}}
# FastMCP wants a Python callable. Build a closure that takes the
# arguments dict, dispatches via handle_function_call, and returns
# the result string. We use add_tool() for full control over the
# input schema (FastMCP's @tool() decorator inspects type hints,
# which we can't get from a JSON schema at runtime).
def _make_handler(tool_name: str):
def _dispatch(**kwargs: Any) -> str:
try:
return handle_function_call(tool_name, kwargs or {})
except Exception as exc:
logger.exception("tool %s raised", tool_name)
return json.dumps({"error": str(exc), "tool": tool_name})
_dispatch.__name__ = tool_name
_dispatch.__doc__ = description
return _dispatch
try:
mcp.add_tool(
_make_handler(name),
name=name,
description=description,
# FastMCP accepts JSON schema directly via the
# input_schema parameter on newer versions; older
# versions use parameters_schema. Try both for compat.
)
except TypeError:
# Older mcp SDK signature — fall back to decorator-style.
handler = _make_handler(name)
handler = mcp.tool(name=name, description=description)(handler)
exposed_count += 1
logger.info(
"hermes-tools MCP server registered %d/%d tools",
exposed_count,
len(EXPOSED_TOOLS),
)
return mcp
def main(argv: Optional[list[str]] = None) -> int:
"""Entry point for `python -m agent.transports.hermes_tools_mcp_server`."""
argv = argv or sys.argv[1:]
verbose = "--verbose" in argv or "-v" in argv
log_level = logging.INFO if verbose else logging.WARNING
logging.basicConfig(
level=log_level,
stream=sys.stderr, # MCP uses stdio for protocol — logs MUST go to stderr
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
)
# Quiet mode: keep Hermes' own banners off stdout (which is the MCP wire).
os.environ.setdefault("HERMES_QUIET", "1")
os.environ.setdefault("HERMES_REDACT_SECRETS", "true")
try:
server = _build_server()
except ImportError as exc:
sys.stderr.write(f"hermes-tools MCP server cannot start: {exc}\n")
return 2
# FastMCP runs with stdio transport by default when launched as a
# subprocess.
try:
server.run()
except KeyboardInterrupt:
return 0
except Exception as exc:
logger.exception("hermes-tools MCP server crashed")
sys.stderr.write(f"hermes-tools MCP server error: {exc}\n")
return 1
return 0
if __name__ == "__main__":
sys.exit(main())
-11
View File
@@ -370,17 +370,6 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://api-docs.deepseek.com/quick_start/pricing",
pricing_version="deepseek-pricing-2026-03-16",
),
(
"deepseek",
"deepseek-v4-pro",
): PricingEntry(
input_cost_per_million=Decimal("1.74"),
output_cost_per_million=Decimal("3.48"),
cache_read_cost_per_million=Decimal("0.0145"),
source="official_docs_snapshot",
source_url="https://api-docs.deepseek.com/quick_start/pricing",
pricing_version="deepseek-pricing-2026-05-12",
),
# Google Gemini
(
"google",
-299
View File
@@ -1,299 +0,0 @@
"""
Video Generation Provider ABC
=============================
Defines the pluggable-backend interface for video generation. Providers register
instances via ``PluginContext.register_video_gen_provider()``; the active one
(selected via ``video_gen.provider`` in ``config.yaml``) services every
``video_generate`` tool call.
Providers live in ``<repo>/plugins/video_gen/<name>/`` (built-in, auto-loaded
as ``kind: backend``) or ``~/.hermes/plugins/video_gen/<name>/`` (user, opt-in
via ``plugins.enabled``).
Mirrors the ``image_gen`` provider design (``agent/image_gen_provider.py``) so
the two surfaces stay learnable together.
Unified surface
---------------
One tool ``video_generate`` covers **text-to-video** and **image-to-video**.
The router is the presence of ``image_url``: if it's set, the provider routes
to its image-to-video endpoint; if it's omitted, the provider routes to
text-to-video. Users pick one **model family** (e.g. Pixverse v6, Veo 3.1,
Kling O3 Standard); the provider handles which underlying FAL/xAI endpoint
to hit.
Video edit and video extend are intentionally NOT exposed in this surface
the inconsistency across backends is too large for one unified tool. If
those use cases warrant attention later they can ship as separate tools.
Response shape
--------------
All providers return a dict built by :func:`success_response` /
:func:`error_response`. Keys:
success bool
video str | None URL or absolute file path
model str provider-specific model identifier
prompt str echoed prompt
modality str "text" | "image" (which mode was used)
aspect_ratio str provider-native (e.g. "16:9") or ""
duration int seconds (0 if not applicable)
provider str provider name (for diagnostics)
error str only when success=False
error_type str only when success=False
"""
from __future__ import annotations
import abc
import base64
import datetime
import logging
import uuid
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
# Common aspect ratios across providers (Veo / Kling / xAI / Pixverse). The
# tool schema advertises this set as an enum hint, but providers may accept
# a narrower or wider set — they are responsible for clamping.
COMMON_ASPECT_RATIOS: Tuple[str, ...] = ("16:9", "9:16", "1:1", "4:3", "3:4", "3:2", "2:3")
DEFAULT_ASPECT_RATIO = "16:9"
COMMON_RESOLUTIONS: Tuple[str, ...] = ("480p", "540p", "720p", "1080p")
DEFAULT_RESOLUTION = "720p"
# ---------------------------------------------------------------------------
# ABC
# ---------------------------------------------------------------------------
class VideoGenProvider(abc.ABC):
"""Abstract base class for a video generation backend.
Subclasses must implement :meth:`generate`. Everything else has sane
defaults override only what your provider needs.
"""
@property
@abc.abstractmethod
def name(self) -> str:
"""Stable short identifier used in ``video_gen.provider`` config.
Lowercase, no spaces. Examples: ``xai``, ``fal``, ``google``.
"""
@property
def display_name(self) -> str:
"""Human-readable label shown in ``hermes tools``. Defaults to ``name.title()``."""
return self.name.title()
def is_available(self) -> bool:
"""Return True when this provider can service calls.
Typically checks for a required API key and optional-dependency
import. Default: True.
"""
return True
def list_models(self) -> List[Dict[str, Any]]:
"""Return catalog entries for ``hermes tools`` model picker.
Each entry represents a **model family** that supports text-to-video
and/or image-to-video routing internally::
{
"id": "veo-3.1", # required
"display": "Veo 3.1", # optional; defaults to id
"speed": "~60s", # optional
"strengths": "...", # optional
"price": "$0.20/s", # optional
"modalities": ["text", "image"], # optional, advisory
}
Default: empty list (provider has no user-selectable models).
"""
return []
def get_setup_schema(self) -> Dict[str, Any]:
"""Return provider metadata for the ``hermes tools`` picker."""
return {
"name": self.display_name,
"badge": "",
"tag": "",
"env_vars": [],
}
def default_model(self) -> Optional[str]:
"""Return the default model id, or None if not applicable."""
models = self.list_models()
if models:
return models[0].get("id")
return None
def capabilities(self) -> Dict[str, Any]:
"""Return what this provider supports.
Returned dict (all keys optional)::
{
"modalities": ["text", "image"], # which inputs the backend accepts
"aspect_ratios": ["16:9", "9:16", ...],
"resolutions": ["720p", "1080p"],
"max_duration": 15, # seconds
"min_duration": 1,
"supports_audio": True,
"supports_negative_prompt": True,
"max_reference_images": 7,
}
Used by the tool layer for soft validation and by ``hermes tools``
for the picker. Default: text-only.
"""
return {
"modalities": ["text"],
"aspect_ratios": list(COMMON_ASPECT_RATIOS),
"resolutions": list(COMMON_RESOLUTIONS),
"max_duration": 10,
"min_duration": 1,
"supports_audio": False,
"supports_negative_prompt": False,
"max_reference_images": 0,
}
@abc.abstractmethod
def generate(
self,
prompt: str,
*,
model: Optional[str] = None,
image_url: Optional[str] = None,
reference_image_urls: Optional[List[str]] = None,
duration: Optional[int] = None,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
resolution: str = DEFAULT_RESOLUTION,
negative_prompt: Optional[str] = None,
audio: Optional[bool] = None,
seed: Optional[int] = None,
**kwargs: Any,
) -> Dict[str, Any]:
"""Generate a video from a prompt (text-to-video) or animate an image
(image-to-video).
Routing: if ``image_url`` is provided, the provider should route to
its image-to-video endpoint; otherwise text-to-video. The plugin
is responsible for picking the right underlying endpoint within
the user's chosen model family.
Implementations should return the dict from :func:`success_response`
or :func:`error_response`. ``kwargs`` may contain forward-compat
parameters future versions of the schema will expose
implementations MUST ignore unknown keys (no TypeError).
"""
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _videos_cache_dir() -> Path:
"""Return ``$HERMES_HOME/cache/videos/``, creating parents as needed."""
from hermes_constants import get_hermes_home
path = get_hermes_home() / "cache" / "videos"
path.mkdir(parents=True, exist_ok=True)
return path
def save_b64_video(
b64_data: str,
*,
prefix: str = "video",
extension: str = "mp4",
) -> Path:
"""Decode base64 video data and write under ``$HERMES_HOME/cache/videos/``.
Returns the absolute :class:`Path` to the saved file.
Filename format: ``<prefix>_<YYYYMMDD_HHMMSS>_<short-uuid>.<ext>``.
"""
raw = base64.b64decode(b64_data)
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
short = uuid.uuid4().hex[:8]
path = _videos_cache_dir() / f"{prefix}_{ts}_{short}.{extension}"
path.write_bytes(raw)
return path
def save_bytes_video(
raw: bytes,
*,
prefix: str = "video",
extension: str = "mp4",
) -> Path:
"""Write raw video bytes (e.g. an HTTP download body) to the cache."""
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
short = uuid.uuid4().hex[:8]
path = _videos_cache_dir() / f"{prefix}_{ts}_{short}.{extension}"
path.write_bytes(raw)
return path
def success_response(
*,
video: str,
model: str,
prompt: str,
modality: str = "text",
aspect_ratio: str = "",
duration: int = 0,
provider: str,
extra: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:
"""Build a uniform success response dict.
``video`` may be an HTTP URL or an absolute filesystem path.
``modality`` is ``"text"`` (text-to-video) or ``"image"`` (image-to-video)
indicates which endpoint was actually hit, useful for diagnostics.
"""
payload: Dict[str, Any] = {
"success": True,
"video": video,
"model": model,
"prompt": prompt,
"modality": modality,
"aspect_ratio": aspect_ratio,
"duration": int(duration) if duration else 0,
"provider": provider,
}
if extra:
for k, v in extra.items():
payload.setdefault(k, v)
return payload
def error_response(
*,
error: str,
error_type: str = "provider_error",
provider: str = "",
model: str = "",
prompt: str = "",
aspect_ratio: str = "",
) -> Dict[str, Any]:
"""Build a uniform error response dict."""
return {
"success": False,
"video": None,
"error": error,
"error_type": error_type,
"model": model,
"prompt": prompt,
"aspect_ratio": aspect_ratio,
"provider": provider,
}
-117
View File
@@ -1,117 +0,0 @@
"""
Video Generation Provider Registry
==================================
Central map of registered providers. Populated by plugins at import-time via
``PluginContext.register_video_gen_provider()``; consumed by the
``video_generate`` tool to dispatch each call to the active backend.
Active selection
----------------
The active provider is chosen by ``video_gen.provider`` in ``config.yaml``.
If unset, :func:`get_active_provider` applies fallback logic:
1. If exactly one provider is registered, use it.
2. Otherwise return ``None`` (the tool surfaces a helpful error pointing
the user at ``hermes tools``).
Mirrors ``agent/image_gen_registry.py`` so the two surfaces behave the
same.
"""
from __future__ import annotations
import logging
import threading
from typing import Dict, List, Optional
from agent.video_gen_provider import VideoGenProvider
logger = logging.getLogger(__name__)
_providers: Dict[str, VideoGenProvider] = {}
_lock = threading.Lock()
def register_provider(provider: VideoGenProvider) -> None:
"""Register a video generation provider.
Re-registration (same ``name``) overwrites the previous entry and logs
a debug message this makes hot-reload scenarios (tests, dev loops)
behave predictably.
"""
if not isinstance(provider, VideoGenProvider):
raise TypeError(
f"register_provider() expects a VideoGenProvider instance, "
f"got {type(provider).__name__}"
)
name = provider.name
if not isinstance(name, str) or not name.strip():
raise ValueError("Video gen provider .name must be a non-empty string")
with _lock:
existing = _providers.get(name)
_providers[name] = provider
if existing is not None:
logger.debug("Video gen provider '%s' re-registered (was %r)", name, type(existing).__name__)
else:
logger.debug("Registered video gen provider '%s' (%s)", name, type(provider).__name__)
def list_providers() -> List[VideoGenProvider]:
"""Return all registered providers, sorted by name."""
with _lock:
items = list(_providers.values())
return sorted(items, key=lambda p: p.name)
def get_provider(name: str) -> Optional[VideoGenProvider]:
"""Return the provider registered under *name*, or None."""
if not isinstance(name, str):
return None
with _lock:
return _providers.get(name.strip())
def get_active_provider() -> Optional[VideoGenProvider]:
"""Resolve the currently-active provider.
Reads ``video_gen.provider`` from config.yaml; falls back per the
module docstring.
"""
configured: Optional[str] = None
try:
from hermes_cli.config import load_config
cfg = load_config()
section = cfg.get("video_gen") if isinstance(cfg, dict) else None
if isinstance(section, dict):
raw = section.get("provider")
if isinstance(raw, str) and raw.strip():
configured = raw.strip()
except Exception as exc:
logger.debug("Could not read video_gen.provider from config: %s", exc)
with _lock:
snapshot = dict(_providers)
if configured:
provider = snapshot.get(configured)
if provider is not None:
return provider
logger.debug(
"video_gen.provider='%s' configured but not registered; falling back",
configured,
)
# Fallback: single-provider case
if len(snapshot) == 1:
return next(iter(snapshot.values()))
return None
def _reset_for_tests() -> None:
"""Clear the registry. **Test-only.**"""
with _lock:
_providers.clear()
-221
View File
@@ -1,221 +0,0 @@
"""
Web Search Provider ABC
=======================
Defines the pluggable-backend interface for web search and content extraction.
Providers register instances via ``PluginContext.register_web_search_provider()``;
the active one (selected via ``web.search_backend`` / ``web.extract_backend`` /
``web.backend`` in ``config.yaml``) services every ``web_search`` /
``web_extract`` tool call.
Providers live in ``<repo>/plugins/web/<name>/`` (built-in, auto-loaded as
``kind: backend``) or ``~/.hermes/plugins/web/<name>/`` (user, opt-in via
``plugins.enabled``).
This ABC is the SINGLE plugin-facing surface for web providers every
provider in the tree (brave-free, ddgs, searxng, exa, parallel, tavily,
firecrawl) implements it. The legacy in-tree ``tools.web_providers.base``
ABCs were deleted in PR #25182 along with the per-vendor inline helpers
in ``tools/web_tools.py``; the response-shape contract documented below
is preserved bit-for-bit so the tool wrapper does not have to translate.
Response shape (preserved from the legacy contract):
Search results::
{
"success": True,
"data": {
"web": [
{"title": str, "url": str, "description": str, "position": int},
...
]
}
}
Extract results::
{
"success": True,
"data": [
{"url": str, "title": str, "content": str,
"raw_content": str, "metadata": dict},
...
]
}
On failure (either capability)::
{"success": False, "error": str}
"""
from __future__ import annotations
import abc
from typing import Any, Dict, List
# ---------------------------------------------------------------------------
# ABC
# ---------------------------------------------------------------------------
class WebSearchProvider(abc.ABC):
"""Abstract base class for a web search/extract/crawl backend.
Subclasses must implement :meth:`is_available` and at least one of
:meth:`search` / :meth:`extract` / :meth:`crawl`. The
:meth:`supports_search` / :meth:`supports_extract` / :meth:`supports_crawl`
capability flags let the registry route each tool call to the right
provider, and let multi-capability providers (Firecrawl, Tavily, Exa,
) advertise multiple capabilities from a single class.
"""
@property
@abc.abstractmethod
def name(self) -> str:
"""Stable short identifier used in ``web.search_backend`` /
``web.extract_backend`` / ``web.backend`` config keys.
Lowercase, no spaces; hyphens permitted to preserve existing
user-visible names. Examples: ``brave-free``, ``ddgs``,
``searxng``, ``firecrawl``.
"""
@property
def display_name(self) -> str:
"""Human-readable label shown in ``hermes tools``. Defaults to ``name``."""
return self.name
@abc.abstractmethod
def is_available(self) -> bool:
"""Return True when this provider can service calls.
Typically a cheap check (env var present, optional Python dep
importable, instance URL set). Must NOT make network calls this
runs at tool-registration time and on every ``hermes tools`` paint.
"""
def supports_search(self) -> bool:
"""Return True if this provider implements :meth:`search`."""
return True
def supports_extract(self) -> bool:
"""Return True if this provider implements :meth:`extract`.
Both sync and async :meth:`extract` implementations are valid the
dispatcher detects coroutine functions via
:func:`inspect.iscoroutinefunction` and awaits as needed. Sync
implementations that perform blocking I/O (HTTP, SDK calls) should
ideally wrap in :func:`asyncio.to_thread` at the call site; small
providers can keep their sync shape and let the dispatcher handle
threading.
"""
return False
def supports_crawl(self) -> bool:
"""Return True if this provider implements :meth:`crawl`.
Crawl differs from extract in that the agent provides a *seed URL*
and the provider walks linked pages on its own useful for
documentation sites where the agent doesn't know all relevant
URLs upfront. Tavily is the only built-in backend that natively
crawls today; Firecrawl provides a similar capability that we
don't currently surface as a tool.
Providers that don't crawl should leave this as False; the
dispatcher in :func:`tools.web_tools.web_crawl_tool` will fall
back to its auxiliary-model summarization path.
"""
return False
def search(self, query: str, limit: int = 5) -> Dict[str, Any]:
"""Execute a web search.
Override when :meth:`supports_search` returns True. The default
raises NotImplementedError; callers should gate on
:meth:`supports_search` before calling.
"""
raise NotImplementedError(
f"{self.name} does not support search (override supports_search)"
)
def extract(self, urls: List[str], **kwargs: Any) -> Any:
"""Extract content from one or more URLs.
Override when :meth:`supports_extract` returns True. The default
raises NotImplementedError; callers should gate on
:meth:`supports_extract` before calling.
Return shape: a list of result dicts matching what the legacy
:func:`tools.web_tools.web_extract_tool` post-processing pipeline
expects::
[
{
"url": str,
"title": str,
"content": str,
"raw_content": str,
"metadata": dict, # optional
"error": str, # optional, only on per-URL failure
},
...
]
Implementations MAY be ``async def`` the dispatcher detects
coroutines via :func:`inspect.iscoroutinefunction` and awaits.
``kwargs`` may carry forward-compat fields (``format``, ``include_raw``,
``max_chars``) implementations should ignore unknown keys.
"""
raise NotImplementedError(
f"{self.name} does not support extract (override supports_extract)"
)
def crawl(self, url: str, **kwargs: Any) -> Any:
"""Crawl a seed URL and return results.
Override when :meth:`supports_crawl` returns True. The default
raises NotImplementedError; callers should gate on
:meth:`supports_crawl` before calling.
Return shape: ``{"results": [{"url": str, "title": str,
"content": str, ...}, ...]}`` matching what
:func:`tools.web_tools.web_crawl_tool` post-processing expects.
Implementations MAY be ``async def``.
``kwargs`` may carry forward-compat fields (e.g. ``max_depth``,
``include_domains``) implementations should ignore unknown keys.
"""
raise NotImplementedError(
f"{self.name} does not support crawl (override supports_crawl)"
)
def get_setup_schema(self) -> Dict[str, Any]:
"""Return provider metadata for the ``hermes tools`` picker.
Used by ``hermes_cli/tools_config.py`` to inject this provider as a
row in the Web Search / Web Extract picker. Shape::
{
"name": "Brave Search (Free)",
"badge": "free",
"tag": "No paid tier needed — uses Brave's free API.",
"env_vars": [
{"key": "BRAVE_SEARCH_API_KEY",
"prompt": "Brave Search API key",
"url": "https://brave.com/search/api/"},
],
}
Default: minimal entry derived from ``display_name``. Override to
expose API key prompts, badges, and instance URL fields.
"""
return {
"name": self.display_name,
"badge": "",
"tag": "",
"env_vars": [],
}
-262
View File
@@ -1,262 +0,0 @@
"""
Web Search Provider Registry
============================
Central map of registered web providers. Populated by plugins at import-time
via :meth:`PluginContext.register_web_search_provider`; consumed by the
``web_search`` and ``web_extract`` tool wrappers in :mod:`tools.web_tools` to
dispatch each call to the active backend.
Active selection
----------------
The active provider is chosen by configuration with this precedence:
1. ``web.search_backend`` / ``web.extract_backend`` / ``web.crawl_backend``
(per-capability override).
2. ``web.backend`` (shared fallback).
3. If exactly one capability-eligible provider is registered AND available,
use it.
4. Legacy preference order ``firecrawl`` ``parallel`` ``tavily``
``exa`` ``searxng`` ``brave-free`` ``ddgs`` filtered by
availability. Matches the historic ``tools.web_tools._get_backend()``
candidate order so installs that never set a config key keep landing
on the same provider they did before the plugin migration.
5. Otherwise ``None`` the tool surfaces a helpful error pointing at
``hermes tools``.
The capability filter (``supports_search`` / ``supports_extract`` /
``supports_crawl``) is applied at every step so a search-only provider
(``brave-free``) configured as ``web.extract_backend`` correctly falls
through to an extract-capable backend.
"""
from __future__ import annotations
import logging
import threading
from typing import Dict, List, Optional
from agent.web_search_provider import WebSearchProvider
logger = logging.getLogger(__name__)
_providers: Dict[str, WebSearchProvider] = {}
_lock = threading.Lock()
def register_provider(provider: WebSearchProvider) -> None:
"""Register a web search/extract provider.
Re-registration (same ``name``) overwrites the previous entry and logs
a debug message makes hot-reload scenarios (tests, dev loops) behave
predictably.
"""
if not isinstance(provider, WebSearchProvider):
raise TypeError(
f"register_provider() expects a WebSearchProvider instance, "
f"got {type(provider).__name__}"
)
name = provider.name
if not isinstance(name, str) or not name.strip():
raise ValueError("Web provider .name must be a non-empty string")
with _lock:
existing = _providers.get(name)
_providers[name] = provider
if existing is not None:
logger.debug(
"Web provider '%s' re-registered (was %r)",
name, type(existing).__name__,
)
else:
logger.debug(
"Registered web provider '%s' (%s)",
name, type(provider).__name__,
)
def list_providers() -> List[WebSearchProvider]:
"""Return all registered providers, sorted by name."""
with _lock:
items = list(_providers.values())
return sorted(items, key=lambda p: p.name)
def get_provider(name: str) -> Optional[WebSearchProvider]:
"""Return the provider registered under *name*, or None."""
if not isinstance(name, str):
return None
with _lock:
return _providers.get(name.strip())
# ---------------------------------------------------------------------------
# Active-provider resolution
# ---------------------------------------------------------------------------
def _read_config_key(*path: str) -> Optional[str]:
"""Resolve a dotted config key from ``config.yaml``. Returns None on miss."""
try:
from hermes_cli.config import load_config
cfg = load_config()
cur = cfg
for segment in path:
if not isinstance(cur, dict):
return None
cur = cur.get(segment)
if isinstance(cur, str) and cur.strip():
return cur.strip()
except Exception as exc:
logger.debug("Could not read config %s: %s", ".".join(path), exc)
return None
# Legacy preference order — preserves behaviour for users who set no
# ``web.backend`` / ``web.<capability>_backend`` config key at all. Matches
# the historic candidate order in :func:`tools.web_tools._get_backend`
# (paid providers first so existing paid setups don't get downgraded to
# a free tier on upgrade). Filtered by ``is_available()`` at walk time so
# we don't surface a provider the user has no credentials for.
_LEGACY_PREFERENCE = (
"firecrawl",
"parallel",
"tavily",
"exa",
"searxng",
"brave-free",
"ddgs",
)
def _resolve(configured: Optional[str], *, capability: str) -> Optional[WebSearchProvider]:
"""Resolve the active provider for a capability ("search" | "extract" | "crawl").
Resolution rules (in order):
1. **Explicit config wins, ignoring availability.** If
``web.{capability}_backend`` or ``web.backend`` names a registered
provider that supports *capability*, return it even if its
:meth:`is_available` returns False the dispatcher will surface a
precise "X_API_KEY is not set" error to the user instead of silently
routing somewhere else. Matches legacy
:func:`tools.web_tools._get_backend` behavior for configured names.
2. **Single-provider shortcut.** When only one registered provider
supports *capability* AND ``is_available()`` reports True, return it.
3. **Legacy preference walk, filtered by availability.** Walk the
:data:`_LEGACY_PREFERENCE` order (firecrawl parallel tavily
exa searxng brave-free ddgs) looking for a provider whose
``supports_<capability>()`` is True AND whose ``is_available()`` is
True. Matches the historic ``tools.web_tools._get_backend()``
candidate order so users with credentials but no explicit config
key keep landing on the same provider as pre-migration. This is
the path that fires when no config key is set pick the
highest-priority backend the user actually has credentials for.
Returns None when no provider is configured AND no available provider
matches the legacy preference; the dispatcher then returns a "set up a
provider" error to the user.
"""
with _lock:
snapshot = dict(_providers)
def _capable(p: WebSearchProvider) -> bool:
if capability == "search":
return bool(p.supports_search())
if capability == "extract":
return bool(p.supports_extract())
if capability == "crawl":
return bool(p.supports_crawl())
return False
def _is_available_safe(p: WebSearchProvider) -> bool:
"""Wrap ``is_available()`` so a buggy provider doesn't kill resolution."""
try:
return bool(p.is_available())
except Exception as exc: # noqa: BLE001
logger.debug("provider %s.is_available() raised %s", p.name, exc)
return False
# 1. Explicit config wins — return regardless of is_available() so the
# user gets a precise downstream error message rather than a silent
# backend switch. Matches _get_backend() in web_tools.py.
if configured:
provider = snapshot.get(configured)
if provider is not None and _capable(provider):
return provider
if provider is None:
logger.debug(
"web backend '%s' configured but not registered; falling back",
configured,
)
else:
logger.debug(
"web backend '%s' configured but does not support '%s'; falling back",
configured, capability,
)
# 2. + 3. Fallback path — filter by availability so we don't surface
# a provider the user has no credentials for. Without this filter,
# a registered-but-unconfigured provider could end up "active" on
# a fresh install with no API keys at all.
eligible = [
p for p in snapshot.values()
if _capable(p) and _is_available_safe(p)
]
if len(eligible) == 1:
return eligible[0]
for legacy in _LEGACY_PREFERENCE:
provider = snapshot.get(legacy)
if (
provider is not None
and _capable(provider)
and _is_available_safe(provider)
):
return provider
return None
def get_active_search_provider() -> Optional[WebSearchProvider]:
"""Resolve the currently-active web search provider.
Reads ``web.search_backend`` (preferred) or ``web.backend`` (shared
fallback) from config.yaml; falls back per the module docstring.
"""
explicit = _read_config_key("web", "search_backend") or _read_config_key("web", "backend")
return _resolve(explicit, capability="search")
def get_active_extract_provider() -> Optional[WebSearchProvider]:
"""Resolve the currently-active web extract provider.
Reads ``web.extract_backend`` (preferred) or ``web.backend`` (shared
fallback) from config.yaml; falls back per the module docstring.
"""
explicit = _read_config_key("web", "extract_backend") or _read_config_key("web", "backend")
return _resolve(explicit, capability="extract")
def get_active_crawl_provider() -> Optional[WebSearchProvider]:
"""Resolve the currently-active web crawl provider.
Reads ``web.crawl_backend`` (preferred) or ``web.backend`` (shared
fallback) from config.yaml; falls back per the module docstring.
Crawl is a niche capability among built-in providers only Tavily and
Firecrawl implement it. Callers should expect ``None`` and fall back to
a different strategy (e.g. summarize-via-LLM) when neither is
configured.
"""
explicit = _read_config_key("web", "crawl_backend") or _read_config_key("web", "backend")
return _resolve(explicit, capability="crawl")
def _reset_for_tests() -> None:
"""Clear the registry. **Test-only.**"""
with _lock:
_providers.clear()
-12
View File
@@ -364,18 +364,6 @@ compression:
# compression of older turns.
protect_last_n: 20
# Number of non-system messages to protect at the head of the transcript, in
# ADDITION to the system prompt (which is always implicitly protected).
# Head messages are NEVER summarized — they survive every compression
# indefinitely. This gives stable early context for short/medium sessions,
# but in long-running sessions that rely on rolling compaction the pinned
# opening turns may not match how you want the session framed over time.
# Set to 0 to preserve ONLY the system prompt (plus the rolling summary
# and recent tail) — the cleanest configuration for long-running sessions.
# Default 3 preserves the system prompt plus the first three non-system
# head messages, matching the pre-feature behaviour.
protect_first_n: 3
# To pin a specific model/provider for compression summaries, use the
# auxiliary section below (auxiliary.compression.provider / model).
+90 -679
View File
File diff suppressed because it is too large Load Diff
-1
View File
@@ -111,7 +111,6 @@ _HOME_TARGET_ENV_VARS = {
"weixin": "WEIXIN_HOME_CHANNEL",
"bluebubbles": "BLUEBUBBLES_HOME_CHANNEL",
"qqbot": "QQBOT_HOME_CHANNEL",
"whatsapp": "WHATSAPP_HOME_CHANNEL",
}
# Legacy env var names kept for back-compat. Each entry is the current
-4
View File
@@ -39,10 +39,6 @@ if [ "$(id -u)" = "0" ]; then
# by the mapped user on the host side.
chown -R hermes:hermes "$HERMES_HOME" 2>/dev/null || \
echo "Warning: chown failed (rootless container?) — continuing anyway"
# The .venv must also be re-chowned when UID is remapped, otherwise
# lazy_deps.py cannot install platform packages (discord.py, etc.).
chown -R hermes:hermes "$INSTALL_DIR/.venv" 2>/dev/null || \
echo "Warning: chown .venv failed (rootless container?) — continuing anyway"
fi
# Ensure config.yaml is readable by the hermes runtime user even if it was
+1 -1
View File
@@ -2,7 +2,7 @@
Hermes Gateway - Multi-platform messaging integration.
This module provides a unified gateway for connecting the Hermes agent
to various messaging platforms (Telegram, Discord, WhatsApp, Weixin, and more) with:
to various messaging platforms (Telegram, Discord, WhatsApp) with:
- Session management (persistent conversations with reset policies)
- Dynamic context injection (agent knows where messages come from)
- Delivery routing (cron job outputs to appropriate channels)
+10 -72
View File
@@ -2,7 +2,7 @@
Gateway configuration management.
Handles loading and validating configuration for:
- Connected platforms (Telegram, Discord, WhatsApp, Weixin, and more)
- Connected platforms (Telegram, Discord, WhatsApp)
- Home channels for each platform
- Session reset policies
- Delivery preferences
@@ -74,24 +74,6 @@ def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
return default
def _ensure_platform_extra_dict(platforms_data: dict, name: str) -> tuple[dict, dict]:
"""Get-or-create ``platforms_data[name]`` and its nested ``extra`` dict.
Both slots are coerced to ``{}`` if a non-dict value is encountered, so
callers can safely write keys without type-checking. Returns
``(plat_data, extra)`` for in-place mutation.
"""
plat_data = platforms_data.setdefault(name, {})
if not isinstance(plat_data, dict):
plat_data = {}
platforms_data[name] = plat_data
extra = plat_data.setdefault("extra", {})
if not isinstance(extra, dict):
extra = {}
plat_data["extra"] = extra
return plat_data, extra
# Module-level cache for bundled platform plugin names (lives outside the
# enum so it doesn't become an accidental enum member).
_Platform__bundled_plugin_names: Optional[set] = None
@@ -735,10 +717,6 @@ def load_gateway_config() -> GatewayConfig:
gw_data["thread_sessions_per_user"] = yaml_cfg["thread_sessions_per_user"]
streaming_cfg = yaml_cfg.get("streaming")
if not isinstance(streaming_cfg, dict):
# Fall back to nested gateway.streaming written by
# ``hermes config set gateway.streaming.*``
streaming_cfg = yaml_cfg.get("gateway", {}).get("streaming")
if isinstance(streaming_cfg, dict):
gw_data["streaming"] = streaming_cfg
@@ -777,27 +755,7 @@ def load_gateway_config() -> GatewayConfig:
merged["extra"] = merged_extra
platforms_data[plat_name] = merged
gw_data["platforms"] = platforms_data
# Iterate built-in platforms plus any registered plugin platforms
# so plugin authors get the same shared-key bridging (#24836).
try:
from hermes_cli.plugins import discover_plugins
discover_plugins() # idempotent
from gateway.platform_registry import platform_registry as _pr
except Exception as e:
logger.debug("plugin discovery skipped: %s", e)
_pr = None
_shared_loop_targets: list = list(Platform)
if _pr is not None:
for _entry in _pr.plugin_entries():
try:
_plat = Platform(_entry.name)
except (ValueError, KeyError):
continue
if _plat not in _shared_loop_targets:
_shared_loop_targets.append(_plat)
for plat in _shared_loop_targets:
for plat in Platform:
if plat == Platform.LOCAL:
continue
platform_cfg = yaml_cfg.get(plat.value)
@@ -852,38 +810,20 @@ def load_gateway_config() -> GatewayConfig:
enabled_was_explicit = "enabled" in platform_cfg
if not bridged and not enabled_was_explicit:
continue
plat_data, extra = _ensure_platform_extra_dict(platforms_data, plat.value)
plat_data = platforms_data.setdefault(plat.value, {})
if not isinstance(plat_data, dict):
plat_data = {}
platforms_data[plat.value] = plat_data
if enabled_was_explicit:
plat_data["enabled"] = platform_cfg["enabled"]
extra = plat_data.setdefault("extra", {})
if not isinstance(extra, dict):
extra = {}
plat_data["extra"] = extra
if plat == Platform.SLACK and enabled_was_explicit:
extra["_enabled_explicit"] = True
extra.update(bridged)
# Plugin-owned YAML→env config bridges (#24836). See
# ``PlatformEntry.apply_yaml_config_fn`` for the hook contract.
# Order: shared-key loop (above) → this dispatch → legacy hardcoded
# blocks (below; no-op when a hook already set their env var) →
# ``_apply_env_overrides()`` after ``GatewayConfig.from_dict``.
if _pr is not None:
for entry in _pr.all_entries():
if entry.apply_yaml_config_fn is None:
continue
platform_cfg = yaml_cfg.get(entry.name)
if not isinstance(platform_cfg, dict):
continue
try:
seeded = entry.apply_yaml_config_fn(yaml_cfg, platform_cfg)
except Exception as e:
logger.debug(
"apply_yaml_config_fn for %s raised: %s",
entry.name, e,
)
continue
if not isinstance(seeded, dict) or not seeded:
continue
_, extra = _ensure_platform_extra_dict(platforms_data, entry.name)
extra.update(seeded)
# Slack settings → env vars (env vars take precedence)
slack_cfg = yaml_cfg.get("slack", {})
if isinstance(slack_cfg, dict):
@@ -912,8 +852,6 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(discord_cfg, dict):
if "require_mention" in discord_cfg and not os.getenv("DISCORD_REQUIRE_MENTION"):
os.environ["DISCORD_REQUIRE_MENTION"] = str(discord_cfg["require_mention"]).lower()
if "thread_require_mention" in discord_cfg and not os.getenv("DISCORD_THREAD_REQUIRE_MENTION"):
os.environ["DISCORD_THREAD_REQUIRE_MENTION"] = str(discord_cfg["thread_require_mention"]).lower()
frc = discord_cfg.get("free_response_channels")
if frc is not None and not os.getenv("DISCORD_FREE_RESPONSE_CHANNELS"):
if isinstance(frc, list):
-16
View File
@@ -119,22 +119,6 @@ class PlatformEntry:
# Signature: () -> Optional[dict[str, Any]]
env_enablement_fn: Optional[Callable[[], Optional[dict]]] = None
# ── YAML→env config bridge ──
# Optional: translate this platform's ``config.yaml`` keys into env vars
# and/or seed ``PlatformConfig.extra`` directly. Lets a plugin own its
# YAML config translation instead of forcing core ``gateway/config.py``
# to know every platform's schema.
#
# Signature: (yaml_cfg: dict, platform_cfg: dict) -> Optional[dict]
# Called from ``load_gateway_config()`` after the generic shared-key loop
# and before ``_apply_env_overrides``. Mutating ``os.environ`` is allowed
# (use ``not os.getenv(...)`` guards to preserve env > YAML precedence);
# any returned dict is merged into ``PlatformConfig.extra``. Exceptions
# are caught and logged at debug level.
# See website/docs/developer-guide/adding-platform-adapters.md for the
# full contract and a worked example.
apply_yaml_config_fn: Optional[Callable[[dict, dict], Optional[dict]]] = None
# Optional: home-channel env var name for cron/notification delivery
# (e.g. ``"IRC_HOME_CHANNEL"``). When set, ``cron.scheduler`` treats this
# platform as a valid ``deliver=<name>`` target and reads the env var to
-8
View File
@@ -21,14 +21,6 @@ status display, gateway setup, and more.
constructed. Without this, env-only setups don't surface in
`hermes gateway status` or `get_connected_platforms()` until the SDK
instantiates.
- `apply_yaml_config_fn: (yaml_cfg, platform_cfg) -> Optional[dict]`
translate this platform's `config.yaml` keys into env vars and/or seed
`PlatformConfig.extra` directly. Lets a plugin own its YAML schema
instead of growing core `gateway/config.py` boilerplate per platform.
Mutating `os.environ` is allowed (use `not os.getenv(...)` guards to
preserve env > YAML precedence); the returned dict is merged into
`PlatformConfig.extra`. Called during `load_gateway_config()` after
the generic shared-key loop and before `_apply_env_overrides()`.
- `cron_deliver_env_var: str` — name of the `*_HOME_CHANNEL` env var. When
set, `deliver=<name>` cron jobs route to this var without editing
`cron/scheduler.py`'s hardcoded sets.
-6
View File
@@ -1168,9 +1168,6 @@ class APIServerAdapter(BasePlatformAdapter):
agent_ref=agent_ref,
gateway_session_key=gateway_session_key,
))
# Ensure SSE drain loops can terminate without relying on polling
# agent_task.done(), which can race with queue timeout checks.
agent_task.add_done_callback(lambda _fut: _stream_q.put(None))
return await self._write_sse_chat_completion(
request, completion_id, model_name, created, _stream_q,
@@ -2200,9 +2197,6 @@ class APIServerAdapter(BasePlatformAdapter):
agent_ref=agent_ref,
gateway_session_key=gateway_session_key,
))
# Ensure SSE drain loops can terminate without relying on polling
# agent_task.done(), which can race with queue timeout checks.
agent_task.add_done_callback(lambda _fut: _stream_q.put(None))
response_id = f"resp_{uuid.uuid4().hex[:28]}"
model_name = body.get("model", self._model_name)
+1 -110
View File
@@ -1,7 +1,7 @@
"""
Base platform adapter interface.
All platform adapters (Telegram, Discord, WhatsApp, Weixin, and more) inherit from this
All platform adapters (Telegram, Discord, WhatsApp) inherit from this
and implement the required methods.
"""
@@ -1743,63 +1743,6 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def send_clarify(
self,
chat_id: str,
question: str,
choices: Optional[list],
clarify_id: str,
session_key: str,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a clarify prompt to the user.
Two render modes:
* **Multiple choice** (``choices`` is a non-empty list) adapters
that override this should render inline buttons (one per choice
plus a final "Other" / free-text option). Button callbacks
MUST resolve via
``tools.clarify_gateway.resolve_gateway_clarify(clarify_id, response)``
with the chosen string. Picking the "Other" button calls
``mark_awaiting_text(clarify_id)`` so the next message in the
session is captured as the response.
* **Open-ended** (``choices`` is None or empty) render the
question as a plain text message; the next user message in the
session is captured by the gateway's text-intercept and
resolves the clarify automatically (see
``GatewayRunner._maybe_intercept_clarify_text``).
The default implementation falls back to a numbered text list,
which works on every platform the user replies with a number
("2") or with the literal choice text, and the gateway intercepts
and resolves. For the text fallback path, the default calls
``mark_awaiting_text()`` so that the gateway text-intercept
(:meth:`GatewayRunner._maybe_intercept_clarify_text`) catches the
user's reply instead of timing out.
Adapters with native button UIs (Telegram, Discord) SHOULD
override this for a richer UX.
"""
if choices:
lines = [f"{question}", ""]
for i, choice in enumerate(choices, start=1):
lines.append(f" {i}. {choice}")
lines.append("")
lines.append("Reply with the number, the option text, or your own answer.")
text = "\n".join(lines)
# Text fallback: enable text-capture so the gateway intercept
# picks up the user's typed reply (e.g. "2" or choice text).
from tools.clarify_gateway import mark_awaiting_text
mark_awaiting_text(clarify_id)
else:
text = f"{question}"
return await self.send(
chat_id=chat_id,
content=text,
metadata=metadata,
)
async def send_private_notice(
self,
chat_id: str,
@@ -2888,58 +2831,6 @@ class BasePlatformAdapter(ABC):
logger.error("[%s] Command '/%s' dispatch failed: %s", self.name, cmd, e, exc_info=True)
return
# Clarify text-capture bypass: if the agent is blocked on a
# clarify_tool call awaiting a free-form text response (open-
# ended clarify, or user picked "Other"), the next non-command
# message in this session MUST reach the runner so the
# clarify-intercept can resolve it and unblock the agent.
#
# Without this bypass: the message gets queued in
# _pending_messages AND triggers an interrupt, killing the
# agent run mid-clarify and discarding the user's answer.
# Same shape as the /approve deadlock fix (PR #4926) — both
# cases are "agent thread blocked on Event.wait, message must
# reach the resolver before being treated as a new turn."
if not cmd:
try:
from tools import clarify_gateway as _clarify_mod
_has_text_clarify = (
_clarify_mod.get_pending_for_session(session_key) is not None
)
except Exception:
_has_text_clarify = False
if _has_text_clarify:
logger.debug(
"[%s] Routing message to clarify text-intercept for %s",
self.name, session_key,
)
try:
_thread_meta = _thread_metadata_for_source(
event.source, _reply_anchor_for_event(event)
)
response = await self._message_handler(event)
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=_reply_anchor_for_event(event),
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
self._schedule_ephemeral_delete(
chat_id=event.source.chat_id,
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
except Exception as e:
logger.error(
"[%s] Clarify text-intercept dispatch failed: %s",
self.name, e, exc_info=True,
)
return
if self._busy_session_handler is not None:
try:
if await self._busy_session_handler(event, session_key):
+2 -26
View File
@@ -111,33 +111,9 @@ DINGTALK_TYPE_MAPPING = {
def check_dingtalk_requirements() -> bool:
"""Check if DingTalk dependencies are available and configured.
Lazy-installs dingtalk-stream via ``tools.lazy_deps.ensure("platform.dingtalk")``
on first call if not present.
"""
global DINGTALK_STREAM_AVAILABLE, dingtalk_stream, ChatbotMessage, CallbackMessage, AckMessage
global HTTPX_AVAILABLE, httpx
"""Check if DingTalk dependencies are available and configured."""
if not DINGTALK_STREAM_AVAILABLE or not HTTPX_AVAILABLE:
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("platform.dingtalk", prompt=False)
except Exception:
return False
try:
import dingtalk_stream as _ds
from dingtalk_stream import ChatbotMessage as _CM
from dingtalk_stream.frames import CallbackMessage as _CBM, AckMessage as _AM
import httpx as _httpx
except ImportError:
return False
dingtalk_stream = _ds
ChatbotMessage = _CM
CallbackMessage = _CBM
AckMessage = _AM
httpx = _httpx
DINGTALK_STREAM_AVAILABLE = True
HTTPX_AVAILABLE = True
return False
if not os.getenv("DINGTALK_CLIENT_ID") or not os.getenv("DINGTALK_CLIENT_SECRET"):
return False
return True
+8 -334
View File
@@ -86,32 +86,8 @@ def _clean_discord_id(entry: str) -> str:
def check_discord_requirements() -> bool:
"""Check if Discord dependencies are available.
Lazy-installs discord.py via ``tools.lazy_deps.ensure("platform.discord")``
on first call if not present. After successful install, re-binds module
globals so ``DISCORD_AVAILABLE`` becomes True.
"""
global DISCORD_AVAILABLE, discord, DiscordMessage, Intents, commands
if DISCORD_AVAILABLE:
return True
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("platform.discord", prompt=False)
except Exception:
return False
try:
import discord as _discord
from discord import Message as _DM, Intents as _Intents
from discord.ext import commands as _commands
except ImportError:
return False
discord = _discord
DiscordMessage = _DM
Intents = _Intents
commands = _commands
DISCORD_AVAILABLE = True
return True
"""Check if Discord dependencies are available."""
return DISCORD_AVAILABLE
def _build_allowed_mentions():
@@ -3577,25 +3553,6 @@ class DiscordAdapter(BasePlatformAdapter):
return {part.strip() for part in s.split(",") if part.strip()}
return set()
def _discord_thread_require_mention(self) -> bool:
"""Return whether thread participation requires @mention to follow up.
When ``False`` (default), once the bot has participated in a thread it
keeps responding to every message in that thread without needing to be
mentioned again useful for one-on-one conversations.
When ``True``, the @mention requirement is enforced inside threads as
well. Set this when multiple bots share a thread and you want each
one to only fire on explicit @mention, avoiding bot-to-bot loops or
unwanted cross-replies.
"""
configured = self.config.extra.get("thread_require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() not in ("false", "0", "no", "off")
return bool(configured)
return os.getenv("DISCORD_THREAD_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
def _thread_parent_channel(self, channel: Any) -> Any:
"""Return the parent text channel when invoked from a thread."""
return getattr(channel, "parent", None) or channel
@@ -3896,84 +3853,6 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception as e:
return SendResult(success=False, error=str(e))
async def send_clarify(
self,
chat_id: str,
question: str,
choices: Optional[list],
clarify_id: str,
session_key: str,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Render a clarify prompt with one Discord button per choice.
Multi-choice mode (``choices`` non-empty): renders a button per option
plus a final "✏️ Other (type answer)" button. Picking "Other" flips
the clarify entry into text-capture mode so the next user message in
the session becomes the response. Numeric clicks resolve immediately
via ``resolve_gateway_clarify(clarify_id, choice_text)``.
Open-ended mode (``choices`` empty/None): renders the question as
plain embed text no buttons. The gateway's text-intercept captures
the next message in this session and resolves the clarify.
"""
if not self._client or not DISCORD_AVAILABLE:
return SendResult(success=False, error="Not connected")
try:
target_id = chat_id
if metadata and metadata.get("thread_id"):
target_id = metadata["thread_id"]
channel = self._client.get_channel(int(target_id))
if not channel:
channel = await self._client.fetch_channel(int(target_id))
# Discord embed description limit is 4096; trim conservatively.
max_desc = 4088
body = str(question or "").strip()
if len(body) > max_desc:
body = body[: max_desc - 3] + "..."
embed = discord.Embed(
title="❓ Hermes needs your input",
description=body,
color=discord.Color.orange(),
)
clean_choices = [
str(c).strip() for c in (choices or []) if c is not None and str(c).strip()
]
# Discord allows up to 5 buttons per row, 5 rows per view = 25.
# We reserve one slot for the "Other" button, so cap at 24 choices.
clean_choices = clean_choices[:24]
if clean_choices:
embed.add_field(
name="Choices",
value="Pick one below, or click ✏️ Other to type a custom answer.",
inline=False,
)
view = ClarifyChoiceView(
choices=clean_choices,
clarify_id=clarify_id,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
else:
embed.add_field(
name="Reply",
value="Reply in this channel with your answer.",
inline=False,
)
view = None
msg = await channel.send(embed=embed, view=view) if view else await channel.send(embed=embed)
return SendResult(success=True, message_id=str(msg.id))
except Exception as e:
logger.warning("[%s] send_clarify failed: %s", self.name, e)
return SendResult(success=False, error=str(e))
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
@@ -4264,17 +4143,6 @@ class DiscordAdapter(BasePlatformAdapter):
raw_content = message.content.strip()
normalized_content = raw_content
mention_prefix = False
snapshot_attachments = []
if hasattr(message, "message_snapshots") and message.message_snapshots:
snapshot_text_parts = []
for snap in message.message_snapshots:
if getattr(snap, "content", None):
snapshot_text_parts.append(snap.content.strip())
snapshot_attachments.extend(getattr(snap, "attachments", []) or [])
if snapshot_text_parts and not raw_content:
raw_content = "\n".join(snapshot_text_parts)
normalized_content = raw_content
if self._client.user and self._client.user in message.mentions:
mention_prefix = True
normalized_content = normalized_content.replace(f"<@{self._client.user.id}>", "").strip()
@@ -4317,15 +4185,8 @@ class DiscordAdapter(BasePlatformAdapter):
)
# Skip the mention check if the message is in a thread where
# the bot has previously participated (auto-created or replied in)
# — UNLESS thread_require_mention is enabled, in which case threads
# are gated the same as channels. Useful when multiple bots share
# a thread.
in_bot_thread = (
is_thread
and thread_id in self._threads
and not self._discord_thread_require_mention()
)
# the bot has previously participated (auto-created or replied in).
in_bot_thread = is_thread and thread_id in self._threads
if require_mention and not is_free_channel and not in_bot_thread:
if self._client.user not in message.mentions and not mention_prefix:
@@ -4338,7 +4199,7 @@ class DiscordAdapter(BasePlatformAdapter):
if not is_thread and not isinstance(message.channel, discord.DMChannel):
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
skip_thread = bool(channel_ids & no_thread_channels)
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in {"true", "1", "yes"}
is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
@@ -4350,15 +4211,13 @@ class DiscordAdapter(BasePlatformAdapter):
auto_threaded_channel = thread
self._threads.mark(thread_id)
all_attachments = list(message.attachments) + snapshot_attachments
# Determine message type
msg_type = MessageType.TEXT
if normalized_content.startswith("/"):
msg_type = MessageType.COMMAND
elif all_attachments:
elif message.attachments:
# Check attachment types
for att in all_attachments:
for att in message.attachments:
if att.content_type:
if att.content_type.startswith("image/"):
msg_type = MessageType.PHOTO
@@ -4417,7 +4276,7 @@ class DiscordAdapter(BasePlatformAdapter):
media_urls = []
media_types = []
pending_text_injection: Optional[str] = None
for att in all_attachments:
for att in message.attachments:
content_type = att.content_type or "unknown"
if content_type.startswith("image/"):
try:
@@ -5216,188 +5075,3 @@ if DISCORD_AVAILABLE:
async def on_timeout(self):
self.resolved = True
self.clear_items()
class ClarifyChoiceView(discord.ui.View):
"""Interactive button view for the clarify tool's multiple-choice prompts.
Renders one button per choice (max 24) plus a final `` Other`` button.
Picking a numeric choice resolves the gateway clarify entry immediately;
picking ``Other`` flips the entry into text-capture mode so the next
user message in the session becomes the response (the gateway's
text-intercept handles the resolution).
Auth gating mirrors ``ExecApprovalView`` only users/roles in the
Discord adapter's allowlist may answer. Single-use: after the first
valid click all buttons disable and the embed updates to show who
answered and what they chose.
"""
def __init__(
self,
choices: List[str],
clarify_id: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300) # 5-minute timeout
self.choices = list(choices)[:24]
self.clarify_id = clarify_id
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
for index, choice in enumerate(self.choices):
# Discord button labels are capped at 80 chars.
label_body = choice if len(choice) <= 75 else choice[:72] + "..."
button = discord.ui.Button(
label=f"{index + 1}. {label_body}",
style=discord.ButtonStyle.primary,
custom_id=f"clarify:{clarify_id}:{index}",
)
button.callback = self._make_choice_callback(index, choice)
self.add_item(button)
other_btn = discord.ui.Button(
label="✏️ Other (type answer)",
style=discord.ButtonStyle.secondary,
custom_id=f"clarify:{clarify_id}:other",
)
other_btn.callback = self._on_other
self.add_item(other_btn)
def _check_auth(self, interaction: "discord.Interaction") -> bool:
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
def _make_choice_callback(self, index: int, choice: str):
async def _callback(interaction: "discord.Interaction"):
await self._resolve_choice(interaction, index, choice)
return _callback
async def _resolve_choice(
self,
interaction: "discord.Interaction",
index: int,
choice: str,
) -> None:
"""Resolve the clarify with a chosen option."""
if self.resolved:
await interaction.response.send_message(
"This prompt has already been answered~", ephemeral=True,
)
return
if not self._check_auth(interaction):
await interaction.response.send_message(
"You're not authorized to answer this prompt~", ephemeral=True,
)
return
self.resolved = True
for child in self.children:
child.disabled = True
embed = interaction.message.embeds[0] if (
interaction.message and interaction.message.embeds
) else None
if embed:
user = getattr(interaction, "user", None)
display_name = getattr(user, "display_name", "user")
embed.color = discord.Color.green()
embed.set_footer(text=f"Answered by {display_name}: {choice}")
try:
await interaction.response.edit_message(embed=embed, view=self)
except Exception:
logger.debug(
"Discord clarify edit_message failed for %s",
self.clarify_id,
exc_info=True,
)
try:
await interaction.response.defer()
except Exception:
pass
# Resolve via the gateway clarify primitive — same mechanism as
# Telegram. Look up the canonical choice text from the entry so
# we round-trip the original value, not a button-label variant.
resolved_text: Optional[str] = None
try:
from tools.clarify_gateway import _entries as _clarify_entries # type: ignore
entry = _clarify_entries.get(self.clarify_id)
if entry and entry.choices and 0 <= index < len(entry.choices):
resolved_text = entry.choices[index]
except Exception:
resolved_text = None
if resolved_text is None:
resolved_text = choice
try:
from tools.clarify_gateway import resolve_gateway_clarify
resolved = resolve_gateway_clarify(self.clarify_id, resolved_text)
logger.info(
"Discord clarify button resolved (id=%s, choice=%r, user=%s, ok=%s)",
self.clarify_id, resolved_text,
getattr(getattr(interaction, "user", None), "display_name", "?"),
resolved,
)
except Exception as exc:
logger.error(
"Discord clarify resolve_gateway_clarify failed (id=%s): %s",
self.clarify_id, exc,
)
async def _on_other(self, interaction: "discord.Interaction") -> None:
"""Flip the clarify entry into text-capture mode."""
if self.resolved:
await interaction.response.send_message(
"This prompt has already been answered~", ephemeral=True,
)
return
if not self._check_auth(interaction):
await interaction.response.send_message(
"You're not authorized to answer this prompt~", ephemeral=True,
)
return
# Don't pop the entry — the gateway's text-intercept needs it
# until the user actually types. Just mark it as awaiting text
# and disable the buttons so the user can't double-click.
try:
from tools.clarify_gateway import mark_awaiting_text
mark_awaiting_text(self.clarify_id)
except Exception as exc:
logger.warning(
"Discord clarify mark_awaiting_text failed (id=%s): %s",
self.clarify_id, exc,
)
self.resolved = True
for child in self.children:
child.disabled = True
embed = interaction.message.embeds[0] if (
interaction.message and interaction.message.embeds
) else None
if embed:
user = getattr(interaction, "user", None)
display_name = getattr(user, "display_name", "user")
embed.color = discord.Color.blue()
embed.set_footer(
text=f"Awaiting typed response from {display_name}",
)
try:
await interaction.response.edit_message(embed=embed, view=self)
except Exception:
try:
await interaction.response.defer()
except Exception:
pass
async def on_timeout(self):
self.resolved = True
for child in self.children:
child.disabled = True
+4 -61
View File
@@ -1300,12 +1300,12 @@ def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:
except Exception:
logger.debug("[Feishu] Failed to apply websocket runtime overrides", exc_info=True)
def _connect_with_overrides(*args: Any, **kwargs: Any) -> Any:
async def _connect_with_overrides(*args: Any, **kwargs: Any) -> Any:
if adapter._ws_ping_interval is not None and "ping_interval" not in kwargs:
kwargs["ping_interval"] = adapter._ws_ping_interval
if adapter._ws_ping_timeout is not None and "ping_timeout" not in kwargs:
kwargs["ping_timeout"] = adapter._ws_ping_timeout
return original_connect(*args, **kwargs)
return await original_connect(*args, **kwargs)
def _configure_with_overrides(conf: Any) -> Any:
if original_configure is None:
@@ -1343,65 +1343,8 @@ def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:
def check_feishu_requirements() -> bool:
"""Check if Feishu/Lark dependencies are available.
Lazy-installs lark-oapi via ``tools.lazy_deps.ensure("platform.feishu")``
on first call if not present. Rebinds all module-level globals on success.
"""
if FEISHU_AVAILABLE:
return True
def _import():
import lark_oapi as lark
from lark_oapi.api.application.v6 import GetApplicationRequest
from lark_oapi.api.im.v1 import (
CreateFileRequest, CreateFileRequestBody,
CreateImageRequest, CreateImageRequestBody,
CreateMessageRequest, CreateMessageRequestBody,
GetChatRequest, GetMessageRequest, GetMessageResourceRequest,
P2ImMessageMessageReadV1,
ReplyMessageRequest, ReplyMessageRequestBody,
UpdateMessageRequest, UpdateMessageRequestBody,
)
from lark_oapi.core import AccessTokenType, HttpMethod
from lark_oapi.core.const import FEISHU_DOMAIN, LARK_DOMAIN
from lark_oapi.core.model import BaseRequest
from lark_oapi.event.callback.model.p2_card_action_trigger import (
CallBackCard, P2CardActionTriggerResponse,
)
from lark_oapi.event.dispatcher_handler import EventDispatcherHandler
from lark_oapi.ws import Client as FeishuWSClient
return {
"lark": lark,
"GetApplicationRequest": GetApplicationRequest,
"CreateFileRequest": CreateFileRequest,
"CreateFileRequestBody": CreateFileRequestBody,
"CreateImageRequest": CreateImageRequest,
"CreateImageRequestBody": CreateImageRequestBody,
"CreateMessageRequest": CreateMessageRequest,
"CreateMessageRequestBody": CreateMessageRequestBody,
"GetChatRequest": GetChatRequest,
"GetMessageRequest": GetMessageRequest,
"GetMessageResourceRequest": GetMessageResourceRequest,
"P2ImMessageMessageReadV1": P2ImMessageMessageReadV1,
"ReplyMessageRequest": ReplyMessageRequest,
"ReplyMessageRequestBody": ReplyMessageRequestBody,
"UpdateMessageRequest": UpdateMessageRequest,
"UpdateMessageRequestBody": UpdateMessageRequestBody,
"AccessTokenType": AccessTokenType,
"HttpMethod": HttpMethod,
"FEISHU_DOMAIN": FEISHU_DOMAIN,
"LARK_DOMAIN": LARK_DOMAIN,
"BaseRequest": BaseRequest,
"CallBackCard": CallBackCard,
"P2CardActionTriggerResponse": P2CardActionTriggerResponse,
"EventDispatcherHandler": EventDispatcherHandler,
"FeishuWSClient": FeishuWSClient,
"FEISHU_AVAILABLE": True,
}
from tools.lazy_deps import ensure_and_bind
return ensure_and_bind("platform.feishu", _import, globals(), prompt=False)
"""Check if Feishu/Lark dependencies are available."""
return FEISHU_AVAILABLE
class FeishuAdapter(BasePlatformAdapter):
+5 -30
View File
@@ -224,11 +224,7 @@ def _check_e2ee_deps() -> bool:
def check_matrix_requirements() -> bool:
"""Return True if the Matrix adapter can be used.
Lazy-installs mautrix via ``tools.lazy_deps.ensure("platform.matrix")``
on first call if not present. Rebinds all module-level type globals on success.
"""
"""Return True if the Matrix adapter can be used."""
token = os.getenv("MATRIX_ACCESS_TOKEN", "")
password = os.getenv("MATRIX_PASSWORD", "")
homeserver = os.getenv("MATRIX_HOMESERVER", "")
@@ -242,31 +238,10 @@ def check_matrix_requirements() -> bool:
try:
import mautrix # noqa: F401
except ImportError:
def _import():
from mautrix.types import (
ContentURI, EventID, EventType, PaginationDirection,
PresenceState, RoomCreatePreset, RoomID, SyncToken,
TrustState, UserID,
)
return {
"ContentURI": ContentURI,
"EventID": EventID,
"EventType": EventType,
"PaginationDirection": PaginationDirection,
"PresenceState": PresenceState,
"RoomCreatePreset": RoomCreatePreset,
"RoomID": RoomID,
"SyncToken": SyncToken,
"TrustState": TrustState,
"UserID": UserID,
}
from tools.lazy_deps import ensure_and_bind
if not ensure_and_bind("platform.matrix", _import, globals(), prompt=False):
logger.warning(
"Matrix: mautrix not installed. Run: pip install 'mautrix[encryption]'"
)
return False
logger.warning(
"Matrix: mautrix not installed. Run: pip install 'mautrix[encryption]'"
)
return False
# If encryption is requested, verify E2EE deps are available at startup
# rather than silently degrading to plaintext-only at connect time.
+2 -27
View File
@@ -176,28 +176,6 @@ class QQAdapter(BasePlatformAdapter):
fut.set_exception(RuntimeError(reason))
self._pending_responses.clear()
def _mark_transport_disconnected(self) -> None:
"""Mark QQ WS down without stopping the reconnect loop.
BasePlatformAdapter uses _running for both process lifecycle and
connection status. QQBot needs to keep the listener task alive across
transient transport drops so it can continue reconnect attempts after a
short-lived gateway or network failure.
"""
if self.has_fatal_error:
return
self._write_runtime_status_safe(
"disconnected",
platform_state="disconnected",
error_code=None,
error_message=None,
)
@property
def is_connected(self) -> bool:
"""Return True only when the QQ WebSocket transport is usable."""
return bool(self._running and self._ws and not self._ws.closed)
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.QQBOT)
@@ -531,7 +509,7 @@ class QQAdapter(BasePlatformAdapter):
else:
quick_disconnect_count = 0
self._mark_transport_disconnected()
self._mark_disconnected()
self._fail_pending("Connection closed")
# Stop reconnecting for fatal codes
@@ -553,7 +531,6 @@ class QQAdapter(BasePlatformAdapter):
RATE_LIMIT_DELAY,
)
if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
self._mark_disconnected()
return
await asyncio.sleep(RATE_LIMIT_DELAY)
if await self._reconnect(backoff_idx):
@@ -607,19 +584,17 @@ class QQAdapter(BasePlatformAdapter):
backoff_idx += 1
if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
logger.error("[%s] Max reconnect attempts reached (QQCloseError)", self._log_tag)
self._mark_disconnected()
return
except Exception as exc:
if not self._running:
return
logger.warning("[%s] WebSocket error: %s", self._log_tag, exc)
self._mark_transport_disconnected()
self._mark_disconnected()
self._fail_pending("Connection interrupted")
if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
logger.error("[%s] Max reconnect attempts reached", self._log_tag)
self._mark_disconnected()
return
if await self._reconnect(backoff_idx):
+1 -3
View File
@@ -446,9 +446,7 @@ class SignalAdapter(BasePlatformAdapter):
if sent_msg and isinstance(sent_msg, dict):
dest = sent_msg.get("destinationNumber") or sent_msg.get("destination")
sent_ts = sent_msg.get("timestamp")
sent_msg_group_info = sent_msg.get("groupInfo") or {}
sent_msg_group_id = sent_msg_group_info.get("groupId") if sent_msg_group_info else None
if dest == self._account_normalized or sent_msg_group_id:
if dest == self._account_normalized:
# Check if this is an echo of our own outbound reply
if sent_ts and sent_ts in self._recent_sent_timestamps:
self._recent_sent_timestamps.discard(sent_ts)
+2 -43
View File
@@ -73,29 +73,8 @@ class _ThreadContextCache:
def check_slack_requirements() -> bool:
"""Check if Slack dependencies are available.
Lazy-installs slack-bolt/slack-sdk via ``tools.lazy_deps.ensure("platform.slack")``
on first call if not present. Rebinds all module-level globals on success.
"""
if SLACK_AVAILABLE:
return True
def _import():
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
import aiohttp
return {
"AsyncApp": AsyncApp,
"AsyncSocketModeHandler": AsyncSocketModeHandler,
"AsyncWebClient": AsyncWebClient,
"aiohttp": aiohttp,
"SLACK_AVAILABLE": True,
}
from tools.lazy_deps import ensure_and_bind
return ensure_and_bind("platform.slack", _import, globals(), prompt=False)
"""Check if Slack dependencies are available."""
return SLACK_AVAILABLE
def _extract_text_from_slack_blocks(blocks: list) -> str:
@@ -1798,26 +1777,6 @@ class SlackAdapter(BasePlatformAdapter):
return
original_text = event.get("text", "")
# Slack blocks native slash commands inside threads ("/queue is not
# supported in threads. Sorry!"). As a workaround, recognise a
# leading ``!`` as an alternate command prefix and rewrite it to
# ``/`` so the rest of the pipeline (MessageType.COMMAND tagging,
# gateway dispatcher) handles it like a normal slash command. Only
# rewrite when the first token resolves to a known gateway command
# so casual messages like "!nice work" pass through unchanged.
if original_text.startswith("!"):
try:
from hermes_cli.commands import is_gateway_known_command
first_token = original_text[1:].split(maxsplit=1)[0]
# Strip "@suffix" the same way get_command() does, so
# forms like ``!stop@hermes`` still resolve.
cmd_name = first_token.split("@", 1)[0].lower()
if cmd_name and "/" not in cmd_name and is_gateway_known_command(cmd_name):
original_text = "/" + original_text[1:]
except Exception: # pragma: no cover - defensive
pass
text = original_text
# Extract quoted/forwarded content from Slack blocks.
+38 -317
View File
@@ -103,58 +103,8 @@ _TELEGRAM_IMAGE_EXT_TO_MIME = {
def check_telegram_requirements() -> bool:
"""Check if Telegram dependencies are available.
If python-telegram-bot is missing, attempts to lazy-install it via
``tools.lazy_deps.ensure("platform.telegram")``. After a successful
install, re-imports the SDK and flips ``TELEGRAM_AVAILABLE`` to True
so the adapter's class-level type aliases get rebound.
"""
global TELEGRAM_AVAILABLE, Update, Bot, Message, InlineKeyboardButton
global InlineKeyboardMarkup, LinkPreviewOptions, Application
global CommandHandler, CallbackQueryHandler, TelegramMessageHandler
global ContextTypes, filters, ParseMode, ChatType, HTTPXRequest
if TELEGRAM_AVAILABLE:
return True
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("platform.telegram", prompt=False)
except Exception:
return False
try:
from telegram import Update as _Update, Bot as _Bot, Message as _Message
from telegram import InlineKeyboardButton as _IKB, InlineKeyboardMarkup as _IKM
try:
from telegram import LinkPreviewOptions as _LPO
except ImportError:
_LPO = None
from telegram.ext import (
Application as _App, CommandHandler as _CH,
CallbackQueryHandler as _CQH,
MessageHandler as _MH,
ContextTypes as _CT, filters as _filters,
)
from telegram.constants import ParseMode as _PM, ChatType as _CtT
from telegram.request import HTTPXRequest as _HR
except ImportError:
return False
Update = _Update
Bot = _Bot
Message = _Message
InlineKeyboardButton = _IKB
InlineKeyboardMarkup = _IKM
LinkPreviewOptions = _LPO
Application = _App
CommandHandler = _CH
CallbackQueryHandler = _CQH
TelegramMessageHandler = _MH
ContextTypes = _CT
filters = _filters
ParseMode = _PM
ChatType = _CtT
HTTPXRequest = _HR
TELEGRAM_AVAILABLE = True
return True
"""Check if Telegram dependencies are available."""
return TELEGRAM_AVAILABLE
# Matches every character that MarkdownV2 requires to be backslash-escaped
@@ -332,13 +282,6 @@ class TelegramAdapter(BasePlatformAdapter):
MEDIA_GROUP_WAIT_SECONDS = 0.8
_GENERAL_TOPIC_THREAD_ID = "1"
# Telegram's edit_message applies MarkdownV2 formatting only on the
# finalize=True path. Without this flag, stream_consumer._send_or_edit
# short-circuits when the raw text is unchanged between the last streamed
# edit and the final edit, skipping the plain-text → MarkdownV2 conversion.
# Fixes #25710.
REQUIRES_EDIT_FINALIZE: bool = True
# Adaptive text-batch ingress: short messages need a tighter delay so the
# first token reaches the agent fast. Numbers tuned for "feels instant":
# ≤320 codepoints (one short paragraph) settles in ~180ms; ≤1024
@@ -434,9 +377,6 @@ class TelegramAdapter(BasePlatformAdapter):
# Slash-confirm button state: confirm_id → session_key (for /reload-mcp
# and any other slash-confirm prompts; see GatewayRunner._request_slash_confirm).
self._slash_confirm_state: Dict[str, str] = {}
# Clarify button state: clarify_id → session_key (for the clarify tool's
# multiple-choice prompts; see GatewayRunner clarify_callback wiring).
self._clarify_state: Dict[str, str] = {}
# Notification mode for message sends.
# "important" — only final responses, approvals, and slash confirmations
# trigger notifications; tool progress, streaming, status
@@ -2077,7 +2017,7 @@ class TelegramAdapter(BasePlatformAdapter):
return SendResult(success=False, error="Not connected")
try:
default_hint = f" (default: {default})" if default else ""
text = self.format_message(f"⚕ *Update needs your input:*\n\n{prompt}{default_hint}")
text = f"⚕ *Update needs your input:*\n\n{prompt}{default_hint}"
keyboard = InlineKeyboardMarkup([
[
InlineKeyboardButton("✓ Yes", callback_data="update_prompt:y"),
@@ -2089,7 +2029,7 @@ class TelegramAdapter(BasePlatformAdapter):
msg = await self._send_message_with_thread_fallback(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN_V2,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
reply_to_message_id=reply_to_id,
**self._thread_kwargs_for_send(
@@ -2225,80 +2165,6 @@ class TelegramAdapter(BasePlatformAdapter):
logger.warning("[%s] send_slash_confirm failed: %s", self.name, e)
return SendResult(success=False, error=str(e))
async def send_clarify(
self,
chat_id: str,
question: str,
choices: Optional[list],
clarify_id: str,
session_key: str,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Render a clarify prompt with one inline button per choice.
Multi-choice mode (``choices`` non-empty): renders one button per
option plus a final "✏️ Other (type answer)" button. Picking the
"Other" button flips the entry into text-capture mode so the next
message becomes the response.
Open-ended mode (``choices`` empty): renders the question as plain
text no buttons. The next message in the session is captured by
the gateway's text-intercept and resolves the clarify.
"""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
text = f"{_html.escape(question)}"
thread_id = self._metadata_thread_id(metadata)
kwargs: Dict[str, Any] = {
"chat_id": int(chat_id),
"text": text,
"parse_mode": ParseMode.HTML,
**self._link_preview_kwargs(),
}
if choices:
# Telegram caps callback_data at 64 bytes; keep "cl:<id>:<idx>"
# short. Button label is also capped (~64 chars in practice).
rows = []
for idx, choice in enumerate(choices):
label = str(choice)
if len(label) > 60:
label = label[:57] + "..."
rows.append([
InlineKeyboardButton(
f"{idx + 1}. {label}",
callback_data=f"cl:{clarify_id}:{idx}",
)
])
rows.append([
InlineKeyboardButton(
"✏️ Other (type answer)",
callback_data=f"cl:{clarify_id}:other",
)
])
kwargs["reply_markup"] = InlineKeyboardMarkup(rows)
reply_to_id = self._reply_to_message_id_for_send(None, metadata)
kwargs["reply_to_message_id"] = reply_to_id
kwargs.update(
self._thread_kwargs_for_send(
chat_id,
thread_id,
metadata,
reply_to_message_id=reply_to_id,
)
)
msg = await self._send_message_with_thread_fallback(**kwargs)
self._clarify_state[clarify_id] = session_key
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.warning("[%s] send_clarify failed: %s", self.name, e)
return SendResult(success=False, error=str(e))
async def send_model_picker(
self,
chat_id: str,
@@ -2341,13 +2207,11 @@ class TelegramAdapter(BasePlatformAdapter):
keyboard = InlineKeyboardMarkup(rows)
provider_label = get_label(current_provider)
text = self.format_message(
(
f"⚙ *Model Configuration*\n\n"
f"Current model: `{current_model or 'unknown'}`\n"
f"Provider: {provider_label}\n\n"
f"Select a provider:"
)
text = (
f"⚙ *Model Configuration*\n\n"
f"Current model: `{current_model or 'unknown'}`\n"
f"Provider: {provider_label}\n\n"
f"Select a provider:"
)
thread_id = metadata.get("thread_id") if metadata else None
@@ -2355,7 +2219,7 @@ class TelegramAdapter(BasePlatformAdapter):
msg = await self._send_message_with_thread_fallback(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN_V2,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
reply_to_message_id=reply_to_id,
**self._thread_kwargs_for_send(
@@ -2465,14 +2329,12 @@ class TelegramAdapter(BasePlatformAdapter):
extra = f"\n_{total - shown} more available — type `/model <name>` directly_" if total > shown else ""
await query.edit_message_text(
text=self.format_message(
(
f"⚙ *Model Configuration*\n\n"
f"Provider: *{pname}*{page_info}\n"
f"Select a model:{extra}"
)
text=(
f"⚙ *Model Configuration*\n\n"
f"Provider: *{pname}*{page_info}\n"
f"Select a model:{extra}"
),
parse_mode=ParseMode.MARKDOWN_V2,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
)
await query.answer()
@@ -2501,14 +2363,12 @@ class TelegramAdapter(BasePlatformAdapter):
extra = f"\n_{total - shown} more available — type `/model <name>` directly_" if total > shown else ""
await query.edit_message_text(
text=self.format_message(
(
f"⚙ *Model Configuration*\n\n"
f"Provider: *{pname}*{page_info}\n"
f"Select a model:{extra}"
)
text=(
f"⚙ *Model Configuration*\n\n"
f"Provider: *{pname}*{page_info}\n"
f"Select a model:{extra}"
),
parse_mode=ParseMode.MARKDOWN_V2,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
)
await query.answer()
@@ -2543,8 +2403,8 @@ class TelegramAdapter(BasePlatformAdapter):
# Edit message to show confirmation, remove buttons
try:
await query.edit_message_text(
text=self.format_message(result_text),
parse_mode=ParseMode.MARKDOWN_V2,
text=result_text,
parse_mode=ParseMode.MARKDOWN,
reply_markup=None,
)
except Exception:
@@ -2584,15 +2444,13 @@ class TelegramAdapter(BasePlatformAdapter):
provider_label = state["current_provider"]
await query.edit_message_text(
text=self.format_message(
(
f"⚙ *Model Configuration*\n\n"
f"Current model: `{state['current_model'] or 'unknown'}`\n"
f"Provider: {provider_label}\n\n"
f"Select a provider:"
)
text=(
f"⚙ *Model Configuration*\n\n"
f"Current model: `{state['current_model'] or 'unknown'}`\n"
f"Provider: {provider_label}\n\n"
f"Select a provider:"
),
parse_mode=ParseMode.MARKDOWN_V2,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
)
await query.answer()
@@ -2675,8 +2533,8 @@ class TelegramAdapter(BasePlatformAdapter):
# Edit message to show decision, remove buttons
try:
await query.edit_message_text(
text=self.format_message(f"{label} by {user_display}"),
parse_mode=ParseMode.MARKDOWN_V2,
text=f"{label} by {user_display}",
parse_mode=ParseMode.MARKDOWN,
reply_markup=None,
)
except Exception:
@@ -2729,8 +2587,8 @@ class TelegramAdapter(BasePlatformAdapter):
try:
await query.edit_message_text(
text=self.format_message(f"{label} by {user_display}"),
parse_mode=ParseMode.MARKDOWN_V2,
text=f"{label} by {user_display}",
parse_mode=ParseMode.MARKDOWN,
reply_markup=None,
)
except Exception:
@@ -2755,8 +2613,8 @@ class TelegramAdapter(BasePlatformAdapter):
prompt_message_id = getattr(query.message, "message_id", None)
send_kwargs: Dict[str, Any] = {
"chat_id": int(query.message.chat_id),
"text": self.format_message(result_text),
"parse_mode": ParseMode.MARKDOWN_V2,
"text": result_text,
"parse_mode": ParseMode.MARKDOWN,
**self._link_preview_kwargs(),
}
chat_type_value = getattr(chat_type, "value", chat_type)
@@ -2787,116 +2645,11 @@ class TelegramAdapter(BasePlatformAdapter):
{"thread_id": str(thread_id)},
)
)
await self._send_message_with_thread_fallback(**send_kwargs)
await self._bot.send_message(**send_kwargs)
except Exception as exc:
logger.error("[%s] slash-confirm callback failed: %s", self.name, exc, exc_info=True)
return
# --- Clarify callbacks (cl:clarify_id:idx | cl:clarify_id:other) ---
if data.startswith("cl:"):
parts = data.split(":", 2)
if len(parts) == 3:
clarify_id = parts[1]
choice_token = parts[2]
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to answer this prompt.")
return
session_key = self._clarify_state.get(clarify_id)
if not session_key:
await query.answer(text="This prompt has already been resolved.")
return
user_display = getattr(query.from_user, "first_name", "User")
if choice_token == "other":
# Flip into text-capture mode and tell the user to type
# their answer. The gateway's text-intercept will pick
# up the next message in this session and resolve the
# clarify. Do NOT pop _clarify_state yet — we still
# need it if the user is slow to respond and the entry
# is cleared by something else.
try:
from tools.clarify_gateway import mark_awaiting_text
mark_awaiting_text(clarify_id)
except Exception as exc:
logger.warning("[%s] mark_awaiting_text failed: %s", self.name, exc)
await query.answer(text="✏️ Type your answer in the chat.")
try:
await query.edit_message_text(
text=f"{query.message.text or ''}\n\n<i>Awaiting typed response from {_html.escape(user_display)}…</i>",
parse_mode=ParseMode.HTML,
reply_markup=None,
)
except Exception:
pass
return
# Numeric choice → resolve immediately with the chosen text
try:
idx = int(choice_token)
except (ValueError, TypeError):
await query.answer(text="Invalid choice.")
return
# Look up the choice text from the entry registered in the
# clarify primitive. Fall back to the index if the entry
# has been cleaned up (race with timeout / session reset).
resolved_text: Optional[str] = None
try:
from tools.clarify_gateway import _entries as _clarify_entries # type: ignore
entry = _clarify_entries.get(clarify_id)
if entry and entry.choices and 0 <= idx < len(entry.choices):
resolved_text = entry.choices[idx]
except Exception:
resolved_text = None
if resolved_text is None:
# Race: entry vanished. Echo the index as a number so
# the agent at least sees an intentional response
# rather than nothing.
resolved_text = f"choice {idx + 1}"
# Pop state and resolve
self._clarify_state.pop(clarify_id, None)
try:
from tools.clarify_gateway import resolve_gateway_clarify
resolved = resolve_gateway_clarify(clarify_id, resolved_text)
except Exception as exc:
logger.error("[%s] resolve_gateway_clarify failed: %s", self.name, exc)
resolved = False
await query.answer(text=f"{resolved_text[:60]}")
try:
await query.edit_message_text(
text=f"{_html.escape(query.message.text or '')}\n\n<b>{_html.escape(user_display)}:</b> {_html.escape(resolved_text)}",
parse_mode=ParseMode.HTML,
reply_markup=None,
)
except Exception:
pass
if resolved:
logger.info(
"Telegram clarify button resolved (id=%s, choice=%r, user=%s)",
clarify_id, resolved_text, user_display,
)
else:
logger.warning(
"Telegram clarify button: resolve_gateway_clarify returned False (id=%s)",
clarify_id,
)
return
# --- Update prompt callbacks ---
if not data.startswith("update_prompt:"):
return
@@ -2916,8 +2669,8 @@ class TelegramAdapter(BasePlatformAdapter):
label = "Yes" if answer == "y" else "No"
try:
await query.edit_message_text(
text=self.format_message(f"⚕ Update prompt answered: *{label}*"),
parse_mode=ParseMode.MARKDOWN_V2,
text=f"⚕ Update prompt answered: *{label}*",
parse_mode=ParseMode.MARKDOWN,
reply_markup=None,
)
except Exception:
@@ -4776,27 +4529,6 @@ class TelegramAdapter(BasePlatformAdapter):
logger.debug("[%s] set_message_reaction failed (%s): %s", self.name, emoji, e)
return False
async def _clear_reactions(self, chat_id: str, message_id: str) -> bool:
"""Clear all reactions from a Telegram message.
Calling ``set_message_reaction`` with ``reaction=None`` (or an empty
sequence) is the documented Bot API way to remove all bot-set
reactions on a message equivalent to Bot API 10.0's
``deleteMessageReaction`` but supported in PTB 22.6 already.
"""
if not self._bot:
return False
try:
await self._bot.set_message_reaction(
chat_id=int(chat_id),
message_id=int(message_id),
reaction=None,
)
return True
except Exception as e:
logger.debug("[%s] clear reactions failed: %s", self.name, e)
return False
async def on_processing_start(self, event: MessageEvent) -> None:
"""Add an in-progress reaction when message processing begins."""
if not self._reactions_enabled():
@@ -4811,23 +4543,12 @@ class TelegramAdapter(BasePlatformAdapter):
Unlike Discord (additive reactions), Telegram's set_message_reaction
replaces all existing reactions in one call no remove step needed.
On CANCELLED outcomes (e.g. the user runs ``/stop``, or a session is
interrupted mid-flight), we explicitly clear the 👀 in-progress
reaction so it doesn't linger on the user's message indefinitely.
Without this clear, the only way to remove the 👀 was to wait for
another agent run to swap it to 👍/👎 which never happens if the
cancellation was the last activity in the chat.
"""
if not self._reactions_enabled():
return
chat_id = getattr(event.source, "chat_id", None)
message_id = getattr(event, "message_id", None)
if not (chat_id and message_id):
return
if outcome == ProcessingOutcome.CANCELLED:
await self._clear_reactions(chat_id, message_id)
else:
if chat_id and message_id and outcome != ProcessingOutcome.CANCELLED:
await self._set_reaction(
chat_id,
message_id,
-1
View File
@@ -345,7 +345,6 @@ class WeComAdapter(BasePlatformAdapter):
try:
await self._open_connection()
backoff_idx = 0
self._mark_connected()
logger.info("[%s] Reconnected", self.name)
except Exception as reconnect_exc:
logger.warning("[%s] Reconnect failed: %s", self.name, reconnect_exc)
+2 -32
View File
@@ -322,26 +322,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
return {str(part).strip() for part in raw if str(part).strip()}
return {part.strip() for part in str(raw).split(",") if part.strip()}
@staticmethod
def _is_broadcast_chat(chat_id: str) -> bool:
"""True for WhatsApp pseudo-chats that aren't real conversations.
Covers Status updates (Stories) and Channel/Newsletter broadcasts.
These show up as inbound messages on Baileys but the agent should
never reply answering a Story update spams the contact's status
feed, and Channel posts aren't addressable in the first place.
"""
if not chat_id:
return False
cid = chat_id.strip().lower()
if cid == "status@broadcast":
return True
# @broadcast suffix covers status@broadcast plus any future
# broadcast-list variants. @newsletter is the Channel JID suffix.
if cid.endswith("@broadcast") or cid.endswith("@newsletter"):
return True
return False
def _is_dm_allowed(self, sender_id: str) -> bool:
"""Check whether a DM from the given sender should be processed."""
if self._dm_policy == "disabled":
@@ -452,16 +432,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
return cleaned.strip() or text
def _should_process_message(self, data: Dict[str, Any]) -> bool:
chat_id_raw = str(data.get("chatId") or "")
# WhatsApp uses pseudo-chats for Status updates (Stories) and
# Channel/Newsletter broadcasts. These are not real conversations
# and the agent should never reply to them — even in self-chat mode
# where the bridge may surface them as "fromMe" events.
if self._is_broadcast_chat(chat_id_raw):
return False
is_group = data.get("isGroup", False)
if is_group:
chat_id = chat_id_raw
chat_id = str(data.get("chatId") or "")
if not self._is_group_allowed(chat_id):
return False
else:
@@ -521,15 +494,12 @@ class WhatsAppAdapter(BasePlatformAdapter):
# plain executable path.
_npm_bin = shutil.which("npm") or "npm"
try:
# Read timeout from environment variable, default to 300 seconds (5 minutes)
# to accommodate slower systems like Unraid NAS
npm_install_timeout = int(os.environ.get("WHATSAPP_NPM_INSTALL_TIMEOUT", "300"))
install_result = subprocess.run(
[_npm_bin, "install", "--silent"],
cwd=str(bridge_dir),
capture_output=True,
text=True,
timeout=npm_install_timeout,
timeout=60,
)
if install_result.returncode != 0:
print(f"[{self.name}] npm install failed: {install_result.stderr}")
+4 -318
View File
@@ -1139,38 +1139,6 @@ def _should_clear_resume_pending_after_turn(agent_result: dict) -> bool:
return True
def _preserve_queued_followup_history_offset(
current_result: dict,
followup_result: dict,
) -> dict:
"""Carry the outer history offset through queued follow-up drains.
``_process_message_background()`` persists transcript rows only once, after the
entire in-band queued-follow-up chain returns. Each recursive ``_run_agent()``
call advances ``history_offset`` to the history it received, so without
correction the outermost persistence step sees only the *last* queued turn as
"new" and silently drops earlier turns from the same drain chain.
Preserve the earliest (outermost) history offset so the final transcript slice
still includes every queued turn that ran during the chain.
"""
if not isinstance(followup_result, dict):
return followup_result
if not isinstance(current_result, dict):
return followup_result
current_offset = current_result.get("history_offset")
followup_offset = followup_result.get("history_offset")
if not isinstance(current_offset, int):
return followup_result
if isinstance(followup_offset, int) and followup_offset <= current_offset:
return followup_result
merged = dict(followup_result)
merged["history_offset"] = current_offset
return merged
class GatewayRunner:
"""
Main gateway controller.
@@ -3307,30 +3275,6 @@ class GatewayRunner:
write_runtime_status(gateway_state="starting", exit_reason=None)
except Exception:
pass
# Log any active supply-chain security advisories. Operators see this
# in gateway.log and `hermes status` surfaces it; we do NOT block
# startup or surface it inline to user messages, since the gateway
# operator is the one who can act on it (uninstall the package,
# rotate credentials). See hermes_cli/security_advisories.py.
try:
from hermes_cli.security_advisories import (
detect_compromised,
gateway_log_message,
)
_adv_hits = detect_compromised()
_adv_msg = gateway_log_message(_adv_hits)
if _adv_msg:
logger.warning("%s", _adv_msg)
logger.warning(
"Run `hermes doctor` on the gateway host for full "
"remediation steps."
)
except Exception:
logger.debug(
"security advisory check failed at gateway startup",
exc_info=True,
)
# Warn if no user allowlists are configured and open access is not opted in
_builtin_allowed_vars = (
@@ -5860,37 +5804,6 @@ class GatewayRunner:
)
_update_prompts.pop(_quick_key, None)
# Intercept messages that are responses to a pending clarify
# request that is awaiting free-form text (either an open-ended
# clarify with no choices, or one where the user picked the
# "Other" button). The first non-empty user message in the
# session resolves the clarify and unblocks the agent thread —
# we do NOT route it to the agent as a new turn.
try:
from tools import clarify_gateway as _clarify_mod
_pending_clarify = _clarify_mod.get_pending_for_session(_quick_key)
except Exception:
_pending_clarify = None
if _pending_clarify is not None:
_raw_clarify_reply = (event.text or "").strip()
# Skip slash commands — the user clearly wanted to issue a
# command, not answer the clarify. Leave the clarify pending
# so the user can retry; if it times out, the agent unblocks
# with an empty response.
if _raw_clarify_reply and not _raw_clarify_reply.startswith("/"):
_resolved = _clarify_mod.resolve_gateway_clarify(
_pending_clarify.clarify_id, _raw_clarify_reply,
)
if _resolved:
logger.info(
"Gateway intercepted clarify text response (session=%s, id=%s)",
_quick_key, _pending_clarify.clarify_id,
)
# Acknowledge with empty string so adapters that emit
# the agent's response don't double-post. The agent
# itself will produce the next user-facing message.
return ""
# Intercept messages that are responses to a pending /reload-mcp
# (or future) slash-confirm prompt. Recognized confirm replies are
# /approve, /always, /cancel (plus short aliases). Anything else
@@ -6128,12 +6041,6 @@ class GatewayRunner:
if _cmd_def_inner and _cmd_def_inner.name == "model":
return "Agent is running — wait or /stop first, then switch models."
# /codex-runtime must not be used while the agent is running.
# Switching mid-turn would split a turn across two transports.
if _cmd_def_inner and _cmd_def_inner.name == "codex-runtime":
return ("Agent is running — wait or /stop first, then "
"change runtime.")
# /approve and /deny must bypass the running-agent interrupt path.
# The agent thread is blocked on a threading.Event inside
# tools/approval.py — sending an interrupt won't unblock it.
@@ -6173,12 +6080,6 @@ class GatewayRunner:
return await self._handle_goal_command(event)
return "Agent is running — use /goal status / pause / clear mid-run, or /stop before setting a new goal."
# /subgoal is safe mid-run — it only modifies the goal's
# subgoals list, which the judge reads at the next turn
# boundary. No race with the running turn.
if _cmd_def_inner and _cmd_def_inner.name == "subgoal":
return await self._handle_subgoal_command(event)
# Session-level toggles that are safe to run mid-agent —
# /yolo can unblock a pending approval prompt, /verbose cycles
# the tool-progress display mode for the ongoing stream.
@@ -6474,9 +6375,6 @@ class GatewayRunner:
if canonical == "model":
return await self._handle_model_command(event)
if canonical == "codex-runtime":
return await self._handle_codex_runtime_command(event)
if canonical == "personality":
return await self._handle_personality_command(event)
@@ -6560,9 +6458,6 @@ class GatewayRunner:
if canonical == "goal":
return await self._handle_goal_command(event)
if canonical == "subgoal":
return await self._handle_subgoal_command(event)
if canonical == "voice":
return await self._handle_voice_command(event)
@@ -7593,7 +7488,6 @@ class GatewayRunner:
hook_ctx = {
"platform": source.platform.value if source.platform else "",
"user_id": source.user_id,
"chat_id": source.chat_id or "",
"session_id": session_entry.session_id,
"message": message_text[:500],
}
@@ -9260,51 +9154,6 @@ class GatewayRunner:
return "\n".join(lines)
async def _handle_codex_runtime_command(self, event: MessageEvent) -> str:
"""Handle /codex-runtime command in the gateway.
Same surface as the CLI handler in cli.py:
/codex-runtime show current state
/codex-runtime auto Hermes default runtime
/codex-runtime codex_app_server codex subprocess runtime
/codex-runtime on / off synonyms
On change, the cached agent for this session is evicted so the next
message creates a fresh AIAgent with the new api_mode wired in
(avoids prompt-cache invalidation mid-session)."""
from hermes_cli import codex_runtime_switch as crs
raw_args = event.get_command_args().strip() if event else ""
new_value, errors = crs.parse_args(raw_args)
if errors:
return "" + "\n".join(errors)
# Load + persist via the same helpers used for /model and /yolo
try:
from hermes_cli.config import load_config, save_config
except Exception as exc:
return f"❌ Could not load config: {exc}"
cfg = load_config()
result = crs.apply(
cfg,
new_value,
persist_callback=(save_config if new_value is not None else None),
)
# On a real change, evict the cached agent so the new runtime takes
# effect on the next message rather than waiting for cache TTL.
if result.success and new_value is not None and result.requires_new_session:
try:
session_key = self._session_key_for_source(event.source)
self._evict_cached_agent(session_key)
except Exception:
logger.debug("could not evict cached agent after codex-runtime change",
exc_info=True)
prefix = "" if result.success else ""
return f"{prefix} {result.message}"
async def _handle_personality_command(self, event: MessageEvent) -> str:
"""Handle /personality command - list or set a personality."""
from hermes_constants import display_hermes_home
@@ -9533,57 +9382,6 @@ class GatewayRunner:
return t("gateway.goal.set", budget=state.max_turns, goal=state.goal)
async def _handle_subgoal_command(self, event: "MessageEvent") -> str:
"""Handle /subgoal for gateway platforms (mirror of CLI handler).
Subgoals are extra criteria appended to the active goal mid-loop.
They modify state read at the next turn boundary, so this is safe
to invoke while the agent is running.
"""
args = (event.get_command_args() or "").strip()
mgr, _session_entry = self._get_goal_manager_for_event(event)
if mgr is None:
return t("gateway.goal.unavailable")
if not mgr.has_goal():
return "No active goal. Set one with /goal <text>."
# No args → list current subgoals.
if not args:
return f"{mgr.status_line()}\n{mgr.render_subgoals()}"
tokens = args.split(None, 1)
verb = tokens[0].lower()
rest = tokens[1].strip() if len(tokens) > 1 else ""
if verb == "remove":
if not rest:
return "Usage: /subgoal remove <n>"
try:
idx = int(rest.split()[0])
except ValueError:
return "/subgoal remove: <n> must be an integer (1-based index)."
try:
removed = mgr.remove_subgoal(idx)
except (IndexError, RuntimeError) as exc:
return f"/subgoal remove: {exc}"
return f"✓ Removed subgoal {idx}: {removed}"
if verb == "clear":
try:
prev = mgr.clear_subgoals()
except RuntimeError as exc:
return f"/subgoal clear: {exc}"
if prev:
return f"✓ Cleared {prev} subgoal{'s' if prev != 1 else ''}."
return "No subgoals to clear."
try:
text = mgr.add_subgoal(args)
except (ValueError, RuntimeError) as exc:
return f"/subgoal: {exc}"
idx = len(mgr.state.subgoals) if mgr.state else 0
return f"✓ Added subgoal {idx}: {text}"
async def _send_goal_status_notice(self, source: Any, message: str) -> None:
"""Send a /goal judge status line back to the originating chat/thread."""
adapter = self.adapters.get(source.platform)
@@ -10355,10 +10153,6 @@ class GatewayRunner:
event_message_id = self._reply_anchor_for_event(event)
# Forward image/audio attachments so the background agent can see them.
media_urls = list(event.media_urls) if event.media_urls else []
media_types = list(event.media_types) if event.media_types else []
# Fire-and-forget the background task
_task = asyncio.create_task(
self._run_background_task(
@@ -10366,8 +10160,6 @@ class GatewayRunner:
source,
task_id,
event_message_id=event_message_id,
media_urls=media_urls,
media_types=media_types,
)
)
self._background_tasks.add(_task)
@@ -10382,15 +10174,10 @@ class GatewayRunner:
source: "SessionSource",
task_id: str,
event_message_id: Optional[str] = None,
media_urls: Optional[List[str]] = None,
media_types: Optional[List[str]] = None,
) -> None:
"""Execute a background agent task and deliver the result to the chat."""
from run_agent import AIAgent
media_urls = media_urls or []
media_types = media_types or []
adapter = self.adapters.get(source.platform)
if not adapter:
logger.warning("No adapter for platform %s in background task %s", source.platform, task_id)
@@ -10426,23 +10213,6 @@ class GatewayRunner:
self._service_tier = self._load_service_tier()
turn_route = self._resolve_turn_agent_config(prompt, model, runtime_kwargs)
# Enrich the prompt with image descriptions so the background
# agent can see user-attached images (same as the main flow).
enriched_prompt = prompt
if media_urls:
image_paths = []
for i, path in enumerate(media_urls):
mtype = media_types[i] if i < len(media_types) else ""
if mtype.startswith("image/"):
image_paths.append(path)
if image_paths:
try:
enriched_prompt = await self._enrich_message_with_vision(
prompt, image_paths,
)
except Exception as e:
logger.warning("Background task vision enrichment failed: %s", e)
def run_sync():
agent = AIAgent(
model=turn_route["model"],
@@ -10474,7 +10244,7 @@ class GatewayRunner:
)
try:
return agent.run_conversation(
user_message=enriched_prompt,
user_message=prompt,
task_id=task_id,
)
finally:
@@ -15163,76 +14933,6 @@ class GatewayRunner:
if _pdc is not None:
_pdc[session_key] = _release_bg_review_messages
# ------------------------------------------------------------------
# Clarify callback: present a clarify prompt and block on a response.
#
# Runs on the agent's worker thread (see clarify_tool's synchronous
# callback contract). Bridges sync→async by scheduling the
# adapter's send_clarify on the gateway event loop, then blocks on
# the clarify primitive's threading.Event with a configurable
# timeout. Returns the user's response string, or a sentinel
# explaining that no response arrived (so the agent can adapt
# rather than hang forever).
# ------------------------------------------------------------------
def _clarify_callback_sync(question: str, choices) -> str:
from tools import clarify_gateway as _clarify_mod
import uuid as _uuid
if not _status_adapter:
return ""
clarify_id = _uuid.uuid4().hex[:10]
_clarify_mod.register(
clarify_id=clarify_id,
session_key=session_key or "",
question=question,
choices=list(choices) if choices else None,
)
# Pause typing — like approval, we don't want a "thinking..."
# status to obscure the prompt or block the user from typing
# an "Other" response on platforms that disable input while
# typing is active (Slack Assistant API).
try:
_status_adapter.pause_typing_for_chat(_status_chat_id)
except Exception:
pass
send_ok = False
try:
fut = asyncio.run_coroutine_threadsafe(
_status_adapter.send_clarify(
chat_id=_status_chat_id,
question=question,
choices=list(choices) if choices else None,
clarify_id=clarify_id,
session_key=session_key or "",
metadata=_status_thread_metadata,
),
_loop_for_step,
)
result = fut.result(timeout=15)
send_ok = bool(getattr(result, "success", False))
except Exception as exc:
logger.warning("Clarify send failed: %s", exc)
send_ok = False
if not send_ok:
# Couldn't deliver the prompt — clean up and return
# sentinel so the agent can fall back to a sensible
# default rather than hanging.
_clarify_mod.clear_session(session_key or "")
return "[clarify prompt could not be delivered]"
timeout = _clarify_mod.get_clarify_timeout()
response = _clarify_mod.wait_for_response(clarify_id, timeout=float(timeout))
if response is None or response == "":
# Timeout or session-boundary cancellation
return f"[user did not respond within {int(timeout / 60)}m]"
return response
agent.clarify_callback = _clarify_callback_sync
# Store agent reference for interrupt support
agent_holder[0] = agent
# Capture the full tool definitions for transcript logging
@@ -15504,14 +15204,6 @@ class GatewayRunner:
result = agent.run_conversation(_run_message, conversation_history=agent_history, task_id=session_id)
finally:
unregister_gateway_notify(_approval_session_key)
# Cancel any pending clarify entries so blocked agent
# threads don't hang past the end of the run (interrupt,
# completion, gateway shutdown). Idempotent.
try:
from tools.clarify_gateway import clear_session as _clear_clarify_session
_clear_clarify_session(_approval_session_key)
except Exception:
pass
reset_current_session_key(_approval_session_token)
result_holder[0] = result
@@ -16131,7 +15823,6 @@ class GatewayRunner:
_already_streamed = bool(
(_sc and getattr(_sc, "final_response_sent", False))
or _previewed
or (_sc and getattr(_sc, "final_content_delivered", False))
)
first_response = result.get("final_response", "")
if first_response and not _already_streamed:
@@ -16217,7 +15908,7 @@ class GatewayRunner:
except Exception:
pass
followup_result = await self._run_agent(
return await self._run_agent(
message=next_message,
context_prompt=context_prompt,
history=updated_history,
@@ -16229,7 +15920,6 @@ class GatewayRunner:
event_message_id=next_message_id,
channel_prompt=next_channel_prompt,
)
return _preserve_queued_followup_history_offset(result, followup_result)
finally:
# Stop progress sender, interrupt monitor, and notification task
if progress_task:
@@ -16293,16 +15983,12 @@ class GatewayRunner:
# response_previewed means the interim_assistant_callback already
# sent the final text via the adapter (non-streaming path).
_previewed = bool(response.get("response_previewed"))
_content_delivered = bool(
_sc and getattr(_sc, "final_content_delivered", False)
)
if not _is_empty_sentinel and (_streamed or _previewed or _content_delivered):
if not _is_empty_sentinel and (_streamed or _previewed):
logger.info(
"Suppressing normal final send for session %s: final delivery already confirmed (streamed=%s previewed=%s content_delivered=%s).",
"Suppressing normal final send for session %s: final delivery already confirmed (streamed=%s previewed=%s).",
session_key or "?",
_streamed,
_previewed,
_content_delivered,
)
response["already_sent"] = True
+6 -51
View File
@@ -124,44 +124,16 @@ def get_process_start_time(pid: int) -> Optional[int]:
def _read_process_cmdline(pid: int) -> Optional[str]:
"""Return the process command line as a space-separated string.
On Linux, reads /proc/<pid>/cmdline directly. On macOS and other
platforms without /proc, falls back to ``ps -p <pid> -o command=``.
On Windows (no /proc, no ps), uses psutil.
"""
"""Return the process command line as a space-separated string."""
cmdline_path = Path(f"/proc/{pid}/cmdline")
try:
raw = cmdline_path.read_bytes()
except (FileNotFoundError, PermissionError, OSError):
pass
else:
if raw:
return raw.replace(b"\x00", b" ").decode("utf-8", errors="ignore").strip()
return None
try:
result = subprocess.run(
["ps", "-p", str(pid), "-o", "command="],
capture_output=True,
text=True,
timeout=5,
)
if result.returncode == 0 and result.stdout.strip():
return result.stdout.strip()
except (OSError, subprocess.TimeoutExpired):
pass
# Windows fallback: psutil (already used by _pid_exists)
try:
import psutil # type: ignore
proc = psutil.Process(pid)
cmdline_parts = proc.cmdline()
if cmdline_parts:
return " ".join(cmdline_parts)
except Exception:
pass
return None
if not raw:
return None
return raw.replace(b"\x00", b" ").decode("utf-8", errors="ignore").strip()
def _looks_like_gateway_process(pid: int) -> bool:
@@ -189,8 +161,7 @@ def _record_looks_like_gateway(record: dict[str, Any]) -> bool:
if not isinstance(argv, list) or not argv:
return False
# Normalize Windows backslashes so patterns match cross-platform.
cmdline = " ".join(str(part) for part in argv).replace("\\", "/")
cmdline = " ".join(str(part) for part in argv)
patterns = (
"hermes_cli.main gateway",
"hermes_cli/main.py gateway",
@@ -623,22 +594,6 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
and current_start != existing.get("start_time")
):
stale = True
# When start_time comparison is unavailable (macOS / Windows
# have no /proc, so both sides are None), fall back to
# checking the live process command line. When cmdline is
# also unreadable (Windows has no ps), consult the lock
# record's own argv — the gateway writes it at startup and
# it's the only identity signal on platforms without ps.
# Both oracles must indicate "not a gateway" to mark stale.
if (
not stale
and existing.get("start_time") is None
and current_start is None
and not _looks_like_gateway_process(existing_pid)
):
live_cmdline = _read_process_cmdline(existing_pid)
if live_cmdline is not None or not _record_looks_like_gateway(existing):
stale = True
# Check if process is stopped (Ctrl+Z / SIGTSTP) — stopped
# processes still appear alive to _pid_exists but are not
# actually running. Treat them as stale so --replace works.
-17
View File
@@ -150,10 +150,6 @@ class GatewayStreamConsumer:
self._flood_strikes = 0 # Consecutive flood-control edit failures
self._current_edit_interval = self.cfg.edit_interval # Adaptive backoff
self._final_response_sent = False
# Set when the final response content was sent to the user via
# streaming, even if the final edit (cursor removal etc.)
# subsequently failed.
self._final_content_delivered = False
# Cache adapter lifecycle capability: only platforms that need an
# explicit finalize call (e.g. DingTalk AI Cards) force us to make
# a redundant final edit. Everyone else keeps the fast path.
@@ -191,12 +187,6 @@ class GatewayStreamConsumer:
"""True when the stream consumer delivered the final assistant reply."""
return self._final_response_sent
@property
def final_content_delivered(self) -> bool:
"""True when the final response content reached the user, even if
the subsequent cosmetic edit (cursor removal) failed."""
return self._final_content_delivered
def on_segment_break(self) -> None:
"""Finalize the current stream segment and start a fresh message."""
self._queue.put(_NEW_SEGMENT)
@@ -465,8 +455,6 @@ class GatewayStreamConsumer:
# tool-progress edits or fallback-mode promotion (#10748)
# — that doesn't mean the final answer reached the user.
self._final_response_sent = chunks_delivered
if chunks_delivered:
self._final_content_delivered = True
return
if got_segment_break:
self._message_id = None
@@ -517,11 +505,6 @@ class GatewayStreamConsumer:
self._last_edit_time = time.monotonic()
if got_done:
# Record that the final content reached the user even
# if the cosmetic final edit below fails.
if current_update_visible and self._accumulated:
self._final_content_delivered = True
# Final edit without cursor. If progressive editing failed
# mid-stream, send a single continuation/fallback message
# here instead of letting the base gateway path send the
+14 -100
View File
@@ -35,7 +35,7 @@ from dataclasses import dataclass, field
from datetime import datetime, timezone
from http.server import BaseHTTPRequestHandler, HTTPServer
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from typing import Any, Dict, List, Optional
from urllib.parse import parse_qs, urlencode, urlparse
import httpx
@@ -284,7 +284,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
),
"alibaba": ProviderConfig(
id="alibaba",
name="Qwen Cloud",
name="Alibaba Cloud (DashScope)",
auth_type="api_key",
inference_base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
api_key_env_vars=("DASHSCOPE_API_KEY",),
@@ -3870,39 +3870,6 @@ def _snapshot_nous_pool_status() -> Dict[str, Any]:
return _empty_nous_auth_status()
# ── Process-level memo for get_nous_auth_status() ──
# get_nous_auth_status() validates state by calling resolve_nous_runtime_credentials(),
# which does a synchronous OAuth refresh POST to portal.nousresearch.com. That can take
# ~350ms even on the failure path, and read-only UI surfaces (`hermes tools`, status panels,
# subscription-feature checks) call it many times per render — `hermes tools` → "All Platforms"
# was firing the refresh ~31× during one menu paint, racking up >13s of HTTP and burning
# single-use refresh tokens. Cache the snapshot for a few seconds, keyed on the auth.json
# mtime so that `hermes auth login/logout/add/remove` invalidate naturally on the next call.
_NOUS_AUTH_STATUS_CACHE_TTL = 15.0 # seconds
_nous_auth_status_cache: Optional[Tuple[float, Optional[float], Dict[str, Any]]] = None
def _auth_file_mtime() -> Optional[float]:
try:
return _auth_file_path().stat().st_mtime
except FileNotFoundError:
return None
except Exception:
return None
def invalidate_nous_auth_status_cache() -> None:
"""Clear the get_nous_auth_status() process-level memo.
Call this from any code path that mutates Nous auth state without going
through resolve_nous_runtime_credentials() (e.g. tests). Login/logout
flows touch auth.json, so the mtime check below invalidates them
automatically explicit invalidation is the belt-and-braces option.
"""
global _nous_auth_status_cache
_nous_auth_status_cache = None
def get_nous_auth_status() -> Dict[str, Any]:
"""Status snapshot for Nous auth.
@@ -3911,32 +3878,7 @@ def get_nous_auth_status() -> Dict[str, Any]:
by resolving runtime credentials so revoked refresh sessions do not show up
as a healthy login. If provider state is absent, fall back to the credential
pool for the just-logged-in / not-yet-promoted case.
The returned snapshot is memoised for ~15s keyed on the auth.json mtime,
so menu/status surfaces that ask repeatedly don't trigger one refresh POST
per call. Login/logout flows write to auth.json and therefore invalidate
the cache automatically; tests can also call
``invalidate_nous_auth_status_cache()`` explicitly.
"""
global _nous_auth_status_cache
now = time.monotonic()
mtime = _auth_file_mtime()
cached = _nous_auth_status_cache
if cached is not None:
cached_at, cached_mtime, cached_status = cached
if (
cached_mtime == mtime
and (now - cached_at) < _NOUS_AUTH_STATUS_CACHE_TTL
):
return dict(cached_status)
status = _compute_nous_auth_status()
_nous_auth_status_cache = (now, mtime, dict(status))
return status
def _compute_nous_auth_status() -> Dict[str, Any]:
"""Uncached implementation of get_nous_auth_status(). See that function."""
state = get_provider_auth_state("nous")
if state:
base_status = {
@@ -4104,8 +4046,6 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
return get_qwen_auth_status()
if target == "google-gemini-cli":
return get_gemini_oauth_auth_status()
if target == "minimax-oauth":
return get_minimax_oauth_auth_status()
if target == "copilot-acp":
return get_external_process_provider_status(target)
# API-key providers
@@ -4817,20 +4757,6 @@ def _minimax_request_user_code(
return payload
def _minimax_expired_in_looks_like_unix_ms(expired_in: int, *, now_ms: int) -> bool:
"""True if ``expired_in`` is plausibly a unix-ms absolute time (vs TTL seconds)."""
return int(expired_in) > (now_ms // 2)
def _minimax_resolve_token_expiry_unix(expired_in: int, *, now: datetime) -> float:
"""Return access-token expiry as unix seconds (MiniMax uses ms epoch or TTL seconds)."""
raw = int(expired_in)
now_ms = int(now.timestamp() * 1000)
if _minimax_expired_in_looks_like_unix_ms(raw, now_ms=now_ms):
return raw / 1000.0
return now.timestamp() + max(1, raw)
def _minimax_poll_token(
client: httpx.Client, *, portal_base_url: str, client_id: str,
user_code: str, code_verifier: str, expired_in: int, interval_ms: Optional[int],
@@ -4839,11 +4765,12 @@ def _minimax_poll_token(
# Defensive parsing: if it's small enough to be a duration, treat as seconds.
import time as _time
now_ms = int(_time.time() * 1000)
raw = int(expired_in)
if _minimax_expired_in_looks_like_unix_ms(raw, now_ms=now_ms):
deadline = raw / 1000.0
if expired_in > now_ms // 2:
# Looks like a unix-ms timestamp.
deadline = expired_in / 1000.0
else:
deadline = _time.time() + max(1, raw)
# Treat as duration in seconds from now.
deadline = _time.time() + max(1, expired_in)
interval = max(2.0, (interval_ms or 2000) / 1000.0)
while _time.time() < deadline:
@@ -4957,10 +4884,8 @@ def _minimax_oauth_login(
)
now = datetime.now(timezone.utc)
expires_at_unix = _minimax_resolve_token_expiry_unix(
int(token_data["expired_in"]), now=now,
)
expires_in_s = max(0, int(expires_at_unix - now.timestamp()))
expires_in_s = int(token_data["expired_in"])
expires_at = now.timestamp() + expires_in_s
auth_state = {
"provider": "minimax-oauth",
@@ -4974,7 +4899,7 @@ def _minimax_oauth_login(
"refresh_token": token_data["refresh_token"],
"resource_url": token_data.get("resource_url"),
"obtained_at": now.isoformat(),
"expires_at": datetime.fromtimestamp(expires_at_unix, tz=timezone.utc).isoformat(),
"expires_at": datetime.fromtimestamp(expires_at, tz=timezone.utc).isoformat(),
"expires_in": expires_in_s,
}
@@ -5035,16 +4960,14 @@ def _refresh_minimax_oauth_state(
relogin_required=True,
)
now_dt = datetime.now(timezone.utc)
expires_at_unix = _minimax_resolve_token_expiry_unix(
int(payload["expired_in"]), now=now_dt,
)
expires_in_s = max(0, int(expires_at_unix - now_dt.timestamp()))
expires_in_s = int(payload["expired_in"])
new_state = dict(state)
new_state.update({
"access_token": payload["access_token"],
"refresh_token": payload.get("refresh_token", state["refresh_token"]),
"obtained_at": now_dt.isoformat(),
"expires_at": datetime.fromtimestamp(expires_at_unix, tz=timezone.utc).isoformat(),
"expires_at": datetime.fromtimestamp(now_dt.timestamp() + expires_in_s,
tz=timezone.utc).isoformat(),
"expires_in": expires_in_s,
})
_minimax_save_auth_state(new_state)
@@ -5329,7 +5252,6 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
get_curated_nous_model_ids, get_pricing_for_provider,
check_nous_free_tier, partition_nous_models_by_tier,
union_with_portal_free_recommendations,
union_with_portal_paid_recommendations,
)
model_ids = get_curated_nous_model_ids()
@@ -5338,27 +5260,19 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
if model_ids:
pricing = get_pricing_for_provider("nous")
free_tier = check_nous_free_tier()
_portal_for_recs = auth_state.get("portal_base_url", "")
if free_tier:
# The Portal's freeRecommendedModels endpoint is the
# source of truth for what's free *right now*. Augment
# the curated list with anything new the Portal flags
# as free so users on older Hermes builds still see
# newly-launched free models without a CLI release.
_portal_for_recs = auth_state.get("portal_base_url", "")
model_ids, pricing = union_with_portal_free_recommendations(
model_ids, pricing, _portal_for_recs,
)
model_ids, unavailable_models = partition_nous_models_by_tier(
model_ids, pricing, free_tier=True,
)
else:
# Paid-tier mirror: pull paidRecommendedModels so newly
# launched paid models surface in the picker even if
# the in-repo curated list and docs-hosted manifest
# haven't caught up yet.
model_ids, pricing = union_with_portal_paid_recommendations(
model_ids, pricing, _portal_for_recs,
)
_portal = auth_state.get("portal_base_url", "")
if model_ids:
print(f"Showing {len(model_ids)} curated models — use \"Enter custom model name\" for others.")
+5 -8
View File
@@ -375,12 +375,10 @@ def auth_add_command(args) -> None:
return
if provider == "minimax-oauth":
creds = auth_mod._minimax_oauth_login(
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=getattr(args, "timeout", None) or 15.0,
)
from hermes_cli.auth import resolve_minimax_oauth_runtime_credentials
creds = resolve_minimax_oauth_runtime_credentials()
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds["access_token"],
creds["api_key"],
_oauth_default_label(provider, len(pool.entries()) + 1),
)
entry = PooledCredential(
@@ -390,9 +388,8 @@ def auth_add_command(args) -> None:
auth_type=AUTH_TYPE_OAUTH,
priority=0,
source=f"{SOURCE_MANUAL}:minimax_oauth",
access_token=creds["access_token"],
refresh_token=creds.get("refresh_token"),
base_url=creds.get("inference_base_url"),
access_token=creds["api_key"],
base_url=creds.get("base_url"),
)
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
-13
View File
@@ -581,19 +581,6 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
if mcp_connected:
summary_parts.append(f"{mcp_connected} MCP servers")
summary_parts.append("/help for commands")
# Indicate when the codex_app_server runtime is active so users
# understand why tool counts may not match what's actually reachable
# (codex builds its own tool list inside the spawned subprocess).
try:
from hermes_cli.codex_runtime_switch import get_current_runtime
from hermes_cli.config import load_config as _load_cfg
if get_current_runtime(_load_cfg()) == "codex_app_server":
right_lines.append(
f"[bold {accent}]Runtime:[/] [{text}]codex app-server[/] "
f"[dim {dim}](terminal/file ops/MCP run inside codex)[/]"
)
except Exception:
pass
# Show active profile name when not 'default'
try:
from hermes_cli.profiles import get_active_profile_name
+4 -17
View File
@@ -22,7 +22,6 @@ from pathlib import Path
from hermes_constants import is_wsl as _is_wsl
logger = logging.getLogger(__name__)
_PNG_SIGNATURE = b"\x89PNG\r\n\x1a\n"
def save_clipboard_image(dest: Path) -> bool:
@@ -379,13 +378,10 @@ def _wayland_save(dest: Path) -> bool:
dest.unlink(missing_ok=True)
return False
# save_clipboard_image() promises a PNG output path. Wayland can offer
# JPEG/GIF/WebP/BMP payloads, so normalize every non-PNG result before
# returning success.
if mime != "image/png":
if not _convert_to_png(dest) or not _is_png_file(dest):
dest.unlink(missing_ok=True)
return False
# BMP needs conversion to PNG (common in WSLg where only BMP
# is bridged from Windows clipboard via RDP).
if mime == "image/bmp":
return _convert_to_png(dest)
return True
@@ -437,15 +433,6 @@ def _convert_to_png(path: Path) -> bool:
return path.exists() and path.stat().st_size > 0
def _is_png_file(path: Path) -> bool:
"""Return True when *path* starts with the PNG file signature."""
try:
with path.open("rb") as f:
return f.read(len(_PNG_SIGNATURE)) == _PNG_SIGNATURE
except OSError:
return False
# ── X11 (xclip) ─────────────────────────────────────────────────────────
def _xclip_has_image() -> bool:
@@ -1,614 +0,0 @@
"""Migrate Hermes' MCP server config and Codex's installed curated plugins
to the format Codex expects in ~/.codex/config.toml.
When the user enables the codex_app_server runtime, the codex subprocess
runs its own MCP client and its own plugin runtime (Linear, Atlassian,
Asana, plus per-account ChatGPT apps via app/list). For both of those to
be useful, the user's choices need to be visible to codex too. This
module:
1. Reads Hermes' YAML and writes equivalent [mcp_servers.<name>]
entries to ~/.codex/config.toml.
2. Queries codex's `plugin/list` for the openai-curated marketplace
and writes [plugins."<name>@<marketplace>"] entries for any plugin
the user has installed=true on their codex CLI. (This is what
OpenClaw calls "migrate native codex plugins" the YouTube-video-
worthy bit Pash highlighted: Canva, GitHub, Calendar, Gmail
pre-configured.)
3. Writes a [permissions] default profile so users on this runtime
don't get an approval prompt on every write attempt.
What translates (MCP servers):
Hermes mcp_servers.<n>.command/args/env codex stdio transport
Hermes mcp_servers.<n>.url/headers codex streamable_http transport
Hermes mcp_servers.<n>.timeout codex tool_timeout_sec
Hermes mcp_servers.<n>.connect_timeout codex startup_timeout_sec
What does NOT translate (warned + skipped):
Hermes-specific keys (sampling, etc.) codex's MCP client has no
equivalent. Listed in the per-server skipped[] field of the report.
What's NOT migrated (intentional):
AGENTS.md codex respects this file natively in its cwd. Hermes' own
AGENTS.md (project-level) is already in the worktree, so codex picks
it up without translation. No code needed.
"""
from __future__ import annotations
import logging
import os
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Optional
logger = logging.getLogger(__name__)
# Marker comments wrapping the managed section so re-runs can detect
# what's ours and what's user-edited. Both must appear or strip is a no-op.
MIGRATION_MARKER = (
"# managed by hermes-agent — `hermes codex-runtime migrate` regenerates this section"
)
MIGRATION_END_MARKER = (
"# end hermes-agent managed section"
)
@dataclass
class MigrationReport:
"""Outcome of a migration pass."""
target_path: Optional[Path] = None
migrated: list[str] = field(default_factory=list)
skipped_keys_per_server: dict[str, list[str]] = field(default_factory=dict)
migrated_plugins: list[str] = field(default_factory=list)
plugin_query_error: Optional[str] = None
wrote_permissions_default: Optional[str] = None
errors: list[str] = field(default_factory=list)
written: bool = False
dry_run: bool = False
def summary(self) -> str:
lines = []
if self.dry_run:
lines.append(f"(dry run) Would write {self.target_path}")
elif self.written:
lines.append(f"Wrote {self.target_path}")
if self.migrated:
lines.append(f"Migrated {len(self.migrated)} MCP server(s):")
for name in self.migrated:
skipped = self.skipped_keys_per_server.get(name, [])
note = (
f" (skipped: {', '.join(skipped)})" if skipped else ""
)
lines.append(f" - {name}{note}")
else:
lines.append("No MCP servers found in Hermes config.")
if self.migrated_plugins:
lines.append(
f"Migrated {len(self.migrated_plugins)} native Codex plugin(s):"
)
for name in self.migrated_plugins:
lines.append(f" - {name}")
elif self.plugin_query_error:
lines.append(f"Codex plugin discovery skipped: {self.plugin_query_error}")
if self.wrote_permissions_default:
lines.append(
f"Wrote default_permissions = "
f"{self.wrote_permissions_default!r}"
)
for err in self.errors:
lines.append(f"{err}")
return "\n".join(lines)
# Hermes keys that codex's MCP schema doesn't support — dropped during
# migration with a warning. Anything not on the keep list AND not the
# transport keys is added to skipped.
_KNOWN_HERMES_KEYS = {
# transport — stdio
"command", "args", "env", "cwd",
# transport — http
"url", "headers", "transport",
# timeouts
"timeout", "connect_timeout",
# general
"enabled", "description",
}
# Subset that have a direct codex equivalent.
_KEYS_DROPPED_WITH_WARNING = {
# Hermes' sampling subsection — codex MCP has no equivalent
"sampling",
}
def _translate_one_server(
name: str, hermes_cfg: dict
) -> tuple[Optional[dict], list[str]]:
"""Translate one Hermes MCP server config to the codex inline-table dict
representation. Returns (codex_entry, skipped_keys).
codex_entry is a dict ready for TOML serialization, or None when the
server can't be translated (e.g. neither command nor url present)."""
if not isinstance(hermes_cfg, dict):
return None, []
skipped: list[str] = []
out: dict[str, Any] = {}
has_command = bool(hermes_cfg.get("command"))
has_url = bool(hermes_cfg.get("url"))
if has_command and has_url:
skipped.append("url (both command and url set; preferring stdio)")
has_url = False
if has_command:
# Stdio transport
out["command"] = str(hermes_cfg["command"])
args = hermes_cfg.get("args") or []
if args:
out["args"] = [str(a) for a in args]
env = hermes_cfg.get("env") or {}
if env:
# Codex expects string values
out["env"] = {str(k): str(v) for k, v in env.items()}
cwd = hermes_cfg.get("cwd")
if cwd:
out["cwd"] = str(cwd)
elif has_url:
# streamable_http transport (codex covers both http and SSE here)
out["url"] = str(hermes_cfg["url"])
headers = hermes_cfg.get("headers") or {}
if headers:
out["http_headers"] = {str(k): str(v) for k, v in headers.items()}
# Hermes' transport: sse hint is informational; codex auto-negotiates
if hermes_cfg.get("transport") == "sse":
skipped.append("transport=sse (codex auto-negotiates)")
else:
return None, ["no command or url field"]
# Timeouts
if "timeout" in hermes_cfg:
try:
out["tool_timeout_sec"] = float(hermes_cfg["timeout"])
except (TypeError, ValueError):
skipped.append("timeout (not numeric)")
if "connect_timeout" in hermes_cfg:
try:
out["startup_timeout_sec"] = float(hermes_cfg["connect_timeout"])
except (TypeError, ValueError):
skipped.append("connect_timeout (not numeric)")
# Enabled flag (codex defaults to true so we only emit when explicitly false)
if hermes_cfg.get("enabled") is False:
out["enabled"] = False
# Detect keys we explicitly drop with warning
for key in hermes_cfg:
if key in _KEYS_DROPPED_WITH_WARNING:
skipped.append(f"{key} (no codex equivalent)")
elif key not in _KNOWN_HERMES_KEYS:
skipped.append(f"{key} (unknown Hermes key)")
return out, skipped
def _format_toml_value(value: Any) -> str:
"""Minimal TOML value formatter for the value types we emit.
We only emit strings, numbers, booleans, and tables of those no nested
arrays of tables. This covers everything codex's MCP schema accepts."""
if isinstance(value, bool):
return "true" if value else "false"
if isinstance(value, (int, float)):
return repr(value)
if isinstance(value, str):
# Escape per TOML basic-string rules. Order matters: backslash
# first so the other escapes don't get re-escaped.
# Control characters (newline, tab, etc.) must use \-escapes
# because TOML basic strings don't allow literal control chars
# — passing them through would produce invalid TOML that codex
# would refuse to load. Paths usually don't contain control
# chars but env-var passthrough (HERMES_HOME, PYTHONPATH) could
# in pathological cases.
escaped = (
value
.replace("\\", "\\\\")
.replace('"', '\\"')
.replace("\b", "\\b")
.replace("\t", "\\t")
.replace("\n", "\\n")
.replace("\f", "\\f")
.replace("\r", "\\r")
)
return f'"{escaped}"'
if isinstance(value, list):
items = ", ".join(_format_toml_value(v) for v in value)
return f"[{items}]"
if isinstance(value, dict):
items = ", ".join(
f'{_quote_key(k)} = {_format_toml_value(v)}' for k, v in value.items()
)
return "{ " + items + " }" if items else "{}"
raise ValueError(f"Unsupported TOML value type: {type(value).__name__}")
def _quote_key(key: str) -> str:
"""Return key bare-or-quoted depending on whether it's a valid bare key."""
if all(c.isalnum() or c in "-_" for c in key) and key:
return key
escaped = key.replace("\\", "\\\\").replace('"', '\\"')
return f'"{escaped}"'
def render_codex_toml_section(
servers: dict[str, dict],
plugins: Optional[list[dict]] = None,
default_permission_profile: Optional[str] = None,
) -> str:
"""Render the managed [mcp_servers.<n>] / [plugins.<id>] / [permissions]
block for ~/.codex/config.toml.
Args:
servers: dict of MCP server name translated codex inline-table
plugins: optional list of {name, marketplace, enabled} for native
Codex plugins to enable. (E.g. the Linear / Atlassian / Asana
curated plugins, or per-account ChatGPT apps.)
default_permission_profile: when set, write `[permissions] default`
so the user doesn't get an approval prompt on every write
attempt. Common values: "workspace-write", "read-only",
"full-access".
"""
out = [MIGRATION_MARKER]
if not servers and not plugins and not default_permission_profile:
out.append("# (no MCP servers, plugins, or permissions configured by Hermes)")
out.append(MIGRATION_END_MARKER)
return "\n".join(out) + "\n"
if default_permission_profile:
# Codex's config schema: `default_permissions` is a top-level
# string referencing a profile name. Built-in profile names start
# with ":" (":workspace-write", ":read-only", ":full-access"). The
# [permissions] table is for *user-defined* named profiles with
# structured fields — not what we want.
normalized = (
default_permission_profile
if default_permission_profile.startswith(":")
else f":{default_permission_profile}"
)
out.append("")
out.append(f"default_permissions = {_format_toml_value(normalized)}")
if servers:
for name in sorted(servers.keys()):
cfg = servers[name]
out.append("")
out.append(f"[mcp_servers.{_quote_key(name)}]")
for k, v in cfg.items():
out.append(f"{_quote_key(k)} = {_format_toml_value(v)}")
if plugins:
for plugin in sorted(plugins, key=lambda p: f"{p.get('name','')}@{p.get('marketplace','')}"):
name = plugin.get("name") or ""
marketplace = plugin.get("marketplace") or "openai-curated"
enabled = bool(plugin.get("enabled", True))
qualified = f"{name}@{marketplace}"
out.append("")
out.append(f'[plugins.{_quote_key(qualified)}]')
out.append(f"enabled = {_format_toml_value(enabled)}")
out.append("")
out.append(MIGRATION_END_MARKER)
return "\n".join(out) + "\n"
def _strip_existing_managed_block(toml_text: str) -> str:
"""Remove any prior managed section so re-runs idempotently replace it.
The managed section is everything between MIGRATION_MARKER (start) and
MIGRATION_END_MARKER (end), inclusive of both markers. User-edited
sections above or below are preserved verbatim.
Backward compatibility: if the start marker is found but no end marker
follows, we fall back to the heuristic that swallows lines until we
hit a section that's not [mcp_servers.*]/[plugins.*]/[permissions]/
a `default_permissions =` key. This matches what older versions of
this code wrote so re-runs don't break configs from prior Hermes
versions."""
lines = toml_text.splitlines(keepends=True)
out: list[str] = []
in_managed = False
saw_end_marker = False
for line in lines:
line_stripped_nl = line.rstrip("\n")
if line_stripped_nl == MIGRATION_MARKER:
in_managed = True
saw_end_marker = False
continue
if in_managed:
if line_stripped_nl == MIGRATION_END_MARKER:
in_managed = False
saw_end_marker = True
continue
stripped = line.lstrip()
if not saw_end_marker and stripped.startswith("[") and not (
stripped.startswith("[mcp_servers")
or stripped.startswith("[plugins")
or stripped.startswith("[permissions]")
or stripped.startswith("[permissions.")
):
# Old-format managed block without end marker: bail back
# to user content as soon as we see a non-managed section.
in_managed = False
out.append(line)
continue
# Otherwise swallow the line.
continue
out.append(line)
return "".join(out)
def _query_codex_plugins(
codex_home: Optional[Path] = None,
timeout: float = 8.0,
) -> tuple[list[dict], Optional[str]]:
"""Query codex's `plugin/list` for installed curated plugins.
Spawns `codex app-server` briefly, sends initialize + plugin/list,
extracts plugins where installed=true. Returns (plugins, error).
Plugins is a list of {name, marketplace, enabled} dicts ready for
render_codex_toml_section().
On any failure (codex not installed, RPC error, timeout) returns
([], error_message). Migration treats this as non-fatal MCP
servers and permissions still write through.
"""
try:
from agent.transports.codex_app_server import CodexAppServerClient
except Exception as exc:
return [], f"transport unavailable: {exc}"
try:
with CodexAppServerClient(
codex_home=str(codex_home) if codex_home else None
) as client:
client.initialize(client_name="hermes-migration")
resp = client.request("plugin/list", {}, timeout=timeout)
except Exception as exc:
return [], f"plugin/list query failed: {exc}"
out: list[dict] = []
seen: set[tuple[str, str]] = set()
marketplaces = resp.get("marketplaces") or []
if not isinstance(marketplaces, list):
return [], "plugin/list response missing 'marketplaces'"
for marketplace in marketplaces:
if not isinstance(marketplace, dict):
continue
market_name = str(marketplace.get("name") or "openai-curated")
plugins = marketplace.get("plugins") or []
if not isinstance(plugins, list):
continue
for plugin in plugins:
if not isinstance(plugin, dict):
continue
installed = bool(plugin.get("installed", False))
if not installed:
continue
# Skip plugins codex itself reports as unavailable (broken
# install, missing OAuth, removed from marketplace, etc.).
# Cf. openclaw/openclaw#80815 — OpenClaw learned to gate
# migration on app readiness to avoid writing config that
# would fail at activation time. Our migration writes to
# codex's config.toml directly, so a broken plugin would
# surface as a codex error on first use. Skipping it here
# keeps the migrated config clean and the user's first
# codex turn from failing.
availability = str(plugin.get("availability") or "").upper()
if availability and availability != "AVAILABLE":
logger.debug(
"skipping plugin %s: availability=%s",
plugin.get("name"), availability,
)
continue
name = str(plugin.get("name") or "")
if not name:
continue
key = (name, market_name)
if key in seen:
continue
seen.add(key)
# Carry forward whatever 'enabled' codex reports — defaults to
# true for installed plugins. This is the same shape OpenClaw
# writes when migrating native codex plugins.
out.append({
"name": name,
"marketplace": market_name,
"enabled": bool(plugin.get("enabled", True)),
})
return out, None
def _build_hermes_tools_mcp_entry() -> dict:
"""Build the codex stdio-transport entry that launches Hermes' own
tool surface as an MCP server. Codex's subprocess will call back into
this for browser/web/delegate_task/vision/memory/skills tools.
The command runs the worktree's Python via the current sys.executable
so a hermes installed under /opt/, /usr/local/, or a venv all work.
HERMES_HOME and PYTHONPATH are passed through so the spawned process
sees the same config + module layout the user is running."""
import sys
env: dict[str, str] = {}
# HERMES_HOME passes through if set so the MCP subprocess sees the
# same config / auth / sessions DB as the parent CLI.
hermes_home = os.environ.get("HERMES_HOME")
if hermes_home:
env["HERMES_HOME"] = hermes_home
# PYTHONPATH passes through so a worktree-launched hermes finds the
# branch's modules instead of the installed package.
pythonpath = os.environ.get("PYTHONPATH")
if pythonpath:
env["PYTHONPATH"] = pythonpath
# Quiet mode + redaction defaults so the MCP wire stays clean.
env["HERMES_QUIET"] = "1"
env["HERMES_REDACT_SECRETS"] = env.get("HERMES_REDACT_SECRETS", "true")
out: dict[str, Any] = {
"command": sys.executable,
"args": ["-m", "agent.transports.hermes_tools_mcp_server"],
}
if env:
out["env"] = env
# Generous timeouts — browser_navigate or delegate_task can take a
# while; we don't want codex's MCP client to give up too early.
out["startup_timeout_sec"] = 30.0
out["tool_timeout_sec"] = 600.0
return out
def migrate(
hermes_config: dict,
*,
codex_home: Optional[Path] = None,
dry_run: bool = False,
discover_plugins: bool = True,
default_permission_profile: Optional[str] = ":workspace",
expose_hermes_tools: bool = True,
) -> MigrationReport:
"""Translate Hermes mcp_servers config + Codex curated plugins into
~/.codex/config.toml.
Args:
hermes_config: full ~/.hermes/config.yaml dict
codex_home: override CODEX_HOME (defaults to ~/.codex)
dry_run: skip the actual write; report what would happen
discover_plugins: when True (default), query `plugin/list` against
the live codex CLI to migrate any installed curated plugins
into [plugins."<name>@<marketplace>"] entries. Set False to
skip the subprocess spawn (for tests or restricted environments).
default_permission_profile: when set (default ":workspace"), write
top-level `default_permissions = "<name>"` so users on this
runtime don't get an approval prompt on every write attempt.
Built-in codex profile names are ":workspace", ":read-only",
":danger-no-sandbox" (note the leading ":"). Also accepts a
user-defined profile name (no leading ":") that the user has
configured in their own [permissions.<name>] table. Set None
to leave permissions unset and let codex use its compiled-in
default (which is read-only).
expose_hermes_tools: when True (default), register Hermes' own
tool surface (web_search, browser_*, delegate_task, vision,
memory, skills, etc.) as an MCP server in ~/.codex/config.toml
so the codex subprocess can call back into Hermes for tools
codex doesn't have built in. Set False to opt out.
"""
report = MigrationReport(dry_run=dry_run)
codex_home = codex_home or Path.home() / ".codex"
target = codex_home / "config.toml"
report.target_path = target
hermes_servers = (hermes_config or {}).get("mcp_servers") or {}
if not isinstance(hermes_servers, dict):
report.errors.append(
"mcp_servers in Hermes config is not a dict; cannot migrate."
)
return report
translated: dict[str, dict] = {}
for name, cfg in hermes_servers.items():
out, skipped = _translate_one_server(str(name), cfg or {})
if out is None:
report.errors.append(
f"server {name!r} skipped: {', '.join(skipped) or 'no transport configured'}"
)
continue
translated[str(name)] = out
if skipped:
report.skipped_keys_per_server[str(name)] = skipped
report.migrated.append(str(name))
# Discover installed Codex curated plugins. Best-effort — never blocks
# the migration if codex is unreachable or the RPC fails.
plugins: list[dict] = []
if discover_plugins and not dry_run:
plugins, plugin_err = _query_codex_plugins(codex_home=codex_home)
if plugin_err:
report.plugin_query_error = plugin_err
for p in plugins:
report.migrated_plugins.append(f"{p['name']}@{p['marketplace']}")
# Track whether we wrote a default permission profile so the report
# surfaces it to the user.
if default_permission_profile:
report.wrote_permissions_default = default_permission_profile
# Inject Hermes' own tool surface as an MCP server so the spawned
# codex subprocess can call back into Hermes for the tools codex
# doesn't ship with — web_search, browser_*, delegate_task, vision,
# memory, skills, session_search, image_generate, text_to_speech.
# The server itself is agent/transports/hermes_tools_mcp_server.py
# and is launched on demand by codex (stdio MCP).
if expose_hermes_tools:
translated["hermes-tools"] = _build_hermes_tools_mcp_entry()
if "hermes-tools" not in report.migrated:
report.migrated.append("hermes-tools")
# Build the new managed block
managed_block = render_codex_toml_section(
translated, plugins=plugins,
default_permission_profile=default_permission_profile,
)
# Read existing codex config if any, strip the prior managed block,
# append the new one.
if target.exists():
try:
existing = target.read_text(encoding="utf-8")
except Exception as exc:
report.errors.append(f"could not read {target}: {exc}")
return report
without_managed = _strip_existing_managed_block(existing)
# Ensure exactly one blank line between user content and managed block
if without_managed and not without_managed.endswith("\n"):
without_managed += "\n"
new_text = (
without_managed.rstrip("\n") + "\n\n" + managed_block
if without_managed.strip()
else managed_block
)
else:
new_text = managed_block
if dry_run:
return report
try:
codex_home.mkdir(parents=True, exist_ok=True)
# Atomic write: write to a temp file in the same directory then
# rename. Same-directory rename is atomic on POSIX and ReplaceFile
# on Windows. Avoids leaving a half-written config.toml that
# codex would refuse to load if we crash mid-write.
import tempfile
tmp_fd, tmp_path_str = tempfile.mkstemp(
prefix=".config.toml.", dir=str(codex_home)
)
tmp_path = Path(tmp_path_str)
try:
with os.fdopen(tmp_fd, "w", encoding="utf-8") as fh:
fh.write(new_text)
tmp_path.replace(target)
except Exception:
# Clean up the temp file if the rename didn't happen.
try:
if tmp_path.exists():
tmp_path.unlink()
except Exception:
pass
raise
report.written = True
except Exception as exc:
report.errors.append(f"could not write {target}: {exc}")
return report
-266
View File
@@ -1,266 +0,0 @@
"""Shared logic for the /codex-runtime slash command.
Toggles `model.openai_runtime` between "auto" (= chat_completions, Hermes'
default) and "codex_app_server" (= hand turns to a codex subprocess).
Both CLI (cli.py) and gateway (gateway/run.py) call into this module so the
behavior stays identical across surfaces.
The actual runtime resolution happens in hermes_cli.runtime_provider's
_maybe_apply_codex_app_server_runtime() helper, which reads the persisted
config value. This module just persists the value and reports the change.
"""
from __future__ import annotations
import logging
from dataclasses import dataclass
from typing import Optional
logger = logging.getLogger(__name__)
VALID_RUNTIMES = ("auto", "codex_app_server")
@dataclass
class CodexRuntimeStatus:
"""Result of a /codex-runtime invocation. Callers render this however
suits their surface (CLI uses Rich panels, gateway sends a text message)."""
success: bool
new_value: Optional[str] = None
old_value: Optional[str] = None
message: str = ""
requires_new_session: bool = False
codex_binary_ok: bool = True
codex_version: Optional[str] = None
def parse_args(arg_string: str) -> tuple[Optional[str], list[str]]:
"""Parse the slash-command argument string. Returns (value, errors).
No args return current state (value=None)
'auto' / 'codex_app_server' / 'on' / 'off' return that value
anything else error
"""
raw = (arg_string or "").strip().lower()
if not raw:
return None, []
# Accept human-friendly synonyms
if raw in ("on", "codex", "enable"):
return "codex_app_server", []
if raw in ("off", "default", "disable", "hermes"):
return "auto", []
if raw in VALID_RUNTIMES:
return raw, []
return None, [
f"Unknown runtime {raw!r}. Use one of: auto, codex_app_server, on, off"
]
def get_current_runtime(config: dict) -> str:
"""Read the current `model.openai_runtime` value from a config dict.
Returns 'auto' for unset / empty / unrecognized values."""
if not isinstance(config, dict):
return "auto"
model_cfg = config.get("model") or {}
if not isinstance(model_cfg, dict):
return "auto"
value = str(model_cfg.get("openai_runtime") or "").strip().lower()
if value in VALID_RUNTIMES:
return value
return "auto"
def set_runtime(config: dict, new_value: str) -> str:
"""Mutate the config dict in place to persist the new runtime value.
Returns the previous value for callers that want to report a delta."""
if new_value not in VALID_RUNTIMES:
raise ValueError(
f"invalid runtime {new_value!r}; must be one of {VALID_RUNTIMES}"
)
old = get_current_runtime(config)
if not isinstance(config.get("model"), dict):
config["model"] = {}
config["model"]["openai_runtime"] = new_value
return old
def check_codex_binary_ok() -> tuple[bool, Optional[str]]:
"""Best-effort verification that codex CLI is installed at acceptable
version. Returns (ok, version_or_message)."""
try:
from agent.transports.codex_app_server import check_codex_binary
return check_codex_binary()
except Exception as exc: # pragma: no cover
return False, f"codex check failed: {exc}"
def apply(
config: dict,
new_value: Optional[str],
*,
persist_callback=None,
) -> CodexRuntimeStatus:
"""Top-level entry point used by both CLI and gateway handlers.
Args:
config: in-memory config dict (will be mutated when new_value is set)
new_value: desired runtime; None means "show current state only"
persist_callback: optional callable taking the mutated config dict
and persisting it to disk. Skipped when None (used by tests).
Returns: CodexRuntimeStatus describing the outcome.
"""
current = get_current_runtime(config)
# Cache the codex binary check for this apply() call. Subprocess spawn
# is cheap (~50ms for `codex --version`), but we'd otherwise call it up
# to 3 times in the enable path (read-only/state, gate, success message).
# None = not yet checked; (bool, str) = result.
_binary_check: Optional[tuple[bool, Optional[str]]] = None
def _check_binary_cached() -> tuple[bool, Optional[str]]:
nonlocal _binary_check
if _binary_check is None:
_binary_check = check_codex_binary_ok()
return _binary_check
# Read-only call: just report state
if new_value is None:
ok, ver = _check_binary_cached()
msg = (
f"openai_runtime: {current}\n"
f"codex CLI: {'OK ' + ver if ok else 'not available — ' + (ver or 'install with `npm i -g @openai/codex`')}"
)
return CodexRuntimeStatus(
success=True,
new_value=current,
old_value=current,
message=msg,
codex_binary_ok=ok,
codex_version=ver if ok else None,
)
# No change requested
if new_value == current:
return CodexRuntimeStatus(
success=True,
new_value=current,
old_value=current,
message=f"openai_runtime already set to {current}",
)
# If switching ON, verify codex CLI is installed before persisting —
# an opt-in toggle that silently fails on the first turn is the
# worst possible UX. Block here with a clear install hint.
if new_value == "codex_app_server":
ok, ver_or_msg = _check_binary_cached()
if not ok:
return CodexRuntimeStatus(
success=False,
new_value=None,
old_value=current,
message=(
"Cannot enable codex_app_server runtime: "
f"{ver_or_msg or 'codex CLI not available'}\n"
"Install with: npm i -g @openai/codex"
),
codex_binary_ok=False,
codex_version=None,
)
set_runtime(config, new_value)
if persist_callback is not None:
try:
persist_callback(config)
except Exception as exc:
logger.exception("failed to persist openai_runtime change")
return CodexRuntimeStatus(
success=False,
new_value=new_value,
old_value=current,
message=f"updated config in memory but persist failed: {exc}",
)
msg_lines = [
f"openai_runtime: {current}{new_value}",
]
if new_value == "codex_app_server":
ok, ver = _check_binary_cached()
if ok:
msg_lines.append(f"codex CLI: {ver}")
# Auto-migrate Hermes' MCP servers + Codex's installed curated
# plugins into ~/.codex/config.toml so the spawned codex subprocess
# sees the same tool surface AND can call back into Hermes for
# browser/web/delegate_task/vision/memory tools (#7 fix).
# Failures are non-fatal — the runtime change still proceeds.
try:
from hermes_cli.codex_runtime_plugin_migration import migrate
mig_report = migrate(config)
# Tools/MCP servers (excluding the hermes-tools callback,
# which is internal plumbing — surface separately).
user_servers = [
s for s in mig_report.migrated if s != "hermes-tools"
]
if user_servers:
msg_lines.append(
f"Migrated {len(user_servers)} MCP server(s): "
f"{', '.join(user_servers)}"
)
# Native Codex plugin migration (Linear, GitHub, etc.)
if mig_report.migrated_plugins:
msg_lines.append(
f"Migrated {len(mig_report.migrated_plugins)} native "
f"Codex plugin(s): {', '.join(mig_report.migrated_plugins)}"
)
elif mig_report.plugin_query_error:
msg_lines.append(
f"Codex plugin discovery skipped: "
f"{mig_report.plugin_query_error}"
)
# Permissions + Hermes tool callback are always-on production
# bits the user benefits from knowing about.
if mig_report.wrote_permissions_default:
msg_lines.append(
f"Default sandbox: {mig_report.wrote_permissions_default} "
f"(no approval prompt on every write)"
)
if "hermes-tools" in mig_report.migrated:
msg_lines.append(
"Hermes tool callback registered: codex can now use "
"web_search, web_extract, browser_*, vision_analyze, "
"image_generate, skill_view, skills_list, text_to_speech, "
"kanban_* (worker + orchestrator) via MCP."
)
msg_lines.append(
" (delegate_task, memory, session_search, todo run "
"only on the default Hermes runtime — they need the "
"agent loop context.)"
)
msg_lines.append(f" (config: {mig_report.target_path})")
for err in mig_report.errors:
msg_lines.append(f"⚠ MCP migration: {err}")
except Exception as exc:
msg_lines.append(f"⚠ MCP migration skipped: {exc}")
msg_lines.append(
"OpenAI/Codex turns now run through `codex app-server` "
"(terminal/file ops/patching inside Codex; "
"Hermes tools available via MCP callback)."
)
msg_lines.append(
"Effective on next session — current cached agent keeps "
"the prior runtime to preserve prompt cache."
)
else:
msg_lines.append("OpenAI/Codex turns will use the default Hermes runtime.")
msg_lines.append("Effective on next session.")
return CodexRuntimeStatus(
success=True,
new_value=new_value,
old_value=current,
message="\n".join(msg_lines),
requires_new_session=True,
)
+9 -16
View File
@@ -104,8 +104,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="<prompt>"),
CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
args_hint="[text | pause | resume | clear | status]"),
CommandDef("subgoal", "Add or manage extra criteria on the active goal", "Session",
args_hint="[text | remove N | clear]"),
CommandDef("status", "Show session info", "Session"),
CommandDef("whoami", "Show your slash command access (admin / user)", "Info"),
CommandDef("profile", "Show active profile name and home directory", "Info"),
@@ -122,8 +120,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
cli_only=True),
CommandDef("model", "Switch model for this session", "Configuration",
aliases=("provider",), args_hint="[model] [--provider name] [--global]"),
CommandDef("codex-runtime", "Toggle codex app-server runtime for OpenAI/Codex models",
"Configuration", args_hint="[auto|codex_app_server]"),
CommandDef("gquota", "Show Google Gemini Code Assist quota usage", "Info",
cli_only=True),
@@ -472,23 +468,20 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
Telegram command names cannot contain hyphens, so they are replaced with
underscores. Aliases are skipped -- Telegram shows one menu entry per
canonical command.
canonical command. Commands that require arguments are skipped because
selecting a Telegram BotCommand sends only ``/command`` and would execute
an incomplete command.
Built-in commands that require arguments (e.g. /queue, /steer, /background)
are **included** because their handlers return usage text when selected
without a payload, making them discoverable via autocomplete.
Plugin-registered slash commands that require arguments are **excluded**
because plugins may not provide a no-arg usage fallback.
Plugin-registered slash commands are included so plugins get native
autocomplete in Telegram without touching core code.
"""
overrides = _resolve_config_gates()
result: list[tuple[str, str]] = []
for cmd in COMMAND_REGISTRY:
if not _is_gateway_available(cmd, overrides):
continue
# Built-in arg-taking commands are included — their handlers show
# usage text when invoked without arguments, and hiding them from
# the menu hurts discoverability (issue #24312).
if _requires_argument(cmd.args_hint):
continue
tg_name = _sanitize_telegram_name(cmd.name)
if tg_name:
result.append((tg_name, cmd.description))
@@ -1366,9 +1359,9 @@ class SlashCommandCompleter(Completer):
try:
proc = subprocess.run(
cmd, capture_output=True, text=True, timeout=2,
cwd=cwd, encoding="utf-8", errors="replace",
cwd=cwd,
)
if proc.returncode == 0 and proc.stdout and proc.stdout.strip():
if proc.returncode == 0 and proc.stdout.strip():
raw = proc.stdout.strip().split("\n")
# Store relative paths
for p in raw[:5000]:
+1 -1
View File
@@ -238,7 +238,7 @@ _hermes() {{
esac
}}
compdef _hermes hermes
_hermes "$@"
"""
+51 -160
View File
@@ -477,12 +477,6 @@ DEFAULT_CONFIG = {
# threshold before escalating to a full timeout. The warning fires
# once per run and does not interrupt the agent. 0 = disable warning.
"gateway_timeout_warning": 900,
# Maximum time (seconds) the gateway will block an agent waiting for
# a clarify-tool response from the user. Hit this and the agent
# unblocks with "[user did not respond within Xm]" so it can adapt
# rather than pinning the running-agent guard forever. CLI clarify
# blocks indefinitely (input() is synchronous) and ignores this.
"clarify_timeout": 600,
# Periodic "still working" notification interval (seconds).
# Sends a status message every N seconds so the user knows the
# agent hasn't died during long tasks. 0 = disable notifications.
@@ -634,12 +628,6 @@ DEFAULT_CONFIG = {
# so the server maps it to a persistent Firefox profile automatically.
# When false (default), each session gets a random userId (ephemeral).
"managed_persistence": False,
# Optional externally managed Camofox identity. Useful when another
# app owns the visible browser and Hermes should operate in it.
"user_id": "",
"session_key": "",
# Rehydrate tab_id from Camofox before creating a new tab.
"adopt_existing_tab": False,
},
},
@@ -731,18 +719,19 @@ DEFAULT_CONFIG = {
"target_ratio": 0.20, # fraction of threshold to preserve as recent tail
"protect_last_n": 20, # minimum recent messages to keep uncompressed
"hygiene_hard_message_limit": 400, # gateway session-hygiene force-compress threshold by message count
"protect_first_n": 3, # non-system head messages always preserved
# verbatim, in ADDITION to the system prompt
# (which is always implicitly protected). Set to
# 0 for long-running rolling-compaction sessions
# where you want nothing pinned except the
# system prompt + rolling summary + recent tail.
},
# Anthropic prompt caching (Claude via OpenRouter or native Anthropic API).
# cache_ttl must be "5m" or "1h" (Anthropic-supported tiers); other values are ignored.
# long_lived_prefix: when true (default), Claude on Anthropic / OpenRouter / Nous
# Portal uses a split layout: tools[-1] + stable system prefix at long_lived_ttl
# (cross-session cache), last 2 messages at cache_ttl (within-session rolling).
# Set false to keep the legacy "system + last 3 messages" single-tier layout.
# long_lived_ttl: TTL for the cross-session prefix tier ("5m" or "1h"; default "1h").
"prompt_caching": {
"cache_ttl": "5m",
"long_lived_prefix": True,
"long_lived_ttl": "1h",
},
# OpenRouter-specific settings.
@@ -928,14 +917,6 @@ DEFAULT_CONFIG = {
"persistent_output": True,
"persistent_output_max_lines": 200,
"inline_diffs": True, # Show inline diff previews for write actions (write_file, patch, skill_manage)
# File-mutation verifier footer. When true (default), the agent
# appends a one-line advisory to its final response whenever a
# write_file / patch call failed during the turn and was never
# superseded by a successful write to the same path. This catches
# the "batch of parallel patches, half fail, model claims success"
# class of over-claim that otherwise forces users to run
# `git status` to verify edits landed. Set false to suppress.
"file_mutation_verifier": True,
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
# UI language for static user-facing messages (approval prompts, a
@@ -977,21 +958,6 @@ DEFAULT_CONFIG = {
# Web dashboard settings
"dashboard": {
"theme": "default", # Dashboard visual theme: "default", "midnight", "ember", "mono", "cyberpunk", "rose"
# Hide the token/cost analytics surfaces (Analytics page, token bars and
# cost figures on the Models page) by default. The numbers shown there
# are a local debug estimate: they only count successful main-agent
# responses with a usable ``response.usage``, and silently exclude every
# auxiliary call (context compression, title generation, vision,
# session search, web extract, smart approval, MCP routing, plugin LLM
# access) plus provider-side retries, fallback attempts, and any call
# whose usage block didn't come back. Cache writes are also missing
# from the API response. On models with heavy auxiliary traffic
# (Kimi K2.6, MiniMax M2.7) the local total can be 10x-100x lower than
# the provider bill, which is worse than hiding the numbers entirely
# because they look precise enough to compare against the provider.
# Set this to True to re-enable the surfaces with the understanding
# that the numbers are a local lower-bound estimate, not billing.
"show_token_analytics": False,
},
# Privacy settings
@@ -1227,6 +1193,45 @@ DEFAULT_CONFIG = {
},
},
# Post-edit lint behaviour. Hermes runs a syntax check after every
# write/patch and surfaces only the errors that edit introduced (see
# ``ShellFileOperations._check_lint_delta`` in tools/file_operations.py).
# The defaults below leave the legacy shell linters (``npx tsc --noEmit``,
# ``go vet``, ``rustfmt --check``, ``py_compile``) in charge — the LSP
# path is opt-in until container/SSH backends grow a way to host the
# language server inside the sandbox.
"lint": {
"lsp": {
# Master switch. When false, ``_check_lint`` skips LSP entirely
# and the in-process / shell linter table runs as before.
"enabled": False,
# Per-language server launch command. Each value is either a
# whitespace-separated string or a list. Missing languages
# fall back to ``tools.lsp_lint._DEFAULT_SERVERS``.
"servers": {
"typescript": "typescript-language-server --stdio",
"typescriptreact": "typescript-language-server --stdio",
"javascript": "typescript-language-server --stdio",
"javascriptreact": "typescript-language-server --stdio",
"rust": "rust-analyzer",
"go": "gopls",
},
# Wall-clock timeout for a single ``didOpen`` → diagnostics
# round-trip. Servers that take longer than this on cold start
# (notably rust-analyzer indexing a fresh checkout) cause the
# caller to fall through to the shell linter.
"diagnostic_timeout": 10,
# Settle window (milliseconds). Typescript-language-server
# publishes an empty diagnostics batch while the program graph
# loads, then re-publishes once the file is fully analysed.
# Re-snapshotting after this delay catches the real verdict.
"settle_ms": 400,
# Shut down a long-lived server process after this many seconds
# of inactivity. The reaper runs in-process; no daemons.
"idle_shutdown": 600,
},
},
# Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
# This section is only needed for hermes-specific overrides; everything else
# (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
@@ -1250,7 +1255,6 @@ DEFAULT_CONFIG = {
"free_response_channels": "", # Comma-separated channel IDs where bot responds without mention
"allowed_channels": "", # If set, bot ONLY responds in these channel IDs (whitelist)
"auto_thread": True, # Auto-create threads on @mention in channels (like Slack)
"thread_require_mention": False, # If True, require @mention in threads too (multi-bot threads)
"reactions": True, # Add 👀/✅/❌ reactions to messages during processing
"channel_prompts": {}, # Per-channel ephemeral system prompts (forum parents apply to child threads)
# Opt-in DM role-based auth (#12136). By default, DISCORD_ALLOWED_ROLES
@@ -1367,21 +1371,6 @@ DEFAULT_CONFIG = {
"domains": [],
"shared_files": [],
},
# Acknowledged supply-chain security advisories. Each entry is the
# ID of an advisory the user has read and acted on (uninstalled the
# compromised package, rotated credentials). Acked advisories no
# longer trigger the startup banner. Add via `hermes doctor --ack
# <id>`; remove by editing the list directly. See
# ``hermes_cli/security_advisories.py`` for the catalog.
"acked_advisories": [],
# Allow Hermes to lazy-install opt-in backend packages from PyPI
# the first time the user enables a backend that needs them
# (e.g. installing ``elevenlabs`` when the user picks ElevenLabs as
# their TTS provider). Set to false to require explicit
# ``pip install`` for everything beyond the base set — appropriate
# for restricted networks, audited environments, or air-gapped
# systems where any runtime install is unacceptable.
"allow_lazy_installs": True,
},
"cron": {
@@ -1520,53 +1509,6 @@ DEFAULT_CONFIG = {
"backup_keep": 5,
},
# Language Server Protocol — semantic diagnostics from real
# language servers (pyright, gopls, rust-analyzer, etc.) wired
# into the post-write lint check used by ``write_file`` and
# ``patch``.
#
# LSP is gated on git-workspace detection: when the agent's
# cwd (or the file being edited) is inside a git worktree, LSP
# runs against that workspace. When neither is in a git repo,
# LSP stays dormant and the in-process syntax check is the only
# tier — handy for Telegram/Discord chats where the cwd is the
# user's home directory.
"lsp": {
# Master toggle. Setting this to false disables the entire
# subsystem — no servers spawn, no background event loop, no
# cost.
"enabled": True,
# Diagnostic-wait mode for the post-write check.
# ``"document"`` waits up to ``wait_timeout`` seconds for the
# current file's diagnostics; ``"full"`` additionally requests
# workspace-wide diagnostics (slower).
"wait_mode": "document",
"wait_timeout": 5.0,
# How to handle missing server binaries.
# ``"auto"`` — try to install via npm/go/pip into
# ``<HERMES_HOME>/lsp/bin/`` on first use.
# ``"manual"`` — only use binaries already on PATH.
# ``"off"`` — alias for ``manual``.
"install_strategy": "auto",
# Per-server overrides. Each key is a server_id from the
# registry (``pyright``, ``typescript``, ``gopls``,
# ``rust-analyzer``, etc.) and accepts:
# disabled: true
# — skip this server even when its extensions match
# command: ["full/path/to/server", "--stdio"]
# — pin a custom binary path; bypasses auto-install
# env: {"KEY": "value"}
# — extra env vars passed to the spawned process
# initialization_options: {...}
# — merged into the LSP ``initializationOptions``
# Empty by default; the registry defaults work for typical
# setups.
"servers": {},
},
# Config schema version - bump this when adding new required fields
"_config_version": 23,
}
@@ -2129,10 +2071,10 @@ OPTIONAL_ENV_VARS = {
"category": "tool",
},
"FAL_KEY": {
"description": "FAL API key for image and video generation",
"description": "FAL API key for image generation",
"prompt": "FAL API key",
"url": "https://fal.ai/",
"tools": ["image_generate", "video_generate"],
"tools": ["image_generate"],
"password": True,
"category": "tool",
},
@@ -4341,34 +4283,10 @@ def load_env() -> Dict[str, str]:
concatenated KEY=VALUE pairs on a single line) are handled
gracefully instead of producing mangled values such as duplicated
bot tokens. See #8908.
The parsed dict is memoised keyed on the .env file mtime, because
``get_env_value()`` is called dozens-to-hundreds of times per
interactive menu render (`hermes tools`, `hermes setup`, status
panels). Sanitisation is O(lines × known-keys), so re-parsing the
same file on every call was burning ~300ms of CPU per `hermes tools`
menu paint on top of the OAuth-refresh slowness. The mtime check
invalidates the cache when the user edits .env mid-process.
"""
global _env_cache
env_path = get_env_path()
try:
mtime = env_path.stat().st_mtime
size = env_path.stat().st_size
cache_key = (str(env_path), mtime, size)
except FileNotFoundError:
cache_key = (str(env_path), None, None)
except Exception:
cache_key = None
if cache_key is not None and _env_cache is not None:
cached_key, cached_vars = _env_cache
if cached_key == cache_key:
return dict(cached_vars)
env_vars: Dict[str, str] = {}
env_vars = {}
if env_path.exists():
# On Windows, open() defaults to the system locale (cp1252) which can
# fail on UTF-8 .env files. Always use explicit UTF-8; tolerate BOM
@@ -4384,33 +4302,10 @@ def load_env() -> Dict[str, str]:
if line and not line.startswith('#') and '=' in line:
key, _, value = line.partition('=')
env_vars[key.strip()] = value.strip().strip('"\'')
if cache_key is not None:
_env_cache = (cache_key, dict(env_vars))
return env_vars
# Module-level memo for load_env(), keyed on (path, mtime, size).
# Editing .env bumps mtime → next load_env() rebuilds. invalidate_env_cache()
# is the explicit knob for writers that update .env via this module
# (set_env_value, save_env, etc.) without relying on filesystem mtime
# resolution.
_env_cache: Optional[Tuple[Tuple[str, Optional[float], Optional[int]], Dict[str, str]]] = None
def invalidate_env_cache() -> None:
"""Clear the load_env() process-level memo.
Writers that mutate .env (set_env_value, save_env, etc.) call this
to guarantee the next load_env() sees their change even on
filesystems with coarse mtime resolution. Reads invalidate naturally
via the mtime/size check.
"""
global _env_cache
_env_cache = None
def _sanitize_env_lines(lines: list) -> list:
"""Fix corrupted .env lines before reading or writing.
@@ -4513,7 +4408,6 @@ def sanitize_env_file() -> int:
pass
raise
_secure_file(env_path)
invalidate_env_cache()
return fixes
@@ -4625,7 +4519,6 @@ def save_env_value(key: str, value: str):
_secure_file(env_path)
os.environ[key] = value
invalidate_env_cache()
def remove_env_value(key: str) -> bool:
@@ -4681,7 +4574,6 @@ def remove_env_value(key: str) -> bool:
_secure_file(env_path)
os.environ.pop(key, None)
invalidate_env_cache()
return found
@@ -4868,7 +4760,6 @@ def show_config():
print(f" Threshold: {compression.get('threshold', 0.50) * 100:.0f}%")
print(f" Target ratio: {compression.get('target_ratio', 0.20) * 100:.0f}% of threshold preserved")
print(f" Protect last: {compression.get('protect_last_n', 20)} messages")
print(f" Protect first: {compression.get('protect_first_n', 3)} non-system head messages")
_aux_comp = config.get('auxiliary', {}).get('compression', {})
_sm = _aux_comp.get('model', '') or '(auto)'
print(f" Model: {_sm}")
+3 -86
View File
@@ -287,8 +287,7 @@ def _build_apikey_providers_list() -> list:
(_pp.models_url or (_pp.base_url.rstrip("/") + "/models"))
if _pp.base_url else None
)
_hc = getattr(_pp, "supports_health_check", True)
_static.append((_label, _key_vars, _models_url, _base_var, _hc))
_static.append((_label, _key_vars, _models_url, _base_var, True))
except Exception:
pass
return _static
@@ -297,101 +296,19 @@ def _build_apikey_providers_list() -> list:
def run_doctor(args):
"""Run diagnostic checks."""
should_fix = getattr(args, 'fix', False)
ack_target = getattr(args, 'ack', None)
# Doctor runs from the interactive CLI, so CLI-gated tool availability
# checks (like cronjob management) should see the same context as `hermes`.
os.environ.setdefault("HERMES_INTERACTIVE", "1")
# Handle `hermes doctor --ack <id>` as a fast path. Persist the ack and
# return without running the rest of the diagnostics — the user has
# already seen the advisory and just wants to silence it.
if ack_target:
from hermes_cli.security_advisories import (
ADVISORIES,
ack_advisory,
)
valid_ids = {a.id for a in ADVISORIES}
if ack_target not in valid_ids:
print(color(
f"Unknown advisory ID: {ack_target!r}. Known IDs: "
f"{', '.join(sorted(valid_ids)) or '(none)'}",
Colors.RED,
))
sys.exit(2)
if ack_advisory(ack_target):
print(color(
f" ✓ Acknowledged advisory {ack_target}. "
f"It will no longer trigger startup banners.",
Colors.GREEN,
))
else:
print(color(
f" ✗ Failed to persist ack for {ack_target}. "
f"Check ~/.hermes/config.yaml is writable.",
Colors.RED,
))
sys.exit(1)
return
issues = []
manual_issues = [] # issues that can't be auto-fixed
fixed_count = 0
print()
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
print(color("│ 🩺 Hermes Doctor │", Colors.CYAN))
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
# =========================================================================
# Check: Security advisories (RUNS FIRST — these are the most urgent)
# =========================================================================
print()
print(color("◆ Security Advisories", Colors.CYAN, Colors.BOLD))
try:
from hermes_cli.security_advisories import (
detect_compromised,
filter_unacked,
full_remediation_text,
get_acked_ids,
)
all_hits = detect_compromised()
fresh_hits = filter_unacked(all_hits)
if fresh_hits:
for hit in fresh_hits:
check_fail(
f"{hit.advisory.title}",
f"({hit.package}=={hit.installed_version})",
)
# Print the full remediation block, indented under the
# check_fail header so it reads as a single section.
for line in full_remediation_text(hit):
if line:
print(f" {color(line, Colors.YELLOW)}")
else:
print()
# Funnel into the action list so the summary block surfaces it
# for users who scroll past the section.
manual_issues.append(
f"Resolve security advisory {hit.advisory.id}: "
f"uninstall {hit.package}=={hit.installed_version} and "
f"rotate credentials, then run "
f"`hermes doctor --ack {hit.advisory.id}`."
)
# Acked-but-still-installed: show as informational so the user
# knows the package is still on disk after the ack.
acked_ids = get_acked_ids()
for h in all_hits:
if h.advisory.id in acked_ids:
check_warn(
f"{h.package}=={h.installed_version} still installed "
f"(advisory {h.advisory.id} acknowledged)",
)
else:
check_ok("No active security advisories")
except Exception as e:
# Never let a bug in the advisory check block the rest of doctor.
check_warn(f"Security advisory check failed: {e}")
# =========================================================================
# Check: Python version
+2 -16
View File
@@ -2164,7 +2164,7 @@ Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=always
RestartSec=5
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
@@ -2199,7 +2199,7 @@ Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=always
RestartSec=5
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
@@ -3658,15 +3658,6 @@ def _all_platforms() -> list[dict]:
``hermes setup gateway`` without needing the gateway to be running.
Built-ins keep their dict shape; plugin entries are adapted to the same
shape with ``_registry_entry`` holding the source.
Platform-specific gating: some platforms can't be configured on
every host. Currently:
- Matrix is hidden on Windows. The [matrix] extra pulls
``mautrix[encryption]`` -> ``python-olm``, which has no Windows
wheel and needs ``make`` + libolm to build from sdist. There's
no native Windows path that works, so we don't offer it in the
picker. Users who want Matrix on Windows can run hermes under
WSL.
"""
# Populate the registry so plugin platforms are visible. Idempotent.
# Bundled platform plugins (``kind: platform``) auto-load unconditionally,
@@ -3680,11 +3671,6 @@ def _all_platforms() -> list[dict]:
logger.debug("plugin discovery failed during platform enumeration: %s", e)
platforms = [dict(p) for p in _PLATFORMS]
# Drop platforms that can't function on this host. See docstring.
if sys.platform == "win32":
platforms = [p for p in platforms if p.get("key") != "matrix"]
by_key = {p["key"]: p for p in platforms}
try:
+12 -141
View File
@@ -33,8 +33,8 @@ import json
import logging
import re
import time
from dataclasses import dataclass, field, asdict
from typing import Any, Dict, List, Optional, Tuple
from dataclasses import dataclass, asdict
from typing import Any, Dict, Optional, Tuple
logger = logging.getLogger(__name__)
@@ -65,21 +65,6 @@ CONTINUATION_PROMPT_TEMPLATE = (
"If you are blocked and need input from the user, say so clearly and stop."
)
# Used when the user has added one or more /subgoal criteria. Surfaced
# to the agent verbatim so it sees what to target on the next turn,
# and surfaced to the judge so the verdict considers them too.
CONTINUATION_PROMPT_WITH_SUBGOALS_TEMPLATE = (
"[Continuing toward your standing goal]\n"
"Goal: {goal}\n\n"
"Additional criteria the user added mid-loop:\n"
"{subgoals_block}\n\n"
"Continue working toward the goal AND all additional criteria. Take "
"the next concrete step. If you believe the goal and every "
"additional criterion are complete, state so explicitly and stop. "
"If you are blocked and need input from the user, say so clearly "
"and stop."
)
JUDGE_SYSTEM_PROMPT = (
"You are a strict judge evaluating whether an autonomous agent has "
@@ -103,23 +88,6 @@ JUDGE_USER_PROMPT_TEMPLATE = (
"Is the goal satisfied?"
)
# Used when the user has added /subgoal criteria. The judge must
# evaluate ALL of them being met, not just the original goal.
JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE = (
"Goal:\n{goal}\n\n"
"Additional criteria the user added mid-loop (all must also be "
"satisfied for the goal to be DONE):\n{subgoals_block}\n\n"
"Agent's most recent response:\n{response}\n\n"
"Decision: For each numbered criterion above, find concrete "
"evidence in the agent's response that the criterion is "
"satisfied. Do not accept generic phrases like 'all requirements "
"met' or 'implying it was done' — require specific evidence (a "
"file contents excerpt, an output line, a command result). If "
"ANY criterion lacks specific evidence in the response, the goal "
"is NOT done — return CONTINUE.\n\n"
"Is the goal AND every additional criterion satisfied?"
)
# ──────────────────────────────────────────────────────────────────────
# Dataclass
@@ -140,12 +108,6 @@ class GoalState:
last_reason: Optional[str] = None
paused_reason: Optional[str] = None # why we auto-paused (budget, etc.)
consecutive_parse_failures: int = 0 # judge-output parse failures in a row
# User-added criteria appended mid-loop via the /subgoal command.
# When non-empty the judge prompt and continuation prompt both
# include them so the agent works toward them and the judge factors
# them into the verdict. Backwards-compatible: defaults to empty so
# old state_meta rows load unchanged.
subgoals: List[str] = field(default_factory=list)
def to_json(self) -> str:
return json.dumps(asdict(self), ensure_ascii=False)
@@ -153,10 +115,6 @@ class GoalState:
@classmethod
def from_json(cls, raw: str) -> "GoalState":
data = json.loads(raw)
raw_subgoals = data.get("subgoals") or []
subgoals: List[str] = []
if isinstance(raw_subgoals, list):
subgoals = [str(s).strip() for s in raw_subgoals if str(s).strip()]
return cls(
goal=data.get("goal", ""),
status=data.get("status", "active"),
@@ -168,18 +126,8 @@ class GoalState:
last_reason=data.get("last_reason"),
paused_reason=data.get("paused_reason"),
consecutive_parse_failures=int(data.get("consecutive_parse_failures", 0) or 0),
subgoals=subgoals,
)
# --- subgoals helpers -------------------------------------------------
def render_subgoals_block(self) -> str:
"""Render the subgoals as a numbered ``- N. text`` block. Empty
when no subgoals exist."""
if not self.subgoals:
return ""
return "\n".join(f"- {i}. {text}" for i, text in enumerate(self.subgoals, start=1))
# ──────────────────────────────────────────────────────────────────────
# Persistence (SessionDB state_meta)
@@ -336,7 +284,6 @@ def judge_goal(
last_response: str,
*,
timeout: float = DEFAULT_JUDGE_TIMEOUT,
subgoals: Optional[List[str]] = None,
) -> Tuple[str, str, bool]:
"""Ask the auxiliary model whether the goal is satisfied.
@@ -349,11 +296,6 @@ def judge_goal(
auto-pause after N consecutive parse failures (see
``DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES``).
``subgoals`` is an optional list of user-added criteria (from
``/subgoal``) that the judge must also factor into its DONE/CONTINUE
decision. When non-empty the prompt switches to the with-subgoals
template; otherwise behavior is identical to the original judge.
This is deliberately fail-open: any error returns ``("continue", "...", False)``
so a broken judge doesn't wedge progress — the turn budget and the
consecutive-parse-failures auto-pause are the backstops.
@@ -365,7 +307,7 @@ def judge_goal(
return "continue", "empty response (nothing to evaluate)", False
try:
from agent.auxiliary_client import get_auxiliary_extra_body, get_text_auxiliary_client
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc:
logger.debug("goal judge: auxiliary client import failed: %s", exc)
return "continue", "auxiliary client unavailable", False
@@ -379,22 +321,10 @@ def judge_goal(
if client is None or not model:
return "continue", "no auxiliary client configured", False
# Build the prompt — pick the with-subgoals variant when applicable.
clean_subgoals = [s.strip() for s in (subgoals or []) if s and s.strip()]
if clean_subgoals:
subgoals_block = "\n".join(
f"- {i}. {text}" for i, text in enumerate(clean_subgoals, start=1)
)
prompt = JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE.format(
goal=_truncate(goal, 2000),
subgoals_block=_truncate(subgoals_block, 2000),
response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
)
else:
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
)
prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
goal=_truncate(goal, 2000),
response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
)
try:
resp = client.chat.completions.create(
@@ -406,7 +336,6 @@ def judge_goal(
temperature=0,
max_tokens=200,
timeout=timeout,
extra_body=get_auxiliary_extra_body() or None,
)
except Exception as exc:
logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
@@ -467,15 +396,14 @@ class GoalManager:
if s is None or s.status in {"cleared",}:
return "No active goal. Set one with /goal <text>."
turns = f"{s.turns_used}/{s.max_turns} turns"
sub = f", {len(s.subgoals)} subgoal{'s' if len(s.subgoals) != 1 else ''}" if s.subgoals else ""
if s.status == "active":
return f"⊙ Goal (active, {turns}{sub}): {s.goal}"
return f"⊙ Goal (active, {turns}): {s.goal}"
if s.status == "paused":
extra = f"{s.paused_reason}" if s.paused_reason else ""
return f"⏸ Goal (paused, {turns}{sub}{extra}): {s.goal}"
return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
if s.status == "done":
return f"✓ Goal done ({turns}{sub}): {s.goal}"
return f"Goal ({s.status}, {turns}{sub}): {s.goal}"
return f"✓ Goal done ({turns}): {s.goal}"
return f"Goal ({s.status}, {turns}): {s.goal}"
# --- mutation -----------------------------------------------------
@@ -528,53 +456,6 @@ class GoalManager:
self._state.last_reason = reason
save_goal(self.session_id, self._state)
# --- /subgoal user controls ---------------------------------------
def add_subgoal(self, text: str) -> str:
"""Append a user-added criterion to the active goal. Requires
``has_goal()``; raises ``RuntimeError`` otherwise.
Returns the cleaned text so the caller can show it back to the user.
"""
if self._state is None or not self.has_goal():
raise RuntimeError("no active goal")
text = (text or "").strip()
if not text:
raise ValueError("subgoal text is empty")
self._state.subgoals.append(text)
save_goal(self.session_id, self._state)
return text
def remove_subgoal(self, index_1based: int) -> str:
"""Remove a subgoal by 1-based index. Returns the removed text."""
if self._state is None or not self.has_goal():
raise RuntimeError("no active goal")
idx = int(index_1based) - 1
if idx < 0 or idx >= len(self._state.subgoals):
raise IndexError(
f"index out of range (1..{len(self._state.subgoals)})"
)
removed = self._state.subgoals.pop(idx)
save_goal(self.session_id, self._state)
return removed
def clear_subgoals(self) -> int:
"""Wipe all subgoals. Returns the previous count."""
if self._state is None or not self.has_goal():
raise RuntimeError("no active goal")
prev = len(self._state.subgoals)
self._state.subgoals = []
save_goal(self.session_id, self._state)
return prev
def render_subgoals(self) -> str:
"""Public helper for the /subgoal slash command."""
if self._state is None:
return "(no active goal)"
if not self._state.subgoals:
return "(no subgoals — use /subgoal <text> to add criteria)"
return self._state.render_subgoals_block()
# --- the main entry point called after every turn -----------------
def evaluate_after_turn(
@@ -612,9 +493,7 @@ class GoalManager:
state.turns_used += 1
state.last_turn_at = time.time()
verdict, reason, parse_failed = judge_goal(
state.goal, last_response, subgoals=state.subgoals or None
)
verdict, reason, parse_failed = judge_goal(state.goal, last_response)
state.last_verdict = verdict
state.last_reason = reason
@@ -699,11 +578,6 @@ class GoalManager:
def next_continuation_prompt(self) -> Optional[str]:
if not self._state or self._state.status != "active":
return None
if self._state.subgoals:
return CONTINUATION_PROMPT_WITH_SUBGOALS_TEMPLATE.format(
goal=self._state.goal,
subgoals_block=self._state.render_subgoals_block(),
)
return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
@@ -711,9 +585,6 @@ __all__ = [
"GoalState",
"GoalManager",
"CONTINUATION_PROMPT_TEMPLATE",
"CONTINUATION_PROMPT_WITH_SUBGOALS_TEMPLATE",
"JUDGE_USER_PROMPT_TEMPLATE",
"JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE",
"DEFAULT_MAX_TURNS",
"load_goal",
"save_goal",
-240
View File
@@ -1,240 +0,0 @@
"""Provider/model inventory context — shared substrate for the dashboard
``/api/model/options``, the TUI ``model.options``/``model.save_key``
JSON-RPC handlers, and the interactive picker.
Before this module the three call-sites each duplicated:
1. The 17-LOC config-slice that pulls ``model.{default,name,provider,base_url}``,
``providers:``, and ``custom_providers:`` out of ``load_config()``;
2. The call into ``list_authenticated_providers`` with the resulting kwargs;
3. (TUI only) a 45-LOC post-pass that merges authenticated rows with
unconfigured ``CANONICAL_PROVIDERS`` rows and emits ``authenticated``/
``auth_type``/``key_env``/``warning`` hints for the picker UI.
Consolidating those three steps into one entry point eliminates two bugs
the duplicates were hiding:
- The dashboard read ``cfg.get("custom_providers")`` directly, missing the
v12+ keyed ``providers:`` form (which the TUI handled via
``get_compatible_custom_providers``).
- The TUI's canonical-merge keyed on ``is_user_defined`` to decide
ordering. Section 3 of ``list_authenticated_providers`` sets
``is_user_defined=True`` even for canonical slugs that appear in the
``providers:`` config dict, which silently demoted them to the tail of
the picker. ``_reorder_canonical`` keys on slug membership instead.
Substrate facts (verified May 2026):
- ``list_authenticated_providers`` already populates each row's
``models`` from the curated catalog (same source as the picker). Do
NOT call ``provider_model_ids()`` per row to "freshen" that bypasses
curation and pulls in non-agentic models (Nous /models returns ~400
IDs including TTS, embeddings, rerankers, image/video generators).
"""
from __future__ import annotations
from dataclasses import dataclass, replace
from typing import Optional
# ─── Public types ───────────────────────────────────────────────────────
@dataclass(frozen=True)
class ConfigContext:
"""Snapshot of the model + provider config every inventory caller
needs. Built once via ``load_picker_context()``; the TUI overlays
live agent state via ``with_overrides()`` before passing through.
"""
current_provider: str
current_model: str
current_base_url: str
user_providers: dict
custom_providers: list
def with_overrides(
self,
*,
current_provider: Optional[str] = None,
current_model: Optional[str] = None,
current_base_url: Optional[str] = None,
) -> "ConfigContext":
"""Return a copy with truthy overrides applied.
Truthy-only because the TUI reads agent attributes that may be
empty strings before an agent is spawned empties must NOT
clobber the disk-config values.
"""
kw: dict = {}
if current_provider:
kw["current_provider"] = current_provider
if current_model:
kw["current_model"] = current_model
if current_base_url:
kw["current_base_url"] = current_base_url
return replace(self, **kw) if kw else self
def load_picker_context() -> ConfigContext:
"""Load the disk-config snapshot every consumer needs.
Replaces the inline 17-LOC config-slice that ``web_server.py`` and
``tui_gateway/server.py`` (×2 sites) used to do.
"""
from hermes_cli.config import get_compatible_custom_providers, load_config
cfg = load_config()
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, dict):
current_model = model_cfg.get("default", model_cfg.get("name", "")) or ""
current_provider = model_cfg.get("provider", "") or ""
current_base_url = model_cfg.get("base_url", "") or ""
else:
# config.model can be a bare string in older configs.
current_model = str(model_cfg) if model_cfg else ""
current_provider = ""
current_base_url = ""
raw = cfg.get("providers")
return ConfigContext(
current_provider=current_provider,
current_model=current_model,
current_base_url=current_base_url,
user_providers=raw if isinstance(raw, dict) else {},
custom_providers=get_compatible_custom_providers(cfg),
)
# ─── Public: payload builder ────────────────────────────────────────────
def build_models_payload(
ctx: ConfigContext,
*,
include_unconfigured: bool = False,
picker_hints: bool = False,
canonical_order: bool = False,
max_models: int = 50,
) -> dict:
"""Build the ``{providers, model, provider}`` shape every consumer
needs from a single substrate call.
Flags:
- ``include_unconfigured``: append ``CANONICAL_PROVIDERS`` rows that
``list_authenticated_providers`` didn't emit (TUI uses this to show
the full provider universe in the picker).
- ``picker_hints``: add ``authenticated``/``auth_type``/``key_env``/
``warning`` per row (TUI ``ModelPickerDialog`` shape).
- ``canonical_order``: reorder canonical-slug rows to
``CANONICAL_PROVIDERS`` declaration order; truly-custom rows go
last (TUI display order).
"""
from hermes_cli.model_switch import list_authenticated_providers
rows = list_authenticated_providers(
current_provider=ctx.current_provider,
current_base_url=ctx.current_base_url,
current_model=ctx.current_model,
user_providers=ctx.user_providers,
custom_providers=ctx.custom_providers,
max_models=max_models,
)
if include_unconfigured:
rows = list(rows) + _append_unconfigured_rows(rows, ctx)
if picker_hints:
_apply_picker_hints(rows)
if canonical_order:
rows = _reorder_canonical(rows)
return {
"providers": rows,
"model": ctx.current_model,
"provider": ctx.current_provider,
}
# ─── Internal: row post-processing ──────────────────────────────────────
def _append_unconfigured_rows(rows: list[dict], ctx: ConfigContext) -> list[dict]:
"""Build skeleton rows for canonical providers missing from ``rows``."""
from hermes_cli.models import CANONICAL_PROVIDERS, _PROVIDER_LABELS
seen = {r["slug"].lower() for r in rows}
cur = (ctx.current_provider or "").lower()
extras: list[dict] = []
for entry in CANONICAL_PROVIDERS:
if entry.slug.lower() in seen:
continue
extras.append(
{
"slug": entry.slug,
"name": _PROVIDER_LABELS.get(entry.slug, entry.label),
"is_current": entry.slug.lower() == cur,
"is_user_defined": False,
"models": [],
"total_models": 0,
"source": "canonical",
}
)
return extras
def _apply_picker_hints(rows: list[dict]) -> None:
"""Add ``authenticated``/``auth_type``/``key_env``/``warning`` per row.
Mutates ``rows`` in-place. Rows already from
``list_authenticated_providers`` are marked ``authenticated=True``;
the unconfigured skeleton rows from ``_append_unconfigured_rows`` get
the picker's setup-hint shape.
"""
from hermes_cli.auth import PROVIDER_REGISTRY
for row in rows:
if "authenticated" in row:
continue
# Distinguish authenticated rows (returned by
# list_authenticated_providers) from skeleton rows (from
# _append_unconfigured_rows). The skeleton rows have empty
# `models` AND source="canonical"; authenticated rows have
# populated `models` OR a non-canonical source.
is_skeleton = row.get("source") == "canonical" and not row.get("models")
row["authenticated"] = not is_skeleton
if not is_skeleton or row.get("is_user_defined"):
continue
cfg = PROVIDER_REGISTRY.get(row["slug"])
auth_type = cfg.auth_type if cfg else "api_key"
key_env = (
cfg.api_key_env_vars[0]
if (cfg and cfg.api_key_env_vars)
else ""
)
row["auth_type"] = auth_type
row["key_env"] = key_env
row["warning"] = (
f"paste {key_env} to activate"
if auth_type == "api_key" and key_env
else f"run `hermes model` to configure ({auth_type})"
)
def _reorder_canonical(rows: list[dict]) -> list[dict]:
"""Canonical slugs in ``CANONICAL_PROVIDERS`` declaration order;
truly-custom rows last.
Keys on slug membership, NOT ``is_user_defined`` section 3 of
``list_authenticated_providers`` sets ``is_user_defined=True`` on
rows from the ``providers:`` config dict even when the slug is
canonical. Keying on the flag would silently demote canonical
providers configured via the new keyed schema.
"""
from hermes_cli.models import CANONICAL_PROVIDERS
order = {e.slug: i for i, e in enumerate(CANONICAL_PROVIDERS)}
canon = sorted(
(r for r in rows if r["slug"] in order),
key=lambda r: order[r["slug"]],
)
extras = [r for r in rows if r["slug"] not in order]
return canon + extras
+1 -2
View File
@@ -155,7 +155,7 @@ def specify_task(
)
try:
from agent.auxiliary_client import get_auxiliary_extra_body, get_text_auxiliary_client
from agent.auxiliary_client import get_text_auxiliary_client
except Exception as exc: # pragma: no cover — import smoke test
logger.debug("specify: auxiliary client import failed: %s", exc)
return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
@@ -187,7 +187,6 @@ def specify_task(
temperature=0.3,
max_tokens=1500,
timeout=timeout or 120,
extra_body=get_auxiliary_extra_body() or None,
)
except Exception as exc:
logger.info(
+45 -271
View File
@@ -2414,31 +2414,30 @@ def _prompt_provider_choice(choices, *, default=0):
def _model_flow_openrouter(config, current_model=""):
"""OpenRouter provider: ensure API key, then pick model."""
from hermes_cli.auth import (
ProviderConfig,
_prompt_model_selection,
_save_model_choice,
deactivate_provider,
)
from hermes_cli.config import get_env_value
from hermes_cli.config import get_env_value, save_env_value
# Route through _prompt_api_key so users can replace a stale/broken key
# in-flow (K/R/C) instead of having to edit ~/.hermes/.env by hand. The
# previous bypass-when-key-exists branch left no way to recover from a
# bad paste short of re-running `hermes setup` from scratch. OpenRouter
# isn't in PROVIDER_REGISTRY so we synthesize a minimal pconfig.
pconfig = ProviderConfig(
id="openrouter",
name="OpenRouter",
auth_type="api_key",
api_key_env_vars=("OPENROUTER_API_KEY",),
)
existing_key = get_env_value("OPENROUTER_API_KEY") or ""
if not existing_key:
api_key = get_env_value("OPENROUTER_API_KEY")
if not api_key:
print("No OpenRouter API key configured.")
print("Get one at: https://openrouter.ai/keys")
print()
_resolved, abort = _prompt_api_key(pconfig, existing_key, provider_id="openrouter")
if abort:
return
try:
import getpass
key = getpass.getpass("OpenRouter API key (or Enter to cancel): ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
if not key:
print("Cancelled.")
return
save_env_value("OPENROUTER_API_KEY", key)
print("API key saved.")
print()
from hermes_cli.models import model_ids, get_pricing_for_provider
@@ -2474,26 +2473,33 @@ def _model_flow_openrouter(config, current_model=""):
def _model_flow_ai_gateway(config, current_model=""):
"""Vercel AI Gateway provider: ensure API key, then pick model with pricing."""
from hermes_cli.auth import (
PROVIDER_REGISTRY,
_prompt_model_selection,
_save_model_choice,
deactivate_provider,
)
from hermes_cli.config import get_env_value
from hermes_cli.config import get_env_value, save_env_value
# Route through _prompt_api_key so users can replace a stale/broken key
# in-flow (K/R/C) instead of having to edit ~/.hermes/.env by hand.
pconfig = PROVIDER_REGISTRY["ai-gateway"]
existing_key = get_env_value("AI_GATEWAY_API_KEY") or ""
if not existing_key:
api_key = get_env_value("AI_GATEWAY_API_KEY")
if not api_key:
print("No Vercel AI Gateway API key configured.")
print(
"Create API key here: https://vercel.com/d?to=%2F%5Bteam%5D%2F%7E%2Fai-gateway&title=AI+Gateway"
)
print("Add a payment method to get $5 in free credits.")
print()
_resolved, abort = _prompt_api_key(pconfig, existing_key, provider_id="ai-gateway")
if abort:
return
try:
import getpass
key = getpass.getpass("AI Gateway API key (or Enter to cancel): ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
if not key:
print("Cancelled.")
return
save_env_value("AI_GATEWAY_API_KEY", key)
print("API key saved.")
print()
from hermes_cli.models import ai_gateway_model_ids, get_pricing_for_provider
@@ -2584,7 +2590,6 @@ def _model_flow_nous(config, current_model="", args=None):
check_nous_free_tier,
partition_nous_models_by_tier,
union_with_portal_free_recommendations,
union_with_portal_paid_recommendations,
)
model_ids = get_curated_nous_model_ids()
@@ -2640,10 +2645,6 @@ def _model_flow_nous(config, current_model="", args=None):
# with the Portal's freeRecommendedModels list so newly-launched free
# models show up even if this CLI build's hardcoded curated list and
# docs-hosted manifest haven't caught up yet.
#
# For paid users: mirror the same idea with paidRecommendedModels so
# newly-launched paid models surface in the picker too — independent
# of CLI release cadence.
unavailable_models: list[str] = []
if free_tier:
model_ids, pricing = union_with_portal_free_recommendations(
@@ -2652,10 +2653,6 @@ def _model_flow_nous(config, current_model="", args=None):
model_ids, unavailable_models = partition_nous_models_by_tier(
model_ids, pricing, free_tier=True
)
else:
model_ids, pricing = union_with_portal_paid_recommendations(
model_ids, pricing, _nous_portal_url,
)
if not model_ids and not unavailable_models:
print("No models available for Nous Portal after filtering.")
@@ -3073,21 +3070,6 @@ def _model_flow_custom(config):
else:
print(f" If /v1 should not be in the base URL, try: {suggested}")
# Prompt for API compatibility mode explicitly so codex-compatible custom
# providers don't silently fall back to chat_completions.
current_model_cfg = config.get("model")
current_api_mode = ""
if isinstance(current_model_cfg, dict):
current_api_mode = str(current_model_cfg.get("api_mode") or "").strip()
api_mode = _prompt_custom_api_mode_selection(
effective_url,
current_api_mode=current_api_mode,
)
if api_mode:
print(f" API mode: {api_mode}")
else:
print(" API mode: auto-detect")
# Select model — use probe results when available, fall back to manual input
model_name = ""
detected_models = probe.get("models") or []
@@ -3151,10 +3133,7 @@ def _model_flow_custom(config):
model["base_url"] = effective_url
if effective_key:
model["api_key"] = effective_key
if api_mode:
model["api_mode"] = api_mode
else:
model.pop("api_mode", None)
model.pop("api_mode", None) # let runtime auto-detect from URL
save_config(cfg)
deactivate_provider()
@@ -3177,10 +3156,7 @@ def _model_flow_custom(config):
_caller_model["base_url"] = effective_url
if effective_key:
_caller_model["api_key"] = effective_key
if api_mode:
_caller_model["api_mode"] = api_mode
else:
_caller_model.pop("api_mode", None)
_caller_model.pop("api_mode", None)
config["model"] = _caller_model
print("Endpoint saved. Use `/model` in chat or `hermes model` to set a model.")
@@ -3191,80 +3167,9 @@ def _model_flow_custom(config):
model_name or "",
context_length=context_length,
name=display_name,
api_mode=api_mode,
)
def _prompt_custom_api_mode_selection(base_url: str, current_api_mode: str = "") -> Optional[str]:
"""Prompt for a custom provider API mode.
Returns an explicit mode string, or None to keep auto-detect behavior.
"""
from hermes_cli.runtime_provider import _detect_api_mode_for_url
detected_mode = _detect_api_mode_for_url(base_url)
normalized_current = str(current_api_mode or "").strip().lower()
default_mode = normalized_current or detected_mode or ""
mode_options = [
(
"",
"Auto-detect",
"Use Hermes URL heuristics; best for standard OpenAI-compatible endpoints.",
),
(
"chat_completions",
"Chat Completions",
"Use /chat/completions for standard OpenAI-compatible servers.",
),
(
"codex_responses",
"Responses / Codex",
"Use /responses for Codex-compatible tool-calling backends.",
),
(
"anthropic_messages",
"Anthropic Messages",
"Use /v1/messages for Anthropic-compatible endpoints.",
),
]
print()
print("Select API compatibility mode:")
for idx, (value, label, description) in enumerate(mode_options, 1):
markers = []
if value == detected_mode:
markers.append("detected")
if value == default_mode:
markers.append("current")
suffix = f" [{' / '.join(markers)}]" if markers else ""
print(f" {idx}. {label}{suffix}")
print(f" {description}")
try:
raw = input(
"Choice [1-4, Enter to keep current/detected]: "
).strip().lower()
except (KeyboardInterrupt, EOFError):
print("\nCancelled.")
raise
if not raw:
return default_mode or None
if raw in {"1", "auto", "detect", "auto-detect"}:
return None
if raw in {"2", "chat", "chat_completions", "completions"}:
return "chat_completions"
if raw in {"3", "responses", "codex", "codex_responses"}:
return "codex_responses"
if raw in {"4", "anthropic", "anthropic_messages", "messages"}:
return "anthropic_messages"
print(f"Invalid API mode choice: {raw}. Falling back to auto-detect.")
return None
def _auto_provider_name(base_url: str) -> str:
"""Generate a display name from a custom endpoint URL.
@@ -3300,12 +3205,12 @@ def _custom_provider_api_key_config_value(provider_info, resolved_api_key=""):
def _save_custom_provider(
base_url, api_key="", model="", context_length=None, name=None, api_mode=None
base_url, api_key="", model="", context_length=None, name=None
):
"""Save a custom endpoint to custom_providers in config.yaml.
Deduplicates by base_url if the URL already exists, updates the
model name, context_length, and api_mode but doesn't add a duplicate entry.
model name and context_length but doesn't add a duplicate entry.
Uses *name* when provided, otherwise auto-generates from the URL.
"""
from hermes_cli.config import load_config, save_config
@@ -3331,13 +3236,6 @@ def _save_custom_provider(
models_cfg[model] = {"context_length": context_length}
entry["models"] = models_cfg
changed = True
if api_mode:
if entry.get("api_mode") != api_mode:
entry["api_mode"] = api_mode
changed = True
elif "api_mode" in entry:
entry.pop("api_mode", None)
changed = True
if changed:
cfg["custom_providers"] = providers
save_config(cfg)
@@ -3352,8 +3250,6 @@ def _save_custom_provider(
entry["api_key"] = api_key
if model:
entry["model"] = model
if api_mode:
entry["api_mode"] = api_mode
if model and context_length:
entry["models"] = {model: {"context_length": context_length}}
@@ -3807,7 +3703,7 @@ def _model_flow_named_custom(config, provider_info):
save_config(cfg)
else:
# Save model name to the custom_providers entry for next time
_save_custom_provider(base_url, config_api_key, model_name, api_mode=api_mode)
_save_custom_provider(base_url, config_api_key, model_name)
print(f"\n✅ Model set to: {model_name}")
print(f" Provider: {name} ({base_url})")
@@ -4964,37 +4860,6 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
)
if model_list:
print(f" Found {len(model_list)} model(s) from Ollama Cloud")
elif provider_id == "novita":
from hermes_cli.models import fetch_api_models
api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
curated = _PROVIDER_MODELS.get(provider_id, [])
live_models = fetch_api_models(api_key_for_probe, effective_base)
if live_models:
model_list = live_models
print(f" Found {len(model_list)} model(s) from {pconfig.name} API")
else:
mdev_models: list = []
try:
from agent.models_dev import list_agentic_models
mdev_models = list_agentic_models(provider_id)
except Exception:
pass
if mdev_models:
seen = {m.lower() for m in mdev_models}
model_list = list(mdev_models)
for m in curated:
if m.lower() not in seen:
model_list.append(m)
seen.add(m.lower())
print(f" Found {len(model_list)} model(s) from models.dev registry")
else:
model_list = curated
if model_list:
print(
f' Showing {len(model_list)} curated models — use "Enter custom model name" for others.'
)
else:
curated = _PROVIDER_MODELS.get(provider_id, [])
@@ -6827,74 +6692,6 @@ def _cleanup_quarantined_exes(scripts_dir: Path | None = None) -> None:
pass
def _refresh_active_lazy_features() -> None:
"""Refresh lazy-installed backends after a code update.
When pyproject.toml's ``[all]`` extra was slimmed down (May 2026), most
optional backends moved to ``tools/lazy_deps.py`` and only install on
first use. ``hermes update`` runs ``uv pip install -e .[all]`` which
leaves those packages untouched so if we bump a pin in
:data:`LAZY_DEPS` (CVE response, transitive bug fix), users who already
activated the backend keep the stale version forever.
This function asks lazy_deps which features the user has previously
activated and reinstalls them under the current pins. Features the
user never enabled stay quiet no churn for cold backends.
Never raises. A failure here must not block the rest of the update.
"""
try:
from tools import lazy_deps
except Exception as exc:
logger.debug("Lazy refresh skipped (import failed): %s", exc)
return
try:
active = lazy_deps.active_features()
except Exception as exc:
logger.debug("Lazy refresh skipped (active_features failed): %s", exc)
return
if not active:
return
print()
print(f"→ Refreshing {len(active)} active lazy backend(s)...")
try:
results = lazy_deps.refresh_active_features(prompt=False)
except Exception as exc:
# refresh_active_features is documented as never-raise, but defend
# the update flow against future regressions.
print(f" ⚠ Lazy refresh failed unexpectedly: {exc}")
return
refreshed = [f for f, s in results.items() if s == "refreshed"]
current = [f for f, s in results.items() if s == "current"]
failed = [(f, s) for f, s in results.items() if s.startswith("failed:")]
skipped = [(f, s) for f, s in results.items() if s.startswith("skipped:")]
if refreshed:
print(f"{len(refreshed)} refreshed: {', '.join(refreshed)}")
if current:
print(f"{len(current)} already current")
if skipped:
# Most common reason: security.allow_lazy_installs=false. Show one
# line so the user knows why; not an error.
names = ", ".join(f for f, _ in skipped)
reason = skipped[0][1].split(": ", 1)[-1]
print(f" · {len(skipped)} skipped ({reason}): {names}")
if failed:
for feature, status in failed:
reason = status.split(": ", 1)[-1]
# Clip noisy pip stderr to keep update output legible.
if len(reason) > 200:
reason = reason[:200] + "..."
print(f"{feature} failed to refresh: {reason}")
print(" Backends keep their previously-installed version; rerun")
print(" `hermes update` once the upstream issue is resolved.")
def _install_python_dependencies_with_optional_fallback(
install_cmd_prefix: list[str],
*,
@@ -7817,8 +7614,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
_install_psutil_android_compat(pip_cmd)
_install_python_dependencies_with_optional_fallback(pip_cmd, group=install_group)
_refresh_active_lazy_features()
_update_node_dependencies()
_build_web_ui(PROJECT_ROOT / "web")
@@ -9364,7 +9159,7 @@ def _build_provider_choices() -> list[str]:
"auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot",
"anthropic", "gemini", "google-gemini-cli", "xai", "bedrock", "azure-foundry",
"ollama-cloud", "huggingface", "zai", "kimi-coding", "kimi-coding-cn",
"stepfun", "minimax", "minimax-cn", "kilocode", "novita", "xiaomi", "arcee",
"stepfun", "minimax", "minimax-cn", "kilocode", "xiaomi", "arcee",
"nvidia", "deepseek", "alibaba", "qwen-oauth", "opencode-zen", "opencode-go",
]
@@ -9384,10 +9179,10 @@ _BUILTIN_SUBCOMMANDS = frozenset(
"computer-use",
"config", "cron", "curator", "dashboard", "debug", "doctor",
"dump", "fallback", "gateway", "hooks", "import", "insights",
"kanban", "login", "logout", "logs", "lsp", "mcp", "memory",
"model", "pairing", "plugins", "profile", "sessions", "setup",
"skills", "slack", "status", "tools", "uninstall", "update",
"version", "webhook", "whatsapp", "chat",
"kanban", "login", "logout", "logs", "mcp", "memory", "model",
"pairing", "plugins", "profile", "sessions", "setup", "skills",
"slack", "status", "tools", "uninstall", "update", "version",
"webhook", "whatsapp", "chat",
# Help-ish invocations — plugin commands not being listed in
# top-level --help is an acceptable trade-off for skipping an
# expensive eager import of every bundled plugin module.
@@ -9586,7 +9381,7 @@ def main():
gateway_parser = subparsers.add_parser(
"gateway",
help="Messaging gateway management",
description="Manage the messaging gateway (Telegram, Discord, WhatsApp, Weixin, and more)",
description="Manage the messaging gateway (Telegram, Discord, WhatsApp)",
)
gateway_subparsers = gateway_parser.add_subparsers(dest="gateway_command")
@@ -9729,17 +9524,6 @@ def main():
gateway_parser.set_defaults(func=cmd_gateway)
# =========================================================================
# lsp command
# =========================================================================
try:
from agent.lsp.cli import register_subparser as _lsp_register
_lsp_register(subparsers)
except Exception as _lsp_err: # noqa: BLE001
# LSP is optional infrastructure — never let a registration
# failure break the CLI overall.
logger.debug("LSP CLI registration failed: %s", _lsp_err)
# =========================================================================
# setup command
# =========================================================================
@@ -10302,16 +10086,6 @@ def main():
doctor_parser.add_argument(
"--fix", action="store_true", help="Attempt to fix issues automatically"
)
doctor_parser.add_argument(
"--ack",
metavar="ADVISORY_ID",
default=None,
help=(
"Acknowledge a security advisory by ID and exit. After ack, the "
"advisory will no longer trigger startup banners. Run `hermes "
"doctor` first to see active advisories and their IDs."
),
)
doctor_parser.set_defaults(func=cmd_doctor)
# =========================================================================
+1 -8
View File
@@ -10,7 +10,6 @@ from __future__ import annotations
import getpass
import os
import sys
import shlex
from pathlib import Path
from hermes_constants import get_hermes_home
@@ -135,7 +134,7 @@ def _install_dependencies(provider_name: str) -> None:
if check_cmd:
try:
subprocess.run(
shlex.split(check_cmd), check=True, capture_output=True, timeout=5
check_cmd, shell=True, capture_output=True, timeout=5
)
except Exception:
if install_cmd:
@@ -379,12 +378,6 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
new_lines.append(f"{key}={val}")
env_path.write_text("\n".join(new_lines) + "\n", encoding="utf-8")
# Restrict permissions — .env holds API keys and tokens.
try:
import stat
env_path.chmod(stat.S_IRUSR | stat.S_IWUSR) # 0600
except OSError:
pass # Windows or read-only FS
# ---------------------------------------------------------------------------
+3 -140
View File
@@ -445,14 +445,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
# Azure Foundry: user-provided endpoint and model.
# Empty list because models depend on the endpoint configuration.
"azure-foundry": [],
"novita": [
"moonshotai/kimi-k2.5",
"minimax/minimax-m2.7",
"zai-org/glm-5",
"deepseek/deepseek-v3-0324",
"deepseek/deepseek-r1-0528",
"qwen/qwen3-235b-a22b-fp8",
],
}
# Vercel AI Gateway: derive the bare-model-id catalog from the curated
@@ -629,71 +621,6 @@ def union_with_portal_free_recommendations(
return (augmented_ids, augmented_pricing)
def union_with_portal_paid_recommendations(
curated_ids: list[str],
pricing: dict[str, dict[str, str]],
portal_base_url: str = "",
*,
force_refresh: bool = False,
) -> tuple[list[str], dict[str, dict[str, str]]]:
"""Augment curated list with the Portal's ``paidRecommendedModels``.
Mirror of :func:`union_with_portal_free_recommendations` for paid-tier
users. The Portal's ``/api/nous/recommended-models`` endpoint advertises
which paid models are blessed *right now* independent of what the
in-repo ``_PROVIDER_MODELS["nous"]`` list happens to contain or whether
the docs-hosted catalog manifest has been rebuilt since the last release.
For paid-tier users this lets newly-launched paid models surface in the
picker even if the user is running an older Hermes that doesn't ship
them in its hardcoded curated list. This function returns an augmented
``(model_ids, pricing)`` pair where:
* Portal paid recommendations missing from ``curated_ids`` are
appended at the front (so the picker shows them first).
* ``pricing`` is left untouched we deliberately do NOT synthesize
pricing entries for paid models. Live pricing is fetched separately
via :func:`get_pricing_for_provider`; if the live endpoint hasn't
published pricing yet, the picker shows a blank price column rather
than fabricating numbers. (The free helper synthesizes ``$0`` so
:func:`partition_nous_models_by_tier` keeps free models selectable;
no equivalent gating applies on the paid side, so synthesis would
only mislead the user.)
Failures (network, parse, missing field) are silent and degrade to
returning the inputs unchanged never block the picker on a
Portal-side hiccup.
"""
try:
payload = fetch_nous_recommended_models(
portal_base_url, force_refresh=force_refresh
)
except Exception:
return (list(curated_ids), dict(pricing))
paid_block = payload.get("paidRecommendedModels") if isinstance(payload, dict) else None
if not isinstance(paid_block, list) or not paid_block:
return (list(curated_ids), dict(pricing))
portal_paid_ids: list[str] = []
for entry in paid_block:
name = _extract_model_name(entry)
if name:
portal_paid_ids.append(name)
if not portal_paid_ids:
return (list(curated_ids), dict(pricing))
augmented_ids = list(curated_ids)
seen = set(augmented_ids)
# Prepend Portal paid recommendations that aren't already curated, so
# the Portal-blessed picks surface first in the picker.
new_ones = [mid for mid in portal_paid_ids if mid not in seen]
if new_ones:
augmented_ids = new_ones + augmented_ids
return (augmented_ids, dict(pricing))
# ---------------------------------------------------------------------------
# TTL cache for free-tier detection — avoids repeated API calls within a
# session while still picking up upgrades quickly.
@@ -913,14 +840,13 @@ class ProviderEntry(NamedTuple):
CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("nous", "Nous Portal", "Nous Portal (Nous Research subscription)"),
ProviderEntry("openrouter", "OpenRouter", "OpenRouter (100+ models, pay-per-use)"),
ProviderEntry("novita", "NovitaAI", "NovitaAI (AI-native cloud: Model API, Agent Sandbox, GPU Cloud)"),
ProviderEntry("lmstudio", "LM Studio", "LM Studio (local desktop app with built-in model server)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("alibaba", "Qwen Cloud", "Qwen Cloud / DashScope Coding (Qwen + multi-provider)"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
ProviderEntry("tencent-tokenhub", "Tencent TokenHub", "Tencent TokenHub (Hy3 Preview — direct API via tokenhub.tencentmaas.com)"),
ProviderEntry("nvidia", "NVIDIA NIM", "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
ProviderEntry("qwen-oauth", "Qwen OAuth (Portal)", "Qwen OAuth (reuses local Qwen CLI login)"),
ProviderEntry("copilot", "GitHub Copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
ProviderEntry("copilot-acp", "GitHub Copilot ACP", "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
ProviderEntry("huggingface", "Hugging Face", "Hugging Face Inference Providers (20+ open models)"),
@@ -935,6 +861,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("minimax", "MiniMax", "MiniMax (global direct API)"),
ProviderEntry("minimax-oauth", "MiniMax (OAuth)", "MiniMax via OAuth browser login (Coding Plan, minimax.io)"),
ProviderEntry("minimax-cn", "MiniMax (China)", "MiniMax China (domestic direct API)"),
ProviderEntry("alibaba", "Alibaba Cloud (DashScope)","Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
ProviderEntry("ollama-cloud", "Ollama Cloud", "Ollama Cloud (cloud-hosted open models — ollama.com)"),
ProviderEntry("arcee", "Arcee AI", "Arcee AI (Trinity models — direct API)"),
ProviderEntry("gmi", "GMI Cloud", "GMI Cloud (multi-model direct API)"),
@@ -944,7 +871,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("bedrock", "AWS Bedrock", "AWS Bedrock (Claude, Nova, Llama, DeepSeek — IAM or API key)"),
ProviderEntry("azure-foundry", "Azure Foundry", "Azure Foundry (OpenAI-style or Anthropic-style endpoint — your Azure AI deployment)"),
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway"),
ProviderEntry("qwen-oauth", "Qwen OAuth (Portal)", "Qwen OAuth (reuses local Qwen CLI login)"),
]
# Auto-extend CANONICAL_PROVIDERS with any provider registered in providers/
@@ -1023,8 +949,6 @@ _PROVIDER_ALIASES = {
"hf": "huggingface",
"hugging-face": "huggingface",
"huggingface-hub": "huggingface",
"novita-ai": "novita",
"novitaai": "novita",
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
"tencent": "tencent-tokenhub",
@@ -1505,7 +1429,7 @@ def _resolve_nous_pricing_credentials() -> tuple[str, str]:
def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> dict[str, dict[str, str]]:
"""Return live pricing for providers that support it (openrouter, nous, ai-gateway, novita)."""
"""Return live pricing for providers that support it (openrouter, nous, ai-gateway)."""
normalized = normalize_provider(provider)
if normalized == "openrouter":
return fetch_models_with_pricing(
@@ -1515,8 +1439,6 @@ def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> d
)
if normalized == "ai-gateway":
return fetch_ai_gateway_pricing(force_refresh=force_refresh)
if normalized == "novita":
return _fetch_novita_pricing(force_refresh=force_refresh)
if normalized == "nous":
api_key, base_url = _resolve_nous_pricing_credentials()
if base_url:
@@ -1533,65 +1455,6 @@ def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> d
return {}
def _fetch_novita_pricing(
timeout: float = 8.0,
*,
force_refresh: bool = False,
) -> dict[str, dict[str, str]]:
"""Fetch pricing from NovitaAI /v1/models.
NovitaAI returns input/output prices per million tokens in units of
0.0001 USD. Convert them to the per-token strings used by the shared
pricing formatter.
Results are cached in ``_pricing_cache`` keyed on the resolved base URL,
matching the pattern used by ``fetch_ai_gateway_pricing`` without this,
every menu render or pricing lookup re-hits the network.
"""
api_key = os.getenv("NOVITA_API_KEY", "").strip()
if not api_key:
return {}
base_url = os.getenv("NOVITA_BASE_URL", "").strip() or "https://api.novita.ai/openai/v1"
cache_key = base_url.rstrip("/")
if not force_refresh and cache_key in _pricing_cache:
return _pricing_cache[cache_key]
url = cache_key + "/models"
headers = {
"Authorization": f"Bearer {api_key}",
"Accept": "application/json",
"User-Agent": _HERMES_USER_AGENT,
}
try:
req = urllib.request.Request(url, headers=headers)
with urllib.request.urlopen(req, timeout=timeout) as resp:
payload = json.loads(resp.read().decode())
except Exception:
_pricing_cache[cache_key] = {}
return {}
result: dict[str, dict[str, str]] = {}
for item in payload.get("data", []):
if not isinstance(item, dict):
continue
mid = item.get("id")
if not mid:
continue
inp = item.get("input_token_price_per_m")
out = item.get("output_token_price_per_m")
if inp is None and out is None:
continue
result[str(mid)] = {
"prompt": str(float(inp or 0) / 10_000 / 1_000_000),
"completion": str(float(out or 0) / 10_000 / 1_000_000),
}
_pricing_cache[cache_key] = result
return result
# All provider IDs and aliases that are valid for the provider:model syntax.
_KNOWN_PROVIDER_NAMES: set[str] = (
set(_PROVIDER_LABELS.keys())
-75
View File
@@ -542,61 +542,6 @@ class PluginContext:
self.manifest.name, provider.name,
)
# -- video gen provider registration -------------------------------------
def register_video_gen_provider(self, provider) -> None:
"""Register a video generation backend.
``provider`` must be an instance of
:class:`agent.video_gen_provider.VideoGenProvider`. The
``provider.name`` attribute is what ``video_gen.provider`` in
``config.yaml`` matches against when routing ``video_generate``
tool calls.
"""
from agent.video_gen_provider import VideoGenProvider
from agent.video_gen_registry import register_provider as _register_video_provider
if not isinstance(provider, VideoGenProvider):
logger.warning(
"Plugin '%s' tried to register a video_gen provider that does "
"not inherit from VideoGenProvider. Ignoring.",
self.manifest.name,
)
return
_register_video_provider(provider)
logger.info(
"Plugin '%s' registered video_gen provider: %s",
self.manifest.name, provider.name,
)
# -- web search/extract provider registration ----------------------------
def register_web_search_provider(self, provider) -> None:
"""Register a web search/extract backend.
``provider`` must be an instance of
:class:`agent.web_search_provider.WebSearchProvider`. The
``provider.name`` attribute is what ``web.search_backend`` /
``web.extract_backend`` / ``web.backend`` in ``config.yaml``
matches against when routing ``web_search`` / ``web_extract``
tool calls.
"""
from agent.web_search_provider import WebSearchProvider
from agent.web_search_registry import register_provider as _register_web_provider
if not isinstance(provider, WebSearchProvider):
logger.warning(
"Plugin '%s' tried to register a web provider that does "
"not inherit from WebSearchProvider. Ignoring.",
self.manifest.name,
)
return
_register_web_provider(provider)
logger.info(
"Plugin '%s' registered web provider: %s",
self.manifest.name, provider.name,
)
# -- platform adapter registration ---------------------------------------
def register_platform(
@@ -1367,21 +1312,6 @@ def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
_thread_tool_whitelist = threading.local()
def set_thread_tool_whitelist(
allowed: Optional[Set[str]],
deny_msg_fmt: str = "Tool '{tool_name}' denied: not in this thread's tool whitelist",
) -> None:
_thread_tool_whitelist.allowed = allowed
_thread_tool_whitelist.fmt = deny_msg_fmt
def clear_thread_tool_whitelist() -> None:
_thread_tool_whitelist.allowed = None
def get_pre_tool_call_block_message(
tool_name: str,
args: Optional[Dict[str, Any]],
@@ -1400,11 +1330,6 @@ def get_pre_tool_call_block_message(
directive wins. Invalid or irrelevant hook return values are
silently ignored so existing observer-only hooks are unaffected.
"""
allowed = getattr(_thread_tool_whitelist, "allowed", None)
if allowed is not None and tool_name not in allowed:
fmt = getattr(_thread_tool_whitelist, "fmt", "Tool '{tool_name}' denied")
return fmt.format(tool_name=tool_name)
hook_results = invoke_hook(
"pre_tool_call",
tool_name=tool_name,
+85
View File
@@ -1295,6 +1295,91 @@ def rename_profile(old_name: str, new_name: str) -> Path:
return new_dir
# ---------------------------------------------------------------------------
# Tab completion
# ---------------------------------------------------------------------------
def generate_bash_completion() -> str:
"""Generate a bash completion script for hermes profile names."""
return '''# Hermes Agent profile completion
# Add to ~/.bashrc: eval "$(hermes completion bash)"
_hermes_profiles() {
local profiles_dir="$HOME/.hermes/profiles"
local profiles="default"
if [ -d "$profiles_dir" ]; then
profiles="$profiles $(ls "$profiles_dir" 2>/dev/null)"
fi
echo "$profiles"
}
_hermes_completion() {
local cur prev
cur="${COMP_WORDS[COMP_CWORD]}"
prev="${COMP_WORDS[COMP_CWORD-1]}"
# Complete profile names after -p / --profile
if [[ "$prev" == "-p" || "$prev" == "--profile" ]]; then
COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
return
fi
# Complete profile subcommands
if [[ "${COMP_WORDS[1]}" == "profile" ]]; then
case "$prev" in
profile)
COMPREPLY=($(compgen -W "list use create delete show alias rename export import" -- "$cur"))
return
;;
use|delete|show|alias|rename|export)
COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
return
;;
esac
fi
# Top-level subcommands
if [[ "$COMP_CWORD" == 1 ]]; then
local commands="chat model gateway setup status cron doctor dump config skills tools mcp sessions profile update version"
COMPREPLY=($(compgen -W "$commands" -- "$cur"))
fi
}
complete -F _hermes_completion hermes
'''
def generate_zsh_completion() -> str:
"""Generate a zsh completion script for hermes profile names."""
return '''#compdef hermes
# Hermes Agent profile completion
# Add to ~/.zshrc: eval "$(hermes completion zsh)"
_hermes() {
local -a profiles
profiles=(default)
if [[ -d "$HOME/.hermes/profiles" ]]; then
profiles+=("${(@f)$(ls $HOME/.hermes/profiles 2>/dev/null)}")
fi
_arguments \\
'-p[Profile name]:profile:($profiles)' \\
'--profile[Profile name]:profile:($profiles)' \\
'1:command:(chat model gateway setup status cron doctor dump config skills tools mcp sessions profile update version)' \\
'*::arg:->args'
case $words[1] in
profile)
_arguments '1:action:(list use create delete show alias rename export import)' \\
'2:profile:($profiles)'
;;
esac
}
_hermes "$@"
'''
# ---------------------------------------------------------------------------
# Profile env resolution (called from _apply_profile_override)
# ---------------------------------------------------------------------------
-9
View File
@@ -156,11 +156,6 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
is_aggregator=True,
base_url_env_var="HF_BASE_URL",
),
"novita": HermesOverlay(
transport="openai_chat",
is_aggregator=True,
base_url_env_var="NOVITA_BASE_URL",
),
"xai": HermesOverlay(
transport="codex_responses",
base_url_override="https://api.x.ai/v1",
@@ -314,10 +309,6 @@ ALIASES: Dict[str, str] = {
"hugging-face": "huggingface",
"huggingface-hub": "huggingface",
# novita
"novita-ai": "novita",
"novitaai": "novita",
# xiaomi
"mimo": "xiaomi",
"xiaomi-mimo": "xiaomi",
+1 -52
View File
@@ -164,18 +164,7 @@ def _copilot_runtime_api_mode(model_cfg: Dict[str, Any], api_key: str) -> str:
return "chat_completions"
_VALID_API_MODES = {
"chat_completions",
"codex_responses",
"anthropic_messages",
"bedrock_converse",
# Optional opt-in: hand the entire turn to a `codex app-server` subprocess
# so terminal/file-ops/patching/sandboxing run inside Codex's own runtime
# instead of Hermes' tool dispatch. Gated behind config key
# `model.openai_runtime == "codex_app_server"` AND provider in
# {"openai", "openai-codex"}. Default is unchanged.
"codex_app_server",
}
_VALID_API_MODES = {"chat_completions", "codex_responses", "anthropic_messages", "bedrock_converse"}
def _parse_api_mode(raw: Any) -> Optional[str]:
@@ -187,32 +176,6 @@ def _parse_api_mode(raw: Any) -> Optional[str]:
return None
def _maybe_apply_codex_app_server_runtime(
*,
provider: str,
api_mode: str,
model_cfg: Optional[Dict[str, Any]],
) -> str:
"""Optional opt-in: rewrite api_mode → "codex_app_server" for OpenAI/Codex
providers when the user has explicitly enabled that runtime via
`model.openai_runtime: codex_app_server` in config.yaml.
Default behavior is preserved: when the key is unset, "auto", or empty,
this function is a no-op. Only providers in {"openai", "openai-codex"}
are eligible other providers (anthropic, openrouter, etc.) cannot be
rerouted through codex.
Returns the (possibly-rewritten) api_mode."""
if not model_cfg:
return api_mode
if provider not in ("openai", "openai-codex"):
return api_mode
runtime = str(model_cfg.get("openai_runtime") or "").strip().lower()
if runtime == "codex_app_server":
return "codex_app_server"
return api_mode
def _resolve_runtime_from_pool_entry(
*,
provider: str,
@@ -242,14 +205,6 @@ def _resolve_runtime_from_pool_entry(
elif provider == "google-gemini-cli":
api_mode = "chat_completions"
base_url = base_url or "cloudcode-pa://google"
elif provider == "minimax-oauth":
# MiniMax OAuth tokens are valid only against the Anthropic Messages
# compatible endpoint. Do not honor stale model.api_mode values from a
# prior OpenAI-compatible provider, or the client will hit
# /chat/completions under /anthropic and receive a bare nginx 404.
api_mode = "anthropic_messages"
pconfig = PROVIDER_REGISTRY.get(provider)
base_url = base_url or (pconfig.inference_base_url if pconfig else "")
elif provider == "anthropic":
api_mode = "anthropic_messages"
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
@@ -330,12 +285,6 @@ def _resolve_runtime_from_pool_entry(
if api_mode == "anthropic_messages" and provider in {"opencode-zen", "opencode-go"}:
base_url = re.sub(r"/v1/?$", "", base_url)
# Optional opt-in: route OpenAI/Codex turns through `codex app-server`.
# Inert when `model.openai_runtime` is unset or "auto".
api_mode = _maybe_apply_codex_app_server_runtime(
provider=provider, api_mode=api_mode, model_cfg=model_cfg
)
return {
"provider": provider,
"api_mode": api_mode,
-451
View File
@@ -1,451 +0,0 @@
"""
Security advisory checker for Hermes Agent.
Detects known-compromised Python packages installed in the active venv
(supply-chain attacks like the Mini Shai-Hulud worm of May 2026 that
poisoned ``mistralai 2.4.6`` on PyPI) and surfaces remediation guidance to
the user.
Design goals:
- **Cheap.** A single ``importlib.metadata.version()`` call per advisory
package. Safe to run on every CLI startup.
- **Loud when it matters, silent otherwise.** If no compromised package is
installed, the user sees nothing.
- **Acknowledgeable.** Once the user has read and acted on an advisory they
can dismiss it via ``hermes doctor --ack <id>``; the ack is persisted to
``config.security.acked_advisories`` and survives restart.
- **Extensible.** Adding a new advisory is one entry in ``ADVISORIES``;
adding a new compromised version is a one-line edit. No code changes
needed when the next worm hits.
The check is invoked from three places:
1. ``hermes doctor`` (and ``hermes doctor --ack <id>``)
2. CLI startup banner (one short line, then full guidance via
``hermes doctor``)
3. Gateway startup (logged to gateway.log; first interactive message gets
a one-line operator banner)
This module is intentionally dependency-free beyond the stdlib so it can
run in environments where the rest of Hermes failed to import.
"""
from __future__ import annotations
import logging
import os
import sys
from dataclasses import dataclass, field
from pathlib import Path
from typing import Iterable, Optional
logger = logging.getLogger(__name__)
# =============================================================================
# Advisory catalog
#
# Each advisory is a community-facing security warning about one or more
# specific package versions that are known to be compromised. To add a new
# advisory:
#
# 1. Append a new ``Advisory`` to ``ADVISORIES`` below
# 2. Set ``compromised`` to a tuple of ``(pkg_name, frozenset_of_versions)``
# — version strings must match what ``importlib.metadata.version()``
# returns. Use an empty frozenset to flag *any installed version*
# (rare; only when the maintainer namespace itself is compromised).
# 3. Write 2-4 short ``remediation`` lines a non-expert can copy/paste.
#
# Do NOT remove old advisories. Once an advisory ships, leave it in place so
# users running an older release with the compromised package still get
# warned. Mark superseded ones via ``superseded_by`` if needed.
# =============================================================================
@dataclass(frozen=True)
class Advisory:
"""One security advisory entry.
Attributes:
id: stable identifier used for acks (e.g. ``shai-hulud-2026-05``).
Lowercase-hyphen, never reused.
title: one-line headline shown in banners.
summary: 1-3 sentence description of what was compromised and how.
url: reference URL (Socket advisory, GitHub advisory, PyPI page).
compromised: tuple of ``(package_name, frozenset_of_versions)``
pairs. Empty frozenset means "any version of this package is
considered suspect" — use sparingly.
remediation: ordered list of steps the user should take. First step
should be the uninstall command; subsequent steps the credential
audit / rotation guidance.
published: ISO date string for sort order.
"""
id: str
title: str
summary: str
url: str
compromised: tuple[tuple[str, frozenset[str]], ...]
remediation: tuple[str, ...]
published: str = ""
severity: str = "high" # low / medium / high / critical
ADVISORIES: tuple[Advisory, ...] = (
Advisory(
id="shai-hulud-2026-05",
title="Mini Shai-Hulud worm — mistralai 2.4.6 compromised on PyPI",
summary=(
"PyPI quarantined the mistralai package on 2026-05-12 after a "
"malicious 2.4.6 release. The worm steals credentials from "
"environment variables and credential files (~/.npmrc, ~/.pypirc, "
"~/.aws/credentials, GitHub PATs, cloud SDK tokens) and exfils "
"them to a hardcoded webhook. If you ran any Python process that "
"imported mistralai 2.4.6 — including hermes when configured "
"with provider=mistral for TTS or STT — assume those credentials "
"are exposed."
),
url="https://socket.dev/blog/mini-shai-hulud-worm-pypi",
compromised=(
("mistralai", frozenset({"2.4.6"})),
),
remediation=(
"Run: pip uninstall -y mistralai (or: uv pip uninstall mistralai)",
"Rotate API keys in ~/.hermes/.env (OpenRouter, Anthropic, OpenAI, "
"Nous, GitHub, AWS, Google, Mistral, etc.).",
"Audit ~/.npmrc, ~/.pypirc, ~/.aws/credentials, ~/.config/gh/hosts.yml, "
"and any other credential files for tokens that may have been read.",
"Check GitHub for unexpected new SSH keys, deploy keys, or webhook "
"additions on repos you have admin on.",
"After cleanup: hermes doctor --ack shai-hulud-2026-05 to dismiss "
"this warning.",
),
published="2026-05-12",
severity="critical",
),
)
# =============================================================================
# Detection
# =============================================================================
@dataclass(frozen=True)
class AdvisoryHit:
"""One package-version match against an advisory."""
advisory: Advisory
package: str
installed_version: str
def _installed_version(pkg_name: str) -> Optional[str]:
"""Return the installed version of ``pkg_name``, or None if not installed.
Uses ``importlib.metadata`` so we don't depend on pip being importable
inside the active venv (uv-created venvs may lack pip).
"""
try:
from importlib.metadata import PackageNotFoundError, version
except ImportError: # py<3.8 — Hermes requires 3.10+ but defensive.
return None
try:
return version(pkg_name)
except PackageNotFoundError:
return None
except Exception:
# Some metadata corruption modes raise ValueError or OSError. Don't
# let advisory checking crash the CLI startup path.
logger.debug("importlib.metadata.version(%s) raised", pkg_name, exc_info=True)
return None
def detect_compromised(
advisories: Iterable[Advisory] = ADVISORIES,
) -> list[AdvisoryHit]:
"""Scan installed packages and return all advisory hits.
A "hit" means an advisory's listed package is installed AND the version
is in the compromised set (or the compromised set is empty, meaning
*any* version is suspect).
"""
hits: list[AdvisoryHit] = []
for advisory in advisories:
for pkg_name, bad_versions in advisory.compromised:
installed = _installed_version(pkg_name)
if installed is None:
continue
if not bad_versions or installed in bad_versions:
hits.append(AdvisoryHit(
advisory=advisory,
package=pkg_name,
installed_version=installed,
))
return hits
# =============================================================================
# Acknowledgement persistence
#
# Acks live under ``security.acked_advisories`` in config.yaml as a list of
# advisory IDs. The list is the only state — no per-host data, no
# timestamps, no fingerprints. Users sharing a config.yaml across machines
# (rare but possible) get the same dismissal everywhere, which is the
# correct behavior for a global advisory.
# =============================================================================
def get_acked_ids() -> set[str]:
"""Return the set of advisory IDs the user has dismissed.
Returns an empty set if config can't be loaded (don't block startup
just because config is broken the advisory will keep firing until
config is repaired, which is fine).
"""
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception:
logger.debug("Could not load config for advisory acks", exc_info=True)
return set()
sec = cfg.get("security") or {}
raw = sec.get("acked_advisories") or []
if not isinstance(raw, list):
return set()
return {str(x).strip() for x in raw if str(x).strip()}
def ack_advisory(advisory_id: str) -> bool:
"""Persist an ack for ``advisory_id``. Returns True on success.
Idempotent acking an already-acked ID is a no-op.
"""
advisory_id = advisory_id.strip()
if not advisory_id:
return False
try:
from hermes_cli.config import load_config, save_config
except Exception:
logger.warning("Could not import config module to persist ack")
return False
try:
cfg = load_config()
sec = cfg.setdefault("security", {})
existing = sec.get("acked_advisories") or []
if not isinstance(existing, list):
existing = []
if advisory_id not in existing:
existing.append(advisory_id)
sec["acked_advisories"] = existing
save_config(cfg)
return True
except Exception:
logger.exception("Failed to persist advisory ack for %s", advisory_id)
return False
def filter_unacked(hits: list[AdvisoryHit]) -> list[AdvisoryHit]:
"""Return only hits whose advisories the user has not dismissed."""
if not hits:
return []
acked = get_acked_ids()
return [h for h in hits if h.advisory.id not in acked]
# =============================================================================
# Rendering helpers
# =============================================================================
def _term_supports_color() -> bool:
if os.environ.get("NO_COLOR"):
return False
if not sys.stdout.isatty():
return False
return True
def short_banner_lines(hits: list[AdvisoryHit]) -> list[str]:
"""Return 1-3 short lines suitable for a startup banner.
Caller is responsible for color/styling. Always names the worst hit
explicitly so the user knows what's wrong without running doctor.
"""
if not hits:
return []
primary = hits[0]
lines = [
f"SECURITY ADVISORY [{primary.advisory.id}]: {primary.advisory.title}",
f" Detected: {primary.package}=={primary.installed_version}",
" Run 'hermes doctor' for remediation steps.",
]
if len(hits) > 1:
lines.insert(1, f" ({len(hits) - 1} additional advisor"
f"{'ies' if len(hits) > 2 else 'y'} also active.)")
return lines
def full_remediation_text(hit: AdvisoryHit) -> list[str]:
"""Return a multi-line block describing the advisory + remediation."""
a = hit.advisory
lines = [
f"=== {a.title} ===",
f"ID: {a.id} Severity: {a.severity} Published: {a.published}",
f"Detected: {hit.package}=={hit.installed_version}",
f"Reference: {a.url}",
"",
a.summary,
"",
"Remediation:",
]
for i, step in enumerate(a.remediation, 1):
lines.append(f" {i}. {step}")
return lines
# =============================================================================
# Startup-banner gating
#
# We do NOT want to hammer the user with the banner on every command. Once
# they've seen it inside a 24h window we cache that fact in
# ``~/.hermes/cache/advisory_banner_seen`` (a single line per advisory ID:
# ``<id> <iso8601_timestamp>``).
#
# Acked advisories never re-banner. Cached-but-not-acked advisories
# re-banner after 24h so the user doesn't fully forget.
# =============================================================================
_BANNER_CACHE_FILE = "advisory_banner_seen"
_BANNER_REPEAT_HOURS = 24
def _banner_cache_path() -> Optional[Path]:
try:
from hermes_constants import get_hermes_home
cache_dir = Path(get_hermes_home()) / "cache"
cache_dir.mkdir(parents=True, exist_ok=True)
return cache_dir / _BANNER_CACHE_FILE
except Exception:
return None
def _read_banner_cache() -> dict[str, float]:
p = _banner_cache_path()
if p is None or not p.exists():
return {}
out: dict[str, float] = {}
try:
for line in p.read_text(encoding="utf-8").splitlines():
line = line.strip()
if not line:
continue
parts = line.split(None, 1)
if len(parts) != 2:
continue
advisory_id, ts = parts
try:
out[advisory_id] = float(ts)
except ValueError:
continue
except Exception:
return {}
return out
def _write_banner_cache(seen: dict[str, float]) -> None:
p = _banner_cache_path()
if p is None:
return
try:
lines = [f"{aid} {ts}" for aid, ts in seen.items()]
p.write_text("\n".join(lines) + "\n", encoding="utf-8")
except Exception:
logger.debug("Could not write advisory banner cache", exc_info=True)
def hits_due_for_banner(
hits: list[AdvisoryHit],
*,
repeat_hours: int = _BANNER_REPEAT_HOURS,
) -> list[AdvisoryHit]:
"""Return only hits whose banner is due (not acked, not recently shown).
Side effect: stamps the banner cache for any hit that's about to be
shown. Callers should subsequently render the result.
"""
import time
fresh = filter_unacked(hits)
if not fresh:
return []
now = time.time()
cache = _read_banner_cache()
cutoff = now - (repeat_hours * 3600)
due: list[AdvisoryHit] = []
for hit in fresh:
last = cache.get(hit.advisory.id, 0.0)
if last < cutoff:
due.append(hit)
cache[hit.advisory.id] = now
if due:
_write_banner_cache(cache)
return due
# =============================================================================
# Public entry points used by doctor / CLI / gateway
# =============================================================================
def render_doctor_section(hits: list[AdvisoryHit]) -> tuple[bool, list[str]]:
"""Render the security-advisory section for ``hermes doctor``.
Returns ``(has_problems, lines)``. Caller is responsible for printing
with whatever color scheme it uses.
"""
fresh = filter_unacked(hits)
if not fresh:
return False, ["No active security advisories. ✓"]
lines: list[str] = []
for i, hit in enumerate(fresh):
if i:
lines.append("")
lines.extend(full_remediation_text(hit))
return True, lines
def startup_banner(hits: list[AdvisoryHit]) -> Optional[str]:
"""Return a printable startup banner, or None if nothing is due.
Updates the banner cache as a side effect (so the next call within
24h returns None for the same hit).
"""
due = hits_due_for_banner(hits)
if not due:
return None
lines = short_banner_lines(due)
if _term_supports_color():
red = "\x1b[1;31m"
reset = "\x1b[0m"
return red + "\n".join(lines) + reset
return "\n".join(lines)
def gateway_log_message(hits: list[AdvisoryHit]) -> Optional[str]:
"""Return a one-line log message for gateway operators, or None."""
fresh = filter_unacked(hits)
if not fresh:
return None
if len(fresh) == 1:
h = fresh[0]
return (f"Security advisory [{h.advisory.id}] active: "
f"{h.package}=={h.installed_version} matches {h.advisory.title}. "
f"See {h.advisory.url}")
return (f"{len(fresh)} security advisories active "
f"(IDs: {', '.join(h.advisory.id for h in fresh)}). "
f"Run `hermes doctor` on the gateway host for details.")
+14 -20
View File
@@ -454,26 +454,6 @@ def _print_setup_summary(config: dict, hermes_home):
else:
tool_status.append(("Image Generation", False, "FAL_KEY or OPENAI_API_KEY"))
# Video generation — opt-in via `hermes tools` → Video Generation.
# Only show the row when a plugin reports available so we don't badger
# users who don't care about video gen with a "missing" status line.
try:
from agent.video_gen_registry import list_providers as _list_video_providers
from hermes_cli.plugins import _ensure_plugins_discovered as _ensure_plugins
_ensure_plugins()
_video_backend = None
for _vp in _list_video_providers():
try:
if _vp.is_available():
_video_backend = _vp.display_name
break
except Exception:
continue
except Exception:
_video_backend = None
if _video_backend:
tool_status.append((f"Video Generation ({_video_backend})", True, None))
# TTS — show configured provider
tts_provider = cfg_get(config, "tts", "provider", default="edge")
if subscription_features.tts.managed_by_nous:
@@ -3266,6 +3246,18 @@ def run_setup_wizard(args):
print_info(f" cp {_backup_path} {config_path}")
_print_setup_summary(config, hermes_home)
_offer_launch_chat()
def _offer_launch_chat():
"""Prompt the user to jump straight into chat after setup."""
print()
if not prompt_yes_no("Launch hermes chat now?", True):
return
from hermes_cli.relaunch import relaunch
relaunch(["chat"])
def _run_first_time_quick_setup(config: dict, hermes_home, is_existing: bool):
"""Streamlined first-time setup: provider, model, terminal & messaging.
@@ -3309,6 +3301,8 @@ def _run_first_time_quick_setup(config: dict, hermes_home, is_existing: bool):
_print_setup_summary(config, hermes_home)
_offer_launch_chat()
def _run_quick_setup(config: dict, hermes_home):
"""Quick setup — only configure items that are missing."""
+8 -37
View File
@@ -666,46 +666,25 @@ def _load_skin_from_yaml(path: Path) -> Optional[Dict[str, Any]]:
return None
def _mapping_or_empty(value: Any, *, section: str, skin_name: str) -> Dict[str, Any]:
"""Return a mapping value or an empty dict when the section type is invalid."""
if isinstance(value, dict):
return value
if value is None:
return {}
logger.warning(
"Skin '%s' has invalid '%s' section type (%s); ignoring section",
skin_name,
section,
type(value).__name__,
)
return {}
def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
"""Build a SkinConfig from a raw dict (built-in or loaded from YAML)."""
# Start with default values as base for missing keys
default = _BUILTIN_SKINS["default"]
skin_name = str(data.get("name", "unknown"))
color_overrides = _mapping_or_empty(data.get("colors"), section="colors", skin_name=skin_name)
spinner_overrides = _mapping_or_empty(data.get("spinner"), section="spinner", skin_name=skin_name)
branding_overrides = _mapping_or_empty(data.get("branding"), section="branding", skin_name=skin_name)
emoji_overrides = _mapping_or_empty(data.get("tool_emojis"), section="tool_emojis", skin_name=skin_name)
colors = dict(default.get("colors", {}))
colors.update(color_overrides)
colors.update(data.get("colors", {}))
spinner = dict(default.get("spinner", {}))
spinner.update(spinner_overrides)
spinner.update(data.get("spinner", {}))
branding = dict(default.get("branding", {}))
branding.update(branding_overrides)
branding.update(data.get("branding", {}))
return SkinConfig(
name=skin_name,
name=data.get("name", "unknown"),
description=data.get("description", ""),
colors=colors,
spinner=spinner,
branding=branding,
tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "")),
tool_emojis=emoji_overrides,
tool_emojis=data.get("tool_emojis", {}),
banner_logo=data.get("banner_logo", ""),
banner_hero=data.get("banner_hero", ""),
)
@@ -849,14 +828,10 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
except Exception:
return {}
# Input/prompt: leave unset by default so the typed text inherits
# the terminal's foreground color (readable in both light and dark
# color schemes). Skins can opt into a colored prompt by setting
# `prompt` explicitly in their YAML.
prompt = skin.get_color("prompt", "")
prompt = skin.get_color("prompt", "#FFF8DC")
input_rule = skin.get_color("input_rule", "#CD7F32")
title = skin.get_color("banner_title", "#FFD700")
text = skin.get_color("banner_text", "#FFF8DC")
text = skin.get_color("banner_text", prompt)
dim = skin.get_color("banner_dim", "#555555")
label = skin.get_color("ui_label", title)
warn = skin.get_color("ui_warn", "#FF8C00")
@@ -876,11 +851,7 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
menu_meta_current_bg = skin.get_color("completion_menu_meta_current_bg", menu_current_bg)
return {
# Typed input always uses terminal default fg/bg so it's
# readable in both light and dark Terminal.app modes. The
# skin's `prompt` color (if any) only styles the prompt symbol,
# NOT the user's typed text.
"input-area": "",
"input-area": prompt,
"placeholder": f"{dim} italic",
"prompt": prompt,
"prompt-working": f"{dim} italic",
+72 -273
View File
@@ -60,7 +60,6 @@ CONFIGURABLE_TOOLSETS = [
("vision", "👁️ Vision / Image Analysis", "vision_analyze"),
("video", "🎬 Video Analysis", "video_analyze (requires video-capable model)"),
("image_gen", "🎨 Image Generation", "image_generate"),
("video_gen", "🎬 Video Generation", "video_generate (text-to-video + image-to-video)"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
("skills", "📚 Skills", "list, view, manage"),
@@ -83,11 +82,7 @@ CONFIGURABLE_TOOLSETS = [
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
#
# Video gen is off by default — it's a niche, paid, slow feature. Users
# who want it opt in via `hermes tools` → Video Generation, which walks
# them through provider + model selection.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video", "video_gen"}
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
# these platforms, and only resolve/save for these platforms. A toolset
@@ -210,9 +205,15 @@ TOOL_CATEGORIES = {
],
"tts_provider": "elevenlabs",
},
# Mistral (Voxtral TTS) temporarily hidden — `mistralai` PyPI
# package is currently quarantined (malicious 2.4.6 release on
# 2026-05-12). Restore this entry once PyPI un-quarantines.
{
"name": "Mistral (Voxtral TTS)",
"badge": "paid",
"tag": "Multilingual, native Opus",
"env_vars": [
{"key": "MISTRAL_API_KEY", "prompt": "Mistral API key", "url": "https://console.mistral.ai/"},
],
"tts_provider": "mistral",
},
{
"name": "Google Gemini TTS",
"badge": "preview",
@@ -245,15 +246,6 @@ TOOL_CATEGORIES = {
"setup_title": "Select Search Provider",
"setup_note": "A free DuckDuckGo search skill is also included — skip this if you don't need a premium provider.",
"icon": "🔍",
# Per-provider rows are injected at runtime from
# plugins.web.<vendor>.provider via _plugin_web_search_providers()
# in _visible_providers(). Only non-provider UX setup-flow rows
# for the firecrawl backend are listed here:
# - "Nous Subscription" — managed Firecrawl billed via Nous
# subscription (requires_nous_auth + override_env_vars).
# - "Firecrawl Self-Hosted" — points firecrawl at a private
# Docker instance via FIRECRAWL_API_URL only.
# See PR #25182 for the migration rationale.
"providers": [
{
"name": "Nous Subscription",
@@ -265,6 +257,42 @@ TOOL_CATEGORIES = {
"managed_nous_feature": "web",
"override_env_vars": ["FIRECRAWL_API_KEY", "FIRECRAWL_API_URL"],
},
{
"name": "Firecrawl Cloud",
"badge": "★ recommended",
"tag": "Full-featured search, extract, and crawl",
"web_backend": "firecrawl",
"env_vars": [
{"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
],
},
{
"name": "Exa",
"badge": "paid",
"tag": "Neural search with semantic understanding",
"web_backend": "exa",
"env_vars": [
{"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
],
},
{
"name": "Parallel",
"badge": "paid",
"tag": "AI-powered search and extract",
"web_backend": "parallel",
"env_vars": [
{"key": "PARALLEL_API_KEY", "prompt": "Parallel API key", "url": "https://parallel.ai"},
],
},
{
"name": "Tavily",
"badge": "free tier",
"tag": "Search, extract, and crawl — 1000 free searches/mo",
"web_backend": "tavily",
"env_vars": [
{"key": "TAVILY_API_KEY", "prompt": "Tavily API key", "url": "https://app.tavily.com/home"},
],
},
{
"name": "Firecrawl Self-Hosted",
"badge": "free · self-hosted",
@@ -274,6 +302,32 @@ TOOL_CATEGORIES = {
{"key": "FIRECRAWL_API_URL", "prompt": "Your Firecrawl instance URL (e.g., http://localhost:3002)"},
],
},
{
"name": "SearXNG",
"badge": "free · self-hosted · search only",
"tag": "Privacy-respecting metasearch engine — search only (pair with any extract provider)",
"web_backend": "searxng",
"env_vars": [
{"key": "SEARXNG_URL", "prompt": "Your SearXNG instance URL (e.g., http://localhost:8080)", "url": "https://searxng.github.io/searxng/"},
],
},
{
"name": "Brave Search (Free Tier)",
"badge": "free tier · search only",
"tag": "2,000 queries/mo free — search only (pair with any extract provider)",
"web_backend": "brave-free",
"env_vars": [
{"key": "BRAVE_SEARCH_API_KEY", "prompt": "Brave Search subscription token", "url": "https://brave.com/search/api/"},
],
},
{
"name": "DuckDuckGo (ddgs)",
"badge": "free · no key · search only",
"tag": "Search via the ddgs Python package — no API key (pair with any extract provider)",
"web_backend": "ddgs",
"env_vars": [],
"post_setup": "ddgs",
},
],
},
"image_gen": {
@@ -301,15 +355,6 @@ TOOL_CATEGORIES = {
},
],
},
"video_gen": {
"name": "Video Generation",
"icon": "🎬",
# Providers list is intentionally empty — every video gen backend
# is a plugin, surfaced by ``_plugin_video_gen_providers()`` and
# injected by ``_visible_providers``. Mirrors the design we'll
# converge image_gen toward.
"providers": [],
},
"browser": {
"name": "Browser Automation",
"icon": "🌐",
@@ -1486,101 +1531,6 @@ def _plugin_image_gen_providers() -> list[dict]:
return rows
def _plugin_video_gen_providers() -> list[dict]:
"""Build picker-row dicts from plugin-registered video gen providers.
Mirrors ``_plugin_image_gen_providers`` exactly every video backend
is a plugin, so this function is the *only* source of provider rows
for the Video Generation category. The hardcoded ``TOOL_CATEGORIES``
entry for ``video_gen`` keeps an empty providers list.
"""
try:
from agent.video_gen_registry import list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
providers = list_providers()
except Exception:
return []
rows: list[dict] = []
for provider in providers:
try:
schema = provider.get_setup_schema()
except Exception:
continue
if not isinstance(schema, dict):
continue
rows.append(
{
"name": schema.get("name", provider.display_name),
"badge": schema.get("badge", ""),
"tag": schema.get("tag", ""),
"env_vars": schema.get("env_vars", []),
"video_gen_plugin_name": provider.name,
}
)
return rows
# Mirror of _plugin_image_gen_providers for web search backends. Surfaces
# every plugin-registered web provider so it appears in the
# "Web Search & Extract" picker. All seven providers (brave-free, ddgs,
# searxng, exa, parallel, tavily, firecrawl) live as plugins after
# PR #25182 — this helper is the sole source of truth for the category's
# provider rows. The hardcoded entries that used to drive the category
# were deleted in the same PR; only the two non-provider UX rows
# ("Nous Subscription" managed-gateway entry, "Firecrawl Self-Hosted")
# remain in TOOL_CATEGORIES because they describe alternative *setup
# flows* for the firecrawl backend rather than distinct providers.
def _plugin_web_search_providers() -> list[dict]:
"""Build picker-row dicts from plugin-registered web search providers.
Each returned dict is a regular ``TOOL_CATEGORIES`` provider row. It
populates both ``web_backend`` (legacy field consumed by setup +
selection helpers) and ``web_search_plugin_name`` (informational
marker) so the picker behaves identically whether a provider is
hardcoded or plugin-registered.
After PR #25182, all seven web providers (brave-free, ddgs, searxng,
exa, parallel, tavily, firecrawl) are plugins; this helper is the sole
source of provider rows for the Web Search & Extract category.
"""
try:
from agent.web_search_registry import list_providers as _list_web_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
providers = _list_web_providers()
except Exception:
return []
rows: list[dict] = []
for provider in providers:
name = getattr(provider, "name", None)
if not name:
continue
try:
schema = provider.get_setup_schema()
except Exception:
continue
if not isinstance(schema, dict):
continue
row = {
"name": schema.get("name", provider.display_name),
"badge": schema.get("badge", ""),
"tag": schema.get("tag", ""),
"env_vars": schema.get("env_vars", []),
"web_backend": name,
"web_search_plugin_name": name,
}
# Optional pass-through fields the schema can opt into.
if schema.get("post_setup"):
row["post_setup"] = schema["post_setup"]
rows.append(row)
return rows
def _visible_providers(cat: dict, config: dict) -> list[dict]:
"""Return provider entries visible for the current auth/config state."""
features = get_nous_subscription_features(config)
@@ -1597,19 +1547,6 @@ def _visible_providers(cat: dict, config: dict) -> list[dict]:
if cat.get("name") == "Image Generation":
visible.extend(_plugin_image_gen_providers())
# Inject plugin-registered video_gen backends. Unlike image_gen,
# video_gen has NO hardcoded providers — every backend is a plugin.
if cat.get("name") == "Video Generation":
visible.extend(_plugin_video_gen_providers())
# Inject plugin-registered web search backends. After PR #25182, this
# is the SOLE source of provider rows for the Web Search & Extract
# category — the per-provider hardcoded entries were deleted. The two
# remaining hardcoded rows ("Nous Subscription", "Firecrawl
# Self-Hosted") are non-provider UX setup-flow rows for firecrawl.
if cat.get("name") == "Web Search & Extract":
visible.extend(_plugin_web_search_providers())
return visible
@@ -1677,23 +1614,6 @@ def _toolset_needs_configuration_prompt(ts_key: str, config: dict) -> bool:
from agent.image_gen_registry import list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
for provider in list_providers():
try:
if provider.is_available():
return False
except Exception:
continue
except Exception:
pass
return True
if ts_key == "video_gen":
# Satisfied when any plugin-registered video gen provider reports
# available — no in-tree fallback (every backend is a plugin).
try:
from agent.video_gen_registry import list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
for provider in list_providers():
try:
@@ -2038,106 +1958,6 @@ def _select_plugin_image_gen_provider(plugin_name: str, config: dict) -> None:
_configure_imagegen_model_for_plugin(plugin_name, config)
# ─── Video Generation Model Pickers ───────────────────────────────────────────
def _plugin_video_gen_catalog(plugin_name: str):
"""Return ``(catalog_dict, default_model_id)`` for a video gen plugin.
Mirrors :func:`_plugin_image_gen_catalog`. Returns ``({}, None)`` when
the plugin isn't registered or has no models.
"""
try:
from agent.video_gen_registry import get_provider
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
provider = get_provider(plugin_name)
except Exception:
return {}, None
if provider is None:
return {}, None
try:
models = provider.list_models() or []
default = provider.default_model()
except Exception:
return {}, None
catalog = {m["id"]: m for m in models if isinstance(m, dict) and "id" in m}
return catalog, default
def _configure_videogen_model_for_plugin(plugin_name: str, config: dict) -> None:
"""Prompt for a video gen model from a plugin's catalog.
Mirrors :func:`_configure_imagegen_model_for_plugin`. Writes the
selection to ``video_gen.model``.
"""
catalog, default_model = _plugin_video_gen_catalog(plugin_name)
if not catalog:
return
cur_cfg = config.setdefault("video_gen", {})
if not isinstance(cur_cfg, dict):
cur_cfg = {}
config["video_gen"] = cur_cfg
current_model = cur_cfg.get("model") or default_model
if current_model not in catalog:
current_model = default_model
model_ids = list(catalog.keys())
ordered = [current_model] + [m for m in model_ids if m != current_model]
widths = {
"model": max(len(m) for m in model_ids),
"speed": max((len(catalog[m].get("speed", "")) for m in model_ids), default=6),
"strengths": max((len(catalog[m].get("strengths", "")) for m in model_ids), default=0),
}
print()
header = (
f" {'Model':<{widths['model']}} "
f"{'Speed':<{widths['speed']}} "
f"{'Strengths':<{widths['strengths']}} "
f"Price"
)
print(color(header, Colors.CYAN))
rows = []
for mid in ordered:
meta = catalog[mid]
row = (
f" {mid:<{widths['model']}} "
f"{meta.get('speed', ''):<{widths['speed']}} "
f"{meta.get('strengths', ''):<{widths['strengths']}} "
f"{meta.get('price', '')}"
)
if mid == current_model:
row += " ← currently in use"
rows.append(row)
idx = _prompt_choice(
f" Choose {plugin_name} model:",
rows,
default=0,
)
chosen = ordered[idx]
cur_cfg["model"] = chosen
_print_success(f" Model set to: {chosen}")
def _select_plugin_video_gen_provider(plugin_name: str, config: dict) -> None:
"""Persist a plugin-backed video generation provider selection."""
vid_cfg = config.setdefault("video_gen", {})
if not isinstance(vid_cfg, dict):
vid_cfg = {}
config["video_gen"] = vid_cfg
vid_cfg["provider"] = plugin_name
vid_cfg["use_gateway"] = False
_print_success(f" video_gen.provider set to: {plugin_name}")
_configure_videogen_model_for_plugin(plugin_name, config)
def _configure_provider(provider: dict, config: dict):
"""Configure a single provider - prompt for API keys and set config."""
env_vars = provider.get("env_vars", [])
@@ -2200,12 +2020,6 @@ def _configure_provider(provider: dict, config: dict):
if plugin_name:
_select_plugin_image_gen_provider(plugin_name, config)
return
# Plugin-registered video_gen provider — same flow, different
# registry.
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
return
# Imagegen backends prompt for model selection after backend pick.
backend = provider.get("imagegen_backend")
if backend:
@@ -2254,10 +2068,6 @@ def _configure_provider(provider: dict, config: dict):
if plugin_name:
_select_plugin_image_gen_provider(plugin_name, config)
return
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
return
# Imagegen backends prompt for model selection after env vars are in.
backend = provider.get("imagegen_backend")
if backend:
@@ -2482,11 +2292,6 @@ def _reconfigure_provider(provider: dict, config: dict):
if plugin_name:
_select_plugin_image_gen_provider(plugin_name, config)
return
# Plugin-registered video_gen provider — same flow, different registry.
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
return
# Imagegen backends prompt for model selection on reconfig too.
backend = provider.get("imagegen_backend")
if backend:
@@ -2519,12 +2324,6 @@ def _reconfigure_provider(provider: dict, config: dict):
_select_plugin_image_gen_provider(plugin_name, config)
return
# Plugin-registered video_gen provider — same flow, different registry.
video_plugin = provider.get("video_gen_plugin_name")
if video_plugin:
_select_plugin_video_gen_provider(video_plugin, config)
return
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
+43 -55
View File
@@ -56,22 +56,10 @@ try:
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel
except ImportError:
# First try lazy-installing the dashboard extras. Only the user actually
# running `hermes dashboard` needs fastapi+uvicorn; lazy install keeps
# them out of every other install path. After install, re-import.
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("tool.dashboard", prompt=False)
from fastapi import FastAPI, HTTPException, Request, WebSocket, WebSocketDisconnect
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, HTMLResponse, JSONResponse, Response
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel
except Exception:
raise SystemExit(
"Web UI requires fastapi and uvicorn.\n"
f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
)
raise SystemExit(
"Web UI requires fastapi and uvicorn.\n"
f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
)
WEB_DIST = Path(os.environ["HERMES_WEB_DIST"]) if "HERMES_WEB_DIST" in os.environ else Path(__file__).parent / "web_dist"
_log = logging.getLogger(__name__)
@@ -285,9 +273,7 @@ _SCHEMA_OVERRIDES: Dict[str, Dict[str, Any]] = {
"stt.provider": {
"type": "select",
"description": "Speech-to-text provider",
# "mistral" temporarily removed — mistralai PyPI package quarantined
# (malicious 2.4.6 release on 2026-05-12). Restore once available.
"options": ["local", "openai"],
"options": ["local", "openai", "mistral"],
},
"display.skin": {
"type": "select",
@@ -994,9 +980,39 @@ def get_model_options():
can share the same types.
"""
try:
from hermes_cli.inventory import build_models_payload, load_picker_context
from hermes_cli.model_switch import list_authenticated_providers
return build_models_payload(load_picker_context(), max_models=50)
cfg = load_config()
model_cfg = cfg.get("model", {})
if isinstance(model_cfg, dict):
current_model = model_cfg.get("default", model_cfg.get("name", "")) or ""
current_provider = model_cfg.get("provider", "") or ""
current_base_url = model_cfg.get("base_url", "") or ""
else:
current_model = str(model_cfg) if model_cfg else ""
current_provider = ""
current_base_url = ""
user_providers = cfg.get("providers") if isinstance(cfg.get("providers"), dict) else {}
custom_providers = (
cfg.get("custom_providers")
if isinstance(cfg.get("custom_providers"), list)
else []
)
providers = list_authenticated_providers(
current_provider=current_provider,
current_base_url=current_base_url,
current_model=current_model,
user_providers=user_providers,
custom_providers=custom_providers,
max_models=50,
)
return {
"providers": providers,
"model": current_model,
"provider": current_provider,
}
except Exception:
_log.exception("GET /api/model/options failed")
raise HTTPException(status_code=500, detail="Failed to list model options")
@@ -2037,7 +2053,6 @@ def _minimax_poller(session_id: str) -> None:
"""
from hermes_cli.auth import (
_minimax_poll_token,
_minimax_resolve_token_expiry_unix,
_minimax_save_auth_state,
MINIMAX_OAUTH_GLOBAL_INFERENCE,
MINIMAX_OAUTH_SCOPE,
@@ -2075,10 +2090,8 @@ def _minimax_poller(session_id: str) -> None:
# dashboard path; cn-region operators can still use the CLI
# flow which supports `--region cn`.
now = datetime.now(timezone.utc)
expires_at_ts = _minimax_resolve_token_expiry_unix(
int(token_data["expired_in"]), now=now,
)
expires_in_s = max(0, int(expires_at_ts - now.timestamp()))
expires_in_s = int(token_data["expired_in"])
expires_at_ts = now.timestamp() + expires_in_s
auth_state = {
"provider": "minimax-oauth",
"region": sess.get("region", "global"),
@@ -3991,9 +4004,6 @@ def _get_dashboard_plugins(force_rescan: bool = False) -> list:
global _dashboard_plugins_cache
if _dashboard_plugins_cache is None or force_rescan:
_dashboard_plugins_cache = _discover_dashboard_plugins()
elif _dashboard_plugins_cache:
if any(not Path(p["_dir"]).is_dir() for p in _dashboard_plugins_cache):
_dashboard_plugins_cache = _discover_dashboard_plugins()
return _dashboard_plugins_cache
@@ -4405,33 +4415,11 @@ def start_server(
if open_browser:
import webbrowser
# On headless Linux (no DISPLAY or WAYLAND_DISPLAY) some registered
# browsers are TUI programs (links, lynx, www-browser) that try to
# take over the terminal. That can send SIGHUP to the server process
# and cause an immediate exit even though uvicorn bound successfully.
# Skip the auto-open attempt on headless systems and let the user
# open the URL manually. macOS and Windows are always considered
# display-capable.
_has_display = (
sys.platform != "linux"
or bool(os.environ.get("DISPLAY"))
or bool(os.environ.get("WAYLAND_DISPLAY"))
)
def _open():
time.sleep(1.0)
webbrowser.open(f"http://{host}:{port}")
if _has_display:
def _open():
try:
time.sleep(1.0)
webbrowser.open(f"http://{host}:{port}")
except Exception:
pass
threading.Thread(target=_open, daemon=True).start()
else:
_log.debug(
"Skipping browser-open: no DISPLAY or WAYLAND_DISPLAY detected "
"(headless Linux). Pass --no-open to suppress this detection."
)
threading.Thread(target=_open, daemon=True).start()
print(f" Hermes Web UI → http://{host}:{port}")
uvicorn.run(app, host=host, port=port, log_level="warning")
+3 -3
View File
@@ -1597,10 +1597,10 @@ class SessionDB:
self._execute_write(_do)
def get_messages(self, session_id: str) -> List[Dict[str, Any]]:
"""Load all messages for a session, ordered by insertion order."""
"""Load all messages for a session, ordered by timestamp."""
with self._lock:
cursor = self._conn.execute(
"SELECT * FROM messages WHERE session_id = ? ORDER BY id",
"SELECT * FROM messages WHERE session_id = ? ORDER BY timestamp, id",
(session_id,),
)
rows = cursor.fetchall()
@@ -1700,7 +1700,7 @@ class SessionDB:
"SELECT role, content, tool_call_id, tool_calls, tool_name, "
"finish_reason, reasoning, reasoning_content, reasoning_details, "
"codex_reasoning_items, codex_message_items "
f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY id",
f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY timestamp, id",
tuple(session_ids),
).fetchall()
+232
View File
@@ -0,0 +1,232 @@
---
name: base
description: Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. Uses Base RPC + CoinGecko. No API key required.
version: 0.1.0
author: youssefea
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
tags: [Base, Blockchain, Crypto, Web3, RPC, DeFi, EVM, L2, Ethereum]
related_skills: []
---
# Base Blockchain Skill
Query Base (Ethereum L2) on-chain data enriched with USD pricing via CoinGecko.
8 commands: wallet portfolio, token info, transactions, gas analysis,
contract inspection, whale detection, network stats, and price lookup.
No API key needed. Uses only Python standard library (urllib, json, argparse).
---
## When to Use
- User asks for a Base wallet balance, token holdings, or portfolio value
- User wants to inspect a specific transaction by hash
- User wants ERC-20 token metadata, price, supply, or market cap
- User wants to understand Base gas costs and L1 data fees
- User wants to inspect a contract (ERC type detection, proxy resolution)
- User wants to find large ETH transfers (whale detection)
- User wants Base network health, gas price, or ETH price
- User asks "what's the price of USDC/AERO/DEGEN/ETH?"
---
## Prerequisites
The helper script uses only Python standard library (urllib, json, argparse).
No external packages required.
Pricing data comes from CoinGecko's free API (no key needed, rate-limited
to ~10-30 requests/minute). For faster lookups, use `--no-prices` flag.
---
## Quick Reference
RPC endpoint (default): https://mainnet.base.org
Override: export BASE_RPC_URL=https://your-private-rpc.com
Helper script path: ~/.hermes/skills/blockchain/base/scripts/base_client.py
```
python3 base_client.py wallet <address> [--limit N] [--all] [--no-prices]
python3 base_client.py tx <hash>
python3 base_client.py token <contract_address>
python3 base_client.py gas
python3 base_client.py contract <address>
python3 base_client.py whales [--min-eth N]
python3 base_client.py stats
python3 base_client.py price <contract_address_or_symbol>
```
---
## Procedure
### 0. Setup Check
```bash
python3 --version
# Optional: set a private RPC for better rate limits
export BASE_RPC_URL="https://mainnet.base.org"
# Confirm connectivity
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
```
### 1. Wallet Portfolio
Get ETH balance and ERC-20 token holdings with USD values.
Checks ~15 well-known Base tokens (USDC, WETH, AERO, DEGEN, etc.)
via on-chain `balanceOf` calls. Tokens sorted by value, dust filtered.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
wallet 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
```
Flags:
- `--limit N` — show top N tokens (default: 20)
- `--all` — show all tokens, no dust filter, no limit
- `--no-prices` — skip CoinGecko price lookups (faster, RPC-only)
Output includes: ETH balance + USD value, token list with prices sorted
by value, dust count, total portfolio value in USD.
Note: Only checks known tokens. Unknown ERC-20s are not discovered.
Use the `token` command with a specific contract address for any token.
### 2. Transaction Details
Inspect a full transaction by its hash. Shows ETH value transferred,
gas used, fee in ETH/USD, status, and decoded ERC-20/ERC-721 transfers.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
tx 0xabc123...your_tx_hash_here
```
Output: hash, block, from, to, value (ETH + USD), gas price, gas used,
fee, status, contract creation address (if any), token transfers.
### 3. Token Info
Get ERC-20 token metadata: name, symbol, decimals, total supply, price,
market cap, and contract code size.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
token 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
```
Output: name, symbol, decimals, total supply, price, market cap.
Reads name/symbol/decimals directly from the contract via eth_call.
### 4. Gas Analysis
Detailed gas analysis with cost estimates for common operations.
Shows current gas price, base fee trends over 10 blocks, block
utilization, and estimated costs for ETH transfers, ERC-20 transfers,
and swaps.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py gas
```
Output: current gas price, base fee, block utilization, 10-block trend,
cost estimates in ETH and USD.
Note: Base is an L2 — actual transaction costs include an L1 data
posting fee that depends on calldata size and L1 gas prices. The
estimates shown are for L2 execution only.
### 5. Contract Inspection
Inspect an address: determine if it's an EOA or contract, detect
ERC-20/ERC-721/ERC-1155 interfaces, resolve EIP-1967 proxy
implementation addresses.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
contract 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
```
Output: is_contract, code size, ETH balance, detected interfaces
(ERC-20, ERC-721, ERC-1155), ERC-20 metadata, proxy implementation
address.
### 6. Whale Detector
Scan the most recent block for large ETH transfers with USD values.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
whales --min-eth 1.0
```
Note: scans the latest block only — point-in-time snapshot, not historical.
Default threshold is 1.0 ETH (lower than Solana's default since ETH
values are higher).
### 7. Network Stats
Live Base network health: latest block, chain ID, gas price, base fee,
block utilization, transaction count, and ETH price.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
```
### 8. Price Lookup
Quick price check for any token by contract address or known symbol.
```bash
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price ETH
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price USDC
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price AERO
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price DEGEN
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
```
Known symbols: ETH, WETH, USDC, cbETH, AERO, DEGEN, TOSHI, BRETT,
WELL, wstETH, rETH, cbBTC.
---
## Pitfalls
- **CoinGecko rate-limits** — free tier allows ~10-30 requests/minute.
Price lookups use 1 request per token. Use `--no-prices` for speed.
- **Public RPC rate-limits** — Base's public RPC limits requests.
For production use, set BASE_RPC_URL to a private endpoint
(Alchemy, QuickNode, Infura).
- **Wallet shows known tokens only** — unlike Solana, EVM chains have no
built-in "get all tokens" RPC. The wallet command checks ~15 popular
Base tokens via `balanceOf`. Unknown ERC-20s won't appear. Use the
`token` command for any specific contract.
- **Token names read from contract** — if a contract doesn't implement
`name()` or `symbol()`, these fields may be empty. Known tokens have
hardcoded labels as fallback.
- **Gas estimates are L2 only** — Base transaction costs include an L1
data posting fee (depends on calldata size and L1 gas prices). The gas
command estimates L2 execution cost only.
- **Whale detector scans latest block only** — not historical. Results
vary by the moment you query. Default threshold is 1.0 ETH.
- **Proxy detection** — only EIP-1967 proxies are detected. Other proxy
patterns (EIP-1167 minimal proxy, custom storage slots) are not checked.
- **Retry on 429** — both RPC and CoinGecko calls retry up to 2 times
with exponential backoff on rate-limit errors.
---
## Verification
```bash
# Should print Base chain ID (8453), latest block, gas price, and ETH price
python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
```
File diff suppressed because it is too large Load Diff
-211
View File
@@ -1,211 +0,0 @@
---
name: evm
description: "Read-only EVM client: wallets, tokens, gas across 8 chains."
version: 1.0.0
author: Mibayy (@Mibayy), youssefea (@youssefea), ethernet8023 (@ethernet8023), Hermes Agent
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
tags: [EVM, Ethereum, BNB, BSC, Base, Arbitrum, Polygon, Optimism, Avalanche, zkSync, Blockchain, Crypto, Web3, DeFi, NFT, ENS, Whale, Security]
category: blockchain
related_skills: [solana]
requires_toolsets: [terminal]
---
# EVM Blockchain Skill
Query EVM-compatible blockchain data across 8 chains with USD pricing.
14 commands: wallet portfolio, token info, transactions, activity, gas tracker,
network stats, price lookup, multi-chain scan, whale detection, ENS resolution,
allowance checker, contract inspector, and transaction decoder.
Supports 8 chains: Ethereum, BNB Chain (BSC), Base, Arbitrum One, Polygon,
Optimism, Avalanche (C-Chain), zkSync Era.
No API key needed. Zero external dependencies — Python standard library only
(urllib, json, argparse, threading).
> **Supersedes the standalone `base` skill.** Base-specific tokens (AERO, DEGEN,
> TOSHI, BRETT, WELL, cbETH, cbBTC, wstETH, rETH) and all Base RPC functionality
> previously living under `optional-skills/blockchain/base/` have been folded
> into this skill. Pass `--chain base` to any command for Base coverage.
---
## When to Use
- User asks for a wallet balance or portfolio on any EVM chain
- User wants to check the same wallet across ALL chains at once
- User wants to inspect a transaction by hash (or decode what it did)
- User wants ERC-20 token metadata, price, supply, or market cap
- User wants recent transaction history for an address
- User wants current gas prices or to compare fees across chains
- User wants to find large whale transfers in recent blocks
- User asks to resolve an ENS name (vitalik.eth) or reverse-lookup an address
- User wants to check if a contract has dangerous token approvals
- User wants to inspect a smart contract (proxy? ERC-20? ERC-721? bytecode size?)
- User wants to compare gas costs across chains before a transaction
---
## Prerequisites
Python 3.8+ standard library only. No pip installs required.
Pricing: CoinGecko free API (rate-limited, ~10-30 req/min).
ENS: ensideas.com public API.
Tx decoding: 4byte.directory public API.
Override RPC endpoint: `export EVM_RPC_URL=https://your-rpc.com`
Helper script path: `~/.hermes/skills/blockchain/evm/scripts/evm_client.py`
---
## Quick Reference
```
SCRIPT=~/.hermes/skills/blockchain/evm/scripts/evm_client.py
# Network & prices
python3 $SCRIPT stats # Ethereum stats
python3 $SCRIPT stats --chain arbitrum # Arbitrum stats
python3 $SCRIPT compare # Gas + prices ALL 8 chains
# Wallet
python3 $SCRIPT wallet 0xd8dA...96045 # Portfolio (ETH + ERC-20)
python3 $SCRIPT wallet 0xd8dA...96045 --chain bsc
python3 $SCRIPT multichain 0xd8dA...96045 # Same wallet on ALL chains
# Tokens & prices
python3 $SCRIPT price ETH
python3 $SCRIPT price 0xdAC1...1ec7 # By contract address
python3 $SCRIPT token 0xdAC1...1ec7 # ERC-20 metadata + market cap
# Transactions
python3 $SCRIPT tx 0x5c50...f060 # Transaction details
python3 $SCRIPT decode 0x5c50...f060 # Decode input data (4byte.directory)
python3 $SCRIPT activity 0xd8dA...96045 # Recent transactions
# Gas
python3 $SCRIPT gas # Gas prices + cost estimates
python3 $SCRIPT gas --chain optimism
# Security
python3 $SCRIPT allowance 0xd8dA...96045 # Dangerous ERC-20 approvals
python3 $SCRIPT contract 0xdAC1...1ec7 # Contract inspection (proxy? standards?)
# ENS
python3 $SCRIPT ens vitalik.eth # Name -> address + profile
python3 $SCRIPT ens 0xd8dA...96045 # Address -> ENS name
# Whale detection
python3 $SCRIPT whale # Large transfers (last 20 blocks, >$10k)
python3 $SCRIPT whale --blocks 50 --min-usd 100000 --chain arbitrum
```
---
## Procedure
### 0. Setup Check
```bash
python3 --version # 3.8+ required
python3 ~/.hermes/skills/blockchain/evm/scripts/evm_client.py stats
```
### 1. Wallet Portfolio
Native balance + known ERC-20 tokens, sorted by USD value.
```bash
python3 $SCRIPT wallet 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
python3 $SCRIPT wallet 0xd8dA... --chain bsc --no-prices # faster
```
### 2. Multi-Chain Scan
Scans all 8 chains simultaneously for the same address using threads.
```bash
python3 $SCRIPT multichain 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
```
Output: per-chain native balance + token holdings + grand total USD.
### 3. Compare (Gas + Prices)
All 8 chains queried in parallel. Shows cheapest/most expensive chain.
```bash
python3 $SCRIPT compare
```
### 4. Transaction Details & Decode
```bash
python3 $SCRIPT tx 0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060
python3 $SCRIPT decode 0x5c504ed... # Shows human-readable function signature
```
Decode uses 4byte.directory to translate 0xa9059cbb -> transfer(address,uint256).
### 5. ENS Resolution
```bash
python3 $SCRIPT ens vitalik.eth # -> 0xd8dA... + avatar + social links
python3 $SCRIPT ens 0xd8dA...96045 # -> vitalik.eth
```
### 6. Allowance Checker (Security)
Checks ERC-20 approvals granted to known DEX/bridge contracts.
```bash
python3 $SCRIPT allowance 0xYourWallet
```
Flags UNLIMITED approvals as HIGH risk.
### 7. Contract Inspector
```bash
python3 $SCRIPT contract 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48 # USDC (proxy)
python3 $SCRIPT contract 0xdAC17F958D2ee523a2206206994597C13D831ec7 # USDT (ERC-20)
```
Detects: proxy (EIP-1967/EIP-1167), ERC-20, ERC-721, ERC-165. Shows bytecode size and implementation address for proxies.
### 8. Whale Detection
```bash
python3 $SCRIPT whale # ETH, last 20 blocks, >$10k
python3 $SCRIPT whale --blocks 50 --min-usd 50000 --chain bsc
```
### 9. Gas Tracker
```bash
python3 $SCRIPT gas
python3 $SCRIPT gas --chain polygon
```
Shows gwei price + USD cost for: transfer, ERC-20 transfer, approve, swap, NFT mint, NFT transfer.
---
## Supported Chains
| Key | Name | Native | Chain ID |
|-----------|----------------|--------|----------|
| ethereum | Ethereum | ETH | 1 |
| bsc | BNB Chain | BNB | 56 |
| base | Base | ETH | 8453 |
| arbitrum | Arbitrum One | ETH | 42161 |
| polygon | Polygon | POL | 137 |
| optimism | Optimism | ETH | 10 |
| avalanche | Avalanche C | AVAX | 43114 |
| zksync | zkSync Era | ETH | 324 |
---
## Pitfalls
- CoinGecko free tier: ~10-30 req/min. Use `--no-prices` for faster wallet scans.
- Public RPCs may throttle. Set EVM_RPC_URL to a private endpoint for production.
- `wallet` and `allowance` only check known token list (~30 tokens per chain). Use a block explorer for complete token discovery.
- `activity` scans recent blocks only (max 200). For full history, use Etherscan API.
- `multichain` runs 8 parallel threads — can trigger rate limits on public RPCs.
- ENS resolution depends on a single public endpoint (ensideas.com / ens.vitalik.ca) with no fallback. If that endpoint is down, `ens` will fail — re-run later or use a block explorer.
- Tx decoding depends on a single public endpoint (4byte.directory) with no fallback. Selectors not in their database show up as `unknown`.
- **L2 gas estimates are L2-execution only.** On rollups like Base, Arbitrum, Optimism, and zkSync, the actual transaction cost also includes an L1 data-posting fee that depends on calldata size and current L1 gas prices. The `gas` command does not estimate that L1 component. For Base specifically, see the network's L1 fee oracle (contract `0x420000000000000000000000000000000000000F`).
- Address / tx-hash inputs are validated for 0x-prefix + correct length + hex, but EIP-55 checksum casing is **not** enforced (RPC endpoints accept any-case hex).
---
## Verification
```bash
# Should print current block, gas price, ETH price
python3 ~/.hermes/skills/blockchain/evm/scripts/evm_client.py stats
# Should resolve vitalik.eth to 0xd8dA...
python3 ~/.hermes/skills/blockchain/evm/scripts/evm_client.py ens vitalik.eth
```
File diff suppressed because it is too large Load Diff
@@ -1,14 +0,0 @@
{
"name": "example",
"label": "Example",
"description": "Example dashboard plugin — used by test suite for auth coverage",
"icon": "Sparkles",
"version": "1.0.0",
"tab": {
"path": "/example",
"position": "after:skills"
},
"slots": [],
"entry": "dist/index.js",
"api": "plugin_api.py"
}
@@ -1,17 +0,0 @@
"""Example dashboard plugin — backend API routes.
Mounted at /api/plugins/example/ by the dashboard plugin system.
This minimal plugin exists so the test suite has a stable, side-effect-free
GET endpoint to verify that plugin API routes work with auth.
"""
from fastapi import APIRouter
router = APIRouter()
@router.get("/hello")
async def hello():
"""Simple greeting endpoint to demonstrate plugin API routes."""
return {"message": "Hello from the example plugin!", "plugin": "example", "version": "1.0.0"}
-7
View File
@@ -875,13 +875,6 @@ class HindsightMemoryProvider(MemoryProvider):
"Hindsight local runtime is unavailable"
+ (f": {reason}" if reason else "")
)
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("memory.hindsight", prompt=False)
except ImportError:
pass
except Exception as _e:
raise ImportError(str(_e))
from hindsight import HindsightEmbedded
HindsightEmbedded.__del__ = lambda self: None
llm_provider = self._config.get("llm_provider", "")
+2 -19
View File
@@ -21,7 +21,6 @@ from dataclasses import dataclass, field
from pathlib import Path
from hermes_constants import get_hermes_home
from hermes_cli.profiles import _get_default_hermes_home
from typing import Any, TYPE_CHECKING
if TYPE_CHECKING:
@@ -74,7 +73,7 @@ def resolve_config_path() -> Path:
return local_path
# Default profile's config — host blocks accumulate here via setup/clone
default_path = _get_default_hermes_home() / "honcho.json"
default_path = Path.home() / ".hermes" / "honcho.json"
if default_path != local_path and default_path.exists():
return default_path
@@ -688,28 +687,12 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
"For local instances, set HONCHO_BASE_URL instead."
)
# Lazy-install the honcho SDK on demand. ensure() honors
# security.allow_lazy_installs (default true). On failure we surface
# the original ImportError-shape message so existing callers still get
# the "go run hermes honcho setup" hint they used to.
try:
from tools.lazy_deps import FeatureUnavailable, ensure as _lazy_ensure
_lazy_ensure("memory.honcho", prompt=False)
except ImportError:
# lazy_deps module missing — fall through to the raw import below.
pass
except Exception:
# FeatureUnavailable or unexpected error. Don't crash here; let the
# actual import attempt produce the canonical error message.
pass
try:
from honcho import Honcho
except ImportError:
raise ImportError(
"honcho-ai is required for Honcho integration. "
"Install it with: pip install honcho-ai "
"(or run `hermes honcho setup` to configure)."
"Install it with: pip install honcho-ai"
)
# Allow config.yaml honcho.base_url to override the SDK's environment
-7
View File
@@ -336,17 +336,10 @@ ADD_RESOURCE_SCHEMA = {
def _zip_directory(dir_path: Path) -> Path:
"""Create a temporary zip file containing a directory tree."""
root = dir_path.resolve()
zip_path = Path(tempfile.gettempdir()) / f"openviking_upload_{uuid.uuid4().hex}.zip"
with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as zipf:
for file_path in dir_path.rglob("*"):
if file_path.is_symlink():
continue
if file_path.is_file():
try:
file_path.resolve().relative_to(root)
except ValueError:
continue
arcname = str(file_path.relative_to(dir_path)).replace("\\", "/")
zipf.write(file_path, arcname=arcname)
return zip_path

Some files were not shown because too many files have changed in this diff Show More