Compare commits

...

39 Commits

Author SHA1 Message Date
kshitijk4poor 965d2fec98 feat(provider): add codex-cli external-process provider
Add an external-process inference provider that shells out to the
Codex CLI (codex exec --json) for inference.  This lets users
delegate Hermes requests to their local Codex CLI installation,
leveraging Codex's agent loop while keeping Hermes as the driver.

Key design:
- Text-in/text-out MVP — Hermes tools are disabled (Codex handles its
  own tool calling internally).
- Streaming is disabled (subprocess stdio returns a single
  SimpleNamespace, not an iterable generator).
- Follows the copilot-acp external-process pattern for routing,
  streaming exclusion, and credential resolution.

Files:
- agent/codex_cli_client.py  — Client facade, parses JSONL events
- hermes_cli/auth.py  — ProviderConfig, status helper, cred resolver
- hermes_cli/runtime_provider.py  — Runtime resolution
- run_agent.py  — Client routing, tool disable, streaming exclusion
- hermes_cli/models.py  — Provider entry, aliases, model list
- hermes_cli/main.py  — --provider choices

Env var support: HERMES_CODEX_CLI_COMMAND, CODEX_CLI_PATH,
HERMES_CODEX_CLI_ARGS.
2026-05-09 21:02:32 +05:30
kshitijk4poor f6d45e5df4 chore: add nik1t7n to AUTHOR_MAP
Nikita Nosov (nik1t7n, PR #22264) — first-time contributor email
and noreply alias.
2026-05-09 04:34:55 -07:00
Nikita Nosov 1ac8deb3ca feat(gateway): stream Telegram edits safely 2026-05-09 04:34:55 -07:00
fahdad cca2869d78 fix(banner): resolve update-check repo from running code, not profile-scoped path
check_for_updates() and _resolve_repo_dir() were preferring
$HERMES_HOME/hermes-agent/ over Path(__file__).parent.parent.resolve()
when looking for a .git checkout.  For profiles created with
--clone-all, $HERMES_HOME/hermes-agent/ points to a stale copy
with a frozen HEAD, causing persistent "N commits behind" banners
that never resolved.

Flip the resolution order: prefer the running code's location first,
fall back to $HERMES_HOME/hermes-agent/ only when the live checkout
doesn't have a .git (system-wide pip installs, distro packages).

The embedded-rev branch (HERMES_REVISION env var, set by nix builds)
is unaffected — it uses git ls-remote against upstream, never reads
the local checkout's HEAD.

Based on PR #21728 by @fahdad
2026-05-09 04:10:35 -07:00
donrhmexe f7e514d4ad fix(profiles): exclude infrastructure artifacts when cloning with --clone-all
When the source profile is the default (~/.hermes), shutil.copytree()
was copying multi-GB infrastructure alongside the ~40 MB of actual
profile data: hermes-agent/ (repo checkout + 3 GB venv), .worktrees/,
profiles/ (sibling profiles — recursive!), bin/ (installed binaries),
node_modules/ (hundreds of MB).

Add _CLONE_ALL_DEFAULT_EXCLUDE_ROOT frozenset with these five entries
and pass an ignore callback to copytree().  Exclusions are gated on
the source actually being the default profile (is_default_source) so
named-profile sources are never affected.

Also exclude at any depth: __pycache__/, *.pyc, *.pyo, *.sock, *.tmp.
Profile data (config.yaml, .env, auth.json, state.db, sessions/,
skills/, logs/) is preserved intact — clone-all means 'complete
snapshot minus infrastructure'.

Mirrors the approach already used by _default_export_ignore() and
_DEFAULT_EXPORT_EXCLUDE_ROOT (the export-side exclusion set which is
broader because it produces a portable archive, not a live clone).

Co-authored-by: MustafaKara7 <karamusti912@gmail.com>
Co-authored-by: fahdad <30740087+fahdad@users.noreply.github.com>
Fixes #5022
Based on PRs #5025, #5026, and #21728
2026-05-09 04:10:35 -07:00
GodsBoy 93e25ceb13 feat(plugins): add standalone_sender_fn for out-of-process cron delivery
Plugin platforms (IRC, Teams, Google Chat) currently fail with
`No live adapter for platform '<name>'` when a `deliver=<plugin>` cron
job runs in a separate process from the gateway, even though the
platforms are eligible cron targets via `cron_deliver_env_var` (added
in #21306). Built-in platforms (Telegram, Discord, Slack, etc.) use
direct REST helpers in `tools/send_message_tool.py` so cron can deliver
without holding the gateway in the same process; plugin platforms
historically depended on `_gateway_runner_ref()` which returns `None`
out of process.

This change adds an optional `standalone_sender_fn` field to
`PlatformEntry` so plugins can register an ephemeral send path that
opens its own connection, sends, and closes without needing the live
adapter. The dispatch site in `_send_via_adapter` falls through to the
hook when the gateway runner is unavailable, with a descriptive error
when neither path applies. The hook is optional, so existing plugins
are unaffected.

Reference migrations land in the same change for IRC, Teams, and
Google Chat, exercising the hook across stdlib (asyncio + IRC protocol),
Bot Framework OAuth client_credentials, and Google service-account
flows respectively.

Security hardening on the new code paths:
* IRC: control-character stripping on chat_id and message body to
  block CRLF command injection; bounded nick-collision retries; JOIN
  before PRIVMSG so channels with the default `+n` mode accept the
  delivery.
* Teams: TEAMS_SERVICE_URL validated against an allowlist of known
  Bot Framework hosts (`smba.trafficmanager.net`,
  `smba.infra.gov.teams.microsoft.us`) to block SSRF; chat_id and
  tenant_id constrained to the documented Bot Framework character set;
  per-request timeouts so a slow STS endpoint cannot starve the
  activity POST.
* Google Chat: chat_id and thread_id validated against strict
  resource-name regexes; service-account refresh wrapped in
  `asyncio.wait_for` so a hung token endpoint cannot stall the
  scheduler.

Test coverage: 20 new tests covering happy path, missing-config errors,
network failure modes, and each defensive validation. Existing tests
unchanged. `bash scripts/run_tests.sh tests/tools/test_send_message_tool.py
tests/gateway/test_irc_adapter.py tests/gateway/test_teams.py
tests/gateway/test_google_chat.py` reports 341 passed, 0 regressions.

Documentation: new "Out-of-process cron delivery" section in
website/docs/developer-guide/adding-platform-adapters.md and an entry
in gateway/platforms/ADDING_A_PLATFORM.md naming the hook.
2026-05-09 02:56:29 -07:00
obafemiferanmi1999 3801825efd fix(tests): pin UTF-8 encoding when reading source files on Windows
Three tests in tests/agent/test_auxiliary_config_bridge.py read
in-tree source files (gateway/run.py and cli.py) via
Path.read_text() with no encoding argument.  The default falls
back to the system locale, which on Western Windows installs is
cp1252, and the read fails as soon as the source contains any
byte that isn't valid cp1252 (e.g. an em-dash in a comment):

    UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f
    in position 41190: character maps to <undefined>

Linux CI doesn't catch this because the default Linux locale is
UTF-8.  Windows contributors hit it on every run of the test suite.

Pin encoding="utf-8" on the three call sites that read repo
source files.  This matches the existing precedent in
hermes_cli/doctor.py:363, where the same pattern (with an
explanatory comment) was applied to fix the .env read on
non-UTF-8 Windows locales.

Affected tests now pass on Windows + Python 3.12:
  - TestGatewayBridgeCodeParity.test_gateway_has_auxiliary_bridge
  - TestGatewayBridgeCodeParity.test_gateway_no_compression_env_bridge
  - TestCLIDefaultsHaveAuxiliaryKeys.test_cli_defaults_can_merge_auxiliary
2026-05-09 02:47:28 -07:00
kshitij 5d2a75ddf2 chore(release): add KvnGz to AUTHOR_MAP (#22458)
Maps obafemiferanmi1999@gmail.com (the commit-author email used on
PR #21473's branch) to GitHub login KvnGz (the PR/branch owner) so
contributor_audit.py recognizes the authored commit in the upcoming
salvage PR.
2026-05-09 02:47:14 -07:00
Zhekinmaksim 4a1840e683 fix(async): replace get_event_loop() with get_running_loop() in async contexts
Follow-up to PR #21293 (cli.py), which fixed the same anti-pattern.
`asyncio.get_event_loop()` is documented as effectively "always returns
the running loop when called from a coroutine" and emits
DeprecationWarning/RuntimeWarning in some interpreter configurations.
The Python docs explicitly recommend get_running_loop() inside coroutines.

Replaces the remaining 9 call sites that are unconditionally inside
async def bodies:

- tools/browser_cdp_tool.py — _cdp_call() (4 sites): deadline + remaining
  computations inside the async websockets.connect context manager.
- hermes_cli/web_server.py — get_status, _start_device_code_flow,
  submit_oauth_code (3 sites): all FastAPI async endpoints offloading
  blocking httpx / PKCE work to run_in_executor.
- environments/agent_loop.py — HermesAgentLoop (1 site): tool dispatch
  inside the async rollout loop.
- environments/benchmarks/terminalbench_2/terminalbench2_env.py —
  rollout_and_score_eval (1 site): test verification thread offload.

All 9 sites are unconditionally inside async def bodies, so a running
loop is guaranteed and no try/except RuntimeError fallback is needed
(unlike the cli.py case in #21293, which ran from a background thread).

Behavior is identical on supported Python versions; aligns the codebase
with the post-#21293 idiom and avoids future warnings as the deprecation
hardens.

Salvaged from PR #21930 by @Zhekinmaksim onto current main (the
original branch was 109 commits behind and carried unintended
stale-branch reverts of unrelated landed changes — _tail_lines
encoding=utf-8 and the Windows PTY bridge guard). Only the 9 swaps
from the PR's intended scope are applied here.
2026-05-09 02:34:19 -07:00
kshitij b7d8e280e8 chore(release): add Zhekinmaksim to AUTHOR_MAP (#22449)
Maps zhekinmaksim@gmail.com to GitHub login Zhekinmaksim so
contributor_audit.py recognizes their authored commit in the
upcoming #21930 salvage PR.
2026-05-09 02:33:49 -07:00
heathley 7e578f02c8 feat(feishu): add native update prompt cards 2026-05-09 02:32:55 -07:00
kshitijk4poor e3ebaa19ba test(kanban): cover kanban_comment author hardening + cross-task policy
- Renames test_comment_custom_author -> test_comment_ignores_caller_supplied_author
  and inverts its assertion: an args['author'] override is silently
  ignored; the author always comes from HERMES_PROFILE.
- Adds test_comment_schema_omits_author_override to assert the
  'author' property is gone from KANBAN_COMMENT_SCHEMA so the
  forgery surface stays closed if someone re-adds the schema field
  by accident.
- Adds test_worker_can_comment_on_foreign_task to pin the #19713
  policy decision: cross-task commenting must remain unrestricted.
  Without this guard, a future change accidentally adding
  _enforce_worker_task_ownership to _handle_comment would close the
  documented handoff channel between tasks.
2026-05-09 02:32:16 -07:00
memosr 9bbad3cc10 fix(security): drop caller-controlled author override in kanban_comment
Comments are injected into the next worker's system prompt by
build_worker_context() as '**{author}** (timestamp): {body}'. The
previous code accepted args['author'] as a free-form override and
exposed it on KANBAN_COMMENT_SCHEMA, which let a worker:

  1. Receive a prompt-injection in a malicious task body.
  2. Call kanban_comment with author='hermes-system' (or any other
     authoritative-looking name) on a sibling task.
  3. The next worker assigned to that sibling task sees the forged
     comment in its boot context as what reads like a system-authored
     directive.

Always derive author from HERMES_PROFILE (the dispatcher already sets
this per worker at hermes_cli/kanban_db.py:3718), and remove the
'author' property from the tool schema so the LLM can't see the
override surface.

Cross-task commenting itself remains unrestricted (see #19713) —
comments are the deliberate handoff channel between tasks; only the
author-override surface is closed.

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-05-09 02:32:16 -07:00
kshitij e3cd4e401d chore(release): add heathley email to AUTHOR_MAP for PR #21911 salvage (#22446) 2026-05-09 02:31:34 -07:00
kshitijk4poor 8578f898cb test(google-chat): cover relay-declared sender_type honoring
Adds five regression tests for the Format 3 (Cloud Run relay) envelope
path:

- test_relay_flat_honors_declared_sender_type_bot: BOT sender_type
  propagates to msg['sender']['type'].
- test_relay_flat_defaults_sender_type_human_when_absent: backward
  compat \u2014 missing field still flows as HUMAN.
- test_relay_flat_coerces_unknown_sender_type_to_human: defensive
  coercion \u2014 strip+upper normalizes whitespace/case, anything outside
  {HUMAN, BOT} falls back to HUMAN.
- test_relay_flat_bot_sender_is_filtered_end_to_end: end-to-end
  through _on_pubsub_message \u2014 a relay envelope with sender_type=BOT
  is dropped by the BOT self-filter without dispatch.
- test_relay_flat_human_sender_dispatches: end-to-end negative
  control \u2014 human relay envelopes still reach the agent loop.

Also clarifies the operator contract in the adapter comment: the
relay must forward upstream sender.type as envelope.sender_type,
otherwise bot replies forwarded as HUMAN cannot be distinguished
from genuine humans by this filter.
2026-05-09 02:31:31 -07:00
memosr c386400040 fix(security): honor relay-declared sender_type in Google Chat adapter to prevent BOT filter bypass 2026-05-09 02:31:31 -07:00
obafemiferanmi1999 0f1d41a88c fix(transports): use PEP 604 annotation for ToolCall.extra_content
`ToolCall.extra_content` was annotated `Optional[Dict[str, Any]]`,
but neither `Optional` nor `Dict` are imported at the top of
`agent/transports/types.py` — only `Any` is.  The rest of the file
consistently uses PEP 604 / 585 syntax (e.g. `str | None`,
`dict[str, Any] | None`).

The file has `from __future__ import annotations`, so the missing
names don't crash class definition.  But the annotation IS evaluated
when anything calls `typing.get_type_hints(ToolCall)` —
introspection raises `NameError: name 'Optional' is not defined`.

ruff catches it cleanly:

    F821 Undefined name `Optional`  agent/transports/types.py:65:32
    F821 Undefined name `Dict`      agent/transports/types.py:65:41

Switch the annotation to `dict[str, Any] | None` to match the
rest of the file's style.  No new imports needed.

Verified:
  - ruff F-checks now pass on the file
  - `typing.get_type_hints(ToolCall)` succeeds where it raised before
  - 166/166 tests in tests/agent/transports/ pass on Windows + Python 3.12
2026-05-09 02:25:37 -07:00
qWaitCrypto 2c8c48fbc7 fix(webui): clarify MEDIA absolute-path hint 2026-05-09 02:22:40 -07:00
qWaitCrypto aad5490e74 fix(webui): add platform hint for MEDIA rendering
WebUI sessions construct AIAgent(platform="webui") but PLATFORM_HINTS
had no "webui" entry, so the agent received no platform hint at all.
The WebUI frontend supports rich MEDIA:/absolute/path previews for
images, audio, video, PDF, HTML, CSV, diffs, and Excalidraw, but
without a hint the agent either ignores MEDIA: or falls back to
Markdown image syntax which silently fails for local files.

Add a webui hint that documents the MEDIA: render path and warns
against ![alt](/path) for local files.

Fixes #21883
2026-05-09 02:22:40 -07:00
uzunkuyruk 7330183d08 fix(model_tools): log warnings for failed JSON-array coercion
When _coerce_json fails to parse a string as JSON or parses to the wrong
type, log a clear WARNING instead of silently returning the original
value. When coerce_tool_args wraps a bare string into a single-element
list AND the string looks like a JSON array (starts with '['), warn
that the model likely emitted a JSON-encoded string instead of a
native array.

This improves diagnostics for the open-weight model output drift
described in #21933 (JSON-array-as-string), as well as any other tool
whose array-typed argument arrives stringified through
handle_function_call.

Note: delegate_task does NOT go through coerce_tool_args (it is in
_AGENT_LOOP_TOOLS and dispatched directly from run_agent.py with raw
function_args from json.loads). The actual delegate_task fix for #21933
is the previous commit. These logging changes apply to all other
array-typed arguments coerced via the shared pipeline.

Salvaged from PR #22092.
2026-05-09 02:18:57 -07:00
Bartok 326ca754ad fix(delegate): accept JSON string batch tasks
Recover delegate_task batch inputs when open-weight models emit tasks as a JSON-encoded array string, and return clear errors for malformed task lists.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-09 02:18:57 -07:00
kshitij 4632be123d chore(release): add uzunkuyruk to AUTHOR_MAP (#22434)
Maps egitimviscara@gmail.com to GitHub login uzunkuyruk so that
contributor_audit.py recognizes their authored commits in upcoming
salvage PRs (e.g. #21933 fix).
2026-05-09 02:18:35 -07:00
kshitij 2a7047c2ed fix(sqlite): fall back to journal_mode=DELETE on NFS/SMB/FUSE (#22043)
SQLite's WAL mode requires shared-memory (mmap) coordination and fcntl
byte-range locks that don't reliably work on network filesystems. Upstream
documents this explicitly:
  https://www.sqlite.org/wal.html#sometimes_queries_return_sqlite_busy_in_wal_mode

On NFS / SMB / some FUSE mounts / WSL1, 'PRAGMA journal_mode=WAL' raises
'sqlite3.OperationalError: locking protocol' (SQLITE_PROTOCOL). Before
this change, every feature backed by state.db or kanban.db broke silently:
  - /resume, /title, /history, /branch returned 'Session database not
    available.' with no cause
  - gateway logged the init failure at DEBUG (invisible in errors.log)
  - kanban dispatcher crashed every 60s, driving the known migration race
    (duplicate column name: consecutive_failures, #21708 / #21374)

Changes:
  - hermes_state.apply_wal_with_fallback(): shared helper that tries WAL
    and falls back to DELETE on SQLITE_PROTOCOL-style errors with one
    WARNING explaining why
  - hermes_state.get_last_init_error() + format_session_db_unavailable():
    capture the init failure cause and surface it in user-facing strings
    (with an NFS/SMB pointer for 'locking protocol')
  - hermes_cli/kanban_db.connect(): use the shared helper
  - gateway/run.py: bump SessionDB init failure log DEBUG -> WARNING
    (matches cli.py's existing correct behavior)
  - cli.py (4 sites) + gateway/run.py (5 sites): replace bare
    'Session database not available.' with format_session_db_unavailable()

Tests: 12 new tests in tests/test_hermes_state_wal_fallback.py + 1 new
test in tests/hermes_cli/test_kanban_db.py. Existing suites (state,
kanban, gateway, cli) remain green for all tests unrelated to pre-existing
failures on main.

Evidence: real-world user on NFSv3 mount (172.26.224.200:d2dfac12/home,
local_lock=none) reporting 'Session database not available.' on /resume;
'locking protocol' appears in 4 distinct log entries across backup,
kanban, TUI, and CLI paths in the same session.

closes #22032
2026-05-09 02:09:35 -07:00
kshitij ae005ec588 fix(send_message): map Telegram General topic id to None for forum groups (#22423)
Telegram forum supergroups address the General topic as
`message_thread_id="1"` on incoming updates, but the Bot API rejects
sends with `message_thread_id=1` ("Message thread not found"). The
gateway adapter has a `_message_thread_id_for_send` helper that maps
"1" to None for that reason; the standalone `_send_telegram` helper
used by the `send_message` tool never got the same mapping, so any
`send_message` call to a Topics-enabled group's General topic
(target shape `telegram:<chat_id>:1`) failed with "Message thread
not found."

Reuse the adapter's helper when available, with an explicit fallback
to the same mapping for environments where the adapter import path
fails (e.g. python-telegram-bot missing in this venv).

Fixes #22267
2026-05-09 01:58:33 -07:00
kshitij 8fb3e2d63a fix: always send tenant headers in OpenViking _headers() when account/user are set
OpenViking 0.3.x requires X-OpenViking-Account and X-OpenViking-User headers for ROOT API key requests to tenant-scoped APIs. Previously the `!="default"` guard skipped these headers when account/user were the literal string "default", causing INVALID_ARGUMENT errors.

Remove the `!="default"` guard so headers are sent whenever account/user are truthy. Empty strings are still correctly skipped since `""` is falsy.

Update tests to reflect the new behavior:
- test_viking_client_headers_send_tenant_when_default: asserts "default" headers ARE present
- test_viking_client_headers_send_tenant_when_empty_falls_back_to_default: asserts "default" headers ARE present from constructor fallback

Based on #21775 by @happy5318
2026-05-09 01:53:19 -07:00
kshitij c7e8add120 fix(context): handle JSON decode errors in compression — salvage of #22248 (#22416)
When an auxiliary LLM provider (or an upstream proxy) returns a non-JSON
body with `Content-Type: application/json` — e.g. an HTML 502 page from a
misconfigured gateway — the OpenAI SDK's `response.json()` raises a raw
`json.JSONDecodeError` (or wraps it in `APIResponseValidationError` whose
message contains "expecting value"). Previously this fell through to the
unknown-error branch and entered a 60s cooldown without retrying on the
main model, dropping the middle conversation turns instead.

This change folds JSON-decode detection into the existing fast-path
fallback chain: detect by `isinstance(e, JSONDecodeError)` OR substring
match for "expecting value", retry once on the main model, and use a
shorter 30s cooldown when already on main (the body shape tends to flip
back to valid quickly when the upstream proxy recovers).

The three duplicated fallback bodies (model-not-found, unknown-error,
JSON-decode) are consolidated into a single `_fallback_to_main_for_compression`
helper that handles the shared bookkeeping (record aux-model failure for
`/usage`-style callers, clear summary_model, clear cooldown).

Also adds three unit tests covering: raw `JSONDecodeError` retries on main,
substring-match for wrapped exceptions, and the 30s cooldown when already
on main.

Salvage of #22248 by @0xharryriddle. Closes #22244.

Co-authored-by: Harry Riddle <ntconguit@gmail.com>
2026-05-09 01:47:15 -07:00
kshitijk4poor aef297a45e fix(telegram): skip send_chat_action for DM topic reply-fallback lanes
The send path uses Hermes' reply-anchor fallback for DM topic lanes
(message_thread_id + reply_to_message_id), but send_chat_action only
accepts message_thread_id — Telegram's Bot API 10.0 rejects it for
these lanes. Without this short-circuit, every typing tick (~every 2s
during agent runs) makes a doomed API call that gets logged as a
'thread not found' debug warning. Skip the call entirely when the
metadata indicates a DM topic reply-fallback lane; the user-visible
behavior is unchanged (no typing indicator either way for these
lanes), but the logs stay clean.

Identified during salvage review of #22053.
2026-05-09 01:39:37 -07:00
Jhin Lee b3239572f0 fix(telegram): preserve DM topic routing via reply fallback 2026-05-09 01:39:37 -07:00
kshitij 28b5bd7e93 chore(release): add leehack to AUTHOR_MAP for PR #22053 salvage (#22409)
Adds jhin.lee@unity3d.com → leehack so contributor_audit.py strict
mode passes when the salvage of #22053 (telegram DM topic reply
fallback) lands on main.
2026-05-09 01:39:16 -07:00
kshitijk4poor 96dc272623 fix(cron): use getJobState helper in handlePauseResume
Self-review follow-up: handlePauseResume read job.state directly while
the rest of the page goes through getJobState(), which falls back to
the enabled flag when state is null/undefined. With the backend
normalizer in this PR, state is always populated on the wire, so this
has no observable effect today — but using the helper keeps the page
consistent and resilient against older Hermes backends that don't run
the normalizer.
2026-05-09 01:11:41 -07:00
LeonSGP43 e572737274 Fix cron dashboard rendering for partial jobs 2026-05-09 01:11:41 -07:00
helix4u e407376c50 fix(cron): normalize partial job records 2026-05-09 01:11:41 -07:00
kshitijk4poor f2afa68a4a chore(release): add oferlaor to AUTHOR_MAP for PR #22356 salvage 2026-05-09 00:57:27 -07:00
Ofer LaOr dbafa083b5 fix(cron): avoid delivery origin as sender identity 2026-05-09 00:57:27 -07:00
brooklyn! a7e7921dbc fix(tui): trim markdown wrap spaces (#22062)
* fix(tui): trim markdown wrap spaces

Use trim-aware wrapping for markdown prose so word-wrapped continuation lines do not keep boundary spaces.

* fix(tui): simplify markdown wrap nodes

Keep trim-aware wrapping on the rendered markdown text node while leaving nested inline segments as plain virtual text.

* fix(tui): trim definition row wrapping

Apply trim-aware wrapping to markdown definition rows so continuation lines match other prose rows.

* fix(tui): trim list and quote wrapping

Put trim-aware wrapping on the rendered list and quote rows that own markdown inline layout.

* fix(tui): preserve markdown nesting with trim wrap

Move list and quote indentation into layout padding so trim-aware wrapping does not erase nested markdown structure.

* fix(tui): trim only soft wrap spaces

Change trim-aware wrapping to remove whitespace only at soft-wrap boundaries so original leading inline spaces stay verbatim.

* fix(tui): preserve extra boundary whitespace

Trim only one soft-wrap boundary whitespace character so wrap-trim avoids leading continuations without collapsing intentional spacing.

* fix(tui): align styled wrap-trim mapping

Update styled text remapping to skip the single whitespace removed at soft-wrap boundaries without dropping preserved indentation.

* fix(tui): clean wrap trim test helpers

Clarify boundary-trim wording and strip OSC escapes from markdown render test output.

* fix(tui): strip osc before ansi in markdown tests

Remove OSC escapes from raw render output before SGR/CSI cleanup so markdown render assertions stay plain text.
2026-05-08 20:51:34 -07:00
teknium1 78b0008f44 fix(gateway): also catch restart TimeoutExpired; friendly message
Extends #19994 to the restart path. Dashboard spawns 'hermes gateway
restart' in the background; when a wedged adapter websocket pushes
drain past the 90s CLI timeout, the dashboard previously surfaced a
raw subprocess.TimeoutExpired traceback.

Mirror systemd_stop()'s TimeoutExpired catch onto both forcing-restart
sites in systemd_restart(). Adds a test that exercises the no-active-pid
branch end-to-end.
2026-05-08 18:50:25 -07:00
LeonSGP43 dccf1fb6e0 fix(gateway): cap adapter disconnect during stop 2026-05-08 18:50:25 -07:00
Teknium 524cbabd89 chore(release): add dandacompany to AUTHOR_MAP for salvaged PR #20503 2026-05-08 17:01:12 -07:00
dante 24d3216175 fix(slack): enable writable app home DMs in manifest 2026-05-08 17:01:12 -07:00
76 changed files with 5863 additions and 525 deletions
+334
View File
@@ -0,0 +1,334 @@
"""OpenAI-compatible shim that forwards Hermes requests to ``codex exec --json``.
This adapter lets Hermes treat the OpenAI Codex CLI as a chat-style backend.
Each request spawns ``codex exec --json --ephemeral --dangerously-bypass-approvals-and-sandbox``,
parses the JSONL event stream, extracts the agent message text and token usage,
and converts the result into the minimal shape Hermes expects from an OpenAI client.
"""
from __future__ import annotations
import json
import logging
import os
import subprocess
import threading
import time
from pathlib import Path
from types import SimpleNamespace
from typing import Any
logger = logging.getLogger(__name__)
_CODEX_CLI_BASE_URL = "codex-cli://local"
_DEFAULT_TIMEOUT_SECONDS = 900.0
def _resolve_command() -> str:
return (
os.getenv("HERMES_CODEX_CLI_COMMAND", "").strip()
or os.getenv("CODEX_CLI_PATH", "").strip()
or "codex"
)
def _resolve_args() -> list[str]:
raw = os.getenv("HERMES_CODEX_CLI_ARGS", "").strip()
if not raw:
return [
"exec",
"--json",
"--ephemeral",
"--dangerously-bypass-approvals-and-sandbox",
"--skip-git-repo-check",
]
import shlex
return shlex.split(raw)
def _build_subprocess_env() -> dict[str, str]:
env = os.environ.copy()
# Preserve HOME so codex can find ~/.codex/auth.json
home = os.environ.get("HOME", "")
if not home:
home = os.path.expanduser("~")
if home and home != "~":
env["HOME"] = home
return env
def _parse_turn_completed_usage(event: dict[str, Any]) -> SimpleNamespace:
usage = event.get("usage") or {}
input_tokens = int(usage.get("input_tokens") or 0)
cached_tokens = int(usage.get("cached_input_tokens") or 0)
output_tokens = int(usage.get("output_tokens") or 0)
reasoning_tokens = int(usage.get("reasoning_output_tokens") or 0)
return SimpleNamespace(
prompt_tokens=input_tokens,
completion_tokens=output_tokens + reasoning_tokens,
total_tokens=input_tokens + output_tokens + reasoning_tokens,
prompt_tokens_details=SimpleNamespace(cached_tokens=cached_tokens),
)
class _CodexCLIChatCompletions:
def __init__(self, client: "CodexCLIClient"):
self._client = client
def create(self, **kwargs: Any) -> Any:
return self._client._create_chat_completion(**kwargs)
class _CodexCLIChatNamespace:
def __init__(self, client: "CodexCLIClient"):
self.completions = _CodexCLIChatCompletions(client)
class CodexCLIClient:
"""Minimal OpenAI-client-compatible facade for Codex CLI."""
def __init__(
self,
*,
api_key: str | None = None,
base_url: str | None = None,
default_headers: dict[str, str] | None = None,
command: str | None = None,
args: list[str] | None = None,
**_: Any,
):
self.api_key = api_key or "codex-cli"
self.base_url = base_url or _CODEX_CLI_BASE_URL
self._default_headers = dict(default_headers or {})
self._command = command or _resolve_command()
self._args = list(args or _resolve_args())
self.chat = _CodexCLIChatNamespace(self)
self.is_closed = False
self._active_process: subprocess.Popen[str] | None = None
self._active_process_lock = threading.Lock()
def close(self) -> None:
proc: subprocess.Popen[str] | None
with self._active_process_lock:
proc = self._active_process
self._active_process = None
self.is_closed = True
if proc is None:
return
try:
proc.terminate()
proc.wait(timeout=2)
except Exception:
try:
proc.kill()
except Exception:
pass
def _build_prompt(self, messages: list[dict[str, Any]], model: str | None = None) -> str:
sections: list[str] = [
"You are being used as the active Codex CLI agent backend for Hermes.",
"Respond to the user's request directly. Do NOT call tools — Hermes handles tools.",
]
if model:
sections.append(f"Hermes requested model hint: {model}")
transcript: list[str] = []
for message in messages:
if not isinstance(message, dict):
continue
role = str(message.get("role") or "unknown").strip().lower()
content = message.get("content")
if content is None:
continue
if isinstance(content, list):
parts = []
for item in content:
if isinstance(item, str):
parts.append(item)
elif isinstance(item, dict) and "text" in item:
parts.append(str(item["text"]))
content = "\n".join(parts).strip()
if not content:
continue
label = {
"system": "System",
"user": "User",
"assistant": "Assistant",
"tool": "Tool",
}.get(role, role.title())
transcript.append(f"{label}:\n{content}")
if transcript:
sections.append("Conversation transcript:\n\n" + "\n\n".join(transcript))
sections.append("Continue the conversation from the latest user request.")
return "\n\n".join(s.strip() for s in sections if s and s.strip())
def _create_chat_completion(
self,
*,
model: str | None = None,
messages: list[dict[str, Any]] | None = None,
timeout: float | None = None,
tools: list[dict[str, Any]] | None = None,
tool_choice: Any = None,
**_: Any,
) -> Any:
prompt_text = self._build_prompt(messages or [], model=model)
# Normalise timeout: run_agent.py may pass an httpx.Timeout object
if timeout is None:
effective_timeout = _DEFAULT_TIMEOUT_SECONDS
elif isinstance(timeout, (int, float)):
effective_timeout = float(timeout)
else:
candidates = [
getattr(timeout, attr, None)
for attr in ("read", "write", "connect", "pool", "timeout")
]
numeric = [float(v) for v in candidates if isinstance(v, (int, float))]
effective_timeout = max(numeric) if numeric else _DEFAULT_TIMEOUT_SECONDS
response_text, usage = self._run_prompt(prompt_text, timeout_seconds=effective_timeout)
assistant_message = SimpleNamespace(
content=response_text,
tool_calls=[],
reasoning=None,
reasoning_content=None,
reasoning_details=None,
)
choice = SimpleNamespace(message=assistant_message, finish_reason="stop")
return SimpleNamespace(
choices=[choice],
usage=usage,
model=model or "codex-cli",
)
def _run_prompt(self, prompt_text: str, *, timeout_seconds: float) -> tuple[str, SimpleNamespace]:
cmd = [self._command] + self._args
# The prompt is a positional arg — pass it via stdin with pipe
try:
proc = subprocess.Popen(
cmd,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
env=_build_subprocess_env(),
)
except FileNotFoundError as exc:
raise RuntimeError(
f"Could not start Codex CLI command '{self._command}'. "
"Install Codex CLI (npm install -g @openai/codex) or set "
f"HERMES_CODEX_CLI_COMMAND / CODEX_CLI_PATH."
) from exc
if proc.stdin is None or proc.stdout is None:
proc.kill()
raise RuntimeError("Codex CLI process did not expose stdin/stdout pipes.")
self.is_closed = False
with self._active_process_lock:
self._active_process = proc
response_parts: list[str] = []
usage = SimpleNamespace(
prompt_tokens=0,
completion_tokens=0,
total_tokens=0,
prompt_tokens_details=SimpleNamespace(cached_tokens=0),
)
stderr_lines: list[str] = []
try:
# Write prompt to stdin and close it to signal end of input
proc.stdin.write(prompt_text)
proc.stdin.close()
deadline = time.monotonic() + timeout_seconds
stdout_thread = threading.Thread(target=lambda: None, daemon=True)
# Collect stdout lines
stdout_lines: list[str] = []
def _read_stdout():
if proc.stdout is None:
return
for line in proc.stdout:
stdout_lines.append(line.rstrip("\n"))
stdout_thread = threading.Thread(target=_read_stdout, daemon=True)
stdout_thread.start()
# We'll also collect stderr
stderr_output: list[str] = []
def _read_stderr():
if proc.stderr is None:
return
for line in proc.stderr:
stderr_output.append(line.rstrip("\n"))
stderr_thread = threading.Thread(target=_read_stderr, daemon=True)
stderr_thread.start()
# Wait for process to complete or timeout
remaining = deadline - time.monotonic()
while remaining > 0:
if proc.poll() is not None:
break
time.sleep(0.1)
remaining = deadline - time.monotonic()
if proc.poll() is None:
proc.kill()
raise TimeoutError("Timed out waiting for Codex CLI response.")
# Wait for threads to finish reading
stdout_thread.join(timeout=5)
stderr_thread.join(timeout=5)
# Parse JSONL output
agent_text = ""
for line in stdout_lines:
try:
event = json.loads(line)
except Exception:
# Non-JSON line (banner, status) — skip
continue
event_type = event.get("type", "")
if event_type == "item.completed":
item = event.get("item") or {}
if item.get("type") == "agent_message":
text = item.get("text") or ""
if text:
agent_text += text
elif event_type == "turn.completed":
usage = _parse_turn_completed_usage(event)
if agent_text:
response_parts.append(agent_text)
# Stderr with useful diagnostics
for line in stderr_output:
if line.strip():
stderr_lines.append(line)
if stderr_lines and not agent_text:
raise RuntimeError(
"Codex CLI produced no agent message. "
f"stderr: {'; '.join(stderr_lines[-5:])}"
)
return "\n".join(response_parts).strip(), usage
finally:
if proc.poll() is None:
try:
proc.kill()
except Exception:
pass
with self._active_process_lock:
if self._active_process is proc:
self._active_process = None
+63 -35
View File
@@ -763,6 +763,33 @@ class ContextCompressor(ContextEngine):
return "\n\n".join(parts)
def _fallback_to_main_for_compression(self, e: Exception, reason: str) -> None:
"""Switch from a separate ``summary_model`` back to the main model.
Centralises the bookkeeping shared by every fallback branch in
:meth:`_generate_summary` (model-not-found, timeout, JSON decode,
unknown error): record the aux-model failure for ``/usage``-style
callers, clear the summary model so the next call uses the main one,
and clear the cooldown so the immediate retry can run.
``reason`` is a short human-readable phrase ("unavailable",
"timed out", "returned invalid JSON", "failed") that is interpolated
into the warning log.
"""
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' %s (%s). "
"Falling back to main model '%s' for compression.",
self.summary_model, reason, e, self.model,
)
_err_text = str(e).strip() or e.__class__.__name__
if len(_err_text) > 220:
_err_text = _err_text[:217].rstrip() + "..."
self._last_aux_model_failure_error = _err_text
self._last_aux_model_failure_model = self.summary_model
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0 # no cooldown — retry immediately
def _generate_summary(self, turns_to_summarize: List[Dict[str, Any]], focus_topic: str = None) -> Optional[str]:
"""Generate a structured summary of conversation turns.
@@ -961,28 +988,42 @@ The user has requested that this compaction PRIORITISE preserving all informatio
_status in (408, 429, 502, 504)
or "timeout" in _err_str
)
# Non-JSON / malformed-body responses from misconfigured providers
# or proxies (e.g. an HTML 502 page returned with
# ``Content-Type: application/json``) bubble up as
# ``json.JSONDecodeError`` from the OpenAI SDK's ``response.json()``,
# or as a wrapping ``APIResponseValidationError`` whose message
# carries the substring "expecting value". Treat these like a
# transient provider failure: one retry on the main model, then a
# short cooldown. Issue #22244.
_is_json_decode = (
isinstance(e, json.JSONDecodeError)
or "expecting value" in _err_str
)
if _is_json_decode and not _is_model_not_found and not _is_timeout:
logger.error(
"Context compression failed: auxiliary LLM returned a "
"non-JSON response. provider=%s summary_model=%s "
"main_model=%s base_url=%s err=%s",
self.provider or "auto",
self.summary_model or "(main)",
self.model,
self.base_url or "default",
e,
)
if (
(_is_model_not_found or _is_timeout)
(_is_model_not_found or _is_timeout or _is_json_decode)
and self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' unavailable (%s). "
"Falling back to main model '%s' for compression.",
self.summary_model, e, self.model,
)
# Record the aux-model failure so callers can warn the user
# even if the retry-on-main succeeds — a misconfigured aux
# model is something the user needs to fix.
_err_text = str(e).strip() or e.__class__.__name__
if len(_err_text) > 220:
_err_text = _err_text[:217].rstrip() + "..."
self._last_aux_model_failure_error = _err_text
self._last_aux_model_failure_model = self.summary_model
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0 # no cooldown
if _is_json_decode:
_reason = "returned invalid JSON"
elif _is_model_not_found:
_reason = "unavailable"
else:
_reason = "timed out"
self._fallback_to_main_for_compression(e, _reason)
return self._generate_summary(turns_to_summarize, focus_topic=focus_topic) # retry immediately
# Unknown-error best-effort retry on main model. Losing N turns of
@@ -999,26 +1040,13 @@ The user has requested that this compaction PRIORITISE preserving all informatio
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' failed (%s). "
"Retrying on main model '%s' before giving up.",
self.summary_model, e, self.model,
)
# Record the aux-model failure (see 404 branch above) — user
# should know their configured model is broken even if main
# recovers the call.
_err_text = str(e).strip() or e.__class__.__name__
if len(_err_text) > 220:
_err_text = _err_text[:217].rstrip() + "..."
self._last_aux_model_failure_error = _err_text
self._last_aux_model_failure_model = self.summary_model
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0
self._fallback_to_main_for_compression(e, "failed")
return self._generate_summary(turns_to_summarize, focus_topic=focus_topic)
# Transient errors (timeout, rate limit, network) — shorter cooldown
_transient_cooldown = 60
# Transient errors (timeout, rate limit, network, JSON decode) —
# shorter cooldown for JSON decode since the body shape can flip
# back to valid quickly when an upstream proxy recovers.
_transient_cooldown = 30 if _is_json_decode else 60
self._summary_failure_cooldown_until = time.monotonic() + _transient_cooldown
err_text = str(e).strip() or e.__class__.__name__
if len(err_text) > 220:
+12
View File
@@ -564,6 +564,18 @@ PLATFORM_HINTS = {
"code fences). Treat this like a conversation, not a document. Keep responses "
"brief and natural."
),
"webui": (
"You are in the Hermes WebUI, a browser-based chat interface. "
"Full Markdown rendering is supported — headings, bold, italic, code "
"blocks, tables, math (LaTeX), and Mermaid diagrams all render natively. "
"To display local or remote media/files inline, include "
"MEDIA:/absolute/path/to/file or MEDIA:https://... in your response. "
"Local file paths must be absolute. Images, audio (with playback speed "
"controls), video, PDFs, HTML, CSV, diffs/patches, and Excalidraw files "
"render as rich previews. Do not use Markdown image syntax like "
"![alt](/path) for local files; local paths are not served that way. "
"Use MEDIA:/absolute/path instead."
),
}
# ---------------------------------------------------------------------------
+1 -1
View File
@@ -62,7 +62,7 @@ class ToolCall:
return (self.provider_data or {}).get("response_item_id")
@property
def extra_content(self) -> Optional[Dict[str, Any]]:
def extra_content(self) -> dict[str, Any] | None:
"""Gemini extra_content (thought_signature) from provider_data.
Gemini 3 thinking models attach ``extra_content`` with a
+1
View File
@@ -500,6 +500,7 @@ group_sessions_per_user: true
# Stream tokens to messaging platforms in real-time. The bot sends a message
# on first token, then progressively edits it as more tokens arrive.
# Disabled by default — enable to try the streaming UX on Telegram/Discord/Slack.
# For Telegram, partial edits are sent as plain text and only the final edit uses MarkdownV2.
streaming:
enabled: false
# transport: edit # "edit" = progressive editMessageText
+8 -4
View File
@@ -5463,7 +5463,8 @@ class HermesCLI:
return
if not self._session_db:
_cprint(" Session database not available.")
from hermes_state import format_session_db_unavailable
_cprint(f" {format_session_db_unavailable()}")
return
# Resolve title or ID
@@ -5574,7 +5575,8 @@ class HermesCLI:
return
if not self._session_db:
_cprint(" Session database not available.")
from hermes_state import format_session_db_unavailable
_cprint(f" {format_session_db_unavailable()}")
return
parts = cmd_original.split(None, 1)
@@ -6850,7 +6852,8 @@ class HermesCLI:
self._pending_title = new_title
_cprint(f" Session title queued: {new_title} (will be saved on first message)")
else:
_cprint(" Session database not available.")
from hermes_state import format_session_db_unavailable
_cprint(f" {format_session_db_unavailable()}")
else:
_cprint(" Usage: /title <your session title>")
else:
@@ -6865,7 +6868,8 @@ class HermesCLI:
else:
_cprint(" No title set. Usage: /title <your session title>")
else:
_cprint(" Session database not available.")
from hermes_state import format_session_db_unavailable
_cprint(f" {format_session_db_unavailable()}")
elif canonical == "new":
parts = cmd_original.split(maxsplit=1)
title = parts[1].strip() if len(parts) > 1 else None
+65 -5
View File
@@ -72,6 +72,65 @@ def _apply_skill_fields(job: Dict[str, Any]) -> Dict[str, Any]:
return normalized
def _coerce_job_text(value: Any, fallback: str = "") -> str:
"""Coerce legacy/hand-edited nullable cron fields to strings for readers."""
if value is None:
return fallback
return str(value)
def _schedule_display_for_job(job: Dict[str, Any]) -> str:
display = _coerce_job_text(job.get("schedule_display")).strip()
if display:
return display
schedule = job.get("schedule")
if isinstance(schedule, dict):
for key in ("display", "value", "expr", "run_at"):
text = _coerce_job_text(schedule.get(key)).strip()
if text:
return text
elif schedule is not None:
return str(schedule)
return "?"
def _normalize_job_record(job: Dict[str, Any]) -> Dict[str, Any]:
"""Return a read-safe cron job shape for UI/API/tool/scheduler consumers.
Older or hand-edited jobs can have nullable fields like ``prompt``,
``name``, or ``schedule_display``. Keep storage untouched on read, but
ensure consumers never crash while formatting or running those records.
"""
normalized = _apply_skill_fields(job)
job_id = _coerce_job_text(normalized.get("id"), "unknown")
prompt = _coerce_job_text(normalized.get("prompt"))
normalized["id"] = job_id
normalized["prompt"] = prompt
name = _coerce_job_text(normalized.get("name")).strip()
if not name:
script = _coerce_job_text(normalized.get("script")).strip()
label_source = (
prompt
or (normalized["skills"][0] if normalized.get("skills") else "")
or script
or job_id
or "cron job"
)
name = label_source[:50].strip() or "cron job"
normalized["name"] = name
normalized["schedule_display"] = _schedule_display_for_job(normalized)
state = _coerce_job_text(normalized.get("state")).strip()
if not state:
state = "scheduled" if normalized.get("enabled", True) else "paused"
normalized["state"] = state
return normalized
def _secure_dir(path: Path):
"""Set directory to owner-only access (0700). No-op on Windows."""
try:
@@ -533,11 +592,12 @@ def create_job(
else:
context_from = None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
prompt_text = _coerce_job_text(prompt)
label_source = (prompt_text or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
job = {
"id": job_id,
"name": name or label_source[:50].strip(),
"prompt": prompt,
"prompt": prompt_text,
"skills": normalized_skills,
"skill": normalized_skills[0] if normalized_skills else None,
"model": normalized_model,
@@ -581,13 +641,13 @@ def get_job(job_id: str) -> Optional[Dict[str, Any]]:
jobs = load_jobs()
for job in jobs:
if job["id"] == job_id:
return _apply_skill_fields(job)
return _normalize_job_record(job)
return None
def list_jobs(include_disabled: bool = False) -> List[Dict[str, Any]]:
"""List all jobs, optionally including disabled ones."""
jobs = [_apply_skill_fields(j) for j in load_jobs()]
jobs = [_normalize_job_record(j) for j in load_jobs()]
if not include_disabled:
jobs = [j for j in jobs if j.get("enabled", True)]
return jobs
@@ -637,7 +697,7 @@ def update_job(job_id: str, updates: Dict[str, Any]) -> Optional[Dict[str, Any]]
jobs[i] = updated
save_jobs(jobs)
return _apply_skill_fields(jobs[i])
return _normalize_job_record(jobs[i])
return None
+28 -5
View File
@@ -845,7 +845,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
result is used for prompt injection. When omitted, the script
(if any) runs inline as before.
"""
prompt = job.get("prompt", "")
prompt = str(job.get("prompt") or "")
skills = job.get("skills")
# Run data-collection script if configured, inject output as context.
@@ -933,6 +933,8 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
if skills is None:
legacy = job.get("skill")
skills = [legacy] if legacy else []
elif isinstance(skills, str):
skills = [skills]
skill_names = [str(name).strip() for name in skills if str(name).strip()]
if not skill_names:
@@ -1015,7 +1017,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
Tuple of (success, full_output_doc, final_response, error_message)
"""
job_id = job["id"]
job_name = job["name"]
job_name = str(job.get("name") or job.get("prompt") or job_id or "cron job")
# ---------------------------------------------------------------
# no_agent short-circuit — the script IS the job, no LLM involvement.
@@ -1204,10 +1206,31 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
# don't clobber each other's targets (os.environ is process-global).
from gateway.session_context import set_session_vars, clear_session_vars, _VAR_MAP
# Cron execution is an internal scheduler context, not a live inbound
# gateway message. Do not seed HERMES_SESSION_* contextvars from the
# stored ``origin`` (which is delivery routing metadata, not a sender
# identity). Several tool consumers branch on these vars during job
# execution and would otherwise behave as if a real user from the
# origin chat was driving the agent:
# - tools/terminal_tool.py: background-process notification routing
# (notify_on_complete / watch_patterns) reads HERMES_SESSION_PLATFORM
# and HERMES_SESSION_CHAT_ID to populate watcher_platform / chat_id,
# which would route completion notifications to the origin chat
# instead of via HERMES_CRON_AUTO_DELIVER_* below.
# - tools/tts_tool.py: picks Opus vs MP3 based on
# HERMES_SESSION_PLATFORM == "telegram".
# - tools/skills_tool.py + agent/prompt_builder.py: per-platform
# skill-disable lists and the system-prompt cache key both consume
# HERMES_SESSION_PLATFORM.
# - tools/send_message_tool.py: mirror source labelling and the
# send_message gate read HERMES_SESSION_PLATFORM.
# Cron output delivery itself reads job["origin"] directly via
# _resolve_origin(job) and the HERMES_CRON_AUTO_DELIVER_* vars set
# below, so clearing HERMES_SESSION_* here does not affect delivery.
_ctx_tokens = set_session_vars(
platform=origin["platform"] if origin else "",
chat_id=str(origin["chat_id"]) if origin else "",
chat_name=origin.get("chat_name", "") if origin else "",
platform="",
chat_id="",
chat_name="",
)
_cron_delivery_vars = (
"HERMES_CRON_AUTO_DELIVER_PLATFORM",
+1 -1
View File
@@ -403,7 +403,7 @@ class HermesAgentLoop:
# Run tool calls in a thread pool so backends that
# use asyncio.run() internally (modal, docker, daytona) get
# a clean event loop instead of deadlocking.
loop = asyncio.get_event_loop()
loop = asyncio.get_running_loop()
# Capture current tool_name/args for the lambda
_tn, _ta, _tid = tool_name, args, self.task_id
tool_result = await loop.run_in_executor(
@@ -575,7 +575,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
# other tasks, tqdm updates, and timeout timers).
ctx = ToolContext(task_id)
try:
loop = asyncio.get_event_loop()
loop = asyncio.get_running_loop()
reward = await loop.run_in_executor(
None, # default thread pool
self._run_tests, eval_item, ctx, task_name,
+18 -1
View File
@@ -30,7 +30,7 @@ Usage (gateway side):
import logging
from dataclasses import dataclass, field
from typing import Any, Callable, Optional
from typing import Any, Awaitable, Callable, Optional
logger = logging.getLogger(__name__)
@@ -125,6 +125,23 @@ class PlatformEntry:
# resolve the default chat/room ID. Empty = no cron home-channel support.
cron_deliver_env_var: str = ""
# ── Standalone (out-of-process) sending ──
# Optional: async coroutine that delivers a message without a live
# gateway adapter. Called by ``tools/send_message_tool._send_via_adapter``
# when ``cron`` runs in a separate process from the gateway and the
# in-process adapter weakref is therefore ``None``.
#
# Signature:
# async (pconfig, chat_id, message, *, thread_id=None,
# media_files=None, force_document=False) -> dict
#
# Returns ``{"success": True, "message_id": ...}`` on success or
# ``{"error": str}`` on failure. Plugin authors typically open an
# ephemeral connection / acquire a fresh OAuth token, send, and close.
# Without this hook, plugin platforms cannot serve as cron ``deliver=``
# targets when the gateway is not co-resident with the cron process.
standalone_sender_fn: Optional[Callable[..., Awaitable[dict]]] = None
class PlatformRegistry:
"""Central registry of platform adapters.
+6 -1
View File
@@ -14,7 +14,7 @@ The plugin system automatically handles: adapter creation, config parsing,
user authorization, cron delivery, send_message routing, system prompt hints,
status display, gateway setup, and more.
**Three optional hooks cover the edges most adapters need:**
**Optional hooks cover the edges most adapters need:**
- `env_enablement_fn: () -> Optional[dict]` — seeds `PlatformConfig.extra`
(and an optional `home_channel` dict) from env vars BEFORE the adapter is
@@ -24,6 +24,11 @@ status display, gateway setup, and more.
- `cron_deliver_env_var: str` — name of the `*_HOME_CHANNEL` env var. When
set, `deliver=<name>` cron jobs route to this var without editing
`cron/scheduler.py`'s hardcoded sets.
- `standalone_sender_fn: async (...) -> dict`: out-of-process delivery
for cron jobs that run separately from the gateway. Without this, a
`deliver=<name>` job fires correctly but the actual send returns
`No live adapter for platform '<name>'`. Pair with `cron_deliver_env_var`
for end-to-end cron support. See the docsite for the signature.
- `plugin.yaml` `requires_env` / `optional_env` rich-dict entries —
auto-populate `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` so the setup
wizard surfaces proper descriptions, prompts, password flags, and URLs.
+6 -1
View File
@@ -312,7 +312,12 @@ class ResponseStore:
self._conn = sqlite3.connect(db_path, check_same_thread=False)
except Exception:
self._conn = sqlite3.connect(":memory:", check_same_thread=False)
self._conn.execute("PRAGMA journal_mode=WAL")
# Use shared WAL-fallback helper so response_store.db degrades
# gracefully on NFS/SMB/FUSE-mounted HERMES_HOME (same filesystem
# issue addressed for state.db/kanban.db — see
# hermes_state._WAL_INCOMPAT_MARKERS).
from hermes_state import apply_wal_with_fallback
apply_wal_with_fallback(self._conn, db_label="response_store.db")
self._conn.execute(
"""CREATE TABLE IF NOT EXISTS responses (
response_id TEXT PRIMARY KEY,
+62 -28
View File
@@ -40,6 +40,52 @@ def _platform_name(platform) -> str:
return str(value or "").lower()
def _thread_metadata_for_source(source, reply_to_message_id: str | None = None) -> dict | None:
"""Build platform-aware thread metadata for adapter sends.
Most platforms route threaded sends with a generic ``thread_id`` metadata
value. Telegram private-chat topics created through Hermes' DM-topic helper
are exposed in updates as ``message_thread_id`` plus a reply anchor, but
outbound sends only render in the correct Telegram lane when the adapter
supplies both ``message_thread_id`` and ``reply_to_message_id``. Mark those
lanes so the Telegram adapter can avoid the known-bad partial routes.
"""
thread_id = getattr(source, "thread_id", None)
if thread_id is None:
return None
metadata = {"thread_id": thread_id}
if _platform_name(getattr(source, "platform", None)) == "telegram" and getattr(source, "chat_type", None) == "dm":
metadata["telegram_dm_topic_reply_fallback"] = True
anchor = reply_to_message_id or getattr(source, "message_id", None)
if anchor is not None:
metadata["telegram_reply_to_message_id"] = str(anchor)
return metadata
def _reply_anchor_for_event(event) -> str | None:
"""Return reply_to id for platforms that need reply semantics.
Telegram forum/supergroup topics should be routed by topic metadata, not by
replying to the triggering message. Hermes-created Telegram private-chat
topic lanes are different: Bot API sends reject their ``message_thread_id``
and do not route with ``direct_messages_topic_id``. Those lanes only remain
visible when sent with both the private topic thread id and a reply to the
triggering user message.
"""
source = getattr(event, "source", None)
platform = _platform_name(getattr(source, "platform", None))
thread_id = getattr(source, "thread_id", None)
if platform == "telegram" and thread_id and getattr(source, "chat_type", None) == "dm":
# Reply to the triggering user message. Replying to Telegram's earlier
# topic seed/anchor can render the bot response outside the active lane.
return getattr(event, "message_id", None) or getattr(event, "reply_to_message_id", None)
if platform == "telegram" and thread_id:
return None
if platform == "feishu" and thread_id and getattr(event, "reply_to_message_id", None):
return getattr(event, "reply_to_message_id", None)
return getattr(event, "message_id", None)
def should_send_media_as_audio(platform, ext: str, is_voice: bool = False) -> bool:
"""Return True when a media file should use the platform's audio sender.
@@ -1719,7 +1765,7 @@ class BasePlatformAdapter(ABC):
"""
# Fallback: send URL as text (subclasses override for native images)
text = f"{caption}\n{image_url}" if caption else image_url
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to, metadata=metadata)
async def send_animation(
self,
@@ -1798,6 +1844,7 @@ class BasePlatformAdapter(ABC):
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
"""
@@ -1810,7 +1857,7 @@ class BasePlatformAdapter(ABC):
text = f"🔊 Audio: {audio_path}"
if caption:
text = f"{caption}\n{text}"
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to, metadata=metadata)
async def play_tts(
self,
@@ -1832,6 +1879,7 @@ class BasePlatformAdapter(ABC):
video_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
"""
@@ -1843,7 +1891,7 @@ class BasePlatformAdapter(ABC):
text = f"🎬 Video: {video_path}"
if caption:
text = f"{caption}\n{text}"
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to, metadata=metadata)
async def send_document(
self,
@@ -1852,6 +1900,7 @@ class BasePlatformAdapter(ABC):
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
"""
@@ -1863,7 +1912,7 @@ class BasePlatformAdapter(ABC):
text = f"📎 File: {file_path}"
if caption:
text = f"{caption}\n{text}"
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to, metadata=metadata)
async def send_image_file(
self,
@@ -1871,6 +1920,7 @@ class BasePlatformAdapter(ABC):
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
"""
@@ -1883,7 +1933,7 @@ class BasePlatformAdapter(ABC):
text = f"🖼️ Image: {image_path}"
if caption:
text = f"{caption}\n{text}"
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to)
return await self.send(chat_id=chat_id, content=text, reply_to=reply_to, metadata=metadata)
@staticmethod
def extract_media(content: str) -> Tuple[List[Tuple[str, bool]], str]:
@@ -2558,7 +2608,7 @@ class BasePlatformAdapter(ABC):
current_guard = self._active_sessions.get(session_key)
command_guard = asyncio.Event()
self._active_sessions[session_key] = command_guard
thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
thread_meta = _thread_metadata_for_source(event.source, _reply_anchor_for_event(event))
try:
response = await self._message_handler(event)
@@ -2579,13 +2629,7 @@ class BasePlatformAdapter(ABC):
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
reply_to=_reply_anchor_for_event(event),
metadata=thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2678,20 +2722,14 @@ class BasePlatformAdapter(ABC):
self.name, cmd, session_key,
)
try:
_thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
_thread_meta = _thread_metadata_for_source(event.source, _reply_anchor_for_event(event))
response = await self._message_handler(event)
_text, _eph_ttl = self._unwrap_ephemeral(response)
if _text:
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
reply_to=_reply_anchor_for_event(event),
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2783,7 +2821,7 @@ class BasePlatformAdapter(ABC):
self._active_sessions[session_key] = interrupt_event
# Start continuous typing indicator (refreshes every 2 seconds)
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
_thread_metadata = _thread_metadata_for_source(event.source, _reply_anchor_for_event(event))
_keep_typing_kwargs = {"metadata": _thread_metadata}
try:
_keep_typing_sig = inspect.signature(self._keep_typing)
@@ -2911,11 +2949,7 @@ class BasePlatformAdapter(ABC):
# Send the text portion
if text_content:
logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
_reply_anchor = (
event.reply_to_message_id
if event.source.platform == Platform.FEISHU and event.source.thread_id and event.reply_to_message_id
else event.message_id
)
_reply_anchor = _reply_anchor_for_event(event)
result = await self._send_with_retry(
chat_id=event.source.chat_id,
content=text_content,
@@ -3108,7 +3142,7 @@ class BasePlatformAdapter(ABC):
try:
error_type = type(e).__name__
error_detail = str(e)[:300] if str(e) else "no details available"
_thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
_thread_metadata = _thread_metadata_for_source(event.source, _reply_anchor_for_event(event))
await self.send(
chat_id=event.source.chat_id,
content=(
+173 -3
View File
@@ -1404,6 +1404,9 @@ class FeishuAdapter(BasePlatformAdapter):
# Exec approval button state (approval_id → {session_key, message_id, chat_id})
self._approval_state: Dict[int, Dict[str, str]] = {}
self._approval_counter = itertools.count(1)
# Update prompt button state (prompt_id → {session_key, message_id, chat_id})
self._update_prompt_state: Dict[int, Dict[str, str]] = {}
self._update_prompt_counter = itertools.count(1)
# Feishu reaction deletion requires the opaque reaction_id returned
# by create, so we cache it per message_id.
self._pending_processing_reactions: "OrderedDict[str, str]" = OrderedDict()
@@ -1856,6 +1859,74 @@ class FeishuAdapter(BasePlatformAdapter):
logger.warning("[Feishu] send_exec_approval failed: %s", exc)
return SendResult(success=False, error=str(exc))
@staticmethod
def _build_update_prompt_card(*, prompt: str, default: str, prompt_id: int) -> Dict[str, Any]:
default_hint = f"\n\nDefault: `{default}`" if default else ""
def _btn(label: str, answer: str, btn_type: str) -> dict:
return {
"tag": "button",
"text": {"tag": "plain_text", "content": label},
"type": btn_type,
"value": {
"hermes_update_prompt_action": answer,
"update_prompt_id": prompt_id,
},
}
return {
"config": {"wide_screen_mode": True},
"header": {
"title": {"content": "⚕ Update Needs Your Input", "tag": "plain_text"},
"template": "orange",
},
"elements": [
{"tag": "markdown", "content": f"{prompt}{default_hint}"},
{
"tag": "action",
"actions": [
_btn("✓ Yes", "y", "primary"),
_btn("✗ No", "n", "danger"),
],
},
],
}
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an interactive update prompt with Yes/No buttons."""
if not self._client:
return SendResult(success=False, error="Not connected")
try:
prompt_id = next(self._update_prompt_counter)
payload = json.dumps(
self._build_update_prompt_card(prompt=prompt, default=default, prompt_id=prompt_id),
ensure_ascii=False,
)
response = await self._feishu_send_with_retry(
chat_id=chat_id,
msg_type="interactive",
payload=payload,
reply_to=None,
metadata=metadata,
)
result = self._finalize_send_result(response, "send_update_prompt failed")
if result.success:
self._update_prompt_state[prompt_id] = {
"session_key": session_key,
"message_id": result.message_id or "",
"chat_id": chat_id,
}
return result
except Exception as exc:
logger.warning("[Feishu] send_update_prompt failed: %s", exc)
return SendResult(success=False, error=str(exc))
@staticmethod
def _build_resolved_approval_card(*, choice: str, user_name: str) -> Dict[str, Any]:
"""Build raw card JSON for a resolved approval action."""
@@ -1875,6 +1946,28 @@ class FeishuAdapter(BasePlatformAdapter):
],
}
@staticmethod
def _build_resolved_update_prompt_card(*, answer: str, user_name: str) -> Dict[str, Any]:
yes = answer == "y"
label = "Yes" if yes else "No"
return {
"config": {"wide_screen_mode": True},
"header": {
"title": {"content": f"{'' if yes else ''} Update prompt answered: {label}", "tag": "plain_text"},
"template": "green" if yes else "red",
},
"elements": [
{"tag": "markdown", "content": f"Answered by **{user_name}**"},
],
}
@staticmethod
def _write_update_prompt_response(answer: str) -> None:
response_path = get_hermes_home() / ".update_response"
tmp_path = response_path.with_suffix(".tmp")
tmp_path.write_text(answer)
tmp_path.replace(response_path)
async def send_voice(
self,
chat_id: str,
@@ -2372,9 +2465,19 @@ class FeishuAdapter(BasePlatformAdapter):
action = getattr(event, "action", None)
action_value = getattr(action, "value", {}) or {}
hermes_action = action_value.get("hermes_action") if isinstance(action_value, dict) else None
update_prompt_action = (
action_value.get("hermes_update_prompt_action")
if isinstance(action_value, dict) else None
)
if hermes_action:
return self._handle_approval_card_action(event=event, action_value=action_value, loop=loop)
if update_prompt_action:
return self._handle_update_prompt_card_action(
event=event,
action_value=action_value,
loop=loop,
)
self._submit_on_loop(loop, self._handle_card_action_event(data))
if P2CardActionTriggerResponse is None:
@@ -2386,10 +2489,26 @@ class FeishuAdapter(BasePlatformAdapter):
"""Return True when the adapter loop can accept thread-safe submissions."""
return loop is not None and not bool(getattr(loop, "is_closed", lambda: False)())
def _submit_on_loop(self, loop: Any, coro: Any) -> None:
def _submit_on_loop(self, loop: Any, coro: Any) -> bool:
"""Schedule background work on the adapter loop with shared failure logging."""
future = asyncio.run_coroutine_threadsafe(coro, loop)
try:
future = asyncio.run_coroutine_threadsafe(coro, loop)
except Exception:
coro.close()
logger.warning("[Feishu] Failed to schedule background callback work", exc_info=True)
return False
future.add_done_callback(self._log_background_failure)
return True
def _is_interactive_operator_authorized(self, open_id: str) -> bool:
"""Return whether this card-action operator may answer gated prompts."""
normalized = str(open_id or "").strip()
if not normalized:
return False
allowed_ids = set(self._admins) | set(self._allowed_group_users)
if not allowed_ids:
return True
return "*" in allowed_ids or normalized in allowed_ids
def _handle_approval_card_action(self, *, event: Any, action_value: Dict[str, Any], loop: Any) -> Any:
"""Schedule approval resolution and build the synchronous callback response."""
@@ -2403,7 +2522,8 @@ class FeishuAdapter(BasePlatformAdapter):
open_id = str(getattr(operator, "open_id", "") or "")
user_name = self._get_cached_sender_name(open_id) or open_id
self._submit_on_loop(loop, self._resolve_approval(approval_id, choice, user_name))
if not self._submit_on_loop(loop, self._resolve_approval(approval_id, choice, user_name)):
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
if P2CardActionTriggerResponse is None:
return None
@@ -2415,6 +2535,41 @@ class FeishuAdapter(BasePlatformAdapter):
response.card = card
return response
def _handle_update_prompt_card_action(self, *, event: Any, action_value: Dict[str, Any], loop: Any) -> Any:
"""Schedule update prompt resolution and build the synchronous callback response."""
prompt_id = action_value.get("update_prompt_id")
if prompt_id is None:
logger.debug("[Feishu] Card action missing update_prompt_id, ignoring")
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
if prompt_id not in self._update_prompt_state:
logger.debug("[Feishu] Update prompt %s already resolved or unknown", prompt_id)
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
answer = str(action_value.get("hermes_update_prompt_action", "") or "").strip().lower()
if answer not in {"y", "n"}:
logger.debug("[Feishu] Card action has invalid update prompt answer=%r", answer)
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
operator = getattr(event, "operator", None)
open_id = str(getattr(operator, "open_id", "") or "")
if not self._is_interactive_operator_authorized(open_id):
logger.warning("[Feishu] Unauthorized update prompt click by %s", open_id or "<unknown>")
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
user_name = self._get_cached_sender_name(open_id) or open_id
if not self._submit_on_loop(loop, self._resolve_update_prompt(prompt_id, answer, user_name)):
return P2CardActionTriggerResponse() if P2CardActionTriggerResponse else None
if P2CardActionTriggerResponse is None:
return None
response = P2CardActionTriggerResponse()
if CallBackCard is not None:
card = CallBackCard()
card.type = "raw"
card.data = self._build_resolved_update_prompt_card(answer=answer, user_name=user_name)
response.card = card
return response
async def _resolve_approval(self, approval_id: Any, choice: str, user_name: str) -> None:
"""Pop approval state and unblock the waiting agent thread."""
state = self._approval_state.pop(approval_id, None)
@@ -2431,6 +2586,21 @@ class FeishuAdapter(BasePlatformAdapter):
except Exception as exc:
logger.error("Failed to resolve gateway approval from Feishu button: %s", exc)
async def _resolve_update_prompt(self, prompt_id: Any, answer: str, user_name: str) -> None:
"""Persist an update prompt answer for the detached update process."""
state = self._update_prompt_state.pop(prompt_id, None)
if not state:
logger.debug("[Feishu] Update prompt %s already resolved or unknown", prompt_id)
return
try:
self._write_update_prompt_response(answer)
logger.info(
"Feishu update prompt resolved for session %s (answer=%s, user=%s)",
state["session_key"], answer, user_name,
)
except Exception as exc:
logger.error("Failed to resolve Feishu update prompt: %s", exc)
async def _handle_reaction_event(self, event_type: str, data: Any) -> None:
"""Fetch the reacted-to message; if it was sent by this bot, emit a synthetic text event."""
if not self._client:
+427 -81
View File
@@ -361,6 +361,63 @@ class TelegramAdapter(BasePlatformAdapter):
thread_id = metadata.get("thread_id") or metadata.get("message_thread_id")
return str(thread_id) if thread_id is not None else None
@classmethod
def _metadata_direct_messages_topic_id(cls, metadata: Optional[Dict[str, Any]]) -> Optional[str]:
if not metadata:
return None
topic_id = metadata.get("direct_messages_topic_id") or metadata.get("telegram_direct_messages_topic_id")
return str(topic_id) if topic_id is not None else None
@classmethod
def _metadata_reply_to_message_id(cls, metadata: Optional[Dict[str, Any]]) -> Optional[int]:
if not metadata:
return None
reply_to = metadata.get("telegram_reply_to_message_id")
return int(reply_to) if reply_to is not None else None
@classmethod
def _reply_to_message_id_for_send(
cls,
reply_to: Optional[str],
metadata: Optional[Dict[str, Any]] = None,
) -> Optional[int]:
if reply_to:
return int(reply_to)
if metadata and metadata.get("telegram_dm_topic_reply_fallback"):
return cls._metadata_reply_to_message_id(metadata)
return None
@classmethod
def _thread_kwargs_for_send(
cls,
chat_id: str,
thread_id: Optional[str],
metadata: Optional[Dict[str, Any]] = None,
reply_to_message_id: Optional[int] = None,
) -> Dict[str, Any]:
"""Return Telegram send kwargs for forum and direct-message topic routing.
Supergroup/forum topics use ``message_thread_id``. True Bot API Direct
Messages topics can opt in with explicit ``direct_messages_topic_id``
metadata. Hermes-created private-chat topic lanes are marked with
``telegram_dm_topic_reply_fallback`` and must send the private topic
thread id together with a reply anchor. Live testing showed that either
parameter alone can render outside the visible lane.
"""
if metadata and metadata.get("telegram_dm_topic_reply_fallback"):
if reply_to_message_id is None:
reply_to_message_id = cls._metadata_reply_to_message_id(metadata)
if reply_to_message_id is None:
return {}
return {"message_thread_id": cls._message_thread_id_for_send(thread_id)}
direct_topic_id = cls._metadata_direct_messages_topic_id(metadata)
if direct_topic_id is not None:
return {
"message_thread_id": None,
"direct_messages_topic_id": int(direct_topic_id),
}
return {"message_thread_id": cls._message_thread_id_for_send(thread_id)}
@classmethod
def _message_thread_id_for_send(cls, thread_id: Optional[str]) -> Optional[int]:
if not thread_id or str(thread_id) == cls._GENERAL_TOPIC_THREAD_ID:
@@ -384,6 +441,65 @@ class TelegramAdapter(BasePlatformAdapter):
def _is_thread_not_found_error(error: Exception) -> bool:
return "thread not found" in str(error).lower()
@staticmethod
def _is_bad_request_error(error: Exception) -> bool:
name = error.__class__.__name__.lower()
if name == "badrequest" or name.endswith("badrequest"):
return True
try:
from telegram.error import BadRequest
return isinstance(error, BadRequest)
except ImportError:
return False
@classmethod
def _should_retry_without_dm_topic_reply_anchor(
cls,
error: Exception,
metadata: Optional[Dict[str, Any]],
reply_to_message_id: Optional[int],
) -> bool:
return (
bool(metadata and metadata.get("telegram_dm_topic_reply_fallback"))
and reply_to_message_id is not None
and cls._is_bad_request_error(error)
and "message to be replied not found" in str(error).lower()
)
async def _send_with_dm_topic_reply_anchor_retry(
self,
send_fn: Any,
send_kwargs: Dict[str, Any],
metadata: Optional[Dict[str, Any]],
reply_to_message_id: Optional[int],
media_label: str,
reset_media: Optional[Any] = None,
) -> Any:
"""Retry stale private-topic media replies once without the topic anchor."""
try:
return await send_fn(**send_kwargs)
except Exception as send_err:
if not self._should_retry_without_dm_topic_reply_anchor(
send_err,
metadata,
reply_to_message_id,
):
raise
logger.warning(
"[%s] Reply target deleted for Telegram %s, "
"retrying without reply/topic anchor: %s",
self.name,
media_label,
send_err,
)
if reset_media is not None:
reset_media()
retry_kwargs = dict(send_kwargs)
retry_kwargs["reply_to_message_id"] = None
retry_kwargs.pop("message_thread_id", None)
retry_kwargs.pop("direct_messages_topic_id", None)
return await send_fn(**retry_kwargs)
def _fallback_ips(self) -> list[str]:
"""Return validated fallback IPs from config (populated by _apply_env_overrides)."""
configured = self.config.extra.get("fallback_ips", []) if getattr(self.config, "extra", None) else []
@@ -1254,9 +1370,23 @@ class TelegramAdapter(BasePlatformAdapter):
_TimedOut = None # type: ignore[assignment,misc]
for i, chunk in enumerate(chunks):
should_thread = self._should_thread_reply(reply_to, i)
reply_to_id = int(reply_to) if should_thread else None
effective_thread_id = self._message_thread_id_for_send(thread_id)
metadata_reply_to = self._metadata_reply_to_message_id(metadata)
reply_to_source = reply_to or (
str(metadata_reply_to)
if metadata and metadata.get("telegram_dm_topic_reply_fallback") and metadata_reply_to is not None else None
)
if metadata and metadata.get("telegram_dm_topic_reply_fallback"):
should_thread = reply_to_source is not None
else:
should_thread = self._should_thread_reply(reply_to_source, i)
reply_to_id = int(reply_to_source) if should_thread and reply_to_source else None
thread_kwargs = self._thread_kwargs_for_send(
chat_id,
thread_id,
metadata,
reply_to_message_id=reply_to_id,
)
effective_thread_id = thread_kwargs.get("message_thread_id")
msg = None
for _send_attempt in range(3):
@@ -1268,7 +1398,7 @@ class TelegramAdapter(BasePlatformAdapter):
text=chunk,
parse_mode=ParseMode.MARKDOWN_V2,
reply_to_message_id=reply_to_id,
message_thread_id=effective_thread_id,
**thread_kwargs,
**self._link_preview_kwargs(),
)
except Exception as md_error:
@@ -1281,7 +1411,7 @@ class TelegramAdapter(BasePlatformAdapter):
text=plain_chunk,
parse_mode=None,
reply_to_message_id=reply_to_id,
message_thread_id=effective_thread_id,
**thread_kwargs,
**self._link_preview_kwargs(),
)
else:
@@ -1302,17 +1432,30 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, effective_thread_id,
)
effective_thread_id = None
thread_kwargs = {"message_thread_id": None}
continue
err_lower = str(send_err).lower()
if "message to be replied not found" in err_lower and reply_to_id is not None:
# Original message was deleted before we
# could reply — clear reply target and retry
# so the response is still delivered.
# could reply. For private-topic fallback
# sends, message_thread_id is only valid with
# the reply anchor, so drop both together.
logger.warning(
"[%s] Reply target deleted, retrying without reply_to: %s",
self.name, send_err,
)
reply_to_id = None
if metadata and metadata.get("telegram_dm_topic_reply_fallback"):
thread_kwargs = {}
effective_thread_id = None
else:
thread_kwargs = self._thread_kwargs_for_send(
chat_id,
thread_id,
metadata,
reply_to_message_id=reply_to_id,
)
effective_thread_id = thread_kwargs.get("message_thread_id")
continue
# Other BadRequest errors are permanent — don't retry
raise
@@ -1372,6 +1515,14 @@ class TelegramAdapter(BasePlatformAdapter):
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
if not finalize:
await self._bot.edit_message_text(
chat_id=int(chat_id),
message_id=int(message_id),
text=content,
)
return SendResult(success=True, message_id=message_id)
formatted = self.format_message(content)
try:
await self._bot.edit_message_text(
@@ -1494,13 +1645,19 @@ class TelegramAdapter(BasePlatformAdapter):
]
])
thread_id = self._metadata_thread_id(metadata)
message_thread_id = self._message_thread_id_for_send(thread_id)
reply_to_id = self._reply_to_message_id_for_send(None, metadata)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
message_thread_id=message_thread_id,
reply_to_message_id=reply_to_id,
**self._thread_kwargs_for_send(
chat_id,
thread_id,
metadata,
reply_to_message_id=reply_to_id,
),
**self._link_preview_kwargs(),
)
return SendResult(success=True, message_id=str(msg.message_id))
@@ -1558,9 +1715,16 @@ class TelegramAdapter(BasePlatformAdapter):
"reply_markup": keyboard,
**self._link_preview_kwargs(),
}
message_thread_id = self._message_thread_id_for_send(thread_id)
if message_thread_id is not None:
kwargs["message_thread_id"] = message_thread_id
reply_to_id = self._reply_to_message_id_for_send(None, metadata)
kwargs["reply_to_message_id"] = reply_to_id
kwargs.update(
self._thread_kwargs_for_send(
chat_id,
thread_id,
metadata,
reply_to_message_id=reply_to_id,
)
)
msg = await self._bot.send_message(**kwargs)
@@ -1603,9 +1767,16 @@ class TelegramAdapter(BasePlatformAdapter):
"reply_markup": keyboard,
**self._link_preview_kwargs(),
}
message_thread_id = self._message_thread_id_for_send(thread_id)
if message_thread_id is not None:
kwargs["message_thread_id"] = message_thread_id
reply_to_id = self._reply_to_message_id_for_send(None, metadata)
kwargs["reply_to_message_id"] = reply_to_id
kwargs.update(
self._thread_kwargs_for_send(
chat_id,
thread_id,
metadata,
reply_to_message_id=reply_to_id,
)
)
msg = await self._bot.send_message(**kwargs)
self._slash_confirm_state[confirm_id] = session_key
@@ -1664,12 +1835,19 @@ class TelegramAdapter(BasePlatformAdapter):
)
thread_id = metadata.get("thread_id") if metadata else None
reply_to_id = self._reply_to_message_id_for_send(None, metadata)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
message_thread_id=int(thread_id) if thread_id else None,
reply_to_message_id=reply_to_id,
**self._thread_kwargs_for_send(
chat_id,
thread_id,
metadata,
reply_to_message_id=reply_to_id,
),
**self._link_preview_kwargs(),
)
@@ -2046,17 +2224,47 @@ class TelegramAdapter(BasePlatformAdapter):
session_key, confirm_id, choice,
)
if result_text and query.message:
# Inherit the prompt message's thread so the reply
# lands in the same supergroup topic / reply chain.
# Inherit the prompt message's topic. Supergroup forums
# use message_thread_id; Telegram private DM-topic lanes
# need both the private topic id and the prompt reply anchor.
thread_id = getattr(query.message, "message_thread_id", None)
chat = getattr(query.message, "chat", None)
chat_type = getattr(chat, "type", None)
prompt_message_id = getattr(query.message, "message_id", None)
send_kwargs: Dict[str, Any] = {
"chat_id": int(query.message.chat_id),
"text": result_text,
"parse_mode": ParseMode.MARKDOWN,
**self._link_preview_kwargs(),
}
if thread_id is not None:
send_kwargs["message_thread_id"] = thread_id
chat_type_value = getattr(chat_type, "value", chat_type)
is_private_chat = str(chat_type_value).lower() in {
"private",
str(ChatType.PRIVATE).lower(),
str(getattr(ChatType.PRIVATE, "value", ChatType.PRIVATE)).lower(),
}
if thread_id is not None and is_private_chat and prompt_message_id is not None:
reply_to_id = int(prompt_message_id)
send_kwargs["reply_to_message_id"] = reply_to_id
send_kwargs.update(
self._thread_kwargs_for_send(
str(query.message.chat_id),
str(thread_id),
{
"thread_id": str(thread_id),
"telegram_dm_topic_reply_fallback": True,
},
reply_to_message_id=reply_to_id,
)
)
elif thread_id is not None:
send_kwargs.update(
self._thread_kwargs_for_send(
str(query.message.chat_id),
str(thread_id),
{"thread_id": str(thread_id)},
)
)
await self._bot.send_message(**send_kwargs)
except Exception as exc:
logger.error("[%s] slash-confirm callback failed: %s", self.name, exc, exc_info=True)
@@ -2137,22 +2345,50 @@ class TelegramAdapter(BasePlatformAdapter):
# .ogg / .opus files -> send as voice (round playable bubble)
if ext in (".ogg", ".opus"):
_voice_thread = self._metadata_thread_id(metadata)
msg = await self._bot.send_voice(
chat_id=int(chat_id),
voice=audio_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_voice_thread),
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
voice_thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_voice_thread,
metadata,
reply_to_message_id=reply_to_id,
)
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_voice,
{
"chat_id": int(chat_id),
"voice": audio_file,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**voice_thread_kwargs,
},
metadata,
reply_to_id,
"voice",
reset_media=lambda: audio_file.seek(0),
)
elif ext in (".mp3", ".m4a"):
# Telegram's Bot API sendAudio only accepts MP3 / M4A.
_audio_thread = self._metadata_thread_id(metadata)
msg = await self._bot.send_audio(
chat_id=int(chat_id),
audio=audio_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_audio_thread),
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
audio_thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_audio_thread,
metadata,
reply_to_message_id=reply_to_id,
)
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_audio,
{
"chat_id": int(chat_id),
"audio": audio_file,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**audio_thread_kwargs,
},
metadata,
reply_to_id,
"audio",
reset_media=lambda: audio_file.seek(0),
)
else:
# Formats Telegram can't play natively (.wav, .flac, ...)
@@ -2172,7 +2408,7 @@ class TelegramAdapter(BasePlatformAdapter):
e,
exc_info=True,
)
return await super().send_voice(chat_id, audio_path, caption, reply_to)
return await super().send_voice(chat_id, audio_path, caption, reply_to, metadata=metadata)
async def send_multiple_images(
self,
@@ -2227,7 +2463,6 @@ class TelegramAdapter(BasePlatformAdapter):
from urllib.parse import unquote as _unquote
_thread = self._metadata_thread_id(metadata)
_thread_id = self._message_thread_id_for_send(_thread)
# Chunk into groups of 10 (Telegram's album limit)
CHUNK = 10
@@ -2263,10 +2498,33 @@ class TelegramAdapter(BasePlatformAdapter):
"[%s] Sending media group of %d photo(s) (chunk %d/%d)",
self.name, len(media), chunk_idx + 1, len(chunks),
)
await self._bot.send_media_group(
chat_id=int(chat_id),
media=media,
message_thread_id=_thread_id,
reply_to_id = self._reply_to_message_id_for_send(None, metadata)
thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_thread,
metadata,
reply_to_message_id=reply_to_id,
)
def _reset_opened_files() -> None:
for fh in opened_files:
try:
fh.seek(0)
except Exception:
pass
await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_media_group,
{
"chat_id": int(chat_id),
"media": media,
"reply_to_message_id": reply_to_id,
**thread_kwargs,
},
metadata,
reply_to_id,
"media group",
reset_media=_reset_opened_files,
)
except Exception as e:
logger.warning(
@@ -2303,13 +2561,27 @@ class TelegramAdapter(BasePlatformAdapter):
return SendResult(success=False, error=self._missing_media_path_error("Image", image_path))
_thread = self._metadata_thread_id(metadata)
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_thread,
metadata,
reply_to_message_id=reply_to_id,
)
with open(image_path, "rb") as image_file:
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_file,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_thread),
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_photo,
{
"chat_id": int(chat_id),
"photo": image_file,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**thread_kwargs,
},
metadata,
reply_to_id,
"photo",
reset_media=lambda: image_file.seek(0),
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
@@ -2360,7 +2632,7 @@ class TelegramAdapter(BasePlatformAdapter):
doc_err,
exc_info=True,
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
return await super().send_image_file(chat_id, image_path, caption, reply_to, metadata=metadata)
async def send_document(
self,
@@ -2382,20 +2654,34 @@ class TelegramAdapter(BasePlatformAdapter):
display_name = file_name or os.path.basename(file_path)
_thread = self._metadata_thread_id(metadata)
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_thread,
metadata,
reply_to_message_id=reply_to_id,
)
with open(file_path, "rb") as f:
msg = await self._bot.send_document(
chat_id=int(chat_id),
document=f,
filename=display_name,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_thread),
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_document,
{
"chat_id": int(chat_id),
"document": f,
"filename": display_name,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**thread_kwargs,
},
metadata,
reply_to_id,
"document",
reset_media=lambda: f.seek(0),
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
print(f"[{self.name}] Failed to send document: {e}")
return await super().send_document(chat_id, file_path, caption, file_name, reply_to)
return await super().send_document(chat_id, file_path, caption, file_name, reply_to, metadata=metadata)
async def send_video(
self,
@@ -2415,18 +2701,32 @@ class TelegramAdapter(BasePlatformAdapter):
return SendResult(success=False, error=self._missing_media_path_error("Video", video_path))
_thread = self._metadata_thread_id(metadata)
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_thread,
metadata,
reply_to_message_id=reply_to_id,
)
with open(video_path, "rb") as f:
msg = await self._bot.send_video(
chat_id=int(chat_id),
video=f,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_thread),
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_video,
{
"chat_id": int(chat_id),
"video": f,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**thread_kwargs,
},
metadata,
reply_to_id,
"video",
reset_media=lambda: f.seek(0),
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
print(f"[{self.name}] Failed to send video: {e}")
return await super().send_video(chat_id, video_path, caption, reply_to)
return await super().send_video(chat_id, video_path, caption, reply_to, metadata=metadata)
async def send_image(
self,
@@ -2452,12 +2752,25 @@ class TelegramAdapter(BasePlatformAdapter):
try:
# Telegram can send photos directly from URLs (up to ~5MB)
_photo_thread = self._metadata_thread_id(metadata)
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_url,
caption=caption[:1024] if caption else None, # Telegram caption limit
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_photo_thread),
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
photo_thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_photo_thread,
metadata,
reply_to_message_id=reply_to_id,
)
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_photo,
{
"chat_id": int(chat_id),
"photo": image_url,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**photo_thread_kwargs,
},
metadata,
reply_to_id,
"URL photo",
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
@@ -2474,13 +2787,25 @@ class TelegramAdapter(BasePlatformAdapter):
resp = await client.get(image_url)
resp.raise_for_status()
image_data = resp.content
msg = await self._bot.send_photo(
chat_id=int(chat_id),
photo=image_data,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_photo_thread),
upload_thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_photo_thread,
metadata,
reply_to_message_id=reply_to_id,
)
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_photo,
{
"chat_id": int(chat_id),
"photo": image_data,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**upload_thread_kwargs,
},
metadata,
reply_to_id,
"uploaded photo",
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e2:
@@ -2491,7 +2816,7 @@ class TelegramAdapter(BasePlatformAdapter):
exc_info=True,
)
# Final fallback: send URL as text
return await super().send_image(chat_id, image_url, caption, reply_to)
return await super().send_image(chat_id, image_url, caption, reply_to, metadata=metadata)
async def send_animation(
self,
@@ -2507,12 +2832,25 @@ class TelegramAdapter(BasePlatformAdapter):
try:
_anim_thread = self._metadata_thread_id(metadata)
msg = await self._bot.send_animation(
chat_id=int(chat_id),
animation=animation_url,
caption=caption[:1024] if caption else None,
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_anim_thread),
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
animation_thread_kwargs = self._thread_kwargs_for_send(
chat_id,
_anim_thread,
metadata,
reply_to_message_id=reply_to_id,
)
msg = await self._send_with_dm_topic_reply_anchor_retry(
self._bot.send_animation,
{
"chat_id": int(chat_id),
"animation": animation_url,
"caption": caption[:1024] if caption else None,
"reply_to_message_id": reply_to_id,
**animation_thread_kwargs,
},
metadata,
reply_to_id,
"animation",
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
@@ -2523,13 +2861,21 @@ class TelegramAdapter(BasePlatformAdapter):
exc_info=True,
)
# Fallback: try as a regular photo
return await self.send_image(chat_id, animation_url, caption, reply_to)
return await self.send_image(chat_id, animation_url, caption, reply_to, metadata=metadata)
async def send_typing(self, chat_id: str, metadata: Optional[Dict[str, Any]] = None) -> None:
"""Send typing indicator."""
if self._bot:
try:
_typing_thread = self._metadata_thread_id(metadata)
# Skip the Bot API call entirely for Hermes-created DM topic
# lanes: send_chat_action only accepts message_thread_id, which
# Telegram's Bot API 10.0 rejects for these lanes. The send
# path uses the reply-anchor fallback instead, but typing has
# no equivalent — skipping avoids noisy "thread not found"
# debug logs on every typing tick.
if metadata and metadata.get("telegram_dm_topic_reply_fallback"):
return
message_thread_id = self._message_thread_id_for_typing(_typing_thread)
# No retry-without-thread fallback here: _message_thread_id_for_typing
# already maps the forum General topic to None, so any non-None value
+125 -37
View File
@@ -61,6 +61,7 @@ from hermes_cli.config import cfg_get
_AGENT_CACHE_MAX_SIZE = 128
_AGENT_CACHE_IDLE_TTL_SECS = 3600.0 # evict agents idle for >1h
_PLATFORM_CONNECT_TIMEOUT_SECS_DEFAULT = 30.0
_ADAPTER_DISCONNECT_TIMEOUT_SECS_DEFAULT = 5.0
_TELEGRAM_COMMAND_MENTION_RE = re.compile(r"(?<![\w:/])/([A-Za-z0-9][A-Za-z0-9_-]*)")
@@ -570,6 +571,7 @@ from gateway.platforms.base import (
EphemeralReply,
MessageEvent,
MessageType,
_reply_anchor_for_event,
merge_pending_message_event,
)
from gateway.restart import (
@@ -1216,7 +1218,13 @@ class GatewayRunner:
from hermes_state import SessionDB
self._session_db = SessionDB()
except Exception as e:
logger.debug("SQLite session store not available: %s", e)
# WARNING (not DEBUG) so the failure appears in errors.log — matches
# cli.py's handling of the same init path. Users hitting NFS-mounted
# HERMES_HOME silently lost /resume, /title, /history, /branch, and
# session search without this. The underlying cause (usually
# "locking protocol" from NFS) is now also captured by
# hermes_state.get_last_init_error() for slash-command error strings.
logger.warning("SQLite session store not available: %s", e)
# Opportunistic state.db maintenance: prune ended sessions older
# than sessions.retention_days + optional VACUUM. Tracks last-run
@@ -1494,8 +1502,18 @@ class GatewayRunner:
Must tolerate partial-init state and never raise, since callers
use it inside error-handling blocks.
"""
timeout = self._adapter_disconnect_timeout_secs()
try:
await adapter.disconnect()
if timeout <= 0:
await adapter.disconnect()
else:
await asyncio.wait_for(adapter.disconnect(), timeout=timeout)
except asyncio.TimeoutError:
logger.warning(
"Timed out after %.1fs while disconnecting %s adapter; continuing shutdown",
timeout,
platform.value if platform is not None else "adapter",
)
except Exception as e:
logger.debug(
"Defensive %s disconnect after failed connect raised: %s",
@@ -1503,6 +1521,21 @@ class GatewayRunner:
e,
)
def _adapter_disconnect_timeout_secs(self) -> float:
"""Return the per-adapter disconnect timeout used during shutdown."""
raw = os.getenv("HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT", "").strip()
if raw:
try:
timeout = float(raw)
except ValueError:
logger.warning(
"Ignoring invalid HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT=%r",
raw,
)
else:
return max(0.0, timeout)
return _ADAPTER_DISCONNECT_TIMEOUT_SECS_DEFAULT
def _platform_connect_timeout_secs(self) -> float:
"""Return the per-platform connect timeout used during startup/retry."""
raw = os.getenv("HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT", "").strip()
@@ -2380,7 +2413,8 @@ class GatewayRunner:
if not adapter:
return True
thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
reply_anchor = self._reply_anchor_for_event(event)
thread_meta = self._thread_metadata_for_source(event.source, reply_anchor)
if self._queue_during_drain_enabled():
self._queue_or_replace_pending_event(session_key, event)
message = f"⏳ Gateway {self._status_action_gerund()} — queued for the next turn after it comes back."
@@ -2390,7 +2424,13 @@ class GatewayRunner:
await adapter._send_with_retry(
chat_id=event.source.chat_id,
content=message,
reply_to=event.message_id,
reply_to=(
reply_anchor
if event.source.platform == Platform.TELEGRAM
and event.source.chat_type == "dm"
and event.source.thread_id
else (None if event.source.platform == Platform.TELEGRAM and event.source.thread_id else event.message_id)
),
metadata=thread_meta,
)
return True
@@ -2527,12 +2567,19 @@ class GatewayRunner:
except Exception as _onb_err:
logger.debug("Failed to apply busy-input onboarding hint: %s", _onb_err)
thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
reply_anchor = self._reply_anchor_for_event(event)
thread_meta = self._thread_metadata_for_source(event.source, reply_anchor)
try:
await adapter._send_with_retry(
chat_id=event.source.chat_id,
content=message,
reply_to=event.message_id,
reply_to=(
reply_anchor
if event.source.platform == Platform.TELEGRAM
and event.source.chat_type == "dm"
and event.source.thread_id
else (None if event.source.platform == Platform.TELEGRAM and event.source.thread_id else event.message_id)
),
metadata=thread_meta,
)
except Exception as e:
@@ -5037,7 +5084,7 @@ class GatewayRunner:
if config and hasattr(config, "get_notice_delivery"):
notice_delivery = config.get_notice_delivery(source.platform)
metadata = {"thread_id": source.thread_id} if getattr(source, "thread_id", None) else None
metadata = self._thread_metadata_for_source(source)
if notice_delivery == "private" and getattr(source, "user_id", None):
try:
result = await adapter.send_private_notice(
@@ -6132,7 +6179,7 @@ class GatewayRunner:
)
if any(marker in message_text for marker in _stt_fail_markers):
_stt_adapter = self.adapters.get(source.platform)
_stt_meta = {"thread_id": source.thread_id} if source.thread_id else None
_stt_meta = self._thread_metadata_for_source(source, self._reply_anchor_for_event(event))
if _stt_adapter:
try:
_stt_msg = (
@@ -6653,7 +6700,7 @@ class GatewayRunner:
f"{_compress_token_threshold:,}",
)
_hyg_meta = {"thread_id": source.thread_id} if source.thread_id else None
_hyg_meta = self._thread_metadata_for_source(source, self._reply_anchor_for_event(event))
try:
from run_agent import AIAgent
@@ -6882,7 +6929,7 @@ class GatewayRunner:
session_id=session_entry.session_id,
session_key=session_key,
run_generation=run_generation,
event_message_id=event.message_id,
event_message_id=self._reply_anchor_for_event(event),
channel_prompt=event.channel_prompt,
)
@@ -7223,7 +7270,11 @@ class GatewayRunner:
try:
_foot_adapter = self.adapters.get(source.platform)
if _foot_adapter:
await _foot_adapter.send(source.chat_id, _footer_line)
await _foot_adapter.send(
source.chat_id,
_footer_line,
metadata=self._thread_metadata_for_source(source, self._reply_anchor_for_event(event)),
)
except Exception as _e:
logger.debug("trailing footer send failed: %s", _e)
return None
@@ -8238,7 +8289,7 @@ class GatewayRunner:
lines.append("_(session only — use `/model <name> --global` to persist)_")
return "\n".join(lines)
metadata = {"thread_id": source.thread_id} if source.thread_id else None
metadata = self._thread_metadata_for_source(source, self._reply_anchor_for_event(event))
result = await adapter.send_model_picker(
chat_id=source.chat_id,
providers=providers,
@@ -8659,7 +8710,7 @@ class GatewayRunner:
try:
metadata = self._thread_metadata_for_source(source)
except Exception:
metadata = {"thread_id": source.thread_id} if getattr(source, "thread_id", None) else None
metadata = None
result = await adapter.send(source.chat_id, message, metadata=metadata)
if result is not None and not getattr(result, "success", True):
@@ -9224,13 +9275,15 @@ class GatewayRunner:
and adapter.is_in_voice_channel(guild_id)):
await adapter.play_in_voice_channel(guild_id, actual_path)
elif adapter and hasattr(adapter, "send_voice"):
reply_anchor = self._reply_anchor_for_event(event)
thread_meta = self._thread_metadata_for_source(event.source, reply_anchor)
send_kwargs: Dict[str, Any] = {
"chat_id": event.source.chat_id,
"audio_path": actual_path,
"reply_to": event.message_id,
"reply_to": reply_anchor,
}
if event.source.thread_id:
send_kwargs["metadata"] = {"thread_id": event.source.thread_id}
if thread_meta:
send_kwargs["metadata"] = thread_meta
await adapter.send_voice(**send_kwargs)
except Exception as e:
logger.warning("Auto voice reply failed: %s", e, exc_info=True)
@@ -9267,7 +9320,7 @@ class GatewayRunner:
_, cleaned = adapter.extract_images(response)
local_files, _ = adapter.extract_local_files(cleaned)
_thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
_thread_meta = self._thread_metadata_for_source(event.source, self._reply_anchor_for_event(event))
from gateway.platforms.base import should_send_media_as_audio
@@ -9431,9 +9484,16 @@ class GatewayRunner:
source = event.source
task_id = f"bg_{datetime.now().strftime('%H%M%S')}_{os.urandom(3).hex()}"
event_message_id = self._reply_anchor_for_event(event)
# Fire-and-forget the background task
_task = asyncio.create_task(
self._run_background_task(prompt, source, task_id)
self._run_background_task(
prompt,
source,
task_id,
event_message_id=event_message_id,
)
)
self._background_tasks.add(_task)
_task.add_done_callback(self._background_tasks.discard)
@@ -9442,7 +9502,11 @@ class GatewayRunner:
return f'🔄 Background task started: "{preview}"\nTask ID: {task_id}\nYou can keep chatting — results will appear when done.'
async def _run_background_task(
self, prompt: str, source: "SessionSource", task_id: str
self,
prompt: str,
source: "SessionSource",
task_id: str,
event_message_id: Optional[str] = None,
) -> None:
"""Execute a background agent task and deliver the result to the chat."""
from run_agent import AIAgent
@@ -9452,7 +9516,7 @@ class GatewayRunner:
logger.warning("No adapter for platform %s in background task %s", source.platform, task_id)
return
_thread_metadata = {"thread_id": source.thread_id} if source.thread_id else None
_thread_metadata = self._thread_metadata_for_source(source, event_message_id)
try:
user_config = _load_gateway_config()
@@ -10316,7 +10380,8 @@ class GatewayRunner:
def _disable_telegram_topic_mode_for_chat(self, source: SessionSource) -> str:
"""Cleanly disable topic mode for a chat via /topic off."""
if not self._session_db:
return "Session database not available."
from hermes_state import format_session_db_unavailable
return format_session_db_unavailable()
chat_id = str(source.chat_id or "")
if not chat_id:
return "Could not determine chat ID."
@@ -10354,7 +10419,8 @@ class GatewayRunner:
if source.platform != Platform.TELEGRAM or source.chat_type != "dm":
return "The /topic command is only available in Telegram private chats."
if not self._session_db:
return "Session database not available."
from hermes_state import format_session_db_unavailable
return format_session_db_unavailable()
# Authorization: /topic activates multi-session mode and mutates
# SQLite side tables. Unauthorized senders (not in allowlist) must
@@ -10568,7 +10634,8 @@ class GatewayRunner:
session_id = session_entry.session_id
if not self._session_db:
return "Session database not available."
from hermes_state import format_session_db_unavailable
return format_session_db_unavailable()
# Ensure session exists in SQLite DB (it may only exist in session_store
# if this is the first command in a new session)
@@ -10612,7 +10679,8 @@ class GatewayRunner:
async def _handle_resume_command(self, event: MessageEvent) -> str:
"""Handle /resume command — switch to a previously-named session."""
if not self._session_db:
return "Session database not available."
from hermes_state import format_session_db_unavailable
return format_session_db_unavailable()
source = event.source
session_key = self._session_key_for_source(source)
@@ -10699,7 +10767,8 @@ class GatewayRunner:
import uuid as _uuid
if not self._session_db:
return "Session database not available."
from hermes_state import format_session_db_unavailable
return format_session_db_unavailable()
source = event.source
session_key = self._session_key_for_source(source)
@@ -11267,7 +11336,7 @@ class GatewayRunner:
_slash_confirm_mod.register(session_key, confirm_id, command, handler)
adapter = self.adapters.get(source.platform)
metadata = self._thread_metadata_for_source(source)
metadata = self._thread_metadata_for_source(source, self._reply_anchor_for_event(event))
used_buttons = False
if adapter is not None:
@@ -11307,12 +11376,30 @@ class GatewayRunner:
except Exception:
return {}
def _thread_metadata_for_source(self, source) -> Optional[Dict[str, Any]]:
def _thread_metadata_for_source(
self,
source,
reply_to_message_id: Optional[str] = None,
) -> Optional[Dict[str, Any]]:
"""Build the metadata dict platforms need for thread-aware replies."""
thread_id = getattr(source, "thread_id", None)
if thread_id is None:
return None
return {"thread_id": thread_id}
metadata: Dict[str, Any] = {"thread_id": thread_id}
if (
getattr(source, "platform", None) == Platform.TELEGRAM
and getattr(source, "chat_type", None) == "dm"
):
metadata["telegram_dm_topic_reply_fallback"] = True
anchor = reply_to_message_id or getattr(source, "message_id", None)
if anchor is not None:
metadata["telegram_reply_to_message_id"] = str(anchor)
return metadata
@staticmethod
def _reply_anchor_for_event(event: MessageEvent) -> Optional[str]:
"""Return the platform-specific reply anchor for GatewayRunner sends."""
return _reply_anchor_for_event(event)
# ------------------------------------------------------------------
@@ -13105,10 +13192,7 @@ class GatewayRunner:
else bool(_plat_streaming)
)
if source.thread_id:
_thread_metadata: Optional[Dict[str, Any]] = {"thread_id": source.thread_id}
else:
_thread_metadata = None
_thread_metadata: Optional[Dict[str, Any]] = self._thread_metadata_for_source(source, event_message_id)
if _streaming_enabled:
try:
@@ -13538,8 +13622,8 @@ class GatewayRunner:
#
# Threading metadata is platform-specific:
# - Slack DM threading needs event_message_id fallback (reply thread)
# - Telegram uses message_thread_id only for forum topics; passing a
# normal DM/group message id as thread_id causes send failures
# - Telegram forum topics use message_thread_id; Hermes-created private
# DM topic lanes require both thread metadata and a reply anchor
# - Feishu only honors reply_in_thread when sending a reply, so topic
# progress uses the triggering event message as the reply target
# - Other platforms should use explicit source.thread_id only
@@ -13547,7 +13631,11 @@ class GatewayRunner:
_progress_thread_id = source.thread_id or event_message_id
else:
_progress_thread_id = source.thread_id
_progress_metadata = {"thread_id": _progress_thread_id} if _progress_thread_id else None
_progress_metadata = (
self._thread_metadata_for_source(source, event_message_id)
if _progress_thread_id == source.thread_id
else {"thread_id": _progress_thread_id}
) if _progress_thread_id else None
_progress_reply_to = (
event_message_id
if source.platform == Platform.FEISHU and source.thread_id and event_message_id
@@ -13807,7 +13895,7 @@ class GatewayRunner:
"reply_to_message_id": event_message_id,
}
else:
_status_thread_metadata = {"thread_id": _progress_thread_id} if _progress_thread_id else None
_status_thread_metadata = self._thread_metadata_for_source(source, event_message_id) if _progress_thread_id else None
def _status_callback_sync(event_type: str, message: str) -> None:
if not _status_adapter or not _run_still_current():
@@ -15092,7 +15180,7 @@ class GatewayRunner:
)
if next_message is None:
return result
next_message_id = getattr(pending_event, "message_id", None)
next_message_id = self._reply_anchor_for_event(pending_event)
next_channel_prompt = getattr(pending_event, "channel_prompt", None)
# Restart typing indicator so the user sees activity while
+124 -43
View File
@@ -197,6 +197,13 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
inference_base_url=DEFAULT_COPILOT_ACP_BASE_URL,
base_url_env_var="COPILOT_ACP_BASE_URL",
),
"codex-cli": ProviderConfig(
id="codex-cli",
name="OpenAI Codex CLI",
auth_type="external_process",
inference_base_url="codex-cli://local",
base_url_env_var="CODEX_CLI_BASE_URL",
),
"gemini": ProviderConfig(
id="gemini",
name="Google AI Studio",
@@ -1377,6 +1384,7 @@ def resolve_provider(
"github": "copilot", "github-copilot": "copilot",
"github-models": "copilot", "github-model": "copilot",
"github-copilot-acp": "copilot-acp", "copilot-acp-agent": "copilot-acp",
"codexcli": "codex-cli", "openai-codex-cli": "codex-cli",
"aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
"opencode": "opencode-zen", "zen": "opencode-zen",
"qwen-portal": "qwen-oauth", "qwen-cli": "qwen-oauth", "qwen-oauth": "qwen-oauth", "google-gemini-cli": "google-gemini-cli", "gemini-cli": "google-gemini-cli", "gemini-oauth": "google-gemini-cli",
@@ -4009,28 +4017,60 @@ def get_external_process_provider_status(provider_id: str) -> Dict[str, Any]:
if not pconfig or pconfig.auth_type != "external_process":
return {"configured": False}
command = (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
raw_args = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
args = shlex.split(raw_args) if raw_args else ["--acp", "--stdio"]
base_url = os.getenv(pconfig.base_url_env_var, "").strip() if pconfig.base_url_env_var else ""
if not base_url:
base_url = pconfig.inference_base_url
if provider_id == "copilot-acp":
command = (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
raw_args = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
args = shlex.split(raw_args) if raw_args else ["--acp", "--stdio"]
base_url = os.getenv(pconfig.base_url_env_var, "").strip() if pconfig.base_url_env_var else ""
if not base_url:
base_url = pconfig.inference_base_url
resolved_command = shutil.which(command) if command else None
return {
"configured": bool(resolved_command or base_url.startswith("acp+tcp://")),
"provider": provider_id,
"name": pconfig.name,
"command": command,
"args": args,
"resolved_command": resolved_command,
"base_url": base_url,
"logged_in": bool(resolved_command or base_url.startswith("acp+tcp://")),
}
resolved_command = shutil.which(command) if command else None
return {
"configured": bool(resolved_command or base_url.startswith("acp+tcp://")),
"provider": provider_id,
"name": pconfig.name,
"command": command,
"args": args,
"resolved_command": resolved_command,
"base_url": base_url,
"logged_in": bool(resolved_command or base_url.startswith("acp+tcp://")),
}
if provider_id == "codex-cli":
command = (
os.getenv("HERMES_CODEX_CLI_COMMAND", "").strip()
or os.getenv("CODEX_CLI_PATH", "").strip()
or "codex"
)
raw_args = os.getenv("HERMES_CODEX_CLI_ARGS", "").strip()
default_args = [
"exec",
"--json",
"--ephemeral",
"--dangerously-bypass-approvals-and-sandbox",
"--skip-git-repo-check",
]
args = shlex.split(raw_args) if raw_args else default_args
base_url = os.getenv(pconfig.base_url_env_var, "").strip() if pconfig.base_url_env_var else ""
if not base_url:
base_url = pconfig.inference_base_url
resolved_command = shutil.which(command) if command else None
return {
"configured": bool(resolved_command),
"provider": provider_id,
"name": pconfig.name,
"command": command,
"args": args,
"resolved_command": resolved_command,
"base_url": base_url,
"logged_in": bool(resolved_command),
}
return {"configured": False}
def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
@@ -4048,6 +4088,8 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
return get_gemini_oauth_auth_status()
if target == "copilot-acp":
return get_external_process_provider_status(target)
if target == "codex-cli":
return get_external_process_provider_status(target)
# API-key providers
pconfig = PROVIDER_REGISTRY.get(target)
if pconfig and pconfig.auth_type == "api_key":
@@ -4121,30 +4163,69 @@ def resolve_external_process_provider_credentials(provider_id: str) -> Dict[str,
if not base_url:
base_url = pconfig.inference_base_url
command = (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
raw_args = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
args = shlex.split(raw_args) if raw_args else ["--acp", "--stdio"]
resolved_command = shutil.which(command) if command else None
if not resolved_command and not base_url.startswith("acp+tcp://"):
raise AuthError(
f"Could not find the Copilot CLI command '{command}'. "
"Install GitHub Copilot CLI or set HERMES_COPILOT_ACP_COMMAND/COPILOT_CLI_PATH.",
provider=provider_id,
code="missing_copilot_cli",
if provider_id == "copilot-acp":
command = (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
raw_args = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
args = shlex.split(raw_args) if raw_args else ["--acp", "--stdio"]
resolved_command = shutil.which(command) if command else None
if not resolved_command and not base_url.startswith("acp+tcp://"):
raise AuthError(
f"Could not find the Copilot CLI command '{command}'. "
"Install GitHub Copilot CLI or set HERMES_COPILOT_ACP_COMMAND/COPILOT_CLI_PATH.",
provider=provider_id,
code="missing_copilot_cli",
)
return {
"provider": provider_id,
"api_key": "copilot-acp",
"base_url": base_url.rstrip("/"),
"command": resolved_command or command,
"args": args,
"source": "process",
}
return {
"provider": provider_id,
"api_key": "copilot-acp",
"base_url": base_url.rstrip("/"),
"command": resolved_command or command,
"args": args,
"source": "process",
}
if provider_id == "codex-cli":
command = (
os.getenv("HERMES_CODEX_CLI_COMMAND", "").strip()
or os.getenv("CODEX_CLI_PATH", "").strip()
or "codex"
)
raw_args = os.getenv("HERMES_CODEX_CLI_ARGS", "").strip()
default_args = [
"exec",
"--json",
"--ephemeral",
"--dangerously-bypass-approvals-and-sandbox",
"--skip-git-repo-check",
]
args = shlex.split(raw_args) if raw_args else default_args
resolved_command = shutil.which(command) if command else None
if not resolved_command:
raise AuthError(
f"Could not find the Codex CLI command '{command}'. "
"Install Codex CLI (npm install -g @openai/codex) or set "
"HERMES_CODEX_CLI_COMMAND / CODEX_CLI_PATH.",
provider=provider_id,
code="missing_codex_cli",
)
return {
"provider": provider_id,
"api_key": "codex-cli",
"base_url": base_url.rstrip("/"),
"command": resolved_command or command,
"args": args,
"source": "process",
}
raise AuthError(
f"Unknown external-process provider '{provider_id}'.",
provider=provider_id,
code="unknown_external_process_provider",
)
# =============================================================================
+14 -6
View File
@@ -206,9 +206,12 @@ def check_for_updates() -> Optional[int]:
if embedded_rev:
behind = _check_via_rev(embedded_rev)
else:
repo_dir = hermes_home / "hermes-agent"
# Prefer the running code's location over the profile-scoped path.
# $HERMES_HOME/hermes-agent/ may be a stale copy from --clone-all;
# Path(__file__) always resolves to the actual installed checkout.
repo_dir = Path(__file__).parent.parent.resolve()
if not (repo_dir / ".git").exists():
repo_dir = Path(__file__).parent.parent.resolve()
repo_dir = hermes_home / "hermes-agent"
if not (repo_dir / ".git").exists():
return None
behind = _check_via_local_git(repo_dir)
@@ -222,11 +225,16 @@ def check_for_updates() -> Optional[int]:
def _resolve_repo_dir() -> Optional[Path]:
"""Return the active Hermes git checkout, or None if this isn't a git install."""
hermes_home = get_hermes_home()
repo_dir = hermes_home / "hermes-agent"
"""Return the active Hermes git checkout, or None if this isn't a git install.
Prefers the running code's location over the profile-scoped path
because ``$HERMES_HOME/hermes-agent/`` may be a stale copy carried
over by ``--clone-all``.
"""
repo_dir = Path(__file__).parent.parent.resolve()
if not (repo_dir / ".git").exists():
repo_dir = Path(__file__).parent.parent.resolve()
hermes_home = get_hermes_home()
repo_dir = hermes_home / "hermes-agent"
return repo_dir if (repo_dir / ".git").exists() else None
+23 -1
View File
@@ -2387,7 +2387,15 @@ def systemd_stop(system: bool = False):
write_planned_stop_marker(pid)
except Exception:
pass
_run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
try:
_run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
except subprocess.TimeoutExpired:
label = _service_scope_label(system)
print(
f"Gateway {label} service is still stopping after 90s; "
"check `hermes gateway status` or logs for final shutdown state."
)
return
print(f"{_service_scope_label(system).capitalize()} service stopped")
@@ -2448,6 +2456,13 @@ def systemd_restart(system: bool = False):
_print_systemd_start_limit_wait(system=system)
return
raise
except subprocess.TimeoutExpired:
label = _service_scope_label(system)
print(
f"Gateway {label} service is still restarting after 90s; "
"check `hermes gateway status` or logs for final state."
)
return
_wait_for_systemd_service_restart(system=system, previous_pid=pid)
return
@@ -2467,6 +2482,13 @@ def systemd_restart(system: bool = False):
_print_systemd_start_limit_wait(system=system)
return
raise
except subprocess.TimeoutExpired:
label = _service_scope_label(system)
print(
f"Gateway {label} service is still restarting after 90s; "
"check `hermes gateway status` or logs for final state."
)
return
_wait_for_systemd_service_restart(system=system, previous_pid=pid)
+5 -1
View File
@@ -917,7 +917,11 @@ def connect(
needs_init = resolved not in _INITIALIZED_PATHS
conn = sqlite3.connect(str(path), isolation_level=None, timeout=30)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA journal_mode=WAL")
# WAL doesn't work on network filesystems (NFS/SMB/FUSE). Shared helper
# falls back to DELETE with one WARNING so kanban stays usable there.
# See hermes_state._WAL_INCOMPAT_MARKERS for detection logic.
from hermes_state import apply_wal_with_fallback
apply_wal_with_fallback(conn, db_label=f"kanban.db ({path.name})")
conn.execute("PRAGMA synchronous=NORMAL")
conn.execute("PRAGMA foreign_keys=ON")
if needs_init:
+1 -1
View File
@@ -8858,7 +8858,7 @@ def _build_provider_choices() -> list[str]:
except Exception:
# Fallback: static list guarantees the CLI always works
return [
"auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot",
"auto", "openrouter", "nous", "openai-codex", "copilot-acp", "codex-cli", "copilot",
"anthropic", "gemini", "google-gemini-cli", "xai", "bedrock", "azure-foundry",
"ollama-cloud", "huggingface", "zai", "kimi-coding", "kimi-coding-cn",
"stepfun", "minimax", "minimax-cn", "kilocode", "xiaomi", "arcee",
+14
View File
@@ -207,6 +207,17 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"copilot-acp": [
"copilot-acp",
],
"codex-cli": [
"gpt-5.5",
"gpt-5.4",
"gpt-5.4-mini",
"gpt-5.3-codex",
"gpt-5.2-codex",
"gpt-5.1-codex-max",
"gpt-5.1-codex-mini",
"o3",
"o4-mini",
],
"copilot": [
"gpt-5.4",
"gpt-5.4-mini",
@@ -799,6 +810,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("qwen-oauth", "Qwen OAuth (Portal)", "Qwen OAuth (reuses local Qwen CLI login)"),
ProviderEntry("copilot", "GitHub Copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
ProviderEntry("copilot-acp", "GitHub Copilot ACP", "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
ProviderEntry("codex-cli", "OpenAI Codex CLI", "OpenAI Codex CLI (spawns `codex exec --json` — text-only MVP, Hermes tools disabled)"),
ProviderEntry("huggingface", "Hugging Face", "Hugging Face Inference Providers (20+ open models)"),
ProviderEntry("gemini", "Google AI Studio", "Google AI Studio (Gemini models — native Gemini API)"),
ProviderEntry("google-gemini-cli", "Google Gemini (OAuth)", "Google Gemini via OAuth + Code Assist (free tier supported; no API key needed)"),
@@ -858,6 +870,8 @@ _PROVIDER_ALIASES = {
"github-model": "copilot",
"github-copilot-acp": "copilot-acp",
"copilot-acp-agent": "copilot-acp",
"codexcli": "codex-cli",
"openai-codex-cli": "codex-cli",
"google": "gemini",
"google-gemini": "gemini",
"google-ai-studio": "gemini",
+65 -14
View File
@@ -64,13 +64,39 @@ _CLONE_SUBDIR_FILES = [
"memories/USER.md",
]
# Runtime files stripped after --clone-all (shouldn't carry over)
_CLONE_ALL_STRIP = [
# Runtime files stripped after --clone-all (shouldn't carry over).
# Kept as a post-copy step rather than in the ignore filter because they
# are created dynamically during normal use and may be absent at copy time.
_CLONE_ALL_STRIP: list[str] = [
"gateway.pid",
"gateway_state.json",
"processes.json",
]
# Infrastructure artifacts excluded from --clone-all when the source is the
# default profile (``~/.hermes``). Named profiles never contain these
# directories at root, so the exclusion is gated to avoid silently dropping
# user data from a named-profile source.
#
# Rationale per item:
# hermes-agent — git repo checkout (~84 MB source + ~3 GB venv)
# .worktrees — git worktrees
# profiles — sibling named profiles (recursive copy never intended)
# bin — installed binaries (tirith etc., ~10 MB) shared per-host
# node_modules — npm packages (hundreds of MB)
#
# See ``_DEFAULT_EXPORT_EXCLUDE_ROOT`` below for the broader export-side
# exclusion list (export drops state.db / logs / caches too because the
# archive is a portable snapshot; clone-all keeps those because the cloned
# profile is meant to keep working immediately).
_CLONE_ALL_DEFAULT_EXCLUDE_ROOT: frozenset[str] = frozenset({
"hermes-agent",
".worktrees",
"profiles",
"bin",
"node_modules",
})
# Marker file written by `hermes profile create --no-skills`. When present in
# a profile's root, callers of seed_profile_skills() (fresh-create, `hermes
# update`'s all-profile sync, the web dashboard) skip bundled-skill seeding
@@ -89,23 +115,48 @@ def has_bundled_skills_opt_out(profile_dir: Path) -> bool:
def _clone_all_copytree_ignore(source_dir: Path):
"""Ignore ``profiles/`` at the root of *source_dir* only.
"""Exclude infrastructure artifacts when cloning a profile via --clone-all.
``~/.hermes`` contains ``profiles/<name>/`` for sibling named profiles.
``shutil.copytree`` would otherwise duplicate that entire tree inside the
new profile (recursive ``.../profiles/.../profiles/...``). Export already
excludes ``profiles`` via ``_DEFAULT_EXPORT_EXCLUDE_ROOT`` match that
behavior for ``--clone-all``.
Two categories:
1. Root-level entries in ``_CLONE_ALL_DEFAULT_EXCLUDE_ROOT`` known
Hermes infrastructure directories that only the default profile
(``~/.hermes``) ever contains. Gated on ``source_dir`` actually
being the default profile so a named-profile source never has its
own data silently dropped.
2. Universal exclusions at any depth Python bytecode caches that
are stale or regenerable (``__pycache__``, ``*.pyc``, ``*.pyo``)
and runtime sockets / temp files (``*.sock``, ``*.tmp``).
The export-side ignore (``_default_export_ignore``) uses the same
two-tier pattern with the broader ``_DEFAULT_EXPORT_EXCLUDE_ROOT`` set
because the export archive is a portable snapshot rather than a live
clone.
"""
source_resolved = source_dir.resolve()
is_default_source = source_resolved == _get_default_hermes_home().resolve()
def _ignore(directory: str, names: List[str]) -> List[str]:
try:
if Path(directory).resolve() == source_resolved:
return [n for n in names if n == "profiles"]
except (OSError, ValueError):
pass
return []
ignored: list[str] = []
for entry in names:
# Universal exclusions at any depth.
if (
entry == "__pycache__"
or entry.endswith((".pyc", ".pyo", ".sock", ".tmp"))
):
ignored.append(entry)
continue
# Root-level exclusions only apply when cloning the default profile.
if is_default_source:
try:
if Path(directory).resolve() == source_resolved:
if entry in _CLONE_ALL_DEFAULT_EXCLUDE_ROOT:
ignored.append(entry)
except (OSError, ValueError):
# ``resolve()`` can fail on unusual FS layouts (broken
# symlinks, missing parents). Fail open — better to
# over-copy than silently drop user data.
pass
return ignored
return _ignore
+13
View File
@@ -1137,6 +1137,19 @@ def resolve_runtime_provider(
"requested_provider": requested_provider,
}
if provider == "codex-cli":
creds = resolve_external_process_provider_credentials(provider)
return {
"provider": "codex-cli",
"api_mode": "chat_completions",
"base_url": creds.get("base_url", "").rstrip("/"),
"api_key": creds.get("api_key", ""),
"command": creds.get("command", ""),
"args": list(creds.get("args") or []),
"source": creds.get("source", "process"),
"requested_provider": requested_provider,
}
# Anthropic (native Messages API)
if provider == "anthropic":
# Allow base URL override from config.yaml model.base_url, but only
+6
View File
@@ -48,6 +48,11 @@ def _build_full_manifest(bot_name: str, bot_description: str) -> dict:
"background_color": "#1a1a2e",
},
"features": {
"app_home": {
"home_tab_enabled": False,
"messages_tab_enabled": True,
"messages_tab_read_only_enabled": False,
},
"bot_user": {
"display_name": bot_name[:80],
"always_online": True,
@@ -69,6 +74,7 @@ def _build_full_manifest(bot_name: str, bot_description: str) -> dict:
"files:read",
"files:write",
"groups:history",
"groups:read",
"im:history",
"im:read",
"im:write",
+3 -3
View File
@@ -533,7 +533,7 @@ async def get_status():
remote_health_body: dict | None = None
if not gateway_running and _GATEWAY_HEALTH_URL:
loop = asyncio.get_event_loop()
loop = asyncio.get_running_loop()
alive, remote_health_body = await loop.run_in_executor(
None, _probe_gateway_health
)
@@ -1845,7 +1845,7 @@ async def _start_device_code_flow(provider_id: str) -> Dict[str, Any]:
client_id=client_id,
scope=scope,
)
device_data = await asyncio.get_event_loop().run_in_executor(None, _do_nous_device_request)
device_data = await asyncio.get_running_loop().run_in_executor(None, _do_nous_device_request)
sid, sess = _new_oauth_session("nous", "device_code")
sess["device_code"] = str(device_data["device_code"])
sess["interval"] = int(device_data["interval"])
@@ -2134,7 +2134,7 @@ async def submit_oauth_code(provider_id: str, body: OAuthSubmitBody, request: Re
"""Submit the auth code for PKCE flows. Token-protected."""
_require_token(request)
if provider_id == "anthropic":
return await asyncio.get_event_loop().run_in_executor(
return await asyncio.get_running_loop().run_in_executor(
None, _submit_anthropic_pkce, body.session_id, body.code,
)
raise HTTPException(status_code=400, detail=f"submit not supported for {provider_id}")
+180 -16
View File
@@ -35,6 +35,153 @@ DEFAULT_DB_PATH = get_hermes_home() / "state.db"
SCHEMA_VERSION = 11
# ---------------------------------------------------------------------------
# WAL-compatibility fallback
# ---------------------------------------------------------------------------
# SQLite's WAL mode requires shared-memory (mmap) coordination and fcntl
# byte-range locks that don't reliably work on network filesystems (NFS,
# SMB/CIFS, some FUSE mounts, WSL1). Upstream documents this explicitly:
# https://www.sqlite.org/wal.html#sometimes_queries_return_sqlite_busy_in_wal_mode
#
# On those filesystems ``PRAGMA journal_mode=WAL`` raises
# ``sqlite3.OperationalError: locking protocol`` (SQLITE_PROTOCOL). If we
# propagate that, every feature backed by state.db / kanban.db breaks
# silently — /resume, /title, /history, /branch, kanban dispatcher, etc.
#
# Instead, fall back to ``journal_mode=DELETE`` (the pre-WAL default) which
# works on NFS. Concurrency drops — concurrent readers are blocked during
# a write — but the feature works.
_WAL_INCOMPAT_MARKERS = (
"locking protocol", # SQLITE_PROTOCOL on NFS/SMB
"not authorized", # Some FUSE mounts block WAL pragma outright
"disk i/o error", # Flaky network FS during WAL setup
)
# Last SessionDB() init error, per-process. Surfaced in /resume and
# related slash-command error strings so users know WHY the DB is
# unavailable instead of getting a bare "Session database not available."
# Only SessionDB.__init__ writes to this; kanban_db.connect() failures
# do not update it (by design — kanban failures are reported via their
# own caller's error handling, not via /resume-style slash commands).
_last_init_error: Optional[str] = None
_last_init_error_lock = threading.Lock()
# Paths for which we've already logged a WAL-fallback WARNING. Without
# this, kanban_db.connect() (called on every kanban operation — see
# hermes_cli/kanban_db.py for ~30 call sites) would re-log the same
# filesystem-incompat warning on every connection, filling errors.log.
_wal_fallback_warned_paths: set[str] = set()
_wal_fallback_warned_lock = threading.Lock()
def _set_last_init_error(msg: Optional[str]) -> None:
"""Record (or clear) the most recent state.db init failure.
Thread-safe via _last_init_error_lock. Callers pass a message to
record a failure or None to clear. SessionDB.__init__ only calls
this to SET on failure it deliberately does NOT clear on success,
because in a multi-threaded caller (e.g. gateway / web_server per-
request SessionDB() instantiation), a concurrent successful open
racing past a different thread's failure would erase the cause
string that thread's /resume handler is about to format. Explicit
clears (e.g. test fixtures) are still supported by passing None.
"""
global _last_init_error
with _last_init_error_lock:
_last_init_error = msg
def get_last_init_error() -> Optional[str]:
"""Return the most recent state.db init failure, if any.
Slash-command handlers (``/resume``, ``/title``, ``/history``, ``/branch``)
call this to surface the underlying cause in their error messages when
``_session_db is None``. Returns ``None`` if SessionDB initialized
successfully (or hasn't been attempted).
"""
return _last_init_error
def format_session_db_unavailable(prefix: str = "Session database not available") -> str:
"""Format a user-facing 'session DB unavailable' message with cause.
When ``SessionDB()`` init fails, callers set ``_session_db = None`` and
several slash commands (/resume, /title, /history, /branch) previously
responded with a bare ``"Session database not available."`` no
indication of WHY. This helper includes the captured cause (typically
``"locking protocol"`` from NFS/SMB) and points users at the known
culprit so they can fix it themselves.
Example output:
Session database not available: locking protocol (state.db may be
on NFS/SMB see https://www.sqlite.org/wal.html).
"""
cause = get_last_init_error()
if not cause:
return f"{prefix}."
hint = ""
if any(marker in cause.lower() for marker in _WAL_INCOMPAT_MARKERS):
hint = " (state.db may be on NFS/SMB/FUSE — see https://www.sqlite.org/wal.html)"
return f"{prefix}: {cause}{hint}."
def apply_wal_with_fallback(
conn: sqlite3.Connection,
*,
db_label: str = "state.db",
) -> str:
"""Set ``journal_mode=WAL`` on ``conn``, falling back to DELETE on failure.
Returns the journal mode actually set (``"wal"`` or ``"delete"``).
On WAL-incompatible filesystems (NFS, SMB, some FUSE), SQLite raises
``OperationalError("locking protocol")`` when setting WAL. We fall
back to DELETE mode the pre-WAL default, which works on NFS and
log one WARNING explaining why.
The WARNING is deduplicated per ``db_label``: repeated connections
to the same underlying DB (e.g. kanban_db.connect() which is called
on every kanban operation) log once per process, not once per call.
Different db_labels log independently, so state.db and kanban.db
each get one warning on the same NFS mount.
Shared by :class:`SessionDB` and ``hermes_cli.kanban_db.connect`` so
both databases get identical fallback behavior.
"""
try:
conn.execute("PRAGMA journal_mode=WAL")
return "wal"
except sqlite3.OperationalError as exc:
msg = str(exc).lower()
if not any(marker in msg for marker in _WAL_INCOMPAT_MARKERS):
# Unrelated OperationalError — don't silently swallow.
raise
_log_wal_fallback_once(db_label, exc)
conn.execute("PRAGMA journal_mode=DELETE")
return "delete"
def _log_wal_fallback_once(db_label: str, exc: Exception) -> None:
"""Log a single WARNING per (process, db_label) about WAL fallback.
Without this dedup, NFS users running kanban (which opens a fresh
connection on every operation see hermes_cli/kanban_db.py) would
fill errors.log with hundreds of identical warnings per hour.
"""
with _wal_fallback_warned_lock:
if db_label in _wal_fallback_warned_paths:
return
_wal_fallback_warned_paths.add(db_label)
logger.warning(
"%s: WAL journal_mode unsupported on this filesystem (%s) — "
"falling back to journal_mode=DELETE (slower rollback-journal "
"mode; reduces concurrency but works on NFS/SMB/FUSE). See "
"https://www.sqlite.org/wal.html for details. This warning "
"fires once per process per database.",
db_label,
exc,
)
SCHEMA_SQL = """
CREATE TABLE IF NOT EXISTS schema_version (
version INTEGER NOT NULL
@@ -185,23 +332,40 @@ class SessionDB:
self._lock = threading.Lock()
self._write_count = 0
self._conn = sqlite3.connect(
str(self.db_path),
check_same_thread=False,
# Short timeout — application-level retry with random jitter
# handles contention instead of sitting in SQLite's internal
# busy handler for up to 30s.
timeout=1.0,
# Autocommit mode: Python's default isolation_level="" auto-starts
# transactions on DML, which conflicts with our explicit
# BEGIN IMMEDIATE. None = we manage transactions ourselves.
isolation_level=None,
)
self._conn.row_factory = sqlite3.Row
self._conn.execute("PRAGMA journal_mode=WAL")
self._conn.execute("PRAGMA foreign_keys=ON")
try:
self._conn = sqlite3.connect(
str(self.db_path),
check_same_thread=False,
# Short timeout — application-level retry with random jitter
# handles contention instead of sitting in SQLite's internal
# busy handler for up to 30s.
timeout=1.0,
# Autocommit mode: Python's default isolation_level=""
# auto-starts transactions on DML, which conflicts with our
# explicit BEGIN IMMEDIATE. None = we manage transactions
# ourselves.
isolation_level=None,
)
self._conn.row_factory = sqlite3.Row
apply_wal_with_fallback(self._conn, db_label="state.db")
self._conn.execute("PRAGMA foreign_keys=ON")
self._init_schema()
self._init_schema()
except Exception as exc:
# Capture the cause so /resume and friends can surface WHY the
# session DB is unavailable instead of a bare "Session database
# not available." Callers that catch this exception keep their
# existing ``self._session_db = None`` degradation path.
#
# Note: we deliberately do NOT clear _last_init_error on the
# success path (no else branch). In multi-threaded callers
# (gateway, web_server per-request SessionDB()), a concurrent
# successful open racing past this failure would erase the
# cause that another thread's /resume is about to format.
# Tests that need to reset the state can call
# ``hermes_state._set_last_init_error(None)`` explicitly.
_set_last_init_error(f"{type(exc).__name__}: {exc}")
raise
# ── Core write helper ──
+21 -1
View File
@@ -550,6 +550,16 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:
# nullable "null" → None).
args[key] = coerced
continue
# If the string looks like a JSON array but _coerce_value
# failed to parse it, warn clearly instead of silently wrapping.
if value.strip().startswith("["):
logger.warning(
"coerce_tool_args: %s.%s looks like a JSON array string "
"but could not be parsed — model may have emitted a "
"JSON-encoded string instead of a native array. "
"Falling back to single-element list.",
tool_name, key,
)
args[key] = [value]
logger.info(
"coerce_tool_args: wrapped bare string in list for %s.%s",
@@ -637,7 +647,12 @@ def _coerce_json(value: str, expected_python_type: type):
"""
try:
parsed = json.loads(value)
except (ValueError, TypeError):
except (ValueError, TypeError) as exc:
logger.warning(
"coerce_tool_args: failed to parse string as JSON for expected type %s: %s",
expected_python_type.__name__,
exc,
)
return value
if isinstance(parsed, expected_python_type):
logger.debug(
@@ -645,6 +660,11 @@ def _coerce_json(value: str, expected_python_type: type):
expected_python_type.__name__,
)
return parsed
logger.warning(
"coerce_tool_args: JSON-parsed value is %s, expected %s — skipping coercion",
type(parsed).__name__,
expected_python_type.__name__,
)
return value
+5 -1
View File
@@ -127,7 +127,11 @@ class MemoryStore:
def _init_db(self) -> None:
"""Create tables, indexes, and triggers if they do not exist. Enable WAL mode."""
self._conn.execute("PRAGMA journal_mode=WAL")
# Use the shared WAL-fallback helper so memory_store.db degrades
# gracefully on NFS/SMB/FUSE-mounted HERMES_HOME (same issue as
# state.db / kanban.db — see hermes_state._WAL_INCOMPAT_MARKERS).
from hermes_state import apply_wal_with_fallback
apply_wal_with_fallback(self._conn, db_label="memory_store.db (holographic)")
self._conn.executescript(_SCHEMA)
# Migrate: add hrr_vector column if missing (safe for existing databases)
columns = {row[1] for row in self._conn.execute("PRAGMA table_info(facts)").fetchall()}
+8 -7
View File
@@ -100,18 +100,19 @@ class _VikingClient:
raise ImportError("httpx is required for OpenViking: pip install httpx")
def _headers(self) -> dict:
# Only send tenant headers when the user actually configured them.
# Legacy installs had account/user defaulted to the literal string
# "default" — treat that as unset so authenticated remote servers
# that derive tenancy from the Bearer key aren't overridden by a
# bogus tenant value.
# Always send tenant headers when account/user are configured.
# OpenViking 0.3.x requires X-OpenViking-Account and X-OpenViking-User
# for ROOT API key requests to tenant-scoped APIs — omitting them
# causes INVALID_ARGUMENT errors even when account="default".
# User-level keys can omit them (server derives tenancy from the key),
# but ROOT keys must always include them explicitly.
h = {
"Content-Type": "application/json",
"X-OpenViking-Agent": self._agent,
}
if self._account and self._account != "default":
if self._account:
h["X-OpenViking-Account"] = self._account
if self._user and self._user != "default":
if self._user:
h["X-OpenViking-User"] = self._user
if self._api_key:
h["X-API-Key"] = self._api_key
+181 -1
View File
@@ -1010,13 +1010,30 @@ class GoogleChatAdapter(BasePlatformAdapter):
+ (sender_email or "unknown").replace("@", "_at_").replace(".", "_")
)
text = envelope.get("text", "") or ""
# Honor the relay's declared sender_type when present so the
# downstream BOT self-filter (sender_type == "BOT") fires for
# bot-originated messages forwarded by the relay. Hardcoding
# "HUMAN" here meant the bot would re-process its own replies
# if the relay forwarded them, and allowed a relay envelope to
# impersonate any allowlisted user without ever being marked
# as a bot. Default to "HUMAN" for backward compatibility when
# the relay does not provide the field.
#
# Operator contract: the relay MUST forward sender.type from
# the upstream Chat event as ``sender_type``. Relays that
# forward bot replies as HUMAN (or omit the field) cannot be
# distinguished from genuine humans here.
sender_type_raw = (envelope.get("sender_type") or "HUMAN")
sender_type = str(sender_type_raw).strip().upper() or "HUMAN"
if sender_type not in {"HUMAN", "BOT"}:
sender_type = "HUMAN"
msg: Dict[str, Any] = {
"name": envelope.get("message_name", "") or "",
"sender": {
"name": sender_name_surrogate,
"email": sender_email,
"displayName": sender_display,
"type": "HUMAN",
"type": sender_type,
},
"text": text,
"argumentText": text,
@@ -3019,6 +3036,165 @@ def interactive_setup() -> None:
print_info("Restart the gateway: hermes gateway restart")
# Strict resource-name pattern. ``spaces/<id>`` and ``users/<id>`` must
# only contain Google Chat's documented character set; anything else
# means a tampered chat_id trying to break out of the REST URL path
# (path traversal, ``?`` query injection, ``#`` fragment truncation).
_GCHAT_CHAT_ID_RE = re.compile(r"^(?:spaces|users)/[A-Za-z0-9_-]+$")
async def _standalone_send(
pconfig,
chat_id: str,
message: str,
*,
thread_id: Optional[str] = None,
media_files: Optional[List[str]] = None,
force_document: bool = False,
) -> Dict[str, Any]:
"""POST a single Google Chat message via the REST API without the SDK.
Used by ``tools/send_message_tool._send_via_adapter`` when the gateway
runner is not in this process (e.g. ``hermes cron`` running as a
separate process from ``hermes gateway``). Without this hook,
``deliver=google_chat`` cron jobs fail with ``No live adapter for
platform``.
Configuration: requires service-account credentials via
``GOOGLE_CHAT_SERVICE_ACCOUNT_JSON``, ``GOOGLE_APPLICATION_CREDENTIALS``,
or Application Default Credentials, and a space resource name as
``chat_id`` (e.g. ``spaces/AAAA-BBBB`` or ``users/<id>``).
Security: ``chat_id`` is validated against the documented Google Chat
resource-name character set before substitution into the REST URL so
a tampered value cannot path-traverse or query-inject.
``media_files`` and ``force_document`` are accepted for signature
parity but are not implemented for the standalone path; messages with
attachments send as text-only. The live adapter handles attachments.
"""
if not chat_id:
return {"error": "Google Chat standalone send: chat_id (space resource) is required"}
if not _GCHAT_CHAT_ID_RE.match(chat_id):
return {"error": (
f"Google Chat standalone send: chat_id {chat_id!r} must match "
f"'spaces/<id>' or 'users/<id>' with only [A-Za-z0-9_-] in the id"
)}
if thread_id is not None and not re.match(r"^spaces/[A-Za-z0-9_-]+/threads/[A-Za-z0-9_-]+$", thread_id):
return {"error": (
f"Google Chat standalone send: thread_id {thread_id!r} must match "
f"'spaces/<id>/threads/<id>'"
)}
extra = getattr(pconfig, "extra", {}) or {}
sa_value = (
extra.get("service_account_json")
or os.getenv("GOOGLE_CHAT_SERVICE_ACCOUNT_JSON")
or os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
)
if service_account is None:
return {"error": "Google Chat standalone send: google-auth not installed"}
try:
from google.auth.transport.requests import Request as _GoogleAuthRequest
except Exception as e:
return {"error": f"Google Chat standalone send: google-auth import failed: {e}"}
try:
if sa_value:
stripped = sa_value.lstrip()
if stripped.startswith("{"):
try:
info = json.loads(sa_value)
except json.JSONDecodeError as exc:
return {"error": f"Google Chat standalone send: inline SA JSON is invalid: {exc}"}
creds = service_account.Credentials.from_service_account_info(info, scopes=_CHAT_SCOPES)
else:
if not os.path.exists(sa_value):
return {"error": f"Google Chat standalone send: SA JSON file not found at {sa_value}"}
try:
with open(sa_value, "r", encoding="utf-8") as fh:
info = json.load(fh)
except json.JSONDecodeError as exc:
return {"error": f"Google Chat standalone send: SA JSON file is invalid: {exc}"}
creds = service_account.Credentials.from_service_account_info(info, scopes=_CHAT_SCOPES)
else:
try:
import google.auth as _google_auth
except ImportError:
return {"error": (
"Google Chat standalone send: no SA credentials configured "
"and google-auth is not installed for ADC fallback"
)}
try:
creds, _project = _google_auth.default(scopes=_CHAT_SCOPES)
except Exception as exc:
return {"error": (
f"Google Chat standalone send: no SA credentials configured "
f"and Application Default Credentials are unavailable: {exc}"
)}
except asyncio.CancelledError:
raise
except Exception as e:
return {"error": f"Google Chat standalone send: credential load failed: {e}"}
# Bound the synchronous urllib3-backed token refresh so a hung Google
# STS endpoint cannot stall the cron scheduler indefinitely.
try:
await asyncio.wait_for(
asyncio.to_thread(creds.refresh, _GoogleAuthRequest()),
timeout=10.0,
)
except asyncio.TimeoutError:
return {"error": "Google Chat standalone send: token refresh timed out"}
except asyncio.CancelledError:
raise
except Exception as e:
return {"error": f"Google Chat standalone send: token refresh failed: {e}"}
token = getattr(creds, "token", None)
if not token:
return {"error": "Google Chat standalone send: refreshed credentials have no token"}
body: Dict[str, Any] = {"text": message}
if thread_id:
body["thread"] = {"name": thread_id}
url = f"https://chat.googleapis.com/v1/{chat_id}/messages"
try:
import aiohttp as _aiohttp
except ImportError:
return {"error": "Google Chat standalone send: aiohttp not installed"}
try:
async with _aiohttp.ClientSession(timeout=_aiohttp.ClientTimeout(total=30.0)) as session:
async with session.post(
url,
json=body,
headers={
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
},
) as resp:
if resp.status >= 400:
text = await resp.text()
return {"error": (
f"Google Chat standalone send: API returned "
f"{resp.status}: {text[:300]}"
)}
payload = await resp.json()
return {
"success": True,
"message_id": payload.get("name"),
}
except asyncio.CancelledError:
raise
except Exception as e:
logger.debug("Google Chat standalone send raised", exc_info=True)
return {"error": f"Google Chat standalone send failed: {e}"}
def register(ctx) -> None:
"""Plugin entry point — called by the Hermes plugin system at startup.
@@ -3052,6 +3228,10 @@ def register(ctx) -> None:
# cron jobs route to the configured home space without editing
# cron/scheduler.py's hardcoded sets.
cron_deliver_env_var="GOOGLE_CHAT_HOME_CHANNEL",
# Out-of-process cron delivery via the Chat REST API. Without this
# hook, deliver=google_chat cron jobs fail with "No live adapter"
# when cron runs separately from the gateway.
standalone_sender_fn=_standalone_send,
# Auth env vars for _is_user_authorized() integration.
allowed_users_env="GOOGLE_CHAT_ALLOWED_USERS",
allow_all_env="GOOGLE_CHAT_ALLOW_ALL_USERS",
+231 -7
View File
@@ -53,11 +53,6 @@ from gateway.session import SessionSource
from gateway.config import PlatformConfig, Platform
def _ensure_imports():
"""No-op — kept for backward compatibility with any call sites."""
pass
# ---------------------------------------------------------------------------
# IRC protocol helpers
# ---------------------------------------------------------------------------
@@ -704,8 +699,233 @@ def _env_enablement() -> dict | None:
return seed
def _strip_irc_control_chars(text: str) -> str:
"""Strip IRC line terminators and the NUL byte from ``text``.
IRC commands are CRLF-delimited; a bare ``\\r`` or ``\\n`` in user
content lets an attacker inject arbitrary IRC commands (CTCP, JOIN,
KICK). ``\\x00`` is a protocol-illegal byte. Everything else is
valid in PRIVMSG payloads.
"""
return text.replace("\r", " ").replace("\n", " ").replace("\x00", "")
def _is_irc_channel(target: str) -> bool:
return bool(target) and target[0] in "#&+!"
async def _standalone_send(
pconfig,
chat_id: str,
message: str,
*,
thread_id: Optional[str] = None,
media_files: Optional[List[str]] = None,
force_document: bool = False,
) -> Dict[str, Any]:
"""Open an ephemeral IRC connection, send a PRIVMSG, and quit.
Used by ``tools/send_message_tool._send_via_adapter`` when the gateway
runner is not in this process (e.g. ``hermes cron`` running as a
separate process from ``hermes gateway``). Without this hook,
``deliver=irc`` cron jobs fail with ``No live adapter for platform``.
The standalone client uses a distinct nick suffix (``-cron``) so it
does not collide with the long-running gateway adapter that may already
be holding the configured nickname on the same network. When the
target is a channel, the client JOINs it before sending PRIVMSG so
networks with the default ``+n`` (no external messages) channel mode
accept the delivery.
``thread_id`` and ``media_files`` are accepted for signature parity but
are not meaningful on IRC: IRC has no native thread or attachment
primitive.
"""
extra = getattr(pconfig, "extra", {}) or {}
server = os.getenv("IRC_SERVER") or extra.get("server", "")
channel = os.getenv("IRC_CHANNEL") or extra.get("channel", "")
if not server or not channel:
return {"error": "IRC standalone send: IRC_SERVER and IRC_CHANNEL must be configured"}
port_value = os.getenv("IRC_PORT") or extra.get("port", 6697)
try:
port = int(port_value)
except (TypeError, ValueError):
return {"error": f"IRC standalone send: invalid port {port_value!r}"}
nickname = os.getenv("IRC_NICKNAME") or extra.get("nickname", "hermes-bot")
use_tls_env = os.getenv("IRC_USE_TLS")
if use_tls_env is not None:
use_tls = use_tls_env.lower() in ("1", "true", "yes")
else:
use_tls = bool(extra.get("use_tls", True))
server_password = os.getenv("IRC_SERVER_PASSWORD") or extra.get("server_password", "")
nickserv_password = os.getenv("IRC_NICKSERV_PASSWORD") or extra.get("nickserv_password", "")
# Reject control characters in chat_id to block IRC command injection.
raw_target = chat_id or channel
if any(ch in raw_target for ch in ("\r", "\n", "\x00", " ")):
return {"error": "IRC standalone send: chat_id contains illegal IRC characters"}
target = raw_target
# Distinct nick prevents NICK collision with a live gateway adapter
# that may already be holding the configured nickname. Cap to 24 chars
# so subsequent collision retries do not overflow the 30-char NICKLEN
# most networks enforce.
nick_base = nickname.rstrip("_0123456789-")[:24] or "hermes-bot"
standalone_nick = f"{nick_base}-cron"[:30]
plain = IRCAdapter._strip_markdown(message)
ssl_ctx = ssl.create_default_context() if use_tls else None
try:
reader, writer = await asyncio.wait_for(
asyncio.open_connection(server, port, ssl=ssl_ctx),
timeout=15.0,
)
except asyncio.CancelledError:
raise
except Exception as e:
return {"error": f"IRC standalone connect failed: {e}"}
async def _raw(line: str) -> None:
writer.write((line + "\r\n").encode("utf-8"))
await writer.drain()
nick_attempts = 0
max_nick_attempts = 5
try:
if server_password:
await _raw(f"PASS {_strip_irc_control_chars(server_password)}")
await _raw(f"NICK {standalone_nick}")
await _raw(f"USER {standalone_nick} 0 * :Hermes Agent (cron)")
loop = asyncio.get_running_loop()
deadline = loop.time() + 15.0
registered = False
while not registered:
remaining = deadline - loop.time()
if remaining <= 0:
return {"error": "IRC standalone send: registration timeout (no RPL_WELCOME)"}
try:
raw_line = await asyncio.wait_for(reader.readuntil(b"\r\n"), timeout=remaining)
except asyncio.TimeoutError:
return {"error": "IRC standalone send: registration timeout (no RPL_WELCOME)"}
except asyncio.IncompleteReadError:
return {"error": "IRC standalone send: server closed connection during registration"}
decoded = raw_line.decode("utf-8", errors="replace").rstrip("\r\n")
msg = _parse_irc_message(decoded)
cmd = msg["command"]
if cmd == "PING":
payload = msg["params"][0] if msg["params"] else ""
await _raw(f"PONG :{payload}")
elif cmd == "001":
registered = True
elif cmd in ("432", "433"):
nick_attempts += 1
if nick_attempts > max_nick_attempts:
return {"error": "IRC standalone send: too many nick collisions"}
# Build the next nick from the stable base, not the
# mutated value, so the suffix stays bounded.
standalone_nick = f"{nick_base}-cron-{nick_attempts}"[:30]
await _raw(f"NICK {standalone_nick}")
elif cmd in ("464", "465"):
return {"error": f"IRC standalone send: server rejected client ({cmd})"}
if nickserv_password:
await _raw(f"PRIVMSG NickServ :IDENTIFY {_strip_irc_control_chars(nickserv_password)}")
await asyncio.sleep(2)
# JOIN before PRIVMSG. IRC channels with the default ``+n`` mode
# (no external messages: Libera, OFTC, EFnet, IRCNet, undernet)
# silently drop PRIVMSG from non-members. Do not JOIN bare nicks
# (DM target) or server queries.
if _is_irc_channel(target):
await _raw(f"JOIN {target}")
join_deadline = loop.time() + 5.0
joined = False
while not joined:
remaining = join_deadline - loop.time()
if remaining <= 0:
# Timed out waiting for a JOIN ack: proceed anyway, the
# server may still deliver the PRIVMSG depending on mode.
break
try:
raw_line = await asyncio.wait_for(reader.readuntil(b"\r\n"), timeout=remaining)
except (asyncio.TimeoutError, asyncio.IncompleteReadError):
break
decoded = raw_line.decode("utf-8", errors="replace").rstrip("\r\n")
jmsg = _parse_irc_message(decoded)
jcmd = jmsg["command"]
if jcmd == "PING":
payload = jmsg["params"][0] if jmsg["params"] else ""
await _raw(f"PONG :{payload}")
elif jcmd in ("366", "JOIN"):
joined = True
elif jcmd in ("403", "405", "471", "473", "474", "475"):
return {"error": f"IRC standalone send: JOIN {target} rejected ({jcmd})"}
# Bytes-aware per-line splitting so multi-line plain text never
# exceeds the IRC 510-byte protocol limit. Reuses the same
# algorithm as IRCAdapter._split_message, with control-character
# stripping per line to block CRLF injection from message content.
overhead = len(f"PRIVMSG {target} :".encode("utf-8")) + 2
max_bytes = 510 - overhead
sent_any = False
for paragraph in plain.split("\n"):
paragraph = _strip_irc_control_chars(paragraph).rstrip()
if not paragraph:
continue
while paragraph:
encoded = paragraph.encode("utf-8")
if len(encoded) <= max_bytes:
await _raw(f"PRIVMSG {target} :{paragraph}")
await asyncio.sleep(0.3)
sent_any = True
break
# Binary search for largest prefix that fits within max_bytes
low, high, best = 1, len(paragraph), 0
while low <= high:
mid = (low + high) // 2
if len(paragraph[:mid].encode("utf-8")) <= max_bytes:
best = mid
low = mid + 1
else:
high = mid - 1
split_at = best
space = paragraph.rfind(" ", 0, split_at)
if space > split_at // 3:
split_at = space
await _raw(f"PRIVMSG {target} :{paragraph[:split_at].rstrip()}")
await asyncio.sleep(0.3)
sent_any = True
paragraph = paragraph[split_at:].lstrip()
if not sent_any:
return {"error": "IRC standalone send: empty message after stripping"}
await _raw("QUIT :delivered")
try:
await asyncio.wait_for(reader.read(1024), timeout=2.0)
except asyncio.TimeoutError:
pass
return {"success": True, "message_id": str(int(time.time() * 1000))}
except asyncio.CancelledError:
raise
except Exception as e:
logger.debug("IRC standalone send raised", exc_info=True)
return {"error": f"IRC standalone send failed: {e}"}
finally:
try:
writer.close()
await asyncio.wait_for(writer.wait_closed(), timeout=5.0)
except (asyncio.TimeoutError, Exception):
pass
def register(ctx):
"""Plugin entry point called by the Hermes plugin system."""
"""Plugin entry point: called by the Hermes plugin system."""
ctx.register_platform(
name="irc",
label="IRC",
@@ -716,7 +936,7 @@ def register(ctx):
required_env=["IRC_SERVER", "IRC_CHANNEL", "IRC_NICKNAME"],
install_hint="No extra packages needed (stdlib only)",
setup_fn=interactive_setup,
# Env-driven auto-configuration seeds PlatformConfig.extra with
# Env-driven auto-configuration: seeds PlatformConfig.extra with
# server/channel/port/tls + home_channel so env-only setups show
# up in gateway status without instantiating the adapter.
env_enablement_fn=_env_enablement,
@@ -724,6 +944,10 @@ def register(ctx):
# IRC_CHANNEL (see _env_enablement), so cron jobs with
# deliver=irc route to the joined channel by default.
cron_deliver_env_var="IRC_HOME_CHANNEL",
# Out-of-process cron delivery. Without this hook, deliver=irc
# cron jobs fail with "No live adapter" when cron runs separately
# from the gateway.
standalone_sender_fn=_standalone_send,
# Auth env vars for _is_user_authorized() integration
allowed_users_env="IRC_ALLOWED_USERS",
allow_all_env="IRC_ALLOW_ALL_USERS",
+174
View File
@@ -418,6 +418,9 @@ def _env_enablement() -> dict | None:
seed["port"] = int(port)
except ValueError:
pass
service_url = os.getenv("TEAMS_SERVICE_URL", "").strip()
if service_url:
seed["service_url"] = service_url
home = os.getenv("TEAMS_HOME_CHANNEL", "").strip()
if home:
seed["home_channel"] = {
@@ -427,6 +430,173 @@ def _env_enablement() -> dict | None:
return seed
# Bot Framework default service URL for the global Teams endpoint. Some
# regional/government tenants need a different host (e.g.
# ``https://smba.infra.gov.teams.microsoft.us/``) which can be supplied via
# ``TEAMS_SERVICE_URL`` or ``extra['service_url']``.
_DEFAULT_TEAMS_SERVICE_URL = "https://smba.trafficmanager.net/teams/"
# Allowlist of Bot Framework service hosts that may receive a freshly
# minted bearer token. Operator-supplied URLs are matched against this
# allowlist to block SSRF / token-exfiltration via a tampered env var.
_ALLOWED_TEAMS_SERVICE_HOSTS = frozenset({
"smba.trafficmanager.net",
"smba.infra.gov.teams.microsoft.us",
})
# Conservative pattern for Bot Framework conversation IDs. Real values
# combine digits, colons, hyphens, dots, '@', and the ``thread.skype`` /
# ``thread.tacv2`` suffixes; reject anything outside this set so a hostile
# value cannot path-traverse out of ``/v3/conversations/<id>/activities``.
import re as _re_teams
_TEAMS_CONV_ID_RE = _re_teams.compile(r"^[A-Za-z0-9:@\-_.]+$")
def _validate_teams_service_url(raw: str) -> Optional[str]:
"""Return a normalized service URL or ``None`` if it is not allowed.
Requires ``https://`` and a host in ``_ALLOWED_TEAMS_SERVICE_HOSTS``.
The trailing slash is added if absent so callers can append
``v3/conversations/...`` without double slashes.
"""
if not raw:
return None
try:
from urllib.parse import urlparse
parsed = urlparse(raw)
except Exception:
return None
if parsed.scheme != "https":
return None
if parsed.hostname not in _ALLOWED_TEAMS_SERVICE_HOSTS:
return None
normalized = raw if raw.endswith("/") else raw + "/"
return normalized
async def _standalone_send(
pconfig,
chat_id: str,
message: str,
*,
thread_id: Optional[str] = None,
media_files: Optional[list] = None,
force_document: bool = False,
) -> Dict[str, Any]:
"""Acquire a Bot Framework bearer token and POST a single message activity.
Used by ``tools/send_message_tool._send_via_adapter`` when the gateway
runner is not in this process (e.g. ``hermes cron`` running as a
separate process from ``hermes gateway``). Without this hook,
``deliver=teams`` cron jobs fail with ``No live adapter for platform``.
Configuration: requires ``TEAMS_CLIENT_ID``, ``TEAMS_CLIENT_SECRET``,
``TEAMS_TENANT_ID``, ``TEAMS_HOME_CHANNEL`` (the conversation ID), and
optionally ``TEAMS_SERVICE_URL`` (Bot Framework service host; must be
a known Bot Framework endpoint, see ``_ALLOWED_TEAMS_SERVICE_HOSTS``).
Security: ``service_url`` is validated against an allowlist of known
Bot Framework hosts to block SSRF / token-exfiltration via a tampered
env var. ``chat_id`` is validated to match the documented Bot
Framework ID character set so it cannot escape the URL path.
``media_files`` and ``force_document`` are accepted for signature
parity but not implemented for the standalone path; messages with
attachments will send as text-only. The live adapter handles
attachments via the SDK.
"""
extra = getattr(pconfig, "extra", {}) or {}
client_id = os.getenv("TEAMS_CLIENT_ID") or extra.get("client_id", "")
client_secret = os.getenv("TEAMS_CLIENT_SECRET") or extra.get("client_secret", "")
tenant_id = os.getenv("TEAMS_TENANT_ID") or extra.get("tenant_id", "")
if not (client_id and client_secret and tenant_id):
return {"error": "Teams standalone send: TEAMS_CLIENT_ID, TEAMS_CLIENT_SECRET, and TEAMS_TENANT_ID are all required"}
raw_service_url = (
os.getenv("TEAMS_SERVICE_URL")
or extra.get("service_url", "")
or _DEFAULT_TEAMS_SERVICE_URL
)
service_url = _validate_teams_service_url(raw_service_url)
if service_url is None:
return {"error": (
f"Teams standalone send: TEAMS_SERVICE_URL host is not on the "
f"Bot Framework allowlist; expected one of "
f"{sorted(_ALLOWED_TEAMS_SERVICE_HOSTS)}"
)}
# Bot Framework conversation IDs are restricted to a known character
# set; anything else means a tampered chat_id trying to break out of
# the URL path.
if not chat_id:
return {"error": "Teams standalone send: chat_id (conversation ID) is required"}
if not _TEAMS_CONV_ID_RE.match(chat_id):
return {"error": "Teams standalone send: chat_id contains characters outside the Bot Framework conversation ID set"}
if not _TEAMS_CONV_ID_RE.match(tenant_id):
return {"error": "Teams standalone send: TEAMS_TENANT_ID contains characters outside the expected set"}
token_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
activities_url = f"{service_url}v3/conversations/{chat_id}/activities"
if not AIOHTTP_AVAILABLE:
return {"error": "Teams standalone send: aiohttp not installed"}
try:
import aiohttp as _aiohttp
# Per-request timeouts so a slow STS endpoint cannot starve the
# subsequent activity POST of its budget.
per_request_timeout = _aiohttp.ClientTimeout(total=15.0)
async with _aiohttp.ClientSession() as session:
async with session.post(
token_url,
data={
"grant_type": "client_credentials",
"client_id": client_id,
"client_secret": client_secret,
"scope": "https://api.botframework.com/.default",
},
headers={"Content-Type": "application/x-www-form-urlencoded"},
timeout=per_request_timeout,
) as token_resp:
if token_resp.status >= 400:
body = await token_resp.text()
return {"error": f"Teams standalone send: token request failed ({token_resp.status}): {body[:300]}"}
token_payload = await token_resp.json()
access_token = token_payload.get("access_token")
if not access_token:
return {"error": "Teams standalone send: token response missing access_token"}
activity = {
"type": "message",
"text": message,
"textFormat": "markdown",
}
async with session.post(
activities_url,
json=activity,
headers={
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json",
},
timeout=per_request_timeout,
) as send_resp:
if send_resp.status >= 400:
body = await send_resp.text()
return {"error": f"Teams standalone send: activity post failed ({send_resp.status}): {body[:300]}"}
send_payload = await send_resp.json()
return {
"success": True,
"message_id": send_payload.get("id"),
}
except asyncio.CancelledError:
raise
except Exception as e:
logger.debug("Teams standalone send raised", exc_info=True)
return {"error": f"Teams standalone send failed: {e}"}
# Keep the old name as an alias so existing test imports don't break.
check_teams_requirements = check_requirements
@@ -985,6 +1155,10 @@ def register(ctx) -> None:
# jobs route to the configured Teams chat/channel without editing
# cron/scheduler.py's hardcoded sets.
cron_deliver_env_var="TEAMS_HOME_CHANNEL",
# Out-of-process cron delivery via Bot Framework REST. Without
# this hook, deliver=teams cron jobs fail with "No live adapter"
# when cron runs separately from the gateway.
standalone_sender_fn=_standalone_send,
# Auth env vars for _is_user_authorized() integration
allowed_users_env="TEAMS_ALLOWED_USERS",
allow_all_env="TEAMS_ALLOW_ALL_USERS",
+26 -2
View File
@@ -1264,6 +1264,7 @@ class AIAgent:
api_mode is None
and self.api_mode == "chat_completions"
and self.provider != "copilot-acp"
and self.provider != "codex-cli"
and not str(self.base_url or "").lower().startswith("acp://copilot")
and not str(self.base_url or "").lower().startswith("acp+tcp://")
and not self._is_azure_openai_url()
@@ -1587,6 +1588,9 @@ class AIAgent:
if self.provider == "copilot-acp":
client_kwargs["command"] = self.acp_command
client_kwargs["args"] = self.acp_args
if self.provider == "codex-cli":
client_kwargs["command"] = self.acp_command
client_kwargs["args"] = self.acp_args
effective_base = base_url
if base_url_host_matches(effective_base, "openrouter.ai"):
from agent.auxiliary_client import build_or_headers
@@ -1761,6 +1765,11 @@ class AIAgent:
disabled_toolsets=disabled_toolsets,
quiet_mode=self.quiet_mode,
)
# Codex CLI provider is text-in/text-out MVP — Hermes tools are disabled
# because Codex handles its own tool calling internally via `codex exec`.
if self.provider == "codex-cli":
self.tools = []
# Show tool configuration and store valid tool names for validation
self.valid_tool_names = set()
@@ -5959,6 +5968,17 @@ class AIAgent:
self._client_log_context(),
)
return client
if self.provider == "codex-cli" or str(client_kwargs.get("base_url", "")).startswith("codex-cli://"):
from agent.codex_cli_client import CodexCLIClient
client = CodexCLIClient(**client_kwargs)
logger.info(
"Codex CLI client created (%s, shared=%s) %s",
reason,
shared,
self._client_log_context(),
)
return client
if self.provider == "google-gemini-cli" or str(client_kwargs.get("base_url", "")).startswith("cloudcode-pa://"):
from agent.gemini_cloudcode_adapter import GeminiCloudCodeClient
@@ -9869,7 +9889,8 @@ class AIAgent:
)
elif function_name == "session_search":
if not self._session_db:
return json.dumps({"success": False, "error": "Session database not available."})
from hermes_state import format_session_db_unavailable
return json.dumps({"success": False, "error": format_session_db_unavailable()})
from tools.session_search_tool import session_search as _session_search
return _session_search(
query=function_args.get("query", ""),
@@ -10492,7 +10513,8 @@ class AIAgent:
self._vprint(f" {_get_cute_tool_message_impl('todo', function_args, tool_duration, result=function_result)}")
elif function_name == "session_search":
if not self._session_db:
function_result = json.dumps({"success": False, "error": "Session database not available."})
from hermes_state import format_session_db_unavailable
function_result = json.dumps({"success": False, "error": format_session_db_unavailable()})
else:
from tools.session_search_tool import session_search as _session_search
function_result = _session_search(
@@ -11807,8 +11829,10 @@ class AIAgent:
# API upgrade (lines ~1083-1085).
elif (
self.provider == "copilot-acp"
or self.provider == "codex-cli"
or str(self.base_url or "").lower().startswith("acp://copilot")
or str(self.base_url or "").lower().startswith("acp+tcp://")
or str(self.base_url or "").lower().startswith("codex-cli://")
):
_use_streaming = False
elif not self._has_stream_consumers():
+9
View File
@@ -53,6 +53,7 @@ AUTHOR_MAP = {
"harish.kukreja@gmail.com": "counterposition",
"cleo@edaphic.xyz": "curiouscleo",
"hirokazu.ogawa@kwansei.ac.jp": "hrkzogw",
"datapod.k@gmail.com": "dandacompany",
"127238744+teknium1@users.noreply.github.com": "teknium1",
"128259593+Gutslabs@users.noreply.github.com": "Gutslabs",
"50326054+nocturnum91@users.noreply.github.com": "nocturnum91",
@@ -63,6 +64,9 @@ AUTHOR_MAP = {
"ytchen0719@gmail.com": "liquidchen",
"am@studio1.tailb672fe.ts.net": "subtract0",
"axmaiqiu@gmail.com": "qWaitCrypto",
"egitimviscara@gmail.com": "uzunkuyruk",
"zhekinmaksim@gmail.com": "Zhekinmaksim",
"obafemiferanmi1999@gmail.com": "KvnGz",
"159539633+MottledShadow@users.noreply.github.com": "MottledShadow",
"aludwin+gh@gmail.com": "adamludwin",
"ngusev@astralinux.ru": "NikolayGusev-astra",
@@ -142,6 +146,7 @@ AUTHOR_MAP = {
"luwinyang@deepseek.com": "lsdsjy",
"season.saw@gmail.com": "season179",
"heathley@Heathley-MacBook-Air.local": "heathley",
"maliyldzhn@gmail.com": "heathley",
"vlad19@gmail.com": "dandaka",
"adamrummer@gmail.com": "cyclingwithelephants",
# Temporary tool-progress cleanup salvage (May 2026)
@@ -165,6 +170,8 @@ AUTHOR_MAP = {
"momowind@gmail.com": "momowind",
"clockwork-codex@users.noreply.github.com": "misery-hl",
"207811921+misery-hl@users.noreply.github.com": "misery-hl",
"20nik.nosov21@gmail.com": "nik1t7n",
"90299797+nik1t7n@users.noreply.github.com": "nik1t7n",
"suncokret@protonmail.com": "suncokret12",
"mio.imoto.ai@gmail.com": "mioimotoai-lgtm",
"aamirjawaid@microsoft.com": "heyitsaamir",
@@ -908,6 +915,8 @@ AUTHOR_MAP = {
"promptsiren@gmail.com": "firefly", # PR #18123 salvage of #16660 (ContextVars)
"wtyopenclaw@gmail.com": "WuTianyi123", # PR #20275 salvage of #13723 (feishu markdown)
"zhicheng.han@mathematik.uni-goettingen.de": "hanzckernel", # PR #20311 (api-server approval events)
"agentsmithlaor@gmail.com": "oferlaor", # PR #22356 salvage (cron origin sender identity)
"jhin.lee@unity3d.com": "leehack", # PR #22053 salvage (telegram DM topic reply fallback)
# pander: empty email, salvaged via PR #19665 from #16126 by @ms-alan
}
+11 -3
View File
@@ -200,7 +200,11 @@ class TestGatewayBridgeCodeParity:
def test_gateway_has_auxiliary_bridge(self):
"""The gateway config bridge must include auxiliary.* bridging."""
gateway_path = Path(__file__).parent.parent.parent / "gateway" / "run.py"
content = gateway_path.read_text()
# Pin encoding to UTF-8: source files in this repo are UTF-8, but
# Path.read_text() defaults to the system locale — which is cp1252
# on most Western Windows installs and crashes as soon as the file
# contains any non-ASCII byte (e.g. an em-dash in a comment).
content = gateway_path.read_text(encoding="utf-8")
# Check for key patterns that indicate the bridge is present
assert "AUXILIARY_VISION_PROVIDER" in content
assert "AUXILIARY_VISION_MODEL" in content
@@ -214,7 +218,9 @@ class TestGatewayBridgeCodeParity:
def test_gateway_no_compression_env_bridge(self):
"""Gateway should NOT bridge compression config to env vars (config-only)."""
gateway_path = Path(__file__).parent.parent.parent / "gateway" / "run.py"
content = gateway_path.read_text()
# See note in test_gateway_has_auxiliary_bridge — pin UTF-8 so the
# test runs on Windows where the default locale is cp1252.
content = gateway_path.read_text(encoding="utf-8")
assert "CONTEXT_COMPRESSION_PROVIDER" not in content
assert "CONTEXT_COMPRESSION_MODEL" not in content
@@ -289,7 +295,9 @@ class TestCLIDefaultsHaveAuxiliaryKeys:
# So auxiliary config from config.yaml gets merged even though
# cli.py's defaults dict doesn't define it.
import cli as _cli_mod
source = Path(_cli_mod.__file__).read_text()
# See note in test_gateway_has_auxiliary_bridge — pin UTF-8 so the
# test runs on Windows where the default locale is cp1252.
source = Path(_cli_mod.__file__).read_text(encoding="utf-8")
assert "auxiliary_config = defaults.get(\"auxiliary\"" in source
assert "AUXILIARY_VISION_PROVIDER" in source
assert "AUXILIARY_VISION_MODEL" in source
+98
View File
@@ -400,6 +400,104 @@ class TestSummaryFallbackToMainModel:
assert result is None
assert c._summary_model_fallen_back is True
def test_json_decode_error_falls_back_to_main_and_succeeds(self):
"""JSONDecodeError from the OpenAI SDK's ``response.json()`` (raised
when a misconfigured proxy returns HTML/plain-text with
``Content-Type: application/json``) should trigger the same
retry-on-main path as 404/timeout. Issue #22244."""
import json as _json
mock_ok = MagicMock()
mock_ok.choices = [MagicMock()]
mock_ok.choices[0].message.content = "summary via main model"
# Simulate the SDK raising a raw JSONDecodeError with a realistic
# error message ("Expecting value: line X column Y char Z").
err_json = _json.JSONDecodeError(
"Expecting value", "<!DOCTYPE html><html>...</html>", 0
)
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="aux-via-broken-proxy",
quiet_mode=True,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=[err_json, mock_ok],
) as mock_call:
result = c._generate_summary(self._msgs())
assert mock_call.call_count == 2
assert mock_call.call_args_list[0].kwargs.get("model") == "aux-via-broken-proxy"
assert "model" not in mock_call.call_args_list[1].kwargs
assert result is not None
assert "summary via main model" in result
# Aux-model failure recorded so /usage / gateway warnings can surface it
assert c._last_aux_model_failure_model == "aux-via-broken-proxy"
assert c._last_aux_model_failure_error is not None
# The 220-char cap is shared with other fallback branches
assert len(c._last_aux_model_failure_error) <= 220
def test_json_decode_error_substring_match_in_wrapped_exception(self):
"""When the OpenAI SDK wraps the raw JSONDecodeError inside its own
``APIResponseValidationError`` (or similar), ``isinstance`` no longer
matches but the substring "expecting value" still appears in
``str(e)``. We detect this case by string match and fall back the
same way."""
mock_ok = MagicMock()
mock_ok.choices = [MagicMock()]
mock_ok.choices[0].message.content = "summary via main model"
# A plain Exception with the canonical JSON decode error text — what
# the SDK's APIResponseValidationError looks like at str() time.
err_wrapped = Exception("Expecting value: line 1 column 1 (char 0)")
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="aux-model",
quiet_mode=True,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=[err_wrapped, mock_ok],
) as mock_call:
result = c._generate_summary(self._msgs())
assert mock_call.call_count == 2
assert result is not None
assert "summary via main model" in result
def test_json_decode_error_on_main_uses_short_cooldown(self):
"""When already on the main model (no separate summary_model, or
fallback already happened), a JSONDecodeError should set the short
30s cooldown, not the default 60s provider bodies tend to
recover quickly when an upstream proxy comes back online."""
import json as _json
err_json = _json.JSONDecodeError("Expecting value", "<html/>", 0)
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
# No summary_model_override → already on main, no fallback path.
quiet_mode=True,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=err_json,
), patch("agent.context_compressor.time.monotonic", return_value=1000.0):
result = c._generate_summary(self._msgs())
assert result is None
# Short JSON-decode cooldown is 30s, not the default 60s.
assert c._summary_failure_cooldown_until == 1030.0
class TestAuxModelFallbackSurfacedToCallers:
"""When summary_model fails but retry-on-main succeeds, compress() must
+8
View File
@@ -789,6 +789,7 @@ class TestPromptBuilderConstants:
assert "cron" in PLATFORM_HINTS
assert "cli" in PLATFORM_HINTS
assert "api_server" in PLATFORM_HINTS
assert "webui" in PLATFORM_HINTS
def test_cli_hint_does_not_suggest_media_tags(self):
# Regression: MEDIA:/path tags are intercepted only by messaging
@@ -826,6 +827,13 @@ class TestPromptBuilderConstants:
assert "MEDIA:" in hint
assert "Markdown" in hint
def test_platform_hints_webui(self):
hint = PLATFORM_HINTS["webui"]
assert "WebUI" in hint
assert "MEDIA:" in hint
assert "Markdown" in hint
assert "absolute" in hint
# =========================================================================
# Environment hints
+20
View File
@@ -207,6 +207,26 @@ class TestJobCRUD:
jobs = list_jobs()
assert len(jobs) == 2
def test_list_jobs_normalizes_partial_legacy_records(self, tmp_cron_dir):
save_jobs([
{
"id": "abc123deadbe",
"name": None,
"prompt": None,
"schedule_display": None,
"schedule": {"kind": "interval", "minutes": 60, "display": "every 60m"},
"enabled": True,
}
])
jobs = list_jobs()
assert jobs[0]["id"] == "abc123deadbe"
assert jobs[0]["name"] == "abc123deadbe"
assert jobs[0]["prompt"] == ""
assert jobs[0]["schedule_display"] == "every 60m"
assert jobs[0]["state"] == "scheduled"
def test_remove_job(self, tmp_cron_dir):
job = create_job(prompt="Temp job", schedule="30m")
assert remove_job(job["id"]) is True
+5
View File
@@ -1788,6 +1788,11 @@ class TestBuildJobPromptSilentHint:
result = _build_job_prompt(job)
assert "[SILENT]" in result
def test_hint_present_when_legacy_prompt_is_null(self):
job = {"id": "abc123deadbe", "name": None, "prompt": None}
result = _build_job_prompt(job)
assert "[SILENT]" in result
def test_delivery_guidance_present(self):
"""Cron hint tells agents their final response is auto-delivered."""
job = {"prompt": "Generate a report"}
+83
View File
@@ -108,6 +108,38 @@ class TestHandleBackgroundCommand:
assert "Summarize the top HN stories" in result
assert len(created_tasks) == 1 # background task was created
@pytest.mark.asyncio
async def test_telegram_dm_topic_passes_trigger_anchor_to_task(self):
"""Telegram private-topic completion sends need the original command message id."""
runner = _make_runner()
runner._run_background_task = AsyncMock()
def capture_task(coro, *args, **kwargs):
coro.close()
mock_task = MagicMock()
return mock_task
source = SessionSource(
platform=Platform.TELEGRAM,
user_id="12345",
chat_id="67890",
chat_type="dm",
thread_id="20197",
)
event = MessageEvent(
text="/background summarize",
source=source,
message_id="463",
reply_to_message_id="462",
)
with patch("gateway.run.asyncio.create_task", side_effect=capture_task):
result = await runner._handle_background_command(event)
assert "Background task started" in result
runner._run_background_task.assert_called_once()
assert runner._run_background_task.call_args.kwargs["event_message_id"] == "463"
@pytest.mark.asyncio
async def test_prompt_truncated_in_preview(self):
"""Long prompts are truncated to 60 chars in the confirmation message."""
@@ -236,6 +268,57 @@ class TestRunBackgroundTask:
mock_agent_instance.shutdown_memory_provider.assert_called_once()
mock_agent_instance.close.assert_called_once()
@pytest.mark.asyncio
async def test_telegram_dm_topic_completion_preserves_reply_anchor_metadata(self, monkeypatch):
"""Background completion metadata must let Telegram send thread id plus reply id."""
from gateway import run as gateway_run
runner = _make_runner()
runner._resolve_session_agent_runtime = MagicMock(
return_value=("test-model", {"api_key": "test-key"})
)
runner._resolve_session_reasoning_config = MagicMock(return_value=None)
runner._load_service_tier = MagicMock(return_value=None)
runner._resolve_turn_agent_config = MagicMock(
return_value={
"model": "test-model",
"runtime": {"api_key": "test-key"},
"request_overrides": None,
}
)
runner._run_in_executor_with_context = AsyncMock(
return_value={"final_response": "done", "messages": []}
)
monkeypatch.setattr(gateway_run, "_load_gateway_config", lambda: {})
mock_adapter = AsyncMock()
mock_adapter.send = AsyncMock()
mock_adapter.extract_media = MagicMock(return_value=([], "done"))
mock_adapter.extract_images = MagicMock(return_value=([], "done"))
runner.adapters[Platform.TELEGRAM] = mock_adapter
source = SessionSource(
platform=Platform.TELEGRAM,
user_id="12345",
chat_id="67890",
chat_type="dm",
thread_id="20197",
)
await runner._run_background_task(
"say hello",
source,
"bg_test",
event_message_id="463",
)
mock_adapter.send.assert_called_once()
assert mock_adapter.send.call_args.kwargs["metadata"] == {
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "463",
}
@pytest.mark.asyncio
async def test_agent_cleanup_runs_when_background_agent_raises(self):
"""Temporary background agents must be cleaned up on error paths too."""
@@ -208,6 +208,101 @@ class TestFeishuExecApproval:
assert ids[0] != ids[1]
# ===========================================================================
# send_update_prompt — interactive card with buttons
# ===========================================================================
class TestFeishuUpdatePrompt:
"""Test send_update_prompt sends an interactive card."""
@pytest.mark.asyncio
async def test_sends_interactive_card(self):
adapter = _make_adapter()
mock_response = SimpleNamespace(
success=lambda: True,
data=SimpleNamespace(message_id="msg_up_001"),
)
with patch.object(
adapter, "_feishu_send_with_retry", new_callable=AsyncMock,
return_value=mock_response,
) as mock_send:
result = await adapter.send_update_prompt(
chat_id="oc_12345",
prompt="Restore stashed changes after update?",
default="y",
session_key="agent:main:feishu:group:oc_12345",
metadata={"thread_id": "th_1"},
)
assert result.success is True
assert result.message_id == "msg_up_001"
kwargs = mock_send.call_args[1]
assert kwargs["chat_id"] == "oc_12345"
assert kwargs["msg_type"] == "interactive"
assert kwargs["metadata"] == {"thread_id": "th_1"}
card = json.loads(kwargs["payload"])
assert card["header"]["template"] == "orange"
assert "Restore stashed changes after update?" in card["elements"][0]["content"]
assert "Default: `y`" in card["elements"][0]["content"]
actions = card["elements"][1]["actions"]
assert [a["value"]["hermes_update_prompt_action"] for a in actions] == ["y", "n"]
@pytest.mark.asyncio
async def test_stores_prompt_state(self):
adapter = _make_adapter()
mock_response = SimpleNamespace(
success=lambda: True,
data=SimpleNamespace(message_id="msg_up_002"),
)
with patch.object(
adapter, "_feishu_send_with_retry", new_callable=AsyncMock,
return_value=mock_response,
):
await adapter.send_update_prompt(
chat_id="oc_12345",
prompt="Continue update?",
session_key="my-session-key",
)
assert len(adapter._update_prompt_state) == 1
prompt_id = list(adapter._update_prompt_state.keys())[0]
state = adapter._update_prompt_state[prompt_id]
assert state["session_key"] == "my-session-key"
assert state["message_id"] == "msg_up_002"
assert state["chat_id"] == "oc_12345"
@pytest.mark.asyncio
async def test_not_connected(self):
adapter = _make_adapter()
adapter._client = None
result = await adapter.send_update_prompt(
chat_id="oc_12345",
prompt="Continue update?",
session_key="s",
)
assert result.success is False
@pytest.mark.asyncio
async def test_send_failure_returns_error(self):
adapter = _make_adapter()
with patch.object(
adapter, "_feishu_send_with_retry", new_callable=AsyncMock,
side_effect=TimeoutError("timed out"),
):
result = await adapter.send_update_prompt(
chat_id="oc_12345",
prompt="Continue update?",
session_key="s",
)
assert result.success is False
assert "timed out" in (result.error or "")
# ===========================================================================
# _resolve_approval — approval state pop + gateway resolution
# ===========================================================================
@@ -442,3 +537,166 @@ class TestCardActionCallbackResponse:
card = response.card.data
assert "Old Name" not in card["elements"][0]["content"]
assert "ou_expired" in card["elements"][0]["content"]
def test_returns_card_for_update_prompt_yes(self, _patch_callback_card_types):
adapter = _make_adapter()
adapter._loop = MagicMock()
adapter._loop.is_closed = MagicMock(return_value=False)
adapter._update_prompt_state[1] = {
"session_key": "sess-up-1",
"message_id": "msg_up_003",
"chat_id": "oc_12345",
}
data = _make_card_action_data(
{"hermes_update_prompt_action": "y", "update_prompt_id": 1},
open_id="ou_bob",
)
adapter._sender_name_cache["ou_bob"] = ("Bob", 9999999999)
with patch("asyncio.run_coroutine_threadsafe", side_effect=_close_submitted_coro):
response = adapter._on_card_action_trigger(data)
assert response is not None
assert response.card is not None
card = response.card.data
assert card["header"]["template"] == "green"
assert "answered: Yes" in card["header"]["title"]["content"]
assert "Bob" in card["elements"][0]["content"]
def test_returns_card_for_update_prompt_no(self, _patch_callback_card_types):
adapter = _make_adapter()
adapter._loop = MagicMock()
adapter._loop.is_closed = MagicMock(return_value=False)
adapter._update_prompt_state[2] = {
"session_key": "sess-up-2",
"message_id": "msg_up_004",
"chat_id": "oc_12345",
}
data = _make_card_action_data(
{"hermes_update_prompt_action": "n", "update_prompt_id": 2},
)
with patch("asyncio.run_coroutine_threadsafe", side_effect=_close_submitted_coro):
response = adapter._on_card_action_trigger(data)
assert response is not None
assert response.card is not None
card = response.card.data
assert card["header"]["template"] == "red"
assert "answered: No" in card["header"]["title"]["content"]
def test_ignores_missing_update_prompt_id(self, _patch_callback_card_types):
adapter = _make_adapter()
adapter._loop = MagicMock()
adapter._loop.is_closed = MagicMock(return_value=False)
data = _make_card_action_data({"hermes_update_prompt_action": "y"})
with patch("asyncio.run_coroutine_threadsafe") as mock_submit:
response = adapter._on_card_action_trigger(data)
assert response is not None
assert response.card is None
mock_submit.assert_not_called()
def test_already_resolved_update_prompt_returns_no_card(self, _patch_callback_card_types):
adapter = _make_adapter()
adapter._loop = MagicMock()
adapter._loop.is_closed = MagicMock(return_value=False)
data = _make_card_action_data(
{"hermes_update_prompt_action": "y", "update_prompt_id": 99},
)
with patch("asyncio.run_coroutine_threadsafe") as mock_submit:
response = adapter._on_card_action_trigger(data)
assert response is not None
assert response.card is None
mock_submit.assert_not_called()
def test_update_prompt_schedule_failure_returns_no_card(self, _patch_callback_card_types):
adapter = _make_adapter()
adapter._loop = MagicMock()
adapter._loop.is_closed = MagicMock(return_value=False)
adapter._update_prompt_state[1] = {
"session_key": "sess-up-1",
"message_id": "msg_up_005",
"chat_id": "oc_12345",
}
data = _make_card_action_data(
{"hermes_update_prompt_action": "y", "update_prompt_id": 1},
)
with patch("asyncio.run_coroutine_threadsafe", side_effect=RuntimeError("loop closed")):
response = adapter._on_card_action_trigger(data)
assert response is not None
assert response.card is None
def test_update_prompt_unauthorized_operator_returns_no_card(self, _patch_callback_card_types):
adapter = _make_adapter()
adapter._loop = MagicMock()
adapter._loop.is_closed = MagicMock(return_value=False)
adapter._update_prompt_state[1] = {
"session_key": "sess-up-1",
"message_id": "msg_up_006",
"chat_id": "oc_12345",
}
adapter._allowed_group_users = {"ou_allowed"}
data = _make_card_action_data(
{"hermes_update_prompt_action": "y", "update_prompt_id": 1},
open_id="ou_intruder",
)
with patch("asyncio.run_coroutine_threadsafe") as mock_submit:
response = adapter._on_card_action_trigger(data)
assert response is not None
assert response.card is None
mock_submit.assert_not_called()
class TestResolveUpdatePrompt:
"""Test update prompt resolution persists the response file."""
@pytest.mark.asyncio
async def test_writes_response_file(self, tmp_path, monkeypatch):
adapter = _make_adapter()
monkeypatch.setenv("HERMES_HOME", str(tmp_path / ".hermes"))
(tmp_path / ".hermes").mkdir()
adapter._update_prompt_state[1] = {
"session_key": "sess-up-1",
"message_id": "msg_up_003",
"chat_id": "oc_12345",
}
await adapter._resolve_update_prompt(1, "y", "Alice")
assert (tmp_path / ".hermes" / ".update_response").read_text() == "y"
assert 1 not in adapter._update_prompt_state
@pytest.mark.asyncio
async def test_overwrites_existing_response_file(self, tmp_path, monkeypatch):
adapter = _make_adapter()
monkeypatch.setenv("HERMES_HOME", str(tmp_path / ".hermes"))
home = tmp_path / ".hermes"
home.mkdir()
(home / ".update_response").write_text("n")
adapter._update_prompt_state[2] = {
"session_key": "sess-up-2",
"message_id": "msg_up_004",
"chat_id": "oc_12345",
}
await adapter._resolve_update_prompt(2, "y", "Alice")
assert (home / ".update_response").read_text() == "y"
@pytest.mark.asyncio
async def test_unknown_prompt_id_drops_silently(self, tmp_path, monkeypatch):
adapter = _make_adapter()
monkeypatch.setenv("HERMES_HOME", str(tmp_path / ".hermes"))
(tmp_path / ".hermes").mkdir()
await adapter._resolve_update_prompt(99, "n", "Nobody")
assert not (tmp_path / ".hermes" / ".update_response").exists()
+281
View File
@@ -485,6 +485,49 @@ class TestOnPubsubMessage:
submit.assert_not_called()
msg.ack.assert_called_once()
def test_relay_flat_bot_sender_is_filtered_end_to_end(self, adapter):
"""Format 3 end-to-end: a relay envelope declaring sender_type=BOT
flows through ``_extract_message_payload`` ``_on_pubsub_message``
and is dropped by the BOT self-filter without dispatch. This is
the actual security contract (the unit tests on
``_extract_message_payload`` only assert the intermediate dict
shape; this test asserts the dispatch is suppressed).
"""
envelope = {
"event_type": "MESSAGE",
"sender_email": "bot@bots.example.com",
"sender_display_name": "HermesBot",
"sender_type": "BOT",
"text": "reply from bot",
"space_name": "spaces/RELAY",
"message_name": "spaces/RELAY/messages/M.M",
}
msg = _make_pubsub_message(envelope)
with patch.object(adapter, "_submit_on_loop") as submit:
adapter._on_pubsub_message(msg)
submit.assert_not_called()
msg.ack.assert_called_once()
def test_relay_flat_human_sender_dispatches(self, adapter):
"""Format 3 negative control: an envelope without sender_type
(or with sender_type=HUMAN) still dispatches to the agent loop,
confirming the BOT-filter doesn't accidentally drop legitimate
human messages from a relay.
"""
envelope = {
"event_type": "MESSAGE",
"sender_email": "alice@example.com",
"sender_display_name": "Alice",
"text": "hello agent",
"space_name": "spaces/RELAY",
"message_name": "spaces/RELAY/messages/M.M",
}
msg = _make_pubsub_message(envelope)
with patch.object(adapter, "_submit_on_loop") as submit:
adapter._on_pubsub_message(msg)
submit.assert_called_once()
msg.ack.assert_called_once()
def test_duplicate_message_dropped(self, adapter):
env = _make_chat_envelope(msg_name="spaces/S/messages/DUP.DUP")
# Prime dedup
@@ -603,6 +646,74 @@ class TestExtractMessagePayload:
assert msg["name"] == "spaces/RELAY/messages/M.M"
assert space["name"] == "spaces/RELAY"
def test_relay_flat_honors_declared_sender_type_bot(self):
"""Format 3 propagates ``envelope.sender_type`` so the downstream
BOT self-filter fires for relay-forwarded bot replies.
Without this, a relay misconfigured to forward the bot's own
replies into the same Pub/Sub topic produced a feedback loop:
the adapter would mark the synthesized sender ``HUMAN`` and the
``sender.type == "BOT"`` self-filter would never fire.
"""
envelope = {
"event_type": "MESSAGE",
"sender_email": "bot@bots.example.com",
"sender_display_name": "HermesBot",
"sender_type": "BOT",
"text": "reply from bot",
"space_name": "spaces/RELAY",
"message_name": "spaces/RELAY/messages/M.M",
}
result = GoogleChatAdapter._extract_message_payload(envelope)
assert result is not None
msg, _space, fmt = result
assert fmt == "relay_flat"
assert msg["sender"]["type"] == "BOT"
def test_relay_flat_defaults_sender_type_human_when_absent(self):
"""Backward compatibility: relays that don't declare sender_type
continue to flow as HUMAN exactly as before this change."""
envelope = {
"event_type": "MESSAGE",
"sender_email": "alice@example.com",
"text": "hi",
"space_name": "spaces/RELAY",
"message_name": "spaces/RELAY/messages/M.M",
}
result = GoogleChatAdapter._extract_message_payload(envelope)
assert result is not None
msg, _space, _fmt = result
assert msg["sender"]["type"] == "HUMAN"
def test_relay_flat_coerces_unknown_sender_type_to_human(self):
"""Defensive coercion: only ``HUMAN`` and ``BOT`` are accepted;
any other value (including stray casing on those two) is either
normalized or falls back to ``HUMAN`` so a malformed relay can't
slip an unrecognized type through to the downstream filter."""
# Lower / mixed case is normalized to upper.
envelope_lower = {
"event_type": "MESSAGE",
"sender_email": "bot@example.com",
"sender_type": " bot ",
"text": "hi",
"space_name": "spaces/RELAY",
"message_name": "spaces/RELAY/messages/M.M",
}
msg, _space, _fmt = GoogleChatAdapter._extract_message_payload(envelope_lower)
assert msg["sender"]["type"] == "BOT"
# Unknown value falls back to HUMAN, not the raw string.
envelope_bogus = {
"event_type": "MESSAGE",
"sender_email": "alice@example.com",
"sender_type": "ROBOT",
"text": "hi",
"space_name": "spaces/RELAY",
"message_name": "spaces/RELAY/messages/M.M",
}
msg, _space, _fmt = GoogleChatAdapter._extract_message_payload(envelope_bogus)
assert msg["sender"]["type"] == "HUMAN"
def test_unrecognized_envelope_returns_none(self):
"""Random JSON with no known shape returns None (caller acks)."""
envelope = {"foo": "bar", "baz": 123}
@@ -2585,3 +2696,173 @@ class TestCronSchedulerRegistry:
from cron.scheduler import _resolve_home_env_var
assert _resolve_home_env_var("google_chat") == "GOOGLE_CHAT_HOME_CHANNEL"
# ── _standalone_send (out-of-process cron delivery) ──────────────────────
class _FakeAiohttpResponse:
def __init__(self, status: int, payload, text_body: str = ""):
self.status = status
self._payload = payload
self._text = text_body or (str(payload) if payload is not None else "")
async def json(self):
return self._payload
async def text(self):
return self._text
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
return None
class _FakeAiohttpSession:
def __init__(self, scripts):
self._scripts = list(scripts)
self.calls: list[tuple[str, dict]] = []
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
return None
def post(self, url, **kwargs):
self.calls.append((url, kwargs))
if not self._scripts:
raise AssertionError(f"No scripted response for POST {url}")
return self._scripts.pop(0)
def _install_fake_aiohttp(monkeypatch, session):
fake_aiohttp = types.SimpleNamespace(
ClientSession=lambda timeout=None: session,
ClientTimeout=lambda total=None: None,
)
monkeypatch.setitem(sys.modules, "aiohttp", fake_aiohttp)
def _install_fake_google_auth_transport(monkeypatch):
fake_request_module = types.SimpleNamespace(Request=lambda: object())
monkeypatch.setitem(sys.modules, "google.auth.transport", types.SimpleNamespace(requests=fake_request_module))
monkeypatch.setitem(sys.modules, "google.auth.transport.requests", fake_request_module)
class TestGoogleChatStandaloneSend:
@pytest.mark.asyncio
async def test_standalone_send_refreshes_token_and_posts_message(
self, monkeypatch, tmp_path
):
sa_file = tmp_path / "sa.json"
sa_file.write_text(json.dumps({
"type": "service_account",
"client_email": "bot@example.iam.gserviceaccount.com",
"private_key": "fake",
"token_uri": "https://example/token",
}))
monkeypatch.setenv("GOOGLE_CHAT_SERVICE_ACCOUNT_JSON", str(sa_file))
fake_creds = MagicMock()
fake_creds.token = "the-token"
fake_creds.refresh = MagicMock(return_value=None)
original = _gc_mod.service_account.Credentials.from_service_account_info
_gc_mod.service_account.Credentials.from_service_account_info = MagicMock(
return_value=fake_creds
)
try:
_install_fake_google_auth_transport(monkeypatch)
send_resp = _FakeAiohttpResponse(200, {"name": "spaces/AAA/messages/MMM"})
session = _FakeAiohttpSession([send_resp])
_install_fake_aiohttp(monkeypatch, session)
result = await _gc_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"spaces/AAAA-BBBB",
"hello cron",
)
finally:
_gc_mod.service_account.Credentials.from_service_account_info = original
assert result == {
"success": True,
"message_id": "spaces/AAA/messages/MMM",
}
fake_creds.refresh.assert_called_once()
assert len(session.calls) == 1
url, kwargs = session.calls[0]
assert url == "https://chat.googleapis.com/v1/spaces/AAAA-BBBB/messages"
assert kwargs["headers"]["Authorization"] == "Bearer the-token"
assert kwargs["json"] == {"text": "hello cron"}
@pytest.mark.asyncio
async def test_standalone_send_returns_error_on_invalid_chat_id(self, monkeypatch):
monkeypatch.delenv("GOOGLE_CHAT_SERVICE_ACCOUNT_JSON", raising=False)
result = await _gc_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"not-a-resource-name",
"hi",
)
assert "error" in result
assert "spaces/" in result["error"] or "users/" in result["error"]
@pytest.mark.asyncio
async def test_standalone_send_propagates_api_failure(self, monkeypatch, tmp_path):
sa_file = tmp_path / "sa.json"
sa_file.write_text(json.dumps({
"type": "service_account",
"client_email": "bot@example.iam.gserviceaccount.com",
"private_key": "fake",
"token_uri": "https://example/token",
}))
monkeypatch.setenv("GOOGLE_CHAT_SERVICE_ACCOUNT_JSON", str(sa_file))
fake_creds = MagicMock()
fake_creds.token = "the-token"
fake_creds.refresh = MagicMock(return_value=None)
original = _gc_mod.service_account.Credentials.from_service_account_info
_gc_mod.service_account.Credentials.from_service_account_info = MagicMock(
return_value=fake_creds
)
try:
_install_fake_google_auth_transport(monkeypatch)
send_resp = _FakeAiohttpResponse(
403,
{"error": {"code": 403, "message": "forbidden"}},
text_body='{"error":{"code":403,"message":"forbidden"}}',
)
session = _FakeAiohttpSession([send_resp])
_install_fake_aiohttp(monkeypatch, session)
result = await _gc_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"spaces/AAAA-BBBB",
"hi",
)
finally:
_gc_mod.service_account.Credentials.from_service_account_info = original
assert "error" in result
assert "403" in result["error"]
@pytest.mark.asyncio
async def test_standalone_send_rejects_chat_id_with_path_traversal(self, monkeypatch):
monkeypatch.delenv("GOOGLE_CHAT_SERVICE_ACCOUNT_JSON", raising=False)
# Attempt to inject extra path segments after the prefix passes the
# startswith check. The strict regex must reject this.
result = await _gc_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"spaces/AAAA/messages?messageReplyOption=REPLY_MESSAGE_FALLBACK_TO_NEW_THREAD",
"hi",
)
assert "error" in result
# The error names the expected resource shape so plugin authors can self-correct
assert "spaces/" in result["error"] or "users/" in result["error"]
+222
View File
@@ -20,6 +20,7 @@ IRCAdapter = _irc_mod.IRCAdapter
check_requirements = _irc_mod.check_requirements
validate_config = _irc_mod.validate_config
register = _irc_mod.register
_standalone_send = _irc_mod._standalone_send
class TestIRCProtocolHelpers:
@@ -500,3 +501,224 @@ class TestIRCPluginRegistration:
ctx.register_platform.assert_called_once()
call_kwargs = ctx.register_platform.call_args
assert call_kwargs[1]["name"] == "irc" or call_kwargs[0][0] == "irc" if call_kwargs[0] else call_kwargs[1]["name"] == "irc"
# ── _standalone_send (out-of-process cron delivery) ──────────────────────
class _FakeIRCConnection:
"""A scripted reader/writer pair used to simulate an IRC server.
Construct with the lines the server should respond with (already
framed by ``\\r\\n``). Captures every line written by the client so
tests can assert NICK/USER/PRIVMSG/QUIT order.
"""
def __init__(self, scripted_lines):
self.writes: list[bytes] = []
self._closed = False
self._scripted = list(scripted_lines)
self._buffer = b""
# writer side ────────────────────────────────────────────────────
def write(self, data: bytes) -> None:
self.writes.append(data)
async def drain(self) -> None:
return None
def close(self) -> None:
self._closed = True
async def wait_closed(self) -> None:
return None
def is_closing(self) -> bool:
return self._closed
# reader side ────────────────────────────────────────────────────
async def readuntil(self, separator: bytes = b"\r\n") -> bytes:
if not self._scripted:
raise asyncio.IncompleteReadError(b"", None)
line = self._scripted.pop(0)
if not line.endswith(b"\r\n"):
line = line + b"\r\n"
return line
async def read(self, n: int = -1) -> bytes:
return b""
class TestIRCStandaloneSend:
@pytest.mark.asyncio
async def test_standalone_send_completes_handshake_and_sends_privmsg(self, monkeypatch):
from gateway.config import PlatformConfig
monkeypatch.setenv("IRC_SERVER", "irc.test.net")
monkeypatch.setenv("IRC_CHANNEL", "#cron")
monkeypatch.setenv("IRC_NICKNAME", "hermesbot")
monkeypatch.setenv("IRC_USE_TLS", "false")
# Server greets us with 001 RPL_WELCOME, then nothing for QUIT drain.
conn = _FakeIRCConnection([b":server 001 hermesbot-cron :Welcome"])
async def _fake_open(host, port, **kwargs):
return conn, conn # reader and writer share the same fake
monkeypatch.setattr(_irc_mod.asyncio, "open_connection", _fake_open)
result = await _standalone_send(
PlatformConfig(enabled=True, extra={}),
"#cron",
"hello from cron",
)
assert result["success"] is True
assert "message_id" in result
sent_lines = b"".join(conn.writes).decode("utf-8").splitlines()
# NICK uses the cron-suffixed identity to avoid colliding with the
# long-running gateway adapter that may already hold the nickname.
assert any(line.startswith("NICK hermesbot-cron") for line in sent_lines)
assert any(line.startswith("USER hermesbot-cron 0 * :Hermes Agent (cron)")
for line in sent_lines)
assert any(line == "PRIVMSG #cron :hello from cron" for line in sent_lines)
assert any(line.startswith("QUIT ") for line in sent_lines)
@pytest.mark.asyncio
async def test_standalone_send_returns_error_when_unconfigured(self, monkeypatch):
from gateway.config import PlatformConfig
for var in ("IRC_SERVER", "IRC_CHANNEL"):
monkeypatch.delenv(var, raising=False)
result = await _standalone_send(
PlatformConfig(enabled=True, extra={}),
"",
"hi",
)
assert "error" in result
assert "IRC_SERVER" in result["error"] or "IRC_CHANNEL" in result["error"]
@pytest.mark.asyncio
async def test_standalone_send_returns_error_on_registration_timeout(self, monkeypatch):
from gateway.config import PlatformConfig
monkeypatch.setenv("IRC_SERVER", "irc.test.net")
monkeypatch.setenv("IRC_CHANNEL", "#cron")
monkeypatch.setenv("IRC_NICKNAME", "hermesbot")
monkeypatch.setenv("IRC_USE_TLS", "false")
# No 001 response: the readuntil call returns IncompleteReadError so
# the registration loop times out via the asyncio wait_for inside.
conn = _FakeIRCConnection([])
async def _fake_open(host, port, **kwargs):
return conn, conn
monkeypatch.setattr(_irc_mod.asyncio, "open_connection", _fake_open)
# Patch wait_for to raise TimeoutError immediately so the test is fast
async def _fast_timeout(coro, timeout):
try:
return await coro
except asyncio.IncompleteReadError:
raise asyncio.TimeoutError()
monkeypatch.setattr(_irc_mod.asyncio, "wait_for", _fast_timeout)
result = await _standalone_send(
PlatformConfig(enabled=True, extra={}),
"#cron",
"hi",
)
assert "error" in result
assert "registration" in result["error"].lower() or "timeout" in result["error"].lower()
@pytest.mark.asyncio
async def test_standalone_send_rejects_crlf_in_chat_id(self, monkeypatch):
from gateway.config import PlatformConfig
monkeypatch.setenv("IRC_SERVER", "irc.test.net")
monkeypatch.setenv("IRC_CHANNEL", "#cron")
monkeypatch.setenv("IRC_NICKNAME", "hermesbot")
monkeypatch.setenv("IRC_USE_TLS", "false")
# Attempt to inject a second IRC command via CRLF in chat_id
result = await _standalone_send(
PlatformConfig(enabled=True, extra={}),
"#cron\r\nKICK #cron hermesbot",
"hi",
)
assert "error" in result
assert "illegal IRC characters" in result["error"]
@pytest.mark.asyncio
async def test_standalone_send_strips_crlf_from_message_body(self, monkeypatch):
from gateway.config import PlatformConfig
monkeypatch.setenv("IRC_SERVER", "irc.test.net")
monkeypatch.setenv("IRC_CHANNEL", "#cron")
monkeypatch.setenv("IRC_NICKNAME", "hermesbot")
monkeypatch.setenv("IRC_USE_TLS", "false")
conn = _FakeIRCConnection([b":server 001 hermesbot-cron :Welcome"])
async def _fake_open(host, port, **kwargs):
return conn, conn
monkeypatch.setattr(_irc_mod.asyncio, "open_connection", _fake_open)
# A bare \r in message content tries to inject a NICK command.
# Our control-char stripper must blank \r so the line stays one PRIVMSG.
result = await _standalone_send(
PlatformConfig(enabled=True, extra={}),
"#cron",
"hello\rNICK eviltwin",
)
sent_lines = b"".join(conn.writes).decode("utf-8").splitlines()
# No injected NICK command after the legitimate registration NICK
nick_lines = [line for line in sent_lines if line.startswith("NICK ")]
# Only the original registration NICK should be present (no injected one)
assert all(line.startswith("NICK hermesbot-cron") for line in nick_lines)
# The PRIVMSG should contain "hello NICK eviltwin" as one line (with \r blanked)
assert any("PRIVMSG #cron :hello NICK eviltwin" in line for line in sent_lines)
@pytest.mark.asyncio
async def test_standalone_send_joins_channel_before_privmsg(self, monkeypatch):
from gateway.config import PlatformConfig
monkeypatch.setenv("IRC_SERVER", "irc.test.net")
monkeypatch.setenv("IRC_CHANNEL", "#cron")
monkeypatch.setenv("IRC_NICKNAME", "hermesbot")
monkeypatch.setenv("IRC_USE_TLS", "false")
# Register, then accept JOIN with 366 RPL_ENDOFNAMES, then PRIVMSG.
conn = _FakeIRCConnection([
b":server 001 hermesbot-cron :Welcome",
b":server 366 hermesbot-cron #cron :End of /NAMES list.",
])
async def _fake_open(host, port, **kwargs):
return conn, conn
monkeypatch.setattr(_irc_mod.asyncio, "open_connection", _fake_open)
result = await _standalone_send(
PlatformConfig(enabled=True, extra={}),
"#cron",
"hello",
)
assert result["success"] is True
sent_lines = b"".join(conn.writes).decode("utf-8").splitlines()
join_idx = next((i for i, line in enumerate(sent_lines) if line.startswith("JOIN #cron")), None)
privmsg_idx = next((i for i, line in enumerate(sent_lines) if line.startswith("PRIVMSG #cron")), None)
assert join_idx is not None, "JOIN must be sent for channel targets"
assert privmsg_idx is not None
assert join_idx < privmsg_idx, "JOIN must precede PRIVMSG"
@@ -10,6 +10,8 @@ The fix: gateway/run.py wraps each adapter connect() with a safety-net
call to _safe_adapter_disconnect() in the failure branches.
"""
import asyncio
import logging
from unittest.mock import AsyncMock, MagicMock
import pytest
@@ -57,3 +59,21 @@ async def test_safe_disconnect_handles_none_platform(bare_runner):
await bare_runner._safe_adapter_disconnect(adapter, None)
adapter.disconnect.assert_awaited_once()
@pytest.mark.asyncio
async def test_safe_disconnect_times_out_and_continues(bare_runner, monkeypatch, caplog):
"""A wedged adapter disconnect must not block gateway shutdown."""
monkeypatch.setenv("HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT", "0.001")
adapter = MagicMock()
async def hang():
await asyncio.sleep(60)
adapter.disconnect = AsyncMock(side_effect=hang)
with caplog.at_level(logging.WARNING, logger="gateway.run"):
await bare_runner._safe_adapter_disconnect(adapter, Platform.FEISHU)
adapter.disconnect.assert_awaited_once()
assert "Timed out after 0.0s while disconnecting feishu adapter" in caplog.text
+174
View File
@@ -703,3 +703,177 @@ class TestTeamsMessageHandling:
await adapter._on_message(ctx)
assert adapter.handle_message.await_count == 1
# ── _standalone_send (out-of-process cron delivery) ──────────────────────
class _FakeAiohttpResponse:
def __init__(self, status: int, payload, text_body: str = ""):
self.status = status
self._payload = payload
self._text = text_body or (str(payload) if payload is not None else "")
async def json(self):
return self._payload
async def text(self):
return self._text
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
return None
class _FakeAiohttpSession:
"""Scripted aiohttp.ClientSession with a queue of responses so tests
can assert calls in order."""
def __init__(self, scripts):
self._scripts = list(scripts)
self.calls: list[tuple[str, dict]] = []
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
return None
def post(self, url, **kwargs):
self.calls.append((url, kwargs))
if not self._scripts:
raise AssertionError(f"No scripted response for POST {url}")
return self._scripts.pop(0)
def _install_fake_aiohttp(monkeypatch, session):
"""Replace ``aiohttp`` in ``sys.modules`` so ``import aiohttp as _aiohttp``
inside ``_standalone_send`` picks up our fake."""
fake_aiohttp = types.SimpleNamespace(
ClientSession=lambda timeout=None: session,
ClientTimeout=lambda total=None: None,
)
monkeypatch.setitem(sys.modules, "aiohttp", fake_aiohttp)
class TestTeamsStandaloneSend:
@pytest.mark.asyncio
async def test_standalone_send_acquires_token_and_posts_activity(self, monkeypatch):
monkeypatch.setenv("TEAMS_CLIENT_ID", "client-id")
monkeypatch.setenv("TEAMS_CLIENT_SECRET", "secret")
monkeypatch.setenv("TEAMS_TENANT_ID", "tenant")
monkeypatch.delenv("TEAMS_SERVICE_URL", raising=False)
token_resp = _FakeAiohttpResponse(200, {"access_token": "the-token"})
activity_resp = _FakeAiohttpResponse(200, {"id": "msg-99"})
session = _FakeAiohttpSession([token_resp, activity_resp])
_install_fake_aiohttp(monkeypatch, session)
result = await _teams_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"19:abc@thread.skype",
"hello cron",
)
assert result == {"success": True, "message_id": "msg-99"}
assert len(session.calls) == 2
token_url, token_kwargs = session.calls[0]
assert "login.microsoftonline.com/tenant/oauth2/v2.0/token" in token_url
assert token_kwargs["data"]["client_id"] == "client-id"
assert token_kwargs["data"]["client_secret"] == "secret"
assert token_kwargs["data"]["scope"] == "https://api.botframework.com/.default"
activity_url, activity_kwargs = session.calls[1]
# Default service URL when TEAMS_SERVICE_URL is unset
assert "smba.trafficmanager.net" in activity_url
assert "/v3/conversations/19:abc@thread.skype/activities" in activity_url
assert activity_kwargs["headers"]["Authorization"] == "Bearer the-token"
assert activity_kwargs["json"]["text"] == "hello cron"
assert activity_kwargs["json"]["type"] == "message"
@pytest.mark.asyncio
async def test_standalone_send_returns_error_when_unconfigured(self, monkeypatch):
for var in ("TEAMS_CLIENT_ID", "TEAMS_CLIENT_SECRET", "TEAMS_TENANT_ID"):
monkeypatch.delenv(var, raising=False)
result = await _teams_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"19:abc@thread.skype",
"hi",
)
assert "error" in result
assert "TEAMS_CLIENT_ID" in result["error"]
@pytest.mark.asyncio
async def test_standalone_send_propagates_token_failure(self, monkeypatch):
monkeypatch.setenv("TEAMS_CLIENT_ID", "client-id")
monkeypatch.setenv("TEAMS_CLIENT_SECRET", "secret")
monkeypatch.setenv("TEAMS_TENANT_ID", "tenant")
token_resp = _FakeAiohttpResponse(
401,
{"error": "unauthorized_client"},
text_body='{"error":"unauthorized_client"}',
)
session = _FakeAiohttpSession([token_resp])
_install_fake_aiohttp(monkeypatch, session)
result = await _teams_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"19:abc@thread.skype",
"hi",
)
assert "error" in result
assert "401" in result["error"]
assert "token" in result["error"].lower()
@pytest.mark.asyncio
async def test_standalone_send_rejects_off_allowlist_service_url(self, monkeypatch):
monkeypatch.setenv("TEAMS_CLIENT_ID", "client-id")
monkeypatch.setenv("TEAMS_CLIENT_SECRET", "secret")
monkeypatch.setenv("TEAMS_TENANT_ID", "tenant")
# SSRF attempt: point us at an attacker-controlled host
monkeypatch.setenv("TEAMS_SERVICE_URL", "https://attacker.example.com/teams/")
# If the allowlist check fails to fire, the fake session will assert
# because no scripts are queued; a passing test means we returned
# before any HTTP call.
session = _FakeAiohttpSession([])
_install_fake_aiohttp(monkeypatch, session)
result = await _teams_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"19:abc@thread.skype",
"hi",
)
assert "error" in result
assert "allowlist" in result["error"].lower()
assert len(session.calls) == 0, "must not call any HTTP endpoint with a tampered service URL"
@pytest.mark.asyncio
async def test_standalone_send_rejects_chat_id_with_path_traversal(self, monkeypatch):
monkeypatch.setenv("TEAMS_CLIENT_ID", "client-id")
monkeypatch.setenv("TEAMS_CLIENT_SECRET", "secret")
monkeypatch.setenv("TEAMS_TENANT_ID", "tenant")
monkeypatch.delenv("TEAMS_SERVICE_URL", raising=False)
session = _FakeAiohttpSession([])
_install_fake_aiohttp(monkeypatch, session)
# Attempt to break out of /v3/conversations/<id>/activities via a `/`
result = await _teams_mod._standalone_send(
PlatformConfig(enabled=True, extra={}),
"19:abc/activities/19:other@thread.skype",
"hi",
)
assert "error" in result
assert "Bot Framework conversation ID" in result["error"]
assert len(session.calls) == 0
+41
View File
@@ -716,3 +716,44 @@ async def test_send_escapes_chunk_indicator_for_markdownv2(adapter):
assert len(sent_texts) > 1
assert re.search(r" \\\([0-9]+/[0-9]+\\\)$", sent_texts[0])
assert re.search(r" \\\([0-9]+/[0-9]+\\\)$", sent_texts[-1])
# =========================================================================
# edit_message — streaming Markdown safety
# =========================================================================
class TestEditMessageStreamingSafety:
@pytest.mark.asyncio
async def test_non_final_edit_uses_plain_text_without_markdown(self):
adapter = TelegramAdapter(PlatformConfig(enabled=True, token="fake-token"))
adapter._bot = MagicMock()
adapter._bot.edit_message_text = AsyncMock()
result = await adapter.edit_message("123", "456", "partial **bold", finalize=False)
assert result.success is True
adapter._bot.edit_message_text.assert_awaited_once_with(
chat_id=123,
message_id=456,
text="partial **bold",
)
@pytest.mark.asyncio
async def test_final_edit_uses_markdownv2_with_plain_fallback(self):
adapter = TelegramAdapter(PlatformConfig(enabled=True, token="fake-token"))
adapter._bot = MagicMock()
adapter._bot.edit_message_text = AsyncMock(side_effect=[Exception("bad markdown"), None])
result = await adapter.edit_message("123", "456", "final **bold**", finalize=True)
assert result.success is True
first_call = adapter._bot.edit_message_text.await_args_list[0].kwargs
second_call = adapter._bot.edit_message_text.await_args_list[1].kwargs
assert "parse_mode" in first_call
assert first_call["text"] == "final *bold*"
assert second_call == {
"chat_id": 123,
"message_id": 456,
"text": "final **bold**",
}
+693 -13
View File
@@ -1,13 +1,11 @@
"""Tests for Telegram send() thread_id fallback.
"""Tests for Telegram topic/thread routing fallbacks.
When message_thread_id points to a non-existent thread, Telegram returns
BadRequest('Message thread not found'). Since BadRequest is a subclass of
NetworkError in python-telegram-bot, the old retry loop treated this as a
transient error and retried 3 times before silently failing killing all
tool progress messages, streaming responses, and typing indicators.
The fix detects "thread not found" BadRequest errors and retries the send
WITHOUT message_thread_id so the message still reaches the chat.
Supergroup forum topics route with ``message_thread_id``. Hermes-created
private DM topic lanes are different: live Telegram testing showed they only
stay in the expected lane when sends include both the private topic
``message_thread_id`` and a ``reply_to_message_id`` anchor to the triggering
user message. If either anchor is unavailable or rejected, the adapter must
avoid retrying with a partial topic route that can render outside the lane.
"""
import sys
@@ -17,7 +15,14 @@ from types import SimpleNamespace
import pytest
from gateway.config import PlatformConfig, Platform
from gateway.platforms.base import SendResult
from gateway.platforms.base import (
MessageEvent,
MessageType,
SendResult,
_reply_anchor_for_event,
_thread_metadata_for_source,
)
from gateway.session import build_session_key
# ── Fake telegram.error hierarchy ──────────────────────────────────────
@@ -44,23 +49,48 @@ class FakeRetryAfter(Exception):
# Build a fake telegram module tree so the adapter's internal imports work
class _FakeInlineKeyboardButton:
def __init__(self, text, callback_data=None, **kwargs):
self.text = text
self.callback_data = callback_data
self.kwargs = kwargs
class _FakeInlineKeyboardMarkup:
def __init__(self, inline_keyboard):
self.inline_keyboard = inline_keyboard
class _FakeInputMediaPhoto:
def __init__(self, media, caption=None, **kwargs):
self.media = media
self.caption = caption
self.kwargs = kwargs
_fake_telegram = types.ModuleType("telegram")
_fake_telegram.Update = object
_fake_telegram.Bot = object
_fake_telegram.Message = object
_fake_telegram.InlineKeyboardButton = object
_fake_telegram.InlineKeyboardMarkup = object
_fake_telegram.InlineKeyboardButton = _FakeInlineKeyboardButton
_fake_telegram.InlineKeyboardMarkup = _FakeInlineKeyboardMarkup
_fake_telegram.InputMediaPhoto = _FakeInputMediaPhoto
_fake_telegram_error = types.ModuleType("telegram.error")
_fake_telegram_error.NetworkError = FakeNetworkError
_fake_telegram_error.BadRequest = FakeBadRequest
_fake_telegram_error.TimedOut = FakeTimedOut
_fake_telegram.error = _fake_telegram_error
_fake_telegram_constants = types.ModuleType("telegram.constants")
_fake_telegram_constants.ParseMode = SimpleNamespace(MARKDOWN_V2="MarkdownV2")
_fake_telegram_constants.ParseMode = SimpleNamespace(
MARKDOWN_V2="MarkdownV2",
MARKDOWN="Markdown",
HTML="HTML",
)
_fake_telegram_constants.ChatType = SimpleNamespace(
GROUP="group",
SUPERGROUP="supergroup",
CHANNEL="channel",
PRIVATE="private",
)
_fake_telegram.constants = _fake_telegram_constants
_fake_telegram_ext = types.ModuleType("telegram.ext")
@@ -205,6 +235,36 @@ async def test_send_typing_does_not_fall_back_to_root_for_dm_topic():
]
@pytest.mark.asyncio
async def test_send_typing_skips_api_call_for_dm_topic_reply_fallback():
"""Hermes-created DM topic lanes have no working Bot API typing route.
``send_chat_action`` only accepts ``message_thread_id``, which Telegram's
Bot API 10.0 rejects for these lanes the call would silently fail and
log a "thread not found" warning every typing tick (every 2s). Skipping
the call entirely keeps logs clean while preserving the user-visible
behavior (no typing indicator either way for these lanes).
"""
adapter = _make_adapter()
call_log = []
async def mock_send_chat_action(**kwargs):
call_log.append(dict(kwargs))
adapter._bot = SimpleNamespace(send_chat_action=mock_send_chat_action)
await adapter.send_typing(
"12345",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert call_log == []
@pytest.mark.asyncio
async def test_send_retries_without_thread_on_thread_not_found():
"""When message_thread_id causes 'thread not found', retry without it."""
@@ -235,6 +295,626 @@ async def test_send_retries_without_thread_on_thread_not_found():
assert call_log[1]["message_thread_id"] is None
@pytest.mark.asyncio
async def test_send_private_dm_topic_uses_direct_messages_topic_id():
"""Private Telegram topics route sends via direct_messages_topic_id."""
adapter = _make_adapter()
call_log = []
async def mock_send_message(**kwargs):
call_log.append(dict(kwargs))
return SimpleNamespace(message_id=42)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
result = await adapter.send(
chat_id="123",
content="test message",
metadata={"thread_id": "99999", "direct_messages_topic_id": "99999"},
)
assert result.success is True
assert call_log[0]["message_thread_id"] is None
assert call_log[0]["direct_messages_topic_id"] == 99999
def test_base_gateway_metadata_marks_telegram_dm_topics_as_reply_fallback():
source = SimpleNamespace(
platform=Platform.TELEGRAM,
chat_type="dm",
thread_id="20189",
)
metadata = _thread_metadata_for_source(source, "462")
assert metadata == {
"thread_id": "20189",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
}
def test_base_gateway_replies_to_triggering_message_for_telegram_dm_topic():
"""Private DM topic lanes should anchor replies to the active user message."""
event = SimpleNamespace(
message_id="463",
reply_to_message_id="462",
source=SimpleNamespace(
platform=Platform.TELEGRAM,
chat_type="dm",
thread_id="20189",
),
)
assert _reply_anchor_for_event(event) == "463"
@pytest.mark.asyncio
async def test_gateway_runner_busy_ack_replies_to_triggering_message_for_telegram_dm_topic(monkeypatch, tmp_path):
"""GatewayRunner's duplicate thread metadata must match the base helper."""
from gateway import run as gateway_run
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
GatewayRunner = gateway_run.GatewayRunner
class BusyAdapter:
def __init__(self):
self._pending_messages = {}
self.calls = []
async def _send_with_retry(self, **kwargs):
self.calls.append(kwargs)
return SendResult(success=True, message_id="ack-1")
class BusyAgent:
def interrupt(self, _text):
return None
def get_activity_summary(self):
return {}
source = SimpleNamespace(
platform=Platform.TELEGRAM,
chat_id="12345",
chat_type="dm",
thread_id="20197",
user_id="user-1",
)
event = MessageEvent(
text="busy follow-up",
message_type=MessageType.TEXT,
source=source,
message_id="463",
reply_to_message_id="462",
)
session_key = build_session_key(source)
adapter = BusyAdapter()
runner = object.__new__(GatewayRunner)
runner.adapters = {Platform.TELEGRAM: adapter}
runner._running_agents = {session_key: BusyAgent()}
runner._running_agents_ts = {}
runner._pending_messages = {}
runner._busy_ack_ts = {}
runner._draining = False
runner._busy_input_mode = "interrupt"
runner._is_user_authorized = lambda _source: True
assert await runner._handle_active_session_busy_message(event, session_key) is True
assert adapter.calls
assert adapter.calls[0]["reply_to"] == "463"
assert adapter.calls[0]["metadata"] == {
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "463",
}
@pytest.mark.asyncio
async def test_send_uses_reply_fallback_for_hermes_dm_topics():
"""Hermes-created Telegram DM topics route with thread id plus reply anchor."""
adapter = _make_adapter()
call_log = []
async def mock_send_message(**kwargs):
call_log.append(kwargs)
return SimpleNamespace(message_id=777)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
result = await adapter.send(
chat_id="123",
content="test message",
reply_to="462",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert "direct_messages_topic_id" not in call_log[0]
@pytest.mark.asyncio
async def test_send_uses_metadata_reply_fallback_for_streaming_dm_topics():
"""Metadata-only sends still stay in Hermes-created Telegram DM topics."""
adapter = _make_adapter()
call_log = []
async def mock_send_message(**kwargs):
call_log.append(kwargs)
return SimpleNamespace(message_id=778)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
result = await adapter.send(
chat_id="123",
content="streamed text",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert "direct_messages_topic_id" not in call_log[0]
@pytest.mark.asyncio
async def test_send_reply_fallback_applies_to_every_chunk_for_dm_topics():
"""Long Telegram DM-topic fallback sends must anchor every chunk."""
adapter = _make_adapter()
call_log = []
async def mock_send_message(**kwargs):
call_log.append(dict(kwargs))
return SimpleNamespace(message_id=len(call_log))
adapter._bot = SimpleNamespace(send_message=mock_send_message)
result = await adapter.send(
chat_id="123",
content="A" * 5000,
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert len(call_log) > 1
assert all(call["reply_to_message_id"] == 462 for call in call_log)
assert all(call["message_thread_id"] == 20197 for call in call_log)
assert all("direct_messages_topic_id" not in call for call in call_log)
@pytest.mark.asyncio
async def test_send_model_picker_uses_metadata_reply_fallback_for_dm_topics():
"""Inline keyboard sends also consume the metadata reply fallback."""
adapter = _make_adapter()
adapter._model_picker_state = {}
call_log = []
async def mock_send_message(**kwargs):
call_log.append(kwargs)
return SimpleNamespace(message_id=779)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
result = await adapter.send_model_picker(
chat_id="123",
providers=[{"name": "OpenAI", "slug": "openai", "models": [], "total_models": 0}],
current_model="gpt-test",
current_provider="openai",
session_key="telegram:123:20197",
on_model_selected=lambda *_: None,
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert "direct_messages_topic_id" not in call_log[0]
@pytest.mark.asyncio
async def test_send_dm_topic_fallback_without_anchor_does_not_crash():
"""DM-topic fallback without an anchor must not use message_thread_id alone."""
adapter = _make_adapter()
call_log = []
async def mock_send_message(**kwargs):
call_log.append(dict(kwargs))
return SimpleNamespace(message_id=780)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
result = await adapter.send(
chat_id="123",
content="source-only send",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] is None
assert "message_thread_id" not in call_log[0]
assert "direct_messages_topic_id" not in call_log[0]
@pytest.mark.asyncio
async def test_send_dm_topic_reply_not_found_retry_drops_thread_id():
"""If Telegram deletes the reply anchor, private-topic retry must drop thread id too."""
adapter = _make_adapter()
call_log = []
async def mock_send_message(**kwargs):
call_log.append(dict(kwargs))
if len(call_log) == 1:
raise FakeBadRequest("Message to be replied not found")
return SimpleNamespace(message_id=781)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
result = await adapter.send(
chat_id="123",
content="anchor disappeared",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert call_log[1]["reply_to_message_id"] is None
assert "message_thread_id" not in call_log[1]
assert "direct_messages_topic_id" not in call_log[1]
@pytest.mark.asyncio
@pytest.mark.parametrize(
("method_name", "bot_method_name", "path_kw", "filename", "payload"),
[
("send_image_file", "send_photo", "image_path", "photo.png", b"png-data"),
("send_document", "send_document", "file_path", "report.txt", b"report-data"),
("send_video", "send_video", "video_path", "clip.mp4", b"video-data"),
("send_voice", "send_voice", "audio_path", "clip.ogg", b"ogg-data"),
("send_voice", "send_audio", "audio_path", "clip.mp3", b"mp3-data"),
],
)
async def test_native_media_dm_topic_reply_not_found_retry_drops_thread_id(
tmp_path,
method_name,
bot_method_name,
path_kw,
filename,
payload,
):
adapter = _make_adapter()
media_path = tmp_path / filename
media_path.write_bytes(payload)
call_log = []
async def mock_send_media(**kwargs):
call_log.append(dict(kwargs))
if len(call_log) == 1:
raise FakeBadRequest("Message to be replied not found")
return SimpleNamespace(message_id=782)
adapter._bot = SimpleNamespace(**{bot_method_name: mock_send_media})
result = await getattr(adapter, method_name)(
chat_id="123",
**{path_kw: str(media_path)},
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert call_log[1]["reply_to_message_id"] is None
assert "message_thread_id" not in call_log[1]
assert "direct_messages_topic_id" not in call_log[1]
@pytest.mark.asyncio
async def test_animation_dm_topic_reply_not_found_retry_drops_thread_id():
adapter = _make_adapter()
call_log = []
async def mock_send_animation(**kwargs):
call_log.append(dict(kwargs))
if len(call_log) == 1:
raise FakeBadRequest("Message to be replied not found")
return SimpleNamespace(message_id=786)
adapter._bot = SimpleNamespace(send_animation=mock_send_animation)
result = await adapter.send_animation(
chat_id="123",
animation_url="https://example.com/anim.gif",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert call_log[1]["reply_to_message_id"] is None
assert "message_thread_id" not in call_log[1]
assert "direct_messages_topic_id" not in call_log[1]
@pytest.mark.asyncio
async def test_media_group_dm_topic_reply_not_found_retry_drops_thread_id(tmp_path):
adapter = _make_adapter()
image_path = tmp_path / "photo.png"
image_path.write_bytes(b"png-data")
call_log = []
async def mock_send_media_group(**kwargs):
call_log.append(dict(kwargs))
if len(call_log) == 1:
raise FakeBadRequest("Message to be replied not found")
return [SimpleNamespace(message_id=783)]
adapter._bot = SimpleNamespace(send_media_group=mock_send_media_group)
await adapter.send_multiple_images(
chat_id="123",
images=[(f"file://{image_path}", "caption")],
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert call_log[1]["reply_to_message_id"] is None
assert "message_thread_id" not in call_log[1]
assert "direct_messages_topic_id" not in call_log[1]
@pytest.mark.asyncio
async def test_send_image_url_dm_topic_reply_not_found_retry_drops_thread_id(monkeypatch):
adapter = _make_adapter()
call_log = []
async def mock_send_photo(**kwargs):
call_log.append(dict(kwargs))
if len(call_log) == 1:
raise FakeBadRequest("Message to be replied not found")
return SimpleNamespace(message_id=784)
adapter._bot = SimpleNamespace(send_photo=mock_send_photo)
import tools.url_safety as url_safety
monkeypatch.setattr(url_safety, "is_safe_url", lambda _url: True)
result = await adapter.send_image(
chat_id="123",
image_url="https://example.com/photo.png",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert call_log[1]["reply_to_message_id"] is None
assert "message_thread_id" not in call_log[1]
assert "direct_messages_topic_id" not in call_log[1]
@pytest.mark.asyncio
async def test_send_image_upload_dm_topic_reply_not_found_retry_drops_thread_id(monkeypatch):
adapter = _make_adapter()
call_log = []
async def mock_send_photo(**kwargs):
call_log.append(dict(kwargs))
if len(call_log) == 1:
raise RuntimeError("URL is too large")
if len(call_log) == 2:
raise FakeBadRequest("Message to be replied not found")
return SimpleNamespace(message_id=785)
class _FakeResponse:
content = b"image-data"
def raise_for_status(self):
return None
class _FakeAsyncClient:
def __init__(self, *args, **kwargs):
pass
async def __aenter__(self):
return self
async def __aexit__(self, *args):
return None
async def get(self, _url):
return _FakeResponse()
monkeypatch.setitem(
sys.modules,
"httpx",
SimpleNamespace(AsyncClient=_FakeAsyncClient),
)
adapter._bot = SimpleNamespace(send_photo=mock_send_photo)
import tools.url_safety as url_safety
monkeypatch.setattr(url_safety, "is_safe_url", lambda _url: True)
result = await adapter.send_image(
chat_id="123",
image_url="https://example.com/photo.png",
metadata={
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
},
)
assert result.success is True
assert call_log[0]["reply_to_message_id"] == 462
assert call_log[0]["message_thread_id"] == 20197
assert call_log[1]["reply_to_message_id"] == 462
assert call_log[1]["message_thread_id"] == 20197
assert call_log[2]["reply_to_message_id"] is None
assert "message_thread_id" not in call_log[2]
assert "direct_messages_topic_id" not in call_log[2]
@pytest.mark.asyncio
async def test_slash_confirm_private_topic_callback_followup_sends_thread_and_reply(monkeypatch):
adapter = _make_adapter()
adapter._slash_confirm_state = {"confirm-1": "session-1"}
adapter._is_callback_user_authorized = lambda *args, **kwargs: True
call_log = []
async def mock_send_message(**kwargs):
call_log.append(dict(kwargs))
return SimpleNamespace(message_id=9001)
async def resolve(_session_key, _confirm_id, _choice):
return "done"
from tools import slash_confirm
monkeypatch.setattr(slash_confirm, "resolve", resolve)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
class Query:
data = "sc:once:confirm-1"
from_user = SimpleNamespace(id=42, first_name="Alice")
message = SimpleNamespace(
chat_id=12345,
chat=SimpleNamespace(type=_fake_telegram_constants.ChatType.PRIVATE),
message_thread_id=20197,
message_id=462,
)
async def answer(self, **kwargs):
return None
async def edit_message_text(self, **kwargs):
return None
await adapter._handle_callback_query(SimpleNamespace(callback_query=Query()), SimpleNamespace())
assert call_log
assert call_log[0]["message_thread_id"] == 20197
assert call_log[0]["reply_to_message_id"] == 462
@pytest.mark.asyncio
async def test_slash_confirm_forum_callback_followup_keeps_existing_thread_behavior(monkeypatch):
adapter = _make_adapter()
adapter._slash_confirm_state = {"confirm-1": "session-1"}
adapter._is_callback_user_authorized = lambda *args, **kwargs: True
call_log = []
async def mock_send_message(**kwargs):
call_log.append(dict(kwargs))
return SimpleNamespace(message_id=9001)
async def resolve(_session_key, _confirm_id, _choice):
return "done"
from tools import slash_confirm
monkeypatch.setattr(slash_confirm, "resolve", resolve)
adapter._bot = SimpleNamespace(send_message=mock_send_message)
class Query:
data = "sc:once:confirm-1"
from_user = SimpleNamespace(id=42, first_name="Alice")
message = SimpleNamespace(
chat_id=-100123,
chat=SimpleNamespace(type=_fake_telegram_constants.ChatType.SUPERGROUP),
message_thread_id=20197,
message_id=462,
)
async def answer(self, **kwargs):
return None
async def edit_message_text(self, **kwargs):
return None
await adapter._handle_callback_query(SimpleNamespace(callback_query=Query()), SimpleNamespace())
assert call_log
assert call_log[0]["message_thread_id"] == 20197
assert "reply_to_message_id" not in call_log[0]
assert "direct_messages_topic_id" not in call_log[0]
@pytest.mark.asyncio
async def test_base_send_image_fallback_preserves_metadata():
"""Base image fallback should pass metadata through instead of referencing kwargs."""
from gateway.platforms.base import BasePlatformAdapter
class _ConcreteBaseAdapter(BasePlatformAdapter):
async def connect(self):
return True
async def disconnect(self):
return None
async def send(self, **kwargs):
call_log.append(kwargs)
return SendResult(success=True, message_id="781")
async def get_chat_info(self, chat_id):
return None
call_log = []
adapter = _ConcreteBaseAdapter(Platform.TELEGRAM, None)
metadata = {"thread_id": "20197"}
result = await adapter.send_image(
chat_id="123",
image_url="https://example.invalid/image.png",
metadata=metadata,
)
assert result.success is True
assert call_log[0]["metadata"] is metadata
@pytest.mark.asyncio
async def test_send_raises_on_other_bad_request():
"""Non-thread BadRequest errors should NOT be retried — they fail immediately."""
+31
View File
@@ -433,6 +433,37 @@ class TestSendVoiceReply:
call_args = mock_adapter.send_voice.call_args
assert call_args.kwargs.get("chat_id") == "123"
@pytest.mark.asyncio
async def test_auto_voice_reply_uses_thread_metadata_helper(self, runner):
from gateway.config import Platform
mock_adapter = AsyncMock()
mock_adapter.send_voice = AsyncMock()
event = _make_event()
event.source.platform = Platform.TELEGRAM
event.source.chat_type = "dm"
event.source.thread_id = "20197"
event.message_id = "462"
runner.adapters[event.source.platform] = mock_adapter
tts_result = json.dumps({"success": True, "file_path": "/tmp/test.ogg"})
with patch("tools.tts_tool.text_to_speech_tool", return_value=tts_result), \
patch("tools.tts_tool._strip_markdown_for_tts", side_effect=lambda t: t), \
patch("os.path.isfile", return_value=True), \
patch("os.unlink"), \
patch("os.makedirs"):
await runner._send_voice_reply(event, "Hello world")
mock_adapter.send_voice.assert_called_once()
call_kwargs = mock_adapter.send_voice.call_args.kwargs
assert call_kwargs["reply_to"] == "462"
assert call_kwargs["metadata"] == {
"thread_id": "20197",
"telegram_dm_topic_reply_fallback": True,
"telegram_reply_to_message_id": "462",
}
@pytest.mark.asyncio
async def test_empty_text_after_strip_skips(self, runner):
event = _make_event()
+107
View File
@@ -0,0 +1,107 @@
"""Tests for the codex-cli external-process provider."""
from __future__ import annotations
import os
from unittest.mock import patch
import pytest
# CRITICAL: import directly from the module to avoid module-level side effects
from hermes_cli.auth import (
PROVIDER_REGISTRY,
get_external_process_provider_status,
get_auth_status,
resolve_external_process_provider_credentials,
)
class TestCodexCLIProviderRegistry:
"""Test that the codex-cli provider is correctly registered."""
def test_provider_registered(self):
assert "codex-cli" in PROVIDER_REGISTRY
pconfig = PROVIDER_REGISTRY["codex-cli"]
assert pconfig.name == "OpenAI Codex CLI"
assert pconfig.auth_type == "external_process"
assert pconfig.inference_base_url == "codex-cli://local"
assert pconfig.base_url_env_var == "CODEX_CLI_BASE_URL"
def test_aliases_resolve(self):
from hermes_cli.auth import resolve_provider
assert resolve_provider("codexcli") == "codex-cli"
assert resolve_provider("openai-codex-cli") == "codex-cli"
class TestCodexCLIStatus:
"""Test the external-process status helper for codex-cli."""
def test_status_not_configured_when_codex_missing(self):
with patch.dict(os.environ, {}, clear=True):
status = get_external_process_provider_status("codex-cli")
assert status["configured"] is False
assert status["provider"] == "codex-cli"
def test_status_configured_when_codex_exists(self):
with patch.dict(os.environ, {"PATH": "/usr/bin:/bin"}):
with patch("shutil.which", return_value="/opt/homebrew/bin/codex"):
status = get_external_process_provider_status("codex-cli")
assert status["configured"] is True
assert status["provider"] == "codex-cli"
assert status["resolved_command"] == "/opt/homebrew/bin/codex"
assert status["command"] == "codex"
def test_auth_status_dispatches(self):
with patch.dict(os.environ, {}, clear=True):
status = get_auth_status("codex-cli")
# Should not throw, returns a dict even when not configured
assert isinstance(status, dict)
assert "configured" in status or "logged_in" in status
def test_status_with_custom_command_env(self):
with patch.dict(os.environ, {"HERMES_CODEX_CLI_COMMAND": "/usr/local/bin/my-codex"}, clear=False):
status = get_external_process_provider_status("codex-cli")
assert status["command"] == "/usr/local/bin/my-codex"
assert status["command"] == "/usr/local/bin/my-codex"
def test_status_with_custom_args_env(self):
with patch.dict(os.environ, {
"HERMES_CODEX_CLI_ARGS": "exec --json --model gpt-5.5",
}, clear=False):
status = get_external_process_provider_status("codex-cli")
assert "exec" in status["args"]
assert "--json" in status["args"]
assert "--model" in status["args"]
def test_status_unknown_provider(self):
status = get_external_process_provider_status("nonexistent")
assert status == {"configured": False}
class TestCodexCLICredentials:
"""Test the credential resolver for codex-cli."""
def test_resolves_command_path_when_available(self):
with patch.dict(os.environ, {}, clear=True):
with patch("shutil.which", return_value="/opt/homebrew/bin/codex"):
creds = resolve_external_process_provider_credentials("codex-cli")
assert creds["provider"] == "codex-cli"
assert creds["command"] == "/opt/homebrew/bin/codex"
assert creds["api_key"] == "codex-cli"
assert creds["base_url"] == "codex-cli://local"
assert "--json" in creds["args"]
assert "--ephemeral" in creds["args"]
def test_raises_when_command_missing(self):
with patch.dict(os.environ, {}, clear=True):
with patch("shutil.which", return_value=None):
with pytest.raises(Exception) as exc_info:
resolve_external_process_provider_credentials("codex-cli")
assert "codex-cli" in str(exc_info.value).lower() or "codex" in str(exc_info.value).lower()
def test_custom_command_from_env(self):
with patch.dict(os.environ, {"HERMES_CODEX_CLI_COMMAND": "/usr/local/bin/custom-codex"}, clear=False):
with patch("shutil.which", return_value="/usr/local/bin/custom-codex"):
creds = resolve_external_process_provider_credentials("codex-cli")
assert creds["command"] == "/usr/local/bin/custom-codex"
+62
View File
@@ -140,6 +140,68 @@ class TestSystemdServiceRefresh:
assert markers == [321]
assert calls == [["stop", gateway_cli.get_service_name()]]
def test_systemd_stop_timeout_prints_status_guidance(self, monkeypatch, capsys):
markers = []
monkeypatch.setattr(gateway_cli, "_select_systemd_scope", lambda system=False: False)
monkeypatch.setattr(gateway_cli, "_require_service_installed", lambda action, system=False: None)
monkeypatch.setattr(status, "get_running_pid", lambda cleanup_stale=True: 321)
monkeypatch.setattr(
status,
"write_planned_stop_marker",
lambda pid: markers.append(pid) or True,
)
def fake_run_systemctl(args, **kwargs):
raise subprocess.TimeoutExpired(args, kwargs.get("timeout"))
monkeypatch.setattr(gateway_cli, "_run_systemctl", fake_run_systemctl)
gateway_cli.systemd_stop()
assert markers == [321]
output = capsys.readouterr().out
assert "still stopping after 90s" in output
assert "hermes gateway status" in output
def test_systemd_restart_timeout_prints_status_guidance(self, monkeypatch, capsys):
"""`hermes gateway restart` must not surface a raw TimeoutExpired traceback.
The dashboard spawns `hermes gateway restart` in the background; when a
wedged adapter websocket pushes drain past the 90s CLI timeout, the
dashboard would previously show a Python traceback (issue #19937
follow-up: the same failure mode applies to restart, not just stop).
"""
monkeypatch.setattr(gateway_cli, "_select_systemd_scope", lambda system=False: False)
monkeypatch.setattr(gateway_cli, "_require_service_installed", lambda action, system=False: None)
monkeypatch.setattr(gateway_cli, "_preflight_user_systemd", lambda: None)
monkeypatch.setattr(gateway_cli, "refresh_systemd_unit_if_needed", lambda system=False: None)
monkeypatch.setattr(status, "get_running_pid", lambda cleanup_stale=True: None)
monkeypatch.setattr(gateway_cli, "_systemd_main_pid", lambda system=False: None)
monkeypatch.setattr(
gateway_cli,
"_recover_pending_systemd_restart",
lambda system=False, previous_pid=None: False,
)
monkeypatch.setattr(
gateway_cli,
"_systemd_service_is_start_limited",
lambda system=False: False,
)
def fake_run_systemctl(args, **kwargs):
# reset-failed is a pre-step (check=False, 30s) — let it pass.
if args and args[0] == "reset-failed":
return SimpleNamespace(returncode=0, stdout="", stderr="")
raise subprocess.TimeoutExpired(args, kwargs.get("timeout"))
monkeypatch.setattr(gateway_cli, "_run_systemctl", fake_run_systemctl)
gateway_cli.systemd_restart()
output = capsys.readouterr().out
assert "still restarting after 90s" in output
assert "hermes gateway status" in output
def test_run_gateway_refreshes_outdated_unit_on_boot(self, tmp_path, monkeypatch):
"""run_gateway() should refresh the systemd unit on boot so that
+52
View File
@@ -914,3 +914,55 @@ def test_latest_summaries_batch_omits_tasks_without_summary(kanban_home):
assert out == {t1: "alpha", t3: "charlie"}
# Empty input → empty dict, no SQL syntax error from "IN ()".
assert kb.latest_summaries(conn, []) == {}
# ---------------------------------------------------------------------------
# NFS / network-filesystem fallback (see hermes_state.apply_wal_with_fallback)
# ---------------------------------------------------------------------------
def test_connect_falls_back_to_delete_on_locking_protocol(kanban_home, caplog):
"""kanban_db.connect() must handle ``locking protocol`` on NFS/SMB.
Without this fallback, the gateway's kanban dispatcher crashes every
60s and the kanban migration (``consecutive_failures`` ADD COLUMN) is
retried forever which is what the real-world user report shows
(see hermes-agent issue #22032).
"""
import sqlite3 as _sqlite3
from unittest.mock import patch as _patch
# Clear module cache so a fresh connect() is attempted
kb._INITIALIZED_PATHS.clear()
real_connect = _sqlite3.connect
class _WalBlockingConnection(_sqlite3.Connection):
def execute(self, sql, *args, **kwargs): # type: ignore[override]
if "journal_mode=wal" in sql.lower().replace(" ", ""):
raise _sqlite3.OperationalError("locking protocol")
return super().execute(sql, *args, **kwargs)
def wal_blocking_connect(*args, **kwargs):
return real_connect(
*args, factory=_WalBlockingConnection, **kwargs
)
with _patch("hermes_cli.kanban_db.sqlite3.connect", side_effect=wal_blocking_connect):
with caplog.at_level("WARNING", logger="hermes_state"):
conn = kb.connect()
# One fallback warning, naming kanban.db
warnings = [
r for r in caplog.records
if r.levelname == "WARNING" and "kanban.db" in r.getMessage()
]
assert len(warnings) >= 1, (
f"Expected a kanban.db WARNING, got: {[r.getMessage() for r in caplog.records]}"
)
# DB still usable end-to-end — create + list a task
t = kb.create_task(conn, title="post-fallback task")
tasks = kb.list_tasks(conn)
assert any(row.id == t for row in tasks)
conn.close()
+58
View File
@@ -244,6 +244,64 @@ class TestCreateProfile:
assert (profile_dir / "memories" / "note.md").read_text() == "remember this"
assert not (profile_dir / "profiles").exists()
def test_clone_all_excludes_default_infrastructure(self, profile_env):
"""--clone-all from default profile excludes hermes-agent, .worktrees,
bin, node_modules at root, plus __pycache__/*.pyc/*.pyo/*.sock/*.tmp
at any depth. Profile data (config, env, skills, sessions, logs,
state.db) must be preserved clone-all means "complete snapshot
minus infrastructure."
"""
tmp_path = profile_env
default_home = tmp_path / ".hermes"
# Simulate infrastructure dirs that only the default profile has
(default_home / "hermes-agent" / ".git").mkdir(parents=True)
(default_home / "hermes-agent" / "venv" / "bin").mkdir(parents=True)
(default_home / "hermes-agent" / "README.md").write_text("repo")
(default_home / ".worktrees" / "some-tree").mkdir(parents=True)
(default_home / "profiles" / "other").mkdir(parents=True)
(default_home / "profiles" / "other" / "config.yaml").write_text("x")
(default_home / "bin").mkdir(exist_ok=True)
(default_home / "bin" / "tool").write_text("binary")
(default_home / "node_modules" / ".package-lock.json").mkdir(parents=True)
# Bytecode + temp files at nested depth (universal exclusion)
(default_home / "skills" / "my-skill" / "__pycache__").mkdir(parents=True)
(default_home / "skills" / "my-skill" / "__pycache__" / "module.cpython-311.pyc").write_text("stale")
(default_home / "skills" / "my-skill" / "module.pyc").write_text("stale")
(default_home / "skills" / "my-skill" / "module.pyo").write_text("stale")
(default_home / "data.sock").write_text("socket")
(default_home / "data.tmp").write_text("tmp")
# Profile data that SHOULD be copied
(default_home / "skills" / "my-skill").mkdir(parents=True, exist_ok=True)
(default_home / "skills" / "my-skill" / "SKILL.md").write_text("skill")
(default_home / "config.yaml").write_text("model: gpt-4")
(default_home / ".env").write_text("KEY=val")
(default_home / "state.db").write_text("sessions-data")
(default_home / "sessions").mkdir(exist_ok=True)
(default_home / "logs").mkdir(exist_ok=True)
(default_home / "logs" / "gateway.log").write_text("log")
profile_dir = create_profile("cloned", clone_all=True, no_alias=True)
# Infrastructure must be excluded
assert not (profile_dir / "hermes-agent").exists()
assert not (profile_dir / ".worktrees").exists()
assert not (profile_dir / "profiles").exists()
assert not (profile_dir / "bin").exists()
assert not (profile_dir / "node_modules").exists()
# Universal exclusions at any depth
assert not (profile_dir / "data.sock").exists()
assert not (profile_dir / "data.tmp").exists()
assert not (profile_dir / "skills" / "my-skill" / "__pycache__").exists()
assert not (profile_dir / "skills" / "my-skill" / "module.pyc").exists()
assert not (profile_dir / "skills" / "my-skill" / "module.pyo").exists()
# All profile data must be present
assert (profile_dir / "skills" / "my-skill" / "SKILL.md").read_text() == "skill"
assert (profile_dir / "config.yaml").read_text() == "model: gpt-4"
assert (profile_dir / ".env").read_text() == "KEY=val"
assert (profile_dir / "state.db").read_text() == "sessions-data"
assert (profile_dir / "sessions").exists()
assert (profile_dir / "logs" / "gateway.log").read_text() == "log"
def test_clone_config_missing_files_skipped(self, profile_env):
"""Clone config gracefully skips files that don't exist in source."""
profile_dir = create_profile("coder", clone_config=True, no_alias=True)
+30
View File
@@ -0,0 +1,30 @@
"""Tests for Slack CLI helpers."""
from hermes_cli.slack_cli import _build_full_manifest
class TestSlackFullManifest:
"""Generated full Slack app manifest used by `hermes slack manifest`."""
def test_app_home_messages_are_writable(self):
manifest = _build_full_manifest("Hermes", "Your Hermes agent on Slack")
assert manifest["features"]["app_home"] == {
"home_tab_enabled": False,
"messages_tab_enabled": True,
"messages_tab_read_only_enabled": False,
}
def test_private_channel_directory_scope_is_included(self):
manifest = _build_full_manifest("Hermes", "Your Hermes agent on Slack")
bot_scopes = manifest["oauth_config"]["scopes"]["bot"]
assert "groups:read" in bot_scopes
def test_assistant_features_remain_enabled(self):
manifest = _build_full_manifest("Hermes", "Your Hermes agent on Slack")
assert "assistant_view" in manifest["features"]
assert "assistant:write" in manifest["oauth_config"]["scopes"]["bot"]
bot_events = manifest["settings"]["event_subscriptions"]["bot_events"]
assert "assistant_thread_started" in bot_events
@@ -314,10 +314,11 @@ def test_viking_client_headers_include_bearer_when_api_key_set():
assert headers["Authorization"] == "Bearer test-key"
def test_viking_client_headers_omit_tenant_when_legacy_default():
# Existing installs have account/user set to the literal string "default".
# Those should NOT be sent as headers — the server would interpret that
# as a real tenant override and reject/misroute requests.
def test_viking_client_headers_send_tenant_when_default():
# account/user set to the literal string "default". OpenViking 0.3.x
# requires X-OpenViking-Account and X-OpenViking-User for ROOT API key
# requests to tenant-scoped APIs — omitting them causes
# INVALID_ARGUMENT errors even when account="default".
client = _VikingClient(
"https://example.com",
api_key="test-key",
@@ -326,13 +327,15 @@ def test_viking_client_headers_omit_tenant_when_legacy_default():
agent="hermes",
)
headers = client._headers()
assert "X-OpenViking-Account" not in headers
assert "X-OpenViking-User" not in headers
assert headers["X-OpenViking-Account"] == "default"
assert headers["X-OpenViking-User"] == "default"
assert headers["X-OpenViking-Agent"] == "hermes"
assert headers["Authorization"] == "Bearer test-key"
def test_viking_client_headers_omit_tenant_when_empty():
def test_viking_client_headers_send_tenant_when_empty_falls_back_to_default():
# Empty account/user strings fall back to "default" via the constructor.
# Headers are sent even for the default value — ROOT API keys need them.
client = _VikingClient(
"https://example.com",
api_key="",
@@ -341,8 +344,8 @@ def test_viking_client_headers_omit_tenant_when_empty():
agent="hermes",
)
headers = client._headers()
assert "X-OpenViking-Account" not in headers
assert "X-OpenViking-User" not in headers
assert headers["X-OpenViking-Account"] == "default"
assert headers["X-OpenViking-User"] == "default"
assert "Authorization" not in headers
assert "X-API-Key" not in headers
+305
View File
@@ -0,0 +1,305 @@
"""Tests for the WAL→DELETE journal-mode fallback on NFS / SMB / FUSE.
When ``PRAGMA journal_mode=WAL`` raises ``OperationalError("locking protocol")``
(SQLITE_PROTOCOL typical on NFS/SMB), Hermes must fall back to
``journal_mode=DELETE`` so ``state.db`` / ``kanban.db`` remain usable.
Without this fallback, users on NFS-mounted ``HERMES_HOME`` silently lose
``/resume``, ``/title``, ``/history``, ``/branch``, session search, and the
kanban dispatcher because ``SessionDB()`` init propagates the error and
every caller swallows it, leaving ``_session_db = None``.
See: https://www.sqlite.org/wal.html "WAL does not work over a network
filesystem".
"""
import sqlite3
from unittest.mock import patch
import pytest
import hermes_state
from hermes_state import (
SessionDB,
apply_wal_with_fallback,
format_session_db_unavailable,
get_last_init_error,
)
# ``sqlite3.Connection.execute`` is a C-level slot and can't be monkeypatched
# directly (``'sqlite3.Connection' object attribute 'execute' is read-only``).
# A factory-built subclass lets us intercept journal_mode=WAL per-test with
# its own mutable counter, avoiding the xdist-parallel class-state race.
def _make_blocking_factory(reason: str, attempt_counter: list):
"""Return a sqlite3.Connection subclass that raises on PRAGMA journal_mode=WAL."""
class _WalBlockingConnection(sqlite3.Connection):
def execute(self, sql, *args, **kwargs): # type: ignore[override]
if "journal_mode=wal" in sql.lower().replace(" ", ""):
attempt_counter[0] += 1
raise sqlite3.OperationalError(reason)
return super().execute(sql, *args, **kwargs)
return _WalBlockingConnection
def _open_blocking(path, reason="locking protocol", **kwargs):
"""Open a connection whose WAL pragma raises ``reason``.
Returns ``(conn, attempt_counter_list)`` so callers can assert how many
times WAL was attempted.
"""
attempts = [0]
factory = _make_blocking_factory(reason, attempts)
return sqlite3.connect(str(path), factory=factory, **kwargs), attempts
@pytest.fixture(autouse=True)
def _reset_last_init_error():
"""Reset the module-global last-error before and after each test."""
hermes_state._set_last_init_error(None)
yield
hermes_state._set_last_init_error(None)
@pytest.fixture(autouse=True)
def _reset_wal_fallback_warned_paths():
"""Reset the WAL-fallback warned-paths set so dedup doesn't leak between tests."""
hermes_state._wal_fallback_warned_paths.clear()
yield
hermes_state._wal_fallback_warned_paths.clear()
class TestApplyWalWithFallback:
def test_succeeds_on_local_fs(self, tmp_path):
"""Happy path: WAL works on a normal filesystem."""
conn = sqlite3.connect(str(tmp_path / "ok.db"), isolation_level=None)
mode = apply_wal_with_fallback(conn)
assert mode == "wal"
cur = conn.execute("PRAGMA journal_mode")
assert cur.fetchone()[0].lower() == "wal"
conn.close()
def test_falls_back_to_delete_on_locking_protocol(self, tmp_path, caplog):
"""NFS-style ``locking protocol`` error → DELETE mode + one WARNING."""
conn, _ = _open_blocking(tmp_path / "nfs.db", isolation_level=None)
with caplog.at_level("WARNING", logger="hermes_state"):
mode = apply_wal_with_fallback(conn, db_label="test.db")
assert mode == "delete"
warnings = [r for r in caplog.records if r.levelname == "WARNING"]
assert len(warnings) == 1
msg = warnings[0].getMessage()
assert "test.db" in msg
assert "journal_mode=DELETE" in msg
assert "locking protocol" in msg
# Post-fallback the DB is still usable for real writes
conn.execute("CREATE TABLE t (x INTEGER)")
conn.execute("INSERT INTO t VALUES (1)")
assert list(conn.execute("SELECT x FROM t"))[0][0] == 1
conn.close()
def test_falls_back_on_not_authorized(self, tmp_path):
"""Some FUSE mounts block WAL pragma outright ('not authorized')."""
conn, _ = _open_blocking(
tmp_path / "fuse.db", reason="not authorized", isolation_level=None
)
mode = apply_wal_with_fallback(conn)
assert mode == "delete"
conn.close()
def test_falls_back_on_disk_io_error(self, tmp_path):
"""Flaky network FS → disk I/O error → still fall back."""
conn, _ = _open_blocking(
tmp_path / "flaky.db", reason="disk I/O error", isolation_level=None
)
mode = apply_wal_with_fallback(conn)
assert mode == "delete"
conn.close()
def test_reraises_unrelated_operational_error(self, tmp_path):
"""Non-WAL-compat errors must NOT be silently swallowed by the fallback."""
conn, _ = _open_blocking(
tmp_path / "other.db",
reason="no such table: nope",
isolation_level=None,
)
with pytest.raises(sqlite3.OperationalError, match="no such table"):
apply_wal_with_fallback(conn)
conn.close()
def test_warning_deduplicated_per_db_label(self, tmp_path, caplog):
"""Repeated calls with the same db_label log exactly ONE warning.
Prevents log spam when NFS users run kanban (which opens a fresh
connection on every operation see hermes_cli/kanban_db.py).
Regression guard: the fix for #22032 ran apply_wal_with_fallback()
on every kb.connect() call; without dedup, errors.log fills with
hundreds of identical warnings per hour.
"""
with caplog.at_level("WARNING", logger="hermes_state"):
# Three separate connections to "the same DB" via the same label
for i in range(3):
conn, _ = _open_blocking(
tmp_path / f"dup-{i}.db", isolation_level=None
)
mode = apply_wal_with_fallback(conn, db_label="shared.db")
assert mode == "delete"
conn.close()
# Exactly one warning across all three calls
warnings = [
r for r in caplog.records
if r.levelname == "WARNING" and "shared.db" in r.getMessage()
]
assert len(warnings) == 1, (
f"Expected 1 deduplicated warning, got {len(warnings)}: "
f"{[r.getMessage() for r in warnings]}"
)
def test_warning_fires_independently_per_db_label(self, tmp_path, caplog):
"""Different db_labels each get their own one warning (not globally dedup'd)."""
with caplog.at_level("WARNING", logger="hermes_state"):
conn1, _ = _open_blocking(tmp_path / "a.db", isolation_level=None)
apply_wal_with_fallback(conn1, db_label="state.db")
conn1.close()
conn2, _ = _open_blocking(tmp_path / "b.db", isolation_level=None)
apply_wal_with_fallback(conn2, db_label="kanban.db")
conn2.close()
warnings = [r for r in caplog.records if r.levelname == "WARNING"]
labels_warned = {
lbl for r in warnings for lbl in ("state.db", "kanban.db")
if lbl in r.getMessage()
}
assert labels_warned == {"state.db", "kanban.db"}, (
f"Each db_label should warn once; got {labels_warned}"
)
class TestGetLastInitError:
def test_none_on_successful_init(self, tmp_path):
"""Happy-path SessionDB init does NOT clear a stale error from a prior thread.
We deliberately don't clear on success so that in multi-threaded
callers (gateway / web_server per-request SessionDB()), a concurrent
successful open racing past a different thread's failure won't
erase the cause string the failing thread's /resume is about to
format. The caller or test fixture is responsible for explicitly
calling _set_last_init_error(None) to reset.
"""
# Autouse fixture starts at None — success-path leaves it None
db = SessionDB(db_path=tmp_path / "ok.db")
try:
assert get_last_init_error() is None
finally:
db.close()
def test_success_does_not_clear_prior_error(self, tmp_path):
"""Thread-safety guard: a successful init must not erase a pre-existing error.
Simulates the multi-threaded race: thread A fails, records cause;
thread B succeeds concurrently. thread A's /resume handler must
still see A's cause — not B's None.
"""
hermes_state._set_last_init_error("OperationalError: locking protocol")
# Now a "successful" init happens on another path — must NOT clear
db = SessionDB(db_path=tmp_path / "ok2.db")
try:
assert get_last_init_error() == "OperationalError: locking protocol"
finally:
db.close()
def test_captures_cause_on_failed_init(self, tmp_path):
"""When SessionDB() raises, the cause is preserved for slash commands.
Simulates a filesystem where BOTH WAL and DELETE journal modes fail
e.g. a read-only mount where no ``PRAGMA journal_mode=X`` works. The
fallback tries DELETE and also gets rejected; the exception bubbles
out of ``SessionDB.__init__`` and the cause is captured.
"""
target = tmp_path / "broken.db"
real_connect = sqlite3.connect
class _BothPragmasFailConnection(sqlite3.Connection):
def execute(self, sql, *args, **kwargs): # type: ignore[override]
if "journal_mode" in sql.lower():
raise sqlite3.OperationalError(
"locking protocol: read-only filesystem"
)
return super().execute(sql, *args, **kwargs)
def gated_connect(*args, **kwargs):
return real_connect(str(target), factory=_BothPragmasFailConnection, **kwargs)
with patch("hermes_state.sqlite3.connect", side_effect=gated_connect):
with pytest.raises(sqlite3.OperationalError):
SessionDB(db_path=target)
cause = get_last_init_error()
assert cause is not None
assert "OperationalError" in cause
assert "locking protocol" in cause
class TestFormatSessionDbUnavailable:
def test_bare_message_when_no_cause(self):
"""No init error recorded → generic message."""
hermes_state._set_last_init_error(None)
assert format_session_db_unavailable() == "Session database not available."
def test_includes_cause(self):
"""Cause is surfaced for slash-command error strings."""
hermes_state._set_last_init_error("OperationalError: generic SQLite error")
msg = format_session_db_unavailable()
assert "generic SQLite error" in msg
assert msg.startswith("Session database not available:")
assert msg.endswith(".")
def test_adds_nfs_hint_for_locking_protocol(self):
"""Locking-protocol cause gets an NFS/SMB pointer for the user."""
hermes_state._set_last_init_error("OperationalError: locking protocol")
msg = format_session_db_unavailable()
assert "locking protocol" in msg
assert "NFS/SMB" in msg
assert "sqlite.org/wal.html" in msg
def test_custom_prefix(self):
"""Callers can customize the prefix for context-specific messages."""
hermes_state._set_last_init_error("OperationalError: locking protocol")
msg = format_session_db_unavailable(prefix="Cannot /resume")
assert msg.startswith("Cannot /resume:")
class TestSessionDbUsesWalFallback:
def test_sessiondb_works_when_wal_unavailable(self, tmp_path):
"""E2E: SessionDB initializes and performs a write on a WAL-blocked FS."""
target = tmp_path / "nfs_style.db"
real_connect = sqlite3.connect
attempts = [0]
factory = _make_blocking_factory("locking protocol", attempts)
def gated_connect(*args, **kwargs):
return real_connect(str(target), factory=factory, **kwargs)
with patch("hermes_state.sqlite3.connect", side_effect=gated_connect):
db = SessionDB(db_path=target)
try:
# WAL was attempted and rejected — fallback kicked in
assert attempts[0] >= 1, (
"WAL pragma was never executed — check the patch target"
)
# SessionDB is usable end-to-end: create a session, read it back
db.create_session(session_id="s1", source="cli", model="test")
sess = db.get_session("s1")
assert sess is not None
assert sess["source"] == "cli"
# No init error was recorded since init succeeded via the fallback
assert get_last_init_error() is None
finally:
db.close()
+22
View File
@@ -122,6 +122,28 @@ class TestUnifiedCronjobTool:
assert listing["jobs"][0]["name"] == "Server Check"
assert listing["jobs"][0]["state"] == "scheduled"
def test_list_handles_partial_legacy_job_records(self):
from cron.jobs import save_jobs
save_jobs([
{
"id": "abc123deadbe",
"name": None,
"prompt": None,
"schedule_display": None,
"schedule": {"kind": "interval", "minutes": 60, "display": "every 60m"},
"repeat": {"times": None, "completed": 0},
"enabled": True,
}
])
listing = json.loads(cronjob(action="list"))
assert listing["success"] is True
assert listing["jobs"][0]["name"] == "abc123deadbe"
assert listing["jobs"][0]["prompt_preview"] == ""
assert listing["jobs"][0]["schedule"] == "every 60m"
def test_pause_and_resume(self):
created = json.loads(cronjob(action="create", prompt="Check", schedule="every 1h"))
job_id = created["job_id"]
+57
View File
@@ -167,6 +167,63 @@ class TestDelegateTask(unittest.TestCase):
self.assertEqual(result["results"][1]["summary"], "Result B")
self.assertIn("total_duration_seconds", result)
@patch("tools.delegate_tool._run_single_child")
def test_batch_mode_accepts_json_string_tasks(self, mock_run):
mock_run.side_effect = [
{
"task_index": 0,
"status": "completed",
"summary": "Result A",
"api_calls": 2,
"duration_seconds": 3.0,
},
{
"task_index": 1,
"status": "completed",
"summary": "Result B",
"api_calls": 4,
"duration_seconds": 6.0,
},
]
parent = _make_mock_parent()
tasks = json.dumps(
[
{"goal": "Research topic A"},
{"goal": "Research topic B"},
]
)
result = json.loads(delegate_task(tasks=tasks, parent_agent=parent))
self.assertIn("results", result)
self.assertEqual(len(result["results"]), 2)
self.assertEqual(result["results"][0]["summary"], "Result A")
self.assertEqual(result["results"][1]["summary"], "Result B")
@patch("tools.delegate_tool._run_single_child")
def test_batch_mode_rejects_non_object_tasks(self, mock_run):
parent = _make_mock_parent()
result = json.loads(
delegate_task(tasks=["not a task object"], parent_agent=parent)
)
self.assertIn("error", result)
self.assertIn("Task 0 must be an object", result["error"])
mock_run.assert_not_called()
@patch("tools.delegate_tool._run_single_child")
def test_batch_mode_rejects_malformed_json_string_tasks(self, mock_run):
parent = _make_mock_parent()
result = json.loads(
delegate_task(tasks='[{"goal": "bad}', parent_agent=parent)
)
self.assertIn("error", result)
self.assertIn("could not be parsed as JSON", result["error"])
mock_run.assert_not_called()
@patch("tools.delegate_tool._run_single_child")
def test_batch_capped_at_3(self, mock_run):
mock_run.return_value = {
+58 -3
View File
@@ -296,21 +296,40 @@ def test_comment_rejects_empty_body(worker_env):
assert json.loads(out).get("error")
def test_comment_custom_author(worker_env):
def test_comment_ignores_caller_supplied_author(worker_env):
"""``args["author"]`` is no longer honored — the author is always
derived from ``HERMES_PROFILE`` so a worker can't forge a comment
under an authoritative-looking name like ``hermes-system`` and
poison the next worker's prompt context. Cross-task commenting
itself remains unrestricted (see #19713); only the author override
is removed.
"""
from tools import kanban_tools as kt
out = kt._handle_comment({
"task_id": worker_env, "body": "hi", "author": "custom-bot",
"task_id": worker_env, "body": "hi", "author": "hermes-system",
})
assert json.loads(out)["ok"]
from hermes_cli import kanban_db as kb
conn = kb.connect()
try:
comments = kb.list_comments(conn, worker_env)
assert comments[0].author == "custom-bot"
# Author comes from HERMES_PROFILE in the fixture, not the
# caller-supplied "hermes-system" override.
assert comments[0].author == "test-worker"
finally:
conn.close()
def test_comment_schema_omits_author_override():
"""The ``author`` property must not appear on KANBAN_COMMENT_SCHEMA;
exposing it to the LLM would re-introduce the forgery surface this
handler is hardened against.
"""
from tools.kanban_tools import KANBAN_COMMENT_SCHEMA
props = KANBAN_COMMENT_SCHEMA["parameters"]["properties"]
assert "author" not in props
def test_create_happy_path(worker_env):
from tools import kanban_tools as kt
out = kt._handle_create({
@@ -657,6 +676,42 @@ def test_worker_heartbeat_rejects_foreign_task_id(worker_env):
assert "refusing to mutate" in d.get("error", "")
def test_worker_can_comment_on_foreign_task(worker_env):
"""Cross-task commenting must remain unrestricted (#19713 policy).
The author-forgery hardening removed args['author'] but deliberately
did NOT add an ownership gate to kanban_comment comments are the
documented handoff channel between tasks. This test pins that policy
so a future change accidentally adding ``_enforce_worker_task_ownership``
to ``_handle_comment`` would fail CI immediately.
"""
from hermes_cli import kanban_db as kb
conn = kb.connect()
try:
other = kb.create_task(conn, title="sibling")
finally:
conn.close()
from tools import kanban_tools as kt
out = kt._handle_comment({
"task_id": other,
"body": "handoff: see prior findings before starting",
})
d = json.loads(out)
assert d.get("ok") is True, f"cross-task comment must succeed: {d}"
# The comment lands on the foreign task, attributed to the worker's
# HERMES_PROFILE — never to a caller-controlled string.
conn = kb.connect()
try:
comments = kb.list_comments(conn, other)
assert len(comments) == 1
assert comments[0].author == "test-worker"
assert comments[0].body.startswith("handoff:")
finally:
conn.close()
def test_worker_complete_own_task_still_works(worker_env):
"""The ownership check doesn't break the normal own-task happy path."""
from tools import kanban_tools as kt
+235
View File
@@ -742,6 +742,64 @@ class TestSendTelegramHtmlDetection:
sleep_mock.assert_awaited_once()
class TestSendTelegramThreadIdMapping:
"""General-topic mapping in _send_telegram (issue #22267).
Telegram forum supergroups address the General topic as
``message_thread_id="1"`` on incoming updates, but the Bot API rejects
sends with ``message_thread_id=1`` ("Message thread not found"). The
gateway adapter's ``_message_thread_id_for_send`` helper maps "1" to
``None`` for that reason; the standalone ``_send_telegram`` helper used
by the ``send_message`` tool needs the same mapping.
"""
def _make_bot(self):
bot = MagicMock()
bot.send_message = AsyncMock(return_value=SimpleNamespace(message_id=1))
return bot
def test_general_topic_thread_id_omitted(self, monkeypatch):
"""thread_id="1" must be dropped before calling the Bot API."""
bot = self._make_bot()
_install_telegram_mock(monkeypatch, bot)
asyncio.run(_send_telegram("tok", "-1001234567890", "hello", thread_id="1"))
bot.send_message.assert_awaited_once()
kwargs = bot.send_message.await_args.kwargs
assert "message_thread_id" not in kwargs
def test_non_general_topic_thread_id_preserved(self, monkeypatch):
"""Real forum-topic thread ids (>1) still pass through as ints."""
bot = self._make_bot()
_install_telegram_mock(monkeypatch, bot)
asyncio.run(_send_telegram("tok", "-1001234567890", "hello", thread_id="17585"))
kwargs = bot.send_message.await_args.kwargs
assert kwargs["message_thread_id"] == 17585
def test_no_thread_id_no_kwarg(self, monkeypatch):
"""With no thread_id, message_thread_id must not appear in kwargs."""
bot = self._make_bot()
_install_telegram_mock(monkeypatch, bot)
asyncio.run(_send_telegram("tok", "-1001234567890", "hello"))
kwargs = bot.send_message.await_args.kwargs
assert "message_thread_id" not in kwargs
def test_general_topic_thread_id_int_input_also_dropped(self, monkeypatch):
"""thread_id passed as the int 1 (not str) must still be dropped."""
bot = self._make_bot()
_install_telegram_mock(monkeypatch, bot)
asyncio.run(_send_telegram("tok", "-1001234567890", "hello", thread_id=1))
kwargs = bot.send_message.await_args.kwargs
assert "message_thread_id" not in kwargs
# ---------------------------------------------------------------------------
# Tests for Discord thread_id support
# ---------------------------------------------------------------------------
@@ -1994,3 +2052,180 @@ class TestSendSignalChunking:
# Only the existing file made it into the RPC
params = fake.calls[0]["payload"]["params"]
assert len(params["attachments"]) == 1
# ── _send_via_adapter standalone fallback ────────────────────────────────
class _FakePlatform:
"""Stand-in for the gateway.config.Platform enum. Holds the .value
attribute consulted by ``_send_via_adapter`` for registry lookups."""
def __init__(self, value):
self.value = value
class TestSendViaAdapterStandaloneFallback:
"""Coverage for the out-of-process plugin-platform send path.
When the gateway runner is not in this process (e.g. ``hermes cron``
runs separately from ``hermes gateway``), ``_send_via_adapter`` should
fall through to the plugin's ``standalone_sender_fn`` registered on
its ``PlatformEntry``. Without the hook, the existing error string
is returned (with a more helpful tail).
"""
@staticmethod
def _make_entry(send_fn):
from gateway.platform_registry import PlatformEntry
return PlatformEntry(
name="fakeplatform",
label="Fake",
adapter_factory=lambda cfg: None,
check_fn=lambda: True,
standalone_sender_fn=send_fn,
)
@pytest.mark.asyncio
async def test_standalone_sender_fn_called_when_no_adapter(self, monkeypatch):
"""Registry has hook, runner ref returns None: the hook is awaited."""
from tools.send_message_tool import _send_via_adapter
from gateway.platform_registry import platform_registry
recorded = {}
async def fake_send(pconfig, chat_id, message, **kwargs):
recorded["pconfig"] = pconfig
recorded["chat_id"] = chat_id
recorded["message"] = message
recorded["kwargs"] = kwargs
return {"success": True, "message_id": "msg-42"}
platform_registry.register(self._make_entry(fake_send))
try:
monkeypatch.setattr("gateway.run._gateway_runner_ref", lambda: None)
pconfig = SimpleNamespace(extra={})
result = await _send_via_adapter(
_FakePlatform("fakeplatform"),
pconfig,
"room/123",
"hello cron",
)
finally:
platform_registry.unregister("fakeplatform")
assert result == {"success": True, "message_id": "msg-42"}
assert recorded["chat_id"] == "room/123"
assert recorded["message"] == "hello cron"
assert recorded["pconfig"] is pconfig
@pytest.mark.asyncio
async def test_standalone_sender_fn_kwargs_forwarded(self, monkeypatch):
"""thread_id, media_files, and force_document all reach the hook."""
from tools.send_message_tool import _send_via_adapter
from gateway.platform_registry import platform_registry
recorded = {}
async def fake_send(pconfig, chat_id, message, *, thread_id=None,
media_files=None, force_document=False):
recorded["thread_id"] = thread_id
recorded["media_files"] = media_files
recorded["force_document"] = force_document
return {"success": True, "message_id": "x"}
platform_registry.register(self._make_entry(fake_send))
try:
monkeypatch.setattr("gateway.run._gateway_runner_ref", lambda: None)
await _send_via_adapter(
_FakePlatform("fakeplatform"),
SimpleNamespace(extra={}),
"chat-1",
"hi",
thread_id="thread-7",
media_files=["/tmp/a.png"],
force_document=True,
)
finally:
platform_registry.unregister("fakeplatform")
assert recorded["thread_id"] == "thread-7"
assert recorded["media_files"] == ["/tmp/a.png"]
assert recorded["force_document"] is True
@pytest.mark.asyncio
async def test_standalone_sender_fn_absent_returns_helpful_error(self, monkeypatch):
"""Registry entry has no hook: the fall-through error explains both
options (gateway-running and standalone hook)."""
from tools.send_message_tool import _send_via_adapter
from gateway.platform_registry import platform_registry
platform_registry.register(self._make_entry(None))
try:
monkeypatch.setattr("gateway.run._gateway_runner_ref", lambda: None)
result = await _send_via_adapter(
_FakePlatform("fakeplatform"),
SimpleNamespace(extra={}),
"chat-1",
"hi",
)
finally:
platform_registry.unregister("fakeplatform")
assert "error" in result
assert "fakeplatform" in result["error"]
assert "standalone_sender_fn" in result["error"]
@pytest.mark.asyncio
async def test_standalone_sender_fn_raises_is_caught_and_formatted(self, monkeypatch):
"""Hook raises: error dict has 'Plugin standalone send failed: ...'"""
from tools.send_message_tool import _send_via_adapter
from gateway.platform_registry import platform_registry
async def boom(pconfig, chat_id, message, **kwargs):
raise ValueError("boom!")
platform_registry.register(self._make_entry(boom))
try:
monkeypatch.setattr("gateway.run._gateway_runner_ref", lambda: None)
result = await _send_via_adapter(
_FakePlatform("fakeplatform"),
SimpleNamespace(extra={}),
"chat-1",
"hi",
)
finally:
platform_registry.unregister("fakeplatform")
assert result == {"error": "Plugin standalone send failed: boom!"}
@pytest.mark.asyncio
async def test_standalone_sender_fn_return_shape_passed_through(self, monkeypatch):
"""Hook returns success dict: passed through unchanged."""
from tools.send_message_tool import _send_via_adapter
from gateway.platform_registry import platform_registry
async def fake_send(pconfig, chat_id, message, **kwargs):
return {"success": True, "message_id": "abc-123", "extra_field": "preserved"}
platform_registry.register(self._make_entry(fake_send))
try:
monkeypatch.setattr("gateway.run._gateway_runner_ref", lambda: None)
result = await _send_via_adapter(
_FakePlatform("fakeplatform"),
SimpleNamespace(extra={}),
"chat-1",
"hi",
)
finally:
platform_registry.unregister("fakeplatform")
assert result["success"] is True
assert result["message_id"] == "abc-123"
assert result["extra_field"] == "preserved"
+4 -4
View File
@@ -132,9 +132,9 @@ async def _cdp_call(
}
)
)
deadline = asyncio.get_event_loop().time() + timeout
deadline = asyncio.get_running_loop().time() + timeout
while True:
remaining = deadline - asyncio.get_event_loop().time()
remaining = deadline - asyncio.get_running_loop().time()
if remaining <= 0:
raise TimeoutError(
f"Timed out attaching to target {target_id}"
@@ -166,9 +166,9 @@ async def _cdp_call(
req["sessionId"] = session_id
await ws.send(json.dumps(req))
deadline = asyncio.get_event_loop().time() + timeout
deadline = asyncio.get_running_loop().time() + timeout
while True:
remaining = deadline - asyncio.get_event_loop().time()
remaining = deadline - asyncio.get_running_loop().time()
if remaining <= 0:
raise TimeoutError(
f"Timed out waiting for response to {method}"
+6 -4
View File
@@ -220,18 +220,20 @@ def _validate_cron_script_path(script: Optional[str]) -> Optional[str]:
def _format_job(job: Dict[str, Any]) -> Dict[str, Any]:
prompt = job.get("prompt", "")
prompt = str(job.get("prompt") or "")
skills = _canonical_skills(job.get("skill"), job.get("skills"))
job_id = str(job.get("id") or "unknown")
name = str(job.get("name") or prompt[:50] or (skills[0] if skills else "") or job_id or "cron job")
result = {
"job_id": job["id"],
"name": job["name"],
"job_id": job_id,
"name": name,
"skill": skills[0] if skills else None,
"skills": skills,
"prompt_preview": prompt[:100] + "..." if len(prompt) > 100 else prompt,
"model": job.get("model"),
"provider": job.get("provider"),
"base_url": job.get("base_url"),
"schedule": job.get("schedule_display"),
"schedule": job.get("schedule_display") or "?",
"repeat": _repeat_display(job),
"deliver": job.get("deliver", "local"),
"next_run_at": job.get("next_run_at"),
+33
View File
@@ -1867,6 +1867,29 @@ def _run_single_child(
logger.debug("Failed to close child agent after delegation")
def _recover_tasks_from_json_string(
tasks: Any,
) -> tuple[Optional[List[Dict[str, Any]]], Optional[str]]:
if not isinstance(tasks, str):
return None, None
raw = tasks.strip()
if not raw:
return None, "Provide either 'goal' (single task) or 'tasks' (batch)."
try:
parsed = json.loads(raw)
except json.JSONDecodeError as exc:
return None, (
"tasks must be a JSON array of task objects; received a string "
f"that could not be parsed as JSON ({exc.msg})."
)
if not isinstance(parsed, list):
return None, (
f"tasks must be a JSON array of task objects; parsed "
f"{type(parsed).__name__} instead."
)
return parsed, None
def delegate_task(
goal: Optional[str] = None,
context: Optional[str] = None,
@@ -1951,6 +1974,12 @@ def delegate_task(
# Normalize to task list
max_children = _get_max_concurrent_children()
recovered_tasks, tasks_error = _recover_tasks_from_json_string(tasks)
if tasks_error:
return tool_error(tasks_error)
if recovered_tasks is not None:
tasks = recovered_tasks
if tasks and isinstance(tasks, list):
if len(tasks) > max_children:
return tool_error(
@@ -1973,6 +2002,10 @@ def delegate_task(
# Validate each task has a goal
for i, task in enumerate(task_list):
if not isinstance(task, dict):
return tool_error(
f"Task {i} must be an object, got {type(task).__name__}."
)
if not task.get("goal", "").strip():
return tool_error(f"Task {i} is missing a 'goal'.")
+10 -8
View File
@@ -373,7 +373,16 @@ def _handle_comment(args: dict, **kw) -> str:
body = args.get("body")
if not body or not str(body).strip():
return tool_error("body is required")
author = args.get("author") or os.environ.get("HERMES_PROFILE") or "worker"
# Author is intentionally derived from the worker's own runtime
# identity, NOT from caller-supplied args. Comments are injected
# into the next worker's system prompt by ``build_worker_context``
# as ``**{author}** (timestamp): {body}`` — accepting an
# ``args["author"]`` override let a worker forge a comment from
# an authoritative-looking name like ``hermes-system`` and poison
# the future-worker context with what reads as a system directive.
# Cross-task commenting itself remains unrestricted (see #19713) —
# comments are the deliberate handoff channel between tasks.
author = os.environ.get("HERMES_PROFILE") or "worker"
try:
kb, conn = _connect()
try:
@@ -656,13 +665,6 @@ KANBAN_COMMENT_SCHEMA = {
"type": "string",
"description": "Markdown-supported comment body.",
},
"author": {
"type": "string",
"description": (
"Override author name. Defaults to the current "
"profile (HERMES_PROFILE env)."
),
},
},
"required": ["task_id", "body"],
},
+111 -16
View File
@@ -423,25 +423,92 @@ def _maybe_skip_cron_duplicate_send(platform_name: str, chat_id: str, thread_id:
}
async def _send_via_adapter(platform, pconfig, chat_id, chunk):
"""Send a message via a live gateway adapter (for plugin platforms).
async def _send_via_adapter(
platform,
pconfig,
chat_id,
chunk,
*,
thread_id=None,
media_files=None,
force_document=False,
):
"""Send a message via a live gateway adapter, with a standalone fallback
for out-of-process callers (e.g. cron running separately from the gateway).
Falls back to error if no adapter is connected for this platform.
Order of attempts:
1. Live in-process adapter via ``_gateway_runner_ref()`` (the path that
existed before this change).
2. The plugin's ``standalone_sender_fn`` registered on its
``PlatformEntry`` (used when the gateway is not in this process, so
the runner weakref is ``None``).
3. A descriptive error explaining both options.
"""
runner = None
try:
from gateway.run import _gateway_runner_ref
runner = _gateway_runner_ref()
if runner:
except Exception:
runner = None
if runner is not None:
try:
adapter = runner.adapters.get(platform)
if adapter:
from gateway.platforms.base import SendResult
except Exception:
adapter = None
if adapter is not None:
try:
result = await adapter.send(chat_id=chat_id, content=chunk)
if result.success:
return {"success": True, "message_id": result.message_id}
return {"error": f"Adapter send failed: {result.error}"}
except Exception as e:
return {"error": f"Plugin platform send failed: {e}"}
return {"error": f"No live adapter for platform '{platform.value}'. Is the gateway running with this platform connected?"}
except asyncio.CancelledError:
raise
except Exception as e:
return {"error": f"Plugin platform send failed: {e}"}
if result.success:
return {"success": True, "message_id": result.message_id}
return {"error": f"Adapter send failed: {result.error}"}
platform_name = platform.value if hasattr(platform, "value") else str(platform)
entry = None
try:
from gateway.platform_registry import platform_registry
entry = platform_registry.get(platform_name)
except Exception:
entry = None
if entry is not None and entry.standalone_sender_fn is not None:
try:
result = await entry.standalone_sender_fn(
pconfig,
chat_id,
chunk,
thread_id=thread_id,
media_files=media_files,
force_document=force_document,
)
except asyncio.CancelledError:
raise
except Exception as e:
logger.debug("Plugin standalone send for %s raised", platform_name, exc_info=True)
return {"error": f"Plugin standalone send failed: {e}"}
if isinstance(result, dict) and (result.get("success") or result.get("error")):
return result
return {
"error": (
f"Plugin standalone send for '{platform_name}' returned an "
f"invalid result: expected a dict with 'success' or 'error' "
f"keys, got {type(result).__name__}"
)
}
return {
"error": (
f"No live adapter for platform '{platform_name}'. Is the gateway "
f"running with this platform connected? For out-of-process delivery "
f"(e.g. cron in a separate process), the platform plugin must "
f"register a standalone_sender_fn on its PlatformEntry."
)
}
async def _send_to_platform(platform, pconfig, chat_id, message, thread_id=None, media_files=None, force_document=False):
@@ -660,9 +727,17 @@ async def _send_to_platform(platform, pconfig, chat_id, message, thread_id=None,
elif platform == Platform.YUANBAO:
result = await _send_yuanbao(chat_id, chunk)
else:
# Plugin platform route through the gateway's live adapter
# if available, otherwise report the error.
result = await _send_via_adapter(platform, pconfig, chat_id, chunk)
# Plugin platform: route through the gateway's live adapter if
# available, otherwise the plugin's standalone_sender_fn.
result = await _send_via_adapter(
platform,
pconfig,
chat_id,
chunk,
thread_id=thread_id,
media_files=media_files,
force_document=force_document,
)
if isinstance(result, dict) and result.get("error"):
return result
@@ -710,7 +785,27 @@ async def _send_telegram(token, chat_id, message, media_files=None, thread_id=No
media_files = media_files or []
thread_kwargs = {}
if thread_id is not None:
thread_kwargs["message_thread_id"] = int(thread_id)
# Reuse the gateway adapter's General-topic mapping: in Telegram
# forum supergroups, the General topic is addressed as
# message_thread_id="1" on incoming updates, but Bot API
# sendMessage rejects message_thread_id=1 with "Message thread
# not found". The adapter's helper maps "1" to None for that
# reason; the send_message tool needs the same mapping or a
# send to a forum group's General topic always errors out
# (see issue #22267).
try:
from gateway.platforms.telegram import TelegramAdapter
effective_thread_id = TelegramAdapter._message_thread_id_for_send(
str(thread_id)
)
except Exception:
# Fallback: explicit mapping in case the adapter import
# fails (e.g. python-telegram-bot missing in this venv).
effective_thread_id = (
None if str(thread_id) == "1" else int(thread_id)
)
if effective_thread_id is not None:
thread_kwargs["message_thread_id"] = effective_thread_id
if disable_link_previews:
thread_kwargs["disable_web_page_preview"] = True
+2 -1
View File
@@ -337,7 +337,8 @@ def session_search(
The current session is excluded from results since the agent already has that context.
"""
if db is None:
return tool_error("Session database not available.", success=False)
from hermes_state import format_session_db_unavailable
return tool_error(format_session_db_unavailable(), success=False)
# Defensive: models (especially open-source) may send non-int limit values
# (None when JSON null, string "int", or even a type object). Coerce to a
@@ -260,23 +260,6 @@ function applyStylesToWrappedText(
for (let lineIdx = 0; lineIdx < lines.length; lineIdx++) {
const line = lines[lineIdx]!
// In trim mode, skip leading whitespace that was trimmed from this line.
// Only skip if the original has whitespace but the output line doesn't start
// with whitespace (meaning it was trimmed). If both have whitespace, the
// whitespace was preserved and we shouldn't skip.
if (trimEnabled && line.length > 0) {
const lineStartsWithWhitespace = /\s/.test(line[0]!)
const originalHasWhitespace = charIndex < originalPlain.length && /\s/.test(originalPlain[charIndex]!)
// Only skip if original has whitespace but line doesn't
if (originalHasWhitespace && !lineStartsWithWhitespace) {
while (charIndex < originalPlain.length && /\s/.test(originalPlain[charIndex]!)) {
charIndex++
}
}
}
let styledLine = ''
let runStart = 0
let runSegmentIndex = charToSegment[charIndex] ?? 0
@@ -333,26 +316,10 @@ function applyStylesToWrappedText(
// split lines.
if (charIndex < originalPlain.length && originalPlain[charIndex] === '\n') {
charIndex++
}
// In trim mode, skip whitespace that was replaced by newline when wrapping.
// We skip whitespace in the original until we reach a character that matches
// the first character of the next line. This handles cases like:
// - "AB \tD" wrapped to "AB\n\tD" - skip spaces until we hit the tab
// In non-trim mode, whitespace is preserved so no skipping is needed.
if (trimEnabled && lineIdx < lines.length - 1) {
const nextLine = lines[lineIdx + 1]!
const nextLineFirstChar = nextLine.length > 0 ? nextLine[0] : null
// Skip whitespace until we hit a char that matches the next line's first char
while (charIndex < originalPlain.length && /\s/.test(originalPlain[charIndex]!)) {
// Stop if we found the character that starts the next line
if (nextLineFirstChar !== null && originalPlain[charIndex] === nextLineFirstChar) {
break
}
charIndex++
}
} else if (trimEnabled && lineIdx < lines.length - 1 && /\s/.test(originalPlain[charIndex] ?? '')) {
// wrap-trim removes exactly one whitespace character at each soft-wrap boundary.
// Keep the style map aligned without eating preserved indentation/spaces.
charIndex++
}
}
@@ -0,0 +1,17 @@
import { describe, expect, it } from 'vitest'
import wrapText from './wrap-text.js'
describe('wrapText wrap-trim', () => {
it('removes a single soft-wrap boundary space', () => {
expect(wrapText('Let me', 5, 'wrap-trim')).toBe('Let\nme')
})
it('preserves extra original spacing at soft-wrap boundaries', () => {
expect(wrapText('foo bar', 5, 'wrap-trim')).toBe('foo \nbar')
})
it('preserves leading whitespace on unwrapped source lines', () => {
expect(wrapText(' indented', 20, 'wrap-trim')).toBe(' indented')
})
})
@@ -77,6 +77,32 @@ function truncate(text: string, columns: number, position: 'start' | 'middle' |
return sliceFit(text, 0, columns - 1) + ELLIPSIS
}
function trimSoftWrapBoundaries(text: string, maxWidth: number): string {
return text
.split('\n')
.map(line => {
const pieces = wrapAnsi(line, maxWidth, { trim: false, hard: true }).split('\n')
if (pieces.length === 1) {
return pieces[0]!
}
for (let index = 0; index < pieces.length - 1; index++) {
const current = pieces[index]!
const next = pieces[index + 1]!
if (/\s$/.test(current)) {
pieces[index] = current.replace(/\s$/, '')
} else if (/^\s/.test(next)) {
pieces[index + 1] = next.replace(/^\s/, '')
}
}
return pieces.join('\n')
})
.join('\n')
}
function computeWrap(text: string, maxWidth: number, wrapType: Styles['textWrap']): string {
if (wrapType === 'wrap') {
return wrapAnsi(text, maxWidth, { trim: false, hard: true })
@@ -87,7 +113,7 @@ function computeWrap(text: string, maxWidth: number, wrapType: Styles['textWrap'
}
if (wrapType === 'wrap-trim') {
return wrapAnsi(text, maxWidth, { trim: true, hard: true })
return trimSoftWrapBoundaries(text, maxWidth)
}
if (wrapType!.startsWith('truncate')) {
+74 -1
View File
@@ -1,8 +1,47 @@
import { PassThrough } from 'stream'
import { Box, renderSync } from '@hermes/ink'
import React from 'react'
import { describe, expect, it } from 'vitest'
import { AUDIO_DIRECTIVE_RE, INLINE_RE, MEDIA_LINE_RE, stripInlineMarkup } from '../components/markdown.js'
import { AUDIO_DIRECTIVE_RE, INLINE_RE, Md, MEDIA_LINE_RE, stripInlineMarkup } from '../components/markdown.js'
import { stripAnsi } from '../lib/text.js'
import { DEFAULT_THEME } from '../theme.js'
const matches = (text: string) => [...text.matchAll(INLINE_RE)].map(m => m[0])
const BEL = String.fromCharCode(7)
const ESC = String.fromCharCode(27)
const CSI_RE = new RegExp(`${ESC}\\[[0-?]*[ -/]*[@-~]`, 'g')
const OSC_RE = new RegExp(`${ESC}\\][\\s\\S]*?(?:${BEL}|${ESC}\\\\)`, 'g')
const renderPlain = (node: React.ReactNode) => {
const stdout = new PassThrough()
const stdin = new PassThrough()
const stderr = new PassThrough()
let output = ''
Object.assign(stdout, { columns: 80, isTTY: false, rows: 24 })
Object.assign(stdin, { isTTY: false })
Object.assign(stderr, { isTTY: false })
stdout.on('data', chunk => {
output += chunk.toString()
})
const instance = renderSync(node, {
patchConsole: false,
stderr: stderr as NodeJS.WriteStream,
stdin: stdin as NodeJS.ReadStream,
stdout: stdout as NodeJS.WriteStream
})
instance.unmount()
instance.cleanup()
return output
.replace(OSC_RE, '')
.split('\n')
.map(line => stripAnsi(line).replace(CSI_RE, '').trimEnd())
}
describe('INLINE_RE emphasis', () => {
it('matches word-boundary italic/bold', () => {
@@ -144,3 +183,37 @@ describe('protocol sentinels', () => {
expect(AUDIO_DIRECTIVE_RE.test('audio_as_voice')).toBe(false)
})
})
describe('Md wrapping', () => {
it('trims spaces from word-wrap continuation lines', () => {
const lines = renderPlain(
React.createElement(Box, { width: 5 }, React.createElement(Md, { t: DEFAULT_THEME, text: 'Let me' }))
)
expect(lines).toContain('Let')
expect(lines).toContain('me')
expect(lines).not.toContain(' me')
})
it('keeps nested list and quote indentation out of trim-sensitive text', () => {
const lines = renderPlain(
React.createElement(
Box,
{ flexDirection: 'column', width: 24 },
React.createElement(Md, { t: DEFAULT_THEME, text: ' - nested bullet' }),
React.createElement(Md, { t: DEFAULT_THEME, text: '>> nested quote' })
)
)
expect(lines).toContain(' • nested bullet')
expect(lines).toContain(' │ nested quote')
})
it('preserves original inline-code edge spaces', () => {
const lines = renderPlain(
React.createElement(Box, { width: 24 }, React.createElement(Md, { t: DEFAULT_THEME, text: '` hi ` ok' }))
)
expect(lines.some(line => line.startsWith(' hi ok'))).toBe(true)
})
})
+25 -29
View File
@@ -323,7 +323,7 @@ function MdInline({ t, text }: { t: Theme; text: string }) {
parts.push(<Text key={parts.length}>{text.slice(last)}</Text>)
}
return <Text>{parts.length ? parts : <Text>{text}</Text>}</Text>
return <Text wrap="wrap-trim">{parts.length ? parts : text}</Text>
}
// Cross-instance parsed-children cache: useMemo's per-instance cache dies
@@ -420,7 +420,7 @@ function MdImpl({ compact, t, text }: MdProps) {
if (media) {
start('paragraph')
nodes.push(
<Text color={t.color.muted} key={key}>
<Text color={t.color.muted} key={key} wrap="wrap-trim">
{'▸ '}
<Link url={/^(?:\/|[a-z]:[\\/])/i.test(media) ? `file://${media}` : media}>
@@ -594,7 +594,7 @@ function MdImpl({ compact, t, text }: MdProps) {
if (heading) {
start('heading')
nodes.push(
<Text bold color={t.color.accent} key={key}>
<Text bold color={t.color.accent} key={key} wrap="wrap-trim">
<MdInline t={t} text={heading} />
</Text>
)
@@ -606,7 +606,7 @@ function MdImpl({ compact, t, text }: MdProps) {
if (i + 1 < lines.length && SETEXT_RE.test(lines[i + 1]!)) {
start('heading')
nodes.push(
<Text bold color={t.color.accent} key={key}>
<Text bold color={t.color.accent} key={key} wrap="wrap-trim">
<MdInline t={t} text={line.trim()} />
</Text>
)
@@ -632,7 +632,7 @@ function MdImpl({ compact, t, text }: MdProps) {
if (footnote) {
start('list')
nodes.push(
<Text color={t.color.muted} key={key}>
<Text color={t.color.muted} key={key} wrap="wrap-trim">
[{footnote[1]}] <MdInline t={t} text={footnote[2] ?? ''} />
</Text>
)
@@ -641,7 +641,7 @@ function MdImpl({ compact, t, text }: MdProps) {
while (i < lines.length && /^\s{2,}\S/.test(lines[i]!)) {
nodes.push(
<Box key={`${key}-cont-${i}`} paddingLeft={2}>
<Text color={t.color.muted}>
<Text color={t.color.muted} wrap="wrap-trim">
<MdInline t={t} text={lines[i]!.trim()} />
</Text>
</Box>
@@ -655,7 +655,7 @@ function MdImpl({ compact, t, text }: MdProps) {
if (i + 1 < lines.length && DEF_RE.test(lines[i + 1]!)) {
start('list')
nodes.push(
<Text bold key={key}>
<Text bold key={key} wrap="wrap-trim">
{line.trim()}
</Text>
)
@@ -669,7 +669,7 @@ function MdImpl({ compact, t, text }: MdProps) {
}
nodes.push(
<Text key={`${key}-def-${i}`}>
<Text key={`${key}-def-${i}`} wrap="wrap-trim">
<Text color={t.color.muted}> · </Text>
<MdInline t={t} text={def} />
</Text>
@@ -689,14 +689,12 @@ function MdImpl({ compact, t, text }: MdProps) {
const marker = task ? (task[1]!.toLowerCase() === 'x' ? '☑' : '☐') : '•'
nodes.push(
<Text key={key}>
<Text color={t.color.muted}>
{' '.repeat(indentDepth(bullet[1]!) * 2)}
{marker}{' '}
<Box key={key} paddingLeft={indentDepth(bullet[1]!) * 2}>
<Text wrap="wrap-trim">
<Text color={t.color.muted}>{marker} </Text>
<MdInline t={t} text={task ? task[2]! : bullet[2]!} />
</Text>
<MdInline t={t} text={task ? task[2]! : bullet[2]!} />
</Text>
</Box>
)
i++
@@ -708,14 +706,12 @@ function MdImpl({ compact, t, text }: MdProps) {
if (numbered) {
start('list')
nodes.push(
<Text key={key}>
<Text color={t.color.muted}>
{' '.repeat(indentDepth(numbered[1]!) * 2)}
{numbered[2]}.{' '}
<Box key={key} paddingLeft={indentDepth(numbered[1]!) * 2}>
<Text wrap="wrap-trim">
<Text color={t.color.muted}>{numbered[2]}. </Text>
<MdInline t={t} text={numbered[3]!} />
</Text>
<MdInline t={t} text={numbered[3]!} />
</Text>
</Box>
)
i++
@@ -737,11 +733,11 @@ function MdImpl({ compact, t, text }: MdProps) {
nodes.push(
<Box flexDirection="column" key={key}>
{quoteLines.map((ql, qi) => (
<Text color={t.color.muted} key={qi}>
{' '.repeat(Math.max(0, ql.depth - 1) * 2)}
{'│ '}
<MdInline t={t} text={ql.text} />
</Text>
<Box key={qi} paddingLeft={Math.max(0, ql.depth - 1) * 2}>
<Text color={t.color.muted} wrap="wrap-trim">
<MdInline t={t} text={ql.text} />
</Text>
</Box>
))}
</Box>
)
@@ -774,7 +770,7 @@ function MdImpl({ compact, t, text }: MdProps) {
if (summary) {
start('paragraph')
nodes.push(
<Text color={t.color.muted} key={key}>
<Text color={t.color.muted} key={key} wrap="wrap-trim">
{summary}
</Text>
)
@@ -786,7 +782,7 @@ function MdImpl({ compact, t, text }: MdProps) {
if (/^<\/?[^>]+>$/.test(line.trim())) {
start('paragraph')
nodes.push(
<Text color={t.color.muted} key={key}>
<Text color={t.color.muted} key={key} wrap="wrap-trim">
{line.trim()}
</Text>
)
+7 -6
View File
@@ -553,13 +553,14 @@ export interface ModelsAnalyticsResponse {
export interface CronJob {
id: string;
name?: string;
prompt: string;
schedule: { kind: string; expr: string; display: string };
schedule_display: string;
name?: string | null;
prompt?: string | null;
script?: string | null;
schedule?: { kind?: string; expr?: string; display?: string };
schedule_display?: string | null;
enabled: boolean;
state: string;
deliver?: string;
state?: string | null;
deliver?: string | null;
last_run_at?: string | null;
next_run_at?: string | null;
last_error?: string | null;
+133 -82
View File
@@ -23,6 +23,50 @@ function formatTime(iso?: string | null): string {
return d.toLocaleString();
}
function asText(value: unknown): string {
return typeof value === "string" ? value : "";
}
function truncateText(value: string, maxLength: number): string {
return value.length > maxLength
? value.slice(0, maxLength) + "..."
: value;
}
function getJobPrompt(job: CronJob): string {
return asText(job.prompt);
}
function getJobName(job: CronJob): string {
return asText(job.name).trim();
}
function getJobTitle(job: CronJob): string {
const name = getJobName(job);
if (name) return name;
const prompt = getJobPrompt(job);
if (prompt) return truncateText(prompt, 60);
const script = asText(job.script);
if (script) return truncateText(script, 60);
return job.id || "Cron job";
}
function getJobScheduleDisplay(job: CronJob): string {
return (
asText(job.schedule_display) ||
asText(job.schedule?.display) ||
asText(job.schedule?.expr) ||
"—"
);
}
function getJobState(job: CronJob): string {
return asText(job.state) || (job.enabled === false ? "disabled" : "scheduled");
}
const STATUS_TONE: Record<string, "success" | "warning" | "destructive"> = {
enabled: "success",
scheduled: "success",
@@ -84,17 +128,17 @@ export default function CronPage() {
const handlePauseResume = async (job: CronJob) => {
try {
const isPaused = job.state === "paused";
const isPaused = getJobState(job) === "paused";
if (isPaused) {
await api.resumeCronJob(job.id);
showToast(
`${t.cron.resume}: "${job.name || job.prompt.slice(0, 30)}"`,
`${t.cron.resume}: "${truncateText(getJobTitle(job), 30)}"`,
"success",
);
} else {
await api.pauseCronJob(job.id);
showToast(
`${t.cron.pause}: "${job.name || job.prompt.slice(0, 30)}"`,
`${t.cron.pause}: "${truncateText(getJobTitle(job), 30)}"`,
"success",
);
}
@@ -108,7 +152,7 @@ export default function CronPage() {
try {
await api.triggerCronJob(job.id);
showToast(
`${t.cron.triggerNow}: "${job.name || job.prompt.slice(0, 30)}"`,
`${t.cron.triggerNow}: "${truncateText(getJobTitle(job), 30)}"`,
"success",
);
loadJobs();
@@ -124,7 +168,7 @@ export default function CronPage() {
try {
await api.deleteCronJob(id);
showToast(
`${t.common.delete}: "${job?.name || (job?.prompt ?? "").slice(0, 30) || id}"`,
`${t.common.delete}: "${job ? truncateText(getJobTitle(job), 30) : id}"`,
"success",
);
loadJobs();
@@ -161,7 +205,9 @@ export default function CronPage() {
title={t.cron.confirmDeleteTitle}
description={
pendingJob
? `"${pendingJob.name || pendingJob.prompt.slice(0, 40)}" — ${t.cron.confirmDeleteMessage}`
? `"${truncateText(getJobTitle(pendingJob), 40)}" — ${
t.cron.confirmDeleteMessage
}`
: t.cron.confirmDeleteMessage
}
loading={jobDelete.isDeleting}
@@ -265,85 +311,90 @@ export default function CronPage() {
</Card>
)}
{jobs.map((job) => (
<Card key={job.id}>
<CardContent className="flex items-center gap-4 py-4">
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 mb-1">
<span className="font-medium text-sm truncate">
{job.name ||
job.prompt.slice(0, 60) +
(job.prompt.length > 60 ? "..." : "")}
</span>
<Badge tone={STATUS_TONE[job.state] ?? "secondary"}>
{job.state}
</Badge>
{job.deliver && job.deliver !== "local" && (
<Badge tone="outline">{job.deliver}</Badge>
{jobs.map((job) => {
const state = getJobState(job);
const promptText = getJobPrompt(job);
const title = getJobTitle(job);
const hasName = Boolean(getJobName(job));
const deliver = asText(job.deliver);
return (
<Card key={job.id}>
<CardContent className="flex items-center gap-4 py-4">
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 mb-1">
<span className="font-medium text-sm truncate">
{title}
</span>
<Badge tone={STATUS_TONE[state] ?? "secondary"}>
{state}
</Badge>
{deliver && deliver !== "local" && (
<Badge tone="outline">{deliver}</Badge>
)}
</div>
{hasName && promptText && (
<p className="text-xs text-muted-foreground truncate mb-1">
{truncateText(promptText, 100)}
</p>
)}
<div className="flex items-center gap-4 text-xs text-muted-foreground">
<span className="font-mono">{getJobScheduleDisplay(job)}</span>
<span>
{t.cron.last}: {formatTime(job.last_run_at)}
</span>
<span>
{t.cron.next}: {formatTime(job.next_run_at)}
</span>
</div>
{job.last_error && (
<p className="text-xs text-destructive mt-1">
{job.last_error}
</p>
)}
</div>
{job.name && (
<p className="text-xs text-muted-foreground truncate mb-1">
{job.prompt.slice(0, 100)}
{job.prompt.length > 100 ? "..." : ""}
</p>
)}
<div className="flex items-center gap-4 text-xs text-muted-foreground">
<span className="font-mono">{job.schedule_display}</span>
<span>
{t.cron.last}: {formatTime(job.last_run_at)}
</span>
<span>
{t.cron.next}: {formatTime(job.next_run_at)}
</span>
<div className="flex items-center gap-1 shrink-0">
<Button
ghost
size="icon"
title={state === "paused" ? t.cron.resume : t.cron.pause}
aria-label={
state === "paused" ? t.cron.resume : t.cron.pause
}
onClick={() => handlePauseResume(job)}
className={
state === "paused" ? "text-success" : "text-warning"
}
>
{state === "paused" ? <Play /> : <Pause />}
</Button>
<Button
ghost
size="icon"
title={t.cron.triggerNow}
aria-label={t.cron.triggerNow}
onClick={() => handleTrigger(job)}
>
<Zap />
</Button>
<Button
ghost
destructive
size="icon"
title={t.common.delete}
aria-label={t.common.delete}
onClick={() => jobDelete.requestDelete(job.id)}
>
<Trash2 />
</Button>
</div>
{job.last_error && (
<p className="text-xs text-destructive mt-1">
{job.last_error}
</p>
)}
</div>
<div className="flex items-center gap-1 shrink-0">
<Button
ghost
size="icon"
title={job.state === "paused" ? t.cron.resume : t.cron.pause}
aria-label={
job.state === "paused" ? t.cron.resume : t.cron.pause
}
onClick={() => handlePauseResume(job)}
className={
job.state === "paused" ? "text-success" : "text-warning"
}
>
{job.state === "paused" ? <Play /> : <Pause />}
</Button>
<Button
ghost
size="icon"
title={t.cron.triggerNow}
aria-label={t.cron.triggerNow}
onClick={() => handleTrigger(job)}
>
<Zap />
</Button>
<Button
ghost
destructive
size="icon"
title={t.common.delete}
aria-label={t.common.delete}
onClick={() => jobDelete.requestDelete(job.id)}
>
<Trash2 />
</Button>
</div>
</CardContent>
</Card>
))}
</CardContent>
</Card>
);
})}
</div>
<PluginSlot name="cron:bottom" />
@@ -253,6 +253,37 @@ ctx.register_platform(
The scheduler reads this env var when resolving the home target for `deliver=my_platform` jobs, and also treats the platform as a valid cron target in `_KNOWN_DELIVERY_PLATFORMS`-style checks. If your `env_enablement_fn` seeds a `home_channel` dict (see above), that takes precedence — `cron_deliver_env_var` is the fallback for cron jobs that run before env seeding.
### Out-of-process cron delivery
`cron_deliver_env_var` makes your platform a recognized `deliver=` target. To make the actual send succeed when the cron job runs in a separate process from the gateway (i.e., `hermes cron run` separate from `hermes gateway`), register a `standalone_sender_fn`:
```python
async def _standalone_send(
pconfig,
chat_id,
message,
*,
thread_id=None,
media_files=None,
force_document=False,
):
"""Open an ephemeral connection / acquire a fresh token, send, and close."""
# ... open connection, send message, return result ...
return {"success": True, "message_id": "..."}
# or {"error": "..."}
ctx.register_platform(
name="my_platform",
...
cron_deliver_env_var="MY_PLATFORM_HOME_CHANNEL",
standalone_sender_fn=_standalone_send,
)
```
Why this hook is necessary: built-in platforms (Telegram, Discord, Slack, etc.) ship direct REST helpers in `tools/send_message_tool.py` so cron can deliver without holding the gateway in the same process. Plugin platforms historically depended on `_gateway_runner_ref()`, which returns `None` outside the gateway process, so without `standalone_sender_fn` the cron-side send fails with `No live adapter for platform '<name>'`.
The function receives the same `pconfig` and `chat_id` that the live adapter would, plus optional `thread_id`, `media_files`, and `force_document` keyword arguments. Returning `{"success": True, "message_id": ...}` is treated as a successful delivery; returning `{"error": "..."}` surfaces the message in cron's `delivery_errors`. Exceptions raised inside the function are caught by the dispatcher and reported as `Plugin standalone send failed: <reason>`. Reference implementations live in `plugins/platforms/{irc,teams,google_chat}/adapter.py`.
## Surfacing Env Vars in `hermes config`
`hermes_cli/config.py` scans `plugins/platforms/*/plugin.yaml` at import time and auto-populates `OPTIONAL_ENV_VARS` from `requires_env` and (optional) `optional_env` blocks. Use the rich-dict form to contribute proper descriptions, prompts, password flags, and URLs — the CLI setup UI picks them up for free.
@@ -249,6 +249,8 @@ When users click buttons or interact with interactive cards sent by the bot, the
- The action's `value` payload from the card definition is included as JSON.
- Card actions are deduplicated with a 15-minute window to prevent double processing.
Gateway-driven update prompts use a native Feishu `Yes` / `No` card instead of falling back to plain text replies. When `hermes update --gateway` needs confirmation, the adapter records the selected answer in Hermes's `.update_response` file and replaces the card inline with a resolved state.
Card action events are dispatched with `MessageType.COMMAND`, so they flow through the normal command processing pipeline.
This is also how **command approval** works — when the agent needs to run a dangerous command, it sends an interactive card with Allow Once / Session / Always / Deny buttons. The user clicks a button, and the card action callback delivers the approval decision back to the agent.