Compare commits

..

22 Commits

Author SHA1 Message Date
emozilla 854206e59e fix(plugins): register dynamically-loaded modules in sys.modules before exec
Dashboard plugin API routes (web_server._mount_plugin_api_routes) and
gateway event hooks (gateway.hooks.HookRegistry.discover_and_load) both
loaded Python files via importlib.util.spec_from_file_location +
exec_module without registering the resulting module in sys.modules.

That breaks any plugin or hook handler that uses `from __future__ import
annotations` together with a Pydantic BaseModel / dataclass / anything
that introspects `__module__`: at first request Pydantic tries to
resolve string-form type hints against the defining module's namespace,
can't find it by name, and raises:

  PydanticUserError: TypeAdapter[...] is not fully defined;
  you should define ... and all referenced types,
  then call `.rebuild()` on the instance.

This is what broke the kanban dashboard's 'triage' button — POST
/api/plugins/kanban/tasks validated against CreateTaskBody (a Pydantic
model in a file using `from __future__ import annotations`) and
returned 500 on every click.

The fix, applied symmetrically to both loaders:

  1. Compute module_name once.
  2. Register the module in sys.modules BEFORE exec_module.
  3. On exec_module failure, pop the half-initialized stub so subsequent
     reloads don't pick up broken state.

GETs were unaffected because they don't build a body TypeAdapter, which
is why this only surfaced when users started POSTing.
2026-04-28 13:24:29 -04:00
teknium1 dd83173621 feat(kanban): per-task force-loaded skills
Tasks can now pin extra skills to load into their worker alongside the
built-in kanban-worker. Use cases: translation tasks that need a
translation skill, review tasks that need github-code-review, security
audits that need security-pr-audit — without editing the assignee's
profile config.

Changes:
- tasks.skills column (JSON array), idempotent migration, Task.skills
  dataclass field.
- create_task(skills=[...]) normalises (strip/dedupe), rejects commas
  in a single name.
- _default_spawn emits one `--skills X` pair per task skill, in
  addition to the built-in `--skills kanban-worker`. Deduped against
  the built-in.
- CLI: `hermes kanban create --skill <name>` (repeatable).
- Tool: `kanban_create(skills=[...])` (accepts list or single string).
- Dashboard: POST /tasks accepts `skills`; inline create form has a
  comma-separated skills input; drawer shows a Skills row.
- `hermes kanban show` prints a skills row when present.
- Docs: kanban.md has a new 'Pinning extra skills to a specific task'
  section; CLI reference shows `--skill`.
- Tests: 16 new (kernel round-trip, dedupe, comma-rejection, dispatcher
  argv with multiple skills, built-in dedupe, CLI flag repeatable, CLI
  no-flag stays None, show renders skills, idempotent migration on
  legacy DB, tool list/string/non-list, dashboard REST round-trip,
  dashboard default empty list).
2026-04-28 07:50:36 -07:00
teknium1 c65c1ddf21 feat(kanban): dispatcher auto-loads kanban-worker skill on every spawn
`_default_spawn` now passes `--skills kanban-worker` in the child's
argv, so every dispatched worker gets the skill loaded automatically
regardless of the profile's default skill config.

Why
  The system prompt carries the MANDATORY lifecycle (via
  KANBAN_GUIDANCE). The skill carries the deeper reference material:
  good summary/metadata patterns, retry diagnostics, block-reason
  examples, workspace handling, CLI fallback. Both are useful;
  requiring users to wire the skill into skills config per-profile
  is a footgun — they'd hit tasks where workers use the lifecycle
  correctly but not the patterns, producing weaker handoffs.

  Auto-loading makes kanban-worker the baseline. Profiles can still
  add more skills via the normal config; --skills is additive.

Test
  New test_default_spawn_auto_loads_kanban_worker_skill intercepts
  subprocess.Popen to capture the argv without actually spawning a
  hermes subprocess. Asserts '--skills kanban-worker' appears in the
  cmd and the env still carries HERMES_KANBAN_TASK + HERMES_PROFILE.

Skill note updated
  Worker skill's opening note changed from "you don't need to load
  this" to "you're seeing this because the dispatcher loaded it for
  you" — reflects the new always-loaded reality.

218/218 kanban + tools suite green.
2026-04-28 07:12:34 -07:00
teknium1 1970bcf5a5 feat(kanban): system-prompt guidance block + reshape skills to reference-only
Moves the kanban worker lifecycle out of the skills and into the system
prompt as a conditionally-injected guidance block, matching the
existing MEMORY_GUIDANCE / SESSION_SEARCH_GUIDANCE / SKILLS_GUIDANCE
pattern in `agent/prompt_builder.py`. Workers now know the full
lifecycle even if no skill was loaded; skills become the deeper
playbook for edge cases.

Why
  Previously the 6-step lifecycle (orient → work → heartbeat → block/
  complete → fan-out) only reached the model if the kanban-worker
  skill was loaded. That's fragile: users configure skills-per-
  platform, profiles can forget to include them, skill loading order
  matters. The lifecycle is load-bearing for correct board behavior
  — it belongs in the prompt path.

What
  - New `KANBAN_GUIDANCE` constant in agent/prompt_builder.py (~3 KB):
    identity framing ("You are a Kanban worker"), 6-step lifecycle
    with exact tool-call shapes (`kanban_show()`, `kanban_complete(
    summary=..., metadata=...)`, `kanban_block(reason=...)`, etc.),
    orchestrator mode note, and explicit DO NOTs including "don't
    shell out to `hermes kanban <verb>`".
  - Wired into `AIAgent._build_system_prompt` gated on `kanban_show
    in self.valid_tool_names`. Because `kanban_show`'s `check_fn`
    gates on `HERMES_KANBAN_TASK` env var, this indirection
    guarantees guidance appears iff a worker was dispatched. Zero
    cost to normal sessions.

Gating verified live
  Normal `hermes chat` session: 2329-char prompt, zero kanban content.
  Worker session (HERMES_KANBAN_TASK set):  5272-char prompt, +2943
  chars of KANBAN_GUIDANCE present. Regression test asserts the
  header phrase and each tool name.

Skills reshape (both from ~6 KB lifecycles to ~7 KB references)
  - `skills/devops/kanban-worker/SKILL.md`: drops the lifecycle
    (now in the guidance block), keeps good-summary patterns,
    block-reason examples, retry diagnostics, pitfalls, CLI
    fallback cheatsheet. Adds explicit note that you don't need to
    load the skill to work a task anymore.
  - `skills/devops/kanban-orchestrator/SKILL.md`: drops the
    "don't execute" rule (now in guidance), keeps the full
    decomposition playbook, specialist roster with typical
    workspaces, a concrete Postgres-migration example with
    `kanban_create` + `parents=[...]` linking, and pitfalls
    (reassignment vs new task, link argument order, tenant
    inheritance).

Tests (+3)
  - `test_kanban_guidance_not_in_normal_prompt`: verifies a regular
    AIAgent with no env var produces a prompt that contains none of
    the kanban lifecycle phrases.
  - `test_kanban_guidance_in_worker_prompt`: spawns an AIAgent with
    HERMES_KANBAN_TASK set, verifies the header + all 4 tool-call
    examples + anti-shell guidance appear.
  - `test_kanban_guidance_prompt_size_bounded`: sanity check that
    the guidance is 1.5-4 KB so it doesn't balloon the cached prompt.

217/217 kanban + tools suite green; 303/303 run_agent suite green.
The guidance block sits alongside the other tool-gated blocks, so
prompt caching remains intact — the system prompt is stable across
every turn of a kanban worker's run.
2026-04-28 05:29:31 -07:00
teknium1 832ecde4b0 feat(kanban): structured tool surface for worker + orchestrator agents
Seven new tools in `tools/kanban_tools.py` that give kanban workers a
backend-portable, schema-filtered way to interact with the board from
inside their own Python process — no shelling out to `hermes kanban`.

Motivation
  The CLI path (`hermes kanban complete \$TASK --summary ...`) breaks
  on any remote terminal backend (Docker, Modal, Singularity, SSH).
  The terminal tool runs `hermes kanban` inside the container, where
  `hermes` isn't installed and `~/.hermes/kanban.db` isn't mounted.
  Tools run in the agent's own Python process, so they always reach
  the board regardless of backend. Also skips shell-quoting fragility
  on --metadata JSON and gives structured error returns the model can
  reason about.

The seven tools
  kanban_show        read current task (defaults to HERMES_KANBAN_TASK)
  kanban_complete    structured handoff: summary + metadata
  kanban_block       ask for human input
  kanban_heartbeat   signal liveness during long operations
  kanban_comment     append to task thread
  kanban_create      fan out into child tasks (orchestrator path)
  kanban_link        add parent→child dependency after the fact

Gating
  Each tool's check_fn returns True iff HERMES_KANBAN_TASK is set in
  the process env. The dispatcher sets it when spawning a worker;
  normal `hermes chat` sessions never have it. Empirically verified:
  a baseline hermes-cli schema is 27 tools; with HERMES_KANBAN_TASK
  set it grows to exactly 34 (+7). Zero leak into normal sessions.

Also set HERMES_PROFILE in the spawn env so the kanban_comment tool's
author default works cleanly (it's what the tool reads to attribute
comments).

Skill updates
  - `skills/devops/kanban-worker/SKILL.md`: lifecycle rewritten to use
    kanban_show / kanban_heartbeat / kanban_block / kanban_complete /
    kanban_comment / kanban_create directly. CLI fallback section
    added for human operators / scripts.
  - `skills/devops/kanban-orchestrator/SKILL.md`: all examples ported
    from CLI to tool form; top-banner note explaining tools are the
    primary surface. kanban_create / kanban_link throughout.

Docs
  `website/docs/user-guide/features/kanban.md`:
  new "How workers interact with the board" section explaining the
  tool surface, gating mechanism, and why tools vs CLI. The worker
  skill / orchestrator skill subsections are now nested under it.

Tests (+25 in tests/tools/test_kanban_tools.py)
  - Schema gating: kanban_tools_hidden_without_env_var,
    kanban_tools_visible_with_env_var.
  - Happy paths: show (default + explicit task_id), complete (with
    summary+metadata, with result only), block, heartbeat (with and
    without note), comment (default + custom author), create (with
    list parents, with string parent), link.
  - Error paths: complete rejects no-handoff and non-dict metadata,
    block rejects empty reason, comment rejects empty body, create
    rejects no title / no assignee / non-list parents, link rejects
    self-reference / missing args / cycles.
  - End-to-end: full worker lifecycle driven entirely through the
    tools, verified against DB state.

214/214 kanban suite pass under scripts/run_tests.sh.
2026-04-28 04:30:22 -07:00
Teknium be184aa5fa fix(kanban): close the two v2-flagged issues in v1
Both items the atypical-scenarios pass flagged as "v2 follow-up"
actually belong in v1. Fixed now.

Fix 1: workspace path traversal
  resolve_workspace now rejects non-absolute paths for all three
  workspace_kinds (scratch-with-explicit-path, dir:, worktree). A
  relative path like '../../../tmp/attacker' was being silently
  resolved against the dispatcher's CWD — a confused-deputy escape.
  Error message points users at the absolute-path requirement.
  Storage remains verbatim (kernel doesn't rewrite user input);
  the refusal happens at resolution time, so the dispatcher's
  existing spawn-failure circuit breaker correctly categorizes it.

  Threat model documented in website/docs/user-guide/features/kanban.md:
  single-host, trusted-local-user. The absolute-path rule prevents
  ambiguity-driven escape, not malicious access — kanban runs as you,
  with your uid, on your filesystem.

Fix 2: build_worker_context unbounded
  Added per-section caps so worker prompts stay bounded on pathological
  boards:
    _CTX_MAX_PRIOR_ATTEMPTS = 10   most-recent N runs shown; older
                                   collapsed into "N earlier attempts
                                   omitted" marker. Attempt numbering
                                   preserved (shows "Attempt 16" not
                                   renumbered).
    _CTX_MAX_COMMENTS       = 30   same pattern for comments.
    _CTX_MAX_FIELD_BYTES    = 4 KB per summary / error / metadata / result.
    _CTX_MAX_BODY_BYTES     = 8 KB per task.body (opening post).
    _CTX_MAX_COMMENT_BYTES  = 2 KB per comment.
  Truncation uses a visible ellipsis + char-count so the worker knows
  it's been truncated.

  Effect on atypical-scenario runs:
    huge_run_count_on_one_task (1000 runs):  63 KB  →    820 chars
    comment_storm       (1000 comments):     50 KB  →  1,671 chars

Tests (+6 in main suite)
  test_resolve_workspace_rejects_relative_dir_path — relative dir:
    path stored verbatim but refused at resolve.
  test_resolve_workspace_accepts_absolute_dir_path — legitimate
    absolute paths are created and returned.
  test_resolve_workspace_rejects_relative_worktree_path — same guard
    for worktree kind.
  test_build_worker_context_caps_prior_attempts — 25 runs → exactly
    _CTX_MAX_PRIOR_ATTEMPTS shown, omitted marker present,
    attempt numbering preserves original index.
  test_build_worker_context_caps_comments — 100 comments → 30 shown,
    70 in the omitted marker.
  test_build_worker_context_caps_huge_summary — 1 MB summary on a
    prior run → context under 10 KB total, truncation marker visible.

189/189 kanban suite pass. Atypical-scenarios stress script still
passes all 28 scenarios with the new caps in effect.
2026-04-28 01:20:08 -07:00
Teknium 63b7b6d5bd test(kanban): atypical-scenario stress suite + clock-skew elapsed clamp fix
Added tests/stress/test_atypical_scenarios.py — 28 scenarios covering
atypical user inputs and environments that the normal tests assume
away. Surfaced one real UI bug in the process.

Bug found and fixed
  - CLI display of negative elapsed time when NTP jumps backward
    between claim_task and complete_task. The kernel faithfully stores
    started_at > ended_at; both CLI display sites (_cmd_show's Runs
    section at line 685, _cmd_runs's table at line 1153) now clamp
    elapsed to max(0, end - start), matching the dashboard JS which
    already had the same Math.max(0, ...) clamp. Regression test
    test_cli_show_clamps_negative_elapsed forces a future started_at
    via raw UPDATE and verifies neither show nor runs prints a
    `-<digits>s` token.

Atypical scenarios covered (all passing, 28 total)
  Data:
    - unicode_and_emoji: CJK, RTL (Hebrew/Arabic), ZWJ emoji sequences,
      control chars, null bytes in titles + metadata
    - huge_strings: 1 MB body + 1 MB summary + 50-level nested metadata
    - sql_injection_attempts: 6 classic payloads, parameterized queries
      hold across every string field
    - newlines_in_summary: full preserved on run, first line in event
      payload for notifier brevity
    - malformed_metadata_via_cli: 4 bad JSON values each cleanly
      rejected with stderr error, no partial task mutation
    - empty_string_fields: empty title rejected, whitespace-only title
      rejected, empty body accepted
    - tenant_with_newlines: multiline tenant strings survive
      board_stats

  Dependency graphs:
    - dependency_cycle: A→B→A refused
    - self_parent: cannot depend on itself
    - diamond_dependency: child promotes only when both parents done
    - wide_fan_out: 500 children promoted in 4ms via complete_task's
      internal recompute_ready
    - wide_fan_in: 500 parents → 1 child, promotion gated correctly
    - parent_in_different_status_states: child stays todo for parent
      in ready/running/blocked/triage/archived; only 'done' unblocks

  Workspace:
    - workspace_path_traversal: dir: workspaces are intentionally
      arbitrary paths (documented threat model)
    - workspace_nonexistent_path: spawn_failure counter increments,
      task returns to ready

  Clock:
    - clock_skew_start_greater_than_end: kernel stores faithfully,
      CLI now clamps at display (see Bug above)

  Filesystem:
    - hermes_home_with_spaces: works
    - hermes_home_with_unicode: works
    - hermes_home_via_symlink: two symlinks to same dir share DB
      (Path.resolve() in _INITIALIZED_PATHS)

  Scale extremes:
    - huge_run_count_on_one_task: 1000 runs → list_runs in <1ms,
      build_worker_context 3ms/63KB (flagged: unbounded on
      retry-heavy tasks; v2 cap candidate)
    - hundred_tenants: 5000 tasks / 100 tenants → stats 1ms, list 26ms
    - comment_storm: 1000 comments → 50KB worker context (same
      unbounded-context flag)

  Lifecycle:
    - completed_task_reclaim_attempt: done tasks can't be
      re-claimed/re-completed/blocked
    - archived_task_resurrection_attempt: archived tasks invisible to
      default list + all ops refused
    - unassigned_task_never_claims: dispatcher skips, task untouched

  Assignees:
    - assignee_with_special_chars: @-signs, dots, CJK, emoji, 200-char
      names, empty strings all handled

  Concurrency:
    - idempotency_key_race: two concurrent processes calling
      create_task with same key both get back the SAME task id,
      exactly one row in DB

  Dashboard:
    - dashboard_rest_with_weird_inputs: empty/huge/unicode titles,
      unknown fields, type mismatches handled correctly (200 for
      valid, 400/422 for invalid)

Observations flagged for v2 (not fixed, not blocking)
  - build_worker_context is unbounded on retry-heavy tasks (1000 runs
    → 63KB) and comment-heavy tasks (1000 comments → 50KB). A
    `--max-prior-attempts` / `--max-comments` cap would be appropriate.
  - `dir:` workspace paths are intentionally arbitrary; docs should
    note the threat model (trusted local user; path is stored
    verbatim, not sandboxed).

183/183 main kanban suite pass. Stress suite still opted out by
default; enable with `pytest --run-stress` or run scripts directly.
2026-04-28 01:13:17 -07:00
Teknium 123f8d0fed feat(kanban): battle-test suite + 2 real bugs it found
Added tests/stress/ — an opt-in suite that stresses the kernel with
real concurrency, real subprocesses, random property fuzzing, and scale
benchmarks. Exposed two real bugs the 4 audit passes missed.

Bugs found and fixed
  - _pid_alive returned True for zombie processes. os.kill(pid, 0)
    succeeds against zombies (process table entry exists until parent
    reaps), so a worker that exited normally but wasn't yet reaped
    would look alive to detect_crashed_workers indefinitely. Fixed by
    peeking at /proc/<pid>/status on Linux and treating State: Z as
    dead. No-op on other POSIX; Windows unchanged.
  - Task ID generator used 2 hex bytes (65k space). By birthday
    paradox, ~5% collision probability at 1k tasks, ~50% at 10k.
    Would raise UNIQUE constraint errors on large boards without a
    retry path. Bumped to 4 hex bytes (4.3B space). Surfaced by
    benchmarks trying to seed 10k tasks.

Stress suite (tests/stress/, opt-in via --run-stress)
  - test_concurrency.py — 5 processes race for 100 tasks. 1
    lost-claim race observed, 0 double-claims, 0 orphan runs.
  - test_concurrency_mixed.py — 10 workers + 1 reclaimer, 500
    tasks, random ops (claim/complete/block/unblock/archive).
    1518 events, 43 lost-claim races, zero invariant violations.
  - test_concurrency_reclaim_race.py — TTL < work duration so the
    reclaimer intentionally yanks tasks mid-work. 82 claims, 44
    reclaimed, 50 completed, 32 complete_refused (CAS correctly
    blocked late completes on reclaimed tasks).
  - test_subprocess_e2e.py — dispatcher spawns real python
    subprocess workers that heartbeat + complete via the CLI.
    Also exercises crash detection against a real dead PID via
    double-fork.
  - test_property_fuzzing.py — 500 randomized sequences, ~40k
    operations, 9 invariant checks after each step. Zero
    violations.
  - test_benchmarks.py — latency at 100/1k/10k tasks. Key numbers:
    dispatch_once @ 10k = 4ms, recompute_ready @ 10k = 47ms,
    list_tasks @ 10k = 50ms, build_worker_context w/ 50 parents
    = 1ms. Erosika's 10k starvation flag is pessimistic by roughly
    an order of magnitude.

Regression tests added to the main suite (+2)
  - test_pid_alive_detects_zombie: /proc check against a real
    zombified process (Linux-only, skipped elsewhere).
  - test_task_ids_dont_collide_at_scale: 500 creates, verify
    uniqueness + format.

New harness
  - tests/stress/conftest.py skips the suite by default; opt in with
    pytest --run-stress. Scripts are also __main__-runnable directly
    since they were developed as standalone stress tests.
  - tests/stress/_fake_worker.py: minimal Python worker that
    exercises the real subprocess contract (reads HERMES_KANBAN_TASK,
    heartbeats, completes via CLI).
  - tests/stress/README.md explains opt-in usage.

Test count: 182/182 kanban suite still pass under scripts/run_tests.sh;
stress suite is additive and out-of-band.
2026-04-27 21:37:34 -07:00
Teknium a24c6e191f fix(kanban): address @erosika's pre-merge review (issue #16102)
Six concrete bugs + two cheap v2 extensions from the review at
https://github.com/NousResearch/hermes-agent/issues/16102#issuecomment-4331125835
Larger items (structured comments as session substrate, taxonomy
reorg) deferred to v2 with reply posted on the issue.

Pre-merge bug fixes
  - unblock_task: close any stale current_run_id pointer with a
    reclaimed run inside the unblock txn. Defensive; the invariant
    holds under current data paths (block_task already closes the
    run) but a future or external write that leaves the pointer
    dangling would otherwise persist across the ready->blocked->
    ready cycle. Mirrors the same pattern in claim_task +
    archive_task.
  - Migration backfill: wrap the in-flight backfill loop in
    write_txn and add a CAS guard (`current_run_id IS NULL`) on
    the pointer UPDATE, with a cleanup path that marks any orphan
    run row reclaimed if the CAS fails. Prevents races against a
    concurrent dispatcher between SELECT and INSERT.
  - Notifier sub leak on non-done terminals: unsub on the last
    delivered event's kind being terminal (completed / blocked /
    gave_up / crashed / timed_out), not just on task.status in
    (done, archived). blocked / gave_up / crashed / timed_out used
    to fire one ping then strand the subscription row forever.
  - Notifier thrashes dead chats: per-subscription send-failure
    counter keyed on (task_id, platform, chat_id, thread_id).
    After 3 consecutive adapter.send exceptions, drop the sub
    automatically. Counter resets on any successful send.

Daemon ops visibility
  - run_daemon on_tick now tracks consecutive ticks where the
    ready queue is non-empty but 0 spawns succeeded. After 6 such
    ticks (default ~30s at interval=5), emits a WARN line to
    stderr pointing at profile health (venv, PATH, credentials)
    and `hermes kanban list --status blocked`. Rate-limited to
    one message per 5 minutes so a persistent outage doesn't
    spam logs.

v2 extensions shipped in scope (pure upside)
  - build_worker_context: new "Recent work by @assignee" section
    surfacing the 5 most-recent completed runs for the current
    task's assignee (excluding this task). Bounded, cached by
    the natural LIMIT, no new dependencies. Skipped when the
    task has no assignee.
  - Gateway notifier message prefix: terminal pings now lead
    with `@<assignee>` so fleets (one chat subscribing to many
    tasks with different workers) stay legible at a glance.
    One-line template change.

Deferred to v2 (noted in reply to erosika)
  - recompute_ready full-scan starvation at 10k+ tasks: dirty-set
    approach is a real refactor; fine as follow-up.
  - Skill ↔ assignee validation for routing: depends on skill
    introspection surface that isn't nailed down.
  - Structured comments (in_reply_to / addressed_to / kind) as
    multi-peer session substrate: schema-affecting, exactly the
    v2-scope design vulcan flagged shouldn't cram into this PR.
  - Pattern vs mechanism taxonomy split in docs: pure docs reorg,
    low urgency.

Tests (+6 in core functionality)
  - unblock_invariant_recovery (engineered leak, defensive close)
  - unblock_normal_path_no_spurious_run (no run created on happy
    block->unblock; erosika's main concern)
  - migration_backfill_idempotent_under_re_run (3x init_db on a
    legacy-shape DB yields exactly 1 run row, not 3)
  - build_worker_context_includes_role_history (role continuity)
  - build_worker_context_role_history_skipped_when_no_assignee
  - build_worker_context_role_history_bounded_to_5

180/180 kanban suite pass under scripts/run_tests.sh. Live-smoke
exercised all three kernel fixes end-to-end with isolated
HERMES_HOME.
2026-04-27 20:40:49 -07:00
Teknium 7206eed319 docs(kanban): add step-by-step tutorial with 10 dashboard screenshots
New website/docs/user-guide/features/kanban-tutorial.md walks four
user stories end-to-end, each backed by a real screenshot of the
dashboard running against seeded data.

Stories
  1. Solo dev shipping a feature (parent->child dependencies,
     structured handoff, run history rendering).
  2. Fleet farming (parallel independent tasks across 3 assignees,
     lanes-by-profile grouping, dispatcher daemon).
  3. Role pipeline with retry (PM spec -> eng implements -> review
     blocks -> eng retries -> review approves; two-run history
     visible in the drawer; downstream workers pull parent
     summary+metadata).
  4. Circuit breaker + crash recovery (2 spawn_failed + 1 gave_up
     for a deploy with missing creds; 1 crashed + 1 completed for
     an OOM-killed migration that recovered on retry).

Each story shows both CLI commands and the dashboard drawer
equivalent. Screenshots captured via playwright + chromium at 2x
device scale, then repalettized with PIL (22MB -> 6.1MB for the
10-image set, no visible quality loss verified against vision).

Side updates
  - website/sidebars.ts: added kanban-tutorial under features.
  - website/docs/user-guide/features/kanban.md: prefix banner
    linking new readers to the tutorial before the reference.

All image references validate: `/img/kanban-tutorial/*` maps to
website/static/img/kanban-tutorial/ (10 files). Docusaurus build
not run locally (no node_modules in worktree); CI build on merge
will confirm.
2026-04-27 20:28:53 -07:00
Teknium 1619c0e503 fix(kanban): third pass — auto-init on first use, show --json carries runs[]
Found during full-stack live-test of the kanban system: two bugs where
the kernel and CLI didn't match the documented contract.

Kernel
  - `connect()` now auto-runs schema creation + migrations on the
    first connection to a given DB path, matching its docstring
    ('Open (and initialize if needed)' — which was aspirational until
    now). Module-level _INITIALIZED_PATHS cache keeps subsequent
    connects cheap. Previously the docstring lied: every path that
    went through connect() on a fresh HERMES_HOME raised 'no such
    table: tasks' — only `hermes kanban init` and `daemon` triggered
    schema creation.
  - `init_db()` always re-runs the migration pass (clears the cache
    entry first). Callers that know the on-disk schema may have
    drifted — tests writing legacy event kinds, external tools
    upgrading an old DB file — can force re-migration.

CLI
  - `kanban_command()` entry point auto-inits the DB before
    dispatching any subcommand. Idempotent; the underlying
    connect-based init pattern makes this a one-line SELECT against
    sqlite_master after the first call.
  - `hermes kanban show --json` now includes:
      - `runs`: full attempt history (id, profile, step_key, status,
        outcome, summary, error, metadata, worker_pid, started_at,
        ended_at)
      - `run_id` on every event object
    Dashboard API already had both; CLI was behind. Now scripts that
    inspect a task can use `show --json` alone.
  - `hermes kanban show` (human-readable) prints a Runs section
    matching the `runs` subcommand's format, and each Event line
    prefixes its run_id when present. Makes attempt attribution
    visible at a glance without a second command.

Tests (+3)
  - cli_create_on_fresh_home_auto_inits (subprocess, no init_db call,
    must succeed — covers the most common first-user path)
  - connect_auto_inits_fresh_db (direct kernel use without init_db)
  - cli_show_json_carries_runs (runs[] present; events carry run_id)

174/174 kanban suite pass under scripts/run_tests.sh.

Live-tested end-to-end
  - Phases 1-6: CLI subprocess (create/claim/complete/bulk-guard/
    synthetic-run/reclaim-via-TTL/multi-attempt history)
  - Phases 7-10: Dashboard FastAPI TestClient (POST/PATCH with
    summary+metadata, drag-drop running->ready, archive-while-
    running, mark-done-with-handoff)
  - Phases 11-13: Dispatcher with stub spawn_fn (3 tasks success,
    3 failures -> gave_up circuit breaker, event.run_id attribution
    across retries)
  - Phases 14-15: WebSocket /events (run_id payload, auth rejects
    wrong tokens with code 1008)
  - Phase 16: Gateway notifier (unseen_events_for_sub returns
    run_id on events, message renders 'done — title' + handoff
    summary from event payload, crashed event path, sub cleanup)

Every surface — kernel, dispatcher, CLI, dashboard REST, dashboard
WebSocket, gateway notifier — exercised end-to-end against a live
FastAPI app with a real SQLite DB in an isolated HERMES_HOME. All
passed.
2026-04-27 19:41:08 -07:00
Teknium e27c819de3 fix(kanban): deep-scan pass 2 — synthetic runs, event.run_id plumbing, invariant recovery, live drawer refresh
Second integration audit covering surfaces the first pass didn't hit.
Found eight issues spanning kernel, dashboard frontend, notifier, and CLI.
All behavioral / UX fixes; no schema change.

Kernel
  - complete_task on a never-claimed task (ready/blocked → done with no
    run in flight) was silently dropping the summary/metadata/result
    onto a non-existent run. Now synthesizes a zero-duration run
    (started_at == ended_at) so attempt history is complete. Only
    fires when there's actually handoff data to persist — bare
    complete_task(tid) remains a no-op for run creation.
  - block_task on a never-claimed task had the same bug for --reason.
    Same fix: synthesize a zero-duration run when a reason is passed.
  - Event dataclass gained a `run_id: Optional[int] = None` field.
    list_events, unseen_events_for_sub, and the dashboard _event_dict
    were all SELECTing the column but dropping it on the way out,
    so downstream consumers couldn't group events by attempt. Every
    read path now surfaces run_id.
  - claim_task got a defensive invariant-recovery step: if somehow
    `current_run_id` is non-NULL on a task in 'ready' status (invariant
    violation from an unknown code path), close the leaked run as
    'reclaimed' inside the same txn as the new claim. No-op in the
    common case; belt-and-suspenders in case a future code path forgets
    to clear the pointer.

Dashboard
  - GET /tasks/:id events array now carries run_id per event (via
    _event_dict).
  - WebSocket /events SELECT now includes run_id in the pushed event
    payload.
  - TaskDrawer reloads itself on live events for its own task id. New
    `taskEventTick[taskId]` state in the Board, incremented on every
    WS event, passed down as `eventTick` prop; drawer's useEffect
    depends on it. Previously, background workers completing a task
    the user was viewing left the drawer showing stale data until
    manual close/reopen.
  - CSS: added `.hermes-kanban-run--ended` rule for the fallback class
    the JS emits when outcome is unset. Harmless before; just
    inconsistent.

CLI
  - `hermes kanban watch --kinds` help text listed the legacy event
    name `spawn_auto_blocked`. The kernel migration renames it to
    `gave_up`, so users typing the documented name got zero matches.
    Now shows the current lexicon (`completed,blocked,gave_up,
    crashed,timed_out`).

Tests (+6 in core functionality, +1 in dashboard plugin)
  - complete_never_claimed_task_synthesizes_run
  - block_never_claimed_task_synthesizes_run
  - complete_never_claimed_without_handoff_skips_synthesis
  - event_dataclass_carries_run_id (created.run_id None, completed.run_id matches)
  - unseen_events_for_sub_includes_run_id (notifier path)
  - claim_task_recovers_from_invariant_leak (engineer the leak, verify recovery)
  - event_dict_includes_run_id (dashboard API shape)

171/171 kanban suite pass under scripts/run_tests.sh. Live-smoke (isolated
HERMES_HOME via execute_code) exercised all six fixed paths plus the
claim-after-leak recovery sequence.

Docs
  - Runs section: new 'Synthetic runs for never-claimed completions'
    and 'Live drawer refresh' paragraphs explaining the invariants.
  - Event reference: `created` / `promoted` / `unblocked` entries now
    explicitly note `run_id` is `NULL`; `completed` / `blocked`
    describe synthetic-run fallback.
2026-04-27 19:23:49 -07:00
Teknium 1c78f6627a docs(kanban): document audit-pass invariants — bulk-close guard, reclaimed-on-status-change, completed event carries summary
- Runs section: dashboard PATCH parity (summary/metadata forward),
  `completed` event embeds first-line summary for notifiers, bulk
  --summary/--metadata refused, archive/drag-drop reclaim semantics.
- Event reference: added Payload column to Lifecycle and Edits
  tables; called out the invariant that `status` carries run_id
  when closing a reclaimed run.
2026-04-27 08:46:08 -07:00
Teknium 8ef2ae6502 fix(kanban): audit pass — close orphaned runs on archive / dashboard direct-status / drag-drop
Integration audit of the runs-as-first-class work (0146cb2bd) found five
bugs where structured runs got orphaned or dashboard parity was missing.
All behavioral fixes; no schema change needed.

Kernel
  - archive_task: when called on a running task, now closes the
    in-flight run with outcome='reclaimed' and clears current_run_id.
    Previously, dashboard bulk-archive or CLI `kanban archive <running>`
    would leave the task_runs row open with ended_at=NULL forever and
    strand the pointer. Adds the claim_lock / claim_expires / worker_pid
    clearing to the UPDATE so the task row is clean too.
  - complete_task: embeds the first-line handoff summary in the
    `completed` event payload (capped at 400 chars). Notifier can now
    render `✔ task done — <title>\n<summary>` without a second SQL hit,
    and the full summary still lives on the run row.

Dashboard plugin
  - _set_status_direct: drag-drop OFF 'running' (to 'ready', 'todo',
    'triage', 'done' — anywhere except back to 'running') now closes
    the active run with outcome='reclaimed'. Clears worker_pid too.
    Snapshots previous status + current_run_id before the UPDATE so
    the decision has the right before-state. status event rows now
    carry run_id when closing a run, NULL otherwise.
  - UpdateTaskBody: adds `summary` and `metadata` fields. PATCH
    /tasks/:id with status='done' now forwards them to complete_task,
    giving the dashboard parity with `hermes kanban complete --summary
    ... --metadata ...`. Previously these fields only existed on the
    CLI.

CLI
  - `hermes kanban complete a b c --summary X` or `--metadata Y`:
    refused with a clear stderr message instead of silently applying
    the same handoff to every task. Bulk-close without handoff flags
    still works. (Note: hermes_cli.main discards subcommand exit
    codes via `args.func(args)` without propagating; tracked
    separately. Side-effect check is the real guard.)

Gateway notifier
  - Completion message prefers run.summary (carried in event payload)
    over task.result. task.result remains the fallback for legacy rows
    written before runs shipped.
  - Docstring: renamed stale `spawn_auto_blocked` reference to
    `gave_up` / `timed_out` — matches the actual TERMINAL_KINDS
    tuple, which was already correct in code.

Tests (+8 in core functionality, +3 in dashboard plugin)
  - archive_of_running_task_closes_run
  - archive_of_ready_task_does_not_create_spurious_run
  - dashboard_direct_status_change_off_running_closes_run
  - dashboard_direct_status_change_within_same_state_is_noop_for_runs
  - cli_bulk_complete_with_summary_rejects (side-effect assertion)
  - cli_bulk_complete_without_summary_still_works
  - completed_event_payload_carries_summary
  - completed_event_payload_summary_none_when_missing
  - patch_status_done_with_summary_and_metadata
  - patch_status_done_without_summary_still_works (legacy path)
  - patch_status_archive_closes_running_run (E2E through FastAPI TestClient)

164/164 kanban suite pass under scripts/run_tests.sh. Live smoke
(execute_code with isolated HERMES_HOME) covered all five fixed paths
plus a re-claim-after-drag-drop to confirm the fresh run is tracked
correctly after the orphan close.
2026-04-27 07:44:39 -07:00
Teknium 0146cb2bd2 feat(kanban): runs as first-class (v1); structured handoffs; forward-compat for v2 workflows
Addresses vulcan-artivus's RFC review on issue #16102. Picks up the
structural changes that are expensive to retrofit later and zero-cost
to land now; defers workflow-template routing + per-stage lanes to v2
(kept forward-compat hooks in the schema).

Kernel
  - New `task_runs` table. Each claim opens a run (pid, claim_lock,
    heartbeat, max_runtime, started_at), each terminal transition
    closes it with an outcome (completed / blocked / crashed /
    timed_out / spawn_failed / gave_up / reclaimed). Multiple rows per
    task when retries happen, preserving full attempt history.
  - `tasks.current_run_id` points at the active run (NULL when idle);
    denormalised for cheap reads.
  - `task_events.run_id` carries the run a given event belongs to so
    UIs group events by attempt. claim/spawned/complete/block/crash/
    timeout/spawn_fail/gave_up/heartbeat events are all run-scoped;
    created/promoted/assigned/edited stay task-scoped (run_id=NULL).
  - Legacy DBs: migration adds the columns + indexes + synthesizes a
    run row for any task that's 'running' before the runs table
    existed, so subsequent complete/heartbeat/reclaim calls have a
    target. Idempotent.

Structured handoff
  - `complete_task(summary=, metadata=)` persists both on the closing
    run. `summary` falls back to `result` when omitted so single-run
    callers don't duplicate. `metadata` is a free-form dict
    ({changed_files, tests_run, findings, ...}).
  - `build_worker_context` rewrites: now reads "Prior attempts on this
    task" (closed runs: outcome, summary, error, metadata) and
    "Parent task results" pulls run.summary + run.metadata of the
    most-recent completed run per parent, falling back to task.result
    for legacy rows without runs. Retrying workers see why earlier
    attempts failed; downstream workers see parent handoffs
    structurally, not as loose `result` strings.

CLI
  - `hermes kanban complete <id> --summary "..." --metadata '{"files":1}'`.
    JSON is parsed and rejected with exit-2 if malformed.
  - New `hermes kanban runs <id> [--json]` verb. Shows per-run rows:
    outcome, profile, elapsed, summary, error. JSON mode serializes
    the full run dataclass for scripting.

Dashboard plugin
  - GET /tasks/:id now carries a runs[] array alongside task / events /
    comments / links. Each run serialised with outcome, summary,
    metadata, worker_pid, elapsed fields.
  - New Run History section in the drawer. Outcome-coloured left
    border (green=active, blue=completed, amber=reclaimed,
    red=crashed/timed_out/gave_up/blocked). Collapsed when >3 runs
    with a '+N earlier' toggle. Shows summary + error + metadata
    inline.

Forward-compat for v2 (vulcan's workflow templates + stages)
  - `tasks.workflow_template_id` and `tasks.current_step_key` added as
    nullable columns. v1 kernel ignores them for routing; v2 will add
    workflow_templates + workflow_steps tables and wire the dispatcher
    to consult them. task_runs has a matching `step_key` column. Lets
    a v2 release land additively without another schema migration.

Tests (+22 in test_kanban_core_functionality.py, +2 in dashboard)
  - run_created_on_claim / run_closed_on_complete_with_summary
  - run_summary_falls_back_to_result
  - multiple_attempts_preserved_as_runs (3 attempts: reclaimed →
    crashed → completed, all visible in list_runs)
  - run_on_block_with_reason / run_on_spawn_failure_records_failed_runs
    (5 spawn_failed runs + 1 gave_up run)
  - event_rows_carry_run_id (task-scoped vs run-scoped split)
  - build_worker_context_includes_prior_attempts
  - build_worker_context_uses_parent_run_summary (metadata JSON in context)
  - migration_backfills_inflight_run_for_legacy_db (simulates a
    pre-migration running task, re-runs init_db, asserts backfill)
  - forward_compat_columns_writable
  - cli_runs_verb + cli_runs_json
  - cli_complete_with_summary_and_metadata (JSON round-trip through
    shlex + argparse)
  - cli_complete_bad_metadata_exits_nonzero
  - task_detail_includes_runs / task_detail_runs_empty_before_claim

269/269 kanban suite pass under scripts/run_tests.sh. Live-smoke
covered: single-attempt complete → run closed + summary persisted;
retry scenario → two runs visible (blocked + completed); parent run
summary + metadata surfaced to child via build_worker_context;
forward-compat columns writable via UPDATE; GET /tasks/:id returns
runs[].

Docs
  - New 'Runs — one row per attempt' section in kanban.md: the
    why (full attempt history, structured metadata), the two-table
    model (task is logical, run is execution), the structured handoff
    shape (--summary / --metadata), example CLI + dashboard output,
    forward-compat note for v2.
  - Event reference updated to mention task_events.run_id.
  - CLI reference gains 'hermes kanban runs <id>'.

Not in v1 (deferred to v2):
  - Workflow templates (workflow_templates + workflow_steps tables,
    stage-based routing, success/failure step links).
  - 'stage' as a distinct axis from status in the UI.
  - Shared-by-default workspace binding across stages of the same
    workflow run.
  - Pipeline replacement for the kanban-orchestrator skill (the
    orchestrator's 'decompose, don't execute' guidance is still
    correct; it becomes partly redundant once workflows land).
2026-04-27 06:54:19 -07:00
Teknium da7d09c3b6 feat(kanban): max-runtime timeouts, worker heartbeats, assignees picker, event vocab cleanup
Ports four items from the Multica audit (https://github.com/multica-ai/multica).
Dropped their cross-host server/daemon architecture and their Postgres+pgvector
skill search — both the wrong shape for our single-host SQLite kernel.

1. Per-task max-runtime (`max_runtime_seconds` column)
   - New kernel function `enforce_max_runtime(conn)` runs in every dispatch
     tick. When a running task's elapsed time exceeds the cap, we SIGTERM
     the worker, wait a 5 s grace (polling _pid_alive), then SIGKILL. The
     task goes back to 'ready' with a `timed_out` event and re-queues
     on the next tick (unless the spawn-failure circuit breaker has
     already parked it).
   - Host-local only: lock prefix must match this host's claimer_id so we
     never signal a PID on another machine.
   - CLI: `hermes kanban create --max-runtime 30m | 2h | 1d | <seconds>`.
     New `_parse_duration` helper accepts s/m/h/d suffixes or bare
     integers.
   - Dashboard POST body + the card's `max_runtime_seconds` field.

2. Worker heartbeat (`last_heartbeat_at` column, `heartbeat` event)
   - `heartbeat_worker(conn, task_id, note=None)` emits the event and
     touches last_heartbeat_at. Refused when the task isn't running.
   - CLI: `hermes kanban heartbeat <id> [--note "..."]`.
   - kanban-worker skill instructs workers to heartbeat during long
     loops (training runs, encodes, crawls, batch uploads).
   - Separate signal from PID crash detection: a worker's Python can
     still be alive while the actual work process is stuck. Heartbeat
     absence is diagnostic; future work can auto-block on stale
     heartbeats but v1 just surfaces the signal.

3. Assignee enumeration (`known_assignees`, `list_profiles_on_disk`)
   - Scans ~/.hermes/profiles/ for dirs containing config.yaml + unions
     with current assignees on the board. Each entry returns
     {name, on_disk, counts: {status: n}}.
   - CLI: `hermes kanban assignees [--json]`. Also hooked into
     `hermes kanban init` which now prints discovered profiles so new
     installs see 'these are the assignees you can target' immediately.
   - Dashboard: GET /api/plugins/kanban/assignees for the picker.

4. Event vocab cleanup (three renames + three new kinds)
   - `ready` → `promoted` (fires when deps clear; clearer semantic).
   - `priority` → `reprioritized` (past-tense verb, matches others).
   - `spawn_auto_blocked` → `gave_up` (short, memorable; the circuit
     breaker gave up on this task).
   - New: `spawned` (emitted with {pid} on successful spawn),
     `heartbeat` ({note?}), `timed_out`
     ({pid, elapsed_seconds, limit_seconds, sigkill}).
   - One-shot migration in `_migrate_add_optional_columns` renames
     legacy rows in-place on init_db(), so existing DBs upgrade cleanly.
   - Gateway notifier's TERMINAL_KINDS set updated; timed_out gets its
     own ⏱ message template, gave_up renamed from 'auto-blocked'.
   - Plugin_api.py's two 'priority' emit sites renamed to
     'reprioritized'.
   - Documented in a new 'Event reference' section in kanban.md,
     grouped into three clusters (lifecycle / edits / worker
     telemetry) with payload shapes.

Tests (+18 in tests/hermes_cli/test_kanban_core_functionality.py,
136/136 pass):
  - max_runtime_terminates_overrun_worker: real SIGTERM flow with
    _pid_alive stub, verifies event payload + state reset.
  - max_runtime_none_means_no_cap: unbounded tasks aren't timed out.
  - create_task_persists_max_runtime.
  - enforce_max_runtime_integrates_with_dispatch: kernel-level +
    dispatch_once chaining.
  - heartbeat_on_running_task + heartbeat_refused_when_not_running.
  - cli_heartbeat_verb with --note round-trip.
  - recompute_ready_emits_promoted_not_ready.
  - spawn_failure_circuit_breaker_emits_gave_up.
  - spawned_event_emitted_with_pid.
  - migration_renames_legacy_event_kinds (injects old rows, re-runs
    init_db, asserts rename).
  - list_profiles_on_disk (tmp_path + config.yaml filter).
  - known_assignees_merges_disk_and_board (profiles on disk + board
    assignees + per-status counts).
  - cli_assignees_json.
  - parse_duration_accepts_formats (s/m/h/d/float).
  - parse_duration_rejects_garbage.
  - cli_create_max_runtime_via_duration (2h → 7200).
  - cli_create_max_runtime_bad_format_exits_nonzero.

Live smoke: POST /tasks with max_runtime_seconds round-trips;
/assignees returns the union of on-disk + board-assigned names;
PATCH priority produces 'reprioritized' events (not 'priority');
board cards expose max_runtime_seconds + last_heartbeat_at.

Docs (website/docs/user-guide/features/kanban.md):
  - New 'Event reference' section with three-cluster table
    (lifecycle / edits / worker telemetry) + payload shapes.
  - CLI reference updated for --max-runtime, heartbeat, assignees.
  - Gateway notifications section updated for the new TERMINAL_KINDS.

Not ported from Multica (deliberate, documented in the out-of-scope
section already): Postgres+pgvector skill search (heavy deps conflict
with SQLite kernel), server+daemon cross-host model (we're
single-host on purpose), first-class agent identity with threaded
comments (we keep the board profile-agnostic).
2026-04-27 06:32:17 -07:00
Teknium af8d43dbbb feat(kanban): core hardening — daemon, circuit breaker, crash detect, logs, notify, bulk, stats
Eliminates every 'known broken on day one' item in the core functionality
audit. The board is now self-driving (daemon, not cron), self-healing
(crash detection, spawn-failure circuit breaker), and self-reporting
(logs, stats, gateway notifications).

Dispatcher
  - New `hermes kanban daemon` long-lived loop with --interval, --max,
    --failure-limit, --pidfile, --verbose, signal-clean shutdown
    (SIGINT/SIGTERM via threading.Event). A kb.run_daemon() entry point
    lets tests drive it inline without subprocess.
  - `hermes kanban init` now prints the dispatcher setup hint so users
    don't leave the board off-by-default. Ships a systemd user unit at
    plugins/kanban/systemd/hermes-kanban-dispatcher.service.
  - Removed the old 'add this to cron' doc path. Cron runs agent
    prompts (LLM cost per tick) — unacceptable for a per-minute
    coordination loop.

Worker aliveness / safety
  - Spawn returns the child's PID; dispatcher stores it on the task row
    and calls detect_crashed_workers() every tick. If the PID is gone
    but the claim TTL hasn't expired, the task drops back to ready with
    a 'crashed' event. Host-local only — cross-host PIDs are ignored
    per the single-host design.
  - Spawn-failure circuit breaker: after N consecutive spawn_failed
    events on the same task (default 5), the dispatcher auto-blocks
    with the last error as the reason. Success resets the counter.
    Workspace-resolution failures count against the same budget.
  - Log rotation: _rotate_worker_log trims at 2 MiB, keeps one
    generation (.log.1), bounds per-task disk usage at ~4 MiB.

Idempotency / dedup
  - create_task(idempotency_key=...) returns the existing non-archived
    task id for retried webhooks. --idempotency-key on the CLI, json
    body field on the dashboard plugin. Archived tasks don't block a
    fresh create with the same key.

CLI surface
  - Bulk verbs: complete, unblock, archive accept multiple ids;
    block accepts --ids for sibling blocks with the same reason.
  - New verbs: daemon, watch (live event tail filtered by
    assignee/tenant/kinds), stats, log, notify-subscribe,
    notify-list, notify-unsubscribe.
  - dispatch gains --failure-limit + crashed/auto_blocked columns in
    JSON output and human-readable output.
  - gc accepts --event-retention-days / --log-retention-days; prunes
    task_events for terminal tasks and old log files.

Gateway integration
  - New GatewayRunner._kanban_notifier_watcher: polls
    kanban_notify_subs every 5s, pushes ✔/⏸/✖ messages to subscribed
    chats for completed/blocked/spawn_auto_blocked/crashed events.
    Cursor-advanced per-sub; auto-removed when the task reaches
    done/archived. Runs alongside the session expiry and platform
    reconnect watchers — SQLite work in asyncio.to_thread so the
    event loop never blocks.
  - /kanban create in the gateway auto-subscribes the originating
    chat (platform + chat_id + thread_id). Users see
    '(subscribed — you'll be notified when t_abcd completes or
    blocks)' appended to the response.

Dashboard plugin
  - GET /stats returns board_stats (by_status, by_assignee,
    oldest_ready_age_seconds).
  - GET /tasks/:id/log returns the worker log with optional ?tail=N
    cap. 404 on unknown task, exists=false when the task has never
    spawned.
  - POST /tasks accepts idempotency_key; both Pydantic body and the
    create_task kwarg now round-trip.
  - /board attaches task.age (created/started/time_to_complete in
    seconds) so the UI can colour stale cards without recomputing.
  - Card CSS: amber border after N minutes, red border when clearly
    stuck (tier per status: running 10m/60m, ready 1h/24h, todo
    7d/30d, blocked 1h/24h).
  - Drawer: new Worker log section, auto-loads on mount, last 100 KB
    cap with on-disk path surfaced when truncated.

Kernel
  - Schema additions: tasks.idempotency_key, tasks.spawn_failures,
    tasks.worker_pid, tasks.last_spawn_error; new
    kanban_notify_subs table. All gated by _migrate_add_optional_columns
    so legacy DBs upgrade cleanly.
  - release_stale_claims / complete_task / block_task now all clear
    worker_pid so crash detection doesn't false-positive on reclaimed
    tasks.
  - read_worker_log fixed: tail-skip no longer eats one-giant-line
    logs (common with child processes that don't flush newlines
    before dying).

Tests (tests/hermes_cli/test_kanban_core_functionality.py, 28 new)
  - Idempotency: same key returns existing, archived doesn't block,
    no key never collides
  - Circuit breaker: auto-blocks after limit, success resets counter,
    workspace-resolution failure counts against budget
  - Aliveness: _pid_alive helper, detect_crashed_workers reclaims
    exited child
  - Daemon: runs and stops cleanly via stop_event, survives a tick
    exception
  - Stats + task_age helpers
  - Notify subs: CRUD, cursor advances, distinct-thread is a separate row
  - GC: events-only-for-terminal-tasks, old worker logs deleted
  - Log: rotation keeps one generation, read_worker_log tail
  - CLI: bulk complete/archive/unblock/block, create with
    --idempotency-key, stats --json, notify-subscribe+list, log
    missing task, gc reports counts
  - run_slash parity: smoke-tests every registered verb (23
    invocations); none may raise or return empty string

Full kanban test suite: 234/234 pass under scripts/run_tests.sh
(60 original + 30 dashboard plugin + 28 new core + 116 command
registry). Live smoke covers /stats, idempotency, age, log endpoint
with and without content, log?tail= truncation signal, 404 on unknown
task.

Docs (website/docs/user-guide/features/kanban.md)
  - 'Core concepts' rewritten: new statuses (triage), idempotency key,
    dispatcher-as-daemon-not-cron with circuit breaker behaviour
    documented.
  - Quick start swapped to daemon. New systemd section covers user
    service install.
  - New sections: idempotent create, bulk verbs, gateway
    notifications, out-of-scope single-host note (kanban.db is local;
    don't expect multi-host).
  - CLI reference updated for every new verb, every new flag.
2026-04-26 13:01:09 -07:00
Teknium 27fc6c1086 feat(kanban): bulk ops, drawer edit, dep editor, markdown, touch, config
The dashboard plugin gets the last layer of features that turn it from a
'usable read surface with drag-drop' into a 'full kanban UI' — no more
'drop to CLI to do X' moments from inside the tab.

Plugin backend
  - POST /tasks/bulk — apply the same patch (status / archive / assignee
    / priority) to every id in the request body. Each id runs
    independently: one bad id reports {ok: false, error: ...} without
    aborting siblings. Status transitions that aren't legal for the
    current state are surfaced per-id ('transition to done refused').
    Used by the multi-select bulk action bar.
  - GET /config — returns the dashboard.kanban section of config.yaml
    (default_tenant, lane_by_profile, include_archived_by_default,
    render_markdown) with sensible defaults when the section is absent.
    Loaded once by the SPA to preselect filters and toggle markdown
    rendering.
  - _conn() helper — every handler now goes through it, calling
    kanban_db.init_db() (idempotent) before every connection. Fresh
    installs work whether the first hit is GET /board, POST /tasks, or
    any other endpoint — no more 'no such table: tasks' when the CLI
    or a script hits the plugin before the dashboard has ever loaded.

Plugin UI (plugin bundle, +~12 KB)
  - Multi-select: per-card checkbox; shift/ctrl-click also toggles
    without opening the drawer. A BulkActionBar appears above the
    columns with batch → ready / complete / archive / reassign
    (profile dropdown + unassign option). Destructive batches confirm
    first. Partial failures from the backend are surfaced inline.
  - Drawer inline editing:
    - Click the title → TitleEditor swaps in an input, Enter saves,
      Escape cancels.
    - Click the Assignee meta row → AssigneeEditor input (empty string
      unassigns).
    - Click the Priority meta row → PriorityEditor numeric input.
    - New 'edit' button on Description → full-width textarea; Save /
      Cancel switch back to rendered view.
  - Dependency editor: chip list of parents + children with per-chip
    × button (calls DELETE /links). Add-parent / add-child dropdowns
    filter out self + already-linked tasks so you cannot re-add a
    duplicate edge or a self-loop. Cycle rejections from the server
    surface cleanly via the existing error banner.
  - Parent selection in InlineCreate: new dropdown listing every task
    on the board ('{id} — {title}') — picking one sends parents=[id]
    with the create payload, so the task lands in todo (or triage if
    created from the Triage column) with the dependency wired up.
  - Safe markdown rendering for description, comment bodies, and
    result. A small in-bundle renderer handles headings, bold, italic,
    inline code, fenced code, bullet lists, and http(s)/mailto links.
    Every substitution runs on HTML-escaped input (no raw HTML), links
    get target=_blank + rel=noopener,noreferrer. Disabled by config
    key dashboard.kanban.render_markdown=false (falls back to <pre>).
  - Touch drag-drop: attachTouchDrag() installs a pointerdown handler
    that spawns a drag proxy, tracks elementFromPoint under the finger,
    and dispatches a hermes-kanban:drop CustomEvent on the column when
    released. Desktop continues to use native HTML5 DnD. Columns
    listen for both.
  - ErrorBoundary already present from the prior commit catches any
    renderer throw; markdown escape + touch-proxy cleanup both have
    their own try/finally.

Tests (tests/plugins/test_kanban_dashboard_plugin.py — 90/90 pass)
  - bulk_status_ready: 3 tasks blocked, batch → ready, all move
  - bulk_archive hides all ids from default board
  - bulk_reassign changes every assignee
  - bulk_unassign_via_empty_string sets assignee back to None
  - bulk_partial_failure_doesnt_abort_siblings: bogus id in middle,
    good siblings still get priority=7
  - bulk_empty_ids_400
  - config_returns_defaults_when_section_missing
  - config_reads_dashboard_kanban_section (writes config.yaml, verifies
    every key round-trips)

Live smoke (real FastAPI app + isolated HERMES_HOME):
  - /config without section returns defaults
  - /config with dashboard.kanban section returns the configured values
  - POST /tasks as the first-ever request (no prior /board) succeeds —
    auto-init handles it
  - Link add + remove via POST /links + DELETE /links round-trip
  - Bulk priority bump on 2 ids, both get priority=5
  - Bulk archive hides ids from default board
  - PATCH {title, body} updates the task, markdown source survives
    the round trip
  - POST /tasks {triage: true, parents: [id]} lands in triage, not todo
  - Bulk partial: 2 good + 1 bogus returns per-id outcome

Docs (website/docs/user-guide/features/kanban.md)
  - 'What the plugin gives you' rewritten to reflect bulk, drawer
    edit, dep editor, parent-on-create, markdown, touch drag-drop.
  - New 'Dashboard config' subsection with a YAML example for
    dashboard.kanban.*.
  - REST table gains /tasks/bulk and /config rows.
2026-04-26 12:36:23 -07:00
Teknium 45806629c5 feat(kanban): Triage column, progress rollup, WS auth, lanes, polish
Follows up on the initial dashboard plugin with the items called out
during self-review — ships the GUI-reality claims the PR body made,
closes the WebSocket auth gap, and lands the 'Triage' status the design
spec's Fusion-style screenshot leads with.

Kernel changes
  - kanban_db.VALID_STATUSES gains 'triage'. status is TEXT without a
    CHECK constraint so no schema migration is needed.
  - create_task(triage=True) forces the initial status to 'triage'
    regardless of parents, and parent ids are still validated so the
    eventual link rows don't dangle. recompute_ready() only promotes
    'todo' -> 'ready', so triage tasks are naturally isolated from the
    dispatcher pipeline.
  - hermes kanban create gains --triage.
  Patterns table (docs) gains P9 'Triage specifier'.

Plugin backend (plugins/kanban/dashboard/plugin_api.py)
  - GET /board now auto-init's kanban.db on first read (idempotent).
    A fresh install shows an empty board instead of 'failed to load'.
  - GET /board returns a new 'progress' field per task — {done, total}
    of child-task completion, or None if the task has no children.
  - BOARD_COLUMNS prepends 'triage'.
  - POST /tasks accepts {triage: bool}; PATCH /tasks/:id accepts
    {status: 'triage'}.
  - WebSocket /events now requires ?token=<session_token> as a query
    param — browsers can't set Authorization on a WS upgrade, so this
    matches the pattern the in-browser PTY bridge uses. Constant-time
    compare against hermes_cli.web_server._SESSION_TOKEN. In bare-test
    contexts (no dashboard module) the check no-ops so the tail loop
    stays testable. Security boundary documented in the module header
    and in website/docs/user-guide/features/kanban.md.

Plugin UI (plugins/kanban/dashboard/dist/index.js + style.css)
  - Adds the Triage column (lilac dot) with helper text
    'Raw ideas — a specifier will flesh out the spec'. Inline-create
    from the Triage column parks new tasks in triage.
  - Status action row in the drawer gains '→ triage'.
  - Progress pill (N/M) on cards that have children. Full-complete
    state tints the pill green.
  - 'Lanes by profile' toolbar toggle — sub-groups the Running column
    by assignee so you see at a glance which specialist is busy on
    what.
  - Destructive status moves (done / archived / blocked) via drag-drop
    OR via the drawer action row now prompt for confirmation.
  - Escape closes the drawer.
  - Live-update reloads are debounced (250ms) so a burst of
    task_events triggers one refetch, not N.
  - WebSocket includes ?token= built from window.__HERMES_SESSION_TOKEN__.
  - WebSocket reconnect uses exponential backoff capped at 30s, not
    a fixed 1.5s spin loop, and surfaces a user-visible error on
    code-1008 (auth rejected) instead of reconnecting forever.
  - ErrorBoundary wraps the page — a bad card render shows a
    'rendering error, reload view' card instead of crashing the tab.

Tests (tests/plugins/test_kanban_dashboard_plugin.py, +5 tests = 21)
  - empty-board shape now asserts all 6 columns including 'triage'
  - create_triage_lands_in_triage_column
  - triage_task_not_promoted_to_ready (dispatcher bypasses triage)
  - patch_status_triage_works (both into triage and out of it)
  - board_progress_rollup (0/2 -> 1/2 -> childless cards = None)
  - board_auto_initializes_missing_db
  - ws_events_rejects_when_token_required (three sub-assertions:
    missing → 1008, wrong → 1008, correct → handshake accepted)

All 82 kanban tests pass under scripts/run_tests.sh.

Docs
  - kanban.md 'What the plugin gives you' fully rewritten to match
    shipped reality (triage, progress pill, assignee lanes,
    destructive-confirm, Escape-close, debounce).
  - New 'Security model' subsection documents the explicit-plugin-
    route-bypass, the WS token requirement, and the --host 0.0.0.0
    warning; also notes that kanban.db is profile-agnostic on purpose
    (the coordination primitive) so cross-profile visibility is
    expected.
  - CLI command reference shows --triage.
  - Collaboration patterns table adds P9 'Triage specifier'.
2026-04-26 12:26:43 -07:00
Teknium 4093201c47 feat(kanban): dashboard plugin — Linear/Fusion-style board UI
Ships plugins/kanban/dashboard/ as a bundled dashboard plugin. No core
changes — uses the standard dashboard plugin contract (manifest.json +
dist/index.js + plugin_api.py) documented in 'Extending the Dashboard'.

What the tab gives you:
- One column per kanban status (todo / ready / running / blocked / done;
  archived behind a toggle), column counts, coloured status dots.
- Cards with id, title, priority badge, tenant tag, assignee,
  comment/link counts, 'created N ago'.
- HTML5 drag-drop between columns — status change routes through the
  same kanban_db code the CLI /kanban verbs use, so the three surfaces
  (CLI, gateway, dashboard) can never drift.
- Inline create per-column (title, assignee, priority).
- Side drawer on card click: description, status action row
  (→ ready / → running / block / unblock / complete / archive),
  dependency links, comment thread with Enter-to-submit,
  last 20 events.
- Toolbar: search, tenant filter, assignee filter, show-archived,
  nudge-dispatcher (skip the 60s wait), refresh.
- Live updates via WebSocket tailing task_events — the board reflects
  CLI or gateway actions in real time.

REST surface under /api/plugins/kanban/: GET /board, GET /tasks/:id,
POST /tasks, PATCH /tasks/:id, POST /tasks/:id/comments, POST /links,
DELETE /links, POST /dispatch, WS /events. Every handler is a thin
wrapper around kanban_db — no new business logic.

Visually theme-aware: the plugin CSS reads only --color-*, --radius,
--font-mono etc. so it reskins with whichever dashboard theme is active.

Tests (tests/plugins/test_kanban_dashboard_plugin.py, 16 tests):
- empty board shape
- create + appears in ready column with tenant/assignee rollups
- tenant filter
- detail includes parents/children/events
- 404 on unknown task
- PATCH status: complete / block / unblock / ready drag-drop / running
- PATCH reassign, priority, edit, invalid-status rejection
- POST comment (plus empty-body rejection)
- POST link + DELETE link + cycle rejection
- POST dispatch (dry run)

All 76 kanban tests pass under scripts/run_tests.sh.

Docs: website/docs/user-guide/features/kanban.md gains a full
'Dashboard (GUI)' section covering install, architecture, REST surface,
live-updates mechanism, extending, and scope boundary.
2026-04-26 12:08:47 -07:00
Teknium 9f610aa8f3 docs(kanban): add GUI/Dashboard plugin section
The /kanban CLI + slash command are enough to run the board
headlessly, but triage and cross-profile supervision want a
visual board. Document the design as a dashboard plugin that:

- reads live state from kanban.db over a WebSocket on
  task_events (no polling)
- writes through run_slash() so CLI/gateway/GUI cannot drift
- mounts under /api/plugins/kanban/ following the existing
  'Extending the Dashboard' plugin shape

The plugin is strictly a thin layer over kanban_db — no new
business logic, nothing to merge into the kernel.
2026-04-26 11:57:04 -07:00
Teknium e1c5e741ad feat(kanban): durable multi-profile collaboration board (#16081)
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.

Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).

What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
  dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
  entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
  worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
  the running-agent guard (board writes don't touch agent state), so
  /kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
  vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
  4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
  updated, sidebar entry added.

Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
  execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
  No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
  failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
  claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
  can serve many businesses with data isolation by workspace path.

Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
2026-04-26 08:29:46 -07:00
1243 changed files with 10473 additions and 182518 deletions
-12
View File
@@ -5,15 +5,7 @@
# Dependencies
node_modules
**/node_modules
.venv
**/.venv
# Built artifacts that are regenerated inside the image. Excluded so local
# rebuilds on the developer's machine don't invalidate the npm-install layer
# that now depends on the full ui-tui/packages/hermes-ink/ tree being present.
ui-tui/dist/
ui-tui/packages/hermes-ink/dist/
# CI/CD
.github
@@ -25,7 +17,3 @@ ui-tui/packages/hermes-ink/dist/
# Runtime data (bind-mounted at /opt/data; must not leak into build context)
data/
# Compose/profile runtime state (bind-mounted; avoid ownership/secret issues)
hermes-config/
runtime/
-16
View File
@@ -398,19 +398,3 @@ IMAGE_TOOLS_DEBUG=false
# Override STT provider endpoints (for proxies or self-hosted instances)
# GROQ_BASE_URL=https://api.groq.com/openai/v1
# STT_OPENAI_BASE_URL=https://api.openai.com/v1
# =============================================================================
# MICROSOFT TEAMS INTEGRATION
# =============================================================================
# Register a Bot in Azure: https://dev.botframework.com/ → "Register a bot"
# Or use Azure Portal: Azure Active Directory → App registrations → New registration
# Then add the bot to Teams via the Bot Framework or App Studio.
#
# TEAMS_CLIENT_ID= # Azure AD App (client) ID
# TEAMS_CLIENT_SECRET= # Azure AD client secret value
# TEAMS_TENANT_ID= # Azure AD tenant ID (or "common" for multi-tenant)
# TEAMS_ALLOWED_USERS= # Comma-separated AAD object IDs or UPNs
# TEAMS_ALLOW_ALL_USERS=false # Set true to skip the allowlist
# TEAMS_HOME_CHANNEL= # Default channel/chat ID for cron delivery
# TEAMS_HOME_CHANNEL_NAME= # Display name for the home channel
# TEAMS_PORT=3978 # Webhook listen port (Bot Framework default)
+2 -12
View File
@@ -1,18 +1,8 @@
name: 'Setup Nix'
description: 'Install Nix and configure Cachix binary cache'
inputs:
cachix-auth-token:
description: 'Cachix auth token (enables push). Omit for read-only.'
required: false
default: ''
description: 'Install Nix with DeterminateSystems and enable magic-nix-cache'
runs:
using: composite
steps:
- uses: DeterminateSystems/nix-installer-action@ef8a148080ab6020fd15196c2084a2eea5ff2d25 # v22
- uses: cachix/cachix-action@1eb2ef646ac0255473d23a5907ad7b04ce94065c # v17
with:
name: hermes-agent
authToken: ${{ inputs.cachix-auth-token }}
continue-on-error: true
- uses: DeterminateSystems/magic-nix-cache-action@565684385bcd71bad329742eefe8d12f2e765b39 # v13
-44
View File
@@ -1,44 +0,0 @@
# Dependabot configuration for hermes-agent.
#
# Deliberately scoped to github-actions only.
#
# We do NOT enable Dependabot for pip / npm / any source-dependency ecosystem
# because we pin source dependencies exactly (uv.lock, package-lock.json) as
# part of our supply-chain posture. Automatic version-bump PRs against those
# pins would undermine the strategy — pins are moved deliberately, after
# review, not on a schedule.
#
# github-actions is the exception: action pins (we use full commit SHAs per
# supply-chain policy) must be updated when upstream actions publish
# patches — usually themselves security fixes. Dependabot opens a PR with
# the new SHA and release notes; we review and merge like any other PR.
#
# Security-update PRs for source dependencies (opened ONLY when a CVE is
# published affecting a currently-pinned version) are enabled separately
# via the repo's Dependabot security updates setting
# (Settings → Code security → Dependabot → Dependabot security updates).
# Those are CVE-only, not schedule-driven, and do not conflict with our
# pinning strategy — they fire when a pinned version becomes known-bad,
# which is exactly when we want to move the pin.
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
day: "monday"
open-pull-requests-limit: 5
labels:
- "dependencies"
- "github-actions"
commit-message:
prefix: "chore(actions)"
include: "scope"
groups:
# Batch routine action bumps into one PR per week to reduce noise.
# Security updates still open individually and bypass grouping.
actions-minor-patch:
update-types:
- "minor"
- "patch"
-10
View File
@@ -76,16 +76,6 @@ jobs:
run: |
mkdir -p _site/docs
cp -r website/build/* _site/docs/
# llms.txt / llms-full.txt are also published at the site root
# (https://hermes-agent.nousresearch.com/llms.txt) because some
# agents and IDE plugins probe the classic root-level path rather
# than /docs/llms.txt. Same file, two URLs, one source of truth.
if [ -f website/build/llms.txt ]; then
cp website/build/llms.txt _site/llms.txt
fi
if [ -f website/build/llms-full.txt ]; then
cp website/build/llms-full.txt _site/llms-full.txt
fi
- name: Upload artifact
uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa # v3
+68
View File
@@ -0,0 +1,68 @@
name: Nix Lockfile Check
on:
pull_request:
workflow_dispatch:
permissions:
contents: read
pull-requests: write
concurrency:
group: nix-lockfile-check-${{ github.ref }}
cancel-in-progress: true
jobs:
check:
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: ./.github/actions/nix-setup
- name: Resolve head SHA
id: sha
shell: bash
run: |
FULL="${{ github.event.pull_request.head.sha || github.sha }}"
echo "full=$FULL" >> "$GITHUB_OUTPUT"
echo "short=${FULL:0:7}" >> "$GITHUB_OUTPUT"
- name: Check lockfile hashes
id: check
continue-on-error: true
env:
LINK_SHA: ${{ steps.sha.outputs.full }}
run: nix run .#fix-lockfiles -- --check
- name: Post sticky PR comment (stale)
if: steps.check.outputs.stale == 'true' && github.event_name == 'pull_request'
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
message: |
### ⚠️ npm lockfile hash out of date
Checked against commit [`${{ steps.sha.outputs.short }}`](${{ github.server_url }}/${{ github.repository }}/commit/${{ steps.sha.outputs.full }}) (PR head at check time).
The `hash = "sha256-..."` line in these nix files no longer matches the committed `package-lock.json`:
${{ steps.check.outputs.report }}
#### Apply the fix
- [ ] **Apply lockfile fix** — tick to push a commit with the correct hashes to this PR branch
- Or [run the Nix Lockfile Fix workflow](${{ github.server_url }}/${{ github.repository }}/actions/workflows/nix-lockfile-fix.yml) manually (pass PR `#${{ github.event.pull_request.number }}`)
- Or locally: `nix run .#fix-lockfiles -- --apply` and commit the diff
- name: Clear sticky PR comment (resolved)
if: steps.check.outputs.stale == 'false' && github.event_name == 'pull_request'
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
delete: true
- name: Fail if stale
if: steps.check.outputs.stale == 'true'
run: exit 1
+3 -108
View File
@@ -1,13 +1,6 @@
name: Nix Lockfile Fix
on:
push:
branches: [main]
paths:
- 'ui-tui/package-lock.json'
- 'ui-tui/package.json'
- 'web/package-lock.json'
- 'web/package.json'
workflow_dispatch:
inputs:
pr_number:
@@ -26,105 +19,9 @@ concurrency:
cancel-in-progress: false
jobs:
# ── Auto-fix on main ───────────────────────────────────────────────
# Fires when a push to main touches package.json or package-lock.json
# in ui-tui/ or web/. Runs fix-lockfiles and pushes the hash
# update commit directly to main so Nix builds never stay broken.
#
# Safety invariants:
# 1. The fix commit only touches nix/*.nix files, which are NOT in
# the paths filter above, so this cannot re-trigger itself.
# 2. An explicit file-whitelist check before commit aborts if
# fix-lockfiles ever modifies unexpected files.
# 3. Job-level concurrency with cancel-in-progress: true ensures
# back-to-back pushes collapse to the newest; ref: main checkout
# always operates on the latest branch state.
# 4. Uses a GitHub App token (not GITHUB_TOKEN) so the fix commit
# triggers downstream nix.yml verification.
auto-fix-main:
if: github.event_name == 'push'
runs-on: ubuntu-latest
timeout-minutes: 25
concurrency:
group: auto-fix-main
cancel-in-progress: true
steps:
- name: Generate GitHub App token
id: app-token
uses: actions/create-github-app-token@7bfa3a4717ef143a604ee0a99d859b8886a96d00 # v1.9.3
with:
app-id: ${{ secrets.APP_ID }}
private-key: ${{ secrets.APP_PRIVATE_KEY }}
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
ref: main
token: ${{ steps.app-token.outputs.token }}
- uses: ./.github/actions/nix-setup
with:
cachix-auth-token: ${{ secrets.CACHIX_AUTH_TOKEN }}
- name: Apply lockfile hashes
id: apply
run: nix run .#fix-lockfiles -- --apply
- name: Commit & push
if: steps.apply.outputs.changed == 'true'
shell: bash
run: |
set -euo pipefail
# Ensure only nix files were modified — prevents accidental
# self-triggering if fix-lockfiles ever touches package files.
unexpected="$(git diff --name-only | grep -Ev '^nix/(tui|web)\.nix$' || true)"
if [ -n "$unexpected" ]; then
echo "::error::Unexpected modified files: $unexpected"
exit 1
fi
# Record the base SHA before committing — used to detect package
# file changes if we need to rebase after a non-fast-forward push.
BASE_SHA="$(git rev-parse HEAD)"
git config user.name 'github-actions[bot]'
git config user.email '41898282+github-actions[bot]@users.noreply.github.com'
git add nix/tui.nix nix/web.nix
git commit -m "fix(nix): auto-refresh npm lockfile hashes" \
-m "Source: $GITHUB_SHA" \
-m "Run: $GITHUB_SERVER_URL/$GITHUB_REPOSITORY/actions/runs/$GITHUB_RUN_ID"
# Retry push with rebase in case main advanced with an unrelated
# commit during the nix build. Without this, a non-fast-forward
# rejection silently loses the fix. If package files changed during
# the rebase, abort — a fresh auto-fix run will handle the new state.
for attempt in 1 2 3; do
if git push origin HEAD:main; then
exit 0
fi
echo "::warning::Push attempt $attempt failed (non-fast-forward?), rebasing…"
git fetch origin main
# If package files changed between our base and the new main,
# our computed hashes are stale. Abort and let the next triggered
# run recompute from the correct package-lock state.
pkg_changed="$(git diff --name-only "$BASE_SHA"..origin/main -- \
'ui-tui/package-lock.json' 'ui-tui/package.json' \
'web/package-lock.json' 'web/package.json' || true)"
if [ -n "$pkg_changed" ]; then
echo "::warning::Package files changed since hash computation — aborting; a fresh run will recompute"
exit 0
fi
git rebase origin/main
done
echo "::error::Failed to push after 3 rebase attempts"
exit 1
# ── PR fix (manual / checkbox) ─────────────────────────────────────
# Existing behavior: run on manual dispatch OR when a task-list
# checkbox in the sticky lockfile-check comment flips from [ ] to [x].
fix:
# Run on manual dispatch OR when a task-list checkbox in the sticky
# lockfile-check comment flips from `[ ]` to `[x]`.
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'issue_comment'
@@ -202,12 +99,10 @@ jobs:
fetch-depth: 0
- uses: ./.github/actions/nix-setup
with:
cachix-auth-token: ${{ secrets.CACHIX_AUTH_TOKEN }}
- name: Apply lockfile hashes
id: apply
run: nix run .#fix-lockfiles
run: nix run .#fix-lockfiles -- --apply
- name: Commit & push
if: steps.apply.outputs.changed == 'true'
-84
View File
@@ -7,7 +7,6 @@ on:
permissions:
contents: read
pull-requests: write
concurrency:
group: nix-${{ github.ref }}
@@ -23,95 +22,12 @@ jobs:
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: ./.github/actions/nix-setup
with:
cachix-auth-token: ${{ secrets.CACHIX_AUTH_TOKEN }}
- name: Resolve head SHA
if: github.event_name == 'pull_request'
id: sha
shell: bash
run: |
FULL="${{ github.event.pull_request.head.sha || github.sha }}"
echo "full=$FULL" >> "$GITHUB_OUTPUT"
echo "short=${FULL:0:7}" >> "$GITHUB_OUTPUT"
- name: Check flake
id: flake
if: runner.os == 'Linux'
continue-on-error: true
run: nix flake check --print-build-logs
- name: Build package
id: build
if: runner.os == 'Linux'
continue-on-error: true
run: nix build --print-build-logs
# When the real Nix build fails, run a targeted diagnostic to see if
# the failure is specifically a stale npm lockfile hash in one of the
# known npm subpackages (tui / web). This avoids surfacing a generic
# "build failed" message when the fix is a single known command.
- name: Diagnose npm lockfile hashes
id: hash_check
if: (steps.flake.outcome == 'failure' || steps.build.outcome == 'failure') && runner.os == 'Linux'
continue-on-error: true
env:
LINK_SHA: ${{ steps.sha.outputs.full }}
run: nix run .#fix-lockfiles -- --check
# If fix-lockfiles itself crashes (infrastructure blip, cache throttle,
# etc.) it won't set stale=true/false. Treat that as a distinct failure
# mode rather than silently ignoring it.
- name: Fail if hash check crashed without reporting
if: steps.hash_check.outcome == 'failure' && steps.hash_check.outputs.stale != 'true' && steps.hash_check.outputs.stale != 'false'
run: |
echo "::error::fix-lockfiles exited without reporting stale status — likely an infrastructure or script failure"
exit 1
- name: Post sticky PR comment (stale hashes)
if: steps.hash_check.outputs.stale == 'true' && github.event_name == 'pull_request'
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
message: |
### ⚠️ npm lockfile hash out of date
Checked against commit [`${{ steps.sha.outputs.short }}`](${{ github.server_url }}/${{ github.repository }}/commit/${{ steps.sha.outputs.full }}) (PR head at check time).
The `hash = "sha256-..."` line in these nix files no longer matches the committed `package-lock.json`:
${{ steps.hash_check.outputs.report }}
#### Apply the fix
- [ ] **Apply lockfile fix** — tick to push a commit with the correct hashes to this PR branch
- Or [run the Nix Lockfile Fix workflow](${{ github.server_url }}/${{ github.repository }}/actions/workflows/nix-lockfile-fix.yml) manually (pass PR `#${{ github.event.pull_request.number }}`)
- Or locally: `nix run .#fix-lockfiles` and commit the diff
# Clear the sticky comment when either the build passed outright (no
# hash check needed) or the hash check explicitly returned stale=false
# (build failed for a non-hash reason).
- name: Clear sticky PR comment (resolved)
if: |
github.event_name == 'pull_request' &&
runner.os == 'Linux' &&
(steps.hash_check.outputs.stale == 'false' ||
(steps.flake.outcome == 'success' && steps.build.outcome == 'success'))
uses: marocchino/sticky-pull-request-comment@52423e01640425a022ef5fd42c6fb5f633a02728 # v2.9.1
with:
header: nix-lockfile-check
delete: true
- name: Final fail if build or flake failed
if: steps.flake.outcome == 'failure' || steps.build.outcome == 'failure'
run: |
if [ "${{ steps.hash_check.outputs.stale }}" == "true" ]; then
echo "::error::Nix build failed due to stale npm lockfile hash. Run: nix run .#fix-lockfiles"
else
echo "::error::Nix build/flake check failed. See logs above."
fi
exit 1
- name: Evaluate flake (macOS)
if: runner.os == 'macOS'
run: nix flake show --json > /dev/null
-67
View File
@@ -1,67 +0,0 @@
name: OSV-Scanner
# Scans lockfiles (uv.lock, package-lock.json) against the OSV vulnerability
# database. Runs on every PR that touches a lockfile and on a weekly schedule
# against main.
#
# This is detection-only — OSV-Scanner does NOT open PRs or modify pins.
# It reports known CVEs in currently-pinned dependency versions so we can
# decide when and how to patch on our own schedule. Our pinning strategy
# (full SHA / exact version) is preserved; only the notification signal
# is added.
#
# Complements the existing supply-chain-audit.yml workflow (which scans
# for malicious code patterns in PR diffs) by covering the orthogonal
# "currently-pinned dep became known-vulnerable" case.
#
# Uses Google's officially-recommended reusable workflow, pinned by SHA.
# Findings land in the repo's Security tab (Code Scanning > OSV-Scanner).
# fail-on-vuln is disabled so the job does not block merges on pre-existing
# vulnerabilities in pinned deps that we may need to patch deliberately.
on:
pull_request:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package.json'
- 'ui-tui/package-lock.json'
- 'website/package.json'
- 'website/package-lock.json'
- '.github/workflows/osv-scanner.yml'
push:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package-lock.json'
- 'website/package-lock.json'
schedule:
# Weekly scan against main — catches CVEs published after merge for
# deps that haven't changed since.
- cron: '0 9 * * 1'
workflow_dispatch:
permissions:
# Required by the reusable workflow to upload SARIF to the Security tab.
actions: read
contents: read
security-events: write
jobs:
scan:
name: Scan lockfiles
uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@c51854704019a247608d928f370c98740469d4b5 # v2.3.5
with:
# Scan explicit lockfiles rather than recursing, so we only look at
# the three sources of truth and skip vendored / test / worktree dirs.
scan-args: |-
--lockfile=uv.lock
--lockfile=ui-tui/package-lock.json
--lockfile=website/package-lock.json
fail-on-vuln: false
-1
View File
@@ -69,4 +69,3 @@ mini-swe-agent/
.nix-stamps/
result
website/static/api/skills-index.json
models-dev-upstream/
+2 -11
View File
@@ -38,7 +38,7 @@ hermes-agent/
│ │ # homeassistant, signal, matrix, mattermost, email, sms,
│ │ # dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
│ │ # webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ └── builtin_hooks/ # Extension point for always-registered gateway hooks (none shipped)
│ └── builtin_hooks/ # Always-registered gateway hooks (boot-md, ...)
├── plugins/ # Plugin system (see "Plugins" section below)
│ ├── memory/ # Memory-provider plugins (honcho, mem0, supermemory, ...)
│ ├── context_engine/ # Context-engine plugins
@@ -257,16 +257,7 @@ The dashboard embeds the real `hermes --tui` — **not** a rewrite. See `hermes
## Adding New Tools
For most custom or local-only tools, do **not** edit Hermes core. Use the plugin
route instead: create `~/.hermes/plugins/<name>/plugin.yaml` and
`~/.hermes/plugins/<name>/__init__.py`, then register tools with
`ctx.register_tool(...)`. Plugin toolsets are discovered automatically and can be
enabled or disabled without touching `tools/` or `toolsets.py`.
Use the built-in route below only when the user is explicitly contributing a new
core Hermes tool that should ship in the base system.
Built-in/core tools require changes in **2 files**:
Requires changes in **2 files**:
**1. Create `tools/your_tool.py`:**
```python
+1 -1
View File
@@ -494,7 +494,7 @@ branding:
agent_name: "My Agent"
welcome: "Welcome message"
response_label: " ⚔ Agent "
prompt_symbol: "⚔"
prompt_symbol: "⚔ "
tool_prefix: "╎" # Tool output line prefix
```
+3 -23
View File
@@ -14,7 +14,7 @@ ENV PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright
# that would otherwise accumulate when hermes runs as PID 1. See #15012.
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential curl nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps git openssh-client docker-cli tini && \
build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps git openssh-client docker-cli tini && \
rm -rf /var/lib/apt/lists/*
# Non-root user for runtime; UID can be overridden via HERMES_UID at runtime
@@ -28,40 +28,20 @@ WORKDIR /opt/hermes
# ---------- Layer-cached dependency install ----------
# Copy only package manifests first so npm install + Playwright are cached
# unless the lockfiles themselves change.
#
# ui-tui/packages/hermes-ink/ is copied IN FULL (not just its manifests)
# because it is referenced as a `file:` workspace dependency from
# ui-tui/package.json. Copying the tree up front lets npm resolve the
# workspace to real content instead of stopping at a bare package.json.
COPY package.json package-lock.json ./
COPY web/package.json web/package-lock.json web/
COPY ui-tui/package.json ui-tui/package-lock.json ui-tui/
COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/
# `npm_config_install_links=false` forces npm to install `file:` deps as
# symlinks (the npm 10+ default) even on Debian's older bundled npm 9.x,
# which defaults to `install-links=true` and installs file deps as *copies*.
# The host-side package-lock.json is generated with a newer npm that uses
# symlinks, so an install-as-copy produces a hidden node_modules/.package-lock.json
# that permanently disagrees with the root lock on the @hermes/ink entry.
# That disagreement trips the TUI launcher's `_tui_need_npm_install()`
# check on every startup and triggers a runtime `npm install` that then
# fails with EACCES (node_modules/ is root-owned from build time).
ENV npm_config_install_links=false
RUN npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
(cd web && npm install --prefer-offline --no-audit) && \
(cd ui-tui && npm install --prefer-offline --no-audit) && \
npm cache clean --force
# ---------- Source code ----------
# .dockerignore excludes node_modules, so the installs above survive.
COPY --chown=hermes:hermes . .
# Build browser dashboard and terminal UI assets.
RUN cd web && npm run build && \
cd ../ui-tui && npm run build
# Build web dashboard (Vite outputs to hermes_cli/web_dist/)
RUN cd web && npm run build
# ---------- Permissions ----------
# Make install dir world-readable so any HERMES_UID can read it at runtime.
-505
View File
@@ -1,505 +0,0 @@
# Hermes Agent v0.12.0 (v2026.4.30)
**Release Date:** April 30, 2026
**Since v0.11.0:** 1,096 commits · 550 merged PRs · 1,270 files changed · 217,776 insertions · 213 community contributors (including co-authors)
> The Curator release — Hermes Agent now maintains itself. An autonomous background Curator grades, prunes, and consolidates your skill library on its own schedule. The self-improvement loop that reviews what to save got a substantial upgrade. Four new inference providers, a 18th messaging platform, a 19th via Teams plugin, native Spotify + Google Meet integrations, ComfyUI and TouchDesigner-MCP moved from optional to bundled-by-default, and a ~57% cut to visible TUI cold start.
---
## ✨ Highlights
- **Autonomous Curator** — `hermes curator` runs as a background agent on the gateway's cron ticker (7-day cycle default). It grades your skill library, consolidates related skills, prunes dead ones, and writes per-run reports to `logs/curator/run.json` + `REPORT.md`. Archived skills are classified consolidated-vs-pruned via model + heuristic. Defense-in-depth gates protect bundled/hub skills from mutation. Unified under `auxiliary.curator` — pick the curator's model in `hermes model`, manage it from the dashboard. `hermes curator status` ranks skills by usage (most-used / least-used). ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277), [#17307](https://github.com/NousResearch/hermes-agent/pull/17307), [#17941](https://github.com/NousResearch/hermes-agent/pull/17941), [#17868](https://github.com/NousResearch/hermes-agent/pull/17868), [#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
- **Self-improvement loop — substantially upgraded** — The background review fork (the core of Hermes' self-improvement: after each turn it decides what memories/skills to save or update) is now class-first (rubric-based rather than free-form), active-update biased (prefers the skill the agent just loaded), handles `references/`/`templates/` sub-files, and properly inherits the parent's live runtime (provider, model, credentials actually propagate). Restricted to memory + skills toolsets so it can't sprawl. Memory providers shut down cleanly. Prior-turn tool messages excluded from the summary so the fork sees a clean context. ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026), [#17213](https://github.com/NousResearch/hermes-agent/pull/17213), [#16099](https://github.com/NousResearch/hermes-agent/pull/16099), [#16569](https://github.com/NousResearch/hermes-agent/pull/16569), [#16204](https://github.com/NousResearch/hermes-agent/pull/16204), [#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
- **Skill integrations — major expansion** — **ComfyUI v5** with official CLI + REST + hardware-gated local install, moved from optional to **built-in by default** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734)). **TouchDesigner-MCP** bundled by default, expanded with GLSL, post-FX, audio, geometry, and 9 new reference docs ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753), [#16624](https://github.com/NousResearch/hermes-agent/pull/16624), [#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @kshitijk4poor + @SHL0MS). **Humanizer** skill ports a text-cleaner that strips AI-isms ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787)). **claude-design** HTML artifact skill + design-md (Google DESIGN.md spec) + airtable salvage + `skill_manage` edits in `external_dirs` + direct-URL skill install + `/reload-skills` slash command. ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358), [#14876](https://github.com/NousResearch/hermes-agent/pull/14876), [#16291](https://github.com/NousResearch/hermes-agent/pull/16291), [#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#16323](https://github.com/NousResearch/hermes-agent/pull/16323), [#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **LM Studio — first-class provider** — upgraded from a custom-endpoint alias to a full-blown native provider: dedicated auth, `hermes doctor` checks, reasoning transport, live `/models` listing. (Salvage of @kshitijk4poor's #17061.) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
- **Four more new inference providers** — **GMI Cloud** (first-class, salvage of #11955@isaachuangGMICLOUD), **Azure AI Foundry** with auto-detection, **MiniMax OAuth** with PKCE browser flow (salvage #15203), **Tencent Tokenhub** (salvage of #16860). ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663), [#15845](https://github.com/NousResearch/hermes-agent/pull/15845), [#17524](https://github.com/NousResearch/hermes-agent/pull/17524), [#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
- **Pluggable gateway platforms + Microsoft Teams** — the gateway is now a plugin host. Drop-in messaging adapters live outside the core, and Microsoft Teams is the first plugin-shipped platform. (Salvage of #17664.) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751), [#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **Tencent 元宝 (Yuanbao) — 18th messaging platform** — native gateway adapter with text + media delivery. ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424))
- **Spotify — native tools + bundled skill + wizard** — 7 tools (play, search, queue, playlists, devices) behind PKCE OAuth, interactive setup wizard, bundled skill, surfacing in `hermes tools`, cron usage documented. ([#15121](https://github.com/NousResearch/hermes-agent/pull/15121), [#15130](https://github.com/NousResearch/hermes-agent/pull/15130), [#15154](https://github.com/NousResearch/hermes-agent/pull/15154), [#15180](https://github.com/NousResearch/hermes-agent/pull/15180))
- **Google Meet plugin** — join calls, transcribe, speak, follow up. Realtime OpenAI transport + Node bot server, full pipeline bundled as a plugin. ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364))
- **`hermes -z` one-shot mode + `hermes update --check`** — non-interactive `hermes -z <prompt>` with `--model`/`--provider`/`HERMES_INFERENCE_MODEL`. `hermes update --check` preflight. Opt-in pre-update HERMES_HOME backup. ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702), [#15704](https://github.com/NousResearch/hermes-agent/pull/15704), [#15841](https://github.com/NousResearch/hermes-agent/pull/15841), [#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
- **Models dashboard tab + in-browser model config** — rich per-model analytics, switch main + auxiliary models from the dashboard. ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745), [#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
- **Remote model catalog manifest** — OpenRouter + Nous Portal model catalogs are now pulled from a remote manifest so new models show up without a release. ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
- **Native multimodal image routing** — images now route based on the model's actual vision capability rather than provider defaults. ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
- **Gateway media parity** — native multi-image sending across Telegram, Discord, Slack, Mattermost, Email, and Signal; centralized audio routing with FLAC support + Telegram document fallback. ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909), [#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- **TUI catches up to (and past) the classic CLI** — LaTeX rendering (@austinpickett), `/reload` .env hot-reload, pluggable busy-indicator styles (@OutThisLife, #13610), opt-in auto-resume of last session, expanded light-terminal auto-detection, session delete from `/resume` picker with `d`, modified mouse-wheel line scroll, and a `/mouse` toggle that kills ConPTY's phantom mouse injection (@kevin-ho). ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175), [#17286](https://github.com/NousResearch/hermes-agent/pull/17286), [#17150](https://github.com/NousResearch/hermes-agent/pull/17150), [#17130](https://github.com/NousResearch/hermes-agent/pull/17130), [#17113](https://github.com/NousResearch/hermes-agent/pull/17113), [#17668](https://github.com/NousResearch/hermes-agent/pull/17668), [#17669](https://github.com/NousResearch/hermes-agent/pull/17669), [#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
- **Observability + achievements plugins** — bundled Langfuse observability plugin (salvage #16845) + bundled hermes-achievements plugin that scans full session history. ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917), [#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
- **TTS provider registry + Piper local TTS** — pluggable `tts.providers.<name>` registry; Piper ships as a native local TTS provider. (Closes #8508.) ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843), [#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
- **Vercel Sandbox backend** — Vercel sandboxes as an execute_code/terminal backend (@kshitijk4poor). ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
- **Secret redaction off by default** — default flipped to off. Prevents the long-standing patch-corruption incidents where fake secret-shaped substrings mangled tool outputs. Opt in via `redaction.enabled: true` when you need it. ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
- **Cold-start performance** — visible TUI cold start cut **~57%** via lazy agent init (@OutThisLife), lazy imports of OpenAI / Anthropic / Firecrawl / account_usage, mtime-cached `load_config()`, memoized `get_tool_definitions()` with TTL-cached `check_fn` results, precompiled dangerous-command patterns. ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190), [#17046](https://github.com/NousResearch/hermes-agent/pull/17046), [#17041](https://github.com/NousResearch/hermes-agent/pull/17041), [#17098](https://github.com/NousResearch/hermes-agent/pull/17098), [#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
- **Configurable prompt cache TTL** — `prompt_caching.cache_ttl` (5m default, 1h opt-in — cost savings for bursty sessions that keep cache warm). Salvage of #12659. ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
---
## 🧠 Autonomous Curator & Self-Improvement Loop
### Curator — autonomous skill maintenance
- **`hermes curator` as a background agent** — runs on the gateway's cron ticker, 7-day cycle by default, umbrella-first prompt, inherits parent config, unbounded iterations ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277) — issue #7816)
- **Per-run reports** — `logs/curator/run.json` + `REPORT.md` per cycle ([#17307](https://github.com/NousResearch/hermes-agent/pull/17307))
- **Consolidated vs pruned classification** — archived skills split with model + heuristic ([#17941](https://github.com/NousResearch/hermes-agent/pull/17941))
- **`hermes curator status`** — ranks skills by usage, shows most-used and least-used ([#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
- **Unified under `auxiliary.curator`** — pick the model in `hermes model`, configure from the dashboard ([#17868](https://github.com/NousResearch/hermes-agent/pull/17868))
- **Documentation** — dedicated curator feature page on the docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
- Fix: seed defaults on update, create `logs/curator/` directory, defer fire import ([#17927](https://github.com/NousResearch/hermes-agent/pull/17927))
- Fix: scan nested archive subdirs in `restore_skill` (@0xDevNinja) ([#17951](https://github.com/NousResearch/hermes-agent/pull/17951))
- Fix: use actual skill activity in curator status (@y0shua1ee) ([#17953](https://github.com/NousResearch/hermes-agent/pull/17953))
- Fix: `skill_manage` refuses writes on pinned skills; pinning now blocks curator writes ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562), [#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
- Fix: `bump_use()` wired into skill invocation + preload + skill_view (salvage #17782) ([#17932](https://github.com/NousResearch/hermes-agent/pull/17932))
### Self-improvement loop (background review fork)
- **Class-first skill-review prompt** — rubric-based grading rather than free-form "should this update" ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026))
- **Active-update bias** — prefers updating skills the agent just loaded, handles `references/` + `templates/` sub-files ([#17213](https://github.com/NousResearch/hermes-agent/pull/17213))
- **Fork inherits parent's live runtime** — provider, model, credentials actually propagate now ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
- **Scoped toolsets** — review fork restricted to memory + skills (no shell, no web) ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
- **Clean shutdown** — background review memory providers exit properly (salvage #15289) ([#16204](https://github.com/NousResearch/hermes-agent/pull/16204))
- **Clean context** — prior-history tool messages excluded from review summary (salvage #14967) ([#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
---
## 🧩 Skills Ecosystem
### Skill integrations — newly bundled or promoted
- **ComfyUI v5** — official CLI + REST + hardware-gated local install; **moved from optional to built-in** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734), [#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
- **TouchDesigner-MCP** — **bundled by default** ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753) — @kshitijk4poor), expanded with GLSL, post-FX, audio, geometry references ([#16624](https://github.com/NousResearch/hermes-agent/pull/16624)), 9 new reference docs ([#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @SHL0MS)
- **Humanizer** — strips AI-isms from text ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787))
- **claude-design** — HTML artifact skill with disambiguation from other design skills ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358))
- **design-md** — Google's DESIGN.md spec skill ([#14876](https://github.com/NousResearch/hermes-agent/pull/14876))
- **airtable** — salvaged skill + skill API keys wired into `.env` (#15838) ([#16291](https://github.com/NousResearch/hermes-agent/pull/16291))
- **pretext** — creative browser demos with @chenglou/pretext ([#17259](https://github.com/NousResearch/hermes-agent/pull/17259))
- **spike** + **sketch** — throwaway experiments + HTML mockups, adapted from gsd-build ([#17421](https://github.com/NousResearch/hermes-agent/pull/17421))
### Skills UX
- **Install skills from a direct HTTP(S) URL** — `hermes skills install <url>` ([#16323](https://github.com/NousResearch/hermes-agent/pull/16323))
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **`hermes skills list`** shows enabled/disabled status ([#16129](https://github.com/NousResearch/hermes-agent/pull/16129))
- **`skill_manage` refuses writes on pinned skills** ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562))
- **`skill_manage` edits external_dirs skills in place** (salvage #9966) ([#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#17289](https://github.com/NousResearch/hermes-agent/pull/17289))
- Fix: inline-shell rendering in `skill_view` ([#15376](https://github.com/NousResearch/hermes-agent/pull/15376))
- Fix: exclude `.archive/` from skill index walk (salvage #17639) ([#17931](https://github.com/NousResearch/hermes-agent/pull/17931))
- Fix: dedicated docs page per bundled + optional skill ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929))
- Fix: `google-workspace` shared HERMES_HOME helper + ship deps as optional extra ([#15405](https://github.com/NousResearch/hermes-agent/pull/15405))
- Fix: auto-wrap ASCII-art code blocks in generated skill pages ([#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
- Point agent at `hermes-agent` skill + docs site for Hermes questions ([#16535](https://github.com/NousResearch/hermes-agent/pull/16535))
---
## 🏗️ Core Agent & Architecture
### Provider & Model Support
#### New providers
- **GMI Cloud** — first-class API-key provider on par with Arcee/Kilocode/Xiaomi (salvage of #11955@isaachuangGMICLOUD) ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663))
- **Azure AI Foundry** — auto-detection, full wiring ([#15845](https://github.com/NousResearch/hermes-agent/pull/15845))
- **LM Studio** — upgraded from custom-endpoint alias to first-class provider: dedicated auth, doctor checks, reasoning transport, live `/models` (salvage of #17061@kshitijk4poor) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
- **MiniMax OAuth** — PKCE browser flow with full OAuth integration (salvage #15203) ([#17524](https://github.com/NousResearch/hermes-agent/pull/17524))
- **Tencent Tokenhub** — new provider (salvage of #16860) ([#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
#### Model catalog
- **Remote model catalog manifest** — OpenRouter + Nous Portal catalogs pulled from remote manifest so new models show up without a release ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
- `openai/gpt-5.5` and `gpt-5.5-pro` added to OpenRouter + Nous Portal ([#15343](https://github.com/NousResearch/hermes-agent/pull/15343))
- `deepseek-v4-pro` and `deepseek-v4-flash` added ([#14934](https://github.com/NousResearch/hermes-agent/pull/14934))
- `qwen3.6-plus` added to Alibaba-supported models ([#16896](https://github.com/NousResearch/hermes-agent/pull/16896))
- Gemini free-tier keys blocked at setup with 429 guidance surfacing ([#15100](https://github.com/NousResearch/hermes-agent/pull/15100))
#### Model configuration
- **Configurable `prompt_caching.cache_ttl`** — 5m default, 1h opt-in (salvage #12659) ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
- `/fast` whitelist broadened to all OpenAI + Anthropic models ([#16883](https://github.com/NousResearch/hermes-agent/pull/16883))
- `auxiliary.extra_body.reasoning` translates into Codex Responses API ([#17004](https://github.com/NousResearch/hermes-agent/pull/17004))
- `hermes fallback` command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
### Agent Loop & Conversation
- **Native multimodal image routing** — based on model vision capability, not provider defaults ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
- **Delegate `child_timeout_seconds` default bumped to 600s** ([#14809](https://github.com/NousResearch/hermes-agent/pull/14809))
- **Diagnostic dump when subagent times out with 0 API calls** ([#15105](https://github.com/NousResearch/hermes-agent/pull/15105))
- **Gateway busts cached agent on compression/context_length config edits** ([#17008](https://github.com/NousResearch/hermes-agent/pull/17008))
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
- `/reload-mcp` awareness — rebuild cached agents + prompt-cache cost confirmation ([#17729](https://github.com/NousResearch/hermes-agent/pull/17729))
- Fix: repair CamelCase + `_tool` suffix tool-call emissions ([#15124](https://github.com/NousResearch/hermes-agent/pull/15124))
- Fix: retry on `json.JSONDecodeError` instead of treating as local validation error ([#15107](https://github.com/NousResearch/hermes-agent/pull/15107))
- Fix: handle unescaped control chars in `tool_call.arguments` ([#15356](https://github.com/NousResearch/hermes-agent/pull/15356))
- Fix: ordering fix in `_copy_reasoning_content_for_api` — cross-provider reasoning isolation (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749))
- Fix: inject empty `reasoning_content` for DeepSeek/Kimi `tool_calls` unconditionally (@Zjianru) ([#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
- Fix: persist streamed `reasoning_content` on assistant turns (#16844) ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
- Fix: cancel coroutine on timeout so worker thread exits; full traceback on tool failure ([#17428](https://github.com/NousResearch/hermes-agent/pull/17428))
- Fix: isolate `get_tool_definitions` quiet_mode cache + dedup LCM injection (#17335) ([#17889](https://github.com/NousResearch/hermes-agent/pull/17889))
- Fix: serialize concurrent `hermes_tools` RPC calls from `execute_code` (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
- Fix: rename `[SYSTEM:``[IMPORTANT:` in all user-injected markers (dodges Azure content filter) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
### Compression
- **Retry summary on main model for unknown errors before giving up** ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774))
- **Notify users when configured aux model fails even if main-model fallback recovers** ([#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
- `/compress` wrapped in `_busy_command` to block input during compression ([#15388](https://github.com/NousResearch/hermes-agent/pull/15388))
- Fix: reserve system + tools headroom when aux binds threshold ([#15631](https://github.com/NousResearch/hermes-agent/pull/15631))
- Fix: use text-char sum for multimodal token estimation in `_find_tail_cut_by_tokens` ([#16369](https://github.com/NousResearch/hermes-agent/pull/16369))
### Session, Memory & State
- **Trigram FTS5 index for CJK search, replace LIKE fallback** (@alt-glitch) ([#16651](https://github.com/NousResearch/hermes-agent/pull/16651))
- **Index `tool_name` + `tool_calls` in FTS5, with repair + migration** (salvages #16866) ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
- **Checkpoints: auto-prune orphan and stale shadow repos at startup** ([#16303](https://github.com/NousResearch/hermes-agent/pull/16303))
- **Memory providers notified on mid-process session_id rotation** (#6672) ([#17409](https://github.com/NousResearch/hermes-agent/pull/17409))
- Fix: quote underscored terms in FTS5 query sanitization ([#16915](https://github.com/NousResearch/hermes-agent/pull/16915))
- Fix: resolve viking_read 500/412 on file URIs + pseudo-summary URIs (salvage #5886) ([#17869](https://github.com/NousResearch/hermes-agent/pull/17869))
- Fix: skip external-provider sync on interrupted turns ([#15395](https://github.com/NousResearch/hermes-agent/pull/15395))
- Fix: close embedded Hindsight async client cleanly (salvage #14605) ([#16209](https://github.com/NousResearch/hermes-agent/pull/16209))
- Fix: pass session transcript to `shutdown_memory_provider` on gateway + CLI (#15165) ([#16571](https://github.com/NousResearch/hermes-agent/pull/16571))
- Fix: write-origin metadata seam ([#15346](https://github.com/NousResearch/hermes-agent/pull/15346))
- Fix: preserve symlinks during atomic file writes ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
- Refactor: remove `flush_memories` entirely ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
### Auxiliary models
- Fix: surface auxiliary failures in UI (previously silent) ([#15324](https://github.com/NousResearch/hermes-agent/pull/15324))
- Fix: surface title-gen auxiliary failures instead of silently dropping ([#16371](https://github.com/NousResearch/hermes-agent/pull/16371))
- Fix: generalize unsupported-parameter detector and harden `max_tokens` retry ([#15633](https://github.com/NousResearch/hermes-agent/pull/15633))
---
## 📱 Messaging Platforms (Gateway)
### New Platforms
- **Microsoft Teams (19th platform)** — as a plugin, + xdist collision guard ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **Yuanbao (Tencent 元宝, 18th platform)** — native adapter with text + media delivery ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424), [#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
### Pluggable Gateway Platforms
- **Drop-in messaging adapters** — the gateway is now a plugin host for platforms (salvage of #17664) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
### Telegram
- **Chat allowlists for groups and forums** (@web3blind) ([#15027](https://github.com/NousResearch/hermes-agent/pull/15027))
- **Send fresh finals for stale preview streams** (port openclaw#72038) ([#16261](https://github.com/NousResearch/hermes-agent/pull/16261))
- **Render markdown tables as row-group bullets + prompt hint** ([#16997](https://github.com/NousResearch/hermes-agent/pull/16997))
- Document fallback in centralized audio routing ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Discord
- **Opt-in toolsets + ID injection + tool split + Feishu wiring** (salvage #15457, #15458) ([#15610](https://github.com/NousResearch/hermes-agent/pull/15610), [#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
- Fix: coerce `limit` parameter to int before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
### Slack
- **Register every gateway command as a native slash (Discord/Telegram parity)** ([#16164](https://github.com/NousResearch/hermes-agent/pull/16164))
- **`strict_mention` config** — prevents thread auto-engagement ([#16193](https://github.com/NousResearch/hermes-agent/pull/16193))
- **`channel_skill_bindings`** — bind specific skills to specific Slack channels ([#16283](https://github.com/NousResearch/hermes-agent/pull/16283))
### Signal
- **Native formatting** — markdown → bodyRanges, reply quotes, reactions ([#17417](https://github.com/NousResearch/hermes-agent/pull/17417))
- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Feishu / Mattermost / Email / Signal
- All participate in **native multi-image sending** ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
### Gateway Core
- **Centralized audio routing + FLAC support + Telegram doc fallback** ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
- **Native multi-image sending** across Telegram, Discord, Slack, Mattermost, Email, Signal ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
- **Make hygiene hard message limit configurable** ([#17000](https://github.com/NousResearch/hermes-agent/pull/17000))
- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
- **`pre_gateway_dispatch` hook** — plugins can intercept before dispatch ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
- **`pre_approval_request` / `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
- Fix: timeouts — guard `load_config()` call against runtime exceptions ([#16318](https://github.com/NousResearch/hermes-agent/pull/16318))
- Fix: support passing handler tools via registry ([#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
---
## 🔧 Tool System
### Plugin-first architecture
- **Pluggable gateway platforms** — platforms can ship as plugins ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
- **Microsoft Teams as first plugin-shipped platform** ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- **`pre_gateway_dispatch` hook** ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
- **`pre_approval_request` + `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
- **`duration_ms` on `post_tool_call`** (inspired by Claude Code 2.1.119) ([#15429](https://github.com/NousResearch/hermes-agent/pull/15429))
- **Bundled plugins**: Spotify ([#15174](https://github.com/NousResearch/hermes-agent/pull/15174)), Google Meet ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364)), Langfuse observability ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917)), hermes-achievements ([#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
- **Page-scoped plugin slots for built-in dashboard pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
### Browser
- **CDP supervisor** — dialog detection + response + cross-origin iframe eval ([#14540](https://github.com/NousResearch/hermes-agent/pull/14540))
- **Auto-spawn local Chromium for LAN/localhost URLs** when cloud provider is configured ([#16136](https://github.com/NousResearch/hermes-agent/pull/16136))
### Execute code / Terminal
- **Vercel Sandbox backend** for `execute_code` / terminal (@kshitijk4poor) ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
- **Collapse subagent `task_id`s to shared container** ([#16177](https://github.com/NousResearch/hermes-agent/pull/16177))
- **Docker: run container as host user** to avoid root-owned bind mounts (@benbarclay) ([#17305](https://github.com/NousResearch/hermes-agent/pull/17305))
- Fix: safely quote `~/` subpaths in wrapped `cd` commands ([#15394](https://github.com/NousResearch/hermes-agent/pull/15394))
- Fix: close file descriptor in `LocalEnvironment._update_cwd` ([#17300](https://github.com/NousResearch/hermes-agent/pull/17300))
- Fix: SSH — prevent tar from overwriting remote home dir permissions ([#17898](https://github.com/NousResearch/hermes-agent/pull/17898), [#17867](https://github.com/NousResearch/hermes-agent/pull/17867))
### Image generation
- See Provider section for updates; no new image providers this window.
### TTS / Voice
- **Pluggable TTS provider registry** under `tts.providers.<name>` ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843))
- **Piper** as native local TTS provider (closes #8508) ([#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
- **Voice mode CLI parity in the TUI** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
- Fix: vision — use HERMES_HOME-based cache dir instead of cwd ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
### Cron
- **Honor `hermes tools` config for the cron platform** ([#14798](https://github.com/NousResearch/hermes-agent/pull/14798))
- **Per-job `workdir`** — project-aware cron runs ([#15110](https://github.com/NousResearch/hermes-agent/pull/15110))
- **`context_from` field** — chain cron job outputs ([#15606](https://github.com/NousResearch/hermes-agent/pull/15606))
- Fix: promote `croniter` to a core dependency ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
### Web search
- **Expose `limit` for `web_search`** ([#16934](https://github.com/NousResearch/hermes-agent/pull/16934))
### Maps
- Fix: include seconds in timezone UTC offset output ([#16300](https://github.com/NousResearch/hermes-agent/pull/16300))
### Approvals
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
- Perf: precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
### ACP
- **Advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
### API Server
- **POST `/v1/runs/{run_id}/stop`** (salvage of #15656) ([#15842](https://github.com/NousResearch/hermes-agent/pull/15842))
- **Expose run status for external UIs** (#17085) ([#17458](https://github.com/NousResearch/hermes-agent/pull/17458))
### Nix
- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
- Fix: use `--rebuild` in fix-lockfiles to bypass cached FOD store paths ([#15444](https://github.com/NousResearch/hermes-agent/pull/15444))
- Fix: `extraPackages` now actually works via per-user profile ([#17047](https://github.com/NousResearch/hermes-agent/pull/17047))
- Fix: refresh web/ npm-deps hash to unblock main builds ([#17174](https://github.com/NousResearch/hermes-agent/pull/17174))
- Fix: replace magic-nix-cache with Cachix ([#17928](https://github.com/NousResearch/hermes-agent/pull/17928))
---
## 🖥️ TUI
### New features
- **LaTeX rendering** (@austinpickett) ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175))
- **`/reload` .env hot-reload** — ported from the classic CLI ([#17286](https://github.com/NousResearch/hermes-agent/pull/17286))
- **Pluggable busy-indicator styles** (@OutThisLife, #13610) ([#17150](https://github.com/NousResearch/hermes-agent/pull/17150))
- **Opt-in auto-resume of the most recent session** (@OutThisLife) ([#17130](https://github.com/NousResearch/hermes-agent/pull/17130))
- **Expanded light-terminal auto-detection** — `HERMES_TUI_THEME` + background hex (@OutThisLife) ([#17113](https://github.com/NousResearch/hermes-agent/pull/17113))
- **Delete sessions from `/resume` picker with `d`** (@OutThisLife) ([#17668](https://github.com/NousResearch/hermes-agent/pull/17668))
- **Line-by-line scroll on modified mouse wheel** (@OutThisLife) ([#17669](https://github.com/NousResearch/hermes-agent/pull/17669))
- **Delete queued message while editing with ctrl-x / cancel with esc** (@OutThisLife) ([#16707](https://github.com/NousResearch/hermes-agent/pull/16707))
- **Per-section visibility for the details accordion** (@OutThisLife) ([#14968](https://github.com/NousResearch/hermes-agent/pull/14968))
- **Voice mode CLI parity** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
- **Contextual first-touch hints ported to TUI** — `/busy`, `/verbose` ([#16054](https://github.com/NousResearch/hermes-agent/pull/16054))
- **Mini help menu on `?` in the input field** (@ethernet8023) ([#18043](https://github.com/NousResearch/hermes-agent/pull/18043))
### Fixes
- Fix: proactive mouse disable on ConPTY + `/mouse` toggle command (@kevin-ho, WSL2 ghost-mouse fix) ([#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
- Fix: restore skills search RPC ([#15870](https://github.com/NousResearch/hermes-agent/pull/15870))
- Perf: cache text measurements across yoga flex re-passes ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
- Perf: stabilize long-session scrolling ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
- Perf: lazily seed virtual history heights ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
- Perf: cut visible cold start ~57% with lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
---
## 🖱️ CLI & User Experience
### New commands
- **`hermes -z <prompt>`** — non-interactive one-shot mode ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702))
- **`hermes -z` with `--model` / `--provider` / `HERMES_INFERENCE_MODEL`** ([#15704](https://github.com/NousResearch/hermes-agent/pull/15704))
- **`hermes update --check`** preflight flag ([#15841](https://github.com/NousResearch/hermes-agent/pull/15841))
- **`hermes fallback`** command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
- **`/busy`** slash command for busy input mode ([#15382](https://github.com/NousResearch/hermes-agent/pull/15382))
- **`/busy` input mode 'steer'** as a third option ([#16279](https://github.com/NousResearch/hermes-agent/pull/16279))
- **`/btw` as alias for `/background`** ([#16053](https://github.com/NousResearch/hermes-agent/pull/16053))
- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
- **Surface `/queue`, `/bg`, `/steer` in agent-running placeholder** ([#16118](https://github.com/NousResearch/hermes-agent/pull/16118))
### Setup / onboarding
- **Auto-reconfigure on existing installs** ([#15879](https://github.com/NousResearch/hermes-agent/pull/15879))
- **Contextual first-touch hints for `/busy` and `/verbose`** ([#16046](https://github.com/NousResearch/hermes-agent/pull/16046))
- **Cost-saving tips from the April 30 tip-of-the-day** ([#17841](https://github.com/NousResearch/hermes-agent/pull/17841))
- **Hyperlink startup banner title to the latest GitHub Release** ([#14945](https://github.com/NousResearch/hermes-agent/pull/14945))
### Update / backup
- **Snapshot pairing data before `git pull`** ([#16383](https://github.com/NousResearch/hermes-agent/pull/16383))
- **Auto-backup HERMES_HOME before `hermes update`** (opt-in, off by default) ([#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
- **Exclude `checkpoints/` from backups** ([#16572](https://github.com/NousResearch/hermes-agent/pull/16572))
- **Exclude SQLite WAL/SHM/journal sidecars from backups** ([#16576](https://github.com/NousResearch/hermes-agent/pull/16576))
- **Installer FHS layout for root installs on Linux** ([#15608](https://github.com/NousResearch/hermes-agent/pull/15608))
- Fix: kill stale dashboards instead of warning ([#17832](https://github.com/NousResearch/hermes-agent/pull/17832))
- Fix: show correct update status on nix-built hermes ([#17550](https://github.com/NousResearch/hermes-agent/pull/17550))
### Slash-command housekeeping
- Refactor: drop `/provider`, `/plan` handler, and clean up slash registry ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
- Refactor: drop `persist_session` plumbing + fix broken `/btw` mid-turn bypass ([#16075](https://github.com/NousResearch/hermes-agent/pull/16075))
### OpenClaw migration (for folks coming from OpenClaw)
- **Hardened OpenClaw import** — plan-first apply, redaction, pre-migration backup ([#16911](https://github.com/NousResearch/hermes-agent/pull/16911))
- Fix: case-preserving brand rewrite + one-time `~/.openclaw` residue banner ([#16327](https://github.com/NousResearch/hermes-agent/pull/16327))
- Fix: resolve `openclaw` workspace files from `agents.defaults.workspace` ([#16879](https://github.com/NousResearch/hermes-agent/pull/16879))
- Fix: resolve model aliases against real OpenClaw catalog schema (salvage #16778) ([#16977](https://github.com/NousResearch/hermes-agent/pull/16977))
---
## 📊 Web Dashboard
- **Models tab** — rich per-model analytics ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745))
- **Configure main + auxiliary models from the Models page** ([#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
- **Dashboard Chat tab — xterm.js + JSON-RPC sidecar** (supersedes #12710 + #13379, @OutThisLife) ([#14890](https://github.com/NousResearch/hermes-agent/pull/14890))
- **Dashboard layout refresh** (@austinpickett) ([#14899](https://github.com/NousResearch/hermes-agent/pull/14899))
- **`--stop` and `--status` flags** on the dashboard CLI ([#17840](https://github.com/NousResearch/hermes-agent/pull/17840))
- **Page-scoped plugin slots for built-in pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
- Fix: replace all buttons for design system buttons ([#17007](https://github.com/NousResearch/hermes-agent/pull/17007))
---
## ⚡ Performance
- **TUI visible cold start cut ~57%** via lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
- **Lazy-import OpenAI, Anthropic, Firecrawl, account_usage** ([#17046](https://github.com/NousResearch/hermes-agent/pull/17046))
- **mtime-cache `load_config()` and `read_raw_config()`** ([#17041](https://github.com/NousResearch/hermes-agent/pull/17041))
- **Memoize `get_tool_definitions()` + TTL-cache `check_fn` results** ([#17098](https://github.com/NousResearch/hermes-agent/pull/17098))
- **Precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS** ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
- **Cache Ink text measurements across yoga flex re-passes** ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
- **Stabilize long-session scrolling** ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
- **Lazily seed virtual history heights** ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
---
## 🔒 Security & Reliability
- **Secret redaction off by default** — stops corrupting patches / API payloads with fake-key substitutions. Opt in via `redaction.enabled: true` ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
- **`[SYSTEM:``[IMPORTANT:`** in all user-injected markers (Azure content filter dodge) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
- **Canonical `mask_secret` helper; fix status.py DIM drift** ([#17207](https://github.com/NousResearch/hermes-agent/pull/17207))
- **Sweep expired paste.rs uploads on a real timer** ([#16431](https://github.com/NousResearch/hermes-agent/pull/16431))
- **Preserve symlinks during atomic file writes** ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
- **Probe `/dev/tty` by opening it, not bare existence** ([#17024](https://github.com/NousResearch/hermes-agent/pull/17024))
---
## 🐛 Notable Bug Fixes
This window includes 360 `fix:` PRs. Selected highlights from across the stack:
- **Background review fork inherits parent's live runtime** — provider/model/creds now propagate correctly ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
- **Hindsight configurable `HINDSIGHT_TIMEOUT` env var** ([#15077](https://github.com/NousResearch/hermes-agent/pull/15077))
- **Tools: normalize numeric entries + clear stale `no_mcp` in `_save_platform_tools`** ([#15607](https://github.com/NousResearch/hermes-agent/pull/15607))
- **MCP: rewrite `definitions` refs to `$defs` in input schemas** — closes provider-side 400s
- **Azure content filter compatibility** — renamed `[SYSTEM:` markers so Azure's content filter stops flagging them ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
- **Vision cache uses HERMES_HOME instead of cwd** ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
- **FTS5 search** — tool_name + tool_calls indexing with repair + migration ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
- **Streaming reasoning persists on assistant turns** ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
- **execute_code concurrent RPC serialization** (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
- **Background reviewer scoped to memory + skills toolsets** — no more accidental web/shell escapes ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
- **Compression recovery** — retry on main before giving up; notify user when aux fails ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774), [#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
- **`croniter` promoted to a core dependency** ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
- **Discord tool `limit` parameter coerced to int** before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
- **Yuanbao messaging platform entrance fix** ([#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
- **ACP advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
- **DeepSeek / Kimi reasoning content isolation** across cross-provider histories (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749), [#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
- **Preserve reasoning_content replay on DeepSeek v4 + Kimi/Moonshot thinking** ([#18045](https://github.com/NousResearch/hermes-agent/pull/18045))
The vast majority of the 360 fixes landed in the streaming/compression/tool-calling paths across all providers — DeepSeek, Kimi, Moonshot, GLM, Qwen, MiniMax, Gemini, Anthropic, OpenAI — alongside TUI polish (resize, scroll, sticky-prompt) and gateway platform-specific edge cases.
---
## 🧪 Testing & CI
- Hermetic test parity (`scripts/run_tests.sh`) held across this window
- **Microsoft Teams xdist collision guard** — prevents worker collisions when Teams platform tests run in parallel ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
- Chore: remove unused imports and dead locals (ruff F401, F841) ([#17010](https://github.com/NousResearch/hermes-agent/pull/17010))
---
## 📚 Documentation
- **Curator feature page** added to docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
- **Document pin also blocking `skill_manage` writes** ([#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
- **Direct-URL skill install documented** across features, reference, guide, and `hermes-agent` skill ([#16355](https://github.com/NousResearch/hermes-agent/pull/16355))
- **Hooks tutorial — build a BOOT.md startup checklist** (replaces the removed built-in hook) ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202))
- **ComfyUI docs: ask local vs cloud FIRST before hardware check** ([#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
- **Obliteratus skill: link YouTube video guide in SKILL.md** ([#15808](https://github.com/NousResearch/hermes-agent/pull/15808))
- Per-skill docs pages generated for bundled + optional skills; ASCII art code blocks auto-wrapped ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929), [#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
---
## ⚖️ Removed / Reverted
- **Kanban multi-profile collaboration board** — landed in #16081, reverted in ([#16098](https://github.com/NousResearch/hermes-agent/pull/16098)) while the design is reworked
- **computer-use cua-driver** — 3 preparatory PRs landed then were reverted in ([#16927](https://github.com/NousResearch/hermes-agent/pull/16927))
- **BOOT.md built-in hook** removed ([#17093](https://github.com/NousResearch/hermes-agent/pull/17093)); the hooks tutorial ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202)) shows how to build the same workflow yourself with a shell hook
- **`/provider` + `/plan` slash commands dropped** ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
- **`flush_memories` removed entirely** ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
---
## 👥 Contributors
### Core
- **@teknium1** (Teknium)
### Top Community Contributors (by merged PR count since v0.11.0)
- **@OutThisLife** (Brooklyn) — 52 PRs · TUI — light-terminal detection + pluggable busy styles + auto-resume + session-delete from /resume + mouse-wheel scrolling + xterm.js dashboard Chat tab + cold-start cut + accordion polish
- **@kshitijk4poor** — 12 PRs · LM Studio first-class provider (salvage), Vercel Sandbox backend, GMI Cloud salvage, bundled-by-default touchdesigner-mcp, many tool-call / reasoning fixes
- **@helix4u** — 10 PRs · MCP schema robustness, assorted stability fixes
- **@alt-glitch** — 8 PRs · trigram FTS5 CJK search, declarative Nix plugin install, matrix/feishu hints and fixes
- **@ethernet8023** — 4 PRs
- **@austinpickett** — 4 PRs · LaTeX rendering in TUI, dashboard layout refresh
- **@benbarclay** — 3 PRs · Docker run-as-host-user so bind mounts don't get root-owned
- **@vominh1919** — 2 PRs
- **@stephenschoettler** — 2 PRs
- **@kevin-ho** — ConPTY mouse-injection fix (#15488)
- **@Zjianru** — cross-provider reasoning_content isolation + DeepSeek/Kimi empty-reasoning injection (#15749, #15762)
- **@web3blind** — Telegram chat allowlists for groups and forums (#15027)
- **@SHL0MS** — 9 new TouchDesigner-MCP reference docs (#16768)
- **@0xDevNinja** — curator `restore_skill` nested-archive fix (#17951)
- **@y0shua1ee** — curator `use` activity fix (#17953)
### Also contributing
Salvaged or co-authored work from **@isaachuangGMICLOUD** (GMI Cloud), earlier upstream PRs from the original author of each salvage chain, and a long tail of one-shot fixes, documentation nudges, and skill contributions from the community.
### All Contributors (alphabetical, excluding @teknium1)
@0xbyt4, @0xharryriddle, @0xDevNinja, @0z1-ghb, @5park1e, @A-FdL-Prog, @aj-nt, @akhater, @alblez, @alexg0bot,
@alexzhu0, @AllardQuek, @alt-glitch, @amanning3390, @amanuel2, @AndreKurait, @andrewhosf, @Andy283, @andyylin,
@angel12, @AntAISecurityLab, @ash, @austinpickett, @badgerbees, @BadTechBandit, @Bartok9, @beenherebefore,
@beesrsj2500, @BeliefanX, @benbarclay, @benjaminsehl, @BlackishGreen33, @bloodcarter, @BlueBirdBack,
@briandevans, @brooklynnicholson, @bsgdigital, @buray, @bwjoke, @camaragon, @cdanis, @cgarwood82,
@charles-brooks, @chen1749144759, @chengoak, @ching-kaching, @Contentment003111, @crayfish-ai, @CruxExperts,
@cyclingwithelephants, @dandaka, @danklynn, @ddupont808, @dhabibi, @difujia, @dimitrovi, @dlkakbs,
@dontcallmejames, @EKKOLearnAI, @emozilla, @ericnicolaides, @Erosika, @ethernet8023, @exiao, @Feranmi10,
@flobo3, @foxion37, @georgeglessner, @georgex8001, @ghostmfr, @H-Ali13381, @HangGlidersRule, @harryplusplus,
@haru398801, @heathley, @hejuntt1014, @hekaru-agent, @helix4u, @Heltman, @HenkDz, @heyitsaamir, @hharry11,
@hhhonzik, @hhuang91, @HiddenPuppy, @htsh, @iamagenius00, @in-liberty420, @innocarpe, @irispillars, @iRonin,
@isaachuangGMICLOUD, @Ito-69, @j3ffffff, @jackjin1997, @jakubkrcmar, @Jason2031, @JayGwod, @jerome-benoit,
@johnncenae, @Kailigithub, @keiravoss94, @kevin-ho, @knockyai, @konsisumer, @kshitijk4poor, @kunlabs, @l0hde,
@Leihb, @leoneparise, @LeonSGP43, @liizfq, @liuhao1024, @loongzhao, @lsdsjy, @luyao618, @ma-pony, @Magaav,
@MagicRay1217, @math0r-be, @MattMaximo, @maxims-oss, @MaxyMoos, @maymuneth, @mcndjxlefnd, @memosr,
@MestreY0d4-Uninter, @mewwts, @Mirac1eSky, @MorAlekss, @mrhwick, @mrunmayee17, @mssteuer, @Nanako0129,
@nazirulhafiy, @Nerijusas, @Nicecsh, @nicoloboschi, @nightq, @ningfangbin, @octo-patch, @Octopus,
@OutThisLife, @Paperclip, @pein892, @perlowja, @prasadus92, @qike-ms, @qiyin-code, @Readon, @ReginaldasR,
@revaraver, @rfilgueiras, @rmoen, @romanornr, @rugvedS07, @rylena, @samrusani, @Sanjays2402, @sasha-id,
@Satoshi-agi, @scheidti, @scotttrinh, @season179, @SeeYangZhi, @sgaofen, @shamork, @shannonsands, @SHL0MS,
@simbam99, @Societus, @socrates1024, @Sonoyunchu, @sprmn24, @stephenschoettler, @tangyuanjc, @TechPrototyper,
@tekgnosis-net, @ThomassJonax, @tmimmanuel, @tochukwuada, @Tosko4, @Tranquil-Flow, @twozle, @txbxxx,
@UgwujaGeorge, @Versun, @vlwkaos, @voidborne-d, @vominh1919, @Wang-tianhao, @Wangshengyang2004, @web3blind,
@westers, @Wysie, @xandersbell, @xiahu88988, @XieNBi, @xinbenlv, @xnbi, @y0shua1ee, @yatesjalex, @yes999zc,
@yeyitech, @Yoimex, @YueLich, @Yukipukii1, @zhiyanliu, @zicochaos, @Zjianru, @zkl2333, @zons-zhaozhy,
@ztexydt-cqh.
Also: @Siddharth Balyan, @YuShu.
---
**Full Changelog**: [v2026.4.23...v2026.4.30](https://github.com/NousResearch/hermes-agent/compare/v2026.4.23...v2026.4.30)
-11
View File
@@ -112,17 +112,6 @@ def main() -> None:
import acp
from .server import HermesACPAgent
# MCP tool discovery from config.yaml — run before asyncio.run() so
# it's safe to use blocking waits. (ACP also registers per-session
# MCP servers dynamically via asyncio.to_thread inside the event
# loop; that path is unaffected.) Moved from model_tools.py module
# scope to avoid freezing the gateway's loop on lazy import (#16856).
try:
from tools.mcp_tool import discover_mcp_tools
discover_mcp_tools()
except Exception:
logger.debug("MCP tool discovery failed at ACP startup", exc_info=True)
agent = HermesACPAgent()
try:
asyncio.run(acp.run_agent(agent, use_unstable_protocol=True))
+21 -537
View File
@@ -3,8 +3,6 @@
from __future__ import annotations
import asyncio
import contextvars
import json
import logging
import os
from collections import defaultdict, deque
@@ -14,7 +12,6 @@ from typing import Any, Deque, Optional
import acp
from acp.schema import (
AgentCapabilities,
AgentMessageChunk,
AuthenticateResponse,
AvailableCommand,
AvailableCommandsUpdate,
@@ -32,7 +29,6 @@ from acp.schema import (
McpServerStdio,
ModelInfo,
NewSessionResponse,
PromptCapabilities,
PromptResponse,
ResumeSessionResponse,
SetSessionConfigOptionResponse,
@@ -48,8 +44,6 @@ from acp.schema import (
TextContentBlock,
UnstructuredCommandInput,
Usage,
UsageUpdate,
UserMessageChunk,
)
# AuthMethodAgent was renamed from AuthMethod in agent-client-protocol 0.9.0
@@ -67,7 +61,6 @@ from acp_adapter.events import (
)
from acp_adapter.permissions import make_approval_callback
from acp_adapter.session import SessionManager, SessionState, _expand_acp_enabled_toolsets
from acp_adapter.tools import build_tool_complete, build_tool_start
logger = logging.getLogger(__name__)
@@ -94,69 +87,17 @@ def _extract_text(
| EmbeddedResourceContentBlock
],
) -> str:
"""Extract plain text from ACP content blocks for display/commands."""
"""Extract plain text from ACP content blocks."""
parts: list[str] = []
for block in prompt:
if isinstance(block, TextContentBlock):
parts.append(block.text)
elif hasattr(block, "text"):
parts.append(str(block.text))
# Non-text blocks are ignored for now.
return "\n".join(parts)
def _image_block_to_openai_part(block: ImageContentBlock) -> dict[str, Any] | None:
"""Convert an ACP image content block to OpenAI-style multimodal content."""
data = str(getattr(block, "data", "") or "").strip()
uri = str(getattr(block, "uri", "") or "").strip()
mime_type = str(getattr(block, "mime_type", "") or "image/png").strip() or "image/png"
if data:
url = data if data.startswith("data:") else f"data:{mime_type};base64,{data}"
elif uri:
url = uri
else:
return None
return {"type": "image_url", "image_url": {"url": url}}
def _content_blocks_to_openai_user_content(
prompt: list[
TextContentBlock
| ImageContentBlock
| AudioContentBlock
| ResourceContentBlock
| EmbeddedResourceContentBlock
],
) -> str | list[dict[str, Any]]:
"""Convert ACP prompt blocks into a Hermes/OpenAI-compatible user content payload."""
parts: list[dict[str, Any]] = []
text_parts: list[str] = []
for block in prompt:
if isinstance(block, TextContentBlock):
if block.text:
parts.append({"type": "text", "text": block.text})
text_parts.append(block.text)
continue
if isinstance(block, ImageContentBlock):
image_part = _image_block_to_openai_part(block)
if image_part is not None:
parts.append(image_part)
continue
if not parts:
return _extract_text(prompt)
# Keep pure text prompts as strings so slash-command handling and text-only
# providers keep the exact legacy path. Switch to structured content only
# when an actual non-text block is present.
if all(part.get("type") == "text" for part in parts):
return "\n".join(text_parts)
return parts
class HermesACPAgent(acp.Agent):
"""ACP Agent implementation wrapping Hermes AIAgent."""
@@ -167,8 +108,6 @@ class HermesACPAgent(acp.Agent):
"context": "Show conversation context info",
"reset": "Clear conversation history",
"compact": "Compress conversation context",
"steer": "Inject guidance into the currently running agent turn",
"queue": "Queue a prompt to run after the current turn finishes",
"version": "Show Hermes version",
}
@@ -198,16 +137,6 @@ class HermesACPAgent(acp.Agent):
"name": "compact",
"description": "Compress conversation context",
},
{
"name": "steer",
"description": "Inject guidance into the currently running agent turn",
"input_hint": "guidance for the active turn",
},
{
"name": "queue",
"description": "Queue a prompt to run after the current turn finishes",
"input_hint": "prompt to run next",
},
{
"name": "version",
"description": "Show Hermes version",
@@ -318,66 +247,6 @@ class HermesACPAgent(acp.Agent):
return target_provider, new_model
@staticmethod
def _build_usage_update(state: SessionState) -> UsageUpdate | None:
"""Build ACP native context-usage data for clients like Zed.
Zed's circular context indicator is driven by ACP ``usage_update``
session updates: ``size`` is the model context window and ``used`` is
the current request pressure. Hermes estimates ``used`` from the same
buckets it sends to providers: system prompt, conversation history, and
tool schemas.
"""
agent = state.agent
compressor = getattr(agent, "context_compressor", None)
size = int(getattr(compressor, "context_length", 0) or 0)
if size <= 0:
return None
try:
from agent.model_metadata import estimate_request_tokens_rough
used = estimate_request_tokens_rough(
state.history,
system_prompt=getattr(agent, "_cached_system_prompt", "") or "",
tools=getattr(agent, "tools", None) or None,
)
except Exception:
logger.debug("Could not estimate ACP native context usage", exc_info=True)
used = int(getattr(compressor, "last_prompt_tokens", 0) or 0)
return UsageUpdate(
session_update="usage_update",
size=max(size, 0),
used=max(used, 0),
)
async def _send_usage_update(self, state: SessionState) -> None:
"""Send ACP native context usage to the connected client."""
if not self._conn:
return
update = self._build_usage_update(state)
if update is None:
return
try:
await self._conn.session_update(
session_id=state.session_id,
update=update,
)
except Exception:
logger.warning(
"Failed to send ACP usage update for session %s",
state.session_id,
exc_info=True,
)
def _schedule_usage_update(self, state: SessionState) -> None:
"""Schedule native context indicator refresh after ACP responses."""
if not self._conn:
return
loop = asyncio.get_running_loop()
loop.call_soon(asyncio.create_task, self._send_usage_update(state))
async def _register_session_mcp_servers(
self,
state: SessionState,
@@ -482,7 +351,6 @@ class HermesACPAgent(acp.Agent):
agent_info=Implementation(name="hermes-agent", version=HERMES_VERSION),
agent_capabilities=AgentCapabilities(
load_session=True,
prompt_capabilities=PromptCapabilities(image=True),
session_capabilities=SessionCapabilities(
fork=SessionForkCapabilities(),
list=SessionListCapabilities(),
@@ -508,140 +376,6 @@ class HermesACPAgent(acp.Agent):
# ---- Session management -------------------------------------------------
@staticmethod
def _history_message_text(message: dict[str, Any]) -> str:
"""Extract displayable text from a persisted OpenAI-style message."""
content = message.get("content")
if isinstance(content, str):
return content.strip()
if isinstance(content, list):
parts: list[str] = []
for item in content:
if isinstance(item, dict):
text = item.get("text")
if isinstance(text, str):
parts.append(text)
elif item.get("type") == "text" and isinstance(item.get("content"), str):
parts.append(item["content"])
elif isinstance(item, str):
parts.append(item)
return "\n".join(part.strip() for part in parts if part and part.strip()).strip()
return ""
@staticmethod
def _history_message_update(
*,
role: str,
text: str,
) -> UserMessageChunk | AgentMessageChunk | None:
"""Build an ACP history replay update for a user/assistant message."""
block = TextContentBlock(type="text", text=text)
if role == "user":
return UserMessageChunk(
session_update="user_message_chunk",
content=block,
)
if role == "assistant":
return AgentMessageChunk(
session_update="agent_message_chunk",
content=block,
)
return None
@staticmethod
def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
"""Extract function name/arguments from an OpenAI-style tool_call."""
function = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
name = str(function.get("name") or tool_call.get("name") or "unknown_tool")
raw_args = function.get("arguments") or tool_call.get("arguments") or tool_call.get("args") or {}
if isinstance(raw_args, str):
try:
parsed = json.loads(raw_args)
except Exception:
parsed = {"raw": raw_args}
raw_args = parsed
if not isinstance(raw_args, dict):
raw_args = {}
return name, raw_args
@staticmethod
def _history_tool_call_id(tool_call: dict[str, Any]) -> str:
"""Return the stable provider tool call id for ACP history replay."""
return str(
tool_call.get("id")
or tool_call.get("call_id")
or tool_call.get("tool_call_id")
or ""
).strip()
async def _replay_session_history(self, state: SessionState) -> None:
"""Send persisted user/assistant history to clients during session/load.
Zed's ACP history UI calls ``session/load`` after the user picks an item
from the Agents sidebar. The agent must then replay the full conversation
as user/assistant chunks plus reconstructed tool-call start/completion
notifications; merely restoring server-side state makes Hermes remember
context, but leaves the editor looking like a clean thread.
"""
if not self._conn or not state.history:
return
active_tool_calls: dict[str, tuple[str, dict[str, Any]]] = {}
async def _send(update: Any) -> bool:
try:
await self._conn.session_update(session_id=state.session_id, update=update)
return True
except Exception:
logger.warning(
"Failed to replay ACP history for session %s",
state.session_id,
exc_info=True,
)
return False
for message in state.history:
role = str(message.get("role") or "")
if role in {"user", "assistant"}:
text = self._history_message_text(message)
if text:
update = self._history_message_update(role=role, text=text)
if update is not None and not await _send(update):
return
if role == "assistant" and isinstance(message.get("tool_calls"), list):
for tool_call in message["tool_calls"]:
if not isinstance(tool_call, dict):
continue
tool_call_id = self._history_tool_call_id(tool_call)
if not tool_call_id:
continue
tool_name, args = self._history_tool_call_name_args(tool_call)
active_tool_calls[tool_call_id] = (tool_name, args)
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
return
continue
if role == "tool":
tool_call_id = str(message.get("tool_call_id") or "").strip()
tool_name = str(message.get("tool_name") or "").strip()
function_args: dict[str, Any] | None = None
if tool_call_id in active_tool_calls:
tool_name, function_args = active_tool_calls.pop(tool_call_id)
if not tool_call_id or not tool_name:
continue
result = message.get("content")
if not await _send(
build_tool_complete(
tool_call_id,
tool_name,
result=result if isinstance(result, str) else None,
function_args=function_args,
)
):
return
async def new_session(
self,
cwd: str,
@@ -652,24 +386,11 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return NewSessionResponse(
session_id=state.session_id,
models=self._build_model_state(state),
)
def _schedule_history_replay(self, state: SessionState) -> None:
"""Replay persisted history after session/load or session/resume returns.
Zed only attaches streamed transcript/tool updates once the load/resume
response has completed. Sending replay notifications while the request is
still in-flight can make the server look correct in logs while the editor
drops or fails to attach the tool-call history.
"""
loop = asyncio.get_running_loop()
replay_coro = self._replay_session_history(state)
loop.call_soon(asyncio.create_task, replay_coro)
async def load_session(
self,
cwd: str,
@@ -683,9 +404,7 @@ class HermesACPAgent(acp.Agent):
return None
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
self._schedule_history_replay(state)
self._schedule_available_commands_update(session_id)
self._schedule_usage_update(state)
return LoadSessionResponse(models=self._build_model_state(state))
async def resume_session(
@@ -701,17 +420,12 @@ class HermesACPAgent(acp.Agent):
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
self._schedule_history_replay(state)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return ResumeSessionResponse(models=self._build_model_state(state))
async def cancel(self, session_id: str, **kwargs: Any) -> None:
state = self.session_manager.get_session(session_id)
if state and state.cancel_event:
with state.runtime_lock:
if state.is_running and state.current_prompt_text:
state.interrupted_prompt_text = state.current_prompt_text
state.cancel_event.set()
try:
if getattr(state, "agent", None) and hasattr(state.agent, "interrupt"):
@@ -802,77 +516,18 @@ class HermesACPAgent(acp.Agent):
return PromptResponse(stop_reason="refusal")
user_text = _extract_text(prompt).strip()
user_content = _content_blocks_to_openai_user_content(prompt)
has_content = bool(user_text) or (
isinstance(user_content, list) and bool(user_content)
)
if not has_content:
if not user_text:
return PromptResponse(stop_reason="end_turn")
# /steer on an idle session has no in-flight tool call to inject into.
# Rewrite it so the payload runs as a normal user prompt, matching the
# gateway's behavior (gateway/run.py ~L4898). Two sub-cases:
# 1. Zed-interrupt salvage — a prior prompt was cancelled by the
# client right before /steer arrived; replay it with the steer
# text attached as explicit correction/guidance so the user's
# in-flight work isn't lost.
# 2. Plain idle — no prior work to salvage; just run the steer
# payload as a regular prompt. Without this, _cmd_steer would
# silently append to state.queued_prompts and respond with
# "No active turn — queued for the next turn", which looks like
# /queue even though the user never typed /queue.
if isinstance(user_content, str) and user_text.startswith("/steer"):
steer_text = user_text.split(maxsplit=1)[1].strip() if len(user_text.split(maxsplit=1)) > 1 else ""
interrupted_prompt = ""
rewrite_idle = False
with state.runtime_lock:
if not state.is_running and steer_text:
if state.interrupted_prompt_text:
interrupted_prompt = state.interrupted_prompt_text
state.interrupted_prompt_text = ""
else:
rewrite_idle = True
if interrupted_prompt:
user_text = (
f"{interrupted_prompt}\n\n"
f"User correction/guidance after interrupt: {steer_text}"
)
user_content = user_text
elif rewrite_idle:
user_text = steer_text
user_content = steer_text
# Intercept slash commands — handle locally without calling the LLM.
# Slash commands are text-only; if the client included images/resources,
# send the whole multimodal prompt to the agent instead of treating it as
# an ACP command.
if isinstance(user_content, str) and user_text.startswith("/"):
# Intercept slash commands — handle locally without calling the LLM
if user_text.startswith("/"):
response_text = self._handle_slash_command(user_text, state)
if response_text is not None:
if self._conn:
update = acp.update_agent_message_text(response_text)
await self._conn.session_update(session_id, update)
await self._send_usage_update(state)
return PromptResponse(stop_reason="end_turn")
# If Zed sends another regular prompt while the same ACP session is
# still running, queue it instead of racing two AIAgent loops against
# the same state.history. /steer and /queue are handled above and can
# land immediately.
with state.runtime_lock:
if state.is_running:
queued_text = user_text or "[Image attachment]"
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
if self._conn:
update = acp.update_agent_message_text(
f"Queued for the next turn. ({depth} queued)"
)
await self._conn.session_update(session_id, update)
return PromptResponse(stop_reason="end_turn")
state.is_running = True
state.current_prompt_text = user_text or "[Image attachment]"
logger.info("Prompt on session %s: %s", session_id, user_text[:100])
conn = self._conn
@@ -885,37 +540,24 @@ class HermesACPAgent(acp.Agent):
tool_call_meta: dict[str, dict[str, Any]] = {}
previous_approval_cb = None
streamed_message = False
if conn:
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
reasoning_cb = make_thinking_cb(conn, session_id, loop)
thinking_cb = make_thinking_cb(conn, session_id, loop)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
message_cb = make_message_cb(conn, session_id, loop)
def stream_delta_cb(text: str) -> None:
nonlocal streamed_message
if text:
streamed_message = True
message_cb(text)
approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
else:
tool_progress_cb = None
reasoning_cb = None
thinking_cb = None
step_cb = None
stream_delta_cb = None
message_cb = None
approval_cb = None
agent = state.agent
agent.tool_progress_callback = tool_progress_cb
# ACP thought panes should not receive Hermes' local kawaii waiting/status
# updates. Route provider/model reasoning deltas instead; if the provider
# emits no reasoning, Zed should not get a fake "thinking" accordion.
agent.thinking_callback = None
agent.reasoning_callback = reasoning_cb
agent.thinking_callback = thinking_cb
agent.step_callback = step_cb
agent.stream_delta_callback = stream_delta_cb
agent.message_callback = message_cb
# Approval callback is per-thread (thread-local, GHSA-qg5c-hvr5-hjgr).
# Set it INSIDE _run_agent so the TLS write happens in the executor
@@ -932,22 +574,6 @@ class HermesACPAgent(acp.Agent):
def _run_agent() -> dict:
nonlocal previous_approval_cb, previous_interactive
# Bind HERMES_SESSION_KEY for this session so per-session caches
# (e.g. the interactive sudo password cache in tools.terminal_tool)
# scope to the ACP session rather than leaking across sessions
# that land on the same reused executor thread. This call runs
# inside a contextvars.copy_context() below, so the ContextVar
# write is isolated from other concurrent ACP sessions.
try:
from gateway.session_context import (
clear_session_vars,
set_session_vars,
)
session_tokens = set_session_vars(session_key=session_id)
except Exception:
session_tokens = None
clear_session_vars = None # type: ignore[assignment]
logger.debug("Could not set ACP session context", exc_info=True)
if approval_cb:
try:
from tools import terminal_tool as _terminal_tool
@@ -961,10 +587,9 @@ class HermesACPAgent(acp.Agent):
os.environ["HERMES_INTERACTIVE"] = "1"
try:
result = agent.run_conversation(
user_message=user_content,
user_message=user_text,
conversation_history=state.history,
task_id=session_id,
persist_user_message=user_text or "[Image attachment]",
)
return result
except Exception as e:
@@ -982,24 +607,11 @@ class HermesACPAgent(acp.Agent):
_terminal_tool.set_approval_callback(previous_approval_cb)
except Exception:
logger.debug("Could not restore approval callback", exc_info=True)
if session_tokens is not None and clear_session_vars is not None:
try:
clear_session_vars(session_tokens)
except Exception:
logger.debug("Could not clear ACP session context", exc_info=True)
try:
# Wrap the executor call in a fresh copy of the current context so
# concurrent ACP sessions on the shared ThreadPoolExecutor don't
# stomp on each other's ContextVar writes (HERMES_SESSION_KEY in
# particular — used by the interactive sudo password cache scope).
ctx = contextvars.copy_context()
result = await loop.run_in_executor(_executor, ctx.run, _run_agent)
result = await loop.run_in_executor(_executor, _run_agent)
except Exception:
logger.exception("Executor error for session %s", session_id)
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
return PromptResponse(stop_reason="end_turn")
if result.get("messages"):
@@ -1021,32 +633,10 @@ class HermesACPAgent(acp.Agent):
)
except Exception:
logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
if final_response and conn and not streamed_message:
if final_response and conn:
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
# Mark this turn idle before draining queued work so recursive prompt()
# calls can acquire the session. Queued turns are intentionally run as
# normal follow-up user prompts, preserving role alternation and history.
with state.runtime_lock:
state.is_running = False
state.current_prompt_text = ""
while True:
with state.runtime_lock:
if not state.queued_prompts:
break
next_prompt = state.queued_prompts.pop(0)
if conn:
await conn.session_update(
session_id,
acp.update_user_message_text(next_prompt),
)
await self.prompt(
prompt=[TextContentBlock(type="text", text=next_prompt)],
session_id=session_id,
)
usage = None
if any(result.get(key) is not None for key in ("prompt_tokens", "completion_tokens", "total_tokens")):
usage = Usage(
@@ -1057,8 +647,6 @@ class HermesACPAgent(acp.Agent):
cached_read_tokens=result.get("cache_read_tokens"),
)
await self._send_usage_update(state)
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
return PromptResponse(stop_reason=stop_reason, usage=usage)
@@ -1126,8 +714,6 @@ class HermesACPAgent(acp.Agent):
"context": self._cmd_context,
"reset": self._cmd_reset,
"compact": self._cmd_compact,
"steer": self._cmd_steer,
"queue": self._cmd_queue,
"version": self._cmd_version,
}.get(cmd)
@@ -1191,84 +777,22 @@ class HermesACPAgent(acp.Agent):
return f"Could not list tools: {e}"
def _cmd_context(self, args: str, state: SessionState) -> str:
"""Show ACP session context pressure and compression guidance."""
n_messages = len(state.history)
# Count by role.
if n_messages == 0:
return "Conversation is empty (no messages yet)."
# Count by role
roles: dict[str, int] = {}
for msg in state.history:
role = msg.get("role", "unknown")
roles[role] = roles.get(role, 0) + 1
agent = state.agent
model = state.model or getattr(agent, "model", "")
provider = getattr(agent, "provider", None) or "auto"
compressor = getattr(agent, "context_compressor", None)
context_length = int(getattr(compressor, "context_length", 0) or 0)
threshold_tokens = int(getattr(compressor, "threshold_tokens", 0) or 0)
try:
from agent.model_metadata import estimate_request_tokens_rough
system_prompt = getattr(agent, "_cached_system_prompt", "") or ""
tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=system_prompt,
tools=tools,
)
except Exception:
logger.debug("Could not estimate ACP context usage", exc_info=True)
approx_tokens = 0
if threshold_tokens <= 0 and context_length > 0:
threshold_tokens = int(context_length * 0.80)
lines = [
f"Conversation: {n_messages} messages"
if n_messages
else "Conversation is empty (no messages yet).",
f"Conversation: {n_messages} messages",
f" user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
]
model = state.model or getattr(state.agent, "model", "")
if model:
lines.append(f"Model: {model}")
lines.append(f"Provider: {provider}")
if approx_tokens > 0:
if context_length > 0:
usage_pct = (approx_tokens / context_length) * 100
lines.append(
f"Context usage: ~{approx_tokens:,} / {context_length:,} tokens ({usage_pct:.1f}%)"
)
else:
lines.append(f"Context usage: ~{approx_tokens:,} tokens")
if threshold_tokens > 0:
if approx_tokens > 0:
threshold_pct = (threshold_tokens / context_length) * 100 if context_length > 0 else 0
remaining = max(threshold_tokens - approx_tokens, 0)
if approx_tokens >= threshold_tokens:
lines.append(
f"Compression: due now (threshold ~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ "). Run /compact."
)
else:
lines.append(
f"Compression: ~{remaining:,} tokens until threshold "
f"(~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ ")."
)
else:
lines.append(f"Compression threshold: ~{threshold_tokens:,} tokens")
if getattr(agent, "compression_enabled", True) is False:
lines.append("Compression is disabled for this agent.")
else:
lines.append("Tip: run /compact to compress manually before the threshold.")
return "\n".join(lines)
def _cmd_reset(self, args: str, state: SessionState) -> str:
@@ -1286,16 +810,10 @@ class HermesACPAgent(acp.Agent):
if not hasattr(agent, "_compress_context"):
return "Context compression not available for this agent."
from agent.model_metadata import estimate_request_tokens_rough
from agent.model_metadata import estimate_messages_tokens_rough
original_count = len(state.history)
# Include system prompt + tool schemas so the figure reflects real
# request pressure, not a transcript-only underestimate (#6217).
_sys_prompt = getattr(agent, "_cached_system_prompt", "") or ""
_tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history, system_prompt=_sys_prompt, tools=_tools
)
approx_tokens = estimate_messages_tokens_rough(state.history)
original_session_db = getattr(agent, "_session_db", None)
try:
@@ -1315,13 +833,7 @@ class HermesACPAgent(acp.Agent):
self.session_manager.save_session(state.session_id)
new_count = len(state.history)
_sys_prompt_after = getattr(agent, "_cached_system_prompt", "") or _sys_prompt
_tools_after = getattr(agent, "tools", None) or _tools
new_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=_sys_prompt_after,
tools=_tools_after,
)
new_tokens = estimate_messages_tokens_rough(state.history)
return (
f"Context compressed: {original_count} -> {new_count} messages\n"
f"~{approx_tokens:,} -> ~{new_tokens:,} tokens"
@@ -1329,34 +841,6 @@ class HermesACPAgent(acp.Agent):
except Exception as e:
return f"Compression failed: {e}"
def _cmd_steer(self, args: str, state: SessionState) -> str:
steer_text = args.strip()
if not steer_text:
return "Usage: /steer <guidance>"
if state.is_running and hasattr(state.agent, "steer"):
try:
if state.agent.steer(steer_text):
preview = steer_text[:80] + ("..." if len(steer_text) > 80 else "")
return f"⏩ Steer queued for the active turn: {preview}"
except Exception as exc:
logger.warning("ACP steer failed for session %s: %s", state.session_id, exc)
return f"⚠️ Steer failed: {exc}"
with state.runtime_lock:
state.queued_prompts.append(steer_text)
depth = len(state.queued_prompts)
return f"No active turn — queued for the next turn. ({depth} queued)"
def _cmd_queue(self, args: str, state: SessionState) -> str:
queued_text = args.strip()
if not queued_text:
return "Usage: /queue <prompt>"
with state.runtime_lock:
state.queued_prompts.append(queued_text)
depth = len(state.queued_prompts)
return f"Queued for the next turn. ({depth} queued)"
def _cmd_version(self, args: str, state: SessionState) -> str:
return f"Hermes Agent v{HERMES_VERSION}"
+7 -46
View File
@@ -26,33 +26,6 @@ from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
def _win_path_to_wsl(path: str) -> str | None:
"""Convert a Windows drive path to its WSL /mnt/<drive>/... equivalent."""
match = re.match(r"^([A-Za-z]):[\\/](.*)$", path)
if not match:
return None
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
return f"/mnt/{drive}/{tail}"
def _translate_acp_cwd(cwd: str) -> str:
"""Translate Windows ACP cwd values when Hermes itself is running in WSL.
Windows ACP clients can launch ``hermes acp`` inside WSL while still sending
editor workspaces as Windows drive paths such as ``E:\\Projects``. Store
and execute against the WSL mount path so agents, tools, and persisted ACP
sessions all agree on the usable workspace. Native Linux/macOS keeps the
original cwd unchanged.
"""
from hermes_constants import is_wsl
if not is_wsl():
return cwd
translated = _win_path_to_wsl(str(cwd))
return translated if translated is not None else cwd
def _normalize_cwd_for_compare(cwd: str | None) -> str:
raw = str(cwd or ".").strip()
if not raw:
@@ -61,9 +34,11 @@ def _normalize_cwd_for_compare(cwd: str | None) -> str:
# Normalize Windows drive paths into the equivalent WSL mount form so
# ACP history filters match the same workspace across Windows and WSL.
translated = _win_path_to_wsl(expanded)
if translated is not None:
expanded = translated
match = re.match(r"^([A-Za-z]):[\\/](.*)$", expanded)
if match:
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
expanded = f"/mnt/{drive}/{tail}"
elif re.match(r"^/mnt/[A-Za-z]/", expanded):
expanded = f"/mnt/{expanded[5].lower()}/{expanded[7:]}"
@@ -121,18 +96,12 @@ def _acp_stderr_print(*args, **kwargs) -> None:
def _register_task_cwd(task_id: str, cwd: str) -> None:
"""Bind a task/session id to the editor's working directory for tools.
Zed can launch Hermes from a Windows workspace while the ACP process runs
inside WSL. In that case ACP sends cwd as e.g. ``E:\\Projects\\POTI``;
local tools need the WSL mount equivalent or subprocess creation fails
before the command can run.
"""
"""Bind a task/session id to the editor's working directory for tools."""
if not task_id:
return
try:
from tools.terminal_tool import register_task_env_overrides
register_task_env_overrides(task_id, {"cwd": _translate_acp_cwd(cwd)})
register_task_env_overrides(task_id, {"cwd": cwd})
except Exception:
logger.debug("Failed to register ACP task cwd override", exc_info=True)
@@ -176,11 +145,6 @@ class SessionState:
model: str = ""
history: List[Dict[str, Any]] = field(default_factory=list)
cancel_event: Any = None # threading.Event
is_running: bool = False
queued_prompts: List[str] = field(default_factory=list)
runtime_lock: Any = field(default_factory=Lock)
current_prompt_text: str = ""
interrupted_prompt_text: str = ""
class SessionManager:
@@ -211,7 +175,6 @@ class SessionManager:
"""Create a new session with a unique ID and a fresh AIAgent."""
import threading
cwd = _translate_acp_cwd(cwd)
session_id = str(uuid.uuid4())
agent = self._make_agent(session_id=session_id, cwd=cwd)
state = SessionState(
@@ -254,7 +217,6 @@ class SessionManager:
"""Deep-copy a session's history into a new session."""
import threading
cwd = _translate_acp_cwd(cwd)
original = self.get_session(session_id) # checks DB too
if original is None:
return None
@@ -356,7 +318,6 @@ class SessionManager:
def update_cwd(self, session_id: str, cwd: str) -> Optional[SessionState]:
"""Update the working directory for a session and its tool overrides."""
cwd = _translate_acp_cwd(cwd)
state = self.get_session(session_id) # checks DB too
if state is None:
return None
+21 -822
View File
@@ -28,11 +28,6 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
"terminal": "execute",
"process": "execute",
"execute_code": "execute",
# Session/meta tools
"todo": "other",
"skill_view": "read",
"skills_list": "read",
"skill_manage": "edit",
# Web / fetch
"web_search": "fetch",
"web_extract": "fetch",
@@ -56,28 +51,6 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
}
_POLISHED_TOOLS = {
# Core operator loop
"todo", "memory", "session_search", "delegate_task",
# Files / execution
"read_file", "write_file", "patch", "search_files", "terminal", "process", "execute_code",
# Skills / web / browser / media
"skill_view", "skills_list", "skill_manage", "web_search", "web_extract",
"browser_navigate", "browser_click", "browser_type", "browser_press", "browser_scroll",
"browser_back", "browser_snapshot", "browser_console", "browser_get_images", "browser_vision",
"vision_analyze", "image_generate", "text_to_speech",
# Schedulers / platform integrations
"cronjob", "send_message", "clarify", "discord", "discord_admin",
"ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
"feishu_doc_read", "feishu_drive_list_comments", "feishu_drive_list_comment_replies",
"feishu_drive_reply_comment", "feishu_drive_add_comment",
"kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
"kanban_block", "kanban_link", "kanban_heartbeat",
"yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
"yb_send_dm", "yb_send_sticker", "mixture_of_agents",
}
def get_tool_kind(tool_name: str) -> ToolKind:
"""Return the ACP ToolKind for a hermes tool, defaulting to 'other'."""
return TOOL_KIND_MAP.get(tool_name, "other")
@@ -112,645 +85,18 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
if urls:
return f"extract: {urls[0]}" + (f" (+{len(urls)-1})" if len(urls) > 1 else "")
return "web extract"
if tool_name == "process":
action = str(args.get("action") or "").strip() or "manage"
sid = str(args.get("session_id") or "").strip()
return f"process {action}: {sid}" if sid else f"process {action}"
if tool_name == "delegate_task":
tasks = args.get("tasks")
if isinstance(tasks, list) and tasks:
return f"delegate batch ({len(tasks)} tasks)"
goal = args.get("goal", "")
if goal and len(goal) > 60:
goal = goal[:57] + "..."
return f"delegate: {goal}" if goal else "delegate task"
if tool_name == "session_search":
query = str(args.get("query") or "").strip()
return f"session search: {query}" if query else "recent sessions"
if tool_name == "memory":
action = str(args.get("action") or "manage").strip() or "manage"
target = str(args.get("target") or "memory").strip() or "memory"
return f"memory {action}: {target}"
if tool_name == "execute_code":
code = str(args.get("code") or "").strip()
first_line = next((line.strip() for line in code.splitlines() if line.strip()), "")
if first_line:
if len(first_line) > 70:
first_line = first_line[:67] + "..."
return f"python: {first_line}"
return "python code"
if tool_name == "todo":
items = args.get("todos")
if isinstance(items, list):
return f"todo ({len(items)} item{'s' if len(items) != 1 else ''})"
return "todo"
if tool_name == "skill_view":
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
suffix = f"/{file_path}" if file_path else ""
return f"skill view ({name}{suffix})"
if tool_name == "skills_list":
category = str(args.get("category") or "").strip()
return f"skills list ({category})" if category else "skills list"
if tool_name == "skill_manage":
action = str(args.get("action") or "manage").strip() or "manage"
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
target = f"{name}/{file_path}" if file_path else name
if len(target) > 64:
target = target[:61] + "..."
return f"skill {action}: {target}"
if tool_name == "browser_navigate":
return f"navigate: {args.get('url', '?')}"
if tool_name == "browser_snapshot":
return "browser snapshot"
if tool_name == "browser_vision":
return f"browser vision: {str(args.get('question', '?'))[:50]}"
if tool_name == "browser_get_images":
return "browser images"
return "execute code"
if tool_name == "vision_analyze":
return f"analyze image: {str(args.get('question', '?'))[:50]}"
if tool_name == "image_generate":
prompt = str(args.get("prompt") or args.get("description") or "").strip()
return f"generate image: {prompt[:50]}" if prompt else "generate image"
if tool_name == "cronjob":
action = str(args.get("action") or "manage").strip() or "manage"
job_id = str(args.get("job_id") or args.get("id") or "").strip()
return f"cron {action}: {job_id}" if job_id else f"cron {action}"
return f"analyze image: {args.get('question', '?')[:50]}"
return tool_name
def _text(content: str) -> Any:
return acp.tool_content(acp.text_block(content))
def _json_loads_maybe(value: Optional[str]) -> Any:
if not isinstance(value, str):
return value
try:
return json.loads(value)
except Exception:
pass
# Some Hermes tools append a human hint after a JSON payload, e.g.
# ``{...}\n\n[Hint: Results truncated...]``. Keep the structured rendering path
# by decoding the first JSON value instead of falling back to raw text.
try:
decoded, _ = json.JSONDecoder().raw_decode(value.lstrip())
return decoded
except Exception:
return None
def _truncate_text(text: str, limit: int = 5000) -> str:
if len(text) <= limit:
return text
return text[: max(0, limit - 100)] + f"\n... ({len(text)} chars total, truncated)"
def _fenced_text(text: str, language: str = "") -> str:
"""Return a Markdown fence that cannot be broken by backticks in text."""
longest = max((len(run) for run in text.split("`")[1::2]), default=0)
fence = "`" * max(3, longest + 1)
return f"{fence}{language}\n{text}\n{fence}"
def _format_todo_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict) or not isinstance(data.get("todos"), list):
return None
summary = data.get("summary") if isinstance(data.get("summary"), dict) else {}
icon = {
"completed": "",
"in_progress": "🔄",
"pending": "",
"cancelled": "",
}
lines = ["**Todo list**", ""]
for item in data["todos"]:
if not isinstance(item, dict):
continue
status = str(item.get("status") or "pending")
content = str(item.get("content") or item.get("id") or "").strip()
if content:
lines.append(f"- {icon.get(status, '')} {content}")
if summary:
cancelled = summary.get("cancelled", 0)
lines.extend([
"",
"**Progress:** "
f"{summary.get('completed', 0)} completed, "
f"{summary.get('in_progress', 0)} in progress, "
f"{summary.get('pending', 0)} pending"
+ (f", {cancelled} cancelled" if cancelled else ""),
])
return "\n".join(lines)
def _format_read_file_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not data.get("content"):
return f"Read failed: {data.get('error')}"
content = data.get("content")
if not isinstance(content, str):
return None
path = str((args or {}).get("path") or data.get("path") or "file").strip()
offset = (args or {}).get("offset")
limit = (args or {}).get("limit")
range_bits = []
if offset:
range_bits.append(f"from line {offset}")
if limit:
range_bits.append(f"limit {limit}")
suffix = f" ({', '.join(range_bits)})" if range_bits else ""
header = f"Read {path}{suffix}"
if data.get("total_lines") is not None:
header += f"{data.get('total_lines')} total lines"
# Hermes read_file output is line-numbered with `|`. If we send it as raw
# Markdown, Zed can interpret pipes as tables and collapse the layout.
# Fence the payload so file lines stay readable and literal.
return _truncate_text(f"{header}\n\n{_fenced_text(content)}")
def _format_search_files_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
matches = data.get("matches")
if not isinstance(matches, list):
return None
total = data.get("total_count", len(matches))
shown = min(len(matches), 12)
truncated = bool(data.get("truncated")) or len(matches) > shown
lines = [
"Search results",
f"Found {total} match{'es' if total != 1 else ''}; showing {shown}.",
"",
]
for match in matches[:shown]:
if not isinstance(match, dict):
lines.append(f"- {match}")
continue
path = str(match.get("path") or match.get("file") or match.get("filename") or "?")
line = match.get("line") or match.get("line_number")
content = str(match.get("content") or match.get("text") or "").strip()
loc = f"{path}:{line}" if line else path
lines.append(f"- {loc}")
if content:
snippet = _truncate_text(" ".join(content.split()), 300)
lines.append(f" {snippet}")
if truncated:
lines.extend([
"",
"Results truncated. Narrow the search, add file_glob, or use offset to page.",
])
return _truncate_text("\n".join(lines), limit=7000)
def _format_execute_code_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
output = str(data.get("output") or "")
error = str(data.get("error") or "")
exit_code = data.get("exit_code")
parts = [f"Exit code: {exit_code}" if exit_code is not None else "Execution complete"]
if output:
parts.extend(["", "Output:", output])
if error:
parts.extend(["", "Error:", error])
return _truncate_text("\n".join(parts))
def _extract_markdown_headings(content: str, limit: int = 8) -> list[str]:
headings: list[str] = []
for line in content.splitlines():
stripped = line.strip()
if stripped.startswith("#"):
heading = stripped.lstrip("#").strip()
if heading:
headings.append(heading)
if len(headings) >= limit:
break
return headings
def _format_skill_view_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Skill view failed: {data.get('error', 'unknown error')}"
name = str(data.get("name") or "skill")
file_path = str(data.get("file") or data.get("path") or "SKILL.md")
description = str(data.get("description") or "").strip()
content = str(data.get("content") or "")
linked = data.get("linked_files") if isinstance(data.get("linked_files"), dict) else None
lines = ["**Skill loaded**", "", f"- **Name:** `{name}`", f"- **File:** `{file_path}`"]
if description:
lines.append(f"- **Description:** {description}")
if content:
lines.append(f"- **Content:** {len(content):,} chars loaded into agent context")
if linked:
linked_count = sum(len(v) for v in linked.values() if isinstance(v, list))
lines.append(f"- **Linked files:** {linked_count}")
headings = _extract_markdown_headings(content)
if headings:
lines.extend(["", "**Sections**"])
lines.extend(f"- {heading}" for heading in headings)
lines.extend([
"",
"_Full skill content is available to the agent but hidden here to keep ACP readable._",
])
return "\n".join(lines)
def _format_skill_manage_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "manage").strip() or "manage"
name = str((args or {}).get("name") or data.get("name") or "skill").strip() or "skill"
file_path = str((args or {}).get("file_path") or data.get("file_path") or "SKILL.md").strip() or "SKILL.md"
success = data.get("success")
status = "✅ Skill updated" if success is not False else "✗ Skill update failed"
lines = [f"**{status}**", "", f"- **Action:** `{action}`", f"- **Skill:** `{name}`"]
if action not in {"delete"}:
lines.append(f"- **File:** `{file_path}`")
message = str(data.get("message") or data.get("error") or "").strip()
if message:
lines.append(f"- **Result:** {message}")
replacements = data.get("replacements") or data.get("replacement_count")
if replacements is not None:
lines.append(f"- **Replacements:** {replacements}")
path = str(data.get("path") or "").strip()
if path:
lines.append(f"- **Path:** `{path}`")
return "\n".join(lines)
def _format_web_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
web = data.get("data", {}).get("web") if isinstance(data.get("data"), dict) else data.get("web")
if not isinstance(web, list):
return None
lines = [f"Web results: {len(web)}"]
for item in web[:10]:
if not isinstance(item, dict):
continue
title = str(item.get("title") or item.get("url") or "result").strip()
url = str(item.get("url") or "").strip()
desc = str(item.get("description") or "").strip()
lines.append(f"{title}" + (f"{url}" if url else ""))
if desc:
lines.append(f" {desc}")
return _truncate_text("\n".join(lines))
def _format_web_extract_result(result: Optional[str]) -> Optional[str]:
"""Return only web_extract errors for ACP; success stays compact via title."""
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False and data.get("error"):
return f"Web extract failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
failures: list[str] = []
for item in results[:10]:
if not isinstance(item, dict):
continue
error = str(item.get("error") or "").strip()
if not error or error in {"None", "null"}:
continue
url = str(item.get("url") or "").strip()
title = str(item.get("title") or url or "Untitled").strip()
failures.append(
f"- {title}" + (f"{url}" if url and url != title else "") + f"\n Error: {_truncate_text(error, limit=500)}"
)
if not failures:
return None
lines = [f"Web extract failed for {len(failures)} URL{'s' if len(failures) != 1 else ''}"]
lines.extend(failures)
return "\n".join(lines)
def _format_process_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False and data.get("error"):
return f"Process error: {data.get('error')}"
action = str((args or {}).get("action") or "process").strip() or "process"
if isinstance(data.get("processes"), list):
processes = data["processes"]
lines = [f"Processes: {len(processes)}"]
for proc in processes[:20]:
if not isinstance(proc, dict):
lines.append(f"- {proc}")
continue
sid = str(proc.get("session_id") or proc.get("id") or "?")
status = str(proc.get("status") or ("exited" if proc.get("exited") else "running"))
cmd = str(proc.get("command") or "").strip()
pid = proc.get("pid")
code = proc.get("exit_code")
bits = [status]
if pid is not None:
bits.append(f"pid {pid}")
if code is not None:
bits.append(f"exit {code}")
lines.append(f"- `{sid}` — {', '.join(bits)}" + (f"{cmd[:120]}" if cmd else ""))
if len(processes) > 20:
lines.append(f"... {len(processes) - 20} more process(es)")
return "\n".join(lines)
status = str(data.get("status") or data.get("state") or action).strip()
sid = str(data.get("session_id") or (args or {}).get("session_id") or "").strip()
lines = [f"Process {action}: {status}" + (f" (`{sid}`)" if sid else "")]
for key, label in (("command", "Command"), ("pid", "PID"), ("exit_code", "Exit code"), ("returncode", "Exit code"), ("lines", "Lines")):
if data.get(key) is not None:
lines.append(f"- **{label}:** {data.get(key)}")
output = data.get("output") or data.get("new_output") or data.get("log") or data.get("stdout")
error = data.get("error") or data.get("stderr")
if output:
lines.extend(["", "Output:", _truncate_text(str(output), limit=5000)])
if error:
lines.extend(["", "Error:", _truncate_text(str(error), limit=2000)])
msg = data.get("message")
if msg and not output and not error:
lines.append(str(msg))
return _truncate_text("\n".join(lines), limit=7000)
def _format_delegate_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not isinstance(data.get("results"), list):
return f"Delegation failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
total = data.get("total_duration_seconds")
lines = [f"Delegation results: {len(results)} task{'s' if len(results) != 1 else ''}" + (f" in {total}s" if total is not None else "")]
icon = {"completed": "", "failed": "", "error": "", "timeout": "", "interrupted": ""}
for item in results:
if not isinstance(item, dict):
lines.append(f"- {item}")
continue
idx = item.get("task_index")
status = str(item.get("status") or "unknown")
model = item.get("model")
dur = item.get("duration_seconds")
role = item.get("_child_role")
header = f"{icon.get(status, '')} Task {idx + 1 if isinstance(idx, int) else '?'}: {status}"
bits = []
if model:
bits.append(str(model))
if role:
bits.append(f"role={role}")
if dur is not None:
bits.append(f"{dur}s")
if bits:
header += " (" + ", ".join(bits) + ")"
lines.extend(["", header])
summary = str(item.get("summary") or "").strip()
error = str(item.get("error") or "").strip()
if summary:
lines.append(_truncate_text(summary, limit=1200))
if error:
lines.append("Error: " + _truncate_text(error, limit=800))
trace = item.get("tool_trace")
if isinstance(trace, list) and trace:
names = [str(t.get("tool") or "?") for t in trace if isinstance(t, dict)]
if names:
lines.append("Tools: " + ", ".join(names[:12]) + (f" (+{len(names)-12})" if len(names) > 12 else ""))
return _truncate_text("\n".join(lines), limit=8000)
def _format_session_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Session search failed: {data.get('error', 'unknown error')}"
results = data.get("results")
if not isinstance(results, list):
return None
mode = data.get("mode") or "search"
query = data.get("query")
lines = ["Recent sessions" if mode == "recent" else f"Session search results" + (f" for `{query}`" if query else "")]
if not results:
lines.append(str(data.get("message") or "No matching sessions found."))
return "\n".join(lines)
for item in results:
if not isinstance(item, dict):
continue
sid = str(item.get("session_id") or "?")
title = str(item.get("title") or item.get("when") or "Untitled session").strip()
when = str(item.get("last_active") or item.get("started_at") or item.get("when") or "").strip()
count = item.get("message_count")
source = str(item.get("source") or "").strip()
meta = ", ".join(str(x) for x in [when, source, f"{count} msgs" if count is not None else ""] if x)
lines.append(f"- **{title}** (`{sid}`)" + (f"{meta}" if meta else ""))
summary = str(item.get("summary") or item.get("preview") or "").strip()
if summary:
lines.append(" " + _truncate_text(" ".join(summary.split()), limit=500))
return _truncate_text("\n".join(lines), limit=7000)
def _format_memory_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "memory").strip() or "memory"
target = str(data.get("target") or (args or {}).get("target") or "memory")
if data.get("success") is False:
lines = [f"✗ Memory {action} failed ({target})", str(data.get("error") or "unknown error")]
matches = data.get("matches")
if isinstance(matches, list) and matches:
lines.append("Matches:")
lines.extend(f"- {_truncate_text(str(m), 160)}" for m in matches[:5])
return "\n".join(lines)
lines = [f"✅ Memory {action} saved ({target})"]
if data.get("message"):
lines.append(str(data.get("message")))
if data.get("entry_count") is not None:
lines.append(f"Entries: {data.get('entry_count')}")
if data.get("usage"):
lines.append(f"Usage: {data.get('usage')}")
# Avoid dumping all memory entries into ACP UI; show only the explicit new value preview.
preview = str((args or {}).get("content") or (args or {}).get("old_text") or "").strip()
if preview:
lines.append("Preview: " + _truncate_text(preview, limit=300))
return "\n".join(lines)
def _format_edit_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
path = str((args or {}).get("path") or "file").strip()
if isinstance(data, dict):
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed for {path}: {data.get('error', 'unknown error')}"
message = str(data.get("message") or "").strip()
replacements = data.get("replacements") or data.get("replacement_count")
lines = [f"{tool_name} completed" + (f" for `{path}`" if path else "")]
if message:
lines.append(message)
if replacements is not None:
lines.append(f"Replacements: {replacements}")
if data.get("files_modified"):
files = data.get("files_modified")
if isinstance(files, list):
lines.append("Files: " + ", ".join(f"`{f}`" for f in files[:8]))
return "\n".join(lines)
if isinstance(result, str) and result.strip():
return _truncate_text(result, limit=3000)
return f"{tool_name} completed" + (f" for `{path}`" if path else "")
def _format_browser_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
if tool_name == "browser_get_images":
images = data.get("images") or data.get("data")
if isinstance(images, list):
lines = [f"Images found: {len(images)}"]
for img in images[:12]:
if isinstance(img, dict):
alt = str(img.get("alt") or "").strip()
url = str(img.get("url") or img.get("src") or "").strip()
lines.append(f"- {alt or 'image'}" + (f"{url}" if url else ""))
return _truncate_text("\n".join(lines), limit=5000)
title = str(data.get("title") or data.get("url") or data.get("status") or tool_name)
text = str(data.get("text") or data.get("content") or data.get("snapshot") or data.get("analysis") or data.get("message") or "").strip()
lines = [title]
if data.get("url") and data.get("url") != title:
lines.append(str(data.get("url")))
if text:
lines.extend(["", _truncate_text(text, limit=5000)])
return _truncate_text("\n".join(lines), limit=7000)
def _format_media_or_cron_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed"]
for key in ("file_path", "path", "url", "image_url", "job_id", "id", "status", "message", "next_run"):
if data.get(key):
lines.append(f"- **{key}:** {data.get(key)}")
return "\n".join(lines)
def _format_generic_structured_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, (dict, list)):
return result if isinstance(result, str) and result.strip() else None
if isinstance(data, list):
lines = [f"{tool_name}: {len(data)} item{'s' if len(data) != 1 else ''}"]
for item in data[:12]:
lines.append(f"- {_truncate_text(str(item), limit=240)}")
return _truncate_text("\n".join(lines), limit=5000)
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed" if data.get("success") is True else f"{tool_name} result"]
priority_keys = (
"message", "status", "id", "task_id", "issue_id", "title", "name", "entity_id",
"state", "service", "url", "path", "file_path", "count", "total", "next_run",
)
seen = set()
for key in priority_keys:
value = data.get(key)
if value in (None, "", [], {}):
continue
seen.add(key)
lines.append(f"- **{key}:** {_truncate_text(str(value), limit=500)}")
for key, value in data.items():
if key in seen or key in {"success", "raw", "content", "entries"}:
continue
if value in (None, "", [], {}):
continue
if isinstance(value, (dict, list)):
preview = json.dumps(value, ensure_ascii=False, default=str)
else:
preview = str(value)
lines.append(f"- **{key}:** {_truncate_text(preview, limit=500)}")
if len(lines) >= 14:
break
content = data.get("content")
if isinstance(content, str) and content.strip():
lines.extend(["", _truncate_text(content.strip(), limit=1500)])
return _truncate_text("\n".join(lines), limit=7000)
def _build_polished_completion_content(
tool_name: str,
result: Optional[str],
function_args: Optional[Dict[str, Any]],
) -> Optional[List[Any]]:
formatter = {
"todo": lambda: _format_todo_result(result),
"read_file": lambda: _format_read_file_result(result, function_args),
"write_file": lambda: _format_edit_result(tool_name, result, function_args),
"patch": lambda: _format_edit_result(tool_name, result, function_args),
"search_files": lambda: _format_search_files_result(result),
"execute_code": lambda: _format_execute_code_result(result),
"process": lambda: _format_process_result(result, function_args),
"delegate_task": lambda: _format_delegate_result(result),
"session_search": lambda: _format_session_search_result(result),
"memory": lambda: _format_memory_result(result, function_args),
"skill_view": lambda: _format_skill_view_result(result),
"skill_manage": lambda: _format_skill_manage_result(result, function_args),
"web_search": lambda: _format_web_search_result(result),
"web_extract": lambda: _format_web_extract_result(result),
"browser_navigate": lambda: _format_browser_result(tool_name, result, function_args),
"browser_snapshot": lambda: _format_browser_result(tool_name, result, function_args),
"browser_vision": lambda: _format_browser_result(tool_name, result, function_args),
"browser_get_images": lambda: _format_browser_result(tool_name, result, function_args),
"vision_analyze": lambda: _format_media_or_cron_result(tool_name, result),
"image_generate": lambda: _format_media_or_cron_result(tool_name, result),
"cronjob": lambda: _format_media_or_cron_result(tool_name, result),
}.get(tool_name)
if formatter is None and tool_name in _POLISHED_TOOLS:
formatter = lambda: _format_generic_structured_result(tool_name, result)
if formatter is None:
return None
text = formatter()
if not text:
return None
return [_text(text)]
def _build_patch_mode_content(patch_text: str) -> List[Any]:
"""Parse V4A patch mode input into ACP diff blocks when possible."""
if not patch_text:
@@ -912,11 +258,7 @@ def _build_tool_complete_content(
except Exception:
pass
polished_content = _build_polished_completion_content(tool_name, result, function_args)
if polished_content:
return polished_content
return [_text(display_result)]
return [acp.tool_content(acp.text_block(display_result))]
# ---------------------------------------------------------------------------
@@ -946,6 +288,7 @@ def build_tool_start(
content = _build_patch_mode_content(patch_text)
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "write_file":
@@ -954,172 +297,32 @@ def build_tool_start(
content = [acp.tool_diff_content(path=path, new_text=file_content)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "terminal":
command = arguments.get("command", "")
content = [_text(f"$ {command}")]
content = [acp.tool_content(acp.text_block(f"$ {command}"))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "read_file":
# The title and location already identify the file. Sending a synthetic
# "Reading ..." content block makes Zed render an unhelpful Output
# section before the real file contents arrive on completion.
path = arguments.get("path", "")
content = [acp.tool_content(acp.text_block(f"Reading {path}"))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=None, locations=locations,
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "search_files":
pattern = arguments.get("pattern", "")
target = arguments.get("target", "content")
search_path = arguments.get("path")
where = f" in {search_path}" if search_path else ""
content = [_text(f"Searching for '{pattern}' ({target}){where}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "todo":
items = arguments.get("todos")
if isinstance(items, list):
preview_lines = ["Updating todo list", ""]
for item in items[:8]:
if isinstance(item, dict):
preview_lines.append(f"- {item.get('status', 'pending')}: {item.get('content', item.get('id', ''))}")
if len(items) > 8:
preview_lines.append(f"... {len(items) - 8} more")
content = [_text("\n".join(preview_lines))]
else:
content = [_text("Reading todo list")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_view":
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
content = [_text(f"Loading skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_manage":
action = str(arguments.get("action") or "manage").strip() or "manage"
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
path = f"skills/{name}/{file_path}" if file_path else f"skills/{name}"
if action == "patch":
old = str(arguments.get("old_string") or "")
new = str(arguments.get("new_string") or "")
content = [acp.tool_diff_content(path=path, old_text=old or None, new_text=new)]
elif action in {"edit", "create"}:
content = [
acp.tool_diff_content(
path=path,
new_text=str(arguments.get("content") or ""),
)
]
elif action == "write_file":
target = str(arguments.get("file_path") or "file")
content = [
acp.tool_diff_content(
path=f"skills/{name}/{target}",
new_text=str(arguments.get("file_content") or ""),
)
]
elif action in {"delete", "remove_file"}:
target = str(arguments.get("file_path") or file_path or name)
content = [_text(f"Removing {target} from skill '{name}'")]
else:
content = [_text(f"Running skill_manage action '{action}' on skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "execute_code":
code = str(arguments.get("code") or "").strip()
preview = code[:1200] + (f"\n... ({len(code)} chars total, truncated)" if len(code) > 1200 else "")
content = [_text(f"Running Python helper script:\n\n```python\n{preview}\n```" if preview else "Running Python helper script")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching the web for: {query}" if query else "Searching the web")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_extract":
# The title identifies the URL(s). Avoid a duplicate content block so
# Zed renders this like read_file: compact start, concise completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "process":
action = str(arguments.get("action") or "").strip() or "manage"
sid = str(arguments.get("session_id") or "").strip()
data_preview = str(arguments.get("data") or "").strip()
text = f"Process action: {action}" + (f"\nSession: {sid}" if sid else "")
if data_preview:
text += "\nInput: " + _truncate_text(data_preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "delegate_task":
tasks = arguments.get("tasks")
if isinstance(tasks, list) and tasks:
lines = [f"Delegating {len(tasks)} tasks", ""]
for i, task in enumerate(tasks[:8], 1):
if isinstance(task, dict):
goal = str(task.get("goal") or "").strip()
role = str(task.get("role") or "").strip()
lines.append(f"{i}. " + _truncate_text(goal, limit=160) + (f" ({role})" if role else ""))
if len(tasks) > 8:
lines.append(f"... {len(tasks) - 8} more")
content = [_text("\n".join(lines))]
else:
goal = str(arguments.get("goal") or "").strip()
content = [_text("Delegating task" + (f":\n{_truncate_text(goal, limit=800)}" if goal else ""))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "session_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching past sessions for: {query}" if query else "Loading recent sessions")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "memory":
action = str(arguments.get("action") or "manage").strip() or "manage"
target = str(arguments.get("target") or "memory").strip() or "memory"
preview = str(arguments.get("content") or arguments.get("old_text") or "").strip()
text = f"Memory {action} ({target})"
if preview:
text += "\nPreview: " + _truncate_text(preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name in _POLISHED_TOOLS:
try:
args_text = json.dumps(arguments, indent=2, default=str)
except (TypeError, ValueError):
args_text = str(arguments)
content = [_text(_truncate_text(args_text, limit=1200))]
content = [acp.tool_content(acp.text_block(f"Searching for '{pattern}' ({target})"))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
# Generic fallback
@@ -1131,7 +334,7 @@ def build_tool_start(
content = [acp.tool_content(acp.text_block(args_text))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=None if tool_name in _POLISHED_TOOLS else arguments,
raw_input=arguments,
)
@@ -1144,22 +347,18 @@ def build_tool_complete(
) -> ToolCallProgress:
"""Create a ToolCallUpdate (progress) event for a completed tool call."""
kind = get_tool_kind(tool_name)
if tool_name == "web_extract":
error_text = _format_web_extract_result(result)
content = [_text(error_text)] if error_text else None
else:
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
return acp.update_tool_call(
tool_call_id,
kind=kind,
status="completed",
content=content,
raw_output=None if tool_name in _POLISHED_TOOLS else result,
raw_output=result,
)
+34 -289
View File
@@ -20,27 +20,12 @@ from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Any, Dict, List, Optional, Tuple
from utils import base_url_host_matches, normalize_proxy_env_vars
from utils import normalize_proxy_env_vars
# NOTE: `import anthropic` is deliberately NOT at module top — the SDK pulls
# ~220 ms of imports (anthropic.types, anthropic.lib.tools._beta_runner, etc.)
# and the 3 usage sites (build_anthropic_client, build_anthropic_bedrock_client,
# read_claude_code_credentials_from_keychain) are all on cold user-triggered
# paths. Access via the `_get_anthropic_sdk()` accessor below, which caches
# the module after the first call and returns None on ImportError.
_anthropic_sdk: Any = ... # sentinel — None means "tried and missing"
def _get_anthropic_sdk():
"""Return the ``anthropic`` SDK module, importing lazily. None if not installed."""
global _anthropic_sdk
if _anthropic_sdk is ...:
try:
import anthropic as _sdk
_anthropic_sdk = _sdk
except ImportError:
_anthropic_sdk = None
return _anthropic_sdk
try:
import anthropic as _anthropic_sdk
except ImportError:
_anthropic_sdk = None # type: ignore[assignment]
logger = logging.getLogger(__name__)
@@ -76,7 +61,6 @@ _ADAPTIVE_THINKING_SUBSTRINGS = ("4-6", "4.6", "4-7", "4.7")
# Models where temperature/top_p/top_k return 400 if set to non-default values.
# This is the Opus 4.7 contract; future 4.x+ models are expected to follow it.
_NO_SAMPLING_PARAMS_SUBSTRINGS = ("4-7", "4.7")
_FAST_MODE_SUPPORTED_SUBSTRINGS = ("opus-4-6", "opus-4.6")
# ── Max output token limits per Anthropic model ───────────────────────
# Source: Anthropic docs + Cline model catalog. Anthropic's API requires
@@ -106,9 +90,6 @@ _ANTHROPIC_OUTPUT_LIMITS = {
"claude-3-haiku": 4_096,
# Third-party Anthropic-compatible providers
"minimax": 131_072,
# Qwen models via DashScope Anthropic-compatible endpoint
# DashScope enforces max_tokens ∈ [1, 65536]
"qwen3": 65_536,
}
# For any model not in the table, assume the highest current limit.
@@ -220,45 +201,20 @@ def _forbids_sampling_params(model: str) -> bool:
return any(v in model for v in _NO_SAMPLING_PARAMS_SUBSTRINGS)
def _supports_fast_mode(model: str) -> bool:
"""Return True for models that support Anthropic Fast Mode (speed=fast).
Per Anthropic docs, fast mode is currently supported on Opus 4.6 only.
Sending ``speed: "fast"`` to any other Claude model (including Opus 4.7)
returns HTTP 400. This guard prevents silently 400'ing when stale config
or older callers leave fast mode enabled across a model upgrade.
"""
return any(v in model for v in _FAST_MODE_SUPPORTED_SUBSTRINGS)
# Beta headers for enhanced features (sent with ALL auth types).
# As of Opus 4.7 (2026-04-16), the first two are GA on Claude 4.6+ — the
# As of Opus 4.7 (2026-04-16), both of these are GA on Claude 4.6+ — the
# beta headers are still accepted (harmless no-op) but not required. Kept
# here so older Claude (4.5, 4.1) + third-party Anthropic-compat endpoints
# that still gate on the headers continue to get the enhanced features.
#
# ``context-1m-2025-08-07`` unlocks the 1M context window on Claude Opus 4.6/4.7
# and Sonnet 4.6 when served via AWS Bedrock or Azure AI Foundry. 1M is GA on
# native Anthropic (api.anthropic.com) for Opus 4.6+, but Bedrock/Azure still
# gate it behind this beta header as of 2026-04 — without it Bedrock caps Opus
# at 200K even though model_metadata.py advertises 1M. The header is a harmless
# no-op on endpoints where 1M is GA.
#
# Migration guide: remove these if you no longer support ≤4.5 models or once
# Bedrock/Azure promote 1M to GA.
# Migration guide: remove these if you no longer support ≤4.5 models.
_COMMON_BETAS = [
"interleaved-thinking-2025-05-14",
"fine-grained-tool-streaming-2025-05-14",
"context-1m-2025-08-07",
]
# MiniMax's Anthropic-compatible endpoints fail tool-use requests when
# the fine-grained tool streaming beta is present. Omit it so tool calls
# fall back to the provider's default response path.
_TOOL_STREAMING_BETA = "fine-grained-tool-streaming-2025-05-14"
# 1M context beta — see comment on _COMMON_BETAS above. Stripped for
# Bearer-auth (MiniMax) endpoints since they host their own models and
# unknown Anthropic beta headers risk request rejection.
_CONTEXT_1M_BETA = "context-1m-2025-08-07"
# Fast mode beta — enables the ``speed: "fast"`` request parameter for
# significantly higher output token throughput on Opus 4.6 (~2.5x).
@@ -380,88 +336,6 @@ def _is_kimi_coding_endpoint(base_url: str | None) -> bool:
return normalized.rstrip("/").lower().startswith("https://api.kimi.com/coding")
# Model-name prefixes that identify the Kimi / Moonshot family. Covers
# - official slugs: ``kimi-k2.5``, ``kimi_thinking``, ``moonshot-v1-8k``
# - common release lines: ``k1.5-...``, ``k2-thinking``, ``k25-...``, ``k2.5-...``
# Matched case-insensitively against the post-``normalize_model_name`` form,
# so a caller's ``provider/vendor/model`` slug is handled the same as a
# bare name.
_KIMI_FAMILY_MODEL_PREFIXES = (
"kimi-", "kimi_",
"moonshot-", "moonshot_",
"k1.", "k1-",
"k2.", "k2-",
"k25", "k2.5",
)
def _model_name_is_kimi_family(model: str | None) -> bool:
if not isinstance(model, str):
return False
m = model.strip().lower()
if not m:
return False
# Strip vendor prefix (e.g. ``moonshotai/kimi-k2.5`` → ``kimi-k2.5``)
if "/" in m:
m = m.rsplit("/", 1)[-1]
return m.startswith(_KIMI_FAMILY_MODEL_PREFIXES)
def _is_kimi_family_endpoint(base_url: str | None, model: str | None = None) -> bool:
"""Return True for any Kimi / Moonshot Anthropic-Messages-speaking endpoint.
Broader than ``_is_kimi_coding_endpoint`` — matches:
- Kimi's official ``/coding`` URL (legacy check, preserved)
- Any ``api.kimi.com`` / ``moonshot.ai`` / ``moonshot.cn`` host
- Custom or proxied endpoints whose *model* name is in the Kimi / Moonshot
family (``kimi-*``, ``moonshot-*``, ``k1.*``, ``k2.*``, …). Users with
``api_mode: anthropic_messages`` on a private gateway fronting Kimi
fall into this branch — the upstream still enforces Kimi's thinking
semantics (reasoning_content required on every replayed tool-call
message) regardless of the gateway's hostname.
Used to decide whether to drop Anthropic's ``thinking`` kwarg and to
preserve unsigned reasoning_content-derived thinking blocks on replay.
See hermes-agent#13848, #17057.
"""
if _is_kimi_coding_endpoint(base_url):
return True
for _domain in ("api.kimi.com", "moonshot.ai", "moonshot.cn"):
if base_url_host_matches(base_url or "", _domain):
return True
if _model_name_is_kimi_family(model):
return True
return False
def _is_deepseek_anthropic_endpoint(base_url: str | None) -> bool:
"""Return True for DeepSeek's Anthropic-compatible endpoint.
DeepSeek's ``/anthropic`` route speaks the Anthropic Messages protocol
but, when thinking mode is enabled, requires the ``thinking`` blocks
from prior assistant turns to round-trip on subsequent requests — the
generic third-party path strips them and triggers HTTP 400::
The content[].thinking in the thinking mode must be passed back
to the API.
Per DeepSeek's published compatibility matrix the blocks are unsigned
(no Anthropic-proprietary signature, no ``redacted_thinking`` support),
so this endpoint is handled with the same strip-signed / keep-unsigned
policy used for Kimi's ``/coding`` endpoint. The match is pinned to
the ``/anthropic`` path so the OpenAI-compatible ``api.deepseek.com``
base URL (which never reaches this adapter) is not misclassified.
See hermes-agent#16748.
"""
if not base_url_host_matches(base_url or "", "api.deepseek.com"):
return False
normalized = _normalize_base_url_text(base_url)
if not normalized:
return False
return "/anthropic" in normalized.rstrip("/").lower()
def _requires_bearer_auth(base_url: str | None) -> bool:
"""Return True for Anthropic-compatible providers that require Bearer auth.
@@ -476,45 +350,20 @@ def _requires_bearer_auth(base_url: str | None) -> bool:
return normalized.startswith(("https://api.minimax.io/anthropic", "https://api.minimaxi.com/anthropic"))
def _common_betas_for_base_url(
base_url: str | None,
*,
drop_context_1m_beta: bool = False,
) -> list[str]:
def _common_betas_for_base_url(base_url: str | None) -> list[str]:
"""Return the beta headers that are safe for the configured endpoint.
MiniMax's Anthropic-compatible endpoints (Bearer-auth) reject requests
that include Anthropic's ``fine-grained-tool-streaming`` beta — every
tool-use message triggers a connection error. Strip that beta for
Bearer-auth endpoints while keeping all other betas intact.
The ``context-1m-2025-08-07`` beta is also stripped for Bearer-auth
endpoints — MiniMax hosts its own models, not Claude, so the header is
irrelevant at best and risks request rejection at worst.
``drop_context_1m_beta=True`` additionally strips the 1M-context beta on
otherwise-unrelated endpoints. The OAuth retry path flips this flag after
a subscription rejects the beta with
"The long context beta is not yet available for this subscription" so
subsequent requests in the same session don't repeat the probe. See the
reactive recovery loop in ``run_agent.py`` and issue-comment history on
PR #17680 for the full rationale.
"""
if _requires_bearer_auth(base_url):
_stripped = {_TOOL_STREAMING_BETA, _CONTEXT_1M_BETA}
return [b for b in _COMMON_BETAS if b not in _stripped]
if drop_context_1m_beta:
return [b for b in _COMMON_BETAS if b != _CONTEXT_1M_BETA]
return [b for b in _COMMON_BETAS if b != _TOOL_STREAMING_BETA]
return _COMMON_BETAS
def build_anthropic_client(
api_key: str,
base_url: str = None,
timeout: float = None,
*,
drop_context_1m_beta: bool = False,
):
def build_anthropic_client(api_key: str, base_url: str = None, timeout: float = None):
"""Create an Anthropic client, auto-detecting setup-tokens vs API keys.
If *timeout* is provided it overrides the default 900s read timeout. The
@@ -523,15 +372,8 @@ def build_anthropic_client(
Anthropic-compatible providers respect the same knob as OpenAI-wire
providers.
``drop_context_1m_beta=True`` strips ``context-1m-2025-08-07`` from the
client-level ``anthropic-beta`` header. Used by the reactive OAuth retry
path in ``run_agent.py`` when a subscription rejects the beta; leave at
its default on fresh clients so 1M-capable subscriptions keep the
capability.
Returns an anthropic.Anthropic instance.
"""
_anthropic_sdk = _get_anthropic_sdk()
if _anthropic_sdk is None:
raise ImportError(
"The 'anthropic' package is required for the Anthropic provider. "
@@ -558,10 +400,7 @@ def build_anthropic_client(
kwargs["default_query"] = {"api-version": "2025-04-15"}
else:
kwargs["base_url"] = normalized_base_url
common_betas = _common_betas_for_base_url(
normalized_base_url,
drop_context_1m_beta=drop_context_1m_beta,
)
common_betas = _common_betas_for_base_url(normalized_base_url)
if _is_kimi_coding_endpoint(base_url):
# Kimi's /coding endpoint requires User-Agent: claude-code/0.1.0
@@ -617,16 +456,8 @@ def build_anthropic_bedrock_client(region: str):
Claude feature parity: prompt caching, thinking budgets, adaptive
thinking, fast mode — features not available via the Converse API.
Attaches the common Anthropic beta headers as client-level defaults so
that Bedrock-hosted Claude models get the same enhanced features as
native Anthropic. The ``context-1m-2025-08-07`` beta in particular
unlocks the 1M context window for Opus 4.6/4.7 on Bedrock — without
it, Bedrock caps these models at 200K even though the Anthropic API
serves them with 1M natively.
Auth uses the boto3 default credential chain (IAM roles, SSO, env vars).
"""
_anthropic_sdk = _get_anthropic_sdk()
if _anthropic_sdk is None:
raise ImportError(
"The 'anthropic' package is required for the Bedrock provider. "
@@ -642,7 +473,6 @@ def build_anthropic_bedrock_client(region: str):
return _anthropic_sdk.AnthropicBedrock(
aws_region=region,
timeout=Timeout(timeout=900.0, connect=10.0),
default_headers={"anthropic-beta": ",".join(_COMMON_BETAS)},
)
@@ -658,6 +488,9 @@ def _read_claude_code_credentials_from_keychain() -> Optional[Dict[str, Any]]:
Returns dict with {accessToken, refreshToken?, expiresAt?} or None.
"""
import platform
import subprocess
if platform.system() != "Darwin":
return None
@@ -1202,12 +1035,9 @@ def normalize_model_name(model: str, preserve_dots: bool = False) -> str:
# These must not be converted to hyphens. See issue #12295.
if _is_bedrock_model_id(model):
return model
# Only convert dots to hyphens for Anthropic/Claude models.
# Non-Anthropic models (gpt-5.4, gemini-2.5, etc.) use dots
# as part of their canonical names. See issue #17171.
_lower = model.lower()
if _lower.startswith("claude-") or _lower.startswith("anthropic/"):
model = model.replace(".", "-")
# OpenRouter uses dots for version separators (claude-opus-4.6),
# Anthropic uses hyphens (claude-opus-4-6). Convert dots to hyphens.
model = model.replace(".", "-")
return model
@@ -1224,74 +1054,17 @@ def _sanitize_tool_id(tool_id: str) -> str:
return sanitized or "tool_0"
def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
"""Normalize tool schemas before sending them to Anthropic.
Anthropic's tool schema validator rejects nullable unions such as
``anyOf: [{"type": "string"}, {"type": "null"}]`` that Pydantic/MCP
commonly emits for optional fields. Tool optionality is represented by
the parent ``required`` array, so we delegate to the shared
``strip_nullable_unions`` helper to collapse nullable unions to the
non-null branch while preserving metadata like description/default.
``keep_nullable_hint=False`` because the Anthropic validator does not
recognize the OpenAPI-style ``nullable: true`` extension and strict
schema-to-grammar converters may reject unknown keywords.
Top-level ``oneOf``/``allOf``/``anyOf`` are also stripped here: the
Anthropic API rejects union keywords at the schema root with a generic
HTTP 400. Several upstream and plugin tools ship schemas with one of
these keywords at the top level (commonly for Pydantic discriminated
unions). If we land here with those keywords still present after
nullable-union stripping, drop them and fall back to a plain object
schema so the tool still validates at the Anthropic boundary.
"""
if not schema:
return {"type": "object", "properties": {}}
from tools.schema_sanitizer import strip_nullable_unions
normalized = strip_nullable_unions(schema, keep_nullable_hint=False)
if not isinstance(normalized, dict):
return {"type": "object", "properties": {}}
# Strip top-level union keywords that Anthropic's validator rejects.
banned = {"oneOf", "allOf", "anyOf"}
if banned & normalized.keys():
normalized = {k: v for k, v in normalized.items() if k not in banned}
if "type" not in normalized:
normalized["type"] = "object"
if normalized.get("type") == "object" and not isinstance(normalized.get("properties"), dict):
normalized = {**normalized, "properties": {}}
return normalized
def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
"""Convert OpenAI tool definitions to Anthropic format."""
if not tools:
return []
result = []
seen_names: set = set()
for t in tools:
fn = t.get("function", {})
name = fn.get("name", "")
# Defensive dedup: Anthropic rejects requests with duplicate tool
# names. Upstream injection paths already dedup, but this guard
# converts a hard API failure into a warning. See: #18478
if name and name in seen_names:
logger.warning(
"convert_tools_to_anthropic: duplicate tool name '%s' "
"— dropping second occurrence",
name,
)
continue
if name:
seen_names.add(name)
result.append({
"name": name,
"name": fn.get("name", ""),
"description": fn.get("description", ""),
"input_schema": _normalize_tool_input_schema(
fn.get("parameters", {"type": "object", "properties": {}})
),
"input_schema": fn.get("parameters", {"type": "object", "properties": {}}),
})
return result
@@ -1422,7 +1195,6 @@ def _convert_content_to_anthropic(content: Any) -> Any:
def convert_messages_to_anthropic(
messages: List[Dict],
base_url: str | None = None,
model: str | None = None,
) -> Tuple[Optional[Any], List[Dict]]:
"""Convert OpenAI-format messages to Anthropic format.
@@ -1434,12 +1206,6 @@ def convert_messages_to_anthropic(
endpoint, all thinking block signatures are stripped. Signatures are
Anthropic-proprietary — third-party endpoints cannot validate them and will
reject them with HTTP 400 "Invalid signature in thinking block".
When *model* is provided and matches the Kimi / Moonshot family (or
*base_url* is a Kimi / Moonshot host), unsigned thinking blocks
synthesised from ``reasoning_content`` are preserved on replayed
assistant tool-call messages — Kimi requires the field to exist, even
if empty.
"""
system = None
result = []
@@ -1668,16 +1434,7 @@ def convert_messages_to_anthropic(
# cache markers can interfere with signature validation.
_THINKING_TYPES = frozenset(("thinking", "redacted_thinking"))
_is_third_party = _is_third_party_anthropic_endpoint(base_url)
# Kimi /coding and DeepSeek /anthropic share a contract: both speak the
# Anthropic Messages protocol upstream but require that thinking blocks
# synthesised from reasoning_content round-trip on subsequent turns when
# thinking is enabled. Signed Anthropic blocks still have to be stripped
# (neither endpoint can validate Anthropic's signatures); unsigned blocks
# are preserved. See hermes-agent#13848 (Kimi) and #16748 (DeepSeek).
_preserve_unsigned_thinking = (
_is_kimi_family_endpoint(base_url, model)
or _is_deepseek_anthropic_endpoint(base_url)
)
_is_kimi = _is_kimi_coding_endpoint(base_url)
last_assistant_idx = None
for i in range(len(result) - 1, -1, -1):
@@ -1689,22 +1446,22 @@ def convert_messages_to_anthropic(
if m.get("role") != "assistant" or not isinstance(m.get("content"), list):
continue
if _preserve_unsigned_thinking:
# Kimi's /coding and DeepSeek's /anthropic endpoints both enable
# thinking server-side and require unsigned thinking blocks on
# replayed assistant tool-call messages. Strip signed Anthropic
# blocks (neither upstream can validate Anthropic signatures) but
# preserve the unsigned ones we synthesised from reasoning_content.
if _is_kimi:
# Kimi's /coding endpoint enables thinking server-side and
# requires unsigned thinking blocks on replayed assistant
# tool-call messages. Strip signed Anthropic blocks (Kimi
# can't validate signatures) but preserve the unsigned ones
# we synthesised from reasoning_content above.
new_content = []
for b in m["content"]:
if not isinstance(b, dict) or b.get("type") not in _THINKING_TYPES:
new_content.append(b)
continue
if b.get("signature") or b.get("data"):
# Anthropic-signed block — upstream can't validate, strip
# Anthropic-signed block — Kimi can't validate, strip
continue
# Unsigned thinking (synthesised from reasoning_content) —
# keep it: the upstream needs it for message-history validation.
# keep it: Kimi needs it for message-history validation.
new_content.append(b)
m["content"] = new_content or [{"type": "text", "text": "(empty)"}]
elif _is_third_party or idx != last_assistant_idx:
@@ -1761,7 +1518,6 @@ def build_anthropic_kwargs(
context_length: Optional[int] = None,
base_url: str | None = None,
fast_mode: bool = False,
drop_context_1m_beta: bool = False,
) -> Dict[str, Any]:
"""Build kwargs for anthropic.messages.create().
@@ -1801,9 +1557,7 @@ def build_anthropic_kwargs(
Currently only supported on native Anthropic endpoints (not third-party
compatible ones).
"""
system, anthropic_messages = convert_messages_to_anthropic(
messages, base_url=base_url, model=model
)
system, anthropic_messages = convert_messages_to_anthropic(messages, base_url=base_url)
anthropic_tools = convert_tools_to_anthropic(tools) if tools else []
model = normalize_model_name(model, preserve_dots=preserve_dots)
@@ -1909,7 +1663,7 @@ def build_anthropic_kwargs(
# silently hides reasoning text that Hermes surfaces in its CLI. We
# request "summarized" so the reasoning blocks stay populated — matching
# 4.6 behavior and preserving the activity-feed UX during long tool runs.
_is_kimi_coding = _is_kimi_family_endpoint(base_url, model)
_is_kimi_coding = _is_kimi_coding_endpoint(base_url)
if reasoning_config and isinstance(reasoning_config, dict) and not _is_kimi_coding:
if reasoning_config.get("enabled") is not False and "haiku" not in model.lower():
effort = str(reasoning_config.get("effort", "medium")).lower()
@@ -1944,22 +1698,13 @@ def build_anthropic_kwargs(
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
# Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
# output speed. Per Anthropic docs, fast mode is only supported on
# Opus 4.6 — Opus 4.7 and other models 400 on the speed parameter.
# Only for native Anthropic endpoints — third-party providers would
# reject the unknown beta header and speed parameter.
if (
fast_mode
and not _is_third_party_anthropic_endpoint(base_url)
and _supports_fast_mode(model)
):
# output speed. Only for native Anthropic endpoints — third-party
# providers would reject the unknown beta header and speed parameter.
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
kwargs.setdefault("extra_body", {})["speed"] = "fast"
# Build extra_headers with ALL applicable betas (the per-request
# extra_headers override the client-level anthropic-beta header).
betas = list(_common_betas_for_base_url(
base_url,
drop_context_1m_beta=drop_context_1m_beta,
))
betas = list(_common_betas_for_base_url(base_url))
if is_oauth:
betas.extend(_OAUTH_ONLY_BETAS)
betas.append(_FAST_MODE_BETA)
+107 -579
View File
File diff suppressed because it is too large Load Diff
+3 -41
View File
@@ -291,52 +291,14 @@ def has_aws_credentials(env: Optional[Dict[str, str]] = None) -> bool:
def resolve_bedrock_region(env: Optional[Dict[str, str]] = None) -> str:
"""Resolve the AWS region for Bedrock API calls.
Priority:
1. AWS_REGION env var
2. AWS_DEFAULT_REGION env var
3. boto3/botocore configured region (from ~/.aws/config or SSO profile)
4. us-east-1 (hard fallback)
The boto3 fallback is critical for EU/AP users who configure their region
in ~/.aws/config via a named profile rather than env vars — without it,
live model discovery would always return us.* profile IDs regardless of
the user's actual region.
Priority: AWS_REGION → AWS_DEFAULT_REGION → us-east-1 (fallback).
"""
env = env if env is not None else os.environ
explicit = (
return (
env.get("AWS_REGION", "").strip()
or env.get("AWS_DEFAULT_REGION", "").strip()
or "us-east-1"
)
if explicit:
return explicit
try:
import botocore.session
region = botocore.session.get_session().get_config_variable("region")
if region:
return region
except Exception:
pass
return "us-east-1"
def bedrock_model_ids_or_none() -> Optional[List[str]]:
"""Live-discover Bedrock model IDs for the active region.
Returns a list of model ID strings if discovery succeeds and yields
at least one model, or ``None`` on failure / empty result. Callers
should fall back to the static curated list when ``None`` is returned.
This helper consolidates the discover → extract-ids → fallback
pattern that was previously duplicated across ``provider_model_ids``,
``list_authenticated_providers`` section 2, and section 3.
"""
try:
discovered = discover_bedrock_models(resolve_bedrock_region())
if discovered:
return [m["id"] for m in discovered]
except Exception:
pass
return None
# ---------------------------------------------------------------------------
+12 -154
View File
@@ -43,9 +43,6 @@ SUMMARY_PREFIX = (
"they were already addressed. "
"Your current task is identified in the '## Active Task' section of the "
"summary — resume exactly from there. "
"IMPORTANT: Your persistent memory (MEMORY.md, USER.md) in the system "
"prompt is ALWAYS authoritative and active — never ignore or deprioritize "
"memory content due to this compaction note. "
"Respond ONLY to the latest user message "
"that appears AFTER this summary. The current session state (files, "
"config, etc.) may reflect work described here — avoid repeating it:"
@@ -64,52 +61,9 @@ _PRUNED_TOOL_PLACEHOLDER = "[Old tool output cleared to save context space]"
# Chars per token rough estimate
_CHARS_PER_TOKEN = 4
# Flat token cost per attached image part. Real cost varies by provider and
# dimensions (Anthropic ≈ width×height/750, GPT-4o up to ~1700 for
# high-detail 2048×2048, Gemini 258/tile), but 1600 is a realistic ceiling
# that keeps compression budgeting honest for multi-image conversations.
# Matches Claude Code's IMAGE_TOKEN_ESTIMATE constant.
_IMAGE_TOKEN_ESTIMATE = 1600
# Same figure expressed in the char-budget currency the rest of the
# compressor speaks in. Used when accumulating message "content length"
# for tail-cut decisions.
_IMAGE_CHAR_EQUIVALENT = _IMAGE_TOKEN_ESTIMATE * _CHARS_PER_TOKEN
_SUMMARY_FAILURE_COOLDOWN_SECONDS = 600
def _content_length_for_budget(raw_content: Any) -> int:
"""Return the effective char-length of a message's content for token budgeting.
Plain strings: ``len(content)``. Multimodal lists: sum of text-part
``len(text)`` plus a flat ``_IMAGE_CHAR_EQUIVALENT`` per image part
(``image_url`` / ``input_image`` / Anthropic-style ``image``). This
keeps the compressor from treating a turn with 5 attached images as
near-zero tokens just because the text part is empty.
"""
if isinstance(raw_content, str):
return len(raw_content)
if not isinstance(raw_content, list):
return len(str(raw_content or ""))
total = 0
for p in raw_content:
if isinstance(p, str):
total += len(p)
continue
if not isinstance(p, dict):
total += len(str(p))
continue
ptype = p.get("type")
if ptype in {"image_url", "input_image", "image"}:
total += _IMAGE_CHAR_EQUIVALENT
else:
# text / input_text / tool_result-with-text / anything else with
# a text field. Ignore the raw base64 payload inside image_url
# dicts — dimensions don't matter, only whether it's an image.
total += len(p.get("text", "") or "")
return total
def _content_text_for_contains(content: Any) -> str:
"""Return a best-effort text view of message content.
@@ -341,13 +295,8 @@ class ContextCompressor(ContextEngine):
self._context_probe_persistable = False
self._previous_summary = None
self._last_summary_error = None
self._last_summary_dropped_count = 0
self._last_summary_fallback_used = False
self._last_aux_model_failure_error = None
self._last_aux_model_failure_model = None
self._last_compression_savings_pct = 100.0
self._ineffective_compression_count = 0
self._summary_failure_cooldown_until = 0.0 # transient errors must not block a fresh session
def update_model(
self,
@@ -449,17 +398,6 @@ class ContextCompressor(ContextEngine):
self._ineffective_compression_count: int = 0
self._summary_failure_cooldown_until: float = 0.0
self._last_summary_error: Optional[str] = None
# When summary generation fails and a static fallback is inserted,
# record how many turns were unrecoverably dropped so callers
# (gateway hygiene, /compress) can surface a visible warning.
self._last_summary_dropped_count: int = 0
self._last_summary_fallback_used: bool = False
# When a user-configured summary model fails and we recover by
# retrying on the main model, record the failure so gateway /
# CLI callers can still warn the user even though compression
# succeeded. Silent recovery would hide the broken config.
self._last_aux_model_failure_error: Optional[str] = None
self._last_aux_model_failure_model: Optional[str] = None
def update_from_response(self, usage: Dict[str, Any]):
"""Update tracked token usage from API response."""
@@ -542,11 +480,11 @@ class ContextCompressor(ContextEngine):
# Token-budget approach: walk backward accumulating tokens
accumulated = 0
boundary = len(result)
min_protect = min(protect_tail_count, len(result))
min_protect = min(protect_tail_count, len(result) - 1)
for i in range(len(result) - 1, -1, -1):
msg = result[i]
raw_content = msg.get("content") or ""
content_len = _content_length_for_budget(raw_content)
content_len = sum(len(p.get("text", "")) for p in raw_content) if isinstance(raw_content, list) else len(raw_content)
msg_tokens = content_len // _CHARS_PER_TOKEN + 10
for tc in msg.get("tool_calls") or []:
if isinstance(tc, dict):
@@ -557,16 +495,7 @@ class ContextCompressor(ContextEngine):
break
accumulated += msg_tokens
boundary = i
# Translate the budget walk into a "protected count", apply the
# floor in count-space (where `max` reads naturally: protect at
# least `min_protect` messages or whatever the budget reserved,
# whichever is more), then convert back to a prune boundary.
# Doing this in index-space with `max` would invert the direction
# (smaller index = MORE protected), so a generous budget would
# silently get truncated back down to `min_protect`.
budget_protect_count = len(result) - boundary
protected_count = max(budget_protect_count, min_protect)
prune_boundary = len(result) - protected_count
prune_boundary = max(boundary, len(result) - min_protect)
else:
prune_boundary = len(result) - protect_tail_count
@@ -582,8 +511,6 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if len(content) < 200:
continue
h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
@@ -603,8 +530,6 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if not content or content == _PRUNED_TOOL_PLACEHOLDER:
continue
# Skip already-deduplicated or previously-summarized results
@@ -920,66 +845,22 @@ The user has requested that this compaction PRIORITISE preserving all informatio
or "does not exist" in _err_str
or "no available channel" in _err_str
)
_is_timeout = (
_status in (408, 429, 502, 504)
or "timeout" in _err_str
)
if (
(_is_model_not_found or _is_timeout)
_is_model_not_found
and self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' unavailable (%s). "
"Summary model '%s' not available (%s). "
"Falling back to main model '%s' for compression.",
self.summary_model, e, self.model,
)
# Record the aux-model failure so callers can warn the user
# even if the retry-on-main succeeds — a misconfigured aux
# model is something the user needs to fix.
_err_text = str(e).strip() or e.__class__.__name__
if len(_err_text) > 220:
_err_text = _err_text[:217].rstrip() + "..."
self._last_aux_model_failure_error = _err_text
self._last_aux_model_failure_model = self.summary_model
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0 # no cooldown
return self._generate_summary(turns_to_summarize, focus_topic=focus_topic) # retry immediately
# Unknown-error best-effort retry on main model. Losing N turns of
# context is almost always worse than one extra summary attempt, so
# if we haven't already fallen back and the summary model differs
# from the main model, try once more on main before entering
# cooldown. Errors that DID match _is_model_not_found above are
# already handled by the fast-path retry; this branch catches
# everything else (400s, provider-specific "no route" strings,
# aggregator rejections, etc.) where auto-retry is still safer
# than dropping the turns.
if (
self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' failed (%s). "
"Retrying on main model '%s' before giving up.",
self.summary_model, e, self.model,
)
# Record the aux-model failure (see 404 branch above) — user
# should know their configured model is broken even if main
# recovers the call.
_err_text = str(e).strip() or e.__class__.__name__
if len(_err_text) > 220:
_err_text = _err_text[:217].rstrip() + "..."
self._last_aux_model_failure_error = _err_text
self._last_aux_model_failure_model = self.summary_model
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0
return self._generate_summary(turns_to_summarize, focus_topic=focus_topic)
# Transient errors (timeout, rate limit, network) — shorter cooldown
_transient_cooldown = 60
self._summary_failure_cooldown_until = time.monotonic() + _transient_cooldown
@@ -1013,8 +894,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
def _get_tool_call_id(tc) -> str:
"""Extract the call ID from a tool_call entry (dict or SimpleNamespace)."""
if isinstance(tc, dict):
return tc.get("call_id", "") or tc.get("id", "") or ""
return getattr(tc, "call_id", "") or getattr(tc, "id", "") or ""
return tc.get("id", "")
return getattr(tc, "id", "") or ""
def _sanitize_tool_pairs(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Fix orphaned tool_call / tool_result pairs after compression.
@@ -1201,9 +1082,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
for i in range(n - 1, head_end - 1, -1):
msg = messages[i]
raw_content = msg.get("content") or ""
content_len = _content_length_for_budget(raw_content)
msg_tokens = content_len // _CHARS_PER_TOKEN + 10 # +10 for role/metadata
content = msg.get("content") or ""
msg_tokens = len(content) // _CHARS_PER_TOKEN + 10 # +10 for role/metadata
# Include tool call arguments in estimate
for tc in msg.get("tool_calls") or []:
if isinstance(tc, dict):
@@ -1272,13 +1152,6 @@ The user has requested that this compaction PRIORITISE preserving all informatio
related to this topic and be more aggressive about compressing
everything else. Inspired by Claude Code's ``/compact``.
"""
# Reset per-call summary failure state — callers inspect these fields
# after compress() returns to decide whether to surface a warning.
self._last_summary_dropped_count = 0
self._last_summary_fallback_used = False
self._last_summary_error = None
self._last_aux_model_failure_error = None
self._last_aux_model_failure_model = None
n_messages = len(messages)
# Only need head + 3 tail messages minimum (token budget decides the real tail size)
_min_for_compress = self.protect_first_n + 3 + 1
@@ -1343,7 +1216,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
msg = messages[i].copy()
if i == 0 and msg.get("role") == "system":
existing = msg.get("content")
_compression_note = "[Note: Some earlier conversation turns have been compacted into a handoff summary to preserve context space. The current session state may still reflect earlier work, so build on that summary and state rather than re-doing work. Your persistent memory (MEMORY.md, USER.md) remains fully authoritative regardless of compaction.]"
_compression_note = "[Note: Some earlier conversation turns have been compacted into a handoff summary to preserve context space. The current session state may still reflect earlier work, so build on that summary and state rather than re-doing work.]"
if _compression_note not in _content_text_for_contains(existing):
msg["content"] = _append_text_to_content(
existing,
@@ -1357,13 +1230,11 @@ The user has requested that this compaction PRIORITISE preserving all informatio
if not self.quiet_mode:
logger.warning("Summary generation failed — inserting static fallback context marker")
n_dropped = compress_end - compress_start
self._last_summary_dropped_count = n_dropped
self._last_summary_fallback_used = True
summary = (
f"{SUMMARY_PREFIX}\n"
f"Summary generation was unavailable. {n_dropped} message(s) were "
f"Summary generation was unavailable. {n_dropped} conversation turns were "
f"removed to free context space but could not be summarized. The removed "
f"messages contained earlier work in this session. Continue based on the "
f"turns contained earlier work in this session. Continue based on the "
f"recent messages below and the current state of any files or resources."
)
@@ -1388,19 +1259,6 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# Merge the summary into the first tail message instead
# of inserting a standalone message that breaks alternation.
_merge_summary_into_tail = True
# When the summary lands as a standalone role="user" message,
# weak models read the verbatim "## Active Task" quote of a past
# user request as fresh input (#11475, #14521). Append the explicit
# end marker — the same one used in the merge-into-tail path — so
# the model has a clear "summary above, not new input" signal.
if not _merge_summary_into_tail and summary_role == "user":
summary = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---"
)
if not _merge_summary_into_tail:
compressed.append({"role": summary_role, "content": summary})
+1 -1
View File
@@ -608,7 +608,7 @@ class CopilotACPClient:
end = start + limit if isinstance(limit, int) and limit > 0 else None
content = "".join(lines[start:end])
if content:
content = redact_sensitive_text(content, force=True)
content = redact_sensitive_text(content)
response = {
"jsonrpc": "2.0",
"id": message_id,
+4 -135
View File
@@ -3,18 +3,17 @@
from __future__ import annotations
import logging
import os
import random
import threading
import time
import uuid
import os
import re
from dataclasses import dataclass, fields, replace
from datetime import datetime
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import get_env_value, load_env
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -456,70 +455,6 @@ class CredentialPool:
logger.debug("Failed to sync from credentials file: %s", exc)
return entry
def _sync_codex_entry_from_auth_store(self, entry: PooledCredential) -> PooledCredential:
"""Sync a Codex device_code pool entry from auth.json if tokens differ.
When a Codex OAuth access token expires (or the ChatGPT account hits
its 5h/weekly quota), the pool entry gets marked ``STATUS_EXHAUSTED``
with a ``last_error_reset_at`` that can be many hours in the future.
Meanwhile the user may run ``hermes model`` / ``hermes auth`` which
performs a fresh device-code login and writes new tokens to
``auth.json`` under ``_auth_store_lock``. Without this sync the pool
entry stays frozen until ``last_error_reset_at`` elapses even
though fresh credentials are sitting on disk and every request
fails with "no available entries (all exhausted or empty)".
Mirrors the Nous/Anthropic resync paths above. Only applies to
device_code-sourced entries; env/API-key-sourced entries have no
auth.json shadow to sync from.
"""
if self.provider != "openai-codex" or entry.source != "device_code":
return entry
try:
with _auth_store_lock():
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "openai-codex")
if not isinstance(state, dict):
return entry
tokens = state.get("tokens")
if not isinstance(tokens, dict):
return entry
store_access = tokens.get("access_token", "")
store_refresh = tokens.get("refresh_token", "")
# Adopt auth.json tokens when either side differs. Codex refresh
# tokens are single-use too, so a fresh refresh_token from
# another process means our entry's pair is consumed/stale.
entry_access = entry.access_token or ""
entry_refresh = entry.refresh_token or ""
if store_access and (
store_access != entry_access
or (store_refresh and store_refresh != entry_refresh)
):
logger.debug(
"Pool entry %s: syncing Codex tokens from auth.json "
"(refreshed by another process)",
entry.id,
)
field_updates: Dict[str, Any] = {
"access_token": store_access,
"refresh_token": store_refresh or entry.refresh_token,
"last_status": None,
"last_status_at": None,
"last_error_code": None,
"last_error_reason": None,
"last_error_message": None,
"last_error_reset_at": None,
}
if state.get("last_refresh"):
field_updates["last_refresh"] = state["last_refresh"]
updated = replace(entry, **field_updates)
self._replace_entry(entry, updated)
self._persist()
return updated
except Exception as exc:
logger.debug("Failed to sync Codex entry from auth.json: %s", exc)
return entry
def _sync_nous_entry_from_auth_store(self, entry: PooledCredential) -> PooledCredential:
"""Sync a Nous pool entry from auth.json if tokens differ.
@@ -852,18 +787,6 @@ class CredentialPool:
if synced is not entry:
entry = synced
cleared_any = True
# For openai-codex entries, same pattern: the user may have
# re-authed via `hermes model` / `hermes auth` after a 429/401,
# leaving fresh tokens on disk while the pool entry is still
# frozen behind last_error_reset_at (can be hours in the
# future for ChatGPT weekly windows).
if (self.provider == "openai-codex"
and entry.source == "device_code"
and entry.last_status == STATUS_EXHAUSTED):
synced = self._sync_codex_entry_from_auth_store(entry)
if synced is not entry:
entry = synced
cleared_any = True
if entry.last_status == STATUS_EXHAUSTED:
exhausted_until = _exhausted_until(entry)
if exhausted_until is not None and now < exhausted_until:
@@ -1300,48 +1223,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
except Exception as exc:
logger.debug("Qwen OAuth token seed failed: %s", exc)
elif provider == "minimax-oauth":
# MiniMax OAuth tokens live in ~/.hermes/auth.json providers.minimax-oauth.
# Seed the pool so `/auth list` reflects the logged-in state and the
# standard `hermes auth remove minimax-oauth <N>` flow works.
# Use refresh_if_expiring=False equivalent: resolve_minimax_oauth_runtime_credentials
# always refreshes on expiry, so instead read raw state here to avoid
# surprise network calls during provider discovery.
try:
from hermes_cli.auth import get_provider_auth_state
state = get_provider_auth_state("minimax-oauth")
if state and state.get("access_token"):
source_name = "oauth"
if not _is_suppressed(provider, source_name):
active_sources.add(source_name)
expires_at_ms = None
try:
from datetime import datetime as _dt
raw = state.get("expires_at", "")
if raw:
expires_at_ms = int(_dt.fromisoformat(raw).timestamp() * 1000)
except Exception:
expires_at_ms = None
base_url = str(state.get("inference_base_url", "") or "").rstrip("/")
changed |= _upsert_entry(
entries,
provider,
source_name,
{
"source": source_name,
"auth_type": AUTH_TYPE_OAUTH,
"access_token": state["access_token"],
"refresh_token": state.get("refresh_token"),
"expires_at_ms": expires_at_ms,
"base_url": base_url,
"label": state.get("label", "") or label_from_token(
state.get("access_token", ""), source_name
),
},
)
except Exception as exc:
logger.debug("MiniMax OAuth token seed failed: %s", exc)
elif provider == "openai-codex":
# Respect user suppression — `hermes auth remove openai-codex` marks
# the device_code source as suppressed so it won't be re-seeded from
@@ -1381,16 +1262,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
changed = False
active_sources: Set[str] = set()
# Prefer ~/.hermes/.env over os.environ — the user's config file is the
# authoritative source for Hermes credentials. Stale env vars from parent
# processes (Codex CLI, test scripts, etc.) should not override deliberate
# changes to the .env file.
def _get_env_prefer_dotenv(key: str) -> str:
env_file = load_env()
val = env_file.get(key) or os.environ.get(key) or ""
return val.strip()
# Honour user suppression — `hermes auth remove <provider> <N>` for an
# env-seeded credential marks the env:<VAR> source as suppressed so it
# won't be re-seeded from the user's shell environment or ~/.hermes/.env.
@@ -1402,8 +1273,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
def _is_source_suppressed(_p, _s): # type: ignore[misc]
return False
if provider == "openrouter":
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
token = os.getenv("OPENROUTER_API_KEY", "").strip()
if token:
source = "env:OPENROUTER_API_KEY"
if _is_source_suppressed(provider, source):
@@ -1429,7 +1299,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
env_url = ""
if pconfig.base_url_env_var:
env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")
env_url = os.getenv(pconfig.base_url_env_var, "").strip().rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
if provider == "anthropic":
@@ -1440,8 +1310,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
]
for env_var in env_vars:
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv(env_var)
token = os.getenv(env_var, "").strip()
if not token:
continue
source = f"env:{env_var}"
+1 -18
View File
@@ -47,6 +47,7 @@ from __future__ import annotations
import os
from dataclasses import dataclass, field
from pathlib import Path
from typing import Callable, List, Optional
@@ -252,19 +253,6 @@ def _remove_nous_device_code(provider: str, removed) -> RemovalResult:
return result
def _remove_minimax_oauth(provider: str, removed) -> RemovalResult:
"""MiniMax OAuth lives in auth.json providers.minimax-oauth — clear it.
Same pattern as Nous: single-source OAuth state with refresh tokens.
Suppression of the `oauth` source ensures the pool reseed path
(_seed_from_singletons) doesn't instantly undo the removal.
"""
result = RemovalResult()
if _clear_auth_store_provider(provider):
result.cleaned.append(f"Cleared {provider} OAuth tokens from auth store")
return result
def _remove_codex_device_code(provider: str, removed) -> RemovalResult:
"""Codex tokens live in TWO places: our auth store AND ~/.codex/auth.json.
@@ -402,11 +390,6 @@ def _register_all_sources() -> None:
remove_fn=_remove_qwen_cli,
description="~/.qwen/oauth_creds.json",
))
register(RemovalStep(
provider="minimax-oauth", source_id="oauth",
remove_fn=_remove_minimax_oauth,
description="auth.json providers.minimax-oauth",
))
register(RemovalStep(
provider="*", source_id="config:",
match_fn=lambda src: src.startswith("config:") or src == "model_config",
-1674
View File
File diff suppressed because it is too large Load Diff
-693
View File
@@ -1,693 +0,0 @@
"""Curator snapshot + rollback.
A pre-run snapshot of ``~/.hermes/skills/`` (excluding ``.curator_backups/``
itself) is taken before any mutating curator pass. Snapshots are tar.gz
files under ``~/.hermes/skills/.curator_backups/<utc-iso>/`` with a
companion ``manifest.json`` describing the snapshot (reason, time, size,
counted skill files). Rollback picks a snapshot, moves the current
``skills/`` tree aside into another snapshot so even the rollback itself
is undoable, then extracts the chosen snapshot into place.
The snapshot does NOT include:
- ``.curator_backups/`` (would recurse)
- ``.hub/`` (hub-installed skills managed by the hub, not us)
It DOES include:
- all SKILL.md files + their directories (``scripts/``, ``references/``,
``templates/``, ``assets/``)
- ``.usage.json`` (usage telemetry needed to rehydrate state cleanly)
- ``.archive/`` (so rollback restores previously-archived skills too)
- ``.curator_state`` (so rolling back also restores the last-run-at
pointer otherwise the curator would immediately re-fire on the next
tick)
- ``.bundled_manifest`` (so protection markers stay consistent)
Alongside the skills tarball, each snapshot also captures a copy of
``~/.hermes/cron/jobs.json`` as ``cron-jobs.json`` when it exists. Cron
jobs reference skills by name in their ``skills``/``skill`` fields; the
curator's consolidation pass rewrites those in place via
``cron.jobs.rewrite_skill_refs()``. Without capturing the pre-run state,
rolling back the skills tree would leave cron jobs pointing at the
umbrella skills even though the narrow skills they were originally
configured with have been restored. We store the whole jobs.json for
fidelity but rollback only touches the ``skills``/``skill`` fields the
rest (schedule, next_run_at, enabled, prompt, etc.) is live state and
we leave it alone.
"""
from __future__ import annotations
import json
import logging
import os
import re
import shutil
import tarfile
import tempfile
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
DEFAULT_KEEP = 5
# Entries under skills/ that should NEVER be rolled up into a snapshot.
# .hub/ is managed by the skills hub; rolling it back would break lockfile
# invariants. .curator_backups is the backup dir itself — recursion bomb.
_EXCLUDE_TOP_LEVEL = {".curator_backups", ".hub"}
# Snapshot id regex: UTC ISO with colons replaced by dashes so the filename
# is portable (Windows-safe). An optional ``-NN`` suffix handles two
# snapshots landing in the same wallclock second.
_ID_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}Z(-\d{2})?$")
def _backups_dir() -> Path:
return get_hermes_home() / "skills" / ".curator_backups"
def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _cron_jobs_file() -> Path:
"""Source path for the live cron jobs store (``~/.hermes/cron/jobs.json``)."""
return get_hermes_home() / "cron" / "jobs.json"
CRON_JOBS_FILENAME = "cron-jobs.json"
def _backup_cron_jobs_into(dest: Path) -> Dict[str, Any]:
"""Copy the live cron jobs.json into ``dest`` as ``cron-jobs.json``.
Returns a small dict describing what was captured so the caller can
fold it into the manifest. Never raises if the cron file is missing
or unreadable, the return dict has ``backed_up=False`` and the reason,
and the snapshot proceeds without cron data (the snapshot is still
useful for rolling back skills).
"""
src = _cron_jobs_file()
info: Dict[str, Any] = {"backed_up": False, "jobs_count": 0}
if not src.exists():
info["reason"] = "no cron/jobs.json present"
return info
try:
raw = src.read_text(encoding="utf-8")
except OSError as e:
logger.debug("Failed to read cron/jobs.json for backup: %s", e)
info["reason"] = f"read error: {e}"
return info
# Count jobs as a nice diagnostic — but don't fail the snapshot if the
# file is unparseable; just store the raw text and let rollback deal
# with it (or not, if it's corrupted). jobs.json wraps the list as
# `{"jobs": [...], "updated_at": ...}` — we count via that shape, and
# fall back to bare-list shape just in case the format ever changes.
try:
parsed = json.loads(raw)
if isinstance(parsed, dict):
inner = parsed.get("jobs")
if isinstance(inner, list):
info["jobs_count"] = len(inner)
elif isinstance(parsed, list):
info["jobs_count"] = len(parsed)
except (json.JSONDecodeError, TypeError):
info["jobs_count"] = 0
info["parse_warning"] = "jobs.json was not valid JSON at snapshot time"
try:
(dest / CRON_JOBS_FILENAME).write_text(raw, encoding="utf-8")
except OSError as e:
logger.debug("Failed to write cron backup file: %s", e)
info["reason"] = f"write error: {e}"
return info
info["backed_up"] = True
return info
def _utc_id(now: Optional[datetime] = None) -> str:
"""UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
if now is None:
now = datetime.now(timezone.utc)
# isoformat → "2026-05-01T13:05:42.123456+00:00"; strip subseconds and tz.
s = now.replace(microsecond=0).isoformat()
if s.endswith("+00:00"):
s = s[:-6]
return s.replace(":", "-") + "Z"
def _load_config() -> Dict[str, Any]:
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception as e:
logger.debug("Failed to load config for curator backup: %s", e)
return {}
if not isinstance(cfg, dict):
return {}
cur = cfg.get("curator") or {}
if not isinstance(cur, dict):
return {}
bk = cur.get("backup") or {}
return bk if isinstance(bk, dict) else {}
def is_enabled() -> bool:
"""Default ON — the whole point of the backup is safety by default."""
return bool(_load_config().get("enabled", True))
def get_keep() -> int:
cfg = _load_config()
try:
n = int(cfg.get("keep", DEFAULT_KEEP))
except (TypeError, ValueError):
n = DEFAULT_KEEP
return max(1, n)
# ---------------------------------------------------------------------------
# Snapshot
# ---------------------------------------------------------------------------
def _count_skill_files(base: Path) -> int:
try:
return sum(1 for _ in base.rglob("SKILL.md"))
except OSError:
return 0
def _write_manifest(dest: Path, reason: str, archive_path: Path,
skills_counted: int,
cron_info: Optional[Dict[str, Any]] = None) -> None:
manifest = {
"id": dest.name,
"reason": reason,
"created_at": datetime.now(timezone.utc).isoformat(),
"archive": archive_path.name,
"archive_bytes": archive_path.stat().st_size,
"skill_files": skills_counted,
}
if cron_info is not None:
manifest["cron_jobs"] = {
"backed_up": bool(cron_info.get("backed_up", False)),
"jobs_count": int(cron_info.get("jobs_count", 0)),
}
if not cron_info.get("backed_up"):
manifest["cron_jobs"]["reason"] = cron_info.get("reason", "not captured")
if cron_info.get("parse_warning"):
manifest["cron_jobs"]["parse_warning"] = cron_info["parse_warning"]
(dest / "manifest.json").write_text(
json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
)
def snapshot_skills(reason: str = "manual") -> Optional[Path]:
"""Create a tar.gz snapshot of ``~/.hermes/skills/`` and prune old ones.
Returns the snapshot directory path, or ``None`` if the snapshot was
skipped (backup disabled, skills dir missing, or an IO error occurred
in which case we log at debug and return None so the curator never
aborts a pass because of a backup failure).
"""
if not is_enabled():
logger.debug("Curator backup disabled by config; skipping snapshot")
return None
skills = _skills_dir()
if not skills.exists():
logger.debug("No ~/.hermes/skills/ directory — nothing to back up")
return None
backups = _backups_dir()
try:
backups.mkdir(parents=True, exist_ok=True)
except OSError as e:
logger.debug("Failed to create backups dir %s: %s", backups, e)
return None
# Uniquify: if a snapshot with the same second already exists (can
# happen if two curator runs fire in the same second), append a short
# counter. Avoids clobbering and avoids timestamp collisions.
base_id = _utc_id()
snap_id = base_id
counter = 1
while (backups / snap_id).exists():
snap_id = f"{base_id}-{counter:02d}"
counter += 1
dest = backups / snap_id
try:
dest.mkdir(parents=True, exist_ok=False)
except OSError as e:
logger.debug("Failed to create snapshot dir %s: %s", dest, e)
return None
archive = dest / "skills.tar.gz"
try:
# Stream into the tarball — no tempdir copy needed.
with tarfile.open(archive, "w:gz", compresslevel=6) as tf:
for entry in sorted(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
# arcname: store paths relative to skills/ so extraction
# drops cleanly back into the skills dir.
tf.add(str(entry), arcname=entry.name, recursive=True)
# Capture cron/jobs.json alongside the tarball. Never fails the
# snapshot — the skills side is the core guarantee; cron is
# additive. We still record in the manifest whether it was
# captured so rollback can surface "no cron data in this snapshot".
cron_info = _backup_cron_jobs_into(dest)
_write_manifest(dest, reason, archive,
_count_skill_files(skills),
cron_info=cron_info)
except (OSError, tarfile.TarError) as e:
logger.debug("Curator snapshot failed: %s", e, exc_info=True)
# Clean up partial snapshot
try:
shutil.rmtree(dest, ignore_errors=True)
except OSError:
pass
return None
_prune_old(keep=get_keep())
logger.info("Curator snapshot created: %s (%s)", snap_id, reason)
return dest
def _prune_old(keep: int) -> List[str]:
"""Delete regular snapshots beyond the newest *keep*. Returns deleted
ids. Staging dirs (``.rollback-staging-*``) are implementation detail
and pruned independently on every call."""
backups = _backups_dir()
if not backups.exists():
return []
entries: List[Tuple[str, Path]] = []
stale_staging: List[Path] = []
for child in backups.iterdir():
if not child.is_dir():
continue
if child.name.startswith(".rollback-staging-"):
# Staging dirs are only supposed to exist briefly during a
# rollback. If we find one here (e.g. from a crashed rollback),
# clean it up opportunistically.
stale_staging.append(child)
continue
if _ID_RE.match(child.name):
entries.append((child.name, child))
# Newest first (lexicographic works because the id is UTC ISO).
entries.sort(key=lambda t: t[0], reverse=True)
deleted: List[str] = []
for _, path in entries[keep:]:
try:
shutil.rmtree(path)
deleted.append(path.name)
except OSError as e:
logger.debug("Failed to prune %s: %s", path, e)
for path in stale_staging:
try:
shutil.rmtree(path)
except OSError as e:
logger.debug("Failed to clean stale staging dir %s: %s", path, e)
return deleted
# ---------------------------------------------------------------------------
# List + rollback
# ---------------------------------------------------------------------------
def _read_manifest(snap_dir: Path) -> Dict[str, Any]:
mf = snap_dir / "manifest.json"
if not mf.exists():
return {}
try:
return json.loads(mf.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return {}
def list_backups() -> List[Dict[str, Any]]:
"""Return all restorable snapshots, newest first. Only entries with a
real ``skills.tar.gz`` tarball are listed transient
``.rollback-staging-*`` directories created mid-rollback are
implementation detail and not shown."""
backups = _backups_dir()
if not backups.exists():
return []
out: List[Dict[str, Any]] = []
for child in sorted(backups.iterdir(), reverse=True):
if not child.is_dir():
continue
if not _ID_RE.match(child.name):
continue
if not (child / "skills.tar.gz").exists():
continue
mf = _read_manifest(child)
mf.setdefault("id", child.name)
mf.setdefault("path", str(child))
if "archive_bytes" not in mf:
arc = child / "skills.tar.gz"
try:
mf["archive_bytes"] = arc.stat().st_size
except OSError:
mf["archive_bytes"] = 0
out.append(mf)
return out
def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
"""Return the path of the requested backup, or the newest one if
*backup_id* is None. Returns None if no match."""
backups = _backups_dir()
if not backups.exists():
return None
if backup_id:
target = backups / backup_id
if (
target.is_dir()
and _ID_RE.match(backup_id)
and (target / "skills.tar.gz").exists()
):
return target
return None
candidates = [
c for c in sorted(backups.iterdir(), reverse=True)
if c.is_dir() and _ID_RE.match(c.name) and (c / "skills.tar.gz").exists()
]
return candidates[0] if candidates else None
def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
"""Reconcile backed-up cron skill links into the live ``cron/jobs.json``.
We do NOT overwrite the whole cron file. Only the ``skills`` and
``skill`` fields are restored, and only on jobs that still exist in the
current file (matched by ``id``). Everything else about the job
schedule, next_run_at, last_run_at, enabled, prompt, workdir, hooks
is live state that the user/scheduler has modified since the snapshot;
overwriting it would regress unrelated cron activity.
Rules:
- Jobs present in backup AND live, with differing skills skills restored.
- Jobs present in backup AND live, with matching skills no-op.
- Jobs present in backup but gone from live (user deleted the job
after the snapshot) skipped, noted in the return report.
- Jobs present in live but not in backup (user created a new cron
job after the snapshot) left untouched.
Never raises; failures are captured in the return dict. Writes through
``cron.jobs`` to pick up the same lock + atomic-write path that tick()
uses, so we don't race the scheduler.
"""
report: Dict[str, Any] = {
"attempted": False,
"restored": [],
"skipped_missing": [],
"unchanged": 0,
"error": None,
}
backup_file = snapshot_dir / CRON_JOBS_FILENAME
if not backup_file.exists():
report["error"] = f"snapshot has no {CRON_JOBS_FILENAME}"
return report
try:
backup_text = backup_file.read_text(encoding="utf-8")
backup_parsed = json.loads(backup_text)
except (OSError, json.JSONDecodeError) as e:
report["error"] = f"failed to load backed-up jobs: {e}"
return report
# jobs.json on disk is `{"jobs": [...], "updated_at": ...}`; accept both
# that shape and a bare list for forward compat.
if isinstance(backup_parsed, dict):
backup_jobs = backup_parsed.get("jobs")
elif isinstance(backup_parsed, list):
backup_jobs = backup_parsed
else:
backup_jobs = None
if not isinstance(backup_jobs, list):
report["error"] = "backed-up cron-jobs.json has no jobs list"
return report
# Build a lookup of the backed-up skill state keyed by job id.
# We only need the two skill-ish fields (legacy single and modern list).
backup_by_id: Dict[str, Dict[str, Any]] = {}
for job in backup_jobs:
if not isinstance(job, dict):
continue
jid = job.get("id")
if not isinstance(jid, str) or not jid:
continue
backup_by_id[jid] = {
"skills": job.get("skills"),
"skill": job.get("skill"),
"name": job.get("name") or jid,
}
if not backup_by_id:
report["attempted"] = True # we tried but there was nothing to do
return report
# Load and rewrite the live jobs under the scheduler's lock.
try:
from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
except ImportError as e:
report["error"] = f"cron module unavailable: {e}"
return report
report["attempted"] = True
try:
with _jobs_file_lock:
live_jobs = load_jobs()
changed = False
live_ids = set()
for live in live_jobs:
if not isinstance(live, dict):
continue
jid = live.get("id")
if not isinstance(jid, str) or not jid:
continue
live_ids.add(jid)
backup = backup_by_id.get(jid)
if backup is None:
continue # live job didn't exist at snapshot time
cur_skills = live.get("skills")
cur_skill = live.get("skill")
bkp_skills = backup.get("skills")
bkp_skill = backup.get("skill")
if cur_skills == bkp_skills and cur_skill == bkp_skill:
report["unchanged"] += 1
continue
# Restore. Preserve absence (don't force the key to appear
# if the backup didn't have it either).
if bkp_skills is None:
live.pop("skills", None)
else:
live["skills"] = bkp_skills
if bkp_skill is None:
live.pop("skill", None)
else:
live["skill"] = bkp_skill
report["restored"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
"from": {"skills": cur_skills, "skill": cur_skill},
"to": {"skills": bkp_skills, "skill": bkp_skill},
})
changed = True
# Jobs in backup but not in live = user deleted them after snapshot
for jid, backup in backup_by_id.items():
if jid not in live_ids:
report["skipped_missing"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
})
if changed:
save_jobs(live_jobs)
except Exception as e: # noqa: BLE001 — rollback must not die mid-restore
logger.debug("Cron skill-link restore failed: %s", e, exc_info=True)
report["error"] = f"restore failed mid-flight: {e}"
return report
def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
"""Restore ``~/.hermes/skills/`` from a snapshot.
Strategy:
1. Resolve the target snapshot (explicit id or newest regular).
2. Take a safety snapshot of the CURRENT skills tree under
``.curator_backups/pre-rollback-<ts>/`` so the rollback itself is
undoable.
3. Move all current top-level entries (except ``.curator_backups``
and ``.hub``) into a tempdir.
4. Extract the chosen snapshot into ``~/.hermes/skills/``.
5. On failure during 4, move the tempdir contents back (best-effort)
and return failure.
Returns ``(ok, message, snapshot_path)``.
"""
target = _resolve_backup(backup_id)
if target is None:
return (
False,
f"no matching backup found"
+ (f" for id '{backup_id}'" if backup_id else "")
+ " (use `hermes curator rollback --list` to see available snapshots)",
None,
)
archive = target / "skills.tar.gz"
if not archive.exists():
return (False, f"snapshot {target.name} has no skills.tar.gz — corrupted?", None)
skills = _skills_dir()
skills.mkdir(parents=True, exist_ok=True)
backups = _backups_dir()
backups.mkdir(parents=True, exist_ok=True)
# Step 2: safety snapshot of current state FIRST. If this fails we bail
# out before touching anything — otherwise a failed extract could leave
# the user with no skills.
try:
snapshot_skills(reason=f"pre-rollback to {target.name}")
except Exception as e:
return (False, f"pre-rollback safety snapshot failed: {e}", None)
# Additionally move current entries into an internal staging dir so
# the extract happens into an empty skills tree (predictable result).
# This dir is implementation detail — not listed as a restorable
# backup. The safety snapshot above is the user-facing undo handle.
staged = backups / f".rollback-staging-{_utc_id()}"
try:
staged.mkdir(parents=True, exist_ok=False)
except OSError as e:
return (False, f"failed to create staging dir: {e}", None)
moved: List[Tuple[Path, Path]] = []
try:
for entry in list(skills.iterdir()):
if entry.name in _EXCLUDE_TOP_LEVEL:
continue
dest = staged / entry.name
shutil.move(str(entry), str(dest))
moved.append((entry, dest))
except OSError as e:
# Best-effort rollback of the move
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"failed to stage current skills: {e}", None)
# Step 4: extract the snapshot into skills/
try:
with tarfile.open(archive, "r:gz") as tf:
# Python 3.12+ supports filter='data' for safer extraction.
# Fall back to the unfiltered call for older interpreters but
# still reject absolute paths and .. components defensively.
for member in tf.getmembers():
name = member.name
if name.startswith("/") or ".." in Path(name).parts:
raise tarfile.TarError(
f"refusing to extract unsafe path: {name!r}"
)
try:
tf.extractall(str(skills), filter="data") # type: ignore[call-arg]
except TypeError:
# Python < 3.12 — no filter kwarg
tf.extractall(str(skills))
except (OSError, tarfile.TarError) as e:
# Best-effort recover: move staged contents back
for orig, dest in moved:
try:
shutil.move(str(dest), str(orig))
except OSError:
pass
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
return (False, f"snapshot extract failed (state restored): {e}", None)
# Extract succeeded — the staging dir has served its purpose. The
# user's undo handle is the safety snapshot tarball we took earlier.
try:
shutil.rmtree(staged, ignore_errors=True)
except OSError:
pass
# Reconcile cron skill-links. Surgical: only the skills/skill fields
# on jobs matched by id. Everything else in jobs.json is live state
# (schedule, next_run_at, enabled, prompt, etc.) and we leave it
# alone. Failures here don't fail the overall rollback — the skills
# tree is already restored, which is the main guarantee.
cron_report = _restore_cron_skill_links(target)
summary_bits = [f"restored from snapshot {target.name}"]
if cron_report.get("attempted"):
restored_n = len(cron_report.get("restored") or [])
skipped_n = len(cron_report.get("skipped_missing") or [])
if cron_report.get("error"):
summary_bits.append(f"cron links: error — {cron_report['error']}")
elif restored_n == 0 and skipped_n == 0 and cron_report.get("unchanged", 0) == 0:
# Attempted but nothing matched — empty snapshot or no overlapping ids.
pass
else:
parts = []
if restored_n:
parts.append(f"{restored_n} job(s) had skill links restored")
if skipped_n:
parts.append(f"{skipped_n} backed-up job(s) no longer exist (skipped)")
if cron_report.get("unchanged"):
parts.append(f"{cron_report['unchanged']} already matched")
summary_bits.append("cron links: " + ", ".join(parts))
logger.info("Curator rollback: restored from %s (cron_report=%s)",
target.name, cron_report)
return (True, "; ".join(summary_bits), target)
# ---------------------------------------------------------------------------
# Human-readable summary for CLI
# ---------------------------------------------------------------------------
def format_size(n: int) -> str:
for unit in ("B", "KB", "MB", "GB"):
if n < 1024 or unit == "GB":
return f"{n:.1f} {unit}" if unit != "B" else f"{n} B"
n /= 1024
return f"{n:.1f} GB"
def summarize_backups() -> str:
rows = list_backups()
if not rows:
return "No curator snapshots yet."
lines = [f"{'id':<24} {'reason':<40} {'skills':>6} {'size':>8}"]
lines.append("" * len(lines[0]))
for r in rows:
lines.append(
f"{r.get('id','?'):<24} "
f"{(r.get('reason','?') or '?')[:40]:<40} "
f"{r.get('skill_files', 0):>6} "
f"{format_size(int(r.get('archive_bytes', 0))):>8}"
)
return "\n".join(lines)
+2 -64
View File
@@ -42,7 +42,6 @@ class FailoverReason(enum.Enum):
# Context / payload
context_overflow = "context_overflow" # Context too large — compress, not failover
payload_too_large = "payload_too_large" # 413 — compress payload
image_too_large = "image_too_large" # Native image part exceeds provider's per-image limit — shrink and retry
# Model
model_not_found = "model_not_found" # 404 or invalid model — fallback to different model
@@ -54,7 +53,6 @@ class FailoverReason(enum.Enum):
# Provider-specific
thinking_signature = "thinking_signature" # Anthropic thinking block sig invalid
long_context_tier = "long_context_tier" # Anthropic "extra usage" tier gate
oauth_long_context_beta_forbidden = "oauth_long_context_beta_forbidden" # Anthropic OAuth subscription rejects 1M context beta — disable beta and retry
# Catch-all
unknown = "unknown" # Unclassifiable — retry with backoff
@@ -92,7 +90,6 @@ class ClassifiedError:
_BILLING_PATTERNS = [
"insufficient credits",
"insufficient_quota",
"insufficient balance",
"credit balance",
"credits have been exhausted",
"top up your credits",
@@ -150,20 +147,6 @@ _PAYLOAD_TOO_LARGE_PATTERNS = [
"error code: 413",
]
# Image-size patterns. Matched against 400 bodies (not 413) because most
# providers return a 400 with a specific image-too-big message before the
# whole request hits the 413 size limit. Anthropic's wording is the most
# important here (hard 5 MB per image, returned as
# "messages.N.content.K.image.source.base64: image exceeds 5 MB maximum").
_IMAGE_TOO_LARGE_PATTERNS = [
"image exceeds", # Anthropic: "image exceeds 5 MB maximum"
"image too large", # generic
"image_too_large", # error_code variant
"image size exceeds", # variant
# "request_too_large" on a request known to contain an image → image is
# the likely culprit; we still try the shrink path before giving up.
]
# Context overflow patterns
_CONTEXT_OVERFLOW_PATTERNS = [
"context length",
@@ -451,25 +434,6 @@ def classify_api_error(
should_compress=True,
)
# Anthropic OAuth subscription rejects the 1M-context beta header.
# Observed error body: "The long context beta is not yet available for
# this subscription." Returned as HTTP 400 from native Anthropic when
# the subscription doesn't include 1M context, even though the request
# carries ``anthropic-beta: context-1m-2025-08-07``. The recovery path
# in run_agent.py rebuilds the Anthropic client with the beta stripped
# and retries once. Pattern is narrow enough that it won't collide with
# the 429 tier-gate pattern above (different status, different phrase).
if (
status_code == 400
and "long context beta" in error_msg
and "not yet available" in error_msg
):
return _result(
FailoverReason.oauth_long_context_beta_forbidden,
retryable=True,
should_compress=False,
)
# ── 2. HTTP status code classification ──────────────────────────
if status_code is not None:
@@ -520,12 +484,7 @@ def classify_api_error(
is_disconnect = any(p in error_msg for p in _SERVER_DISCONNECT_PATTERNS)
if is_disconnect and not status_code:
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have hundreds of
# messages while still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.6 or (
context_length <= 256000 and (approx_tokens > 120000 or num_messages > 200)
)
is_large = approx_tokens > context_length * 0.6 or approx_tokens > 120000 or num_messages > 200
if is_large:
return _result(
FailoverReason.context_overflow,
@@ -712,15 +671,6 @@ def _classify_400(
) -> ClassifiedError:
"""Classify 400 Bad Request — context overflow, format error, or generic."""
# Image-too-large from 400 (Anthropic's 5 MB per-image check fires this way).
# Must be checked BEFORE context_overflow because messages can trip both
# patterns ("exceeds" + "image") and image-shrink is a cheaper recovery.
if any(p in error_msg for p in _IMAGE_TOO_LARGE_PATTERNS):
return result_fn(
FailoverReason.image_too_large,
retryable=True,
)
# Context overflow from 400
if any(p in error_msg for p in _CONTEXT_OVERFLOW_PATTERNS):
return result_fn(
@@ -771,12 +721,7 @@ def _classify_400(
if not err_body_msg:
err_body_msg = str(body.get("message") or "").strip().lower()
is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have many messages while
# still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.4 or (
context_length <= 256000 and (approx_tokens > 80000 or num_messages > 80)
)
is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80
if is_generic and is_large:
return result_fn(
@@ -853,13 +798,6 @@ def _classify_by_message(
should_compress=True,
)
# Image-too-large patterns (from message text when no status_code)
if any(p in error_msg for p in _IMAGE_TOO_LARGE_PATTERNS):
return result_fn(
FailoverReason.image_too_large,
retryable=True,
)
# Usage-limit patterns need the same disambiguation as 402: some providers
# surface "usage limit" errors without an HTTP status code. A transient
# signal ("try again", "resets at", …) means it's a periodic quota, not
+2
View File
@@ -30,6 +30,7 @@ from __future__ import annotations
import json
import logging
import os
import time
import uuid
from types import SimpleNamespace
@@ -41,6 +42,7 @@ from agent import google_oauth
from agent.gemini_schema import sanitize_gemini_tool_parameters
from agent.google_code_assist import (
CODE_ASSIST_ENDPOINT,
FREE_TIER_ID,
CodeAssistError,
ProjectContext,
resolve_project_context,
+1 -15
View File
@@ -679,21 +679,7 @@ def translate_stream_event(event: Dict[str, Any], model: str, tool_call_indices:
finish_reason_raw = str(cand.get("finishReason") or "")
if finish_reason_raw:
mapped = "tool_calls" if tool_call_indices else _map_gemini_finish_reason(finish_reason_raw)
finish_chunk = _make_stream_chunk(model=model, finish_reason=mapped)
# Attach usage from this event's usageMetadata so the streaming
# loop in run_agent.py can record token counts (mirrors the
# non-streaming path in translate_gemini_response).
usage_meta = event.get("usageMetadata") or {}
if usage_meta:
finish_chunk.usage = SimpleNamespace(
prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
total_tokens=int(usage_meta.get("totalTokenCount") or 0),
prompt_tokens_details=SimpleNamespace(
cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
),
)
chunks.append(finish_chunk)
chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
return chunks
+1 -1
View File
@@ -2,7 +2,7 @@
from __future__ import annotations
from typing import Any, Dict
from typing import Any, Dict, List
# Gemini's ``FunctionDeclaration.parameters`` field accepts the ``Schema``
# object, which is only a subset of OpenAPI 3.0 / JSON Schema. Strip fields
+1
View File
@@ -29,6 +29,7 @@ from __future__ import annotations
import json
import logging
import os
import time
import urllib.error
import urllib.parse
+5 -18
View File
@@ -49,13 +49,14 @@ import json
import logging
import os
import secrets
import socket
import stat
import threading
import time
import urllib.error
import urllib.parse
import urllib.request
from dataclasses import dataclass
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, Optional, Tuple
@@ -97,7 +98,6 @@ _DEFAULT_CLIENT_SECRET = f"GOCSPX-{_PUBLIC_CLIENT_SECRET_SUFFIX}"
# Regex patterns for fallback scraping from an installed gemini-cli.
import re as _re
from utils import atomic_replace
_CLIENT_ID_PATTERN = _re.compile(
r"OAUTH_CLIENT_ID\s*=\s*['\"]([0-9]+-[a-z0-9]+\.apps\.googleusercontent\.com)['\"]"
)
@@ -489,30 +489,17 @@ def save_credentials(creds: GoogleCredentials) -> Path:
"""Atomically write creds to disk with 0o600 permissions."""
path = _credentials_path()
path.parent.mkdir(parents=True, exist_ok=True)
# Tighten parent dir to 0o700 so siblings can't traverse to the creds file.
# On Windows this is a no-op (POSIX mode bits aren't enforced); ignore failures.
try:
os.chmod(path.parent, 0o700)
except OSError:
pass
payload = json.dumps(creds.to_dict(), indent=2, sort_keys=True) + "\n"
with _credentials_lock():
tmp_path = path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
try:
# Create with 0o600 atomically to close the TOCTOU window where the
# default umask (often 0o644) would briefly expose tokens to other
# local users between open() and chmod().
fd = os.open(
str(tmp_path),
os.O_WRONLY | os.O_CREAT | os.O_EXCL,
stat.S_IRUSR | stat.S_IWUSR,
)
with os.fdopen(fd, "w", encoding="utf-8") as fh:
with open(tmp_path, "w", encoding="utf-8") as fh:
fh.write(payload)
fh.flush()
os.fsync(fh.fileno())
atomic_replace(tmp_path, path)
os.chmod(tmp_path, stat.S_IRUSR | stat.S_IWUSR)
os.replace(tmp_path, path)
finally:
try:
if tmp_path.exists():
-236
View File
@@ -1,236 +0,0 @@
"""Routing helpers for inbound user-attached images.
Two modes:
native attach images as OpenAI-style ``image_url`` content parts on the
user turn. Provider adapters (Anthropic, Gemini, Bedrock, Codex,
OpenAI chat.completions) already translate these into their
vendor-specific multimodal formats.
text run ``vision_analyze`` on each image up-front and prepend the
description to the user's text. The model never sees the pixels;
it only sees a lossy text summary. This is the pre-existing
behaviour and still the right choice for non-vision models.
The decision is made once per message turn by :func:`decide_image_input_mode`.
It reads ``agent.image_input_mode`` from config.yaml (``auto`` | ``native``
| ``text``, default ``auto``) and the active model's capability metadata.
In ``auto`` mode:
- If the user has explicitly configured ``auxiliary.vision.provider``
(i.e. not ``auto`` and not empty), we assume they want the text pipeline
regardless of the main model they've opted in to a specific vision
backend for a reason (cost, quality, local-only, etc.).
- Otherwise, if the active model reports ``supports_vision=True`` in its
models.dev metadata, we attach natively.
- Otherwise (non-vision model, no explicit override), we fall back to text.
This keeps ``vision_analyze`` surfaced as a tool in every session skills
and agent flows that chain it (browser screenshots, deeper inspection of
URL-referenced images, style-gating loops) keep working. The routing only
affects *how user-attached images on the current turn* are presented to the
main model.
"""
from __future__ import annotations
import base64
import logging
import mimetypes
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
_VALID_MODES = frozenset({"auto", "native", "text"})
def _coerce_mode(raw: Any) -> str:
"""Normalize a config value into one of the valid modes."""
if not isinstance(raw, str):
return "auto"
val = raw.strip().lower()
if val in _VALID_MODES:
return val
return "auto"
def _explicit_aux_vision_override(cfg: Optional[Dict[str, Any]]) -> bool:
"""True when the user configured a specific auxiliary vision backend.
An explicit override means the user *wants* the text pipeline (they're
paying for a dedicated vision model), so we don't silently bypass it.
"""
if not isinstance(cfg, dict):
return False
aux = cfg.get("auxiliary") or {}
if not isinstance(aux, dict):
return False
vision = aux.get("vision") or {}
if not isinstance(vision, dict):
return False
provider = str(vision.get("provider") or "").strip().lower()
model = str(vision.get("model") or "").strip()
base_url = str(vision.get("base_url") or "").strip()
# "auto" / "" / blank = not explicit
if provider in ("", "auto") and not model and not base_url:
return False
return True
def _lookup_supports_vision(provider: str, model: str) -> Optional[bool]:
"""Return True/False if we can resolve caps, None if unknown."""
if not provider or not model:
return None
try:
from agent.models_dev import get_model_capabilities
caps = get_model_capabilities(provider, model)
except Exception as exc: # pragma: no cover - defensive
logger.debug("image_routing: caps lookup failed for %s:%s%s", provider, model, exc)
return None
if caps is None:
return None
return bool(caps.supports_vision)
def decide_image_input_mode(
provider: str,
model: str,
cfg: Optional[Dict[str, Any]],
) -> str:
"""Return ``"native"`` or ``"text"`` for the given turn.
Args:
provider: active inference provider ID (e.g. ``"anthropic"``, ``"openrouter"``).
model: active model slug as it would be sent to the provider.
cfg: loaded config.yaml dict, or None. When None, behaves as auto.
"""
mode_cfg = "auto"
if isinstance(cfg, dict):
agent_cfg = cfg.get("agent") or {}
if isinstance(agent_cfg, dict):
mode_cfg = _coerce_mode(agent_cfg.get("image_input_mode"))
if mode_cfg == "native":
return "native"
if mode_cfg == "text":
return "text"
# auto
if _explicit_aux_vision_override(cfg):
return "text"
supports = _lookup_supports_vision(provider, model)
if supports is True:
return "native"
return "text"
# Image size handling is REACTIVE rather than proactive: we attempt native
# attachment at full size regardless of provider, and rely on
# ``run_agent._try_shrink_image_parts_in_messages`` to shrink + retry if
# the provider rejects the request (e.g. Anthropic's hard 5 MB per-image
# ceiling returned as HTTP 400 "image exceeds 5 MB maximum").
#
# Why reactive: our knowledge of provider ceilings is partial and evolving
# (OpenAI accepts 49 MB+, Anthropic 5 MB, Gemini 100 MB, others unknown).
# A proactive per-provider table would be stale the moment a provider raises
# or lowers its limit, and silently degrading quality for users on providers
# that would have accepted the full image is the worse failure mode.
# The shrink-on-reject path loses 1 API call + maybe 1s of Pillow work when
# it fires, which is cheaper than permanent quality loss.
def _guess_mime(path: Path) -> str:
mime, _ = mimetypes.guess_type(str(path))
if mime and mime.startswith("image/"):
return mime
# mimetypes on some Linux distros mis-maps .jpg; default to jpeg when
# the suffix looks imagey.
suffix = path.suffix.lower()
return {
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".png": "image/png",
".gif": "image/gif",
".webp": "image/webp",
".bmp": "image/bmp",
}.get(suffix, "image/jpeg")
def _file_to_data_url(path: Path) -> Optional[str]:
"""Encode a local image as a base64 data URL at its native size.
Size limits are NOT enforced here the agent retry loop
(``run_agent._try_shrink_image_parts_in_messages``) shrinks on the
provider's first rejection. Keeping this simple means providers that
accept large images (OpenAI 49 MB+, Gemini 100 MB) don't pay a silent
quality tax just because one other provider is stricter.
Returns None only if the file can't be read (missing, permission
denied, etc.); the caller reports those paths in ``skipped``.
"""
try:
raw = path.read_bytes()
except Exception as exc:
logger.warning("image_routing: failed to read %s%s", path, exc)
return None
mime = _guess_mime(path)
b64 = base64.b64encode(raw).decode("ascii")
return f"data:{mime};base64,{b64}"
def build_native_content_parts(
user_text: str,
image_paths: List[str],
) -> Tuple[List[Dict[str, Any]], List[str]]:
"""Build an OpenAI-style ``content`` list for a user turn.
Shape:
[{"type": "text", "text": "..."},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}},
...]
Images are attached at their native size. If a provider rejects the
request because an image is too large (e.g. Anthropic's 5 MB per-image
ceiling), the agent's retry loop transparently shrinks and retries
once see ``run_agent._try_shrink_image_parts_in_messages``.
Returns (content_parts, skipped_paths). Skipped paths are files that
couldn't be read from disk.
"""
parts: List[Dict[str, Any]] = []
skipped: List[str] = []
text = (user_text or "").strip()
if text:
parts.append({"type": "text", "text": text})
for raw_path in image_paths:
p = Path(raw_path)
if not p.exists() or not p.is_file():
skipped.append(str(raw_path))
continue
data_url = _file_to_data_url(p)
if not data_url:
skipped.append(str(raw_path))
continue
parts.append({
"type": "image_url",
"image_url": {"url": data_url},
})
# If the text was empty, add a neutral prompt so the turn isn't just images.
if not text and any(p.get("type") == "image_url" for p in parts):
parts.insert(0, {"type": "text", "text": "What do you see in this image?"})
return parts, skipped
__all__ = [
"decide_image_input_mode",
"build_native_content_parts",
]
-48
View File
@@ -1,48 +0,0 @@
"""LM Studio reasoning-effort resolution shared by the chat-completions
transport and run_agent's iteration-limit summary path.
LM Studio publishes per-model ``capabilities.reasoning.allowed_options`` (e.g.
``["off","on"]`` for toggle-style models, ``["off","minimal","low"]`` for
graduated models). We map the user's ``reasoning_config`` onto LM Studio's
OpenAI-compatible vocabulary, then clamp against the model's allowed set so
the server doesn't 400 on an unsupported effort.
"""
from __future__ import annotations
from typing import List, Optional
# LM Studio accepts these top-level reasoning_effort values via its
# OpenAI-compatible chat.completions endpoint.
_LM_VALID_EFFORTS = {"none", "minimal", "low", "medium", "high", "xhigh"}
# Toggle-style models publish allowed_options as ["off","on"] in /api/v1/models.
# Map them onto the OpenAI-compatible request vocabulary.
_LM_EFFORT_ALIASES = {"off": "none", "on": "medium"}
def resolve_lmstudio_effort(
reasoning_config: Optional[dict],
allowed_options: Optional[List[str]],
) -> Optional[str]:
"""Return the ``reasoning_effort`` string to send to LM Studio, or ``None``.
``None`` means "omit the field": the user picked a level the model can't
honor, so let LM Studio fall back to the model's declared default rather
than silently substituting a different effort. When ``allowed_options`` is
falsy (probe failed), skip clamping and send the resolved effort anyway.
"""
effort = "medium"
if reasoning_config and isinstance(reasoning_config, dict):
if reasoning_config.get("enabled") is False:
effort = "none"
else:
raw = (reasoning_config.get("effort") or "").strip().lower()
raw = _LM_EFFORT_ALIASES.get(raw, raw)
if raw in _LM_VALID_EFFORTS:
effort = raw
if allowed_options:
allowed = {_LM_EFFORT_ALIASES.get(opt, opt) for opt in allowed_options}
if effort not in allowed:
return None
return effort
+5 -5
View File
@@ -20,25 +20,25 @@ def summarize_manual_compression(
headline = f"No changes from compression: {before_count} messages"
if after_tokens == before_tokens:
token_line = (
f"Approx request size: ~{before_tokens:,} tokens (unchanged)"
f"Rough transcript estimate: ~{before_tokens:,} tokens (unchanged)"
)
else:
token_line = (
f"Approx request size: ~{before_tokens:,}"
f"Rough transcript estimate: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
else:
headline = f"Compressed: {before_count}{after_count} messages"
token_line = (
f"Approx request size: ~{before_tokens:,}"
f"Rough transcript estimate: ~{before_tokens:,}"
f"~{after_tokens:,} tokens"
)
note = None
if not noop and after_count < before_count and after_tokens > before_tokens:
note = (
"Note: fewer messages can still raise this estimate when "
"compression rewrites the transcript into denser summaries."
"Note: fewer messages can still raise this rough transcript estimate "
"when compression rewrites the transcript into denser summaries."
)
return {
+8 -152
View File
@@ -28,6 +28,7 @@ Usage in run_agent.py:
from __future__ import annotations
import json
import logging
import re
import inspect
@@ -49,7 +50,7 @@ _INTERNAL_CONTEXT_RE = re.compile(
re.IGNORECASE,
)
_INTERNAL_NOTE_RE = re.compile(
r'\[System note:\s*The following is recalled memory context,\s*NOT new user input\.\s*Treat as (?:informational background data|authoritative reference data[^\]]*)\.\]\s*',
r'\[System note:\s*The following is recalled memory context,\s*NOT new user input\.\s*Treat as informational background data\.\]\s*',
re.IGNORECASE,
)
@@ -62,129 +63,19 @@ def sanitize_context(text: str) -> str:
return text
class StreamingContextScrubber:
"""Stateful scrubber for streaming text that may contain split memory-context spans.
The one-shot ``sanitize_context`` regex cannot survive chunk boundaries:
a ``<memory-context>`` opened in one delta and closed in a later delta
leaks its payload to the UI because the non-greedy block regex needs
both tags in one string. This scrubber runs a small state machine
across deltas, holding back partial-tag tails and discarding
everything inside a span (including the system-note line).
Usage::
scrubber = StreamingContextScrubber()
for delta in stream:
visible = scrubber.feed(delta)
if visible:
emit(visible)
trailing = scrubber.flush() # at end of stream
if trailing:
emit(trailing)
The scrubber is re-entrant per agent instance. Callers building new
top-level responses (new turn) should create a fresh scrubber or call
``reset()``.
"""
_OPEN_TAG = "<memory-context>"
_CLOSE_TAG = "</memory-context>"
def __init__(self) -> None:
self._in_span: bool = False
self._buf: str = ""
def reset(self) -> None:
self._in_span = False
self._buf = ""
def feed(self, text: str) -> str:
"""Return the visible portion of ``text`` after scrubbing.
Any trailing fragment that could be the start of an open/close tag
is held back in the internal buffer and surfaced on the next
``feed()`` call or discarded/emitted by ``flush()``.
"""
if not text:
return ""
buf = self._buf + text
self._buf = ""
out: list[str] = []
while buf:
if self._in_span:
idx = buf.lower().find(self._CLOSE_TAG)
if idx == -1:
# Hold back a potential partial close tag; drop the rest
held = self._max_partial_suffix(buf, self._CLOSE_TAG)
self._buf = buf[-held:] if held else ""
return "".join(out)
# Found close — skip span content + tag, continue
buf = buf[idx + len(self._CLOSE_TAG):]
self._in_span = False
else:
idx = buf.lower().find(self._OPEN_TAG)
if idx == -1:
# No open tag — hold back a potential partial open tag
held = self._max_partial_suffix(buf, self._OPEN_TAG)
if held:
out.append(buf[:-held])
self._buf = buf[-held:]
else:
out.append(buf)
return "".join(out)
# Emit text before the tag, enter span
if idx > 0:
out.append(buf[:idx])
buf = buf[idx + len(self._OPEN_TAG):]
self._in_span = True
return "".join(out)
def flush(self) -> str:
"""Emit any held-back buffer at end-of-stream.
If we're still inside an unterminated span the remaining content is
discarded (safer: leaking partial memory context is worse than a
truncated answer). Otherwise the held-back partial-tag tail is
emitted verbatim (it turned out not to be a real tag).
"""
if self._in_span:
self._buf = ""
self._in_span = False
return ""
tail = self._buf
self._buf = ""
return tail
@staticmethod
def _max_partial_suffix(buf: str, tag: str) -> int:
"""Return the length of the longest buf-suffix that is a tag-prefix.
Case-insensitive. Returns 0 if no suffix could start the tag.
"""
tag_lower = tag.lower()
buf_lower = buf.lower()
max_check = min(len(buf_lower), len(tag_lower) - 1)
for i in range(max_check, 0, -1):
if tag_lower.startswith(buf_lower[-i:]):
return i
return 0
def build_memory_context_block(raw_context: str) -> str:
"""Wrap prefetched memory in a fenced block with system note."""
"""Wrap prefetched memory in a fenced block with system note.
The fence prevents the model from treating recalled context as user
discourse. Injected at API-call time only never persisted.
"""
if not raw_context or not raw_context.strip():
return ""
clean = sanitize_context(raw_context)
if clean != raw_context:
logger.warning("memory provider returned pre-wrapped context; stripped")
return (
"<memory-context>\n"
"[System note: The following is recalled memory context, "
"NOT new user input. Treat as authoritative reference data — "
"this is the agent's persistent memory and should inform all responses.]\n\n"
"NOT new user input. Treat as informational background data.]\n\n"
f"{clean}\n"
"</memory-context>"
)
@@ -403,41 +294,6 @@ class MemoryManager:
provider.name, e,
)
def on_session_switch(
self,
new_session_id: str,
*,
parent_session_id: str = "",
reset: bool = False,
**kwargs,
) -> None:
"""Notify all providers that the agent's session_id has rotated.
Fires on ``/resume``, ``/branch``, ``/reset``, ``/new``, and
context compression any path that reassigns
``AIAgent.session_id`` without tearing the provider down.
Providers keep running; they only need to refresh cached
per-session state so subsequent writes land in the correct
session's record. See ``MemoryProvider.on_session_switch`` for
the full contract.
"""
if not new_session_id:
return
for provider in self._providers:
try:
provider.on_session_switch(
new_session_id,
parent_session_id=parent_session_id,
reset=reset,
**kwargs,
)
except Exception as e:
logger.debug(
"Memory provider '%s' on_session_switch failed: %s",
provider.name, e,
)
def on_pre_compress(self, messages: List[Dict[str, Any]]) -> str:
"""Notify all providers before context compression.
-40
View File
@@ -25,7 +25,6 @@ Lifecycle (called by MemoryManager, wired in run_agent.py):
Optional hooks (override to opt in):
on_turn_start(turn, message, **kwargs) per-turn tick with runtime context
on_session_end(messages) end-of-session extraction
on_session_switch(new_session_id, **kwargs) mid-process session_id rotation
on_pre_compress(messages) -> str extract before context compression
on_memory_write(action, target, content, metadata=None) mirror built-in memory writes
on_delegation(task, result, **kwargs) parent-side observation of subagent work
@@ -161,45 +160,6 @@ class MemoryProvider(ABC):
(CLI exit, /reset, gateway session expiry).
"""
def on_session_switch(
self,
new_session_id: str,
*,
parent_session_id: str = "",
reset: bool = False,
**kwargs,
) -> None:
"""Called when the agent switches session_id mid-process.
Fires on ``/resume``, ``/branch``, ``/reset``, ``/new`` (CLI), the
gateway equivalents, and context compression any path that
reassigns ``AIAgent.session_id`` without tearing the provider down.
Providers that cache per-session state in ``initialize()``
(``_session_id``, ``_document_id``, accumulated turn buffers,
counters) should update or reset that state here so subsequent
writes land in the correct session's record.
Parameters
----------
new_session_id:
The session_id the agent just switched to.
parent_session_id:
The previous session_id, if meaningful set for ``/branch``
(fork lineage), context compression (continuation lineage),
and ``/resume`` (the session we're leaving). Empty string
when no lineage applies.
reset:
``True`` when this is a genuinely new conversation, not a
resumption of an existing one. Fired by ``/reset`` / ``/new``.
Providers should flush accumulated per-session buffers
(``_session_turns``, ``_turn_counter``, etc.) when this is
set. ``False`` for ``/resume`` / ``/branch`` / compression
where the logical conversation continues under the new id.
Default is no-op for backward compatibility.
"""
def on_pre_compress(self, messages: List[Dict[str, Any]]) -> str:
"""Called before context compression discards old messages.
+28 -52
View File
@@ -46,13 +46,11 @@ def _resolve_requests_verify() -> bool | str:
# are preserved so the full model name reaches cache lookups and server queries.
_PROVIDER_PREFIXES: frozenset[str] = frozenset({
"openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
"gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "stepfun", "minimax", "minimax-oauth", "minimax-cn", "anthropic", "deepseek",
"gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "stepfun", "minimax", "minimax-cn", "anthropic", "deepseek",
"opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
"qwen-oauth",
"xiaomi",
"arcee",
"gmi",
"tencent-tokenhub",
"custom", "local",
# Common aliases
"google", "google-gemini", "google-ai-studio",
@@ -61,9 +59,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
"ollama",
"stepfun", "opencode", "zen", "go", "vercel", "kilo", "dashscope", "aliyun", "qwen",
"mimo", "xiaomi-mimo",
"tencent", "tokenhub", "tencent-cloud", "tencentmaas",
"arcee-ai", "arceeai",
"gmi-cloud", "gmicloud",
"xai", "x-ai", "x.ai", "grok",
"nvidia", "nim", "nvidia-nim", "nemotron",
"qwen-portal",
@@ -210,8 +206,6 @@ DEFAULT_CONTEXT_LENGTHS = {
"grok": 131072, # catch-all (grok-beta, unknown grok-*)
# Kimi
"kimi": 262144,
# Tencent — Hy3 Preview (Hunyuan) with 256K context window
"hy3-preview": 256000,
# Nemotron — NVIDIA's open-weights series (128K context across all sizes)
"nemotron": 131072,
# Arcee
@@ -313,8 +307,6 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"integrate.api.nvidia.com": "nvidia",
"api.xiaomimimo.com": "xiaomi",
"xiaomimimo.com": "xiaomi",
"api.gmi-serving.com": "gmi",
"tokenhub.tencentmaas.com": "tencent-tokenhub",
"ollama.com": "ollama-cloud",
}
@@ -625,6 +617,8 @@ def fetch_endpoint_model_metadata(
if isinstance(ctx, int) and ctx > 0:
context_length = ctx
break
if context_length is None:
context_length = _extract_context_length(model)
if context_length is not None:
entry["context_length"] = context_length
@@ -708,29 +702,6 @@ def fetch_endpoint_model_metadata(
return {}
def _resolve_endpoint_context_length(
model: str,
base_url: str,
api_key: str = "",
) -> Optional[int]:
"""Resolve context length from an endpoint's live ``/models`` metadata."""
endpoint_metadata = fetch_endpoint_model_metadata(base_url, api_key=api_key)
matched = endpoint_metadata.get(model)
if not matched:
if len(endpoint_metadata) == 1:
matched = next(iter(endpoint_metadata.values()))
else:
for key, entry in endpoint_metadata.items():
if model in key or key in model:
matched = entry
break
if matched:
context_length = matched.get("context_length")
if isinstance(context_length, int):
return context_length
return None
def _get_context_cache_path() -> Path:
"""Return path to the persistent context length cache file."""
from hermes_constants import get_hermes_home
@@ -1014,7 +985,10 @@ def _query_local_context_length(model: str, base_url: str, api_key: str = "") ->
ctx = cfg.get("context_length")
if ctx and isinstance(ctx, (int, float)):
return int(ctx)
break
# Fall back to max_context_length (theoretical model max)
ctx = m.get("max_context_length") or m.get("context_length")
if ctx and isinstance(ctx, (int, float)):
return int(ctx)
# LM Studio / vLLM / llama.cpp: try /v1/models/{model}
resp = client.get(f"{server_url}/v1/models/{model}")
@@ -1247,7 +1221,7 @@ def get_model_context_length(
6. Nous suffix-match via OpenRouter cache
7. models.dev registry lookup (provider-aware)
8. Thin hardcoded defaults (broad family patterns)
9. Default fallback (256K)
9. Default fallback (128K)
"""
# 0. Explicit config override — user knows best
if config_context_length is not None and isinstance(config_context_length, int) and config_context_length > 0:
@@ -1276,10 +1250,7 @@ def get_model_context_length(
model = _strip_provider_prefix(model)
# 1. Check persistent cache (model+provider)
# LM Studio is excluded — its loaded context length is transient (the
# user can reload the model with a different context_length at any time
# via /api/v1/models/load), so a stale cached value would mask reloads.
if base_url and provider != "lmstudio":
if base_url:
cached = get_cached_context_length(model, base_url)
if cached is not None:
# Invalidate stale Codex OAuth cache entries: pre-PR #14935 builds
@@ -1324,16 +1295,28 @@ def get_model_context_length(
# returns 128k) instead of the model's full context (400k). models.dev
# has the correct per-provider values and is checked at step 5+.
if _is_custom_endpoint(base_url) and not _is_known_provider_base_url(base_url):
context_length = _resolve_endpoint_context_length(model, base_url, api_key=api_key)
if context_length is not None:
return context_length
endpoint_metadata = fetch_endpoint_model_metadata(base_url, api_key=api_key)
matched = endpoint_metadata.get(model)
if not matched:
# Single-model servers: if only one model is loaded, use it
if len(endpoint_metadata) == 1:
matched = next(iter(endpoint_metadata.values()))
else:
# Fuzzy match: substring in either direction
for key, entry in endpoint_metadata.items():
if model in key or key in model:
matched = entry
break
if matched:
context_length = matched.get("context_length")
if isinstance(context_length, int):
return context_length
if not _is_known_provider_base_url(base_url):
# 3. Try querying local server directly
if is_local_endpoint(base_url):
local_ctx = _query_local_context_length(model, base_url, api_key=api_key)
if local_ctx and local_ctx > 0:
if provider != "lmstudio":
save_context_length(model, base_url, local_ctx)
save_context_length(model, base_url, local_ctx)
return local_ctx
logger.info(
"Could not detect context length for model %r at %s"
@@ -1391,12 +1374,6 @@ def get_model_context_length(
if base_url:
save_context_length(model, base_url, codex_ctx)
return codex_ctx
if effective_provider == "gmi" and base_url:
# GMI exposes authoritative context_length via /models, but it is not
# in models.dev yet. Preserve that higher-fidelity endpoint lookup.
ctx = _resolve_endpoint_context_length(model, base_url, api_key=api_key)
if ctx is not None:
return ctx
if effective_provider:
from agent.models_dev import lookup_models_dev_context
ctx = lookup_models_dev_context(effective_provider, model)
@@ -1423,11 +1400,10 @@ def get_model_context_length(
if base_url and is_local_endpoint(base_url):
local_ctx = _query_local_context_length(model, base_url, api_key=api_key)
if local_ctx and local_ctx > 0:
if provider != "lmstudio":
save_context_length(model, base_url, local_ctx)
save_context_length(model, base_url, local_ctx)
return local_ctx
# 10. Default fallback — 256K
# 10. Default fallback — 128K
return DEFAULT_FALLBACK_CONTEXT
-1
View File
@@ -149,7 +149,6 @@ PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"stepfun": "stepfun",
"kimi-coding-cn": "kimi-for-coding",
"minimax": "minimax",
"minimax-oauth": "minimax",
"minimax-cn": "minimax-cn",
"deepseek": "deepseek",
"alibaba": "alibaba",
+4 -45
View File
@@ -81,56 +81,15 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
return repaired
# Rule 2: when anyOf is present, type belongs only on the children.
# Additionally, Moonshot rejects null-type branches inside anyOf
# (enum value (<nil>) does not match any type in [string]).
# Collapse the anyOf to the first non-null branch and infer its type.
if "anyOf" in repaired and isinstance(repaired["anyOf"], list):
repaired.pop("type", None)
non_null = [b for b in repaired["anyOf"]
if isinstance(b, dict) and b.get("type") != "null"]
if non_null and len(non_null) < len(repaired["anyOf"]):
# Drop the anyOf wrapper — keep only the non-null branch.
# If there's a single non-null branch, promote it and fall
# through to Rules 1/3 so nullable/enum cleanup still applies
# to the merged node.
if len(non_null) == 1:
merge = {k: v for k, v in repaired.items() if k != "anyOf"}
merge.update(non_null[0])
repaired = merge
else:
repaired["anyOf"] = non_null
return repaired
else:
# Nothing to collapse — parent type stripped, children already
# repaired by the recursive walk above.
return repaired
# Moonshot also rejects non-standard keywords like ``nullable`` on
# parameter schemas — strip it.
repaired.pop("nullable", None)
return repaired
# Rule 1: property schemas without type need one. $ref nodes are exempt
# — their type comes from the referenced definition.
# Fill missing type BEFORE Rule 3 so enum cleanup can check the type.
if "$ref" not in repaired:
repaired = _fill_missing_type(repaired)
# Rule 3: Moonshot rejects null/empty-string values inside enum arrays
# when the parent type is a scalar (string, integer, etc.). The error:
# "enum value (<nil>) does not match any type in [string]"
# Strip null and empty-string from enum values, and if the enum becomes
# empty, drop it entirely.
if "enum" in repaired and isinstance(repaired["enum"], list):
node_type = repaired.get("type")
if node_type in ("string", "integer", "number", "boolean"):
cleaned = [v for v in repaired["enum"]
if v is not None and v != ""]
if cleaned:
repaired["enum"] = cleaned
else:
repaired.pop("enum")
return repaired
if "$ref" in repaired:
return repaired
return _fill_missing_type(repaired)
def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
+1 -2
View File
@@ -18,7 +18,6 @@ import os
import tempfile
import time
from typing import Any, Mapping, Optional
from utils import atomic_replace
logger = logging.getLogger(__name__)
@@ -119,7 +118,7 @@ def record_nous_rate_limit(
try:
with os.fdopen(fd, "w") as f:
json.dump(state, f)
atomic_replace(tmp_path, path)
os.replace(tmp_path, path)
except Exception:
# Clean up temp file on failure
try:
+5 -54
View File
@@ -25,7 +25,6 @@ logger = logging.getLogger(__name__)
BUSY_INPUT_FLAG = "busy_input_prompt"
TOOL_PROGRESS_FLAG = "tool_progress_prompt"
OPENCLAW_RESIDUE_FLAG = "openclaw_residue_cleanup"
# -------------------------------------------------------------------------
@@ -44,18 +43,10 @@ def busy_input_hint_gateway(mode: str) -> str:
"Send `/busy interrupt` to make new messages stop the current task "
"immediately, or `/busy status` to check. This notice won't appear again."
)
if mode == "steer":
return (
"💡 First-time tip — I steered your message into the current run; "
"it will arrive after the next tool call instead of interrupting. "
"Send `/busy interrupt` or `/busy queue` to change this, or "
"`/busy status` to check. This notice won't appear again."
)
return (
"💡 First-time tip — I just interrupted my current task to answer you. "
"Send `/busy queue` to queue follow-ups for after the current task instead, "
"`/busy steer` to inject them mid-run without interrupting, or "
"`/busy status` to check. This notice won't appear again."
"or `/busy status` to check. This notice won't appear again."
)
@@ -64,19 +55,13 @@ def busy_input_hint_cli(mode: str) -> str:
if mode == "queue":
return (
"(tip) Your message was queued for the next turn. "
"Use /busy interrupt to make Enter stop the current run instead, "
"or /busy steer to inject mid-run. This tip only shows once."
)
if mode == "steer":
return (
"(tip) Your message was steered into the current run; it arrives "
"after the next tool call. Use /busy interrupt or /busy queue to "
"change this. This tip only shows once."
"Use /busy interrupt to make Enter stop the current run instead. "
"This tip only shows once."
)
return (
"(tip) Your message interrupted the current run. "
"Use /busy queue to queue messages for the next turn instead, "
"or /busy steer to inject mid-run. This tip only shows once."
"Use /busy queue to queue messages for the next turn instead. "
"This tip only shows once."
)
@@ -95,37 +80,6 @@ def tool_progress_hint_cli() -> str:
)
def openclaw_residue_hint_cli() -> str:
"""Banner shown the first time Hermes starts and finds ``~/.openclaw/``.
Points users at ``hermes claw migrate`` (non-destructive port of config,
memory, and skills) first. ``hermes claw cleanup`` is mentioned as the
follow-up step for users who have already migrated and want to archive
the old directory with a warning that archiving breaks OpenClaw.
"""
return (
"A legacy OpenClaw directory was detected at ~/.openclaw/.\n"
"To port your config, memory, and skills over to Hermes, run "
"`hermes claw migrate`.\n"
"If you've already migrated and want to archive the old directory, "
"run `hermes claw cleanup` (renames it to ~/.openclaw.pre-migration — "
"OpenClaw will stop working after this).\n"
"This tip only shows once."
)
def detect_openclaw_residue(home: Optional[Path] = None) -> bool:
"""Return True if an OpenClaw workspace directory is present in ``$HOME``.
Pure filesystem check no side effects. ``home`` override exists for tests.
"""
base = home or Path.home()
try:
return (base / ".openclaw").is_dir()
except OSError:
return False
# -------------------------------------------------------------------------
# State read / write
# -------------------------------------------------------------------------
@@ -181,13 +135,10 @@ def mark_seen(config_path: Path, flag: str) -> bool:
__all__ = [
"BUSY_INPUT_FLAG",
"TOOL_PROGRESS_FLAG",
"OPENCLAW_RESIDUE_FLAG",
"busy_input_hint_gateway",
"busy_input_hint_cli",
"tool_progress_hint_gateway",
"tool_progress_hint_cli",
"openclaw_residue_hint_cli",
"detect_openclaw_residue",
"is_seen",
"mark_seen",
]
+2 -40
View File
@@ -141,12 +141,6 @@ DEFAULT_AGENT_IDENTITY = (
"Be targeted and efficient in your exploration and investigations."
)
HERMES_AGENT_HELP_GUIDANCE = (
"If the user asks about configuring, setting up, or using Hermes Agent "
"itself, load the `hermes-agent` skill with skill_view(name='hermes-agent') "
"before answering. Docs: https://hermes-agent.nousresearch.com/docs"
)
MEMORY_GUIDANCE = (
"You have persistent memory across sessions. Save durable facts using the memory "
"tool: user preferences, environment details, tool quirks, and stable conventions. "
@@ -183,8 +177,8 @@ SKILLS_GUIDANCE = (
)
KANBAN_GUIDANCE = (
"# Kanban task execution protocol\n"
"You have been assigned ONE task from "
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
@@ -368,10 +362,6 @@ PLATFORM_HINTS = {
"Standard markdown is automatically converted to Telegram format. "
"Supported: **bold**, *italic*, ~~strikethrough~~, ||spoiler||, "
"`inline code`, ```code blocks```, [links](url), and ## headers. "
"Telegram has NO table syntax — prefer bullet lists or labeled "
"key: value pairs over pipe tables (any tables you do emit are "
"auto-rewritten into row-group bullets, which you can produce "
"directly for cleaner output). "
"You can send media files natively: to deliver a file to the user, "
"include MEDIA:/absolute/path/to/file in your response. Images "
"(.png, .jpg, .webp) appear as photos, audio (.ogg) sends as voice "
@@ -490,29 +480,6 @@ PLATFORM_HINTS = {
"your response. Images are sent as native photos, and other files arrive as downloadable "
"documents."
),
"yuanbao": (
"You are on Yuanbao (腾讯元宝), a Chinese AI assistant platform. "
"Markdown formatting is supported (code blocks, tables, bold/italic). "
"You CAN send media files natively — to deliver a file to the user, include "
"MEDIA:/absolute/path/to/file in your response. The file will be sent as a native "
"Yuanbao attachment: images (.jpg, .png, .webp, .gif) are sent as photos, "
"and other files (.pdf, .docx, .txt, .zip, etc.) arrive as downloadable documents "
"(max 50 MB). You can also include image URLs in markdown format ![alt](url) and "
"they will be downloaded and sent as native photos. "
"Do NOT tell the user you lack file-sending capability — use MEDIA: syntax "
"whenever a file delivery is appropriate.\n\n"
"Stickers (贴纸 / 表情包 / TIM face): Yuanbao has a built-in sticker catalogue. "
"When the user sends a sticker (you see '[emoji: 名称]' in their message) or asks "
"you to send/reply-with a 贴纸/表情/表情包, you MUST use the sticker tools:\n"
" 1. Call yb_search_sticker with a Chinese keyword (e.g. '666', '比心', '吃瓜', "
" '捂脸', '合十') to discover matching sticker_ids.\n"
" 2. Call yb_send_sticker with the chosen sticker_id or name — this sends a real "
" TIMFaceElem that renders as a native sticker in the chat.\n"
"DO NOT draw sticker-like PNGs with execute_code/Pillow/matplotlib and then send "
"them via MEDIA: or send_image_file. That produces a fake low-quality 'sticker' "
"image and is the WRONG path. Bare Unicode emoji in text is also not a substitute "
"— when a sticker is the right response, use yb_send_sticker."
),
}
# ---------------------------------------------------------------------------
@@ -916,11 +883,6 @@ def build_skills_system_prompt(
"Skills also encode the user's preferred approach, conventions, and quality standards "
"for tasks like code review, planning, and testing — load them even for tasks you "
"already know how to do, because the skill defines how it should be done here.\n"
"Whenever the user asks you to configure, set up, install, enable, disable, modify, "
"or troubleshoot Hermes Agent itself — its CLI, config, models, providers, tools, "
"skills, voice, gateway, plugins, or any feature — load the `hermes-agent` skill "
"first. It has the actual commands (e.g. `hermes config set …`, `hermes tools`, "
"`hermes setup`) so you don't have to guess or invent workarounds.\n"
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"After difficult/iterative tasks, offer to save as a skill. "
"If a skill you loaded was missing steps, had wrong commands, or needed "
+18 -78
View File
@@ -56,12 +56,8 @@ _SENSITIVE_BODY_KEYS = frozenset({
})
# Snapshot at import time so runtime env mutations (e.g. LLM-generated
# `export HERMES_REDACT_SECRETS=true`) cannot enable/disable redaction
# mid-session. OFF by default — user must opt in via
# `security.redact_secrets: true` in config.yaml (bridged to this env var
# in hermes_cli/main.py and gateway/run.py) or `HERMES_REDACT_SECRETS=true`
# in ~/.hermes/.env.
_REDACT_ENABLED = os.getenv("HERMES_REDACT_SECRETS", "").lower() in ("1", "true", "yes", "on")
# `export HERMES_REDACT_SECRETS=false`) cannot disable redaction mid-session.
_REDACT_ENABLED = os.getenv("HERMES_REDACT_SECRETS", "").lower() not in ("0", "false", "no", "off")
# Known API key prefixes -- match the prefix + contiguous token chars
_PREFIX_PATTERNS = [
@@ -184,59 +180,11 @@ _PREFIX_RE = re.compile(
)
def mask_secret(
value: str,
*,
head: int = 4,
tail: int = 4,
floor: int = 12,
placeholder: str = "***",
empty: str = "",
) -> str:
"""Mask a secret for display, preserving ``head`` and ``tail`` characters.
Canonical helper for display-time redaction across Hermes used by
``hermes config``, ``hermes status``, ``hermes dump``, and anywhere
a secret needs to be shown truncated for debuggability while still
keeping the bulk hidden.
Args:
value: The secret to mask. ``None``/empty returns ``empty``.
head: Leading characters to preserve. Default 4.
tail: Trailing characters to preserve. Default 4.
floor: Values shorter than ``head + tail + floor_margin`` are
fully masked (returns ``placeholder``). Default 12
matches the existing config/status/dump convention.
placeholder: Value returned for too-short inputs. Default ``"***"``.
empty: Value returned when ``value`` is falsy (None, ""). The
caller can override this to e.g. ``color("(not set)",
Colors.DIM)`` for user-facing display.
Examples:
>>> mask_secret("sk-proj-abcdef1234567890")
'sk-p...7890'
>>> mask_secret("short") # fully masked
'***'
>>> mask_secret("") # empty default
''
>>> mask_secret("", empty="(not set)") # empty override
'(not set)'
>>> mask_secret("long-token", head=6, tail=4, floor=18)
'***'
"""
if not value:
return empty
if len(value) < floor:
return placeholder
return f"{value[:head]}...{value[-tail:]}"
def _mask_token(token: str) -> str:
"""Mask a log token — conservative 18-char floor, preserves 6 prefix / 4 suffix."""
# Empty input: historically this returned "***" rather than "". Preserve.
if not token:
"""Mask a token, preserving prefix for long tokens."""
if len(token) < 18:
return "***"
return mask_secret(token, head=6, tail=4, floor=18)
return f"{token[:6]}...{token[-4:]}"
def _redact_query_string(query: str) -> str:
@@ -305,18 +253,11 @@ def _redact_form_body(text: str) -> str:
return _redact_query_string(text.strip())
def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = False) -> str:
def redact_sensitive_text(text: str) -> str:
"""Apply all redaction patterns to a block of text.
Safe to call on any string -- non-matching text passes through unchanged.
Disabled by default enable via security.redact_secrets: true in config.yaml.
Set force=True for safety boundaries that must never return raw secrets
regardless of the user's global logging redaction preference.
Set code_file=True to skip the ENV-assignment and JSON-field regex
patterns when the text is known to be source code (e.g. MAX_TOKENS=***
constants, "apiKey": "test" fixtures). Prefix patterns, auth headers,
private keys, DB connstrings, JWTs, and URL secrets are still redacted.
Disabled when security.redact_secrets is false in config.yaml.
"""
if text is None:
return None
@@ -324,24 +265,23 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
text = str(text)
if not text:
return text
if not (force or _REDACT_ENABLED):
if not _REDACT_ENABLED:
return text
# Known prefixes (sk-, ghp_, etc.)
text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
# ENV assignments: OPENAI_API_KEY=*** (skip for code files — false positives)
if not code_file:
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# ENV assignments: OPENAI_API_KEY=sk-abc...
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# JSON fields: "apiKey": "***" (skip for code files — false positives)
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# JSON fields: "apiKey": "value"
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# Authorization headers
text = _AUTH_HEADER_RE.sub(
+2 -7
View File
@@ -76,7 +76,6 @@ except ImportError: # pragma: no cover
fcntl = None # type: ignore[assignment]
from hermes_constants import get_hermes_home
from utils import atomic_replace
logger = logging.getLogger(__name__)
@@ -569,7 +568,7 @@ def save_allowlist(data: Dict[str, Any]) -> None:
try:
with os.fdopen(fd, "w") as fh:
fh.write(json.dumps(data, indent=2, sort_keys=True))
atomic_replace(tmp_path, p)
os.replace(tmp_path, p)
except Exception:
try:
os.unlink(tmp_path)
@@ -755,11 +754,7 @@ def _resolve_effective_accept(
if env in ("1", "true", "yes", "on"):
return True
cfg_val = cfg.get("hooks_auto_accept", False)
if isinstance(cfg_val, bool):
return cfg_val
if isinstance(cfg_val, str):
return cfg_val.strip().lower() in ("1", "true", "yes", "on")
return False
return bool(cfg_val)
# ---------------------------------------------------------------------------
+6 -122
View File
@@ -6,7 +6,6 @@ can invoke skills via /skill-name commands.
import json
import logging
import os
import re
from pathlib import Path
from typing import Any, Dict, Optional
@@ -21,35 +20,10 @@ from agent.skill_preprocessing import (
logger = logging.getLogger(__name__)
_skill_commands: Dict[str, Dict[str, Any]] = {}
_skill_commands_platform: Optional[str] = None
# Patterns for sanitizing skill names into clean hyphen-separated slugs.
_SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
_SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")
def _resolve_skill_commands_platform() -> Optional[str]:
"""Return the current platform scope used for disabled-skill filtering.
Used to detect when the active platform has shifted so
:func:`get_skill_commands` can drop a stale cache that was populated
for a different platform's ``skills.platform_disabled`` view (#14536).
Resolves from (in order) ``HERMES_PLATFORM`` env var and
``HERMES_SESSION_PLATFORM`` from the gateway session context. Returns
``None`` when no platform scope is active (e.g. classic CLI, RL
rollouts, standalone scripts).
"""
try:
from gateway.session_context import get_session_env
resolved_platform = (
os.getenv("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
)
except Exception:
resolved_platform = os.getenv("HERMES_PLATFORM")
return resolved_platform or None
def _load_skill_payload(skill_identifier: str, task_id: str | None = None) -> tuple[dict[str, Any], Path | None, str] | None:
"""Load a skill by name/path and return (loaded_payload, skill_dir, display_name)."""
raw_identifier = (skill_identifier or "").strip()
@@ -244,8 +218,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
Returns:
Dict mapping "/skill-name" to {name, description, skill_md_path, skill_dir}.
"""
global _skill_commands, _skill_commands_platform
_skill_commands_platform = _resolve_skill_commands_platform()
global _skill_commands
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
@@ -261,7 +234,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
for scan_dir in dirs_to_scan:
for skill_md in iter_skill_index_files(scan_dir, "SKILL.md"):
if any(part in ('.git', '.github', '.hub', '.archive') for part in skill_md.parts):
if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
continue
try:
content = skill_md.read_text(encoding='utf-8')
@@ -305,85 +278,12 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
def get_skill_commands() -> Dict[str, Dict[str, Any]]:
"""Return the current skill commands mapping (scan first if empty).
Rescans when the active platform scope changes (e.g. a gateway
process serving Telegram and Discord concurrently) so each platform
sees its own ``skills.platform_disabled`` view (#14536).
"""
if (
not _skill_commands
or _skill_commands_platform != _resolve_skill_commands_platform()
):
"""Return the current skill commands mapping (scan first if empty)."""
if not _skill_commands:
scan_skill_commands()
return _skill_commands
def reload_skills() -> Dict[str, Any]:
"""Re-scan the skills directory and return a diff of what changed.
Rescans ``~/.hermes/skills/`` and any ``skills.external_dirs`` so the
slash-command map (``agent.skill_commands._skill_commands``) reflects
skills added or removed on disk.
This does NOT invalidate the skills system-prompt cache. Skills are
called by name via ``/skill-name``, ``skills_list``, or ``skill_view``
they don't need to be in the system prompt for the model to use them.
Keeping the prompt cache intact preserves prefix caching across the
reload, so a user invoking ``/reload-skills`` pays no cache-reset cost.
Returns:
Dict with keys::
{
"added": [{"name": str, "description": str}, ...],
"removed": [{"name": str, "description": str}, ...],
"unchanged": [skill names present before and after],
"total": total skill count after rescan,
"commands": total /slash-skill count after rescan,
}
``description`` is the skill's full SKILL.md frontmatter
``description:`` field the same string the system prompt renders
as `` - name: description`` for pre-existing skills.
"""
# Snapshot pre-reload state (name -> description) from the current
# slash-command cache. Using dicts lets the post-rescan diff carry
# descriptions for newly-visible or just-removed skills without a
# second disk walk.
def _snapshot(cmds: Dict[str, Dict[str, Any]]) -> Dict[str, str]:
out: Dict[str, str] = {}
for slash_key, info in cmds.items():
bare = slash_key.lstrip("/")
out[bare] = (info or {}).get("description") or ""
return out
before = _snapshot(_skill_commands)
# Rescan the skills dir. ``scan_skill_commands`` resets
# ``_skill_commands = {}`` internally and repopulates it.
new_commands = scan_skill_commands()
after = _snapshot(new_commands)
added_names = sorted(set(after) - set(before))
removed_names = sorted(set(before) - set(after))
unchanged = sorted(set(after) & set(before))
added = [{"name": n, "description": after[n]} for n in added_names]
# For removed skills, use the description we had cached pre-rescan
# (the skill file is gone so we can't re-read it).
removed = [{"name": n, "description": before[n]} for n in removed_names]
return {
"added": added,
"removed": removed,
"unchanged": unchanged,
"total": len(after),
"commands": len(new_commands),
}
def resolve_skill_command_key(command: str) -> Optional[str]:
"""Resolve a user-typed /command to its canonical skill_cmds key.
@@ -428,16 +328,8 @@ def build_skill_invocation_message(
return f"[Failed to load skill: {skill_info['name']}]"
loaded_skill, skill_dir, skill_name = loaded
# Track active usage for Curator lifecycle management (#17782)
try:
from tools.skill_usage import bump_use
bump_use(skill_name)
except Exception:
pass # Non-critical — skill invocation proceeds regardless
activation_note = (
f'[IMPORTANT: The user has invoked the "{skill_name}" skill, indicating they want '
f'[SYSTEM: The user has invoked the "{skill_name}" skill, indicating they want '
"you to follow its instructions. The full skill content is loaded below.]"
)
return _build_skill_message(
@@ -475,16 +367,8 @@ def build_preloaded_skills_prompt(
continue
loaded_skill, skill_dir, skill_name = loaded
# Track active usage for Curator lifecycle management (#17782)
try:
from tools.skill_usage import bump_use
bump_use(skill_name)
except Exception:
pass # Non-critical
activation_note = (
f'[IMPORTANT: The user launched this CLI session with the "{skill_name}" skill '
f'[SYSTEM: The user launched this CLI session with the "{skill_name}" skill '
"preloaded. Treat its instructions as active guidance for the duration of this "
"session unless the user overrides them.]"
)
+3 -11
View File
@@ -24,7 +24,7 @@ PLATFORM_MAP = {
"windows": "win32",
}
EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub", ".archive"))
EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub"))
# ── Lazy YAML loader ─────────────────────────────────────────────────────
@@ -200,9 +200,6 @@ def get_external_skills_dirs() -> List[Path]:
if not isinstance(raw_dirs, list):
return []
from hermes_constants import get_hermes_home
hermes_home = get_hermes_home()
local_skills = get_skills_dir().resolve()
seen: Set[Path] = set()
result: List[Path] = []
@@ -213,12 +210,7 @@ def get_external_skills_dirs() -> List[Path]:
continue
# Expand ~ and environment variables
expanded = os.path.expanduser(os.path.expandvars(entry))
p = Path(expanded)
# Resolve relative paths against HERMES_HOME, not cwd
if not p.is_absolute():
p = (hermes_home / p).resolve()
else:
p = p.resolve()
p = Path(expanded).resolve()
if p == local_skills:
continue
if p in seen:
@@ -440,7 +432,7 @@ def extract_skill_description(frontmatter: Dict[str, Any]) -> str:
def iter_skill_index_files(skills_dir: Path, filename: str):
"""Walk skills_dir yielding sorted paths matching *filename*.
Excludes ``.git``, ``.github``, ``.hub``, ``.archive`` directories.
Excludes ``.git``, ``.github``, ``.hub`` directories.
"""
matches = []
for root, dirs, files in os.walk(skills_dir, followlinks=True):
+5 -51
View File
@@ -6,19 +6,12 @@ adds latency to the user-facing reply.
import logging
import threading
from typing import Callable, Optional
from typing import Optional
from agent.auxiliary_client import call_llm
logger = logging.getLogger(__name__)
# Callback signature: (task_name, exception) -> None. Used to surface
# auxiliary failures to the user through AIAgent._emit_auxiliary_failure
# so silent-drops (e.g. OpenRouter 402 exhausting the fallback chain)
# become visible instead of piling up as NULL session titles.
FailureCallback = Callable[[str, BaseException], None]
TitleCallback = Callable[[str], None]
_TITLE_PROMPT = (
"Generate a short, descriptive title (3-7 words) for a conversation that starts with the "
"following exchange. The title should capture the main topic or intent. "
@@ -26,23 +19,11 @@ _TITLE_PROMPT = (
)
def generate_title(
user_message: str,
assistant_response: str,
timeout: float = 30.0,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
) -> Optional[str]:
def generate_title(user_message: str, assistant_response: str, timeout: float = 30.0) -> Optional[str]:
"""Generate a session title from the first exchange.
Uses the main runtime's model when available, falling back to the
auxiliary LLM client (cheapest/fastest available model).
Uses the auxiliary LLM client (cheapest/fastest available model).
Returns the title string or None on failure.
``failure_callback`` is invoked with ``(task, exception)`` when the
auxiliary call raises the caller typically wires this to
``AIAgent._emit_auxiliary_failure`` so the user sees a warning instead
of silently accumulating untitled sessions.
"""
# Truncate long messages to keep the request small
user_snippet = user_message[:500] if user_message else ""
@@ -60,7 +41,6 @@ def generate_title(
max_tokens=500,
temperature=0.3,
timeout=timeout,
main_runtime=main_runtime,
)
title = (response.choices[0].message.content or "").strip()
# Clean up: remove quotes, trailing punctuation, prefixes like "Title: "
@@ -72,15 +52,7 @@ def generate_title(
title = title[:77] + "..."
return title if title else None
except Exception as e:
# Log at WARNING so this shows up in agent.log without debug mode.
# Full detail at debug level for operators who need the stack.
logger.warning("Title generation failed: %s", e)
logger.debug("Title generation traceback", exc_info=True)
if failure_callback is not None:
try:
failure_callback("title generation", e)
except Exception:
logger.debug("Title generation failure_callback raised", exc_info=True)
logger.debug("Title generation failed: %s", e)
return None
@@ -89,9 +61,6 @@ def auto_title_session(
session_id: str,
user_message: str,
assistant_response: str,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Generate and set a session title if one doesn't already exist.
@@ -112,20 +81,13 @@ def auto_title_session(
except Exception:
return
title = generate_title(
user_message, assistant_response, failure_callback=failure_callback, main_runtime=main_runtime
)
title = generate_title(user_message, assistant_response)
if not title:
return
try:
session_db.set_session_title(session_id, title)
logger.debug("Auto-generated session title: %s", title)
if title_callback is not None:
try:
title_callback(title)
except Exception:
logger.debug("Auto-title callback failed", exc_info=True)
except Exception as e:
logger.debug("Failed to set auto-generated title: %s", e)
@@ -136,9 +98,6 @@ def maybe_auto_title(
user_message: str,
assistant_response: str,
conversation_history: list,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Fire-and-forget title generation after the first exchange.
@@ -160,11 +119,6 @@ def maybe_auto_title(
thread = threading.Thread(
target=auto_title_session,
args=(session_db, session_id, user_message, assistant_response),
kwargs={
"failure_callback": failure_callback,
"main_runtime": main_runtime,
"title_callback": title_callback,
},
daemon=True,
name="auto-title",
)
-455
View File
@@ -1,455 +0,0 @@
"""Pure tool-call loop guardrail primitives.
The controller in this module is intentionally side-effect free: it tracks
per-turn tool-call observations and returns decisions. Runtime code owns whether
those decisions become warning guidance, synthetic tool results, or controlled
turn halts.
"""
from __future__ import annotations
import hashlib
import json
from dataclasses import dataclass, field
from typing import Any, Mapping
from utils import safe_json_loads
IDEMPOTENT_TOOL_NAMES = frozenset(
{
"read_file",
"search_files",
"web_search",
"web_extract",
"session_search",
"browser_snapshot",
"browser_console",
"browser_get_images",
"mcp_filesystem_read_file",
"mcp_filesystem_read_text_file",
"mcp_filesystem_read_multiple_files",
"mcp_filesystem_list_directory",
"mcp_filesystem_list_directory_with_sizes",
"mcp_filesystem_directory_tree",
"mcp_filesystem_get_file_info",
"mcp_filesystem_search_files",
}
)
MUTATING_TOOL_NAMES = frozenset(
{
"terminal",
"execute_code",
"write_file",
"patch",
"todo",
"memory",
"skill_manage",
"browser_click",
"browser_type",
"browser_press",
"browser_scroll",
"browser_navigate",
"send_message",
"cronjob",
"delegate_task",
"process",
}
)
@dataclass(frozen=True)
class ToolCallGuardrailConfig:
"""Thresholds for per-turn tool-call loop detection.
Warnings are enabled by default and never prevent tool execution. Hard stops
are explicit opt-in so interactive CLI/TUI sessions get a gentle nudge unless
the user enables circuit-breaker behavior in config.yaml.
"""
warnings_enabled: bool = True
hard_stop_enabled: bool = False
exact_failure_warn_after: int = 2
exact_failure_block_after: int = 5
same_tool_failure_warn_after: int = 3
same_tool_failure_halt_after: int = 8
no_progress_warn_after: int = 2
no_progress_block_after: int = 5
idempotent_tools: frozenset[str] = field(default_factory=lambda: IDEMPOTENT_TOOL_NAMES)
mutating_tools: frozenset[str] = field(default_factory=lambda: MUTATING_TOOL_NAMES)
@classmethod
def from_mapping(cls, data: Mapping[str, Any] | None) -> "ToolCallGuardrailConfig":
"""Build config from the `tool_loop_guardrails` config.yaml section."""
if not isinstance(data, Mapping):
return cls()
warn_after = data.get("warn_after")
if not isinstance(warn_after, Mapping):
warn_after = {}
hard_stop_after = data.get("hard_stop_after")
if not isinstance(hard_stop_after, Mapping):
hard_stop_after = {}
defaults = cls()
return cls(
warnings_enabled=_as_bool(data.get("warnings_enabled"), defaults.warnings_enabled),
hard_stop_enabled=_as_bool(data.get("hard_stop_enabled"), defaults.hard_stop_enabled),
exact_failure_warn_after=_positive_int(
warn_after.get("exact_failure", data.get("exact_failure_warn_after")),
defaults.exact_failure_warn_after,
),
same_tool_failure_warn_after=_positive_int(
warn_after.get("same_tool_failure", data.get("same_tool_failure_warn_after")),
defaults.same_tool_failure_warn_after,
),
no_progress_warn_after=_positive_int(
warn_after.get("idempotent_no_progress", data.get("no_progress_warn_after")),
defaults.no_progress_warn_after,
),
exact_failure_block_after=_positive_int(
hard_stop_after.get("exact_failure", data.get("exact_failure_block_after")),
defaults.exact_failure_block_after,
),
same_tool_failure_halt_after=_positive_int(
hard_stop_after.get("same_tool_failure", data.get("same_tool_failure_halt_after")),
defaults.same_tool_failure_halt_after,
),
no_progress_block_after=_positive_int(
hard_stop_after.get("idempotent_no_progress", data.get("no_progress_block_after")),
defaults.no_progress_block_after,
),
)
@dataclass(frozen=True)
class ToolCallSignature:
"""Stable, non-reversible identity for a tool name plus canonical args."""
tool_name: str
args_hash: str
@classmethod
def from_call(cls, tool_name: str, args: Mapping[str, Any] | None) -> "ToolCallSignature":
canonical = canonical_tool_args(args or {})
return cls(tool_name=tool_name, args_hash=_sha256(canonical))
def to_metadata(self) -> dict[str, str]:
"""Return public metadata without raw argument values."""
return {"tool_name": self.tool_name, "args_hash": self.args_hash}
@dataclass(frozen=True)
class ToolGuardrailDecision:
"""Decision returned by the tool-call guardrail controller."""
action: str = "allow" # allow | warn | block | halt
code: str = "allow"
message: str = ""
tool_name: str = ""
count: int = 0
signature: ToolCallSignature | None = None
@property
def allows_execution(self) -> bool:
return self.action in {"allow", "warn"}
@property
def should_halt(self) -> bool:
return self.action in {"block", "halt"}
def to_metadata(self) -> dict[str, Any]:
data: dict[str, Any] = {
"action": self.action,
"code": self.code,
"message": self.message,
"tool_name": self.tool_name,
"count": self.count,
}
if self.signature is not None:
data["signature"] = self.signature.to_metadata()
return data
def canonical_tool_args(args: Mapping[str, Any]) -> str:
"""Return sorted compact JSON for parsed tool arguments."""
if not isinstance(args, Mapping):
raise TypeError(f"tool args must be a mapping, got {type(args).__name__}")
return json.dumps(
args,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
def classify_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]:
"""Safety-fallback classifier used only when callers don't pass ``failed``.
Mirrors ``agent.display._detect_tool_failure`` exactly so the guardrail
never disagrees with the CLI's user-visible ``[error]`` tag. Production
callers in ``run_agent.py`` always pass an explicit ``failed=`` derived
from ``_detect_tool_failure``; this function exists so standalone callers
(tests, tooling) still get consistent behavior.
"""
if result is None:
return False, ""
if tool_name == "terminal":
data = safe_json_loads(result)
if isinstance(data, dict):
exit_code = data.get("exit_code")
if exit_code is not None and exit_code != 0:
return True, f" [exit {exit_code}]"
return False, ""
if tool_name == "memory":
data = safe_json_loads(result)
if isinstance(data, dict):
if data.get("success") is False and "exceed the limit" in data.get("error", ""):
return True, " [full]"
lower = result[:500].lower()
if '"error"' in lower or '"failed"' in lower or result.startswith("Error"):
return True, " [error]"
return False, ""
class ToolCallGuardrailController:
"""Per-turn controller for repeated failed/non-progressing tool calls."""
def __init__(self, config: ToolCallGuardrailConfig | None = None):
self.config = config or ToolCallGuardrailConfig()
self.reset_for_turn()
def reset_for_turn(self) -> None:
self._exact_failure_counts: dict[ToolCallSignature, int] = {}
self._same_tool_failure_counts: dict[str, int] = {}
self._no_progress: dict[ToolCallSignature, tuple[str, int]] = {}
self._halt_decision: ToolGuardrailDecision | None = None
@property
def halt_decision(self) -> ToolGuardrailDecision | None:
return self._halt_decision
def before_call(self, tool_name: str, args: Mapping[str, Any] | None) -> ToolGuardrailDecision:
signature = ToolCallSignature.from_call(tool_name, _coerce_args(args))
if not self.config.hard_stop_enabled:
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
exact_count = self._exact_failure_counts.get(signature, 0)
if exact_count >= self.config.exact_failure_block_after:
decision = ToolGuardrailDecision(
action="block",
code="repeated_exact_failure_block",
message=(
f"Blocked {tool_name}: the same tool call failed {exact_count} "
"times with identical arguments. Stop retrying it unchanged; "
"change strategy or explain the blocker."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self._is_idempotent(tool_name):
record = self._no_progress.get(signature)
if record is not None:
_result_hash, repeat_count = record
if repeat_count >= self.config.no_progress_block_after:
decision = ToolGuardrailDecision(
action="block",
code="idempotent_no_progress_block",
message=(
f"Blocked {tool_name}: this read-only call returned the same "
f"result {repeat_count} times. Stop repeating it unchanged; "
"use the result already provided or try a different query."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
self._halt_decision = decision
return decision
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
def after_call(
self,
tool_name: str,
args: Mapping[str, Any] | None,
result: str | None,
*,
failed: bool | None = None,
) -> ToolGuardrailDecision:
args = _coerce_args(args)
signature = ToolCallSignature.from_call(tool_name, args)
if failed is None:
failed, _ = classify_tool_failure(tool_name, result)
if failed:
exact_count = self._exact_failure_counts.get(signature, 0) + 1
self._exact_failure_counts[signature] = exact_count
self._no_progress.pop(signature, None)
same_count = self._same_tool_failure_counts.get(tool_name, 0) + 1
self._same_tool_failure_counts[tool_name] = same_count
if self.config.hard_stop_enabled and same_count >= self.config.same_tool_failure_halt_after:
decision = ToolGuardrailDecision(
action="halt",
code="same_tool_failure_halt",
message=(
f"Stopped {tool_name}: it failed {same_count} times this turn. "
"Stop retrying the same failing tool path and choose a different approach."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
self._halt_decision = decision
return decision
if self.config.warnings_enabled and exact_count >= self.config.exact_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="repeated_exact_failure_warning",
message=(
f"{tool_name} has failed {exact_count} times with identical arguments. "
"This looks like a loop; inspect the error and change strategy "
"instead of retrying it unchanged."
),
tool_name=tool_name,
count=exact_count,
signature=signature,
)
if self.config.warnings_enabled and same_count >= self.config.same_tool_failure_warn_after:
return ToolGuardrailDecision(
action="warn",
code="same_tool_failure_warning",
message=(
f"{tool_name} has failed {same_count} times this turn. "
"This looks like a loop; change approach before retrying."
),
tool_name=tool_name,
count=same_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=exact_count, signature=signature)
self._exact_failure_counts.pop(signature, None)
self._same_tool_failure_counts.pop(tool_name, None)
if not self._is_idempotent(tool_name):
self._no_progress.pop(signature, None)
return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
result_hash = _result_hash(result)
previous = self._no_progress.get(signature)
repeat_count = 1
if previous is not None and previous[0] == result_hash:
repeat_count = previous[1] + 1
self._no_progress[signature] = (result_hash, repeat_count)
if self.config.warnings_enabled and repeat_count >= self.config.no_progress_warn_after:
return ToolGuardrailDecision(
action="warn",
code="idempotent_no_progress_warning",
message=(
f"{tool_name} returned the same result {repeat_count} times. "
"Use the result already provided or change the query instead of "
"repeating it unchanged."
),
tool_name=tool_name,
count=repeat_count,
signature=signature,
)
return ToolGuardrailDecision(tool_name=tool_name, count=repeat_count, signature=signature)
def _is_idempotent(self, tool_name: str) -> bool:
if tool_name in self.config.mutating_tools:
return False
return tool_name in self.config.idempotent_tools
def toolguard_synthetic_result(decision: ToolGuardrailDecision) -> str:
"""Build a synthetic role=tool content string for a blocked tool call."""
return json.dumps(
{
"error": decision.message,
"guardrail": decision.to_metadata(),
},
ensure_ascii=False,
)
def append_toolguard_guidance(result: str, decision: ToolGuardrailDecision) -> str:
"""Append runtime guidance to the current tool result content."""
if decision.action not in {"warn", "halt"} or not decision.message:
return result
label = "Tool loop hard stop" if decision.action == "halt" else "Tool loop warning"
suffix = (
f"\n\n[{label}: "
f"{decision.code}; count={decision.count}; {decision.message}]"
)
return (result or "") + suffix
def _coerce_args(args: Mapping[str, Any] | None) -> Mapping[str, Any]:
return args if isinstance(args, Mapping) else {}
def _result_hash(result: str | None) -> str:
parsed = safe_json_loads(result or "")
if parsed is not None:
try:
canonical = json.dumps(
parsed,
ensure_ascii=False,
sort_keys=True,
separators=(",", ":"),
default=str,
)
except TypeError:
canonical = str(parsed)
else:
canonical = result or ""
return _sha256(canonical)
def _as_bool(value: Any, default: bool) -> bool:
if value is None:
return default
if isinstance(value, bool):
return value
if isinstance(value, (int, float)):
return bool(value)
if isinstance(value, str):
lowered = value.strip().lower()
if lowered in {"1", "true", "yes", "on", "enabled"}:
return True
if lowered in {"0", "false", "no", "off", "disabled"}:
return False
return default
def _positive_int(value: Any, default: int) -> int:
if value is None:
return default
try:
parsed = int(value)
except (TypeError, ValueError):
return default
return parsed if parsed >= 1 else default
def _sha256(value: str) -> str:
return hashlib.sha256(value.encode("utf-8")).hexdigest()
-2
View File
@@ -58,7 +58,6 @@ class AnthropicTransport(ProviderTransport):
context_length: int | None
base_url: str | None
fast_mode: bool
drop_context_1m_beta: bool
"""
from agent.anthropic_adapter import build_anthropic_kwargs
@@ -74,7 +73,6 @@ class AnthropicTransport(ProviderTransport):
context_length=params.get("context_length"),
base_url=params.get("base_url"),
fast_mode=params.get("fast_mode", False),
drop_context_1m_beta=params.get("drop_context_1m_beta", False),
)
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
+3 -138
View File
@@ -12,93 +12,12 @@ reasoning configuration, temperature handling, and extra_body assembly.
import copy
from typing import Any, Dict, List, Optional
from agent.lmstudio_reasoning import resolve_lmstudio_effort
from agent.moonshot_schema import is_moonshot_model, sanitize_moonshot_tools
from agent.prompt_builder import DEVELOPER_ROLE_MODELS
from agent.transports.base import ProviderTransport
from agent.transports.types import NormalizedResponse, ToolCall, Usage
def _build_gemini_thinking_config(model: str, reasoning_config: dict | None) -> dict | None:
"""Translate Hermes/OpenRouter-style reasoning config to Gemini thinkingConfig."""
if reasoning_config is None or not isinstance(reasoning_config, dict):
return None
normalized_model = (model or "").strip().lower()
if normalized_model.startswith("google/"):
normalized_model = normalized_model.split("/", 1)[1]
# ``thinking_config`` is a Gemini-only request parameter. The same
# ``gemini`` provider also serves Gemma (and historically PaLM/Bard);
# those reject the field with HTTP 400 "Unknown name 'thinking_config':
# Cannot find field" — including the polite ``{"includeThoughts": False}``
# form. Omit the field entirely on non-Gemini models. (#17426)
if not normalized_model.startswith("gemini"):
return None
if reasoning_config.get("enabled") is False:
# Gemini can hide thought parts even when internal thinking still
# happens; omit thinkingLevel to avoid model-specific validation quirks.
return {"includeThoughts": False}
effort = str(reasoning_config.get("effort", "medium") or "medium").strip().lower()
if effort == "none":
return {"includeThoughts": False}
thinking_config: Dict[str, Any] = {"includeThoughts": True}
# Gemini 2.5 accepts thinkingBudget; don't guess a budget from Hermes'
# coarse effort levels. ``includeThoughts`` alone is enough to surface
# thought parts without risking request validation errors.
if normalized_model.startswith("gemini-2.5-"):
return thinking_config
if effort not in {"minimal", "low", "medium", "high", "xhigh"}:
effort = "medium"
# Gemini 3 Flash documents low/medium/high thinking levels; Gemini 3 Pro
# is stricter (low/high). Clamp Hermes' wider effort set to what each
# family accepts so we never forward an undocumented level verbatim.
if normalized_model.startswith(("gemini-3", "gemini-3.1")):
if "flash" in normalized_model:
if effort in {"minimal", "low"}:
thinking_config["thinkingLevel"] = "low"
elif effort in {"high", "xhigh"}:
thinking_config["thinkingLevel"] = "high"
else:
thinking_config["thinkingLevel"] = "medium"
elif "pro" in normalized_model:
thinking_config["thinkingLevel"] = (
"high" if effort in {"high", "xhigh"} else "low"
)
return thinking_config
def _snake_case_gemini_thinking_config(config: dict | None) -> dict | None:
"""Convert Gemini thinking config keys to the OpenAI-compat field names."""
if not isinstance(config, dict) or not config:
return None
translated: Dict[str, Any] = {}
if isinstance(config.get("includeThoughts"), bool):
translated["include_thoughts"] = config["includeThoughts"]
if isinstance(config.get("thinkingLevel"), str) and config["thinkingLevel"].strip():
translated["thinking_level"] = config["thinkingLevel"].strip().lower()
if isinstance(config.get("thinkingBudget"), (int, float)):
translated["thinking_budget"] = int(config["thinkingBudget"])
return translated or None
def _is_gemini_openai_compat_base_url(base_url: Any) -> bool:
normalized = str(base_url or "").strip().rstrip("/").lower()
if not normalized:
return False
if "generativelanguage.googleapis.com" not in normalized:
return False
return normalized.endswith("/openai")
class ChatCompletionsTransport(ProviderTransport):
"""Transport for api_mode='chat_completions'.
@@ -182,7 +101,6 @@ class ChatCompletionsTransport(ProviderTransport):
is_github_models: bool
is_nvidia_nim: bool
is_kimi: bool
is_lmstudio: bool
is_custom_provider: bool
ollama_num_ctx: int | None
# Provider routing
@@ -196,7 +114,6 @@ class ChatCompletionsTransport(ProviderTransport):
# Reasoning
supports_reasoning: bool
github_reasoning_extra: dict | None
lmstudio_reasoning_options: list[str] | None # raw allowed_options from /api/v1/models
# Claude on OpenRouter/Nous max output
anthropic_max_output: int | None
# Extra
@@ -271,7 +188,6 @@ class ChatCompletionsTransport(ProviderTransport):
anthropic_max_out = params.get("anthropic_max_output")
is_nvidia_nim = params.get("is_nvidia_nim", False)
is_kimi = params.get("is_kimi", False)
is_tokenhub = params.get("is_tokenhub", False)
reasoning_config = params.get("reasoning_config")
if ephemeral is not None and max_tokens_fn:
@@ -303,41 +219,12 @@ class ChatCompletionsTransport(ProviderTransport):
_kimi_effort = _e
api_kwargs["reasoning_effort"] = _kimi_effort
# Tencent TokenHub: top-level reasoning_effort (unless thinking disabled)
if is_tokenhub:
_tokenhub_thinking_off = bool(
reasoning_config
and isinstance(reasoning_config, dict)
and reasoning_config.get("enabled") is False
)
if not _tokenhub_thinking_off:
_tokenhub_effort = "high"
if reasoning_config and isinstance(reasoning_config, dict):
_e = (reasoning_config.get("effort") or "").strip().lower()
if _e in ("low", "medium", "high"):
_tokenhub_effort = _e
api_kwargs["reasoning_effort"] = _tokenhub_effort
# LM Studio: top-level reasoning_effort. Only emit when the model
# declares reasoning support via /api/v1/models capabilities (gated
# upstream by params["supports_reasoning"]). resolve_lmstudio_effort
# is shared with run_agent's summary path so both stay in sync.
if params.get("is_lmstudio", False) and params.get("supports_reasoning", False):
_lm_effort = resolve_lmstudio_effort(
reasoning_config,
params.get("lmstudio_reasoning_options"),
)
if _lm_effort is not None:
api_kwargs["reasoning_effort"] = _lm_effort
# extra_body assembly
extra_body: Dict[str, Any] = {}
is_openrouter = params.get("is_openrouter", False)
is_nous = params.get("is_nous", False)
is_github_models = params.get("is_github_models", False)
provider_name = str(params.get("provider_name") or "").strip().lower()
base_url = params.get("base_url")
provider_prefs = params.get("provider_preferences")
if provider_prefs and is_openrouter:
@@ -353,9 +240,8 @@ class ChatCompletionsTransport(ProviderTransport):
"type": "enabled" if _kimi_thinking_enabled else "disabled",
}
# Reasoning. LM Studio is handled above via top-level reasoning_effort,
# so skip emitting extra_body.reasoning for it.
if params.get("supports_reasoning", False) and not params.get("is_lmstudio", False):
# Reasoning
if params.get("supports_reasoning", False):
if is_github_models:
gh_reasoning = params.get("github_reasoning_extra")
if gh_reasoning is not None:
@@ -391,23 +277,6 @@ class ChatCompletionsTransport(ProviderTransport):
if is_qwen:
extra_body["vl_high_resolution_images"] = True
if provider_name == "gemini":
raw_thinking_config = _build_gemini_thinking_config(model, reasoning_config)
if _is_gemini_openai_compat_base_url(base_url):
thinking_config = _snake_case_gemini_thinking_config(raw_thinking_config)
if thinking_config:
openai_compat_extra = extra_body.get("extra_body", {})
google_extra = openai_compat_extra.get("google", {})
google_extra["thinking_config"] = thinking_config
openai_compat_extra["google"] = google_extra
extra_body["extra_body"] = openai_compat_extra
elif raw_thinking_config:
extra_body["thinking_config"] = raw_thinking_config
elif provider_name == "google-gemini-cli":
thinking_config = _build_gemini_thinking_config(model, reasoning_config)
if thinking_config:
extra_body["thinking_config"] = thinking_config
# Merge any pre-built extra_body additions
additions = params.get("extra_body_additions")
if additions:
@@ -477,13 +346,9 @@ class ChatCompletionsTransport(ProviderTransport):
# so keep them apart in provider_data rather than merging.
reasoning = getattr(msg, "reasoning", None)
reasoning_content = getattr(msg, "reasoning_content", None)
if reasoning_content is None and hasattr(msg, "model_extra"):
model_extra = getattr(msg, "model_extra", None) or {}
if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
reasoning_content = model_extra["reasoning_content"]
provider_data: Dict[str, Any] = {}
if reasoning_content is not None:
if reasoning_content:
provider_data["reasoning_content"] = reasoning_content
rd = getattr(msg, "reasoning_details", None)
if rd:
+4 -13
View File
@@ -8,7 +8,7 @@ streaming, or the _run_codex_stream() call path.
from typing import Any, Dict, List, Optional
from agent.transports.base import ProviderTransport
from agent.transports.types import NormalizedResponse, ToolCall
from agent.transports.types import NormalizedResponse, ToolCall, Usage
class ResponsesApiTransport(ProviderTransport):
@@ -143,18 +143,7 @@ class ResponsesApiTransport(ProviderTransport):
kwargs["max_output_tokens"] = max_tokens
if is_xai_responses and session_id:
existing_extra_headers = kwargs.get("extra_headers")
merged_extra_headers: Dict[str, str] = {}
if isinstance(existing_extra_headers, dict):
merged_extra_headers.update(
{
str(key): str(value)
for key, value in existing_extra_headers.items()
if key and value is not None
}
)
merged_extra_headers["x-grok-conv-id"] = session_id
kwargs["extra_headers"] = merged_extra_headers
kwargs["extra_headers"] = {"x-grok-conv-id": session_id}
return kwargs
@@ -162,6 +151,8 @@ class ResponsesApiTransport(ProviderTransport):
"""Normalize Codex Responses API response to NormalizedResponse."""
from agent.codex_responses_adapter import (
_normalize_codex_response,
_extract_responses_message_text,
_extract_responses_reasoning_text,
)
# _normalize_codex_response returns (SimpleNamespace, finish_reason_str)
-21
View File
@@ -359,25 +359,6 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
# MiniMax
(
"minimax",
"minimax-m2.7",
): PricingEntry(
input_cost_per_million=Decimal("0.30"),
output_cost_per_million=Decimal("1.20"),
source="official_docs_snapshot",
pricing_version="minimax-pricing-2026-04",
),
(
"minimax-cn",
"minimax-m2.7",
): PricingEntry(
input_cost_per_million=Decimal("0.30"),
output_cost_per_million=Decimal("1.20"),
source="official_docs_snapshot",
pricing_version="minimax-pricing-2026-04",
),
}
@@ -419,8 +400,6 @@ def resolve_billing_route(
return BillingRoute(provider="anthropic", model=model.split("/")[-1], base_url=base_url or "", billing_mode="official_docs_snapshot")
if provider_name == "openai":
return BillingRoute(provider="openai", model=model.split("/")[-1], base_url=base_url or "", billing_mode="official_docs_snapshot")
if provider_name in {"minimax", "minimax-cn"}:
return BillingRoute(provider=provider_name, model=model.split("/")[-1], base_url=base_url or "", billing_mode="official_docs_snapshot")
if provider_name in {"custom", "local"} or (base and "localhost" in base):
return BillingRoute(provider=provider_name or "custom", model=model, base_url=base_url or "", billing_mode="unknown")
return BillingRoute(provider=provider_name or "unknown", model=model.split("/")[-1] if model else "", base_url=base_url or "", billing_mode="unknown")
+9 -51
View File
@@ -30,13 +30,14 @@ model:
# "ollama-cloud" - Ollama Cloud (requires: OLLAMA_API_KEY — https://ollama.com/settings)
# "kilocode" - KiloCode gateway (requires: KILOCODE_API_KEY)
# "ai-gateway" - Vercel AI Gateway (requires: AI_GATEWAY_API_KEY)
# "lmstudio" - LM Studio local server (optional: LM_API_KEY, defaults to http://127.0.0.1:1234/v1)
#
# Local servers (LM Studio, Ollama, vLLM, llama.cpp):
# "custom" - Any other OpenAI-compatible endpoint. Set base_url below.
# Aliases: "ollama", "vllm", "llamacpp" all map to "custom".
# LM Studio is first-class and uses provider: "lmstudio".
# It works with both no-auth and auth-enabled server modes.
# "custom" - Any OpenAI-compatible endpoint. Set base_url below.
# Aliases: "lmstudio", "ollama", "vllm", "llamacpp" all map to "custom".
# Example for LM Studio:
# provider: "lmstudio"
# base_url: "http://localhost:1234/v1"
# No API key needed — local servers typically ignore auth.
#
# Can also be overridden with --provider flag or HERMES_INFERENCE_PROVIDER env var.
provider: "auto"
@@ -121,18 +122,6 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# OpenRouter Response Caching (only applies when using OpenRouter)
# =============================================================================
# Cache identical API responses at the OpenRouter edge for free instant replays.
# When enabled, identical requests (same model, messages, parameters) return
# cached responses with zero billing. Separate from Anthropic prompt caching.
# See: https://openrouter.ai/docs/guides/features/response-caching
#
# openrouter:
# response_cache: true # Enable response caching (default: true)
# response_cache_ttl: 300 # Cache TTL in seconds, 1-86400 (default: 300)
# =============================================================================
# Git Worktree Isolation
# =============================================================================
@@ -192,11 +181,6 @@ terminal:
# lifetime_seconds: 300
# docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
# docker_mount_cwd_to_workspace: true # Explicit opt-in: mount your launch cwd into /workspace
# # Optional: run the container as your host user's uid:gid so files written
# # into bind-mounted dirs are owned by you, not root. Drops SETUID/SETGID
# # caps too since no gosu privilege drop is needed. Leave off if your
# # chosen docker_image expects to start as root.
# docker_run_as_host_user: true
# # Optional: explicitly forward selected env vars into Docker.
# # These values come from your current shell first, then ~/.hermes/.env.
# # Warning: anything forwarded here is visible to commands run in the container.
@@ -301,25 +285,6 @@ browser:
# after this period of no activity between agent loops (default: 120 = 2 minutes)
inactivity_timeout: 120
# =============================================================================
# Tool Loop Guardrails
# =============================================================================
# Soft warnings are enabled by default. They append guidance to repeated failed
# or non-progressing tool results but still let the tool execute. Hard stops are
# opt-in circuit breakers for autonomous/cron sessions where stopping a loop is
# preferable to spending the full iteration budget.
tool_loop_guardrails:
warnings_enabled: true
hard_stop_enabled: false
warn_after:
exact_failure: 2
same_tool_failure: 3
idempotent_no_progress: 2
hard_stop_after:
exact_failure: 5
same_tool_failure: 8
idempotent_no_progress: 5
# =============================================================================
# Context Compression (Auto-shrinks long conversations)
# =============================================================================
@@ -601,7 +566,7 @@ agent:
# - A preset like "hermes-cli" or "hermes-telegram" (curated tool set)
# - A list of individual toolsets to compose your own (see list below)
#
# Supported platform keys: cli, telegram, discord, whatsapp, slack, qqbot, teams
# Supported platform keys: cli, telegram, discord, whatsapp, slack, qqbot
#
# Examples:
#
@@ -631,7 +596,6 @@ agent:
# signal: hermes-signal (same as telegram)
# homeassistant: hermes-homeassistant (same as telegram)
# qqbot: hermes-qqbot (same as telegram)
# teams: hermes-teams (same as telegram)
#
platform_toolsets:
cli: [hermes-cli]
@@ -642,8 +606,6 @@ platform_toolsets:
signal: [hermes-signal]
homeassistant: [hermes-homeassistant]
qqbot: [hermes-qqbot]
yuanbao: [hermes-yuanbao]
teams: [hermes-teams]
# =============================================================================
# Gateway Platform Settings
@@ -885,12 +847,8 @@ display:
# What Enter does when Hermes is already busy (CLI and gateway platforms).
# interrupt: Interrupt the current run and redirect Hermes (default)
# queue: Queue your message for the next turn
# steer: Inject your message mid-run via /steer, arriving at the agent
# after the next tool call — no interrupt, no role violation.
# Falls back to 'queue' if the agent isn't running yet or if
# images are attached (steer only carries text).
# Ctrl+C (or /stop in gateway) always interrupts regardless of this setting.
# Toggle at runtime with /busy <interrupt|queue|steer>.
# Toggle at runtime with /busy_input_mode <interrupt|queue>.
busy_input_mode: interrupt
# Background process notifications (gateway/messaging only).
@@ -965,7 +923,7 @@ display:
# agent_name: "My Agent" # Banner title and branding
# welcome: "Welcome message" # Shown at CLI startup
# response_label: " ⚔ Agent " # Response box header label
# prompt_symbol: "⚔" # Prompt symbol (bare token; renderers add trailing space)
# prompt_symbol: "⚔ " # Prompt symbol
# tool_prefix: "╎" # Tool output line prefix (default: ┊)
#
skin: default
+272 -1457
View File
File diff suppressed because it is too large Load Diff
+14 -217
View File
@@ -21,7 +21,6 @@ from typing import Optional, Dict, List, Any, Union
logger = logging.getLogger(__name__)
from hermes_time import now as _hermes_now
from utils import atomic_replace
try:
from croniter import croniter
@@ -312,22 +311,8 @@ def compute_next_run(schedule: Dict[str, Any], last_run_at: Optional[str] = None
elif schedule["kind"] == "cron":
if not HAS_CRONITER:
logger.warning(
"Cannot compute next run for cron schedule %r: 'croniter' is "
"not installed. croniter is a core dependency as of v0.9.x; "
"reinstall hermes-agent or run 'pip install croniter' in your "
"runtime env.",
schedule.get("expr"),
)
return None
# Use last_run_at as the croniter base when available, consistent
# with interval jobs. This ensures that after a crash/restart,
# the next run is anchored to the actual last execution time
# rather than to an arbitrary restart time.
base_time = now
if last_run_at:
base_time = _ensure_aware(datetime.fromisoformat(last_run_at))
cron = croniter(schedule["expr"], base_time)
cron = croniter(schedule["expr"], now)
next_run = cron.get_next(datetime)
return next_run.isoformat()
@@ -376,7 +361,7 @@ def save_jobs(jobs: List[Dict[str, Any]]):
json.dump({"jobs": jobs, "updated_at": _hermes_now().isoformat()}, f, indent=2)
f.flush()
os.fsync(f.fileno())
atomic_replace(tmp_path, JOBS_FILE)
os.replace(tmp_path, JOBS_FILE)
_secure_file(JOBS_FILE)
except BaseException:
try:
@@ -420,7 +405,7 @@ def _normalize_workdir(workdir: Optional[str]) -> Optional[str]:
def create_job(
prompt: Optional[str],
prompt: str,
schedule: str,
name: Optional[str] = None,
repeat: Optional[int] = None,
@@ -435,14 +420,12 @@ def create_job(
context_from: Optional[Union[str, List[str]]] = None,
enabled_toolsets: Optional[List[str]] = None,
workdir: Optional[str] = None,
no_agent: bool = False,
) -> Dict[str, Any]:
"""
Create a new cron job.
Args:
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set).
Ignored when ``no_agent=True`` except as an optional name hint.
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set)
schedule: Schedule string (see parse_schedule)
name: Optional friendly name
repeat: How many times to run (None = forever, 1 = once)
@@ -453,33 +436,21 @@ def create_job(
model: Optional per-job model override
provider: Optional per-job provider override
base_url: Optional per-job base URL override
script: Optional path to a script whose stdout feeds the job. With
``no_agent=True`` the script IS the job its stdout is
delivered verbatim. Without ``no_agent``, its stdout is
injected into the agent's prompt as context (data-collection /
change-detection pattern). Paths resolve under
~/.hermes/scripts/; ``.sh`` / ``.bash`` files run via bash,
anything else via Python.
script: Optional path to a Python script whose stdout is injected into the
prompt each run. The script runs before the agent turn, and its output
is prepended as context. Useful for data collection / change detection.
context_from: Optional job ID (or list of job IDs) whose most recent output
is injected into the prompt as context before each run.
Useful for chaining cron jobs: job A finds data, job B processes it.
enabled_toolsets: Optional list of toolset names to restrict the agent to.
When set, only tools from these toolsets are loaded, reducing
token overhead. When omitted, all default tools are loaded.
Ignored when ``no_agent=True``.
workdir: Optional absolute path. When set, the job runs as if launched
from that directory: AGENTS.md / CLAUDE.md / .cursorrules from
that directory are injected into the system prompt, and the
terminal/file/code_exec tools use it as their working directory
(via TERMINAL_CWD). When unset, the old behaviour is preserved
(no context files injected, tools use the scheduler's cwd).
With ``no_agent=True``, ``workdir`` is still applied as the
script's cwd so relative paths inside the script behave
predictably.
no_agent: When True, skip the agent entirely run ``script`` on schedule
and deliver its stdout directly. Empty stdout = silent (no
delivery). Requires ``script`` to be set. Ideal for classic
watchdogs and periodic alerts that don't need LLM reasoning.
Returns:
The created job dict
@@ -513,16 +484,6 @@ def create_job(
normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
normalized_toolsets = normalized_toolsets or None
normalized_workdir = _normalize_workdir(workdir)
normalized_no_agent = bool(no_agent)
# no_agent jobs are meaningless without a script — the script IS the job.
# Surface this as a clear ValueError at create time so bad configs never
# reach the scheduler.
if normalized_no_agent and not normalized_script:
raise ValueError(
"no_agent=True requires a script — with no agent and no script "
"there is nothing for the job to run."
)
# Normalize context_from: accept str or list of str, store as list or None
if isinstance(context_from, str):
@@ -532,7 +493,7 @@ def create_job(
else:
context_from = None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
job = {
"id": job_id,
"name": name or label_source[:50].strip(),
@@ -543,7 +504,6 @@ def create_job(
"provider": normalized_provider,
"base_url": normalized_base_url,
"script": normalized_script,
"no_agent": normalized_no_agent,
"context_from": context_from,
"schedule": parsed_schedule,
"schedule_display": parsed_schedule.get("display", schedule),
@@ -738,32 +698,10 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None,
# Compute next run
job["next_run_at"] = compute_next_run(job["schedule"], now)
# If no next run, decide whether this is terminal completion
# (one-shot) or a transient failure (recurring schedule couldn't
# compute — e.g. 'croniter' missing from the runtime env).
# Recurring jobs must NEVER be silently disabled: that turns a
# missing runtime dep into "job completed" and the user's
# schedule quietly goes off. See issue #16265.
# If no next run (one-shot completed), disable
if job["next_run_at"] is None:
kind = job.get("schedule", {}).get("kind")
if kind in ("cron", "interval"):
job["state"] = "error"
if not job.get("last_error"):
job["last_error"] = (
"Failed to compute next run for recurring "
"schedule (is the 'croniter' package "
"installed in the gateway's Python env?)"
)
logger.error(
"Job '%s' (%s) could not compute next_run_at; "
"leaving enabled and marking state=error so the "
"job is not silently disabled.",
job.get("name", job["id"]),
kind,
)
else:
job["enabled"] = False
job["state"] = "completed"
job["enabled"] = False
job["state"] = "completed"
elif job.get("state") != "paused":
job["state"] = "scheduled"
@@ -810,12 +748,6 @@ def get_due_jobs() -> List[Dict[str, Any]]:
the job is fast-forwarded to the next future run instead of firing
immediately. This prevents a burst of missed jobs on gateway restart.
"""
with _jobs_file_lock:
return _get_due_jobs_locked()
def _get_due_jobs_locked() -> List[Dict[str, Any]]:
"""Inner implementation of get_due_jobs(); must be called with _jobs_file_lock held."""
now = _hermes_now()
raw_jobs = load_jobs()
jobs = [_apply_skill_fields(j) for j in copy.deepcopy(raw_jobs)]
@@ -828,36 +760,19 @@ def _get_due_jobs_locked() -> List[Dict[str, Any]]:
next_run = job.get("next_run_at")
if not next_run:
schedule = job.get("schedule", {})
kind = schedule.get("kind")
# One-shot jobs use a small grace window via the dedicated helper.
recovered_next = _recoverable_oneshot_run_at(
schedule,
job.get("schedule", {}),
now,
last_run_at=job.get("last_run_at"),
)
recovery_kind = "one-shot" if recovered_next else None
# Recurring jobs reach here only when something — typically a
# direct jobs.json edit that bypassed add_job() — left
# next_run_at unset. Without this branch, such jobs are
# silently skipped forever; recompute next_run_at from the
# schedule so they pick up at their next scheduled tick.
if not recovered_next and kind in ("cron", "interval"):
recovered_next = compute_next_run(schedule, now.isoformat())
if recovered_next:
recovery_kind = kind
if not recovered_next:
continue
job["next_run_at"] = recovered_next
next_run = recovered_next
logger.info(
"Job '%s' had no next_run_at; recovering %s run at %s",
"Job '%s' had no next_run_at; recovering one-shot run at %s",
job.get("name", job["id"]),
recovery_kind,
recovered_next,
)
for rj in raw_jobs:
@@ -920,7 +835,7 @@ def save_job_output(job_id: str, output: str):
f.write(output)
f.flush()
os.fsync(f.fileno())
atomic_replace(tmp_path, output_file)
os.replace(tmp_path, output_file)
_secure_file(output_file)
except BaseException:
try:
@@ -930,121 +845,3 @@ def save_job_output(job_id: str, output: str):
raise
return output_file
# =============================================================================
# Skill reference rewriting (curator integration)
# =============================================================================
def rewrite_skill_refs(
consolidated: Optional[Dict[str, str]] = None,
pruned: Optional[List[str]] = None,
) -> Dict[str, Any]:
"""Rewrite cron job skill references after a curator consolidation pass.
When the curator consolidates a skill X into umbrella Y (or archives X
as pruned), any cron job that lists ``X`` in its ``skills`` field will
fail to load ``X`` at run time the scheduler logs a warning and
skips the skill, so the job runs without the instructions it was
scheduled to follow. See cron/scheduler.py where ``skill_view`` is
called per skill name.
This function repairs cron jobs in-place:
- A skill listed in ``consolidated`` is replaced with its umbrella
target (the ``into`` value). If the umbrella is already in the
job's skill list, the stale name is dropped without duplication.
- A skill listed in ``pruned`` is dropped outright there is no
forwarding target.
- Ordering and other skills in the list are preserved.
- The legacy ``skill`` field is realigned via ``_apply_skill_fields``.
Args:
consolidated: mapping of ``old_skill_name -> umbrella_skill_name``.
pruned: list of skill names that were archived with no forwarding
target.
Returns a report dict::
{
"rewrites": [
{
"job_id": ...,
"job_name": ...,
"before": [...],
"after": [...],
"mapped": {"old": "new", ...},
"dropped": ["old", ...],
},
...
],
"jobs_updated": N,
"jobs_scanned": M,
}
Best-effort: exceptions from loading/saving propagate to the caller so
tests can assert behaviour; the curator invocation site wraps this
call in a try/except so a failure here never breaks the curator.
"""
consolidated = dict(consolidated or {})
pruned_set = set(pruned or [])
# A skill listed in both wins as "consolidated" — it has a target,
# which is the more useful of the two outcomes.
pruned_set -= set(consolidated.keys())
if not consolidated and not pruned_set:
return {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
with _jobs_file_lock:
jobs = load_jobs()
rewrites: List[Dict[str, Any]] = []
changed = False
for job in jobs:
skills_before = _normalize_skill_list(job.get("skill"), job.get("skills"))
if not skills_before:
continue
mapped: Dict[str, str] = {}
dropped: List[str] = []
new_skills: List[str] = []
for name in skills_before:
if name in consolidated:
target = consolidated[name]
mapped[name] = target
if target and target not in new_skills:
new_skills.append(target)
elif name in pruned_set:
dropped.append(name)
else:
if name not in new_skills:
new_skills.append(name)
if not mapped and not dropped:
continue
job["skills"] = new_skills
job["skill"] = new_skills[0] if new_skills else None
changed = True
rewrites.append({
"job_id": job.get("id"),
"job_name": job.get("name") or job.get("id"),
"before": list(skills_before),
"after": list(new_skills),
"mapped": mapped,
"dropped": dropped,
})
if changed:
save_jobs(jobs)
logger.info(
"Curator rewrote skill references in %d cron job(s)", len(rewrites)
)
return {
"rewrites": rewrites,
"jobs_updated": len(rewrites),
"jobs_scanned": len(jobs),
}
+64 -325
View File
@@ -35,7 +35,7 @@ from typing import List, Optional
sys.path.insert(0, str(Path(__file__).parent.parent))
from hermes_constants import get_hermes_home
from hermes_cli.config import load_config, _expand_env_vars
from hermes_cli.config import load_config
from hermes_time import now as _hermes_now
logger = logging.getLogger(__name__)
@@ -77,7 +77,7 @@ _KNOWN_DELIVERY_PLATFORMS = frozenset({
"telegram", "discord", "slack", "whatsapp", "signal",
"matrix", "mattermost", "homeassistant", "dingtalk", "feishu",
"wecom", "wecom_callback", "weixin", "sms", "email", "webhook", "bluebubbles",
"qqbot", "yuanbao",
"qqbot",
})
# Platforms that support a configured cron/notification home target, mapped to
@@ -123,19 +123,9 @@ _LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _resolve_origin(job: dict) -> Optional[dict]:
"""Extract origin info from a job, preserving any extra routing metadata.
Treats non-dict origins (free-form provenance strings, ints, lists from
migration scripts or hand-edited jobs.json) as missing instead of
crashing with ``AttributeError`` on ``origin.get(...)``. Without this
guard, a job tagged with e.g. ``"combined-digest-replaces-x-and-y"``
crashed every fire attempt with
``'str' object has no attribute 'get'`` ``mark_job_run`` recorded the
failure, but the next tick re-loaded the same poisoned origin and
crashed identically until the field was patched manually (#18722).
"""
"""Extract origin info from a job, preserving any extra routing metadata."""
origin = job.get("origin")
if not isinstance(origin, dict):
if not origin:
return None
platform = origin.get("platform")
chat_id = origin.get("chat_id")
@@ -157,19 +147,6 @@ def _get_home_target_chat_id(platform_name: str) -> str:
return value
def _get_home_target_thread_id(platform_name: str) -> Optional[str]:
"""Return the optional thread/topic ID for a platform home target."""
env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
if not env_var:
return None
value = os.getenv(f"{env_var}_THREAD_ID", "").strip()
if not value:
legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
if legacy:
value = os.getenv(f"{legacy}_THREAD_ID", "").strip()
return value or None
def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
"""Resolve one concrete auto-delivery target for a cron job."""
@@ -198,7 +175,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": _get_home_target_thread_id(platform_name),
"thread_id": None,
}
return None
@@ -221,9 +198,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
if resolved:
parsed_chat_id, parsed_thread_id, resolved_is_explicit = _parse_target_ref(platform_key, resolved)
if resolved_is_explicit:
chat_id = parsed_chat_id
if parsed_thread_id is not None:
thread_id = parsed_thread_id
chat_id, thread_id = parsed_chat_id, parsed_thread_id
else:
chat_id = resolved
except Exception:
@@ -252,36 +227,16 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": _get_home_target_thread_id(platform_name),
"thread_id": None,
}
def _normalize_deliver_value(deliver) -> str:
"""Normalize a stored/submitted ``deliver`` value to its canonical string form.
The contract is that ``deliver`` is a string (``"local"``, ``"origin"``,
``"telegram"``, ``"telegram:-1001:17"``, or comma-separated combinations).
Historically some callers MCP clients passing an array, direct edits of
``jobs.json``, or stale code paths have stored a list/tuple like
``["telegram"]``. ``str(["telegram"])`` would serialize to the literal
string ``"['telegram']"``, which is not a known platform and fails
resolution silently. Flatten lists/tuples into a comma-separated string
so both forms work. Returns ``"local"`` for anything falsy.
"""
if deliver is None or deliver == "":
return "local"
if isinstance(deliver, (list, tuple)):
parts = [str(p).strip() for p in deliver if str(p).strip()]
return ",".join(parts) if parts else "local"
return str(deliver)
def _resolve_delivery_targets(job: dict) -> List[dict]:
"""Resolve all concrete auto-delivery targets for a cron job (supports comma-separated deliver)."""
deliver = _normalize_deliver_value(job.get("deliver", "local"))
deliver = job.get("deliver", "local")
if deliver == "local":
return []
parts = [p.strip() for p in deliver.split(",") if p.strip()]
parts = [p.strip() for p in str(deliver).split(",") if p.strip()]
seen = set()
targets = []
for part in parts:
@@ -300,21 +255,13 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
return targets[0] if targets else None
# Media extension sets — audio routing is centralized in gateway.platforms.base
# via should_send_media_as_audio() so Telegram-specific rules stay in one place.
# Media extension sets — keep in sync with gateway/platforms/base.py:_process_message_background
_AUDIO_EXTS = frozenset({'.ogg', '.opus', '.mp3', '.wav', '.m4a'})
_VIDEO_EXTS = frozenset({'.mp4', '.mov', '.avi', '.mkv', '.webm', '.3gp'})
_IMAGE_EXTS = frozenset({'.jpg', '.jpeg', '.png', '.webp', '.gif'})
def _send_media_via_adapter(
adapter,
chat_id: str,
media_files: list,
metadata: dict | None,
loop,
job: dict,
platform=None,
) -> None:
def _send_media_via_adapter(adapter, chat_id: str, media_files: list, metadata: dict | None, loop, job: dict) -> None:
"""Send extracted MEDIA files as native platform attachments via a live adapter.
Routes each file to the appropriate adapter method (send_voice, send_image_file,
@@ -323,13 +270,10 @@ def _send_media_via_adapter(
"""
from pathlib import Path
from gateway.platforms.base import should_send_media_as_audio
for media_path, _is_voice in media_files:
try:
ext = Path(media_path).suffix.lower()
route_platform = platform if platform is not None else getattr(adapter, "platform", None)
if should_send_media_as_audio(route_platform, ext, is_voice=_is_voice):
if ext in _AUDIO_EXTS:
coro = adapter.send_voice(chat_id=chat_id, audio_path=media_path, metadata=metadata)
elif ext in _VIDEO_EXTS:
coro = adapter.send_video(chat_id=chat_id, video_path=media_path, metadata=metadata)
@@ -375,6 +319,26 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
from tools.send_message_tool import _send_to_platform
from gateway.config import load_gateway_config, Platform
platform_map = {
"telegram": Platform.TELEGRAM,
"discord": Platform.DISCORD,
"slack": Platform.SLACK,
"whatsapp": Platform.WHATSAPP,
"signal": Platform.SIGNAL,
"matrix": Platform.MATRIX,
"mattermost": Platform.MATTERMOST,
"homeassistant": Platform.HOMEASSISTANT,
"dingtalk": Platform.DINGTALK,
"feishu": Platform.FEISHU,
"wecom": Platform.WECOM,
"wecom_callback": Platform.WECOM_CALLBACK,
"weixin": Platform.WEIXIN,
"email": Platform.EMAIL,
"sms": Platform.SMS,
"bluebubbles": Platform.BLUEBUBBLES,
"qqbot": Platform.QQBOT,
}
# Optionally wrap the content with a header/footer so the user knows this
# is a cron delivery. Wrapping is on by default; set cron.wrap_response: false
# in config.yaml for clean output.
@@ -417,7 +381,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = _resolve_origin(job) or {}
origin = job.get("origin") or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
@@ -431,23 +395,13 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
job["id"], platform_name, chat_id, thread_id,
)
# Built-in names resolve to their enum member; plugin platform names
# create dynamic members via Platform._missing_().
try:
platform = Platform(platform_name.lower())
except (ValueError, KeyError):
platform = platform_map.get(platform_name.lower())
if not platform:
msg = f"unknown platform '{platform_name}'"
logger.warning("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
pconfig = config.platforms.get(platform)
if not pconfig or not pconfig.enabled:
msg = f"platform '{platform_name}' not configured/enabled"
logger.warning("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
# Prefer the live adapter when the gateway is running — this supports E2EE
# rooms (e.g. Matrix) where the standalone HTTP path cannot encrypt.
runtime_adapter = (adapters or {}).get(platform)
@@ -478,15 +432,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
# Send extracted media files as native attachments via the live adapter
if adapter_ok and media_files:
_send_media_via_adapter(
runtime_adapter,
chat_id,
media_files,
send_metadata,
loop,
job,
platform=platform,
)
_send_media_via_adapter(runtime_adapter, chat_id, media_files, send_metadata, loop, job)
if adapter_ok:
logger.info("Job '%s': delivered to %s:%s via live adapter", job["id"], platform_name, chat_id)
@@ -498,6 +444,13 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
)
if not delivered:
pconfig = config.platforms.get(platform)
if not pconfig or not pconfig.enabled:
msg = f"platform '{platform_name}' not configured/enabled"
logger.warning("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
# Standalone path: run the async send in a fresh event loop (safe from any thread)
coro = _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files)
try:
@@ -576,18 +529,8 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
prevent arbitrary script execution via path traversal or absolute
path injection.
Supported interpreters (chosen by file extension):
* ``.sh`` / ``.bash`` run with ``/bin/bash``
* anything else run with the current Python interpreter
(``sys.executable``), preserving the original behaviour for
Python-based pre-check and data-collection scripts.
Shell support lets ``no_agent=True`` jobs ship classic bash watchdogs
(the `memory-watchdog.sh` pattern) without wrapping them in Python.
Args:
script_path: Path to the script. Relative paths are resolved
script_path: Path to a Python script. Relative paths are resolved
against HERMES_HOME/scripts/. Absolute and ~-prefixed paths
are also validated to ensure they stay within the scripts dir.
@@ -624,19 +567,9 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
script_timeout = _get_script_timeout()
# Pick an interpreter by extension. Bash for .sh/.bash, Python for
# everything else. We deliberately do NOT honour the file's own
# shebang: the scripts dir is trusted, but keeping the interpreter
# choice explicit here keeps the allowed surface small and auditable.
suffix = path.suffix.lower()
if suffix in (".sh", ".bash"):
argv = ["/bin/bash", str(path)]
else:
argv = [sys.executable, str(path)]
try:
result = subprocess.run(
argv,
[sys.executable, str(path)],
capture_output=True,
text=True,
timeout=script_timeout,
@@ -726,8 +659,10 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
f"{prompt}"
)
else:
# Script produced no output — nothing to report, skip AI call.
return None
prompt = (
"[Script ran successfully but produced no output.]\n\n"
f"{prompt}"
)
else:
prompt = (
"## Script Error\n"
@@ -780,7 +715,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
# Always prepend cron execution guidance so the agent knows how
# delivery works and can suppress delivery when appropriate.
cron_hint = (
"[IMPORTANT: You are running as a scheduled cron job. "
"[SYSTEM: You are running as a scheduled cron job. "
"DELIVERY: Your final response will be automatically delivered "
"to the user — do NOT use send_message or try to deliver "
"the output yourself. Just produce your report/output as your "
@@ -800,7 +735,6 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
return prompt
from tools.skills_tool import skill_view
from tools.skill_usage import bump_use
parts = []
skipped: list[str] = []
@@ -812,18 +746,12 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
skipped.append(skill_name)
continue
# Bump usage so the curator sees this skill as actively used.
try:
bump_use(skill_name)
except Exception:
logger.debug("Cron job: failed to bump skill usage for '%s'", skill_name, exc_info=True)
content = str(loaded.get("content") or "").strip()
if parts:
parts.append("")
parts.extend(
[
f'[IMPORTANT: The user has invoked the "{skill_name}" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]',
f'[SYSTEM: The user has invoked the "{skill_name}" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]',
"",
content,
]
@@ -831,7 +759,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
if skipped:
notice = (
f"[IMPORTANT: The following skill(s) were listed for this job but could not be found "
f"[SYSTEM: The following skill(s) were listed for this job but could not be found "
f"and were skipped: {', '.join(skipped)}. "
f"Start your response with a brief notice so the user is aware, e.g.: "
f"'⚠️ Skill(s) not found and skipped: {', '.join(skipped)}']"
@@ -850,120 +778,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
Returns:
Tuple of (success, full_output_doc, final_response, error_message)
"""
job_id = job["id"]
job_name = job["name"]
# ---------------------------------------------------------------
# no_agent short-circuit — the script IS the job, no LLM involvement.
# ---------------------------------------------------------------
# This mirrors the classic "run a bash script on a timer, send its
# stdout to telegram" watchdog pattern. The agent path is skipped
# entirely: no AIAgent, no prompt, no tool loop, no token spend.
#
# We check this BEFORE importing run_agent / constructing SessionDB so
# a pure-script tick never pays for the agent machinery it isn't going
# to use. Keep this block self-contained.
#
# Semantics:
# - script stdout (trimmed) → delivered verbatim as the final message
# - empty stdout → silent run (no delivery, success=True)
# - non-zero exit / timeout → delivered as an error alert, success=False
# - wakeAgent=false gate → treated like empty stdout (silent), since
# the whole point of no_agent is that there
# is no agent to wake
if job.get("no_agent"):
script_path = job.get("script")
if not script_path:
err = "no_agent=True but no script is set for this job"
logger.error("Job '%s': %s", job_id, err)
return False, "", "", err
# Apply workdir if configured — lets scripts use predictable relative
# paths. For no_agent jobs this is just the subprocess cwd (not an
# agent TERMINAL_CWD bridge).
_job_workdir = (job.get("workdir") or "").strip() or None
_prior_cwd = None
if _job_workdir and Path(_job_workdir).is_dir():
_prior_cwd = os.getcwd()
try:
os.chdir(_job_workdir)
except OSError:
_prior_cwd = None
try:
ok, output = _run_job_script(script_path)
finally:
if _prior_cwd is not None:
try:
os.chdir(_prior_cwd)
except OSError:
pass
now_iso = _hermes_now().strftime("%Y-%m-%d %H:%M:%S")
if not ok:
# Script crashed / timed out / exited non-zero. Deliver the
# error so the user knows the watchdog itself broke — silent
# failure for an alerting job is the worst-case outcome.
alert = (
f"⚠ Cron watchdog '{job_name}' script failed\n\n"
f"{output}\n\n"
f"Time: {now_iso}"
)
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** script failed\n\n"
f"{output}\n"
)
return False, doc, alert, output
# Honour the wakeAgent gate as a silent signal — `wakeAgent: false`
# means "nothing to report this tick", same as empty stdout.
if not _parse_wake_gate(output):
logger.info(
"Job '%s' (no_agent): wakeAgent=false gate — silent run", job_id
)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (wakeAgent=false)\n"
)
return True, silent_doc, SILENT_MARKER, None
if not output.strip():
logger.info("Job '%s' (no_agent): empty stdout — silent run", job_id)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (empty output)\n"
)
return True, silent_doc, SILENT_MARKER, None
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n\n"
f"---\n\n"
f"{output}\n"
)
return True, doc, output, None
# ---------------------------------------------------------------
# Default (LLM) path — import and construct the agent machinery now
# that we know we actually need it. Doing these imports here instead of
# at module top keeps no_agent ticks from paying for AIAgent / SessionDB
# construction costs.
# ---------------------------------------------------------------
from run_agent import AIAgent
# Initialize SQLite session store so cron job messages are persisted
# and discoverable via session_search (same pattern as gateway/run.py).
_session_db = None
@@ -972,6 +788,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_session_db = SessionDB()
except Exception as e:
logger.debug("Job '%s': SQLite session store not available: %s", job.get("id", "?"), e)
job_id = job["id"]
job_name = job["name"]
# Wake-gate: if this job has a pre-check script, run it BEFORE building
# the prompt so a ``{"wakeAgent": false}`` response can short-circuit
@@ -996,17 +815,12 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
return True, silent_doc, SILENT_MARKER, None
prompt = _build_job_prompt(job, prerun_script=prerun_script)
if prompt is None:
logger.info("Job '%s': script produced no output, skipping AI call.", job_name)
return True, "", SILENT_MARKER, None
origin = _resolve_origin(job)
_cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"
logger.info("Running job '%s' (ID: %s)", job_name, job_id)
logger.info("Prompt: %s", prompt[:100])
agent = None
# Mark this as a cron session so the approval system can apply cron_mode.
# This env var is process-wide and persists for the lifetime of the
# scheduler process — every job this process runs is a cron job.
@@ -1021,13 +835,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
chat_id=str(origin["chat_id"]) if origin else "",
chat_name=origin.get("chat_name", "") if origin else "",
)
_cron_delivery_vars = (
"HERMES_CRON_AUTO_DELIVER_PLATFORM",
"HERMES_CRON_AUTO_DELIVER_CHAT_ID",
"HERMES_CRON_AUTO_DELIVER_THREAD_ID",
)
for _var_name in _cron_delivery_vars:
_VAR_MAP[_var_name].set("")
# Per-job working directory. When set (and validated at create/update
# time), we point TERMINAL_CWD at it so:
@@ -1066,11 +873,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if delivery_target:
_VAR_MAP["HERMES_CRON_AUTO_DELIVER_PLATFORM"].set(delivery_target["platform"])
_VAR_MAP["HERMES_CRON_AUTO_DELIVER_CHAT_ID"].set(str(delivery_target["chat_id"]))
_VAR_MAP["HERMES_CRON_AUTO_DELIVER_THREAD_ID"].set(
""
if delivery_target.get("thread_id") is None
else str(delivery_target["thread_id"])
)
if delivery_target.get("thread_id") is not None:
_VAR_MAP["HERMES_CRON_AUTO_DELIVER_THREAD_ID"].set(str(delivery_target["thread_id"]))
model = job.get("model") or os.getenv("HERMES_MODEL") or ""
@@ -1082,7 +886,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if os.path.exists(_cfg_path):
with open(_cfg_path) as _f:
_cfg = yaml.safe_load(_f) or {}
_cfg = _expand_env_vars(_cfg)
_model_cfg = _cfg.get("model", {})
if not job.get("model"):
if isinstance(_model_cfg, str):
@@ -1135,13 +938,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
)
from hermes_cli.auth import AuthError
try:
# Do not inject HERMES_INFERENCE_PROVIDER here. resolve_runtime_provider()
# already prefers persisted config over stale shell/env overrides when
# no explicit provider is requested. Passing the env var here short-
# circuits that precedence and can resurrect old providers (for
# example DeepSeek) for cron jobs that do not pin provider/model.
runtime_kwargs = {
"requested": job.get("provider"),
"requested": job.get("provider") or os.getenv("HERMES_INFERENCE_PROVIDER"),
}
if job.get("base_url"):
runtime_kwargs["explicit_base_url"] = job.get("base_url")
@@ -1210,12 +1008,10 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
enabled_toolsets=_resolve_cron_enabled_toolsets(job, _cfg),
disabled_toolsets=["cronjob", "messaging", "clarify"],
quiet_mode=True,
# Cron jobs should always inherit the user's SOUL.md identity from
# HERMES_HOME. When a workdir is configured, also inject project
# context files (AGENTS.md / CLAUDE.md / .cursorrules) from there.
# Without a workdir, keep cwd context discovery disabled.
# When a workdir is configured, inject AGENTS.md / CLAUDE.md /
# .cursorrules from that directory; otherwise preserve the old
# behaviour (don't inject SOUL.md/AGENTS.md from the scheduler cwd).
skip_context_files=not bool(_job_workdir),
load_soul_identity=True,
skip_memory=True, # Cron system prompts would corrupt user representations
platform="cron",
session_id=_cron_session_id,
@@ -1230,18 +1026,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
#
# Uses the agent's built-in activity tracker (updated by
# _touch_activity() on every tool call, API call, and stream delta).
_raw_cron_timeout = os.getenv("HERMES_CRON_TIMEOUT", "").strip()
if _raw_cron_timeout:
try:
_cron_timeout = float(_raw_cron_timeout)
except (ValueError, TypeError):
logger.warning(
"Invalid HERMES_CRON_TIMEOUT=%r; using default 600s",
_raw_cron_timeout,
)
_cron_timeout = 600.0
else:
_cron_timeout = 600.0
_cron_timeout = float(os.getenv("HERMES_CRON_TIMEOUT", 600))
_cron_inactivity_limit = _cron_timeout if _cron_timeout > 0 else None
_POLL_INTERVAL = 5.0
_cron_pool = concurrent.futures.ThreadPoolExecutor(max_workers=1)
@@ -1316,21 +1101,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
f"agent.run_conversation returned {type(result).__name__} instead of dict: {result!r}"
)
# If the agent itself reported failure (e.g. all retries exhausted on
# API errors, model abort, mid-run interrupt), do not silently mark the
# job as successful. run_agent populates `failed=True`/`completed=False`
# on these paths and may put the error into `final_response`, which
# would otherwise be delivered as if it were the agent's reply and the
# job's `last_status` set to "ok". Raise so the except handler below
# builds the proper failure tuple. (issue #17855)
if result.get("failed") is True or result.get("completed") is False:
_err_text = (
result.get("error")
or (result.get("final_response") or "").strip()
or "agent reported failure"
)
raise RuntimeError(_err_text)
final_response = result.get("final_response", "") or ""
# Strip leaked placeholder text that upstream may inject on empty completions.
if final_response.strip() == "(No response generated)":
@@ -1390,8 +1160,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
os.environ["TERMINAL_CWD"] = _prior_terminal_cwd
# Clean up ContextVar session/delivery state for this job.
clear_session_vars(_ctx_tokens)
for _var_name in _cron_delivery_vars:
_VAR_MAP[_var_name].set("")
if _session_db:
try:
_session_db.end_session(_cron_session_id, "cron_complete")
@@ -1401,24 +1169,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_session_db.close()
except (Exception, KeyboardInterrupt) as e:
logger.debug("Job '%s': failed to close SQLite session store: %s", job_id, e)
# Release subprocesses, terminal sandboxes, browser daemons, and the
# main OpenAI/httpx client held by this ephemeral cron agent. Without
# this, a gateway that ticks cron every N minutes leaks fds per job
# until it hits EMFILE (#10200 / "too many open files").
try:
if agent is not None:
agent.close()
except (Exception, KeyboardInterrupt) as e:
logger.debug("Job '%s': failed to close agent resources: %s", job_id, e)
# Each cron run spins up a short-lived worker thread whose event loop
# dies as soon as the ``ThreadPoolExecutor`` shuts down. Any async
# httpx clients cached under that loop are now unusable — reap them
# so their transports don't accumulate in the process-global cache.
try:
from agent.auxiliary_client import cleanup_stale_async_clients
cleanup_stale_async_clients()
except Exception as e:
logger.debug("Job '%s': failed to reap stale auxiliary clients: %s", job_id, e)
def tick(verbose: bool = True, adapters=None, loop=None) -> int:
@@ -1558,17 +1308,6 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
_futures.append(_tick_pool.submit(_ctx.run, _process_job, job))
_results.extend(f.result() for f in _futures)
# Best-effort sweep of MCP stdio subprocesses that survived their
# session teardown during this tick. Runs AFTER every job has
# finished so active sessions (including live user chats) are
# never touched — only PIDs explicitly detected as orphans in
# tools.mcp_tool._run_stdio's finally block are reaped.
try:
from tools.mcp_tool import _kill_orphaned_mcp_children
_kill_orphaned_mcp_children()
except Exception as _e:
logger.debug("Post-tick MCP orphan cleanup failed: %s", _e)
return sum(_results)
finally:
if fcntl:
-7
View File
@@ -34,13 +34,6 @@ services:
# uncomment BOTH lines (API_SERVER_KEY is mandatory for auth):
# - API_SERVER_HOST=0.0.0.0
# - API_SERVER_KEY=${API_SERVER_KEY}
# Microsoft Teams — uncomment and fill in to enable Teams gateway.
# Register your bot at https://dev.botframework.com/ to get these values.
# - TEAMS_CLIENT_ID=${TEAMS_CLIENT_ID}
# - TEAMS_CLIENT_SECRET=${TEAMS_CLIENT_SECRET}
# - TEAMS_TENANT_ID=${TEAMS_TENANT_ID}
# - TEAMS_ALLOWED_USERS=${TEAMS_ALLOWED_USERS}
# - TEAMS_PORT=${TEAMS_PORT:-3978}
command: ["gateway", "run"]
dashboard:
-35
View File
@@ -86,41 +86,6 @@ if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
# Optionally start `hermes dashboard` as a side-process.
#
# Toggled by HERMES_DASHBOARD=1 (also accepts "true"/"yes", case-insensitive).
# Host/port/TUI can be overridden via:
# HERMES_DASHBOARD_HOST (default 0.0.0.0 — exposed outside the container)
# HERMES_DASHBOARD_PORT (default 9119, matches `hermes dashboard` default)
# HERMES_DASHBOARD_TUI (already honored by `hermes dashboard` itself)
#
# The dashboard is a long-lived server. We background it *before* the final
# `exec hermes "$@"` so the user's chosen foreground command (chat, gateway,
# sleep infinity, …) remains PID-of-interest for the container runtime. When
# the container stops the whole process tree is torn down, so no explicit
# cleanup is needed.
case "${HERMES_DASHBOARD:-}" in
1|true|TRUE|True|yes|YES|Yes)
dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
dash_port="${HERMES_DASHBOARD_PORT:-9119}"
dash_args=(--host "$dash_host" --port "$dash_port" --no-open)
# Binding to anything other than localhost requires --insecure — the
# dashboard refuses otherwise because it exposes API keys. Inside a
# container this is the expected deployment (host reaches it via
# published port), so opt in automatically.
if [ "$dash_host" != "127.0.0.1" ] && [ "$dash_host" != "localhost" ]; then
dash_args+=(--insecure)
fi
echo "Starting hermes dashboard on ${dash_host}:${dash_port} (background)"
# Prefix dashboard output so it's distinguishable from the main
# process in `docker logs`. stdbuf keeps the pipe line-buffered.
(
stdbuf -oL -eL hermes dashboard "${dash_args[@]}" 2>&1 \
| sed -u 's/^/[dashboard] /'
) &
;;
esac
# Final exec: two supported invocation patterns.
#
# docker run <image> -> exec `hermes` with no args (legacy default)
@@ -1,473 +0,0 @@
# Telegram DM User-Managed Multi-Session Topics Implementation Plan
> **For Hermes:** Use test-driven-development for implementation. Use subagent-driven-development only after this plan is split into small reviewed tasks.
**Goal:** Add an opt-in Telegram DM multi-session mode where Telegram user-created private-chat topics become independent Hermes session lanes, while the root DM becomes a system lobby.
**Architecture:** Rely on Telegram's native private-chat topic UI. Users create new topics with the `+` button; Hermes maps each `message_thread_id` to a separate session lane. Hermes does not create topics for normal `/new` flow and does not try to manage topic lifecycle beyond activation/status, root-lobby behavior, and restoring legacy sessions into a user-created topic.
**Tech Stack:** Hermes gateway, Telegram Bot API 9.4+, python-telegram-bot adapter, SQLite SessionDB / side tables, pytest.
---
## 1. Product decisions
### Accepted
- PR-quality implementation: migrations, tests, docs, backwards compatibility.
- Use SQLite persistence, not JSON sidecars.
- Live status suffixes in topic titles are out of MVP.
- Topic title sync/editing is out of MVP except future-compatible storage if cheap.
- User creates Telegram topics manually through the Telegram bot interface.
- `/new` does **not** create Telegram topics.
- Root/main DM becomes a system lobby after activation.
- Existing Telegram behavior remains unchanged until the feature is activated/enabled.
- Migration of old sessions is supported through `/topic` listing and `/topic <session_id>` restore inside a user-created topic.
### Telegram API assumptions verified from Bot API docs
- `getMe` returns bot `User` fields:
- `has_topics_enabled`: forum/topic mode enabled in private chats.
- `allows_users_to_create_topics`: users may create/delete topics in private chats.
- `createForumTopic` works for private chats with a user, but MVP does not rely on it for normal flow.
- `Message.message_thread_id` identifies a topic in private chats.
- `sendMessage` supports `message_thread_id` for private-chat topics.
- `pinChatMessage` is allowed in private chats.
---
## 2. Target UX
### 2.1 Activation from root/main DM
User sends:
```text
/topic
```
Hermes:
1. calls Telegram `getMe`;
2. verifies `has_topics_enabled` and `allows_users_to_create_topics`;
3. enables multi-session topic mode for this Telegram DM user/chat;
4. sends an onboarding message;
5. pins the onboarding message if configured;
6. shows old/unlinked sessions that can be restored into topics.
Suggested onboarding text:
```text
Multi-session mode is enabled.
Create new Hermes chats with the + button in this bot interface. Each Telegram topic is an independent Hermes session, so you can work on different tasks in parallel.
This main chat is reserved for system commands, status, and session management.
To restore an old session:
1. Use /topic here to see unlinked sessions.
2. Create a new topic with the + button.
3. Send /topic <session_id> inside that topic.
```
### 2.2 Root/main DM after activation
Root DM is a system lobby.
Allowed/system commands include at least:
- `/topic`
- `/status`
- `/sessions` if available
- `/usage`
- `/help`
- `/platforms`
Normal user prompts in root DM do not enter the agent loop. Reply:
```text
This main chat is reserved for system commands.
To chat with Hermes, create a new topic using the + button in this bot interface. Each topic works as an independent Hermes session.
```
`/new` in root DM does not create a session/topic. Reply:
```text
To start a new parallel Hermes chat, create a new topic with the + button in this bot interface.
Each topic is an independent Hermes session. Use /new inside a topic only if you want to replace that topic's current session.
```
### 2.3 First message in a user-created topic
When a user creates a Telegram topic and sends the first message there:
1. Hermes receives a Telegram DM message with `message_thread_id`.
2. Hermes derives the existing thread-aware `session_key` from `(platform=telegram, chat_type=dm, chat_id, thread_id)`.
3. If no binding exists, Hermes creates a fresh Hermes session for this topic lane and persists the binding.
4. The message runs through the normal agent loop for that lane.
### 2.4 `/new` inside a non-main topic
`/new` remains supported but replaces the session attached to the current topic lane.
Hermes should warn:
```text
Started a new Hermes session in this topic.
Tip: for parallel work, create a new topic with the + button instead of using /new here. /new replaces the session attached to the current topic.
```
### 2.5 `/topic` in root/main DM after activation
Shows:
- mode enabled/disabled;
- last capability check result;
- whether intro message is pinned if known;
- count of known topic bindings;
- list of old/unlinked sessions.
Example:
```text
Telegram multi-session topics are enabled.
Create new Hermes chats with the + button in this bot interface.
Unlinked previous sessions:
1. 2026-05-01 Research notes — id: abc123
2. 2026-04-30 Deploy debugging — id: def456
3. Untitled session — id: ghi789
To restore one:
1. Create a new topic with the + button.
2. Open that topic.
3. Send /topic <id>
```
### 2.6 `/topic` inside a non-main topic
Without args, show the current topic binding:
```text
This topic is linked to:
Session: Research notes
ID: abc123
Use /new to replace this topic with a fresh session.
For parallel work, create another topic with the + button.
```
### 2.7 `/topic <session_id>` inside a non-main topic
Restore an old/unlinked session into the current user-created topic.
Behavior:
1. reject if not in Telegram DM topic;
2. verify session belongs to the same Telegram user/chat or is a safe legacy root DM session for this user;
3. reject if session is already linked to another active topic in MVP;
4. `SessionStore.switch_session(current_topic_session_key, target_session_id)`;
5. upsert binding with `managed_mode = restored`;
6. send two messages into the topic:
- session restored confirmation;
- last Hermes assistant message if available.
Example:
```text
Session restored: Research notes
Last Hermes message:
...
```
---
## 3. Persistence model
Use SQLite, but topic-mode schema changes are **explicit opt-in migrations**, not automatic startup reconciliation.
Important rollback-safety rule:
- upgrading Hermes and starting the gateway must not create Telegram topic-mode tables or columns;
- old/default Telegram behavior must keep working on the existing `state.db`;
- the first `/topic` activation path calls an idempotent explicit migration, then enables topic mode for that chat;
- if activation fails before the migration is needed, the database remains in the pre-topic-mode shape.
### 3.1 No eager `sessions` table mutation for MVP
Do **not** add `chat_id`, `chat_type`, `thread_id`, or `session_key` columns to `sessions` as part of ordinary `SessionDB()` startup. The existing declarative `_reconcile_columns()` mechanism would add them eagerly on every process start, which violates the managed-migration requirement.
For MVP, keep origin/session-lane data in topic-specific side tables created only by the explicit `/topic` migration. Legacy unlinked sessions can be discovered conservatively from existing data (`source = telegram`, `user_id = current Telegram user`) plus absence from topic bindings.
If future PRs need richer origin metadata for all gateway sessions, introduce it behind a separate explicit migration/command or a compatibility-reviewed schema bump.
### 3.2 Explicit `/topic` migration API
Add an idempotent method such as:
```python
def apply_telegram_topic_migration(self) -> None: ...
```
It creates only topic-mode side tables/indexes and records:
```text
state_meta.telegram_dm_topic_schema_version = 1
```
This method is called from `/topic` activation/status paths before reading or writing topic-mode state. It is not called from generic `SessionDB.__init__`, gateway startup, CLI startup, or auto-maintenance.
### 3.3 `telegram_dm_topic_mode`
Stores per-user/chat activation state. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id` primary key
- `user_id`
- `enabled`
- `activated_at`
- `updated_at`
- `has_topics_enabled`
- `allows_users_to_create_topics`
- `capability_checked_at`
- `intro_message_id`
- `pinned_message_id`
### 3.4 `telegram_dm_topic_bindings`
Stores Telegram topic/thread to Hermes session binding. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id`
- `thread_id`
- `user_id`
- `session_key`
- `session_id`
- `managed_mode`
- `auto`
- `restored`
- `new_replaced`
- `linked_at`
- `updated_at`
Recommended constraints:
- primary key `(chat_id, thread_id)`;
- unique index on `session_id` for MVP to prevent one session linked to multiple topics;
- index `(user_id, chat_id)` for status/listing.
### 3.5 Unlinked session semantics
For MVP, a session is unlinked if:
- `source = telegram`;
- `user_id = current Telegram user`;
- no row in `telegram_dm_topic_bindings` has `session_id = session_id`.
This is intentionally conservative until a future explicit migration adds richer cross-platform origin metadata.
Never dedupe by title.
---
## 4. Config
Suggested config block:
```yaml
platforms:
telegram:
extra:
multisession_topics:
enabled: false
mode: user_managed_topics
root_chat_behavior: system_lobby
pin_intro_message: true
```
Notes:
- `enabled: false` means existing Telegram behavior is unchanged.
- Activation via `/topic` may create per-chat enabled state only if global config permits it.
- `root_chat_behavior: system_lobby` is the MVP behavior for activated chats.
---
## 5. Command behavior summary
### `/topic` root/main DM
- If not activated: capability check, activate, send/pin onboarding, list unlinked sessions.
- If activated: show status and unlinked sessions.
### `/topic` non-main topic
- Show current binding.
### `/topic <session_id>` root/main DM
Reject with instructions:
```text
Create a new topic with the + button, open it, then send /topic <session_id> there to restore this session.
```
### `/topic <session_id>` non-main topic
Restore that session into this topic if ownership/linking checks pass.
### `/new` root/main DM when activated
Reply with instructions to use the `+` button. Do not enter agent loop.
### `/new` non-main topic
Create a new session in the current topic lane, persist/update binding, warn that `+` is preferred for parallel work.
### Normal text root/main DM when activated
Reply with system-lobby instruction. Do not enter agent loop.
### Normal text non-main topic
Normal Hermes agent flow for that topic's session lane.
---
## 6. PR breakdown
### PR 1 — Explicit topic-mode schema migration
**Goal:** Add rollback-safe SQLite support for Telegram topic mode without mutating `state.db` on ordinary upgrade/startup.
**Files likely touched:**
- `hermes_state.py`
- tests under `tests/`
**Tests first:**
1. opening an old/current DB with `SessionDB()` does not create topic-mode tables or `sessions` origin columns;
2. calling `apply_telegram_topic_migration()` creates `telegram_dm_topic_mode` and `telegram_dm_topic_bindings` idempotently;
3. migration records `state_meta.telegram_dm_topic_schema_version = 1`.
### PR 2 — Topic mode activation and binding APIs
**Goal:** Add SQLite persistence for activation and topic bindings.
**Tests first:**
1. enable/check mode row round-trips;
2. binding upsert and lookup by `(chat_id, user_id, thread_id)`;
3. linked sessions are excluded from unlinked list.
### PR 3 — `/topic` activation/status command
**Goal:** Implement root activation/status/listing behavior.
**Tests first:**
1. `/topic` in root checks `getMe` capabilities and records activation;
2. capability failure returns readable instructions;
3. activated root `/topic` lists unlinked sessions.
### PR 4 — System lobby behavior
**Goal:** Prevent root chat from entering agent loop after activation.
**Tests first:**
1. normal text in activated root returns lobby instruction;
2. `/new` in activated root returns `+` button instruction;
3. non-activated root behavior is unchanged.
### PR 5 — Auto-bind user-created topics
**Goal:** First message in non-main topic creates/uses an independent session lane.
**Tests first:**
1. new topic message creates binding with `auto_created`;
2. repeated topic message reuses same binding/lane;
3. two topics in same DM do not share sessions.
### PR 6 — Restore legacy sessions into a topic
**Goal:** Implement `/topic <session_id>` in non-main topics.
**Tests first:**
1. root `/topic <id>` rejects with instructions;
2. topic `/topic <id>` switches current topic lane to target session;
3. restore rejects sessions from other users/chats;
4. restore rejects already-linked sessions;
5. restore emits confirmation and last Hermes assistant message.
### PR 7 — `/new` inside topic updates binding
**Goal:** Keep existing `/new` semantics but persist topic binding replacement.
**Tests first:**
1. `/new` in topic creates a new session for same topic lane;
2. binding updates to `managed_mode = new_replaced`;
3. response includes guidance to use `+` for parallel work.
### PR 8 — Docs and polish
**Goal:** Document the feature and Telegram setup.
**Files likely touched:**
- `website/docs/user-guide/messaging/telegram.md`
- maybe `website/docs/user-guide/sessions.md`
Docs must explain:
- BotFather/Telegram settings for topic mode and user-created topics;
- `/topic` activation;
- root system lobby;
- using `+` for new parallel chats;
- restoring old sessions with `/topic <id>` inside a topic;
- limitations.
---
## 7. Testing / quality gates
Run targeted tests after each TDD cycle, then broader tests before completion.
Suggested commands after inspection confirms test paths:
```bash
python -m pytest tests/test_hermes_state.py -q
python -m pytest tests/gateway/ -q
python -m pytest tests/ -o 'addopts=' -q
```
Do not ship without verifying disabled-feature backwards compatibility.
---
## 8. Definition of done for MVP
- `/topic` activates/checks Telegram DM multi-session mode.
- Root DM becomes a system lobby after activation.
- Onboarding message tells users to create new chats with the Telegram `+` button.
- Onboarding message can be pinned in private chat.
- User-created topics automatically become independent Hermes session lanes.
- `/new` in root gives instructions, not a new agent run.
- `/new` in a topic creates a new session in that topic and warns that `+` is preferred for parallel work.
- `/topic` in root lists unlinked old sessions.
- `/topic <session_id>` inside a topic restores that session and sends confirmation + last Hermes assistant message.
- Ownership checks prevent restoring other users' sessions.
- Already-linked sessions are not restored into a second topic in MVP.
- Existing Telegram behavior is unchanged when the feature is disabled.
- Tests and docs are included.
-1
View File
@@ -36,7 +36,6 @@
imports = [
./nix/packages.nix
./nix/overlays.nix
./nix/nixosModules.nix
./nix/checks.nix
./nix/devShell.nix
Binary file not shown.

Before

Width:  |  Height:  |  Size: 115 KiB

+85
View File
@@ -0,0 +1,85 @@
"""Built-in boot-md hook — run ~/.hermes/BOOT.md on gateway startup.
This hook is always registered. It silently skips if no BOOT.md exists.
To activate, create ``~/.hermes/BOOT.md`` with instructions for the
agent to execute on every gateway restart.
Example BOOT.md::
# Startup Checklist
1. Check if any cron jobs failed overnight
2. Send a status update to Discord #general
3. If there are errors in /opt/app/deploy.log, summarize them
The agent runs in a background thread so it doesn't block gateway
startup. If nothing needs attention, it replies with [SILENT] to
suppress delivery.
"""
import logging
import threading
logger = logging.getLogger("hooks.boot-md")
from hermes_constants import get_hermes_home
HERMES_HOME = get_hermes_home()
BOOT_FILE = HERMES_HOME / "BOOT.md"
def _build_boot_prompt(content: str) -> str:
"""Wrap BOOT.md content in a system-level instruction."""
return (
"You are running a startup boot checklist. Follow the BOOT.md "
"instructions below exactly.\n\n"
"---\n"
f"{content}\n"
"---\n\n"
"Execute each instruction. If you need to send a message to a "
"platform, use the send_message tool.\n"
"If nothing needs attention and there is nothing to report, "
"reply with ONLY: [SILENT]"
)
def _run_boot_agent(content: str) -> None:
"""Spawn a one-shot agent session to execute the boot instructions."""
try:
from run_agent import AIAgent
prompt = _build_boot_prompt(content)
agent = AIAgent(
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
max_iterations=20,
)
result = agent.run_conversation(prompt)
response = result.get("final_response", "")
if response and "[SILENT]" not in response:
logger.info("boot-md completed: %s", response[:200])
else:
logger.info("boot-md completed (nothing to report)")
except Exception as e:
logger.error("boot-md agent failed: %s", e)
async def handle(event_type: str, context: dict) -> None:
"""Gateway startup handler — run BOOT.md if it exists."""
if not BOOT_FILE.exists():
return
content = BOOT_FILE.read_text(encoding="utf-8").strip()
if not content:
return
logger.info("Running BOOT.md (%d chars)", len(content))
# Run in a background thread so we don't block gateway startup.
thread = threading.Thread(
target=_run_boot_agent,
args=(content,),
name="boot-md",
daemon=True,
)
thread.start()
+14 -77
View File
@@ -57,7 +57,7 @@ def _session_entry_name(origin: Dict[str, Any]) -> str:
# Build / refresh
# ---------------------------------------------------------------------------
async def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
"""
Build a channel directory from connected platform adapters and session data.
@@ -72,7 +72,7 @@ async def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
if platform == Platform.DISCORD:
platforms["discord"] = _build_discord(adapter)
elif platform == Platform.SLACK:
platforms["slack"] = await _build_slack(adapter)
platforms["slack"] = _build_slack(adapter)
except Exception as e:
logger.warning("Channel directory: failed to build %s: %s", platform.value, e)
@@ -86,16 +86,6 @@ async def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
continue
platforms[plat_name] = _build_from_sessions(plat_name)
# Include plugin-registered platforms (dynamic enum members aren't in
# Platform.__members__, so the loop above misses them).
try:
from gateway.platform_registry import platform_registry
for entry in platform_registry.plugin_entries():
if entry.name not in _SKIP_SESSION_DISCOVERY and entry.name not in platforms:
platforms[entry.name] = _build_from_sessions(entry.name)
except Exception:
pass
directory = {
"updated_at": datetime.now().isoformat(),
"platforms": platforms,
@@ -146,66 +136,21 @@ def _build_discord(adapter) -> List[Dict[str, str]]:
return channels
async def _build_slack(adapter) -> List[Dict[str, Any]]:
"""List Slack channels the bot has joined across all workspaces.
Uses ``users.conversations`` against each workspace's web client. Pulls
public + private channels the bot is a member of, then merges in DMs
discovered from session history (IMs aren't useful to enumerate
proactively).
"""
team_clients = getattr(adapter, "_team_clients", None) or {}
if not team_clients:
def _build_slack(adapter) -> List[Dict[str, str]]:
"""List Slack channels the bot has joined."""
# Slack adapter may expose a web client
client = getattr(adapter, "_app", None) or getattr(adapter, "_client", None)
if not client:
return _build_from_sessions("slack")
channels: List[Dict[str, Any]] = []
seen_ids: set = set()
try:
from tools.send_message_tool import _send_slack # noqa: F401
# Use the Slack Web API directly if available
except Exception:
pass
for team_id, client in team_clients.items():
try:
cursor: Optional[str] = None
for _page in range(20): # safety cap on pagination
response = await client.users_conversations(
types="public_channel,private_channel",
exclude_archived=True,
limit=200,
cursor=cursor,
)
if not response.get("ok"):
logger.warning(
"Channel directory: users.conversations not ok for team %s: %s",
team_id,
response.get("error", "unknown"),
)
break
for ch in response.get("channels", []):
cid = ch.get("id")
name = ch.get("name")
if not cid or not name or cid in seen_ids:
continue
seen_ids.add(cid)
channels.append({
"id": cid,
"name": name,
"type": "private" if ch.get("is_private") else "channel",
})
cursor = (response.get("response_metadata") or {}).get("next_cursor")
if not cursor:
break
except Exception as e:
logger.warning(
"Channel directory: failed to list Slack channels for team %s: %s",
team_id, e,
)
continue
# Merge in DM/group entries discovered from session history.
for entry in _build_from_sessions("slack"):
if entry.get("id") not in seen_ids:
channels.append(entry)
seen_ids.add(entry.get("id"))
return channels
# Fallback to session data
return _build_from_sessions("slack")
def _build_from_sessions(platform_name: str) -> List[Dict[str, str]]:
@@ -278,14 +223,6 @@ def resolve_channel_name(platform_name: str, name: str) -> Optional[str]:
if not channels:
return None
# 0. Exact ID match — case-sensitive, no normalization. Lets callers pass
# raw platform IDs (e.g. Slack "C0B0QV5434G") even when the format guard
# in _parse_target_ref hasn't recognized them as explicit.
raw = name.strip()
for ch in channels:
if ch.get("id") == raw:
return ch["id"]
query = _normalize_channel_query(name)
# 1. Exact name match, including the display labels shown by send_message(action="list")
+68 -381
View File
@@ -13,7 +13,7 @@ import os
import json
from pathlib import Path
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Any, Callable
from typing import Dict, List, Optional, Any
from enum import Enum
from hermes_cli.config import get_hermes_home
@@ -36,26 +36,6 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
return is_truthy_value(value, default=default)
def _coerce_float(value: Any, default: float) -> float:
"""Coerce numeric config values, falling back on malformed input."""
if value is None:
return default
try:
return float(value)
except (TypeError, ValueError):
return default
def _coerce_int(value: Any, default: int) -> int:
"""Coerce integer config values, falling back on malformed input."""
if value is None:
return default
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
"""Normalize unauthorized DM behavior to a supported value."""
if isinstance(value, str):
@@ -65,28 +45,8 @@ def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> st
return default
def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
"""Normalize notice delivery mode to a supported value."""
if isinstance(value, str):
normalized = value.strip().lower()
if normalized in {"public", "private"}:
return normalized
return default
# Module-level cache for bundled platform plugin names (lives outside the
# enum so it doesn't become an accidental enum member).
_Platform__bundled_plugin_names: Optional[set] = None
class Platform(Enum):
"""Supported messaging platforms.
Built-in platforms have explicit members. Plugin platforms use dynamic
members created on-demand by ``_missing_()`` so that
``Platform("irc")`` works without modifying this enum. Dynamic members
are cached in ``_value2member_map_`` for identity-stable comparisons.
"""
"""Supported messaging platforms."""
LOCAL = "local"
TELEGRAM = "telegram"
DISCORD = "discord"
@@ -107,77 +67,6 @@ class Platform(Enum):
WEIXIN = "weixin"
BLUEBUBBLES = "bluebubbles"
QQBOT = "qqbot"
YUANBAO = "yuanbao"
@classmethod
def _missing_(cls, value):
"""Accept unknown platform names only for known plugin adapters.
Creates a pseudo-member cached in ``_value2member_map_`` so that
``Platform("irc") is Platform("irc")`` holds True (identity-stable).
Arbitrary strings are rejected to prevent enum pollution.
"""
if not isinstance(value, str) or not value.strip():
return None
# Normalise to lowercase to avoid case mismatches in config
value = value.strip().lower()
# Check cache first (another call may have created it already)
if value in cls._value2member_map_:
return cls._value2member_map_[value]
# Only create pseudo-members for bundled plugin platforms (discovered
# via filesystem scan) or runtime-registered plugin platforms.
global _Platform__bundled_plugin_names
if _Platform__bundled_plugin_names is None:
_Platform__bundled_plugin_names = cls._scan_bundled_plugin_platforms()
if value in _Platform__bundled_plugin_names:
pseudo = object.__new__(cls)
pseudo._value_ = value
pseudo._name_ = value.upper().replace("-", "_").replace(" ", "_")
cls._value2member_map_[value] = pseudo
cls._member_map_[pseudo._name_] = pseudo
return pseudo
# Runtime-registered plugins (e.g. user-installed, discovered after
# the enum was defined).
try:
from gateway.platform_registry import platform_registry
if platform_registry.is_registered(value):
pseudo = object.__new__(cls)
pseudo._value_ = value
pseudo._name_ = value.upper().replace("-", "_").replace(" ", "_")
cls._value2member_map_[value] = pseudo
cls._member_map_[pseudo._name_] = pseudo
return pseudo
except Exception:
pass
return None
@classmethod
def _scan_bundled_plugin_platforms(cls) -> set:
"""Return names of bundled platform plugins under ``plugins/platforms/``."""
names: set = set()
try:
platforms_dir = Path(__file__).parent.parent / "plugins" / "platforms"
if platforms_dir.is_dir():
for child in platforms_dir.iterdir():
if (
child.is_dir()
and (child / "__init__.py").exists()
and (
(child / "plugin.yaml").exists()
or (child / "plugin.yml").exists()
)
):
names.add(child.name.lower())
except Exception:
pass
return names
# Snapshot of built-in platform values before any dynamic _missing_ lookups.
# Used to distinguish real platforms from arbitrary strings.
_BUILTIN_PLATFORM_VALUES = frozenset(m.value for m in Platform.__members__.values())
@dataclass
@@ -186,24 +75,18 @@ class HomeChannel:
Default destination for a platform.
When a cron job specifies deliver="telegram" without a specific chat ID,
messages are sent to this home channel. Thread-aware platforms may also
store a thread/topic ID so the bare platform target routes to the exact
conversation where /sethome was run.
messages are sent to this home channel.
"""
platform: Platform
chat_id: str
name: str # Human-readable name for display
thread_id: Optional[str] = None
def to_dict(self) -> Dict[str, Any]:
result = {
return {
"platform": self.platform.value,
"chat_id": self.chat_id,
"name": self.name,
}
if self.thread_id:
result["thread_id"] = self.thread_id
return result
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
@@ -211,7 +94,6 @@ class HomeChannel:
platform=Platform(data["platform"]),
chat_id=str(data["chat_id"]),
name=data.get("name", "Home"),
thread_id=str(data["thread_id"]) if data.get("thread_id") else None,
)
@@ -313,14 +195,6 @@ class StreamingConfig:
edit_interval: float = 1.0 # Seconds between message edits (Telegram rate-limits at ~1/s)
buffer_threshold: int = 40 # Chars before forcing an edit
cursor: str = "" # Cursor shown during streaming
# Ported from openclaw/openclaw#72038. When >0, the final edit for
# a long-running streamed response is delivered as a fresh message
# if the original preview has been visible for at least this many
# seconds, so the platform's visible timestamp reflects completion
# time instead of the preview creation time. Currently applied to
# Telegram only (other platforms ignore the setting). Default 60s
# matches the OpenClaw rollout. Set to 0 to disable.
fresh_final_after_seconds: float = 60.0
def to_dict(self) -> Dict[str, Any]:
return {
@@ -329,7 +203,6 @@ class StreamingConfig:
"edit_interval": self.edit_interval,
"buffer_threshold": self.buffer_threshold,
"cursor": self.cursor,
"fresh_final_after_seconds": self.fresh_final_after_seconds,
}
@classmethod
@@ -337,55 +210,14 @@ class StreamingConfig:
if not data:
return cls()
return cls(
enabled=_coerce_bool(data.get("enabled"), False),
enabled=data.get("enabled", False),
transport=data.get("transport", "edit"),
edit_interval=_coerce_float(data.get("edit_interval"), 1.0),
buffer_threshold=_coerce_int(data.get("buffer_threshold"), 40),
edit_interval=float(data.get("edit_interval", 1.0)),
buffer_threshold=int(data.get("buffer_threshold", 40)),
cursor=data.get("cursor", ""),
fresh_final_after_seconds=_coerce_float(
data.get("fresh_final_after_seconds"), 60.0
),
)
# -----------------------------------------------------------------------------
# Built-in platform connection checkers
# -----------------------------------------------------------------------------
# Each callable receives a ``PlatformConfig`` and returns ``True`` when the
# platform is sufficiently configured to be considered "connected". Platforms
# that rely on the generic ``token or api_key`` check (Telegram, Discord,
# Slack, Matrix, Mattermost, HomeAssistant) do not need an entry here.
_PLATFORM_CONNECTED_CHECKERS: dict[Platform, Callable[[PlatformConfig], bool]] = {
Platform.WEIXIN: lambda cfg: bool(
cfg.extra.get("account_id") and (cfg.token or cfg.extra.get("token"))
),
Platform.WHATSAPP: lambda cfg: True, # bridge handles auth
Platform.SIGNAL: lambda cfg: bool(cfg.extra.get("http_url")),
Platform.EMAIL: lambda cfg: bool(cfg.extra.get("address")),
Platform.SMS: lambda cfg: bool(os.getenv("TWILIO_ACCOUNT_SID")),
Platform.API_SERVER: lambda cfg: True,
Platform.WEBHOOK: lambda cfg: True,
Platform.FEISHU: lambda cfg: bool(cfg.extra.get("app_id")),
Platform.WECOM: lambda cfg: bool(cfg.extra.get("bot_id")),
Platform.WECOM_CALLBACK: lambda cfg: bool(
cfg.extra.get("corp_id") or cfg.extra.get("apps")
),
Platform.BLUEBUBBLES: lambda cfg: bool(
cfg.extra.get("server_url") and cfg.extra.get("password")
),
Platform.QQBOT: lambda cfg: bool(
cfg.extra.get("app_id") and cfg.extra.get("client_secret")
),
Platform.YUANBAO: lambda cfg: bool(
cfg.extra.get("app_id") and cfg.extra.get("app_secret")
),
Platform.DINGTALK: lambda cfg: bool(
(cfg.extra.get("client_id") or os.getenv("DINGTALK_CLIENT_ID"))
and (cfg.extra.get("client_secret") or os.getenv("DINGTALK_CLIENT_SECRET"))
),
}
@dataclass
class GatewayConfig:
"""
@@ -439,43 +271,58 @@ class GatewayConfig:
for platform, config in self.platforms.items():
if not config.enabled:
continue
if self._is_platform_connected(platform, config):
# Weixin requires both a token and an account_id
if platform == Platform.WEIXIN:
if config.extra.get("account_id") and (config.token or config.extra.get("token")):
connected.append(platform)
continue
# Platforms that use token/api_key auth
if config.token or config.api_key:
connected.append(platform)
# WhatsApp uses enabled flag only (bridge handles auth)
elif platform == Platform.WHATSAPP:
connected.append(platform)
# Signal uses extra dict for config (http_url + account)
elif platform == Platform.SIGNAL and config.extra.get("http_url"):
connected.append(platform)
# Email uses extra dict for config (address + imap_host + smtp_host)
elif platform == Platform.EMAIL and config.extra.get("address"):
connected.append(platform)
# SMS uses api_key (Twilio auth token) — SID checked via env
elif platform == Platform.SMS and os.getenv("TWILIO_ACCOUNT_SID"):
connected.append(platform)
# API Server uses enabled flag only (no token needed)
elif platform == Platform.API_SERVER:
connected.append(platform)
# Webhook uses enabled flag only (secrets are per-route)
elif platform == Platform.WEBHOOK:
connected.append(platform)
# Feishu uses extra dict for app credentials
elif platform == Platform.FEISHU and config.extra.get("app_id"):
connected.append(platform)
# WeCom bot mode uses extra dict for bot credentials
elif platform == Platform.WECOM and config.extra.get("bot_id"):
connected.append(platform)
# WeCom callback mode uses corp_id or apps list
elif platform == Platform.WECOM_CALLBACK and (
config.extra.get("corp_id") or config.extra.get("apps")
):
connected.append(platform)
# BlueBubbles uses extra dict for local server config
elif platform == Platform.BLUEBUBBLES and config.extra.get("server_url") and config.extra.get("password"):
connected.append(platform)
# QQBot uses extra dict for app credentials
elif platform == Platform.QQBOT and config.extra.get("app_id") and config.extra.get("client_secret"):
connected.append(platform)
# DingTalk uses client_id/client_secret from config.extra or env vars
elif platform == Platform.DINGTALK and (
config.extra.get("client_id") or os.getenv("DINGTALK_CLIENT_ID")
) and (
config.extra.get("client_secret") or os.getenv("DINGTALK_CLIENT_SECRET")
):
connected.append(platform)
return connected
def _is_platform_connected(self, platform: Platform, config: PlatformConfig) -> bool:
"""Check whether a single platform is sufficiently configured."""
# Weixin requires both a token and an account_id (checked first so
# the generic token branch doesn't let it through without account_id).
if platform == Platform.WEIXIN:
return bool(
config.extra.get("account_id")
and (config.token or config.extra.get("token"))
)
# Generic token/api_key auth covers Telegram, Discord, Slack, etc.
if config.token or config.api_key:
return True
# Platform-specific check
checker = _PLATFORM_CONNECTED_CHECKERS.get(platform)
if checker is not None:
return checker(config)
# Plugin-registered platforms
try:
from gateway.platform_registry import platform_registry
entry = platform_registry.get(platform.value)
if entry:
if entry.is_connected is not None:
return entry.is_connected(config)
if entry.validate_config is not None:
return entry.validate_config(config)
return True
except Exception:
pass # Registry not yet initialised during early import
return False
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
"""Get the home channel for a platform."""
@@ -608,17 +455,6 @@ class GatewayConfig:
)
return self.unauthorized_dm_behavior
def get_notice_delivery(self, platform: Optional[Platform] = None) -> str:
"""Return the effective notice-delivery mode for a platform."""
if platform:
platform_cfg = self.platforms.get(platform)
if platform_cfg and "notice_delivery" in platform_cfg.extra:
return _normalize_notice_delivery(
platform_cfg.extra.get("notice_delivery"),
"public",
)
return "public"
def load_gateway_config() -> GatewayConfig:
"""
@@ -714,8 +550,6 @@ def load_gateway_config() -> GatewayConfig:
existing = {}
# Deep-merge extra dicts so gateway.json defaults survive
merged_extra = {**existing.get("extra", {}), **plat_block.get("extra", {})}
if plat_name == Platform.SLACK.value and "enabled" in plat_block:
merged_extra["_enabled_explicit"] = True
merged = {**existing, **plat_block}
if merged_extra:
merged["extra"] = merged_extra
@@ -734,15 +568,8 @@ def load_gateway_config() -> GatewayConfig:
platform_cfg.get("unauthorized_dm_behavior"),
gw_data.get("unauthorized_dm_behavior", "pair"),
)
if "notice_delivery" in platform_cfg:
bridged["notice_delivery"] = _normalize_notice_delivery(
platform_cfg.get("notice_delivery"),
"public",
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "reply_in_thread" in platform_cfg:
bridged["reply_in_thread"] = platform_cfg["reply_in_thread"]
if "require_mention" in platform_cfg:
bridged["require_mention"] = platform_cfg["require_mention"]
if "free_response_channels" in platform_cfg:
@@ -757,7 +584,7 @@ def load_gateway_config() -> GatewayConfig:
bridged["group_policy"] = platform_cfg["group_policy"]
if "group_allow_from" in platform_cfg:
bridged["group_allow_from"] = platform_cfg["group_allow_from"]
if plat in (Platform.DISCORD, Platform.SLACK) and "channel_skill_bindings" in platform_cfg:
if plat == Platform.DISCORD and "channel_skill_bindings" in platform_cfg:
bridged["channel_skill_bindings"] = platform_cfg["channel_skill_bindings"]
if "channel_prompts" in platform_cfg:
channel_prompts = platform_cfg["channel_prompts"]
@@ -765,21 +592,16 @@ def load_gateway_config() -> GatewayConfig:
bridged["channel_prompts"] = {str(k): v for k, v in channel_prompts.items()}
else:
bridged["channel_prompts"] = channel_prompts
enabled_was_explicit = "enabled" in platform_cfg
if not bridged and not enabled_was_explicit:
if not bridged:
continue
plat_data = platforms_data.setdefault(plat.value, {})
if not isinstance(plat_data, dict):
plat_data = {}
platforms_data[plat.value] = plat_data
if enabled_was_explicit:
plat_data["enabled"] = platform_cfg["enabled"]
extra = plat_data.setdefault("extra", {})
if not isinstance(extra, dict):
extra = {}
plat_data["extra"] = extra
if plat == Platform.SLACK and enabled_was_explicit:
extra["_enabled_explicit"] = True
extra.update(bridged)
# Slack settings → env vars (env vars take precedence)
@@ -787,8 +609,6 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(slack_cfg, dict):
if "require_mention" in slack_cfg and not os.getenv("SLACK_REQUIRE_MENTION"):
os.environ["SLACK_REQUIRE_MENTION"] = str(slack_cfg["require_mention"]).lower()
if "strict_mention" in slack_cfg and not os.getenv("SLACK_STRICT_MENTION"):
os.environ["SLACK_STRICT_MENTION"] = str(slack_cfg["strict_mention"]).lower()
if "allow_bots" in slack_cfg and not os.getenv("SLACK_ALLOW_BOTS"):
os.environ["SLACK_ALLOW_BOTS"] = str(slack_cfg["allow_bots"]).lower()
frc = slack_cfg.get("free_response_channels")
@@ -846,25 +666,11 @@ def load_gateway_config() -> GatewayConfig:
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# Bridge top-level require_mention to Telegram when the telegram: section
# does not already provide one. Users often write "require_mention: true"
# at the top level alongside group_sessions_per_user, expecting it to work
# the same way (#3979).
_tl_require_mention = yaml_cfg.get("require_mention")
if _tl_require_mention is not None:
_tg_section = yaml_cfg.get("telegram") or {}
if "require_mention" not in _tg_section:
_tg_plat = platforms_data.setdefault(Platform.TELEGRAM.value, {})
_tg_extra = _tg_plat.setdefault("extra", {})
_tg_extra.setdefault("require_mention", _tl_require_mention)
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
if isinstance(telegram_cfg, dict):
# Prefer telegram.require_mention; fall back to the top-level shorthand.
_effective_rm = telegram_cfg.get("require_mention", yaml_cfg.get("require_mention"))
if _effective_rm is not None and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(_effective_rm).lower()
if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
os.environ["TELEGRAM_MENTION_PATTERNS"] = json.dumps(telegram_cfg["mention_patterns"])
frc = telegram_cfg.get("free_response_chats")
@@ -881,21 +687,11 @@ def load_gateway_config() -> GatewayConfig:
os.environ["TELEGRAM_REACTIONS"] = str(telegram_cfg["reactions"]).lower()
if "proxy_url" in telegram_cfg and not os.getenv("TELEGRAM_PROXY"):
os.environ["TELEGRAM_PROXY"] = str(telegram_cfg["proxy_url"]).strip()
allowed_users = telegram_cfg.get("allow_from")
if allowed_users is not None and not os.getenv("TELEGRAM_ALLOWED_USERS"):
if isinstance(allowed_users, list):
allowed_users = ",".join(str(v) for v in allowed_users)
os.environ["TELEGRAM_ALLOWED_USERS"] = str(allowed_users)
group_allowed_users = telegram_cfg.get("group_allow_from")
if group_allowed_users is not None and not os.getenv("TELEGRAM_GROUP_ALLOWED_USERS"):
if isinstance(group_allowed_users, list):
group_allowed_users = ",".join(str(v) for v in group_allowed_users)
os.environ["TELEGRAM_GROUP_ALLOWED_USERS"] = str(group_allowed_users)
group_allowed_chats = telegram_cfg.get("group_allowed_chats")
if group_allowed_chats is not None and not os.getenv("TELEGRAM_GROUP_ALLOWED_CHATS"):
if isinstance(group_allowed_chats, list):
group_allowed_chats = ",".join(str(v) for v in group_allowed_chats)
os.environ["TELEGRAM_GROUP_ALLOWED_CHATS"] = str(group_allowed_chats)
if "group_allowed_chats" in telegram_cfg and not os.getenv("TELEGRAM_GROUP_ALLOWED_USERS"):
gac = telegram_cfg["group_allowed_chats"]
if isinstance(gac, list):
gac = ",".join(str(v) for v in gac)
os.environ["TELEGRAM_GROUP_ALLOWED_USERS"] = str(gac)
if "disable_link_previews" in telegram_cfg:
plat_data = platforms_data.setdefault(Platform.TELEGRAM.value, {})
if not isinstance(plat_data, dict):
@@ -966,12 +762,6 @@ def load_gateway_config() -> GatewayConfig:
if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
os.environ["MATRIX_DM_MENTION_THREADS"] = str(matrix_cfg["dm_mention_threads"]).lower()
# Feishu settings → env vars (env vars take precedence)
feishu_cfg = yaml_cfg.get("feishu", {})
if isinstance(feishu_cfg, dict):
if "allow_bots" in feishu_cfg and not os.getenv("FEISHU_ALLOW_BOTS"):
os.environ["FEISHU_ALLOW_BOTS"] = str(feishu_cfg["allow_bots"]).lower()
except Exception as e:
logger.warning(
"Failed to process config.yaml — falling back to .env / gateway.json values. "
@@ -1092,7 +882,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.TELEGRAM,
chat_id=telegram_home,
name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("TELEGRAM_HOME_CHANNEL_THREAD_ID") or None,
)
# Discord
@@ -1109,7 +898,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DISCORD,
chat_id=discord_home,
name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DISCORD_HOME_CHANNEL_THREAD_ID") or None,
)
# Reply threading mode for Discord (off/first/all)
@@ -1125,33 +913,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if Platform.WHATSAPP not in config.platforms:
config.platforms[Platform.WHATSAPP] = PlatformConfig()
config.platforms[Platform.WHATSAPP].enabled = True
whatsapp_home = os.getenv("WHATSAPP_HOME_CHANNEL")
if whatsapp_home and Platform.WHATSAPP in config.platforms:
config.platforms[Platform.WHATSAPP].home_channel = HomeChannel(
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WHATSAPP_HOME_CHANNEL_THREAD_ID") or None,
)
# Slack
slack_token = os.getenv("SLACK_BOT_TOKEN")
if slack_token:
if Platform.SLACK not in config.platforms:
# No yaml config for Slack — env-only setup, enable it
config.platforms[Platform.SLACK] = PlatformConfig()
config.platforms[Platform.SLACK].enabled = True
else:
slack_config = config.platforms[Platform.SLACK]
enabled_was_explicit = bool(slack_config.extra.pop("_enabled_explicit", False))
if not slack_config.enabled and not enabled_was_explicit:
# Top-level Slack settings such as channel prompts should not
# turn an env-token setup into a disabled platform. Only an
# explicit slack.enabled/platforms.slack.enabled false should.
slack_config.enabled = True
# If yaml config exists, respect its enabled flag (don't override
# explicit enabled: false). Token is still stored so skills that
# send Slack messages can use it without activating the gateway adapter.
config.platforms[Platform.SLACK].enabled = True
config.platforms[Platform.SLACK].token = slack_token
slack_home = os.getenv("SLACK_HOME_CHANNEL")
if slack_home and Platform.SLACK in config.platforms:
@@ -1159,7 +927,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
thread_id=os.getenv("SLACK_HOME_CHANNEL_THREAD_ID") or None,
)
# Signal
@@ -1180,7 +947,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SIGNAL_HOME_CHANNEL_THREAD_ID") or None,
)
# Mattermost
@@ -1200,7 +966,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("MATTERMOST_HOME_CHANNEL_THREAD_ID") or None,
)
# Matrix
@@ -1232,7 +997,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
thread_id=os.getenv("MATRIX_HOME_ROOM_THREAD_ID") or None,
)
# Home Assistant
@@ -1266,7 +1030,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
thread_id=os.getenv("EMAIL_HOME_ADDRESS_THREAD_ID") or None,
)
# SMS (Twilio)
@@ -1282,7 +1045,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SMS_HOME_CHANNEL_THREAD_ID") or None,
)
# API Server
@@ -1345,7 +1107,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DINGTALK,
chat_id=dingtalk_home,
name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DINGTALK_HOME_CHANNEL_THREAD_ID") or None,
)
# Feishu / Lark
@@ -1373,7 +1134,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.FEISHU,
chat_id=feishu_home,
name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("FEISHU_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom (Enterprise WeChat)
@@ -1396,7 +1156,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WECOM,
chat_id=wecom_home,
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WECOM_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom callback mode (self-built apps)
@@ -1455,7 +1214,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WEIXIN,
chat_id=weixin_home,
name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WEIXIN_HOME_CHANNEL_THREAD_ID") or None,
)
# BlueBubbles (iMessage)
@@ -1479,7 +1237,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.BLUEBUBBLES,
chat_id=bluebubbles_home,
name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("BLUEBUBBLES_HOME_CHANNEL_THREAD_ID") or None,
)
# QQ (Official Bot API v2)
@@ -1517,56 +1274,8 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
thread_id=(
os.getenv("QQBOT_HOME_CHANNEL_THREAD_ID")
or os.getenv("QQ_HOME_CHANNEL_THREAD_ID")
or None
),
)
# Yuanbao — YUANBAO_APP_ID preferred
yuanbao_app_id = os.getenv("YUANBAO_APP_ID") or os.getenv("YUANBAO_APP_KEY")
yuanbao_app_secret = os.getenv("YUANBAO_APP_SECRET")
if yuanbao_app_id and yuanbao_app_secret:
if Platform.YUANBAO not in config.platforms:
config.platforms[Platform.YUANBAO] = PlatformConfig()
config.platforms[Platform.YUANBAO].enabled = True
extra = config.platforms[Platform.YUANBAO].extra
extra["app_id"] = yuanbao_app_id
extra["app_secret"] = yuanbao_app_secret
yuanbao_bot_id = os.getenv("YUANBAO_BOT_ID")
if yuanbao_bot_id:
extra["bot_id"] = yuanbao_bot_id
yuanbao_ws_url = os.getenv("YUANBAO_WS_URL")
if yuanbao_ws_url:
extra["ws_url"] = yuanbao_ws_url
yuanbao_api_domain = os.getenv("YUANBAO_API_DOMAIN")
if yuanbao_api_domain:
extra["api_domain"] = yuanbao_api_domain
yuanbao_route_env = os.getenv("YUANBAO_ROUTE_ENV")
if yuanbao_route_env:
extra["route_env"] = yuanbao_route_env
yuanbao_home = os.getenv("YUANBAO_HOME_CHANNEL")
if yuanbao_home:
config.platforms[Platform.YUANBAO].home_channel = HomeChannel(
platform=Platform.YUANBAO,
chat_id=yuanbao_home,
name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("YUANBAO_HOME_CHANNEL_THREAD_ID") or None,
)
yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
if yuanbao_dm_policy:
extra["dm_policy"] = yuanbao_dm_policy.strip().lower()
yuanbao_dm_allow_from = os.getenv("YUANBAO_DM_ALLOW_FROM")
if yuanbao_dm_allow_from:
extra["dm_allow_from"] = yuanbao_dm_allow_from
yuanbao_group_policy = os.getenv("YUANBAO_GROUP_POLICY")
if yuanbao_group_policy:
extra["group_policy"] = yuanbao_group_policy.strip().lower()
yuanbao_group_allow_from = os.getenv("YUANBAO_GROUP_ALLOW_FROM")
if yuanbao_group_allow_from:
extra["group_allow_from"] = yuanbao_group_allow_from
# Session settings
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
if idle_minutes:
@@ -1581,25 +1290,3 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.default_reset_policy.at_hour = int(reset_hour)
except ValueError:
pass
# Registry-driven enable for plugin platforms. Built-ins have explicit
# blocks above; plugins expose check_fn() which is the single source of
# truth for "are my env vars set?". When it returns True, ensure the
# platform is enabled so start() will create its adapter.
try:
from hermes_cli.plugins import discover_plugins
discover_plugins() # idempotent
from gateway.platform_registry import platform_registry
for entry in platform_registry.plugin_entries():
try:
if not entry.check_fn():
continue
except Exception as e:
logger.debug("check_fn for %s raised: %s", entry.name, e)
continue
platform = Platform(entry.name)
if platform not in config.platforms:
config.platforms[platform] = PlatformConfig()
config.platforms[platform].enabled = True
except Exception as e:
logger.debug("Plugin platform enable pass failed: %s", e)
+7 -9
View File
@@ -53,10 +53,9 @@ class DeliveryTarget:
- "telegram" Telegram home channel
- "telegram:123456" specific Telegram chat
"""
target_stripped = target.strip()
target_lower = target_stripped.lower()
target = target.strip().lower()
if target_lower == "origin":
if target == "origin":
if origin:
return cls(
platform=origin.platform,
@@ -68,14 +67,13 @@ class DeliveryTarget:
# Fallback to local if no origin
return cls(platform=Platform.LOCAL, is_origin=True)
if target_lower == "local":
if target == "local":
return cls(platform=Platform.LOCAL)
# Check for platform:chat_id or platform:chat_id:thread_id format
# Use the original case for chat_id/thread_id to preserve case-sensitive IDs
if ":" in target_stripped:
parts = target_stripped.split(":", 2)
platform_str = parts[0].lower() # Platform names are case-insensitive
if ":" in target:
parts = target.split(":", 2)
platform_str = parts[0]
chat_id = parts[1] if len(parts) > 1 else None
thread_id = parts[2] if len(parts) > 2 else None
try:
@@ -87,7 +85,7 @@ class DeliveryTarget:
# Just a platform name (use home channel)
try:
platform = Platform(target_lower)
platform = Platform(target)
return cls(platform=platform)
except ValueError:
# Unknown platform, treat as local
+1 -3
View File
@@ -79,9 +79,7 @@ _PLATFORM_DEFAULTS: dict[str, dict[str, Any]] = {
"discord": _TIER_HIGH,
# Tier 2 — edit support, often customer/workspace channels
# Slack: tool_progress off by default — Bolt posts cannot be edited like CLI;
# "new"/"all" spam permanent lines in channels (hermes-agent#14663).
"slack": {**_TIER_MEDIUM, "tool_progress": "off"},
"slack": _TIER_MEDIUM,
"mattermost": _TIER_MEDIUM,
"matrix": _TIER_MEDIUM,
"feishu": _TIER_MEDIUM,
+12 -6
View File
@@ -53,13 +53,19 @@ class HookRegistry:
return list(self._loaded_hooks)
def _register_builtin_hooks(self) -> None:
"""Register built-in hooks that are always active.
"""Register built-in hooks that are always active."""
try:
from gateway.builtin_hooks.boot_md import handle as boot_md_handle
Currently empty no shipped built-in hooks. Kept as the extension
point for future always-on gateway hooks so they drop in without
re-plumbing discover_and_load().
"""
return
self._handlers.setdefault("gateway:startup", []).append(boot_md_handle)
self._loaded_hooks.append({
"name": "boot-md",
"description": "Run ~/.hermes/BOOT.md on gateway startup",
"events": ["gateway:startup"],
"path": "(builtin)",
})
except Exception as e:
print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
def discover_and_load(self) -> None:
"""
+11 -57
View File
@@ -28,7 +28,6 @@ def mirror_to_session(
message_text: str,
source_label: str = "cli",
thread_id: Optional[str] = None,
user_id: Optional[str] = None,
) -> bool:
"""
Append a delivery-mirror message to the target session's transcript.
@@ -40,20 +39,9 @@ def mirror_to_session(
All errors are caught -- this is never fatal.
"""
try:
session_id = _find_session_id(
platform,
str(chat_id),
thread_id=thread_id,
user_id=user_id,
)
session_id = _find_session_id(platform, str(chat_id), thread_id=thread_id)
if not session_id:
logger.debug(
"Mirror: no session found for %s:%s:%s:%s",
platform,
chat_id,
thread_id,
user_id,
)
logger.debug("Mirror: no session found for %s:%s:%s", platform, chat_id, thread_id)
return False
mirror_msg = {
@@ -71,33 +59,17 @@ def mirror_to_session(
return True
except Exception as e:
logger.debug(
"Mirror failed for %s:%s:%s:%s: %s",
platform,
chat_id,
thread_id,
user_id,
e,
)
logger.debug("Mirror failed for %s:%s:%s: %s", platform, chat_id, thread_id, e)
return False
def _find_session_id(
platform: str,
chat_id: str,
thread_id: Optional[str] = None,
user_id: Optional[str] = None,
) -> Optional[str]:
def _find_session_id(platform: str, chat_id: str, thread_id: Optional[str] = None) -> Optional[str]:
"""
Find the active session_id for a platform + chat_id pair.
Scans sessions.json entries and matches where origin.chat_id == chat_id
on the right platform. DM session keys don't embed the chat_id
(e.g. "agent:main:telegram:dm"), so we check the origin dict.
When *user_id* is provided, prefer exact sender matches. If multiple
same-chat candidates exist and none matches the user, return None instead
of guessing and contaminating another participant's session.
"""
if not _SESSIONS_INDEX.exists():
return None
@@ -109,7 +81,8 @@ def _find_session_id(
return None
platform_lower = platform.lower()
candidates = []
best_match = None
best_updated = ""
for _key, entry in data.items():
origin = entry.get("origin") or {}
@@ -123,31 +96,12 @@ def _find_session_id(
origin_thread_id = origin.get("thread_id")
if thread_id is not None and str(origin_thread_id or "") != str(thread_id):
continue
candidates.append(entry)
updated = entry.get("updated_at", "")
if updated > best_updated:
best_updated = updated
best_match = entry.get("session_id")
if not candidates:
return None
if user_id:
exact_user_matches = [
entry for entry in candidates
if str((entry.get("origin") or {}).get("user_id") or "") == str(user_id)
]
if exact_user_matches:
candidates = exact_user_matches
elif len(candidates) > 1:
return None
elif len(candidates) > 1:
distinct_user_ids = {
str((entry.get("origin") or {}).get("user_id") or "").strip()
for entry in candidates
if str((entry.get("origin") or {}).get("user_id") or "").strip()
}
if len(distinct_user_ids) > 1:
return None
best_entry = max(candidates, key=lambda entry: entry.get("updated_at", ""))
return best_entry.get("session_id")
return best_match
def _append_to_jsonl(session_id: str, message: dict) -> None:
+1 -2
View File
@@ -28,7 +28,6 @@ from pathlib import Path
from typing import Optional
from hermes_constants import get_hermes_dir
from utils import atomic_replace
# Unambiguous alphabet -- excludes 0/O, 1/I to prevent confusion
@@ -60,7 +59,7 @@ def _secure_write(path: Path, data: str) -> None:
f.write(data)
f.flush()
os.fsync(f.fileno())
atomic_replace(tmp_path, path)
os.replace(tmp_path, str(path))
try:
os.chmod(path, 0o600)
except OSError:
-212
View File
@@ -1,212 +0,0 @@
"""
Platform Adapter Registry
Allows platform adapters (built-in and plugin) to self-register so the gateway
can discover and instantiate them without hardcoded if/elif chains.
Built-in adapters continue to use the existing if/elif in _create_adapter()
for now. Plugin adapters register here via PluginContext.register_platform()
and are looked up first -- if nothing is found the gateway falls through to
the legacy code path.
Usage (plugin side):
from gateway.platform_registry import platform_registry, PlatformEntry
platform_registry.register(PlatformEntry(
name="irc",
label="IRC",
adapter_factory=lambda cfg: IRCAdapter(cfg),
check_fn=check_requirements,
validate_config=lambda cfg: bool(cfg.extra.get("server")),
required_env=["IRC_SERVER"],
install_hint="pip install irc",
))
Usage (gateway side):
adapter = platform_registry.create_adapter("irc", platform_config)
"""
import logging
from dataclasses import dataclass, field
from typing import Any, Callable, Optional
logger = logging.getLogger(__name__)
@dataclass
class PlatformEntry:
"""Metadata and factory for a single platform adapter."""
# Identifier used in config.yaml (e.g. "irc", "viber").
name: str
# Human-readable label (e.g. "IRC", "Viber").
label: str
# Factory callable: receives a PlatformConfig, returns an adapter instance.
# Using a factory instead of a bare class lets plugins do custom init
# (e.g. passing extra kwargs, wrapping in try/except).
adapter_factory: Callable[[Any], Any]
# Returns True when the platform's dependencies are available.
check_fn: Callable[[], bool]
# Optional: given a PlatformConfig, is it properly configured?
# If None, the registry skips config validation and lets the adapter
# fail at connect() time with a descriptive error.
validate_config: Optional[Callable[[Any], bool]] = None
# Optional: given a PlatformConfig, is the platform connected/enabled?
# Used by ``GatewayConfig.get_connected_platforms()`` and setup UI status.
# If None, falls back to ``validate_config`` or ``check_fn``.
is_connected: Optional[Callable[[Any], bool]] = None
# Env vars this platform needs (for ``hermes setup`` display).
required_env: list = field(default_factory=list)
# Hint shown when check_fn returns False.
install_hint: str = ""
# Optional setup function for interactive configuration.
# Signature: () -> None (prompts user, saves env vars).
# If None, falls back to _setup_standard_platform (needs token_var + vars)
# or a generic "set these env vars" display.
setup_fn: Optional[Callable[[], None]] = None
# "builtin" or "plugin"
source: str = "plugin"
# Name of the plugin manifest that registered this entry (empty for
# built-ins). Used by ``hermes gateway setup`` to auto-enable the
# owning plugin when the user configures its platform.
plugin_name: str = ""
# ── Auth env var names (for _is_user_authorized integration) ──
# E.g. "IRC_ALLOWED_USERS" — checked for comma-separated user IDs.
allowed_users_env: str = ""
# E.g. "IRC_ALLOW_ALL_USERS" — if truthy, all users authorized.
allow_all_env: str = ""
# ── Message limits ──
# Max message length for smart-chunking. 0 = no limit.
max_message_length: int = 0
# ── Privacy ──
# If True, session descriptions redact PII (phone numbers, etc.)
pii_safe: bool = False
# ── Display ──
# Emoji for CLI/gateway display (e.g. "💬")
emoji: str = "🔌"
# Whether this platform should appear in _UPDATE_ALLOWED_PLATFORMS
# (allows /update command from this platform).
allow_update_command: bool = True
# ── LLM guidance ──
# Platform hint injected into the system prompt (e.g. "You are on IRC.
# Do not use markdown."). Empty string = no hint.
platform_hint: str = ""
class PlatformRegistry:
"""Central registry of platform adapters.
Thread-safe for reads (dict lookups are atomic under GIL).
Writes happen at startup during sequential discovery.
"""
def __init__(self) -> None:
self._entries: dict[str, PlatformEntry] = {}
def register(self, entry: PlatformEntry) -> None:
"""Register a platform adapter entry.
If an entry with the same name exists, it is replaced (last writer
wins -- this lets plugins override built-in adapters if desired).
"""
if entry.name in self._entries:
prev = self._entries[entry.name]
logger.info(
"Platform '%s' re-registered (was %s, now %s)",
entry.name,
prev.source,
entry.source,
)
self._entries[entry.name] = entry
logger.debug("Registered platform adapter: %s (%s)", entry.name, entry.source)
def unregister(self, name: str) -> bool:
"""Remove a platform entry. Returns True if it existed."""
return self._entries.pop(name, None) is not None
def get(self, name: str) -> Optional[PlatformEntry]:
"""Look up a platform entry by name."""
return self._entries.get(name)
def all_entries(self) -> list[PlatformEntry]:
"""Return all registered platform entries."""
return list(self._entries.values())
def plugin_entries(self) -> list[PlatformEntry]:
"""Return only plugin-registered platform entries."""
return [e for e in self._entries.values() if e.source == "plugin"]
def is_registered(self, name: str) -> bool:
return name in self._entries
def create_adapter(self, name: str, config: Any) -> Optional[Any]:
"""Create an adapter instance for the given platform name.
Returns None if:
- No entry registered for *name*
- check_fn() returns False (missing deps)
- validate_config() returns False (misconfigured)
- The factory raises an exception
"""
entry = self._entries.get(name)
if entry is None:
return None
if not entry.check_fn():
hint = f" ({entry.install_hint})" if entry.install_hint else ""
logger.warning(
"Platform '%s' requirements not met%s",
entry.label,
hint,
)
return None
if entry.validate_config is not None:
try:
if not entry.validate_config(config):
logger.warning(
"Platform '%s' config validation failed",
entry.label,
)
return None
except Exception as e:
logger.warning(
"Platform '%s' config validation error: %s",
entry.label,
e,
)
return None
try:
adapter = entry.adapter_factory(config)
return adapter
except Exception as e:
logger.error(
"Failed to create adapter for platform '%s': %s",
entry.label,
e,
exc_info=True,
)
return None
# Module-level singleton
platform_registry = PlatformRegistry()
+4 -25
View File
@@ -1,30 +1,9 @@
# Adding a New Messaging Platform
There are two ways to add a platform to the Hermes gateway:
## Plugin Path (Recommended for Community/Third-Party)
Create a plugin directory in `~/.hermes/plugins/` with a `PLUGIN.yaml` and
`adapter.py`. The adapter inherits from `BasePlatformAdapter` and registers
via `ctx.register_platform()` in the `register(ctx)` entry point. This
requires **zero changes to core Hermes code**.
The plugin system automatically handles: adapter creation, config parsing,
user authorization, cron delivery, send_message routing, system prompt hints,
status display, gateway setup, and more.
See `plugins/platforms/irc/` for a complete reference implementation, and
`website/docs/developer-guide/adding-platform-adapters.md` for the full
plugin guide with code examples.
---
## Built-in Path (Core Contributors Only)
Checklist for integrating a platform directly into the Hermes core.
Use this as a reference when building a built-in adapter — every item here
is a real integration point. Missing any of them will cause broken
functionality, missing features, or inconsistent behavior.
Checklist for integrating a new messaging platform into the Hermes gateway.
Use this as a reference when building a new adapter — every item here is a
real integration point that exists in the codebase. Missing any of them will
cause broken functionality, missing features, or inconsistent behavior.
---
-2
View File
@@ -10,12 +10,10 @@ Each adapter handles:
from .base import BasePlatformAdapter, MessageEvent, SendResult
from .qqbot import QQAdapter
from .yuanbao import YuanbaoAdapter
__all__ = [
"BasePlatformAdapter",
"MessageEvent",
"SendResult",
"QQAdapter",
"YuanbaoAdapter",
]
-84
View File
@@ -1,84 +0,0 @@
"""Shared HTTP client factory for long-lived platform adapters.
Gateway messaging platforms (QQ Bot, Feishu, WeCom, DingTalk, Signal,
BlueBubbles, WeCom-callback) keep a persistent ``httpx.AsyncClient``
alive for the adapter's lifetime. That amortises TLS/connection setup
across many API calls, but it also means the process's file-descriptor
pressure is sensitive to how aggressively the pool recycles idle keep-
alive connections.
httpx's default ``keepalive_expiry`` is 5 seconds. On macOS behind
Cloudflare Warp (and other transparent proxies), peer-initiated FIN can
sit in ``CLOSE_WAIT`` longer than that before the local socket actually
drains which, multiplied across 7 long-lived adapters plus the LLM
client and MCP clients, walks straight into the default 256 fd limit.
See #18451.
``platform_httpx_limits()`` returns a tighter ``httpx.Limits`` the
adapter factories use instead of the httpx default. The values chosen:
* ``max_keepalive_connections=10`` plenty for any single adapter;
platform APIs rarely parallelise beyond this.
* ``keepalive_expiry=2.0`` close idle sockets aggressively so a
proxy's lingering CLOSE_WAIT window can't starve the process.
Override via ``HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY`` /
``HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE`` env vars when tuning under load.
"""
from __future__ import annotations
import os
try:
import httpx
except ImportError: # pragma: no cover — optional dep
httpx = None # type: ignore[assignment]
_DEFAULT_KEEPALIVE_EXPIRY_S = 2.0
_DEFAULT_MAX_KEEPALIVE = 10
def platform_httpx_limits() -> "httpx.Limits | None":
"""Return ``httpx.Limits`` tuned for persistent platform-adapter clients.
Returns ``None`` when httpx isn't importable, so callers can fall
back to httpx's built-in default without a hard dependency on this
helper being reachable.
"""
if httpx is None:
return None
def _env_float(name: str, default: float) -> float:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = float(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
def _env_int(name: str, default: int) -> int:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = int(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
keepalive_expiry = _env_float(
"HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY", _DEFAULT_KEEPALIVE_EXPIRY_S
)
max_keepalive = _env_int(
"HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE", _DEFAULT_MAX_KEEPALIVE
)
return httpx.Limits(
max_keepalive_connections=max_keepalive,
# Leave max_connections at httpx default (100) — plenty of headroom.
keepalive_expiry=keepalive_expiry,
)
+60 -252
View File
@@ -7,9 +7,7 @@ Exposes an HTTP server with endpoints:
- GET /v1/responses/{response_id} Retrieve a stored response
- DELETE /v1/responses/{response_id} Delete a stored response
- GET /v1/models lists hermes-agent as an available model
- GET /v1/capabilities machine-readable API capabilities for external UIs
- POST /v1/runs start a run, returns run_id immediately (202)
- GET /v1/runs/{run_id} retrieve current run status
- GET /v1/runs/{run_id}/events SSE stream of structured lifecycle events
- POST /v1/runs/{run_id}/stop interrupt a running agent
- GET /health health check
@@ -62,14 +60,6 @@ MAX_NORMALIZED_TEXT_LENGTH = 65_536 # 64 KB cap for normalized content parts
MAX_CONTENT_LIST_SIZE = 1_000 # Max items when content is an array
def _coerce_port(value: Any, default: int = DEFAULT_PORT) -> int:
"""Parse a listen port without letting malformed env/config values crash startup."""
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_chat_content(
content: Any, *, _max_depth: int = 10, _depth: int = 0,
) -> str:
@@ -581,10 +571,7 @@ class APIServerAdapter(BasePlatformAdapter):
super().__init__(config, Platform.API_SERVER)
extra = config.extra or {}
self._host: str = extra.get("host", os.getenv("API_SERVER_HOST", DEFAULT_HOST))
raw_port = extra.get("port")
if raw_port is None:
raw_port = os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))
self._port: int = _coerce_port(raw_port, DEFAULT_PORT)
self._port: int = int(extra.get("port", os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))))
self._api_key: str = extra.get("key", os.getenv("API_SERVER_KEY", ""))
self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
@@ -603,8 +590,6 @@ class APIServerAdapter(BasePlatformAdapter):
# Active run agent/task references for stop support
self._active_run_agents: Dict[str, Any] = {}
self._active_run_tasks: Dict[str, "asyncio.Task"] = {}
# Pollable run status for dashboards and external control-plane UIs.
self._run_statuses: Dict[str, Dict[str, Any]] = {}
self._session_db: Optional[Any] = None # Lazy-init SessionDB for session continuity
@staticmethod
@@ -738,11 +723,10 @@ class APIServerAdapter(BasePlatformAdapter):
gateway platforms), falling back to the hermes-api-server default.
"""
from run_agent import AIAgent
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
from hermes_cli.tools_config import _get_platform_tools
runtime_kwargs = _resolve_runtime_agent_kwargs()
reasoning_config = GatewayRunner._load_reasoning_config()
model = _resolve_gateway_model()
user_config = _load_gateway_config()
@@ -752,6 +736,7 @@ class APIServerAdapter(BasePlatformAdapter):
# Load fallback provider chain so the API server platform has the
# same fallback behaviour as Telegram/Discord/Slack (fixes #4954).
from gateway.run import GatewayRunner
fallback_model = GatewayRunner._load_fallback_model()
agent = AIAgent(
@@ -770,7 +755,6 @@ class APIServerAdapter(BasePlatformAdapter):
tool_complete_callback=tool_complete_callback,
session_db=self._ensure_session_db(),
fallback_model=fallback_model,
reasoning_config=reasoning_config,
)
return agent
@@ -824,51 +808,6 @@ class APIServerAdapter(BasePlatformAdapter):
],
})
async def _handle_capabilities(self, request: "web.Request") -> "web.Response":
"""GET /v1/capabilities — advertise the stable API surface.
External UIs and orchestrators use this endpoint to discover the API
server's plugin-safe contract without scraping docs or assuming that
every Hermes version exposes the same endpoints.
"""
auth_err = self._check_auth(request)
if auth_err:
return auth_err
return web.json_response({
"object": "hermes.api_server.capabilities",
"platform": "hermes-agent",
"model": self._model_name,
"auth": {
"type": "bearer",
"required": bool(self._api_key),
},
"features": {
"chat_completions": True,
"chat_completions_streaming": True,
"responses_api": True,
"responses_streaming": True,
"run_submission": True,
"run_status": True,
"run_events_sse": True,
"run_stop": True,
"tool_progress_events": True,
"session_continuity_header": "X-Hermes-Session-Id",
"cors": bool(self._cors_origins),
},
"endpoints": {
"health": {"method": "GET", "path": "/health"},
"health_detailed": {"method": "GET", "path": "/health/detailed"},
"models": {"method": "GET", "path": "/v1/models"},
"chat_completions": {"method": "POST", "path": "/v1/chat/completions"},
"responses": {"method": "POST", "path": "/v1/responses"},
"runs": {"method": "POST", "path": "/v1/runs"},
"run_status": {"method": "GET", "path": "/v1/runs/{run_id}"},
"run_events": {"method": "GET", "path": "/v1/runs/{run_id}/events"},
"run_stop": {"method": "POST", "path": "/v1/runs/{run_id}/stop"},
},
})
async def _handle_chat_completions(self, request: "web.Request") -> "web.Response":
"""POST /v1/chat/completions — OpenAI Chat Completions format."""
auth_err = self._check_auth(request)
@@ -993,62 +932,39 @@ class APIServerAdapter(BasePlatformAdapter):
if delta is not None:
_stream_q.put(delta)
# Track which tool_call_ids we've emitted a "running" lifecycle
# event for, so a "completed" event without a matching "running"
# (e.g. internal/filtered tools) is silently dropped instead of
# producing an orphaned event clients can't correlate.
_started_tool_call_ids: set[str] = set()
def _on_tool_progress(event_type, name, preview, args, **kwargs):
"""Send tool progress as a separate SSE event.
def _on_tool_start(tool_call_id, function_name, function_args):
"""Emit ``hermes.tool.progress`` with ``status: running``.
Previously, progress markers like `` list`` were injected
directly into ``delta.content``. OpenAI-compatible frontends
(Open WebUI, LobeChat, ) store ``delta.content`` verbatim as
the assistant message and send it back on subsequent requests.
After enough turns the model learns to *emit* the markers as
plain text instead of issuing real tool calls silently
hallucinating tool results. See #6972.
Replaces the old ``tool_progress_callback("tool.started",
...)`` emit so SSE consumers receive a single event per
tool start, carrying both the legacy ``tool``/``emoji``/
``label`` payload (for #6972 frontends) and the new
``toolCallId``/``status`` correlation fields (#16588).
Skips tools whose names start with ``_`` so internal
events (``_thinking``, ) stay off the wire matching
the prior ``_on_tool_progress`` filter exactly.
The fix: push a tagged tuple ``("__tool_progress__", payload)``
onto the stream queue. The SSE writer emits it as a custom
``event: hermes.tool.progress`` line that compliant frontends
can render for UX but will *not* persist into conversation
history. Clients that don't understand the custom event type
silently ignore it per the SSE specification.
"""
if not tool_call_id or function_name.startswith("_"):
if event_type != "tool.started":
return
_started_tool_call_ids.add(tool_call_id)
from agent.display import build_tool_preview, get_tool_emoji
label = build_tool_preview(function_name, function_args) or function_name
if name.startswith("_"):
return
from agent.display import get_tool_emoji
emoji = get_tool_emoji(name)
label = preview or name
_stream_q.put(("__tool_progress__", {
"tool": function_name,
"emoji": get_tool_emoji(function_name),
"tool": name,
"emoji": emoji,
"label": label,
"toolCallId": tool_call_id,
"status": "running",
}))
def _on_tool_complete(tool_call_id, function_name, function_args, function_result):
"""Emit the matching ``status: completed`` event.
Dropped if the start was filtered (internal tool, missing
id, or never seen) so clients never get an orphaned
``completed`` they can't correlate to a prior ``running``.
"""
if not tool_call_id or tool_call_id not in _started_tool_call_ids:
return
_started_tool_call_ids.discard(tool_call_id)
_stream_q.put(("__tool_progress__", {
"tool": function_name,
"toolCallId": tool_call_id,
"status": "completed",
}))
# Start agent in background. agent_ref is a mutable container
# so the SSE writer can interrupt the agent on client disconnect.
#
# ``tool_progress_callback`` is intentionally not wired here:
# it would duplicate every emit because ``run_agent`` fires it
# side-by-side with ``tool_start_callback``/``tool_complete_callback``.
# The structured callbacks are strictly richer (they carry the
# tool_call id), so they own the chat-completions SSE channel.
agent_ref = [None]
agent_task = asyncio.ensure_future(self._run_agent(
user_message=user_message,
@@ -1056,8 +972,7 @@ class APIServerAdapter(BasePlatformAdapter):
ephemeral_system_prompt=system_prompt,
session_id=session_id,
stream_delta_callback=_on_delta,
tool_start_callback=_on_tool_start,
tool_complete_callback=_on_tool_complete,
tool_progress_callback=_on_tool_progress,
agent_ref=agent_ref,
))
@@ -1172,8 +1087,7 @@ class APIServerAdapter(BasePlatformAdapter):
Tagged tuples ``("__tool_progress__", payload)`` are sent
as a custom ``event: hermes.tool.progress`` SSE event so
frontends can display them without storing the markers in
conversation history. See #6972 for the original event,
#16588 for the ``toolCallId``/``status`` lifecycle fields.
conversation history. See #6972.
"""
if isinstance(item, tuple) and len(item) == 2 and item[0] == "__tool_progress__":
event_data = json.dumps(item[1])
@@ -2363,11 +2277,10 @@ class APIServerAdapter(BasePlatformAdapter):
)
if agent_ref is not None:
agent_ref[0] = agent
effective_task_id = session_id or str(uuid.uuid4())
result = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id=effective_task_id,
task_id="default",
)
usage = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -2384,31 +2297,10 @@ class APIServerAdapter(BasePlatformAdapter):
_MAX_CONCURRENT_RUNS = 10 # Prevent unbounded resource allocation
_RUN_STREAM_TTL = 300 # seconds before orphaned runs are swept
_RUN_STATUS_TTL = 3600 # seconds to retain terminal run status for polling
def _set_run_status(self, run_id: str, status: str, **fields: Any) -> Dict[str, Any]:
"""Update pollable run status without exposing private agent objects."""
now = time.time()
current = self._run_statuses.get(run_id, {})
current.update({
"object": "hermes.run",
"run_id": run_id,
"status": status,
"updated_at": now,
})
current.setdefault("created_at", fields.pop("created_at", now))
current.update(fields)
self._run_statuses[run_id] = current
return current
def _make_run_event_callback(self, run_id: str, loop: "asyncio.AbstractEventLoop"):
"""Return a tool_progress_callback that pushes structured events to the run's SSE queue."""
def _push(event: Dict[str, Any]) -> None:
self._set_run_status(
run_id,
self._run_statuses.get(run_id, {}).get("status", "running"),
last_event=event.get("event"),
)
q = self._run_streams.get(run_id)
if q is None:
return
@@ -2473,6 +2365,28 @@ class APIServerAdapter(BasePlatformAdapter):
if not user_message:
return web.json_response(_openai_error("No user message found in input"), status=400)
run_id = f"run_{uuid.uuid4().hex}"
loop = asyncio.get_running_loop()
q: "asyncio.Queue[Optional[Dict]]" = asyncio.Queue()
self._run_streams[run_id] = q
self._run_streams_created[run_id] = time.time()
event_cb = self._make_run_event_callback(run_id, loop)
# Also wire stream_delta_callback so message.delta events flow through
def _text_cb(delta: Optional[str]) -> None:
if delta is None:
return
try:
loop.call_soon_threadsafe(q.put_nowait, {
"event": "message.delta",
"run_id": run_id,
"timestamp": time.time(),
"delta": delta,
})
except Exception:
pass
instructions = body.get("instructions")
previous_response_id = body.get("previous_response_id")
@@ -2520,42 +2434,11 @@ class APIServerAdapter(BasePlatformAdapter):
)
conversation_history.append({"role": msg["role"], "content": str(content)})
run_id = f"run_{uuid.uuid4().hex}"
session_id = body.get("session_id") or stored_session_id or run_id
ephemeral_system_prompt = instructions
loop = asyncio.get_running_loop()
q: "asyncio.Queue[Optional[Dict]]" = asyncio.Queue()
created_at = time.time()
self._run_streams[run_id] = q
self._run_streams_created[run_id] = created_at
event_cb = self._make_run_event_callback(run_id, loop)
# Also wire stream_delta_callback so message.delta events flow through.
def _text_cb(delta: Optional[str]) -> None:
if delta is None:
return
try:
loop.call_soon_threadsafe(q.put_nowait, {
"event": "message.delta",
"run_id": run_id,
"timestamp": time.time(),
"delta": delta,
})
except Exception:
pass
self._set_run_status(
run_id,
"queued",
created_at=created_at,
session_id=session_id,
model=body.get("model", self._model_name),
)
async def _run_and_close():
try:
self._set_run_status(run_id, "running")
agent = self._create_agent(
ephemeral_system_prompt=ephemeral_system_prompt,
session_id=session_id,
@@ -2564,11 +2447,10 @@ class APIServerAdapter(BasePlatformAdapter):
)
self._active_run_agents[run_id] = agent
def _run_sync():
effective_task_id = session_id or run_id
r = agent.run_conversation(
user_message=user_message,
conversation_history=conversation_history,
task_id=effective_task_id,
task_id="default",
)
u = {
"input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -2578,62 +2460,16 @@ class APIServerAdapter(BasePlatformAdapter):
return r, u
result, usage = await asyncio.get_running_loop().run_in_executor(None, _run_sync)
# Check for structured failure (non-retryable client errors like
# 401/400 return failed=True instead of raising, so the except
# block below never fires — issue #15561).
if isinstance(result, dict) and result.get("failed"):
error_msg = result.get("error") or "agent run failed"
q.put_nowait({
"event": "run.failed",
"run_id": run_id,
"timestamp": time.time(),
"error": error_msg,
})
self._set_run_status(
run_id,
"failed",
error=error_msg,
last_event="run.failed",
)
else:
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
self._set_run_status(
run_id,
"completed",
output=final_response,
usage=usage,
last_event="run.completed",
)
except asyncio.CancelledError:
self._set_run_status(
run_id,
"cancelled",
last_event="run.cancelled",
)
try:
q.put_nowait({
"event": "run.cancelled",
"run_id": run_id,
"timestamp": time.time(),
})
except Exception:
pass
raise
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
except Exception as exc:
logger.exception("[api_server] run %s failed", run_id)
self._set_run_status(
run_id,
"failed",
error=str(exc),
last_event="run.failed",
)
try:
q.put_nowait({
"event": "run.failed",
@@ -2663,21 +2499,6 @@ class APIServerAdapter(BasePlatformAdapter):
return web.json_response({"run_id": run_id, "status": "started"}, status=202)
async def _handle_get_run(self, request: "web.Request") -> "web.Response":
"""GET /v1/runs/{run_id} — return pollable run status for external UIs."""
auth_err = self._check_auth(request)
if auth_err:
return auth_err
run_id = request.match_info["run_id"]
status = self._run_statuses.get(run_id)
if status is None:
return web.json_response(
_openai_error(f"Run not found: {run_id}", code="run_not_found"),
status=404,
)
return web.json_response(status)
async def _handle_run_events(self, request: "web.Request") -> "web.StreamResponse":
"""GET /v1/runs/{run_id}/events — SSE stream of structured agent lifecycle events."""
auth_err = self._check_auth(request)
@@ -2740,8 +2561,6 @@ class APIServerAdapter(BasePlatformAdapter):
if agent is None and task is None:
return web.json_response(_openai_error(f"Run not found: {run_id}", code="run_not_found"), status=404)
self._set_run_status(run_id, "stopping", last_event="run.stopping")
if agent is not None:
try:
agent.interrupt("Stop requested via API")
@@ -2784,15 +2603,6 @@ class APIServerAdapter(BasePlatformAdapter):
self._active_run_agents.pop(run_id, None)
self._active_run_tasks.pop(run_id, None)
stale_statuses = [
run_id
for run_id, status in list(self._run_statuses.items())
if status.get("status") in {"completed", "failed", "cancelled"}
and now - float(status.get("updated_at", 0) or 0) > self._RUN_STATUS_TTL
]
for run_id in stale_statuses:
self._run_statuses.pop(run_id, None)
# ------------------------------------------------------------------
# BasePlatformAdapter interface
# ------------------------------------------------------------------
@@ -2811,7 +2621,6 @@ class APIServerAdapter(BasePlatformAdapter):
self._app.router.add_get("/health/detailed", self._handle_health_detailed)
self._app.router.add_get("/v1/health", self._handle_health)
self._app.router.add_get("/v1/models", self._handle_models)
self._app.router.add_get("/v1/capabilities", self._handle_capabilities)
self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
self._app.router.add_post("/v1/responses", self._handle_responses)
self._app.router.add_get("/v1/responses/{response_id}", self._handle_get_response)
@@ -2827,7 +2636,6 @@ class APIServerAdapter(BasePlatformAdapter):
self._app.router.add_post("/api/jobs/{job_id}/run", self._handle_run_job)
# Structured event streaming
self._app.router.add_post("/v1/runs", self._handle_runs)
self._app.router.add_get("/v1/runs/{run_id}", self._handle_get_run)
self._app.router.add_get("/v1/runs/{run_id}/events", self._handle_run_events)
self._app.router.add_post("/v1/runs/{run_id}/stop", self._handle_stop_run)
# Start background sweep to clean up orphaned (unconsumed) run streams
+116 -745
View File
File diff suppressed because it is too large Load Diff
+1 -3
View File
@@ -162,9 +162,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
return False
from aiohttp import web
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
self.client = httpx.AsyncClient(timeout=30.0)
try:
await self._api_get("/api/v1/ping")
info = await self._api_get("/api/v1/server/info")
+1 -5
View File
@@ -228,11 +228,7 @@ class DingTalkAdapter(BasePlatformAdapter):
return False
try:
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, limits=platform_httpx_limits(),
)
self._http_client = httpx.AsyncClient(timeout=30.0)
credential = dingtalk_stream.Credential(
self._client_id, self._client_secret
File diff suppressed because it is too large Load Diff
+1 -123
View File
@@ -28,10 +28,9 @@ from email.header import decode_header
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.base import MIMEBase
from email.utils import formatdate
from email import encoders
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from typing import Any, Dict, List, Optional
from gateway.platforms.base import (
BasePlatformAdapter,
@@ -416,18 +415,6 @@ class EmailAdapter(BasePlatformAdapter):
logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
return
# Skip senders not in EMAIL_ALLOWED_USERS — prevents the adapter
# from creating a MessageEvent (and thus thread context) for senders
# that the gateway will never authorize. Without this early guard,
# a race between dispatch and authorization can result in the adapter
# sending a reply even though the handler returned None.
allowed_raw = os.getenv("EMAIL_ALLOWED_USERS", "").strip()
if allowed_raw:
allowed = {addr.strip().lower() for addr in allowed_raw.split(",") if addr.strip()}
if sender_addr.lower() not in allowed:
logger.debug("[Email] Dropping non-allowlisted sender at dispatch: %s", sender_addr)
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
@@ -517,7 +504,6 @@ class EmailAdapter(BasePlatformAdapter):
msg["In-Reply-To"] = original_msg_id
msg["References"] = original_msg_id
msg["Date"] = formatdate(localtime=True)
msg_id = f"<hermes-{uuid.uuid4().hex[:12]}@{self._address.split('@')[1]}>"
msg["Message-ID"] = msg_id
@@ -552,113 +538,6 @@ class EmailAdapter(BasePlatformAdapter):
text += f"\n\nImage: {image_url}"
return await self.send(chat_id, text.strip(), reply_to)
async def send_multiple_images(
self,
chat_id: str,
images: List[Tuple[str, str]],
metadata: Optional[Dict[str, Any]] = None,
human_delay: float = 0.0,
) -> None:
"""Send a batch of images as a single email with multiple MIME attachments.
Local files are attached directly. URL images have their URL
appended to the body (email adapter does not download remote
images). No hard cap email clients handle dozens of
attachments fine, subject to SMTP message size limits.
"""
if not images:
return
from urllib.parse import unquote as _unquote
body_parts: List[str] = []
local_paths: List[str] = []
for image_url, alt_text in images:
if alt_text:
body_parts.append(alt_text)
if image_url.startswith("file://"):
local_path = _unquote(image_url[7:])
if Path(local_path).exists():
local_paths.append(local_path)
else:
logger.warning("[Email] Skipping missing image: %s", local_path)
else:
# Remote URLs just get linked in the body (parity with send_image)
body_parts.append(f"Image: {image_url}")
if not local_paths and not body_parts:
return
body = "\n\n".join(body_parts)
try:
loop = asyncio.get_running_loop()
await loop.run_in_executor(
None,
self._send_email_with_attachments,
chat_id,
body,
local_paths,
)
except Exception as e:
logger.error("[Email] Multi-image send failed, falling back: %s", e, exc_info=True)
await super().send_multiple_images(chat_id, images, metadata, human_delay)
def _send_email_with_attachments(
self,
to_addr: str,
body: str,
file_paths: List[str],
) -> str:
"""Send an email with multiple file attachments via SMTP."""
msg = MIMEMultipart()
msg["From"] = self._address
msg["To"] = to_addr
ctx = self._thread_context.get(to_addr, {})
subject = ctx.get("subject", "Hermes Agent")
if not subject.startswith("Re:"):
subject = f"Re: {subject}"
msg["Subject"] = subject
original_msg_id = ctx.get("message_id")
if original_msg_id:
msg["In-Reply-To"] = original_msg_id
msg["References"] = original_msg_id
msg["Date"] = formatdate(localtime=True)
msg_id = f"<hermes-{uuid.uuid4().hex[:12]}@{self._address.split('@')[1]}>"
msg["Message-ID"] = msg_id
if body:
msg.attach(MIMEText(body, "plain", "utf-8"))
for file_path in file_paths:
p = Path(file_path)
try:
with open(p, "rb") as f:
part = MIMEBase("application", "octet-stream")
part.set_payload(f.read())
encoders.encode_base64(part)
part.add_header("Content-Disposition", f"attachment; filename={p.name}")
msg.attach(part)
except Exception as e:
logger.warning("[Email] Failed to attach %s: %s", file_path, e)
smtp = smtplib.SMTP(self._smtp_host, self._smtp_port, timeout=30)
try:
smtp.starttls(context=ssl.create_default_context())
smtp.login(self._address, self._password)
smtp.send_message(msg)
finally:
try:
smtp.quit()
except Exception:
smtp.close()
logger.info("[Email] Sent multi-attachment email to %s (%d files)", to_addr, len(file_paths))
return msg_id
async def send_document(
self,
chat_id: str,
@@ -707,7 +586,6 @@ class EmailAdapter(BasePlatformAdapter):
msg["In-Reply-To"] = original_msg_id
msg["References"] = original_msg_id
msg["Date"] = formatdate(localtime=True)
msg_id = f"<hermes-{uuid.uuid4().hex[:12]}@{self._address.split('@')[1]}>"
msg["Message-ID"] = msg_id
+54 -226
View File
@@ -64,7 +64,7 @@ from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
from typing import Any, Dict, List, Literal, Optional, Sequence
from typing import Any, Dict, List, Optional, Sequence
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen
@@ -141,7 +141,6 @@ from gateway.platforms.base import (
)
from gateway.status import acquire_scoped_lock, release_scoped_lock
from hermes_constants import get_hermes_home
from utils import atomic_json_write
logger = logging.getLogger(__name__)
@@ -388,8 +387,6 @@ class FeishuAdapterSettings:
admins: frozenset[str] = frozenset()
default_group_policy: str = ""
group_rules: Dict[str, FeishuGroupRule] = field(default_factory=dict)
allow_bots: str = "none" # "none" | "mentions" | "all"
require_mention: bool = True
@dataclass
@@ -399,7 +396,6 @@ class FeishuGroupRule:
policy: str # "open" | "allowlist" | "blacklist" | "admin_only" | "disabled"
allowlist: set[str] = field(default_factory=set)
blacklist: set[str] = field(default_factory=set)
require_mention: Optional[bool] = None # None = inherit global
@dataclass
@@ -409,40 +405,6 @@ class FeishuBatchState:
counts: Dict[str, int] = field(default_factory=dict)
# ---------------------------------------------------------------------------
# Admission: policy types
# ---------------------------------------------------------------------------
RejectReason = Literal[
"self_echo",
"self_ids_unknown",
"bots_disabled",
"bot_not_mentioned",
"group_policy_rejected",
]
def _is_bot_sender(sender: Any) -> bool:
# receive_v1 docs say {user, bot}; accept "app" defensively.
return getattr(sender, "sender_type", "") in ("bot", "app")
def _sender_identity(sender: Any) -> frozenset:
# Take any non-empty id variant — tenant sender_id_type decides which are populated.
sid = getattr(sender, "sender_id", None)
if sid is None:
return frozenset()
return frozenset(
v for v in (
getattr(sid, "open_id", None),
getattr(sid, "user_id", None),
getattr(sid, "union_id", None),
)
if v
)
# ---------------------------------------------------------------------------
# Markdown rendering helpers
# ---------------------------------------------------------------------------
@@ -1415,16 +1377,10 @@ class FeishuAdapter(BasePlatformAdapter):
for chat_id, rule_cfg in raw_group_rules.items():
if not isinstance(rule_cfg, dict):
continue
# Only override when the key is explicitly set — missing vs false
# must not collapse.
per_chat_require_mention: Optional[bool] = None
if "require_mention" in rule_cfg:
per_chat_require_mention = _to_boolean(rule_cfg.get("require_mention"))
group_rules[str(chat_id)] = FeishuGroupRule(
policy=str(rule_cfg.get("policy", "open")).strip().lower(),
allowlist=set(str(u).strip() for u in rule_cfg.get("allowlist", []) if str(u).strip()),
blacklist=set(str(u).strip() for u in rule_cfg.get("blacklist", []) if str(u).strip()),
require_mention=per_chat_require_mention,
)
# Bot-level admins
@@ -1434,16 +1390,6 @@ class FeishuAdapter(BasePlatformAdapter):
# Default group policy (for groups not in group_rules)
default_group_policy = str(extra.get("default_group_policy", "")).strip().lower()
# Env-only so adapter and gateway auth bypass share one source; yaml
# feishu.allow_bots is bridged to this env var at config load.
allow_bots = os.getenv("FEISHU_ALLOW_BOTS", "none").strip().lower()
if allow_bots not in ("none", "mentions", "all"):
logger.warning(
"[Feishu] Unknown allow_bots=%r, falling back to 'none'. Valid: none, mentions, all.",
allow_bots,
)
allow_bots = "none"
return FeishuAdapterSettings(
app_id=str(extra.get("app_id") or os.getenv("FEISHU_APP_ID", "")).strip(),
app_secret=str(extra.get("app_secret") or os.getenv("FEISHU_APP_SECRET", "")).strip(),
@@ -1500,10 +1446,6 @@ class FeishuAdapter(BasePlatformAdapter):
admins=admins,
default_group_policy=default_group_policy,
group_rules=group_rules,
allow_bots=allow_bots,
require_mention=_to_boolean(
extra.get("require_mention", os.getenv("FEISHU_REQUIRE_MENTION", "true"))
),
)
def _apply_settings(self, settings: FeishuAdapterSettings) -> None:
@@ -1534,8 +1476,6 @@ class FeishuAdapter(BasePlatformAdapter):
self._ws_reconnect_interval = settings.ws_reconnect_interval
self._ws_ping_interval = settings.ws_ping_interval
self._ws_ping_timeout = settings.ws_ping_timeout
self._allow_bots = settings.allow_bots
self._require_mention = settings.require_mention
def _build_event_handler(self) -> Any:
if EventDispatcherHandler is None:
@@ -2249,28 +2189,30 @@ class FeishuAdapter(BasePlatformAdapter):
event = getattr(data, "event", None)
message = getattr(event, "message", None)
sender = getattr(event, "sender", None)
if not message or not sender or not getattr(sender, "sender_id", None):
logger.debug("[Feishu] Dropping malformed inbound event: missing message/sender")
sender_id = getattr(sender, "sender_id", None)
if not message or not sender_id:
logger.debug("[Feishu] Dropping malformed inbound event: missing message or sender_id")
return
message_id = getattr(message, "message_id", None)
if not message_id or self._is_duplicate(message_id):
logger.debug("[Feishu] Dropping duplicate/missing message_id: %s", message_id)
return
reason = self._admit(sender, message)
if reason is not None:
logger.debug("[Feishu] dropping inbound event: %s", reason)
if self._is_self_sent_bot_message(event):
logger.debug("[Feishu] Dropping self-sent bot event: %s", message_id)
return
chat_type = getattr(message, "chat_type", "p2p")
chat_id = getattr(message, "chat_id", "") or ""
if chat_type != "p2p" and not self._should_accept_group_message(message, sender_id, chat_id):
logger.debug("[Feishu] Dropping group message that failed mention/policy gate: %s", message_id)
return
await self._process_inbound_message(
data=data,
message=message,
sender_id=getattr(sender, "sender_id", None),
sender_id=sender_id,
chat_type=chat_type,
message_id=message_id,
is_bot=_is_bot_sender(sender),
)
def _on_message_read_event(self, data: P2ImMessageMessageReadV1) -> None:
@@ -2447,11 +2389,10 @@ class FeishuAdapter(BasePlatformAdapter):
msg = items[0] if items else None
if not msg:
return
# GET im/v1/messages returns sender.id=app_id for bot messages —
# peer bots and us share sender_type="app" but differ on app_id.
sender = getattr(msg, "sender", None)
if str(getattr(sender, "id", "") or "") != self._app_id:
return # only route reactions on this bot's own messages
sender_type = str(getattr(sender, "sender_type", "") or "").lower()
if sender_type != "app":
return # only route reactions on our own bot messages
chat_id = str(getattr(msg, "chat_id", "") or "")
chat_type_raw = str(getattr(msg, "chat_type", "p2p") or "p2p")
if not chat_id:
@@ -2738,7 +2679,6 @@ class FeishuAdapter(BasePlatformAdapter):
sender_id: Any,
chat_type: str,
message_id: str,
is_bot: bool = False,
) -> None:
text, inbound_type, media_urls, media_types, mentions = await self._extract_message_content(message)
@@ -2757,45 +2697,34 @@ class FeishuAdapter(BasePlatformAdapter):
if hint:
text = f"{hint}\n\n{text}" if text else hint
thread_id = getattr(message, "thread_id", None) or getattr(message, "root_id", None) or None
reply_to_message_id = (
getattr(message, "parent_id", None)
or getattr(message, "upper_message_id", None)
or getattr(message, "root_id", None)
or None
)
reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
sender_primary = (
getattr(sender_id, "open_id", None)
or getattr(sender_id, "user_id", None)
or getattr(sender_id, "union_id", None)
or "<unknown>"
)
logger.info(
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s sender=%s:%s text=%r media=%d",
"[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s text=%r media=%d",
"dm" if chat_type == "p2p" else "group",
message_id,
inbound_type.value,
getattr(message, "chat_id", "") or "",
"bot" if is_bot else "user",
sender_primary,
text[:120],
len(media_urls),
)
chat_id = getattr(message, "chat_id", "") or ""
chat_info = await self.get_chat_info(chat_id)
sender_profile = await self._resolve_sender_profile(sender_id, is_bot=is_bot)
sender_profile = await self._resolve_sender_profile(sender_id)
source = self.build_source(
chat_id=chat_id,
chat_name=chat_info.get("name") or chat_id or "Feishu Chat",
chat_type=self._resolve_source_chat_type(chat_info=chat_info, event_chat_type=chat_type),
user_id=sender_profile["user_id"],
user_name=sender_profile["user_name"],
thread_id=thread_id,
thread_id=getattr(message, "thread_id", None) or None,
user_id_alt=sender_profile["user_id_alt"],
is_bot=is_bot,
)
normalized = MessageEvent(
text=text,
@@ -2924,18 +2853,13 @@ class FeishuAdapter(BasePlatformAdapter):
},
)
response.raise_for_status()
# Snapshot Content-Type and body while the client context is
# still active so pooled connections fully release on exit.
# See #18451.
content_type_hdr = str(response.headers.get("Content-Type", ""))
body = response.content
filename = self._derive_remote_filename(
file_url,
content_type=content_type_hdr,
content_type=str(response.headers.get("Content-Type", "")),
default_name=preferred_name,
default_ext=default_ext,
)
cached_path = cache_document_from_bytes(body, filename)
cached_path = cache_document_from_bytes(response.content, filename)
return cached_path, filename
@staticmethod
@@ -3523,12 +3447,7 @@ class FeishuAdapter(BasePlatformAdapter):
return "dm"
return "group"
async def _resolve_sender_profile(
self,
sender_id: Any,
*,
is_bot: bool = False,
) -> Dict[str, Optional[str]]:
async def _resolve_sender_profile(self, sender_id: Any) -> Dict[str, Optional[str]]:
"""Map Feishu's three-tier user IDs onto Hermes' SessionSource fields.
Preference order for the primary ``user_id`` field:
@@ -3545,11 +3464,7 @@ class FeishuAdapter(BasePlatformAdapter):
union_id = getattr(sender_id, "union_id", None) or None
# Prefer tenant-scoped user_id; fall back to app-scoped open_id.
primary_id = user_id or open_id
# bot/v3/bots/basic_batch only accepts open_id.
name_lookup_id = open_id if is_bot else (primary_id or union_id)
display_name = await self._resolve_sender_name_from_api(
name_lookup_id, is_bot=is_bot,
)
display_name = await self._resolve_sender_name_from_api(primary_id or union_id)
return {
"user_id": primary_id,
"user_name": display_name,
@@ -3569,14 +3484,11 @@ class FeishuAdapter(BasePlatformAdapter):
self._sender_name_cache.pop(sender_id, None)
return None
async def _resolve_sender_name_from_api(
self,
sender_id: Optional[str],
*,
is_bot: bool = False,
) -> Optional[str]:
"""Bots divert to bot/basic_batch — contact API doesn't return bot names.
Failures are silent so the pipeline never blocks on name resolution.
async def _resolve_sender_name_from_api(self, sender_id: Optional[str]) -> Optional[str]:
"""Fetch the sender's display name from the Feishu contact API with a 10-minute cache.
ID-type detection mirrors openclaw: ou_ open_id, on_ union_id, else user_id.
Failures are silently suppressed; the message pipeline must not block on name resolution.
"""
if not sender_id or not self._client:
return None
@@ -3586,16 +3498,7 @@ class FeishuAdapter(BasePlatformAdapter):
now = time.time()
cached_name = self._get_cached_sender_name(trimmed)
if cached_name is not None:
return cached_name or None # "" cached means "known nameless"
if is_bot:
names = await self._fetch_bot_names([trimmed])
if names is None:
return None
expire_at = now + _FEISHU_SENDER_NAME_TTL_SECONDS
for oid, name in names.items():
self._sender_name_cache[oid] = (name, expire_at)
hit = self._sender_name_cache.get(trimmed)
return (hit[0] or None) if hit else None
return cached_name
try:
from lark_oapi.api.contact.v3 import GetUserRequest # lazy import
if trimmed.startswith("ou_"):
@@ -3624,35 +3527,6 @@ class FeishuAdapter(BasePlatformAdapter):
logger.debug("[Feishu] Failed to resolve sender name for %s", sender_id, exc_info=True)
return None
async def _fetch_bot_names(self, bot_ids: List[str]) -> Optional[Dict[str, str]]:
if not self._client or not bot_ids:
return None
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/bots/basic_batch")
.queries([("bot_ids", oid) for oid in bot_ids])
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if not content:
return None
payload = json.loads(content)
if payload.get("code") != 0:
return None
bots = (payload.get("data") or {}).get("bots") or {}
return {
oid: str(info.get("name") or "").strip()
for oid, info in bots.items()
if oid
}
except Exception:
logger.debug("[Feishu] Failed to fetch bot names for %s", bot_ids, exc_info=True)
return None
async def _fetch_message_text(self, message_id: str) -> Optional[str]:
if not self._client or not message_id:
return None
@@ -3716,60 +3590,10 @@ class FeishuAdapter(BasePlatformAdapter):
logger.exception("[Feishu] Background inbound processing failed")
# =========================================================================
# Inbound admission
# Group policy and mention gating
# =========================================================================
def _admit(self, sender: Any, message: Any) -> Optional[RejectReason]:
sender_ids = _sender_identity(sender)
self_ids = frozenset(v for v in (self._bot_open_id, self._bot_user_id) if v)
is_bot = _is_bot_sender(sender)
is_group = getattr(message, "chat_type", "p2p") != "p2p"
chat_id = getattr(message, "chat_id", "") or ""
require_mention = is_group and self._require_mention_for(chat_id)
# Defensive only — Feishu doesn't echo our outbound back as inbound,
# and open_id is always populated on both sides.
if self_ids and sender_ids & self_ids:
return "self_echo"
if is_bot:
mode = self._allow_bots
if mode != "mentions" and mode != "all":
return "bots_disabled"
# Defensive: pre-hydration or malformed payloads.
if not self_ids or not sender_ids:
return "self_ids_unknown"
# Step 4 covers mention enforcement for groups when require_mention
# is on; check here only on paths step 4 won't reach.
if mode == "mentions" and not require_mention and not self._mentions_self(message):
return "bot_not_mentioned"
if not is_group:
return None
if not self._allow_group_message(
getattr(sender, "sender_id", None), chat_id, is_bot=is_bot,
):
return "group_policy_rejected"
if require_mention and not self._mentions_self(message):
return "group_policy_rejected"
return None
def _require_mention_for(self, chat_id: str) -> bool:
rule = self._group_rules.get(chat_id) if chat_id else None
if rule and rule.require_mention is not None:
return rule.require_mention
return self._require_mention
# --- Group policy ---------------------------------------------------------
def _allow_group_message(
self,
sender_id: Any,
chat_id: str = "",
*,
is_bot: bool = False,
) -> bool:
def _allow_group_message(self, sender_id: Any, chat_id: str = "") -> bool:
"""Per-group policy gate for non-DM traffic."""
sender_open_id = getattr(sender_id, "open_id", None)
sender_user_id = getattr(sender_id, "user_id", None)
@@ -3788,17 +3612,12 @@ class FeishuAdapter(BasePlatformAdapter):
allowlist = self._allowed_group_users
blacklist = set()
# Channel locks apply to everyone; allowlist/blacklist only gate humans
# (bots were already cleared upstream by FEISHU_ALLOW_BOTS).
if policy == "disabled":
return False
if policy == "open":
return True
if policy == "admin_only":
return False
if is_bot:
return True
if policy == "allowlist":
return bool(sender_ids and (sender_ids & allowlist))
if policy == "blacklist":
@@ -3806,16 +3625,17 @@ class FeishuAdapter(BasePlatformAdapter):
return bool(sender_ids and (sender_ids & self._allowed_group_users))
# --- Mention detection ----------------------------------------------------
def _mentions_self(self, message: Any) -> bool:
# @_all is Feishu's @everyone placeholder.
def _should_accept_group_message(self, message: Any, sender_id: Any, chat_id: str = "") -> bool:
"""Require an explicit @mention before group messages enter the agent."""
if not self._allow_group_message(sender_id, chat_id):
return False
# @_all is Feishu's @everyone placeholder — always route to the bot.
raw_content = getattr(message, "content", "") or ""
if "@_all" in raw_content:
return True
mentions = getattr(message, "mentions", None) or []
if mentions and self._message_mentions_bot(mentions):
return True
if mentions:
return self._message_mentions_bot(mentions)
normalized = normalize_feishu_message(
message_type=getattr(message, "message_type", "") or "",
raw_content=raw_content,
@@ -3824,6 +3644,23 @@ class FeishuAdapter(BasePlatformAdapter):
)
return self._post_mentions_bot(normalized.mentions)
def _is_self_sent_bot_message(self, event: Any) -> bool:
"""Return True only for Feishu events emitted by this Hermes bot."""
sender = getattr(event, "sender", None)
sender_type = str(getattr(sender, "sender_type", "") or "").strip().lower()
if sender_type not in {"bot", "app"}:
return False
sender_id = getattr(sender, "sender_id", None)
sender_open_id = str(getattr(sender_id, "open_id", "") or "").strip()
sender_user_id = str(getattr(sender_id, "user_id", "") or "").strip()
if self._bot_open_id and sender_open_id == self._bot_open_id:
return True
if self._bot_user_id and sender_user_id == self._bot_user_id:
return True
return False
def _message_mentions_bot(self, mentions: List[Any]) -> bool:
# IDs trump names: when both sides have open_id (or both user_id),
# match requires equal IDs. Name fallback only when either side
@@ -3967,7 +3804,7 @@ class FeishuAdapter(BasePlatformAdapter):
recent = self._seen_message_order[-self._dedup_cache_size:]
# Save as {msg_id: timestamp} so TTL filtering works across restarts.
payload = {"message_ids": {k: self._seen_message_ids[k] for k in recent if k in self._seen_message_ids}}
atomic_json_write(self._dedup_state_path, payload, indent=None)
self._dedup_state_path.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
except OSError:
logger.warning("[Feishu] Failed to persist dedup state to %s", self._dedup_state_path, exc_info=True)
@@ -4229,15 +4066,6 @@ class FeishuAdapter(BasePlatformAdapter):
if active_reply_to and not self._response_succeeded(response):
code = getattr(response, "code", None)
if code in _FEISHU_REPLY_FALLBACK_CODES:
if (metadata or {}).get("thread_id"):
logger.warning(
"[Feishu] Reply to %s failed in thread %s (code %s — message withdrawn/missing); "
"skipping top-level fallback to avoid creating a new topic",
active_reply_to,
(metadata or {}).get("thread_id"),
code,
)
return response
logger.warning(
"[Feishu] Reply to %s failed (code %s — message withdrawn/missing); "
"falling back to new message in chat %s",
+1
View File
@@ -974,6 +974,7 @@ def build_whole_comment_prompt(
def _resolve_model_and_runtime() -> Tuple[str, dict]:
"""Resolve model and provider credentials, same as gateway message handling."""
import os
from gateway.run import _load_gateway_config, _resolve_gateway_model
user_config = _load_gateway_config()
+4 -14
View File
@@ -11,12 +11,10 @@ import logging
import re
import time
from pathlib import Path
from typing import TYPE_CHECKING, Dict
from utils import atomic_json_write
from typing import TYPE_CHECKING, Dict, Optional
if TYPE_CHECKING:
from gateway.platforms.base import MessageEvent
from gateway.platforms.base import BasePlatformAdapter, MessageEvent
logger = logging.getLogger(__name__)
@@ -59,15 +57,6 @@ class MessageDeduplicator:
if len(self._seen) > self._max_size:
cutoff = now - self._ttl
self._seen = {k: v for k, v in self._seen.items() if v > cutoff}
if len(self._seen) > self._max_size:
# TTL pruning alone does not cap the cache when every entry is
# still fresh. Keep the newest entries so the helper's
# max_size bound is enforced under sustained traffic.
newest = sorted(
self._seen.items(),
key=lambda item: item[1],
)[-self._max_size:]
self._seen = dict(newest)
return False
def clear(self):
@@ -239,11 +228,12 @@ class ThreadParticipationTracker:
def _save(self) -> None:
path = self._state_path()
path.parent.mkdir(parents=True, exist_ok=True)
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = set(thread_list)
atomic_json_write(path, thread_list, indent=None)
path.write_text(json.dumps(thread_list), encoding="utf-8")
def mark(self, thread_id: str) -> None:
"""Mark *thread_id* as participated and persist."""
+1 -1
View File
@@ -139,7 +139,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
async def _ws_connect(self) -> bool:
"""Establish WebSocket connection and authenticate."""
ws_url = self._hass_url.replace("https://", "wss://").replace("http://", "ws://")
ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
ws_url = f"{ws_url}/api/websocket"
self._session = aiohttp.ClientSession(
+47 -493
View File
@@ -11,7 +11,6 @@ Environment variables:
MATRIX_PASSWORD Password (alternative to access token)
MATRIX_ENCRYPTION Set "true" to enable E2EE
MATRIX_DEVICE_ID Stable device ID for E2EE persistence across restarts
MATRIX_PROXY HTTP(S) or SOCKS proxy URL for Matrix traffic
MATRIX_ALLOWED_USERS Comma-separated Matrix user IDs (@user:server)
MATRIX_HOME_ROOM Room ID for cron/notification delivery
MATRIX_REACTIONS Set "false" to disable processing lifecycle reactions
@@ -19,7 +18,6 @@ Environment variables:
MATRIX_REQUIRE_MENTION Require @mention in rooms (default: true)
MATRIX_FREE_RESPONSE_ROOMS Comma-separated room IDs exempt from mention requirement
MATRIX_AUTO_THREAD Auto-create threads for room messages (default: true)
MATRIX_DM_AUTO_THREAD Auto-create threads for DM messages (default: false)
MATRIX_RECOVERY_KEY Recovery key for cross-signing verification after device key rotation
MATRIX_DM_MENTION_THREADS Create a thread when bot is @mentioned in a DM (default: false)
"""
@@ -32,8 +30,6 @@ import mimetypes
import os
import re
import time
from dataclasses import dataclass
from html import escape as _html_escape
from pathlib import Path
from typing import Any, Dict, Optional, Set
@@ -99,25 +95,11 @@ from gateway.platforms.base import (
MessageType,
ProcessingOutcome,
SendResult,
resolve_proxy_url,
proxy_kwargs_for_aiohttp,
)
from gateway.platforms.helpers import ThreadParticipationTracker
logger = logging.getLogger(__name__)
@dataclass
class _MatrixApprovalPrompt:
"""Tracks a pending Matrix reaction-based exec approval prompt."""
def __init__(self, session_key: str, chat_id: str, message_id: str, resolved: bool = False):
self.session_key = session_key
self.chat_id = chat_id
self.message_id = message_id
self.resolved = resolved
self.bot_reaction_events: dict[str, str] = {} # emoji -> event_id
# Matrix message size limit (4000 chars practical, spec has no hard limit
# but clients render poorly above this).
MAX_MESSAGE_LENGTH = 4000
@@ -132,85 +114,11 @@ _CRYPTO_DB_PATH = _STORE_DIR / "crypto.db"
# Grace period: ignore messages older than this many seconds before startup.
_STARTUP_GRACE_SECONDS = 5
_OUTBOUND_MENTION_RE = re.compile(
r"(?<![\w/])(@[0-9A-Za-z._=/-]+:[0-9A-Za-z.-]+(?::\d+)?)"
)
_E2EE_INSTALL_HINT = (
"Install with: pip install 'mautrix[encryption]' (requires libolm C library)"
)
_MATRIX_IMAGE_FILENAME_EXTS = frozenset({
".jpg",
".jpeg",
".png",
".gif",
".webp",
".bmp",
".svg",
".heic",
".heif",
".avif",
})
def _looks_like_matrix_image_filename(text: str) -> bool:
"""Return True when Matrix image body text is probably just a transport filename.
Matrix ``m.image`` events commonly populate ``content.body`` with the uploaded
filename when the user did not add a caption. Treating that raw filename as
user-authored text confuses downstream vision enrichment.
"""
candidate = str(text or "").strip()
if not candidate or "\n" in candidate or candidate.endswith("/"):
return False
name = Path(candidate).name
if not name or name != candidate:
return False
suffix = Path(name).suffix.lower()
if not suffix:
return False
guessed_type, _ = mimetypes.guess_type(name)
if guessed_type and guessed_type.startswith("image/"):
return True
return suffix in _MATRIX_IMAGE_FILENAME_EXTS
def _create_matrix_session(proxy_url: str | None):
"""Create an ``aiohttp.ClientSession`` whose proxy applies to *all* requests.
mautrix's ``HTTPAPI._send()`` calls ``session.request()`` without forwarding
per-request ``proxy=`` kwargs. For HTTP(S) proxies we use aiohttp's native
``proxy=`` session parameter which sets a default for every request. For SOCKS
we use ``aiohttp_socks.ProxyConnector`` (connector-level).
When no proxy is configured we enable ``trust_env`` so standard env vars
(``HTTP_PROXY`` / ``HTTPS_PROXY``) are honoured automatically.
"""
import aiohttp
if not proxy_url:
return aiohttp.ClientSession(trust_env=True)
if proxy_url.split("://")[0].lower().startswith("socks"):
try:
from aiohttp_socks import ProxyConnector
return aiohttp.ClientSession(
connector=ProxyConnector.from_url(proxy_url, rdns=True),
)
except ImportError:
logger.warning(
"aiohttp_socks not installed — SOCKS proxy %s ignored. "
"Run: pip install aiohttp-socks",
proxy_url,
)
return aiohttp.ClientSession(trust_env=True)
return aiohttp.ClientSession(proxy=proxy_url)
def _check_e2ee_deps() -> bool:
"""Return True if mautrix E2EE dependencies (python-olm) are available."""
@@ -352,9 +260,6 @@ class MatrixAdapter(BasePlatformAdapter):
"1",
"yes",
)
self._dm_auto_thread: bool = os.getenv(
"MATRIX_DM_AUTO_THREAD", "false"
).lower() in ("true", "1", "yes")
self._dm_mention_threads: bool = os.getenv(
"MATRIX_DM_MENTION_THREADS", "false"
).lower() in ("true", "1", "yes")
@@ -365,11 +270,6 @@ class MatrixAdapter(BasePlatformAdapter):
).lower() not in ("false", "0", "no")
self._pending_reactions: dict[tuple[str, str], str] = {}
# Proxy support — resolve once at init, reuse for all HTTP traffic.
self._proxy_url: str | None = resolve_proxy_url(platform_env_var="MATRIX_PROXY")
if self._proxy_url:
logger.info("Matrix: proxy configured — %s", self._proxy_url)
# Text batching: merge rapid successive messages (Telegram-style).
# Matrix clients split long messages around 4000 chars.
self._text_batch_delay_seconds = float(
@@ -381,18 +281,6 @@ class MatrixAdapter(BasePlatformAdapter):
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
# Matrix reaction-based dangerous command approvals.
self._approval_reaction_map = {
"": "once",
"": "deny",
}
self._approval_prompts_by_event: Dict[str, _MatrixApprovalPrompt] = {}
self._approval_prompt_by_session: Dict[str, str] = {}
allowed_users_raw = os.getenv("MATRIX_ALLOWED_USERS", "")
self._allowed_user_ids: Set[str] = {
u.strip() for u in allowed_users_raw.split(",") if u.strip()
}
def _is_duplicate_event(self, event_id) -> bool:
"""Return True if this event was already processed. Tracks the ID otherwise."""
if not event_id:
@@ -438,7 +326,7 @@ class MatrixAdapter(BasePlatformAdapter):
)
return False
except Exception as exc:
logger.error("Matrix: post-upload key verification failed: %s", exc, exc_info=True)
logger.error("Matrix: post-upload key verification failed: %s", exc)
return False
return True
@@ -454,7 +342,6 @@ class MatrixAdapter(BasePlatformAdapter):
logger.error(
"Matrix: cannot verify device keys on server: %s — refusing E2EE",
exc,
exc_info=True,
)
return False
@@ -469,7 +356,7 @@ class MatrixAdapter(BasePlatformAdapter):
try:
await olm.share_keys()
except Exception as exc:
logger.error("Matrix: failed to re-upload device keys: %s", exc, exc_info=True)
logger.error("Matrix: failed to re-upload device keys: %s", exc)
return False
return await self._reverify_keys_after_upload(client, local_ed25519)
@@ -509,7 +396,6 @@ class MatrixAdapter(BasePlatformAdapter):
"Try generating a new access token to get a fresh device.",
client.device_id,
exc,
exc_info=True,
)
return False
return await self._reverify_keys_after_upload(client, local_ed25519)
@@ -534,11 +420,9 @@ class MatrixAdapter(BasePlatformAdapter):
_STORE_DIR.mkdir(parents=True, exist_ok=True)
# Create the HTTP API layer.
client_session = _create_matrix_session(self._proxy_url)
api = HTTPAPI(
base_url=self._homeserver,
token=self._access_token or "",
client_session=client_session,
)
# Create the client.
@@ -581,7 +465,6 @@ class MatrixAdapter(BasePlatformAdapter):
logger.error(
"Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER: %s",
exc,
exc_info=True,
)
await api.session.close()
return False
@@ -724,44 +607,6 @@ class MatrixAdapter(BasePlatformAdapter):
logger.warning(
"Matrix: recovery key verification failed: %s", exc
)
else:
# No recovery key — bootstrap cross-signing if the bot
# has none yet. Without this, Element shows "Encrypted
# by a device not verified by its owner" on every
# message from this bot, indefinitely. mautrix's
# generate_recovery_key does the full flow: generates
# MSK/SSK/USK, uploads private keys to SSSS, publishes
# public keys to the homeserver, and signs the current
# device with the new SSK. Some homeservers require UIA
# for /keys/device_signing/upload — those will need an
# alternate path; Continuwuity and Synapse-with-shared-
# secret accept the unauthenticated upload.
try:
own_xsign = await olm.get_own_cross_signing_public_keys()
except Exception as exc:
own_xsign = None
logger.warning(
"Matrix: cross-signing key lookup failed: %s", exc
)
if own_xsign is None:
try:
new_recovery_key = await olm.generate_recovery_key()
logger.warning(
"Matrix: bootstrapped cross-signing for %s. "
"SAVE THIS RECOVERY KEY — set "
"MATRIX_RECOVERY_KEY for future restarts so "
"the bot can re-sign its device after key "
"rotation: %s",
client.mxid,
new_recovery_key,
)
except Exception as exc:
logger.warning(
"Matrix: cross-signing bootstrap failed "
"(non-fatal — Element will show 'not "
"verified by its owner'): %s",
exc,
)
client.crypto = olm
logger.info(
@@ -819,7 +664,6 @@ class MatrixAdapter(BasePlatformAdapter):
await asyncio.gather(*tasks)
except Exception as exc:
logger.warning("Matrix: initial sync event dispatch error: %s", exc)
await self._join_pending_invites(sync_data)
else:
logger.warning(
"Matrix: initial sync returned unexpected type %s",
@@ -883,8 +727,17 @@ class MatrixAdapter(BasePlatformAdapter):
chunks = self.truncate_message(formatted, MAX_MESSAGE_LENGTH)
last_event_id = None
for i, chunk in enumerate(chunks):
msg_content = self._build_text_message_content(chunk)
for chunk in chunks:
msg_content: Dict[str, Any] = {
"msgtype": "m.text",
"body": chunk,
}
# Convert markdown to HTML for rich rendering.
html = self._markdown_to_html(chunk)
if html and html != chunk:
msg_content["format"] = "org.matrix.custom.html"
msg_content["formatted_body"] = html
# Reply-to support.
if reply_to:
@@ -991,21 +844,25 @@ class MatrixAdapter(BasePlatformAdapter):
"""Edit an existing message (via m.replace)."""
formatted = self.format_message(content)
new_content = self._build_text_message_content(formatted)
msg_content: Dict[str, Any] = {
"msgtype": "m.text",
"body": f"* {formatted}",
"m.new_content": new_content,
"m.new_content": {
"msgtype": "m.text",
"body": formatted,
},
"m.relates_to": {
"rel_type": "m.replace",
"event_id": message_id,
},
}
if "m.mentions" in new_content:
msg_content["m.mentions"] = new_content["m.mentions"]
if "formatted_body" in new_content:
html = self._markdown_to_html(formatted)
if html and html != formatted:
msg_content["m.new_content"]["format"] = "org.matrix.custom.html"
msg_content["m.new_content"]["formatted_body"] = html
msg_content["format"] = "org.matrix.custom.html"
msg_content["formatted_body"] = f'* {new_content["formatted_body"]}'
msg_content["m.relates_to"] = {
"rel_type": "m.replace",
"event_id": message_id,
}
msg_content["formatted_body"] = f"* {html}"
try:
event_id = await self._client.send_message_event(
@@ -1038,12 +895,10 @@ class MatrixAdapter(BasePlatformAdapter):
# Try aiohttp first (always available), fall back to httpx
try:
import aiohttp as _aiohttp
_sess_kw, _req_kw = proxy_kwargs_for_aiohttp(self._proxy_url)
async with _aiohttp.ClientSession(**_sess_kw) as http:
async with _aiohttp.ClientSession(trust_env=True) as http:
async with http.get(
image_url,
timeout=_aiohttp.ClientTimeout(total=30),
**_req_kw,
image_url, timeout=_aiohttp.ClientTimeout(total=30)
) as resp:
resp.raise_for_status()
data = await resp.read()
@@ -1053,10 +908,8 @@ class MatrixAdapter(BasePlatformAdapter):
)
except ImportError:
import httpx
_httpx_kw: dict = {}
if self._proxy_url:
_httpx_kw["proxy"] = self._proxy_url
async with httpx.AsyncClient(**_httpx_kw) as http:
async with httpx.AsyncClient() as http:
resp = await http.get(image_url, follow_redirects=True, timeout=30)
resp.raise_for_status()
data = resp.content
@@ -1131,56 +984,6 @@ class MatrixAdapter(BasePlatformAdapter):
chat_id, video_path, "m.video", caption, reply_to, metadata=metadata
)
async def send_exec_approval(
self,
chat_id: str,
command: str,
session_key: str,
description: str = "dangerous command",
metadata: Optional[dict] = None,
) -> SendResult:
"""Send a reaction-based exec approval prompt for Matrix."""
if not self._client:
return SendResult(success=False, error="Not connected")
cmd_preview = command[:2000] + "..." if len(command) > 2000 else command
text = (
"⚠️ **Dangerous command requires approval**\n"
f"```\n{cmd_preview}\n```\n"
f"Reason: {description}\n\n"
"Reply `/approve` to execute, `/approve session` to approve this pattern for the session, "
"`/approve always` to approve permanently, or `/deny` to cancel.\n\n"
"You can also click the reaction to approve:\n"
"✅ = /approve\n"
"❎ = /deny"
)
result = await self.send(chat_id, text, metadata=metadata)
if not result.success or not result.message_id:
return result
prompt = _MatrixApprovalPrompt(
session_key=session_key,
chat_id=chat_id,
message_id=result.message_id,
)
old_event = self._approval_prompt_by_session.get(session_key)
if old_event:
self._approval_prompts_by_event.pop(old_event, None)
self._approval_prompts_by_event[result.message_id] = prompt
self._approval_prompt_by_session[session_key] = result.message_id
for emoji in ("", ""):
try:
reaction_result = await self._send_reaction(chat_id, result.message_id, emoji)
# Save the bot's reaction event_id for later cleanup
if reaction_result:
prompt.bot_reaction_events[emoji] = str(reaction_result)
except Exception as exc:
logger.debug("Matrix: failed to add approval reaction %s: %s", emoji, exc)
return result
def format_message(self, content: str) -> str:
"""Pass-through — Matrix supports standard Markdown natively."""
# Strip image markdown; media is uploaded separately.
@@ -1312,15 +1115,9 @@ class MatrixAdapter(BasePlatformAdapter):
next_batch = await client.sync_store.get_next_batch()
while not self._closing:
try:
# Wrap in asyncio.wait_for to guard against TCP-level hangs
# that the Matrix long-poll timeout cannot catch. Long-poll
# is 30s, so 45s gives 15s slack for network drain.
sync_data = await asyncio.wait_for(
client.sync(
since=next_batch,
timeout=30000,
),
timeout=45.0,
sync_data = await client.sync(
since=next_batch,
timeout=30000,
)
# nio returns SyncError objects (not exceptions) for auth
@@ -1356,7 +1153,6 @@ class MatrixAdapter(BasePlatformAdapter):
await asyncio.gather(*tasks)
except Exception as exc:
logger.warning("Matrix: sync event dispatch error: %s", exc)
await self._join_pending_invites(sync_data)
except asyncio.CancelledError:
return
@@ -1382,92 +1178,13 @@ class MatrixAdapter(BasePlatformAdapter):
# Event callbacks
# ------------------------------------------------------------------
def _is_self_sender(self, sender: str) -> bool:
"""Return True if the sender refers to the bot's own account.
Matrix user IDs are byte-compared after trimming whitespace and
lowercasing some homeservers normalize the localpart case
differently at different API surfaces, and the reply-loop tail
of the "hall of mirrors" bug (#15763) has been observed with the
bot's own account bypassing a case-sensitive equality check.
When ``self._user_id`` is empty (whoami hasn't resolved yet, or
login failed), we cannot prove a sender is NOT us, so we return
True defensively an unidentified bot dropping its own events
is always preferable to falling into an echo loop.
"""
own = (self._user_id or "").strip().lower()
if not own:
return True
return sender.strip().lower() == own
@staticmethod
def _is_system_or_bridge_sender(sender: str) -> bool:
"""Return True if the sender looks like a system / bridge / appservice
identity rather than a real user.
Appservice namespaces on Matrix conventionally prefix bot / puppet
user IDs with an underscore (e.g. ``@_telegram_12345:server``,
``@_discord_999:server``, ``@_slack_...:server``). Server-notices
bots and bridge-controller bots on many homeservers use the same
pattern.
We treat these as system identities for pairing purposes: they
should never be offered a pairing code, because an operator
approving the code would hand the bridge itself permanent
authorization and every outbound message relayed by the bridge
would then loop back into the agent as an "authorized user
message", which is the root of issue #15763.
Matches:
``@_something:server`` appservice namespace convention
``@:server`` malformed / empty localpart
``:server`` malformed, no leading ``@``
"""
s = (sender or "").strip()
if not s:
return True
# Localpart is everything between leading '@' and ':'
if s.startswith("@"):
s = s[1:]
if ":" in s:
localpart, _, _ = s.partition(":")
else:
localpart = s
if not localpart:
return True
return localpart.startswith("_")
async def _on_room_message(self, event: Any) -> None:
"""Handle incoming room message events (text, media)."""
room_id = str(getattr(event, "room_id", ""))
sender = str(getattr(event, "sender", ""))
# Diagnostic: confirm the callback is firing at all when DEBUG is on.
# Helps users troubleshoot silent inbound issues like #5819, #7914, #12614.
logger.debug(
"Matrix: callback fired — event %s from %s in %s",
getattr(event, "event_id", "?"),
sender,
room_id,
)
# Ignore own messages (case-insensitive; also drops when our own
# user_id hasn't been resolved yet — see _is_self_sender docstring
# and issue #15763).
if self._is_self_sender(sender):
return
# Ignore appservice / bridge / system identities so they never
# trigger the pairing flow. Once a bridge user is paired, every
# outbound message it relays would loop back as an authorized
# user message (the "hall of mirrors" in #15763).
if self._is_system_or_bridge_sender(sender):
logger.debug(
"Matrix: ignoring system/bridge sender %s in %s",
sender,
room_id,
)
# Ignore own messages.
if sender == self._user_id:
return
# Deduplicate by event ID.
@@ -1563,12 +1280,6 @@ class MatrixAdapter(BasePlatformAdapter):
in_bot_thread = bool(thread_id and thread_id in self._threads)
if self._require_mention and not is_free_room and not in_bot_thread:
if not is_mentioned:
logger.debug(
"Matrix: ignoring message %s in %s — no @mention "
"(set MATRIX_REQUIRE_MENTION=false to disable)",
event_id,
room_id,
)
return None
# DM mention-thread.
@@ -1581,7 +1292,7 @@ class MatrixAdapter(BasePlatformAdapter):
body = self._strip_mention(body)
# Auto-thread.
if not thread_id and ((not is_dm and self._auto_thread) or (is_dm and self._dm_auto_thread)):
if not is_dm and not thread_id and self._auto_thread:
thread_id = event_id
self._threads.mark(thread_id)
@@ -1823,9 +1534,6 @@ class MatrixAdapter(BasePlatformAdapter):
return
body, is_dm, chat_type, thread_id, display_name, source = ctx
if msgtype == "m.image" and _looks_like_matrix_image_filename(body):
body = ""
allow_http_fallback = bool(http_url) and not is_encrypted_media
media_urls = (
[cached_path]
@@ -1855,35 +1563,13 @@ class MatrixAdapter(BasePlatformAdapter):
"Matrix: invited to %s — joining",
room_id,
)
await self._join_room_by_id(room_id)
async def _join_room_by_id(self, room_id: str) -> bool:
"""Join a room by ID and refresh local caches on success."""
if not room_id:
return False
if room_id in self._joined_rooms:
return True
try:
await self._client.join_room(RoomID(room_id))
self._joined_rooms.add(room_id)
logger.info("Matrix: joined %s", room_id)
await self._refresh_dm_cache()
return True
except Exception as exc:
logger.warning("Matrix: error joining %s: %s", room_id, exc)
return False
async def _join_pending_invites(self, sync_data: Dict[str, Any]) -> None:
"""Join rooms still present in rooms.invite after sync processing."""
rooms = sync_data.get("rooms", {}) if isinstance(sync_data, dict) else {}
invites = rooms.get("invite", {})
if not isinstance(invites, dict):
return
for room_id in invites:
if room_id in self._joined_rooms:
continue
logger.info("Matrix: reconciling pending invite for %s", room_id)
await self._join_room_by_id(str(room_id))
# ------------------------------------------------------------------
# Reactions (send, receive, processing lifecycle)
@@ -1968,7 +1654,7 @@ class MatrixAdapter(BasePlatformAdapter):
async def _on_reaction(self, event: Any) -> None:
"""Handle incoming reaction events."""
sender = str(getattr(event, "sender", ""))
if self._is_self_sender(sender):
if sender == self._user_id:
return
event_id = str(getattr(event, "event_id", ""))
if self._is_duplicate_event(event_id):
@@ -1998,51 +1684,6 @@ class MatrixAdapter(BasePlatformAdapter):
room_id,
)
# Check if this reaction resolves a pending approval prompt.
prompt = self._approval_prompts_by_event.get(reacts_to)
if prompt and not prompt.resolved:
if room_id != prompt.chat_id:
return
if self._allowed_user_ids and sender not in self._allowed_user_ids:
logger.info(
"Matrix: ignoring approval reaction from unauthorized user %s on %s",
sender, reacts_to,
)
return
choice = self._approval_reaction_map.get(key)
if not choice:
return
try:
from tools.approval import resolve_gateway_approval
count = resolve_gateway_approval(prompt.session_key, choice)
if count:
prompt.resolved = True
self._approval_prompts_by_event.pop(reacts_to, None)
self._approval_prompt_by_session.pop(prompt.session_key, None)
logger.info(
"Matrix reaction resolved %d approval(s) for session %s "
"(choice=%s, user=%s)",
count, prompt.session_key, choice, sender,
)
# Redact bot's seed reactions, leaving only the user's
await self._redact_bot_approval_reactions(room_id, prompt)
except Exception as exc:
logger.error("Failed to resolve gateway approval from Matrix reaction: %s", exc)
async def _redact_bot_approval_reactions(
self,
room_id: str,
prompt: "_MatrixApprovalPrompt",
) -> None:
"""Redact the bot's seed ✅/❎ reactions, leaving only the user's reaction."""
for emoji, evt_id in prompt.bot_reaction_events.items():
try:
await self.redact_message(room_id, evt_id, "approval resolved")
logger.debug("Matrix: redacted bot reaction %s (%s)", emoji, evt_id)
except Exception as exc:
logger.debug("Matrix: failed to redact bot reaction %s: %s", emoji, exc)
# ------------------------------------------------------------------
# Text message aggregation (handles Matrix client-side splits)
# ------------------------------------------------------------------
@@ -2268,7 +1909,11 @@ class MatrixAdapter(BasePlatformAdapter):
if not self._client or not text:
return SendResult(success=False, error="No client or empty text")
msg_content = self._build_text_message_content(text, msgtype=msgtype)
msg_content: Dict[str, Any] = {"msgtype": msgtype, "body": text}
html = self._markdown_to_html(text)
if html and html != text:
msg_content["format"] = "org.matrix.custom.html"
msg_content["formatted_body"] = html
try:
event_id = await self._client.send_message_event(
@@ -2331,77 +1976,6 @@ class MatrixAdapter(BasePlatformAdapter):
# Mention detection helpers
# ------------------------------------------------------------------
def _build_text_message_content(self, text: str, msgtype: str = "m.text") -> Dict[str, Any]:
"""Build Matrix text content with HTML and outbound mention metadata."""
msg_content: Dict[str, Any] = {"msgtype": msgtype, "body": text}
mention_user_ids = self._extract_outbound_mentions(text)
if mention_user_ids:
msg_content["m.mentions"] = {"user_ids": mention_user_ids}
html_source = self._inject_outbound_mention_links(text)
html = self._markdown_to_html(html_source)
if html and html != text:
msg_content["format"] = "org.matrix.custom.html"
msg_content["formatted_body"] = html
return msg_content
def _extract_outbound_mentions(self, text: str) -> list[str]:
"""Return unique Matrix user IDs mentioned in outbound text."""
protected, _ = self._protect_outbound_mention_regions(text)
seen: Set[str] = set()
mentions: list[str] = []
for match in _OUTBOUND_MENTION_RE.finditer(protected):
user_id = match.group(1)
if user_id not in seen:
seen.add(user_id)
mentions.append(user_id)
return mentions
def _inject_outbound_mention_links(self, text: str) -> str:
"""Wrap outbound Matrix mentions in markdown links outside code spans."""
if not text:
return text
protected, placeholders = self._protect_outbound_mention_regions(text)
linked = _OUTBOUND_MENTION_RE.sub(
lambda match: f"[{match.group(1)}](https://matrix.to/#/{match.group(1)})",
protected,
)
for idx, original in enumerate(placeholders):
linked = linked.replace(f"\x00MENTION_PROTECTED{idx}\x00", original)
return linked
def _protect_outbound_mention_regions(self, text: str) -> tuple[str, list[str]]:
"""Protect markdown regions where outbound mentions should stay literal."""
placeholders: list[str] = []
def _protect(fragment: str) -> str:
idx = len(placeholders)
placeholders.append(fragment)
return f"\x00MENTION_PROTECTED{idx}\x00"
protected = re.sub(
r"```[\s\S]*?```",
lambda match: _protect(match.group(0)),
text or "",
)
protected = re.sub(
r"`[^`\n]+`",
lambda match: _protect(match.group(0)),
protected,
)
protected = re.sub(
r"\[[^\]]+\]\([^)]+\)",
lambda match: _protect(match.group(0)),
protected,
)
return protected, placeholders
def _is_bot_mentioned(
self,
body: str,
@@ -2436,33 +2010,13 @@ class MatrixAdapter(BasePlatformAdapter):
return False
def _strip_mention(self, body: str) -> str:
"""Remove explicit bot mentions from message body.
"""Strip the bot's full MXID (``@user:server``) from *body*.
Important: only strip explicit mention tokens (``@user:server`` or
``@localpart``). Do NOT strip bare words matching the bot localpart,
otherwise normal phrases like "Hermes Agent" become "Agent".
The bare localpart is intentionally *not* stripped it would
mangle file paths like ``/home/hermes/media/file.png``.
"""
if not body:
return ""
# Strip explicit full MXID mentions.
if self._user_id:
body = body.replace(self._user_id, "")
# Strip explicit @localpart mentions only (not bare localpart words).
if self._user_id and ":" in self._user_id:
localpart = self._user_id.split(":")[0].lstrip("@")
if localpart:
body = re.sub(
r'(?<![\w])@' + re.escape(localpart) + r'\b',
'',
body,
flags=re.IGNORECASE,
)
# Normalize spacing after mention removal.
body = re.sub(r'[ \t]{2,}', ' ', body)
body = re.sub(r'\s+([,.;:!?])', r'\1', body)
return body.strip()
async def _get_display_name(self, room_id: str, user_id: str) -> str:
+2 -95
View File
@@ -19,7 +19,7 @@ import logging
import os
import re
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from typing import Any, Dict, List, Optional
from gateway.config import Platform, PlatformConfig
from gateway.platforms.helpers import MessageDeduplicator
@@ -412,6 +412,7 @@ class MattermostAdapter(BasePlatformAdapter):
import aiohttp
last_exc = None
file_data = None
ct = "application/octet-stream"
fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
@@ -496,100 +497,6 @@ class MattermostAdapter(BasePlatformAdapter):
return SendResult(success=False, error="Failed to post with file")
return SendResult(success=True, message_id=data["id"])
async def send_multiple_images(
self,
chat_id: str,
images: List[Tuple[str, str]],
metadata: Optional[Dict[str, Any]] = None,
human_delay: float = 0.0,
) -> None:
"""Send a batch of images as a single Mattermost post with multiple attachments.
Mattermost supports up to 5 ``file_ids`` per post. Each image is
uploaded individually (Mattermost's file API is one-at-a-time),
then a single post is created referencing all uploaded file_ids
at once. Batches larger than 5 are chunked. Falls back to the
base per-image loop on total failure.
"""
if not images:
return
import mimetypes
import aiohttp
from urllib.parse import unquote as _unquote
CHUNK = 5 # Mattermost post file_ids cap
chunks = [images[i:i + CHUNK] for i in range(0, len(images), CHUNK)]
for chunk_idx, chunk in enumerate(chunks):
if human_delay > 0 and chunk_idx > 0:
await asyncio.sleep(human_delay)
file_ids: List[str] = []
caption_parts: List[str] = []
try:
for image_url, alt_text in chunk:
if alt_text:
caption_parts.append(alt_text)
if image_url.startswith("file://"):
local_path = _unquote(image_url[7:])
p = Path(local_path)
if not p.exists():
logger.warning("Mattermost: skipping missing image %s", local_path)
continue
fname = p.name
ct = mimetypes.guess_type(fname)[0] or "image/png"
file_data = p.read_bytes()
else:
from tools.url_safety import is_safe_url
if not is_safe_url(image_url):
logger.warning("Mattermost: blocked unsafe image URL in batch")
continue
try:
async with self._session.get(
image_url, timeout=aiohttp.ClientTimeout(total=30)
) as resp:
if resp.status >= 400:
logger.warning(
"Mattermost: failed to download image (HTTP %d): %s",
resp.status, image_url[:80],
)
continue
file_data = await resp.read()
ct = resp.content_type or "image/png"
except Exception as dl_err:
logger.warning("Mattermost: download failed for %s: %s", image_url[:80], dl_err)
continue
fname = image_url.rsplit("/", 1)[-1].split("?")[0] or f"image_{len(file_ids)}.png"
fid = await self._upload_file(chat_id, file_data, fname, ct)
if fid:
file_ids.append(fid)
if not file_ids:
continue
payload: Dict[str, Any] = {
"channel_id": chat_id,
"message": "\n".join(caption_parts),
"file_ids": file_ids,
}
logger.info(
"Mattermost: sending %d image(s) as single post (chunk %d/%d)",
len(file_ids), chunk_idx + 1, len(chunks),
)
data = await self._api_post("posts", payload)
if not data or "id" not in data:
logger.warning("Mattermost: multi-image post failed, falling back")
await super().send_multiple_images(chat_id, chunk, metadata, human_delay=human_delay)
except Exception as e:
logger.warning(
"Mattermost: multi-image send failed (chunk %d/%d), falling back: %s",
chunk_idx + 1, len(chunks), e, exc_info=True,
)
await super().send_multiple_images(chat_id, chunk, metadata, human_delay=human_delay)
# ------------------------------------------------------------------
# WebSocket
# ------------------------------------------------------------------
+8 -41
View File
@@ -243,14 +243,10 @@ class QQAdapter(BasePlatformAdapter):
return False
try:
# Tighter keepalive pool so idle CLOSE_WAIT sockets drain
# faster behind proxies like Cloudflare Warp (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
limits=platform_httpx_limits(),
)
# 1. Get access token
@@ -397,24 +393,13 @@ class QQAdapter(BasePlatformAdapter):
await self._session.close()
self._session = None
# Honor WSL proxy env for QQ WebSocket. Hermes upgrades overwrite this
# local patch, so QQ can regress to direct-connect timeouts after update.
self._session = aiohttp.ClientSession(trust_env=True)
ws_proxy = (
os.getenv("WSS_PROXY")
or os.getenv("wss_proxy")
or os.getenv("HTTPS_PROXY")
or os.getenv("https_proxy")
or os.getenv("ALL_PROXY")
or os.getenv("all_proxy")
)
self._session = aiohttp.ClientSession()
self._ws = await self._session.ws_connect(
gateway_url,
headers={
"User-Agent": build_user_agent(),
},
timeout=CONNECT_TIMEOUT_SECONDS,
proxy=ws_proxy,
)
logger.info("[%s] WebSocket connected to %s", self._log_tag, gateway_url)
@@ -991,18 +976,6 @@ class QQAdapter(BasePlatformAdapter):
if not channel_id:
return
# Apply group_policy ACL — guild channels are group-like contexts.
# Without this check any member of any guild the bot is in could
# bypass the configured allowlist.
guild_id = str(d.get("guild_id", ""))
author_id = str(author.get("id", ""))
if not self._is_group_allowed(guild_id or channel_id, author_id):
logger.debug(
"[%s] Guild message blocked by ACL: channel=%s user=%s",
self._log_tag, channel_id, author_id,
)
return
member = d.get("member") if isinstance(d.get("member"), dict) else {}
nick = str(member.get("nick", "")) or str(author.get("username", ""))
@@ -1059,17 +1032,6 @@ class QQAdapter(BasePlatformAdapter):
if not guild_id:
return
# Apply dm_policy ACL — guild DMs were previously unauthenticated.
# Without this check any member of any guild the bot is in could
# bypass the configured allowlist via direct messages.
author_id = str(author.get("id", ""))
if not self._is_dm_allowed(author_id):
logger.debug(
"[%s] Guild DM blocked by ACL: guild=%s user=%s",
self._log_tag, guild_id, author_id,
)
return
text = content
att_result = await self._process_attachments(d.get("attachments"))
image_urls = att_result["image_urls"]
@@ -1995,7 +1957,7 @@ class QQAdapter(BasePlatformAdapter):
self, openid: str, content: str, reply_to: Optional[str] = None
) -> SendResult:
"""Send text to a C2C user via REST API."""
self._next_msg_seq(reply_to or openid)
msg_seq = self._next_msg_seq(reply_to or openid)
body = self._build_text_body(content, reply_to)
if reply_to:
body["msg_id"] = reply_to
@@ -2008,7 +1970,7 @@ class QQAdapter(BasePlatformAdapter):
self, group_openid: str, content: str, reply_to: Optional[str] = None
) -> SendResult:
"""Send text to a group via REST API."""
self._next_msg_seq(reply_to or group_openid)
msg_seq = self._next_msg_seq(reply_to or group_openid)
body = self._build_text_body(content, reply_to)
if reply_to:
body["msg_id"] = reply_to
@@ -2173,6 +2135,11 @@ class QQAdapter(BasePlatformAdapter):
# Route
chat_type = self._guess_chat_type(chat_id)
target_path = (
f"/v2/users/{chat_id}/files"
if chat_type == "c2c"
else f"/v2/groups/{chat_id}/files"
)
if chat_type == "guild":
# Guild channels don't support native media upload in the same way
+13 -536
View File
@@ -21,7 +21,7 @@ import time
import uuid
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from typing import Dict, List, Optional, Any
from urllib.parse import quote, unquote
import httpx
@@ -31,7 +31,6 @@ from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
ProcessingOutcome,
SendResult,
cache_image_from_bytes,
cache_audio_from_bytes,
@@ -39,17 +38,6 @@ from gateway.platforms.base import (
cache_image_from_url,
)
from gateway.platforms.helpers import redact_phone
from gateway.platforms.signal_rate_limit import (
SIGNAL_BATCH_PACING_NOTICE_THRESHOLD,
SIGNAL_MAX_ATTACHMENTS_PER_MSG,
SIGNAL_RATE_LIMIT_MAX_ATTEMPTS,
SignalRateLimitError,
_extract_retry_after_seconds,
_format_wait,
_is_signal_rate_limit_error,
_signal_send_timeout,
get_scheduler,
)
logger = logging.getLogger(__name__)
@@ -64,7 +52,6 @@ SSE_RETRY_DELAY_MAX = 60.0
HEALTH_CHECK_INTERVAL = 30.0 # seconds between health checks
HEALTH_CHECK_STALE_THRESHOLD = 120.0 # seconds without SSE activity before concern
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
@@ -175,10 +162,6 @@ class SignalAdapter(BasePlatformAdapter):
"""Signal messenger adapter using signal-cli HTTP daemon."""
platform = Platform.SIGNAL
# Signal has no real edit API for already-sent messages. Mark it explicitly
# so streaming suppresses the visible cursor instead of leaving a stale tofu
# square behind in chat clients when edit attempts fail.
SUPPORTS_MESSAGE_EDITING = False
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.SIGNAL)
@@ -192,15 +175,6 @@ class SignalAdapter(BasePlatformAdapter):
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# DM allowlist — mirrors SIGNAL_ALLOWED_USERS checked by run.py.
# Stored here so the reaction hooks can skip unauthorized senders
# (reactions fire before run.py's auth gate, so without this check
# every inbound DM from any contact gets a 👀 reaction).
# "*" means all users allowed (open mode); empty means no restriction
# recorded at adapter level (run.py still enforces auth separately).
dm_allowed_str = os.getenv("SIGNAL_ALLOWED_USERS", "*")
self.dm_allow_from = set(_parse_comma_list(dm_allowed_str))
# HTTP client
self.client: Optional[httpx.AsyncClient] = None
@@ -257,9 +231,7 @@ class SignalAdapter(BasePlatformAdapter):
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
self.client = httpx.AsyncClient(timeout=30.0)
try:
# Health check — verify signal-cli daemon is reachable
try:
@@ -516,11 +488,6 @@ class SignalAdapter(BasePlatformAdapter):
if text and mentions:
text = _render_mentions(text, mentions)
# Extract quote (reply-to) context from Signal dataMessage
quote_data = data_message.get("quote") or {}
reply_to_id = str(quote_data.get("id")) if quote_data.get("id") else None
reply_to_text = quote_data.get("text")
# Process attachments
attachments_data = data_message.get("attachments", [])
media_urls = []
@@ -545,18 +512,6 @@ class SignalAdapter(BasePlatformAdapter):
except Exception:
logger.exception("Signal: failed to fetch attachment %s", att_id)
# Skip envelopes with no meaningful content (no text, no attachments).
# Catches profile key updates, empty messages, and other metadata-only
# envelopes that still carry a dataMessage wrapper but have nothing
# worth processing. See issue: signal-cli logs "Profile key update" +
# Hermes receives msg='' triggering a full agent turn for nothing.
if (not text or not text.strip()) and not media_urls:
logger.debug(
"Signal: skipping contentless envelope from %s (%d attachments)",
redact_phone(sender), len(media_urls) if media_urls else 0,
)
return
# Build session source
source = self.build_source(
chat_id=chat_id,
@@ -586,9 +541,7 @@ class SignalAdapter(BasePlatformAdapter):
else:
timestamp = datetime.now(tz=timezone.utc)
# Build and dispatch event.
# Store raw envelope data in raw_message so on_processing_start/complete
# can extract targetAuthor + targetTimestamp for sendReaction.
# Build and dispatch event
event = MessageEvent(
source=source,
text=text or "",
@@ -596,9 +549,6 @@ class SignalAdapter(BasePlatformAdapter):
media_urls=media_urls,
media_types=media_types,
timestamp=timestamp,
raw_message={"sender": sender, "timestamp_ms": ts_ms},
reply_to_message_id=reply_to_id,
reply_to_text=reply_to_text,
)
logger.debug("Signal: message from %s in %s: %s",
@@ -709,8 +659,6 @@ class SignalAdapter(BasePlatformAdapter):
rpc_id: str = None,
*,
log_failures: bool = True,
raise_on_rate_limit: bool = False,
timeout: float = 30.0,
) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon.
@@ -719,11 +667,6 @@ class SignalAdapter(BasePlatformAdapter):
repeated NETWORK_FAILURE spam for unreachable recipients while
still preserving visibility for the first occurrence and for
unrelated RPCs.
When ``raise_on_rate_limit=True``, a Signal ``[429]`` /
``RateLimitException`` response raises ``SignalRateLimitError``
instead of being swallowed lets callers (multi-attachment send)
opt into backoff-retry without changing default behaviour.
"""
if not self.client:
logger.warning("Signal: RPC called but client not connected")
@@ -743,28 +686,20 @@ class SignalAdapter(BasePlatformAdapter):
resp = await self.client.post(
f"{self.http_url}/api/v1/rpc",
json=payload,
timeout=timeout,
timeout=30.0,
)
resp.raise_for_status()
data = resp.json()
if "error" in data:
err = data["error"]
if raise_on_rate_limit:
if _is_signal_rate_limit_error(err):
err_msg = str(err.get("message", "")) if isinstance(err, dict) else str(err)
retry_after = _extract_retry_after_seconds(err)
raise SignalRateLimitError(err_msg, retry_after=retry_after)
if log_failures:
logger.warning("Signal RPC error (%s): %s", method, err)
logger.warning("Signal RPC error (%s): %s", method, data["error"])
else:
logger.debug("Signal RPC error (%s): %s", method, err)
logger.debug("Signal RPC error (%s): %s", method, data["error"])
return None
return data.get("result")
except SignalRateLimitError:
raise
except Exception as e:
if log_failures:
logger.warning("Signal RPC %s failed: %s", method, e)
@@ -772,159 +707,6 @@ class SignalAdapter(BasePlatformAdapter):
logger.debug("Signal RPC %s failed: %s", method, e)
return None
# ------------------------------------------------------------------
# Formatting — markdown → Signal body ranges
# ------------------------------------------------------------------
@staticmethod
def _markdown_to_signal(text: str) -> tuple:
"""Convert markdown to plain text + Signal textStyles list.
Signal doesn't render markdown. Instead it uses ``bodyRanges``
(exposed by signal-cli as ``textStyle`` / ``textStyles`` params)
with the format ``start:length:STYLE``.
Positions are measured in **UTF-16 code units** (not Python code
points) because that's what the Signal protocol uses.
Supported styles: BOLD, ITALIC, STRIKETHROUGH, MONOSPACE.
(Signal's SPOILER style is not currently mapped — no standard
markdown syntax for it; would need ``||spoiler||`` parsing.)
Returns ``(plain_text, styles_list)`` where *styles_list* may be
empty if there's nothing to format.
"""
import re
def _utf16_len(s: str) -> int:
"""Length of *s* in UTF-16 code units."""
return len(s.encode("utf-16-le")) // 2
# Pre-process: normalize whitespace before any position tracking
# so later operations don't invalidate recorded offsets.
text = re.sub(r"\n{3,}", "\n\n", text)
text = text.strip()
styles: list = []
# --- Phase 1: fenced code blocks ```...``` → MONOSPACE ---
_CB = re.compile(r"```[a-zA-Z0-9_+-]*\n?(.*?)```", re.DOTALL)
while m := _CB.search(text):
inner = m.group(1).rstrip("\n")
start = m.start()
text = text[: m.start()] + inner + text[m.end() :]
styles.append((start, len(inner), "MONOSPACE"))
# --- Phase 2: heading markers # Foo → Foo (BOLD) ---
_HEADING = re.compile(r"^#{1,6}\s+", re.MULTILINE)
new_text = ""
last_end = 0
for m in _HEADING.finditer(text):
new_text += text[last_end : m.start()]
last_end = m.end()
eol = text.find("\n", m.end())
if eol == -1:
eol = len(text)
heading_text = text[m.end() : eol]
start = len(new_text)
new_text += heading_text
styles.append((start, len(heading_text), "BOLD"))
last_end = eol
new_text += text[last_end:]
text = new_text
# --- Phase 3: inline patterns (single-pass to avoid offset drift) ---
# The old code processed each pattern sequentially, stripping markers
# and recording positions per-pass. Later passes shifted text without
# adjusting earlier positions → bold/italic landed mid-word.
#
# Fix: collect ALL non-overlapping matches first, then strip every
# marker in one pass so positions are computed against the final text.
_PATTERNS = [
(re.compile(r"\*\*(.+?)\*\*", re.DOTALL), "BOLD"),
(re.compile(r"__(.+?)__", re.DOTALL), "BOLD"),
(re.compile(r"~~(.+?)~~", re.DOTALL), "STRIKETHROUGH"),
(re.compile(r"`(.+?)`"), "MONOSPACE"),
(re.compile(r"(?<!\*)\*(?!\*| )(.+?)(?<!\*)\*(?!\*)"), "ITALIC"),
(re.compile(r"(?<!\w)_(?!_)(.+?)(?<!_)_(?!\w)"), "ITALIC"),
]
# Collect all non-overlapping matches (earlier patterns win ties).
all_matches: list = [] # (start, end, g1_start, g1_end, style)
occupied: list = [] # (start, end) intervals already claimed
for pat, style in _PATTERNS:
for m in pat.finditer(text):
ms, me = m.start(), m.end()
if not any(ms < oe and me > os for os, oe in occupied):
all_matches.append((ms, me, m.start(1), m.end(1), style))
occupied.append((ms, me))
all_matches.sort()
# Build removal list so we can adjust Phase 1/2 styles.
# Each match removes its prefix markers (start..g1_start) and
# suffix markers (g1_end..end).
removals: list = [] # (position, length) sorted
for ms, me, g1s, g1e, _ in all_matches:
if g1s > ms:
removals.append((ms, g1s - ms))
if me > g1e:
removals.append((g1e, me - g1e))
removals.sort()
# Adjust Phase 1/2 styles for characters about to be removed.
def _adj(pos: int) -> int:
shift = 0
for rp, rl in removals:
if rp < pos:
shift += min(rl, pos - rp)
else:
break
return pos - shift
adjusted_prior: list = []
for s, l, st in styles:
ns = _adj(s)
ne = _adj(s + l)
if ne > ns:
adjusted_prior.append((ns, ne - ns, st))
# Strip all inline markers in one pass → positions are correct.
result = ""
last_end = 0
inline_styles: list = []
for ms, me, g1s, g1e, sty in all_matches:
result += text[last_end:ms]
pos = len(result)
inner = text[g1s:g1e]
result += inner
inline_styles.append((pos, len(inner), sty))
last_end = me
result += text[last_end:]
text = result
styles = adjusted_prior + inline_styles
# Convert code-point offsets → UTF-16 code-unit offsets
style_strings = []
for cp_start, cp_len, stype in sorted(styles):
# Safety: skip any out-of-bounds styles
if cp_start < 0 or cp_start + cp_len > len(text):
continue
u16_start = _utf16_len(text[:cp_start])
u16_len = _utf16_len(text[cp_start : cp_start + cp_len])
style_strings.append(f"{u16_start}:{u16_len}:{stype}")
return text, style_strings
def format_message(self, content: str) -> str:
"""Strip markdown for plain-text fallback (used by base class).
The actual rich formatting happens in send() via _markdown_to_signal().
"""
# This is only called if someone uses the base-class send path.
# Our send() override bypasses this entirely.
return content
# ------------------------------------------------------------------
# Sending
# ------------------------------------------------------------------
@@ -936,22 +718,14 @@ class SignalAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send a text message with native Signal formatting."""
"""Send a text message."""
await self._stop_typing_indicator(chat_id)
plain_text, text_styles = self._markdown_to_signal(content)
params: Dict[str, Any] = {
"account": self.account,
"message": plain_text,
"message": content,
}
if text_styles:
if len(text_styles) == 1:
params["textStyle"] = text_styles[0]
else:
params["textStyles"] = text_styles
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
@@ -961,10 +735,11 @@ class SignalAdapter(BasePlatformAdapter):
if result is not None:
self._track_sent_timestamp(result)
# Signal has no editable message identifier. Returning None keeps the
# stream consumer on the non-edit fallback path instead of pretending
# future edits can remove an in-progress cursor from the chat thread.
return SendResult(success=True, message_id=None)
# Use the timestamp from the RPC result as a pseudo message_id.
# Signal doesn't have real message IDs, but the stream consumer
# needs a truthy value to follow its edit→fallback path correctly.
_msg_id = str(result.get("timestamp", "")) if isinstance(result, dict) else None
return SendResult(success=True, message_id=_msg_id or None)
return SendResult(success=False, error="RPC send failed")
def _track_sent_timestamp(self, rpc_result) -> None:
@@ -1028,178 +803,6 @@ class SignalAdapter(BasePlatformAdapter):
self._typing_failures.pop(chat_id, None)
self._typing_skip_until.pop(chat_id, None)
async def send_multiple_images(
self,
chat_id: str,
images: List[Tuple[str, str]],
metadata: Optional[Dict[str, Any]] = None,
human_delay: float = 0.0,
) -> None:
"""Send a batch of images via chunked Signal RPC calls.
Per-image alt texts are dropped Signal's send RPC only carries
one shared message body. Bad images (download failure, missing
file, oversize) are skipped with a warning so one bad URL
doesn't lose the rest of the batch. ``human_delay`` is ignored:
the rate-limit scheduler handles inter-batch pacing.
"""
if not images:
return
scheduler = get_scheduler()
logger.info(
"Signal send_multiple_images: received %d image(s) for %s"
"scheduler state: %s",
len(images), chat_id[:30], scheduler.state(),
)
await self._stop_typing_indicator(chat_id)
attachments: List[str] = []
skipped_download = 0
skipped_missing = 0
skipped_oversize = 0
for image_url, _alt_text in images:
if image_url.startswith("file://"):
file_path = unquote(image_url[7:])
else:
try:
file_path = await cache_image_from_url(image_url)
except Exception as e:
logger.warning("Signal: failed to download image %s: %s", image_url, e)
skipped_download += 1
continue
if not file_path or not Path(file_path).exists():
logger.warning("Signal: image file not found for %s", image_url)
skipped_missing += 1
continue
file_size = Path(file_path).stat().st_size
if file_size > SIGNAL_MAX_ATTACHMENT_SIZE:
logger.warning(
"Signal: image too large (%d bytes), skipping %s", file_size, image_url
)
skipped_oversize += 1
continue
attachments.append(file_path)
if not attachments:
logger.error(
"Signal: no valid images in batch of %d "
"(download=%d missing=%d oversize=%d)",
len(images), skipped_download, skipped_missing, skipped_oversize,
)
return
logger.info(
"Signal send_multiple_images: %d/%d images valid, sending in chunks",
len(attachments), len(images),
)
base_params: Dict[str, Any] = {
"account": self.account,
"message": "",
}
if chat_id.startswith("group:"):
base_params["groupId"] = chat_id[6:]
else:
base_params["recipient"] = [await self._resolve_recipient(chat_id)]
att_batches = [
attachments[i:i + SIGNAL_MAX_ATTACHMENTS_PER_MSG]
for i in range(0, len(attachments), SIGNAL_MAX_ATTACHMENTS_PER_MSG)
]
for idx, att_batch in enumerate(att_batches):
n = len(att_batch)
estimated = scheduler.estimate_wait(n)
logger.debug(
"Signal batch %d/%d: %d attachments, estimated wait=%.1fs",
idx + 1, len(att_batches), n, estimated,
)
if estimated >= SIGNAL_BATCH_PACING_NOTICE_THRESHOLD:
await self._notify_batch_pacing(
chat_id, idx + 1, len(att_batches), estimated
)
params = dict(base_params, attachments=att_batch)
send_timeout = _signal_send_timeout(n)
for attempt in range(1, SIGNAL_RATE_LIMIT_MAX_ATTEMPTS + 1):
await scheduler.acquire(n)
try:
_rpc_t0 = time.monotonic()
result = await self._rpc(
"send", params, raise_on_rate_limit=True, timeout=send_timeout,
)
_rpc_duration = time.monotonic() - _rpc_t0
if result is not None:
self._track_sent_timestamp(result)
await scheduler.report_rpc_duration(_rpc_duration, n)
logger.info(
"Signal batch %d/%d: %d attachments sent in %.1fs "
"(attempt %d/%d)",
idx + 1, len(att_batches), n, _rpc_duration,
attempt, SIGNAL_RATE_LIMIT_MAX_ATTEMPTS,
)
else:
# Assume the server didn't accept the batch, don't deduce tokens
logger.error(
"Signal: RPC send failed for batch %d/%d (%d attachments, "
"attempt %d/%d, rpc_duration=%.1fs)",
idx + 1, len(att_batches), n,
attempt, SIGNAL_RATE_LIMIT_MAX_ATTEMPTS,
_rpc_duration,
)
# Retry transient (non-rate-limit) failures once
if attempt < SIGNAL_RATE_LIMIT_MAX_ATTEMPTS:
backoff = 2.0 ** attempt
logger.info(
"Signal: retrying batch %d/%d after %.1fs backoff",
idx + 1, len(att_batches), backoff,
)
await asyncio.sleep(backoff)
continue
break
except SignalRateLimitError as e:
scheduler.feedback(e.retry_after, n)
if attempt >= SIGNAL_RATE_LIMIT_MAX_ATTEMPTS:
logger.error(
"Signal: rate-limit retries exhausted on batch %d/%d "
"(%d attachments lost, server retry_after=%s)",
idx + 1, len(att_batches), n,
f"{e.retry_after:.0f}s" if e.retry_after else "unknown",
)
break
logger.warning(
"Signal: rate-limited on batch %d/%d "
"(attempt %d/%d, server retry_after=%s); "
"scheduler will pace the retry",
idx + 1, len(att_batches),
attempt, SIGNAL_RATE_LIMIT_MAX_ATTEMPTS,
f"{e.retry_after:.0f}s" if e.retry_after else "unknown",
)
async def _notify_batch_pacing(
self,
chat_id: str,
next_batch_idx: int,
total_batches: int,
wait_s: float,
) -> None:
"""Inform the user when an inter-batch pacing wait crosses the
notice threshold. Best-effort; logs and continues on failure."""
try:
await self.send(
chat_id,
f"(More images coming — pausing ~{_format_wait(wait_s)} "
f"for Signal rate limit, batch {next_batch_idx}/{total_batches}.)",
)
except Exception as e:
logger.warning("Signal: failed to send pacing notice: %s", e)
async def send_image(
self,
chat_id: str,
@@ -1360,132 +963,6 @@ class SignalAdapter(BasePlatformAdapter):
_keep_typing finally block to clean up platform-level typing tasks."""
await self._stop_typing_indicator(chat_id)
# ------------------------------------------------------------------
# Reactions
# ------------------------------------------------------------------
async def send_reaction(
self,
chat_id: str,
emoji: str,
target_author: str,
target_timestamp: int,
) -> bool:
"""Send a reaction emoji to a specific message via signal-cli RPC.
Args:
chat_id: The chat (phone number or "group:<id>")
emoji: Reaction emoji string (e.g. "👀", "")
target_author: Phone number / UUID of the message author
target_timestamp: Signal timestamp (ms) of the message to react to
"""
params: Dict[str, Any] = {
"account": self.account,
"emoji": emoji,
"targetAuthor": target_author,
"targetTimestamp": target_timestamp,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("sendReaction", params)
if result is not None:
return True
logger.debug("Signal: sendReaction failed (chat=%s, emoji=%s)", chat_id[:20], emoji)
return False
async def remove_reaction(
self,
chat_id: str,
target_author: str,
target_timestamp: int,
) -> bool:
"""Remove a reaction by sending an empty-string emoji."""
params: Dict[str, Any] = {
"account": self.account,
"emoji": "",
"targetAuthor": target_author,
"targetTimestamp": target_timestamp,
"remove": True,
}
if chat_id.startswith("group:"):
params["groupId"] = chat_id[6:]
else:
params["recipient"] = [chat_id]
result = await self._rpc("sendReaction", params)
return result is not None
# ------------------------------------------------------------------
# Processing Lifecycle Hooks (reactions as progress indicators)
# ------------------------------------------------------------------
def _extract_reaction_target(self, event: MessageEvent) -> Optional[tuple]:
"""Extract (target_author, target_timestamp) from a MessageEvent.
Returns None if the event doesn't carry the raw Signal envelope data
needed for sendReaction.
"""
raw = event.raw_message
if not isinstance(raw, dict):
return None
author = raw.get("sender")
ts = raw.get("timestamp_ms")
if not author or not ts:
return None
return (author, ts)
def _reactions_enabled(self, event: "MessageEvent" = None) -> bool:
"""Check if message reactions are enabled for this event.
Two gates:
1. SIGNAL_REACTIONS env var set to false/0/no to disable globally.
2. DM allowlist if SIGNAL_ALLOWED_USERS is set, only react to
messages from senders in that list. This prevents unauthorized
contacts from seeing the 👀 reaction (which fires before run.py's
auth gate and would otherwise reveal that a bot is listening).
"""
if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
return False
if event is not None:
sender = getattr(getattr(event, "source", None), "user_id", None)
if sender and "*" not in self.dm_allow_from and sender not in self.dm_allow_from:
return False
return True
async def on_processing_start(self, event: MessageEvent) -> None:
"""React with 👀 when processing begins."""
if not self._reactions_enabled(event):
return
target = self._extract_reaction_target(event)
if target:
await self.send_reaction(event.source.chat_id, "👀", *target)
async def on_processing_complete(self, event: MessageEvent, outcome: "ProcessingOutcome") -> None:
"""Swap the 👀 reaction for ✅ (success) or ❌ (failure).
On CANCELLED we leave the 👀 in place no terminal outcome means
the reaction should keep reflecting "in progress" (matches Telegram).
"""
if not self._reactions_enabled(event):
return
if outcome == ProcessingOutcome.CANCELLED:
return
target = self._extract_reaction_target(event)
if not target:
return
chat_id = event.source.chat_id
# Remove the in-progress reaction, then add the final one
await self.remove_reaction(chat_id, *target)
if outcome == ProcessingOutcome.SUCCESS:
await self.send_reaction(chat_id, "", *target)
elif outcome == ProcessingOutcome.FAILURE:
await self.send_reaction(chat_id, "", *target)
# ------------------------------------------------------------------
# Chat Info
# ------------------------------------------------------------------
-369
View File
@@ -1,369 +0,0 @@
"""
Signal attachment rate-limit scheduler.
Process-wide token-bucket simulator that mirrors the per-account
attachment rate limit signal-cli/Signal-Server enforce. Producers
(``SignalAdapter.send_multiple_images`` and the ``send_message`` tool's
Signal path) call ``acquire(n)`` before an attachment send; on a 429
they call ``feedback(retry_after, n)`` so the model recalibrates from
the server's authoritative hint.
The scheduler serializes concurrent calls through an ``asyncio.Lock``,
giving FIFO fairness across agent sessions sharing one signal-cli
daemon.
"""
from __future__ import annotations
import asyncio
import logging
import re
import time
from typing import Any, Optional
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
SIGNAL_MAX_ATTACHMENTS_PER_MSG = 32 # per-message attachment cap (source: Signal-{Android,Desktop} source code)
SIGNAL_RATE_LIMIT_BUCKET_CAPACITY = 50 # server-side token-bucket capacity for attachments rate limiting
SIGNAL_RATE_LIMIT_DEFAULT_RETRY_AFTER = 4 # fallback token refill interval for signal-cli < v0.14.3
SIGNAL_RATE_LIMIT_MAX_ATTEMPTS = 2 # initial attempt + 1 retry
SIGNAL_BATCH_PACING_NOTICE_THRESHOLD = 10.0 # if estimated waiting time > 10s, notify the user about the delay
SIGNAL_RPC_ERROR_RATELIMIT = -5 # signal-cli (v0.14.3+) JSON-RPC error code for RateLimitException
# ---------------------------------------------------------------------------
# Errors
# ---------------------------------------------------------------------------
class SignalRateLimitError(Exception):
"""
Raised by ``SignalAdapter._rpc`` for rate-limit responses when the
caller has opted in via ``raise_on_rate_limit=True``.
Carries the server-supplied per-token Retry-After (in seconds) on
signal-cli v0.14.3
``retry_after`` is None when the version doesn't expose it.
"""
def __init__(self, message: str, retry_after: Optional[float] = None) -> None:
super().__init__(message)
self.retry_after = retry_after
class SignalSchedulerError(Exception):
pass
# ---------------------------------------------------------------------------
# Detection helpers — used to fish a 429 out of signal-cli's various error
# shapes (typed code, [429] substring, libsignal-net RetryLaterException
# leaked through AttachmentInvalidException).
# ---------------------------------------------------------------------------
# "Retry after 4 seconds" / "retry after 4 second" — libsignal-net's
# RetryLaterException string form, surfaced when 429s hit during
# attachment upload (signal-cli wraps these as AttachmentInvalidException
# rather than RateLimitException, so the typed path doesn't fire).
_RETRY_AFTER_RE = re.compile(r"Retry after (\d+(?:\.\d+)?)\s*second", re.IGNORECASE)
def _extract_retry_after_seconds(err: Any) -> Optional[float]:
"""Pull the per-token Retry-After window from a signal-cli rate-limit error.
Tries two sources, in order:
1. ``error.data.response.results[*].retryAfterSeconds`` the
structured field signal-cli v0.14.3 surfaces for plain
RateLimitException.
2. ``"Retry after N seconds"`` parsed out of the message covers
libsignal-net's RetryLaterException that gets wrapped as
AttachmentInvalidException during attachment upload, where the
structured field stays null.
Returns None when neither yields a value.
"""
msg = ""
if isinstance(err, dict):
data = err.get("data") or {}
response = data.get("response") or {}
results = response.get("results") or []
candidates = [
r.get("retryAfterSeconds") for r in results
if isinstance(r, dict) and r.get("retryAfterSeconds")
]
if candidates:
return float(max(candidates))
msg = str(err.get("message", ""))
else:
msg = str(err)
match = _RETRY_AFTER_RE.search(msg)
return float(match.group(1)) if match else None
def _is_signal_rate_limit_error(err: Any) -> bool:
"""True if a signal-cli RPC error reflects a rate-limit failure.
Matches three layers:
- typed ``RATELIMIT_ERROR`` code (signal-cli v0.14.3, plain
RateLimitException)
- legacy ``[429] / RateLimitException`` substrings
- libsignal-net's ``RetryLaterException`` / ``Retry after N seconds``
surfaced inside ``AttachmentInvalidException`` when the rate
limit is hit during attachment upload signal-cli never re-tags
these as RateLimitException, so substring is the only signal.
"""
if isinstance(err, dict) and err.get("code") == SIGNAL_RPC_ERROR_RATELIMIT:
return True
message = (
str(err.get("message", ""))
if isinstance(err, dict)
else str(err)
)
msg_lower = message.lower()
return (
"[429]" in message
or "ratelimit" in msg_lower
or "retrylaterexception" in msg_lower
or "retry after" in msg_lower
)
# ---------------------------------------------------------------------------
# Misc helpers
# ---------------------------------------------------------------------------
def _format_wait(seconds: float) -> str:
"""Human-friendly wait label for user-facing pacing notices."""
s = max(0.0, seconds)
if s < 90:
return f"{int(round(s))}s"
return f"{max(1, int(round(s / 60)))} min"
def _signal_send_timeout(num_attachments: int) -> float:
"""HTTP timeout for a Signal ``send`` RPC.
signal-cli uploads attachments serially during the call, so the
server-side time scales with batch size. Default 30s is fine for
text-only sends but truncates large attachment batches mid-upload
we then log a phantom failure even though signal-cli completes the
send a few seconds later. Scale at 5s/attachment with a 60s floor.
"""
if num_attachments <= 0:
return 30.0
return max(60.0, 5.0 * num_attachments)
# ---------------------------------------------------------------------------
# Scheduler
# ---------------------------------------------------------------------------
class SignalAttachmentScheduler:
"""Process-wide token-bucket simulator for Signal attachment sends.
The bucket holds up to ``capacity`` tokens (default 50, matching
Signal's server-side rate-limit bucket size). Each attachment consumes one
token. Tokens refill at ``refill_rate`` tokens/second, calibrated
from the per-token Retry-After hint we get from the server when a
429 fires. Until we've observed one, we use the documented default
(1 token / 4 seconds).
Concurrent ``acquire(n)`` calls serialize through an
``asyncio.Lock`` natural FIFO across agent sessions hitting the
same daemon.
"""
def __init__(
self,
capacity: float = float(SIGNAL_RATE_LIMIT_BUCKET_CAPACITY),
default_retry_after: float = float(SIGNAL_RATE_LIMIT_DEFAULT_RETRY_AFTER),
) -> None:
self.capacity = float(capacity)
self.tokens = float(capacity)
self.refill_rate = 1.0 / float(default_retry_after)
self.last_refill = time.monotonic()
self._lock = asyncio.Lock()
# ------------------------------------------------------------------
# Internals
# ------------------------------------------------------------------
def _refill(self) -> None:
now = time.monotonic()
elapsed = now - self.last_refill
if elapsed > 0 and self.tokens < self.capacity:
self.tokens = min(self.capacity, self.tokens + elapsed * self.refill_rate)
self.last_refill = now
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def estimate_wait(self, n: int) -> float:
"""Best-effort estimate of the seconds until ``n`` tokens would
be available. Used to decide whether to emit a user-facing
pacing notice *before* committing to an ``acquire`` that may
block silently. Lock-free; small races vs. concurrent acquires
are benign for an informational notice.
"""
now = time.monotonic()
elapsed = now - self.last_refill
projected = self.tokens
if elapsed > 0 and projected < self.capacity:
projected = min(self.capacity, projected + elapsed * self.refill_rate)
deficit = n - projected
if deficit <= 0:
return 0.0
return deficit / self.refill_rate
async def acquire(self, n: int) -> float:
"""Block until at least ``n`` tokens are available, return the
seconds slept.
Does **not** deduct tokens the bucket is a read-only model of
server-side capacity. Call ``report_rpc_duration()`` after the
RPC to synchronise the model with the server timeline.
Not perfect in case lots of coroutines try to acquire for big
uploads (``report_rpc_duration`` will take a long time to get hit)
but this is just a simulation. Signal server is ground truth and
will raise rate-limit exceptions triggering requeues.
The lock is released during ``asyncio.sleep`` so other callers
can interleave. A retry loop re-checks after each sleep in
case the deadline was pessimistic.
"""
if n <= 0:
return 0.0
if n > self.capacity:
raise SignalSchedulerError(
f"Signal scheduler was called requesting {n} tokens "
f"(max is {self.capacity})",
)
total_slept = 0.0
first_pass = True
while True:
async with self._lock:
self._refill()
if self.tokens >= n:
if not first_pass or total_slept > 0:
logger.debug(
"Signal scheduler: tokens sufficient for %d "
"(remaining=%.1f, total_slept=%.1fs)",
n, self.tokens, total_slept,
)
return total_slept
deficit = n - self.tokens
wait = deficit / self.refill_rate
if first_pass:
logger.info(
"Signal scheduler: pausing %.1fs for %d tokens "
"(available=%.1f, deficit=%.1f, refill=%.4f/s ≈ %.1fs/token)",
wait, n, self.tokens, deficit,
self.refill_rate, 1.0 / self.refill_rate,
)
first_pass = False
await asyncio.sleep(wait)
total_slept += wait
async def report_rpc_duration(self, rpc_duration: float, n_attachments: int) -> None:
"""Record an attachment-send RPC that just completed.
Deducts ``n_attachments`` tokens without crediting refill during
the upload window. Signal's server checks the bucket at RPC start
and does *not* refill during request processing refill resumes
after the response. Crediting upload-time refill causes cumulative
drift that eventually triggers 429s.
Advances ``last_refill`` so the next ``acquire`` / ``_refill``
starts counting from this point.
"""
if n_attachments <= 0:
return
async with self._lock:
now = time.monotonic()
token_before = self.tokens
self.tokens = max(0.0, token_before - float(n_attachments))
self.last_refill = now
logger.log(
logging.INFO if rpc_duration > 10 and n_attachments > 5 else logging.DEBUG,
"Signal scheduler: RPC for %d att took %.1fs — "
"tokens %.1f%.1f (deducted=%d, no upload refill credited, refill=%.4fs⁻¹)",
n_attachments, rpc_duration,
token_before, self.tokens,
n_attachments, self.refill_rate,
)
def feedback(self, retry_after: Optional[float], n_attempted: int) -> None:
"""Apply server feedback after a 429.
``retry_after`` is the per-*token* refill window the server
reports (None when signal-cli is older than v0.14.3 and didn't
surface it).
When present we calibrate ``refill_rate`` from it:
the server is authoritative.
"""
if retry_after and retry_after > 0:
new_rate = 1.0 / float(retry_after)
if new_rate != self.refill_rate:
logger.info(
"Signal scheduler: calibrating refill_rate to %.4f tokens/sec "
"(server retry_after=%.1fs per token)",
new_rate, retry_after,
)
self.refill_rate = new_rate
self.tokens = 0.0
self.last_refill = time.monotonic()
def state(self) -> dict:
"""Return current scheduler state for diagnostic logging (read-only).
Does not advance ``last_refill`` safe to call from logging paths
without perturbing the bucket.
"""
now = time.monotonic()
elapsed = now - self.last_refill
projected = self.tokens
if elapsed > 0 and projected < self.capacity:
projected = min(self.capacity, projected + elapsed * self.refill_rate)
return {
"tokens": round(projected, 1),
"capacity": int(self.capacity),
"refill_rate": round(self.refill_rate, 4),
"refill_seconds_per_token": round(1.0 / self.refill_rate, 1) if self.refill_rate > 0 else float("inf"),
}
# ---------------------------------------------------------------------------
# Process-wide singleton
# ---------------------------------------------------------------------------
_scheduler: Optional[SignalAttachmentScheduler] = None
def get_scheduler() -> SignalAttachmentScheduler:
"""Return the process-wide scheduler, creating it on first access."""
global _scheduler
if _scheduler is None:
_scheduler = SignalAttachmentScheduler()
logger.info(
"Signal scheduler: created (capacity=%d tokens, refill=%.4f/s ≈ %.1fs/token)",
int(_scheduler.capacity),
_scheduler.refill_rate,
1.0 / _scheduler.refill_rate,
)
return _scheduler
def _reset_scheduler() -> None:
"""Drop the cached scheduler so the next ``get_scheduler`` call
builds a fresh one. Test-only never call from production paths."""
global _scheduler
_scheduler = None
+82 -1264
View File
File diff suppressed because it is too large Load Diff
+5 -9
View File
@@ -10,7 +10,7 @@ Shares credentials with the optional telephony skill — same env vars:
Gateway-specific env vars:
- SMS_WEBHOOK_PORT (default 8080)
- SMS_WEBHOOK_HOST (default 127.0.0.1)
- SMS_WEBHOOK_HOST (default 0.0.0.0)
- SMS_WEBHOOK_URL (public URL for Twilio signature validation required)
- SMS_INSECURE_NO_SIGNATURE (true to disable signature validation dev only)
- SMS_ALLOWED_USERS (comma-separated E.164 phone numbers)
@@ -41,7 +41,7 @@ logger = logging.getLogger(__name__)
TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
MAX_SMS_LENGTH = 1600 # ~10 SMS segments
DEFAULT_WEBHOOK_PORT = 8080
DEFAULT_WEBHOOK_HOST = "127.0.0.1"
DEFAULT_WEBHOOK_HOST = "0.0.0.0"
def check_sms_requirements() -> bool:
@@ -91,23 +91,19 @@ class SmsAdapter(BasePlatformAdapter):
from aiohttp import web
if not self._from_number:
msg = "[sms] TWILIO_PHONE_NUMBER not set — cannot send replies"
logger.error(msg)
self._set_fatal_error("sms_missing_phone_number", msg, retryable=False)
logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
return False
insecure_no_sig = os.getenv("SMS_INSECURE_NO_SIGNATURE", "").lower() == "true"
if not self._webhook_url and not insecure_no_sig:
msg = (
logger.error(
"[sms] Refusing to start: SMS_WEBHOOK_URL is required for Twilio "
"signature validation. Set it to the public URL configured in your "
"Twilio console (e.g. https://example.com/webhooks/twilio). "
"For local development without validation, set "
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production)."
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production).",
)
logger.error(msg)
self._set_fatal_error("sms_missing_webhook_url", msg, retryable=False)
return False
if insecure_no_sig and not self._webhook_url:
+41 -594
View File
@@ -84,7 +84,6 @@ from gateway.platforms.telegram_network import (
discover_fallback_ips,
parse_fallback_ip_env,
)
from utils import atomic_replace
def check_telegram_requirements() -> bool:
@@ -123,12 +122,12 @@ def _strip_mdv2(text: str) -> str:
# ---------------------------------------------------------------------------
# Markdown table → Telegram-friendly row groups
# Markdown table → code block conversion
# ---------------------------------------------------------------------------
# Telegram's MarkdownV2 has no table syntax — '|' is just an escaped literal,
# so pipe tables render as noisy backslash-pipe text with no alignment.
# Reformating each row into a bold heading plus bullet list keeps the content
# readable on mobile clients while preserving the source data.
# Wrapping the table in a fenced code block makes Telegram render it as
# monospace preformatted text with columns intact.
# Matches a GFM table delimiter row: optional outer pipes, cells containing
# only dashes (with optional leading/trailing colons for alignment) separated
@@ -145,49 +144,13 @@ def _is_table_row(line: str) -> bool:
return bool(stripped) and '|' in stripped
def _split_markdown_table_row(line: str) -> list[str]:
"""Split a simple GFM table row into stripped cell values."""
stripped = line.strip()
if stripped.startswith("|"):
stripped = stripped[1:]
if stripped.endswith("|"):
stripped = stripped[:-1]
return [cell.strip() for cell in stripped.split("|")]
def _render_table_block_for_telegram(table_block: list[str]) -> str:
"""Render a detected GFM table as Telegram-friendly row groups."""
if len(table_block) < 3:
return "\n".join(table_block)
headers = _split_markdown_table_row(table_block[0])
if len(headers) < 2:
return "\n".join(table_block)
rendered_rows: list[str] = []
for index, row in enumerate(table_block[2:], start=1):
cells = _split_markdown_table_row(row)
if len(cells) < len(headers):
cells.extend([""] * (len(headers) - len(cells)))
elif len(cells) > len(headers):
cells = cells[: len(headers)]
heading = next((cell for cell in cells if cell), f"Row {index}")
rendered_rows.append(f"**{heading}**")
rendered_rows.extend(
f"{header}: {value}" for header, value in zip(headers, cells)
)
return "\n\n".join(rendered_rows)
def _wrap_markdown_tables(text: str) -> str:
"""Rewrite GFM-style pipe tables into Telegram-friendly bullet groups.
"""Wrap GFM-style pipe tables in ``` fences so Telegram renders them.
Detected by a row containing '|' immediately followed by a delimiter
row matching :data:`_TABLE_SEPARATOR_RE`. Subsequent pipe-containing
non-blank lines are consumed as the table body and rewritten as
per-row bullet groups. Tables inside existing fenced code blocks are left
non-blank lines are consumed as the table body and included in the
wrapped block. Tables inside existing fenced code blocks are left
alone.
"""
if '|' not in text or '-' not in text:
@@ -224,7 +187,9 @@ def _wrap_markdown_tables(text: str) -> str:
while j < len(lines) and _is_table_row(lines[j]):
table_block.append(lines[j])
j += 1
out.append(_render_table_block_for_telegram(table_block))
out.append('```')
out.extend(table_block)
out.append('```')
i = j
continue
@@ -237,14 +202,14 @@ def _wrap_markdown_tables(text: str) -> str:
class TelegramAdapter(BasePlatformAdapter):
"""
Telegram bot adapter.
Handles:
- Receiving messages from users and groups
- Sending responses with Telegram markdown
- Forum topics (thread_id support)
- Media messages
"""
# Telegram message limits
MAX_MESSAGE_LENGTH = 4096
# Threshold for detecting Telegram client-side message splits.
@@ -252,7 +217,7 @@ class TelegramAdapter(BasePlatformAdapter):
_SPLIT_THRESHOLD = 4000
MEDIA_GROUP_WAIT_SECONDS = 0.8
_GENERAL_TOPIC_THREAD_ID = "1"
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.TELEGRAM)
self._app: Optional[Application] = None
@@ -286,57 +251,15 @@ class TelegramAdapter(BasePlatformAdapter):
self._model_picker_state: Dict[str, dict] = {}
# Approval button state: message_id → session_key
self._approval_state: Dict[int, str] = {}
# Slash-confirm button state: confirm_id → session_key (for /reload-mcp
# and any other slash-confirm prompts; see GatewayRunner._request_slash_confirm).
self._slash_confirm_state: Dict[str, str] = {}
def _is_callback_user_authorized(
self,
user_id: str,
*,
chat_id: Optional[str] = None,
chat_type: Optional[str] = None,
thread_id: Optional[str] = None,
user_name: Optional[str] = None,
) -> bool:
@staticmethod
def _is_callback_user_authorized(user_id: str) -> bool:
"""Return whether a Telegram inline-button caller may perform gated actions."""
normalized_user_id = str(user_id or "").strip()
if not normalized_user_id:
return False
runner = getattr(getattr(self, "_message_handler", None), "__self__", None)
auth_fn = getattr(runner, "_is_user_authorized", None)
if callable(auth_fn):
try:
from gateway.session import SessionSource
normalized_chat_type = str(chat_type or "dm").strip().lower() or "dm"
if normalized_chat_type == "private":
normalized_chat_type = "dm"
elif normalized_chat_type == "supergroup":
normalized_chat_type = "forum" if thread_id is not None else "group"
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id=str(chat_id or normalized_user_id),
chat_type=normalized_chat_type,
user_id=normalized_user_id,
user_name=str(user_name).strip() if user_name else None,
thread_id=str(thread_id) if thread_id is not None else None,
)
return bool(auth_fn(source))
except Exception:
logger.debug(
"[Telegram] Falling back to env-only callback auth for user %s",
normalized_user_id,
exc_info=True,
)
allowed_csv = os.getenv("TELEGRAM_ALLOWED_USERS", "").strip()
if not allowed_csv:
return True
allowed_ids = {uid.strip() for uid in allowed_csv.split(",") if uid.strip()}
return "*" in allowed_ids or normalized_user_id in allowed_ids
return "*" in allowed_ids or user_id in allowed_ids
@classmethod
def _metadata_thread_id(cls, metadata: Optional[Dict[str, Any]]) -> Optional[str]:
@@ -411,49 +334,6 @@ class TelegramAdapter(BasePlatformAdapter):
return {"link_preview_options": LinkPreviewOptions(is_disabled=True)}
return {"disable_web_page_preview": True}
async def _drain_polling_connections(self) -> None:
"""Reset the httpx connection pool used for getUpdates polling.
Network errors (especially through proxies like sing-box) can leave
httpx connections in a half-closed state that still occupy pool slots.
After enough reconnect cycles the pool fills up entirely, causing
``Pool timeout: All connections in the connection pool are occupied.``
We reset ONLY ``_request[0]`` (the getUpdates request) the general
request (``_request[1]``) is left untouched so concurrent
``send_message`` / ``edit_message`` calls are never interrupted.
Implementation note: accesses ``Bot._request[0]`` which is the
get-updates ``BaseRequest`` in the PTB 22.x internal tuple
``(get_updates_request, general_request)``. There is no public
accessor for the polling request; review if upgrading to PTB 23+.
"""
if not (self._app and self._app.bot):
return
try:
# PTB 22.x: _request is a (get_updates, general) tuple;
# no public accessor exists for the polling request.
polling_req = self._app.bot._request[0] # noqa: SLF001
except Exception:
return
try:
await polling_req.shutdown()
except Exception:
logger.debug(
"[%s] Polling request shutdown failed (non-fatal)",
self.name, exc_info=True,
)
try:
await polling_req.initialize()
logger.debug(
"[%s] Polling request pool drained before reconnect", self.name
)
except Exception:
logger.debug(
"[%s] Polling request re-initialize failed (non-fatal)",
self.name, exc_info=True,
)
async def _handle_polling_network_error(self, error: Exception) -> None:
"""Reconnect polling after a transient network interruption.
@@ -499,8 +379,6 @@ class TelegramAdapter(BasePlatformAdapter):
except Exception:
pass
await self._drain_polling_connections()
try:
await self._app.updater.start_polling(
allowed_updates=Update.ALL_TYPES,
@@ -512,17 +390,6 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, attempt,
)
self._polling_network_error_count = 0
# start_polling() returning is necessary but not sufficient:
# PTB's Updater can be left in a state where `running` is True
# but the underlying long-poll task is wedged on a stale httpx
# connection and never makes progress. No error_callback fires
# in that state, so the reconnect ladder won't advance on its
# own. Schedule a deferred probe to detect the wedge and
# re-enter the ladder if needed.
if not self.has_fatal_error:
probe = asyncio.ensure_future(self._verify_polling_after_reconnect())
self._background_tasks.add(probe)
probe.add_done_callback(self._background_tasks.discard)
except Exception as retry_err:
logger.warning("[%s] Telegram polling reconnect failed: %s", self.name, retry_err)
# start_polling failed — polling is dead and no further error
@@ -534,50 +401,6 @@ class TelegramAdapter(BasePlatformAdapter):
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)
async def _verify_polling_after_reconnect(self) -> None:
"""Heartbeat probe scheduled after a successful reconnect.
PTB's Updater can survive a botched stop()+start_polling() cycle
with `running=True` but a wedged consumer task. No error callback
fires, so the reconnect ladder doesn't advance on its own. This
probe detects the wedge by:
1. Sleeping HEARTBEAT_PROBE_DELAY so a healthy long-poll has time
to complete at least one cycle.
2. Verifying `Updater.running` is still True.
3. Probing the bot endpoint with a tight asyncio timeout. A
wedged httpx pool fails this probe; a healthy one returns
well under the timeout.
On any failure, re-enter the reconnect ladder so the existing
MAX_NETWORK_RETRIES path can ultimately escalate to fatal-error.
"""
HEARTBEAT_PROBE_DELAY = 60
PROBE_TIMEOUT = 10
await asyncio.sleep(HEARTBEAT_PROBE_DELAY)
if self.has_fatal_error:
return
if not (self._app and self._app.updater and self._app.updater.running):
logger.warning(
"[%s] Updater not running %ds after reconnect — treating as wedged",
self.name, HEARTBEAT_PROBE_DELAY,
)
await self._handle_polling_network_error(
RuntimeError("Updater not running after reconnect heartbeat")
)
return
try:
await asyncio.wait_for(self._app.bot.get_me(), PROBE_TIMEOUT)
except Exception as probe_err:
logger.warning(
"[%s] Polling heartbeat probe failed %ds after reconnect: %s",
self.name, HEARTBEAT_PROBE_DELAY, probe_err,
)
await self._handle_polling_network_error(probe_err)
async def _handle_polling_conflict(self, error: Exception) -> None:
if self.has_fatal_error and self.fatal_error_code == "telegram_polling_conflict":
return
@@ -603,7 +426,6 @@ class TelegramAdapter(BasePlatformAdapter):
except Exception:
pass
await asyncio.sleep(RETRY_DELAY)
await self._drain_polling_connections()
try:
await self._app.updater.start_polling(
allowed_updates=Update.ALL_TYPES,
@@ -688,29 +510,6 @@ class TelegramAdapter(BasePlatformAdapter):
)
return None
async def rename_dm_topic(
self,
chat_id: int,
thread_id: int,
name: str,
) -> None:
"""Rename a forum topic in a private (DM) chat."""
if not self._bot:
return
try:
chat_id_arg = int(chat_id)
except (TypeError, ValueError):
chat_id_arg = chat_id
await self._bot.edit_forum_topic(
chat_id=chat_id_arg,
message_thread_id=int(thread_id),
name=name,
)
logger.info(
"[%s] Renamed DM topic in chat %s thread_id=%s -> '%s'",
self.name, chat_id, thread_id, name,
)
def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
"""Save a newly created thread_id back into config.yaml so it persists across restarts."""
try:
@@ -755,7 +554,7 @@ class TelegramAdapter(BasePlatformAdapter):
_yaml.dump(config, f, default_flow_style=False, sort_keys=False)
f.flush()
os.fsync(f.fileno())
atomic_replace(tmp_path, config_path)
os.replace(tmp_path, config_path)
except BaseException:
try:
os.unlink(tmp_path)
@@ -839,20 +638,6 @@ class TelegramAdapter(BasePlatformAdapter):
# Persist thread_id to config so we don't recreate on next restart
self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)
# Send a seed message so the topic is visible in Telegram's client.
# Empty topics are hidden by the client UI until they contain a message.
try:
await self._bot.send_message(
chat_id=int(chat_id),
message_thread_id=thread_id,
text=f"\U0001f4cc {topic_name}",
)
except Exception as seed_err:
logger.debug(
"[%s] Could not send seed message to topic '%s': %s",
self.name, topic_name, seed_err,
)
async def connect(self) -> bool:
"""Connect to Telegram via polling or webhook.
@@ -1128,7 +913,7 @@ class TelegramAdapter(BasePlatformAdapter):
self._set_fatal_error("telegram_connect_error", message, retryable=True)
logger.error("[%s] Failed to connect to Telegram: %s", self.name, e, exc_info=True)
return False
async def disconnect(self) -> None:
"""Stop polling/webhook, cancel pending album flushes, and disconnect."""
pending_media_group_tasks = list(self._media_group_tasks.values())
@@ -1424,35 +1209,9 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=False, error=str(e))
async def delete_message(self, chat_id: str, message_id: str) -> bool:
"""Delete a previously sent Telegram message.
Used by the stream consumer's fresh-final cleanup path (ported
from openclaw/openclaw#72038) to remove long-lived preview
messages after sending the completed reply as a fresh message.
Telegram's Bot API ``deleteMessage`` works for bot-posted
messages in the last 48 hours. Failures are non-fatal the
caller leaves the preview in place and logs at debug level.
"""
if not self._bot:
return False
try:
await self._bot.delete_message(
chat_id=int(chat_id),
message_id=int(message_id),
)
return True
except Exception as e:
logger.debug(
"[%s] Failed to delete Telegram message %s: %s",
self.name, message_id, e,
)
return False
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send an inline-keyboard update prompt (Yes / No buttons).
@@ -1470,14 +1229,11 @@ class TelegramAdapter(BasePlatformAdapter):
InlineKeyboardButton("✗ No", callback_data="update_prompt:n"),
]
])
thread_id = self._metadata_thread_id(metadata)
message_thread_id = self._message_thread_id_for_send(thread_id)
msg = await self._bot.send_message(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN,
reply_markup=keyboard,
message_thread_id=message_thread_id,
**self._link_preview_kwargs(),
)
return SendResult(success=True, message_id=str(msg.message_id))
@@ -1549,48 +1305,6 @@ class TelegramAdapter(BasePlatformAdapter):
logger.warning("[%s] send_exec_approval failed: %s", self.name, e)
return SendResult(success=False, error=str(e))
async def send_slash_confirm(
self, chat_id: str, title: str, message: str, session_key: str,
confirm_id: str, metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Render a three-button slash-command confirmation prompt."""
if not self._bot:
return SendResult(success=False, error="Not connected")
try:
# Message body: render as plain text (message already contains
# markdown formatting from the gateway primitive).
preview = message if len(message) <= 3800 else message[:3800] + "..."
keyboard = InlineKeyboardMarkup([
[
InlineKeyboardButton("✅ Approve Once", callback_data=f"sc:once:{confirm_id}"),
InlineKeyboardButton("🔒 Always Approve", callback_data=f"sc:always:{confirm_id}"),
],
[
InlineKeyboardButton("❌ Cancel", callback_data=f"sc:cancel:{confirm_id}"),
],
])
thread_id = self._metadata_thread_id(metadata)
kwargs: Dict[str, Any] = {
"chat_id": int(chat_id),
"text": preview,
"parse_mode": ParseMode.MARKDOWN,
"reply_markup": keyboard,
**self._link_preview_kwargs(),
}
message_thread_id = self._message_thread_id_for_send(thread_id)
if message_thread_id is not None:
kwargs["message_thread_id"] = message_thread_id
msg = await self._bot.send_message(**kwargs)
self._slash_confirm_state[confirm_id] = session_key
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.warning("[%s] send_slash_confirm failed: %s", self.name, e)
return SendResult(success=False, error=str(e))
async def send_model_picker(
self,
chat_id: str,
@@ -1895,12 +1609,6 @@ class TelegramAdapter(BasePlatformAdapter):
if not query or not query.data:
return
data = query.data
query_message = getattr(query, "message", None)
query_chat_id = getattr(query_message, "chat_id", None)
query_chat = getattr(query_message, "chat", None)
query_chat_type = getattr(query_chat, "type", None)
query_thread_id = getattr(query_message, "message_thread_id", None)
query_user_name = getattr(query.from_user, "first_name", None)
# --- Model picker callbacks ---
if data.startswith(("mp:", "mm:", "mb", "mx", "mg:")):
@@ -1922,13 +1630,7 @@ class TelegramAdapter(BasePlatformAdapter):
# Only authorized users may click approval buttons.
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
if not self._is_callback_user_authorized(caller_id):
await query.answer(text="⛔ You are not authorized to approve commands.")
return
@@ -1971,86 +1673,12 @@ class TelegramAdapter(BasePlatformAdapter):
logger.error("Failed to resolve gateway approval from Telegram button: %s", exc)
return
# --- Slash-confirm callbacks (sc:choice:confirm_id) ---
if data.startswith("sc:"):
parts = data.split(":", 2)
if len(parts) == 3:
choice = parts[1] # once, always, cancel
confirm_id = parts[2]
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
await query.answer(text="⛔ You are not authorized to answer this prompt.")
return
session_key = self._slash_confirm_state.pop(confirm_id, None)
if not session_key:
await query.answer(text="This prompt has already been resolved.")
return
label_map = {
"once": "✅ Approved once",
"always": "🔒 Always approve",
"cancel": "❌ Cancelled",
}
user_display = getattr(query.from_user, "first_name", "User")
label = label_map.get(choice, "Resolved")
await query.answer(text=label)
try:
await query.edit_message_text(
text=f"{label} by {user_display}",
parse_mode=ParseMode.MARKDOWN,
reply_markup=None,
)
except Exception:
pass
# Resolve via the module-level primitive. The runner stored
# a handler keyed by session_key; we run it on the event
# loop and (if it returns a string) send it as a follow-up
# message in the same chat.
try:
from tools import slash_confirm as _slash_confirm_mod
result_text = await _slash_confirm_mod.resolve(
session_key, confirm_id, choice,
)
if result_text and query.message:
# Inherit the prompt message's thread so the reply
# lands in the same supergroup topic / reply chain.
thread_id = getattr(query.message, "message_thread_id", None)
send_kwargs: Dict[str, Any] = {
"chat_id": int(query.message.chat_id),
"text": result_text,
"parse_mode": ParseMode.MARKDOWN,
**self._link_preview_kwargs(),
}
if thread_id is not None:
send_kwargs["message_thread_id"] = thread_id
await self._bot.send_message(**send_kwargs)
except Exception as exc:
logger.error("[%s] slash-confirm callback failed: %s", self.name, exc, exc_info=True)
return
# --- Update prompt callbacks ---
if not data.startswith("update_prompt:"):
return
answer = data.split(":", 1)[1] # "y" or "n"
caller_id = str(getattr(query.from_user, "id", ""))
if not self._is_callback_user_authorized(
caller_id,
chat_id=query_chat_id,
chat_type=str(query_chat_type) if query_chat_type is not None else None,
thread_id=str(query_thread_id) if query_thread_id is not None else None,
user_name=query_user_name,
):
if not self._is_callback_user_authorized(caller_id):
await query.answer(text="⛔ You are not authorized to answer update prompts.")
return
await query.answer(text=f"Sent '{answer}' to the update process.")
@@ -2110,9 +1738,8 @@ class TelegramAdapter(BasePlatformAdapter):
return SendResult(success=False, error=self._missing_media_path_error("Audio", audio_path))
with open(audio_path, "rb") as audio_file:
ext = os.path.splitext(audio_path)[1].lower()
# .ogg / .opus files -> send as voice (round playable bubble)
if ext in (".ogg", ".opus"):
# .ogg files -> send as voice (round playable bubble)
if audio_path.endswith((".ogg", ".opus")):
_voice_thread = self._metadata_thread_id(metadata)
msg = await self._bot.send_voice(
chat_id=int(chat_id),
@@ -2121,8 +1748,8 @@ class TelegramAdapter(BasePlatformAdapter):
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_voice_thread),
)
elif ext in (".mp3", ".m4a"):
# Telegram's Bot API sendAudio only accepts MP3 / M4A.
else:
# .mp3 and others -> send as audio file
_audio_thread = self._metadata_thread_id(metadata)
msg = await self._bot.send_audio(
chat_id=int(chat_id),
@@ -2131,16 +1758,6 @@ class TelegramAdapter(BasePlatformAdapter):
reply_to_message_id=int(reply_to) if reply_to else None,
message_thread_id=self._message_thread_id_for_send(_audio_thread),
)
else:
# Formats Telegram can't play natively (.wav, .flac, ...)
# — fall back to document delivery instead of raising.
return await self.send_document(
chat_id=chat_id,
file_path=audio_path,
caption=caption,
reply_to=reply_to,
metadata=metadata,
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.error(
@@ -2150,118 +1767,7 @@ class TelegramAdapter(BasePlatformAdapter):
exc_info=True,
)
return await super().send_voice(chat_id, audio_path, caption, reply_to)
async def send_multiple_images(
self,
chat_id: str,
images: List[tuple],
metadata: Optional[Dict[str, Any]] = None,
human_delay: float = 0.0,
) -> None:
"""Send a batch of images natively via Telegram's media group API.
Telegram's ``send_media_group`` bundles up to 10 photos/videos into
a single album. Larger batches are chunked. Animated GIFs cannot
go into a media group (they require ``send_animation``), so they
are peeled off and sent individually via the base default path.
URL-based photos go into the group directly; local files are
opened as byte streams. On failure the whole batch falls back to
the base adapter's per-image loop.
"""
if not self._bot:
return
if not images:
return
try:
from telegram import InputMediaPhoto
except Exception as exc: # pragma: no cover - missing SDK
logger.warning(
"[%s] InputMediaPhoto unavailable, falling back to per-image send: %s",
self.name, exc,
)
await super().send_multiple_images(chat_id, images, metadata, human_delay)
return
# Peel off animations — they need send_animation, not send_media_group
animations: List[tuple] = []
photos: List[tuple] = []
for image_url, alt_text in images:
if not image_url.startswith("file://") and self._is_animation_url(image_url):
animations.append((image_url, alt_text))
else:
photos.append((image_url, alt_text))
# Animations: route through the base default (per-image send_animation)
if animations:
await super().send_multiple_images(
chat_id, animations, metadata, human_delay=human_delay,
)
if not photos:
return
from urllib.parse import unquote as _unquote
_thread = self._metadata_thread_id(metadata)
_thread_id = self._message_thread_id_for_send(_thread)
# Chunk into groups of 10 (Telegram's album limit)
CHUNK = 10
chunks = [photos[i:i + CHUNK] for i in range(0, len(photos), CHUNK)]
for chunk_idx, chunk in enumerate(chunks):
if human_delay > 0 and chunk_idx > 0:
await asyncio.sleep(human_delay)
media: List[Any] = []
opened_files: List[Any] = []
try:
for image_url, alt_text in chunk:
caption = alt_text[:1024] if alt_text else None
if image_url.startswith("file://"):
local_path = _unquote(image_url[7:])
if not os.path.exists(local_path):
logger.warning(
"[%s] Skipping missing image in media group: %s",
self.name, local_path,
)
continue
fh = open(local_path, "rb")
opened_files.append(fh)
media.append(InputMediaPhoto(media=fh, caption=caption))
else:
media.append(InputMediaPhoto(media=image_url, caption=caption))
if not media:
continue
logger.info(
"[%s] Sending media group of %d photo(s) (chunk %d/%d)",
self.name, len(media), chunk_idx + 1, len(chunks),
)
await self._bot.send_media_group(
chat_id=int(chat_id),
media=media,
message_thread_id=_thread_id,
)
except Exception as e:
logger.warning(
"[%s] send_media_group failed (chunk %d/%d), falling back to per-image: %s",
self.name, chunk_idx + 1, len(chunks), e,
exc_info=True,
)
# Fallback: send each photo in this chunk individually
await super().send_multiple_images(
chat_id, chunk, metadata, human_delay=human_delay,
)
finally:
for fh in opened_files:
try:
fh.close()
except Exception:
pass
async def send_image_file(
self,
chat_id: str,
@@ -2290,54 +1796,13 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
error_str = str(e)
# Dimension-related errors are the expected case for valid image
# files that Telegram just refuses as photos (screenshots, extreme
# aspect ratios). Log at INFO because the document fallback is
# the correct path. Any other send_photo failure also falls back
# to document (rate limits, corrupt file markers, format edge
# cases), but at WARNING because it's unexpected and worth
# surfacing in logs.
is_dim_error = (
"Photo_invalid_dimensions" in error_str
or "PHOTO_INVALID_DIMENSIONS" in error_str
logger.error(
"[%s] Failed to send Telegram local image, falling back to base adapter: %s",
self.name,
e,
exc_info=True,
)
if is_dim_error:
logger.info(
"[%s] Image dimensions exceed Telegram photo limits, "
"sending as document: %s",
self.name,
image_path,
)
else:
logger.warning(
"[%s] Failed to send Telegram local image as photo, "
"trying document fallback: %s",
self.name,
e,
exc_info=True,
)
# Fallback to sending as document (file) — no dimension limit,
# only 50MB size limit. If even that fails, fall back to the
# base adapter's text-only "Image: /path" rendering.
try:
return await self.send_document(
chat_id=chat_id,
file_path=image_path,
caption=caption,
file_name=os.path.basename(image_path),
reply_to=reply_to,
metadata=metadata,
)
except Exception as doc_err:
logger.error(
"[%s] Failed to send Telegram local image as document, "
"falling back to base adapter: %s",
self.name,
doc_err,
exc_info=True,
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_document(
self,
@@ -2469,7 +1934,7 @@ class TelegramAdapter(BasePlatformAdapter):
)
# Final fallback: send URL as text
return await super().send_image(chat_id, image_url, caption, reply_to)
async def send_animation(
self,
chat_id: str,
@@ -2531,7 +1996,7 @@ class TelegramAdapter(BasePlatformAdapter):
e,
exc_info=True,
)
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
"""Get information about a Telegram chat."""
if not self._bot:
@@ -2565,7 +2030,7 @@ class TelegramAdapter(BasePlatformAdapter):
exc_info=True,
)
return {"name": str(chat_id), "type": "dm", "error": str(e)}
def format_message(self, content: str) -> str:
"""
Convert standard markdown to Telegram MarkdownV2 format.
@@ -2590,8 +2055,10 @@ class TelegramAdapter(BasePlatformAdapter):
text = content
# 0) Rewrite GFM-style pipe tables into Telegram-friendly row groups
# before the normal MarkdownV2 conversions run.
# 0) Pre-wrap GFM-style pipe tables in ``` fences. Telegram can't
# render tables natively, but fenced code blocks render as
# monospace preformatted text with columns intact. The wrapped
# tables then flow through step (1) below as protected regions.
text = _wrap_markdown_tables(text)
# 1) Protect fenced code blocks (``` ... ```)
@@ -2737,7 +2204,7 @@ class TelegramAdapter(BasePlatformAdapter):
text = ''.join(_safe_parts)
return text
# ── Group mention gating ──────────────────────────────────────────────
def _telegram_require_mention(self) -> bool:
@@ -2861,26 +2328,6 @@ class TelegramAdapter(BasePlatformAdapter):
user = getattr(entity, "user", None)
if user and getattr(user, "id", None) == bot_id:
return True
elif entity_type == "bot_command" and expected:
# Telegram's official group-disambiguation form for slash
# commands (``/cmd@botname``) is emitted as a single
# ``bot_command`` entity covering the whole span — there
# is no accompanying ``mention`` entity. Treat it as a
# direct address to this bot when the ``@botname`` suffix
# matches. This is the form Telegram's own command menu
# autocomplete produces in groups, so dropping it at the
# mention gate would break /new, /reset, /help, ... for
# every group that has ``require_mention`` enabled (#15415).
offset = int(getattr(entity, "offset", -1))
length = int(getattr(entity, "length", 0))
if offset < 0 or length <= 0:
continue
command_text = source_text[offset:offset + length]
at_index = command_text.find("@")
if at_index < 0:
continue
if command_text[at_index:].strip().lower() == expected:
return True
return False
def _message_matches_mention_patterns(self, message: Message) -> bool:
@@ -2952,7 +2399,7 @@ class TelegramAdapter(BasePlatformAdapter):
event = self._build_message_event(update.message, MessageType.TEXT, update_id=update.update_id)
event.text = self._clean_bot_trigger_text(event.text)
self._enqueue_text_event(event)
async def _handle_command(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming command messages."""
if not update.message or not update.message.text:
@@ -2962,7 +2409,7 @@ class TelegramAdapter(BasePlatformAdapter):
event = self._build_message_event(update.message, MessageType.COMMAND, update_id=update.update_id)
await self.handle_message(event)
async def _handle_location_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle incoming location/venue pin messages."""
if not update.message:
@@ -3320,7 +2767,7 @@ class TelegramAdapter(BasePlatformAdapter):
return
await self.handle_message(event)
async def _queue_media_group_event(self, media_group_id: str, event: MessageEvent) -> None:
"""Buffer Telegram media-group items so albums arrive as one logical event.
+20 -16
View File
@@ -202,22 +202,26 @@ class WebhookAdapter(BasePlatformAdapter):
if deliver_type == "github_comment":
return await self._deliver_github_comment(content, delivery)
# Cross-platform delivery — any platform with a gateway adapter.
# Check both built-in names and plugin-registered platforms.
_BUILTIN_DELIVER_PLATFORMS = {
"telegram", "discord", "slack", "signal", "sms", "whatsapp",
"matrix", "mattermost", "homeassistant", "email", "dingtalk",
"feishu", "wecom", "wecom_callback", "weixin", "bluebubbles",
"qqbot", "yuanbao",
}
_is_known_platform = deliver_type in _BUILTIN_DELIVER_PLATFORMS
if not _is_known_platform:
try:
from gateway.platform_registry import platform_registry
_is_known_platform = platform_registry.is_registered(deliver_type)
except Exception:
pass
if self.gateway_runner and _is_known_platform:
# Cross-platform delivery — any platform with a gateway adapter
if self.gateway_runner and deliver_type in (
"telegram",
"discord",
"slack",
"signal",
"sms",
"whatsapp",
"matrix",
"mattermost",
"homeassistant",
"email",
"dingtalk",
"feishu",
"wecom",
"wecom_callback",
"weixin",
"bluebubbles",
"qqbot",
):
return await self._deliver_cross_platform(
deliver_type, content, delivery
)
+1 -6
View File
@@ -142,7 +142,6 @@ class WeComAdapter(BasePlatformAdapter):
"""WeCom AI Bot adapter backed by a persistent WebSocket connection."""
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
SUPPORTS_MESSAGE_EDITING = False
# Threshold for detecting WeCom client-side message splits.
# When a chunk is near the 4000-char limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 3900
@@ -207,11 +206,7 @@ class WeComAdapter(BasePlatformAdapter):
return False
try:
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, follow_redirects=True, limits=platform_httpx_limits(),
)
self._http_client = httpx.AsyncClient(timeout=30.0, follow_redirects=True)
await self._open_connection()
self._mark_connected()
self._listen_task = asyncio.create_task(self._listen_loop())
+1 -3
View File
@@ -119,9 +119,7 @@ class WecomCallbackAdapter(BasePlatformAdapter):
pass
try:
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(timeout=20.0, limits=platform_httpx_limits())
self._http_client = httpx.AsyncClient(timeout=20.0)
self._app = web.Application()
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get(self._path, self._handle_verify)
+8 -65
View File
@@ -89,21 +89,8 @@ MAX_CONSECUTIVE_FAILURES = 3
RETRY_DELAY_SECONDS = 2
BACKOFF_DELAY_SECONDS = 30
SESSION_EXPIRED_ERRCODE = -14
RATE_LIMIT_ERRCODE = -2 # iLink frequency limit — backoff and retry
MESSAGE_DEDUP_TTL_SECONDS = 300
def _is_stale_session_ret(
ret: "Optional[int]", errcode: "Optional[int]", errmsg: "Optional[str]",
) -> bool:
"""True when iLink returns ret=-2 / errcode=-2 with 'unknown error',
which is a stale-session signal (same as errcode=-14) rather than
a genuine rate limit."""
if ret != RATE_LIMIT_ERRCODE and errcode != RATE_LIMIT_ERRCODE:
return False
return (errmsg or "").lower() == "unknown error"
MEDIA_IMAGE = 1
MEDIA_VIDEO = 2
MEDIA_FILE = 3
@@ -1126,7 +1113,7 @@ async def qr_login(
class WeixinAdapter(BasePlatformAdapter):
"""Native Hermes adapter for Weixin personal accounts."""
MAX_MESSAGE_LENGTH = 2000
MAX_MESSAGE_LENGTH = 4000
# WeChat does not support editing sent messages — streaming must use the
# fallback "send-final-only" path so the cursor (▉) is never left visible.
@@ -1151,10 +1138,10 @@ class WeixinAdapter(BasePlatformAdapter):
extra.get("cdn_base_url") or os.getenv("WEIXIN_CDN_BASE_URL", WEIXIN_CDN_BASE_URL)
).strip().rstrip("/")
self._send_chunk_delay_seconds = float(
extra.get("send_chunk_delay_seconds") or os.getenv("WEIXIN_SEND_CHUNK_DELAY_SECONDS", "1.5")
extra.get("send_chunk_delay_seconds") or os.getenv("WEIXIN_SEND_CHUNK_DELAY_SECONDS", "0.35")
)
self._send_chunk_retries = int(
extra.get("send_chunk_retries") or os.getenv("WEIXIN_SEND_CHUNK_RETRIES", "4")
extra.get("send_chunk_retries") or os.getenv("WEIXIN_SEND_CHUNK_RETRIES", "2")
)
self._send_chunk_retry_delay_seconds = float(
extra.get("send_chunk_retry_delay_seconds")
@@ -1222,17 +1209,6 @@ class WeixinAdapter(BasePlatformAdapter):
self._mark_connected()
_LIVE_ADAPTERS[self._token] = self
logger.info("[%s] Connected account=%s base=%s", self.name, _safe_id(self._account_id), self._base_url)
if self._group_policy != "disabled":
logger.warning(
"[%s] WEIXIN_GROUP_POLICY=%s is set, but QR-login connects an iLink bot "
"identity (e.g. ...@im.bot) which typically cannot be invited into ordinary "
"WeChat groups. iLink usually does not deliver ordinary-group events for "
"these accounts, so group messages may never reach Hermes regardless of this "
"policy. If group delivery doesn't work, the limitation is on the iLink side, "
"not in Hermes.",
self.name,
self._group_policy,
)
return True
async def disconnect(self) -> None:
@@ -1277,8 +1253,7 @@ class WeixinAdapter(BasePlatformAdapter):
ret = response.get("ret", 0)
errcode = response.get("errcode", 0)
if ret not in (0, None) or errcode not in (0, None):
if (ret == SESSION_EXPIRED_ERRCODE or errcode == SESSION_EXPIRED_ERRCODE
or _is_stale_session_ret(ret, errcode, response.get("errmsg"))):
if ret == SESSION_EXPIRED_ERRCODE or errcode == SESSION_EXPIRED_ERRCODE:
logger.error("[%s] Session expired; pausing for 10 minutes", self.name)
await asyncio.sleep(600)
consecutive_failures = 0
@@ -1333,15 +1308,6 @@ class WeixinAdapter(BasePlatformAdapter):
if message_id and self._dedup.is_duplicate(message_id):
return
# Secondary content-fingerprint dedup for text messages
item_list = message.get("item_list") or []
text = _extract_text(item_list)
if text:
content_key = f"content:{sender_id}:{hashlib.md5(text.encode()).hexdigest()}"
if self._dedup.is_duplicate(content_key):
logger.debug("[%s] Content-dedup: skipping duplicate message from %s", self.name, sender_id)
return
chat_type, effective_chat_id = _guess_chat_type(message, self._account_id)
if chat_type == "group":
if self._group_policy == "disabled":
@@ -1356,6 +1322,8 @@ class WeixinAdapter(BasePlatformAdapter):
self._token_store.set(self._account_id, sender_id, context_token)
asyncio.create_task(self._maybe_fetch_typing_ticket(sender_id, context_token or None))
item_list = message.get("item_list") or []
text = _extract_text(item_list)
media_paths: List[str] = []
media_types: List[str] = []
@@ -1550,7 +1518,6 @@ class WeixinAdapter(BasePlatformAdapter):
is_session_expired = (
ret == SESSION_EXPIRED_ERRCODE
or errcode == SESSION_EXPIRED_ERRCODE
or _is_stale_session_ret(ret, errcode, resp.get("errmsg"))
)
# Session expired — strip token and retry once
if is_session_expired and not retried_without_token and context_token:
@@ -1564,28 +1531,6 @@ class WeixinAdapter(BasePlatformAdapter):
self.name, _safe_id(chat_id),
)
continue
# Rate limit (-2) — backoff and retry
is_rate_limited = (
ret == RATE_LIMIT_ERRCODE
or errcode == RATE_LIMIT_ERRCODE
)
if is_rate_limited:
errmsg = resp.get("errmsg") or resp.get("msg") or "rate limited"
# Record the error so we raise a descriptive
# RuntimeError (instead of AssertionError) if the
# loop exhausts with the server still rate-limiting.
last_error = RuntimeError(
f"iLink sendmessage rate limited: ret={ret} errcode={errcode} errmsg={errmsg}"
)
if attempt >= self._send_chunk_retries:
break
wait = self._send_chunk_retry_delay_seconds * 3 # 3x backoff for rate limit
logger.warning(
"[%s] rate limited for %s; backing off %.1fs before retry",
self.name, _safe_id(chat_id), wait,
)
await asyncio.sleep(wait)
continue
errmsg = resp.get("errmsg") or resp.get("msg") or "unknown error"
raise RuntimeError(
f"iLink sendmessage error: ret={ret} errcode={errcode} errmsg={errmsg}"
@@ -1627,7 +1572,7 @@ class WeixinAdapter(BasePlatformAdapter):
_, image_cleaned = self.extract_images(cleaned_content)
local_files, final_content = self.extract_local_files(image_cleaned)
_AUDIO_EXTS = {".ogg", ".opus", ".mp3", ".wav", ".m4a", ".flac"}
_AUDIO_EXTS = {".ogg", ".opus", ".mp3", ".wav", ".m4a"}
_VIDEO_EXTS = {".mp4", ".mov", ".avi", ".mkv", ".webm", ".3gp"}
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".webp", ".gif"}
@@ -2037,9 +1982,7 @@ async def send_weixin_direct(
live_adapter = _LIVE_ADAPTERS.get(resolved_token)
send_session = getattr(live_adapter, '_send_session', None)
if (live_adapter is not None and send_session is not None
and not send_session.closed
and send_session._loop is asyncio.get_running_loop()):
if live_adapter is not None and send_session is not None and not send_session.closed:
last_result: Optional[SendResult] = None
cleaned = live_adapter.format_message(message)
if cleaned:
+2 -32
View File
@@ -185,13 +185,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
# Set to True by disconnect() before we SIGTERM our child bridge so
# _check_managed_bridge_exit() can distinguish an intentional
# shutdown-time exit (returncode -15 / -2 / 0) from a real crash.
# Without this, every graceful gateway shutdown/restart would log
# "Fatal whatsapp adapter error" plus dispatch a fatal-error
# notification before the normal "✓ whatsapp disconnected" fires.
self._shutting_down: bool = False
def _whatsapp_require_mention(self) -> bool:
configured = self.config.extra.get("require_mention")
@@ -562,21 +555,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
if returncode is None:
return None
# Planned shutdown: disconnect() sets _shutting_down before it sends
# SIGTERM to the bridge, so a returncode of -15 (SIGTERM), -2 (SIGINT),
# or 0 (clean exit) at that point is expected, not a crash. Treat it
# as informational and skip the fatal-error path.
# getattr-with-default keeps tests that construct the adapter via
# ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
# every _make_adapter() helper having to seed the attribute.
if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
logger.info(
"[%s] Bridge exited during shutdown (code %d).",
self.name,
returncode,
)
return None
message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
if not self.has_fatal_error:
logger.error("[%s] %s", self.name, message)
@@ -587,10 +565,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def disconnect(self) -> None:
"""Stop the WhatsApp bridge and clean up any orphaned processes."""
# Flip the shutdown flag BEFORE signalling the child so the exit-check
# path (which runs from other tasks like send() and the poll loop)
# doesn't race us and report the intentional termination as fatal.
self._shutting_down = True
if self._bridge_process:
try:
try:
@@ -902,15 +876,11 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
import aiohttp
# Must wrap in `async with` — a bare `await session.post(...)`
# leaves the response object alive until GC, holding its TCP
# socket in CLOSE_WAIT. See #18451.
async with self._http_session.post(
await self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
):
pass
)
except Exception:
pass # Ignore typing indicator failures
File diff suppressed because it is too large Load Diff
-645
View File
@@ -1,645 +0,0 @@
"""
yuanbao_media.py 元宝平台媒体处理模块
提供 COS 上传文件下载TIM 媒体消息构建等功能
移植自 TypeScript media.tsyuanbao-openclaw-plugin
使用 httpx 替代 cos-nodejs-sdk-v5避免引入额外 SDK 依赖
COS 上传流程
1. 调用 genUploadInfo 获取临时凭证tmpSecretId/tmpSecretKey/sessionToken
2. 用临时凭证通过 HMAC-SHA1 签名构建 Authorization
3. HTTP PUT 上传到 COS
TIM 消息体构建
- buildImageMsgBody() TIMImageElem
- buildFileMsgBody() TIMFileElem
"""
from __future__ import annotations
import hashlib
import hmac
import logging
import os
import secrets
import struct
import time
import urllib.parse
from typing import Optional, Any
import httpx
logger = logging.getLogger(__name__)
# ============ 常量 ============
UPLOAD_INFO_PATH = "/api/resource/genUploadInfo"
DEFAULT_API_DOMAIN = "yuanbao.tencent.com"
DEFAULT_MAX_SIZE_MB = 50
# COS 加速域名后缀(优先使用全球加速)
COS_USE_ACCELERATE = True
# ============ 类型映射 ============
# MIME → image_format 数字(TIM 协议字段)
_MIME_TO_IMAGE_FORMAT: dict[str, int] = {
"image/jpeg": 1,
"image/jpg": 1,
"image/gif": 2,
"image/png": 3,
"image/bmp": 4,
"image/webp": 255,
"image/heic": 255,
"image/tiff": 255,
}
# 文件扩展名 → MIME
_EXT_TO_MIME: dict[str, str] = {
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".png": "image/png",
".gif": "image/gif",
".webp": "image/webp",
".bmp": "image/bmp",
".heic": "image/heic",
".tiff": "image/tiff",
".ico": "image/x-icon",
".pdf": "application/pdf",
".doc": "application/msword",
".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
".xls": "application/vnd.ms-excel",
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
".ppt": "application/vnd.ms-powerpoint",
".pptx": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
".txt": "text/plain",
".zip": "application/zip",
".tar": "application/x-tar",
".gz": "application/gzip",
".mp3": "audio/mpeg",
".mp4": "video/mp4",
".wav": "audio/wav",
".ogg": "audio/ogg",
".webm": "video/webm",
}
# ============ 工具函数 ============
def guess_mime_type(filename: str) -> str:
"""根据文件扩展名猜测 MIME 类型。"""
ext = os.path.splitext(filename)[-1].lower()
return _EXT_TO_MIME.get(ext, "application/octet-stream")
def is_image(filename: str, mime_type: str = "") -> bool:
"""判断是否为图片类型。"""
if mime_type.startswith("image/"):
return True
ext = os.path.splitext(filename)[-1].lower()
return ext in {".jpg", ".jpeg", ".png", ".gif", ".webp", ".bmp", ".heic", ".tiff", ".ico"}
def get_image_format(mime_type: str) -> int:
"""获取 TIM 图片格式编号。"""
return _MIME_TO_IMAGE_FORMAT.get(mime_type.lower(), 255)
def md5_hex(data: bytes) -> str:
"""计算 MD5 十六进制摘要。"""
return hashlib.md5(data).hexdigest()
def generate_file_id() -> str:
"""生成随机文件 ID(32 位 hex)。"""
return secrets.token_hex(16)
# ============ 图片尺寸解析(纯 Python,无需 Pillow ============
def parse_image_size(data: bytes) -> Optional[dict[str, int]]:
"""
解析图片宽高支持 JPEG/PNG/GIF/WebP无需第三方依赖
返回 {"width": w, "height": h} None无法识别
"""
return (
_parse_png_size(data)
or _parse_jpeg_size(data)
or _parse_gif_size(data)
or _parse_webp_size(data)
)
def _parse_png_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 24:
return None
if buf[:4] != b"\x89PNG":
return None
w = struct.unpack(">I", buf[16:20])[0]
h = struct.unpack(">I", buf[20:24])[0]
return {"width": w, "height": h}
def _parse_jpeg_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 4 or buf[0] != 0xFF or buf[1] != 0xD8:
return None
i = 2
while i < len(buf) - 9:
if buf[i] != 0xFF:
i += 1
continue
marker = buf[i + 1]
if marker in (0xC0, 0xC2):
h = struct.unpack(">H", buf[i + 5: i + 7])[0]
w = struct.unpack(">H", buf[i + 7: i + 9])[0]
return {"width": w, "height": h}
if i + 3 < len(buf):
i += 2 + struct.unpack(">H", buf[i + 2: i + 4])[0]
else:
break
return None
def _parse_gif_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 10:
return None
sig = buf[:6].decode("ascii", errors="replace")
if sig not in ("GIF87a", "GIF89a"):
return None
w = struct.unpack("<H", buf[6:8])[0]
h = struct.unpack("<H", buf[8:10])[0]
return {"width": w, "height": h}
def _parse_webp_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 16:
return None
if buf[:4] != b"RIFF" or buf[8:12] != b"WEBP":
return None
chunk = buf[12:16].decode("ascii", errors="replace")
if chunk == "VP8 ":
if len(buf) >= 30 and buf[23] == 0x9D and buf[24] == 0x01 and buf[25] == 0x2A:
w = struct.unpack("<H", buf[26:28])[0] & 0x3FFF
h = struct.unpack("<H", buf[28:30])[0] & 0x3FFF
return {"width": w, "height": h}
elif chunk == "VP8L":
if len(buf) >= 25 and buf[20] == 0x2F:
bits = struct.unpack("<I", buf[21:25])[0]
w = (bits & 0x3FFF) + 1
h = ((bits >> 14) & 0x3FFF) + 1
return {"width": w, "height": h}
elif chunk == "VP8X":
if len(buf) >= 30:
w = (buf[24] | (buf[25] << 8) | (buf[26] << 16)) + 1
h = (buf[27] | (buf[28] << 8) | (buf[29] << 16)) + 1
return {"width": w, "height": h}
return None
# ============ URL 下载 ============
async def download_url(
url: str,
max_size_mb: int = DEFAULT_MAX_SIZE_MB,
) -> tuple[bytes, str]:
"""
下载 URL 内容返回 (bytes, content_type)
Args:
url: HTTP(S) URL
max_size_mb: 最大允许大小MB超过则抛出异常
Returns:
(data_bytes, content_type_string)
Raises:
ValueError: 内容超过大小限制
httpx.HTTPError: 网络/HTTP 错误
"""
max_bytes = max_size_mb * 1024 * 1024
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
# 先 HEAD 检查大小
try:
head = await client.head(url)
content_length = int(head.headers.get("content-length", 0) or 0)
if content_length > 0 and content_length > max_bytes:
raise ValueError(
f"文件过大: {content_length / 1024 / 1024:.1f} MB > {max_size_mb} MB"
)
except httpx.HTTPStatusError:
pass # 部分服务器不支持 HEAD,忽略
# GET 下载(流式读取,防止超限)
async with client.stream("GET", url) as resp:
resp.raise_for_status()
content_type = resp.headers.get("content-type", "").split(";")[0].strip()
chunks: list[bytes] = []
downloaded = 0
async for chunk in resp.aiter_bytes(65536):
downloaded += len(chunk)
if downloaded > max_bytes:
raise ValueError(
f"文件过大: 已超过 {max_size_mb} MB 限制"
)
chunks.append(chunk)
data = b"".join(chunks)
return data, content_type
# ============ COS 鉴权(HMAC-SHA1 ============
def _cos_sign(
method: str,
path: str,
params: dict[str, str],
headers: dict[str, str],
secret_id: str,
secret_key: str,
start_time: Optional[int] = None,
expire_seconds: int = 3600,
) -> str:
"""
构建 COS 请求签名q-sign-algorithm=sha1 方案
参考https://cloud.tencent.com/document/product/436/7778
Args:
method: HTTP 方法小写 "put"
path: URL 路径URL encode 后的小写
params: URL 查询参数 dict用于签名
headers: 参与签名的请求头 dictkey 需小写
secret_id: 临时 SecretIdtmpSecretId
secret_key: 临时 SecretKeytmpSecretKey
start_time: 签名起始 Unix 时间戳默认 now
expire_seconds: 签名有效期默认 3600
Returns:
Authorization header 完整字符串
"""
now = int(time.time())
q_sign_time = f"{start_time or now};{(start_time or now) + expire_seconds}"
# Step 1: SignKey = HMAC-SHA1(SecretKey, q-sign-time)
sign_key = hmac.new(
secret_key.encode("utf-8"),
q_sign_time.encode("utf-8"),
hashlib.sha1,
).hexdigest()
# Step 2: HttpString
# 参数和头部需按字典序排列,key 小写
sorted_params = sorted((k.lower(), urllib.parse.quote(str(v), safe="") ) for k, v in params.items())
sorted_headers = sorted((k.lower(), urllib.parse.quote(str(v), safe="") ) for k, v in headers.items())
url_param_list = ";".join(k for k, _ in sorted_params)
url_params = "&".join(f"{k}={v}" for k, v in sorted_params)
header_list = ";".join(k for k, _ in sorted_headers)
header_str = "&".join(f"{k}={v}" for k, v in sorted_headers)
http_string = "\n".join([
method.lower(),
path,
url_params,
header_str,
"",
])
# Step 3: StringToSign = sha1 hash of HttpString
sha1_of_http = hashlib.sha1(http_string.encode("utf-8")).hexdigest()
string_to_sign = "\n".join([
"sha1",
q_sign_time,
sha1_of_http,
"",
])
# Step 4: Signature = HMAC-SHA1(SignKey, StringToSign)
signature = hmac.new(
sign_key.encode("utf-8"),
string_to_sign.encode("utf-8"),
hashlib.sha1,
).hexdigest()
return (
f"q-sign-algorithm=sha1"
f"&q-ak={secret_id}"
f"&q-sign-time={q_sign_time}"
f"&q-key-time={q_sign_time}"
f"&q-header-list={header_list}"
f"&q-url-param-list={url_param_list}"
f"&q-signature={signature}"
)
# ============ 主要公开 API ============
async def get_cos_credentials(
app_key: str,
api_domain: str,
token: str,
filename: str = "file",
file_id: Optional[str] = None,
bot_id: str = "",
route_env: str = "",
) -> dict:
"""
调用 genUploadInfo 接口获取 COS 临时密钥及上传配置
Args:
app_key: 应用 Key用于 X-ID
api_domain: API 域名 https://bot.yuanbao.tencent.com
token: 当前有效的签票 tokenX-Token
filename: 待上传的文件名含扩展名
file_id: 客户端生成的唯一文件 ID不传则自动生成
bot_id: Bot 账号 ID用于 X-ID
Returns:
COS 上传配置 dict包含以下字段
bucketName (str) COS Bucket 名称
region (str) COS 地域
location (str) 上传 Key对象路径
encryptTmpSecretId (str) 临时 SecretId
encryptTmpSecretKey(str) 临时 SecretKey
encryptToken (str) SessionToken
startTime (int) 凭证起始时间戳Unix
expiredTime (int) 凭证过期时间戳Unix
resourceUrl (str) 上传后的公网访问 URL
resourceID (str) 资源 ID可选
Raises:
RuntimeError: 接口返回非 0 code 或字段缺失
"""
if file_id is None:
file_id = generate_file_id()
upload_url = f"{api_domain.rstrip('/')}{UPLOAD_INFO_PATH}"
headers = {
"Content-Type": "application/json",
"X-Token": token,
"X-ID": bot_id or app_key,
"X-Source": "web",
}
if route_env:
headers["X-Route-Env"] = route_env
body = {
"fileName": filename,
"fileId": file_id,
"docFrom": "localDoc",
"docOpenId": "",
}
async with httpx.AsyncClient(timeout=15.0) as client:
resp = await client.post(upload_url, json=body, headers=headers)
resp.raise_for_status()
result: dict[str, Any] = resp.json()
code = result.get("code")
if code != 0 and code is not None:
raise RuntimeError(
f"genUploadInfo 失败: code={code}, msg={result.get('msg', '')}"
)
data = result.get("data") or result
required_fields = ["bucketName", "location"]
missing = [f for f in required_fields if not data.get(f)]
if missing:
raise RuntimeError(
f"genUploadInfo 返回字段不完整: 缺少字段 {missing}"
)
return data
async def upload_to_cos(
file_bytes: bytes,
filename: str,
content_type: str,
credentials: dict,
bucket: str,
region: str,
) -> dict:
"""
通过 httpx PUT 请求将文件上传到 COS
使用临时凭证tmpSecretId/tmpSecretKey/sessionToken构建 HMAC-SHA1 签名
Args:
file_bytes: 文件二进制内容
filename: 文件名用于辅助计算 MIMEUUID
content_type: MIME 类型 "image/jpeg"
credentials: get_cos_credentials() 返回的 dict包含
encryptTmpSecretId tmpSecretId
encryptTmpSecretKey tmpSecretKey
encryptToken sessionToken
location COS key对象路径
resourceUrl 上传后公网 URL
startTime 凭证起始时间Unix
expiredTime 凭证过期时间Unix
bucket: COS Bucket 名称 chatbot-1234567890
region: COS 地域 ap-guangzhou
Returns:
上传结果 dict包含
url (str) COS 公网访问 URL
uuid (str) 文件内容 MD5
size (int) 文件大小字节
width (int, optional) 图片宽度仅图片
height (int, optional) 图片高度仅图片
Raises:
httpx.HTTPStatusError: COS 返回非 2xx 状态
RuntimeError: credentials 字段缺失
"""
secret_id: str = credentials.get("encryptTmpSecretId", "")
secret_key: str = credentials.get("encryptTmpSecretKey", "")
session_token: str = credentials.get("encryptToken", "")
cos_key: str = credentials.get("location", "")
resource_url: str = credentials.get("resourceUrl", "")
start_time: Optional[int] = credentials.get("startTime")
expired_time: Optional[int] = credentials.get("expiredTime")
if not secret_id or not secret_key or not cos_key:
raise RuntimeError(
f"COS credentials 不完整: secretId={bool(secret_id)}, "
f"secretKey={bool(secret_key)}, location={bool(cos_key)}"
)
# 构建 COS 上传 URL(优先使用全球加速域名)
if COS_USE_ACCELERATE:
cos_host = f"{bucket}.cos.accelerate.myqcloud.com"
else:
cos_host = f"{bucket}.cos.{region}.myqcloud.com"
# URL encode cos_key(保留 /
encoded_key = urllib.parse.quote(cos_key, safe="/")
cos_url = f"https://{cos_host}/{encoded_key.lstrip('/')}"
# 确定 Content-Type
if not content_type or content_type == "application/octet-stream":
if is_image(filename):
content_type = guess_mime_type(filename)
else:
content_type = "application/octet-stream"
# 计算文件 MD5 + size
file_uuid = md5_hex(file_bytes)
file_size = len(file_bytes)
# 参与签名的请求头
sign_headers = {
"host": cos_host,
"content-type": content_type,
"x-cos-security-token": session_token,
}
# 计算签名有效期
now = int(time.time())
sign_start = start_time if start_time else now
sign_expire = (expired_time - now) if expired_time and expired_time > now else 3600
authorization = _cos_sign(
method="put",
path=f"/{encoded_key.lstrip('/')}",
params={},
headers=sign_headers,
secret_id=secret_id,
secret_key=secret_key,
start_time=sign_start,
expire_seconds=sign_expire,
)
put_headers = {
"Authorization": authorization,
"Content-Type": content_type,
"x-cos-security-token": session_token,
}
logger.info(
"COS PUT: bucket=%s region=%s key=%s size=%d mime=%s",
bucket, region, cos_key, file_size, content_type,
)
async with httpx.AsyncClient(timeout=120.0) as client:
resp = await client.put(
cos_url,
content=file_bytes,
headers=put_headers,
)
resp.raise_for_status()
# 解析图片尺寸(仅图片类型)
result: dict[str, Any] = {
"url": resource_url or cos_url,
"uuid": file_uuid,
"size": file_size,
}
if content_type.startswith("image/"):
size_info = parse_image_size(file_bytes)
if size_info:
result["width"] = size_info["width"]
result["height"] = size_info["height"]
logger.info(
"COS 上传成功: url=%s size=%d",
result["url"], file_size,
)
return result
# ============ TIM 媒体消息构建 ============
def build_image_msg_body(
url: str,
uuid: Optional[str] = None,
filename: Optional[str] = None,
size: int = 0,
width: int = 0,
height: int = 0,
mime_type: str = "",
) -> list[dict]:
"""
构建腾讯 IM TIMImageElem 消息体
参考https://cloud.tencent.com/document/product/269/2720
Args:
url: 图片公网访问 URLCOS resourceUrl
uuid: 文件 UUIDMD5 或其他唯一标识
filename: 文件名uuid 为空时作为备用
size: 文件大小字节
width: 图片宽度像素
height: 图片高度像素
mime_type: MIME 类型用于确定 image_format
Returns:
TIMImageElem 消息体列表适合直接放入 msg_body
"""
_uuid = uuid or filename or _basename_from_url(url) or "image"
image_format = get_image_format(mime_type) if mime_type else 255
return [
{
"msg_type": "TIMImageElem",
"msg_content": {
"uuid": _uuid,
"image_format": image_format,
"image_info_array": [
{
"type": 1, # 1 = 原图
"size": size,
"width": width,
"height": height,
"url": url,
}
],
},
}
]
def build_file_msg_body(
url: str,
filename: str,
uuid: Optional[str] = None,
size: int = 0,
) -> list[dict]:
"""
构建腾讯 IM TIMFileElem 消息体
参考https://cloud.tencent.com/document/product/269/2720
Args:
url: 文件公网访问 URLCOS resourceUrl
filename: 文件名含扩展名
uuid: 文件 UUIDMD5 或其他唯一标识不传则使用 filename
size: 文件大小字节
Returns:
TIMFileElem 消息体列表适合直接放入 msg_body
"""
_uuid = uuid or filename
return [
{
"msg_type": "TIMFileElem",
"msg_content": {
"uuid": _uuid,
"file_name": filename,
"file_size": size,
"url": url,
},
}
]
# ============ 内部工具 ============
def _basename_from_url(url: str) -> str:
"""从 URL 提取文件名。"""
try:
parsed = urllib.parse.urlparse(url)
return os.path.basename(parsed.path)
except Exception:
return ""

Some files were not shown because too many files have changed in this diff Show More