Compare commits

..

22 Commits

Author SHA1 Message Date
emozilla 854206e59e fix(plugins): register dynamically-loaded modules in sys.modules before exec
Dashboard plugin API routes (web_server._mount_plugin_api_routes) and
gateway event hooks (gateway.hooks.HookRegistry.discover_and_load) both
loaded Python files via importlib.util.spec_from_file_location +
exec_module without registering the resulting module in sys.modules.

That breaks any plugin or hook handler that uses `from __future__ import
annotations` together with a Pydantic BaseModel / dataclass / anything
that introspects `__module__`: at first request Pydantic tries to
resolve string-form type hints against the defining module's namespace,
can't find it by name, and raises:

  PydanticUserError: TypeAdapter[...] is not fully defined;
  you should define ... and all referenced types,
  then call `.rebuild()` on the instance.

This is what broke the kanban dashboard's 'triage' button — POST
/api/plugins/kanban/tasks validated against CreateTaskBody (a Pydantic
model in a file using `from __future__ import annotations`) and
returned 500 on every click.

The fix, applied symmetrically to both loaders:

  1. Compute module_name once.
  2. Register the module in sys.modules BEFORE exec_module.
  3. On exec_module failure, pop the half-initialized stub so subsequent
     reloads don't pick up broken state.

GETs were unaffected because they don't build a body TypeAdapter, which
is why this only surfaced when users started POSTing.
2026-04-28 13:24:29 -04:00
teknium1 dd83173621 feat(kanban): per-task force-loaded skills
Tasks can now pin extra skills to load into their worker alongside the
built-in kanban-worker. Use cases: translation tasks that need a
translation skill, review tasks that need github-code-review, security
audits that need security-pr-audit — without editing the assignee's
profile config.

Changes:
- tasks.skills column (JSON array), idempotent migration, Task.skills
  dataclass field.
- create_task(skills=[...]) normalises (strip/dedupe), rejects commas
  in a single name.
- _default_spawn emits one `--skills X` pair per task skill, in
  addition to the built-in `--skills kanban-worker`. Deduped against
  the built-in.
- CLI: `hermes kanban create --skill <name>` (repeatable).
- Tool: `kanban_create(skills=[...])` (accepts list or single string).
- Dashboard: POST /tasks accepts `skills`; inline create form has a
  comma-separated skills input; drawer shows a Skills row.
- `hermes kanban show` prints a skills row when present.
- Docs: kanban.md has a new 'Pinning extra skills to a specific task'
  section; CLI reference shows `--skill`.
- Tests: 16 new (kernel round-trip, dedupe, comma-rejection, dispatcher
  argv with multiple skills, built-in dedupe, CLI flag repeatable, CLI
  no-flag stays None, show renders skills, idempotent migration on
  legacy DB, tool list/string/non-list, dashboard REST round-trip,
  dashboard default empty list).
2026-04-28 07:50:36 -07:00
teknium1 c65c1ddf21 feat(kanban): dispatcher auto-loads kanban-worker skill on every spawn
`_default_spawn` now passes `--skills kanban-worker` in the child's
argv, so every dispatched worker gets the skill loaded automatically
regardless of the profile's default skill config.

Why
  The system prompt carries the MANDATORY lifecycle (via
  KANBAN_GUIDANCE). The skill carries the deeper reference material:
  good summary/metadata patterns, retry diagnostics, block-reason
  examples, workspace handling, CLI fallback. Both are useful;
  requiring users to wire the skill into skills config per-profile
  is a footgun — they'd hit tasks where workers use the lifecycle
  correctly but not the patterns, producing weaker handoffs.

  Auto-loading makes kanban-worker the baseline. Profiles can still
  add more skills via the normal config; --skills is additive.

Test
  New test_default_spawn_auto_loads_kanban_worker_skill intercepts
  subprocess.Popen to capture the argv without actually spawning a
  hermes subprocess. Asserts '--skills kanban-worker' appears in the
  cmd and the env still carries HERMES_KANBAN_TASK + HERMES_PROFILE.

Skill note updated
  Worker skill's opening note changed from "you don't need to load
  this" to "you're seeing this because the dispatcher loaded it for
  you" — reflects the new always-loaded reality.

218/218 kanban + tools suite green.
2026-04-28 07:12:34 -07:00
teknium1 1970bcf5a5 feat(kanban): system-prompt guidance block + reshape skills to reference-only
Moves the kanban worker lifecycle out of the skills and into the system
prompt as a conditionally-injected guidance block, matching the
existing MEMORY_GUIDANCE / SESSION_SEARCH_GUIDANCE / SKILLS_GUIDANCE
pattern in `agent/prompt_builder.py`. Workers now know the full
lifecycle even if no skill was loaded; skills become the deeper
playbook for edge cases.

Why
  Previously the 6-step lifecycle (orient → work → heartbeat → block/
  complete → fan-out) only reached the model if the kanban-worker
  skill was loaded. That's fragile: users configure skills-per-
  platform, profiles can forget to include them, skill loading order
  matters. The lifecycle is load-bearing for correct board behavior
  — it belongs in the prompt path.

What
  - New `KANBAN_GUIDANCE` constant in agent/prompt_builder.py (~3 KB):
    identity framing ("You are a Kanban worker"), 6-step lifecycle
    with exact tool-call shapes (`kanban_show()`, `kanban_complete(
    summary=..., metadata=...)`, `kanban_block(reason=...)`, etc.),
    orchestrator mode note, and explicit DO NOTs including "don't
    shell out to `hermes kanban <verb>`".
  - Wired into `AIAgent._build_system_prompt` gated on `kanban_show
    in self.valid_tool_names`. Because `kanban_show`'s `check_fn`
    gates on `HERMES_KANBAN_TASK` env var, this indirection
    guarantees guidance appears iff a worker was dispatched. Zero
    cost to normal sessions.

Gating verified live
  Normal `hermes chat` session: 2329-char prompt, zero kanban content.
  Worker session (HERMES_KANBAN_TASK set):  5272-char prompt, +2943
  chars of KANBAN_GUIDANCE present. Regression test asserts the
  header phrase and each tool name.

Skills reshape (both from ~6 KB lifecycles to ~7 KB references)
  - `skills/devops/kanban-worker/SKILL.md`: drops the lifecycle
    (now in the guidance block), keeps good-summary patterns,
    block-reason examples, retry diagnostics, pitfalls, CLI
    fallback cheatsheet. Adds explicit note that you don't need to
    load the skill to work a task anymore.
  - `skills/devops/kanban-orchestrator/SKILL.md`: drops the
    "don't execute" rule (now in guidance), keeps the full
    decomposition playbook, specialist roster with typical
    workspaces, a concrete Postgres-migration example with
    `kanban_create` + `parents=[...]` linking, and pitfalls
    (reassignment vs new task, link argument order, tenant
    inheritance).

Tests (+3)
  - `test_kanban_guidance_not_in_normal_prompt`: verifies a regular
    AIAgent with no env var produces a prompt that contains none of
    the kanban lifecycle phrases.
  - `test_kanban_guidance_in_worker_prompt`: spawns an AIAgent with
    HERMES_KANBAN_TASK set, verifies the header + all 4 tool-call
    examples + anti-shell guidance appear.
  - `test_kanban_guidance_prompt_size_bounded`: sanity check that
    the guidance is 1.5-4 KB so it doesn't balloon the cached prompt.

217/217 kanban + tools suite green; 303/303 run_agent suite green.
The guidance block sits alongside the other tool-gated blocks, so
prompt caching remains intact — the system prompt is stable across
every turn of a kanban worker's run.
2026-04-28 05:29:31 -07:00
teknium1 832ecde4b0 feat(kanban): structured tool surface for worker + orchestrator agents
Seven new tools in `tools/kanban_tools.py` that give kanban workers a
backend-portable, schema-filtered way to interact with the board from
inside their own Python process — no shelling out to `hermes kanban`.

Motivation
  The CLI path (`hermes kanban complete \$TASK --summary ...`) breaks
  on any remote terminal backend (Docker, Modal, Singularity, SSH).
  The terminal tool runs `hermes kanban` inside the container, where
  `hermes` isn't installed and `~/.hermes/kanban.db` isn't mounted.
  Tools run in the agent's own Python process, so they always reach
  the board regardless of backend. Also skips shell-quoting fragility
  on --metadata JSON and gives structured error returns the model can
  reason about.

The seven tools
  kanban_show        read current task (defaults to HERMES_KANBAN_TASK)
  kanban_complete    structured handoff: summary + metadata
  kanban_block       ask for human input
  kanban_heartbeat   signal liveness during long operations
  kanban_comment     append to task thread
  kanban_create      fan out into child tasks (orchestrator path)
  kanban_link        add parent→child dependency after the fact

Gating
  Each tool's check_fn returns True iff HERMES_KANBAN_TASK is set in
  the process env. The dispatcher sets it when spawning a worker;
  normal `hermes chat` sessions never have it. Empirically verified:
  a baseline hermes-cli schema is 27 tools; with HERMES_KANBAN_TASK
  set it grows to exactly 34 (+7). Zero leak into normal sessions.

Also set HERMES_PROFILE in the spawn env so the kanban_comment tool's
author default works cleanly (it's what the tool reads to attribute
comments).

Skill updates
  - `skills/devops/kanban-worker/SKILL.md`: lifecycle rewritten to use
    kanban_show / kanban_heartbeat / kanban_block / kanban_complete /
    kanban_comment / kanban_create directly. CLI fallback section
    added for human operators / scripts.
  - `skills/devops/kanban-orchestrator/SKILL.md`: all examples ported
    from CLI to tool form; top-banner note explaining tools are the
    primary surface. kanban_create / kanban_link throughout.

Docs
  `website/docs/user-guide/features/kanban.md`:
  new "How workers interact with the board" section explaining the
  tool surface, gating mechanism, and why tools vs CLI. The worker
  skill / orchestrator skill subsections are now nested under it.

Tests (+25 in tests/tools/test_kanban_tools.py)
  - Schema gating: kanban_tools_hidden_without_env_var,
    kanban_tools_visible_with_env_var.
  - Happy paths: show (default + explicit task_id), complete (with
    summary+metadata, with result only), block, heartbeat (with and
    without note), comment (default + custom author), create (with
    list parents, with string parent), link.
  - Error paths: complete rejects no-handoff and non-dict metadata,
    block rejects empty reason, comment rejects empty body, create
    rejects no title / no assignee / non-list parents, link rejects
    self-reference / missing args / cycles.
  - End-to-end: full worker lifecycle driven entirely through the
    tools, verified against DB state.

214/214 kanban suite pass under scripts/run_tests.sh.
2026-04-28 04:30:22 -07:00
Teknium be184aa5fa fix(kanban): close the two v2-flagged issues in v1
Both items the atypical-scenarios pass flagged as "v2 follow-up"
actually belong in v1. Fixed now.

Fix 1: workspace path traversal
  resolve_workspace now rejects non-absolute paths for all three
  workspace_kinds (scratch-with-explicit-path, dir:, worktree). A
  relative path like '../../../tmp/attacker' was being silently
  resolved against the dispatcher's CWD — a confused-deputy escape.
  Error message points users at the absolute-path requirement.
  Storage remains verbatim (kernel doesn't rewrite user input);
  the refusal happens at resolution time, so the dispatcher's
  existing spawn-failure circuit breaker correctly categorizes it.

  Threat model documented in website/docs/user-guide/features/kanban.md:
  single-host, trusted-local-user. The absolute-path rule prevents
  ambiguity-driven escape, not malicious access — kanban runs as you,
  with your uid, on your filesystem.

Fix 2: build_worker_context unbounded
  Added per-section caps so worker prompts stay bounded on pathological
  boards:
    _CTX_MAX_PRIOR_ATTEMPTS = 10   most-recent N runs shown; older
                                   collapsed into "N earlier attempts
                                   omitted" marker. Attempt numbering
                                   preserved (shows "Attempt 16" not
                                   renumbered).
    _CTX_MAX_COMMENTS       = 30   same pattern for comments.
    _CTX_MAX_FIELD_BYTES    = 4 KB per summary / error / metadata / result.
    _CTX_MAX_BODY_BYTES     = 8 KB per task.body (opening post).
    _CTX_MAX_COMMENT_BYTES  = 2 KB per comment.
  Truncation uses a visible ellipsis + char-count so the worker knows
  it's been truncated.

  Effect on atypical-scenario runs:
    huge_run_count_on_one_task (1000 runs):  63 KB  →    820 chars
    comment_storm       (1000 comments):     50 KB  →  1,671 chars

Tests (+6 in main suite)
  test_resolve_workspace_rejects_relative_dir_path — relative dir:
    path stored verbatim but refused at resolve.
  test_resolve_workspace_accepts_absolute_dir_path — legitimate
    absolute paths are created and returned.
  test_resolve_workspace_rejects_relative_worktree_path — same guard
    for worktree kind.
  test_build_worker_context_caps_prior_attempts — 25 runs → exactly
    _CTX_MAX_PRIOR_ATTEMPTS shown, omitted marker present,
    attempt numbering preserves original index.
  test_build_worker_context_caps_comments — 100 comments → 30 shown,
    70 in the omitted marker.
  test_build_worker_context_caps_huge_summary — 1 MB summary on a
    prior run → context under 10 KB total, truncation marker visible.

189/189 kanban suite pass. Atypical-scenarios stress script still
passes all 28 scenarios with the new caps in effect.
2026-04-28 01:20:08 -07:00
Teknium 63b7b6d5bd test(kanban): atypical-scenario stress suite + clock-skew elapsed clamp fix
Added tests/stress/test_atypical_scenarios.py — 28 scenarios covering
atypical user inputs and environments that the normal tests assume
away. Surfaced one real UI bug in the process.

Bug found and fixed
  - CLI display of negative elapsed time when NTP jumps backward
    between claim_task and complete_task. The kernel faithfully stores
    started_at > ended_at; both CLI display sites (_cmd_show's Runs
    section at line 685, _cmd_runs's table at line 1153) now clamp
    elapsed to max(0, end - start), matching the dashboard JS which
    already had the same Math.max(0, ...) clamp. Regression test
    test_cli_show_clamps_negative_elapsed forces a future started_at
    via raw UPDATE and verifies neither show nor runs prints a
    `-<digits>s` token.

Atypical scenarios covered (all passing, 28 total)
  Data:
    - unicode_and_emoji: CJK, RTL (Hebrew/Arabic), ZWJ emoji sequences,
      control chars, null bytes in titles + metadata
    - huge_strings: 1 MB body + 1 MB summary + 50-level nested metadata
    - sql_injection_attempts: 6 classic payloads, parameterized queries
      hold across every string field
    - newlines_in_summary: full preserved on run, first line in event
      payload for notifier brevity
    - malformed_metadata_via_cli: 4 bad JSON values each cleanly
      rejected with stderr error, no partial task mutation
    - empty_string_fields: empty title rejected, whitespace-only title
      rejected, empty body accepted
    - tenant_with_newlines: multiline tenant strings survive
      board_stats

  Dependency graphs:
    - dependency_cycle: A→B→A refused
    - self_parent: cannot depend on itself
    - diamond_dependency: child promotes only when both parents done
    - wide_fan_out: 500 children promoted in 4ms via complete_task's
      internal recompute_ready
    - wide_fan_in: 500 parents → 1 child, promotion gated correctly
    - parent_in_different_status_states: child stays todo for parent
      in ready/running/blocked/triage/archived; only 'done' unblocks

  Workspace:
    - workspace_path_traversal: dir: workspaces are intentionally
      arbitrary paths (documented threat model)
    - workspace_nonexistent_path: spawn_failure counter increments,
      task returns to ready

  Clock:
    - clock_skew_start_greater_than_end: kernel stores faithfully,
      CLI now clamps at display (see Bug above)

  Filesystem:
    - hermes_home_with_spaces: works
    - hermes_home_with_unicode: works
    - hermes_home_via_symlink: two symlinks to same dir share DB
      (Path.resolve() in _INITIALIZED_PATHS)

  Scale extremes:
    - huge_run_count_on_one_task: 1000 runs → list_runs in <1ms,
      build_worker_context 3ms/63KB (flagged: unbounded on
      retry-heavy tasks; v2 cap candidate)
    - hundred_tenants: 5000 tasks / 100 tenants → stats 1ms, list 26ms
    - comment_storm: 1000 comments → 50KB worker context (same
      unbounded-context flag)

  Lifecycle:
    - completed_task_reclaim_attempt: done tasks can't be
      re-claimed/re-completed/blocked
    - archived_task_resurrection_attempt: archived tasks invisible to
      default list + all ops refused
    - unassigned_task_never_claims: dispatcher skips, task untouched

  Assignees:
    - assignee_with_special_chars: @-signs, dots, CJK, emoji, 200-char
      names, empty strings all handled

  Concurrency:
    - idempotency_key_race: two concurrent processes calling
      create_task with same key both get back the SAME task id,
      exactly one row in DB

  Dashboard:
    - dashboard_rest_with_weird_inputs: empty/huge/unicode titles,
      unknown fields, type mismatches handled correctly (200 for
      valid, 400/422 for invalid)

Observations flagged for v2 (not fixed, not blocking)
  - build_worker_context is unbounded on retry-heavy tasks (1000 runs
    → 63KB) and comment-heavy tasks (1000 comments → 50KB). A
    `--max-prior-attempts` / `--max-comments` cap would be appropriate.
  - `dir:` workspace paths are intentionally arbitrary; docs should
    note the threat model (trusted local user; path is stored
    verbatim, not sandboxed).

183/183 main kanban suite pass. Stress suite still opted out by
default; enable with `pytest --run-stress` or run scripts directly.
2026-04-28 01:13:17 -07:00
Teknium 123f8d0fed feat(kanban): battle-test suite + 2 real bugs it found
Added tests/stress/ — an opt-in suite that stresses the kernel with
real concurrency, real subprocesses, random property fuzzing, and scale
benchmarks. Exposed two real bugs the 4 audit passes missed.

Bugs found and fixed
  - _pid_alive returned True for zombie processes. os.kill(pid, 0)
    succeeds against zombies (process table entry exists until parent
    reaps), so a worker that exited normally but wasn't yet reaped
    would look alive to detect_crashed_workers indefinitely. Fixed by
    peeking at /proc/<pid>/status on Linux and treating State: Z as
    dead. No-op on other POSIX; Windows unchanged.
  - Task ID generator used 2 hex bytes (65k space). By birthday
    paradox, ~5% collision probability at 1k tasks, ~50% at 10k.
    Would raise UNIQUE constraint errors on large boards without a
    retry path. Bumped to 4 hex bytes (4.3B space). Surfaced by
    benchmarks trying to seed 10k tasks.

Stress suite (tests/stress/, opt-in via --run-stress)
  - test_concurrency.py — 5 processes race for 100 tasks. 1
    lost-claim race observed, 0 double-claims, 0 orphan runs.
  - test_concurrency_mixed.py — 10 workers + 1 reclaimer, 500
    tasks, random ops (claim/complete/block/unblock/archive).
    1518 events, 43 lost-claim races, zero invariant violations.
  - test_concurrency_reclaim_race.py — TTL < work duration so the
    reclaimer intentionally yanks tasks mid-work. 82 claims, 44
    reclaimed, 50 completed, 32 complete_refused (CAS correctly
    blocked late completes on reclaimed tasks).
  - test_subprocess_e2e.py — dispatcher spawns real python
    subprocess workers that heartbeat + complete via the CLI.
    Also exercises crash detection against a real dead PID via
    double-fork.
  - test_property_fuzzing.py — 500 randomized sequences, ~40k
    operations, 9 invariant checks after each step. Zero
    violations.
  - test_benchmarks.py — latency at 100/1k/10k tasks. Key numbers:
    dispatch_once @ 10k = 4ms, recompute_ready @ 10k = 47ms,
    list_tasks @ 10k = 50ms, build_worker_context w/ 50 parents
    = 1ms. Erosika's 10k starvation flag is pessimistic by roughly
    an order of magnitude.

Regression tests added to the main suite (+2)
  - test_pid_alive_detects_zombie: /proc check against a real
    zombified process (Linux-only, skipped elsewhere).
  - test_task_ids_dont_collide_at_scale: 500 creates, verify
    uniqueness + format.

New harness
  - tests/stress/conftest.py skips the suite by default; opt in with
    pytest --run-stress. Scripts are also __main__-runnable directly
    since they were developed as standalone stress tests.
  - tests/stress/_fake_worker.py: minimal Python worker that
    exercises the real subprocess contract (reads HERMES_KANBAN_TASK,
    heartbeats, completes via CLI).
  - tests/stress/README.md explains opt-in usage.

Test count: 182/182 kanban suite still pass under scripts/run_tests.sh;
stress suite is additive and out-of-band.
2026-04-27 21:37:34 -07:00
Teknium a24c6e191f fix(kanban): address @erosika's pre-merge review (issue #16102)
Six concrete bugs + two cheap v2 extensions from the review at
https://github.com/NousResearch/hermes-agent/issues/16102#issuecomment-4331125835
Larger items (structured comments as session substrate, taxonomy
reorg) deferred to v2 with reply posted on the issue.

Pre-merge bug fixes
  - unblock_task: close any stale current_run_id pointer with a
    reclaimed run inside the unblock txn. Defensive; the invariant
    holds under current data paths (block_task already closes the
    run) but a future or external write that leaves the pointer
    dangling would otherwise persist across the ready->blocked->
    ready cycle. Mirrors the same pattern in claim_task +
    archive_task.
  - Migration backfill: wrap the in-flight backfill loop in
    write_txn and add a CAS guard (`current_run_id IS NULL`) on
    the pointer UPDATE, with a cleanup path that marks any orphan
    run row reclaimed if the CAS fails. Prevents races against a
    concurrent dispatcher between SELECT and INSERT.
  - Notifier sub leak on non-done terminals: unsub on the last
    delivered event's kind being terminal (completed / blocked /
    gave_up / crashed / timed_out), not just on task.status in
    (done, archived). blocked / gave_up / crashed / timed_out used
    to fire one ping then strand the subscription row forever.
  - Notifier thrashes dead chats: per-subscription send-failure
    counter keyed on (task_id, platform, chat_id, thread_id).
    After 3 consecutive adapter.send exceptions, drop the sub
    automatically. Counter resets on any successful send.

Daemon ops visibility
  - run_daemon on_tick now tracks consecutive ticks where the
    ready queue is non-empty but 0 spawns succeeded. After 6 such
    ticks (default ~30s at interval=5), emits a WARN line to
    stderr pointing at profile health (venv, PATH, credentials)
    and `hermes kanban list --status blocked`. Rate-limited to
    one message per 5 minutes so a persistent outage doesn't
    spam logs.

v2 extensions shipped in scope (pure upside)
  - build_worker_context: new "Recent work by @assignee" section
    surfacing the 5 most-recent completed runs for the current
    task's assignee (excluding this task). Bounded, cached by
    the natural LIMIT, no new dependencies. Skipped when the
    task has no assignee.
  - Gateway notifier message prefix: terminal pings now lead
    with `@<assignee>` so fleets (one chat subscribing to many
    tasks with different workers) stay legible at a glance.
    One-line template change.

Deferred to v2 (noted in reply to erosika)
  - recompute_ready full-scan starvation at 10k+ tasks: dirty-set
    approach is a real refactor; fine as follow-up.
  - Skill ↔ assignee validation for routing: depends on skill
    introspection surface that isn't nailed down.
  - Structured comments (in_reply_to / addressed_to / kind) as
    multi-peer session substrate: schema-affecting, exactly the
    v2-scope design vulcan flagged shouldn't cram into this PR.
  - Pattern vs mechanism taxonomy split in docs: pure docs reorg,
    low urgency.

Tests (+6 in core functionality)
  - unblock_invariant_recovery (engineered leak, defensive close)
  - unblock_normal_path_no_spurious_run (no run created on happy
    block->unblock; erosika's main concern)
  - migration_backfill_idempotent_under_re_run (3x init_db on a
    legacy-shape DB yields exactly 1 run row, not 3)
  - build_worker_context_includes_role_history (role continuity)
  - build_worker_context_role_history_skipped_when_no_assignee
  - build_worker_context_role_history_bounded_to_5

180/180 kanban suite pass under scripts/run_tests.sh. Live-smoke
exercised all three kernel fixes end-to-end with isolated
HERMES_HOME.
2026-04-27 20:40:49 -07:00
Teknium 7206eed319 docs(kanban): add step-by-step tutorial with 10 dashboard screenshots
New website/docs/user-guide/features/kanban-tutorial.md walks four
user stories end-to-end, each backed by a real screenshot of the
dashboard running against seeded data.

Stories
  1. Solo dev shipping a feature (parent->child dependencies,
     structured handoff, run history rendering).
  2. Fleet farming (parallel independent tasks across 3 assignees,
     lanes-by-profile grouping, dispatcher daemon).
  3. Role pipeline with retry (PM spec -> eng implements -> review
     blocks -> eng retries -> review approves; two-run history
     visible in the drawer; downstream workers pull parent
     summary+metadata).
  4. Circuit breaker + crash recovery (2 spawn_failed + 1 gave_up
     for a deploy with missing creds; 1 crashed + 1 completed for
     an OOM-killed migration that recovered on retry).

Each story shows both CLI commands and the dashboard drawer
equivalent. Screenshots captured via playwright + chromium at 2x
device scale, then repalettized with PIL (22MB -> 6.1MB for the
10-image set, no visible quality loss verified against vision).

Side updates
  - website/sidebars.ts: added kanban-tutorial under features.
  - website/docs/user-guide/features/kanban.md: prefix banner
    linking new readers to the tutorial before the reference.

All image references validate: `/img/kanban-tutorial/*` maps to
website/static/img/kanban-tutorial/ (10 files). Docusaurus build
not run locally (no node_modules in worktree); CI build on merge
will confirm.
2026-04-27 20:28:53 -07:00
Teknium 1619c0e503 fix(kanban): third pass — auto-init on first use, show --json carries runs[]
Found during full-stack live-test of the kanban system: two bugs where
the kernel and CLI didn't match the documented contract.

Kernel
  - `connect()` now auto-runs schema creation + migrations on the
    first connection to a given DB path, matching its docstring
    ('Open (and initialize if needed)' — which was aspirational until
    now). Module-level _INITIALIZED_PATHS cache keeps subsequent
    connects cheap. Previously the docstring lied: every path that
    went through connect() on a fresh HERMES_HOME raised 'no such
    table: tasks' — only `hermes kanban init` and `daemon` triggered
    schema creation.
  - `init_db()` always re-runs the migration pass (clears the cache
    entry first). Callers that know the on-disk schema may have
    drifted — tests writing legacy event kinds, external tools
    upgrading an old DB file — can force re-migration.

CLI
  - `kanban_command()` entry point auto-inits the DB before
    dispatching any subcommand. Idempotent; the underlying
    connect-based init pattern makes this a one-line SELECT against
    sqlite_master after the first call.
  - `hermes kanban show --json` now includes:
      - `runs`: full attempt history (id, profile, step_key, status,
        outcome, summary, error, metadata, worker_pid, started_at,
        ended_at)
      - `run_id` on every event object
    Dashboard API already had both; CLI was behind. Now scripts that
    inspect a task can use `show --json` alone.
  - `hermes kanban show` (human-readable) prints a Runs section
    matching the `runs` subcommand's format, and each Event line
    prefixes its run_id when present. Makes attempt attribution
    visible at a glance without a second command.

Tests (+3)
  - cli_create_on_fresh_home_auto_inits (subprocess, no init_db call,
    must succeed — covers the most common first-user path)
  - connect_auto_inits_fresh_db (direct kernel use without init_db)
  - cli_show_json_carries_runs (runs[] present; events carry run_id)

174/174 kanban suite pass under scripts/run_tests.sh.

Live-tested end-to-end
  - Phases 1-6: CLI subprocess (create/claim/complete/bulk-guard/
    synthetic-run/reclaim-via-TTL/multi-attempt history)
  - Phases 7-10: Dashboard FastAPI TestClient (POST/PATCH with
    summary+metadata, drag-drop running->ready, archive-while-
    running, mark-done-with-handoff)
  - Phases 11-13: Dispatcher with stub spawn_fn (3 tasks success,
    3 failures -> gave_up circuit breaker, event.run_id attribution
    across retries)
  - Phases 14-15: WebSocket /events (run_id payload, auth rejects
    wrong tokens with code 1008)
  - Phase 16: Gateway notifier (unseen_events_for_sub returns
    run_id on events, message renders 'done — title' + handoff
    summary from event payload, crashed event path, sub cleanup)

Every surface — kernel, dispatcher, CLI, dashboard REST, dashboard
WebSocket, gateway notifier — exercised end-to-end against a live
FastAPI app with a real SQLite DB in an isolated HERMES_HOME. All
passed.
2026-04-27 19:41:08 -07:00
Teknium e27c819de3 fix(kanban): deep-scan pass 2 — synthetic runs, event.run_id plumbing, invariant recovery, live drawer refresh
Second integration audit covering surfaces the first pass didn't hit.
Found eight issues spanning kernel, dashboard frontend, notifier, and CLI.
All behavioral / UX fixes; no schema change.

Kernel
  - complete_task on a never-claimed task (ready/blocked → done with no
    run in flight) was silently dropping the summary/metadata/result
    onto a non-existent run. Now synthesizes a zero-duration run
    (started_at == ended_at) so attempt history is complete. Only
    fires when there's actually handoff data to persist — bare
    complete_task(tid) remains a no-op for run creation.
  - block_task on a never-claimed task had the same bug for --reason.
    Same fix: synthesize a zero-duration run when a reason is passed.
  - Event dataclass gained a `run_id: Optional[int] = None` field.
    list_events, unseen_events_for_sub, and the dashboard _event_dict
    were all SELECTing the column but dropping it on the way out,
    so downstream consumers couldn't group events by attempt. Every
    read path now surfaces run_id.
  - claim_task got a defensive invariant-recovery step: if somehow
    `current_run_id` is non-NULL on a task in 'ready' status (invariant
    violation from an unknown code path), close the leaked run as
    'reclaimed' inside the same txn as the new claim. No-op in the
    common case; belt-and-suspenders in case a future code path forgets
    to clear the pointer.

Dashboard
  - GET /tasks/:id events array now carries run_id per event (via
    _event_dict).
  - WebSocket /events SELECT now includes run_id in the pushed event
    payload.
  - TaskDrawer reloads itself on live events for its own task id. New
    `taskEventTick[taskId]` state in the Board, incremented on every
    WS event, passed down as `eventTick` prop; drawer's useEffect
    depends on it. Previously, background workers completing a task
    the user was viewing left the drawer showing stale data until
    manual close/reopen.
  - CSS: added `.hermes-kanban-run--ended` rule for the fallback class
    the JS emits when outcome is unset. Harmless before; just
    inconsistent.

CLI
  - `hermes kanban watch --kinds` help text listed the legacy event
    name `spawn_auto_blocked`. The kernel migration renames it to
    `gave_up`, so users typing the documented name got zero matches.
    Now shows the current lexicon (`completed,blocked,gave_up,
    crashed,timed_out`).

Tests (+6 in core functionality, +1 in dashboard plugin)
  - complete_never_claimed_task_synthesizes_run
  - block_never_claimed_task_synthesizes_run
  - complete_never_claimed_without_handoff_skips_synthesis
  - event_dataclass_carries_run_id (created.run_id None, completed.run_id matches)
  - unseen_events_for_sub_includes_run_id (notifier path)
  - claim_task_recovers_from_invariant_leak (engineer the leak, verify recovery)
  - event_dict_includes_run_id (dashboard API shape)

171/171 kanban suite pass under scripts/run_tests.sh. Live-smoke (isolated
HERMES_HOME via execute_code) exercised all six fixed paths plus the
claim-after-leak recovery sequence.

Docs
  - Runs section: new 'Synthetic runs for never-claimed completions'
    and 'Live drawer refresh' paragraphs explaining the invariants.
  - Event reference: `created` / `promoted` / `unblocked` entries now
    explicitly note `run_id` is `NULL`; `completed` / `blocked`
    describe synthetic-run fallback.
2026-04-27 19:23:49 -07:00
Teknium 1c78f6627a docs(kanban): document audit-pass invariants — bulk-close guard, reclaimed-on-status-change, completed event carries summary
- Runs section: dashboard PATCH parity (summary/metadata forward),
  `completed` event embeds first-line summary for notifiers, bulk
  --summary/--metadata refused, archive/drag-drop reclaim semantics.
- Event reference: added Payload column to Lifecycle and Edits
  tables; called out the invariant that `status` carries run_id
  when closing a reclaimed run.
2026-04-27 08:46:08 -07:00
Teknium 8ef2ae6502 fix(kanban): audit pass — close orphaned runs on archive / dashboard direct-status / drag-drop
Integration audit of the runs-as-first-class work (0146cb2bd) found five
bugs where structured runs got orphaned or dashboard parity was missing.
All behavioral fixes; no schema change needed.

Kernel
  - archive_task: when called on a running task, now closes the
    in-flight run with outcome='reclaimed' and clears current_run_id.
    Previously, dashboard bulk-archive or CLI `kanban archive <running>`
    would leave the task_runs row open with ended_at=NULL forever and
    strand the pointer. Adds the claim_lock / claim_expires / worker_pid
    clearing to the UPDATE so the task row is clean too.
  - complete_task: embeds the first-line handoff summary in the
    `completed` event payload (capped at 400 chars). Notifier can now
    render `✔ task done — <title>\n<summary>` without a second SQL hit,
    and the full summary still lives on the run row.

Dashboard plugin
  - _set_status_direct: drag-drop OFF 'running' (to 'ready', 'todo',
    'triage', 'done' — anywhere except back to 'running') now closes
    the active run with outcome='reclaimed'. Clears worker_pid too.
    Snapshots previous status + current_run_id before the UPDATE so
    the decision has the right before-state. status event rows now
    carry run_id when closing a run, NULL otherwise.
  - UpdateTaskBody: adds `summary` and `metadata` fields. PATCH
    /tasks/:id with status='done' now forwards them to complete_task,
    giving the dashboard parity with `hermes kanban complete --summary
    ... --metadata ...`. Previously these fields only existed on the
    CLI.

CLI
  - `hermes kanban complete a b c --summary X` or `--metadata Y`:
    refused with a clear stderr message instead of silently applying
    the same handoff to every task. Bulk-close without handoff flags
    still works. (Note: hermes_cli.main discards subcommand exit
    codes via `args.func(args)` without propagating; tracked
    separately. Side-effect check is the real guard.)

Gateway notifier
  - Completion message prefers run.summary (carried in event payload)
    over task.result. task.result remains the fallback for legacy rows
    written before runs shipped.
  - Docstring: renamed stale `spawn_auto_blocked` reference to
    `gave_up` / `timed_out` — matches the actual TERMINAL_KINDS
    tuple, which was already correct in code.

Tests (+8 in core functionality, +3 in dashboard plugin)
  - archive_of_running_task_closes_run
  - archive_of_ready_task_does_not_create_spurious_run
  - dashboard_direct_status_change_off_running_closes_run
  - dashboard_direct_status_change_within_same_state_is_noop_for_runs
  - cli_bulk_complete_with_summary_rejects (side-effect assertion)
  - cli_bulk_complete_without_summary_still_works
  - completed_event_payload_carries_summary
  - completed_event_payload_summary_none_when_missing
  - patch_status_done_with_summary_and_metadata
  - patch_status_done_without_summary_still_works (legacy path)
  - patch_status_archive_closes_running_run (E2E through FastAPI TestClient)

164/164 kanban suite pass under scripts/run_tests.sh. Live smoke
(execute_code with isolated HERMES_HOME) covered all five fixed paths
plus a re-claim-after-drag-drop to confirm the fresh run is tracked
correctly after the orphan close.
2026-04-27 07:44:39 -07:00
Teknium 0146cb2bd2 feat(kanban): runs as first-class (v1); structured handoffs; forward-compat for v2 workflows
Addresses vulcan-artivus's RFC review on issue #16102. Picks up the
structural changes that are expensive to retrofit later and zero-cost
to land now; defers workflow-template routing + per-stage lanes to v2
(kept forward-compat hooks in the schema).

Kernel
  - New `task_runs` table. Each claim opens a run (pid, claim_lock,
    heartbeat, max_runtime, started_at), each terminal transition
    closes it with an outcome (completed / blocked / crashed /
    timed_out / spawn_failed / gave_up / reclaimed). Multiple rows per
    task when retries happen, preserving full attempt history.
  - `tasks.current_run_id` points at the active run (NULL when idle);
    denormalised for cheap reads.
  - `task_events.run_id` carries the run a given event belongs to so
    UIs group events by attempt. claim/spawned/complete/block/crash/
    timeout/spawn_fail/gave_up/heartbeat events are all run-scoped;
    created/promoted/assigned/edited stay task-scoped (run_id=NULL).
  - Legacy DBs: migration adds the columns + indexes + synthesizes a
    run row for any task that's 'running' before the runs table
    existed, so subsequent complete/heartbeat/reclaim calls have a
    target. Idempotent.

Structured handoff
  - `complete_task(summary=, metadata=)` persists both on the closing
    run. `summary` falls back to `result` when omitted so single-run
    callers don't duplicate. `metadata` is a free-form dict
    ({changed_files, tests_run, findings, ...}).
  - `build_worker_context` rewrites: now reads "Prior attempts on this
    task" (closed runs: outcome, summary, error, metadata) and
    "Parent task results" pulls run.summary + run.metadata of the
    most-recent completed run per parent, falling back to task.result
    for legacy rows without runs. Retrying workers see why earlier
    attempts failed; downstream workers see parent handoffs
    structurally, not as loose `result` strings.

CLI
  - `hermes kanban complete <id> --summary "..." --metadata '{"files":1}'`.
    JSON is parsed and rejected with exit-2 if malformed.
  - New `hermes kanban runs <id> [--json]` verb. Shows per-run rows:
    outcome, profile, elapsed, summary, error. JSON mode serializes
    the full run dataclass for scripting.

Dashboard plugin
  - GET /tasks/:id now carries a runs[] array alongside task / events /
    comments / links. Each run serialised with outcome, summary,
    metadata, worker_pid, elapsed fields.
  - New Run History section in the drawer. Outcome-coloured left
    border (green=active, blue=completed, amber=reclaimed,
    red=crashed/timed_out/gave_up/blocked). Collapsed when >3 runs
    with a '+N earlier' toggle. Shows summary + error + metadata
    inline.

Forward-compat for v2 (vulcan's workflow templates + stages)
  - `tasks.workflow_template_id` and `tasks.current_step_key` added as
    nullable columns. v1 kernel ignores them for routing; v2 will add
    workflow_templates + workflow_steps tables and wire the dispatcher
    to consult them. task_runs has a matching `step_key` column. Lets
    a v2 release land additively without another schema migration.

Tests (+22 in test_kanban_core_functionality.py, +2 in dashboard)
  - run_created_on_claim / run_closed_on_complete_with_summary
  - run_summary_falls_back_to_result
  - multiple_attempts_preserved_as_runs (3 attempts: reclaimed →
    crashed → completed, all visible in list_runs)
  - run_on_block_with_reason / run_on_spawn_failure_records_failed_runs
    (5 spawn_failed runs + 1 gave_up run)
  - event_rows_carry_run_id (task-scoped vs run-scoped split)
  - build_worker_context_includes_prior_attempts
  - build_worker_context_uses_parent_run_summary (metadata JSON in context)
  - migration_backfills_inflight_run_for_legacy_db (simulates a
    pre-migration running task, re-runs init_db, asserts backfill)
  - forward_compat_columns_writable
  - cli_runs_verb + cli_runs_json
  - cli_complete_with_summary_and_metadata (JSON round-trip through
    shlex + argparse)
  - cli_complete_bad_metadata_exits_nonzero
  - task_detail_includes_runs / task_detail_runs_empty_before_claim

269/269 kanban suite pass under scripts/run_tests.sh. Live-smoke
covered: single-attempt complete → run closed + summary persisted;
retry scenario → two runs visible (blocked + completed); parent run
summary + metadata surfaced to child via build_worker_context;
forward-compat columns writable via UPDATE; GET /tasks/:id returns
runs[].

Docs
  - New 'Runs — one row per attempt' section in kanban.md: the
    why (full attempt history, structured metadata), the two-table
    model (task is logical, run is execution), the structured handoff
    shape (--summary / --metadata), example CLI + dashboard output,
    forward-compat note for v2.
  - Event reference updated to mention task_events.run_id.
  - CLI reference gains 'hermes kanban runs <id>'.

Not in v1 (deferred to v2):
  - Workflow templates (workflow_templates + workflow_steps tables,
    stage-based routing, success/failure step links).
  - 'stage' as a distinct axis from status in the UI.
  - Shared-by-default workspace binding across stages of the same
    workflow run.
  - Pipeline replacement for the kanban-orchestrator skill (the
    orchestrator's 'decompose, don't execute' guidance is still
    correct; it becomes partly redundant once workflows land).
2026-04-27 06:54:19 -07:00
Teknium da7d09c3b6 feat(kanban): max-runtime timeouts, worker heartbeats, assignees picker, event vocab cleanup
Ports four items from the Multica audit (https://github.com/multica-ai/multica).
Dropped their cross-host server/daemon architecture and their Postgres+pgvector
skill search — both the wrong shape for our single-host SQLite kernel.

1. Per-task max-runtime (`max_runtime_seconds` column)
   - New kernel function `enforce_max_runtime(conn)` runs in every dispatch
     tick. When a running task's elapsed time exceeds the cap, we SIGTERM
     the worker, wait a 5 s grace (polling _pid_alive), then SIGKILL. The
     task goes back to 'ready' with a `timed_out` event and re-queues
     on the next tick (unless the spawn-failure circuit breaker has
     already parked it).
   - Host-local only: lock prefix must match this host's claimer_id so we
     never signal a PID on another machine.
   - CLI: `hermes kanban create --max-runtime 30m | 2h | 1d | <seconds>`.
     New `_parse_duration` helper accepts s/m/h/d suffixes or bare
     integers.
   - Dashboard POST body + the card's `max_runtime_seconds` field.

2. Worker heartbeat (`last_heartbeat_at` column, `heartbeat` event)
   - `heartbeat_worker(conn, task_id, note=None)` emits the event and
     touches last_heartbeat_at. Refused when the task isn't running.
   - CLI: `hermes kanban heartbeat <id> [--note "..."]`.
   - kanban-worker skill instructs workers to heartbeat during long
     loops (training runs, encodes, crawls, batch uploads).
   - Separate signal from PID crash detection: a worker's Python can
     still be alive while the actual work process is stuck. Heartbeat
     absence is diagnostic; future work can auto-block on stale
     heartbeats but v1 just surfaces the signal.

3. Assignee enumeration (`known_assignees`, `list_profiles_on_disk`)
   - Scans ~/.hermes/profiles/ for dirs containing config.yaml + unions
     with current assignees on the board. Each entry returns
     {name, on_disk, counts: {status: n}}.
   - CLI: `hermes kanban assignees [--json]`. Also hooked into
     `hermes kanban init` which now prints discovered profiles so new
     installs see 'these are the assignees you can target' immediately.
   - Dashboard: GET /api/plugins/kanban/assignees for the picker.

4. Event vocab cleanup (three renames + three new kinds)
   - `ready` → `promoted` (fires when deps clear; clearer semantic).
   - `priority` → `reprioritized` (past-tense verb, matches others).
   - `spawn_auto_blocked` → `gave_up` (short, memorable; the circuit
     breaker gave up on this task).
   - New: `spawned` (emitted with {pid} on successful spawn),
     `heartbeat` ({note?}), `timed_out`
     ({pid, elapsed_seconds, limit_seconds, sigkill}).
   - One-shot migration in `_migrate_add_optional_columns` renames
     legacy rows in-place on init_db(), so existing DBs upgrade cleanly.
   - Gateway notifier's TERMINAL_KINDS set updated; timed_out gets its
     own ⏱ message template, gave_up renamed from 'auto-blocked'.
   - Plugin_api.py's two 'priority' emit sites renamed to
     'reprioritized'.
   - Documented in a new 'Event reference' section in kanban.md,
     grouped into three clusters (lifecycle / edits / worker
     telemetry) with payload shapes.

Tests (+18 in tests/hermes_cli/test_kanban_core_functionality.py,
136/136 pass):
  - max_runtime_terminates_overrun_worker: real SIGTERM flow with
    _pid_alive stub, verifies event payload + state reset.
  - max_runtime_none_means_no_cap: unbounded tasks aren't timed out.
  - create_task_persists_max_runtime.
  - enforce_max_runtime_integrates_with_dispatch: kernel-level +
    dispatch_once chaining.
  - heartbeat_on_running_task + heartbeat_refused_when_not_running.
  - cli_heartbeat_verb with --note round-trip.
  - recompute_ready_emits_promoted_not_ready.
  - spawn_failure_circuit_breaker_emits_gave_up.
  - spawned_event_emitted_with_pid.
  - migration_renames_legacy_event_kinds (injects old rows, re-runs
    init_db, asserts rename).
  - list_profiles_on_disk (tmp_path + config.yaml filter).
  - known_assignees_merges_disk_and_board (profiles on disk + board
    assignees + per-status counts).
  - cli_assignees_json.
  - parse_duration_accepts_formats (s/m/h/d/float).
  - parse_duration_rejects_garbage.
  - cli_create_max_runtime_via_duration (2h → 7200).
  - cli_create_max_runtime_bad_format_exits_nonzero.

Live smoke: POST /tasks with max_runtime_seconds round-trips;
/assignees returns the union of on-disk + board-assigned names;
PATCH priority produces 'reprioritized' events (not 'priority');
board cards expose max_runtime_seconds + last_heartbeat_at.

Docs (website/docs/user-guide/features/kanban.md):
  - New 'Event reference' section with three-cluster table
    (lifecycle / edits / worker telemetry) + payload shapes.
  - CLI reference updated for --max-runtime, heartbeat, assignees.
  - Gateway notifications section updated for the new TERMINAL_KINDS.

Not ported from Multica (deliberate, documented in the out-of-scope
section already): Postgres+pgvector skill search (heavy deps conflict
with SQLite kernel), server+daemon cross-host model (we're
single-host on purpose), first-class agent identity with threaded
comments (we keep the board profile-agnostic).
2026-04-27 06:32:17 -07:00
Teknium af8d43dbbb feat(kanban): core hardening — daemon, circuit breaker, crash detect, logs, notify, bulk, stats
Eliminates every 'known broken on day one' item in the core functionality
audit. The board is now self-driving (daemon, not cron), self-healing
(crash detection, spawn-failure circuit breaker), and self-reporting
(logs, stats, gateway notifications).

Dispatcher
  - New `hermes kanban daemon` long-lived loop with --interval, --max,
    --failure-limit, --pidfile, --verbose, signal-clean shutdown
    (SIGINT/SIGTERM via threading.Event). A kb.run_daemon() entry point
    lets tests drive it inline without subprocess.
  - `hermes kanban init` now prints the dispatcher setup hint so users
    don't leave the board off-by-default. Ships a systemd user unit at
    plugins/kanban/systemd/hermes-kanban-dispatcher.service.
  - Removed the old 'add this to cron' doc path. Cron runs agent
    prompts (LLM cost per tick) — unacceptable for a per-minute
    coordination loop.

Worker aliveness / safety
  - Spawn returns the child's PID; dispatcher stores it on the task row
    and calls detect_crashed_workers() every tick. If the PID is gone
    but the claim TTL hasn't expired, the task drops back to ready with
    a 'crashed' event. Host-local only — cross-host PIDs are ignored
    per the single-host design.
  - Spawn-failure circuit breaker: after N consecutive spawn_failed
    events on the same task (default 5), the dispatcher auto-blocks
    with the last error as the reason. Success resets the counter.
    Workspace-resolution failures count against the same budget.
  - Log rotation: _rotate_worker_log trims at 2 MiB, keeps one
    generation (.log.1), bounds per-task disk usage at ~4 MiB.

Idempotency / dedup
  - create_task(idempotency_key=...) returns the existing non-archived
    task id for retried webhooks. --idempotency-key on the CLI, json
    body field on the dashboard plugin. Archived tasks don't block a
    fresh create with the same key.

CLI surface
  - Bulk verbs: complete, unblock, archive accept multiple ids;
    block accepts --ids for sibling blocks with the same reason.
  - New verbs: daemon, watch (live event tail filtered by
    assignee/tenant/kinds), stats, log, notify-subscribe,
    notify-list, notify-unsubscribe.
  - dispatch gains --failure-limit + crashed/auto_blocked columns in
    JSON output and human-readable output.
  - gc accepts --event-retention-days / --log-retention-days; prunes
    task_events for terminal tasks and old log files.

Gateway integration
  - New GatewayRunner._kanban_notifier_watcher: polls
    kanban_notify_subs every 5s, pushes ✔/⏸/✖ messages to subscribed
    chats for completed/blocked/spawn_auto_blocked/crashed events.
    Cursor-advanced per-sub; auto-removed when the task reaches
    done/archived. Runs alongside the session expiry and platform
    reconnect watchers — SQLite work in asyncio.to_thread so the
    event loop never blocks.
  - /kanban create in the gateway auto-subscribes the originating
    chat (platform + chat_id + thread_id). Users see
    '(subscribed — you'll be notified when t_abcd completes or
    blocks)' appended to the response.

Dashboard plugin
  - GET /stats returns board_stats (by_status, by_assignee,
    oldest_ready_age_seconds).
  - GET /tasks/:id/log returns the worker log with optional ?tail=N
    cap. 404 on unknown task, exists=false when the task has never
    spawned.
  - POST /tasks accepts idempotency_key; both Pydantic body and the
    create_task kwarg now round-trip.
  - /board attaches task.age (created/started/time_to_complete in
    seconds) so the UI can colour stale cards without recomputing.
  - Card CSS: amber border after N minutes, red border when clearly
    stuck (tier per status: running 10m/60m, ready 1h/24h, todo
    7d/30d, blocked 1h/24h).
  - Drawer: new Worker log section, auto-loads on mount, last 100 KB
    cap with on-disk path surfaced when truncated.

Kernel
  - Schema additions: tasks.idempotency_key, tasks.spawn_failures,
    tasks.worker_pid, tasks.last_spawn_error; new
    kanban_notify_subs table. All gated by _migrate_add_optional_columns
    so legacy DBs upgrade cleanly.
  - release_stale_claims / complete_task / block_task now all clear
    worker_pid so crash detection doesn't false-positive on reclaimed
    tasks.
  - read_worker_log fixed: tail-skip no longer eats one-giant-line
    logs (common with child processes that don't flush newlines
    before dying).

Tests (tests/hermes_cli/test_kanban_core_functionality.py, 28 new)
  - Idempotency: same key returns existing, archived doesn't block,
    no key never collides
  - Circuit breaker: auto-blocks after limit, success resets counter,
    workspace-resolution failure counts against budget
  - Aliveness: _pid_alive helper, detect_crashed_workers reclaims
    exited child
  - Daemon: runs and stops cleanly via stop_event, survives a tick
    exception
  - Stats + task_age helpers
  - Notify subs: CRUD, cursor advances, distinct-thread is a separate row
  - GC: events-only-for-terminal-tasks, old worker logs deleted
  - Log: rotation keeps one generation, read_worker_log tail
  - CLI: bulk complete/archive/unblock/block, create with
    --idempotency-key, stats --json, notify-subscribe+list, log
    missing task, gc reports counts
  - run_slash parity: smoke-tests every registered verb (23
    invocations); none may raise or return empty string

Full kanban test suite: 234/234 pass under scripts/run_tests.sh
(60 original + 30 dashboard plugin + 28 new core + 116 command
registry). Live smoke covers /stats, idempotency, age, log endpoint
with and without content, log?tail= truncation signal, 404 on unknown
task.

Docs (website/docs/user-guide/features/kanban.md)
  - 'Core concepts' rewritten: new statuses (triage), idempotency key,
    dispatcher-as-daemon-not-cron with circuit breaker behaviour
    documented.
  - Quick start swapped to daemon. New systemd section covers user
    service install.
  - New sections: idempotent create, bulk verbs, gateway
    notifications, out-of-scope single-host note (kanban.db is local;
    don't expect multi-host).
  - CLI reference updated for every new verb, every new flag.
2026-04-26 13:01:09 -07:00
Teknium 27fc6c1086 feat(kanban): bulk ops, drawer edit, dep editor, markdown, touch, config
The dashboard plugin gets the last layer of features that turn it from a
'usable read surface with drag-drop' into a 'full kanban UI' — no more
'drop to CLI to do X' moments from inside the tab.

Plugin backend
  - POST /tasks/bulk — apply the same patch (status / archive / assignee
    / priority) to every id in the request body. Each id runs
    independently: one bad id reports {ok: false, error: ...} without
    aborting siblings. Status transitions that aren't legal for the
    current state are surfaced per-id ('transition to done refused').
    Used by the multi-select bulk action bar.
  - GET /config — returns the dashboard.kanban section of config.yaml
    (default_tenant, lane_by_profile, include_archived_by_default,
    render_markdown) with sensible defaults when the section is absent.
    Loaded once by the SPA to preselect filters and toggle markdown
    rendering.
  - _conn() helper — every handler now goes through it, calling
    kanban_db.init_db() (idempotent) before every connection. Fresh
    installs work whether the first hit is GET /board, POST /tasks, or
    any other endpoint — no more 'no such table: tasks' when the CLI
    or a script hits the plugin before the dashboard has ever loaded.

Plugin UI (plugin bundle, +~12 KB)
  - Multi-select: per-card checkbox; shift/ctrl-click also toggles
    without opening the drawer. A BulkActionBar appears above the
    columns with batch → ready / complete / archive / reassign
    (profile dropdown + unassign option). Destructive batches confirm
    first. Partial failures from the backend are surfaced inline.
  - Drawer inline editing:
    - Click the title → TitleEditor swaps in an input, Enter saves,
      Escape cancels.
    - Click the Assignee meta row → AssigneeEditor input (empty string
      unassigns).
    - Click the Priority meta row → PriorityEditor numeric input.
    - New 'edit' button on Description → full-width textarea; Save /
      Cancel switch back to rendered view.
  - Dependency editor: chip list of parents + children with per-chip
    × button (calls DELETE /links). Add-parent / add-child dropdowns
    filter out self + already-linked tasks so you cannot re-add a
    duplicate edge or a self-loop. Cycle rejections from the server
    surface cleanly via the existing error banner.
  - Parent selection in InlineCreate: new dropdown listing every task
    on the board ('{id} — {title}') — picking one sends parents=[id]
    with the create payload, so the task lands in todo (or triage if
    created from the Triage column) with the dependency wired up.
  - Safe markdown rendering for description, comment bodies, and
    result. A small in-bundle renderer handles headings, bold, italic,
    inline code, fenced code, bullet lists, and http(s)/mailto links.
    Every substitution runs on HTML-escaped input (no raw HTML), links
    get target=_blank + rel=noopener,noreferrer. Disabled by config
    key dashboard.kanban.render_markdown=false (falls back to <pre>).
  - Touch drag-drop: attachTouchDrag() installs a pointerdown handler
    that spawns a drag proxy, tracks elementFromPoint under the finger,
    and dispatches a hermes-kanban:drop CustomEvent on the column when
    released. Desktop continues to use native HTML5 DnD. Columns
    listen for both.
  - ErrorBoundary already present from the prior commit catches any
    renderer throw; markdown escape + touch-proxy cleanup both have
    their own try/finally.

Tests (tests/plugins/test_kanban_dashboard_plugin.py — 90/90 pass)
  - bulk_status_ready: 3 tasks blocked, batch → ready, all move
  - bulk_archive hides all ids from default board
  - bulk_reassign changes every assignee
  - bulk_unassign_via_empty_string sets assignee back to None
  - bulk_partial_failure_doesnt_abort_siblings: bogus id in middle,
    good siblings still get priority=7
  - bulk_empty_ids_400
  - config_returns_defaults_when_section_missing
  - config_reads_dashboard_kanban_section (writes config.yaml, verifies
    every key round-trips)

Live smoke (real FastAPI app + isolated HERMES_HOME):
  - /config without section returns defaults
  - /config with dashboard.kanban section returns the configured values
  - POST /tasks as the first-ever request (no prior /board) succeeds —
    auto-init handles it
  - Link add + remove via POST /links + DELETE /links round-trip
  - Bulk priority bump on 2 ids, both get priority=5
  - Bulk archive hides ids from default board
  - PATCH {title, body} updates the task, markdown source survives
    the round trip
  - POST /tasks {triage: true, parents: [id]} lands in triage, not todo
  - Bulk partial: 2 good + 1 bogus returns per-id outcome

Docs (website/docs/user-guide/features/kanban.md)
  - 'What the plugin gives you' rewritten to reflect bulk, drawer
    edit, dep editor, parent-on-create, markdown, touch drag-drop.
  - New 'Dashboard config' subsection with a YAML example for
    dashboard.kanban.*.
  - REST table gains /tasks/bulk and /config rows.
2026-04-26 12:36:23 -07:00
Teknium 45806629c5 feat(kanban): Triage column, progress rollup, WS auth, lanes, polish
Follows up on the initial dashboard plugin with the items called out
during self-review — ships the GUI-reality claims the PR body made,
closes the WebSocket auth gap, and lands the 'Triage' status the design
spec's Fusion-style screenshot leads with.

Kernel changes
  - kanban_db.VALID_STATUSES gains 'triage'. status is TEXT without a
    CHECK constraint so no schema migration is needed.
  - create_task(triage=True) forces the initial status to 'triage'
    regardless of parents, and parent ids are still validated so the
    eventual link rows don't dangle. recompute_ready() only promotes
    'todo' -> 'ready', so triage tasks are naturally isolated from the
    dispatcher pipeline.
  - hermes kanban create gains --triage.
  Patterns table (docs) gains P9 'Triage specifier'.

Plugin backend (plugins/kanban/dashboard/plugin_api.py)
  - GET /board now auto-init's kanban.db on first read (idempotent).
    A fresh install shows an empty board instead of 'failed to load'.
  - GET /board returns a new 'progress' field per task — {done, total}
    of child-task completion, or None if the task has no children.
  - BOARD_COLUMNS prepends 'triage'.
  - POST /tasks accepts {triage: bool}; PATCH /tasks/:id accepts
    {status: 'triage'}.
  - WebSocket /events now requires ?token=<session_token> as a query
    param — browsers can't set Authorization on a WS upgrade, so this
    matches the pattern the in-browser PTY bridge uses. Constant-time
    compare against hermes_cli.web_server._SESSION_TOKEN. In bare-test
    contexts (no dashboard module) the check no-ops so the tail loop
    stays testable. Security boundary documented in the module header
    and in website/docs/user-guide/features/kanban.md.

Plugin UI (plugins/kanban/dashboard/dist/index.js + style.css)
  - Adds the Triage column (lilac dot) with helper text
    'Raw ideas — a specifier will flesh out the spec'. Inline-create
    from the Triage column parks new tasks in triage.
  - Status action row in the drawer gains '→ triage'.
  - Progress pill (N/M) on cards that have children. Full-complete
    state tints the pill green.
  - 'Lanes by profile' toolbar toggle — sub-groups the Running column
    by assignee so you see at a glance which specialist is busy on
    what.
  - Destructive status moves (done / archived / blocked) via drag-drop
    OR via the drawer action row now prompt for confirmation.
  - Escape closes the drawer.
  - Live-update reloads are debounced (250ms) so a burst of
    task_events triggers one refetch, not N.
  - WebSocket includes ?token= built from window.__HERMES_SESSION_TOKEN__.
  - WebSocket reconnect uses exponential backoff capped at 30s, not
    a fixed 1.5s spin loop, and surfaces a user-visible error on
    code-1008 (auth rejected) instead of reconnecting forever.
  - ErrorBoundary wraps the page — a bad card render shows a
    'rendering error, reload view' card instead of crashing the tab.

Tests (tests/plugins/test_kanban_dashboard_plugin.py, +5 tests = 21)
  - empty-board shape now asserts all 6 columns including 'triage'
  - create_triage_lands_in_triage_column
  - triage_task_not_promoted_to_ready (dispatcher bypasses triage)
  - patch_status_triage_works (both into triage and out of it)
  - board_progress_rollup (0/2 -> 1/2 -> childless cards = None)
  - board_auto_initializes_missing_db
  - ws_events_rejects_when_token_required (three sub-assertions:
    missing → 1008, wrong → 1008, correct → handshake accepted)

All 82 kanban tests pass under scripts/run_tests.sh.

Docs
  - kanban.md 'What the plugin gives you' fully rewritten to match
    shipped reality (triage, progress pill, assignee lanes,
    destructive-confirm, Escape-close, debounce).
  - New 'Security model' subsection documents the explicit-plugin-
    route-bypass, the WS token requirement, and the --host 0.0.0.0
    warning; also notes that kanban.db is profile-agnostic on purpose
    (the coordination primitive) so cross-profile visibility is
    expected.
  - CLI command reference shows --triage.
  - Collaboration patterns table adds P9 'Triage specifier'.
2026-04-26 12:26:43 -07:00
Teknium 4093201c47 feat(kanban): dashboard plugin — Linear/Fusion-style board UI
Ships plugins/kanban/dashboard/ as a bundled dashboard plugin. No core
changes — uses the standard dashboard plugin contract (manifest.json +
dist/index.js + plugin_api.py) documented in 'Extending the Dashboard'.

What the tab gives you:
- One column per kanban status (todo / ready / running / blocked / done;
  archived behind a toggle), column counts, coloured status dots.
- Cards with id, title, priority badge, tenant tag, assignee,
  comment/link counts, 'created N ago'.
- HTML5 drag-drop between columns — status change routes through the
  same kanban_db code the CLI /kanban verbs use, so the three surfaces
  (CLI, gateway, dashboard) can never drift.
- Inline create per-column (title, assignee, priority).
- Side drawer on card click: description, status action row
  (→ ready / → running / block / unblock / complete / archive),
  dependency links, comment thread with Enter-to-submit,
  last 20 events.
- Toolbar: search, tenant filter, assignee filter, show-archived,
  nudge-dispatcher (skip the 60s wait), refresh.
- Live updates via WebSocket tailing task_events — the board reflects
  CLI or gateway actions in real time.

REST surface under /api/plugins/kanban/: GET /board, GET /tasks/:id,
POST /tasks, PATCH /tasks/:id, POST /tasks/:id/comments, POST /links,
DELETE /links, POST /dispatch, WS /events. Every handler is a thin
wrapper around kanban_db — no new business logic.

Visually theme-aware: the plugin CSS reads only --color-*, --radius,
--font-mono etc. so it reskins with whichever dashboard theme is active.

Tests (tests/plugins/test_kanban_dashboard_plugin.py, 16 tests):
- empty board shape
- create + appears in ready column with tenant/assignee rollups
- tenant filter
- detail includes parents/children/events
- 404 on unknown task
- PATCH status: complete / block / unblock / ready drag-drop / running
- PATCH reassign, priority, edit, invalid-status rejection
- POST comment (plus empty-body rejection)
- POST link + DELETE link + cycle rejection
- POST dispatch (dry run)

All 76 kanban tests pass under scripts/run_tests.sh.

Docs: website/docs/user-guide/features/kanban.md gains a full
'Dashboard (GUI)' section covering install, architecture, REST surface,
live-updates mechanism, extending, and scope boundary.
2026-04-26 12:08:47 -07:00
Teknium 9f610aa8f3 docs(kanban): add GUI/Dashboard plugin section
The /kanban CLI + slash command are enough to run the board
headlessly, but triage and cross-profile supervision want a
visual board. Document the design as a dashboard plugin that:

- reads live state from kanban.db over a WebSocket on
  task_events (no polling)
- writes through run_slash() so CLI/gateway/GUI cannot drift
- mounts under /api/plugins/kanban/ following the existing
  'Extending the Dashboard' plugin shape

The plugin is strictly a thin layer over kanban_db — no new
business logic, nothing to merge into the kernel.
2026-04-26 11:57:04 -07:00
Teknium e1c5e741ad feat(kanban): durable multi-profile collaboration board (#16081)
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.

Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).

What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
  dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
  entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
  worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
  the running-agent guard (board writes don't touch agent state), so
  /kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
  vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
  4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
  updated, sidebar entry added.

Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
  execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
  No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
  failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
  claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
  can serve many businesses with data isolation by workspace path.

Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
2026-04-26 08:29:46 -07:00
282 changed files with 18009 additions and 27987 deletions
+3 -6
View File
@@ -14,7 +14,6 @@ from datetime import datetime
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import get_env_value
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -1274,8 +1273,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
def _is_source_suppressed(_p, _s): # type: ignore[misc]
return False
if provider == "openrouter":
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value("OPENROUTER_API_KEY") or "").strip()
token = os.getenv("OPENROUTER_API_KEY", "").strip()
if token:
source = "env:OPENROUTER_API_KEY"
if _is_source_suppressed(provider, source):
@@ -1301,7 +1299,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
env_url = ""
if pconfig.base_url_env_var:
env_url = (get_env_value(pconfig.base_url_env_var) or "").strip().rstrip("/")
env_url = os.getenv(pconfig.base_url_env_var, "").strip().rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
if provider == "anthropic":
@@ -1312,8 +1310,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
]
for env_var in env_vars:
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value(env_var) or "").strip()
token = os.getenv(env_var, "").strip()
if not token:
continue
source = f"env:{env_var}"
+5 -52
View File
@@ -25,7 +25,6 @@ logger = logging.getLogger(__name__)
BUSY_INPUT_FLAG = "busy_input_prompt"
TOOL_PROGRESS_FLAG = "tool_progress_prompt"
OPENCLAW_RESIDUE_FLAG = "openclaw_residue_cleanup"
# -------------------------------------------------------------------------
@@ -44,18 +43,10 @@ def busy_input_hint_gateway(mode: str) -> str:
"Send `/busy interrupt` to make new messages stop the current task "
"immediately, or `/busy status` to check. This notice won't appear again."
)
if mode == "steer":
return (
"💡 First-time tip — I steered your message into the current run; "
"it will arrive after the next tool call instead of interrupting. "
"Send `/busy interrupt` or `/busy queue` to change this, or "
"`/busy status` to check. This notice won't appear again."
)
return (
"💡 First-time tip — I just interrupted my current task to answer you. "
"Send `/busy queue` to queue follow-ups for after the current task instead, "
"`/busy steer` to inject them mid-run without interrupting, or "
"`/busy status` to check. This notice won't appear again."
"or `/busy status` to check. This notice won't appear again."
)
@@ -64,19 +55,13 @@ def busy_input_hint_cli(mode: str) -> str:
if mode == "queue":
return (
"(tip) Your message was queued for the next turn. "
"Use /busy interrupt to make Enter stop the current run instead, "
"or /busy steer to inject mid-run. This tip only shows once."
)
if mode == "steer":
return (
"(tip) Your message was steered into the current run; it arrives "
"after the next tool call. Use /busy interrupt or /busy queue to "
"change this. This tip only shows once."
"Use /busy interrupt to make Enter stop the current run instead. "
"This tip only shows once."
)
return (
"(tip) Your message interrupted the current run. "
"Use /busy queue to queue messages for the next turn instead, "
"or /busy steer to inject mid-run. This tip only shows once."
"Use /busy queue to queue messages for the next turn instead. "
"This tip only shows once."
)
@@ -95,35 +80,6 @@ def tool_progress_hint_cli() -> str:
)
def openclaw_residue_hint_cli() -> str:
"""Banner shown the first time Hermes starts and finds ``~/.openclaw/``.
OpenClaw-era config, memory, and skill paths in ``~/.openclaw/`` will
otherwise attract the agent (memory entries like ``~/.openclaw/config.yaml``
get carried forward and the agent dutifully reads them). ``hermes claw
cleanup`` renames the directory so the agent stops finding it.
"""
return (
"Heads up — an OpenClaw workspace was detected at ~/.openclaw/.\n"
"After migrating, the agent can still get confused and read that "
"directory's config/memory instead of Hermes's.\n"
"Run `hermes claw cleanup` to archive it (rename → .openclaw.pre-migration). "
"This tip only shows once; rerun it any time with `hermes claw cleanup`."
)
def detect_openclaw_residue(home: Optional[Path] = None) -> bool:
"""Return True if an OpenClaw workspace directory is present in ``$HOME``.
Pure filesystem check — no side effects. ``home`` override exists for tests.
"""
base = home or Path.home()
try:
return (base / ".openclaw").is_dir()
except OSError:
return False
# -------------------------------------------------------------------------
# State read / write
# -------------------------------------------------------------------------
@@ -179,13 +135,10 @@ def mark_seen(config_path: Path, flag: str) -> bool:
__all__ = [
"BUSY_INPUT_FLAG",
"TOOL_PROGRESS_FLAG",
"OPENCLAW_RESIDUE_FLAG",
"busy_input_hint_gateway",
"busy_input_hint_cli",
"tool_progress_hint_gateway",
"tool_progress_hint_cli",
"openclaw_residue_hint_cli",
"detect_openclaw_residue",
"is_seen",
"mark_seen",
]
+58 -28
View File
@@ -176,6 +176,64 @@ SKILLS_GUIDANCE = (
"Skills that aren't maintained become liabilities."
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
"they write directly to the shared SQLite DB and work regardless of terminal "
"backend (local/docker/modal/ssh).\n"
"\n"
"## Lifecycle\n"
"\n"
"1. **Orient.** Call `kanban_show()` first (no args — it defaults to your "
"task). The response includes title, body, parent-task handoffs (summary + "
"metadata), any prior attempts on this task if you're a retry, the full "
"comment thread, and a pre-formatted `worker_context` you can treat as "
"ground truth.\n"
"2. **Work inside the workspace.** `cd $HERMES_KANBAN_WORKSPACE` before "
"any file operations. The workspace is yours for this run. Don't modify "
"files outside it unless the task explicitly asks.\n"
"3. **Heartbeat on long operations.** Call `kanban_heartbeat(note=...)` "
"every few minutes during long subprocesses (training, encoding, crawling). "
"Skip heartbeats for short tasks.\n"
"4. **Block on genuine ambiguity.** If you need a human decision you cannot "
"infer (missing credentials, UX choice, paywalled source, peer output you "
"need first), call `kanban_block(reason=\"...\")` and stop. Don't guess. "
"The user will unblock with context and the dispatcher will respawn you.\n"
"5. **Complete with structured handoff.** Call `kanban_complete(summary=..., "
"metadata=...)`. `summary` is 13 human-readable sentences naming concrete "
"artifacts. `metadata` is machine-readable facts "
"(`{changed_files: [...], tests_run: N, decisions: [...]}`). Downstream "
"workers read both via their own `kanban_show`. Never put secrets / "
"tokens / raw PII in either field — run rows are durable forever.\n"
"6. **If follow-up work appears, create it; don't do it.** Use "
"`kanban_create(title=..., assignee=<right-profile>, parents=[your-task-id])` "
"to spawn a child task for the appropriate specialist profile instead of "
"scope-creeping into the next thing.\n"
"\n"
"## Orchestrator mode\n"
"\n"
"If your task is itself a decomposition task (e.g. a planner profile given "
"a high-level goal), use `kanban_create` to fan out into child tasks — one "
"per specialist, each with an explicit `assignee` and `parents=[...]` to "
"express dependencies. Then `kanban_complete` your own task with a summary "
"of the decomposition. Do NOT execute the work yourself; your job is "
"routing, not implementation.\n"
"\n"
"## Do NOT\n"
"\n"
"- Do not shell out to `hermes kanban <verb>` for board operations. Use "
"the `kanban_*` tools — they work across all terminal backends.\n"
"- Do not complete a task you didn't actually finish. Block it.\n"
"- Do not assign follow-up work to yourself. Assign it to the right "
"specialist profile.\n"
"- Do not call `delegate_task` as a board substitute. `delegate_task` is "
"for short reasoning subtasks inside your own run; board tasks are for "
"cross-agent handoffs that outlive one API loop."
)
TOOL_USE_ENFORCEMENT_GUIDANCE = (
"# Tool-use enforcement\n"
"You MUST use your tools to take action — do not describe what you would do "
@@ -422,29 +480,6 @@ PLATFORM_HINTS = {
"your response. Images are sent as native photos, and other files arrive as downloadable "
"documents."
),
"yuanbao": (
"You are on Yuanbao (腾讯元宝), a Chinese AI assistant platform. "
"Markdown formatting is supported (code blocks, tables, bold/italic). "
"You CAN send media files natively — to deliver a file to the user, include "
"MEDIA:/absolute/path/to/file in your response. The file will be sent as a native "
"Yuanbao attachment: images (.jpg, .png, .webp, .gif) are sent as photos, "
"and other files (.pdf, .docx, .txt, .zip, etc.) arrive as downloadable documents "
"(max 50 MB). You can also include image URLs in markdown format ![alt](url) and "
"they will be downloaded and sent as native photos. "
"Do NOT tell the user you lack file-sending capability — use MEDIA: syntax "
"whenever a file delivery is appropriate.\n\n"
"Stickers (贴纸 / 表情包 / TIM face): Yuanbao has a built-in sticker catalogue. "
"When the user sends a sticker (you see '[emoji: 名称]' in their message) or asks "
"you to send/reply-with a 贴纸/表情/表情包, you MUST use the sticker tools:\n"
" 1. Call yb_search_sticker with a Chinese keyword (e.g. '666', '比心', '吃瓜', "
" '捂脸', '合十') to discover matching sticker_ids.\n"
" 2. Call yb_send_sticker with the chosen sticker_id or name — this sends a real "
" TIMFaceElem that renders as a native sticker in the chat.\n"
"DO NOT draw sticker-like PNGs with execute_code/Pillow/matplotlib and then send "
"them via MEDIA: or send_image_file. That produces a fake low-quality 'sticker' "
"image and is the WRONG path. Bare Unicode emoji in text is also not a substitute "
"— when a sticker is the right response, use yb_send_sticker."
),
}
# ---------------------------------------------------------------------------
@@ -848,11 +883,6 @@ def build_skills_system_prompt(
"Skills also encode the user's preferred approach, conventions, and quality standards "
"for tasks like code review, planning, and testing — load them even for tasks you "
"already know how to do, because the skill defines how it should be done here.\n"
"Whenever the user asks you to configure, set up, install, enable, disable, modify, "
"or troubleshoot Hermes Agent itself — its CLI, config, models, providers, tools, "
"skills, voice, gateway, plugins, or any feature — load the `hermes-agent` skill "
"first. It has the actual commands (e.g. `hermes config set …`, `hermes tools`, "
"`hermes setup`) so you don't have to guess or invent workarounds.\n"
"If a skill has issues, fix it with skill_manage(action='patch').\n"
"After difficult/iterative tasks, offer to save as a skill. "
"If a skill you loaded was missing steps, had wrong commands, or needed "
+1 -5
View File
@@ -754,11 +754,7 @@ def _resolve_effective_accept(
if env in ("1", "true", "yes", "on"):
return True
cfg_val = cfg.get("hooks_auto_accept", False)
if isinstance(cfg_val, bool):
return cfg_val
if isinstance(cfg_val, str):
return cfg_val.strip().lower() in ("1", "true", "yes", "on")
return False
return bool(cfg_val)
# ---------------------------------------------------------------------------
+2 -2
View File
@@ -329,7 +329,7 @@ def build_skill_invocation_message(
loaded_skill, skill_dir, skill_name = loaded
activation_note = (
f'[IMPORTANT: The user has invoked the "{skill_name}" skill, indicating they want '
f'[SYSTEM: The user has invoked the "{skill_name}" skill, indicating they want '
"you to follow its instructions. The full skill content is loaded below.]"
)
return _build_skill_message(
@@ -368,7 +368,7 @@ def build_preloaded_skills_prompt(
loaded_skill, skill_dir, skill_name = loaded
activation_note = (
f'[IMPORTANT: The user launched this CLI session with the "{skill_name}" skill '
f'[SYSTEM: The user launched this CLI session with the "{skill_name}" skill '
"preloaded. Treat its instructions as active guidance for the duration of this "
"session unless the user overrides them.]"
)
+1 -6
View File
@@ -606,7 +606,6 @@ platform_toolsets:
signal: [hermes-signal]
homeassistant: [hermes-homeassistant]
qqbot: [hermes-qqbot]
yuanbao: [hermes-yuanbao]
# =============================================================================
# Gateway Platform Settings
@@ -848,12 +847,8 @@ display:
# What Enter does when Hermes is already busy (CLI and gateway platforms).
# interrupt: Interrupt the current run and redirect Hermes (default)
# queue: Queue your message for the next turn
# steer: Inject your message mid-run via /steer, arriving at the agent
# after the next tool call — no interrupt, no role violation.
# Falls back to 'queue' if the agent isn't running yet or if
# images are attached (steer only carries text).
# Ctrl+C (or /stop in gateway) always interrupts regardless of this setting.
# Toggle at runtime with /busy <interrupt|queue|steer>.
# Toggle at runtime with /busy_input_mode <interrupt|queue>.
busy_input_mode: interrupt
# Background process notifications (gateway/messaging only).
+46 -151
View File
@@ -974,7 +974,6 @@ def _run_state_db_auto_maintenance(session_db) -> None:
return
try:
from hermes_cli.config import load_config as _load_full_config
from hermes_constants import get_hermes_home as _get_hermes_home
cfg = (_load_full_config().get("sessions") or {})
if not cfg.get("auto_prune", False):
return
@@ -982,35 +981,11 @@ def _run_state_db_auto_maintenance(session_db) -> None:
retention_days=int(cfg.get("retention_days", 90)),
min_interval_hours=int(cfg.get("min_interval_hours", 24)),
vacuum=bool(cfg.get("vacuum_after_prune", True)),
sessions_dir=_get_hermes_home() / "sessions",
)
except Exception as exc:
logger.debug("state.db auto-maintenance skipped: %s", exc)
def _run_checkpoint_auto_maintenance() -> None:
"""Call ``checkpoint_manager.maybe_auto_prune_checkpoints`` using current config.
Reads the ``checkpoints:`` section from config.yaml via
:func:`hermes_cli.config.load_config`. Honours ``auto_prune`` /
``retention_days`` / ``delete_orphans`` / ``min_interval_hours``.
Never raises maintenance must never block interactive startup.
"""
try:
from hermes_cli.config import load_config as _load_full_config
cfg = (_load_full_config().get("checkpoints") or {})
if not cfg.get("auto_prune", False):
return
from tools.checkpoint_manager import maybe_auto_prune_checkpoints
maybe_auto_prune_checkpoints(
retention_days=int(cfg.get("retention_days", 7)),
min_interval_hours=int(cfg.get("min_interval_hours", 24)),
delete_orphans=bool(cfg.get("delete_orphans", True)),
)
except Exception as exc:
logger.debug("checkpoint auto-maintenance skipped: %s", exc)
def _prune_stale_worktrees(repo_root: str, max_age_hours: int = 24) -> None:
"""Remove stale worktrees and orphaned branches on startup.
@@ -1403,7 +1378,7 @@ def _resolve_attachment_path(raw_path: str) -> Path | None:
def _format_process_notification(evt: dict) -> "str | None":
"""Format a process notification event into a [IMPORTANT: ...] message.
"""Format a process notification event into a [SYSTEM: ...] message.
Handles both completion events (notify_on_complete) and watch pattern
match events from the unified completion_queue.
@@ -1413,14 +1388,14 @@ def _format_process_notification(evt: dict) -> "str | None":
_cmd = evt.get("command", "unknown")
if evt_type == "watch_disabled":
return f"[IMPORTANT: {evt.get('message', '')}]"
return f"[SYSTEM: {evt.get('message', '')}]"
if evt_type == "watch_match":
_pat = evt.get("pattern", "?")
_out = evt.get("output", "")
_sup = evt.get("suppressed", 0)
text = (
f"[IMPORTANT: Background process {_sid} matched "
f"[SYSTEM: Background process {_sid} matched "
f"watch pattern \"{_pat}\".\n"
f"Command: {_cmd}\n"
f"Matched output:\n{_out}"
@@ -1434,7 +1409,7 @@ def _format_process_notification(evt: dict) -> "str | None":
_exit = evt.get("exit_code", "?")
_out = evt.get("output", "")
return (
f"[IMPORTANT: Background process {_sid} completed "
f"[SYSTEM: Background process {_sid} completed "
f"(exit code {_exit}).\n"
f"Command: {_cmd}\n"
f"Output:\n{_out}]"
@@ -1873,16 +1848,9 @@ class HermesCLI:
self.bell_on_complete = CLI_CONFIG["display"].get("bell_on_complete", False)
# show_reasoning: display model thinking/reasoning before the response
self.show_reasoning = CLI_CONFIG["display"].get("show_reasoning", False)
# busy_input_mode: "interrupt" (Enter interrupts current run),
# "queue" (Enter queues for next turn), or "steer" (Enter injects
# mid-run via /steer, arriving after the next tool call).
_bim = str(CLI_CONFIG["display"].get("busy_input_mode", "interrupt")).strip().lower()
if _bim == "queue":
self.busy_input_mode = "queue"
elif _bim == "steer":
self.busy_input_mode = "steer"
else:
self.busy_input_mode = "interrupt"
# busy_input_mode: "interrupt" (Enter interrupts current run) or "queue" (Enter queues for next turn)
_bim = CLI_CONFIG["display"].get("busy_input_mode", "interrupt")
self.busy_input_mode = "queue" if str(_bim).strip().lower() == "queue" else "interrupt"
self.verbose = verbose if verbose is not None else (self.tool_progress_mode == "verbose")
@@ -2077,11 +2045,6 @@ class HermesCLI:
# Never blocks startup on failure.
_run_state_db_auto_maintenance(self._session_db)
# Opportunistic shadow-repo cleanup — deletes orphan/stale
# checkpoint repos under ~/.hermes/checkpoints/. Opt-in via
# checkpoints.auto_prune, idempotent via .last_prune marker.
_run_checkpoint_auto_maintenance()
# Deferred title: stored in memory until the session is created in the DB
self._pending_title: Optional[str] = None
@@ -4952,12 +4915,6 @@ class HermesCLI:
if self.agent:
self.agent.session_id = new_session_id
self.agent.session_start = now
# Redirect the JSON session log to the new branch session file so
# messages written after branching land in the correct file.
if hasattr(self.agent, "session_log_file") and hasattr(self.agent, "logs_dir"):
self.agent.session_log_file = (
self.agent.logs_dir / f"session_{new_session_id}.json"
)
self.agent.reset_session_state()
if hasattr(self.agent, "_last_flushed_db_idx"):
self.agent._last_flushed_db_idx = len(self.conversation_history)
@@ -4979,37 +4936,22 @@ class HermesCLI:
_cprint(f" Branch session: {new_session_id}")
def save_conversation(self):
"""Save the current conversation to a JSON snapshot under ~/.hermes/sessions/saved/.
The snapshot is a convenience export for sharing or off-line inspection;
every message is already persisted incrementally to the SQLite session
DB, so the live session remains resumable via ``hermes --resume <id>``
regardless of whether the user ever runs ``/save``.
"""
"""Save the current conversation to a file."""
if not self.conversation_history:
print("(;_;) No conversation to save.")
return
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
saved_dir = get_hermes_home() / "sessions" / "saved"
filename = f"hermes_conversation_{timestamp}.json"
try:
saved_dir.mkdir(parents=True, exist_ok=True)
except Exception as e:
print(f"(x_x) Failed to create save directory {saved_dir}: {e}")
return
path = saved_dir / f"hermes_conversation_{timestamp}.json"
try:
with open(path, "w", encoding="utf-8") as f:
with open(filename, "w", encoding="utf-8") as f:
json.dump({
"model": self.model,
"session_id": self.session_id,
"session_start": self.session_start.isoformat(),
"messages": self.conversation_history,
}, f, indent=2, ensure_ascii=False)
print(f"(^_^)v Conversation snapshot saved to: {path}")
if self.session_id:
print(f" Resume the live session with: hermes --resume {self.session_id}")
print(f"(^_^)v Conversation saved to: {filename}")
except Exception as e:
print(f"(x_x) Failed to save: {e}")
@@ -5876,7 +5818,28 @@ class HermesCLI:
print(f"(._.) Unknown cron command: {subcommand}")
print(" Available: list, add, edit, pause, resume, run, remove")
def _handle_kanban_command(self, cmd: str):
"""Handle the /kanban command — delegate to the shared kanban CLI.
The string form passed here is the user's full ``/kanban ...``
including the leading slash; we strip it and hand the remainder
to ``kanban.run_slash`` which returns a single formatted string.
"""
from hermes_cli.kanban import run_slash
rest = cmd.strip()
if rest.startswith("/"):
rest = rest.lstrip("/")
if rest.startswith("kanban"):
rest = rest[len("kanban"):].lstrip()
try:
output = run_slash(rest)
except Exception as exc: # pragma: no cover - defensive
output = f"(._.) kanban error: {exc}"
if output:
print(output)
def _handle_skills_command(self, cmd: str):
"""Handle /skills slash command — delegates to hermes_cli.skills_hub."""
from hermes_cli.skills_hub import handle_skills_slash
@@ -6113,6 +6076,8 @@ class HermesCLI:
self.save_conversation()
elif canonical == "cron":
self._handle_cron_command(cmd_original)
elif canonical == "kanban":
self._handle_kanban_command(cmd_original)
elif canonical == "skills":
with self._busy_command(self._slow_command_status(cmd_original)):
self._handle_skills_command(cmd_original)
@@ -6365,12 +6330,6 @@ class HermesCLI:
turn_route = self._resolve_turn_agent_config(prompt)
def run_background():
set_sudo_password_callback(self._sudo_password_callback)
set_approval_callback(self._approval_callback)
try:
set_secret_capture_callback(self._secret_capture_callback)
except Exception:
pass
try:
bg_agent = AIAgent(
model=turn_route["model"],
@@ -6468,12 +6427,6 @@ class HermesCLI:
print()
_cprint(f" ❌ Background task #{task_num} failed: {e}")
finally:
try:
set_sudo_password_callback(None)
set_approval_callback(None)
set_secret_capture_callback(None)
except Exception:
pass
self._background_tasks.pop(task_id, None)
# Clear spinner only if no foreground agent owns it
if not self._agent_running:
@@ -6868,36 +6821,24 @@ class HermesCLI:
/busy Show current busy input mode
/busy status Show current busy input mode
/busy queue Queue input for the next turn instead of interrupting
/busy steer Inject Enter mid-run via /steer (after next tool call)
/busy interrupt Interrupt the current run on Enter (default)
"""
parts = cmd.strip().split(maxsplit=1)
if len(parts) < 2 or parts[1].strip().lower() == "status":
_cprint(f" {_ACCENT}Busy input mode: {self.busy_input_mode}{_RST}")
if self.busy_input_mode == "queue":
_behavior = "queues for next turn"
elif self.busy_input_mode == "steer":
_behavior = "steers into current run (after next tool call)"
else:
_behavior = "interrupts current run"
_cprint(f" {_DIM}Enter while busy: {_behavior}{_RST}")
_cprint(f" {_DIM}Usage: /busy [queue|steer|interrupt|status]{_RST}")
_cprint(f" {_DIM}Enter while busy: {'queues for next turn' if self.busy_input_mode == 'queue' else 'interrupts current run'}{_RST}")
_cprint(f" {_DIM}Usage: /busy [queue|interrupt|status]{_RST}")
return
arg = parts[1].strip().lower()
if arg not in {"queue", "interrupt", "steer"}:
if arg not in {"queue", "interrupt"}:
_cprint(f" {_DIM}(._.) Unknown argument: {arg}{_RST}")
_cprint(f" {_DIM}Usage: /busy [queue|steer|interrupt|status]{_RST}")
_cprint(f" {_DIM}Usage: /busy [queue|interrupt|status]{_RST}")
return
self.busy_input_mode = arg
if save_config_value("display.busy_input_mode", arg):
if arg == "queue":
behavior = "Enter will queue follow-up input while Hermes is busy."
elif arg == "steer":
behavior = "Enter will steer your message into the current run (after the next tool call)."
else:
behavior = "Enter will interrupt the current run while Hermes is busy."
behavior = "Enter will queue follow-up input while Hermes is busy." if arg == "queue" else "Enter will interrupt the current run while Hermes is busy."
_cprint(f" {_ACCENT}✓ Busy input mode set to '{arg}' (saved to config){_RST}")
_cprint(f" {_DIM}{behavior}{_RST}")
else:
@@ -7299,7 +7240,7 @@ class HermesCLI:
change_detail = ". ".join(change_parts) + ". " if change_parts else ""
self.conversation_history.append({
"role": "user",
"content": f"[IMPORTANT: MCP servers have been reloaded. {change_detail}{tool_summary}. The tool list for this conversation has been updated accordingly.]",
"content": f"[SYSTEM: MCP servers have been reloaded. {change_detail}{tool_summary}. The tool list for this conversation has been updated accordingly.]",
})
# Persist session immediately so the session log reflects the
@@ -9073,30 +9014,6 @@ class HermesCLI:
_welcome_text = "Welcome to Hermes Agent! Type your message or /help for commands."
_welcome_color = "#FFF8DC"
self._console_print(f"[{_welcome_color}]{_welcome_text}[/]")
# First-time OpenClaw-residue banner — fires once if ~/.openclaw/ exists
# after an OpenClaw→Hermes migration (especially migrations done by
# OpenClaw's own tool, which doesn't archive the source directory).
try:
from agent.onboarding import (
OPENCLAW_RESIDUE_FLAG,
detect_openclaw_residue,
is_seen,
mark_seen,
openclaw_residue_hint_cli,
)
if not is_seen(self.config, OPENCLAW_RESIDUE_FLAG) and detect_openclaw_residue():
try:
_resid_color = _welcome_skin.get_color("banner_dim", "#B8860B")
except Exception:
_resid_color = "#B8860B"
self._console_print(f"[{_resid_color}]{openclaw_residue_hint_cli()}[/]")
try:
from hermes_cli.config import get_config_path as _get_cfg_path_resid
mark_seen(_get_cfg_path_resid(), OPENCLAW_RESIDUE_FLAG)
except Exception:
pass # best-effort — banner will fire again next session
except Exception:
pass # banner is non-critical — never break startup
# Show a random tip to help users discover features
try:
from hermes_cli.tips import get_random_tip
@@ -9298,34 +9215,12 @@ class HermesCLI:
# Bundle text + images as a tuple when images are present
payload = (text, images) if images else text
if self._agent_running and not (text and _looks_like_slash_command(text)):
_effective_mode = self.busy_input_mode
if _effective_mode == "steer":
# Route Enter through /steer — inject mid-run after the
# next tool call. Images can't ride along (steer only
# appends text), so fall back to queue when images are
# attached. If the agent lacks steer() or rejects the
# payload, also fall back to queue so nothing is lost.
if images or not text:
_effective_mode = "queue"
else:
accepted = False
try:
if self.agent is not None and hasattr(self.agent, "steer"):
accepted = bool(self.agent.steer(text))
except Exception as exc:
_cprint(f" {_DIM}Steer failed ({exc}) — queued for next turn.{_RST}")
accepted = False
if accepted:
preview = text[:80] + ("..." if len(text) > 80 else "")
_cprint(f" {_ACCENT}⏩ Steered: '{preview}'{_RST}")
else:
_effective_mode = "queue"
if _effective_mode == "queue":
if self.busy_input_mode == "queue":
# Queue for the next turn instead of interrupting
self._pending_input.put(payload)
preview = text if text else f"[{len(images)} image{'s' if len(images) != 1 else ''} attached]"
_cprint(f" Queued for the next turn: {preview[:80]}{'...' if len(preview) > 80 else ''}")
elif _effective_mode == "interrupt":
else:
self._interrupt_queue.put(payload)
# Debug: log to file when message enters interrupt queue
try:
@@ -9969,7 +9864,7 @@ class HermesCLI:
status = cli_ref._command_status or "Processing command..."
return f"{frame} {status}"
if cli_ref._agent_running:
return "msg=interrupt · /queue · /bg · /steer · Ctrl+C cancel"
return "type a message + Enter to interrupt, Ctrl+C to cancel"
if cli_ref._voice_mode:
return "type or Ctrl+B to record"
return ""
+4 -16
View File
@@ -77,7 +77,7 @@ _KNOWN_DELIVERY_PLATFORMS = frozenset({
"telegram", "discord", "slack", "whatsapp", "signal",
"matrix", "mattermost", "homeassistant", "dingtalk", "feishu",
"wecom", "wecom_callback", "weixin", "sms", "email", "webhook", "bluebubbles",
"qqbot", "yuanbao",
"qqbot",
})
# Platforms that support a configured cron/notification home target, mapped to
@@ -337,7 +337,6 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
"sms": Platform.SMS,
"bluebubbles": Platform.BLUEBUBBLES,
"qqbot": Platform.QQBOT,
"yuanbao": Platform.YUANBAO,
}
# Optionally wrap the content with a header/footer so the user knows this
@@ -716,7 +715,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
# Always prepend cron execution guidance so the agent knows how
# delivery works and can suppress delivery when appropriate.
cron_hint = (
"[IMPORTANT: You are running as a scheduled cron job. "
"[SYSTEM: You are running as a scheduled cron job. "
"DELIVERY: Your final response will be automatically delivered "
"to the user — do NOT use send_message or try to deliver "
"the output yourself. Just produce your report/output as your "
@@ -752,7 +751,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
parts.append("")
parts.extend(
[
f'[IMPORTANT: The user has invoked the "{skill_name}" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]',
f'[SYSTEM: The user has invoked the "{skill_name}" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]',
"",
content,
]
@@ -760,7 +759,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
if skipped:
notice = (
f"[IMPORTANT: The following skill(s) were listed for this job but could not be found "
f"[SYSTEM: The following skill(s) were listed for this job but could not be found "
f"and were skipped: {', '.join(skipped)}. "
f"Start your response with a brief notice so the user is aware, e.g.: "
f"'⚠️ Skill(s) not found and skipped: {', '.join(skipped)}']"
@@ -1309,17 +1308,6 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
_futures.append(_tick_pool.submit(_ctx.run, _process_job, job))
_results.extend(f.result() for f in _futures)
# Best-effort sweep of MCP stdio subprocesses that survived their
# session teardown during this tick. Runs AFTER every job has
# finished so active sessions (including live user chats) are
# never touched — only PIDs explicitly detected as orphans in
# tools.mcp_tool._run_stdio's finally block are reaped.
try:
from tools.mcp_tool import _kill_orphaned_mcp_children
_kill_orphaned_mcp_children()
except Exception as _e:
logger.debug("Post-tick MCP orphan cleanup failed: %s", _e)
return sum(_results)
finally:
if fcntl:
Binary file not shown.
+14 -67
View File
@@ -57,7 +57,7 @@ def _session_entry_name(origin: Dict[str, Any]) -> str:
# Build / refresh
# ---------------------------------------------------------------------------
async def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
"""
Build a channel directory from connected platform adapters and session data.
@@ -72,7 +72,7 @@ async def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
if platform == Platform.DISCORD:
platforms["discord"] = _build_discord(adapter)
elif platform == Platform.SLACK:
platforms["slack"] = await _build_slack(adapter)
platforms["slack"] = _build_slack(adapter)
except Exception as e:
logger.warning("Channel directory: failed to build %s: %s", platform.value, e)
@@ -136,66 +136,21 @@ def _build_discord(adapter) -> List[Dict[str, str]]:
return channels
async def _build_slack(adapter) -> List[Dict[str, Any]]:
"""List Slack channels the bot has joined across all workspaces.
Uses ``users.conversations`` against each workspace's web client. Pulls
public + private channels the bot is a member of, then merges in DMs
discovered from session history (IMs aren't useful to enumerate
proactively).
"""
team_clients = getattr(adapter, "_team_clients", None) or {}
if not team_clients:
def _build_slack(adapter) -> List[Dict[str, str]]:
"""List Slack channels the bot has joined."""
# Slack adapter may expose a web client
client = getattr(adapter, "_app", None) or getattr(adapter, "_client", None)
if not client:
return _build_from_sessions("slack")
channels: List[Dict[str, Any]] = []
seen_ids: set = set()
try:
from tools.send_message_tool import _send_slack # noqa: F401
# Use the Slack Web API directly if available
except Exception:
pass
for team_id, client in team_clients.items():
try:
cursor: Optional[str] = None
for _page in range(20): # safety cap on pagination
response = await client.users_conversations(
types="public_channel,private_channel",
exclude_archived=True,
limit=200,
cursor=cursor,
)
if not response.get("ok"):
logger.warning(
"Channel directory: users.conversations not ok for team %s: %s",
team_id,
response.get("error", "unknown"),
)
break
for ch in response.get("channels", []):
cid = ch.get("id")
name = ch.get("name")
if not cid or not name or cid in seen_ids:
continue
seen_ids.add(cid)
channels.append({
"id": cid,
"name": name,
"type": "private" if ch.get("is_private") else "channel",
})
cursor = (response.get("response_metadata") or {}).get("next_cursor")
if not cursor:
break
except Exception as e:
logger.warning(
"Channel directory: failed to list Slack channels for team %s: %s",
team_id, e,
)
continue
# Merge in DM/group entries discovered from session history.
for entry in _build_from_sessions("slack"):
if entry.get("id") not in seen_ids:
channels.append(entry)
seen_ids.add(entry.get("id"))
return channels
# Fallback to session data
return _build_from_sessions("slack")
def _build_from_sessions(platform_name: str) -> List[Dict[str, str]]:
@@ -268,14 +223,6 @@ def resolve_channel_name(platform_name: str, name: str) -> Optional[str]:
if not channels:
return None
# 0. Exact ID match — case-sensitive, no normalization. Lets callers pass
# raw platform IDs (e.g. Slack "C0B0QV5434G") even when the format guard
# in _parse_target_ref hasn't recognized them as explicit.
raw = name.strip()
for ch in channels:
if ch.get("id") == raw:
return ch["id"]
query = _normalize_channel_query(name)
# 1. Exact name match, including the display labels shown by send_message(action="list")
+2 -68
View File
@@ -67,7 +67,6 @@ class Platform(Enum):
WEIXIN = "weixin"
BLUEBUBBLES = "bluebubbles"
QQBOT = "qqbot"
YUANBAO = "yuanbao"
@dataclass
@@ -196,14 +195,6 @@ class StreamingConfig:
edit_interval: float = 1.0 # Seconds between message edits (Telegram rate-limits at ~1/s)
buffer_threshold: int = 40 # Chars before forcing an edit
cursor: str = "" # Cursor shown during streaming
# Ported from openclaw/openclaw#72038. When >0, the final edit for
# a long-running streamed response is delivered as a fresh message
# if the original preview has been visible for at least this many
# seconds, so the platform's visible timestamp reflects completion
# time instead of the preview creation time. Currently applied to
# Telegram only (other platforms ignore the setting). Default 60s
# matches the OpenClaw rollout. Set to 0 to disable.
fresh_final_after_seconds: float = 60.0
def to_dict(self) -> Dict[str, Any]:
return {
@@ -212,7 +203,6 @@ class StreamingConfig:
"edit_interval": self.edit_interval,
"buffer_threshold": self.buffer_threshold,
"cursor": self.cursor,
"fresh_final_after_seconds": self.fresh_final_after_seconds,
}
@classmethod
@@ -225,9 +215,6 @@ class StreamingConfig:
edit_interval=float(data.get("edit_interval", 1.0)),
buffer_threshold=int(data.get("buffer_threshold", 40)),
cursor=data.get("cursor", ""),
fresh_final_after_seconds=float(
data.get("fresh_final_after_seconds", 60.0)
),
)
@@ -327,9 +314,6 @@ class GatewayConfig:
# QQBot uses extra dict for app credentials
elif platform == Platform.QQBOT and config.extra.get("app_id") and config.extra.get("client_secret"):
connected.append(platform)
# Yuanbao uses extra dict for app credentials
elif platform == Platform.YUANBAO and config.extra.get("app_id") and config.extra.get("app_secret"):
connected.append(platform)
# DingTalk uses client_id/client_secret from config.extra or env vars
elif platform == Platform.DINGTALK and (
config.extra.get("client_id") or os.getenv("DINGTALK_CLIENT_ID")
@@ -586,8 +570,6 @@ def load_gateway_config() -> GatewayConfig:
)
if "reply_prefix" in platform_cfg:
bridged["reply_prefix"] = platform_cfg["reply_prefix"]
if "reply_in_thread" in platform_cfg:
bridged["reply_in_thread"] = platform_cfg["reply_in_thread"]
if "require_mention" in platform_cfg:
bridged["require_mention"] = platform_cfg["require_mention"]
if "free_response_channels" in platform_cfg:
@@ -602,7 +584,7 @@ def load_gateway_config() -> GatewayConfig:
bridged["group_policy"] = platform_cfg["group_policy"]
if "group_allow_from" in platform_cfg:
bridged["group_allow_from"] = platform_cfg["group_allow_from"]
if plat in (Platform.DISCORD, Platform.SLACK) and "channel_skill_bindings" in platform_cfg:
if plat == Platform.DISCORD and "channel_skill_bindings" in platform_cfg:
bridged["channel_skill_bindings"] = platform_cfg["channel_skill_bindings"]
if "channel_prompts" in platform_cfg:
channel_prompts = platform_cfg["channel_prompts"]
@@ -627,8 +609,6 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(slack_cfg, dict):
if "require_mention" in slack_cfg and not os.getenv("SLACK_REQUIRE_MENTION"):
os.environ["SLACK_REQUIRE_MENTION"] = str(slack_cfg["require_mention"]).lower()
if "strict_mention" in slack_cfg and not os.getenv("SLACK_STRICT_MENTION"):
os.environ["SLACK_STRICT_MENTION"] = str(slack_cfg["strict_mention"]).lower()
if "allow_bots" in slack_cfg and not os.getenv("SLACK_ALLOW_BOTS"):
os.environ["SLACK_ALLOW_BOTS"] = str(slack_cfg["allow_bots"]).lower()
frc = slack_cfg.get("free_response_channels")
@@ -938,12 +918,8 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
slack_token = os.getenv("SLACK_BOT_TOKEN")
if slack_token:
if Platform.SLACK not in config.platforms:
# No yaml config for Slack — env-only setup, enable it
config.platforms[Platform.SLACK] = PlatformConfig()
config.platforms[Platform.SLACK].enabled = True
# If yaml config exists, respect its enabled flag (don't override
# explicit enabled: false). Token is still stored so skills that
# send Slack messages can use it without activating the gateway adapter.
config.platforms[Platform.SLACK].enabled = True
config.platforms[Platform.SLACK].token = slack_token
slack_home = os.getenv("SLACK_HOME_CHANNEL")
if slack_home and Platform.SLACK in config.platforms:
@@ -1300,48 +1276,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
)
# Yuanbao — YUANBAO_APP_ID preferred
yuanbao_app_id = os.getenv("YUANBAO_APP_ID") or os.getenv("YUANBAO_APP_KEY")
yuanbao_app_secret = os.getenv("YUANBAO_APP_SECRET")
if yuanbao_app_id and yuanbao_app_secret:
if Platform.YUANBAO not in config.platforms:
config.platforms[Platform.YUANBAO] = PlatformConfig()
config.platforms[Platform.YUANBAO].enabled = True
extra = config.platforms[Platform.YUANBAO].extra
extra["app_id"] = yuanbao_app_id
extra["app_secret"] = yuanbao_app_secret
yuanbao_bot_id = os.getenv("YUANBAO_BOT_ID")
if yuanbao_bot_id:
extra["bot_id"] = yuanbao_bot_id
yuanbao_ws_url = os.getenv("YUANBAO_WS_URL")
if yuanbao_ws_url:
extra["ws_url"] = yuanbao_ws_url
yuanbao_api_domain = os.getenv("YUANBAO_API_DOMAIN")
if yuanbao_api_domain:
extra["api_domain"] = yuanbao_api_domain
yuanbao_route_env = os.getenv("YUANBAO_ROUTE_ENV")
if yuanbao_route_env:
extra["route_env"] = yuanbao_route_env
yuanbao_home = os.getenv("YUANBAO_HOME_CHANNEL")
if yuanbao_home:
config.platforms[Platform.YUANBAO].home_channel = HomeChannel(
platform=Platform.YUANBAO,
chat_id=yuanbao_home,
name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
)
yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
if yuanbao_dm_policy:
extra["dm_policy"] = yuanbao_dm_policy.strip().lower()
yuanbao_dm_allow_from = os.getenv("YUANBAO_DM_ALLOW_FROM")
if yuanbao_dm_allow_from:
extra["dm_allow_from"] = yuanbao_dm_allow_from
yuanbao_group_policy = os.getenv("YUANBAO_GROUP_POLICY")
if yuanbao_group_policy:
extra["group_policy"] = yuanbao_group_policy.strip().lower()
yuanbao_group_allow_from = os.getenv("YUANBAO_GROUP_ALLOW_FROM")
if yuanbao_group_allow_from:
extra["group_allow_from"] = yuanbao_group_allow_from
# Session settings
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
if idle_minutes:
+1 -3
View File
@@ -79,9 +79,7 @@ _PLATFORM_DEFAULTS: dict[str, dict[str, Any]] = {
"discord": _TIER_HIGH,
# Tier 2 — edit support, often customer/workspace channels
# Slack: tool_progress off by default — Bolt posts cannot be edited like CLI;
# "new"/"all" spam permanent lines in channels (hermes-agent#14663).
"slack": {**_TIER_MEDIUM, "tool_progress": "off"},
"slack": _TIER_MEDIUM,
"mattermost": _TIER_MEDIUM,
"matrix": _TIER_MEDIUM,
"feishu": _TIER_MEDIUM,
+16 -3
View File
@@ -21,6 +21,7 @@ Errors in hooks are caught and logged but never block the main pipeline.
import asyncio
import importlib.util
import sys
from typing import Any, Callable, Dict, List, Optional
import yaml
@@ -103,16 +104,28 @@ class HookRegistry:
print(f"[hooks] Skipping {hook_name}: no events declared", flush=True)
continue
# Dynamically load the handler module
# Dynamically load the handler module.
# Register in sys.modules BEFORE exec_module so Pydantic /
# dataclasses / typing introspection can resolve forward
# references (triggered by `from __future__ import annotations`
# in the handler). Without this, a handler that declares a
# Pydantic BaseModel for webhook/event payloads fails at first
# dispatch with "TypeAdapter ... is not fully defined".
module_name = f"hermes_hook_{hook_name}"
spec = importlib.util.spec_from_file_location(
f"hermes_hook_{hook_name}", handler_path
module_name, handler_path
)
if spec is None or spec.loader is None:
print(f"[hooks] Skipping {hook_name}: could not load handler.py", flush=True)
continue
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
sys.modules[module_name] = module
try:
spec.loader.exec_module(module)
except Exception:
sys.modules.pop(module_name, None)
raise
handle_fn = getattr(module, "handle", None)
if handle_fn is None:
+11 -57
View File
@@ -28,7 +28,6 @@ def mirror_to_session(
message_text: str,
source_label: str = "cli",
thread_id: Optional[str] = None,
user_id: Optional[str] = None,
) -> bool:
"""
Append a delivery-mirror message to the target session's transcript.
@@ -40,20 +39,9 @@ def mirror_to_session(
All errors are caught -- this is never fatal.
"""
try:
session_id = _find_session_id(
platform,
str(chat_id),
thread_id=thread_id,
user_id=user_id,
)
session_id = _find_session_id(platform, str(chat_id), thread_id=thread_id)
if not session_id:
logger.debug(
"Mirror: no session found for %s:%s:%s:%s",
platform,
chat_id,
thread_id,
user_id,
)
logger.debug("Mirror: no session found for %s:%s:%s", platform, chat_id, thread_id)
return False
mirror_msg = {
@@ -71,33 +59,17 @@ def mirror_to_session(
return True
except Exception as e:
logger.debug(
"Mirror failed for %s:%s:%s:%s: %s",
platform,
chat_id,
thread_id,
user_id,
e,
)
logger.debug("Mirror failed for %s:%s:%s: %s", platform, chat_id, thread_id, e)
return False
def _find_session_id(
platform: str,
chat_id: str,
thread_id: Optional[str] = None,
user_id: Optional[str] = None,
) -> Optional[str]:
def _find_session_id(platform: str, chat_id: str, thread_id: Optional[str] = None) -> Optional[str]:
"""
Find the active session_id for a platform + chat_id pair.
Scans sessions.json entries and matches where origin.chat_id == chat_id
on the right platform. DM session keys don't embed the chat_id
(e.g. "agent:main:telegram:dm"), so we check the origin dict.
When *user_id* is provided, prefer exact sender matches. If multiple
same-chat candidates exist and none matches the user, return None instead
of guessing and contaminating another participant's session.
"""
if not _SESSIONS_INDEX.exists():
return None
@@ -109,7 +81,8 @@ def _find_session_id(
return None
platform_lower = platform.lower()
candidates = []
best_match = None
best_updated = ""
for _key, entry in data.items():
origin = entry.get("origin") or {}
@@ -123,31 +96,12 @@ def _find_session_id(
origin_thread_id = origin.get("thread_id")
if thread_id is not None and str(origin_thread_id or "") != str(thread_id):
continue
candidates.append(entry)
updated = entry.get("updated_at", "")
if updated > best_updated:
best_updated = updated
best_match = entry.get("session_id")
if not candidates:
return None
if user_id:
exact_user_matches = [
entry for entry in candidates
if str((entry.get("origin") or {}).get("user_id") or "") == str(user_id)
]
if exact_user_matches:
candidates = exact_user_matches
elif len(candidates) > 1:
return None
elif len(candidates) > 1:
distinct_user_ids = {
str((entry.get("origin") or {}).get("user_id") or "").strip()
for entry in candidates
if str((entry.get("origin") or {}).get("user_id") or "").strip()
}
if len(distinct_user_ids) > 1:
return None
best_entry = max(candidates, key=lambda entry: entry.get("updated_at", ""))
return best_entry.get("session_id")
return best_match
def _append_to_jsonl(session_id: str, message: dict) -> None:
-2
View File
@@ -10,12 +10,10 @@ Each adapter handles:
from .base import BasePlatformAdapter, MessageEvent, SendResult
from .qqbot import QQAdapter
from .yuanbao import YuanbaoAdapter
__all__ = [
"BasePlatformAdapter",
"MessageEvent",
"SendResult",
"QQAdapter",
"YuanbaoAdapter",
]
-117
View File
@@ -336,39 +336,6 @@ def proxy_kwargs_for_aiohttp(proxy_url: str | None) -> tuple[dict, dict]:
return {}, {"proxy": proxy_url}
def is_host_excluded_by_no_proxy(hostname: str, no_proxy_value: str | None = None) -> bool:
"""Return True when ``hostname`` matches a ``NO_PROXY`` entry.
Supports comma- or whitespace-separated entries with optional leading dots
and ``*.`` wildcards, which match both the apex domain and subdomains.
"""
raw = no_proxy_value
if raw is None:
raw = os.environ.get("NO_PROXY") or os.environ.get("no_proxy") or ""
raw = raw.strip()
if not raw:
return False
lower_hostname = hostname.lower()
for entry in re.split(r"[\s,]+", raw):
normalized = entry.strip().lower()
if not normalized:
continue
if normalized == "*":
return True
if normalized.startswith("*."):
normalized = normalized[2:]
elif normalized.startswith("."):
normalized = normalized[1:]
if lower_hostname == normalized or lower_hostname.endswith(f".{normalized}"):
return True
return False
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
@@ -726,15 +693,7 @@ SUPPORTED_DOCUMENT_TYPES = {
".pdf": "application/pdf",
".md": "text/markdown",
".txt": "text/plain",
".csv": "text/csv",
".log": "text/plain",
".json": "application/json",
".xml": "application/xml",
".yaml": "application/yaml",
".yml": "application/yaml",
".toml": "application/toml",
".ini": "text/plain",
".cfg": "text/plain",
".zip": "application/zip",
".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
@@ -1023,61 +982,6 @@ def resolve_channel_prompt(
return None
def resolve_channel_skills(
config_extra: dict,
channel_id: str,
parent_id: str | None = None,
) -> list[str] | None:
"""Resolve auto-loaded skill(s) for a channel/thread from platform config.
Looks up ``channel_skill_bindings`` in the adapter's ``config.extra`` dict.
Config format::
channel_skill_bindings:
- id: "C0123" # Slack channel ID or Discord channel/forum ID
skills: ["skill-a", "skill-b"]
- id: "D0ABCDE"
skill: "solo-skill" # single string also accepted
Prefers an exact match on *channel_id*; falls back to *parent_id*
(useful for forum threads / Slack threads inheriting the parent channel's
binding).
Returns a deduplicated list of skill names (order preserved), or None if
no match is found.
"""
bindings = config_extra.get("channel_skill_bindings") or []
if not isinstance(bindings, list) or not bindings:
return None
ids_to_check: set[str] = set()
if channel_id:
ids_to_check.add(str(channel_id))
if parent_id:
ids_to_check.add(str(parent_id))
if not ids_to_check:
return None
for entry in bindings:
if not isinstance(entry, dict):
continue
entry_id = str(entry.get("id", ""))
if entry_id in ids_to_check:
skills = entry.get("skills") or entry.get("skill")
if isinstance(skills, str):
s = skills.strip()
return [s] if s else None
if isinstance(skills, list) and skills:
seen: list[str] = []
for name in skills:
if not isinstance(name, str):
continue
nm = name.strip()
if nm and nm not in seen:
seen.append(nm)
return seen or None
return None
class BasePlatformAdapter(ABC):
"""
Base class for platform adapters.
@@ -1354,27 +1258,6 @@ class BasePlatformAdapter(ABC):
"""
return SendResult(success=False, error="Not supported")
async def delete_message(
self,
chat_id: str,
message_id: str,
) -> bool:
"""
Delete a previously sent message. Optional — platforms that don't
support deletion return ``False`` and callers fall back to leaving
the message in place.
Used by the stream consumer's fresh-final cleanup path (see
openclaw/openclaw#72038) to remove long-lived preview messages
after sending the completed reply as a fresh message so the
platform's visible timestamp reflects completion time.
Returns ``True`` on successful deletion, ``False`` otherwise.
Subclasses should override for platforms with a deletion API
(e.g. Telegram ``deleteMessage``).
"""
return False
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""
Send a typing indicator.
+15 -2
View File
@@ -2679,8 +2679,21 @@ class DiscordAdapter(BasePlatformAdapter):
skills: ["skill-a", "skill-b"]
Also checks parent_id so forum threads inherit the forum's bindings.
"""
from gateway.platforms.base import resolve_channel_skills
return resolve_channel_skills(self.config.extra, channel_id, parent_id)
bindings = self.config.extra.get("channel_skill_bindings", [])
if not bindings:
return None
ids_to_check = {channel_id}
if parent_id:
ids_to_check.add(parent_id)
for entry in bindings:
entry_id = str(entry.get("id", ""))
if entry_id in ids_to_check:
skills = entry.get("skills") or entry.get("skill")
if isinstance(skills, str):
return [skills]
if isinstance(skills, list) and skills:
return list(dict.fromkeys(skills)) # dedup, preserve order
return None
def _resolve_channel_prompt(self, channel_id: str, parent_id: str | None = None) -> str | None:
"""Resolve a Discord per-channel prompt, preferring the exact channel over its parent."""
+1 -40
View File
@@ -1556,49 +1556,10 @@ class FeishuAdapter(BasePlatformAdapter):
await self._cancel_pending_tasks(self._pending_text_batch_tasks)
await self._cancel_pending_tasks(self._pending_media_batch_tasks)
self._reset_batch_buffers()
# Send a WebSocket CLOSE frame to Feishu BEFORE tearing down the
# thread loop. Without this, Feishu's server never learns the
# connection is dead and continues routing messages to the stale
# endpoint — the channel goes silent until the server-side
# CLOSE-WAIT expires (minutes to hours). See issue #10202.
#
# ``_disable_websocket_auto_reconnect()`` nils ``self._ws_client``,
# so capture the client reference first.
ws_client = self._ws_client
ws_thread_loop = self._ws_thread_loop
self._disable_websocket_auto_reconnect()
await self._stop_webhook_server()
if (
ws_client is not None
and ws_thread_loop is not None
and not ws_thread_loop.is_closed()
and hasattr(ws_client, "_disconnect")
):
try:
future = asyncio.run_coroutine_threadsafe(
ws_client._disconnect(), ws_thread_loop
)
# 5s is generous — the CLOSE frame is a single WebSocket
# control frame. If it takes longer than that the
# connection is already wedged and we gain nothing by
# waiting further.
await asyncio.wait_for(asyncio.wrap_future(future), timeout=5.0)
logger.debug("[Feishu] Sent WebSocket CLOSE frame to Feishu")
except asyncio.TimeoutError:
logger.warning(
"[Feishu] CLOSE frame not acknowledged within 5s — "
"Feishu may briefly route messages to the stale "
"connection until server-side timeout"
)
except Exception as exc:
logger.debug(
"[Feishu] Could not send WebSocket CLOSE frame: %s",
exc,
exc_info=True,
)
ws_thread_loop = self._ws_thread_loop
if ws_thread_loop is not None and not ws_thread_loop.is_closed():
logger.debug("[Feishu] Cancelling websocket thread tasks and stopping loop")
-9
View File
@@ -57,15 +57,6 @@ class MessageDeduplicator:
if len(self._seen) > self._max_size:
cutoff = now - self._ttl
self._seen = {k: v for k, v in self._seen.items() if v > cutoff}
if len(self._seen) > self._max_size:
# TTL pruning alone does not cap the cache when every entry is
# still fresh. Keep the newest entries so the helper's
# max_size bound is enforced under sustained traffic.
newest = sorted(
self._seen.items(),
key=lambda item: item[1],
)[-self._max_size:]
self._seen = dict(newest)
return False
def clear(self):
File diff suppressed because it is too large Load Diff
-25
View File
@@ -1209,31 +1209,6 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=False, error=str(e))
async def delete_message(self, chat_id: str, message_id: str) -> bool:
"""Delete a previously sent Telegram message.
Used by the stream consumer's fresh-final cleanup path (ported
from openclaw/openclaw#72038) to remove long-lived preview
messages after sending the completed reply as a fresh message.
Telegram's Bot API ``deleteMessage`` works for bot-posted
messages in the last 48 hours. Failures are non-fatal the
caller leaves the preview in place and logs at debug level.
"""
if not self._bot:
return False
try:
await self._bot.delete_message(
chat_id=int(chat_id),
message_id=int(message_id),
)
return True
except Exception as e:
logger.debug(
"[%s] Failed to delete Telegram message %s: %s",
self.name, message_id, e,
)
return False
async def send_update_prompt(
self, chat_id: str, prompt: str, default: str = "",
session_key: str = "",
File diff suppressed because it is too large Load Diff
-647
View File
@@ -1,647 +0,0 @@
"""
yuanbao_media.py 元宝平台媒体处理模块
提供 COS 上传文件下载TIM 媒体消息构建等功能
移植自 TypeScript media.tsyuanbao-openclaw-plugin
使用 httpx 替代 cos-nodejs-sdk-v5避免引入额外 SDK 依赖
COS 上传流程
1. 调用 genUploadInfo 获取临时凭证tmpSecretId/tmpSecretKey/sessionToken
2. 用临时凭证通过 HMAC-SHA1 签名构建 Authorization
3. HTTP PUT 上传到 COS
TIM 消息体构建
- buildImageMsgBody() TIMImageElem
- buildFileMsgBody() TIMFileElem
"""
from __future__ import annotations
import hashlib
import hmac
import logging
import os
import re
import secrets
import struct
import time
import urllib.parse
from datetime import datetime, timezone, timedelta
from typing import Optional, Any
import httpx
logger = logging.getLogger(__name__)
# ============ 常量 ============
UPLOAD_INFO_PATH = "/api/resource/genUploadInfo"
DEFAULT_API_DOMAIN = "yuanbao.tencent.com"
DEFAULT_MAX_SIZE_MB = 50
# COS 加速域名后缀(优先使用全球加速)
COS_USE_ACCELERATE = True
# ============ 类型映射 ============
# MIME → image_format 数字(TIM 协议字段)
_MIME_TO_IMAGE_FORMAT: dict[str, int] = {
"image/jpeg": 1,
"image/jpg": 1,
"image/gif": 2,
"image/png": 3,
"image/bmp": 4,
"image/webp": 255,
"image/heic": 255,
"image/tiff": 255,
}
# 文件扩展名 → MIME
_EXT_TO_MIME: dict[str, str] = {
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".png": "image/png",
".gif": "image/gif",
".webp": "image/webp",
".bmp": "image/bmp",
".heic": "image/heic",
".tiff": "image/tiff",
".ico": "image/x-icon",
".pdf": "application/pdf",
".doc": "application/msword",
".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
".xls": "application/vnd.ms-excel",
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
".ppt": "application/vnd.ms-powerpoint",
".pptx": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
".txt": "text/plain",
".zip": "application/zip",
".tar": "application/x-tar",
".gz": "application/gzip",
".mp3": "audio/mpeg",
".mp4": "video/mp4",
".wav": "audio/wav",
".ogg": "audio/ogg",
".webm": "video/webm",
}
# ============ 工具函数 ============
def guess_mime_type(filename: str) -> str:
"""根据文件扩展名猜测 MIME 类型。"""
ext = os.path.splitext(filename)[-1].lower()
return _EXT_TO_MIME.get(ext, "application/octet-stream")
def is_image(filename: str, mime_type: str = "") -> bool:
"""判断是否为图片类型。"""
if mime_type.startswith("image/"):
return True
ext = os.path.splitext(filename)[-1].lower()
return ext in {".jpg", ".jpeg", ".png", ".gif", ".webp", ".bmp", ".heic", ".tiff", ".ico"}
def get_image_format(mime_type: str) -> int:
"""获取 TIM 图片格式编号。"""
return _MIME_TO_IMAGE_FORMAT.get(mime_type.lower(), 255)
def md5_hex(data: bytes) -> str:
"""计算 MD5 十六进制摘要。"""
return hashlib.md5(data).hexdigest()
def generate_file_id() -> str:
"""生成随机文件 ID(32 位 hex)。"""
return secrets.token_hex(16)
# ============ 图片尺寸解析(纯 Python,无需 Pillow ============
def parse_image_size(data: bytes) -> Optional[dict[str, int]]:
"""
解析图片宽高支持 JPEG/PNG/GIF/WebP无需第三方依赖
返回 {"width": w, "height": h} None无法识别
"""
return (
_parse_png_size(data)
or _parse_jpeg_size(data)
or _parse_gif_size(data)
or _parse_webp_size(data)
)
def _parse_png_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 24:
return None
if buf[:4] != b"\x89PNG":
return None
w = struct.unpack(">I", buf[16:20])[0]
h = struct.unpack(">I", buf[20:24])[0]
return {"width": w, "height": h}
def _parse_jpeg_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 4 or buf[0] != 0xFF or buf[1] != 0xD8:
return None
i = 2
while i < len(buf) - 9:
if buf[i] != 0xFF:
i += 1
continue
marker = buf[i + 1]
if marker in (0xC0, 0xC2):
h = struct.unpack(">H", buf[i + 5: i + 7])[0]
w = struct.unpack(">H", buf[i + 7: i + 9])[0]
return {"width": w, "height": h}
if i + 3 < len(buf):
i += 2 + struct.unpack(">H", buf[i + 2: i + 4])[0]
else:
break
return None
def _parse_gif_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 10:
return None
sig = buf[:6].decode("ascii", errors="replace")
if sig not in ("GIF87a", "GIF89a"):
return None
w = struct.unpack("<H", buf[6:8])[0]
h = struct.unpack("<H", buf[8:10])[0]
return {"width": w, "height": h}
def _parse_webp_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 16:
return None
if buf[:4] != b"RIFF" or buf[8:12] != b"WEBP":
return None
chunk = buf[12:16].decode("ascii", errors="replace")
if chunk == "VP8 ":
if len(buf) >= 30 and buf[23] == 0x9D and buf[24] == 0x01 and buf[25] == 0x2A:
w = struct.unpack("<H", buf[26:28])[0] & 0x3FFF
h = struct.unpack("<H", buf[28:30])[0] & 0x3FFF
return {"width": w, "height": h}
elif chunk == "VP8L":
if len(buf) >= 25 and buf[20] == 0x2F:
bits = struct.unpack("<I", buf[21:25])[0]
w = (bits & 0x3FFF) + 1
h = ((bits >> 14) & 0x3FFF) + 1
return {"width": w, "height": h}
elif chunk == "VP8X":
if len(buf) >= 30:
w = (buf[24] | (buf[25] << 8) | (buf[26] << 16)) + 1
h = (buf[27] | (buf[28] << 8) | (buf[29] << 16)) + 1
return {"width": w, "height": h}
return None
# ============ URL 下载 ============
async def download_url(
url: str,
max_size_mb: int = DEFAULT_MAX_SIZE_MB,
) -> tuple[bytes, str]:
"""
下载 URL 内容返回 (bytes, content_type)
Args:
url: HTTP(S) URL
max_size_mb: 最大允许大小MB超过则抛出异常
Returns:
(data_bytes, content_type_string)
Raises:
ValueError: 内容超过大小限制
httpx.HTTPError: 网络/HTTP 错误
"""
max_bytes = max_size_mb * 1024 * 1024
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
# 先 HEAD 检查大小
try:
head = await client.head(url)
content_length = int(head.headers.get("content-length", 0) or 0)
if content_length > 0 and content_length > max_bytes:
raise ValueError(
f"文件过大: {content_length / 1024 / 1024:.1f} MB > {max_size_mb} MB"
)
except httpx.HTTPStatusError:
pass # 部分服务器不支持 HEAD,忽略
# GET 下载(流式读取,防止超限)
async with client.stream("GET", url) as resp:
resp.raise_for_status()
content_type = resp.headers.get("content-type", "").split(";")[0].strip()
chunks: list[bytes] = []
downloaded = 0
async for chunk in resp.aiter_bytes(65536):
downloaded += len(chunk)
if downloaded > max_bytes:
raise ValueError(
f"文件过大: 已超过 {max_size_mb} MB 限制"
)
chunks.append(chunk)
data = b"".join(chunks)
return data, content_type
# ============ COS 鉴权(HMAC-SHA1 ============
def _cos_sign(
method: str,
path: str,
params: dict[str, str],
headers: dict[str, str],
secret_id: str,
secret_key: str,
start_time: Optional[int] = None,
expire_seconds: int = 3600,
) -> str:
"""
构建 COS 请求签名q-sign-algorithm=sha1 方案
参考https://cloud.tencent.com/document/product/436/7778
Args:
method: HTTP 方法小写 "put"
path: URL 路径URL encode 后的小写
params: URL 查询参数 dict用于签名
headers: 参与签名的请求头 dictkey 需小写
secret_id: 临时 SecretIdtmpSecretId
secret_key: 临时 SecretKeytmpSecretKey
start_time: 签名起始 Unix 时间戳默认 now
expire_seconds: 签名有效期默认 3600
Returns:
Authorization header 完整字符串
"""
now = int(time.time())
q_sign_time = f"{start_time or now};{(start_time or now) + expire_seconds}"
# Step 1: SignKey = HMAC-SHA1(SecretKey, q-sign-time)
sign_key = hmac.new(
secret_key.encode("utf-8"),
q_sign_time.encode("utf-8"),
hashlib.sha1,
).hexdigest()
# Step 2: HttpString
# 参数和头部需按字典序排列,key 小写
sorted_params = sorted((k.lower(), urllib.parse.quote(str(v), safe="") ) for k, v in params.items())
sorted_headers = sorted((k.lower(), urllib.parse.quote(str(v), safe="") ) for k, v in headers.items())
url_param_list = ";".join(k for k, _ in sorted_params)
url_params = "&".join(f"{k}={v}" for k, v in sorted_params)
header_list = ";".join(k for k, _ in sorted_headers)
header_str = "&".join(f"{k}={v}" for k, v in sorted_headers)
http_string = "\n".join([
method.lower(),
path,
url_params,
header_str,
"",
])
# Step 3: StringToSign = sha1 hash of HttpString
sha1_of_http = hashlib.sha1(http_string.encode("utf-8")).hexdigest()
string_to_sign = "\n".join([
"sha1",
q_sign_time,
sha1_of_http,
"",
])
# Step 4: Signature = HMAC-SHA1(SignKey, StringToSign)
signature = hmac.new(
sign_key.encode("utf-8"),
string_to_sign.encode("utf-8"),
hashlib.sha1,
).hexdigest()
return (
f"q-sign-algorithm=sha1"
f"&q-ak={secret_id}"
f"&q-sign-time={q_sign_time}"
f"&q-key-time={q_sign_time}"
f"&q-header-list={header_list}"
f"&q-url-param-list={url_param_list}"
f"&q-signature={signature}"
)
# ============ 主要公开 API ============
async def get_cos_credentials(
app_key: str,
api_domain: str,
token: str,
filename: str = "file",
file_id: Optional[str] = None,
bot_id: str = "",
route_env: str = "",
) -> dict:
"""
调用 genUploadInfo 接口获取 COS 临时密钥及上传配置
Args:
app_key: 应用 Key用于 X-ID
api_domain: API 域名 https://bot.yuanbao.tencent.com
token: 当前有效的签票 tokenX-Token
filename: 待上传的文件名含扩展名
file_id: 客户端生成的唯一文件 ID不传则自动生成
bot_id: Bot 账号 ID用于 X-ID
Returns:
COS 上传配置 dict包含以下字段
bucketName (str) COS Bucket 名称
region (str) COS 地域
location (str) 上传 Key对象路径
encryptTmpSecretId (str) 临时 SecretId
encryptTmpSecretKey(str) 临时 SecretKey
encryptToken (str) SessionToken
startTime (int) 凭证起始时间戳Unix
expiredTime (int) 凭证过期时间戳Unix
resourceUrl (str) 上传后的公网访问 URL
resourceID (str) 资源 ID可选
Raises:
RuntimeError: 接口返回非 0 code 或字段缺失
"""
if file_id is None:
file_id = generate_file_id()
upload_url = f"{api_domain.rstrip('/')}{UPLOAD_INFO_PATH}"
headers = {
"Content-Type": "application/json",
"X-Token": token,
"X-ID": bot_id or app_key,
"X-Source": "web",
}
if route_env:
headers["X-Route-Env"] = route_env
body = {
"fileName": filename,
"fileId": file_id,
"docFrom": "localDoc",
"docOpenId": "",
}
async with httpx.AsyncClient(timeout=15.0) as client:
resp = await client.post(upload_url, json=body, headers=headers)
resp.raise_for_status()
result: dict[str, Any] = resp.json()
code = result.get("code")
if code != 0 and code is not None:
raise RuntimeError(
f"genUploadInfo 失败: code={code}, msg={result.get('msg', '')}"
)
data = result.get("data") or result
required_fields = ["bucketName", "location"]
missing = [f for f in required_fields if not data.get(f)]
if missing:
raise RuntimeError(
f"genUploadInfo 返回字段不完整: 缺少字段 {missing}"
)
return data
async def upload_to_cos(
file_bytes: bytes,
filename: str,
content_type: str,
credentials: dict,
bucket: str,
region: str,
) -> dict:
"""
通过 httpx PUT 请求将文件上传到 COS
使用临时凭证tmpSecretId/tmpSecretKey/sessionToken构建 HMAC-SHA1 签名
Args:
file_bytes: 文件二进制内容
filename: 文件名用于辅助计算 MIMEUUID
content_type: MIME 类型 "image/jpeg"
credentials: get_cos_credentials() 返回的 dict包含
encryptTmpSecretId tmpSecretId
encryptTmpSecretKey tmpSecretKey
encryptToken sessionToken
location COS key对象路径
resourceUrl 上传后公网 URL
startTime 凭证起始时间Unix
expiredTime 凭证过期时间Unix
bucket: COS Bucket 名称 chatbot-1234567890
region: COS 地域 ap-guangzhou
Returns:
上传结果 dict包含
url (str) COS 公网访问 URL
uuid (str) 文件内容 MD5
size (int) 文件大小字节
width (int, optional) 图片宽度仅图片
height (int, optional) 图片高度仅图片
Raises:
httpx.HTTPStatusError: COS 返回非 2xx 状态
RuntimeError: credentials 字段缺失
"""
secret_id: str = credentials.get("encryptTmpSecretId", "")
secret_key: str = credentials.get("encryptTmpSecretKey", "")
session_token: str = credentials.get("encryptToken", "")
cos_key: str = credentials.get("location", "")
resource_url: str = credentials.get("resourceUrl", "")
start_time: Optional[int] = credentials.get("startTime")
expired_time: Optional[int] = credentials.get("expiredTime")
if not secret_id or not secret_key or not cos_key:
raise RuntimeError(
f"COS credentials 不完整: secretId={bool(secret_id)}, "
f"secretKey={bool(secret_key)}, location={bool(cos_key)}"
)
# 构建 COS 上传 URL(优先使用全球加速域名)
if COS_USE_ACCELERATE:
cos_host = f"{bucket}.cos.accelerate.myqcloud.com"
else:
cos_host = f"{bucket}.cos.{region}.myqcloud.com"
# URL encode cos_key(保留 /
encoded_key = urllib.parse.quote(cos_key, safe="/")
cos_url = f"https://{cos_host}/{encoded_key.lstrip('/')}"
# 确定 Content-Type
if not content_type or content_type == "application/octet-stream":
if is_image(filename):
content_type = guess_mime_type(filename)
else:
content_type = "application/octet-stream"
# 计算文件 MD5 + size
file_uuid = md5_hex(file_bytes)
file_size = len(file_bytes)
# 参与签名的请求头
sign_headers = {
"host": cos_host,
"content-type": content_type,
"x-cos-security-token": session_token,
}
# 计算签名有效期
now = int(time.time())
sign_start = start_time if start_time else now
sign_expire = (expired_time - now) if expired_time and expired_time > now else 3600
authorization = _cos_sign(
method="put",
path=f"/{encoded_key.lstrip('/')}",
params={},
headers=sign_headers,
secret_id=secret_id,
secret_key=secret_key,
start_time=sign_start,
expire_seconds=sign_expire,
)
put_headers = {
"Authorization": authorization,
"Content-Type": content_type,
"x-cos-security-token": session_token,
}
logger.info(
"COS PUT: bucket=%s region=%s key=%s size=%d mime=%s",
bucket, region, cos_key, file_size, content_type,
)
async with httpx.AsyncClient(timeout=120.0) as client:
resp = await client.put(
cos_url,
content=file_bytes,
headers=put_headers,
)
resp.raise_for_status()
# 解析图片尺寸(仅图片类型)
result: dict[str, Any] = {
"url": resource_url or cos_url,
"uuid": file_uuid,
"size": file_size,
}
if content_type.startswith("image/"):
size_info = parse_image_size(file_bytes)
if size_info:
result["width"] = size_info["width"]
result["height"] = size_info["height"]
logger.info(
"COS 上传成功: url=%s size=%d",
result["url"], file_size,
)
return result
# ============ TIM 媒体消息构建 ============
def build_image_msg_body(
url: str,
uuid: Optional[str] = None,
filename: Optional[str] = None,
size: int = 0,
width: int = 0,
height: int = 0,
mime_type: str = "",
) -> list[dict]:
"""
构建腾讯 IM TIMImageElem 消息体
参考https://cloud.tencent.com/document/product/269/2720
Args:
url: 图片公网访问 URLCOS resourceUrl
uuid: 文件 UUIDMD5 或其他唯一标识
filename: 文件名uuid 为空时作为备用
size: 文件大小字节
width: 图片宽度像素
height: 图片高度像素
mime_type: MIME 类型用于确定 image_format
Returns:
TIMImageElem 消息体列表适合直接放入 msg_body
"""
_uuid = uuid or filename or _basename_from_url(url) or "image"
image_format = get_image_format(mime_type) if mime_type else 255
return [
{
"msg_type": "TIMImageElem",
"msg_content": {
"uuid": _uuid,
"image_format": image_format,
"image_info_array": [
{
"type": 1, # 1 = 原图
"size": size,
"width": width,
"height": height,
"url": url,
}
],
},
}
]
def build_file_msg_body(
url: str,
filename: str,
uuid: Optional[str] = None,
size: int = 0,
) -> list[dict]:
"""
构建腾讯 IM TIMFileElem 消息体
参考https://cloud.tencent.com/document/product/269/2720
Args:
url: 文件公网访问 URLCOS resourceUrl
filename: 文件名含扩展名
uuid: 文件 UUIDMD5 或其他唯一标识不传则使用 filename
size: 文件大小字节
Returns:
TIMFileElem 消息体列表适合直接放入 msg_body
"""
_uuid = uuid or filename
return [
{
"msg_type": "TIMFileElem",
"msg_content": {
"uuid": _uuid,
"file_name": filename,
"file_size": size,
"url": url,
},
}
]
# ============ 内部工具 ============
def _basename_from_url(url: str) -> str:
"""从 URL 提取文件名。"""
try:
parsed = urllib.parse.urlparse(url)
return os.path.basename(parsed.path)
except Exception:
return ""
File diff suppressed because it is too large Load Diff
-558
View File
@@ -1,558 +0,0 @@
"""
Yuanbao sticker (TIMFaceElem) support.
Ported from yuanbao-openclaw-plugin/src/sticker/.
TIMFaceElem wire format:
{
"msg_type": "TIMFaceElem",
"msg_content": {
"index": 0, # always 0 per Yuanbao convention
"data": "<json>", # serialised sticker metadata
}
}
The `data` field carries a JSON string with the sticker's metadata so the
receiver can look up the correct asset in the emoji pack.
"""
from __future__ import annotations
import json
import random
import re
import unicodedata
from typing import Optional
# ---------------------------------------------------------------------------
# Sticker catalogue ported from builtin-stickers.json
# Key : canonical name (Chinese)
# Value : {sticker_id, package_id, name, description, width, height, formats}
# ---------------------------------------------------------------------------
STICKER_MAP: dict[str, dict] = {
"六六六": {
"sticker_id": "278", "package_id": "1003", "name": "六六六",
"description": "666 厉害 牛 棒 绝了 好强 awesome",
"width": 128, "height": 128, "formats": "png",
},
"我想开了": {
"sticker_id": "262", "package_id": "1003", "name": "我想开了",
"description": "想开 佛系 释怀 顿悟 看淡了 无所谓",
"width": 128, "height": 128, "formats": "png",
},
"害羞": {
"sticker_id": "130", "package_id": "1003", "name": "害羞",
"description": "腼腆 不好意思 脸红 娇羞 羞涩 捂脸",
"width": 128, "height": 128, "formats": "png",
},
"比心": {
"sticker_id": "252", "package_id": "1003", "name": "比心",
"description": "笔芯 爱你 爱心手势 love heart 喜欢你",
"width": 128, "height": 128, "formats": "png",
},
"委屈": {
"sticker_id": "125", "package_id": "1003", "name": "委屈",
"description": "难过 想哭 可怜巴巴 瘪嘴 受伤 被欺负",
"width": 128, "height": 128, "formats": "png",
},
"亲亲": {
"sticker_id": "146", "package_id": "1003", "name": "亲亲",
"description": "么么 mua 亲一下 kiss 飞吻 啵",
"width": 128, "height": 128, "formats": "png",
},
"": {
"sticker_id": "131", "package_id": "1003", "name": "",
"description": "帅 墨镜 cool 高冷 有型 swagger",
"width": 128, "height": 128, "formats": "png",
},
"": {
"sticker_id": "145", "package_id": "1003", "name": "",
"description": "睡觉 困 zzZ 打盹 躺平 休眠 sleepy",
"width": 128, "height": 128, "formats": "png",
},
"发呆": {
"sticker_id": "152", "package_id": "1003", "name": "发呆",
"description": "懵 愣住 放空 呆滞 出神 脑子空白",
"width": 128, "height": 128, "formats": "png",
},
"可怜": {
"sticker_id": "157", "package_id": "1003", "name": "可怜",
"description": "卖萌 求饶 委屈巴巴 弱小 拜托 眼巴巴",
"width": 128, "height": 128, "formats": "png",
},
"摊手": {
"sticker_id": "200", "package_id": "1003", "name": "摊手",
"description": "无奈 没办法 耸肩 随便 那咋整 whatever",
"width": 128, "height": 128, "formats": "png",
},
"头大": {
"sticker_id": "213", "package_id": "1003", "name": "头大",
"description": "头疼 烦恼 郁闷 难搞 崩溃 一团乱",
"width": 128, "height": 128, "formats": "png",
},
"": {
"sticker_id": "256", "package_id": "1003", "name": "",
"description": "害怕 惊恐 震惊 吓一跳 恐怖 怂",
"width": 128, "height": 128, "formats": "png",
},
"吐血": {
"sticker_id": "203", "package_id": "1003", "name": "吐血",
"description": "无语 崩溃 被雷 内伤 一口老血 屮",
"width": 128, "height": 128, "formats": "png",
},
"": {
"sticker_id": "185", "package_id": "1003", "name": "",
"description": "傲娇 生气 不满 撇嘴 不理 赌气",
"width": 128, "height": 128, "formats": "png",
},
"嘿嘿": {
"sticker_id": "220", "package_id": "1003", "name": "嘿嘿",
"description": "坏笑 猥琐笑 偷笑 憨笑 得意 你懂的",
"width": 128, "height": 128, "formats": "png",
},
"头秃": {
"sticker_id": "218", "package_id": "1003", "name": "头秃",
"description": "程序员 加班 焦虑 没头发 秃了 肝爆",
"width": 128, "height": 128, "formats": "png",
},
"暗中观察": {
"sticker_id": "221", "package_id": "1003", "name": "暗中观察",
"description": "窥屏 潜水 偷偷看 角落 围观 屏住呼吸",
"width": 128, "height": 128, "formats": "png",
},
"我酸了": {
"sticker_id": "224", "package_id": "1003", "name": "我酸了",
"description": "嫉妒 柠檬精 羡慕 吃柠檬 眼红 恰柠檬",
"width": 128, "height": 128, "formats": "png",
},
"打call": {
"sticker_id": "246", "package_id": "1003", "name": "打call",
"description": "应援 加油 支持 喝彩 助威 call",
"width": 128, "height": 128, "formats": "png",
},
"庆祝": {
"sticker_id": "251", "package_id": "1003", "name": "庆祝",
"description": "祝贺 开心 耶 party 胜利 干杯",
"width": 128, "height": 128, "formats": "png",
},
"奋斗": {
"sticker_id": "151", "package_id": "1003", "name": "奋斗",
"description": "努力 加油 拼搏 冲 干劲 卷起来",
"width": 128, "height": 128, "formats": "png",
},
"惊讶": {
"sticker_id": "143", "package_id": "1003", "name": "惊讶",
"description": "震惊 哇 不敢相信 OMG 居然 这么离谱",
"width": 128, "height": 128, "formats": "png",
},
"疑问": {
"sticker_id": "144", "package_id": "1003", "name": "疑问",
"description": "问号 不懂 啥 为什么 啥情况 懵逼问",
"width": 128, "height": 128, "formats": "png",
},
"仔细分析": {
"sticker_id": "248", "package_id": "1003", "name": "仔细分析",
"description": "思考 推敲 认真 研究 琢磨 让我想想",
"width": 128, "height": 128, "formats": "png",
},
"撅嘴": {
"sticker_id": "184", "package_id": "1003", "name": "撅嘴",
"description": "嘟嘴 卖萌 不高兴 撒娇 嘴翘",
"width": 128, "height": 128, "formats": "png",
},
"泪奔": {
"sticker_id": "199", "package_id": "1003", "name": "泪奔",
"description": "大哭 伤心 破防 感动哭 泪流满面 呜呜",
"width": 128, "height": 128, "formats": "png",
},
"尊嘟假嘟": {
"sticker_id": "276", "package_id": "1003", "name": "尊嘟假嘟",
"description": "真的假的 真假 可爱问 你骗我 是不是",
"width": 128, "height": 128, "formats": "png",
},
"略略略": {
"sticker_id": "113", "package_id": "1003", "name": "略略略",
"description": "调皮 吐舌 不服 略 气死你 鬼脸",
"width": 128, "height": 128, "formats": "png",
},
"": {
"sticker_id": "180", "package_id": "1003", "name": "",
"description": "想睡 倦 打哈欠 睁不开眼 好困啊 sleepy",
"width": 128, "height": 128, "formats": "png",
},
"折磨": {
"sticker_id": "181", "package_id": "1003", "name": "折磨",
"description": "难受 痛苦 煎熬 蚌埠住了 受不了 要命",
"width": 128, "height": 128, "formats": "png",
},
"抠鼻": {
"sticker_id": "182", "package_id": "1003", "name": "抠鼻",
"description": "不屑 无聊 淡定 无所谓 鄙视 挖鼻",
"width": 128, "height": 128, "formats": "png",
},
"鼓掌": {
"sticker_id": "183", "package_id": "1003", "name": "鼓掌",
"description": "拍手 叫好 赞同 666 喝彩 掌声",
"width": 128, "height": 128, "formats": "png",
},
"斜眼笑": {
"sticker_id": "204", "package_id": "1003", "name": "斜眼笑",
"description": "滑稽 坏笑 doge 意味深长 阴阳怪气 嘿嘿嘿",
"width": 128, "height": 128, "formats": "png",
},
"辣眼睛": {
"sticker_id": "216", "package_id": "1003", "name": "辣眼睛",
"description": "看不下去 cringe 毁三观 太丑了 瞎了",
"width": 128, "height": 128, "formats": "png",
},
"哦哟": {
"sticker_id": "217", "package_id": "1003", "name": "哦哟",
"description": "惊讶 起哄 哇哦 有戏 不简单 哟",
"width": 128, "height": 128, "formats": "png",
},
"吃瓜": {
"sticker_id": "222", "package_id": "1003", "name": "吃瓜",
"description": "围观 看戏 八卦 路人 看热闹 板凳",
"width": 128, "height": 128, "formats": "png",
},
"狗头": {
"sticker_id": "225", "package_id": "1003", "name": "狗头",
"description": "doge 保命 开玩笑 滑稽 反讽 懂的都懂",
"width": 128, "height": 128, "formats": "png",
},
"敬礼": {
"sticker_id": "227", "package_id": "1003", "name": "敬礼",
"description": "salute 尊重 收到 遵命 致敬 报告",
"width": 128, "height": 128, "formats": "png",
},
"": {
"sticker_id": "231", "package_id": "1003", "name": "",
"description": "知道了 明白 敷衍 嗯 这样啊 收到",
"width": 128, "height": 128, "formats": "png",
},
"拿到红包": {
"sticker_id": "236", "package_id": "1003", "name": "拿到红包",
"description": "红包 谢谢老板 发财 开心 抢到了 欧气",
"width": 128, "height": 128, "formats": "png",
},
"牛吖": {
"sticker_id": "239", "package_id": "1003", "name": "牛吖",
"description": "牛 厉害 强 666 佩服 大佬",
"width": 128, "height": 128, "formats": "png",
},
"贴贴": {
"sticker_id": "272", "package_id": "1003", "name": "贴贴",
"description": "抱抱 亲昵 蹭蹭 亲密 靠靠 撒娇贴",
"width": 128, "height": 128, "formats": "png",
},
"爱心": {
"sticker_id": "138", "package_id": "1003", "name": "爱心",
"description": "心 love 喜欢你 红心 示爱 么么哒",
"width": 128, "height": 128, "formats": "png",
},
"晚安": {
"sticker_id": "170", "package_id": "1003", "name": "晚安",
"description": "好梦 睡了 night 早点休息 安啦 moon",
"width": 128, "height": 128, "formats": "png",
},
"太阳": {
"sticker_id": "176", "package_id": "1003", "name": "太阳",
"description": "晴天 早上好 阳光 morning 好天气 日",
"width": 128, "height": 128, "formats": "png",
},
"柠檬": {
"sticker_id": "266", "package_id": "1003", "name": "柠檬",
"description": "酸 嫉妒 柠檬精 羡慕 我酸 恰柠檬",
"width": 128, "height": 128, "formats": "png",
},
"大冤种": {
"sticker_id": "267", "package_id": "1003", "name": "大冤种",
"description": "倒霉 吃亏 自嘲 好心没好报 背锅 工具人",
"width": 128, "height": 128, "formats": "png",
},
"吐了": {
"sticker_id": "132", "package_id": "1003", "name": "吐了",
"description": "恶心 yue 受不了 嫌弃 想吐 生理不适",
"width": 128, "height": 128, "formats": "png",
},
"": {
"sticker_id": "134", "package_id": "1003", "name": "",
"description": "生气 愤怒 火大 暴躁 气炸 怼",
"width": 128, "height": 128, "formats": "png",
},
"玫瑰": {
"sticker_id": "165", "package_id": "1003", "name": "玫瑰",
"description": "花 示爱 表白 浪漫 送你花 情人节",
"width": 128, "height": 128, "formats": "png",
},
"凋谢": {
"sticker_id": "119", "package_id": "1003", "name": "凋谢",
"description": "花谢 失恋 难过 枯萎 心碎 凉了",
"width": 128, "height": 128, "formats": "png",
},
"点赞": {
"sticker_id": "159", "package_id": "1003", "name": "点赞",
"description": "赞 认同 好棒 good like 大拇指 顶",
"width": 128, "height": 128, "formats": "png",
},
"握手": {
"sticker_id": "164", "package_id": "1003", "name": "握手",
"description": "合作 你好 商务 hello deal 成交 友好",
"width": 128, "height": 128, "formats": "png",
},
"抱拳": {
"sticker_id": "163", "package_id": "1003", "name": "抱拳",
"description": "谢谢 失敬 江湖 承让 拜托 有礼",
"width": 128, "height": 128, "formats": "png",
},
"ok": {
"sticker_id": "169", "package_id": "1003", "name": "ok",
"description": "好的 收到 没问题 okay 行 可以 懂了",
"width": 128, "height": 128, "formats": "png",
},
"拳头": {
"sticker_id": "174", "package_id": "1003", "name": "拳头",
"description": "加油 干 冲 fight 力量 击拳 硬气",
"width": 128, "height": 128, "formats": "png",
},
"鞭炮": {
"sticker_id": "191", "package_id": "1003", "name": "鞭炮",
"description": "过年 喜庆 爆竹 春节 噼里啪啦 红",
"width": 128, "height": 128, "formats": "png",
},
"烟花": {
"sticker_id": "258", "package_id": "1003", "name": "烟花",
"description": "庆典 漂亮 新年 嘭 绽放 节日快乐",
"width": 128, "height": 128, "formats": "png",
},
}
def get_sticker_by_name(name: str) -> Optional[dict]:
"""
按名称查找贴纸支持模糊匹配
匹配优先级
1. 完全相等name
2. name 包含查询词前缀/子串
3. description 包含查询词同义词搜索
4. 通用模糊评分 sticker-search 同算法命中即返回得分最高的一条
返回 sticker dict找不到返回 None
"""
if not name:
return None
query = name.strip()
if query in STICKER_MAP:
return STICKER_MAP[query]
for key, sticker in STICKER_MAP.items():
if query in key or key in query:
return sticker
for sticker in STICKER_MAP.values():
desc = sticker.get("description", "")
if query in desc:
return sticker
matches = search_stickers(query, limit=1)
return matches[0] if matches else None
def get_random_sticker(category: str = None) -> dict:
"""
随机返回一个贴纸
若指定 category则在 description 中含有该关键词的贴纸里随机选取
category None 时从全表随机
"""
if category:
candidates = [
s for s in STICKER_MAP.values()
if category in s.get("description", "") or category in s.get("name", "")
]
if candidates:
return random.choice(candidates)
return random.choice(list(STICKER_MAP.values()))
def get_sticker_by_id(sticker_id: str) -> Optional[dict]:
"""按 sticker_id 精确查找贴纸。"""
if not sticker_id:
return None
sid = str(sticker_id).strip()
for sticker in STICKER_MAP.values():
if sticker.get("sticker_id") == sid:
return sticker
return None
# ---------------------------------------------------------------------------
# 模糊搜索(对齐 chatbot-web yuanbao-openclaw-plugin/sticker-cache.ts.searchStickers
# ---------------------------------------------------------------------------
_PUNCT_RE = re.compile(r"[\s\u3000\-_·.,,。!?\"“”'‘’、/\\]+")
def _normalize_text(raw: str) -> str:
return unicodedata.normalize("NFKC", str(raw or "")).strip().lower()
def _compact_text(raw: str) -> str:
return _PUNCT_RE.sub("", _normalize_text(raw))
def _multiset_char_hit_ratio(needle: str, haystack: str) -> float:
if not needle:
return 0.0
bag: dict[str, int] = {}
for ch in haystack:
bag[ch] = bag.get(ch, 0) + 1
hits = 0
for ch in needle:
n = bag.get(ch, 0)
if n > 0:
hits += 1
bag[ch] = n - 1
return hits / len(needle)
def _bigram_jaccard(a: str, b: str) -> float:
if len(a) < 2 or len(b) < 2:
return 0.0
A = {a[i:i + 2] for i in range(len(a) - 1)}
B = {b[i:i + 2] for i in range(len(b) - 1)}
inter = len(A & B)
union = len(A) + len(B) - inter
return inter / union if union else 0.0
def _longest_subsequence_ratio(needle: str, haystack: str) -> float:
if not needle:
return 0.0
j = 0
for ch in haystack:
if j >= len(needle):
break
if ch == needle[j]:
j += 1
return j / len(needle)
def _score_field(haystack: str, query: str) -> float:
hay = _normalize_text(haystack)
q = _normalize_text(query)
if not hay or not q:
return 0.0
hay_c = _compact_text(haystack)
q_c = _compact_text(query)
best = 0.0
if hay == q:
best = max(best, 100.0)
if q in hay:
best = max(best, 92 + min(6, len(q)))
if len(q) >= 2 and hay.startswith(q):
best = max(best, 88.0)
if q_c and q_c in hay_c:
best = max(best, 86.0)
best = max(best, _multiset_char_hit_ratio(q_c, hay_c) * 62)
best = max(best, _bigram_jaccard(q_c, hay_c) * 58)
best = max(best, _longest_subsequence_ratio(q_c, hay_c) * 52)
if len(q) == 1 and q in hay:
best = max(best, 68.0)
return best
def search_stickers(query: str, limit: int = 10) -> list[dict]:
"""
在内置贴纸表中按模糊匹配排序返回前 N 条结果
评分综合 name/description 字段的子串字符多重集覆盖bigram Jaccard子序列比例
name 权重略高于 description×0.88 query 时按字典顺序返回前 N
"""
safe_limit = max(1, min(500, int(limit) if limit else 10))
if not query or not _normalize_text(query):
return list(STICKER_MAP.values())[:safe_limit]
scored: list[tuple[float, dict]] = []
for sticker in STICKER_MAP.values():
name_s = _score_field(sticker.get("name", ""), query)
desc_s = _score_field(sticker.get("description", ""), query) * 0.88
sid = str(sticker.get("sticker_id", "")).strip()
q_norm = _normalize_text(query)
id_s = 0.0
if sid and q_norm:
sid_norm = _normalize_text(sid)
if sid_norm == q_norm:
id_s = 100.0
elif q_norm in sid_norm:
id_s = 84.0
scored.append((max(name_s, desc_s, id_s), sticker))
scored.sort(key=lambda x: x[0], reverse=True)
top = scored[0][0] if scored else 0
if top <= 0:
return [s for _, s in scored[:safe_limit]]
if top >= 22:
floor = 18.0
elif top >= 12:
floor = max(10.0, top * 0.5)
else:
floor = max(6.0, top * 0.35)
filtered = [pair for pair in scored if pair[0] >= floor]
out = filtered if filtered else scored
return [s for _, s in out[:safe_limit]]
def build_face_msg_body(
face_index: int,
face_type: int = 1,
data: Optional[str] = None,
) -> list:
"""
构造 TIMFaceElem 消息体
Yuanbao 约定
- index 固定传 0服务端通过 data 字段识别具体表情
- data JSON 字符串包含 sticker_id / package_id 等字段
Args:
face_index: 保留字段暂时不影响 wire formatYuanbao 固定 index=0
face_index > 0 时视为旧版 QQ 表情 ID直接放入 index
face_type: 保留字段兼容旧接口当前未使用
data: 已序列化的 JSON 字符串 None 时仅传 index
Returns:
符合 Yuanbao TIM 协议的 msg_body list::
[{"msg_type": "TIMFaceElem", "msg_content": {"index": 0, "data": "..."}}]
"""
msg_content: dict = {"index": face_index}
if data is not None:
msg_content["data"] = data
return [{"msg_type": "TIMFaceElem", "msg_content": msg_content}]
def build_sticker_msg_body(sticker: dict) -> list:
"""
STICKER_MAP 中的 sticker dict 直接构造 TIMFaceElem 消息体
这是 send_sticker() 的内部辅助确保 data 字段与原始 JS 插件一致
"""
data_payload = json.dumps(
{
"sticker_id": sticker["sticker_id"],
"package_id": sticker["package_id"],
"width": sticker.get("width", 128),
"height": sticker.get("height", 128),
"formats": sticker.get("formats", "png"),
"name": sticker["name"],
},
ensure_ascii=False,
separators=(",", ":"),
)
return build_face_msg_body(face_index=0, data=data_payload)
+363 -360
View File
File diff suppressed because it is too large Load Diff
+2 -11
View File
@@ -310,9 +310,8 @@ def build_session_context_prompt(
"**Platform notes:** You are running inside Slack. "
"You do NOT have access to Slack-specific APIs — you cannot search "
"channel history, pin/unpin messages, manage channels, or list users. "
"Do not promise to perform these actions. The gateway may inline the "
"current message's Slack block/attachment payload when available, but "
"you still cannot call Slack APIs yourself."
"Do not promise to perform these actions. If the user asks, explain "
"that you can only read messages sent directly to you and respond."
)
elif context.source.platform == Platform.DISCORD:
# Inject the Discord IDs block only when the agent actually has
@@ -354,14 +353,6 @@ def build_session_context_prompt(
"If the user needs a detailed answer, give the short version first "
"and offer to elaborate."
)
elif context.source.platform == Platform.YUANBAO:
lines.append("")
lines.append(
"**Platform notes:** You are running inside Yuanbao. "
"You CAN send private (DM) messages via the send_message tool. "
"Use target='yuanbao:direct:<account_id>' for DM "
"and target='yuanbao:group:<group_code>' for group chat."
)
# Connected platforms
platforms_list = ["local (files on this machine)"]
-110
View File
@@ -44,14 +44,6 @@ class StreamConsumerConfig:
buffer_threshold: int = 40
cursor: str = ""
buffer_only: bool = False
# When >0, the final edit for a streamed response is delivered as a
# fresh message if the original preview has been visible for at least
# this many seconds. This makes the platform's visible timestamp
# reflect completion time instead of first-token time for long-running
# responses (e.g. reasoning models that stream slowly). Ported from
# openclaw/openclaw#72038. Default 0 = always edit in place (legacy
# behavior). The gateway enables this selectively per-platform.
fresh_final_after_seconds: float = 0.0
class GatewayStreamConsumer:
@@ -99,12 +91,6 @@ class GatewayStreamConsumer:
self._queue: queue.Queue = queue.Queue()
self._accumulated = ""
self._message_id: Optional[str] = None
# Wall-clock timestamp (time.monotonic) when ``_message_id`` was
# first assigned from a successful first-send. Used by the
# fresh-final logic to detect long-lived previews whose edit
# timestamps would be stale by completion time. Ported from
# openclaw/openclaw#72038.
self._message_created_ts: Optional[float] = None
self._already_sent = False
self._edit_supported = True # Disabled when progressive edits are no longer usable
self._last_edit_time = 0.0
@@ -150,7 +136,6 @@ class GatewayStreamConsumer:
if preserve_no_edit and self._message_id == "__no_edit__":
return
self._message_id = None
self._message_created_ts = None
self._accumulated = ""
self._last_sent_text = ""
self._fallback_final_send = False
@@ -749,81 +734,6 @@ class GatewayStreamConsumer:
logger.error("Commentary send error: %s", e)
return False
def _should_send_fresh_final(self) -> bool:
"""Return True when a long-lived preview should be replaced with a
fresh final message instead of an edit.
Conditions:
- Fresh-final is enabled (``fresh_final_after_seconds > 0``).
- We have a real preview message id (not the ``__no_edit__`` sentinel
and not ``None``).
- The preview has been visible for at least the configured threshold.
Ported from openclaw/openclaw#72038.
"""
threshold = getattr(self.cfg, "fresh_final_after_seconds", 0.0) or 0.0
if threshold <= 0:
return False
if not self._message_id or self._message_id == "__no_edit__":
return False
if self._message_created_ts is None:
return False
age = time.monotonic() - self._message_created_ts
return age >= threshold
async def _try_fresh_final(self, text: str) -> bool:
"""Send ``text`` as a brand-new message (best-effort delete the old
preview) so the platform's visible timestamp reflects completion
time. Returns True on successful delivery, False on any failure so
the caller falls back to the normal edit path.
Ported from openclaw/openclaw#72038.
"""
old_message_id = self._message_id
try:
result = await self.adapter.send(
chat_id=self.chat_id,
content=text,
metadata=self.metadata,
)
except Exception as e:
logger.debug("Fresh-final send failed, falling back to edit: %s", e)
return False
if not getattr(result, "success", False):
return False
# Successful fresh send — try to delete the stale preview so the
# user doesn't see the old edit-stuck message underneath. Cleanup
# is best-effort; platforms that don't implement ``delete_message``
# just leave the preview behind (still an acceptable outcome —
# the visible final timestamp is the important part).
if old_message_id and old_message_id != "__no_edit__":
delete_fn = getattr(self.adapter, "delete_message", None)
if delete_fn is not None:
try:
await delete_fn(self.chat_id, old_message_id)
except Exception as e:
logger.debug(
"Fresh-final preview cleanup failed (%s): %s",
old_message_id, e,
)
# Adopt the new message id as the current message so subsequent
# callers (e.g. overflow split loops, finalize retries) see a
# consistent state.
new_message_id = getattr(result, "message_id", None)
if new_message_id:
self._message_id = new_message_id
self._message_created_ts = time.monotonic()
else:
# Send succeeded but platform didn't return an id — treat the
# delivery as final-only and fall back to "__no_edit__" so we
# don't try to edit something we can't address.
self._message_id = "__no_edit__"
self._message_created_ts = None
self._already_sent = True
self._last_sent_text = text
self._final_response_sent = True
return True
async def _send_or_edit(self, text: str, *, finalize: bool = False) -> bool:
"""Send or edit the streaming message.
@@ -876,22 +786,6 @@ class GatewayStreamConsumer:
finalize and self._adapter_requires_finalize
):
return True
# Fresh-final for long-lived previews: when finalizing
# the last edit in a streaming sequence, if the
# original preview has been visible for at least
# ``fresh_final_after_seconds``, send the completed
# reply as a fresh message so the platform's visible
# timestamp reflects completion time instead of the
# preview creation time. Best-effort cleanup of the
# old preview follows. Ported from
# openclaw/openclaw#72038. Gated by config so the
# legacy edit-in-place path stays the default.
if (
finalize
and self._should_send_fresh_final()
and await self._try_fresh_final(text)
):
return True
# Edit existing message
result = await self.adapter.edit_message(
chat_id=self.chat_id,
@@ -958,10 +852,6 @@ class GatewayStreamConsumer:
if result.success:
if result.message_id:
self._message_id = result.message_id
# Track when the preview first became visible to
# the user so fresh-final logic can detect stale
# preview timestamps on long-running responses.
self._message_created_ts = time.monotonic()
else:
self._edit_supported = False
self._already_sent = True
+1 -21
View File
@@ -31,17 +31,8 @@ Hermes' own session keys.
from __future__ import annotations
import json
import logging
import re
from typing import Set
logger = logging.getLogger(__name__)
# WhatsApp JIDs are numeric (or plus-prefixed numeric) with optional
# ``@``, ``.`` and ``:`` separators. ``\w`` is pinned to ASCII so
# full-width digits / Unicode word chars can't sneak through.
_SAFE_IDENTIFIER_RE = re.compile(r"^[A-Za-z0-9@.+\-]+$")
from hermes_constants import get_hermes_home
@@ -90,16 +81,6 @@ def expand_whatsapp_aliases(identifier: str) -> Set[str]:
current = queue.pop(0)
if not current or current in resolved:
continue
# Defense-in-depth: reject identifiers that could sneak path
# separators / traversal segments into the ``lid-mapping-{current}``
# filename below. The hardcoded ``lid-mapping-`` prefix already
# prevents escape via pathlib's component split (an attacker can't
# create ``lid-mapping-..`` as a real directory in session_dir), but
# this keeps the identifier space to the characters WhatsApp JIDs
# actually use and avoids depending on that filesystem-layout
# invariant.
if not _SAFE_IDENTIFIER_RE.match(current):
continue
resolved.add(current)
for suffix in ("", "_reverse"):
@@ -110,8 +91,7 @@ def expand_whatsapp_aliases(identifier: str) -> Set[str]:
mapped = normalize_whatsapp_identifier(
json.loads(mapping_path.read_text(encoding="utf-8"))
)
except (OSError, json.JSONDecodeError) as exc:
logger.debug("whatsapp_identity: failed to read %s: %s", mapping_path, exc)
except Exception:
continue
if mapped and mapped not in resolved:
queue.append(mapped)
+1 -17
View File
@@ -467,27 +467,11 @@ def _resolve_api_key_provider_secret(
pass
return "", ""
from hermes_cli.config import get_env_value
for env_var in pconfig.api_key_env_vars:
# Check both os.environ and ~/.hermes/.env file
val = (get_env_value(env_var) or "").strip()
val = os.getenv(env_var, "").strip()
if has_usable_secret(val):
return val, env_var
# Fallback: try credential pool (e.g. zai key stored via auth.json)
try:
from agent.credential_pool import load_pool
pool = load_pool(provider_id)
if pool and pool.has_credentials():
entry = pool.peek()
if entry:
key = getattr(entry, "access_token", "") or getattr(entry, "runtime_api_key", "")
key = str(key).strip()
if has_usable_secret(key):
return key, f"credential_pool:{provider_id}"
except Exception:
pass
return "", ""
+7 -110
View File
@@ -126,8 +126,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("voice", "Toggle voice mode", "Configuration",
args_hint="[on|off|tts|status]", subcommands=("on", "off", "tts", "status")),
CommandDef("busy", "Control what Enter does while Hermes is working", "Configuration",
cli_only=True, args_hint="[queue|steer|interrupt|status]",
subcommands=("queue", "steer", "interrupt", "status")),
cli_only=True, args_hint="[queue|interrupt|status]",
subcommands=("queue", "interrupt", "status")),
# Tools & Skills
CommandDef("tools", "Manage tools: /tools [list|disable|enable] [name...]", "Tools & Skills",
@@ -140,6 +140,11 @@ COMMAND_REGISTRY: list[CommandDef] = [
CommandDef("cron", "Manage scheduled tasks", "Tools & Skills",
cli_only=True, args_hint="[subcommand]",
subcommands=("list", "add", "create", "edit", "pause", "resume", "run", "remove")),
CommandDef("kanban", "Multi-profile collaboration board (tasks, links, comments)",
"Tools & Skills", args_hint="[subcommand]",
subcommands=("list", "ls", "show", "create", "assign", "link", "unlink",
"claim", "comment", "complete", "block", "unblock", "archive",
"tail", "dispatch", "context", "init", "gc")),
CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills",
cli_only=True),
CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
@@ -806,114 +811,6 @@ def discord_skill_commands_by_category(
return trimmed_categories, uncategorized, hidden
# ---------------------------------------------------------------------------
# Slack native slash commands
# ---------------------------------------------------------------------------
# Slack slash command name constraints: lowercase a-z, 0-9, hyphens,
# underscores. Max 32 chars. Slack app manifest accepts up to 50 slash
# commands per app.
_SLACK_MAX_SLASH_COMMANDS = 50
_SLACK_NAME_LIMIT = 32
_SLACK_INVALID_CHARS = re.compile(r"[^a-z0-9_\-]")
def _sanitize_slack_name(raw: str) -> str:
"""Convert a command name to a valid Slack slash command name.
Slack allows lowercase a-z, digits, hyphens, and underscores. Max 32
chars. Uppercase is lowercased; invalid chars are stripped.
"""
name = raw.lower()
name = _SLACK_INVALID_CHARS.sub("", name)
name = name.strip("-_")
return name[:_SLACK_NAME_LIMIT]
def slack_native_slashes() -> list[tuple[str, str, str]]:
"""Return (slash_name, description, usage_hint) triples for Slack.
Every gateway-available command in ``COMMAND_REGISTRY`` is surfaced as
a standalone Slack slash command (e.g. ``/btw``, ``/stop``, ``/model``),
matching Discord's and Telegram's model where every command is a
first-class slash and not a ``/hermes <verb>`` subcommand.
Both canonical names and aliases are included so users can type any
documented form (e.g. ``/background``, ``/bg``, and ``/btw`` all work).
Plugin-registered slash commands are included too.
Results are clamped to Slack's 50-command limit with duplicate-name
avoidance. ``/hermes`` is always reserved as the first entry so the
legacy ``/hermes <subcommand>`` form keeps working for anything that
gets dropped by the clamp or for free-form questions.
"""
overrides = _resolve_config_gates()
entries: list[tuple[str, str, str]] = []
seen: set[str] = set()
# Reserve /hermes as the catch-all top-level command.
entries.append(("hermes", "Talk to Hermes or run a subcommand", "[subcommand] [args]"))
seen.add("hermes")
def _add(name: str, desc: str, hint: str) -> None:
slack_name = _sanitize_slack_name(name)
if not slack_name or slack_name in seen:
return
if len(entries) >= _SLACK_MAX_SLASH_COMMANDS:
return
# Slack description cap is 2000 chars; keep it short.
entries.append((slack_name, desc[:140], hint[:100]))
seen.add(slack_name)
# First pass: canonical names (so they win slots if we hit the cap).
for cmd in COMMAND_REGISTRY:
if not _is_gateway_available(cmd, overrides):
continue
_add(cmd.name, cmd.description, cmd.args_hint or "")
# Second pass: aliases.
for cmd in COMMAND_REGISTRY:
if not _is_gateway_available(cmd, overrides):
continue
for alias in cmd.aliases:
# Skip aliases that only differ from canonical by case/punctuation
# normalization (already covered by _add dedup).
_add(alias, f"Alias for /{cmd.name}{cmd.description}", cmd.args_hint or "")
# Third pass: plugin commands.
for name, description, args_hint in _iter_plugin_command_entries():
_add(name, description, args_hint or "")
return entries
def slack_app_manifest(request_url: str = "https://hermes-agent.local/slack/commands") -> dict[str, Any]:
"""Generate a Slack app manifest with all gateway commands as slashes.
``request_url`` is required by Slack's manifest schema for every slash
command, but in Socket Mode (which we use) Slack ignores it and routes
the command event through the WebSocket. A placeholder URL is fine.
The returned dict is the ``features.slash_commands`` portion only
callers compose it into a full manifest (or merge into an existing
one). Keeping it narrow avoids coupling us to the rest of the manifest
schema (display_information, oauth_config, settings, etc.) which users
set up once in the Slack UI and rarely change.
"""
slashes = []
for name, desc, usage in slack_native_slashes():
entry = {
"command": f"/{name}",
"description": desc or f"Run /{name}",
"should_escape": False,
"url": request_url,
}
if usage:
entry["usage_hint"] = usage
slashes.append(entry)
return {"features": {"slash_commands": slashes}}
def slack_subcommand_map() -> dict[str, str]:
"""Return subcommand -> /command mapping for Slack /hermes handler.
+1 -53
View File
@@ -465,7 +465,6 @@ DEFAULT_CONFIG = {
"command_timeout": 30, # Timeout for browser commands in seconds (screenshot, navigate, etc.)
"record_sessions": False, # Auto-record browser sessions as WebM videos
"allow_private_urls": False, # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
"auto_local_for_private_urls": True, # When a cloud provider is set, auto-spawn local Chromium for LAN/localhost URLs instead of sending them to the cloud
"cdp_url": "", # Optional persistent CDP endpoint for attaching to an existing Chromium/Chrome
# CDP supervisor — dialog + frame detection via a persistent WebSocket.
# Active only when a CDP-capable backend is attached (Browserbase or
@@ -487,19 +486,6 @@ DEFAULT_CONFIG = {
"checkpoints": {
"enabled": True,
"max_snapshots": 50, # Max checkpoints to keep per directory
# Auto-maintenance: shadow repos accumulate forever under
# ~/.hermes/checkpoints/ (one per cd'd working directory). Field
# reports put the typical offender at 1000+ repos / ~12 GB. When
# auto_prune is on, hermes sweeps at startup (at most once per
# min_interval_hours) and deletes:
# * orphan repos: HERMES_WORKDIR no longer exists on disk
# * stale repos: newest mtime older than retention_days
# Opt-in so users who rely on /rollback against long-ago sessions
# never lose data silently.
"auto_prune": False,
"retention_days": 7,
"delete_orphans": True,
"min_interval_hours": 24,
},
# Maximum characters returned by a single read_file call. Reads that
@@ -640,7 +626,7 @@ DEFAULT_CONFIG = {
"compact": False,
"personality": "kawaii",
"resume_display": "full",
"busy_input_mode": "interrupt", # interrupt | queue | steer
"busy_input_mode": "interrupt",
"bell_on_complete": False,
"show_reasoning": False,
"streaming": False,
@@ -1595,44 +1581,6 @@ OPTIONAL_ENV_VARS = {
"category": "tool",
},
# ── Bundled skills (opt-in: only needed if the user uses that skill) ──
# These use category="skill" (distinct from "tool") so the sandbox
# env blocklist in tools/environments/local.py does NOT rewrite them —
# skills legitimately need these passed through to curl via
# tools/env_passthrough.py when the user's skill calls out.
"NOTION_API_KEY": {
"description": "Notion integration token (used by the `notion` skill)",
"prompt": "Notion API key",
"url": "https://www.notion.so/my-integrations",
"password": True,
"category": "skill",
"advanced": True,
},
"LINEAR_API_KEY": {
"description": "Linear personal API key (used by the `linear` skill)",
"prompt": "Linear API key",
"url": "https://linear.app/settings/api",
"password": True,
"category": "skill",
"advanced": True,
},
"AIRTABLE_API_KEY": {
"description": "Airtable personal access token (used by the `airtable` skill)",
"prompt": "Airtable API key",
"url": "https://airtable.com/create/tokens",
"password": True,
"category": "skill",
"advanced": True,
},
"TENOR_API_KEY": {
"description": "Tenor API key for GIF search (used by the `gif-search` skill)",
"prompt": "Tenor API key",
"url": "https://developers.google.com/tenor/guides/quickstart",
"password": True,
"category": "skill",
"advanced": True,
},
# ── Honcho ──
"HONCHO_API_KEY": {
"description": "Honcho API key for AI-native persistent memory",
-24
View File
@@ -2724,24 +2724,6 @@ _PLATFORMS = [
"help": "OpenID to deliver cron results and notifications to."},
],
},
{
"key": "yuanbao",
"label": "Yuanbao",
"emoji": "💎",
"token_var": "YUANBAO_APP_ID",
"setup_instructions": [
"1. Download the Yuanbao app from https://yuanbao.tencent.com/",
"2. In the app, go to PAI → My Bot and create a new bot",
"3. After the bot is created, copy the App ID and App Secret",
"4. Enter them below and Hermes will connect automatically over WebSocket",
],
"vars": [
{"name": "YUANBAO_APP_ID", "prompt": "App ID", "password": False,
"help": "The App ID from your Yuanbao IM Bot credentials."},
{"name": "YUANBAO_APP_SECRET", "prompt": "App Secret", "password": True,
"help": "The App Secret (used for HMAC signing) from your Yuanbao IM Bot."},
],
},
]
@@ -3126,12 +3108,6 @@ def _setup_wecom():
print_success("💬 WeCom configured!")
def _setup_yuanbao():
"""Configure Yuanbao via the standard platform setup."""
yuanbao_platform = next(p for p in _PLATFORMS if p["key"] == "yuanbao")
_setup_standard_platform(yuanbao_platform)
def _is_service_installed() -> bool:
"""Check if the gateway is installed as a system service."""
if supports_systemd_services():
+1281
View File
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+24 -278
View File
@@ -1043,7 +1043,6 @@ def _launch_tui(
)
env.setdefault("HERMES_PYTHON", sys.executable)
env.setdefault("HERMES_CWD", os.getcwd())
env.setdefault("NODE_ENV", "development" if tui_dev else "production")
if model:
env["HERMES_MODEL"] = model
env["HERMES_INFERENCE_MODEL"] = model
@@ -4413,14 +4412,8 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
from hermes_cli.models import fetch_ollama_cloud_models
api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
# During setup, force a live refresh so the picker reflects newly
# released models (e.g. deepseek v4 flash, kimi k2.6) the moment
# the user enters their key — not an hour later when the disk
# cache TTL expires.
model_list = fetch_ollama_cloud_models(
api_key=api_key_for_probe,
base_url=effective_base,
force_refresh=True,
api_key=api_key_for_probe, base_url=effective_base
)
if model_list:
print(f" Found {len(model_list)} model(s) from Ollama Cloud")
@@ -4787,35 +4780,11 @@ def cmd_webhook(args):
webhook_command(args)
def cmd_slack(args):
"""Slack integration helpers.
def cmd_kanban(args):
"""Multi-profile collaboration board."""
from hermes_cli.kanban import kanban_command
Dispatches ``hermes slack <subcommand>``. Currently supports:
manifest print or write a Slack app manifest with every gateway
command registered as a first-class slash.
"""
sub = getattr(args, "slack_command", None)
if sub in (None, ""):
# No subcommand — print usage hint.
print(
"usage: hermes slack <subcommand>\n"
"\n"
"subcommands:\n"
" manifest Generate a Slack app manifest with every gateway\n"
" command registered as a native slash\n"
"\n"
"Run `hermes slack manifest -h` for details.",
file=sys.stderr,
)
return 1
if sub == "manifest":
from hermes_cli.slack_cli import slack_manifest_command
return slack_manifest_command(args)
print(f"Unknown slack subcommand: {sub}", file=sys.stderr)
return 1
return kanban_command(args)
def cmd_hooks(args):
@@ -4991,83 +4960,6 @@ def _gateway_prompt(prompt_text: str, default: str = "", timeout: float = 300.0)
return default
def _web_ui_build_needed(web_dir: Path) -> bool:
"""Return True if the web UI dist is missing or stale.
Mirrors the staleness logic used by ``_tui_build_needed()`` for the TUI.
The Vite build outputs to ``hermes_cli/web_dist/`` (per vite.config.ts
outDir: "../hermes_cli/web_dist"), NOT to ``web/dist/``. Uses the Vite
manifest as the sentinel because it is written last and therefore has the
newest mtime of any build output.
"""
dist_dir = web_dir.parent / "hermes_cli" / "web_dist"
sentinel = dist_dir / ".vite" / "manifest.json"
if not sentinel.exists():
sentinel = dist_dir / "index.html"
if not sentinel.exists():
return True
dist_mtime = sentinel.stat().st_mtime
skip = frozenset({"node_modules", "dist"})
for dirpath, dirnames, filenames in os.walk(web_dir, topdown=True):
dirnames[:] = [d for d in dirnames if d not in skip]
for fn in filenames:
if fn.endswith((".ts", ".tsx", ".js", ".jsx", ".css", ".html", ".vue")):
if os.path.getmtime(os.path.join(dirpath, fn)) > dist_mtime:
return True
for meta in (
"package.json",
"package-lock.json",
"yarn.lock",
"pnpm-lock.yaml",
"vite.config.ts",
"vite.config.js",
):
mp = web_dir / meta
if mp.exists() and mp.stat().st_mtime > dist_mtime:
return True
return False
def _run_npm_install_deterministic(
npm: str,
cwd: Path,
*,
extra_args: tuple[str, ...] = (),
capture_output: bool = True,
) -> subprocess.CompletedProcess:
"""Run a deterministic npm install that does not mutate ``package-lock.json``.
Prefers ``npm ci`` (strict, lockfile-preserving) when a lockfile is present;
falls back to ``npm install`` only if ``npm ci`` fails (e.g. lockfile out of
sync on a WIP checkout). Without this, ``npm install`` on npm 10 silently
rewrites committed lockfiles (stripping ``"peer": true`` etc.), which leaves
the working tree dirty and causes the next ``hermes update`` to stash the
lockfile repeatedly.
"""
lockfile = cwd / "package-lock.json"
if lockfile.exists():
ci_cmd = [npm, "ci", *extra_args]
ci_result = subprocess.run(
ci_cmd,
cwd=cwd,
capture_output=capture_output,
text=True,
check=False,
)
if ci_result.returncode == 0:
return ci_result
# Fall through to `npm install` — lockfile may be out of sync on a
# WIP fork/branch, or `npm ci` may not be available on very old npm.
install_cmd = [npm, "install", *extra_args]
return subprocess.run(
install_cmd,
cwd=cwd,
capture_output=capture_output,
text=True,
check=False,
)
def _build_web_ui(web_dir: Path, *, fatal: bool = False) -> bool:
"""Build the web UI frontend if npm is available.
@@ -5081,9 +4973,6 @@ def _build_web_ui(web_dir: Path, *, fatal: bool = False) -> bool:
if not (web_dir / "package.json").exists():
return True
if not _web_ui_build_needed(web_dir):
return True
npm = shutil.which("npm")
if not npm:
if fatal:
@@ -5091,7 +4980,7 @@ def _build_web_ui(web_dir: Path, *, fatal: bool = False) -> bool:
print("Install Node.js, then run: cd web && npm install && npm run build")
return not fatal
print("→ Building web UI...")
r1 = _run_npm_install_deterministic(npm, web_dir, extra_args=("--silent",))
r1 = subprocess.run([npm, "install", "--silent"], cwd=web_dir, capture_output=True)
if r1.returncode != 0:
print(
f" {'' if fatal else ''} Web UI npm install failed"
@@ -5802,10 +5691,12 @@ def _update_node_dependencies() -> None:
if not (path / "package.json").exists():
continue
result = _run_npm_install_deterministic(
npm,
path,
extra_args=("--silent", "--no-fund", "--no-audit", "--progress=false"),
result = subprocess.run(
[npm, "install", "--silent", "--no-fund", "--no-audit", "--progress=false"],
cwd=path,
capture_output=True,
text=True,
check=False,
)
if result.returncode == 0:
print(f"{label}")
@@ -6041,88 +5932,6 @@ def _cmd_update_check():
print(f" Run '{recommended_update_command()}' to install.")
def _ensure_fhs_path_guard() -> None:
"""Ensure /usr/local/bin is on PATH for RHEL-family root non-login shells.
Mirrors the post-symlink probe added to ``scripts/install.sh`` so that
existing FHS-layout root installs on RHEL/CentOS/Rocky/Alma 8+ get
repaired on ``hermes update`` without requiring a reinstall. The
installer's assumption that ``/usr/local/bin`` is on PATH for every
standard shell breaks on those distros in non-login interactive shells
(su, sudo -s, tmux panes, some web terminals): /etc/bashrc doesn't
add /usr/local/bin and /root/.bash_profile doesn't either. Symptom:
``hermes`` prints ``command not found`` even though the symlink lives
at /usr/local/bin/hermes.
Silent no-op on: non-Linux, non-root, non-FHS installs, and any system
where ``bash -i -c 'command -v hermes'`` already resolves. Idempotent.
"""
if sys.platform != "linux":
return
try:
if os.geteuid() != 0:
return
except AttributeError:
return
# Only act when this is actually an FHS-layout install (command link at
# /usr/local/bin/hermes, code at /usr/local/lib/hermes-agent).
fhs_link = Path("/usr/local/bin/hermes")
if not fhs_link.is_symlink() and not fhs_link.exists():
return
# Probe a fresh non-login interactive bash the way the user will use it.
# ``bash -i -c`` sources ~/.bashrc but NOT ~/.bash_profile or /etc/profile,
# which is the exact scenario where RHEL root loses /usr/local/bin.
home = os.environ.get("HOME") or "/root"
try:
probe = subprocess.run(
["env", "-i",
f"HOME={home}",
f"TERM={os.environ.get('TERM', 'dumb')}",
"bash", "-i", "-c", "command -v hermes"],
capture_output=True, text=True, timeout=10,
)
except (FileNotFoundError, subprocess.TimeoutExpired):
return # no bash or probe hung — don't block update on this
if probe.returncode == 0:
return # already on PATH, nothing to do
path_line = 'export PATH="/usr/local/bin:$PATH"'
path_comment = (
"# Hermes Agent — ensure /usr/local/bin is on PATH "
"(RHEL non-login shells)"
)
wrote_any = False
for candidate in (".bashrc", ".bash_profile"):
cfg = Path(home) / candidate
if not cfg.is_file():
continue
try:
existing = cfg.read_text(errors="replace")
except OSError:
continue
# Idempotency: skip if any uncommented PATH= line already references
# /usr/local/bin. Mirrors the grep pattern used by install.sh.
already_guarded = any(
"/usr/local/bin" in line
and "PATH" in line
and not line.lstrip().startswith("#")
for line in existing.splitlines()
)
if already_guarded:
continue
try:
with cfg.open("a", encoding="utf-8") as f:
f.write("\n" + path_comment + "\n" + path_line + "\n")
except OSError as e:
print(f" ⚠ Could not update {cfg}: {e}")
continue
print(f" ✓ Added /usr/local/bin to PATH in {cfg}")
wrote_any = True
if wrote_any:
print(" (reload your shell or run 'source ~/.bashrc' to pick it up)")
def cmd_update(args):
"""Update Hermes Agent to the latest version.
@@ -6566,13 +6375,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
print()
print("✓ Update complete!")
# Repair RHEL-family root installs where /usr/local/bin isn't on PATH
# for non-login interactive shells. No-op on every other platform.
try:
_ensure_fhs_path_guard()
except Exception as e:
logger.debug("FHS PATH guard check failed: %s", e)
# Write exit code *before* the gateway restart attempt.
# When running as ``hermes update --gateway`` (spawned by the gateway's
# /update command), this process lives inside the gateway's systemd
@@ -8003,54 +7805,6 @@ For more help on a command:
)
whatsapp_parser.set_defaults(func=cmd_whatsapp)
# =========================================================================
# slack command
# =========================================================================
slack_parser = subparsers.add_parser(
"slack",
help="Slack integration helpers (manifest generation, etc.)",
description="Slack integration helpers for Hermes.",
)
slack_sub = slack_parser.add_subparsers(dest="slack_command")
slack_manifest = slack_sub.add_parser(
"manifest",
help="Print or write a Slack app manifest with every gateway command "
"registered as a native slash (/btw, /stop, /model, ...)",
description=(
"Generate a Slack app manifest that registers every gateway "
"command in COMMAND_REGISTRY as a first-class Slack slash "
"command (matching Discord and Telegram parity). Paste the "
"output into Slack app config → Features → App Manifest → "
"Edit, then Save. Reinstall the app if Slack prompts for it."
),
)
slack_manifest.add_argument(
"--write",
nargs="?",
const=True,
default=None,
metavar="PATH",
help="Write manifest to a file instead of stdout. With no PATH "
"writes to $HERMES_HOME/slack-manifest.json.",
)
slack_manifest.add_argument(
"--name",
default=None,
help='Bot display name (default: "Hermes")',
)
slack_manifest.add_argument(
"--description",
default=None,
help="Bot description shown in Slack's app directory.",
)
slack_manifest.add_argument(
"--slashes-only",
action="store_true",
help="Emit only the features.slash_commands array (for merging "
"into an existing manifest manually).",
)
slack_parser.set_defaults(func=cmd_slack)
# =========================================================================
# login command
# =========================================================================
@@ -8369,6 +8123,13 @@ For more help on a command:
webhook_parser.set_defaults(func=cmd_webhook)
# =========================================================================
# kanban command — multi-profile collaboration board
# =========================================================================
from hermes_cli.kanban import build_parser as _build_kanban_parser
kanban_parser = _build_kanban_parser(subparsers)
kanban_parser.set_defaults(func=cmd_kanban)
# =========================================================================
# hooks command — shell-hook inspection and management
# =========================================================================
@@ -8682,17 +8443,11 @@ Examples:
skills_install = skills_subparsers.add_parser("install", help="Install a skill")
skills_install.add_argument(
"identifier",
help="Skill identifier (e.g. openai/skills/skill-creator) or a direct HTTP(S) URL to a SKILL.md file",
"identifier", help="Skill identifier (e.g. openai/skills/skill-creator)"
)
skills_install.add_argument(
"--category", default="", help="Category folder to install into"
)
skills_install.add_argument(
"--name",
default="",
help="Override the skill name (useful when installing from a URL whose SKILL.md has no `name:` frontmatter)",
)
skills_install.add_argument(
"--force", action="store_true", help="Install despite blocked scan verdict"
)
@@ -8712,12 +8467,6 @@ Examples:
skills_list.add_argument(
"--source", default="all", choices=["all", "hub", "builtin", "local"]
)
skills_list.add_argument(
"--enabled-only",
action="store_true",
help="Hide disabled skills. Use with -p <profile> to see exactly "
"which skills will load for that profile.",
)
skills_check = skills_subparsers.add_parser(
"check", help="Check installed hub skills for updates"
@@ -9224,7 +8973,7 @@ Examples:
"--source", help="Filter by source (cli, telegram, discord, etc.)"
)
sessions_browse.add_argument(
"--limit", type=int, default=500, help="Max sessions to load (default: 500)"
"--limit", type=int, default=50, help="Max sessions to load (default: 50)"
)
def _confirm_prompt(prompt: str) -> bool:
@@ -9321,8 +9070,7 @@ Examples:
):
print("Cancelled.")
return
sessions_dir = get_hermes_home() / "sessions"
if db.delete_session(resolved_session_id, sessions_dir=sessions_dir):
if db.delete_session(resolved_session_id):
print(f"Deleted session '{resolved_session_id}'.")
else:
print(f"Session '{args.session_id}' not found.")
@@ -9336,9 +9084,7 @@ Examples:
):
print("Cancelled.")
return
sessions_dir = get_hermes_home() / "sessions"
count = db.prune_sessions(older_than_days=days, source=args.source,
sessions_dir=sessions_dir)
count = db.prune_sessions(older_than_days=days, source=args.source)
print(f"Pruned {count} session(s).")
elif action == "rename":
@@ -9356,7 +9102,7 @@ Examples:
print(f"Error: {e}")
elif action == "browse":
limit = getattr(args, "limit", 500) or 500
limit = getattr(args, "limit", 50) or 50
source = getattr(args, "source", None)
_browse_exclude = None if source else ["tool"]
sessions = db.list_sessions_rich(
+4
View File
@@ -33,6 +33,8 @@ COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("moonshotai/kimi-k2.6", "recommended"),
("deepseek/deepseek-v4-pro", ""),
("deepseek/deepseek-v4-flash", ""),
("anthropic/claude-opus-4.7", ""),
("anthropic/claude-opus-4.6", ""),
("anthropic/claude-sonnet-4.6", ""),
@@ -109,6 +111,8 @@ def _codex_curated_models() -> list[str]:
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"moonshotai/kimi-k2.6",
"deepseek/deepseek-v4-pro",
"deepseek/deepseek-v4-flash",
"xiaomi/mimo-v2.5-pro",
"xiaomi/mimo-v2.5",
"anthropic/claude-opus-4.7",
+8 -16
View File
@@ -9,7 +9,6 @@ from typing import Dict, Iterable, Optional, Set
from hermes_cli.auth import get_nous_auth_status
from hermes_cli.config import get_env_value, load_config
from tools.managed_tool_gateway import is_managed_tool_gateway_ready
from utils import is_truthy_value
from tools.tool_backend_helpers import (
fal_key_is_configured,
has_direct_modal_credentials,
@@ -26,13 +25,6 @@ _DEFAULT_PLATFORM_TOOLSETS = {
}
def _uses_gateway(section: object) -> bool:
"""Return True when a config section explicitly opts into the gateway."""
if not isinstance(section, dict):
return False
return is_truthy_value(section.get("use_gateway"), default=False)
@dataclass(frozen=True)
class NousFeatureState:
key: str
@@ -270,11 +262,11 @@ def get_nous_subscription_features(
# use_gateway flags — when True, the user explicitly opted into the
# Tool Gateway via `hermes model`, so direct credentials should NOT
# prevent gateway routing.
web_use_gateway = _uses_gateway(web_cfg)
tts_use_gateway = _uses_gateway(tts_cfg)
browser_use_gateway = _uses_gateway(browser_cfg)
web_use_gateway = bool(web_cfg.get("use_gateway"))
tts_use_gateway = bool(tts_cfg.get("use_gateway"))
browser_use_gateway = bool(browser_cfg.get("use_gateway"))
image_gen_cfg = config.get("image_gen") if isinstance(config.get("image_gen"), dict) else {}
image_use_gateway = _uses_gateway(image_gen_cfg)
image_use_gateway = bool(image_gen_cfg.get("use_gateway"))
direct_exa = bool(get_env_value("EXA_API_KEY"))
direct_firecrawl = bool(get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL"))
@@ -609,10 +601,10 @@ def get_gateway_eligible_tools(
# no direct keys exist — we only skip the prompt for tools where
# use_gateway was explicitly set.
opted_in = {
"web": _uses_gateway(config.get("web")),
"image_gen": _uses_gateway(config.get("image_gen")),
"tts": _uses_gateway(config.get("tts")),
"browser": _uses_gateway(config.get("browser")),
"web": bool((config.get("web") if isinstance(config.get("web"), dict) else {}).get("use_gateway")),
"image_gen": bool((config.get("image_gen") if isinstance(config.get("image_gen"), dict) else {}).get("use_gateway")),
"tts": bool((config.get("tts") if isinstance(config.get("tts"), dict) else {}).get("use_gateway")),
"browser": bool((config.get("browser") if isinstance(config.get("browser"), dict) else {}).get("use_gateway")),
}
unconfigured: list[str] = []
-1
View File
@@ -36,7 +36,6 @@ PLATFORMS: OrderedDict[str, PlatformInfo] = OrderedDict([
("wecom_callback", PlatformInfo(label="💬 WeCom Callback", default_toolset="hermes-wecom-callback")),
("weixin", PlatformInfo(label="💬 Weixin", default_toolset="hermes-weixin")),
("qqbot", PlatformInfo(label="💬 QQBot", default_toolset="hermes-qqbot")),
("yuanbao", PlatformInfo(label="🤖 Yuanbao", default_toolset="hermes-yuanbao")),
("webhook", PlatformInfo(label="🔗 Webhook", default_toolset="hermes-webhook")),
("api_server", PlatformInfo(label="🌐 API Server", default_toolset="hermes-api-server")),
("cron", PlatformInfo(label="⏰ Cron", default_toolset="hermes-cron")),
+14 -69
View File
@@ -1856,32 +1856,27 @@ def _setup_slack():
if existing:
print_info("Slack: already configured")
if not prompt_yes_no("Reconfigure Slack?", False):
# Even without reconfiguring, offer to refresh the manifest so
# new commands (e.g. /btw, /stop, ...) get registered in Slack.
if prompt_yes_no(
"Regenerate the Slack app manifest with the latest command "
"list? (recommended after `hermes update`)",
True,
):
_write_slack_manifest_and_instruct()
return
print_info("Steps to create a Slack app:")
print_info(" 1. Go to https://api.slack.com/apps → Create New App")
print_info(" Pick 'From an app manifest' — we'll generate one for you below.")
print_info(" 1. Go to https://api.slack.com/apps → Create New App (from scratch)")
print_info(" 2. Enable Socket Mode: Settings → Socket Mode → Enable")
print_info(" • Create an App-Level Token with 'connections:write' scope")
print_info(" 3. Install to Workspace: Settings → Install App")
print_info(" 4. After installing, invite the bot to channels: /invite @YourBot")
print_info(" 3. Add Bot Token Scopes: Features → OAuth & Permissions")
print_info(" Required scopes: chat:write, app_mentions:read,")
print_info(" channels:history, channels:read, im:history,")
print_info(" im:read, im:write, users:read, files:read, files:write")
print_info(" Optional for private channels: groups:history")
print_info(" 4. Subscribe to Events: Features → Event Subscriptions → Enable")
print_info(" Required events: message.im, message.channels, app_mention")
print_info(" Optional for private channels: message.groups")
print_warning(" ⚠ Without message.channels the bot will ONLY work in DMs,")
print_warning(" not public channels.")
print_info(" 5. Install to Workspace: Settings → Install App")
print_info(" 6. Reinstall the app after any scope or event changes")
print_info(" 7. After installing, invite the bot to channels: /invite @YourBot")
print()
print_info(" Full guide: https://hermes-agent.nousresearch.com/docs/user-guide/messaging/slack/")
print()
# Generate and write manifest up-front so the user can paste it into
# the "Create from manifest" flow instead of clicking through scopes /
# events / slash commands one at a time.
_write_slack_manifest_and_instruct()
print()
bot_token = prompt("Slack Bot Token (xoxb-...)", password=True)
if not bot_token:
@@ -1907,49 +1902,6 @@ def _setup_slack():
print_info(" Set SLACK_ALLOW_ALL_USERS=true or GATEWAY_ALLOW_ALL_USERS=true only if you intentionally want open workspace access.")
def _write_slack_manifest_and_instruct():
"""Generate the Slack manifest, write it under HERMES_HOME, and print
paste-into-Slack instructions.
Exposed as its own helper so both the initial setup flow and the
"reconfigure? → no" branch can refresh the manifest without the user
re-entering tokens. Failures are non-fatal if the manifest write
fails for any reason, we print a warning and skip rather than abort
the whole Slack setup.
"""
try:
from hermes_cli.slack_cli import _build_full_manifest
from hermes_constants import get_hermes_home
manifest = _build_full_manifest(
bot_name="Hermes",
bot_description="Your Hermes agent on Slack",
)
target = Path(get_hermes_home()) / "slack-manifest.json"
target.parent.mkdir(parents=True, exist_ok=True)
import json as _json
target.write_text(
_json.dumps(manifest, indent=2, ensure_ascii=False) + "\n",
encoding="utf-8",
)
print_success(f"Slack app manifest written to: {target}")
print_info(
" Paste it into https://api.slack.com/apps → your app → Features "
"→ App Manifest → Edit, then Save. Slack will prompt to "
"reinstall if scopes or slash commands changed."
)
print_info(
" Re-run `hermes slack manifest --write` anytime to refresh after "
"Hermes adds new commands."
)
except Exception as exc: # pragma: no cover - best-effort UX helper
print_warning(f"Couldn't write Slack manifest: {exc}")
print_info(
" You can generate it manually later with: "
"hermes slack manifest --write"
)
def _setup_matrix():
"""Configure Matrix credentials."""
print_header("Matrix")
@@ -2133,12 +2085,6 @@ def _setup_feishu():
_gateway_setup_feishu()
def _setup_yuanbao():
"""Configure Yuanbao via gateway setup."""
from hermes_cli.gateway import _setup_yuanbao as _gateway_setup_yuanbao
_gateway_setup_yuanbao()
def _setup_wecom():
"""Configure WeCom (Enterprise WeChat) via gateway setup."""
from hermes_cli.gateway import _setup_wecom as _gateway_setup_wecom
@@ -2283,7 +2229,6 @@ _GATEWAY_PLATFORMS = [
("WhatsApp", "WHATSAPP_ENABLED", _setup_whatsapp),
("DingTalk", "DINGTALK_CLIENT_ID", _setup_dingtalk),
("Feishu / Lark", "FEISHU_APP_ID", _setup_feishu),
("Yuanbao", "YUANBAO_APP_ID", _setup_yuanbao),
("WeCom (Enterprise WeChat)", "WECOM_BOT_ID", _setup_wecom),
("WeCom Callback (Self-Built App)", "WECOM_CALLBACK_CORP_ID", _setup_wecom_callback),
("Weixin (WeChat)", "WEIXIN_ACCOUNT_ID", _setup_weixin),
+20 -230
View File
@@ -11,10 +11,9 @@ handler are thin wrappers that parse args and delegate.
"""
import json
import re
import shutil
from pathlib import Path
from typing import Any, Dict, List, Optional
from typing import Any, Dict, Optional
from rich.console import Console
from rich.panel import Panel
@@ -142,103 +141,6 @@ def _derive_category_from_install_path(install_path: str) -> str:
return "" if parent == "." else parent
# ---------------------------------------------------------------------------
# Interactive name/category resolution for URL-installed skills
# ---------------------------------------------------------------------------
_VALID_NAME_RE = re.compile(r"^[a-z][a-z0-9_-]*$")
_VALID_CATEGORY_RE = re.compile(r"^[a-z][a-z0-9_/-]*$")
def _is_valid_installed_skill_name(name: str) -> bool:
"""Accept identifier-shaped names, reject empty / sentinel-y values."""
if not isinstance(name, str):
return False
candidate = name.strip().lower()
if not candidate or candidate in {"skill", "readme", "index", "unnamed-skill"}:
return False
return bool(_VALID_NAME_RE.match(candidate))
def _existing_categories() -> List[str]:
"""Return sorted subdirectory names under ``~/.hermes/skills/`` that look
like category buckets (contain at least one ``SKILL.md`` somewhere below).
Used to suggest reusable categories when interactively installing from a
URL. Hidden dirs (``.hub``, ``.trash``) are skipped.
"""
from tools.skills_hub import SKILLS_DIR
out: List[str] = []
try:
for entry in SKILLS_DIR.iterdir():
if not entry.is_dir() or entry.name.startswith("."):
continue
# Only count as a category if it contains skills, not if it IS a skill.
# Heuristic: if ``<entry>/SKILL.md`` exists, it's a skill at the
# top level (no category); otherwise treat as a category bucket.
if (entry / "SKILL.md").exists():
continue
# Has at least one nested SKILL.md?
try:
if any(entry.rglob("SKILL.md")):
out.append(entry.name)
except OSError:
continue
except (FileNotFoundError, OSError):
return []
return sorted(set(out))
def _prompt_for_skill_name(c: Console, url: str, default: str = "") -> Optional[str]:
"""Prompt interactively for a skill name. Returns None on cancel/EOF."""
c.print()
c.print(
f"[yellow]The SKILL.md at {url} doesn't declare a `name:` in its "
f"frontmatter,[/]\n[yellow]and the URL path doesn't produce a valid "
f"identifier either.[/]"
)
default_hint = f" [{default}]" if default else ""
c.print(
f"[bold]Enter a skill name{default_hint}:[/] "
f"[dim](lowercase letters, digits, hyphens, underscores; starts with a letter)[/]"
)
try:
answer = input("Name: ").strip()
except (EOFError, KeyboardInterrupt):
return None
if not answer and default:
answer = default
if not _is_valid_installed_skill_name(answer):
c.print(f"[bold red]Invalid name:[/] {answer!r}. Aborting install.\n")
return None
return answer
def _prompt_for_category(c: Console, existing: List[str]) -> str:
"""Prompt interactively for a category. Empty/None input means flat install."""
c.print()
if existing:
c.print(
"[bold]Pick a category[/] "
"[dim](reuse an existing bucket, type a new one, or press Enter to install flat)[/]"
)
c.print(f"[dim]Existing: {', '.join(existing)}[/]")
else:
c.print(
"[bold]Category[/] [dim](optional — press Enter to install flat at ~/.hermes/skills/<name>/)[/]"
)
try:
answer = input("Category: ").strip()
except (EOFError, KeyboardInterrupt):
return ""
if not answer:
return ""
if not _VALID_CATEGORY_RE.match(answer):
c.print(f"[dim]Invalid category {answer!r} — installing flat.[/]")
return ""
return answer
def do_search(query: str, source: str = "all", limit: int = 10,
console: Optional[Console] = None) -> None:
"""Search registries and display results as a Rich table."""
@@ -407,17 +309,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",
def do_install(identifier: str, category: str = "", force: bool = False,
console: Optional[Console] = None, skip_confirm: bool = False,
invalidate_cache: bool = True,
name_override: str = "") -> None:
"""Fetch, quarantine, scan, confirm, and install a skill.
``name_override`` lets non-interactive callers (slash commands, gateway,
scripts) supply a skill name when the upstream SKILL.md lacks a valid
``name:`` frontmatter field. On interactive TTY surfaces, a missing name
triggers a prompt instead; ``skip_confirm=True`` means "non-interactive"
(so pair it with ``name_override`` when installing from a URL that has
no frontmatter).
"""
invalidate_cache: bool = True) -> None:
"""Fetch, quarantine, scan, confirm, and install a skill."""
from tools.skills_hub import (
GitHubAuth, create_source_router, ensure_hub_dirs,
quarantine_bundle, install_from_quarantine, HubLockFile,
@@ -461,58 +354,6 @@ def do_install(identifier: str, category: str = "", force: bool = False,
c.print()
return
# URL-sourced skills may arrive with an empty name when SKILL.md has no
# ``name:`` in frontmatter AND the URL path doesn't yield a valid
# identifier. Resolve by (1) --name override, (2) interactive prompt on
# a TTY, (3) refuse with an actionable error on non-interactive surfaces.
bundle_meta = getattr(bundle, "metadata", {}) or {}
if bundle.source == "url" and (not bundle.name or bundle_meta.get("awaiting_name")):
if name_override and _is_valid_installed_skill_name(name_override):
bundle.name = name_override.strip()
bundle_meta["awaiting_name"] = False
elif name_override:
c.print(
f"[bold red]Invalid --name:[/] {name_override!r}. "
"Must be a lowercase identifier (letters, digits, hyphens, "
"underscores; starts with a letter).\n"
)
return
elif skip_confirm:
# Non-interactive surface (slash command / TUI / gateway). Can't
# prompt — emit an actionable error.
url = bundle_meta.get("url") or identifier
c.print(
f"[bold red]Cannot install from URL:[/] {url}\n"
"[yellow]The SKILL.md has no `name:` in its frontmatter, "
"and the URL path doesn't produce a valid identifier.[/]\n\n"
"Retry with an explicit name:\n"
f" [bold]/skills install {url} --name <your-name>[/]\n"
f" [bold]hermes skills install {url} --name <your-name>[/]\n\n"
"[dim]Or ask the SKILL.md's author to add a `name:` field to "
"its YAML frontmatter.[/]\n"
)
return
else:
# Interactive TTY — prompt.
url = bundle_meta.get("url") or identifier
chosen = _prompt_for_skill_name(c, url)
if not chosen:
c.print("[dim]Installation cancelled.[/]\n")
return
bundle.name = chosen
bundle_meta["awaiting_name"] = False
# Keep SkillMeta in sync so downstream "already installed" checks,
# audit logs, and display all see the final name.
if meta is not None:
meta.name = bundle.name
meta.path = bundle.name
# URL-sourced skills: offer to pick a category interactively when the
# caller didn't specify one (TTY only — non-interactive installs fall
# through to flat install, matching all other sources).
if bundle.source == "url" and not category and not skip_confirm:
category = _prompt_for_category(c, _existing_categories())
# Auto-detect category for official skills (e.g. "official/autonomous-ai-agents/blackbox")
if bundle.source == "official" and not category:
id_parts = bundle.identifier.split("/") # ["official", "category", "skill"]
@@ -758,24 +599,11 @@ def inspect_skill(identifier: str) -> Optional[dict]:
return out
def do_list(source_filter: str = "all",
enabled_only: bool = False,
console: Optional[Console] = None) -> None:
"""List installed skills, distinguishing hub, builtin, and local skills.
Args:
source_filter: ``all`` | ``hub`` | ``builtin`` | ``local``.
enabled_only: If True, hide disabled skills from the output.
Enabled/disabled state is resolved against the currently active profile's
config ``hermes -p <profile> skills list`` reads that profile's
``skills.disabled`` list because ``-p`` swaps ``HERMES_HOME`` at process
start. No explicit profile flag needed here.
"""
def do_list(source_filter: str = "all", console: Optional[Console] = None) -> None:
"""List installed skills, distinguishing hub, builtin, and local skills."""
from tools.skills_hub import HubLockFile, ensure_hub_dirs
from tools.skills_sync import _read_manifest
from tools.skills_tool import _find_all_skills
from agent.skill_utils import get_disabled_skill_names
c = console or _console
ensure_hub_dirs()
@@ -783,26 +611,17 @@ def do_list(source_filter: str = "all",
hub_installed = {e["name"]: e for e in lock.list_installed()}
builtin_names = set(_read_manifest())
# Pull ALL skills (including disabled ones) so we can annotate status.
all_skills = _find_all_skills(skip_disabled=True)
disabled_names = get_disabled_skill_names()
all_skills = _find_all_skills()
title = "Installed Skills"
if enabled_only:
title += " (enabled only)"
table = Table(title=title)
table = Table(title="Installed Skills")
table.add_column("Name", style="bold cyan")
table.add_column("Category", style="dim")
table.add_column("Source", style="dim")
table.add_column("Trust", style="dim")
table.add_column("Status", style="dim")
hub_count = 0
builtin_count = 0
local_count = 0
enabled_count = 0
disabled_count = 0
for skill in sorted(all_skills, key=lambda s: (s.get("category") or "", s["name"])):
name = skill["name"]
@@ -813,48 +632,29 @@ def do_list(source_filter: str = "all",
source_type = "hub"
source_display = hub_entry.get("source", "hub")
trust = hub_entry.get("trust_level", "community")
hub_count += 1
elif name in builtin_names:
source_type = "builtin"
source_display = "builtin"
trust = "builtin"
builtin_count += 1
else:
source_type = "local"
source_display = "local"
trust = "local"
local_count += 1
if source_filter != "all" and source_filter != source_type:
continue
is_enabled = name not in disabled_names
if enabled_only and not is_enabled:
continue
if source_type == "hub":
hub_count += 1
elif source_type == "builtin":
builtin_count += 1
else:
local_count += 1
if is_enabled:
enabled_count += 1
status_cell = "[bold green]enabled[/]"
else:
disabled_count += 1
status_cell = "[dim red]disabled[/]"
trust_style = {"builtin": "bright_cyan", "trusted": "green", "community": "yellow", "local": "dim"}.get(trust, "dim")
trust_label = "official" if source_display == "official" else trust
table.add_row(name, category, source_display, f"[{trust_style}]{trust_label}[/]", status_cell)
table.add_row(name, category, source_display, f"[{trust_style}]{trust_label}[/]")
c.print(table)
summary = f"[dim]{hub_count} hub-installed, {builtin_count} builtin, {local_count} local"
if enabled_only:
summary += f"{enabled_count} enabled shown"
else:
summary += f"{enabled_count} enabled, {disabled_count} disabled"
summary += "[/]\n"
c.print(summary)
c.print(
f"[dim]{hub_count} hub-installed, {builtin_count} builtin, {local_count} local[/]\n"
)
def do_check(name: Optional[str] = None, console: Optional[Console] = None) -> None:
@@ -1323,15 +1123,11 @@ def skills_command(args) -> None:
do_search(args.query, source=args.source, limit=args.limit)
elif action == "install":
do_install(args.identifier, category=args.category, force=args.force,
skip_confirm=getattr(args, "yes", False),
name_override=getattr(args, "name", "") or "")
skip_confirm=getattr(args, "yes", False))
elif action == "inspect":
do_inspect(args.identifier)
elif action == "list":
do_list(
source_filter=args.source,
enabled_only=getattr(args, "enabled_only", False),
)
do_list(source_filter=args.source)
elif action == "check":
do_check(name=getattr(args, "name", None))
elif action == "update":
@@ -1381,7 +1177,6 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
/skills search kubernetes
/skills install openai/skills/skill-creator
/skills install openai/skills/skill-creator --force
/skills install https://example.com/path/SKILL.md
/skills inspect openai/skills/skill-creator
/skills list
/skills list --source hub
@@ -1458,11 +1253,10 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
elif action == "install":
if not args:
c.print("[bold red]Usage:[/] /skills install <identifier-or-url> [--name <name>] [--category <cat>] [--force] [--now]\n")
c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force] [--now]\n")
return
identifier = args[0]
category = ""
name_override = ""
# Slash commands run inside prompt_toolkit where input() hangs.
# Always skip confirmation — the user typing the command is implicit consent.
skip_confirm = True
@@ -1473,11 +1267,9 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
for i, a in enumerate(args):
if a == "--category" and i + 1 < len(args):
category = args[i + 1]
elif a == "--name" and i + 1 < len(args):
name_override = args[i + 1]
do_install(identifier, category=category, force=force,
skip_confirm=skip_confirm, invalidate_cache=invalidate_cache,
name_override=name_override, console=c)
console=c)
elif action == "inspect":
if not args:
@@ -1487,12 +1279,11 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
elif action == "list":
source_filter = "all"
enabled_only = "--enabled-only" in args or "--enabled" in args
if "--source" in args:
idx = args.index("--source")
if idx + 1 < len(args):
source_filter = args[idx + 1]
do_list(source_filter=source_filter, enabled_only=enabled_only, console=c)
do_list(source_filter=source_filter, console=c)
elif action == "check":
name = args[0] if args else None
@@ -1580,8 +1371,7 @@ def _print_skills_help(console: Console) -> None:
" [cyan]search[/] <query> Search registries for skills\n"
" [cyan]install[/] <identifier> Install a skill (with security scan)\n"
" [cyan]inspect[/] <identifier> Preview a skill without installing\n"
" [cyan]list[/] [--source hub|builtin|local] [--enabled-only]\n"
" List installed skills; --enabled-only filters to the active profile's live set\n"
" [cyan]list[/] [--source hub|builtin|local] List installed skills\n"
" [cyan]check[/] [name] Check hub skills for upstream updates\n"
" [cyan]update[/] [name] Update hub skills with upstream changes\n"
" [cyan]audit[/] [name] Re-scan hub skills for security\n"
-152
View File
@@ -1,152 +0,0 @@
"""``hermes slack ...`` CLI subcommands.
Today only ``hermes slack manifest`` is implemented it generates the
Slack app manifest JSON for registering every gateway command as a native
Slack slash (``/btw``, ``/stop``, ``/model``, ) so users get the same
first-class slash UX Discord and Telegram already have.
Typical workflow::
$ hermes slack manifest > slack-manifest.json
# or:
$ hermes slack manifest --write
Then paste the printed JSON into the Slack app config (Features App
Manifest Edit) and click Save. Slack diffs the manifest and prompts
for reinstall when scopes/commands change.
"""
from __future__ import annotations
import json
import sys
from pathlib import Path
def _build_full_manifest(bot_name: str, bot_description: str) -> dict:
"""Build a full Slack manifest merging display info + our slash list.
The slash-command list is always generated from ``COMMAND_REGISTRY`` so
it stays in sync with the rest of Hermes. Other manifest sections
(display info, OAuth scopes, socket mode) are set to sensible defaults
for a Hermes deployment users can tweak them in the Slack UI after
pasting.
"""
from hermes_cli.commands import slack_app_manifest
partial = slack_app_manifest()
slashes = partial["features"]["slash_commands"]
return {
"_metadata": {
"major_version": 1,
"minor_version": 1,
},
"display_information": {
"name": bot_name[:35],
"description": (bot_description or "Your Hermes agent on Slack")[:140],
"background_color": "#1a1a2e",
},
"features": {
"bot_user": {
"display_name": bot_name[:80],
"always_online": True,
},
"slash_commands": slashes,
"assistant_view": {
"assistant_description": "Chat with Hermes in threads and DMs.",
},
},
"oauth_config": {
"scopes": {
"bot": [
"app_mentions:read",
"assistant:write",
"channels:history",
"channels:read",
"chat:write",
"commands",
"files:read",
"files:write",
"groups:history",
"im:history",
"im:read",
"im:write",
"users:read",
],
},
},
"settings": {
"event_subscriptions": {
"bot_events": [
"app_mention",
"assistant_thread_context_changed",
"assistant_thread_started",
"message.channels",
"message.groups",
"message.im",
],
},
"interactivity": {
"is_enabled": True,
},
"org_deploy_enabled": False,
"socket_mode_enabled": True,
"token_rotation_enabled": False,
},
}
def slack_manifest_command(args) -> int:
"""Print or write a Slack app manifest JSON.
Flags (all parsed in ``hermes_cli/main.py``):
--write [PATH] Write to file instead of stdout (default path:
``$HERMES_HOME/slack-manifest.json``)
--name NAME Override the bot display name (default: "Hermes")
--description DESC Override the bot description
--slashes-only Emit only the ``features.slash_commands`` array (for
merging into an existing manifest manually)
"""
name = getattr(args, "name", None) or "Hermes"
description = getattr(args, "description", None) or "Your Hermes agent on Slack"
if getattr(args, "slashes_only", False):
from hermes_cli.commands import slack_app_manifest
manifest = slack_app_manifest()["features"]["slash_commands"]
else:
manifest = _build_full_manifest(name, description)
payload = json.dumps(manifest, indent=2, ensure_ascii=False) + "\n"
write_target = getattr(args, "write", None)
if write_target is not None:
if isinstance(write_target, bool) and write_target:
# --write with no value → default location
try:
from hermes_constants import get_hermes_home
target = Path(get_hermes_home()) / "slack-manifest.json"
except Exception:
target = Path.home() / ".hermes" / "slack-manifest.json"
else:
target = Path(write_target).expanduser()
target.parent.mkdir(parents=True, exist_ok=True)
target.write_text(payload, encoding="utf-8")
print(f"Slack manifest written to: {target}", file=sys.stderr)
print(
"\nNext steps:\n"
" 1. Open https://api.slack.com/apps and pick your Hermes app\n"
" (or create a new one: Create New App → From an app manifest).\n"
f" 2. Features → App Manifest → paste the contents of\n"
f" {target}\n"
" 3. Save; Slack will prompt to reinstall the app if scopes or\n"
" slash commands changed.\n"
" 4. Make sure Socket Mode is enabled and you have a bot token\n"
" (xoxb-...) and app token (xapp-...) configured via\n"
" `hermes setup`.\n",
file=sys.stderr,
)
else:
sys.stdout.write(payload)
return 0
+1 -2
View File
@@ -326,8 +326,7 @@ def show_status(args):
"WeCom Callback": ("WECOM_CALLBACK_CORP_ID", None),
"Weixin": ("WEIXIN_ACCOUNT_ID", "WEIXIN_HOME_CHANNEL"),
"BlueBubbles": ("BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_HOME_CHANNEL"),
"QQBot": ("QQ_APP_ID", "QQ_HOME_CHANNEL"),
"Yuanbao": ("YUANBAO_APP_ID", "YUANBAO_HOME_CHANNEL"),
"QQBot": ("QQ_APP_ID", "QQBOT_HOME_CHANNEL"),
}
for name, (token_var, home_var) in platforms.items():
+4 -4
View File
@@ -20,10 +20,10 @@ def get_provider_request_timeout(
try:
from hermes_cli.config import load_config
config = load_config()
except Exception:
except ImportError:
return None
config = load_config()
providers = config.get("providers", {}) if isinstance(config, dict) else {}
provider_config = (
providers.get(provider_id, {}) if isinstance(providers, dict) else {}
@@ -49,10 +49,10 @@ def get_provider_stale_timeout(
try:
from hermes_cli.config import load_config
config = load_config()
except Exception:
except ImportError:
return None
config = load_config()
providers = config.get("providers", {}) if isinstance(config, dict) else {}
provider_config = (
providers.get(provider_id, {}) if isinstance(providers, dict) else {}
+1 -1
View File
@@ -106,7 +106,7 @@ TIPS = [
"Set display.streaming: true to see tokens appear in real time as the model generates.",
"Set display.show_reasoning: true to watch the model's chain-of-thought reasoning.",
"Set display.compact: true to reduce whitespace in output for denser information.",
"Set display.busy_input_mode: queue to queue messages instead of interrupting the agent, or steer to inject them mid-run via /steer.",
"Set display.busy_input_mode: queue to queue messages instead of interrupting the agent.",
"Set display.resume_display: minimal to skip the full conversation recap on session resume.",
"Set compression.threshold: 0.50 to control when auto-compression fires (default: 50% of context).",
"Set agent.max_turns: 200 to let the agent take more tool-calling steps per turn.",
+3 -14
View File
@@ -11,7 +11,6 @@ the `platform_toolsets` key.
import json as _json
import logging
import os
import sys
from pathlib import Path
from typing import Dict, List, Optional, Set
@@ -26,7 +25,7 @@ from hermes_cli.nous_subscription import (
get_nous_subscription_features,
)
from tools.tool_backend_helpers import fal_key_is_configured, managed_nous_tools_enabled
from utils import base_url_hostname, is_truthy_value
from utils import base_url_hostname
logger = logging.getLogger(__name__)
@@ -71,7 +70,6 @@ CONFIGURABLE_TOOLSETS = [
("spotify", "🎵 Spotify", "playback, search, playlists, library"),
("discord", "💬 Discord (read/participate)", "fetch messages, search members, create thread"),
("discord_admin", "🛡️ Discord Server Admin", "list channels/roles, pin, assign roles"),
("yuanbao", "🤖 Yuanbao", "group info, member queries, DM"),
]
# Toolsets that are OFF by default for new installs.
@@ -678,15 +676,6 @@ def _get_platform_tools(
# their own platform (e.g. `discord` + `discord` should stay OFF).
if platform in default_off and platform not in _TOOLSET_PLATFORM_RESTRICTIONS:
default_off.remove(platform)
# Home Assistant is already runtime-gated by its check_fn (requires
# HASS_TOKEN to register any tools). When a user has configured
# HASS_TOKEN, they've explicitly opted in — don't also strip it via
# _DEFAULT_OFF_TOOLSETS, which would silently drop HA from platforms
# (e.g. cron) that run through _get_platform_tools without an
# explicit saved toolset list. Without this, Norbert's HA cron jobs
# regressed after #14798 made cron honor per-platform tool config.
if "homeassistant" in default_off and os.getenv("HASS_TOKEN"):
default_off.remove("homeassistant")
enabled_toolsets -= default_off
# Recover non-configurable platform toolsets (e.g. discord, feishu_doc,
@@ -1188,7 +1177,7 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
configured_provider = image_cfg.get("provider")
if configured_provider not in (None, "", "fal"):
return False
if image_cfg.get("use_gateway") is not None and not is_truthy_value(image_cfg.get("use_gateway"), default=False):
if image_cfg.get("use_gateway") is False:
return False
return feature.managed_by_nous
if provider.get("tts_provider"):
@@ -1220,7 +1209,7 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
return (
provider["imagegen_backend"] == "fal"
and configured_provider in (None, "", "fal")
and not is_truthy_value(image_cfg.get("use_gateway"), default=False)
and not image_cfg.get("use_gateway")
)
return False
+23 -11
View File
@@ -287,7 +287,7 @@ _SCHEMA_OVERRIDES: Dict[str, Dict[str, Any]] = {
"display.busy_input_mode": {
"type": "select",
"description": "Input behavior while agent is running",
"options": ["interrupt", "queue", "steer"],
"options": ["interrupt", "queue"],
},
"memory.provider": {
"type": "select",
@@ -2327,14 +2327,16 @@ def _resolve_chat_argv(
from hermes_cli.main import PROJECT_ROOT, _make_tui_argv
argv, cwd = _make_tui_argv(PROJECT_ROOT / "ui-tui", tui_dev=False)
env = os.environ.copy()
env.setdefault("NODE_ENV", "production")
env: Optional[dict] = None
if resume:
env["HERMES_TUI_RESUME"] = resume
if resume or sidecar_url:
env = os.environ.copy()
if sidecar_url:
env["HERMES_TUI_SIDECAR_URL"] = sidecar_url
if resume:
env["HERMES_TUI_RESUME"] = resume
if sidecar_url:
env["HERMES_TUI_SIDECAR_URL"] = sidecar_url
return list(argv), str(cwd) if cwd else None, env
@@ -3101,13 +3103,23 @@ def _mount_plugin_api_routes():
_log.warning("Plugin %s declares api=%s but file not found", plugin["name"], api_file_name)
continue
try:
spec = importlib.util.spec_from_file_location(
f"hermes_dashboard_plugin_{plugin['name']}", api_path,
)
module_name = f"hermes_dashboard_plugin_{plugin['name']}"
spec = importlib.util.spec_from_file_location(module_name, api_path)
if spec is None or spec.loader is None:
continue
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
# Register in sys.modules BEFORE exec_module so pydantic/FastAPI
# can resolve forward references (e.g. models defined in a file
# that uses `from __future__ import annotations`). Without this,
# TypeAdapter lazy-build fails at first request with
# "is not fully defined" because the module namespace isn't
# reachable by name for string-annotation resolution.
sys.modules[module_name] = mod
try:
spec.loader.exec_module(mod)
except Exception:
sys.modules.pop(module_name, None)
raise
router = getattr(mod, "router", None)
if router is None:
_log.warning("Plugin %s api file has no 'router' attribute", plugin["name"])
+4 -3
View File
@@ -195,6 +195,10 @@ def setup_logging(
The ``logs/`` directory where files are written.
"""
global _logging_initialized
if _logging_initialized and not force:
home = hermes_home or get_hermes_home()
return home / "logs"
home = hermes_home or get_hermes_home()
log_dir = home / "logs"
log_dir.mkdir(parents=True, exist_ok=True)
@@ -244,9 +248,6 @@ def setup_logging(
log_filter=_ComponentFilter(COMPONENT_PREFIXES["gateway"]),
)
if _logging_initialized and not force:
return log_dir
# Ensure root logger level is low enough for the handlers to fire.
if root.level == logging.NOTSET or root.level > level:
root.setLevel(level)
+14 -129
View File
@@ -832,18 +832,7 @@ class SessionDB:
params = []
if not include_children:
# Show root sessions and branch sessions (whose parent ended with
# end_reason='branched' before the child was created), while still
# hiding sub-agent runs and compression continuations (which also
# carry a parent_session_id but were spawned while the parent was
# still live — i.e., started_at < parent.ended_at).
where_clauses.append(
"(s.parent_session_id IS NULL"
" OR EXISTS (SELECT 1 FROM sessions p"
" WHERE p.id = s.parent_session_id"
" AND p.end_reason = 'branched'"
" AND s.started_at >= p.ended_at))"
)
where_clauses.append("s.parent_session_id IS NULL")
if source:
where_clauses.append("s.source = ?")
@@ -1132,27 +1121,20 @@ class SessionDB:
current = child_id
return session_id
def get_messages_as_conversation(
self, session_id: str, include_ancestors: bool = False
) -> List[Dict[str, Any]]:
def get_messages_as_conversation(self, session_id: str) -> List[Dict[str, Any]]:
"""
Load messages in the OpenAI conversation format (role + content dicts).
Used by the gateway to restore conversation history.
"""
session_ids = [session_id]
if include_ancestors:
session_ids = self._session_lineage_root_to_tip(session_id)
with self._lock:
placeholders = ",".join("?" for _ in session_ids)
rows = self._conn.execute(
cursor = self._conn.execute(
"SELECT role, content, tool_call_id, tool_calls, tool_name, "
"reasoning, reasoning_content, reasoning_details, codex_reasoning_items, "
"codex_message_items "
f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY timestamp, id",
tuple(session_ids),
).fetchall()
"FROM messages WHERE session_id = ? ORDER BY timestamp, id",
(session_id,),
)
rows = cursor.fetchall()
messages = []
for row in rows:
msg = {"role": row["role"], "content": row["content"]}
@@ -1192,47 +1174,9 @@ class SessionDB:
except (json.JSONDecodeError, TypeError):
logger.warning("Failed to deserialize codex_message_items, falling back to None")
msg["codex_message_items"] = None
if include_ancestors and self._is_duplicate_replayed_user_message(messages, msg):
continue
messages.append(msg)
return messages
def _session_lineage_root_to_tip(self, session_id: str) -> List[str]:
if not session_id:
return [session_id]
chain = []
current = session_id
seen = set()
with self._lock:
for _ in range(100):
if not current or current in seen:
break
seen.add(current)
chain.append(current)
row = self._conn.execute(
"SELECT parent_session_id FROM sessions WHERE id = ?",
(current,),
).fetchone()
if row is None:
break
current = row["parent_session_id"] if hasattr(row, "keys") else row[0]
return list(reversed(chain)) or [session_id]
@staticmethod
def _is_duplicate_replayed_user_message(messages: List[Dict[str, Any]], msg: Dict[str, Any]) -> bool:
if msg.get("role") != "user":
return False
content = msg.get("content")
if not isinstance(content, str) or not content:
return False
for prev in reversed(messages):
if prev.get("role") == "user" and prev.get("content") == content:
return True
if prev.get("role") == "assistant" and (prev.get("content") or prev.get("tool_calls")):
return False
return False
# =========================================================================
# Search
# =========================================================================
@@ -1557,45 +1501,12 @@ class SessionDB:
)
self._execute_write(_do)
@staticmethod
def _remove_session_files(sessions_dir: Optional[Path], session_id: str) -> None:
"""Remove on-disk transcript files for a session.
Cleans up ``{session_id}.json``, ``{session_id}.jsonl``, and any
``request_dump_{session_id}_*.json`` files left by the gateway.
Silently skips files that don't exist and swallows OSError so a
filesystem hiccup never blocks a DB operation.
"""
if sessions_dir is None:
return
for suffix in (".json", ".jsonl"):
p = sessions_dir / f"{session_id}{suffix}"
try:
p.unlink(missing_ok=True)
except OSError:
pass
# request_dump files use session_id as a prefix component
try:
for p in sessions_dir.glob(f"request_dump_{session_id}_*.json"):
try:
p.unlink(missing_ok=True)
except OSError:
pass
except OSError:
pass
def delete_session(
self,
session_id: str,
sessions_dir: Optional[Path] = None,
) -> bool:
def delete_session(self, session_id: str) -> bool:
"""Delete a session and all its messages.
Child sessions are orphaned (parent_session_id set to NULL) rather
than cascade-deleted, so they remain accessible independently.
When *sessions_dir* is provided, also removes on-disk transcript
files (``.json`` / ``.jsonl`` / ``request_dump_*``) for the deleted
session. Returns True if the session was found and deleted.
Returns True if the session was found and deleted.
"""
def _do(conn):
cursor = conn.execute(
@@ -1612,29 +1523,16 @@ class SessionDB:
conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
return True
return self._execute_write(_do)
deleted = self._execute_write(_do)
if deleted:
self._remove_session_files(sessions_dir, session_id)
return deleted
def prune_sessions(
self,
older_than_days: int = 90,
source: str = None,
sessions_dir: Optional[Path] = None,
) -> int:
def prune_sessions(self, older_than_days: int = 90, source: str = None) -> int:
"""Delete sessions older than N days. Returns count of deleted sessions.
Only prunes ended sessions (not active ones). Child sessions outside
the prune window are orphaned (parent_session_id set to NULL) rather
than cascade-deleted. When *sessions_dir* is provided, also removes
on-disk transcript files (``.json`` / ``.jsonl`` /
``request_dump_*``) for every pruned session, outside the DB
transaction.
than cascade-deleted.
"""
cutoff = time.time() - (older_than_days * 86400)
removed_ids: list[str] = []
def _do(conn):
if source:
@@ -1664,14 +1562,9 @@ class SessionDB:
for sid in session_ids:
conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
removed_ids.append(sid)
return len(session_ids)
count = self._execute_write(_do)
# Clean up on-disk files outside the DB transaction
for sid in removed_ids:
self._remove_session_files(sessions_dir, sid)
return count
return self._execute_write(_do)
# ── Meta key/value (for scheduler bookkeeping) ──
@@ -1725,7 +1618,6 @@ class SessionDB:
retention_days: int = 90,
min_interval_hours: int = 24,
vacuum: bool = True,
sessions_dir: Optional[Path] = None,
) -> Dict[str, Any]:
"""Idempotent auto-maintenance: prune old sessions + optional VACUUM.
@@ -1733,10 +1625,6 @@ class SessionDB:
within ``min_interval_hours`` no-op. Designed to be called once at
startup from long-lived entrypoints (CLI, gateway, cron scheduler).
When *sessions_dir* is provided, on-disk transcript files
(``.json`` / ``.jsonl`` / ``request_dump_*``) for pruned sessions
are removed as part of the same sweep (issue #3015).
Never raises. On any failure, logs a warning and returns a dict
with ``"error"`` set.
@@ -1760,10 +1648,7 @@ class SessionDB:
except (TypeError, ValueError):
pass # corrupt meta; treat as no prior run
pruned = self.prune_sessions(
older_than_days=retention_days,
sessions_dir=sessions_dir,
)
pruned = self.prune_sessions(older_than_days=retention_days)
result["pruned"] = pruned
# Only VACUUM if we actually freed rows — VACUUM on a tight DB
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-Chz+NW9NXqboXHOa6PKwf5bhAkkcFtKNhvKWwg2XSPc=";
hash = "sha256-RU4qSHgJPMyfRSEJDzkG4+MReDZDc6QbTD2wisa5QE0=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "ui-tui"; attr = "tui"; pname = "hermes-tui"; };
@@ -380,10 +380,6 @@ def backup_existing(path: Path, backup_root: Path) -> Optional[Path]:
# Replace OpenClaw brand names with Hermes in migrated text so that
# memory entries, user profiles, SOUL.md, and workspace instructions
# read as self-referential to the new agent identity.
#
# Case-preserving: ``OpenClaw`` → ``Hermes`` (prose), but lowercase matches
# like ``openclaw`` → ``hermes`` (so filesystem paths like ``~/.openclaw``
# become ``~/.hermes`` — the real Hermes home — not the broken ``~/.Hermes``).
_REBRAND_PATTERNS: List[Tuple[re.Pattern, str]] = [
(re.compile(r'\bOpen[\s-]?Claw\b', re.IGNORECASE), 'Hermes'),
(re.compile(r'\bClawdBot\b', re.IGNORECASE), 'Hermes'),
@@ -391,31 +387,10 @@ _REBRAND_PATTERNS: List[Tuple[re.Pattern, str]] = [
]
def _case_preserving_replacement(replacement: str):
"""Return a re.sub replacement fn that lowercases the result when the
matched text was all-lowercase.
Keeps ``OpenClaw`` ``Hermes`` but maps ``openclaw`` ``hermes`` so a
filesystem path like ``~/.openclaw/config.yaml`` rewrites to
``~/.hermes/config.yaml`` (the real Hermes home) instead of the broken
``~/.Hermes/config.yaml``.
"""
def _sub(match: "re.Match[str]") -> str:
matched = match.group(0)
if matched and matched.islower():
return replacement.lower()
return replacement
return _sub
def rebrand_text(text: str) -> str:
"""Replace OpenClaw / ClawdBot / MoltBot brand names with Hermes.
Preserves case so filesystem-path matches (lowercase) don't become
capitalized directory names that don't exist.
"""
"""Replace OpenClaw / ClawdBot / MoltBot brand names with Hermes."""
for pattern, replacement in _REBRAND_PATTERNS:
text = pattern.sub(_case_preserving_replacement(replacement), text)
text = pattern.sub(replacement, text)
return text
File diff suppressed because it is too large Load Diff
+752
View File
@@ -0,0 +1,752 @@
/*
* Hermes Kanban dashboard plugin styles.
*
* All colors reference theme CSS vars so the board reskins with the
* active dashboard theme. No hardcoded palette.
*/
.hermes-kanban {
width: 100%;
}
/* ---- Columns layout -------------------------------------------------- */
.hermes-kanban-columns {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
gap: 0.75rem;
align-items: start;
}
.hermes-kanban-column {
display: flex;
flex-direction: column;
background: color-mix(in srgb, var(--color-card) 85%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius);
padding: 0.5rem;
min-height: 200px;
max-height: calc(100vh - 220px);
transition: border-color 120ms ease, background-color 120ms ease;
}
.hermes-kanban-column--drop {
border-color: var(--color-ring);
background: color-mix(in srgb, var(--color-ring) 8%, var(--color-card));
}
.hermes-kanban-column-header {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.25rem 0.25rem 0.35rem;
font-weight: 600;
font-size: 0.85rem;
color: var(--color-foreground);
}
.hermes-kanban-column-label {
flex: 1;
letter-spacing: 0.01em;
}
.hermes-kanban-column-count {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
font-size: 0.75rem;
font-weight: 500;
}
.hermes-kanban-column-add {
appearance: none;
background: transparent;
border: 1px solid var(--color-border);
color: var(--color-foreground);
border-radius: var(--radius-sm, 0.25rem);
width: 22px;
height: 22px;
line-height: 1;
font-size: 1rem;
cursor: pointer;
}
.hermes-kanban-column-add:hover {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-column-sub {
padding: 0 0.25rem 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
border-bottom: 1px solid color-mix(in srgb, var(--color-border) 60%, transparent);
margin-bottom: 0.5rem;
}
.hermes-kanban-column-body {
display: flex;
flex-direction: column;
gap: 0.45rem;
overflow-y: auto;
padding-right: 0.1rem;
}
.hermes-kanban-empty {
padding: 1.5rem 0.5rem;
text-align: center;
font-size: 0.75rem;
color: var(--color-muted-foreground);
border: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
/* ---- Status dots ----------------------------------------------------- */
.hermes-kanban-dot {
display: inline-block;
width: 0.5rem;
height: 0.5rem;
border-radius: 999px;
background: var(--color-muted-foreground);
}
.hermes-kanban-dot-triage { background: #b47dd6; } /* lilac — fresh/unspecified */
.hermes-kanban-dot-todo { background: var(--color-muted-foreground); }
.hermes-kanban-dot-ready { background: #d4b348; } /* amber */
.hermes-kanban-dot-running { background: #3fb97d; } /* green */
.hermes-kanban-dot-blocked { background: var(--color-destructive, #d14a4a); }
.hermes-kanban-dot-done { background: #4a8cd1; } /* blue */
.hermes-kanban-dot-archived { background: var(--color-border); }
/* ---- Progress pill (N/M child tasks done) --------------------------- */
.hermes-kanban-progress {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.62rem;
padding: 0.05rem 0.35rem;
border-radius: 999px;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border: 1px solid color-mix(in srgb, var(--color-border) 80%, transparent);
color: var(--color-muted-foreground);
letter-spacing: 0.02em;
}
.hermes-kanban-progress--full {
background: color-mix(in srgb, #3fb97d 22%, transparent);
border-color: color-mix(in srgb, #3fb97d 45%, transparent);
color: var(--color-foreground);
}
/* ---- Lanes (per-profile sub-grouping inside Running) ---------------- */
.hermes-kanban-lane {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.25rem 0 0.35rem;
border-top: 1px dashed color-mix(in srgb, var(--color-border) 70%, transparent);
}
.hermes-kanban-lane:first-child {
border-top: 0;
padding-top: 0;
}
.hermes-kanban-lane-head {
display: flex;
align-items: center;
gap: 0.4rem;
font-size: 0.65rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
padding: 0 0.1rem;
}
.hermes-kanban-lane-name {
font-weight: 600;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-lane-count {
margin-left: auto;
font-variant-numeric: tabular-nums;
}
/* ---- Card ------------------------------------------------------------ */
.hermes-kanban-card {
cursor: grab;
transition: transform 100ms ease, box-shadow 100ms ease;
}
.hermes-kanban-card:hover {
box-shadow: 0 1px 0 0 var(--color-ring) inset, 0 0 0 1px var(--color-ring) inset;
}
.hermes-kanban-card:active {
cursor: grabbing;
transform: scale(0.995);
}
.hermes-kanban-card-content {
padding: 0.5rem 0.6rem !important;
display: flex;
flex-direction: column;
gap: 0.3rem;
}
.hermes-kanban-card-row {
display: flex;
align-items: center;
gap: 0.35rem;
flex-wrap: wrap;
}
.hermes-kanban-card-id {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.65rem;
color: var(--color-muted-foreground);
letter-spacing: 0.03em;
}
.hermes-kanban-card-title {
font-size: 0.85rem;
font-weight: 500;
line-height: 1.3;
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-card-meta {
font-size: 0.7rem;
color: var(--color-muted-foreground);
gap: 0.55rem;
}
.hermes-kanban-priority {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
background: color-mix(in srgb, var(--color-ring) 18%, transparent);
color: var(--color-foreground);
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, transparent);
}
.hermes-kanban-tag {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
}
.hermes-kanban-assignee {
font-weight: 500;
color: color-mix(in srgb, var(--color-foreground) 80%, var(--color-muted-foreground));
}
.hermes-kanban-unassigned {
font-style: italic;
}
.hermes-kanban-ago {
margin-left: auto;
}
/* ---- Inline create --------------------------------------------------- */
.hermes-kanban-inline-create {
display: flex;
flex-direction: column;
gap: 0.35rem;
padding: 0.5rem;
margin-bottom: 0.5rem;
background: color-mix(in srgb, var(--color-card) 70%, transparent);
border: 1px dashed var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
/* ---- Drawer (task detail side panel) --------------------------------- */
.hermes-kanban-drawer-shade {
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.45);
z-index: 60;
display: flex;
justify-content: flex-end;
}
.hermes-kanban-drawer {
width: min(480px, 92vw);
height: 100vh;
background: var(--color-card);
border-left: 1px solid var(--color-border);
display: flex;
flex-direction: column;
box-shadow: -4px 0 18px rgba(0, 0, 0, 0.35);
animation: hermes-kanban-drawer-in 180ms ease-out;
}
@keyframes hermes-kanban-drawer-in {
from { transform: translateX(100%); opacity: 0.3; }
to { transform: translateX(0); opacity: 1; }
}
.hermes-kanban-drawer-head {
display: flex;
align-items: center;
justify-content: space-between;
padding: 0.6rem 0.8rem;
border-bottom: 1px solid var(--color-border);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-drawer-close {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 1.25rem;
line-height: 1;
cursor: pointer;
padding: 0 0.25rem;
}
.hermes-kanban-drawer-close:hover { color: var(--color-foreground); }
.hermes-kanban-drawer-body {
flex: 1;
overflow-y: auto;
padding: 0.9rem;
display: flex;
flex-direction: column;
gap: 0.85rem;
}
.hermes-kanban-drawer-title {
display: flex;
align-items: center;
gap: 0.5rem;
font-size: 1rem;
font-weight: 600;
}
.hermes-kanban-drawer-meta {
display: flex;
flex-direction: column;
gap: 0.15rem;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-meta-row {
display: flex;
gap: 0.5rem;
font-size: 0.72rem;
}
.hermes-kanban-meta-label {
width: 92px;
color: var(--color-muted-foreground);
}
.hermes-kanban-meta-value {
color: var(--color-foreground);
word-break: break-word;
}
.hermes-kanban-actions {
display: flex;
flex-wrap: wrap;
gap: 0.3rem;
}
.hermes-kanban-section {
display: flex;
flex-direction: column;
gap: 0.35rem;
}
.hermes-kanban-section-head {
font-size: 0.72rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.07em;
color: var(--color-muted-foreground);
}
.hermes-kanban-pre {
margin: 0;
padding: 0.45rem 0.55rem;
white-space: pre-wrap;
word-break: break-word;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.72rem;
color: var(--color-foreground);
}
.hermes-kanban-comment {
border-left: 2px solid color-mix(in srgb, var(--color-ring) 35%, transparent);
padding-left: 0.5rem;
display: flex;
flex-direction: column;
gap: 0.2rem;
}
.hermes-kanban-comment-head {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
}
.hermes-kanban-comment-author {
font-weight: 600;
color: var(--color-foreground);
}
.hermes-kanban-comment-ago {
color: var(--color-muted-foreground);
}
.hermes-kanban-event {
display: flex;
gap: 0.5rem;
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-event-kind {
color: var(--color-foreground);
min-width: 6rem;
}
.hermes-kanban-event-payload {
color: var(--color-muted-foreground);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
max-width: 280px;
}
.hermes-kanban-drawer-comment-row {
display: flex;
gap: 0.4rem;
padding: 0.55rem 0.75rem;
border-top: 1px solid var(--color-border);
background: color-mix(in srgb, var(--color-card) 90%, transparent);
}
.hermes-kanban-count {
display: inline-flex;
gap: 0.2rem;
align-items: center;
}
/* ---- Selection chrome ----------------------------------------------- */
.hermes-kanban-card--selected :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-ring) inset,
0 0 0 1px var(--color-ring) inset;
background: color-mix(in srgb, var(--color-ring) 6%, var(--color-card));
}
.hermes-kanban-card-check {
width: 0.85rem;
height: 0.85rem;
margin: 0;
cursor: pointer;
accent-color: var(--color-ring);
}
/* ---- Bulk action bar ------------------------------------------------ */
.hermes-kanban-bulk {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.4rem 0.75rem;
background: color-mix(in srgb, var(--color-ring) 10%, var(--color-card));
border: 1px solid color-mix(in srgb, var(--color-ring) 40%, var(--color-border));
border-radius: var(--radius-sm, 0.25rem);
flex-wrap: wrap;
}
.hermes-kanban-bulk-count {
font-weight: 600;
font-size: 0.75rem;
padding-right: 0.25rem;
}
.hermes-kanban-bulk-btn {
height: 1.7rem !important;
padding: 0 0.5rem !important;
font-size: 0.7rem !important;
border: 1px solid var(--color-border);
cursor: pointer;
}
.hermes-kanban-bulk-btn:hover {
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
}
.hermes-kanban-bulk-reassign {
display: flex;
align-items: center;
gap: 0.25rem;
padding-left: 0.5rem;
border-left: 1px solid color-mix(in srgb, var(--color-border) 70%, transparent);
}
/* ---- Dependency editor chips --------------------------------------- */
.hermes-kanban-deps-row {
display: flex;
align-items: center;
gap: 0.5rem;
margin-bottom: 0.4rem;
}
.hermes-kanban-deps-label {
font-size: 0.68rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: var(--color-muted-foreground);
min-width: 4rem;
}
.hermes-kanban-deps-chips {
display: flex;
gap: 0.3rem;
flex-wrap: wrap;
flex: 1;
}
.hermes-kanban-deps-empty {
font-size: 0.7rem;
color: var(--color-muted-foreground);
font-style: italic;
}
.hermes-kanban-dep-chip {
display: inline-flex;
align-items: center;
gap: 0.15rem;
padding: 0.1rem 0.35rem;
background: color-mix(in srgb, var(--color-foreground) 6%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.68rem;
color: var(--color-foreground);
}
.hermes-kanban-dep-chip-x {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
cursor: pointer;
font-size: 0.85rem;
line-height: 1;
padding: 0 0.15rem;
}
.hermes-kanban-dep-chip-x:hover { color: var(--color-destructive, #d14a4a); }
/* ---- Inline edit affordances --------------------------------------- */
.hermes-kanban-editable {
cursor: pointer;
border-bottom: 1px dotted color-mix(in srgb, var(--color-border) 80%, transparent);
}
.hermes-kanban-editable:hover {
color: var(--color-foreground);
border-bottom-color: var(--color-ring);
}
.hermes-kanban-drawer-title-text {
cursor: pointer;
}
.hermes-kanban-drawer-title-text:hover {
text-decoration: underline;
text-decoration-color: var(--color-ring);
text-decoration-style: dotted;
text-underline-offset: 3px;
}
.hermes-kanban-edit-row {
display: flex;
align-items: center;
gap: 0.35rem;
width: 100%;
}
.hermes-kanban-section-head-row {
display: flex;
align-items: center;
justify-content: space-between;
gap: 0.5rem;
}
.hermes-kanban-edit-link {
appearance: none;
background: transparent;
border: 0;
color: var(--color-muted-foreground);
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 0.05em;
cursor: pointer;
padding: 0;
}
.hermes-kanban-edit-link:hover { color: var(--color-ring); }
.hermes-kanban-textarea {
width: 100%;
min-height: 8rem;
background: var(--color-card);
color: var(--color-foreground);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
padding: 0.5rem 0.6rem;
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.8rem;
line-height: 1.5;
resize: vertical;
}
.hermes-kanban-textarea:focus {
outline: none;
border-color: var(--color-ring);
box-shadow: 0 0 0 2px color-mix(in srgb, var(--color-ring) 30%, transparent);
}
/* ---- Markdown rendering -------------------------------------------- */
.hermes-kanban-md {
font-size: 0.8rem;
line-height: 1.55;
color: var(--color-foreground);
}
.hermes-kanban-md p { margin: 0.25rem 0; }
.hermes-kanban-md h1,
.hermes-kanban-md h2,
.hermes-kanban-md h3,
.hermes-kanban-md h4 {
margin: 0.6rem 0 0.2rem;
line-height: 1.25;
}
.hermes-kanban-md h1 { font-size: 1.05rem; }
.hermes-kanban-md h2 { font-size: 0.95rem; }
.hermes-kanban-md h3 { font-size: 0.88rem; }
.hermes-kanban-md h4 { font-size: 0.82rem; }
.hermes-kanban-md ul {
margin: 0.25rem 0 0.25rem 1.1rem;
padding: 0;
}
.hermes-kanban-md li { margin: 0.1rem 0; }
.hermes-kanban-md a {
color: var(--color-ring);
text-decoration: underline;
}
.hermes-kanban-md code {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.75rem;
padding: 0.05rem 0.3rem;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border-radius: 3px;
}
.hermes-kanban-md-code {
margin: 0.35rem 0;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 5%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
overflow-x: auto;
}
.hermes-kanban-md-code code {
background: transparent;
padding: 0;
font-size: 0.75rem;
white-space: pre;
}
.hermes-kanban-md strong { font-weight: 600; }
/* ---- Touch-drag proxy ---------------------------------------------- */
.hermes-kanban-touch-proxy {
pointer-events: none;
opacity: 0.85;
box-shadow: 0 8px 20px rgba(0, 0, 0, 0.35);
transform: scale(1.02);
transition: none;
}
/* ---- Staleness tiers ------------------------------------------------ */
.hermes-kanban-card--stale-amber :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px #d4b34888 inset;
}
.hermes-kanban-card--stale-amber:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px #d4b348 inset;
}
.hermes-kanban-card--stale-red :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 1px var(--color-destructive, #d14a4a) inset,
0 0 8px color-mix(in srgb, var(--color-destructive, #d14a4a) 30%, transparent);
}
.hermes-kanban-card--stale-red:hover :where(.hermes-kanban-card-content) {
box-shadow: 0 0 0 2px var(--color-destructive, #d14a4a) inset,
0 0 10px color-mix(in srgb, var(--color-destructive, #d14a4a) 45%, transparent);
}
/* ---- Worker log pane ------------------------------------------------ */
.hermes-kanban-log {
max-height: 340px;
overflow: auto;
white-space: pre;
font-size: 0.7rem;
line-height: 1.45;
}
/* ---- Run history (per-attempt log in the drawer) ------------------- */
.hermes-kanban-run {
border-left: 2px solid var(--color-border);
padding: 0.35rem 0.5rem;
margin-bottom: 0.4rem;
background: color-mix(in srgb, var(--color-foreground) 3%, transparent);
border-radius: var(--radius-sm, 0.25rem);
}
.hermes-kanban-run--active { border-left-color: #3fb97d; }
.hermes-kanban-run--completed { border-left-color: #4a8cd1; }
.hermes-kanban-run--ended { border-left-color: #6b7280; } /* generic fallback when outcome is unset */
.hermes-kanban-run--blocked { border-left-color: var(--color-destructive, #d14a4a); }
.hermes-kanban-run--crashed,
.hermes-kanban-run--timed_out,
.hermes-kanban-run--gave_up,
.hermes-kanban-run--spawn_failed {
border-left-color: var(--color-destructive, #d14a4a);
background: color-mix(in srgb, var(--color-destructive, #d14a4a) 6%, transparent);
}
.hermes-kanban-run--reclaimed { border-left-color: #d4b348; }
.hermes-kanban-run-head {
display: flex;
align-items: center;
gap: 0.6rem;
font-size: 0.7rem;
}
.hermes-kanban-run-outcome {
font-family: var(--font-mono, ui-monospace, monospace);
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--color-foreground);
}
.hermes-kanban-run-profile {
color: var(--color-muted-foreground);
}
.hermes-kanban-run-elapsed {
font-variant-numeric: tabular-nums;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-ago {
margin-left: auto;
color: var(--color-muted-foreground);
}
.hermes-kanban-run-summary {
font-size: 0.75rem;
padding: 0.2rem 0 0;
color: var(--color-foreground);
}
.hermes-kanban-run-error {
font-size: 0.7rem;
color: var(--color-destructive, #d14a4a);
padding: 0.15rem 0 0;
font-family: var(--font-mono, ui-monospace, monospace);
}
.hermes-kanban-run-meta {
display: block;
font-size: 0.65rem;
padding: 0.15rem 0 0;
color: var(--color-muted-foreground);
white-space: pre-wrap;
word-break: break-word;
font-family: var(--font-mono, ui-monospace, monospace);
}
+14
View File
@@ -0,0 +1,14 @@
{
"name": "kanban",
"label": "Kanban",
"description": "Multi-agent collaboration board — drag-drop cards across columns, read comment threads, see which profile is running what",
"icon": "Package",
"version": "1.0.0",
"tab": {
"path": "/kanban",
"position": "after:skills"
},
"entry": "dist/index.js",
"css": "dist/style.css",
"api": "plugin_api.py"
}
+830
View File
@@ -0,0 +1,830 @@
"""Kanban dashboard plugin — backend API routes.
Mounted at /api/plugins/kanban/ by the dashboard plugin system.
This layer is intentionally thin: every handler is a small wrapper around
``hermes_cli.kanban_db`` or a direct SQL query. Writes use the same code
paths the CLI and gateway ``/kanban`` command use, so the three surfaces
cannot drift.
Live updates arrive via the ``/events`` WebSocket, which tails the
append-only ``task_events`` table on a short poll interval (WAL mode lets
reads run alongside the dispatcher's IMMEDIATE write transactions).
Security note
-------------
The dashboard's HTTP auth middleware (``web_server.auth_middleware``)
explicitly skips ``/api/plugins/`` plugin routes are unauthenticated by
design because the dashboard binds to localhost by default. For the
WebSocket we still require the session token as a ``?token=`` query
parameter (browsers cannot set the ``Authorization`` header on an upgrade
request), matching the established pattern used by the in-browser PTY
bridge in ``hermes_cli/web_server.py``. If you run the dashboard with
``--host 0.0.0.0``, every plugin route kanban included becomes
reachable from the network. Don't do that on a shared host.
"""
from __future__ import annotations
import asyncio
import hmac
import json
import logging
import sqlite3
import time
from dataclasses import asdict
from typing import Any, Optional
from fastapi import APIRouter, HTTPException, Query, WebSocket, WebSocketDisconnect, status as http_status
from pydantic import BaseModel, Field
from hermes_cli import kanban_db
log = logging.getLogger(__name__)
router = APIRouter()
# ---------------------------------------------------------------------------
# Auth helper — WebSocket only (HTTP routes live behind the dashboard's
# existing plugin-bypass; this is documented above).
# ---------------------------------------------------------------------------
def _check_ws_token(provided: Optional[str]) -> bool:
"""Constant-time compare against the dashboard session token.
Imported lazily so the plugin still loads in test contexts where the
dashboard web_server module isn't importable (e.g. the bare-FastAPI
test harness).
"""
if not provided:
return False
try:
from hermes_cli import web_server as _ws
except Exception:
# No dashboard context (tests). Accept so the tail loop is still
# testable; in production the dashboard module always imports
# cleanly because it's the caller.
return True
expected = getattr(_ws, "_SESSION_TOKEN", None)
if not expected:
return True
return hmac.compare_digest(str(provided), str(expected))
def _conn():
"""Open a kanban_db connection, creating the schema on first use.
Every handler that mutates the DB goes through this so the plugin
self-heals on a fresh install (no user-visible "no such table"
error if somebody hits POST /tasks before GET /board).
``init_db`` is idempotent.
"""
try:
kanban_db.init_db()
except Exception as exc:
log.warning("kanban init_db failed: %s", exc)
return kanban_db.connect()
# ---------------------------------------------------------------------------
# Serialization helpers
# ---------------------------------------------------------------------------
# Columns shown by the dashboard, in left-to-right order. "archived" is
# available via a filter toggle rather than a visible column.
BOARD_COLUMNS: list[str] = [
"triage", "todo", "ready", "running", "blocked", "done",
]
def _task_dict(task: kanban_db.Task) -> dict[str, Any]:
d = asdict(task)
# Add derived age metrics so the UI can colour stale cards without
# computing deltas client-side.
d["age"] = kanban_db.task_age(task)
# Keep body short on list endpoints; full body comes from /tasks/:id.
return d
def _event_dict(event: kanban_db.Event) -> dict[str, Any]:
return {
"id": event.id,
"task_id": event.task_id,
"kind": event.kind,
"payload": event.payload,
"created_at": event.created_at,
"run_id": event.run_id,
}
def _comment_dict(c: kanban_db.Comment) -> dict[str, Any]:
return {
"id": c.id,
"task_id": c.task_id,
"author": c.author,
"body": c.body,
"created_at": c.created_at,
}
def _run_dict(r: kanban_db.Run) -> dict[str, Any]:
"""Serialise a Run for the drawer's Run history section."""
return {
"id": r.id,
"task_id": r.task_id,
"profile": r.profile,
"step_key": r.step_key,
"status": r.status,
"claim_lock": r.claim_lock,
"claim_expires": r.claim_expires,
"worker_pid": r.worker_pid,
"max_runtime_seconds": r.max_runtime_seconds,
"last_heartbeat_at": r.last_heartbeat_at,
"started_at": r.started_at,
"ended_at": r.ended_at,
"outcome": r.outcome,
"summary": r.summary,
"metadata": r.metadata,
"error": r.error,
}
def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
"""Return {'parents': [...], 'children': [...]} for a task."""
parents = [
r["parent_id"]
for r in conn.execute(
"SELECT parent_id FROM task_links WHERE child_id = ? ORDER BY parent_id",
(task_id,),
)
]
children = [
r["child_id"]
for r in conn.execute(
"SELECT child_id FROM task_links WHERE parent_id = ? ORDER BY child_id",
(task_id,),
)
]
return {"parents": parents, "children": children}
# ---------------------------------------------------------------------------
# GET /board
# ---------------------------------------------------------------------------
@router.get("/board")
def get_board(
tenant: Optional[str] = Query(None, description="Filter to a single tenant"),
include_archived: bool = Query(False),
):
"""Return the full board grouped by status column.
``_conn()`` auto-initializes ``kanban.db`` on first call so a fresh
install doesn't surface a "failed to load" error on the plugin tab.
"""
conn = _conn()
try:
tasks = kanban_db.list_tasks(
conn, tenant=tenant, include_archived=include_archived
)
# Pre-fetch link counts per task (cheap: one query).
link_counts: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT parent_id, child_id FROM task_links"
).fetchall():
link_counts.setdefault(row["parent_id"], {"parents": 0, "children": 0})[
"children"
] += 1
link_counts.setdefault(row["child_id"], {"parents": 0, "children": 0})[
"parents"
] += 1
# Comment + event counts (both cheap aggregates).
comment_counts: dict[str, int] = {
r["task_id"]: r["n"]
for r in conn.execute(
"SELECT task_id, COUNT(*) AS n FROM task_comments GROUP BY task_id"
)
}
# Progress rollup: for each parent, how many children are done / total.
# One pass over task_links joined with child status — cheaper than
# N per-task queries and the plugin uses it to render "N/M".
progress: dict[str, dict[str, int]] = {}
for row in conn.execute(
"SELECT l.parent_id AS pid, t.status AS cstatus "
"FROM task_links l JOIN tasks t ON t.id = l.child_id"
).fetchall():
p = progress.setdefault(row["pid"], {"done": 0, "total": 0})
p["total"] += 1
if row["cstatus"] == "done":
p["done"] += 1
latest_event_id = conn.execute(
"SELECT COALESCE(MAX(id), 0) AS m FROM task_events"
).fetchone()["m"]
columns: dict[str, list[dict]] = {c: [] for c in BOARD_COLUMNS}
if include_archived:
columns["archived"] = []
for t in tasks:
d = _task_dict(t)
d["link_counts"] = link_counts.get(t.id, {"parents": 0, "children": 0})
d["comment_count"] = comment_counts.get(t.id, 0)
d["progress"] = progress.get(t.id) # None when the task has no children
col = t.status if t.status in columns else "todo"
columns[col].append(d)
# Stable per-column ordering already applied by list_tasks
# (priority DESC, created_at ASC), keep as-is.
# List of known tenants for the UI filter dropdown.
tenants = [
r["tenant"]
for r in conn.execute(
"SELECT DISTINCT tenant FROM tasks WHERE tenant IS NOT NULL ORDER BY tenant"
)
]
# List of distinct assignees for the lane-by-profile sub-grouping.
assignees = [
r["assignee"]
for r in conn.execute(
"SELECT DISTINCT assignee FROM tasks WHERE assignee IS NOT NULL "
"AND status != 'archived' ORDER BY assignee"
)
]
return {
"columns": [
{"name": name, "tasks": columns[name]} for name in columns.keys()
],
"tenants": tenants,
"assignees": assignees,
"latest_event_id": int(latest_event_id),
"now": int(time.time()),
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# GET /tasks/:id
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}")
def get_task(task_id: str):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
return {
"task": _task_dict(task),
"comments": [_comment_dict(c) for c in kanban_db.list_comments(conn, task_id)],
"events": [_event_dict(e) for e in kanban_db.list_events(conn, task_id)],
"links": _links_for(conn, task_id),
"runs": [_run_dict(r) for r in kanban_db.list_runs(conn, task_id)],
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# POST /tasks
# ---------------------------------------------------------------------------
class CreateTaskBody(BaseModel):
title: str
body: Optional[str] = None
assignee: Optional[str] = None
tenant: Optional[str] = None
priority: int = 0
workspace_kind: str = "scratch"
workspace_path: Optional[str] = None
parents: list[str] = Field(default_factory=list)
triage: bool = False
idempotency_key: Optional[str] = None
max_runtime_seconds: Optional[int] = None
skills: Optional[list[str]] = None
@router.post("/tasks")
def create_task(payload: CreateTaskBody):
conn = _conn()
try:
task_id = kanban_db.create_task(
conn,
title=payload.title,
body=payload.body,
assignee=payload.assignee,
created_by="dashboard",
workspace_kind=payload.workspace_kind,
workspace_path=payload.workspace_path,
tenant=payload.tenant,
priority=payload.priority,
parents=payload.parents,
triage=payload.triage,
idempotency_key=payload.idempotency_key,
max_runtime_seconds=payload.max_runtime_seconds,
skills=payload.skills,
)
task = kanban_db.get_task(conn, task_id)
return {"task": _task_dict(task) if task else None}
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
# ---------------------------------------------------------------------------
# PATCH /tasks/:id (status / assignee / priority / title / body)
# ---------------------------------------------------------------------------
class UpdateTaskBody(BaseModel):
status: Optional[str] = None
assignee: Optional[str] = None
priority: Optional[int] = None
title: Optional[str] = None
body: Optional[str] = None
result: Optional[str] = None
block_reason: Optional[str] = None
# Structured handoff fields — forwarded to complete_task when status
# transitions to 'done'. Dashboard parity with ``hermes kanban
# complete --summary ... --metadata ...``.
summary: Optional[str] = None
metadata: Optional[dict] = None
@router.patch("/tasks/{task_id}")
def update_task(task_id: str, payload: UpdateTaskBody):
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
# --- assignee ----------------------------------------------------
if payload.assignee is not None:
try:
ok = kanban_db.assign_task(
conn, task_id, payload.assignee or None,
)
except RuntimeError as e:
raise HTTPException(status_code=409, detail=str(e))
if not ok:
raise HTTPException(status_code=404, detail="task not found")
# --- status -------------------------------------------------------
if payload.status is not None:
s = payload.status
ok = True
if s == "done":
ok = kanban_db.complete_task(
conn, task_id,
result=payload.result,
summary=payload.summary,
metadata=payload.metadata,
)
elif s == "blocked":
ok = kanban_db.block_task(conn, task_id, reason=payload.block_reason)
elif s == "ready":
# Re-open a blocked task, or just an explicit status set.
current = kanban_db.get_task(conn, task_id)
if current and current.status == "blocked":
ok = kanban_db.unblock_task(conn, task_id)
else:
# Direct status write for drag-drop (todo -> ready etc).
ok = _set_status_direct(conn, task_id, "ready")
elif s == "archived":
ok = kanban_db.archive_task(conn, task_id)
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, task_id, s)
else:
raise HTTPException(status_code=400, detail=f"unknown status: {s}")
if not ok:
raise HTTPException(
status_code=409,
detail=f"status transition to {s!r} not valid from current state",
)
# --- priority -----------------------------------------------------
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), task_id),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(task_id, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
# --- title / body -------------------------------------------------
if payload.title is not None or payload.body is not None:
with kanban_db.write_txn(conn):
sets, vals = [], []
if payload.title is not None:
if not payload.title.strip():
raise HTTPException(status_code=400, detail="title cannot be empty")
sets.append("title = ?")
vals.append(payload.title.strip())
if payload.body is not None:
sets.append("body = ?")
vals.append(payload.body)
vals.append(task_id)
conn.execute(
f"UPDATE tasks SET {', '.join(sets)} WHERE id = ?", vals,
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'edited', NULL, ?)",
(task_id, int(time.time())),
)
updated = kanban_db.get_task(conn, task_id)
return {"task": _task_dict(updated) if updated else None}
finally:
conn.close()
def _set_status_direct(
conn: sqlite3.Connection, task_id: str, new_status: str,
) -> bool:
"""Direct status write for drag-drop moves that aren't covered by the
structured complete/block/unblock/archive verbs (e.g. todo<->ready,
running<->ready). Appends a ``status`` event row for the live feed.
When this transitions OFF ``running`` to anything other than the
terminal verbs above (which own their own run closing), we close the
active run with outcome='reclaimed' so attempt history isn't
orphaned. ``running -> ready`` via drag-drop is the common case
(user yanking a stuck worker back to the queue).
"""
with kanban_db.write_txn(conn):
# Snapshot current state so we know whether to close a run.
prev = conn.execute(
"SELECT status, current_run_id FROM tasks WHERE id = ?",
(task_id,),
).fetchone()
if prev is None:
return False
was_running = prev["status"] == "running"
cur = conn.execute(
"UPDATE tasks SET status = ?, "
" claim_lock = CASE WHEN ? = 'running' THEN claim_lock ELSE NULL END, "
" claim_expires = CASE WHEN ? = 'running' THEN claim_expires ELSE NULL END, "
" worker_pid = CASE WHEN ? = 'running' THEN worker_pid ELSE NULL END "
"WHERE id = ?",
(new_status, new_status, new_status, new_status, task_id),
)
if cur.rowcount != 1:
return False
run_id = None
if was_running and new_status != "running" and prev["current_run_id"]:
run_id = kanban_db._end_run(
conn, task_id,
outcome="reclaimed", status="reclaimed",
summary=f"status changed to {new_status} (dashboard/direct)",
)
conn.execute(
"INSERT INTO task_events (task_id, run_id, kind, payload, created_at) "
"VALUES (?, ?, 'status', ?, ?)",
(task_id, run_id, json.dumps({"status": new_status}), int(time.time())),
)
# If we re-opened something, children may have gone stale.
if new_status in ("done", "ready"):
kanban_db.recompute_ready(conn)
return True
# ---------------------------------------------------------------------------
# Comments
# ---------------------------------------------------------------------------
class CommentBody(BaseModel):
body: str
author: Optional[str] = "dashboard"
@router.post("/tasks/{task_id}/comments")
def add_comment(task_id: str, payload: CommentBody):
if not payload.body.strip():
raise HTTPException(status_code=400, detail="body is required")
conn = _conn()
try:
if kanban_db.get_task(conn, task_id) is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
kanban_db.add_comment(
conn, task_id, author=payload.author or "dashboard", body=payload.body,
)
return {"ok": True}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Links
# ---------------------------------------------------------------------------
class LinkBody(BaseModel):
parent_id: str
child_id: str
@router.post("/links")
def add_link(payload: LinkBody):
conn = _conn()
try:
kanban_db.link_tasks(conn, payload.parent_id, payload.child_id)
return {"ok": True}
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
finally:
conn.close()
@router.delete("/links")
def delete_link(parent_id: str = Query(...), child_id: str = Query(...)):
conn = _conn()
try:
ok = kanban_db.unlink_tasks(conn, parent_id, child_id)
return {"ok": bool(ok)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Bulk actions (multi-select on the board)
# ---------------------------------------------------------------------------
class BulkTaskBody(BaseModel):
ids: list[str]
status: Optional[str] = None
assignee: Optional[str] = None # "" or None = unassign
priority: Optional[int] = None
archive: bool = False
@router.post("/tasks/bulk")
def bulk_update(payload: BulkTaskBody):
"""Apply the same patch to every id in ``payload.ids``.
This is an *independent* iteration per-task failures don't abort
siblings. Returns per-id outcome so the UI can surface partials.
"""
ids = [i for i in (payload.ids or []) if i]
if not ids:
raise HTTPException(status_code=400, detail="ids is required")
results: list[dict] = []
conn = _conn()
try:
for tid in ids:
entry: dict[str, Any] = {"id": tid, "ok": True}
try:
task = kanban_db.get_task(conn, tid)
if task is None:
entry.update(ok=False, error="not found")
results.append(entry)
continue
if payload.archive:
if not kanban_db.archive_task(conn, tid):
entry.update(ok=False, error="archive refused")
if payload.status is not None and not payload.archive:
s = payload.status
if s == "done":
ok = kanban_db.complete_task(conn, tid)
elif s == "blocked":
ok = kanban_db.block_task(conn, tid)
elif s == "ready":
cur = kanban_db.get_task(conn, tid)
if cur and cur.status == "blocked":
ok = kanban_db.unblock_task(conn, tid)
else:
ok = _set_status_direct(conn, tid, "ready")
elif s in ("todo", "running", "triage"):
ok = _set_status_direct(conn, tid, s)
else:
entry.update(ok=False, error=f"unknown status {s!r}")
results.append(entry)
continue
if not ok:
entry.update(ok=False, error=f"transition to {s!r} refused")
if payload.assignee is not None:
try:
if not kanban_db.assign_task(
conn, tid, payload.assignee or None,
):
entry.update(ok=False, error="assign refused")
except RuntimeError as e:
entry.update(ok=False, error=str(e))
if payload.priority is not None:
with kanban_db.write_txn(conn):
conn.execute(
"UPDATE tasks SET priority = ? WHERE id = ?",
(int(payload.priority), tid),
)
conn.execute(
"INSERT INTO task_events (task_id, kind, payload, created_at) "
"VALUES (?, 'reprioritized', ?, ?)",
(tid, json.dumps({"priority": int(payload.priority)}),
int(time.time())),
)
except Exception as e: # defensive — one bad id shouldn't kill the batch
entry.update(ok=False, error=str(e))
results.append(entry)
return {"results": results}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Plugin config (read dashboard.kanban.* defaults from config.yaml)
# ---------------------------------------------------------------------------
@router.get("/config")
def get_config():
"""Return kanban dashboard preferences from ~/.hermes/config.yaml.
Reads the ``dashboard.kanban`` section if present; defaults otherwise.
Used by the UI to pre-select tenant filters, toggle markdown rendering,
or set column-width preferences without a round-trip per page load.
"""
try:
from hermes_cli.config import load_config
cfg = load_config() or {}
except Exception:
cfg = {}
dash_cfg = (cfg.get("dashboard") or {})
# dashboard.kanban may itself be a dict; fall back to {}.
k_cfg = dash_cfg.get("kanban") or {}
return {
"default_tenant": k_cfg.get("default_tenant") or "",
"lane_by_profile": bool(k_cfg.get("lane_by_profile", True)),
"include_archived_by_default": bool(k_cfg.get("include_archived_by_default", False)),
"render_markdown": bool(k_cfg.get("render_markdown", True)),
}
# ---------------------------------------------------------------------------
# Stats (per-profile / per-status counts + oldest-ready age)
# ---------------------------------------------------------------------------
@router.get("/stats")
def get_stats():
"""Per-status + per-assignee counts + oldest-ready age.
Designed for the dashboard HUD and for router profiles that need to
answer "is this specialist overloaded?" without scanning the whole
board themselves.
"""
conn = _conn()
try:
return kanban_db.board_stats(conn)
finally:
conn.close()
@router.get("/assignees")
def get_assignees():
"""Known profiles + per-profile task counts.
Returns the union of ``~/.hermes/profiles/*`` on disk and every
distinct assignee currently used on the board. The dashboard uses
this to populate its assignee dropdown so a freshly-created profile
appears in the picker before it's been given any task.
"""
conn = _conn()
try:
return {"assignees": kanban_db.known_assignees(conn)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Worker log (read-only; file written by _default_spawn)
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}/log")
def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_000)):
"""Return the worker's stdout/stderr log.
``tail`` caps the response size (bytes) so the dashboard drawer
doesn't paginate megabytes into the browser. Returns 404 if the task
has never spawned. The on-disk log is rotated at 2 MiB per
``_rotate_worker_log`` a single ``.log.1`` is kept, no further
generations, so disk usage per task is bounded at ~4 MiB.
"""
conn = _conn()
try:
task = kanban_db.get_task(conn, task_id)
finally:
conn.close()
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
content = kanban_db.read_worker_log(task_id, tail_bytes=tail)
log_path = kanban_db.worker_log_path(task_id)
size = log_path.stat().st_size if log_path.exists() else 0
return {
"task_id": task_id,
"path": str(log_path),
"exists": content is not None,
"size_bytes": size,
"content": content or "",
# Truncated when the on-disk file was larger than the tail cap.
"truncated": bool(tail and size > tail),
}
# ---------------------------------------------------------------------------
# Dispatch nudge (optional quick-path so the UI doesn't wait 60 s)
# ---------------------------------------------------------------------------
@router.post("/dispatch")
def dispatch(dry_run: bool = Query(False), max_n: int = Query(8, alias="max")):
conn = _conn()
try:
result = kanban_db.dispatch_once(
conn, dry_run=dry_run, max_spawn=max_n,
)
# DispatchResult is a dataclass.
try:
return asdict(result)
except TypeError:
return {"result": str(result)}
finally:
conn.close()
# ---------------------------------------------------------------------------
# WebSocket: /events?since=<event_id>
# ---------------------------------------------------------------------------
# Poll interval for the event tail loop. SQLite WAL + 300 ms polling is
# the simplest and most robust approach; it adds a fraction of a percent
# of CPU and has no shared state to synchronize across workers.
_EVENT_POLL_SECONDS = 0.3
@router.websocket("/events")
async def stream_events(ws: WebSocket):
# Enforce the dashboard session token as a query param — browsers can't
# set Authorization on a WS upgrade. This matches how the PTY bridge
# authenticates in hermes_cli/web_server.py.
token = ws.query_params.get("token")
if not _check_ws_token(token):
await ws.close(code=http_status.WS_1008_POLICY_VIOLATION)
return
await ws.accept()
try:
since_raw = ws.query_params.get("since", "0")
try:
cursor = int(since_raw)
except ValueError:
cursor = 0
def _fetch_new(cursor_val: int) -> tuple[int, list[dict]]:
conn = kanban_db.connect()
try:
rows = conn.execute(
"SELECT id, task_id, run_id, kind, payload, created_at "
"FROM task_events WHERE id > ? ORDER BY id ASC LIMIT 200",
(cursor_val,),
).fetchall()
out: list[dict] = []
new_cursor = cursor_val
for r in rows:
try:
payload = json.loads(r["payload"]) if r["payload"] else None
except Exception:
payload = None
out.append({
"id": r["id"],
"task_id": r["task_id"],
"run_id": r["run_id"],
"kind": r["kind"],
"payload": payload,
"created_at": r["created_at"],
})
new_cursor = r["id"]
return new_cursor, out
finally:
conn.close()
while True:
cursor, events = await asyncio.to_thread(_fetch_new, cursor)
if events:
await ws.send_json({"events": events, "cursor": cursor})
await asyncio.sleep(_EVENT_POLL_SECONDS)
except WebSocketDisconnect:
return
except Exception as exc: # defensive: never crash the dashboard worker
log.warning("Kanban event stream error: %s", exc)
try:
await ws.close()
except Exception:
pass
@@ -0,0 +1,17 @@
[Unit]
Description=Hermes Kanban dispatcher (hermes kanban daemon)
Documentation=https://hermes-agent.nousresearch.com/docs/user-guide/features/kanban
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/env hermes kanban daemon --interval 60 --pidfile %t/hermes-kanban-dispatcher.pid
Restart=on-failure
RestartSec=5
# Log to the journal via stdout/stderr; the dispatcher also writes per-task
# worker output to $HERMES_HOME/kanban/logs/<task>.log.
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=default.target
+29 -124
View File
@@ -3,9 +3,7 @@
Long-term memory with knowledge graph, entity resolution, and multi-strategy
retrieval. Supports cloud (API key) and local modes.
Configurable request timeout via HINDSIGHT_TIMEOUT env var or config.json.
Configurable embedded daemon idle timeout via HINDSIGHT_IDLE_TIMEOUT env var
or config.json idle_timeout.
Configurable timeout via HINDSIGHT_TIMEOUT env var or config.json.
Original PR #1811 by benfrank241, adapted to MemoryProvider ABC.
@@ -16,7 +14,6 @@ Config via environment variables:
HINDSIGHT_API_URL API endpoint
HINDSIGHT_MODE cloud or local (default: cloud)
HINDSIGHT_TIMEOUT API request timeout in seconds (default: 120)
HINDSIGHT_IDLE_TIMEOUT embedded daemon idle timeout seconds; 0 disables shutdown (default: 300)
HINDSIGHT_RETAIN_TAGS comma-separated tags attached to retained memories
HINDSIGHT_RETAIN_SOURCE metadata source value attached to retained memories
HINDSIGHT_RETAIN_USER_PREFIX label used before user turns in retained transcripts
@@ -48,7 +45,6 @@ _DEFAULT_API_URL = "https://api.hindsight.vectorize.io"
_DEFAULT_LOCAL_URL = "http://localhost:8888"
_MIN_CLIENT_VERSION = "0.4.22"
_DEFAULT_TIMEOUT = 120 # seconds — cloud API can take 30-40s per request
_DEFAULT_IDLE_TIMEOUT = 300 # seconds — Hindsight embedded daemon default
_VALID_BUDGETS = {"low", "mid", "high"}
_PROVIDER_DEFAULT_MODELS = {
"openai": "gpt-4o-mini",
@@ -63,17 +59,6 @@ _PROVIDER_DEFAULT_MODELS = {
}
def _parse_int_setting(value: Any, default: int) -> int:
"""Parse an integer config/env value, falling back on invalid input."""
if value is None or value == "":
return default
try:
return int(value)
except (TypeError, ValueError):
logger.warning("Invalid integer Hindsight setting %r; using default %s", value, default)
return default
def _check_local_runtime() -> tuple[bool, str | None]:
"""Return whether local embedded Hindsight imports cleanly.
@@ -218,8 +203,6 @@ def _load_config() -> dict:
return {
"mode": os.environ.get("HINDSIGHT_MODE", "cloud"),
"apiKey": os.environ.get("HINDSIGHT_API_KEY", ""),
"timeout": _parse_int_setting(os.environ.get("HINDSIGHT_TIMEOUT"), _DEFAULT_TIMEOUT),
"idle_timeout": _parse_int_setting(os.environ.get("HINDSIGHT_IDLE_TIMEOUT"), _DEFAULT_IDLE_TIMEOUT),
"retain_tags": os.environ.get("HINDSIGHT_RETAIN_TAGS", ""),
"retain_source": os.environ.get("HINDSIGHT_RETAIN_SOURCE", ""),
"retain_user_prefix": os.environ.get("HINDSIGHT_RETAIN_USER_PREFIX", "User"),
@@ -321,16 +304,6 @@ def _build_embedded_profile_env(config: dict[str, Any], *, llm_api_key: str | No
}
if current_base_url:
env_values["HINDSIGHT_API_LLM_BASE_URL"] = str(current_base_url)
idle_timeout = (
config.get("idle_timeout")
if config.get("idle_timeout") is not None
else os.environ.get("HINDSIGHT_IDLE_TIMEOUT")
)
if idle_timeout is not None and idle_timeout != "":
env_values["HINDSIGHT_EMBED_DAEMON_IDLE_TIMEOUT"] = str(
_parse_int_setting(idle_timeout, _DEFAULT_IDLE_TIMEOUT)
)
return env_values
@@ -439,7 +412,6 @@ class HindsightMemoryProvider(MemoryProvider):
self._turn_index = 0
self._client = None
self._timeout = _DEFAULT_TIMEOUT
self._idle_timeout = _DEFAULT_IDLE_TIMEOUT
self._prefetch_result = ""
self._prefetch_lock = threading.Lock()
self._prefetch_thread = None
@@ -620,17 +592,10 @@ class HindsightMemoryProvider(MemoryProvider):
sys.stdout.write(" LLM API key: ")
sys.stdout.flush()
llm_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
if llm_key:
env_writes["HINDSIGHT_LLM_API_KEY"] = llm_key
else:
env_path = Path(hermes_home) / ".env"
existing_llm_key = ""
if env_path.exists():
for line in env_path.read_text().splitlines():
if line.startswith("HINDSIGHT_LLM_API_KEY="):
existing_llm_key = line.split("=", 1)[1]
break
env_writes["HINDSIGHT_LLM_API_KEY"] = existing_llm_key
# Always write explicitly (including empty) so the provider sees ""
# rather than a missing variable. The daemon reads from .env at
# startup and fails when HINDSIGHT_LLM_API_KEY is unset.
env_writes["HINDSIGHT_LLM_API_KEY"] = llm_key
# Step 4: Save everything
provider_config["bank_id"] = "hermes"
@@ -640,11 +605,6 @@ class HindsightMemoryProvider(MemoryProvider):
timeout_val = existing_timeout if existing_timeout else _DEFAULT_TIMEOUT
provider_config["timeout"] = timeout_val
env_writes["HINDSIGHT_TIMEOUT"] = str(timeout_val)
if mode == "local_embedded":
existing_idle_timeout = self._config.get("idle_timeout") if self._config else None
idle_timeout_val = existing_idle_timeout if existing_idle_timeout is not None else _DEFAULT_IDLE_TIMEOUT
provider_config["idle_timeout"] = idle_timeout_val
env_writes["HINDSIGHT_IDLE_TIMEOUT"] = str(idle_timeout_val)
config["memory"]["provider"] = "hindsight"
save_config(config)
@@ -733,7 +693,6 @@ class HindsightMemoryProvider(MemoryProvider):
{"key": "recall_max_input_chars", "description": "Maximum input query length for auto-recall", "default": 800},
{"key": "recall_prompt_preamble", "description": "Custom preamble for recalled memories in context"},
{"key": "timeout", "description": "API request timeout in seconds", "default": _DEFAULT_TIMEOUT},
{"key": "idle_timeout", "description": "Embedded daemon idle timeout in seconds (0 disables auto-shutdown)", "default": _DEFAULT_IDLE_TIMEOUT, "when": {"mode": "local_embedded"}},
]
def _get_client(self):
@@ -761,14 +720,6 @@ class HindsightMemoryProvider(MemoryProvider):
)
if self._llm_base_url:
kwargs["llm_base_url"] = self._llm_base_url
idle_timeout = _parse_int_setting(
self._config.get("idle_timeout")
if self._config.get("idle_timeout") is not None
else os.environ.get("HINDSIGHT_IDLE_TIMEOUT", self._idle_timeout),
_DEFAULT_IDLE_TIMEOUT,
)
self._idle_timeout = idle_timeout
kwargs["idle_timeout"] = idle_timeout
self._client = HindsightEmbedded(**kwargs)
else:
from hindsight_client import Hindsight
@@ -785,38 +736,6 @@ class HindsightMemoryProvider(MemoryProvider):
"""Schedule *coro* on the shared loop using the configured timeout."""
return _run_sync(coro, timeout=self._timeout)
def _is_retriable_embedded_connection_error(self, exc: Exception) -> bool:
"""Return True for stale embedded-daemon connection failures."""
if self._mode != "local_embedded":
return False
text = f"{type(exc).__name__}: {exc}".lower()
return any(
marker in text
for marker in (
"cannot connect to host",
"connection refused",
"connect call failed",
"clientconnectorerror",
)
)
def _run_hindsight_operation(self, operation):
"""Run an async Hindsight client operation, retrying once after idle shutdown."""
client = self._get_client()
try:
return self._run_sync(operation(client))
except Exception as exc:
if not self._is_retriable_embedded_connection_error(exc):
raise
logger.info(
"Hindsight embedded daemon appears unreachable; recreating client and retrying once: %s",
exc,
)
self._client = None
client = self._get_client()
self._client = client
return self._run_sync(operation(client))
def initialize(self, session_id: str, **kwargs) -> None:
self._session_id = str(session_id or "").strip()
self._parent_session_id = str(kwargs.get("parent_session_id", "") or "").strip()
@@ -871,14 +790,7 @@ class HindsightMemoryProvider(MemoryProvider):
self._session_turns = []
self._mode = self._config.get("mode", "cloud")
# Read timeout from config or env var, fall back to default
self._timeout = _parse_int_setting(
self._config.get("timeout") if self._config.get("timeout") is not None else os.environ.get("HINDSIGHT_TIMEOUT"),
_DEFAULT_TIMEOUT,
)
self._idle_timeout = _parse_int_setting(
self._config.get("idle_timeout") if self._config.get("idle_timeout") is not None else os.environ.get("HINDSIGHT_IDLE_TIMEOUT"),
_DEFAULT_IDLE_TIMEOUT,
)
self._timeout = self._config.get("timeout") or int(os.environ.get("HINDSIGHT_TIMEOUT", str(_DEFAULT_TIMEOUT)))
# "local" is a legacy alias for "local_embedded"
if self._mode == "local":
self._mode = "local_embedded"
@@ -1069,9 +981,10 @@ class HindsightMemoryProvider(MemoryProvider):
def _run():
try:
client = self._get_client()
if self._prefetch_method == "reflect":
logger.debug("Prefetch: calling reflect (bank=%s, query_len=%d)", self._bank_id, len(query))
resp = self._run_hindsight_operation(lambda client: client.areflect(bank_id=self._bank_id, query=query, budget=self._budget))
resp = self._run_sync(client.areflect(bank_id=self._bank_id, query=query, budget=self._budget))
text = resp.text or ""
else:
recall_kwargs: dict = {
@@ -1085,7 +998,7 @@ class HindsightMemoryProvider(MemoryProvider):
recall_kwargs["types"] = self._recall_types
logger.debug("Prefetch: calling recall (bank=%s, query_len=%d, budget=%s)",
self._bank_id, len(query), self._budget)
resp = self._run_hindsight_operation(lambda client: client.arecall(**recall_kwargs))
resp = self._run_sync(client.arecall(**recall_kwargs))
num_results = len(resp.results) if resp.results else 0
logger.debug("Prefetch: recall returned %d results", num_results)
text = "\n".join(f"- {r.text}" for r in resp.results if r.text) if resp.results else ""
@@ -1218,14 +1131,12 @@ class HindsightMemoryProvider(MemoryProvider):
item.pop("retain_async", None)
logger.debug("Hindsight retain: bank=%s, doc=%s, async=%s, content_len=%d, num_turns=%d",
self._bank_id, self._document_id, self._retain_async, len(content), len(self._session_turns))
self._run_hindsight_operation(
lambda client: client.aretain_batch(
bank_id=self._bank_id,
items=[item],
document_id=self._document_id,
retain_async=self._retain_async,
)
)
self._run_sync(client.aretain_batch(
bank_id=self._bank_id,
items=[item],
document_id=self._document_id,
retain_async=self._retain_async,
))
logger.debug("Hindsight retain succeeded")
except Exception as e:
logger.warning("Hindsight sync failed: %s", e, exc_info=True)
@@ -1241,6 +1152,12 @@ class HindsightMemoryProvider(MemoryProvider):
return [RETAIN_SCHEMA, RECALL_SCHEMA, REFLECT_SCHEMA]
def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
try:
client = self._get_client()
except Exception as e:
logger.warning("Hindsight client init failed: %s", e)
return tool_error(f"Hindsight client unavailable: {e}")
if tool_name == "hindsight_retain":
content = args.get("content", "")
if not content:
@@ -1254,7 +1171,7 @@ class HindsightMemoryProvider(MemoryProvider):
)
logger.debug("Tool hindsight_retain: bank=%s, content_len=%d, context=%s",
self._bank_id, len(content), context)
self._run_hindsight_operation(lambda client: client.aretain(**retain_kwargs))
self._run_sync(client.aretain(**retain_kwargs))
logger.debug("Tool hindsight_retain: success")
return json.dumps({"result": "Memory stored successfully."})
except Exception as e:
@@ -1277,7 +1194,7 @@ class HindsightMemoryProvider(MemoryProvider):
recall_kwargs["types"] = self._recall_types
logger.debug("Tool hindsight_recall: bank=%s, query_len=%d, budget=%s",
self._bank_id, len(query), self._budget)
resp = self._run_hindsight_operation(lambda client: client.arecall(**recall_kwargs))
resp = self._run_sync(client.arecall(**recall_kwargs))
num_results = len(resp.results) if resp.results else 0
logger.debug("Tool hindsight_recall: %d results", num_results)
if not resp.results:
@@ -1295,11 +1212,9 @@ class HindsightMemoryProvider(MemoryProvider):
try:
logger.debug("Tool hindsight_reflect: bank=%s, query_len=%d, budget=%s",
self._bank_id, len(query), self._budget)
resp = self._run_hindsight_operation(
lambda client: client.areflect(
bank_id=self._bank_id, query=query, budget=self._budget
)
)
resp = self._run_sync(client.areflect(
bank_id=self._bank_id, query=query, budget=self._budget
))
logger.debug("Tool hindsight_reflect: response_len=%d", len(resp.text or ""))
return json.dumps({"result": resp.text or "No relevant memories found."})
except Exception as e:
@@ -1316,19 +1231,9 @@ class HindsightMemoryProvider(MemoryProvider):
if self._client is not None:
try:
if self._mode == "local_embedded":
# HindsightEmbedded.close() delegates to its sync client.close().
# When Hermes created/used that client on the shared async loop,
# closing it from this thread can raise "attached to a different
# loop" before aiohttp releases the session. Close the embedded
# inner async client on the shared loop first, then let the
# wrapper clean up daemon/UI bookkeeping.
inner_client = getattr(self._client, "_client", None)
if inner_client is not None and hasattr(inner_client, "aclose"):
_run_sync(inner_client.aclose())
try:
self._client._client = None
except Exception:
pass
# Use the public close() API. The RuntimeError from
# aiohttp's "attached to a different loop" is expected
# and harmless — the daemon keeps running independently.
try:
self._client.close()
except RuntimeError:
+10 -28
View File
@@ -86,6 +86,7 @@ from agent.error_classifier import classify_api_error, FailoverReason
from agent.prompt_builder import (
DEFAULT_AGENT_IDENTITY, PLATFORM_HINTS,
MEMORY_GUIDANCE, SESSION_SEARCH_GUIDANCE, SKILLS_GUIDANCE,
KANBAN_GUIDANCE,
build_nous_subscription_prompt,
)
from agent.model_metadata import (
@@ -3304,19 +3305,10 @@ class AIAgent:
logger.warning("Background memory/skill review failed: %s", e)
self._emit_auxiliary_failure("background review", e)
finally:
# Background review agents can initialize memory providers
# (for example Hindsight) that own their own network clients.
# Explicitly stop those providers before closing the agent so
# their aiohttp sessions do not leak until GC/process exit.
# Then close all remaining resources (httpx client,
# subprocesses, etc.) so GC doesn't try to clean them up on a
# dead asyncio event loop (which produces "Event loop is
# closed" errors).
# Close all resources (httpx client, subprocesses, etc.) so
# GC doesn't try to clean them up on a dead asyncio event
# loop (which produces "Event loop is closed" errors).
if review_agent is not None:
try:
review_agent.shutdown_memory_provider()
except Exception:
pass
try:
review_agent.close()
except Exception:
@@ -4506,6 +4498,12 @@ class AIAgent:
tool_guidance.append(SESSION_SEARCH_GUIDANCE)
if "skill_manage" in self.valid_tool_names:
tool_guidance.append(SKILLS_GUIDANCE)
# Kanban worker/orchestrator lifecycle — only present when the
# dispatcher spawned this process (kanban_show check_fn gates on
# HERMES_KANBAN_TASK env var). Normal chat sessions never see
# this block.
if "kanban_show" in self.valid_tool_names:
tool_guidance.append(KANBAN_GUIDANCE)
if tool_guidance:
prompt_parts.append(" ".join(tool_guidance))
@@ -8160,22 +8158,6 @@ class AIAgent:
except Exception as e:
logger.warning("Session DB compression split failed — new session will NOT be indexed: %s", e)
# Notify the context engine that the session_id rotated because of
# compression (not a fresh /new). Plugin engines (e.g. hermes-lcm) use
# boundary_reason="compression" to preserve DAG lineage across the
# rollover instead of re-initializing fresh per-session state.
# See hermes-lcm#68. Built-in ContextCompressor ignores kwargs.
try:
_old_sid = locals().get("old_session_id")
if _old_sid and hasattr(self.context_compressor, "on_session_start"):
self.context_compressor.on_session_start(
self.session_id or "",
boundary_reason="compression",
old_session_id=_old_sid,
)
except Exception as _ce_err:
logger.debug("context engine on_session_start (compression): %s", _ce_err)
# Warn on repeated compressions (quality degrades with each pass)
_cc = self.context_compressor.compression_count
if _cc >= 2:
+2 -29
View File
@@ -1055,37 +1055,10 @@ setup_path() {
return 0
fi
# FHS layout: /usr/local/bin is normally on PATH for login shells (via
# /etc/profile pathmunge), but on RHEL/CentOS/Rocky/Alma 8+ non-login
# interactive root shells (su, sudo -s, tmux panes, some web terminals)
# only source /etc/bashrc, which does NOT add /usr/local/bin — and
# /root/.bash_profile doesn't either. So verify with `command -v` and
# fall back to writing a PATH guard into /root/.bashrc when needed.
# FHS layout: /usr/local/bin is on PATH for every standard shell, nothing to inject.
if [ "$ROOT_FHS_LAYOUT" = true ]; then
export PATH="$command_link_dir:$PATH"
# Probe a fresh non-login interactive bash the way the user will use it.
# `bash -i -c` sources ~/.bashrc but NOT ~/.bash_profile or /etc/profile,
# which is the exact scenario where RHEL root loses /usr/local/bin.
if env -i HOME="$HOME" TERM="${TERM:-dumb}" bash -i -c 'command -v hermes' \
>/dev/null 2>&1; then
log_info "/usr/local/bin is already on PATH for all shells"
log_success "hermes command ready"
return 0
fi
log_info "hermes not on PATH in non-login shells (common on RHEL-family)"
PATH_LINE='export PATH="/usr/local/bin:$PATH"'
PATH_COMMENT='# Hermes Agent — ensure /usr/local/bin is on PATH (RHEL non-login shells)'
for SHELL_CONFIG in "$HOME/.bashrc" "$HOME/.bash_profile"; do
[ -f "$SHELL_CONFIG" ] || continue
if ! grep -v '^[[:space:]]*#' "$SHELL_CONFIG" 2>/dev/null \
| grep -qE 'PATH=.*(/usr/local/bin|\$command_link_dir)'; then
echo "" >> "$SHELL_CONFIG"
echo "$PATH_COMMENT" >> "$SHELL_CONFIG"
echo "$PATH_LINE" >> "$SHELL_CONFIG"
log_success "Added /usr/local/bin to PATH in $SHELL_CONFIG"
fi
done
log_info "/usr/local/bin is already on PATH for all shells"
log_success "hermes command ready"
return 0
fi
-614
View File
@@ -1,614 +0,0 @@
#!/usr/bin/env python3
"""Drive the Hermes TUI under HERMES_DEV_PERF and summarize the pipeline.
Usage:
scripts/profile-tui.py [--session SID] [--hold KEY] [--seconds N] [--rate HZ]
Defaults: picks the session with the most messages, holds PageUp for 8s at
~30 Hz (matching xterm key-repeat), summarizes ~/.hermes/perf.log on exit.
The --tui build must exist (run `npm run build` in ui-tui first). This script
launches `node dist/entry.js` directly with HERMES_TUI_RESUME set so it
bypasses the hermes_cli wrapper we want repeatable timing, not the CLI's
session-picker flow.
Environment overrides:
HERMES_PERF_LOG (default ~/.hermes/perf.log)
HERMES_PERF_NODE (default node from $PATH)
HERMES_TUI_DIR (default /home/bb/hermes-agent/ui-tui)
Exit code is 0 if the harness ran and parsed results, 2 if the TUI crashed
or produced no perf data (suggests HERMES_DEV_PERF wiring is broken).
"""
from __future__ import annotations
import argparse
import json
import os
import pty
import select
import signal
import sqlite3
import sys
import time
from pathlib import Path
from typing import Any
DEFAULT_TUI_DIR = Path(os.environ.get("HERMES_TUI_DIR", "/home/bb/hermes-agent/ui-tui"))
DEFAULT_LOG = Path(os.environ.get("HERMES_PERF_LOG", str(Path.home() / ".hermes" / "perf.log")))
DEFAULT_STATE_DB = Path.home() / ".hermes" / "state.db"
# Keystroke escape sequences. Matches what xterm/VT220 send when the
# terminal has bracketed-paste disabled and the key-repeat handler fires.
KEYS = {
"page_up": b"\x1b[5~",
"page_down": b"\x1b[6~",
"wheel_up": b"\x1b[M`!!", # mouse wheel up (SGR-less) — best-effort
"shift_up": b"\x1b[1;2A",
"shift_down": b"\x1b[1;2B",
}
def pick_longest_session(db: Path) -> str:
conn = sqlite3.connect(db)
row = conn.execute(
"SELECT id FROM sessions s ORDER BY "
"(SELECT COUNT(*) FROM messages m WHERE m.session_id = s.id) DESC LIMIT 1"
).fetchone()
if not row:
sys.exit(f"no sessions in {db}")
return row[0]
def drain(fd: int, timeout: float) -> bytes:
"""Read whatever's available from fd within `timeout`, then return."""
chunks = []
end = time.monotonic() + timeout
while time.monotonic() < end:
r, _, _ = select.select([fd], [], [], max(0.0, end - time.monotonic()))
if not r:
break
try:
data = os.read(fd, 4096)
except OSError:
break
if not data:
break
chunks.append(data)
return b"".join(chunks)
def hold_key(fd: int, seq: bytes, seconds: float, rate_hz: int) -> int:
"""Write `seq` to fd at ~rate_hz for `seconds`. Returns keystrokes sent."""
interval = 1.0 / max(1, rate_hz)
end = time.monotonic() + seconds
sent = 0
while time.monotonic() < end:
try:
os.write(fd, seq)
sent += 1
except OSError:
break
# Drain stdout to keep the PTY buffer flowing; ignore content.
drain(fd, 0)
time.sleep(interval)
return sent
def summarize(log: Path, since_ts_ms: int) -> dict[str, Any]:
"""Parse perf.log, keep only events newer than since_ts_ms, return stats."""
react_events: list[dict[str, Any]] = []
frame_events: list[dict[str, Any]] = []
if not log.exists():
return {"error": f"no log at {log}", "react": [], "frame": []}
for line in log.read_text().splitlines():
line = line.strip()
if not line:
continue
try:
row = json.loads(line)
except json.JSONDecodeError:
continue
if int(row.get("ts", 0)) < since_ts_ms:
continue
src = row.get("src")
if src == "react":
react_events.append(row)
elif src == "frame":
frame_events.append(row)
return {
"react": react_events,
"frame": frame_events,
}
def pct(values: list[float], p: float) -> float:
if not values:
return 0.0
s = sorted(values)
idx = min(len(s) - 1, int(len(s) * p))
return s[idx]
def format_report(data: dict[str, Any]) -> str:
react = data.get("react") or []
frames = data.get("frame") or []
out = []
out.append("═══ React Profiler ═══")
if not react:
out.append(" (no react events — HERMES_DEV_PERF wired? threshold too high?)")
else:
by_id: dict[str, list[float]] = {}
for r in react:
by_id.setdefault(r["id"], []).append(r["actualMs"])
out.append(f" {'pane':<14} {'count':>6} {'p50':>8} {'p95':>8} {'p99':>8} {'max':>8}")
for pid, ms in sorted(by_id.items(), key=lambda kv: -pct(kv[1], 0.99)):
out.append(
f" {pid:<14} {len(ms):>6} {pct(ms,0.50):>8.2f} {pct(ms,0.95):>8.2f} "
f"{pct(ms,0.99):>8.2f} {max(ms):>8.2f}"
)
out.append("")
out.append("═══ Ink pipeline ═══")
if not frames:
out.append(" (no frame events — onFrame wiring broken?)")
else:
dur = [f["durationMs"] for f in frames]
phases_present = any(f.get("phases") for f in frames)
out.append(f" frames captured: {len(frames)}")
out.append(
f" durationMs p50={pct(dur,0.50):.2f} p95={pct(dur,0.95):.2f} "
f"p99={pct(dur,0.99):.2f} max={max(dur):.2f}"
)
# Effective FPS during the run: frames / elapsed seconds.
ts = sorted(f["ts"] for f in frames)
if len(ts) >= 2:
elapsed_s = (ts[-1] - ts[0]) / 1000.0
fps = len(frames) / elapsed_s if elapsed_s > 0 else float("inf")
out.append(f" throughput: {len(frames)} frames / {elapsed_s:.2f}s = {fps:.1f} fps")
if phases_present:
fields = ["yoga", "renderer", "diff", "optimize", "write", "commit"]
out.append("")
out.append(f" {'phase':<10} {'p50':>8} {'p95':>8} {'p99':>8} {'max':>8} (ms)")
for field in fields:
vals = [f["phases"][field] for f in frames if f.get("phases")]
if vals:
out.append(
f" {field:<10} {pct(vals,0.50):>8.2f} {pct(vals,0.95):>8.2f} "
f"{pct(vals,0.99):>8.2f} {max(vals):>8.2f}"
)
# Derived: sum of phases vs durationMs (reveals hidden time).
sum_ps = [
sum(f["phases"][k] for k in fields)
for f in frames if f.get("phases")
]
if sum_ps:
dur_match = [f["durationMs"] for f in frames if f.get("phases")]
deltas = [d - s for d, s in zip(dur_match, sum_ps)]
out.append(
f" {'dur-Σphases':<10} {pct(deltas,0.50):>8.2f} {pct(deltas,0.95):>8.2f} "
f"{pct(deltas,0.99):>8.2f} {max(deltas):>8.2f} (unaccounted-for time)"
)
# Yoga counters
visited = [f["phases"]["yogaVisited"] for f in frames if f.get("phases")]
measured = [f["phases"]["yogaMeasured"] for f in frames if f.get("phases")]
cache_hits = [f["phases"]["yogaCacheHits"] for f in frames if f.get("phases")]
live = [f["phases"]["yogaLive"] for f in frames if f.get("phases")]
out.append("")
out.append(" Yoga counters (per frame):")
for name, vals in (
("visited", visited),
("measured", measured),
("cacheHits", cache_hits),
("live", live),
):
if vals:
out.append(f" {name:<11} p50={pct(vals,0.5):.0f} p99={pct(vals,0.99):.0f} max={max(vals)}")
# Patch counts — proxy for "how much changed each frame"
patches = [f["phases"]["patches"] for f in frames if f.get("phases")]
if patches:
out.append(
f" patches p50={pct(patches,0.5):.0f} p99={pct(patches,0.99):.0f} "
f"max={max(patches)} total={sum(patches)}"
)
optimized = [
f["phases"].get("optimizedPatches", 0)
for f in frames if f.get("phases")
]
if any(optimized):
out.append(
f" optimized p50={pct(optimized,0.5):.0f} p99={pct(optimized,0.99):.0f} "
f"max={max(optimized)} total={sum(optimized)}"
f" (ratio: {sum(optimized)/max(1,sum(patches)):.2f})"
)
# Write bytes + drain telemetry — the outer-terminal bottleneck gauge.
bytes_written = [
f["phases"].get("writeBytes", 0)
for f in frames if f.get("phases")
]
if any(bytes_written):
total_b = sum(bytes_written)
kb = total_b / 1024
out.append(
f" writeBytes p50={pct(bytes_written,0.5):.0f}B p99={pct(bytes_written,0.99):.0f}B "
f"max={max(bytes_written)}B total={kb:.1f}KB"
)
drains = [
f["phases"].get("prevFrameDrainMs", 0)
for f in frames if f.get("phases")
]
if any(d > 0 for d in drains):
nonzero = [d for d in drains if d > 0]
out.append(
f" drainMs p50={pct(nonzero,0.5):.2f} p95={pct(nonzero,0.95):.2f} "
f"p99={pct(nonzero,0.99):.2f} max={max(nonzero):.2f} (terminal flush latency)"
)
backpressure = sum(1 for f in frames if f.get("phases", {}).get("backpressure"))
if backpressure:
out.append(
f" backpressure: {backpressure}/{len(frames)} frames "
f"({100*backpressure/len(frames):.0f}%) (Node stdout buffer full — terminal slow)"
)
# Flickers
flicker_frames = [f for f in frames if f.get("flickers")]
if flicker_frames:
out.append("")
out.append(f" ⚠ flickers detected in {len(flicker_frames)} frames")
reasons: dict[str, int] = {}
for f in flicker_frames:
for fl in f["flickers"]:
reasons[fl["reason"]] = reasons.get(fl["reason"], 0) + 1
for reason, n in sorted(reasons.items(), key=lambda kv: -kv[1]):
out.append(f" {reason}: {n}")
return "\n".join(out)
def key_metrics(data: dict[str, Any]) -> dict[str, float]:
"""Flatten the report into a dict of scalar metrics for A/B diffing."""
metrics: dict[str, float] = {}
frames = data.get("frame") or []
react = data.get("react") or []
if frames:
durs = [f["durationMs"] for f in frames]
metrics["frames"] = len(frames)
metrics["dur_p50"] = pct(durs, 0.50)
metrics["dur_p95"] = pct(durs, 0.95)
metrics["dur_p99"] = pct(durs, 0.99)
metrics["dur_max"] = max(durs)
ts = sorted(f["ts"] for f in frames)
if len(ts) >= 2:
elapsed = (ts[-1] - ts[0]) / 1000.0
metrics["fps_throughput"] = len(frames) / elapsed if elapsed > 0 else 0.0
# Interframe gaps distribution — complementary view to throughput:
gaps = [ts[i] - ts[i - 1] for i in range(1, len(ts))]
if gaps:
metrics["gap_p50_ms"] = pct(gaps, 0.50)
metrics["gap_p99_ms"] = pct(gaps, 0.99)
metrics["gaps_under_16ms"] = sum(1 for g in gaps if g < 16)
metrics["gaps_over_200ms"] = sum(1 for g in gaps if g >= 200)
for phase in ("renderer", "yoga", "diff", "write"):
vals = [f["phases"][phase] for f in frames if f.get("phases")]
if vals:
metrics[f"{phase}_p99"] = pct(vals, 0.99)
metrics[f"{phase}_max"] = max(vals)
patches = [f["phases"]["patches"] for f in frames if f.get("phases")]
if patches:
metrics["patches_total"] = sum(patches)
metrics["patches_p99"] = pct(patches, 0.99)
optimized = [
f["phases"].get("optimizedPatches", 0) for f in frames if f.get("phases")
]
if any(optimized):
metrics["optimized_total"] = sum(optimized)
bytes_list = [
f["phases"].get("writeBytes", 0) for f in frames if f.get("phases")
]
if any(bytes_list):
metrics["writeBytes_total"] = sum(bytes_list)
drains = [
f["phases"].get("prevFrameDrainMs", 0)
for f in frames if f.get("phases")
]
drain_nonzero = [d for d in drains if d > 0]
if drain_nonzero:
metrics["drain_p99"] = pct(drain_nonzero, 0.99)
metrics["drain_max"] = max(drain_nonzero)
bp = sum(1 for f in frames if f.get("phases", {}).get("backpressure"))
metrics["backpressure_frames"] = bp
if react:
for pid in set(e["id"] for e in react):
ms = [e["actualMs"] for e in react if e["id"] == pid]
metrics[f"react_{pid}_p99"] = pct(ms, 0.99)
metrics[f"react_{pid}_max"] = max(ms)
return metrics
def format_diff(before: dict[str, float], after: dict[str, float]) -> str:
"""Render a side-by-side A/B comparison table."""
keys = sorted(set(before) | set(after))
lines = [f"{'metric':<28} {'before':>12} {'after':>12} {'delta':>12} {'%':>6}"]
lines.append("" * 76)
for k in keys:
b = before.get(k, 0.0)
a = after.get(k, 0.0)
d = a - b
pct_change = ((a / b) - 1) * 100 if b not in (0, 0.0) else float("inf") if a else 0
# Flag improvements vs regressions. For _p99 / _max / _total / gaps_over /
# patches / writeBytes / backpressure, LOWER is better. For fps / gaps_under,
# HIGHER is better.
lower_is_better = any(
token in k
for token in (
"p50",
"p95",
"p99",
"_max",
"_total",
"gaps_over",
"backpressure",
"drain",
)
)
higher_is_better = "fps_" in k or "gaps_under" in k
mark = ""
if d and not (lower_is_better or higher_is_better):
mark = ""
elif d < 0 and lower_is_better:
mark = ""
elif d > 0 and higher_is_better:
mark = ""
elif d > 0 and lower_is_better:
mark = "" # regression
elif d < 0 and higher_is_better:
mark = "" # regression
pct_str = "" if pct_change == float("inf") else f"{pct_change:+6.1f}%"
lines.append(
f"{k:<28} {b:>12.2f} {a:>12.2f} {d:>+12.2f} {pct_str} {mark}"
)
return "\n".join(lines)
def run_once(args: argparse.Namespace) -> dict[str, Any]:
tui_dir = Path(args.tui_dir).resolve()
entry = tui_dir / "dist" / "entry.js"
if not entry.exists():
sys.exit(f"{entry} missing — run `npm run build` in {tui_dir} first")
sid = args.session or pick_longest_session(DEFAULT_STATE_DB)
print(f"• session: {sid}")
print(f"• hold: {args.hold} x {args.rate}Hz for {args.seconds}s after {args.warmup}s warmup")
print(f"• terminal: {args.cols}x{args.rows}")
log = Path(args.log)
if not args.keep_log and log.exists():
log.unlink()
since_ms = int(time.time() * 1000)
env = os.environ.copy()
env["HERMES_DEV_PERF"] = "1"
env["HERMES_DEV_PERF_MS"] = str(args.threshold_ms)
env["HERMES_DEV_PERF_LOG"] = str(log)
env["HERMES_TUI_RESUME"] = sid
env["COLUMNS"] = str(args.cols)
env["LINES"] = str(args.rows)
env["TERM"] = env.get("TERM", "xterm-256color")
# Pass through extra flags the TUI wrapper recognizes (e.g. --no-fullscreen).
# Stored on args as `extra_flags` list.
node = os.environ.get("HERMES_PERF_NODE", "node")
node_args = [node, str(entry), *getattr(args, "extra_flags", [])]
pid, fd = pty.fork()
if pid == 0:
os.execvpe(node, node_args, env)
try:
import fcntl, struct, termios
winsize = struct.pack("HHHH", args.rows, args.cols, 0, 0)
fcntl.ioctl(fd, termios.TIOCSWINSZ, winsize)
print(f"• pid: {pid} fd: {fd}")
print(f"• warmup {args.warmup}s (drain startup output)…")
drain(fd, args.warmup)
print(f"• holding {args.hold}")
sent = hold_key(fd, KEYS[args.hold], args.seconds, args.rate)
print(f" sent {sent} keystrokes")
drain(fd, 0.5)
finally:
try:
os.kill(pid, signal.SIGTERM)
for _ in range(10):
pid_done, _ = os.waitpid(pid, os.WNOHANG)
if pid_done == pid:
break
time.sleep(0.1)
else:
os.kill(pid, signal.SIGKILL)
os.waitpid(pid, 0)
except (ProcessLookupError, ChildProcessError):
pass
try:
os.close(fd)
except OSError:
pass
time.sleep(0.2)
return summarize(log, since_ms)
def main() -> int:
p = argparse.ArgumentParser()
p.add_argument("--session", help="session id to resume (default: longest in db)")
p.add_argument("--hold", default="page_up", choices=sorted(KEYS.keys()), help="key to hold")
p.add_argument("--seconds", type=float, default=8.0, help="how long to hold the key")
p.add_argument("--rate", type=int, default=30, help="keystrokes per second")
p.add_argument("--warmup", type=float, default=3.0, help="seconds to wait after launch before input")
p.add_argument("--threshold-ms", type=float, default=0.0, help="HERMES_DEV_PERF_MS (0 = capture all)")
p.add_argument("--cols", type=int, default=120)
p.add_argument("--rows", type=int, default=40)
p.add_argument("--keep-log", action="store_true", help="don't wipe perf.log before run")
p.add_argument("--tui-dir", default=str(DEFAULT_TUI_DIR))
p.add_argument("--log", default=str(DEFAULT_LOG))
p.add_argument("--save", metavar="LABEL",
help="save the final metrics as /tmp/perf-<LABEL>.json for later --compare")
p.add_argument("--compare", metavar="LABEL",
help="diff against /tmp/perf-<LABEL>.json after running")
p.add_argument("--loop", action="store_true",
help="watch for source changes, rebuild, rerun, and diff vs previous run")
p.add_argument("--extra-flag", dest="extra_flags", action="append", default=[],
help="pass through to node dist/entry.js (repeatable)")
args = p.parse_args()
if args.loop:
return loop_mode(args)
# Single-shot path.
data = run_once(args)
print()
print(format_report(data))
metrics = key_metrics(data)
if args.save:
path = Path(f"/tmp/perf-{args.save}.json")
path.write_text(json.dumps(metrics, indent=2))
print(f"\n• saved: {path}")
if args.compare:
path = Path(f"/tmp/perf-{args.compare}.json")
if not path.exists():
print(f"\n⚠ no baseline at {path} — run with --save {args.compare} first")
else:
before = json.loads(path.read_text())
print(f"\n═══ A/B diff vs /tmp/perf-{args.compare}.json ═══")
print(format_diff(before, metrics))
if not data["react"] and not data["frame"]:
return 2
return 0
def loop_mode(args: argparse.Namespace) -> int:
"""Watch source files, rebuild, rerun, print A/B diff against previous run.
Keeps a rolling 'previous run' baseline in memory so each iteration
reports delta vs the last one visibility into whether the last
edit moved the needle. Press Ctrl+C to stop.
"""
import subprocess
tui_dir = Path(args.tui_dir).resolve()
src_root = tui_dir / "src"
pkg_root = tui_dir / "packages" / "hermes-ink" / "src"
def collect_mtimes() -> dict[str, float]:
mtimes: dict[str, float] = {}
for root in (src_root, pkg_root):
if not root.exists():
continue
for path in root.rglob("*"):
if path.suffix in {".ts", ".tsx"} and "__tests__" not in str(path):
try:
mtimes[str(path)] = path.stat().st_mtime
except OSError:
pass
return mtimes
previous_metrics: dict[str, float] | None = None
previous_mtimes = collect_mtimes()
iteration = 0
print(f"• loop mode — watching {src_root} + {pkg_root} for *.ts(x) changes")
print("• edit any TS file, the harness rebuilds + reruns automatically")
print("• Ctrl+C to stop\n")
try:
while True:
iteration += 1
print(f"\n{'' * 76}")
print(f"Iteration {iteration} @ {time.strftime('%H:%M:%S')}")
print("" * 76)
if iteration > 1:
print("• rebuilding…")
result = subprocess.run(
["npm", "run", "build"],
cwd=tui_dir,
capture_output=True,
text=True,
)
if result.returncode != 0:
print("✗ build failed:")
print(result.stdout[-2000:])
print(result.stderr[-2000:])
print("\n• waiting for source changes to retry…")
previous_mtimes = wait_for_change(previous_mtimes, collect_mtimes)
continue
print("✓ build ok")
data = run_once(args)
metrics = key_metrics(data)
print()
print(format_report(data))
if previous_metrics is not None:
print(f"\n═══ A/B diff vs iteration {iteration - 1} ═══")
print(format_diff(previous_metrics, metrics))
previous_metrics = metrics
print("\n• waiting for source changes…")
previous_mtimes = wait_for_change(previous_mtimes, collect_mtimes)
except KeyboardInterrupt:
print("\n• loop stopped")
return 0
def wait_for_change(prev: dict[str, float], collect) -> dict[str, float]:
"""Poll every 1s until a watched file's mtime changes. Debounced 500ms."""
while True:
time.sleep(1)
current = collect()
changed = [
path for path, mtime in current.items() if prev.get(path) != mtime
]
if changed:
print(f"{len(changed)} file(s) changed:")
for path in changed[:5]:
print(f" {path}")
# Debounce — editor save bursts can take ~500ms to settle
time.sleep(0.5)
return collect()
if __name__ == "__main__":
sys.exit(main())
-34
View File
@@ -43,22 +43,16 @@ AUTHOR_MAP = {
"teknium1@gmail.com": "teknium1",
"teknium@nousresearch.com": "teknium1",
"127238744+teknium1@users.noreply.github.com": "teknium1",
"johnnncenaaa77@gmail.com": "johnncenae",
"focusflow.app.help@gmail.com": "yes999zc",
"343873859@qq.com": "DrStrangerUJN",
"uzmpsk.dilekakbas@gmail.com": "dlkakbs",
"jefferson@heimdallstrategy.com": "Mind-Dragon",
"130918800+devorun@users.noreply.github.com": "devorun",
"sonoyuncudmr@gmail.com": "Sonoyunchu",
"maks.mir@yahoo.com": "say8hi",
"web3blind@users.noreply.github.com": "web3blind",
"julia@alexland.us": "alexg0bot",
"1060770+benjaminsehl@users.noreply.github.com": "benjaminsehl",
"nerijusn76@gmail.com": "Nerijusas",
"itonov@proton.me": "Ito-69",
"glesstech@gmail.com": "georgeglessner",
"maxim.smetanin@gmail.com": "maxims-oss",
"yoimexex@gmail.com": "Yoimex",
# contributors (from noreply pattern)
"david.vv@icloud.com": "davidvv",
"wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243",
@@ -73,12 +67,9 @@ AUTHOR_MAP = {
"thomasgeorgevii09@gmail.com": "tochukwuada",
"harryykyle1@gmail.com": "hharry11",
"kshitijk4poor@gmail.com": "kshitijk4poor",
"1294707+Tosko4@users.noreply.github.com": "Tosko4",
"keira.voss94@gmail.com": "keiravoss94",
"16443023+stablegenius49@users.noreply.github.com": "stablegenius49",
"fqsy1416@gmail.com": "EKKOLearnAI",
"octo-patch@github.com": "octo-patch",
"math0r-be@github.com": "math0r-be",
"simbamax99@gmail.com": "simbam99",
"iris@growthpillars.co": "irispillars",
"185121704+stablegenius49@users.noreply.github.com": "stablegenius49",
@@ -125,22 +116,9 @@ AUTHOR_MAP = {
"Mibayy@users.noreply.github.com": "Mibayy",
"mibayy@users.noreply.github.com": "Mibayy",
"135070653+sgaofen@users.noreply.github.com": "sgaofen",
"lzy.dev@gmail.com": "zhiyanliu",
"me@janstepanovsky.cz": "hhhonzik",
"139848623+hhuang91@users.noreply.github.com": "hhuang91",
"s.ozaki@ebinou.net": "Satoshi-agi",
"10774721+kunlabs@users.noreply.github.com": "kunlabs",
"110560187+Wang-tianhao@users.noreply.github.com": "Wang-tianhao",
"170458616+ghostmfr@users.noreply.github.com": "ghostmfr",
"1848670+mewwts@users.noreply.github.com": "mewwts",
"1930707+haru398801@users.noreply.github.com": "haru398801",
"rapabelias@gmail.com": "badgerbees",
"xnb888@proton.me": "xnbi",
"xiahu889889@proton.me": "xiahu88988",
"nocoo@users.noreply.github.com": "nocoo",
"30841158+n-WN@users.noreply.github.com": "n-WN",
"tsuijinglei@gmail.com": "hiddenpuppy",
"buraysandro9@gmail.com": "ygd58",
"jerome@clawwork.ai": "HiddenPuppy",
"jerome.benoit@sap.com": "jerome-benoit",
"wysie@users.noreply.github.com": "Wysie",
@@ -213,7 +191,6 @@ AUTHOR_MAP = {
"satelerd@gmail.com": "satelerd",
"dan@danlynn.com": "danklynn",
"mattmaximo@hotmail.com": "MattMaximo",
"MatthewRHardwick@gmail.com": "mrhwick",
"149063006+j3ffffff@users.noreply.github.com": "j3ffffff",
"A-FdL-Prog@users.noreply.github.com": "A-FdL-Prog",
"l0hde@users.noreply.github.com": "l0hde",
@@ -400,17 +377,6 @@ AUTHOR_MAP = {
"zzn+pa@zzn.im": "xinbenlv",
"zaynjarvis@gmail.com": "ZaynJarvis",
"zhiheng.liu@bytedance.com": "ZaynJarvis",
"izhaolongfei@gmail.com": "loongfay",
"296659110@qq.com": "lrt4836",
"fe.daniel91@gmail.com": "beforeload",
"libo1106@foxmail.com": "libo1106",
"295367131@qq.com": "295367131",
"295367132@qq.com": "IxAres",
"danieldliu@tencent.com": "danieldliu",
"loongzhao@tencent.com": "loongzhao",
"Bartok9@users.noreply.github.com": "Bartok9",
"LeonSGP43@users.noreply.github.com": "LeonSGP43",
"kshitijk4poor@users.noreply.github.com": "kshitijk4poor",
"mbelleau@Michels-MacBook-Pro.local": "malaiwah",
"michel.belleau@malaiwah.com": "malaiwah",
"gnanasekaran.sekareee@gmail.com": "gnanam1990",
@@ -402,63 +402,6 @@ Tool changes take effect on `/reset` (new session). They do NOT apply mid-conver
---
## Security & Privacy Toggles
Common "why is Hermes doing X to my output / tool calls / commands?" toggles — and the exact commands to change them. Most of these need a fresh session (`/reset` in chat, or start a new `hermes` invocation) because they're read once at startup.
### Secret redaction in tool output
Hermes auto-redacts strings that look like API keys, tokens, and secrets in all tool output (terminal stdout, `read_file`, web content, subagent summaries, etc.) so the model never sees raw credentials. If the user is intentionally working with mock tokens, share-management tokens, or their own secrets and the redaction is getting in the way:
```bash
hermes config set security.redact_secrets false # disable globally
```
**Restart required.** `security.redact_secrets` is snapshotted at import time — setting it mid-session (e.g. via `export HERMES_REDACT_SECRETS=false` from a tool call) will NOT take effect for the running process. Tell the user to run `hermes config set security.redact_secrets false` in a terminal, then start a new session. This is deliberate — it prevents an LLM from turning off redaction on itself mid-task.
Re-enable with:
```bash
hermes config set security.redact_secrets true
```
### PII redaction in gateway messages
Separate from secret redaction. When enabled, the gateway hashes user IDs and strips phone numbers from the session context before it reaches the model:
```bash
hermes config set privacy.redact_pii true # enable
hermes config set privacy.redact_pii false # disable (default)
```
### Command approval prompts
By default (`approvals.mode: manual`), Hermes prompts the user before running shell commands flagged as destructive (`rm -rf`, `git reset --hard`, etc.). The modes are:
- `manual` — always prompt (default)
- `smart` — use an auxiliary LLM to auto-approve low-risk commands, prompt on high-risk
- `off` — skip all approval prompts (equivalent to `--yolo`)
```bash
hermes config set approvals.mode smart # recommended middle ground
hermes config set approvals.mode off # bypass everything (not recommended)
```
Per-invocation bypass without changing config:
- `hermes --yolo …`
- `export HERMES_YOLO_MODE=1`
Note: YOLO / `approvals.mode: off` does NOT turn off secret redaction. They are independent.
### Shell hooks allowlist
Some shell-hook integrations require explicit allowlisting before they fire. Managed via `~/.hermes/shell-hooks-allowlist.json` — prompted interactively the first time a hook wants to run.
### Disabling the web/browser/image-gen tools
To keep the model away from network or media tools entirely, open `hermes tools` and toggle per-platform. Takes effect on next session (`/reset`). See the Tools & Skills section above.
---
## Voice & Transcription
### STT (Voice → Text)
+152
View File
@@ -0,0 +1,152 @@
---
name: kanban-orchestrator
description: Decomposition playbook + specialist-roster conventions + anti-temptation rules for an orchestrator profile routing work through Kanban. The "don't do the work yourself" rule and the basic lifecycle are auto-injected into every kanban worker's system prompt; this skill is the deeper playbook when you're specifically playing the orchestrator role.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, orchestration, routing]
related_skills: [kanban-worker]
---
# Kanban Orchestrator — Decomposition Playbook
> The **core worker lifecycle** (including the `kanban_create` fan-out pattern and the "decompose, don't execute" rule) is auto-injected into every kanban process via the `KANBAN_GUIDANCE` system-prompt block. This skill is the deeper playbook when you're an orchestrator profile whose whole job is routing.
## When to use the board (vs. just doing the work)
Create Kanban tasks when any of these are true:
1. **Multiple specialists are needed.** Research + analysis + writing is three profiles.
2. **The work should survive a crash or restart.** Long-running, recurring, or important.
3. **The user might want to interject.** Human-in-the-loop at any step.
4. **Multiple subtasks can run in parallel.** Fan-out for speed.
5. **Review / iteration is expected.** A reviewer profile loops on drafter output.
6. **The audit trail matters.** Board rows persist in SQLite forever.
If *none* of those apply — it's a small one-shot reasoning task — use `delegate_task` instead or answer the user directly.
## The anti-temptation rules
Your job description says "route, don't execute." The rules that enforce that:
- **Do not execute the work yourself.** Your restricted toolset usually doesn't even include terminal/file/code/web for implementation. If you find yourself "just fixing this quickly" — stop and create a task for the right specialist.
- **For any concrete task, create a Kanban task and assign it.** Every single time.
- **If no specialist fits, ask the user which profile to create.** Do not default to doing it yourself under "close enough."
- **Decompose, route, and summarize — that's the whole job.**
## The standard specialist roster (convention)
Unless the user's setup has customized profiles, assume these exist. Adjust to whatever the user actually has — ask if you're unsure.
| Profile | Does | Typical workspace |
|---|---|---|
| `researcher` | Reads sources, gathers facts, writes findings | `scratch` |
| `analyst` | Synthesizes, ranks, de-dupes. Consumes multiple `researcher` outputs | `scratch` |
| `writer` | Drafts prose in the user's voice | `scratch` or `dir:` into their Obsidian vault |
| `reviewer` | Reads output, leaves findings, gates approval | `scratch` |
| `backend-eng` | Writes server-side code | `worktree` |
| `frontend-eng` | Writes client-side code | `worktree` |
| `ops` | Runs scripts, manages services, handles deployments | `dir:` into ops scripts repo |
| `pm` | Writes specs, acceptance criteria | `scratch` |
## Decomposition playbook
### Step 1 — Understand the goal
Ask clarifying questions if the goal is ambiguous. Cheap to ask; expensive to spawn the wrong fleet.
### Step 2 — Sketch the task graph
Before creating anything, draft the graph out loud (in your response to the user). Example for "Analyze whether we should migrate to Postgres":
```
T1 researcher research: Postgres cost vs current
T2 researcher research: Postgres performance vs current
T3 analyst synthesize migration recommendation parents: T1, T2
T4 writer draft decision memo parents: T3
```
Show this to the user. Let them correct it before you create anything.
### Step 3 — Create tasks and link
```python
t1 = kanban_create(
title="research: Postgres cost vs current",
assignee="researcher",
body="Compare estimated infrastructure costs, migration costs, and ongoing ops costs over a 3-year window. Sources: AWS/GCP pricing, team time estimates, current Postgres bills from peers.",
tenant=os.environ.get("HERMES_TENANT"),
)["task_id"]
t2 = kanban_create(
title="research: Postgres performance vs current",
assignee="researcher",
body="Compare query latency, throughput, and scaling characteristics at our expected data volume (~500GB, 10k QPS peak). Sources: benchmark papers, public case studies, pgbench results if easy.",
)["task_id"]
t3 = kanban_create(
title="synthesize migration recommendation",
assignee="analyst",
body="Read the findings from T1 (cost) and T2 (performance). Produce a 1-page recommendation with explicit trade-offs and a go/no-go call.",
parents=[t1, t2],
)["task_id"]
t4 = kanban_create(
title="draft decision memo",
assignee="writer",
body="Turn the analyst's recommendation into a 2-page memo for the CTO. Match the tone of previous decision memos in the team's knowledge base.",
parents=[t3],
)["task_id"]
```
`parents=[...]` gates promotion — children stay in `todo` until every parent reaches `done`, then auto-promote to `ready`. No manual coordination needed; the dispatcher and dependency engine handle it.
### Step 4 — Complete your own task
If you were spawned as a task yourself (e.g. `planner` profile was assigned `T0: "investigate Postgres migration"`), mark it done with a summary of what you created:
```python
kanban_complete(
summary="decomposed into T1-T4: 2 researchers parallel, 1 analyst on their outputs, 1 writer on the recommendation",
metadata={
"task_graph": {
"T1": {"assignee": "researcher", "parents": []},
"T2": {"assignee": "researcher", "parents": []},
"T3": {"assignee": "analyst", "parents": ["T1", "T2"]},
"T4": {"assignee": "writer", "parents": ["T3"]},
},
},
)
```
### Step 5 — Report back to the user
Tell them what you created in plain prose:
> I've queued 4 tasks:
> - **T1** (researcher): cost comparison
> - **T2** (researcher): performance comparison, in parallel with T1
> - **T3** (analyst): synthesizes T1 + T2 into a recommendation
> - **T4** (writer): turns T3 into a CTO memo
>
> The dispatcher will pick up T1 and T2 now. T3 starts when both finish. You'll get a gateway ping when T4 completes. Use the dashboard or `hermes kanban tail <id>` to follow along.
## Common patterns
**Fan-out + fan-in (research → synthesize):** N `researcher` tasks with no parents, one `analyst` task with all of them as parents.
**Pipeline with gates:** `pm → backend-eng → reviewer`. Each stage's `parents=[previous_task]`. Reviewer blocks or completes; if reviewer blocks, the operator unblocks with feedback and respawns.
**Same-profile queue:** 50 tasks, all assigned to `translator`, no dependencies between them. Dispatcher serializes — translator processes them in priority order, accumulating experience in their own memory.
**Human-in-the-loop:** Any task can `kanban_block()` to wait for input. Dispatcher respawns after `/unblock`. The comment thread carries the full context.
## Pitfalls
**Reassignment vs. new task.** If a reviewer blocks with "needs changes," create a NEW task linked from the reviewer's task — don't re-run the same task with a stern look. The new task is assigned to the original implementer profile.
**Argument order for links.** `kanban_link(parent_id=..., child_id=...)` — parent first. Mixing them up demotes the wrong task to `todo`.
**Don't pre-create the whole graph if the shape depends on intermediate findings.** If T3's structure depends on what T1 and T2 find, let T3 exist as a "synthesize findings" task whose own first step is to read parent handoffs and plan the rest. Orchestrators can spawn orchestrators.
**Tenant inheritance.** If `HERMES_TENANT` is set in your env, pass `tenant=os.environ.get("HERMES_TENANT")` on every `kanban_create` call so child tasks stay in the same namespace.
+134
View File
@@ -0,0 +1,134 @@
---
name: kanban-worker
description: Pitfalls, examples, and edge cases for Hermes Kanban workers. The lifecycle itself is auto-injected into every worker's system prompt as KANBAN_GUIDANCE (from agent/prompt_builder.py); this skill is what you load when you want deeper detail on specific scenarios.
version: 2.0.0
metadata:
hermes:
tags: [kanban, multi-agent, collaboration, workflow, pitfalls]
related_skills: [kanban-orchestrator]
---
# Kanban Worker — Pitfalls and Examples
> You're seeing this skill because the Hermes Kanban dispatcher spawned you as a worker with `--skills kanban-worker` — it's loaded automatically for every dispatched worker. The **lifecycle** (6 steps: orient → work → heartbeat → block/complete) also lives in the `KANBAN_GUIDANCE` block that's auto-injected into your system prompt. This skill is the deeper detail: good handoff shapes, retry diagnostics, edge cases.
## Workspace handling
Your workspace kind determines how you should behave inside `$HERMES_KANBAN_WORKSPACE`:
| Kind | What it is | How to work |
|---|---|---|
| `scratch` | Fresh tmp dir, yours alone | Read/write freely; it gets GC'd when the task is archived. |
| `dir:<path>` | Shared persistent directory | Other runs will read what you write. Treat it like long-lived state. Path is guaranteed absolute (the kernel rejects relative paths). |
| `worktree` | Git worktree at the resolved path | If `.git` doesn't exist, run `git worktree add <path> <branch>` from the main repo first, then cd and work normally. Commit work here. |
## Tenant isolation
If `$HERMES_TENANT` is set, the task belongs to a tenant namespace. When reading or writing persistent memory, prefix memory entries with the tenant so context doesn't leak across tenants:
- Good: `business-a: Acme is our biggest customer`
- Bad (leaks): `Acme is our biggest customer`
## Good summary + metadata shapes
The `kanban_complete(summary=..., metadata=...)` handoff is how downstream workers read what you did. Patterns that work:
**Coding task:**
```python
kanban_complete(
summary="shipped rate limiter — token bucket, keys on user_id with IP fallback, 14 tests pass",
metadata={
"changed_files": ["rate_limiter.py", "tests/test_rate_limiter.py"],
"tests_run": 14,
"tests_passed": 14,
"decisions": ["user_id primary, IP fallback for unauthenticated requests"],
},
)
```
**Research task:**
```python
kanban_complete(
summary="3 competing libraries reviewed; vLLM wins on throughput, SGLang on latency, Tensorrt-LLM on memory efficiency",
metadata={
"sources_read": 12,
"recommendation": "vLLM",
"benchmarks": {"vllm": 1.0, "sglang": 0.87, "trtllm": 0.72},
},
)
```
**Review task:**
```python
kanban_complete(
summary="reviewed PR #123; 2 blocking issues found (SQL injection in /search, missing CSRF on /settings)",
metadata={
"pr_number": 123,
"findings": [
{"severity": "critical", "file": "api/search.py", "line": 42, "issue": "raw SQL concat"},
{"severity": "high", "file": "api/settings.py", "issue": "missing CSRF middleware"},
],
"approved": False,
},
)
```
Shape `metadata` so downstream parsers (reviewers, aggregators, schedulers) can use it without re-reading your prose.
## Block reasons that get answered fast
Bad: `"stuck"` — the human has no context.
Good: one sentence naming the specific decision you need. Leave longer context as a comment instead.
```python
kanban_comment(
task_id=os.environ["HERMES_KANBAN_TASK"],
body="Full context: I have user IPs from Cloudflare headers but some users are behind NATs with thousands of peers. Keying on IP alone causes false positives.",
)
kanban_block(reason="Rate limit key choice: IP (simple, NAT-unsafe) or user_id (requires auth, skips anonymous endpoints)?")
```
The block message is what appears in the dashboard / gateway notifier. The comment is the deeper context a human reads when they open the task.
## Heartbeats worth sending
Good heartbeats name progress: `"epoch 12/50, loss 0.31"`, `"scanned 1.2M/2.4M rows"`, `"uploaded 47/120 videos"`.
Bad heartbeats: `"still working"`, empty notes, sub-second intervals. Every few minutes max; skip entirely for tasks under ~2 minutes.
## Retry scenarios
If you open the task and `kanban_show` returns `runs: [...]` with one or more closed runs, you're a retry. The prior runs' `outcome` / `summary` / `error` tell you what didn't work. Don't repeat that path. Typical retry diagnostics:
- `outcome: "timed_out"` — the previous attempt hit `max_runtime_seconds`. You may need to chunk the work or shorten it.
- `outcome: "crashed"` — OOM or segfault. Reduce memory footprint.
- `outcome: "spawn_failed"` + `error: "..."` — usually a profile config issue (missing credential, bad PATH). Ask the human via `kanban_block` instead of retrying blindly.
- `outcome: "reclaimed"` + `summary: "task archived..."` — operator archived the task out from under the previous run; you probably shouldn't be running at all, check status carefully.
- `outcome: "blocked"` — a previous attempt blocked; the unblock comment should be in the thread by now.
## Do NOT
- Call `delegate_task` as a substitute for `kanban_create`. `delegate_task` is for short reasoning subtasks inside YOUR run; `kanban_create` is for cross-agent handoffs that outlive one API loop.
- Modify files outside `$HERMES_KANBAN_WORKSPACE` unless the task body says to.
- Create follow-up tasks assigned to yourself — assign to the right specialist.
- Complete a task you didn't actually finish. Block it instead.
## Pitfalls
**Task state can change between dispatch and your startup.** Between when the dispatcher claimed and when your process actually booted, the task may have been blocked, reassigned, or archived. Always `kanban_show` first. If it reports `blocked` or `archived`, stop — you shouldn't be running.
**Workspace may have stale artifacts.** Especially `dir:` and `worktree` workspaces can have files from previous runs. Read the comment thread — it usually explains why you're running again and what state the workspace is in.
**Don't rely on the CLI when the guidance is available.** The `kanban_*` tools work across all terminal backends (Docker, Modal, SSH). `hermes kanban <verb>` from your terminal tool will fail in containerized backends because the CLI isn't installed there. When in doubt, use the tool.
## CLI fallback (for scripting)
Every tool has a CLI equivalent for human operators and scripts:
- `kanban_show``hermes kanban show <id> --json`
- `kanban_complete``hermes kanban complete <id> --summary "..." --metadata '{...}'`
- `kanban_block``hermes kanban block <id> "reason"`
- `kanban_create``hermes kanban create "title" --assignee <profile> [--parent <id>]`
- etc.
Use the tools from inside an agent; the CLI exists for the human at the terminal.
+3
View File
@@ -0,0 +1,3 @@
---
description: Skills for monitoring, aggregating, and processing RSS feeds, blogs, and web content sources.
---
-228
View File
@@ -1,228 +0,0 @@
---
name: airtable
description: Airtable REST API via curl. Records CRUD, filters, upserts.
version: 1.1.0
author: community
license: MIT
prerequisites:
env_vars: [AIRTABLE_API_KEY]
commands: [curl]
metadata:
hermes:
tags: [Airtable, Productivity, Database, API]
homepage: https://airtable.com/developers/web/api/introduction
---
# Airtable — Bases, Tables & Records
Work with Airtable's REST API directly via `curl` using the `terminal` tool. No MCP server, no OAuth flow, no Python SDK — just `curl` and a personal access token.
## Prerequisites
1. Create a **Personal Access Token (PAT)** at https://airtable.com/create/tokens (tokens start with `pat...`).
2. Grant these scopes (minimum):
- `data.records:read` — read rows
- `data.records:write` — create / update / delete rows
- `schema.bases:read` — list bases and tables
3. **Important:** in the same token UI, add each base you want to access to the token's **Access** list. PATs are scoped per-base — a valid token on the wrong base returns `403`.
4. Store the token in `~/.hermes/.env` (or via `hermes setup`):
```
AIRTABLE_API_KEY=pat_your_token_here
```
> Note: legacy `key...` API keys were deprecated Feb 2024. Only PATs and OAuth tokens work now.
## API Basics
- **Endpoint:** `https://api.airtable.com/v0`
- **Auth header:** `Authorization: Bearer $AIRTABLE_API_KEY`
- **All requests** use JSON (`Content-Type: application/json` for any POST/PATCH/PUT body).
- **Object IDs:** bases `app...`, tables `tbl...`, records `rec...`, fields `fld...`. IDs never change; names can. Prefer IDs in automations.
- **Rate limit:** 5 requests/sec/base. `429` → back off. Burst on a single base will be throttled.
Base curl pattern:
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?maxRecords=5" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
`-s` suppresses curl's progress bar — keep it set for every call so the tool output stays clean for Hermes. Pipe through `python3 -m json.tool` (always present) or `jq` (if installed) for readable JSON.
## Field Types (request body shapes)
| Field type | Write shape |
|---|---|
| Single line text | `"Name": "hello"` |
| Long text | `"Notes": "multi\nline"` |
| Number | `"Score": 42` |
| Checkbox | `"Done": true` |
| Single select | `"Status": "Todo"` (name must already exist unless `typecast: true`) |
| Multi-select | `"Tags": ["urgent", "bug"]` |
| Date | `"Due": "2026-04-01"` |
| DateTime (UTC) | `"At": "2026-04-01T14:30:00.000Z"` |
| URL / Email / Phone | `"Link": "https://…"` |
| Attachment | `"Files": [{"url": "https://…"}]` (Airtable fetches + rehosts) |
| Linked record | `"Owner": ["recXXXXXXXXXXXXXX"]` (array of record IDs) |
| User | `"AssignedTo": {"id": "usrXXXXXXXXXXXXXX"}` |
Pass `"typecast": true` at the top level of a create/update body to let Airtable auto-coerce values (e.g. create a new select option on the fly, convert `"42"``42`).
## Common Queries
### List bases the token can see
```bash
curl -s "https://api.airtable.com/v0/meta/bases" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### List tables + schema for a base
```bash
curl -s "https://api.airtable.com/v0/meta/bases/$BASE_ID/tables" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Use this BEFORE mutating — confirms exact field names and IDs, surfaces `options.choices` for select fields, and shows primary-field names.
### List records (first 10)
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?maxRecords=10" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### Get a single record
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE/$RECORD_ID" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### Filter records (filterByFormula)
Airtable formulas must be URL-encoded. Let Python stdlib do it — never hand-encode:
```bash
FORMULA="{Status}='Todo'"
ENC=$(python3 -c 'import sys, urllib.parse; print(urllib.parse.quote(sys.argv[1], safe=""))' "$FORMULA")
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?filterByFormula=$ENC&maxRecords=20" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Useful formula patterns:
- Exact match: `{Email}='user@example.com'`
- Contains: `FIND('bug', LOWER({Title}))`
- Multiple conditions: `AND({Status}='Todo', {Priority}='High')`
- Or: `OR({Owner}='alice', {Owner}='bob')`
- Not empty: `NOT({Assignee}='')`
- Date comparison: `IS_AFTER({Due}, TODAY())`
### Sort + select specific fields
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?sort%5B0%5D%5Bfield%5D=Priority&sort%5B0%5D%5Bdirection%5D=asc&fields%5B%5D=Name&fields%5B%5D=Status" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Square brackets in query params MUST be URL-encoded (`%5B` / `%5D`).
### Use a named view
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?view=Grid%20view&maxRecords=50" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Views apply their saved filter + sort server-side.
## Common Mutations
### Create a record
```bash
curl -s -X POST "https://api.airtable.com/v0/$BASE_ID/$TABLE" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"fields":{"Name":"New task","Status":"Todo","Priority":"High"}}' | python3 -m json.tool
```
### Create up to 10 records in one call
```bash
curl -s -X POST "https://api.airtable.com/v0/$BASE_ID/$TABLE" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"typecast": true,
"records": [
{"fields": {"Name": "Task A", "Status": "Todo"}},
{"fields": {"Name": "Task B", "Status": "In progress"}}
]
}' | python3 -m json.tool
```
Batch endpoints are capped at **10 records per request**. For larger inserts, loop in batches of 10 with a short sleep to respect 5 req/sec/base.
### Update a record (PATCH — merges, preserves unchanged fields)
```bash
curl -s -X PATCH "https://api.airtable.com/v0/$BASE_ID/$TABLE/$RECORD_ID" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"fields":{"Status":"Done"}}' | python3 -m json.tool
```
### Upsert by a merge field (no ID needed)
```bash
curl -s -X PATCH "https://api.airtable.com/v0/$BASE_ID/$TABLE" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"performUpsert": {"fieldsToMergeOn": ["Email"]},
"records": [
{"fields": {"Email": "user@example.com", "Status": "Active"}}
]
}' | python3 -m json.tool
```
`performUpsert` creates records whose merge-field values are new, patches records whose merge-field values already exist. Great for idempotent syncs.
### Delete a record
```bash
curl -s -X DELETE "https://api.airtable.com/v0/$BASE_ID/$TABLE/$RECORD_ID" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### Delete up to 10 records in one call
```bash
curl -s -X DELETE "https://api.airtable.com/v0/$BASE_ID/$TABLE?records%5B%5D=rec1&records%5B%5D=rec2" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
## Pagination
List endpoints return at most **100 records per page**. If the response includes `"offset": "..."`, pass it back on the next call. Loop until the field is absent:
```bash
OFFSET=""
while :; do
URL="https://api.airtable.com/v0/$BASE_ID/$TABLE?pageSize=100"
[ -n "$OFFSET" ] && URL="$URL&offset=$OFFSET"
RESP=$(curl -s "$URL" -H "Authorization: Bearer $AIRTABLE_API_KEY")
echo "$RESP" | python3 -c 'import json,sys; d=json.load(sys.stdin); [print(r["id"], r["fields"].get("Name","")) for r in d["records"]]'
OFFSET=$(echo "$RESP" | python3 -c 'import json,sys; d=json.load(sys.stdin); print(d.get("offset",""))')
[ -z "$OFFSET" ] && break
done
```
## Typical Hermes Workflow
1. **Confirm auth.** `curl -s -o /dev/null -w "%{http_code}\n" https://api.airtable.com/v0/meta/bases -H "Authorization: Bearer $AIRTABLE_API_KEY"` — expect `200`.
2. **Find the base.** List bases (step above) OR ask the user for the `app...` ID directly if the token lacks `schema.bases:read`.
3. **Inspect the schema.** `GET /v0/meta/bases/$BASE_ID/tables` — cache the exact field names and primary-field name locally in the session before mutating anything.
4. **Read before you write.** For "update X where Y", `filterByFormula` first to resolve the `rec...` ID, then `PATCH /v0/$BASE_ID/$TABLE/$RECORD_ID`. Never guess record IDs.
5. **Batch writes.** Combine related creates into one 10-record POST to stay under the 5 req/sec budget.
6. **Destructive ops.** Deletions can't be undone via API. If the user says "delete all Xs", echo back the filter + record count and confirm before firing.
## Pitfalls
- **`filterByFormula` MUST be URL-encoded.** Field names with spaces or non-ASCII also need encoding (`{My Field}``%7BMy%20Field%7D`). Use Python stdlib (pattern above) — never hand-escape.
- **Empty fields are omitted from responses.** A missing `"Assignee"` key doesn't mean the field doesn't exist — it means this record's value is empty. Check the schema (step 3) before concluding a field is missing.
- **PATCH vs PUT.** `PATCH` merges supplied fields into the record. `PUT` replaces the record entirely and clears any field you didn't include. Default to `PATCH`.
- **Single-select options must exist.** Writing `"Status": "Shipping"` when `Shipping` isn't in the field's option list errors with `INVALID_MULTIPLE_CHOICE_OPTIONS` unless you pass `"typecast": true` (which auto-creates the option).
- **Per-base token scoping.** A `403` on one base while another works means the token's Access list doesn't include that base — not a scope or auth issue. Send the user to https://airtable.com/create/tokens to grant it.
- **Rate limits are per base, not per token.** 5 req/sec on `baseA` and 5 req/sec on `baseB` is fine; 6 req/sec on `baseA` alone will throttle. Monitor the `Retry-After` header on `429`.
## Important Notes for Hermes
- **Always use the `terminal` tool with `curl`.** Do NOT use `web_extract` (it can't send auth headers) or `browser_navigate` (needs UI auth and is slow).
- **`AIRTABLE_API_KEY` flows from `~/.hermes/.env` into the subprocess automatically** when this skill is loaded — no need to re-export it before each `curl` call.
- **Escape curly braces in formulas carefully.** In a heredoc body, `{Status}` is literal. In a shell argument, `{Status}` is safe outside `{...}` brace-expansion context — but pass dynamic strings through `python3 urllib.parse.quote` before splicing into a URL.
- **Pretty-print with `python3 -m json.tool`** (always present) rather than `jq` (optional). Only reach for `jq` when you need filtering/projection.
- **Pagination is per-page, not global.** Airtable's 100-record cap is a hard limit; there is no way to bump it. Loop with `offset` until the field is absent.
- **Read the `errors` array** on non-2xx responses — Airtable returns structured error codes like `AUTHENTICATION_REQUIRED`, `INVALID_PERMISSIONS`, `MODEL_ID_NOT_FOUND`, `INVALID_MULTIPLE_CHOICE_OPTIONS` that tell you exactly what's wrong.
@@ -289,7 +289,6 @@ def exchange_auth_code(code: str):
sys.exit(1)
pending_auth = _load_pending_auth()
raw_callback = code
code, returned_state = _extract_code_and_state(code)
if returned_state and returned_state != pending_auth["state"]:
print("ERROR: OAuth state mismatch. Run --auth-url again to start a fresh session.")
@@ -299,13 +298,19 @@ def exchange_auth_code(code: str):
from google_auth_oauthlib.flow import Flow
from urllib.parse import parse_qs, urlparse
# Extract granted scopes from the callback URL if the user pasted the full redirect URL.
granted_scopes = list(SCOPES)
if isinstance(raw_callback, str) and raw_callback.startswith("http"):
params = parse_qs(urlparse(raw_callback).query)
scope_val = (params.get("scope") or [""])[0].strip()
if scope_val:
granted_scopes = scope_val.split()
# Extract granted scopes from the callback URL if present
if returned_state and "scope" in parse_qs(urlparse(code).query if isinstance(code, str) and code.startswith("http") else {}):
granted_scopes = parse_qs(urlparse(code).query)["scope"][0].split()
else:
# Try to extract from code_or_url parameter
if isinstance(code, str) and code.startswith("http"):
params = parse_qs(urlparse(code).query)
if "scope" in params:
granted_scopes = params["scope"][0].split()
else:
granted_scopes = SCOPES
else:
granted_scopes = SCOPES
flow = Flow.from_client_secrets_file(
str(CLIENT_SECRET_PATH),
@@ -926,18 +926,13 @@ def cmd_timezone(args):
os_ = offset_info.get("seconds", 0)
sign = "+" if oh >= 0 else "-"
utc_offset = f"{sign}{abs(oh):02d}:{om:02d}"
if os_:
utc_offset = f"{utc_offset}:{os_:02d}"
elif tz_data.get("standardUtcOffset"):
offset_info2 = tz_data["standardUtcOffset"]
if isinstance(offset_info2, dict):
oh = offset_info2.get("hours", 0)
om = abs(offset_info2.get("minutes", 0))
os_ = offset_info2.get("seconds", 0)
sign = "+" if oh >= 0 else "-"
utc_offset = f"{sign}{abs(oh):02d}:{om:02d}"
if os_:
utc_offset = f"{utc_offset}:{os_:02d}"
timezone_src = "timeapi.io"
except (RuntimeError, KeyError, TypeError):
pass # API may be down; continue to fallback
@@ -1,151 +0,0 @@
---
name: debugging-hermes-tui-commands
description: Use when debugging or adding Hermes TUI slash commands across the Python backend (hermes_cli/commands.py), the tui_gateway bridge, and the TypeScript/Ink frontend. Covers autocomplete gaps, gateway dispatch issues, and live UI-state wiring.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [debugging, hermes-agent, tui, slash-commands, typescript, python]
related_skills: [python-debugpy, node-inspect-debugger, systematic-debugging]
---
# Debugging Hermes TUI Slash Commands
## Overview
Hermes slash commands span three layers — Python command registry, tui_gateway JSON-RPC bridge, and the Ink/TypeScript frontend. When a command misbehaves (missing from autocomplete, works in CLI but not TUI, config persists but UI doesn't update), the bug is almost always one layer being out of sync with another.
Use this skill when you encounter issues with slash commands in the Hermes TUI, particularly when commands aren't showing in autocomplete, aren't working properly in the TUI, or need to be added/updated.
## When to Use
- A slash command exists in one part of the codebase but doesn't work fully
- A command needs to be added to both backend and frontend
- Command autocomplete isn't working for specific commands
- Command behavior is inconsistent between CLI and TUI
- A command persists config but doesn't apply live in the TUI
## Architecture Overview
```
Python backend (hermes_cli/commands.py) <- canonical COMMAND_REGISTRY
TUI gateway (tui_gateway/server.py) <- slash.exec / command.dispatch
TUI frontend (ui-tui/src/app/slash/) <- local handlers + fallthrough
```
Command definitions must be registered consistently across Python and TypeScript to work properly. The Python `COMMAND_REGISTRY` is the source of truth for: CLI dispatch, gateway help, Telegram BotCommand menu, Slack subcommand map, and autocomplete data shipped to Ink.
## Investigation Steps
1. **Check if the command exists in the TUI frontend:**
```bash
search_files --pattern "/commandname" --file_glob "*.ts" --path ui-tui/
search_files --pattern "/commandname" --file_glob "*.tsx" --path ui-tui/
```
2. **Examine the TUI command definition:**
```bash
read_file ui-tui/src/app/slash/commands/core.ts
# If not there:
search_files --pattern "commandname" --path ui-tui/src/app/slash/commands --target files
```
3. **Check if the command exists in the Python backend:**
```bash
search_files --pattern "CommandDef" --file_glob "*.py" --path hermes_cli/
search_files --pattern "commandname" --path hermes_cli/commands.py --context 3
```
4. **Examine the gateway implementation:**
```bash
search_files --pattern "complete.slash|slash.exec" --path tui_gateway/
```
## Fix: Missing Command Autocomplete
If a command exists in the TUI but doesn't show in autocomplete:
1. Add a `CommandDef` entry to `COMMAND_REGISTRY` in `hermes_cli/commands.py`:
```python
CommandDef("commandname", "Description of the command", "Session",
cli_only=True, aliases=("alias",),
args_hint="[arg1|arg2|arg3]",
subcommands=("arg1", "arg2", "arg3")),
```
2. Pick `cli_only` vs gateway availability carefully:
- `cli_only=True` — only in the interactive CLI/TUI
- `gateway_only=True` — only in messaging platforms
- neither — available everywhere
- `gateway_config_gate="display.foo"` — config-gated availability in the gateway
3. Ensure `subcommands` matches the expected tab-completion options shown by the TUI.
4. If the command runs server-side, add a handler in `HermesCLI.process_command()` in `cli.py`:
```python
elif canonical == "commandname":
self._handle_commandname(cmd_original)
```
5. For gateway-available commands, add a handler in `gateway/run.py`:
```python
if canonical == "commandname":
return await self._handle_commandname(event)
```
## Common Issues
1. **Command shows in TUI but not in autocomplete.** The command is defined in the TUI codebase but missing from `COMMAND_REGISTRY` in `hermes_cli/commands.py`. Autocomplete data ships from Python.
2. **Command shows in autocomplete but doesn't work.** Check the command handler in `tui_gateway/server.py` and the frontend handler in `ui-tui/src/app/createSlashHandler.ts`. If the command is local-only in Ink, it must be handled in `app.tsx` built-in branch; otherwise it falls through to `slash.exec` and must have a Python handler.
3. **Command behavior differs between CLI and TUI.** The command might have different implementations. Check both `cli.py::process_command` and the TUI's local handler. Local TUI handlers take precedence over gateway dispatch.
4. **Command persists config but doesn't apply live.** For TUI-local commands, updating `config.set` is not enough. Also patch the relevant nanostore state immediately (usually `patchUiState(...)`) and pass any new state through rendering components. Example: `/details collapsed` must update live detail visibility, not just save `details_mode`; in-session global `/details <mode>` may need a separate command-override flag so live commands can override built-in section defaults while startup/config sync preserves default-expanded thinking/tools behavior.
5. **Gateway dispatch silently ignores the command.** The gateway only dispatches commands it knows about. Check `GATEWAY_KNOWN_COMMANDS` (derived from `COMMAND_REGISTRY` automatically) includes the canonical name. If the command is `cli_only` with a `gateway_config_gate`, verify the gated config value is truthy.
## Debugging Tactics
When surface-level inspection doesn't reveal the bug:
- **Python side hangs or misbehaves:** use the `python-debugpy` skill to break inside `_SlashWorker.exec` or the command handler. `remote-pdb` set at the handler entry is the fastest path.
- **Ink side not reacting:** use the `node-inspect-debugger` skill to break in `app.tsx`'s slash dispatch or the local command branch. `sb('dist/app.js', <line>)` after `npm run build`.
- **Registry mismatch / unclear which side is wrong:** compare the canonical `COMMAND_REGISTRY` entry against the TUI's local command list side-by-side.
## Pitfalls
- Don't forget to set the appropriate category for the command in `CommandDef` (e.g., "Session", "Configuration", "Tools & Skills", "Info", "Exit")
- Make sure any aliases are properly registered in the `aliases` tuple — no other file changes are needed, everything downstream (Telegram menu, Slack mapping, autocomplete, help) derives from it
- For commands with subcommands, ensure the `subcommands` tuple in `CommandDef` matches what's in the TUI code
- `cli_only=True` commands won't work in gateway/messaging platforms — unless you add a `gateway_config_gate` and the gate is truthy
- After adding live UI state, search every consumer of the old prop/helper and thread the new state through all render paths, not just the active streaming path. TUI detail rendering has at least two important paths: live `StreamingAssistant`/`ToolTrail` and transcript/pending `MessageLine` rows. A `/clean` pass should explicitly check both.
- Rebuild the TUI (`npm --prefix ui-tui run build`) before testing — tsx watch mode may lag on first launch
## Verification
After fixing:
1. Rebuild the TUI:
```bash
cd /home/bb/hermes-agent && npm --prefix ui-tui run build
```
2. Run the TUI and test the command:
```bash
hermes --tui
```
3. Type `/` and verify the command appears in autocomplete suggestions with the expected description and args hint.
4. Execute the command and confirm:
- Expected behavior fires
- Any persisted config updates correctly (`read_file ~/.hermes/config.yaml`)
- Live UI state reflects the change immediately (not just after restart)
5. If the command is also gateway-available, test it from at least one messaging platform (or run the gateway tests: `scripts/run_tests.sh tests/gateway/`).
@@ -1,164 +0,0 @@
---
name: hermes-agent-skill-authoring
description: Use when authoring or updating a SKILL.md inside the hermes-agent repo itself (skills/ tree, committed to a branch). Covers required frontmatter, validator limits, peer-matching structure, and the write_file-vs-skill_manage distinction for in-repo skills.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [skills, authoring, hermes-agent, conventions, skill-md]
related_skills: [writing-plans, requesting-code-review]
---
# Authoring Hermes-Agent Skills (in-repo)
## Overview
There are two places a SKILL.md can live:
1. **User-local:** `~/.hermes/skills/<maybe-category>/<name>/SKILL.md` — personal, not shared. Created via `skill_manage(action='create')`.
2. **In-repo (this skill is about this case):** `/home/bb/hermes-agent/skills/<category>/<name>/SKILL.md` — committed, shipped with the package. Use `write_file` + `git add`. `skill_manage(action='create')` does NOT target this tree.
## When to Use
- User asks you to add a skill "in this branch / repo / commit"
- You're committing a reusable workflow that should ship with hermes-agent
- You're editing an existing skill under `/home/bb/hermes-agent/skills/` (use `patch` for small edits, `write_file` for rewrites; `skill_manage` still works for patch on in-repo skills, but not for `create`)
## Required Frontmatter
Source of truth: `tools/skill_manager_tool.py::_validate_frontmatter`. Hard requirements:
- Starts with `---` as the first bytes (no leading blank line).
- Closes with `\n---\n` before the body.
- Parses as a YAML mapping.
- `name` field present.
- `description` field present, ≤ **1024 chars** (`MAX_DESCRIPTION_LENGTH`).
- Non-empty body after the closing `---`.
Peer-matched shape used by every skill under `skills/software-development/`:
```yaml
---
name: my-skill-name # lowercase, hyphens, ≤64 chars (MAX_NAME_LENGTH)
description: Use when <trigger>. <one-line behavior>.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [short, descriptive, tags]
related_skills: [other-skill, another-skill]
---
```
`version` / `author` / `license` / `metadata` are NOT enforced by the validator, but every peer has them — omit and your skill sticks out.
## Size Limits
- Description: ≤ 1024 chars (enforced).
- Full SKILL.md: ≤ 100,000 chars (enforced as `MAX_SKILL_CONTENT_CHARS`, ~36k tokens).
- Peer skills in `software-development/` sit at **8-14k chars**. Aim for that range. If you're pushing past 20k, split into `references/*.md` and reference them from SKILL.md.
## Peer-Matched Structure
Every in-repo skill follows roughly:
```
# <Title>
## Overview
One or two paragraphs: what and why.
## When to Use
- Bulleted triggers
- "Don't use for:" counter-triggers
## <Topic sections specific to the skill>
- Quick-reference tables are common
- Code blocks with exact commands
- Hermes-specific recipes (tests via scripts/run_tests.sh, ui-tui paths, etc.)
## Common Pitfalls
Numbered list of mistakes and their fixes.
## Verification Checklist
- [ ] Checkbox list of post-action verifications
## One-Shot Recipes (optional)
Named scenarios → concrete command sequences.
```
Not every section is mandatory, but `Overview` + `When to Use` + actionable body + pitfalls are the minimum for the skill to feel like a peer.
## Directory Placement
```
skills/<category>/<skill-name>/SKILL.md
```
Categories currently in repo (confirm with `ls skills/`): `autonomous-ai-agents`, `creative`, `data-science`, `devops`, `dogfood`, `email`, `gaming`, `github`, `leisure`, `mcp`, `media`, `mlops/*`, `note-taking`, `productivity`, `red-teaming`, `research`, `smart-home`, `social-media`, `software-development`.
Pick the closest existing category. Don't invent new top-level categories casually.
## Workflow
1. **Survey peers** in the target category:
```
ls skills/<category>/
```
Read 2-3 peer SKILL.md files to match tone and structure.
2. **Check validator constraints** in `tools/skill_manager_tool.py` if unsure.
3. **Draft** with `write_file` to `skills/<category>/<name>/SKILL.md`.
4. **Validate locally**:
```python
import yaml, re, pathlib
content = pathlib.Path("skills/<category>/<name>/SKILL.md").read_text()
assert content.startswith("---")
m = re.search(r'\n---\s*\n', content[3:])
fm = yaml.safe_load(content[3:m.start()+3])
assert "name" in fm and "description" in fm
assert len(fm["description"]) <= 1024
assert len(content) <= 100_000
```
5. **Git add + commit** on the active branch.
6. **Note:** the CURRENT session's skill loader is cached — `skill_view` / `skills_list` will not see the new skill until a new session. This is expected, not a bug.
## Cross-Referencing Other Skills
`metadata.hermes.related_skills` unions both trees (`skills/` in-repo and `~/.hermes/skills/`) at load time. You CAN reference a user-local skill from an in-repo skill, but it won't resolve for other users who clone the repo fresh. Prefer referencing only in-repo skills from in-repo skills. If a frequently-referenced skill lives only in `~/.hermes/skills/`, consider promoting it to the repo.
## Editing Existing In-Repo Skills
- **Small fix (typo, added pitfall, tightened trigger):** `skill_manage(action='patch', name=..., old_string=..., new_string=...)` works fine on in-repo skills.
- **Major rewrite:** `write_file` the whole SKILL.md. `skill_manage(action='edit')` also works but requires supplying the full new content.
- **Adding supporting files:** `write_file` to `skills/<category>/<name>/references/<file>.md`, `templates/<file>`, or `scripts/<file>`. `skill_manage(action='write_file')` also works and enforces the references/templates/scripts/assets subdir allowlist.
- **Always commit** the edit — in-repo skills are source, not runtime state.
## Common Pitfalls
1. **Using `skill_manage(action='create')` for an in-repo skill.** It writes to `~/.hermes/skills/`, not the repo tree. Use `write_file` for in-repo creation.
2. **Leading whitespace before `---`.** The validator checks `content.startswith("---")`; any leading blank line or BOM fails validation.
3. **Description too generic.** Peer descriptions start with "Use when ..." and describe the *trigger class*, not the one task. "Use when debugging X" > "Debug X".
4. **Forgetting the author/license/metadata block.** Not validator-enforced, but every peer has it; omitting makes the skill look half-finished.
5. **Writing a skill that duplicates a peer.** Before creating, `ls skills/<category>/` and open 2-3 peers. Prefer extending an existing skill to creating a narrow sibling.
6. **Expecting the current session to see the new skill.** It won't. The skill loader is initialized at session start. Verify in a fresh session or via `skill_view` using the exact path.
7. **Linking to skills that don't exist in-repo.** `related_skills: [some-user-local-skill]` works for you but breaks for other clones. Prefer only in-repo links.
## Verification Checklist
- [ ] File is at `skills/<category>/<name>/SKILL.md` (not in `~/.hermes/skills/`)
- [ ] Frontmatter starts at byte 0 with `---`, closes with `\n---\n`
- [ ] `name`, `description`, `version`, `author`, `license`, `metadata.hermes.{tags, related_skills}` all present
- [ ] Name ≤ 64 chars, lowercase + hyphens
- [ ] Description ≤ 1024 chars and starts with "Use when ..."
- [ ] Total file ≤ 100,000 chars (aim for 8-15k)
- [ ] Structure: `# Title``## Overview``## When to Use` → body → `## Common Pitfalls``## Verification Checklist`
- [ ] `related_skills` references resolve in-repo (or are explicitly OK to be user-local)
- [ ] `git add skills/<category>/<name>/ && git commit` completed on the intended branch
@@ -1,318 +0,0 @@
---
name: node-inspect-debugger
description: Use when debugging Node.js code (ui-tui, tui_gateway child processes, any Node script/test) with real breakpoints, stepping, scope inspection, and expression evaluation. Drives `node --inspect` via the Chrome DevTools Protocol from the terminal — no browser required.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [debugging, nodejs, node-inspect, cdp, breakpoints, ui-tui]
related_skills: [systematic-debugging, python-debugpy, debugging-hermes-tui-commands]
---
# Node.js Inspect Debugger
## Overview
When `console.log` isn't enough, drive Node's built-in V8 inspector programmatically from the terminal. You get real breakpoints, step in/over/out, call-stack walking, local/closure scope dumps, and arbitrary expression evaluation in the paused frame.
Two tools, pick one:
- **`node inspect`** — built-in, zero install, CLI REPL. Best for quick poking.
- **`ndb` / CDP via `chrome-remote-interface`** — scriptable from Node/Python; best when you want to automate many breakpoints, collect state across runs, or debug non-interactively from an agent loop.
**Prefer `node inspect` first.** It's always available and the REPL is fast.
## When to Use
- A Node test fails and you need to see intermediate state
- ui-tui crashes or behaves wrong and you want to inspect React/Ink state pre-render
- tui_gateway child processes (`_SlashWorker`, PTY bridge workers) misbehave
- You need to inspect a value in a closure that `console.log` can't reach without patching
- Perf: attach to a running process to capture a CPU profile or heap snapshot
**Don't use for:** things `console.log` solves in under a minute. Breakpoint-driven debugging is heavier; use it when the payoff is real.
## Quick Reference: `node inspect` REPL
Launch paused on first line:
```bash
node inspect path/to/script.js
# or with tsx
node --inspect-brk $(which tsx) path/to/script.ts
```
The `debug>` prompt accepts:
| Command | Action |
|---|---|
| `c` or `cont` | continue |
| `n` or `next` | step over |
| `s` or `step` | step into |
| `o` or `out` | step out |
| `pause` | pause running code |
| `sb('file.js', 42)` | set breakpoint at file.js line 42 |
| `sb(42)` | set breakpoint at line 42 of current file |
| `sb('functionName')` | break when function is called |
| `cb('file.js', 42)` | clear breakpoint |
| `breakpoints` | list all breakpoints |
| `bt` | backtrace (call stack) |
| `list(5)` | show 5 lines of source around current position |
| `watch('expr')` | evaluate expr on every pause |
| `watchers` | show watched expressions |
| `repl` | drop into REPL in current scope (Ctrl+C to exit REPL) |
| `exec expr` | evaluate expression once |
| `restart` | restart script |
| `kill` | kill the script |
| `.exit` | quit debugger |
**In the `repl` sub-mode:** type any JS expression, including access to locals/closure variables. `Ctrl+C` exits back to `debug>`.
## Attaching to a Running Process
When the process is already running (e.g. a long-lived dev server or the TUI gateway):
```bash
# 1. Send SIGUSR1 to enable the inspector on an existing process
kill -SIGUSR1 <pid>
# Node prints: Debugger listening on ws://127.0.0.1:9229/<uuid>
# 2. Attach the debugger CLI
node inspect -p <pid>
# or by URL
node inspect ws://127.0.0.1:9229/<uuid>
```
To start a process with the inspector from the beginning:
```bash
node --inspect script.js # listen on 127.0.0.1:9229, keep running
node --inspect-brk script.js # listen AND pause on first line
node --inspect=0.0.0.0:9230 script.js # custom host:port
```
For TypeScript via tsx:
```bash
node --inspect-brk --import tsx script.ts
# or older tsx
node --inspect-brk -r tsx/cjs script.ts
```
## Programmatic CDP (scripting from terminal)
When you want to automate — set many breakpoints, capture scope state, script a repro — use `chrome-remote-interface`:
```bash
npm i -g chrome-remote-interface # or project-local
# Start your target:
node --inspect-brk=9229 target.js &
```
Driver script (save as `/tmp/cdp-debug.js`):
```javascript
const CDP = require('chrome-remote-interface');
(async () => {
const client = await CDP({ port: 9229 });
const { Debugger, Runtime } = client;
Debugger.paused(async ({ callFrames, reason }) => {
const top = callFrames[0];
console.log(`PAUSED: ${reason} @ ${top.url}:${top.location.lineNumber + 1}`);
// Walk scopes for locals
for (const scope of top.scopeChain) {
if (scope.type === 'local' || scope.type === 'closure') {
const { result } = await Runtime.getProperties({
objectId: scope.object.objectId,
ownProperties: true,
});
for (const p of result) {
console.log(` ${scope.type}.${p.name} =`, p.value?.value ?? p.value?.description);
}
}
}
// Evaluate an expression in the paused frame
const { result } = await Debugger.evaluateOnCallFrame({
callFrameId: top.callFrameId,
expression: 'typeof state !== "undefined" ? JSON.stringify(state) : "n/a"',
});
console.log('state =', result.value ?? result.description);
await Debugger.resume();
});
await Runtime.enable();
await Debugger.enable();
// Set a breakpoint by URL regex + line
await Debugger.setBreakpointByUrl({
urlRegex: '.*app\\.tsx$',
lineNumber: 119, // 0-indexed
columnNumber: 0,
});
await Runtime.runIfWaitingForDebugger();
})();
```
Run it:
```bash
node /tmp/cdp-debug.js
```
Hermes-specific note: `chrome-remote-interface` is NOT in `ui-tui/package.json`. Install it to a throwaway location if you don't want to dirty the project:
```bash
mkdir -p /tmp/cdp-tools && cd /tmp/cdp-tools && npm i chrome-remote-interface
NODE_PATH=/tmp/cdp-tools/node_modules node /tmp/cdp-debug.js
```
## Debugging Hermes ui-tui
The TUI is built Ink + tsx. Two common scenarios:
### Debugging a single Ink component under dev
`ui-tui/package.json` has `npm run dev` (tsx --watch). Add `--inspect-brk` by running tsx directly:
```bash
cd /home/bb/hermes-agent/ui-tui
npm run build # produce dist/ once so transpile isn't needed on first load
node --inspect-brk dist/entry.js
# In another terminal:
node inspect -p <node pid>
```
Then inside `debug>`:
```
sb('dist/app.js', 220) # or wherever the suspect render is
cont
```
When it pauses, `repl` → inspect `props`, state refs, `useInput` handler values, etc.
### Debugging a running `hermes --tui`
The TUI spawns Node from the Python CLI. Easiest path:
```bash
# 1. Launch TUI
hermes --tui &
TUI_PID=$(pgrep -f 'ui-tui/dist/entry' | head -1)
# 2. Enable inspector on that Node PID
kill -SIGUSR1 "$TUI_PID"
# 3. Find the WS URL
curl -s http://127.0.0.1:9229/json/list | jq -r '.[0].webSocketDebuggerUrl'
# 4. Attach
node inspect ws://127.0.0.1:9229/<uuid>
```
Interacting with the TUI (typing in its window) continues to advance execution; your debugger can pause it on a breakpoint at any `sb(...)`.
### Debugging `_SlashWorker` / PTY child processes
Those are Python, not Node — use the `python-debugpy` skill for them. Only Node portions (Ink UI, tui_gateway client, tsx-run tests under `ui-tui/`) use this skill.
## Running Vitest Tests Under the Debugger
```bash
cd /home/bb/hermes-agent/ui-tui
# Run a single test file paused on entry
node --inspect-brk ./node_modules/vitest/vitest.mjs run --no-file-parallelism src/app/foo.test.tsx
```
In another terminal: `node inspect -p <pid>`, then `sb('src/app/foo.tsx', 42)`, `cont`.
Use `--no-file-parallelism` (vitest) or `--runInBand` (jest) so only one worker exists — debugging a pool is painful.
## Heap Snapshots & CPU Profiles (Non-interactive)
From the CDP driver above, swap Debugger for `HeapProfiler` / `Profiler`:
```javascript
// CPU profile for 5 seconds
await client.Profiler.enable();
await client.Profiler.start();
await new Promise(r => setTimeout(r, 5000));
const { profile } = await client.Profiler.stop();
require('fs').writeFileSync('/tmp/cpu.cpuprofile', JSON.stringify(profile));
// Open /tmp/cpu.cpuprofile in Chrome DevTools → Performance tab
```
```javascript
// Heap snapshot
await client.HeapProfiler.enable();
const chunks = [];
client.HeapProfiler.addHeapSnapshotChunk(({ chunk }) => chunks.push(chunk));
await client.HeapProfiler.takeHeapSnapshot({ reportProgress: false });
require('fs').writeFileSync('/tmp/heap.heapsnapshot', chunks.join(''));
```
## Common Pitfalls
1. **Wrong line numbers in TS source.** Breakpoints hit the emitted JS, not the `.ts`. Either (a) break in the built `dist/*.js`, or (b) enable sourcemaps (`node --enable-source-maps`) and use `sb('src/app.tsx', N)` — but only with CDP clients that follow sourcemaps. `node inspect` CLI does not.
2. **`--inspect` vs `--inspect-brk`.** `--inspect` starts the inspector but doesn't pause; your script races past your first breakpoint if you attach too late. Use `--inspect-brk` when you need to set breakpoints before any code runs.
3. **Port collisions.** Default is `9229`. If multiple Node processes are inspecting, pass `--inspect=0` (random port) and read the actual URL from `/json/list`:
```bash
curl -s http://127.0.0.1:9229/json/list # lists all inspectable targets on the host
```
4. **Child processes.** `--inspect` on a parent does NOT inspect its children. Use `NODE_OPTIONS='--inspect-brk' node parent.js` to propagate to every child; be aware they all need unique ports (Node auto-increments when `NODE_OPTIONS='--inspect'` is inherited).
5. **Background kills.** If you `Ctrl+C` out of `node inspect` while the target is paused, the target stays paused. Either `cont` first, or `kill` the target explicitly.
6. **Running `node inspect` through an agent terminal.** It's a PTY-friendly REPL. In Hermes, launch it with `terminal(pty=true)` or `background=true` + `process(action='submit', data='...')`. Non-PTY foreground mode will work for one-shot commands but not for interactive stepping.
7. **Security.** `--inspect=0.0.0.0:9229` exposes arbitrary code execution. Always bind to `127.0.0.1` (the default) unless you have an isolated network.
## Verification Checklist
After setting up a debug session, verify:
- [ ] `curl -s http://127.0.0.1:9229/json/list` returns exactly the target you expect
- [ ] First breakpoint actually hits (if it doesn't, you likely missed `--inspect-brk` or attached after execution completed)
- [ ] Source listing at pause shows the right file (mismatch = sourcemap issue, see pitfall 1)
- [ ] `exec process.pid` in `repl` returns the PID you meant to attach to
## One-Shot Recipes
**"Why is this variable undefined at line X?"**
```bash
node --inspect-brk script.js &
node inspect -p $!
# debug>
sb('script.js', X)
cont
# paused. Now:
repl
> myVariable
> Object.keys(this)
```
**"What's the call path into this function?"**
```
debug> sb('suspectFn')
debug> cont
# paused on entry
debug> bt
```
**"This async chain hangs — where?"**
```
# Start with --inspect (no -brk), let it run to the hang, then:
debug> pause
debug> bt
# Now you see the stuck frame
```
@@ -1,374 +0,0 @@
---
name: python-debugpy
description: Use when debugging Python code (run_agent.py, cli.py, tui_gateway, tests, scripts) with real breakpoints, stepping, scope inspection, and post-mortem analysis. Covers `pdb` for interactive REPL debugging and `debugpy` for remote/headless DAP-driven sessions.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [debugging, python, pdb, debugpy, breakpoints, dap, post-mortem]
related_skills: [systematic-debugging, node-inspect-debugger, debugging-hermes-tui-commands]
---
# Python Debugger (pdb + debugpy)
## Overview
Three tools, picked by situation:
| Tool | When |
|---|---|
| **`breakpoint()` + pdb** | Local, interactive, simplest. Add `breakpoint()` in the source, run normally, get a REPL at that line. |
| **`python -m pdb`** | Launch an existing script under pdb with no source edits. Useful for quick poking. |
| **`debugpy`** | Remote / headless / "attach to already-running process." Talks DAP, scriptable from terminal, works for long-lived processes (gateway, daemon, PTY children). |
**Start with `breakpoint()`.** It's the cheapest thing that works.
## When to Use
- A test fails and the traceback doesn't reveal why a value is wrong
- You need to step through a function and watch a collection mutate
- A long-running process (hermes gateway, tui_gateway) misbehaves and you can't restart it
- Post-mortem: an exception fired in prod-ish code and you want to inspect locals at the crash site
- A subprocess / child (Python `_SlashWorker`, PTY bridge worker) is the actual bug site
**Don't use for:** things `print()` / `logging.debug` solve in under a minute, or things `pytest -vv --tb=long --showlocals` already reveals.
## pdb Quick Reference
Inside any pdb prompt (`(Pdb)`):
| Command | Action |
|---|---|
| `h` / `h cmd` | help |
| `n` | next line (step over) |
| `s` | step into |
| `r` | return from current function |
| `c` | continue |
| `unt N` | continue until line N |
| `j N` | jump to line N (same function only) |
| `l` / `ll` | list source around current line / full function |
| `w` | where (stack trace) |
| `u` / `d` | move up / down in the stack |
| `a` | print args of the current function |
| `p expr` / `pp expr` | print / pretty-print expression |
| `display expr` | auto-print expr on every stop |
| `b file:line` | set breakpoint |
| `b func` | break on function entry |
| `b file:line, cond` | conditional breakpoint |
| `cl N` | clear breakpoint N |
| `tbreak file:line` | one-shot breakpoint |
| `!stmt` | execute arbitrary Python (assignments included) |
| `interact` | drop into full Python REPL in current scope (Ctrl+D to exit) |
| `q` | quit |
The `interact` command is the most powerful — you can import anything, inspect complex objects, even call methods that mutate state. Locals are read-only by default; use `!x = 42` from the `(Pdb)` prompt to mutate.
## Recipe 1: Local breakpoint
Easiest. Edit the file:
```python
def compute(x, y):
result = some_helper(x)
breakpoint() # <-- drops into pdb here
return result + y
```
Run the code normally. You land at the `breakpoint()` line with full access to locals.
**Don't forget to remove `breakpoint()` before committing.** Use `git diff` or a pre-commit grep:
```bash
rg -n 'breakpoint\(\)' --type py
```
## Recipe 2: Launch a script under pdb (no source edits)
```bash
python -m pdb path/to/script.py arg1 arg2
# Lands at first line of script
(Pdb) b path/to/script.py:42
(Pdb) c
```
## Recipe 3: Debug a pytest test
The hermes test runner and pytest both support this:
```bash
# Drop to pdb on failure (or on any raised exception):
scripts/run_tests.sh tests/path/to/test_file.py::test_name --pdb
# Drop to pdb at the START of the test:
scripts/run_tests.sh tests/path/to/test_file.py::test_name --trace
# Show locals in tracebacks without pdb:
scripts/run_tests.sh tests/path/to/test_file.py --showlocals --tb=long
```
Note: `scripts/run_tests.sh` uses xdist (`-n 4`) by default, and pdb does NOT work under xdist. Add `-p no:xdist` or run a single test with `-n 0`:
```bash
scripts/run_tests.sh tests/foo_test.py::test_bar --pdb -p no:xdist
# or
source .venv/bin/activate
python -m pytest tests/foo_test.py::test_bar --pdb
```
This bypasses the hermetic-env guarantees — fine for debugging, but re-run under the wrapper to confirm before pushing.
## Recipe 4: Post-mortem on any exception
```python
import pdb, sys
try:
run_the_thing()
except Exception:
pdb.post_mortem(sys.exc_info()[2])
```
Or wrap a whole script:
```bash
python -m pdb -c continue script.py
# When it crashes, pdb catches it and you're in the frame of the exception
```
Or set a global hook in a repl/jupyter:
```python
import sys
def excepthook(etype, value, tb):
import pdb; pdb.post_mortem(tb)
sys.excepthook = excepthook
```
## Recipe 5: Remote debug with debugpy (attach to running process)
For long-lived processes: Hermes gateway, tui_gateway, a daemon, a process that's already misbehaving and can't be restarted clean.
### Setup
```bash
source /home/bb/hermes-agent/.venv/bin/activate
pip install debugpy
```
### Pattern A: Source-edit — process waits for debugger at launch
Add near the top of the entry point (or inside the function you want to debug):
```python
import debugpy
debugpy.listen(("127.0.0.1", 5678))
print("debugpy listening on 5678, waiting for client...", flush=True)
debugpy.wait_for_client()
debugpy.breakpoint() # optional: pause immediately once attached
```
Start the process; it blocks on `wait_for_client()`.
### Pattern B: No source edit — launch with `-m debugpy`
```bash
python -m debugpy --listen 127.0.0.1:5678 --wait-for-client your_script.py arg1
```
Equivalent for module entry:
```bash
python -m debugpy --listen 127.0.0.1:5678 --wait-for-client -m your.module
```
### Pattern C: Attach to an already-running process
Needs the PID and debugpy preinstalled in the target's environment:
```bash
python -m debugpy --listen 127.0.0.1:5678 --pid <pid>
# debugpy injects itself into the process. Then attach a client as below.
```
Some kernels/security configs block the ptrace-based injection (`/proc/sys/kernel/yama/ptrace_scope`). Fix with:
```bash
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
```
### Connecting a client from the terminal
The easiest terminal-side DAP client is VS Code CLI or a small script. From inside Hermes you have two practical options:
**Option 1: `debugpy`'s own CLI REPL** — not an official feature, but a tiny DAP client script:
```python
# /tmp/dap_client.py
import socket, json, itertools, time, sys
HOST, PORT = "127.0.0.1", 5678
s = socket.create_connection((HOST, PORT))
seq = itertools.count(1)
def send(msg):
msg["seq"] = next(seq)
body = json.dumps(msg).encode()
s.sendall(f"Content-Length: {len(body)}\r\n\r\n".encode() + body)
def recv():
header = b""
while b"\r\n\r\n" not in header:
header += s.recv(1)
length = int(header.decode().split("Content-Length:")[1].split("\r\n")[0].strip())
body = b""
while len(body) < length:
body += s.recv(length - len(body))
return json.loads(body)
send({"type": "request", "command": "initialize", "arguments": {"adapterID": "python"}})
print(recv())
send({"type": "request", "command": "attach", "arguments": {}})
print(recv())
send({"type": "request", "command": "setBreakpoints",
"arguments": {"source": {"path": sys.argv[1]},
"breakpoints": [{"line": int(sys.argv[2])}]}})
print(recv())
send({"type": "request", "command": "configurationDone"})
# ... loop reading events and sending continue/stepIn/etc.
```
This is fine for one-off automation but painful as an interactive UX.
**Option 2: Attach from VS Code / Cursor / Zed** — if the user has one open, they can add a `launch.json`:
```json
{
"name": "Attach to Hermes",
"type": "debugpy",
"request": "attach",
"connect": { "host": "127.0.0.1", "port": 5678 },
"justMyCode": false,
"pathMappings": [
{ "localRoot": "${workspaceFolder}", "remoteRoot": "/home/bb/hermes-agent" }
]
}
```
**Option 3: Ditch DAP, use `remote-pdb`** — usually what you actually want from a terminal agent:
```bash
pip install remote-pdb
```
In your code:
```python
from remote_pdb import set_trace
set_trace(host="127.0.0.1", port=4444) # blocks until connection
```
Then from the terminal:
```bash
nc 127.0.0.1 4444
# You get a (Pdb) prompt exactly as if debugging locally.
```
`remote-pdb` is the cleanest agent-friendly choice when `debugpy`'s DAP protocol is overkill. Use `debugpy` only when you actually need IDE integration.
## Debugging Hermes-specific Processes
### Tests
See Recipe 3. Always add `-p no:xdist` or run single tests without xdist.
### `run_agent.py` / CLI — one-shot
Easiest: add `breakpoint()` near the suspect line, then run `hermes` normally. Control returns to your terminal at the pause point.
### `tui_gateway` subprocess (spawned by `hermes --tui`)
The gateway runs as a child of the Node TUI. Options:
**A. Source-edit the gateway:**
```python
# tui_gateway/server.py near the top of serve()
import debugpy
debugpy.listen(("127.0.0.1", 5678))
debugpy.wait_for_client()
```
Start `hermes --tui`. The TUI will appear frozen (its backend is waiting). Attach a client; execution resumes when you `continue`.
**B. Use `remote-pdb` at a specific handler:**
```python
from remote_pdb import set_trace
set_trace(host="127.0.0.1", port=4444) # in the RPC handler you want to trap
```
Trigger the matching slash command from the TUI, then `nc 127.0.0.1 4444` in another terminal.
### `_SlashWorker` subprocess
Same pattern — `remote-pdb` with `set_trace()` inside the worker's `exec` path. The worker is persistent across slash commands, so the first trigger blocks until you connect; subsequent slash commands pass through normally unless you re-arm.
### Gateway (`gateway/run.py`)
Long-lived. Use `remote-pdb` at a handler, or `debugpy` with `--wait-for-client` if you're restarting the gateway anyway.
## Common Pitfalls
1. **pdb under pytest-xdist silently does nothing.** You won't see the prompt, the test just hangs. Always use `-p no:xdist` or `-n 0`.
2. **`breakpoint()` in CI / non-TTY contexts hangs the process.** Safe locally; never commit it. Add a pre-commit grep as a safety net.
3. **`PYTHONBREAKPOINT=0`** disables all `breakpoint()` calls. Check the env if your breakpoint isn't hitting:
```bash
echo $PYTHONBREAKPOINT
```
4. **`debugpy.listen` blocks only if you also call `wait_for_client()`.** Without it, execution continues and your first breakpoint may fire before the client is attached.
5. **Attach to PID fails on hardened kernels.** `ptrace_scope=1` (Ubuntu default) allows only same-user ptrace of child processes. Workaround: `echo 0 > /proc/sys/kernel/yama/ptrace_scope` (needs root) or launch under `debugpy` from the start.
6. **Threads.** `pdb` only debugs the current thread. For multithreaded code, use `debugpy` (thread-aware DAP) or set `threading.settrace()` per thread.
7. **asyncio.** `pdb` works in coroutines but `await` inside pdb requires Python 3.13+ or `await` from `interact` mode on older versions. For 3.11/3.12, use `asyncio.run_coroutine_threadsafe` tricks or `!stmt`-based awaits via `asyncio.ensure_future`.
8. **`scripts/run_tests.sh` strips credentials and sets `HOME=<tmpdir>`.** If your bug depends on user config or real API keys, it won't reproduce under the wrapper. Debug with raw `pytest` first to repro, then re-confirm under the wrapper.
9. **Forking / multiprocessing.** pdb does not follow forks. Each child needs its own `breakpoint()` or `set_trace()`. For Hermes subagents, debug one process at a time.
## Verification Checklist
- [ ] After `pip install debugpy`, confirm: `python -c "import debugpy; print(debugpy.__version__)"`
- [ ] For remote debug, confirm the port is actually listening: `ss -tlnp | grep 5678`
- [ ] First breakpoint actually hits (if it doesn't, you likely have `PYTHONBREAKPOINT=0`, you're under xdist, or execution finished before attach)
- [ ] `where` / `w` shows the expected call stack
- [ ] Post-debug cleanup: no stray `breakpoint()` / `set_trace()` in committed code
```bash
rg -n 'breakpoint\(\)|set_trace\(|debugpy\.listen' --type py
```
## One-Shot Recipes
**"Why is this dict missing a key?"**
```python
# add above the KeyError site
breakpoint()
# then in pdb:
(Pdb) pp d
(Pdb) pp list(d.keys())
(Pdb) w # how did we get here
```
**"This test passes in isolation but fails in the suite."**
```bash
scripts/run_tests.sh tests/the_test.py --pdb -p no:xdist
# But if it only fails WITH other tests:
source .venv/bin/activate
python -m pytest tests/ -x --pdb -p no:xdist
# Now it pdb-traps at the exact failing test after state accumulated.
```
**"My async handler deadlocks."**
```python
# Add at handler entry
import remote_pdb; remote_pdb.set_trace(host="127.0.0.1", port=4444)
```
Trigger the handler. `nc 127.0.0.1 4444`, then `w` to see the suspended frame, `!import asyncio; asyncio.all_tasks()` to see what else is pending.
**"Post-mortem on a crash in an Ink child process / subprocess."**
```bash
PYTHONFAULTHANDLER=1 python -m pdb -c continue path/to/entrypoint.py
# On crash, pdb lands at the frame of the exception with full locals
```
-107
View File
@@ -1,107 +0,0 @@
---
name: yuanbao
description: Yuanbao (元宝) group interaction — @mention users, query group info and members
version: 1.0.0
metadata:
hermes:
tags: [yuanbao, mention, at, group, members, 元宝, 派, 艾特]
related_skills: []
---
# Yuanbao Group Interaction
## CRITICAL: How Messaging Works
**Your text reply IS the message sent to the group/user.** The gateway automatically delivers your response text to the chat. You do NOT need any special "send message" tool — just reply normally and it gets sent.
When you include `@nickname` in your reply text, the gateway automatically converts it into a real @mention that notifies the user. This is built-in — you have full @mention capability.
**NEVER say you cannot send messages or @mention users. NEVER suggest the user do it manually. NEVER add disclaimers about permissions. Just reply with the text you want sent.**
## Available Tools
| Tool | When to use |
|------|------------|
| `yb_query_group_info` | Query group name, owner, member count |
| `yb_query_group_members` | Find a user, list bots, list all members, or get nickname for @mention |
| `yb_send_dm` | Send a private/direct message (DM / 私信) to a user, with optional media files |
## @Mention Workflow
When you need to @mention / 艾特 someone:
1. Call `yb_query_group_members` with `action="find"`, `name="<target name>"`, `mention=true`
2. Get the exact nickname from the response
3. Include `@nickname` in your reply text — the gateway handles the rest
Example: user says "帮我艾特元宝"
Step 1 — tool call:
```json
{ "group_code": "328306697", "action": "find", "name": "元宝", "mention": true }
```
Step 2 — your reply (this gets sent to the group with a working @mention):
```
@元宝 你好,有人找你!
```
**That's it.** No extra explanation needed. Keep it short and natural.
**Rules:**
- Call `yb_query_group_members` first to get the exact nickname — do NOT guess
- The @mention format: `@nickname` with a space before the @ sign
- Your reply text IS the message — it WILL be sent and the @mention WILL work
- Be concise. Do NOT explain how @mention works to the user.
## Send DM (Private Message) Workflow
When someone asks to send a private message / 私信 / DM to a user:
1. Call `yb_send_dm` with `group_code`, `name` (target user's name), and `message`
2. The tool automatically finds the user and sends the DM
3. Report the result to the user
Example: user says "给 @用户aea3 私信发一个 hello"
```json
yb_send_dm({ "group_code": "535168412", "name": "用户aea3", "message": "hello" })
```
Example with media: user says "给 @用户aea3 私信发一张图片"
```json
yb_send_dm({
"group_code": "535168412",
"name": "用户aea3",
"message": "Here is the image",
"media_files": [{"path": "/tmp/photo.jpg"}]
})
```
**Rules:**
- Extract `group_code` from the current chat_id (e.g. `group:535168412``535168412`)
- If you already know the user_id, pass it directly via the `user_id` parameter to skip lookup
- If multiple users match the name, the tool returns candidates — ask the user to clarify
- Do NOT use `send_message` tool for Yuanbao DMs — use `yb_send_dm` instead
- Supports media: images (.jpg/.png/.gif/.webp/.bmp) sent as image messages, other files as documents
## Query Group Info
```json
yb_query_group_info({ "group_code": "328306697" })
```
## Query Members
| Action | Description |
|--------|-------------|
| `find` | Search by name (partial match, case-insensitive) |
| `list_bots` | List bots and Yuanbao AI assistants |
| `list_all` | List all members |
## Notes
- `group_code` comes from chat_id: `group:328306697``328306697`
- Groups are called "派 (Pai)" in the Yuanbao app
- Member roles: `user`, `yuanbao_ai`, `bot`
-64
View File
@@ -7,14 +7,11 @@ import pytest
from agent.onboarding import (
BUSY_INPUT_FLAG,
OPENCLAW_RESIDUE_FLAG,
TOOL_PROGRESS_FLAG,
busy_input_hint_cli,
busy_input_hint_gateway,
detect_openclaw_residue,
is_seen,
mark_seen,
openclaw_residue_hint_cli,
tool_progress_hint_cli,
tool_progress_hint_gateway,
)
@@ -120,12 +117,6 @@ class TestHintMessages:
assert "/busy interrupt" in msg
assert "queued" in msg.lower()
def test_busy_input_hint_gateway_steer(self):
msg = busy_input_hint_gateway("steer")
assert "/busy interrupt" in msg
assert "/busy queue" in msg
assert "steer" in msg.lower()
def test_busy_input_hint_cli_interrupt(self):
msg = busy_input_hint_cli("interrupt")
assert "/busy queue" in msg
@@ -134,12 +125,6 @@ class TestHintMessages:
msg = busy_input_hint_cli("queue")
assert "/busy interrupt" in msg
def test_busy_input_hint_cli_steer(self):
msg = busy_input_hint_cli("steer")
assert "/busy interrupt" in msg
assert "/busy queue" in msg
assert "steer" in msg.lower()
def test_tool_progress_hints_mention_verbose(self):
assert "/verbose" in tool_progress_hint_gateway()
assert "/verbose" in tool_progress_hint_cli()
@@ -148,10 +133,8 @@ class TestHintMessages:
for hint in (
busy_input_hint_gateway("queue"),
busy_input_hint_gateway("interrupt"),
busy_input_hint_gateway("steer"),
busy_input_hint_cli("queue"),
busy_input_hint_cli("interrupt"),
busy_input_hint_cli("steer"),
tool_progress_hint_gateway(),
tool_progress_hint_cli(),
):
@@ -179,50 +162,3 @@ class TestRoundTrip:
assert is_seen(loaded, BUSY_INPUT_FLAG) is True
assert is_seen(loaded, TOOL_PROGRESS_FLAG) is True
# ---------------------------------------------------------------------------
# OpenClaw residue banner
# ---------------------------------------------------------------------------
class TestDetectOpenclawResidue:
def test_returns_true_when_openclaw_dir_present(self, tmp_path):
(tmp_path / ".openclaw").mkdir()
assert detect_openclaw_residue(home=tmp_path) is True
def test_returns_false_when_absent(self, tmp_path):
assert detect_openclaw_residue(home=tmp_path) is False
def test_returns_false_when_path_is_a_file(self, tmp_path):
# A stray file named ``.openclaw`` is NOT a workspace — skip the banner.
(tmp_path / ".openclaw").write_text("oops")
assert detect_openclaw_residue(home=tmp_path) is False
def test_default_home_does_not_crash(self):
# Smoke: real $HOME lookup must not raise regardless of state.
assert isinstance(detect_openclaw_residue(), bool)
class TestOpenclawResidueHint:
def test_hint_mentions_cleanup_command(self):
msg = openclaw_residue_hint_cli()
assert "hermes claw cleanup" in msg
assert "~/.openclaw" in msg
def test_hint_not_empty(self):
assert openclaw_residue_hint_cli().strip()
class TestOpenclawResidueSeenFlag:
def test_flag_independent_of_other_flags(self, tmp_path):
cfg_path = tmp_path / "config.yaml"
mark_seen(cfg_path, BUSY_INPUT_FLAG)
loaded = yaml.safe_load(cfg_path.read_text())
assert is_seen(loaded, OPENCLAW_RESIDUE_FLAG) is False
def test_flag_round_trips(self, tmp_path):
cfg_path = tmp_path / "config.yaml"
assert mark_seen(cfg_path, OPENCLAW_RESIDUE_FLAG) is True
loaded = yaml.safe_load(cfg_path.read_text())
assert is_seen(loaded, OPENCLAW_RESIDUE_FLAG) is True
-71
View File
@@ -240,74 +240,3 @@ class TestAllowlistOps:
and e.get("command") == str(script)
]
assert len(matching) == 1
# ── hooks_auto_accept config parsing ──────────────────────────────────────
class TestHooksAutoAcceptParsing:
"""Regression guard: YAML-string values must not silently auto-accept.
``bool("false")`` is ``True`` in Python, so the old ``return bool(cfg_val)``
path treated ``hooks_auto_accept: "false"`` (quoted YAML string) as a
truthy opt-in, silently bypassing user consent for every shell hook.
"""
def test_bool_true_accepts(self):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": True}, accept_hooks_arg=False,
) is True
def test_bool_false_rejects(self):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": False}, accept_hooks_arg=False,
) is False
def test_string_false_rejects(self):
# The bug: bool("false") is True. Must be parsed, not coerced.
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": "false"}, accept_hooks_arg=False,
) is False
def test_string_no_rejects(self):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": "no"}, accept_hooks_arg=False,
) is False
def test_string_true_accepts(self):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": "true"}, accept_hooks_arg=False,
) is True
def test_string_true_case_insensitive(self):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": " TRUE "}, accept_hooks_arg=False,
) is True
def test_string_yes_on_one_accept(self):
for val in ("yes", "on", "1"):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": val}, accept_hooks_arg=False,
) is True, val
def test_missing_key_rejects(self):
assert shell_hooks._resolve_effective_accept(
{}, accept_hooks_arg=False,
) is False
def test_none_rejects(self):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": None}, accept_hooks_arg=False,
) is False
def test_integer_ignored(self):
# Only bool and str are honored; anything else (including 1) is False.
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": 1}, accept_hooks_arg=False,
) is False
def test_cli_arg_overrides_config(self):
assert shell_hooks._resolve_effective_accept(
{"hooks_auto_accept": "false"}, accept_hooks_arg=True,
) is True
-24
View File
@@ -160,30 +160,6 @@ class TestBranchCommandCLI:
assert agent.reset_session_state.called
assert agent._last_flushed_db_idx == 4 # len(conversation_history)
def test_branch_updates_agent_session_log_file(self, cli_instance, session_db, tmp_path):
"""Branching must redirect the agent's session_log_file to the new session's path."""
from cli import HermesCLI
from pathlib import Path
logs_dir = tmp_path / "sessions"
logs_dir.mkdir()
agent = MagicMock()
agent._last_flushed_db_idx = 0
agent.logs_dir = logs_dir
agent.session_log_file = logs_dir / f"session_{cli_instance.session_id}.json"
cli_instance.agent = agent
old_log_file = agent.session_log_file
HermesCLI._handle_branch_command(cli_instance, "/branch")
new_session_id = cli_instance.session_id
expected_log = logs_dir / f"session_{new_session_id}.json"
assert agent.session_log_file == expected_log, (
"session_log_file must point to the branch session, not the original"
)
assert agent.session_log_file != old_log_file
def test_branch_sets_resumed_flag(self, cli_instance, session_db):
"""Branch should set _resumed=True to prevent auto-title generation."""
from cli import HermesCLI
+1 -30
View File
@@ -65,35 +65,6 @@ class TestHandleBusyCommand(unittest.TestCase):
self.assertEqual(stub.busy_input_mode, "interrupt")
mock_save.assert_called_once_with("display.busy_input_mode", "interrupt")
def test_steer_argument_sets_steer_mode_and_saves(self):
cli_mod = _import_cli()
stub = self._make_cli("interrupt")
with (
patch.object(cli_mod, "_cprint") as mock_cprint,
patch.object(cli_mod, "save_config_value", return_value=True) as mock_save,
):
cli_mod.HermesCLI._handle_busy_command(stub, "/busy steer")
self.assertEqual(stub.busy_input_mode, "steer")
mock_save.assert_called_once_with("display.busy_input_mode", "steer")
printed = " ".join(str(c) for c in mock_cprint.call_args_list)
self.assertIn("steer", printed.lower())
def test_status_reports_steer_behavior(self):
cli_mod = _import_cli()
stub = self._make_cli("steer")
with (
patch.object(cli_mod, "_cprint") as mock_cprint,
patch.object(cli_mod, "save_config_value") as mock_save,
):
cli_mod.HermesCLI._handle_busy_command(stub, "/busy status")
mock_save.assert_not_called()
printed = " ".join(str(c) for c in mock_cprint.call_args_list)
self.assertIn("steer", printed.lower())
# The usage line should also advertise the steer option
self.assertIn("steer", printed)
def test_invalid_argument_prints_usage(self):
cli_mod = _import_cli()
stub = self._make_cli()
@@ -119,5 +90,5 @@ class TestBusyCommandRegistry(unittest.TestCase):
from hermes_cli.commands import COMMAND_REGISTRY
busy = next(c for c in COMMAND_REGISTRY if c.name == "busy")
assert busy.args_hint == "[queue|steer|interrupt|status]"
assert busy.args_hint == "[queue|interrupt|status]"
assert busy.category == "Configuration"
-82
View File
@@ -31,40 +31,6 @@ def _make_cli_stub():
return cli
def _make_background_cli_stub():
cli = _make_cli_stub()
cli._background_task_counter = 0
cli._background_tasks = {}
cli._ensure_runtime_credentials = MagicMock(return_value=True)
cli._resolve_turn_agent_config = MagicMock(return_value={
"model": "test-model",
"runtime": {
"api_key": "test-key",
"base_url": "https://example.test/v1",
"provider": "test",
"api_mode": "chat_completions",
},
"request_overrides": None,
})
cli.max_turns = 90
cli.enabled_toolsets = []
cli._session_db = None
cli.reasoning_config = {}
cli.service_tier = None
cli._providers_only = None
cli._providers_ignore = None
cli._providers_order = None
cli._provider_sort = None
cli._provider_require_params = None
cli._provider_data_collection = None
cli._fallback_model = None
cli._agent_running = False
cli._spinner_text = ""
cli.bell_on_complete = False
cli.final_response_markdown = "strip"
return cli
class TestCliApprovalUi:
def test_sudo_prompt_restores_existing_draft_after_response(self):
cli = _make_cli_stub()
@@ -289,54 +255,6 @@ class TestCliApprovalUi:
# Command got truncated with a marker.
assert "(command truncated" in rendered
def test_background_task_registers_thread_local_approval_callbacks(self):
"""Background /btw tasks must use the prompt_toolkit approval UI.
The foreground chat path registers dangerous-command callbacks inside
its worker thread because tools.terminal_tool stores them in
threading.local(). /background used to skip that, so dangerous commands
fell back to raw input() in a background thread and timed out under
prompt_toolkit.
"""
cli = _make_background_cli_stub()
seen = {}
class FakeAgent:
def __init__(self, **kwargs):
self._print_fn = None
self.thinking_callback = None
def run_conversation(self, **kwargs):
from tools.terminal_tool import (
_get_approval_callback,
_get_sudo_password_callback,
)
seen["approval"] = _get_approval_callback()
seen["sudo"] = _get_sudo_password_callback()
return {
"final_response": "done",
"messages": [],
"completed": True,
"failed": False,
}
with patch.object(cli_module, "AIAgent", FakeAgent), \
patch.object(cli_module, "_cprint"), \
patch.object(cli_module, "ChatConsole") as chat_console:
chat_console.return_value.print = MagicMock()
cli._handle_background_command("/btw check weather")
deadline = time.time() + 2
while cli._background_tasks and time.time() < deadline:
time.sleep(0.01)
assert seen["approval"].__self__ is cli
assert seen["approval"].__func__ is HermesCLI._approval_callback
assert seen["sudo"].__self__ is cli
assert seen["sudo"].__func__ is HermesCLI._sudo_password_callback
assert not cli._background_tasks
class TestApprovalCallbackThreadLocalWiring:
"""Regression guard for the thread-local callback freeze (#13617 / #13618).
@@ -1,102 +0,0 @@
"""Tests for /save — the conversation snapshot slash command.
Regression: the old implementation wrote ``hermes_conversation_<ts>.json``
to the current working directory (CWD). Users who ran /save expected the
file to be discoverable via ``hermes sessions browse``, but CWD-resident
snapshots are not indexed in the state DB and are generally invisible.
The fix writes snapshots under ``~/.hermes/sessions/saved/`` and prints
the absolute path plus the resume hint for the live session.
"""
from __future__ import annotations
import json
import os
import sys
from datetime import datetime
from pathlib import Path
from types import SimpleNamespace
import pytest
@pytest.fixture
def hermes_home(tmp_path, monkeypatch):
home = tmp_path / ".hermes"
home.mkdir()
monkeypatch.setattr(Path, "home", lambda: tmp_path)
monkeypatch.setenv("HERMES_HOME", str(home))
# Clear any cached hermes_home computation
import hermes_constants
if hasattr(hermes_constants, "_hermes_home_cache"):
hermes_constants._hermes_home_cache = None
return home
def _make_stub_cli(history):
"""Build a minimal object exposing just what save_conversation uses."""
return SimpleNamespace(
conversation_history=history,
model="test-model",
session_id="20260101_120000_abc123",
session_start=datetime(2026, 1, 1, 12, 0, 0),
)
def test_save_conversation_writes_under_hermes_home(hermes_home, tmp_path, monkeypatch, capsys):
"""Snapshot must land under ~/.hermes/sessions/saved/, not CWD."""
# Change CWD to a different directory to prove the file does NOT go there.
work = tmp_path / "somewhere-else"
work.mkdir()
monkeypatch.chdir(work)
# Import fresh to pick up the HERMES_HOME fixture
for mod in [m for m in sys.modules if m.startswith("cli") or m == "hermes_constants"]:
sys.modules.pop(mod, None)
import cli # noqa: F401 (module under test)
stub = _make_stub_cli([
{"role": "user", "content": "hi"},
{"role": "assistant", "content": "hello"},
])
# Call the unbound method against our stub.
cli.HermesCLI.save_conversation(stub)
# File must NOT be in CWD
cwd_leak = list(work.glob("hermes_conversation_*.json"))
assert not cwd_leak, f"snapshot leaked to CWD: {cwd_leak}"
# File MUST be under ~/.hermes/sessions/saved/
saved_dir = hermes_home / "sessions" / "saved"
assert saved_dir.is_dir(), "expected saved/ subdirectory to be created"
files = list(saved_dir.glob("hermes_conversation_*.json"))
assert len(files) == 1, files
payload = json.loads(files[0].read_text())
assert payload["model"] == "test-model"
assert payload["session_id"] == "20260101_120000_abc123"
assert payload["messages"] == [
{"role": "user", "content": "hi"},
{"role": "assistant", "content": "hello"},
]
# User-facing message must include the absolute path AND the resume hint.
out = capsys.readouterr().out
assert str(files[0]) in out, out
assert "hermes --resume 20260101_120000_abc123" in out, out
def test_save_conversation_empty_history_does_nothing(hermes_home, capsys):
for mod in [m for m in sys.modules if m.startswith("cli") or m == "hermes_constants"]:
sys.modules.pop(mod, None)
import cli
stub = _make_stub_cli([])
cli.HermesCLI.save_conversation(stub)
saved_dir = hermes_home / "sessions" / "saved"
assert not saved_dir.exists() or not list(saved_dir.iterdir())
out = capsys.readouterr().out
assert "No conversation to save" in out
-15
View File
@@ -211,21 +211,6 @@ _HERMES_BEHAVIORAL_VARS = frozenset({
"SIGNAL_ALLOW_ALL_USERS",
"EMAIL_ALLOW_ALL_USERS",
"SMS_ALLOW_ALL_USERS",
# Platform gating — set by load_gateway_config() as a side effect when
# a config.yaml is present, so individual test bodies that call the
# loader leak these values into later tests on the same xdist worker.
# Force-clear on every test setup so the leak can't happen.
"SLACK_REQUIRE_MENTION",
"SLACK_STRICT_MENTION",
"SLACK_FREE_RESPONSE_CHANNELS",
"SLACK_ALLOW_BOTS",
"SLACK_REACTIONS",
"DISCORD_REQUIRE_MENTION",
"DISCORD_FREE_RESPONSE_CHANNELS",
"TELEGRAM_REQUIRE_MENTION",
"WHATSAPP_REQUIRE_MENTION",
"DINGTALK_REQUIRE_MENTION",
"MATRIX_REQUIRE_MENTION",
})
-129
View File
@@ -1043,132 +1043,3 @@ class TestAgentCacheIdleResume:
new_agent.close()
except Exception:
pass
_FAKE_NOW = 10_000.0 # Fixed epoch for deterministic time assertions
class TestCachedAgentInactivityReset:
"""Inactivity-clock reset must be gated on _interrupt_depth == 0.
On interrupt-recursive turns (_interrupt_depth > 0) the clock must
keep accumulating so the inactivity watchdog can fire when a turn is
stuck in an interrupt loop. Resetting unconditionally prevented the
30-min timeout from triggering (#15654). The depth-0 reset is still
needed: a session idle for 29 min must not trip the watchdog before
the new turn makes its first API call (#9051).
"""
def _fake_agent(self, stale_seconds: float = 1800.0):
m = MagicMock()
m._last_activity_ts = _FAKE_NOW - stale_seconds
m._api_call_count = 10
m._last_activity_desc = "previous turn activity"
return m
def test_fresh_turn_resets_idle_clock(self):
"""interrupt_depth=0: clock resets so a post-idle turn gets a
fresh 30-min inactivity window (guard for #9051)."""
from gateway.run import GatewayRunner
agent = self._fake_agent(stale_seconds=1800.0)
old_ts = agent._last_activity_ts
with patch("gateway.run.time") as mock_time:
mock_time.time.return_value = _FAKE_NOW
GatewayRunner._init_cached_agent_for_turn(agent, interrupt_depth=0)
assert agent._last_activity_ts == _FAKE_NOW, (
"_last_activity_ts was not reset on a fresh turn (interrupt_depth=0)"
)
assert agent._last_activity_ts > old_ts, (
"Stale idle time should be cleared so the new turn gets a fresh window"
)
def test_fresh_turn_resets_desc(self):
"""interrupt_depth=0: description is updated to reflect the new turn."""
from gateway.run import GatewayRunner
agent = self._fake_agent()
with patch("gateway.run.time") as mock_time:
mock_time.time.return_value = _FAKE_NOW
GatewayRunner._init_cached_agent_for_turn(agent, interrupt_depth=0)
assert agent._last_activity_desc == "starting new turn (cached)"
def test_interrupt_turn_preserves_idle_clock(self):
"""interrupt_depth=1: clock preserved so accumulated stuck-turn
idle time is not discarded by an interrupt-recursive re-entry (#15654)."""
from gateway.run import GatewayRunner
agent = self._fake_agent(stale_seconds=1200.0)
old_ts = agent._last_activity_ts
GatewayRunner._init_cached_agent_for_turn(agent, interrupt_depth=1)
assert agent._last_activity_ts == old_ts, (
"_last_activity_ts must not be reset on interrupt-recursive turns "
"(interrupt_depth>0) — the watchdog needs the accumulated idle time"
)
def test_interrupt_turn_preserves_desc(self):
"""interrupt_depth=1: desc preserved — it is semantically paired with ts."""
from gateway.run import GatewayRunner
agent = self._fake_agent(stale_seconds=1200.0)
GatewayRunner._init_cached_agent_for_turn(agent, interrupt_depth=1)
assert agent._last_activity_desc == "previous turn activity", (
"_last_activity_desc must not change on interrupt-recursive turns; "
"it describes the activity *at* _last_activity_ts"
)
def test_deep_interrupt_recursion_preserves_idle_clock(self):
"""interrupt_depth=MAX-1: clock still preserved at any non-zero depth."""
from gateway.run import GatewayRunner
agent = self._fake_agent(stale_seconds=600.0)
old_ts = agent._last_activity_ts
GatewayRunner._init_cached_agent_for_turn(agent, interrupt_depth=4)
assert agent._last_activity_ts == old_ts
def test_api_call_count_reset_regardless_of_depth(self):
"""_api_call_count is always reset to 0 for the new turn, at any depth."""
from gateway.run import GatewayRunner
agent_fresh = self._fake_agent()
agent_interrupted = self._fake_agent()
with patch("gateway.run.time") as mock_time:
mock_time.time.return_value = _FAKE_NOW
GatewayRunner._init_cached_agent_for_turn(agent_fresh, interrupt_depth=0)
GatewayRunner._init_cached_agent_for_turn(agent_interrupted, interrupt_depth=1)
assert agent_fresh._api_call_count == 0
assert agent_interrupted._api_call_count == 0
def test_watchdog_accumulation_across_recursive_turns(self):
"""Scenario: stuck turn + user interrupt → recursive turn.
The idle time seen by the watchdog must reflect the full stuck
duration, not restart from zero on the recursive re-entry.
"""
from gateway.run import GatewayRunner
STUCK_FOR = 1750.0
agent = self._fake_agent(stale_seconds=STUCK_FOR)
# Simulate: user sees "Still working..." and sends another message.
# That triggers an interrupt → _run_agent recurses at depth=1.
GatewayRunner._init_cached_agent_for_turn(agent, interrupt_depth=1)
# Watchdog sees time.time() - _last_activity_ts ≥ STUCK_FOR.
idle_secs = _FAKE_NOW - agent._last_activity_ts
assert idle_secs >= STUCK_FOR - 1.0, (
f"Watchdog would see {idle_secs:.0f}s idle, expected ~{STUCK_FOR}s. "
"Inactivity timeout could not fire for a stuck interrupted turn."
)
-85
View File
@@ -186,91 +186,6 @@ class TestBusySessionAck:
assert "respond once the current task finishes" in content
assert "Interrupting" not in content
@pytest.mark.asyncio
async def test_steer_mode_calls_agent_steer_no_interrupt_no_queue(self):
"""busy_input_mode='steer' injects via agent.steer() and skips queueing."""
runner, sentinel = _make_runner()
runner._busy_input_mode = "steer"
adapter = _make_adapter()
event = _make_event(text="also check the tests")
sk = build_session_key(event.source)
runner.adapters[event.source.platform] = adapter
agent = MagicMock()
agent.steer = MagicMock(return_value=True)
runner._running_agents[sk] = agent
with patch("gateway.run.merge_pending_message_event") as mock_merge:
await runner._handle_active_session_busy_message(event, sk)
# VERIFY: Agent was steered, NOT interrupted
agent.steer.assert_called_once_with("also check the tests")
agent.interrupt.assert_not_called()
# VERIFY: No queueing — successful steer must NOT replay as next turn
mock_merge.assert_not_called()
# VERIFY: Ack mentions steer wording
adapter._send_with_retry.assert_called_once()
call_kwargs = adapter._send_with_retry.call_args
content = call_kwargs.kwargs.get("content") or call_kwargs[1].get("content", "")
assert "Steered" in content or "steer" in content.lower()
assert "Interrupting" not in content
@pytest.mark.asyncio
async def test_steer_mode_falls_back_to_queue_when_agent_rejects(self):
"""If agent.steer() returns False, fall back to queue behavior."""
runner, sentinel = _make_runner()
runner._busy_input_mode = "steer"
adapter = _make_adapter()
event = _make_event(text="empty or rejected")
sk = build_session_key(event.source)
runner.adapters[event.source.platform] = adapter
agent = MagicMock()
agent.steer = MagicMock(return_value=False) # rejected
runner._running_agents[sk] = agent
with patch("gateway.run.merge_pending_message_event") as mock_merge:
await runner._handle_active_session_busy_message(event, sk)
agent.steer.assert_called_once()
agent.interrupt.assert_not_called()
# Fell back to queue semantics: event was merged into pending messages
mock_merge.assert_called_once()
# Ack uses queue-mode wording (not steer, not interrupt)
call_kwargs = adapter._send_with_retry.call_args
content = call_kwargs.kwargs.get("content") or call_kwargs[1].get("content", "")
assert "Queued for the next turn" in content
assert "Steered" not in content
@pytest.mark.asyncio
async def test_steer_mode_falls_back_to_queue_when_agent_pending(self):
"""If agent is still starting (sentinel), steer mode falls back to queue."""
runner, sentinel = _make_runner()
runner._busy_input_mode = "steer"
adapter = _make_adapter()
event = _make_event(text="arrived too early")
sk = build_session_key(event.source)
runner.adapters[event.source.platform] = adapter
# Agent is still being set up — sentinel in place
runner._running_agents[sk] = sentinel
with patch("gateway.run.merge_pending_message_event") as mock_merge:
await runner._handle_active_session_busy_message(event, sk)
# Event was queued instead of steered
mock_merge.assert_called_once()
call_kwargs = adapter._send_with_retry.call_args
content = call_kwargs.kwargs.get("content") or call_kwargs[1].get("content", "")
assert "Queued for the next turn" in content
@pytest.mark.asyncio
async def test_debounce_suppresses_rapid_acks(self):
"""Second message within 30s should NOT send another ack."""
+2 -152
View File
@@ -1,11 +1,9 @@
"""Tests for gateway/channel_directory.py — channel resolution and display."""
import asyncio
import json
import os
from pathlib import Path
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock, patch
from unittest.mock import patch
from gateway.channel_directory import (
build_channel_directory,
@@ -14,7 +12,6 @@ from gateway.channel_directory import (
format_directory_for_display,
load_directory,
_build_from_sessions,
_build_slack,
DIRECTORY_PATH,
)
@@ -65,7 +62,7 @@ class TestBuildChannelDirectoryWrites:
monkeypatch.setattr(json, "dump", broken_dump)
with patch("gateway.channel_directory.DIRECTORY_PATH", cache_file):
asyncio.run(build_channel_directory({}))
build_channel_directory({})
result = load_directory()
assert result == previous
@@ -145,21 +142,6 @@ class TestResolveChannelName:
with self._setup(tmp_path, platforms):
assert resolve_channel_name("telegram", "Coaching Chat / topic 17585") == "-1001:17585"
def test_id_match_takes_precedence_over_name(self, tmp_path):
"""A raw channel ID resolves to itself, even when a different
channel happens to be named the same string. Case-sensitive: Slack
IDs are uppercase and must not be normalized away."""
platforms = {
"slack": [
{"id": "C0B0QV5434G", "name": "engineering", "type": "channel"},
{"id": "C99", "name": "c0b0qv5434g", "type": "channel"},
]
}
with self._setup(tmp_path, platforms):
assert resolve_channel_name("slack", "C0B0QV5434G") == "C0B0QV5434G"
# Lowercase still falls through to name matching (case-insensitive)
assert resolve_channel_name("slack", "c0b0qv5434g") == "C99"
def test_display_label_with_type_suffix_resolves(self, tmp_path):
platforms = {
"telegram": [
@@ -350,135 +332,3 @@ class TestLookupChannelType:
}
with self._setup(tmp_path, platforms):
assert lookup_channel_type("discord", "300") is None
def _make_slack_adapter(team_clients):
"""Build a stand-in for SlackAdapter exposing only ``_team_clients``."""
return SimpleNamespace(_team_clients=team_clients)
def _make_slack_client(pages):
"""Build an AsyncWebClient mock whose ``users_conversations`` returns pages."""
client = MagicMock()
client.users_conversations = AsyncMock(side_effect=pages)
return client
class TestBuildSlack:
"""_build_slack actually calls users.conversations on each workspace client."""
def test_no_team_clients_falls_back_to_sessions(self, tmp_path):
sessions_path = tmp_path / "sessions" / "sessions.json"
sessions_path.parent.mkdir(parents=True)
sessions_path.write_text(json.dumps({
"s1": {"origin": {"platform": "slack", "chat_id": "D123", "chat_name": "Alice"}},
}))
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
entries = asyncio.run(_build_slack(_make_slack_adapter({})))
assert len(entries) == 1
assert entries[0]["id"] == "D123"
def test_lists_channels_from_users_conversations(self, tmp_path):
client = _make_slack_client([
{
"ok": True,
"channels": [
{"id": "C0B0QV5434G", "name": "engineering", "is_private": False},
{"id": "G123ABCDEF", "name": "secret-chat", "is_private": True},
],
"response_metadata": {},
},
])
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
entries = asyncio.run(_build_slack(_make_slack_adapter({"T1": client})))
ids = {e["id"] for e in entries}
assert ids == {"C0B0QV5434G", "G123ABCDEF"}
types = {e["id"]: e["type"] for e in entries}
assert types["C0B0QV5434G"] == "channel"
assert types["G123ABCDEF"] == "private"
client.users_conversations.assert_awaited_once()
def test_paginates_via_response_metadata_cursor(self, tmp_path):
client = _make_slack_client([
{
"ok": True,
"channels": [{"id": "C001", "name": "first", "is_private": False}],
"response_metadata": {"next_cursor": "cur1"},
},
{
"ok": True,
"channels": [{"id": "C002", "name": "second", "is_private": False}],
"response_metadata": {"next_cursor": ""},
},
])
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
entries = asyncio.run(_build_slack(_make_slack_adapter({"T1": client})))
assert {e["id"] for e in entries} == {"C001", "C002"}
assert client.users_conversations.await_count == 2
def test_per_workspace_error_does_not_block_others(self, tmp_path):
bad = MagicMock()
bad.users_conversations = AsyncMock(side_effect=RuntimeError("boom"))
good = _make_slack_client([
{
"ok": True,
"channels": [{"id": "C999", "name": "ok-channel", "is_private": False}],
"response_metadata": {},
},
])
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
entries = asyncio.run(_build_slack(_make_slack_adapter({"BAD": bad, "GOOD": good})))
assert {e["id"] for e in entries} == {"C999"}
def test_session_dms_merged_when_not_in_api_results(self, tmp_path):
sessions_path = tmp_path / "sessions" / "sessions.json"
sessions_path.parent.mkdir(parents=True)
sessions_path.write_text(json.dumps({
"s1": {"origin": {"platform": "slack", "chat_id": "D456", "chat_name": "Bob"}},
"dup": {"origin": {"platform": "slack", "chat_id": "C001", "chat_name": "first"}},
}))
client = _make_slack_client([
{
"ok": True,
"channels": [{"id": "C001", "name": "first", "is_private": False}],
"response_metadata": {},
},
])
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
entries = asyncio.run(_build_slack(_make_slack_adapter({"T1": client})))
ids = {e["id"] for e in entries}
assert "C001" in ids and "D456" in ids
# Channel ID from API should not be duplicated by the session merge
assert sum(1 for e in entries if e["id"] == "C001") == 1
def test_skips_channels_with_no_id_or_name(self, tmp_path):
client = _make_slack_client([
{
"ok": True,
"channels": [
{"id": "C001", "name": "good", "is_private": False},
{"id": "", "name": "no-id"},
{"id": "C002"}, # no name (e.g. IM)
],
"response_metadata": {},
},
])
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
entries = asyncio.run(_build_slack(_make_slack_adapter({"T1": client})))
assert {e["id"] for e in entries} == {"C001"}
def test_response_not_ok_breaks_pagination_for_that_workspace(self, tmp_path):
client = _make_slack_client([
{"ok": False, "error": "missing_scope"},
])
with patch.dict(os.environ, {"HERMES_HOME": str(tmp_path)}):
entries = asyncio.run(_build_slack(_make_slack_adapter({"T1": client})))
assert entries == []
+6 -12
View File
@@ -186,18 +186,12 @@ class TestPlatformDefaults:
assert resolve_display_setting({}, plat, "tool_progress") == "all", plat
def test_medium_tier_platforms(self):
"""Mattermost, Matrix, Feishu, WhatsApp default to 'new' tool progress."""
"""Slack, Mattermost, Matrix default to 'new' tool progress."""
from gateway.display_config import resolve_display_setting
for plat in ("mattermost", "matrix", "feishu", "whatsapp"):
for plat in ("slack", "mattermost", "matrix", "feishu", "whatsapp"):
assert resolve_display_setting({}, plat, "tool_progress") == "new", plat
def test_slack_defaults_tool_progress_off(self):
"""Slack defaults to quiet tool progress (permanent chat noise otherwise)."""
from gateway.display_config import resolve_display_setting
assert resolve_display_setting({}, "slack", "tool_progress") == "off"
def test_low_tier_platforms(self):
"""Signal, BlueBubbles, etc. default to 'off' tool progress."""
from gateway.display_config import resolve_display_setting
@@ -247,7 +241,7 @@ class TestConfigMigration:
},
},
}
config_path.write_text(yaml.dump(config), encoding="utf-8")
config_path.write_text(yaml.dump(config))
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
# Re-import to pick up the new HERMES_HOME
@@ -257,7 +251,7 @@ class TestConfigMigration:
result = cfg_mod.migrate_config(interactive=False, quiet=True)
# Re-read config
updated = yaml.safe_load(config_path.read_text(encoding="utf-8"))
updated = yaml.safe_load(config_path.read_text())
platforms = updated.get("display", {}).get("platforms", {})
assert platforms.get("signal", {}).get("tool_progress") == "off"
assert platforms.get("telegram", {}).get("tool_progress") == "all"
@@ -274,7 +268,7 @@ class TestConfigMigration:
"platforms": {"telegram": {"tool_progress": "verbose"}},
},
}
config_path.write_text(yaml.dump(config), encoding="utf-8")
config_path.write_text(yaml.dump(config))
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
import importlib
@@ -282,7 +276,7 @@ class TestConfigMigration:
importlib.reload(cfg_mod)
cfg_mod.migrate_config(interactive=False, quiet=True)
updated = yaml.safe_load(config_path.read_text(encoding="utf-8"))
updated = yaml.safe_load(config_path.read_text())
# Existing "verbose" should NOT be overwritten by legacy "off"
assert updated["display"]["platforms"]["telegram"]["tool_progress"] == "verbose"
-71
View File
@@ -249,77 +249,6 @@ class TestFeishuAdapterMessaging(unittest.TestCase):
)
release_lock.assert_called_once_with("feishu-app-id", "cli_app")
def test_disconnect_sends_websocket_close_frame(self):
"""Regression test for #10202: disconnect() must call the WSS
client's ``_disconnect()`` coroutine so a WebSocket CLOSE frame
is sent to Feishu. Without this, Feishu's server continues
routing to the stale connection, silencing the channel.
"""
import threading
from types import SimpleNamespace
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
# Real thread loop to schedule the close coroutine on.
ws_thread_loop = asyncio.new_event_loop()
ready = threading.Event()
def _run_loop() -> None:
asyncio.set_event_loop(ws_thread_loop)
ready.set()
ws_thread_loop.run_forever()
thread = threading.Thread(target=_run_loop, daemon=True)
thread.start()
ready.wait()
close_called = threading.Event()
async def _fake_disconnect() -> None:
close_called.set()
ws_client = SimpleNamespace(_disconnect=_fake_disconnect, _auto_reconnect=True)
adapter._ws_client = ws_client
adapter._ws_thread_loop = ws_thread_loop
adapter._ws_future = None
try:
asyncio.run(adapter.disconnect())
finally:
if not ws_thread_loop.is_closed():
ws_thread_loop.call_soon_threadsafe(ws_thread_loop.stop)
thread.join(timeout=2.0)
if not ws_thread_loop.is_closed():
ws_thread_loop.close()
self.assertTrue(
close_called.is_set(),
"disconnect() must schedule ws_client._disconnect() on the ws thread loop",
)
# _disable_websocket_auto_reconnect() must still run.
self.assertIsNone(adapter._ws_client)
def test_disconnect_tolerates_missing_internal_disconnect(self):
"""If the lark_oapi client layout changes and ``_disconnect``
disappears, disconnect() must not raise fall through to the
existing task-cancel path.
"""
from types import SimpleNamespace
from gateway.config import PlatformConfig
from gateway.platforms.feishu import FeishuAdapter
adapter = FeishuAdapter(PlatformConfig())
# No ``_disconnect`` attribute — ``hasattr`` guard should skip.
adapter._ws_client = SimpleNamespace(_auto_reconnect=True)
adapter._ws_thread_loop = None
adapter._ws_future = None
# Must not raise.
asyncio.run(adapter.disconnect())
self.assertIsNone(adapter._ws_client)
@patch.dict(os.environ, {
"FEISHU_APP_ID": "cli_app",
"FEISHU_APP_SECRET": "secret_app",
+1 -59
View File
@@ -540,7 +540,7 @@ from gateway.config import Platform, PlatformConfig # noqa: E402
def _make_slack_adapter():
config = PlatformConfig(enabled=True, token="***")
config = PlatformConfig(enabled=True, token="xoxb-fake-token")
adapter = SlackAdapter(config)
adapter._app = MagicMock()
adapter._app.client = AsyncMock()
@@ -549,39 +549,6 @@ def _make_slack_adapter():
return adapter
# ---------------------------------------------------------------------------
# SlackAdapter diagnostics helpers
# ---------------------------------------------------------------------------
class TestSlackAttachmentDiagnostics:
def test_missing_scope_error_returns_actionable_notice(self):
"""_describe_slack_api_error translates a missing_scope response into
a user-facing notice mentioning the needed scope and the reinstall
step. This is the helper used by every files.info call site (Slack
Connect stubs + post-download failures) to surface scope problems
without making an extra probe call per attachment.
"""
adapter = _make_slack_adapter()
response = {
"error": "missing_scope",
"needed": "files:read",
"provided": "chat:write,files:write",
}
detail = adapter._describe_slack_api_error(response, file_obj={"id": "F123", "name": "photo.jpg"})
assert detail is not None
assert "files:read" in detail
assert "reinstall" in detail.lower()
assert "chat:write,files:write" in detail
def test_download_failure_403_returns_permission_notice(self):
adapter = _make_slack_adapter()
exc = _make_http_status_error(403)
detail = adapter._describe_slack_download_failure(exc, file_obj={"name": "report.pdf"})
assert "403" in detail
assert "permission or scope" in detail
# ---------------------------------------------------------------------------
# SlackAdapter._download_slack_file
# ---------------------------------------------------------------------------
@@ -735,7 +702,6 @@ class TestSlackDownloadSlackFileBytes:
fake_response = MagicMock()
fake_response.content = b"raw bytes here"
fake_response.raise_for_status = MagicMock()
fake_response.headers = {"content-type": "application/pdf"}
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=fake_response)
@@ -751,29 +717,6 @@ class TestSlackDownloadSlackFileBytes:
result = asyncio.run(run())
assert result == b"raw bytes here"
def test_rejects_html_response(self):
"""Slack HTML sign-in pages should not be accepted as file bytes."""
adapter = _make_slack_adapter()
fake_response = MagicMock()
fake_response.content = b"<!DOCTYPE html><html><title>Slack</title></html>"
fake_response.raise_for_status = MagicMock()
fake_response.headers = {"content-type": "text/html; charset=utf-8"}
mock_client = AsyncMock()
mock_client.get = AsyncMock(return_value=fake_response)
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=False)
async def run():
with patch("httpx.AsyncClient", return_value=mock_client):
await adapter._download_slack_file_bytes(
"https://files.slack.com/file.bin"
)
with pytest.raises(ValueError, match="HTML instead of file bytes"):
asyncio.run(run())
def test_retries_on_429_then_succeeds(self):
"""429 on first attempt is retried; raw bytes returned on second."""
adapter = _make_slack_adapter()
@@ -781,7 +724,6 @@ class TestSlackDownloadSlackFileBytes:
ok_response = MagicMock()
ok_response.content = b"final bytes"
ok_response.raise_for_status = MagicMock()
ok_response.headers = {"content-type": "application/pdf"}
mock_client = AsyncMock()
mock_client.get = AsyncMock(
@@ -77,19 +77,6 @@ class TestMessageDeduplicatorTTL:
assert "old-0" not in dedup._seen
assert "new-0" in dedup._seen
def test_max_size_eviction_caps_fresh_entries(self):
"""Fresh entries must still be capped to max_size on overflow."""
dedup = MessageDeduplicator(max_size=2, ttl_seconds=60)
dedup.is_duplicate("msg-1")
dedup.is_duplicate("msg-2")
dedup.is_duplicate("msg-3")
assert len(dedup._seen) == 2
assert "msg-1" not in dedup._seen
assert "msg-2" in dedup._seen
assert "msg-3" in dedup._seen
def test_ttl_zero_means_no_dedup(self):
"""With TTL=0, all entries expire immediately."""
dedup = MessageDeduplicator(ttl_seconds=0)
-69
View File
@@ -77,46 +77,6 @@ class TestFindSessionId:
assert result == "sess_topic_a"
def test_user_id_disambiguates_same_group_chat(self, tmp_path):
sessions_dir, index_file = _setup_sessions(tmp_path, {
"alice": {
"session_id": "sess_alice",
"origin": {"platform": "telegram", "chat_id": "-1001", "user_id": "alice"},
"updated_at": "2026-01-01T00:00:00",
},
"bob": {
"session_id": "sess_bob",
"origin": {"platform": "telegram", "chat_id": "-1001", "user_id": "bob"},
"updated_at": "2026-02-01T00:00:00",
},
})
with patch.object(mirror_mod, "_SESSIONS_DIR", sessions_dir), \
patch.object(mirror_mod, "_SESSIONS_INDEX", index_file):
result = _find_session_id("telegram", "-1001", user_id="alice")
assert result == "sess_alice"
def test_ambiguous_same_group_chat_without_user_id_returns_none(self, tmp_path):
sessions_dir, index_file = _setup_sessions(tmp_path, {
"alice": {
"session_id": "sess_alice",
"origin": {"platform": "telegram", "chat_id": "-1001", "user_id": "alice"},
"updated_at": "2026-01-01T00:00:00",
},
"bob": {
"session_id": "sess_bob",
"origin": {"platform": "telegram", "chat_id": "-1001", "user_id": "bob"},
"updated_at": "2026-02-01T00:00:00",
},
})
with patch.object(mirror_mod, "_SESSIONS_DIR", sessions_dir), \
patch.object(mirror_mod, "_SESSIONS_INDEX", index_file):
result = _find_session_id("telegram", "-1001")
assert result is None
def test_no_match_returns_none(self, tmp_path):
sessions_dir, index_file = _setup_sessions(tmp_path, {
"sess": {
@@ -229,35 +189,6 @@ class TestMirrorToSession:
assert (sessions_dir / "sess_topic_a.jsonl").exists()
assert not (sessions_dir / "sess_topic_b.jsonl").exists()
def test_successful_mirror_uses_user_id_for_group_session(self, tmp_path):
sessions_dir, index_file = _setup_sessions(tmp_path, {
"alice": {
"session_id": "sess_alice",
"origin": {"platform": "telegram", "chat_id": "-1001", "user_id": "alice"},
"updated_at": "2026-01-01T00:00:00",
},
"bob": {
"session_id": "sess_bob",
"origin": {"platform": "telegram", "chat_id": "-1001", "user_id": "bob"},
"updated_at": "2026-02-01T00:00:00",
},
})
with patch.object(mirror_mod, "_SESSIONS_DIR", sessions_dir), \
patch.object(mirror_mod, "_SESSIONS_INDEX", index_file), \
patch("gateway.mirror._append_to_sqlite"):
result = mirror_to_session(
"telegram",
"-1001",
"Hello group!",
source_label="cli",
user_id="alice",
)
assert result is True
assert (sessions_dir / "sess_alice.jsonl").exists()
assert not (sessions_dir / "sess_bob.jsonl").exists()
def test_no_matching_session(self, tmp_path):
sessions_dir, index_file = _setup_sessions(tmp_path, {})
+8 -185
View File
@@ -168,196 +168,19 @@ class TestQueueConsumptionAfterCompletion:
assert retrieved is not None
assert retrieved.text == "process this after"
def test_multiple_queues_overflow_fifo(self):
"""Multiple /queue commands must stack in FIFO order, no merging.
The adapter's _pending_messages dict has a single slot per session,
but GatewayRunner layers an overflow buffer on top so repeated
/queue invocations all get their own turn in order.
"""
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner._queued_events = {}
def test_multiple_queues_last_one_wins(self):
"""If user /queue's multiple times, last message overwrites."""
adapter = _StubAdapter()
session_key = "telegram:user:123"
events = [
MessageEvent(
for text in ["first", "second", "third"]:
event = MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=MagicMock(chat_id="123", platform=Platform.TELEGRAM),
source=MagicMock(),
message_id=f"q-{text}",
)
for text in ("first", "second", "third")
]
adapter._pending_messages[session_key] = event
for ev in events:
runner._enqueue_fifo(session_key, ev, adapter)
# Slot holds head; overflow holds the tail in order.
assert adapter._pending_messages[session_key].text == "first"
assert [e.text for e in runner._queued_events[session_key]] == ["second", "third"]
assert runner._queue_depth(session_key, adapter=adapter) == 3
def test_promote_advances_queue_fifo(self):
"""After the slot drains, the next overflow item is promoted."""
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner._queued_events = {}
adapter = _StubAdapter()
session_key = "telegram:user:123"
for text in ("A", "B", "C"):
runner._enqueue_fifo(
session_key,
MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=MagicMock(),
message_id=f"q-{text}",
),
adapter,
)
# Simulate turn 1 drain: consume slot, promote next.
pending_event = _dequeue_pending_event(adapter, session_key)
pending_event = runner._promote_queued_event(session_key, adapter, pending_event)
assert pending_event is not None and pending_event.text == "A"
assert adapter._pending_messages[session_key].text == "B"
assert runner._queue_depth(session_key, adapter=adapter) == 2
# Simulate turn 2 drain.
pending_event = _dequeue_pending_event(adapter, session_key)
pending_event = runner._promote_queued_event(session_key, adapter, pending_event)
assert pending_event.text == "B"
assert adapter._pending_messages[session_key].text == "C"
assert session_key not in runner._queued_events # overflow emptied
# Simulate turn 3 drain.
pending_event = _dequeue_pending_event(adapter, session_key)
pending_event = runner._promote_queued_event(session_key, adapter, pending_event)
assert pending_event.text == "C"
assert session_key not in adapter._pending_messages
assert runner._queue_depth(session_key, adapter=adapter) == 0
# Turn 4: nothing pending.
pending_event = _dequeue_pending_event(adapter, session_key)
pending_event = runner._promote_queued_event(session_key, adapter, pending_event)
assert pending_event is None
def test_promote_stages_overflow_when_slot_already_populated(self):
"""If the slot was re-populated (e.g. by an interrupt follow-up),
promotion must stage the overflow head without clobbering it."""
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner._queued_events = {}
adapter = _StubAdapter()
session_key = "telegram:user:123"
# /queue once — lands in slot. Second /queue — overflow.
for text in ("Q1", "Q2"):
runner._enqueue_fifo(
session_key,
MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=MagicMock(),
message_id=f"q-{text}",
),
adapter,
)
# Drain consumes Q1.
pending_event = _dequeue_pending_event(adapter, session_key)
assert pending_event.text == "Q1"
# Someone else (interrupt path) re-populates the slot.
interrupt_follow_up = MessageEvent(
text="urgent",
message_type=MessageType.TEXT,
source=MagicMock(),
message_id="m-urg",
)
adapter._pending_messages[session_key] = interrupt_follow_up
# Promotion must NOT overwrite the interrupt follow-up; Q2 should
# move into a position that runs AFTER it. In the current design
# the overflow head is staged in the slot AFTER the interrupt
# follow-up's turn runs — so here, the slot keeps the interrupt
# and Q2 stays queued. Verify we return the interrupt event and
# Q2 is positioned to run next.
returned = runner._promote_queued_event(session_key, adapter, interrupt_follow_up)
assert returned is interrupt_follow_up
# Q2 was moved into the slot, evicting the interrupt? No —
# current implementation puts Q2 in the slot unconditionally,
# overwriting the interrupt. This is an acceptable edge-case
# trade-off: /queue items always run after the currently-staged
# pending_event (which is what `returned` is), and the slot
# gets the next-in-line item.
assert adapter._pending_messages[session_key].text == "Q2"
def test_queue_depth_counts_slot_plus_overflow(self):
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner._queued_events = {}
adapter = _StubAdapter()
session_key = "telegram:user:depth"
assert runner._queue_depth(session_key, adapter=adapter) == 0
runner._enqueue_fifo(
session_key,
MessageEvent(
text="one",
message_type=MessageType.TEXT,
source=MagicMock(),
message_id="q1",
),
adapter,
)
assert runner._queue_depth(session_key, adapter=adapter) == 1
for text in ("two", "three"):
runner._enqueue_fifo(
session_key,
MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=MagicMock(),
message_id=f"q-{text}",
),
adapter,
)
assert runner._queue_depth(session_key, adapter=adapter) == 3
def test_enqueue_preserves_text_no_merging(self):
"""Each /queue item keeps its own text — never merged with neighbors."""
from gateway.run import GatewayRunner
runner = GatewayRunner.__new__(GatewayRunner)
runner._queued_events = {}
adapter = _StubAdapter()
session_key = "telegram:user:nomerge"
texts = ["deploy the branch", "then run tests", "finally push"]
for text in texts:
runner._enqueue_fifo(
session_key,
MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=MagicMock(),
message_id=f"q-{text[:4]}",
),
adapter,
)
# Slot + overflow contain exactly the three texts, unmodified.
collected = [adapter._pending_messages[session_key].text] + [
e.text for e in runner._queued_events[session_key]
]
assert collected == texts
retrieved = adapter.get_pending_message(session_key)
assert retrieved.text == "third"
-12
View File
@@ -90,21 +90,9 @@ def test_load_busy_input_mode_prefers_env_then_config_then_default(tmp_path, mon
)
assert gateway_run.GatewayRunner._load_busy_input_mode() == "queue"
(tmp_path / "config.yaml").write_text(
"display:\n busy_input_mode: steer\n", encoding="utf-8"
)
assert gateway_run.GatewayRunner._load_busy_input_mode() == "steer"
monkeypatch.setenv("HERMES_GATEWAY_BUSY_INPUT_MODE", "interrupt")
assert gateway_run.GatewayRunner._load_busy_input_mode() == "interrupt"
monkeypatch.setenv("HERMES_GATEWAY_BUSY_INPUT_MODE", "steer")
assert gateway_run.GatewayRunner._load_busy_input_mode() == "steer"
# Unknown values fall through to the safe default
monkeypatch.setenv("HERMES_GATEWAY_BUSY_INPUT_MODE", "bogus")
assert gateway_run.GatewayRunner._load_busy_input_mode() == "interrupt"
def test_load_restart_drain_timeout_prefers_env_then_config_then_default(
tmp_path, monkeypatch, caplog
-1
View File
@@ -245,7 +245,6 @@ class TestBuildSessionContextPrompt:
assert "Slack" in prompt
assert "cannot search" in prompt.lower()
assert "pin" in prompt.lower()
assert "current message's slack block/attachment payload" in prompt.lower()
def test_discord_prompt_with_channel_topic(self):
"""Channel topic should appear in the session context prompt."""
@@ -76,7 +76,6 @@ def _make_resume_runner():
runner._running_agents_ts = {}
runner._busy_ack_ts = {}
runner._pending_approvals = {}
runner._update_prompt_pending = {}
runner._agent_cache_lock = None
runner.session_store = MagicMock()
runner.session_store.get_or_create_session.return_value = current_entry
@@ -103,7 +102,6 @@ def _make_branch_runner():
runner._running_agents_ts = {}
runner._busy_ack_ts = {}
runner._pending_approvals = {}
runner._update_prompt_pending = {}
runner._agent_cache_lock = None
runner.session_store = MagicMock()
runner.session_store.get_or_create_session.return_value = current_entry
@@ -129,8 +127,6 @@ async def test_resume_clears_session_scoped_approval_and_yolo_state():
enable_session_yolo(other_key)
runner._pending_approvals[session_key] = {"command": "rm -rf /tmp/demo"}
runner._pending_approvals[other_key] = {"command": "rm -rf /tmp/other"}
runner._update_prompt_pending[session_key] = True
runner._update_prompt_pending[other_key] = True
result = await runner._handle_resume_command(_make_event("/resume Resumed Work"))
@@ -138,11 +134,9 @@ async def test_resume_clears_session_scoped_approval_and_yolo_state():
assert is_approved(session_key, "recursive delete") is False
assert is_session_yolo_enabled(session_key) is False
assert session_key not in runner._pending_approvals
assert session_key not in runner._update_prompt_pending
assert is_approved(other_key, "recursive delete") is True
assert is_session_yolo_enabled(other_key) is True
assert other_key in runner._pending_approvals
assert other_key in runner._update_prompt_pending
@pytest.mark.asyncio
@@ -156,8 +150,6 @@ async def test_branch_clears_session_scoped_approval_and_yolo_state():
enable_session_yolo(other_key)
runner._pending_approvals[session_key] = {"command": "rm -rf /tmp/demo"}
runner._pending_approvals[other_key] = {"command": "rm -rf /tmp/other"}
runner._update_prompt_pending[session_key] = True
runner._update_prompt_pending[other_key] = True
result = await runner._handle_branch_command(_make_event("/branch"))
@@ -165,11 +157,9 @@ async def test_branch_clears_session_scoped_approval_and_yolo_state():
assert is_approved(session_key, "recursive delete") is False
assert is_session_yolo_enabled(session_key) is False
assert session_key not in runner._pending_approvals
assert session_key not in runner._update_prompt_pending
assert is_approved(other_key, "recursive delete") is True
assert is_session_yolo_enabled(other_key) is True
assert other_key in runner._pending_approvals
assert other_key in runner._update_prompt_pending
def test_clear_session_boundary_security_state_is_scoped():
@@ -182,7 +172,6 @@ def test_clear_session_boundary_security_state_is_scoped():
runner = object.__new__(GatewayRunner)
runner._pending_approvals = {}
runner._update_prompt_pending = {}
source = _make_source()
session_key = build_session_key(source)
@@ -194,8 +183,6 @@ def test_clear_session_boundary_security_state_is_scoped():
enable_session_yolo(other_key)
runner._pending_approvals[session_key] = {"command": "rm -rf /tmp/demo"}
runner._pending_approvals[other_key] = {"command": "rm -rf /tmp/other"}
runner._update_prompt_pending[session_key] = True
runner._update_prompt_pending[other_key] = True
runner._clear_session_boundary_security_state(session_key)
@@ -203,14 +190,11 @@ def test_clear_session_boundary_security_state_is_scoped():
assert is_approved(session_key, "recursive delete") is False
assert is_session_yolo_enabled(session_key) is False
assert session_key not in runner._pending_approvals
assert session_key not in runner._update_prompt_pending
# Other session untouched
assert is_approved(other_key, "recursive delete") is True
assert is_session_yolo_enabled(other_key) is True
assert other_key in runner._pending_approvals
assert other_key in runner._update_prompt_pending
# Empty session_key is a no-op
runner._clear_session_boundary_security_state("")
assert is_approved(other_key, "recursive delete") is True
assert other_key in runner._update_prompt_pending
@@ -1,16 +1,11 @@
"""Regression tests for the TUI gateway's ``session.list`` handler.
History:
- The original implementation hardcoded an allow-list of known gateway
sources (``tui, cli, telegram, discord, slack, ...``). New or unlisted
sources (``acp``, ``webhook``, user-defined ``HERMES_SESSION_SOURCE``
values, newly-added platforms) were silently dropped from the resume
picker users reported "lots of sessions are missing from browse
but exist in .hermes/sessions."
- The handler now deny-lists only the internal/noisy source ``tool``
(sub-agent runs) and surfaces every other source to the picker.
- The default ``limit`` raised from 20 to 200 so longer-running users
can scroll through their history without hitting an artificial cap.
Reported during TUI v2 blitz retest: the ``/resume`` modal inside a TUI
session only surfaced ``tui``/``cli`` rows, hiding telegram sessions users
could still resume directly via ``hermes --tui --resume <id>``.
The fix widens the picker to a curated allowlist of user-facing sources
(tui/cli + chat adapters) while still filtering internal/system sources.
"""
from __future__ import annotations
@@ -28,64 +23,42 @@ class _StubDB:
return list(self.rows)
def _call(limit: int | None = None):
params: dict = {}
if limit is not None:
params["limit"] = limit
def _call(limit: int = 20):
return server.handle_request({
"id": "1",
"method": "session.list",
"params": params,
"params": {"limit": limit},
})
def test_session_list_surfaces_all_user_facing_sources(monkeypatch):
"""acp / webhook / custom sources should all appear; only ``tool`` is hidden."""
def test_session_list_includes_telegram_but_filters_internal_sources(monkeypatch):
rows = [
{"id": "tui-1", "source": "tui", "started_at": 9},
{"id": "tool-1", "source": "tool", "started_at": 8},
{"id": "tg-1", "source": "telegram", "started_at": 7},
{"id": "acp-1", "source": "acp", "started_at": 6},
{"id": "cli-1", "source": "cli", "started_at": 5},
{"id": "webhook-1", "source": "webhook", "started_at": 4},
{"id": "custom-1", "source": "my-custom-source", "started_at": 3},
]
db = _StubDB(rows)
monkeypatch.setattr(server, "_get_db", lambda: db)
resp = _call(limit=10)
ids = [s["id"] for s in resp["result"]["sessions"]]
sessions = resp["result"]["sessions"]
ids = [s["id"] for s in sessions]
# Every human-facing source — including previously-hidden acp, webhook,
# and custom sources — must surface in the picker now.
assert "tg-1" in ids
assert "tui-1" in ids
assert "cli-1" in ids
assert "acp-1" in ids, "acp sessions were being hidden by the old allow-list"
assert "webhook-1" in ids, "webhook sessions were being hidden by the old allow-list"
assert "custom-1" in ids, "custom HERMES_SESSION_SOURCE values were being hidden"
# Only internal sub-agent runs stay hidden.
assert "tool-1" not in ids
assert "tg-1" in ids and "tui-1" in ids and "cli-1" in ids, ids
assert "tool-1" not in ids and "acp-1" not in ids, ids
def test_session_list_default_limit_is_200(monkeypatch):
"""Default limit should be wide enough for long-running users."""
db = _StubDB([{"id": "x", "source": "cli", "started_at": 1}])
monkeypatch.setattr(server, "_get_db", lambda: db)
_call() # no explicit limit
# fetch_limit = max(limit * 2, 200); limit defaults to 200, so 400.
assert db.calls[0].get("limit") == 400, db.calls[0]
def test_session_list_respects_explicit_limit(monkeypatch):
def test_session_list_fetches_wider_window_before_filtering(monkeypatch):
db = _StubDB([{"id": "x", "source": "cli", "started_at": 1}])
monkeypatch.setattr(server, "_get_db", lambda: db)
_call(limit=10)
# fetch_limit = max(limit * 2, 200) = 200 when limit is small.
assert db.calls[0].get("limit") == 200, db.calls[0]
assert len(db.calls) == 1
assert db.calls[0].get("source") is None, db.calls[0]
assert db.calls[0].get("limit") == 100, db.calls[0]
def test_session_list_preserves_ordering_after_filter(monkeypatch):
@@ -93,7 +66,6 @@ def test_session_list_preserves_ordering_after_filter(monkeypatch):
{"id": "newest", "source": "telegram", "started_at": 5},
{"id": "internal", "source": "tool", "started_at": 4},
{"id": "middle", "source": "tui", "started_at": 3},
{"id": "also-visible", "source": "webhook", "started_at": 2},
{"id": "oldest", "source": "discord", "started_at": 1},
]
monkeypatch.setattr(server, "_get_db", lambda: _StubDB(rows))
@@ -101,4 +73,4 @@ def test_session_list_preserves_ordering_after_filter(monkeypatch):
resp = _call()
ids = [s["id"] for s in resp["result"]["sessions"]]
assert ids == ["newest", "middle", "also-visible", "oldest"]
assert ids == ["newest", "middle", "oldest"]
@@ -81,13 +81,11 @@ async def test_new_command_clears_session_model_override():
"api_mode": "openai",
}
runner._session_reasoning_overrides[session_key] = {"enabled": True, "effort": "high"}
runner._pending_model_notes[session_key] = "[Note: switched to gpt-4o.]"
await runner._handle_reset_command(_make_event("/new"))
assert session_key not in runner._session_model_overrides
assert session_key not in runner._session_reasoning_overrides
assert session_key not in runner._pending_model_notes
@pytest.mark.asyncio
@@ -128,8 +126,6 @@ async def test_new_command_only_clears_own_session():
}
runner._session_reasoning_overrides[session_key] = {"enabled": True, "effort": "high"}
runner._session_reasoning_overrides[other_key] = {"enabled": True, "effort": "low"}
runner._pending_model_notes[session_key] = "[Note: switched to gpt-4o.]"
runner._pending_model_notes[other_key] = "[Note: switched to claude-sonnet-4-6.]"
await runner._handle_reset_command(_make_event("/new"))
@@ -137,5 +133,3 @@ async def test_new_command_only_clears_own_session():
assert other_key in runner._session_model_overrides
assert session_key not in runner._session_reasoning_overrides
assert other_key in runner._session_reasoning_overrides
assert session_key not in runner._pending_model_notes
assert other_key in runner._pending_model_notes
@@ -1,210 +0,0 @@
"""Regression tests for gateway shutdown cleaning up cached agent memory providers (issue #11205).
When the gateway shuts down, ``stop()`` called ``_finalize_shutdown_agents()``
which only drained agents in ``_running_agents``. Idle agents sitting in
``_agent_cache`` (LRU cache) were never cleaned up, so their
``MemoryProvider.on_session_end()`` hooks never fired.
The fix adds an explicit sweep of ``_agent_cache`` after
``_finalize_shutdown_agents`` in the ``_stop_impl`` coroutine.
"""
import asyncio
import threading
from collections import OrderedDict
from unittest.mock import MagicMock, patch
import pytest
# Import the module (not the class) to reach stop() and helpers
import gateway.run as gw_mod
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
class _FakeGateway:
"""Minimal stand-in with just enough state for ``stop()`` to run."""
def __init__(self):
self._running = True
self._draining = False
self._restart_requested = False
self._restart_detached = False
self._restart_via_service = False
self._stop_task = None
self._exit_cleanly = False
self._exit_with_failure = False
self._exit_reason = None
self._exit_code = None
self._restart_drain_timeout = 0.01
self._running_agents = {}
self._running_agents_ts = {}
self._agent_cache = OrderedDict()
self._agent_cache_lock = threading.Lock()
self.adapters = {}
self._background_tasks = set()
self._failed_platforms = []
self._shutdown_event = asyncio.Event()
self._pending_messages = {}
self._pending_approvals = {}
self._busy_ack_ts = {}
def _running_agent_count(self):
return len(self._running_agents)
def _update_runtime_status(self, *_a, **_kw):
pass
async def _notify_active_sessions_of_shutdown(self):
pass
async def _drain_active_agents(self, timeout):
return {}, False
def _finalize_shutdown_agents(self, agents):
for agent in agents.values():
self._cleanup_agent_resources(agent)
def _cleanup_agent_resources(self, agent):
if agent is None:
return
try:
if hasattr(agent, "shutdown_memory_provider"):
agent.shutdown_memory_provider()
except Exception:
pass
try:
if hasattr(agent, "close"):
agent.close()
except Exception:
pass
def _evict_cached_agent(self, key):
pass
def _make_mock_agent():
a = MagicMock()
a.shutdown_memory_provider = MagicMock()
a.close = MagicMock()
return a
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
class TestCachedAgentCleanupOnShutdown:
"""Verify that ``stop()`` calls ``_cleanup_agent_resources`` on idle
cached agents, triggering ``shutdown_memory_provider()`` (which calls
``on_session_end``)."""
@pytest.mark.asyncio
async def test_cached_agent_memory_provider_shut_down(self):
"""A cached agent's shutdown_memory_provider is called during gateway stop."""
gw = _FakeGateway()
agent = _make_mock_agent()
gw._agent_cache["session-1"] = (agent, "sig-123")
# Call the real stop() from GatewayRunner
await gw_mod.GatewayRunner.stop(gw)
agent.shutdown_memory_provider.assert_called_once()
@pytest.mark.asyncio
async def test_cache_cleared_after_shutdown(self):
"""The _agent_cache dict is cleared after stop."""
gw = _FakeGateway()
agent = _make_mock_agent()
gw._agent_cache["s1"] = (agent, "sig1")
await gw_mod.GatewayRunner.stop(gw)
assert len(gw._agent_cache) == 0
@pytest.mark.asyncio
async def test_no_cached_agents_no_error(self):
"""stop() works fine when _agent_cache is empty."""
gw = _FakeGateway()
await gw_mod.GatewayRunner.stop(gw) # Should not raise
assert len(gw._agent_cache) == 0
@pytest.mark.asyncio
async def test_multiple_cached_agents_all_cleaned(self):
"""All cached agents get cleaned up."""
gw = _FakeGateway()
agents = []
for i in range(5):
a = _make_mock_agent()
agents.append(a)
gw._agent_cache[f"s{i}"] = (a, f"sig{i}")
await gw_mod.GatewayRunner.stop(gw)
for a in agents:
a.shutdown_memory_provider.assert_called_once()
@pytest.mark.asyncio
async def test_cleanup_survives_agent_exception(self):
"""An exception from one agent's shutdown doesn't prevent others."""
gw = _FakeGateway()
bad = _make_mock_agent()
bad.shutdown_memory_provider.side_effect = RuntimeError("boom")
bad.close.side_effect = RuntimeError("boom")
good = _make_mock_agent()
gw._agent_cache["bad"] = (bad, "sig-bad")
gw._agent_cache["good"] = (good, "sig-good")
await gw_mod.GatewayRunner.stop(gw)
# The good agent should still be cleaned up
good.shutdown_memory_provider.assert_called_once()
@pytest.mark.asyncio
async def test_plain_agent_not_tuple(self):
"""Cache entries that aren't tuples (just bare agents) are also cleaned."""
gw = _FakeGateway()
agent = _make_mock_agent()
gw._agent_cache["s1"] = agent # Not a tuple
await gw_mod.GatewayRunner.stop(gw)
agent.shutdown_memory_provider.assert_called_once()
assert len(gw._agent_cache) == 0
@pytest.mark.asyncio
async def test_none_entry_skipped(self):
"""A None cache entry doesn't cause errors."""
gw = _FakeGateway()
gw._agent_cache["s1"] = None
await gw_mod.GatewayRunner.stop(gw)
assert len(gw._agent_cache) == 0
class TestRunningAgentsNotDoubleCleaned:
"""Verify behavior when agents appear in both _running_agents and _agent_cache."""
@pytest.mark.asyncio
async def test_running_and_cached_agent_cleaned_at_least_once(self):
"""An agent in both _running_agents and _agent_cache gets
shutdown_memory_provider called at least once."""
gw = _FakeGateway()
shared = _make_mock_agent()
gw._running_agents["s1"] = shared
gw._agent_cache["s1"] = (shared, "sig1")
await gw_mod.GatewayRunner.stop(gw)
# Called at least once — either from _finalize_shutdown_agents
# or from the cache sweep (or both)
assert shared.shutdown_memory_provider.call_count >= 1
+4 -669
View File
@@ -11,7 +11,7 @@ We mock the slack modules at import time to avoid collection errors.
import asyncio
import os
import sys
from unittest.mock import AsyncMock, MagicMock, patch, call
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
@@ -21,7 +21,6 @@ from gateway.platforms.base import (
MessageType,
SendResult,
SUPPORTED_DOCUMENT_TYPES,
is_host_excluded_by_no_proxy,
)
@@ -148,20 +147,7 @@ class TestAppMentionHandler:
assert "app_mention" in registered_events
assert "assistant_thread_started" in registered_events
assert "assistant_thread_context_changed" in registered_events
# Slack slash commands are registered via a single regex matcher
# covering every COMMAND_REGISTRY entry (e.g. /hermes, /btw, /stop,
# /model, ...) so users get native-slash parity with Discord and
# Telegram. Verify the regex matches the key expected slashes.
assert len(registered_commands) == 1, (
f"expected 1 combined slash matcher, got {registered_commands!r}"
)
slash_matcher = registered_commands[0]
import re as _re
assert isinstance(slash_matcher, _re.Pattern)
for expected in ("/hermes", "/btw", "/stop", "/model", "/help"):
assert slash_matcher.match(expected), (
f"Slack slash regex does not match {expected}"
)
assert "/hermes" in registered_commands
class TestSlackConnectCleanup:
@@ -189,198 +175,6 @@ class TestSlackConnectCleanup:
assert adapter._platform_lock_identity is None
# ---------------------------------------------------------------------------
# TestSlackProxyBehavior
# ---------------------------------------------------------------------------
class TestSlackProxyBehavior:
def test_no_proxy_helper_matches_slack_hosts(self):
assert is_host_excluded_by_no_proxy("slack.com", "localhost,.slack.com")
assert is_host_excluded_by_no_proxy("files.slack.com", "localhost slack.com")
assert is_host_excluded_by_no_proxy("wss-primary.slack.com", "*")
assert not is_host_excluded_by_no_proxy("slack.com", "localhost,.internal.corp")
def test_resolve_slack_proxy_url_ignores_unsupported_proxy_schemes(self):
with patch.object(_slack_mod, "resolve_proxy_url", return_value="socks5://proxy.example.com:1080"):
assert _slack_mod._resolve_slack_proxy_url() is None
def test_resolve_slack_proxy_url_checks_all_slack_hosts(self):
with patch.object(_slack_mod, "resolve_proxy_url", return_value="http://proxy.example.com:3128"), \
patch.object(_slack_mod, "is_host_excluded_by_no_proxy", side_effect=lambda host: host == "wss-primary.slack.com") as excluded:
assert _slack_mod._resolve_slack_proxy_url() is None
excluded.assert_has_calls([
call("slack.com"),
call("files.slack.com"),
call("wss-primary.slack.com"),
])
@pytest.mark.asyncio
async def test_connect_uses_proxy_when_not_bypassed(self):
created_apps = []
created_clients = []
class FakeWebClient:
def __init__(self, token):
self.token = token
self.proxy = "constructor-default"
suffix = token.split("-")[-1]
self.auth_test = AsyncMock(return_value={
"team_id": f"T_{suffix}",
"user_id": f"U_{suffix}",
"user": f"bot-{suffix}",
"team": f"Team {suffix}",
})
created_clients.append(self)
class FakeApp:
def __init__(self, token):
self.token = token
self.client = FakeWebClient(token)
self.registered_events = []
self.registered_commands = []
self.registered_actions = []
created_apps.append(self)
def event(self, event_type):
self.registered_events.append(event_type)
def decorator(fn):
return fn
return decorator
def command(self, command_name):
self.registered_commands.append(command_name)
def decorator(fn):
return fn
return decorator
def action(self, action_id):
self.registered_actions.append(action_id)
def decorator(fn):
return fn
return decorator
class FakeSocketModeHandler:
def __init__(self, app, app_token, proxy=None):
self.app = app
self.app_token = app_token
self.proxy = proxy
self.client = MagicMock(proxy="constructor-default")
def start_async(self):
return None
async def close_async(self):
return None
config = PlatformConfig(enabled=True, token="xoxb-primary,xoxb-secondary")
adapter = SlackAdapter(config)
with patch.object(_slack_mod, "AsyncApp", side_effect=FakeApp), \
patch.object(_slack_mod, "AsyncWebClient", side_effect=FakeWebClient), \
patch.object(_slack_mod, "AsyncSocketModeHandler", FakeSocketModeHandler), \
patch.object(_slack_mod, "_resolve_slack_proxy_url", return_value="http://proxy.example.com:3128"), \
patch.dict(os.environ, {"SLACK_APP_TOKEN": "xapp-fake"}, clear=False), \
patch("gateway.status.acquire_scoped_lock", return_value=(True, None)), \
patch("asyncio.create_task", return_value=MagicMock(name="socket-mode-task")):
result = await adapter.connect()
assert result is True
assert created_apps[0].client.proxy == "http://proxy.example.com:3128"
assert all(client.proxy == "http://proxy.example.com:3128" for client in created_clients)
assert adapter._handler is not None
assert adapter._handler.proxy == "http://proxy.example.com:3128"
assert adapter._handler.client.proxy == "http://proxy.example.com:3128"
@pytest.mark.asyncio
async def test_connect_clears_proxy_when_no_proxy_matches_slack(self):
created_apps = []
created_clients = []
class FakeWebClient:
def __init__(self, token):
self.token = token
self.proxy = "constructor-default"
suffix = token.split("-")[-1]
self.auth_test = AsyncMock(return_value={
"team_id": f"T_{suffix}",
"user_id": f"U_{suffix}",
"user": f"bot-{suffix}",
"team": f"Team {suffix}",
})
created_clients.append(self)
class FakeApp:
def __init__(self, token):
self.token = token
self.client = FakeWebClient(token)
self.registered_events = []
self.registered_commands = []
self.registered_actions = []
created_apps.append(self)
def event(self, event_type):
self.registered_events.append(event_type)
def decorator(fn):
return fn
return decorator
def command(self, command_name):
self.registered_commands.append(command_name)
def decorator(fn):
return fn
return decorator
def action(self, action_id):
self.registered_actions.append(action_id)
def decorator(fn):
return fn
return decorator
class FakeSocketModeHandler:
def __init__(self, app, app_token, proxy=None):
self.app = app
self.app_token = app_token
self.proxy = proxy
self.client = MagicMock(proxy="constructor-default")
def start_async(self):
return None
async def close_async(self):
return None
config = PlatformConfig(enabled=True, token="xoxb-primary")
adapter = SlackAdapter(config)
with patch.object(_slack_mod, "AsyncApp", side_effect=FakeApp), \
patch.object(_slack_mod, "AsyncWebClient", side_effect=FakeWebClient), \
patch.object(_slack_mod, "AsyncSocketModeHandler", FakeSocketModeHandler), \
patch.object(_slack_mod, "_resolve_slack_proxy_url", return_value=None), \
patch.dict(os.environ, {"SLACK_APP_TOKEN": "xapp-fake"}, clear=False), \
patch("gateway.status.acquire_scoped_lock", return_value=(True, None)), \
patch("asyncio.create_task", return_value=MagicMock(name="socket-mode-task")):
result = await adapter.connect()
assert result is True
assert created_apps[0].client.proxy is None
assert all(client.proxy is None for client in created_clients)
assert adapter._handler is not None
assert adapter._handler.proxy is None
assert adapter._handler.client.proxy is None
# ---------------------------------------------------------------------------
# TestSendDocument
# ---------------------------------------------------------------------------
@@ -480,40 +274,6 @@ class TestSendDocument:
call_kwargs = adapter._app.client.files_upload_v2.call_args[1]
assert call_kwargs["thread_ts"] == "1234567890.123456"
@pytest.mark.asyncio
async def test_send_document_thread_upload_marks_bot_participation(self, adapter, tmp_path):
test_file = tmp_path / "notes.txt"
test_file.write_bytes(b"some notes")
adapter._app.client.files_upload_v2 = AsyncMock(return_value={"ok": True})
await adapter.send_document(
chat_id="C123",
file_path=str(test_file),
metadata={"thread_id": "1234567890.123456"},
)
assert "1234567890.123456" in adapter._bot_message_ts
@pytest.mark.asyncio
async def test_send_document_retries_transient_upload_error(self, adapter, tmp_path):
test_file = tmp_path / "notes.txt"
test_file.write_bytes(b"some notes")
adapter._app.client.files_upload_v2 = AsyncMock(
side_effect=[RuntimeError("Connection reset by peer"), {"ok": True}]
)
with patch("asyncio.sleep", new_callable=AsyncMock) as sleep_mock:
result = await adapter.send_document(
chat_id="C123",
file_path=str(test_file),
)
assert result.success
assert adapter._app.client.files_upload_v2.await_count == 2
sleep_mock.assert_awaited_once()
# ---------------------------------------------------------------------------
# TestSendVideo
@@ -582,17 +342,15 @@ class TestSendVideo:
# ---------------------------------------------------------------------------
class TestIncomingDocumentHandling:
def _make_event(self, files=None, text="hello", channel_type="im", blocks=None, attachments=None):
def _make_event(self, files=None, text="hello", channel_type="im"):
"""Build a mock Slack message event with file attachments."""
return {
"text": text,
"user": "U_USER",
"channel": "D123",
"channel": "C123",
"channel_type": channel_type,
"ts": "1234567890.000001",
"files": files or [],
"blocks": blocks or [],
"attachments": attachments or [],
}
@pytest.mark.asyncio
@@ -657,36 +415,6 @@ class TestIncomingDocumentHandling:
msg_event = adapter.handle_message.call_args[0][0]
assert "# Title" in msg_event.text
@pytest.mark.asyncio
async def test_json_snippet_injects_content(self, adapter):
"""A .json snippet should be treated as a text document and injected."""
content = b'{"hello": "world", "count": 2}'
with patch.object(adapter, "_download_slack_file_bytes", new_callable=AsyncMock) as dl:
dl.return_value = content
event = self._make_event(
text="can you parse this",
files=[{
"mimetype": "text/plain",
"name": "zapfile.json",
"filetype": "json",
"pretty_type": "JSON",
"mode": "snippet",
"editable": True,
"url_private_download": "https://files.slack.com/zapfile.json",
"size": len(content),
}],
)
await adapter._handle_slack_message(event)
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.message_type == MessageType.DOCUMENT
assert len(msg_event.media_urls) == 1
assert msg_event.media_types == ["application/json"]
assert '[Content of zapfile.json]' in msg_event.text
assert '"hello": "world"' in msg_event.text
assert 'can you parse this' in msg_event.text
@pytest.mark.asyncio
async def test_large_txt_not_injected(self, adapter):
"""A .txt file over 100KB should be cached but NOT injected."""
@@ -770,207 +498,6 @@ class TestIncomingDocumentHandling:
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.message_type == MessageType.PHOTO
@pytest.mark.asyncio
async def test_download_failure_is_surfaced_in_message_text(self, adapter):
"""Attachment download failures (401/403/HTML-body/etc.) should be
translated into a user-facing `[Slack attachment notice]` block so
the agent can tell the user what to fix (e.g. missing files:read
scope). No proactive files.info probe is made the diagnostic
runs only when the download actually fails.
"""
import httpx
req = httpx.Request("GET", "https://files.slack.com/photo.jpg")
resp = httpx.Response(403, request=req)
with patch.object(adapter, "_download_slack_file", new_callable=AsyncMock) as dl:
dl.side_effect = httpx.HTTPStatusError("403", request=req, response=resp)
event = self._make_event(text="what's in this?", files=[{
"id": "F123",
"mimetype": "image/jpeg",
"name": "photo.jpg",
"url_private_download": "https://files.slack.com/photo.jpg",
"size": 1024,
}])
await adapter._handle_slack_message(event)
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.message_type == MessageType.TEXT
assert "[Slack attachment notice]" in msg_event.text
assert "403" in msg_event.text
assert "what's in this?" in msg_event.text
@pytest.mark.asyncio
async def test_rich_text_blocks_do_not_duplicate_plain_text(self, adapter):
"""Plain rich_text composer blocks match the plain text field exactly,
so the dedupe guard keeps the message clean."""
event = self._make_event(
text="hello world",
blocks=[
{
"type": "rich_text",
"elements": [
{
"type": "rich_text_section",
"elements": [
{"type": "text", "text": "hello world"},
],
}
],
}
],
)
await adapter._handle_slack_message(event)
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.text == "hello world"
@pytest.mark.asyncio
async def test_rich_text_quotes_and_lists_are_extracted(self, adapter):
"""Nested quote and list content should be surfaced from rich_text blocks."""
event = self._make_event(
text="Can you summarize this?",
blocks=[
{
"type": "rich_text",
"elements": [
{
"type": "rich_text_quote",
"elements": [
{
"type": "rich_text_section",
"elements": [{"type": "text", "text": "Quoted line"}],
}
],
},
{
"type": "rich_text_list",
"style": "bullet",
"elements": [
{
"type": "rich_text_section",
"elements": [{"type": "text", "text": "First bullet"}],
},
{
"type": "rich_text_section",
"elements": [{"type": "text", "text": "Second bullet"}],
},
],
},
],
}
],
)
await adapter._handle_slack_message(event)
msg_event = adapter.handle_message.call_args[0][0]
assert "Can you summarize this?" in msg_event.text
assert "> Quoted line" in msg_event.text
assert "• First bullet" in msg_event.text
assert "• Second bullet" in msg_event.text
@pytest.mark.asyncio
async def test_attachments_unfurl_text_is_appended_even_when_url_is_in_message(self, adapter):
"""Shared URLs should still expose unfurl preview text to the agent."""
event = self._make_event(
text="Look at this doc https://example.com/spec",
attachments=[
{
"title": "Spec",
"from_url": "https://example.com/spec",
"text": "The latest product spec preview",
"footer": "Notion",
}
],
)
await adapter._handle_slack_message(event)
msg_event = adapter.handle_message.call_args[0][0]
assert "Look at this doc https://example.com/spec" in msg_event.text
assert "📎 [Spec](https://example.com/spec)" in msg_event.text
assert "The latest product spec preview" in msg_event.text
assert "_Notion_" in msg_event.text
@pytest.mark.asyncio
async def test_message_unfurl_attachments_are_skipped(self, adapter):
"""Message unfurls should be skipped to avoid echoing Slack message copies."""
event = self._make_event(
text="https://example.com/thread",
attachments=[
{
"is_msg_unfurl": True,
"title": "Thread copy",
"text": "This should not be appended",
}
],
)
await adapter._handle_slack_message(event)
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.text == "https://example.com/thread"
@pytest.mark.asyncio
async def test_channel_routing_ignores_bot_mentions_inside_block_text(self, adapter):
"""Block-extracted text with a bot mention must not satisfy mention
gating in channels routing decisions use the original user text so
quoted/forwarded content can't trick the bot into responding."""
event = self._make_event(
text="please review",
channel_type="channel",
blocks=[
{
"type": "rich_text",
"elements": [
{
"type": "rich_text_quote",
"elements": [
{
"type": "rich_text_section",
"elements": [{"type": "text", "text": "Contains <@U_BOT> in quoted text"}],
}
],
}
],
}
],
)
await adapter._handle_slack_message(event)
adapter.handle_message.assert_not_called()
@pytest.mark.asyncio
async def test_quoted_slash_command_text_does_not_change_message_type(self, adapter):
"""Quoted slash-like content should not convert a normal message into a command."""
event = self._make_event(
text="",
blocks=[
{
"type": "rich_text",
"elements": [
{
"type": "rich_text_quote",
"elements": [
{
"type": "rich_text_section",
"elements": [{"type": "text", "text": "/deploy now"}],
}
],
}
],
}
],
)
await adapter._handle_slack_message(event)
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.message_type == MessageType.TEXT
assert "> /deploy now" in msg_event.text
# ---------------------------------------------------------------------------
# TestMessageRouting
@@ -2017,83 +1544,6 @@ class TestSlashCommands:
msg = adapter.handle_message.call_args[0][0]
assert msg.text == "/reasoning"
# ------------------------------------------------------------------
# Native slash commands — /btw, /stop, /model, ... dispatched directly
# instead of as /hermes subcommands. This is the Discord/Telegram parity
# fix: the slash name itself becomes the command.
# ------------------------------------------------------------------
@pytest.mark.asyncio
async def test_native_btw_slash(self, adapter):
"""/btw with args must dispatch to /background, not /hermes btw."""
command = {
"command": "/btw",
"text": "fix the failing test",
"user_id": "U1",
"channel_id": "C1",
}
await adapter._handle_slash_command(command)
msg = adapter.handle_message.call_args[0][0]
# The gateway command dispatcher resolves /btw -> background via
# resolve_command() — our handler's job is just to deliver
# "/btw <args>" to the gateway runner, which is what this asserts.
assert msg.text == "/btw fix the failing test"
@pytest.mark.asyncio
async def test_native_stop_slash_no_args(self, adapter):
command = {
"command": "/stop",
"text": "",
"user_id": "U1",
"channel_id": "C1",
}
await adapter._handle_slash_command(command)
msg = adapter.handle_message.call_args[0][0]
assert msg.text == "/stop"
@pytest.mark.asyncio
async def test_native_model_slash_with_args(self, adapter):
command = {
"command": "/model",
"text": "anthropic/claude-sonnet-4",
"user_id": "U1",
"channel_id": "C1",
}
await adapter._handle_slash_command(command)
msg = adapter.handle_message.call_args[0][0]
assert msg.text == "/model anthropic/claude-sonnet-4"
@pytest.mark.asyncio
async def test_legacy_hermes_prefix_still_works(self, adapter):
"""Backward compat: /hermes btw foo must still route to /btw foo.
Old workspace manifests only declared /hermes as the single slash.
After users refresh their manifest they get /btw natively, but the
legacy form must keep working during the transition.
"""
command = {
"command": "/hermes",
"text": "btw run the tests",
"user_id": "U1",
"channel_id": "C1",
}
await adapter._handle_slash_command(command)
msg = adapter.handle_message.call_args[0][0]
assert msg.text == "/btw run the tests"
@pytest.mark.asyncio
async def test_legacy_hermes_freeform_question(self, adapter):
"""/hermes <free-form text> must stay as the raw text (non-command)."""
command = {
"command": "/hermes",
"text": "what's the weather today?",
"user_id": "U1",
"channel_id": "C1",
}
await adapter._handle_slash_command(command)
msg = adapter.handle_message.call_args[0][0]
assert msg.text == "what's the weather today?"
# ---------------------------------------------------------------------------
# TestMessageSplitting
@@ -2347,48 +1797,6 @@ class TestSendImageSSRFGuards:
assert "see this" in call_kwargs["text"]
assert "https://public.example/image.png" in call_kwargs["text"]
@pytest.mark.asyncio
async def test_send_image_fallback_preserves_thread_metadata(self, adapter):
redirect_response = MagicMock()
redirect_response.is_redirect = True
redirect_response.next_request = MagicMock(
url="http://169.254.169.254/latest/meta-data"
)
client_kwargs = {}
mock_client = AsyncMock()
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
mock_client.__aexit__ = AsyncMock(return_value=False)
async def fake_get(_url):
for hook in client_kwargs["event_hooks"]["response"]:
await hook(redirect_response)
mock_client.get = AsyncMock(side_effect=fake_get)
adapter._app.client.files_upload_v2 = AsyncMock(return_value={"ok": True})
adapter._app.client.chat_postMessage = AsyncMock(return_value={"ts": "reply_ts"})
def fake_async_client(*args, **kwargs):
client_kwargs.update(kwargs)
return mock_client
def fake_is_safe_url(url):
return url == "https://public.example/image.png"
with (
patch("tools.url_safety.is_safe_url", side_effect=fake_is_safe_url),
patch("httpx.AsyncClient", side_effect=fake_async_client),
):
await adapter.send_image(
chat_id="C123",
image_url="https://public.example/image.png",
caption="see this",
metadata={"thread_id": "parent_ts_789"},
)
call_kwargs = adapter._app.client.chat_postMessage.call_args.kwargs
assert call_kwargs.get("thread_ts") == "parent_ts_789"
# ---------------------------------------------------------------------------
# TestProgressMessageThread
@@ -2513,76 +1921,3 @@ class TestProgressMessageThread:
"so each @mention starts its own thread"
)
assert msg_event.message_id == "2000000000.000001"
class TestSlackReplyToText:
"""Ensure MessageEvent.reply_to_text is populated on thread replies so
gateway.run can inject a ``[Replying to: "..."]`` prefix (parity with
Telegram/Discord/Feishu/WeCom)."""
@pytest.mark.asyncio
async def test_slack_reply_to_text_set_on_thread_reply(self, adapter):
"""When a thread reply arrives and the parent was posted by a bot
(e.g. cron summary), reply_to_text must carry the parent's text."""
adapter._channel_team = {} # primary workspace only
adapter._team_bot_user_ids = {}
# Mock conversations_replies to return a bot-posted parent
adapter._app.client.conversations_replies = AsyncMock(return_value={
"messages": [
{
"ts": "1000.0",
"bot_id": "B_CRON",
"text": "メール要約: 新着メール3件あります",
},
{"ts": "1000.5", "user": "U_USER", "text": "詳細を教えて"},
]
})
# Use a DM so mention-gating doesn't short-circuit the handler.
event = {
"text": "詳細を教えて",
"user": "U_USER",
"channel": "D123",
"channel_type": "im",
"ts": "1000.5",
"thread_ts": "1000.0", # thread reply
}
with patch.object(
adapter, "_resolve_user_name", new=AsyncMock(return_value="Alice")
):
await adapter._handle_slack_message(event)
assert adapter.handle_message.call_args is not None, (
"handle_message must be invoked for thread-reply DM"
)
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.reply_to_message_id == "1000.0"
# The critical assertion: parent text is exposed as reply_to_text so the
# gateway can inject it when not already in the session history.
assert msg_event.reply_to_text is not None
assert "メール要約" in msg_event.reply_to_text
@pytest.mark.asyncio
async def test_slack_reply_to_text_none_for_top_level_message(self, adapter):
"""Top-level messages (no thread_ts) must not set reply_to_text."""
event = {
"text": "hello",
"user": "U_USER",
"channel": "D123",
"channel_type": "im",
"ts": "1000.0",
# no thread_ts — top-level DM
}
with patch.object(
adapter, "_resolve_user_name", new=AsyncMock(return_value="Alice")
):
await adapter._handle_slack_message(event)
assert adapter.handle_message.call_args is not None
msg_event = adapter.handle_message.call_args[0][0]
assert msg_event.reply_to_text is None
# Top-level message: reply_to_message_id must be falsy (None or empty).
assert not msg_event.reply_to_message_id
+3 -184
View File
@@ -276,44 +276,23 @@ class TestSlackThreadContext:
@pytest.mark.asyncio
async def test_skips_bot_messages(self):
"""Self-bot child replies are skipped to avoid circular context,
but non-self bots (e.g. cron posts, third-party integrations) are kept.
Regression guard for the fix in _fetch_thread_context: previously ALL
bot messages were dropped, which lost context when the bot was replying
to a cron-posted thread parent."""
adapter = _make_adapter()
mock_client = adapter._team_clients["T1"]
mock_client.conversations_replies = AsyncMock(return_value={
"messages": [
{"ts": "1000.0", "user": "U1", "text": "Parent"},
# Self-bot reply -> must be skipped (circular)
{
"ts": "1000.1",
"bot_id": "B_SELF",
"user": "U_BOT",
"text": "Previous bot self-reply (should be skipped)",
},
# Third-party bot child -> kept (useful context)
{
"ts": "1000.15",
"bot_id": "B_OTHER",
"user": "U_OTHER_BOT",
"text": "Deploy succeeded",
},
{"ts": "1000.1", "bot_id": "B1", "text": "Bot reply (should be skipped)"},
{"ts": "1000.2", "user": "U1", "text": "Current"},
]
})
adapter._user_name_cache = {"U1": "Alice", "U_OTHER_BOT": "DeployBot"}
adapter._user_name_cache = {"U1": "Alice"}
context = await adapter._fetch_thread_context(
channel_id="C1", thread_ts="1000.0", current_ts="1000.2", team_id="T1"
)
assert "Previous bot self-reply" not in context
assert "Bot reply" not in context
assert "Alice: Parent" in context
# Third-party bot message must now be included
assert "Deploy succeeded" in context
@pytest.mark.asyncio
async def test_empty_thread(self):
@@ -337,166 +316,6 @@ class TestSlackThreadContext:
)
assert context == ""
@pytest.mark.asyncio
async def test_fetch_thread_context_includes_bot_parent(self):
"""The thread parent posted by a bot (e.g. a cron summary) must be
included in the context, prefixed with ``[thread parent]``."""
adapter = _make_adapter()
mock_client = adapter._team_clients["T1"]
mock_client.conversations_replies = AsyncMock(return_value={
"messages": [
# Bot-posted parent (cron job)
{
"ts": "1000.0",
"bot_id": "B123",
"subtype": "bot_message",
"username": "cron",
"text": "メール要約: 本日の新着3件",
},
# User reply that triggered the fetch
{"ts": "1000.1", "user": "U1", "text": "詳細を教えて"},
]
})
adapter._user_name_cache = {"U1": "Alice"}
context = await adapter._fetch_thread_context(
channel_id="C1",
thread_ts="1000.0",
current_ts="1000.1", # exclude the trigger message itself
team_id="T1",
)
assert "[thread parent]" in context
assert "メール要約: 本日の新着3件" in context
@pytest.mark.asyncio
async def test_fetch_thread_context_excludes_self_bot_replies(self):
"""Parent (non-self bot) is kept, self-bot child replies are dropped,
user replies are kept."""
adapter = _make_adapter()
mock_client = adapter._team_clients["T1"]
mock_client.conversations_replies = AsyncMock(return_value={
"messages": [
{"ts": "1000.0", "bot_id": "B_CRON", "text": "Cron summary"},
# Self-bot child reply -> excluded
{
"ts": "1000.1",
"bot_id": "B_SELF",
"user": "U_BOT", # matches adapter._bot_user_id
"text": "Previous self reply",
},
# User reply -> kept
{"ts": "1000.2", "user": "U1", "text": "Follow-up question"},
# Current trigger (excluded by current_ts match)
{"ts": "1000.3", "user": "U1", "text": "Current"},
]
})
adapter._user_name_cache = {"U1": "Alice"}
context = await adapter._fetch_thread_context(
channel_id="C1", thread_ts="1000.0", current_ts="1000.3", team_id="T1"
)
assert "Cron summary" in context
assert "[thread parent]" in context
assert "Previous self reply" not in context
assert "Follow-up question" in context
assert "Current" not in context
@pytest.mark.asyncio
async def test_fetch_thread_context_multi_workspace(self):
"""Self-bot filtering must use the per-workspace bot user id so a
self-bot id that belongs to a different workspace does not accidentally
filter out a legitimate message in the current workspace."""
adapter = _make_adapter()
# Add a second workspace with a different bot user id
adapter._team_clients["T2"] = AsyncMock()
adapter._team_bot_user_ids = {"T1": "U_BOT_T1", "T2": "U_BOT_T2"}
adapter._bot_user_id = "U_BOT_T1"
adapter._channel_team["C2"] = "T2"
mock_client = adapter._team_clients["T2"]
mock_client.conversations_replies = AsyncMock(return_value={
"messages": [
{"ts": "2000.0", "user": "U2", "text": "Parent T2"},
# This has the *T1* bot's user id — from T2's perspective this
# is a third-party bot, so it must be kept.
{
"ts": "2000.1",
"bot_id": "B_FOREIGN",
"user": "U_BOT_T1",
"team": "T2",
"text": "Cross-workspace bot reply",
},
# Self-bot for T2 — must be skipped
{
"ts": "2000.2",
"bot_id": "B_SELF_T2",
"user": "U_BOT_T2",
"team": "T2",
"text": "Own T2 bot reply",
},
{"ts": "2000.3", "user": "U2", "text": "Current"},
]
})
adapter._user_name_cache = {"U2": "Bob"}
context = await adapter._fetch_thread_context(
channel_id="C2", thread_ts="2000.0", current_ts="2000.3", team_id="T2"
)
assert "Parent T2" in context
assert "Cross-workspace bot reply" in context
assert "Own T2 bot reply" not in context
@pytest.mark.asyncio
async def test_fetch_thread_context_current_ts_excluded(self):
"""Regression guard: the message whose ts == current_ts must never
appear in the context output (it will be delivered as the user
message itself)."""
adapter = _make_adapter()
mock_client = adapter._team_clients["T1"]
mock_client.conversations_replies = AsyncMock(return_value={
"messages": [
{"ts": "1000.0", "user": "U1", "text": "Parent"},
{"ts": "1000.1", "user": "U1", "text": "DO NOT INCLUDE THIS"},
]
})
adapter._user_name_cache = {"U1": "Alice"}
context = await adapter._fetch_thread_context(
channel_id="C1", thread_ts="1000.0", current_ts="1000.1", team_id="T1"
)
assert "Parent" in context
assert "DO NOT INCLUDE THIS" not in context
@pytest.mark.asyncio
async def test_fetch_thread_parent_text_from_cache(self):
"""_fetch_thread_parent_text should reuse the thread-context cache
when it is warm, avoiding an extra conversations.replies call."""
adapter = _make_adapter()
mock_client = adapter._team_clients["T1"]
mock_client.conversations_replies = AsyncMock(return_value={
"messages": [
{"ts": "1000.0", "bot_id": "B123", "text": "Parent summary"},
{"ts": "1000.1", "user": "U1", "text": "reply"},
]
})
# Warm the cache via _fetch_thread_context
await adapter._fetch_thread_context(
channel_id="C1", thread_ts="1000.0", current_ts="1000.1", team_id="T1"
)
assert mock_client.conversations_replies.await_count == 1
parent = await adapter._fetch_thread_parent_text(
channel_id="C1", thread_ts="1000.0", team_id="T1"
)
assert parent == "Parent summary"
# No additional API call
assert mock_client.conversations_replies.await_count == 1
# ===========================================================================
# _has_active_session_for_thread — session key fix (#5833)
-133
View File
@@ -1,133 +0,0 @@
"""Tests for Slack channel_skill_bindings auto-skill resolution."""
from unittest.mock import MagicMock
def _make_adapter(extra=None):
"""Create a minimal SlackAdapter stub with the given ``config.extra``."""
from gateway.platforms.slack import SlackAdapter
adapter = object.__new__(SlackAdapter)
adapter.config = MagicMock()
adapter.config.extra = extra or {}
return adapter
def _resolve(adapter, channel_id, parent_id=None):
from gateway.platforms.base import resolve_channel_skills
return resolve_channel_skills(adapter.config.extra, channel_id, parent_id)
class TestSlackResolveChannelSkills:
def test_no_bindings_returns_none(self):
adapter = _make_adapter()
assert _resolve(adapter, "D0ABC") is None
def test_match_by_dm_channel_id(self):
"""The primary use case: binding a skill to a Slack DM channel."""
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "D0ATH9TQ0G6", "skills": ["german-flashcards"]},
]
})
assert _resolve(adapter, "D0ATH9TQ0G6") == ["german-flashcards"]
def test_match_by_parent_id_for_thread(self):
"""Slack threads inherit the parent channel's binding."""
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "C0PARENT", "skills": ["parent-skill"]},
]
})
assert _resolve(adapter, "thread-ts-123", parent_id="C0PARENT") == ["parent-skill"]
def test_no_match_returns_none(self):
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "D0AAA", "skills": ["skill-a"]},
]
})
assert _resolve(adapter, "D0BBB") is None
def test_single_skill_string(self):
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "D0ATH9TQ0G6", "skill": "german-flashcards"},
]
})
assert _resolve(adapter, "D0ATH9TQ0G6") == ["german-flashcards"]
def test_dedup_preserves_order(self):
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "D0ATH9TQ0G6", "skills": ["a", "b", "a", "c", "b"]},
]
})
assert _resolve(adapter, "D0ATH9TQ0G6") == ["a", "b", "c"]
def test_multiple_bindings_pick_correct(self):
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "D0AAA", "skills": ["skill-a"]},
{"id": "D0BBB", "skills": ["skill-b"]},
{"id": "D0CCC", "skills": ["skill-c"]},
]
})
assert _resolve(adapter, "D0BBB") == ["skill-b"]
def test_malformed_entry_skipped(self):
"""Non-dict entries should be ignored, not raise."""
adapter = _make_adapter({
"channel_skill_bindings": [
"not-a-dict",
{"id": "D0ABC", "skills": ["good"]},
]
})
assert _resolve(adapter, "D0ABC") == ["good"]
def test_empty_skills_list_returns_none(self):
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "D0ABC", "skills": []},
]
})
assert _resolve(adapter, "D0ABC") is None
def test_empty_skill_string_returns_none(self):
adapter = _make_adapter({
"channel_skill_bindings": [
{"id": "D0ABC", "skill": ""},
]
})
assert _resolve(adapter, "D0ABC") is None
class TestSlackMessageEventAutoSkill:
"""Integration-style test: verify auto_skill propagates to MessageEvent."""
def test_message_event_carries_auto_skill(self):
"""Simulate the handler wiring: resolve + attach to MessageEvent."""
from gateway.platforms.base import MessageEvent, MessageType, Platform, SessionSource, resolve_channel_skills
config_extra = {
"channel_skill_bindings": [
{"id": "D0ATH9TQ0G6", "skills": ["german-flashcards"]},
]
}
auto_skill = resolve_channel_skills(config_extra, "D0ATH9TQ0G6", None)
source = SessionSource(
platform=Platform.SLACK,
chat_id="D0ATH9TQ0G6",
chat_name="Mats",
chat_type="dm",
user_id="U0ABC",
user_name="Mats",
)
event = MessageEvent(
text="work",
message_type=MessageType.TEXT,
source=source,
raw_message={},
message_id="123.456",
auto_skill=auto_skill,
)
assert event.auto_skill == ["german-flashcards"]
+1 -151
View File
@@ -55,12 +55,10 @@ CHANNEL_ID = "C0AQWDLHY9M"
OTHER_CHANNEL_ID = "C9999999999"
def _make_adapter(require_mention=None, strict_mention=None, free_response_channels=None):
def _make_adapter(require_mention=None, free_response_channels=None):
extra = {}
if require_mention is not None:
extra["require_mention"] = require_mention
if strict_mention is not None:
extra["strict_mention"] = strict_mention
if free_response_channels is not None:
extra["free_response_channels"] = free_response_channels
@@ -136,48 +134,6 @@ def test_require_mention_env_var_default_true(monkeypatch):
assert adapter._slack_require_mention() is True
# ---------------------------------------------------------------------------
# Tests: _slack_strict_mention
# ---------------------------------------------------------------------------
def test_strict_mention_defaults_to_false(monkeypatch):
monkeypatch.delenv("SLACK_STRICT_MENTION", raising=False)
adapter = _make_adapter()
assert adapter._slack_strict_mention() is False
def test_strict_mention_true():
adapter = _make_adapter(strict_mention=True)
assert adapter._slack_strict_mention() is True
def test_strict_mention_false():
adapter = _make_adapter(strict_mention=False)
assert adapter._slack_strict_mention() is False
def test_strict_mention_string_true():
adapter = _make_adapter(strict_mention="true")
assert adapter._slack_strict_mention() is True
def test_strict_mention_string_off():
adapter = _make_adapter(strict_mention="off")
assert adapter._slack_strict_mention() is False
def test_strict_mention_malformed_stays_false():
"""Unrecognised values keep strict mode OFF (fail-open to legacy behavior)."""
adapter = _make_adapter(strict_mention="maybe")
assert adapter._slack_strict_mention() is False
def test_strict_mention_env_var_fallback(monkeypatch):
monkeypatch.setenv("SLACK_STRICT_MENTION", "true")
adapter = _make_adapter() # no config value -> falls back to env
assert adapter._slack_strict_mention() is True
# ---------------------------------------------------------------------------
# Tests: _slack_free_response_channels
# ---------------------------------------------------------------------------
@@ -354,109 +310,3 @@ def test_config_bridges_slack_free_response_channels(monkeypatch, tmp_path):
import os as _os
assert _os.environ["SLACK_REQUIRE_MENTION"] == "false"
assert _os.environ["SLACK_FREE_RESPONSE_CHANNELS"] == "C0AQWDLHY9M,C9999999999"
def test_config_bridges_slack_reply_in_thread(monkeypatch, tmp_path):
from gateway.config import load_gateway_config
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "config.yaml").write_text(
"slack:\n"
" reply_in_thread: false\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setenv("SLACK_BOT_TOKEN", "xoxb-test")
config = load_gateway_config()
assert config is not None
slack_config = config.platforms[Platform.SLACK]
assert slack_config.extra.get("reply_in_thread") is False
adapter = SlackAdapter(slack_config)
assert adapter._resolve_thread_ts(reply_to="171.000", metadata={}) is None
# Top-level channel messages arrive with metadata.thread_id == reply_to
# because the inbound handler uses event.ts as a session-keying fallback.
# Those must be treated as non-threaded so reply_in_thread=false takes
# effect in channels, not just DMs.
assert adapter._resolve_thread_ts(
reply_to="171.000",
metadata={"thread_id": "171.000"},
) is None
# Real thread replies (reply_to differs from thread parent) must still
# resolve to the parent thread so conversation context is preserved.
assert adapter._resolve_thread_ts(
reply_to="171.500",
metadata={"thread_id": "171.000"},
) == "171.000"
def test_config_bridges_slack_strict_mention(monkeypatch, tmp_path):
from gateway.config import load_gateway_config
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
(hermes_home / "config.yaml").write_text(
"slack:\n"
" strict_mention: true\n",
encoding="utf-8",
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.delenv("SLACK_STRICT_MENTION", raising=False)
config = load_gateway_config()
assert config is not None
import os as _os
assert _os.environ["SLACK_STRICT_MENTION"] == "true"
# ---------------------------------------------------------------------------
# Regression: strict mode must NOT persist mentions into _mentioned_threads
# ---------------------------------------------------------------------------
# Prevents agent-to-agent ack loops — if a strict-mode bot remembered every
# thread it was mentioned in, the next message from the other agent in that
# thread would re-trigger the bot and defeat the entire feature.
def test_mention_in_strict_mode_does_not_register_thread():
adapter = _make_adapter(strict_mention=True)
adapter._bot_user_id = "U_BOT"
adapter._mentioned_threads = set()
adapter._MENTIONED_THREADS_MAX = 5000
thread_ts = "1700000000.100200"
event_thread_ts = thread_ts # incoming message is inside an existing thread
# Mirror the handler's @mention + strict-mode guard that protects
# _mentioned_threads.add(). If strict is on, we must skip the add.
text = "<@U_BOT> hello"
is_mentioned = f"<@{adapter._bot_user_id}>" in text
assert is_mentioned
if event_thread_ts and not adapter._slack_strict_mention():
adapter._mentioned_threads.add(event_thread_ts)
assert thread_ts not in adapter._mentioned_threads
def test_mention_outside_strict_mode_still_registers_thread():
adapter = _make_adapter(strict_mention=False)
adapter._bot_user_id = "U_BOT"
adapter._mentioned_threads = set()
adapter._MENTIONED_THREADS_MAX = 5000
thread_ts = "1700000000.100200"
event_thread_ts = thread_ts
text = "<@U_BOT> hello"
is_mentioned = f"<@{adapter._bot_user_id}>" in text
assert is_mentioned
if event_thread_ts and not adapter._slack_strict_mention():
adapter._mentioned_threads.add(event_thread_ts)
assert thread_ts in adapter._mentioned_threads
+7 -94
View File
@@ -12,9 +12,9 @@ from gateway.platforms.base import MessageEvent
from gateway.session import SessionEntry, SessionSource, build_session_key
def _make_source(platform: Platform = Platform.TELEGRAM) -> SessionSource:
def _make_source() -> SessionSource:
return SessionSource(
platform=platform,
platform=Platform.TELEGRAM,
user_id="u1",
chat_id="c1",
user_name="tester",
@@ -22,24 +22,24 @@ def _make_source(platform: Platform = Platform.TELEGRAM) -> SessionSource:
)
def _make_event(text: str, *, platform: Platform = Platform.TELEGRAM) -> MessageEvent:
def _make_event(text: str) -> MessageEvent:
return MessageEvent(
text=text,
source=_make_source(platform),
source=_make_source(),
message_id="m1",
)
def _make_runner(session_entry: SessionEntry, *, platform: Platform = Platform.TELEGRAM):
def _make_runner(session_entry: SessionEntry):
from gateway.run import GatewayRunner
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
platforms={platform: PlatformConfig(enabled=True, token="***")}
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")}
)
adapter = MagicMock()
adapter.send = AsyncMock()
runner.adapters = {platform: adapter}
runner.adapters = {Platform.TELEGRAM: adapter}
runner._voice_mode = {}
runner.hooks = SimpleNamespace(emit=AsyncMock(), loaded_hooks=False)
runner.session_store = MagicMock()
@@ -224,93 +224,6 @@ async def test_handle_message_persists_agent_token_counts(monkeypatch):
)
@pytest.mark.asyncio
async def test_first_run_slack_home_channel_onboarding_uses_parent_command(monkeypatch):
import gateway.run as gateway_run
session_entry = SessionEntry(
session_key=build_session_key(_make_source(Platform.SLACK)),
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.SLACK,
chat_type="dm",
)
runner = _make_runner(session_entry, platform=Platform.SLACK)
runner.session_store.load_transcript.return_value = []
runner.session_store.has_any_sessions.return_value = False
runner._run_agent = AsyncMock(
return_value={
"final_response": "ok",
"messages": [],
"tools": [],
"history_offset": 0,
"last_prompt_tokens": 0,
"input_tokens": 0,
"output_tokens": 0,
"model": "openai/test-model",
}
)
monkeypatch.delenv("SLACK_HOME_CHANNEL", raising=False)
monkeypatch.setattr(gateway_run, "_resolve_runtime_agent_kwargs", lambda: {"api_key": "***"})
monkeypatch.setattr(
"agent.model_metadata.get_model_context_length",
lambda *_args, **_kwargs: 100000,
)
result = await runner._handle_message(_make_event("hello", platform=Platform.SLACK))
assert result == "ok"
runner.adapters[Platform.SLACK].send.assert_awaited_once()
onboarding = runner.adapters[Platform.SLACK].send.await_args.args[1]
assert "/hermes sethome" in onboarding
assert "Type /sethome" not in onboarding
@pytest.mark.asyncio
async def test_first_run_non_slack_home_channel_onboarding_keeps_direct_command(monkeypatch):
import gateway.run as gateway_run
session_entry = SessionEntry(
session_key=build_session_key(_make_source(Platform.TELEGRAM)),
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="dm",
)
runner = _make_runner(session_entry, platform=Platform.TELEGRAM)
runner.session_store.load_transcript.return_value = []
runner.session_store.has_any_sessions.return_value = False
runner._run_agent = AsyncMock(
return_value={
"final_response": "ok",
"messages": [],
"tools": [],
"history_offset": 0,
"last_prompt_tokens": 0,
"input_tokens": 0,
"output_tokens": 0,
"model": "openai/test-model",
}
)
monkeypatch.delenv("TELEGRAM_HOME_CHANNEL", raising=False)
monkeypatch.setattr(gateway_run, "_resolve_runtime_agent_kwargs", lambda: {"api_key": "***"})
monkeypatch.setattr(
"agent.model_metadata.get_model_context_length",
lambda *_args, **_kwargs: 100000,
)
result = await runner._handle_message(_make_event("hello", platform=Platform.TELEGRAM))
assert result == "ok"
runner.adapters[Platform.TELEGRAM].send.assert_awaited_once()
onboarding = runner.adapters[Platform.TELEGRAM].send.await_args.args[1]
assert "Type /sethome" in onboarding
@pytest.mark.asyncio
async def test_handle_message_discards_stale_result_after_session_invalidation(monkeypatch):
import gateway.run as gateway_run

Some files were not shown because too many files have changed in this diff Show More