feat(kanban): default_assignee fallback + per-profile concurrency cap (#27145 , #21582 ) (#34244 )

Two related dispatcher behaviors that have been missing for a while. ## kanban.default_assignee (#27145) Reporter (@agarzon): dashboard creates a task without an assignee, task parks in 'ready' forever even though the operator's intent ('default') is perfectly clear. The dispatcher already had a 'skipped_unassigned' bucket but no fallback routing — users had to manually type 'default' in the assignee field every time. Behavior: when 'kanban.default_assignee' is set in config.yaml, the dispatcher applies that assignee to any unassigned ready task before deciding whether to spawn. The row is mutated (assignee column + an 'assigned' event with source='kanban.default_assignee' for the audit trail). Empty/whitespace config value = no fallback, preserving the existing skipped_unassigned behavior. Dry-run mode reports what WOULD happen via the new 'auto_assigned_default' bucket on DispatchResult, but does NOT mutate the DB — operators using 'hermes kanban dispatch --dry-run' see the routing decision before committing. ## kanban.max_in_progress_per_profile (#21582) Reporter (@edwardchenchen, @simlu, 4 reactions): fan-out workloads saturate one profile's local model / API quota / browser pool while other profiles sit idle. The existing global 'max_in_progress' caps total workers but doesn't balance across profiles. Behavior: when 'kanban.max_in_progress_per_profile' is set to a positive int, the dispatcher tracks per-assignee running counts (one query at tick start) and refuses to spawn for any assignee already at the cap. Tasks blocked this way go to a new 'skipped_per_profile_capped' bucket on DispatchResult as (task_id, assignee, current_running_count) tuples — NOT an operator-actionable failure, just 'try again next tick when the profile has capacity'. Pre-existing 'running' tasks count against the cap (verified via regression test). The cap respects dry_run mode by incrementing its in-memory counter on each would-be spawn so dry_run reports the same balanced subset that a real tick would. Invalid cap values (0, negative, non-int, None) are treated as 'no cap', preserving the existing behavior. Backward-compatible for installs that don't set the config. ## Surfaces - 'hermes kanban dispatch' CLI now prints 'Auto-assigned to kanban.default_assignee=X: ...' and 'Deferred (X at per-profile cap, N running): ...' lines, plus matching JSON keys in --json output. - Gateway dispatcher logs the configured values at startup ('default_assignee=X', 'max_in_progress_per_profile=N'). - 'kanban.max_in_progress_per_profile' added to DEFAULT_CONFIG with inline docs. ## Validation - tests/hermes_cli/test_kanban_default_assignee.py (6 cases): no-cap baseline, auto-assign + DB mutation, dry-run reports without mutating, whitespace treated as None, explicit assignees untouched, DispatchResult field schema. - tests/hermes_cli/test_kanban_per_profile_cap.py (9 cases including 4 parametrized): no-cap baseline, balanced 2-profile fan-out, pre-existing running counts against cap, invalid cap values (0/-1/'abc'/None), capped tasks dispatched on next tick after running task completes, DispatchResult field schema. - Broader kanban suite: 464/464 pass (was 449 baseline; +15 new regression tests across both features). ## Credit #27145 — Jimmy Johansson reported the dispatcher skipped-unassigned gap; @agarzon scoped the simpler 'honor kanban.default_assignee' fix that matches the existing config knob. #21582 — @edwardchenchen filed the per-profile cap ask after hitting model 429s on fan-out research projects; @simlu confirmed the same pain on local-model setups.
docs(docker): refresh user-guide page for s6-overlay reality
2026-05-28 19:02:55 -07:00 · 2026-05-29 11:55:01 +10:00
14 changed files with 679 additions and 265 deletions
@@ -5420,6 +5420,49 @@ class GatewayRunner:
            )
            stale_timeout_seconds = 0

+        # Read kanban.default_assignee — fallback profile for tasks
+        # created without an explicit assignee (e.g. via the dashboard).
+        # When set, the dispatcher applies it to unassigned ready tasks
+        # instead of skipping them indefinitely (#27145). Empty string
+        # (the schema default) means "no fallback, keep skipping" —
+        # backward-compatible with existing installs.
+        default_assignee = (kanban_cfg.get("default_assignee") or "").strip() or None
+        if default_assignee:
+            logger.info(
+                "kanban dispatcher: default_assignee=%r (unassigned ready tasks "
+                "will route to this profile)",
+                default_assignee,
+            )
+
+        # Read kanban.max_in_progress_per_profile — per-profile concurrency
+        # cap (#21582). When set, no single profile gets more than N
+        # workers running at once, even if the global max_in_progress
+        # would allow it. Prevents one profile's local model / API quota
+        # / browser pool from being overwhelmed by a fan-out.
+        raw_per_profile = kanban_cfg.get("max_in_progress_per_profile", None)
+        max_in_progress_per_profile = None
+        if raw_per_profile is not None:
+            try:
+                max_in_progress_per_profile = int(raw_per_profile)
+            except (TypeError, ValueError):
+                logger.warning(
+                    "kanban dispatcher: invalid kanban.max_in_progress_per_profile=%r; ignoring",
+                    raw_per_profile,
+                )
+                max_in_progress_per_profile = None
+            else:
+                if max_in_progress_per_profile < 1:
+                    logger.warning(
+                        "kanban dispatcher: kanban.max_in_progress_per_profile=%r is below 1; ignoring",
+                        raw_per_profile,
+                    )
+                    max_in_progress_per_profile = None
+                else:
+                    logger.info(
+                        "kanban dispatcher: max_in_progress_per_profile=%d",
+                        max_in_progress_per_profile,
+                    )
+
        # Initial delay so the gateway finishes wiring adapters before the
        # dispatcher spawns workers (those workers may hit gateway notify
        # subscriptions etc.). Matches the notifier watcher's delay.
@@ -5511,6 +5554,8 @@ class GatewayRunner:
                    max_in_progress=max_in_progress,
                    failure_limit=failure_limit,
                    stale_timeout_seconds=stale_timeout_seconds,
+                    default_assignee=default_assignee,
+                    max_in_progress_per_profile=max_in_progress_per_profile,
                )
            except sqlite3.DatabaseError as exc:
                if _is_corrupt_board_db_error(exc):
@@ -1726,6 +1726,15 @@ DEFAULT_CONFIG = {
        # assignee to any installed profile. When unset, falls back to the
        # default profile. A task never ends up with assignee=None.
        "default_assignee": "",
+        # Per-profile concurrency cap (#21582). When set to a positive int,
+        # no single profile can have more than N workers running at once,
+        # even if the global max_in_progress / max_spawn caps would allow
+        # it. Tasks blocked this way defer to the next dispatcher tick.
+        # Unset (None) means "no per-profile cap" — backward-compatible
+        # with existing installs. Useful for fan-out workflows that would
+        # otherwise saturate one profile's local model / API quota /
+        # browser pool while leaving other profiles idle.
+        "max_in_progress_per_profile": None,
        # When true, the kanban dispatcher auto-runs the decomposer on
        # tasks that land in Triage (every dispatcher tick). When false,
        # decomposition is manual via `hermes kanban decompose <id>` or
@@ -26,15 +26,10 @@ from hermes_cli.dashboard_auth import list_providers
 from hermes_cli.dashboard_auth.audit import AuditEvent, audit_log
 from hermes_cli.dashboard_auth.base import ProviderError
 from hermes_cli.dashboard_auth.cookies import read_session_cookies
-from hermes_cli.dashboard_auth.public_paths import PUBLIC_API_PATHS

 _log = logging.getLogger(__name__)

-# Prefixes that bypass the auth gate. Match via ``path == prefix`` or
-# ``path.startswith(prefix)`` — so ``/assets/`` (with trailing slash)
-# matches ``/assets/foo.css`` but not ``/assetsleak``. Auth-bootstrap
-# (login page, OAuth round trip, provider listing) and static asset
-# mounts go here.
+# Paths that bypass the auth gate. Order matters: prefix match.
 _GATE_PUBLIC_PREFIXES: tuple[str, ...] = (
    "/auth/login",
    "/auth/callback",
@@ -50,20 +45,6 @@ _GATE_PUBLIC_PREFIXES: tuple[str, ...] = (


 def _path_is_public(path: str) -> bool:
-    """True if ``path`` bypasses the OAuth auth gate.
-
-    Two sources of public-ness:
-
-    * :data:`PUBLIC_API_PATHS` — the shared ``/api/*`` allowlist that
-      the legacy ``_SESSION_TOKEN`` middleware also honours. Matched
-      exactly (no prefix expansion) so adding ``/api/status`` doesn't
-      accidentally expose ``/api/status/secret-extension``.
-    * :data:`_GATE_PUBLIC_PREFIXES` — auth-bootstrap routes and static
-      mounts. Prefix-matched so ``/assets/foo.css`` lights up via
-      ``/assets/``.
-    """
-    if path in PUBLIC_API_PATHS:
-        return True
    return any(
        path == prefix or path.startswith(prefix)
        for prefix in _GATE_PUBLIC_PREFIXES
@@ -1,49 +0,0 @@
-"""Shared allowlist of ``/api/*`` paths that bypass dashboard auth.
-
-Two middlewares enforce dashboard auth and previously kept independent
-copies of this list:
-
-* ``hermes_cli.web_server.auth_middleware`` — loopback / ``--insecure``
-  mode, gates on the ephemeral ``_SESSION_TOKEN``.
-* ``hermes_cli.dashboard_auth.middleware.gated_auth_middleware`` —
-  non-loopback mode, gates on the OAuth session cookie.
-
-When the lists drifted, ``/api/status`` ended up public under the legacy
-gate but 401'd under the OAuth gate. That broke the portal's wildcard
-liveness probe (``nous-account-service`` ``fly-provider.ts``
-``getInstanceRuntimeStatus``), which fetches ``/api/status`` without a
-cookie as its sole signal of "agent dashboard is alive": every healthy
-wildcard-subdomain agent surfaced as STARTING/down in the portal UI even
-though the dashboard was serving correctly.
-
-Centralising the allowlist here so both middlewares import the same
-frozenset prevents the next drift. Keep this list minimal — only truly
-non-sensitive, read-only endpoints belong here. As a sanity check, every
-entry should be safe to expose to:
-
-  * external uptime probes (Pingdom, Better Stack, NAS),
-  * the dashboard SPA before the user has logged in,
-  * anyone who happens to ``curl`` the hostname.
-
-If a new endpoint doesn't pass all three tests, it should be gated and
-the SPA should bootstrap it after login instead.
-"""
-from __future__ import annotations
-
-PUBLIC_API_PATHS: frozenset[str] = frozenset({
-    # Liveness probe target. Returns version, gateway state, active
-    # session count, and the dashboard auth-gate shape. No bodies, no
-    # session content, no secrets. Documented as the portal's wildcard
-    # liveness probe in
-    # ``docs/agent-dashboard-public-url-contract.md`` (NAS side).
-    "/api/status",
-    # Read-only config-defaults / schema feeds for the SPA's Config page.
-    "/api/config/defaults",
-    "/api/config/schema",
-    # Read-only model metadata (context windows, etc.) — same shape as
-    # provider catalogs already exposed on the public internet.
-    "/api/model/info",
-    # Read-only theme + plugin manifests for the dashboard skin engine.
-    "/api/dashboard/themes",
-    "/api/dashboard/plugins",
-})
@@ -2087,12 +2087,35 @@ def _cmd_tail(args: argparse.Namespace) -> int:


 def _cmd_dispatch(args: argparse.Namespace) -> int:
+    # Honour kanban.default_assignee as the fallback for unassigned ready
+    # tasks (#27145) and kanban.max_in_progress_per_profile as the
+    # per-profile concurrency cap (#21582). Same semantics as the
+    # gateway dispatch path.
+    try:
+        from hermes_cli.config import load_config
+        _cfg = load_config()
+        _kanban_cfg = _cfg.get("kanban", {}) if isinstance(_cfg, dict) else {}
+        default_assignee = (_kanban_cfg.get("default_assignee") or "").strip() or None
+        _raw_per_profile = _kanban_cfg.get("max_in_progress_per_profile", None)
+        try:
+            max_in_progress_per_profile = (
+                int(_raw_per_profile) if _raw_per_profile is not None else None
+            )
+            if max_in_progress_per_profile is not None and max_in_progress_per_profile < 1:
+                max_in_progress_per_profile = None
+        except (TypeError, ValueError):
+            max_in_progress_per_profile = None
+    except Exception:
+        default_assignee = None
+        max_in_progress_per_profile = None
    with kb.connect_closing() as conn:
        res = kb.dispatch_once(
            conn,
            dry_run=args.dry_run,
            max_spawn=args.max,
            failure_limit=getattr(args, "failure_limit", kb.DEFAULT_SPAWN_FAILURE_LIMIT),
+            default_assignee=default_assignee,
+            max_in_progress_per_profile=max_in_progress_per_profile,
        )
    if getattr(args, "json", False):
        print(json.dumps({
@@ -2108,6 +2131,11 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
            ],
            "skipped_unassigned": res.skipped_unassigned,
            "skipped_nonspawnable": res.skipped_nonspawnable,
+            "skipped_per_profile_capped": [
+                {"task_id": tid, "assignee": who, "current": current}
+                for (tid, who, current) in res.skipped_per_profile_capped
+            ],
+            "auto_assigned_default": res.auto_assigned_default,
        }, indent=2))
        return 0
    print(f"Reclaimed:    {res.reclaimed}")
@@ -2128,8 +2156,18 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
    for tid, who, ws in res.spawned:
        tag = " (dry)" if args.dry_run else ""
        print(f"  - {tid}  ->  {who}  @ {ws or '-'}{tag}")
+    if res.auto_assigned_default:
+        print(
+            f"Auto-assigned to kanban.default_assignee={default_assignee!r}: "
+            f"{', '.join(res.auto_assigned_default)}"
+        )
    if res.skipped_unassigned:
        print(f"Skipped (unassigned): {', '.join(res.skipped_unassigned)}")
+    if res.skipped_per_profile_capped:
+        for tid, who, current in res.skipped_per_profile_capped:
+            print(
+                f"Deferred ({who} at per-profile cap, {current} running): {tid}"
+            )
    if res.skipped_nonspawnable:
        print(
            f"Skipped (non-spawnable assignee — terminal lane, OK): "
@@ -4289,6 +4289,12 @@ class DispatchResult:
    skipped_unassigned: list[str] = field(default_factory=list)
    """Ready task ids skipped because they have no assignee at all.
    Operator-actionable — usually a misfiled task waiting for routing."""
+    auto_assigned_default: list[str] = field(default_factory=list)
+    """Task ids that were unassigned in the DB and had
+    ``kanban.default_assignee`` applied this tick before spawning (#27145).
+    Surfaces the auto-assignment to telemetry / CLI / dashboard so the
+    operator can see when the dispatcher is acting on the fallback rule
+    rather than on explicit per-task assignments."""
    skipped_nonspawnable: list[str] = field(default_factory=list)
    """Ready task ids skipped because their assignee names a control-plane
    lane (a Claude Code terminal like ``orion-cc``) rather than a Hermes
@@ -4296,6 +4302,14 @@ class DispatchResult:
    operator-actionable failure. Tracked separately so health telemetry
    can distinguish "real stuck" (nothing spawned but spawnable work
    available) from "correctly idle" (nothing spawnable in the queue)."""
+    skipped_per_profile_capped: list[tuple[str, str, int]] = field(default_factory=list)
+    """Tasks deferred this tick because their assignee is already at
+    ``kanban.max_in_progress_per_profile`` (#21582). Each entry is
+    ``(task_id, assignee, current_running_count)``. NOT an
+    operator-actionable failure — the task will be picked up on a
+    subsequent tick when the assignee has capacity. Separate bucket so
+    telemetry / dashboards can show "this profile is busy" vs
+    "task is genuinely stuck"."""
    crashed: list[str] = field(default_factory=list)
    """Task ids reclaimed because their worker PID disappeared."""
    auto_blocked: list[str] = field(default_factory=list)
@@ -5342,6 +5356,8 @@ def dispatch_once(
    failure_limit: int = DEFAULT_SPAWN_FAILURE_LIMIT,
    stale_timeout_seconds: int = 0,
    board: Optional[str] = None,
+    default_assignee: Optional[str] = None,
+    max_in_progress_per_profile: Optional[int] = None,
 ) -> DispatchResult:
    """Run one dispatcher tick.

@@ -5427,12 +5443,89 @@ def dispatch_once(
        if max_spawn is None or max_spawn > remaining:
            max_spawn = remaining
    spawned = 0
+    # Per-profile concurrency cap (#21582): when set, track how many
+    # workers each assignee already has in flight, and refuse to spawn
+    # when this would push that assignee past the cap. Prevents
+    # fan-out workloads from melting a single profile's local model /
+    # API quota / browser pool while leaving other profiles idle.
+    # Tasks blocked this way go to skipped_per_profile_capped (not
+    # skipped_unassigned — the operator-actionable signal is different:
+    # "this profile is busy, try again later" not "this needs routing").
+    _per_profile_cap = max_in_progress_per_profile if (
+        isinstance(max_in_progress_per_profile, int)
+        and max_in_progress_per_profile > 0
+    ) else None
+    _per_profile_running: dict[str, int] = {}
+    if _per_profile_cap is not None:
+        for prow in conn.execute(
+            "SELECT assignee, COUNT(*) AS n FROM tasks "
+            "WHERE status = 'running' AND assignee IS NOT NULL "
+            "GROUP BY assignee"
+        ):
+            _per_profile_running[prow["assignee"]] = int(prow["n"])
+    # Normalize default_assignee once: empty/whitespace string → None so the
+    # rest of the loop can use ``if default_assignee:`` as a single check.
+    # We also resolve profile_exists once here for the same reason.
+    _default_assignee = (default_assignee or "").strip() or None
+    _default_assignee_resolved = False
+    if _default_assignee:
+        try:
+            from hermes_cli.profiles import profile_exists as _pe
+            _default_assignee_resolved = bool(_pe(_default_assignee))
+        except Exception:
+            # Profiles module not importable (test stubs, exotic envs).
+            # Trust the operator's config and try the assignment; the
+            # downstream profile_exists check on the assigned row will
+            # bucket it as nonspawnable if the profile genuinely isn't
+            # there, with the existing diagnostic.
+            _default_assignee_resolved = True
    for row in ready_rows:
        if max_spawn is not None and running_count + spawned >= max_spawn:
            break
-        if not row["assignee"]:
-            result.skipped_unassigned.append(row["id"])
-            continue
+        row_assignee = row["assignee"]
+        if not row_assignee:
+            # Honour kanban.default_assignee: when the dispatcher hits an
+            # unassigned ready task and an operator-configured fallback
+            # exists, persist the assignment and proceed. This removes the
+            # dashboard footgun where a task created without an assignee
+            # parks in 'ready' forever even though the operator's intent
+            # ("default") was perfectly clear (#27145). Mutating the row
+            # (not just the in-memory view) keeps diagnostics and the
+            # board state consistent: the task is now legitimately owned
+            # by ``kanban.default_assignee``, not "unassigned but secretly
+            # routed".
+            if _default_assignee and _default_assignee_resolved:
+                # Dry-run: show what WOULD happen (auto-assign + spawn) without
+                # mutating the DB. Real run: mutate the row + emit the
+                # 'assigned' event so the board state matches what just happened.
+                if not dry_run:
+                    try:
+                        with write_txn(conn):
+                            conn.execute(
+                                "UPDATE tasks SET assignee = ? WHERE id = ? "
+                                "AND (assignee IS NULL OR assignee = '')",
+                                (_default_assignee, row["id"]),
+                            )
+                            _append_event(
+                                conn, row["id"], "assigned",
+                                {
+                                    "assignee": _default_assignee,
+                                    "source": "kanban.default_assignee",
+                                },
+                            )
+                    except Exception:
+                        _log.debug(
+                            "kanban dispatch: failed to apply default_assignee=%r "
+                            "to task %s",
+                            _default_assignee, row["id"], exc_info=True,
+                        )
+                        result.skipped_unassigned.append(row["id"])
+                        continue
+                row_assignee = _default_assignee
+                result.auto_assigned_default.append(row["id"])
+            else:
+                result.skipped_unassigned.append(row["id"])
+                continue
        # Skip ready tasks whose assignee is not a real Hermes profile.
        # `_default_spawn` invokes ``hermes -p <assignee>`` which fails
        # with "Profile 'X' does not exist" when the assignee names a
@@ -5447,7 +5540,7 @@ def dispatch_once(
            from hermes_cli.profiles import profile_exists  # local import: avoids cycle
        except Exception:
            profile_exists = None  # type: ignore[assignment]
-        if profile_exists is not None and not profile_exists(row["assignee"]):
+        if profile_exists is not None and not profile_exists(row_assignee):
            # Bucket separately from skipped_unassigned: the operator
            # cannot fix this by assigning a profile (the assignee IS the
            # intended owner — a terminal lane). Health telemetry uses
@@ -5456,6 +5549,19 @@ def dispatch_once(
            # of human-pulled work.
            result.skipped_nonspawnable.append(row["id"])
            continue
+        # Per-profile concurrency cap (#21582): even if there's global
+        # headroom, refuse to spawn for an assignee that's already at
+        # its in-flight cap. Prevents one profile's local model / API
+        # quota / browser pool from being overwhelmed by a fan-out
+        # while the global max_in_progress / max_spawn caps still allow
+        # work on OTHER profiles.
+        if _per_profile_cap is not None:
+            current = _per_profile_running.get(row_assignee, 0)
+            if current >= _per_profile_cap:
+                result.skipped_per_profile_capped.append(
+                    (row["id"], row_assignee, current)
+                )
+                continue
        # Respawn guard: refuse to re-spawn when useful work is already
        # in-flight/recent, or when the last failure is a deterministic
        # blocker (quota / auth). The guard defers the spawn this tick so
@@ -5478,7 +5584,15 @@ def dispatch_once(
                    )
            continue
        if dry_run:
-            result.spawned.append((row["id"], row["assignee"], ""))
+            result.spawned.append((row["id"], row_assignee, ""))
+            # Increment per-profile counter even in dry_run so the cap
+            # check sees the would-be spawn on subsequent iterations.
+            # Without this, dry_run reports every task as spawnable and
+            # under-reports the capped subset (#21582).
+            if _per_profile_cap is not None and row_assignee:
+                _per_profile_running[row_assignee] = (
+                    _per_profile_running.get(row_assignee, 0) + 1
+                )
            continue
        claimed = claim_task(conn, row["id"], ttl_seconds=ttl_seconds)
        if claimed is None:
@@ -5521,6 +5635,13 @@ def dispatch_once(
            # complete_task).
            result.spawned.append((claimed.id, claimed.assignee or "", str(workspace)))
            spawned += 1
+            # Track the new in-flight count for this profile so later
+            # iterations in this same tick respect the per-profile cap
+            # (#21582). Subsequent ticks re-query from the DB.
+            if _per_profile_cap is not None and claimed.assignee:
+                _per_profile_running[claimed.assignee] = (
+                    _per_profile_running.get(claimed.assignee, 0) + 1
+                )
        except Exception as exc:
            auto = _record_spawn_failure(
                conn, claimed.id, str(exc),
@@ -110,20 +110,17 @@ app.add_middleware(

 # ---------------------------------------------------------------------------
 # Endpoints that do NOT require the session token.  Everything else under
-# /api/ is gated by the auth middleware below.
-#
-# This list is defined in ``hermes_cli.dashboard_auth.public_paths`` so the
-# OAuth gate middleware can honour the same allowlist — keeping the two
-# gates in lockstep avoids drift like the wildcard-subdomain regression
-# where ``/api/status`` was public under the legacy gate but 401'd under
-# the OAuth gate (breaking the portal's liveness probe).
-#
-# Keep the upstream list minimal — only truly non-sensitive, read-only
-# endpoints belong there.
+# /api/ is gated by the auth middleware below.  Keep this list minimal —
+# only truly non-sensitive, read-only endpoints belong here.
 # ---------------------------------------------------------------------------
-from hermes_cli.dashboard_auth.public_paths import (
-    PUBLIC_API_PATHS as _PUBLIC_API_PATHS,
-)
+_PUBLIC_API_PATHS: frozenset = frozenset({
+    "/api/status",
+    "/api/config/defaults",
+    "/api/config/schema",
+    "/api/model/info",
+    "/api/dashboard/themes",
+    "/api/dashboard/plugins",
+})


 def _has_valid_session_token(request: Request) -> bool:
@@ -324,14 +324,10 @@ def test_dashboard_oauth_gate_engages_on_non_loopback_bind(
    1. ``/api/auth/providers`` (publicly reachable through the gate so
       the login page can bootstrap) returns 200 with ``nous`` in the
       provider list — proves the bundled provider registered.
-    2. ``/api/sessions`` (a gated route under both the legacy
-       ``_SESSION_TOKEN`` middleware and the OAuth gate) returns 401
-       to an unauthenticated caller — proves the OAuth gate is actively
-       intercepting browser traffic. We deliberately probe a gated route
-       here rather than ``/api/status``: status sits in the shared
-       ``PUBLIC_API_PATHS`` allowlist (portal liveness probe target) and
-       responds 200 without a cookie under both gates, so it cannot
-       distinguish "gate on" from "gate off".
+    2. ``/api/status`` (a public endpoint under the legacy
+       ``_SESSION_TOKEN`` middleware) returns 401 — proves the OAuth gate
+       runs upstream of the legacy public list and is actively
+       intercepting unauthenticated callers.
    """
    subprocess.run(
        ["docker", "run", "-d", "--name", container_name,
@@ -355,32 +351,14 @@ def test_dashboard_oauth_gate_engages_on_non_loopback_bind(
        f"HERMES_DASHBOARD_OAUTH_CLIENT_ID is set. Got: {payload!r}"
    )

-    # (2) A gated route (``/api/sessions``) returns 401 to an
-    #     unauthenticated caller — the OAuth gate is intercepting.
-    status_code, body = _http_probe(container_name, "/api/sessions")
-    assert status_code == 401, (
-        "OAuth gate must intercept gated /api/* routes on 0.0.0.0 bind "
-        "when a provider is registered and HERMES_DASHBOARD_INSECURE "
-        f"is unset. Got: status={status_code} body={body!r}"
-    )
-
-    # (3) ``/api/status`` remains 200 under the gate — it's in the shared
-    #     ``PUBLIC_API_PATHS`` allowlist so NAS's wildcard-subdomain
-    #     liveness probe (``fly-provider.ts`` ``getInstanceRuntimeStatus``)
-    #     can reach it without a cookie. Regression guard: this allowlist
-    #     drifted once already and surfaced every healthy agent as
-    #     STARTING/down in the portal UI.
+    # (2) /api/status is gated by the OAuth middleware → unauthenticated
+    # callers get 401, not the legacy public 200 JSON.
    status_code, body = _http_probe(container_name, "/api/status")
-    assert status_code == 200, (
-        "/api/status must remain publicly reachable under the OAuth gate "
-        "— the portal uses it as the wildcard-subdomain liveness probe. "
+    assert status_code == 401, (
+        "OAuth gate must intercept /api/status on 0.0.0.0 bind when a "
+        "provider is registered and HERMES_DASHBOARD_INSECURE is unset. "
        f"Got: status={status_code} body={body!r}"
    )
-    status = json.loads(body)
-    assert status.get("auth_required") is True, (
-        "/api/status must report auth_required=True when the OAuth gate "
-        f"is engaged so the SPA/portal can distinguish modes. Got: {status!r}"
-    )


 def test_dashboard_insecure_env_var_opts_out_of_gate(
@@ -131,13 +131,8 @@ class TestRefreshTokenCookieDeprecation:


 class TestApi401Envelope:
-    # NOTE: probe a gated route (``/api/sessions``) here rather than
-    # ``/api/status`` — status is in the shared ``PUBLIC_API_PATHS``
-    # allowlist (portal liveness probe) so it would 200 even without a
-    # cookie and never exercise the 401-envelope code path.
-
    def test_no_cookie_returns_unauthenticated_envelope(self, gated_app):
-        r = gated_app.get("/api/sessions")
+        r = gated_app.get("/api/status")
        assert r.status_code == 401
        body = r.json()
        assert body["error"] == "unauthenticated"
@@ -146,7 +141,7 @@ class TestApi401Envelope:

    def test_invalid_cookie_returns_session_expired_envelope(self, gated_app):
        gated_app.cookies.set(SESSION_AT_COOKIE, "garbage")
-        r = gated_app.get("/api/sessions")
+        r = gated_app.get("/api/status")
        assert r.status_code == 401
        body = r.json()
        assert body["error"] == "session_expired"
@@ -156,7 +151,7 @@ class TestApi401Envelope:
        """Dead-cookie cleanup — Phase 6 requirement so the browser
        doesn't keep replaying the stale token on every request."""
        gated_app.cookies.set(SESSION_AT_COOKIE, "garbage")
-        r = gated_app.get("/api/sessions")
+        r = gated_app.get("/api/status")
        set_cookies = r.headers.get_list("set-cookie")
        assert any(
            c.startswith(f"{SESSION_AT_COOKIE}=") and "Max-Age=0" in c
@@ -56,61 +56,10 @@ def gated_app():
 # ---------------------------------------------------------------------------


-def test_gated_status_is_public(gated_app):
-    """``/api/status`` MUST be public under the OAuth gate.
-
-    Regression guard for the wildcard-subdomain rollout: NAS
-    (``fly-provider.ts`` ``getInstanceRuntimeStatus``) hits
-    ``/api/status`` without a cookie as its sole liveness probe. A 401
-    here surfaces every healthy agent as STARTING/down in the portal
-    UI. The endpoint returns only version + gateway/auth-gate metadata
-    (no user data, no session content), so it stays in the shared
-    ``PUBLIC_API_PATHS`` allowlist under both the legacy ``_SESSION_TOKEN``
-    gate and the OAuth gate.
-
-    The body also reports the gate's shape (``auth_required``,
-    ``auth_providers``) so the SPA's StatusPage and external monitors
-    can distinguish loopback / gated / no-providers without a separate
-    round trip.
-    """
+def test_gated_status_now_requires_auth(gated_app):
+    """When gate is on, /api/status is NOT public — login bootstrap uses /api/auth/providers."""
    r = gated_app.get("/api/status")
-    assert r.status_code == 200, (
-        f"Expected 200, got {r.status_code}: {r.text}"
-    )
-    body = r.json()
-    assert body["auth_required"] is True
-    assert "version" in body
-    assert "gateway_state" in body
-
-
-@pytest.mark.parametrize("path", [
-    "/api/config/defaults",
-    "/api/config/schema",
-    "/api/model/info",
-    "/api/dashboard/themes",
-    "/api/dashboard/plugins",
-])
-def test_other_public_api_paths_are_public_under_gate(gated_app, path):
-    """The remaining ``PUBLIC_API_PATHS`` entries must also bypass the
-    gate. They're documented as non-sensitive read-only endpoints that
-    the SPA pre-loads before login (themes, config schema, model
-    metadata). A 401 / 302-to-login here would block the dashboard
-    shell from rendering pre-auth.
-
-    Accept any non-auth-failure status: 200 when the route succeeds,
-    or any route-specific error (e.g. 400 / 404 / 500 from a missing
-    dependency) — but NEVER 401, and NEVER a 302 to ``/login``.
-    """
-    r = gated_app.get(path, follow_redirects=False)
-    assert r.status_code != 401, (
-        f"{path} returned 401 under the OAuth gate — should be public"
-    )
-    if r.status_code == 302:
-        location = r.headers.get("location", "")
-        assert "/login" not in location, (
-            f"{path} redirected to {location} — should be public, "
-            "not bounced to /login"
-        )
+    assert r.status_code == 401


 def test_gated_html_redirects_to_login(gated_app):
@@ -149,7 +98,7 @@ def test_gated_static_asset_path_is_public(gated_app):
 # ---------------------------------------------------------------------------


-def test_full_login_round_trip_unlocks_gated_api(gated_app):
+def test_full_login_round_trip_unlocks_api_status(gated_app):
    # 1) Click "Sign in with Stub IdP" — /auth/login redirects to the stub
    #    with a PKCE cookie on the response.
    r1 = gated_app.get("/auth/login?provider=stub", follow_redirects=False)
@@ -179,16 +128,11 @@ def test_full_login_round_trip_unlocks_gated_api(gated_app):
    assert any("hermes_session_at" in c for c in set_cookies)
    assert any("hermes_session_rt" in c for c in set_cookies)

-    # 3) A gated API route (``/api/sessions``) now succeeds because we
-    #    have a valid session cookie. (We deliberately don't probe
-    #    ``/api/status`` here — it's in the shared PUBLIC_API_PATHS
-    #    allowlist and would 200 even without a login, so it can't
-    #    distinguish "logged in" from "gate accidentally disabled".)
-    r3 = gated_app.get("/api/sessions")
-    assert r3.status_code == 200, (
-        f"Expected 200 for /api/sessions post-login, got {r3.status_code}: "
-        f"{r3.text}"
-    )
+    # 3) /api/status now succeeds because we're authenticated.
+    r3 = gated_app.get("/api/status")
+    assert r3.status_code == 200
+    body = r3.json()
+    assert "version" in body


 def test_login_unknown_provider_returns_404(gated_app):
@@ -59,11 +59,19 @@ def loopback_client():
    web_server.app.state.auth_required = prev_required


+def _login(client: TestClient) -> None:
+    """Drive the stub OAuth round trip so the gated client is authed."""
+    r1 = client.get("/auth/login?provider=stub", follow_redirects=False)
+    assert r1.status_code == 302
+    state = r1.headers["location"].split("state=")[1]
+    r2 = client.get(
+        f"/auth/callback?code=stub_code&state={state}", follow_redirects=False
+    )
+    assert r2.status_code == 302
+
+
 def test_status_reports_auth_required_in_gated_mode(gated_client):
-    # No ``_login()`` call — ``/api/status`` is in the shared
-    # ``PUBLIC_API_PATHS`` allowlist precisely so external probes (and
-    # the SPA's pre-login bootstrap) can read the gate's shape without
-    # a cookie. Hit it cold.
+    _login(gated_client)
    r = gated_client.get("/api/status")
    assert r.status_code == 200
    body = r.json()
@@ -0,0 +1,154 @@
+"""Regression tests for #27145 — kanban.default_assignee for unassigned ready tasks.
+
+When the dispatcher hits an unassigned ready task and ``kanban.default_assignee``
+is set, the dispatcher applies the assignment and spawns. Without the config,
+the task is skipped (existing behavior preserved).
+"""
+from __future__ import annotations
+
+import json
+import os
+import sys
+import tempfile
+
+import pytest
+
+
+@pytest.fixture()
+def isolated_kanban_home(monkeypatch):
+    """Spin up a fresh HERMES_HOME with a clean kanban DB."""
+    test_home = tempfile.mkdtemp(prefix="kanban_default_assignee_test_")
+    monkeypatch.setenv("HERMES_HOME", test_home)
+    # Force-reimport so the fresh HERMES_HOME is picked up.
+    for mod in list(sys.modules.keys()):
+        if mod.startswith("hermes_cli") or mod.startswith("hermes_state") or mod == "hermes_constants":
+            del sys.modules[mod]
+    from hermes_cli import kanban_db
+    yield kanban_db, test_home
+    # Cleanup is best-effort; tempfile dir survives but pytest isolation
+    # gives each test its own monkeypatched HERMES_HOME so no cross-test
+    # contamination.
+
+
+def _fake_spawn(*args, **kwargs):
+    """Stand-in for the real worker spawn — returns a fake PID."""
+    return 12345
+
+
+def test_unassigned_task_skipped_without_default_assignee(isolated_kanban_home):
+    """Baseline: with no default_assignee, an unassigned ready task is
+    skipped via the existing `skipped_unassigned` bucket and the DB row
+    is untouched."""
+    kb, _home = isolated_kanban_home
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        task_id = kb.create_task(conn, title="t1", assignee=None)
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(conn, spawn_fn=_fake_spawn, dry_run=False)
+    assert res.skipped_unassigned == [task_id]
+    assert not res.auto_assigned_default
+    assert not res.spawned
+    with kb.connect_closing() as conn:
+        row = conn.execute("SELECT assignee FROM tasks WHERE id = ?", (task_id,)).fetchone()
+    assert row["assignee"] is None
+
+
+def test_unassigned_task_auto_assigned_with_default_assignee(isolated_kanban_home):
+    """Core #27145 contract: with default_assignee set, an unassigned ready
+    task gets the assignment applied and dispatched on the same tick. The
+    DB row is mutated (assignee column + an 'assigned' event)."""
+    kb, _home = isolated_kanban_home
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        task_id = kb.create_task(conn, title="t1", assignee=None)
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=False,
+            default_assignee="default",
+        )
+    assert res.auto_assigned_default == [task_id]
+    assert not res.skipped_unassigned
+    assert len(res.spawned) == 1
+    assert res.spawned[0][0] == task_id
+    assert res.spawned[0][1] == "default"
+
+    with kb.connect_closing() as conn:
+        row = conn.execute("SELECT assignee FROM tasks WHERE id = ?", (task_id,)).fetchone()
+    assert row["assignee"] == "default"
+
+    # 'assigned' event emitted for the audit trail
+    with kb.connect_closing() as conn:
+        evs = list(conn.execute(
+            "SELECT kind, payload FROM task_events WHERE task_id = ? AND kind = 'assigned'",
+            (task_id,),
+        ))
+    assert len(evs) == 1
+    payload = json.loads(evs[0][1])
+    assert payload["assignee"] == "default"
+    assert payload["source"] == "kanban.default_assignee"
+
+
+def test_dry_run_with_default_assignee_reports_without_mutating(isolated_kanban_home):
+    """Dry-run mode: reports what WOULD happen (task in auto_assigned_default,
+    spawn entry) but does NOT mutate the DB. Operators using
+    `hermes kanban dispatch --dry-run` see the routing decision before
+    committing."""
+    kb, _home = isolated_kanban_home
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        task_id = kb.create_task(conn, title="t1", assignee=None)
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=True,
+            default_assignee="default",
+        )
+    assert res.auto_assigned_default == [task_id]
+    assert len(res.spawned) == 1
+    with kb.connect_closing() as conn:
+        row = conn.execute("SELECT assignee FROM tasks WHERE id = ?", (task_id,)).fetchone()
+    # DB unchanged — dry_run did not commit the assignment.
+    assert row["assignee"] is None
+
+
+def test_whitespace_default_assignee_treated_as_none(isolated_kanban_home):
+    """Empty / whitespace-only default_assignee values must be treated as
+    'no fallback set' so a misconfigured kanban.default_assignee=' '
+    doesn't surprise operators by silently routing unassigned tasks."""
+    kb, _home = isolated_kanban_home
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        task_id = kb.create_task(conn, title="t1", assignee=None)
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=False,
+            default_assignee="   ",
+        )
+    assert task_id in res.skipped_unassigned
+    assert not res.auto_assigned_default
+
+
+def test_explicitly_assigned_task_untouched_by_default_assignee(isolated_kanban_home):
+    """A task with an explicit assignee must NOT be touched by the
+    default_assignee logic — that fallback only applies to genuinely
+    unassigned rows."""
+    kb, _home = isolated_kanban_home
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        task_id = kb.create_task(conn, title="t1", assignee="default")
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=False,
+            default_assignee="someother",
+        )
+    assert task_id not in res.auto_assigned_default
+    assert any(s[0] == task_id and s[1] == "default" for s in res.spawned)
+
+
+def test_dispatch_result_has_auto_assigned_default_field():
+    """Schema-level invariant: DispatchResult exposes the
+    auto_assigned_default field so CLI / dashboard / gateway can surface
+    the new routing decisions."""
+    from hermes_cli.kanban_db import DispatchResult
+    r = DispatchResult()
+    assert hasattr(r, "auto_assigned_default")
+    assert r.auto_assigned_default == []
@@ -0,0 +1,167 @@
+"""Regression tests for #21582 — per-profile concurrency cap in dispatcher.
+
+When ``kanban.max_in_progress_per_profile`` is set, no single profile
+gets more than N workers running at once even if the global
+``max_in_progress`` cap would allow it. Prevents one profile's local
+model / API quota / browser pool from being overwhelmed by a fan-out.
+"""
+from __future__ import annotations
+
+import os
+import sys
+import tempfile
+
+import pytest
+
+
+@pytest.fixture()
+def isolated_kanban_home_with_profiles(monkeypatch):
+    """Spin up a fresh HERMES_HOME with kanban DB + alpha/beta profiles."""
+    test_home = tempfile.mkdtemp(prefix="kanban_per_profile_cap_test_")
+    for prof in ("alpha", "beta", "default"):
+        os.makedirs(os.path.join(test_home, "profiles", prof), exist_ok=True)
+    monkeypatch.setenv("HERMES_HOME", test_home)
+    for mod in list(sys.modules.keys()):
+        if mod.startswith("hermes_cli") or mod.startswith("hermes_state") or mod == "hermes_constants":
+            del sys.modules[mod]
+    from hermes_cli import kanban_db
+    yield kanban_db
+
+
+def _fake_spawn(*args, **kwargs):
+    return 12345
+
+
+def test_no_cap_all_tasks_dispatched(isolated_kanban_home_with_profiles):
+    """Baseline: with no per-profile cap, all ready tasks dispatch."""
+    kb = isolated_kanban_home_with_profiles
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        for i in range(5):
+            kb.create_task(conn, title=f"a{i}", assignee="alpha")
+        for i in range(3):
+            kb.create_task(conn, title=f"b{i}", assignee="beta")
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(conn, spawn_fn=_fake_spawn, dry_run=True)
+    assert len(res.spawned) == 8
+    assert not res.skipped_per_profile_capped
+
+
+def test_cap_2_balances_two_profiles(isolated_kanban_home_with_profiles):
+    """With cap=2: 2 alpha + 2 beta dispatched; remaining 3 alpha + 1 beta
+    deferred to skipped_per_profile_capped."""
+    kb = isolated_kanban_home_with_profiles
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        for i in range(5):
+            kb.create_task(conn, title=f"a{i}", assignee="alpha")
+        for i in range(3):
+            kb.create_task(conn, title=f"b{i}", assignee="beta")
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=True,
+            max_in_progress_per_profile=2,
+        )
+    spawn_assignees = [s[1] for s in res.spawned]
+    capped_assignees = [c[1] for c in res.skipped_per_profile_capped]
+    assert spawn_assignees.count("alpha") == 2
+    assert spawn_assignees.count("beta") == 2
+    assert capped_assignees.count("alpha") == 3
+    assert capped_assignees.count("beta") == 1
+
+
+def test_pre_existing_running_counts_against_cap(isolated_kanban_home_with_profiles):
+    """A task already in 'running' status when dispatch_once starts counts
+    toward the per-profile cap. With 1 alpha pre-running and cap=1, NO new
+    alpha tasks should spawn; beta is independent so 1 beta spawns."""
+    kb = isolated_kanban_home_with_profiles
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        running_alpha = kb.create_task(conn, title="running alpha", assignee="alpha")
+        with kb.write_txn(conn):
+            conn.execute(
+                "UPDATE tasks SET status = 'running', claim_lock = 'test:1' WHERE id = ?",
+                (running_alpha,),
+            )
+        for i in range(2):
+            kb.create_task(conn, title=f"a{i}", assignee="alpha")
+        for i in range(2):
+            kb.create_task(conn, title=f"b{i}", assignee="beta")
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=True,
+            max_in_progress_per_profile=1,
+        )
+    spawn_assignees = [s[1] for s in res.spawned]
+    capped_assignees = [c[1] for c in res.skipped_per_profile_capped]
+    assert spawn_assignees.count("alpha") == 0
+    assert spawn_assignees.count("beta") == 1
+    assert capped_assignees.count("alpha") == 2
+    assert capped_assignees.count("beta") == 1
+
+
+@pytest.mark.parametrize("cap", [0, -1, "abc", None])
+def test_invalid_cap_treated_as_no_cap(isolated_kanban_home_with_profiles, cap):
+    """Cap values that don't represent a positive int should be treated as
+    'no cap' — silently falling through rather than crashing the dispatcher."""
+    kb = isolated_kanban_home_with_profiles
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        for i in range(3):
+            kb.create_task(conn, title=f"a{i}", assignee="alpha")
+    with kb.connect_closing() as conn:
+        res = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=True,
+            max_in_progress_per_profile=cap,
+        )
+    assert not res.skipped_per_profile_capped
+    assert len(res.spawned) == 3
+
+
+def test_capped_tasks_dispatched_on_subsequent_tick(isolated_kanban_home_with_profiles):
+    """A task deferred this tick because its profile was at cap should be
+    eligible for dispatch on the next tick (after running tasks complete).
+    This verifies the cap is per-tick state, not a permanent block."""
+    kb = isolated_kanban_home_with_profiles
+    with kb.connect_closing() as conn:
+        kb.create_board(slug="default", name="Test")
+        ids = [kb.create_task(conn, title=f"a{i}", assignee="alpha") for i in range(3)]
+
+    # First tick: cap=1, only 1 alpha dispatched
+    with kb.connect_closing() as conn:
+        res1 = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=False,
+            max_in_progress_per_profile=1,
+        )
+    assert len(res1.spawned) == 1
+    assert len(res1.skipped_per_profile_capped) == 2
+
+    # Simulate the running task completing — set it back to done so the
+    # 'running' count drops
+    spawned_id = res1.spawned[0][0]
+    with kb.connect_closing() as conn:
+        with kb.write_txn(conn):
+            conn.execute(
+                "UPDATE tasks SET status = 'done', claim_lock = NULL WHERE id = ?",
+                (spawned_id,),
+            )
+
+    # Second tick: 1 more alpha should now dispatch
+    with kb.connect_closing() as conn:
+        res2 = kb.dispatch_once(
+            conn, spawn_fn=_fake_spawn, dry_run=False,
+            max_in_progress_per_profile=1,
+        )
+    assert len(res2.spawned) == 1
+    assert len(res2.skipped_per_profile_capped) == 1
+    assert res2.spawned[0][0] != spawned_id  # different task this time
+
+
+def test_dispatch_result_has_skipped_per_profile_capped_field():
+    """Schema-level invariant: DispatchResult exposes the
+    skipped_per_profile_capped field as a list of
+    (task_id, assignee, current_running) tuples."""
+    from hermes_cli.kanban_db import DispatchResult
+    r = DispatchResult()
+    assert hasattr(r, "skipped_per_profile_capped")
+    assert r.skipped_per_profile_capped == []
@@ -54,12 +54,7 @@ This behavior applies to the s6-based image only. Earlier (tini-based) images st
 :::

 :::note Where gateway logs go
-Inside the s6 image, the supervised gateway's output is tee'd to two destinations:
-
- **`docker logs <container>`** — every line in real time (raw, no extra prefix). This is the same stream you'd get from a foreground gateway, so existing `docker logs --follow` / `--timestamps` / log-shipper integrations work unchanged.
- **`${HERMES_HOME}/logs/gateways/<profile>/current`** (mapped to `~/.hermes/logs/gateways/<profile>/current` on the host via the volume mount) — rotated, with an ISO 8601 timestamp prepended per line. Rotation is 10 archives × 1 MB each, so it can't fill the disk. This is what `hermes logs` reads and what survives container restarts.
-
-The per-profile reconciler keeps a separate audit log at `${HERMES_HOME}/logs/container-boot.log` — one line per profile per container boot, recording whether each gateway was restored to its prior state.
+See the [Where the logs go](#where-the-logs-go) section below for the full routing map (per-profile gateways, dashboard, boot reconciler, container-wide `docker logs`).
 :::

 Note: the API server is gated on `API_SERVER_ENABLED=true`. To expose it beyond `127.0.0.1` inside the container, also set `API_SERVER_HOST=0.0.0.0` and an `API_SERVER_KEY` (minimum 8 characters — generate one with `openssl rand -hex 32`). Example:
@@ -81,7 +76,7 @@ Opening any port on an internet facing machine is a security risk. You should no

 ## Running the dashboard

-The built-in web dashboard runs as an optional side-process inside the same container as the gateway. Set `HERMES_DASHBOARD=1` to run the dashboard on container loopback (`127.0.0.1`) by default:
+The built-in web dashboard runs as a supervised s6-rc service alongside the gateway in the same container. Set `HERMES_DASHBOARD=1` to bring it up:

 ```sh
 docker run -d \
@@ -89,54 +84,38 @@ docker run -d \
  --restart unless-stopped \
  -v ~/.hermes:/opt/data \
  -p 8642:8642 \
+  -p 9119:9119 \
  -e HERMES_DASHBOARD=1 \
  nousresearch/hermes-agent gateway run
 ```

-The entrypoint starts `hermes dashboard` in the background (running as the non-root `hermes` user) before `exec`-ing the main command. Dashboard output is prefixed with `[dashboard]` in `docker logs` so it's easy to separate from gateway logs.
+The dashboard is supervised by s6 — if it crashes, `s6-supervise` restarts it automatically after a short backoff. Dashboard stdout/stderr is forwarded to `docker logs <container>` (no prefix; the gateway's own output now lives in a per-profile s6-log file — see [Where the logs go](#where-the-logs-go) below — so the two streams don't clash).

 | Environment variable | Description | Default |
 |---------------------|-------------|---------|
-| `HERMES_DASHBOARD` | Set to `1` (or `true` / `yes`) to launch the dashboard alongside the main command | *(unset — dashboard not started)* |
-| `HERMES_DASHBOARD_HOST` | Bind address for the dashboard HTTP server | `127.0.0.1` |
+| `HERMES_DASHBOARD` | Set to `1` (or `true` / `yes`) to enable the supervised dashboard service | *(unset — service is registered but stays down)* |
+| `HERMES_DASHBOARD_HOST` | Bind address for the dashboard HTTP server | `0.0.0.0` |
 | `HERMES_DASHBOARD_PORT` | Port for the dashboard HTTP server | `9119` |
 | `HERMES_DASHBOARD_TUI` | Set to `1` to expose the in-browser Chat tab (embedded `hermes --tui` via PTY/WebSocket) | *(unset)* |
 | `HERMES_DASHBOARD_INSECURE` | Set to `1` (or `true` / `yes`) to bind without the OAuth auth gate. Only use on trusted networks behind a reverse proxy without the OAuth contract — the dashboard exposes API keys and session data | *(unset — gate enforced when a `DashboardAuthProvider` is registered)* |

-By default, the dashboard stays on loopback (`127.0.0.1`) to avoid exposing
-the web surface over the network. To publish it intentionally, set
-`HERMES_DASHBOARD_HOST=0.0.0.0`. The dashboard's OAuth auth gate engages
-automatically whenever:
+The dashboard inside the container defaults to binding `0.0.0.0` — without it, the published `-p 9119:9119` port would not be reachable from the host. To restrict the bind to container loopback (for sidecar / reverse-proxy setups), set `HERMES_DASHBOARD_HOST=127.0.0.1`.

-1. The bind host is non-loopback, **and**
+The dashboard's OAuth auth gate engages automatically when both of the following are true:
+
+1. The bind host is non-loopback (e.g. the default `0.0.0.0` inside the container), **and**
 2. A `DashboardAuthProvider` plugin is registered.

-The bundled `dashboard_auth/nous` provider activates whenever
-`HERMES_DASHBOARD_OAUTH_CLIENT_ID` is set (see
-[Web Dashboard → Authentication](features/web-dashboard.md)). With the
-gate engaged, browser callers are redirected to the configured portal's
-OAuth flow before they can reach any protected route.
+The bundled `dashboard_auth/nous` provider activates whenever `HERMES_DASHBOARD_OAUTH_CLIENT_ID` is set (see [Web Dashboard → Authentication](features/web-dashboard.md)). With the gate engaged, browser callers are redirected to the configured portal's OAuth flow before they can reach any protected route.

-If no provider is registered and the bind is non-loopback, the dashboard
-**fails closed at startup** with a specific error pointing at the
-missing env var. To opt out of the gate explicitly — for a trusted-LAN
-deployment behind your own reverse proxy without the OAuth contract —
-set `HERMES_DASHBOARD_INSECURE=1`. This re-enables the legacy "no auth,
-loud warning" mode and is the only path that disables the gate; the bind
-host does not implicitly determine `--insecure` anymore.
+If no provider is registered and the bind is non-loopback, the dashboard **fails closed at startup** with a specific error pointing at the missing env var. To opt out of the gate explicitly — for a trusted-LAN deployment behind your own reverse proxy without the OAuth contract — set `HERMES_DASHBOARD_INSECURE=1`. This is the **only** path that disables the gate; the bind host alone never implies `--insecure` (it used to, but that predated the OAuth gate and silently disabled it on every container-deployed dashboard).

-:::note
-The dashboard runs as a supervised s6 service inside the container. If
-the dashboard process crashes, s6-overlay restarts it automatically
-after a short backoff — you'll see a new PID without needing to
-restart the container. Logs and crash output are visible via
-`docker logs <container>` (s6 forwards service stdout/stderr there).
-
-Running the dashboard as a separate container is not supported: its
-gateway-liveness detection requires a shared PID namespace with the
-gateway process.
+:::warning `HERMES_DASHBOARD_INSECURE=1` exposes API keys
+Opting out of the OAuth gate serves the dashboard's API surface (including model keys and session data) to anyone who can reach the published port. Only enable it when you have your own auth layer in front, or on a trusted LAN you fully control.
 :::

+Running the dashboard as a separate container is not supported: its gateway-liveness detection requires a shared PID namespace with the gateway process.
+
 ## Running interactively (CLI chat)

 To open an interactive chat session against a running data directory:
@@ -179,37 +158,60 @@ Never run two Hermes **gateway** containers against the same data directory simu

 ## Multi-profile support

-Hermes supports [multiple profiles](../reference/profile-commands.md) — separate `~/.hermes/` directories that let you run independent agents (different SOUL, skills, memory, sessions, credentials) from a single installation. **When running under Docker, using Hermes' built-in multi-profile feature is not recommended.**
+Hermes supports [multiple profiles](../reference/profile-commands.md) — separate `~/.hermes/` subdirectories that let you run independent agents (different SOUL, skills, memory, sessions, credentials) from a single installation. **Inside the official Docker image, the s6 supervision tree treats each profile as a first-class supervised service**, so the recommended deployment is **one container hosting all profiles**.

-Instead, the recommended pattern is **one container per profile**, with each container bind-mounting its own host directory as `/opt/data`:
+Each profile created with `hermes profile create <name>` gets:
+
+- A dedicated s6 service slot at `/run/service/gateway-<name>/`, registered dynamically by the runtime — no container rebuild required.
+- Auto-restart on crash, backoff-managed by `s6-supervise`.
+- Per-profile rotated logs at `${HERMES_HOME}/logs/gateways/<name>/current` (10 archives × 1 MB each).
+- State persistence across container restarts: the boot-time reconciler reads `gateway_state.json` from each profile directory and brings the slot back up only for profiles whose last recorded state was `running`. Stopped profiles stay stopped.
+
+The lifecycle commands you'd run on the host work the same way from inside the container:

 ```sh
-# Work profile
-docker run -d \
-  --name hermes-work \
-  --restart unless-stopped \
-  -v ~/.hermes-work:/opt/data \
-  -p 8642:8642 \
-  nousresearch/hermes-agent gateway run
+# Create a profile — registers the gateway-<name> s6 slot.
+docker exec hermes hermes profile create coder

-# Personal profile
-docker run -d \
-  --name hermes-personal \
-  --restart unless-stopped \
-  -v ~/.hermes-personal:/opt/data \
-  -p 8643:8642 \
-  nousresearch/hermes-agent gateway run
+# Start / stop / restart — dispatches s6-svc; the gateway lifecycle survives docker restart.
+docker exec hermes hermes -p coder gateway start
+docker exec hermes hermes -p coder gateway stop
+docker exec hermes hermes -p coder gateway restart
+
+# Status — reports `Manager: s6 (container supervisor)` inside the container.
+docker exec hermes hermes -p coder gateway status
+
+# Remove a profile — tears down the s6 slot too.
+docker exec hermes hermes profile delete coder
 ```

-Why separate containers over profiles in Docker:
+Under the hood, `hermes gateway start/stop/restart` inside the container is intercepted and routed to `s6-svc` against the right service directory; you don't need to learn the s6 commands directly. For raw supervisor state, use `/command/s6-svstat /run/service/gateway-<name>` (note `/command/` is on PATH only for processes spawned by the supervision tree — when calling from `docker exec`, pass the absolute path).

- **Isolation** — each container has its own filesystem, process table, and resource limits. A crash, dependency change, or runaway session in one profile can't affect another.
- **Independent lifecycle** — upgrade, restart, pause, or roll back each agent separately (`docker restart hermes-work` leaves `hermes-personal` untouched).
- **Clean port and network separation** — each gateway binds its own host port; there's no risk of cross-talk between chat platforms or API servers.
- **Simpler mental model** — the container *is* the profile. Backups, migrations, and permissions all follow the bind-mounted directory, with no extra `--profile` flags to remember.
- **Avoids concurrent-write risk** — the warning above about never running two gateways against the same data directory still applies to profiles within a single container.
+### Why one container with many profiles, not many containers

-In Docker Compose, this just means declaring one service per profile with distinct `container_name`, `volumes`, and `ports`:
+Before the s6 migration, "one container per profile" was the recommended pattern because there was no in-container supervisor to manage multiple gateways. With s6 as PID 1, that's no longer necessary, and the single-container layout is simpler in almost every dimension:
+
+| | One container, many profiles | One container per profile |
+|---|---|---|
+| Disk overhead | One image, one bundled venv, one Playwright cache | N images / N caches |
+| Memory overhead | Shared Python interpreter cache, shared node_modules | Duplicated per container |
+| Profile creation | `docker exec ... hermes profile create <name>` (seconds) | New `docker run` invocation + port allocation + bind-mount config |
+| Per-profile crash recovery | `s6-supervise` auto-restart | Docker's `--restart unless-stopped` (slower, kills sibling work) |
+| Logs | Per-profile rotated file via `s6-log`, plus container-boot audit log | `docker logs <name>` per container — no built-in rotation |
+| Backup | One `~/.hermes` directory | N directories to coordinate |
+
+The default profile (`default`) is always registered on first boot, so a fresh container ships with one supervised gateway out of the box. Additional profiles are pure runtime adds.
+
+### When you DO want a separate container
+
+Profile-in-container is the default. Run a separate container per profile only when you have a specific reason:
+
+- **Resource isolation per workload** — e.g. a runaway browser-tool session in profile A shouldn't be able to OOM profile B. Containers give you `--memory` / `--cpus` per profile.
+- **Independent image pinning** — different upstream image tags per workload.
+- **Network segmentation** — distinct Docker networks per profile (e.g. one customer-facing, one internal).
+- **Compliance / blast radius** — distinct credentials never share an OS-level process tree.
+
+In those cases, declare one service per profile with distinct `container_name`, `volumes`, and `ports`:

 ```yaml
 services:
@@ -234,6 +236,24 @@ services:
      - ~/.hermes-personal:/opt/data
 ```

+The warning from [Persistent volumes](#persistent-volumes) still applies: never point two containers at the same `~/.hermes` directory simultaneously. The s6 supervisor inside each container manages its own profile set; cross-container sharing of a data volume corrupts session files and memory stores.
+
+## Where the logs go
+
+The s6 container has four distinct log surfaces, and "why isn't my gateway showing anything in `docker logs`" is a common surprise. Cheatsheet:
+
+| Source | Where it lands | How to read it |
+|---|---|---|
+| **Per-profile gateway** (`hermes gateway run` and per-profile gateways under s6) | Tee'd to two places: `docker logs <container>` (real time, no extra prefix) **and** `${HERMES_HOME}/logs/gateways/<profile>/current` (rotated, ISO-8601 timestamped, 10 archives × 1 MB each) | `docker logs -f hermes` or `tail -F ~/.hermes/logs/gateways/default/current` on the host |
+| **Dashboard** (when `HERMES_DASHBOARD=1`) | `docker logs <container>` (no prefix) | `docker logs -f hermes` — interleaved with gateway lines |
+| **Boot reconciler** (records which profile gateways were restored on each container start) | `${HERMES_HOME}/logs/container-boot.log` (append-only audit log) | `tail -F ~/.hermes/logs/container-boot.log` |
+| **Generic Hermes logs** (`agent.log`, `errors.log`) | `${HERMES_HOME}/logs/` (profile-aware) | `docker exec hermes hermes logs --follow [--level WARNING] [--session <id>]` |
+
+Two practical consequences worth knowing:
+
+- The file copy at `logs/gateways/<profile>/current` is what survives container restarts. `docker logs` only retains output from the current container's lifetime (and is wiped on `docker rm`); the rotated files persist on the bind-mounted volume.
+- The boot reconciler's audit line shape is `<iso-timestamp> profile=<name> prior_state=<state> action=<registered|started>`, so a quick `grep profile=coder ~/.hermes/logs/container-boot.log` reveals when a given profile was last restored and whether s6 auto-started it.
+
 ## Environment variable forwarding

 API keys are read from `/opt/data/.env` inside the container. You can also pass environment variables directly:
@@ -281,7 +301,7 @@ services:
          cpus: "2.0"
 ```

-Start with `docker compose up -d` and view logs with `docker compose logs -f`. Dashboard output is prefixed with `[dashboard]` so it's easy to filter from gateway logs.
+Start with `docker compose up -d` and view logs with `docker compose logs -f`. The supervised gateway's stdout is also tee'd to `${HERMES_HOME}/logs/gateways/<profile>/current` on the volume — see [Where the logs go](#where-the-logs-go) for the full routing map.

 ## Optional: Linux desktop audio bridge

@@ -415,24 +435,28 @@ The container ENTRYPOINT is now `/init` (s6-overlay), not `/usr/bin/tini`. All f
 Do not override the image entrypoint unless you keep `/init` (or, equivalently, the legacy `docker/entrypoint.sh` shim that forwards to the stage2 hook) in the command chain. s6-overlay's `/init` runs as root so it can chown the volume on first boot, then drops to the `hermes` user via `s6-setuidgid` for every supervised service AND for the main program. Starting `hermes gateway run` as root inside the official image is refused by default because it can leave root-owned files in `/opt/data` and break later dashboard or gateway starts. Set `HERMES_ALLOW_ROOT_GATEWAY=1` only when you intentionally accept that risk.
 :::

-### Per-profile gateway supervision
+### `docker exec` automatically drops to the `hermes` user

-Inside the container, each profile created with `hermes profile create <name>` automatically gets an s6-supervised gateway service registered at `/run/service/gateway-<name>/`. The lifecycle commands you'd run on the host work the same way:
+`docker exec hermes <cmd>` defaults to running as root inside the container, but the image ships a thin shim at `/opt/hermes/bin/hermes` (earliest on PATH) that detects root callers and transparently re-execs through `s6-setuidgid hermes`. So `docker exec hermes login`, `docker exec hermes profile create …`, `docker exec hermes setup`, etc. all write files owned by UID 10000 — i.e. readable by the supervised gateway — with no extra `--user` flag needed. Non-root callers (the supervised processes themselves, `docker exec --user hermes`, kanban subagents inside the container) hit a short-circuit that exec's the venv binary directly, so there's no overhead on the hot paths.
+
+If you specifically need a `docker exec` that retains root semantics (diagnostic sessions, inspecting root-only state, files outside `/opt/data` that root happens to own), opt out per invocation:

 ```sh
-hermes profile create coder            # registers gateway-coder s6 slot
-hermes -p coder gateway start          # s6-svc -u  → supervised gateway
-hermes -p coder gateway stop           # s6-svc -d  → service down
-hermes -p coder gateway restart        # s6-svc -t  → SIGTERM the supervisor
-hermes profile delete coder            # tears down the s6 slot
+docker exec -e HERMES_DOCKER_EXEC_AS_ROOT=1 hermes <cmd>
 ```

+The shim accepts `1` / `true` / `yes` (case-insensitive). Anything else — including typos like `=0` — falls through to the drop, so silent opt-outs aren't possible. If `s6-setuidgid` isn't available (custom builds that stripped s6-overlay), the shim refuses to run as root and exits 126 instead, surfacing the broken privilege model loudly rather than regressing to the historical footgun where `docker exec hermes login` would write `auth.json` as `root:root` and break the supervised gateway's auth on every chat platform message.
+
+### Per-profile gateway supervision
+
+Each profile created with `hermes profile create <name>` automatically gets an s6-supervised gateway service registered at `/run/service/gateway-<name>/`, with state-persistent auto-restart across container restarts. See [Multi-profile support](#multi-profile-support) above for the user-facing workflow and the lifecycle commands.
+
 **Supervision benefits over the pre-s6 image:**

 - Gateway crashes are auto-restarted by `s6-supervise` after a ~1s backoff.
- Dashboard crashes are auto-restarted (set `HERMES_DASHBOARD=1` to start it).
+- Dashboard, when enabled with `HERMES_DASHBOARD=1`, is supervised on the same supervision tree and gets the same auto-restart treatment.
 - `docker restart` preserves running gateways: the cont-init reconciler reads `$HERMES_HOME/profiles/<name>/gateway_state.json` and brings the slot back up if the last recorded state was `running`. Stopped gateways stay stopped.
- Per-profile gateway logs persist under `$HERMES_HOME/logs/gateways/<profile>/current` (rotated by `s6-log`), and the reconciler's actions are appended to `$HERMES_HOME/logs/container-boot.log` per boot.
+- Per-profile gateway logs persist under `$HERMES_HOME/logs/gateways/<profile>/current` (rotated by `s6-log`), and the reconciler's actions are appended to `$HERMES_HOME/logs/container-boot.log` per boot. See [Where the logs go](#where-the-logs-go) for the full routing map.

 `hermes status` inside the container reports `Manager: s6 (container supervisor)`. Use `/command/s6-svstat /run/service/gateway-<name>` for the raw supervisor view (note `/command/` is on PATH for supervision-tree processes only; pass the absolute path when calling from `docker exec`).

@@ -692,6 +716,8 @@ The container's stage2 hook drops privileges to the non-root `hermes` user (UID
 chmod -R 755 ~/.hermes
 ```

+`docker exec hermes <cmd>` automatically drops to UID 10000 too — see [`docker exec` automatically drops to the `hermes` user](#docker-exec-automatically-drops-to-the-hermes-user) for details and the per-invocation opt-out.
+
 ### Browser tools not working

 Playwright needs shared memory. Add `--shm-size=1g` to your Docker run command: