Compare commits

...

6 Commits

Author SHA1 Message Date
kshitijk4poor 84d1673e2f feat: provider modules — ProviderProfile ABC, 30 providers, fetch_models, transport single-path
feat: provider modules — ProviderProfile ABC, 29 providers, fetch_models, transport single-path

Introduces providers/ as the single source of truth for every inference
provider. All 29 providers declared with correct data cross-checked against
auth.py, runtime_provider.py and auxiliary_client.py.

Rebased onto main (30307a980). Incorporates post-salvage fixes from
56724147e (gmi aux model google/gemini-3.1-flash-lite-preview, already set in providers/gmi.py).
2026-04-29 20:10:09 +05:30
Teknium 30307a9802 feat(plugins): add pre_approval_request / post_approval_response hooks (#16776)
Plugins can now observe dangerous-command approval events in real time,
on both the CLI-interactive path and the async gateway path. This is the
missing hook surface external tools need to build approval notifiers
(macOS menu-bar allow/deny, Slack alerts, audit logs, etc.) without
forking Hermes or running a parallel gateway adapter.

Changes:
- hermes_cli/plugins.py: add two entries to VALID_HOOKS
- tools/approval.py: fire both hooks from check_all_command_guards --
  around prompt_dangerous_approval (CLI surface) and around the
  notify_cb + blocking event.wait loop (gateway surface)
- website/docs/user-guide/features/hooks.md: document both hooks with
  a macOS-notification example
- tests/tools/test_approval_plugin_hooks.py: 5 tests covering CLI once,
  CLI deny, plugin-crash resilience, gateway approve, gateway timeout

Hooks are observer-only: return values are ignored, so plugins cannot
veto or pre-answer an approval (use pre_tool_call for that). A crashing
plugin cannot break the approval flow -- invoke_hook swallows per-
callback errors, and the wrapper logs and swallows dispatch-layer
errors too.

Surface kwarg distinguishes "cli" from "gateway"; post hook reports
choice as one of once/session/always/deny/timeout.
2026-04-27 20:08:33 -07:00
Teknium 6ea5699e3f fix(compression): notify users when configured aux model fails even if main-model fallback recovers (#16775)
A misconfigured auxiliary.compression.model is a user-fixable problem that silent recovery would hide. The previous retry-on-main logic transparently swallowed aux-model failures whenever the fallback succeeded, leaving the user's broken config in place and racking up future failures.

Track the aux-model failure on the compressor alongside the existing fallback-placeholder fields:
- _last_aux_model_failure_model: str | None
- _last_aux_model_failure_error: str | None

Both are set at the moment the aux model errors (captured before summary_model is cleared for retry), regardless of whether the retry succeeds. Cleared at compress() start and on on_session_reset() so a clean run doesn't leak stale warnings.

Surface at three places:
- gateway hygiene auto-compress: ℹ note to the platform adapter (thread_id preserved)
- gateway /compress command: ℹ line appended to the reply
- CLI via _emit_warning: deduped on (model, error) so repeat compactions don't spam

Distinct from the existing ⚠️ dropped-turns warning — different severity, different emoji, explicit 'context is intact' reassurance.
2026-04-27 20:08:23 -07:00
SHL0MS c3e3a9c184 feat(skills): add Tier A references — external-data, panel-ui, replicator, dat-scripting, 3d-scene
Five additional reference docs covering common TD use cases that were not yet
documented in any reference (operators.md lists the ops, but no usage patterns).

- external-data.md: webDAT, webclientDAT, webserverDAT, websocketDAT,
  mqttClientDAT, serialDAT, tcpipDAT — auth, polling, push, JSON parsing
- panel-ui.md: custom parameter pages, button/slider/field/list COMPs,
  containerCOMP layouts, panelExecuteDAT callbacks
- replicator.md: replicatorCOMP for data-driven cloning, per-row overrides,
  recreatemissing pattern, replicator vs Python loop
- dat-scripting.md: full Execute DAT family — chopExecuteDAT, datExecuteDAT,
  parameterExecuteDAT, panelExecuteDAT, opExecuteDAT, executeDAT lifecycle
- 3d-scene.md: light types, three-point rigs, shadows, IBL/cubemaps,
  PBR materials with idiom table, multi-camera, DOF

Same conventions as existing refs: code-first, verify param names with
td_get_par_info, no token-budget impact (load on demand).
2026-04-27 19:35:18 -07:00
SHL0MS 02df438316 feat(skills): expand touchdesigner-mcp with animation, MIDI/OSC, particles, projection refs
Adds four new reference docs covering common TD use cases not previously
documented in the skill:

- animation.md: LFOs, timers, keyframes, easing, time references
- midi-osc.md: MIDI controllers, OSC routing, TouchOSC, multi-machine sync
- particles.md: POPs and particleSOP — emission, forces, collisions, render
- projection-mapping.md: windowCOMP, corner pin, mesh warp, edge blending

Also clarifies the SKILL.md tool quick reference: adds td_screen_point_to_global
and notes that 4 admin/dev-mode tools (td_project_quit, td_test_session,
td_dev_log, td_clear_dev_log) live only in mcp-tools.md to keep the main
reference focused on creative workflows.

No SKILL.md workflow or critical-rules changes. References load on demand
so no token-budget impact at session start.
2026-04-27 19:35:18 -07:00
Teknium 94b26f3ec9 fix(compression): retry summary on main model for unknown errors before giving up (#16774)
The existing retry-on-main path in _generate_summary only fires for errors that match the _is_model_not_found heuristic (404/503, 'model_not_found', 'does not exist', 'no available channel'). Other misconfiguration errors — 400s from aggregators, provider-specific 'no route' strings, opaque rejections — fall straight through to the transient-cooldown branch, which drops N turns of context and inserts a static placeholder.

Losing context is almost always worse than one extra summary attempt. Add a best-effort retry-on-main for the unknown-error branch, guarded by the same invariants as the existing fast-path retry: only when summary_model differs from main, and only once per compressor (_summary_model_fallen_back).

Tests cover: 404 fast-path fallback still works, unknown 400 now falls back, same-model aux skips retry (no infinite loop), and a double-failure (aux + main) stops at 2 calls.
2026-04-27 19:25:57 -07:00
79 changed files with 7213 additions and 1036 deletions
+632
View File
@@ -0,0 +1,632 @@
"""OpenAI-compatible shim that forwards Hermes requests to `copilot --acp`.
This adapter lets Hermes treat the GitHub Copilot ACP server as a chat-style
backend. Each request starts a short-lived ACP session, sends the formatted
conversation as a single prompt, collects text chunks, and converts the result
back into the minimal shape Hermes expects from an OpenAI client.
"""
from __future__ import annotations
import json
import os
import queue
import re
import shlex
import subprocess
import threading
import time
from collections import deque
from pathlib import Path
from types import SimpleNamespace
from typing import Any
from agent.file_safety import get_read_block_error, is_write_denied
from agent.redact import redact_sensitive_text
ACP_MARKER_BASE_URL = "acp://copilot"
_DEFAULT_TIMEOUT_SECONDS = 900.0
_TOOL_CALL_BLOCK_RE = re.compile(r"<tool_call>\s*(\{.*?\})\s*</tool_call>", re.DOTALL)
_TOOL_CALL_JSON_RE = re.compile(
r"\{\s*\"id\"\s*:\s*\"[^\"]+\"\s*,\s*\"type\"\s*:\s*\"function\"\s*,\s*\"function\"\s*:\s*\{.*?\}\s*\}",
re.DOTALL,
)
def _resolve_command() -> str:
return (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
def _resolve_args() -> list[str]:
raw = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
if not raw:
return ["--acp", "--stdio"]
return shlex.split(raw)
def _jsonrpc_error(message_id: Any, code: int, message: str) -> dict[str, Any]:
return {
"jsonrpc": "2.0",
"id": message_id,
"error": {
"code": code,
"message": message,
},
}
def _permission_denied(message_id: Any) -> dict[str, Any]:
return {
"jsonrpc": "2.0",
"id": message_id,
"result": {
"outcome": {
"outcome": "cancelled",
}
},
}
def _format_messages_as_prompt(
messages: list[dict[str, Any]],
model: str | None = None,
tools: list[dict[str, Any]] | None = None,
tool_choice: Any = None,
) -> str:
sections: list[str] = [
"You are being used as the active ACP agent backend for Hermes.",
"Use ACP capabilities to complete tasks.",
"IMPORTANT: If you take an action with a tool, you MUST output tool calls using <tool_call>{...}</tool_call> blocks with JSON exactly in OpenAI function-call shape.",
"If no tool is needed, answer normally.",
]
if model:
sections.append(f"Hermes requested model hint: {model}")
if isinstance(tools, list) and tools:
tool_specs: list[dict[str, Any]] = []
for t in tools:
if not isinstance(t, dict):
continue
fn = t.get("function") or {}
if not isinstance(fn, dict):
continue
name = fn.get("name")
if not isinstance(name, str) or not name.strip():
continue
tool_specs.append(
{
"name": name.strip(),
"description": fn.get("description", ""),
"parameters": fn.get("parameters", {}),
}
)
if tool_specs:
sections.append(
"Available tools (OpenAI function schema). "
"When using a tool, emit ONLY <tool_call>{...}</tool_call> with one JSON object "
"containing id/type/function{name,arguments}. arguments must be a JSON string.\n"
+ json.dumps(tool_specs, ensure_ascii=False)
)
if tool_choice is not None:
sections.append(
f"Tool choice hint: {json.dumps(tool_choice, ensure_ascii=False)}"
)
transcript: list[str] = []
for message in messages:
if not isinstance(message, dict):
continue
role = str(message.get("role") or "unknown").strip().lower()
if role == "tool":
role = "tool"
elif role not in {"system", "user", "assistant"}:
role = "context"
content = message.get("content")
rendered = _render_message_content(content)
if not rendered:
continue
label = {
"system": "System",
"user": "User",
"assistant": "Assistant",
"tool": "Tool",
"context": "Context",
}.get(role, role.title())
transcript.append(f"{label}:\n{rendered}")
if transcript:
sections.append("Conversation transcript:\n\n" + "\n\n".join(transcript))
sections.append("Continue the conversation from the latest user request.")
return "\n\n".join(
section.strip() for section in sections if section and section.strip()
)
def _render_message_content(content: Any) -> str:
if content is None:
return ""
if isinstance(content, str):
return content.strip()
if isinstance(content, dict):
if "text" in content:
return str(content.get("text") or "").strip()
if "content" in content and isinstance(content.get("content"), str):
return str(content.get("content") or "").strip()
return json.dumps(content, ensure_ascii=True)
if isinstance(content, list):
parts: list[str] = []
for item in content:
if isinstance(item, str):
parts.append(item)
elif isinstance(item, dict):
text = item.get("text")
if isinstance(text, str) and text.strip():
parts.append(text.strip())
return "\n".join(parts).strip()
return str(content).strip()
def _extract_tool_calls_from_text(text: str) -> tuple[list[SimpleNamespace], str]:
if not isinstance(text, str) or not text.strip():
return [], ""
extracted: list[SimpleNamespace] = []
consumed_spans: list[tuple[int, int]] = []
def _try_add_tool_call(raw_json: str) -> None:
try:
obj = json.loads(raw_json)
except Exception:
return
if not isinstance(obj, dict):
return
fn = obj.get("function")
if not isinstance(fn, dict):
return
fn_name = fn.get("name")
if not isinstance(fn_name, str) or not fn_name.strip():
return
fn_args = fn.get("arguments", "{}")
if not isinstance(fn_args, str):
fn_args = json.dumps(fn_args, ensure_ascii=False)
call_id = obj.get("id")
if not isinstance(call_id, str) or not call_id.strip():
call_id = f"acp_call_{len(extracted) + 1}"
extracted.append(
SimpleNamespace(
id=call_id,
call_id=call_id,
response_item_id=None,
type="function",
function=SimpleNamespace(name=fn_name.strip(), arguments=fn_args),
)
)
for m in _TOOL_CALL_BLOCK_RE.finditer(text):
raw = m.group(1)
_try_add_tool_call(raw)
consumed_spans.append((m.start(), m.end()))
# Only try bare-JSON fallback when no XML blocks were found.
if not extracted:
for m in _TOOL_CALL_JSON_RE.finditer(text):
raw = m.group(0)
_try_add_tool_call(raw)
consumed_spans.append((m.start(), m.end()))
if not consumed_spans:
return extracted, text.strip()
consumed_spans.sort()
merged: list[tuple[int, int]] = []
for start, end in consumed_spans:
if not merged or start > merged[-1][1]:
merged.append((start, end))
else:
merged[-1] = (merged[-1][0], max(merged[-1][1], end))
parts: list[str] = []
cursor = 0
for start, end in merged:
if cursor < start:
parts.append(text[cursor:start])
cursor = max(cursor, end)
if cursor < len(text):
parts.append(text[cursor:])
cleaned = "\n".join(p.strip() for p in parts if p and p.strip()).strip()
return extracted, cleaned
def _ensure_path_within_cwd(path_text: str, cwd: str) -> Path:
candidate = Path(path_text)
if not candidate.is_absolute():
raise PermissionError("ACP file-system paths must be absolute.")
resolved = candidate.resolve()
root = Path(cwd).resolve()
try:
resolved.relative_to(root)
except ValueError as exc:
raise PermissionError(
f"Path '{resolved}' is outside the session cwd '{root}'."
) from exc
return resolved
class _ACPChatCompletions:
def __init__(self, client: CopilotACPClient):
self._client = client
def create(self, **kwargs: Any) -> Any:
return self._client._create_chat_completion(**kwargs)
class _ACPChatNamespace:
def __init__(self, client: CopilotACPClient):
self.completions = _ACPChatCompletions(client)
class CopilotACPClient:
"""Minimal OpenAI-client-compatible facade for Copilot ACP."""
def __init__(
self,
*,
api_key: str | None = None,
base_url: str | None = None,
default_headers: dict[str, str] | None = None,
acp_command: str | None = None,
acp_args: list[str] | None = None,
acp_cwd: str | None = None,
command: str | None = None,
args: list[str] | None = None,
**_: Any,
):
self.api_key = api_key or "copilot-acp"
self.base_url = base_url or ACP_MARKER_BASE_URL
self._default_headers = dict(default_headers or {})
self._acp_command = acp_command or command or _resolve_command()
self._acp_args = list(acp_args or args or _resolve_args())
self._acp_cwd = str(Path(acp_cwd or os.getcwd()).resolve())
self.chat = _ACPChatNamespace(self)
self.is_closed = False
self._active_process: subprocess.Popen[str] | None = None
self._active_process_lock = threading.Lock()
def close(self) -> None:
proc: subprocess.Popen[str] | None
with self._active_process_lock:
proc = self._active_process
self._active_process = None
self.is_closed = True
if proc is None:
return
try:
proc.terminate()
proc.wait(timeout=2)
except Exception:
try:
proc.kill()
except Exception:
pass
def _create_chat_completion(
self,
*,
model: str | None = None,
messages: list[dict[str, Any]] | None = None,
timeout: float | None = None,
tools: list[dict[str, Any]] | None = None,
tool_choice: Any = None,
**_: Any,
) -> Any:
prompt_text = _format_messages_as_prompt(
messages or [],
model=model,
tools=tools,
tool_choice=tool_choice,
)
# Normalise timeout: run_agent.py may pass an httpx.Timeout object
# (used natively by the OpenAI SDK) rather than a plain float.
if timeout is None:
_effective_timeout = _DEFAULT_TIMEOUT_SECONDS
elif isinstance(timeout, (int, float)):
_effective_timeout = float(timeout)
else:
# httpx.Timeout or similar — pick the largest component so the
# subprocess has enough wall-clock time for the full response.
_candidates = [
getattr(timeout, attr, None)
for attr in ("read", "write", "connect", "pool", "timeout")
]
_numeric = [float(v) for v in _candidates if isinstance(v, (int, float))]
_effective_timeout = max(_numeric) if _numeric else _DEFAULT_TIMEOUT_SECONDS
response_text, reasoning_text = self._run_prompt(
prompt_text,
timeout_seconds=_effective_timeout,
)
tool_calls, cleaned_text = _extract_tool_calls_from_text(response_text)
usage = SimpleNamespace(
prompt_tokens=0,
completion_tokens=0,
total_tokens=0,
prompt_tokens_details=SimpleNamespace(cached_tokens=0),
)
assistant_message = SimpleNamespace(
content=cleaned_text,
tool_calls=tool_calls,
reasoning=reasoning_text or None,
reasoning_content=reasoning_text or None,
reasoning_details=None,
)
finish_reason = "tool_calls" if tool_calls else "stop"
choice = SimpleNamespace(message=assistant_message, finish_reason=finish_reason)
return SimpleNamespace(
choices=[choice],
usage=usage,
model=model or "copilot-acp",
)
def _run_prompt(
self, prompt_text: str, *, timeout_seconds: float
) -> tuple[str, str]:
try:
proc = subprocess.Popen(
[self._acp_command] + self._acp_args,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
cwd=self._acp_cwd,
)
except FileNotFoundError as exc:
raise RuntimeError(
f"Could not start Copilot ACP command '{self._acp_command}'. "
"Install GitHub Copilot CLI or set HERMES_COPILOT_ACP_COMMAND/COPILOT_CLI_PATH."
) from exc
if proc.stdin is None or proc.stdout is None:
proc.kill()
raise RuntimeError("Copilot ACP process did not expose stdin/stdout pipes.")
self.is_closed = False
with self._active_process_lock:
self._active_process = proc
inbox: queue.Queue[dict[str, Any]] = queue.Queue()
stderr_tail: deque[str] = deque(maxlen=40)
def _stdout_reader() -> None:
if proc.stdout is None:
return
for line in proc.stdout:
try:
inbox.put(json.loads(line))
except Exception:
inbox.put({"raw": line.rstrip("\n")})
def _stderr_reader() -> None:
if proc.stderr is None:
return
for line in proc.stderr:
stderr_tail.append(line.rstrip("\n"))
out_thread = threading.Thread(target=_stdout_reader, daemon=True)
err_thread = threading.Thread(target=_stderr_reader, daemon=True)
out_thread.start()
err_thread.start()
next_id = 0
def _request(
method: str,
params: dict[str, Any],
*,
text_parts: list[str] | None = None,
reasoning_parts: list[str] | None = None,
) -> Any:
nonlocal next_id
next_id += 1
request_id = next_id
payload = {
"jsonrpc": "2.0",
"id": request_id,
"method": method,
"params": params,
}
assert proc.stdin is not None # always set: Popen(stdin=PIPE)
proc.stdin.write(json.dumps(payload) + "\n")
proc.stdin.flush()
deadline = time.time() + timeout_seconds
while time.time() < deadline:
if proc.poll() is not None:
break
try:
msg = inbox.get(timeout=0.1)
except queue.Empty:
continue
if self._handle_server_message(
msg,
process=proc,
cwd=self._acp_cwd,
text_parts=text_parts,
reasoning_parts=reasoning_parts,
):
continue
if msg.get("id") != request_id:
continue
if "error" in msg:
err = msg.get("error") or {}
raise RuntimeError(
f"Copilot ACP {method} failed: {err.get('message') or err}"
)
return msg.get("result")
stderr_text = "\n".join(stderr_tail).strip()
if proc.poll() is not None and stderr_text:
raise RuntimeError(f"Copilot ACP process exited early: {stderr_text}")
raise TimeoutError(
f"Timed out waiting for Copilot ACP response to {method}."
)
try:
_request(
"initialize",
{
"protocolVersion": 1,
"clientCapabilities": {
"fs": {
"readTextFile": True,
"writeTextFile": True,
}
},
"clientInfo": {
"name": "hermes-agent",
"title": "Hermes Agent",
"version": "0.0.0",
},
},
)
session = (
_request(
"session/new",
{
"cwd": self._acp_cwd,
"mcpServers": [],
},
)
or {}
)
session_id = str(session.get("sessionId") or "").strip()
if not session_id:
raise RuntimeError("Copilot ACP did not return a sessionId.")
text_parts: list[str] = []
reasoning_parts: list[str] = []
_request(
"session/prompt",
{
"sessionId": session_id,
"prompt": [
{
"type": "text",
"text": prompt_text,
}
],
},
text_parts=text_parts,
reasoning_parts=reasoning_parts,
)
return "".join(text_parts), "".join(reasoning_parts)
finally:
self.close()
def _handle_server_message(
self,
msg: dict[str, Any],
*,
process: subprocess.Popen[str],
cwd: str,
text_parts: list[str] | None,
reasoning_parts: list[str] | None,
) -> bool:
method = msg.get("method")
if not isinstance(method, str):
return False
if method == "session/update":
params = msg.get("params") or {}
update = params.get("update") or {}
kind = str(update.get("sessionUpdate") or "").strip()
content = update.get("content") or {}
chunk_text = ""
if isinstance(content, dict):
chunk_text = str(content.get("text") or "")
if kind == "agent_message_chunk" and chunk_text and text_parts is not None:
text_parts.append(chunk_text)
elif (
kind == "agent_thought_chunk"
and chunk_text
and reasoning_parts is not None
):
reasoning_parts.append(chunk_text)
return True
if process.stdin is None:
return True
message_id = msg.get("id")
params = msg.get("params") or {}
if method == "session/request_permission":
response = _permission_denied(message_id)
elif method == "fs/read_text_file":
try:
path = _ensure_path_within_cwd(str(params.get("path") or ""), cwd)
block_error = get_read_block_error(str(path))
if block_error:
raise PermissionError(block_error)
content = path.read_text() if path.exists() else ""
line = params.get("line")
limit = params.get("limit")
if isinstance(line, int) and line > 1:
lines = content.splitlines(keepends=True)
start = line - 1
end = (
start + limit if isinstance(limit, int) and limit > 0 else None
)
content = "".join(lines[start:end])
if content:
content = redact_sensitive_text(content)
response = {
"jsonrpc": "2.0",
"id": message_id,
"result": {
"content": content,
},
}
except Exception as exc:
response = _jsonrpc_error(message_id, -32602, str(exc))
elif method == "fs/write_text_file":
try:
path = _ensure_path_within_cwd(str(params.get("path") or ""), cwd)
if is_write_denied(str(path)):
raise PermissionError(
f"Write denied: '{path}' is a protected system/credential file."
)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(str(params.get("content") or ""))
response = {
"jsonrpc": "2.0",
"id": message_id,
"result": None,
}
except Exception as exc:
response = _jsonrpc_error(message_id, -32602, str(exc))
else:
response = _jsonrpc_error(
message_id,
-32601,
f"ACP client method '{method}' is not supported by Hermes yet.",
)
process.stdin.write(json.dumps(response) + "\n")
process.stdin.flush()
return True
+51 -27
View File
@@ -151,23 +151,31 @@ def _fixed_temperature_for_model(
return None
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"gemini": "gemini-3-flash-preview",
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
"stepfun": "step-3.5-flash",
"kimi-coding-cn": "kimi-k2-turbo-preview",
"gmi": "google/gemini-3.1-flash-lite-preview",
"minimax": "MiniMax-M2.7",
"minimax-cn": "MiniMax-M2.7",
def _get_aux_model_for_provider(provider_id: str) -> str:
"""Return the cheap auxiliary model for a provider.
Reads from ProviderProfile.default_aux_model first, falling back to the
legacy hardcoded dict for providers that predate the profiles system.
"""
try:
from providers import get_provider_profile
_p = get_provider_profile(provider_id)
if _p and _p.default_aux_model:
return _p.default_aux_model
except Exception:
pass
return _API_KEY_PROVIDER_AUX_MODELS_FALLBACK.get(provider_id, "")
# Fallback for providers not yet migrated to ProviderProfile.default_aux_model.
# New providers should set default_aux_model on their profile instead.
_API_KEY_PROVIDER_AUX_MODELS_FALLBACK: Dict[str, str] = {
"anthropic": "claude-haiku-4-5-20251001",
"ai-gateway": "google/gemini-3-flash",
"opencode-zen": "gemini-3-flash",
"opencode-go": "glm-5",
"kilocode": "google/gemini-3-flash-preview",
"ollama-cloud": "nemotron-3-nano:30b",
}
# Legacy alias — callers that haven't been updated yet can still use this.
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = _API_KEY_PROVIDER_AUX_MODELS_FALLBACK
# Vision-specific model overrides for direct providers.
# When the user's main provider has a dedicated vision/multimodal model that
# differs from their main chat model, map it here. The vision auto-detect
@@ -868,7 +876,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
base_url = _to_openai_base_url(
_pool_runtime_base_url(entry, pconfig.inference_base_url) or pconfig.inference_base_url
)
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
model = _get_aux_model_for_provider(provider_id) or None
if model is None:
continue # skip provider if we don't know a valid aux model
logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
@@ -877,14 +885,22 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
if is_native_gemini_base_url(base_url):
return GeminiNativeClient(api_key=api_key, base_url=base_url), model
extra = {}
if base_url_host_matches(base_url, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "claude-code/0.1.0"}
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
extra = {}
if base_url_host_matches(base_url, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "claude-code/0.1.0"}
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
extra["default_headers"] = copilot_default_headers()
else:
try:
from providers import get_provider_profile as _gpf_aux
_ph_aux = _gpf_aux(provider_id)
if _ph_aux and _ph_aux.default_headers:
extra["default_headers"] = dict(_ph_aux.default_headers)
except Exception:
pass
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
creds = resolve_api_key_provider_credentials(provider_id)
api_key = str(creds.get("api_key", "")).strip()
@@ -894,7 +910,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
base_url = _to_openai_base_url(
str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
)
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
model = _get_aux_model_for_provider(provider_id) or None
if model is None:
continue # skip provider if we don't know a valid aux model
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
@@ -910,6 +926,14 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
else:
try:
from providers import get_provider_profile as _gpf_aux2
_ph_aux2 = _gpf_aux2(provider_id)
if _ph_aux2 and _ph_aux2.default_headers:
extra["default_headers"] = dict(_ph_aux2.default_headers)
except Exception:
pass
return OpenAI(api_key=api_key, base_url=base_url, **extra), model
return None, None
@@ -1258,7 +1282,7 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
from agent.anthropic_adapter import _is_oauth_token
is_oauth = _is_oauth_token(token)
model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
model = _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
try:
real_client = build_anthropic_client(token, base_url)
@@ -1642,7 +1666,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
except ImportError:
pass
try:
from agent.copilot_acp_client import CopilotACPClient
from acp_adapter.copilot_client import CopilotACPClient
if isinstance(sync_client, CopilotACPClient):
return sync_client, model
except ImportError:
@@ -1986,7 +2010,7 @@ def resolve_provider_client(
str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
)
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
default_model = _get_aux_model_for_provider(provider)
final_model = _normalize_resolved_model(model or default_model, provider)
if provider == "gemini":
@@ -2056,7 +2080,7 @@ def resolve_provider_client(
"process credentials are incomplete"
)
return None, None
from agent.copilot_acp_client import CopilotACPClient
from acp_adapter.copilot_client import CopilotACPClient
client = CopilotACPClient(
api_key=api_key,
+50
View File
@@ -340,6 +340,8 @@ class ContextCompressor(ContextEngine):
self._last_summary_error = None
self._last_summary_dropped_count = 0
self._last_summary_fallback_used = False
self._last_aux_model_failure_error = None
self._last_aux_model_failure_model = None
self._last_compression_savings_pct = 100.0
self._ineffective_compression_count = 0
@@ -448,6 +450,12 @@ class ContextCompressor(ContextEngine):
# (gateway hygiene, /compress) can surface a visible warning.
self._last_summary_dropped_count: int = 0
self._last_summary_fallback_used: bool = False
# When a user-configured summary model fails and we recover by
# retrying on the main model, record the failure so gateway /
# CLI callers can still warn the user even though compression
# succeeded. Silent recovery would hide the broken config.
self._last_aux_model_failure_error: Optional[str] = None
self._last_aux_model_failure_model: Optional[str] = None
def update_from_response(self, usage: Dict[str, Any]):
"""Update tracked token usage from API response."""
@@ -907,10 +915,50 @@ The user has requested that this compaction PRIORITISE preserving all informatio
"Falling back to main model '%s' for compression.",
self.summary_model, e, self.model,
)
# Record the aux-model failure so callers can warn the user
# even if the retry-on-main succeeds — a misconfigured aux
# model is something the user needs to fix.
_err_text = str(e).strip() or e.__class__.__name__
if len(_err_text) > 220:
_err_text = _err_text[:217].rstrip() + "..."
self._last_aux_model_failure_error = _err_text
self._last_aux_model_failure_model = self.summary_model
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0 # no cooldown
return self._generate_summary(turns_to_summarize, focus_topic=focus_topic) # retry immediately
# Unknown-error best-effort retry on main model. Losing N turns of
# context is almost always worse than one extra summary attempt, so
# if we haven't already fallen back and the summary model differs
# from the main model, try once more on main before entering
# cooldown. Errors that DID match _is_model_not_found above are
# already handled by the fast-path retry; this branch catches
# everything else (400s, provider-specific "no route" strings,
# aggregator rejections, etc.) where auto-retry is still safer
# than dropping the turns.
if (
self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' failed (%s). "
"Retrying on main model '%s' before giving up.",
self.summary_model, e, self.model,
)
# Record the aux-model failure (see 404 branch above) — user
# should know their configured model is broken even if main
# recovers the call.
_err_text = str(e).strip() or e.__class__.__name__
if len(_err_text) > 220:
_err_text = _err_text[:217].rstrip() + "..."
self._last_aux_model_failure_error = _err_text
self._last_aux_model_failure_model = self.summary_model
self.summary_model = "" # empty = use main model
self._summary_failure_cooldown_until = 0.0
return self._generate_summary(turns_to_summarize, focus_topic=focus_topic)
# Transient errors (timeout, rate limit, network) — shorter cooldown
_transient_cooldown = 60
self._summary_failure_cooldown_until = time.monotonic() + _transient_cooldown
@@ -1208,6 +1256,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
self._last_summary_dropped_count = 0
self._last_summary_fallback_used = False
self._last_summary_error = None
self._last_aux_model_failure_error = None
self._last_aux_model_failure_model = None
n_messages = len(messages)
# Only need head + 3 tail messages minimum (token budget decides the real tail size)
_min_for_compress = self.protect_first_n + 3 + 1
+5 -643
View File
@@ -1,646 +1,8 @@
"""OpenAI-compatible shim that forwards Hermes requests to `copilot --acp`.
"""Backward-compatibility shim.
This adapter lets Hermes treat the GitHub Copilot ACP server as a chat-style
backend. Each request starts a short-lived ACP session, sends the formatted
conversation as a single prompt, collects text chunks, and converts the result
back into the minimal shape Hermes expects from an OpenAI client.
CopilotACPClient has moved to acp_adapter/copilot_client.py.
This module re-exports it so existing callers continue to work.
"""
from acp_adapter.copilot_client import CopilotACPClient # noqa: F401
from __future__ import annotations
import json
import os
import queue
import re
import shlex
import subprocess
import threading
import time
from collections import deque
from pathlib import Path
from types import SimpleNamespace
from typing import Any
from agent.file_safety import get_read_block_error, is_write_denied
from agent.redact import redact_sensitive_text
ACP_MARKER_BASE_URL = "acp://copilot"
_DEFAULT_TIMEOUT_SECONDS = 900.0
_TOOL_CALL_BLOCK_RE = re.compile(r"<tool_call>\s*(\{.*?\})\s*</tool_call>", re.DOTALL)
_TOOL_CALL_JSON_RE = re.compile(r"\{\s*\"id\"\s*:\s*\"[^\"]+\"\s*,\s*\"type\"\s*:\s*\"function\"\s*,\s*\"function\"\s*:\s*\{.*?\}\s*\}", re.DOTALL)
def _resolve_command() -> str:
return (
os.getenv("HERMES_COPILOT_ACP_COMMAND", "").strip()
or os.getenv("COPILOT_CLI_PATH", "").strip()
or "copilot"
)
def _resolve_args() -> list[str]:
raw = os.getenv("HERMES_COPILOT_ACP_ARGS", "").strip()
if not raw:
return ["--acp", "--stdio"]
return shlex.split(raw)
def _resolve_home_dir() -> str:
"""Return a stable HOME for child ACP processes."""
try:
from hermes_constants import get_subprocess_home
profile_home = get_subprocess_home()
if profile_home:
return profile_home
except Exception:
pass
home = os.environ.get("HOME", "").strip()
if home:
return home
expanded = os.path.expanduser("~")
if expanded and expanded != "~":
return expanded
try:
import pwd
resolved = pwd.getpwuid(os.getuid()).pw_dir.strip()
if resolved:
return resolved
except Exception:
pass
# Last resort: /tmp (writable on any POSIX system). Avoids crashing the
# subprocess with no HOME; callers can set HERMES_HOME explicitly if they
# need a different writable dir.
return "/tmp"
def _build_subprocess_env() -> dict[str, str]:
env = os.environ.copy()
env["HOME"] = _resolve_home_dir()
return env
def _jsonrpc_error(message_id: Any, code: int, message: str) -> dict[str, Any]:
return {
"jsonrpc": "2.0",
"id": message_id,
"error": {
"code": code,
"message": message,
},
}
def _permission_denied(message_id: Any) -> dict[str, Any]:
return {
"jsonrpc": "2.0",
"id": message_id,
"result": {
"outcome": {
"outcome": "cancelled",
}
},
}
def _format_messages_as_prompt(
messages: list[dict[str, Any]],
model: str | None = None,
tools: list[dict[str, Any]] | None = None,
tool_choice: Any = None,
) -> str:
sections: list[str] = [
"You are being used as the active ACP agent backend for Hermes.",
"Use ACP capabilities to complete tasks.",
"IMPORTANT: If you take an action with a tool, you MUST output tool calls using <tool_call>{...}</tool_call> blocks with JSON exactly in OpenAI function-call shape.",
"If no tool is needed, answer normally.",
]
if model:
sections.append(f"Hermes requested model hint: {model}")
if isinstance(tools, list) and tools:
tool_specs: list[dict[str, Any]] = []
for t in tools:
if not isinstance(t, dict):
continue
fn = t.get("function") or {}
if not isinstance(fn, dict):
continue
name = fn.get("name")
if not isinstance(name, str) or not name.strip():
continue
tool_specs.append(
{
"name": name.strip(),
"description": fn.get("description", ""),
"parameters": fn.get("parameters", {}),
}
)
if tool_specs:
sections.append(
"Available tools (OpenAI function schema). "
"When using a tool, emit ONLY <tool_call>{...}</tool_call> with one JSON object "
"containing id/type/function{name,arguments}. arguments must be a JSON string.\n"
+ json.dumps(tool_specs, ensure_ascii=False)
)
if tool_choice is not None:
sections.append(f"Tool choice hint: {json.dumps(tool_choice, ensure_ascii=False)}")
transcript: list[str] = []
for message in messages:
if not isinstance(message, dict):
continue
role = str(message.get("role") or "unknown").strip().lower()
if role == "tool":
role = "tool"
elif role not in {"system", "user", "assistant"}:
role = "context"
content = message.get("content")
rendered = _render_message_content(content)
if not rendered:
continue
label = {
"system": "System",
"user": "User",
"assistant": "Assistant",
"tool": "Tool",
"context": "Context",
}.get(role, role.title())
transcript.append(f"{label}:\n{rendered}")
if transcript:
sections.append("Conversation transcript:\n\n" + "\n\n".join(transcript))
sections.append("Continue the conversation from the latest user request.")
return "\n\n".join(section.strip() for section in sections if section and section.strip())
def _render_message_content(content: Any) -> str:
if content is None:
return ""
if isinstance(content, str):
return content.strip()
if isinstance(content, dict):
if "text" in content:
return str(content.get("text") or "").strip()
if "content" in content and isinstance(content.get("content"), str):
return str(content.get("content") or "").strip()
return json.dumps(content, ensure_ascii=True)
if isinstance(content, list):
parts: list[str] = []
for item in content:
if isinstance(item, str):
parts.append(item)
elif isinstance(item, dict):
text = item.get("text")
if isinstance(text, str) and text.strip():
parts.append(text.strip())
return "\n".join(parts).strip()
return str(content).strip()
def _extract_tool_calls_from_text(text: str) -> tuple[list[SimpleNamespace], str]:
if not isinstance(text, str) or not text.strip():
return [], ""
extracted: list[SimpleNamespace] = []
consumed_spans: list[tuple[int, int]] = []
def _try_add_tool_call(raw_json: str) -> None:
try:
obj = json.loads(raw_json)
except Exception:
return
if not isinstance(obj, dict):
return
fn = obj.get("function")
if not isinstance(fn, dict):
return
fn_name = fn.get("name")
if not isinstance(fn_name, str) or not fn_name.strip():
return
fn_args = fn.get("arguments", "{}")
if not isinstance(fn_args, str):
fn_args = json.dumps(fn_args, ensure_ascii=False)
call_id = obj.get("id")
if not isinstance(call_id, str) or not call_id.strip():
call_id = f"acp_call_{len(extracted)+1}"
extracted.append(
SimpleNamespace(
id=call_id,
call_id=call_id,
response_item_id=None,
type="function",
function=SimpleNamespace(name=fn_name.strip(), arguments=fn_args),
)
)
for m in _TOOL_CALL_BLOCK_RE.finditer(text):
raw = m.group(1)
_try_add_tool_call(raw)
consumed_spans.append((m.start(), m.end()))
# Only try bare-JSON fallback when no XML blocks were found.
if not extracted:
for m in _TOOL_CALL_JSON_RE.finditer(text):
raw = m.group(0)
_try_add_tool_call(raw)
consumed_spans.append((m.start(), m.end()))
if not consumed_spans:
return extracted, text.strip()
consumed_spans.sort()
merged: list[tuple[int, int]] = []
for start, end in consumed_spans:
if not merged or start > merged[-1][1]:
merged.append((start, end))
else:
merged[-1] = (merged[-1][0], max(merged[-1][1], end))
parts: list[str] = []
cursor = 0
for start, end in merged:
if cursor < start:
parts.append(text[cursor:start])
cursor = max(cursor, end)
if cursor < len(text):
parts.append(text[cursor:])
cleaned = "\n".join(p.strip() for p in parts if p and p.strip()).strip()
return extracted, cleaned
def _ensure_path_within_cwd(path_text: str, cwd: str) -> Path:
candidate = Path(path_text)
if not candidate.is_absolute():
raise PermissionError("ACP file-system paths must be absolute.")
resolved = candidate.resolve()
root = Path(cwd).resolve()
try:
resolved.relative_to(root)
except ValueError as exc:
raise PermissionError(f"Path '{resolved}' is outside the session cwd '{root}'.") from exc
return resolved
class _ACPChatCompletions:
def __init__(self, client: "CopilotACPClient"):
self._client = client
def create(self, **kwargs: Any) -> Any:
return self._client._create_chat_completion(**kwargs)
class _ACPChatNamespace:
def __init__(self, client: "CopilotACPClient"):
self.completions = _ACPChatCompletions(client)
class CopilotACPClient:
"""Minimal OpenAI-client-compatible facade for Copilot ACP."""
def __init__(
self,
*,
api_key: str | None = None,
base_url: str | None = None,
default_headers: dict[str, str] | None = None,
acp_command: str | None = None,
acp_args: list[str] | None = None,
acp_cwd: str | None = None,
command: str | None = None,
args: list[str] | None = None,
**_: Any,
):
self.api_key = api_key or "copilot-acp"
self.base_url = base_url or ACP_MARKER_BASE_URL
self._default_headers = dict(default_headers or {})
self._acp_command = acp_command or command or _resolve_command()
self._acp_args = list(acp_args or args or _resolve_args())
self._acp_cwd = str(Path(acp_cwd or os.getcwd()).resolve())
self.chat = _ACPChatNamespace(self)
self.is_closed = False
self._active_process: subprocess.Popen[str] | None = None
self._active_process_lock = threading.Lock()
def close(self) -> None:
proc: subprocess.Popen[str] | None
with self._active_process_lock:
proc = self._active_process
self._active_process = None
self.is_closed = True
if proc is None:
return
try:
proc.terminate()
proc.wait(timeout=2)
except Exception:
try:
proc.kill()
except Exception:
pass
def _create_chat_completion(
self,
*,
model: str | None = None,
messages: list[dict[str, Any]] | None = None,
timeout: float | None = None,
tools: list[dict[str, Any]] | None = None,
tool_choice: Any = None,
**_: Any,
) -> Any:
prompt_text = _format_messages_as_prompt(
messages or [],
model=model,
tools=tools,
tool_choice=tool_choice,
)
# Normalise timeout: run_agent.py may pass an httpx.Timeout object
# (used natively by the OpenAI SDK) rather than a plain float.
if timeout is None:
_effective_timeout = _DEFAULT_TIMEOUT_SECONDS
elif isinstance(timeout, (int, float)):
_effective_timeout = float(timeout)
else:
# httpx.Timeout or similar — pick the largest component so the
# subprocess has enough wall-clock time for the full response.
_candidates = [
getattr(timeout, attr, None)
for attr in ("read", "write", "connect", "pool", "timeout")
]
_numeric = [float(v) for v in _candidates if isinstance(v, (int, float))]
_effective_timeout = max(_numeric) if _numeric else _DEFAULT_TIMEOUT_SECONDS
response_text, reasoning_text = self._run_prompt(
prompt_text,
timeout_seconds=_effective_timeout,
)
tool_calls, cleaned_text = _extract_tool_calls_from_text(response_text)
usage = SimpleNamespace(
prompt_tokens=0,
completion_tokens=0,
total_tokens=0,
prompt_tokens_details=SimpleNamespace(cached_tokens=0),
)
assistant_message = SimpleNamespace(
content=cleaned_text,
tool_calls=tool_calls,
reasoning=reasoning_text or None,
reasoning_content=reasoning_text or None,
reasoning_details=None,
)
finish_reason = "tool_calls" if tool_calls else "stop"
choice = SimpleNamespace(message=assistant_message, finish_reason=finish_reason)
return SimpleNamespace(
choices=[choice],
usage=usage,
model=model or "copilot-acp",
)
def _run_prompt(self, prompt_text: str, *, timeout_seconds: float) -> tuple[str, str]:
try:
proc = subprocess.Popen(
[self._acp_command] + self._acp_args,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
cwd=self._acp_cwd,
env=_build_subprocess_env(),
)
except FileNotFoundError as exc:
raise RuntimeError(
f"Could not start Copilot ACP command '{self._acp_command}'. "
"Install GitHub Copilot CLI or set HERMES_COPILOT_ACP_COMMAND/COPILOT_CLI_PATH."
) from exc
if proc.stdin is None or proc.stdout is None:
proc.kill()
raise RuntimeError("Copilot ACP process did not expose stdin/stdout pipes.")
self.is_closed = False
with self._active_process_lock:
self._active_process = proc
inbox: queue.Queue[dict[str, Any]] = queue.Queue()
stderr_tail: deque[str] = deque(maxlen=40)
def _stdout_reader() -> None:
if proc.stdout is None:
return
for line in proc.stdout:
try:
inbox.put(json.loads(line))
except Exception:
inbox.put({"raw": line.rstrip("\n")})
def _stderr_reader() -> None:
if proc.stderr is None:
return
for line in proc.stderr:
stderr_tail.append(line.rstrip("\n"))
out_thread = threading.Thread(target=_stdout_reader, daemon=True)
err_thread = threading.Thread(target=_stderr_reader, daemon=True)
out_thread.start()
err_thread.start()
next_id = 0
def _request(method: str, params: dict[str, Any], *, text_parts: list[str] | None = None, reasoning_parts: list[str] | None = None) -> Any:
nonlocal next_id
next_id += 1
request_id = next_id
payload = {
"jsonrpc": "2.0",
"id": request_id,
"method": method,
"params": params,
}
proc.stdin.write(json.dumps(payload) + "\n")
proc.stdin.flush()
deadline = time.time() + timeout_seconds
while time.time() < deadline:
if proc.poll() is not None:
break
try:
msg = inbox.get(timeout=0.1)
except queue.Empty:
continue
if self._handle_server_message(
msg,
process=proc,
cwd=self._acp_cwd,
text_parts=text_parts,
reasoning_parts=reasoning_parts,
):
continue
if msg.get("id") != request_id:
continue
if "error" in msg:
err = msg.get("error") or {}
raise RuntimeError(
f"Copilot ACP {method} failed: {err.get('message') or err}"
)
return msg.get("result")
stderr_text = "\n".join(stderr_tail).strip()
if proc.poll() is not None and stderr_text:
raise RuntimeError(f"Copilot ACP process exited early: {stderr_text}")
raise TimeoutError(f"Timed out waiting for Copilot ACP response to {method}.")
try:
_request(
"initialize",
{
"protocolVersion": 1,
"clientCapabilities": {
"fs": {
"readTextFile": True,
"writeTextFile": True,
}
},
"clientInfo": {
"name": "hermes-agent",
"title": "Hermes Agent",
"version": "0.0.0",
},
},
)
session = _request(
"session/new",
{
"cwd": self._acp_cwd,
"mcpServers": [],
},
) or {}
session_id = str(session.get("sessionId") or "").strip()
if not session_id:
raise RuntimeError("Copilot ACP did not return a sessionId.")
text_parts: list[str] = []
reasoning_parts: list[str] = []
_request(
"session/prompt",
{
"sessionId": session_id,
"prompt": [
{
"type": "text",
"text": prompt_text,
}
],
},
text_parts=text_parts,
reasoning_parts=reasoning_parts,
)
return "".join(text_parts), "".join(reasoning_parts)
finally:
self.close()
def _handle_server_message(
self,
msg: dict[str, Any],
*,
process: subprocess.Popen[str],
cwd: str,
text_parts: list[str] | None,
reasoning_parts: list[str] | None,
) -> bool:
method = msg.get("method")
if not isinstance(method, str):
return False
if method == "session/update":
params = msg.get("params") or {}
update = params.get("update") or {}
kind = str(update.get("sessionUpdate") or "").strip()
content = update.get("content") or {}
chunk_text = ""
if isinstance(content, dict):
chunk_text = str(content.get("text") or "")
if kind == "agent_message_chunk" and chunk_text and text_parts is not None:
text_parts.append(chunk_text)
elif kind == "agent_thought_chunk" and chunk_text and reasoning_parts is not None:
reasoning_parts.append(chunk_text)
return True
if process.stdin is None:
return True
message_id = msg.get("id")
params = msg.get("params") or {}
if method == "session/request_permission":
response = _permission_denied(message_id)
elif method == "fs/read_text_file":
try:
path = _ensure_path_within_cwd(str(params.get("path") or ""), cwd)
block_error = get_read_block_error(str(path))
if block_error:
raise PermissionError(block_error)
content = path.read_text() if path.exists() else ""
line = params.get("line")
limit = params.get("limit")
if isinstance(line, int) and line > 1:
lines = content.splitlines(keepends=True)
start = line - 1
end = start + limit if isinstance(limit, int) and limit > 0 else None
content = "".join(lines[start:end])
if content:
content = redact_sensitive_text(content)
response = {
"jsonrpc": "2.0",
"id": message_id,
"result": {
"content": content,
},
}
except Exception as exc:
response = _jsonrpc_error(message_id, -32602, str(exc))
elif method == "fs/write_text_file":
try:
path = _ensure_path_within_cwd(str(params.get("path") or ""), cwd)
if is_write_denied(str(path)):
raise PermissionError(
f"Write denied: '{path}' is a protected system/credential file."
)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(str(params.get("content") or ""))
response = {
"jsonrpc": "2.0",
"id": message_id,
"result": None,
}
except Exception as exc:
response = _jsonrpc_error(message_id, -32602, str(exc))
else:
response = _jsonrpc_error(
message_id,
-32601,
f"ACP client method '{method}' is not supported by Hermes yet.",
)
process.stdin.write(json.dumps(response) + "\n")
process.stdin.flush()
return True
__all__ = ["CopilotACPClient"]
+11
View File
@@ -313,6 +313,17 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"ollama.com": "ollama-cloud",
}
# Auto-extend with hostnames derived from provider profiles.
# Any provider with a base_url not already in the map gets added automatically.
try:
from providers import list_providers as _list_providers
for _pp in _list_providers():
_host = _pp.get_hostname()
if _host and _host not in _URL_TO_PROVIDER:
_URL_TO_PROVIDER[_host] = _pp.name
except Exception:
pass
def _infer_provider_from_url(base_url: str) -> Optional[str]:
"""Infer the models.dev provider name from a base URL.
+13 -1
View File
@@ -6,9 +6,16 @@ Usage:
result = transport.normalize_response(raw_response)
"""
from agent.transports.types import NormalizedResponse, ToolCall, Usage, build_tool_call, map_finish_reason # noqa: F401
from agent.transports.types import (
NormalizedResponse,
ToolCall,
Usage,
build_tool_call,
map_finish_reason,
) # noqa: F401
_REGISTRY: dict = {}
_discovered: bool = False
def register_transport(api_mode: str, transport_cls: type) -> None:
@@ -23,6 +30,9 @@ def get_transport(api_mode: str):
This allows gradual migration — call sites can check for None
and fall back to the legacy code path.
"""
global _discovered
if not _discovered:
_discover_transports()
cls = _REGISTRY.get(api_mode)
if cls is None:
# The registry can be partially populated when a specific transport
@@ -38,6 +48,8 @@ def get_transport(api_mode: str):
def _discover_transports() -> None:
"""Import all transport modules to trigger auto-registration."""
global _discovered
_discovered = True
try:
import agent.transports.anthropic # noqa: F401
except ImportError:
+160 -152
View File
@@ -10,7 +10,7 @@ reasoning configuration, temperature handling, and extra_body assembly.
"""
import copy
from typing import Any, Dict, List, Optional
from typing import Any
from agent.moonshot_schema import is_moonshot_model, sanitize_moonshot_tools
from agent.prompt_builder import DEVELOPER_ROLE_MODELS
@@ -28,7 +28,9 @@ class ChatCompletionsTransport(ProviderTransport):
def api_mode(self) -> str:
return "chat_completions"
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> List[Dict[str, Any]]:
def convert_messages(
self, messages: list[dict[str, Any]], **kwargs
) -> list[dict[str, Any]]:
"""Messages are already in OpenAI format — sanitize Codex leaks only.
Strips Codex Responses API fields (``codex_reasoning_items`` /
@@ -45,7 +47,9 @@ class ChatCompletionsTransport(ProviderTransport):
tool_calls = msg.get("tool_calls")
if isinstance(tool_calls, list):
for tc in tool_calls:
if isinstance(tc, dict) and ("call_id" in tc or "response_item_id" in tc):
if isinstance(tc, dict) and (
"call_id" in tc or "response_item_id" in tc
):
needs_sanitize = True
break
if needs_sanitize:
@@ -68,76 +72,52 @@ class ChatCompletionsTransport(ProviderTransport):
tc.pop("response_item_id", None)
return sanitized
def convert_tools(self, tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
def convert_tools(self, tools: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Tools are already in OpenAI format — identity."""
return tools
def build_kwargs(
self,
model: str,
messages: List[Dict[str, Any]],
tools: Optional[List[Dict[str, Any]]] = None,
messages: list[dict[str, Any]],
tools: list[dict[str, Any]] | None = None,
**params,
) -> Dict[str, Any]:
) -> dict[str, Any]:
"""Build chat.completions.create() kwargs.
This is the most complex transport method — it handles ~16 providers
via params rather than subclasses.
params:
params (all optional):
timeout: float — API call timeout
max_tokens: int | None — user-configured max tokens
ephemeral_max_output_tokens: int | None — one-shot override (error recovery)
ephemeral_max_output_tokens: int | None — one-shot override
max_tokens_param_fn: callable — returns {max_tokens: N} or {max_completion_tokens: N}
reasoning_config: dict | None
request_overrides: dict | None
session_id: str | None
qwen_session_metadata: dict | None — {sessionId, promptId} precomputed
model_lower: str — lowercase model name for pattern matching
# Provider detection flags (all optional, default False)
is_openrouter: bool
is_nous: bool
is_qwen_portal: bool
is_github_models: bool
is_nvidia_nim: bool
is_kimi: bool
is_custom_provider: bool
ollama_num_ctx: int | None
# Provider routing
provider_preferences: dict | None
# Qwen-specific
qwen_prepare_fn: callable | None — runs AFTER codex sanitization
qwen_prepare_inplace_fn: callable | None — in-place variant for deepcopied lists
# Temperature
fixed_temperature: Any — from _fixed_temperature_for_model()
omit_temperature: bool
# Reasoning
# Provider profile path (all per-provider quirks live in providers/)
provider_profile: ProviderProfile | None — when present, delegates to
_build_kwargs_from_profile(); all flag params below are bypassed.
# Remaining flags — only used by the legacy fallback for unregistered
# providers (i.e. get_provider_profile() returned None). Known
# providers all go through provider_profile.
qwen_session_metadata: dict | None
supports_reasoning: bool
github_reasoning_extra: dict | None
# Claude on OpenRouter/Nous max output
anthropic_max_output: int | None
# Extra
extra_body_additions: dict | None — pre-built extra_body entries
extra_body_additions: dict | None
"""
# Codex sanitization: drop reasoning_items / call_id / response_item_id
sanitized = self.convert_messages(messages)
# Qwen portal prep AFTER codex sanitization. If sanitize already
# deepcopied, reuse that copy via the in-place variant to avoid a
# second deepcopy.
is_qwen = params.get("is_qwen_portal", False)
if is_qwen:
qwen_prep = params.get("qwen_prepare_fn")
qwen_prep_inplace = params.get("qwen_prepare_inplace_fn")
if sanitized is messages:
if qwen_prep is not None:
sanitized = qwen_prep(sanitized)
else:
# Already deepcopied — transform in place
if qwen_prep_inplace is not None:
qwen_prep_inplace(sanitized)
elif qwen_prep is not None:
sanitized = qwen_prep(sanitized)
# ── Provider profile: single-path when present ──────────────────
_profile = params.get("provider_profile")
if _profile:
return self._build_kwargs_from_profile(
_profile, model, sanitized, tools, params
)
# ── Legacy fallback (unregistered / unknown provider) ───────────
# Reached only when get_provider_profile() returned None.
# Known providers always go through the profile path above.
# Developer role swap for GPT-5/Codex models
model_lower = params.get("model_lower", (model or "").lower())
@@ -150,7 +130,7 @@ class ChatCompletionsTransport(ProviderTransport):
sanitized = list(sanitized)
sanitized[0] = {**sanitized[0], "role": "developer"}
api_kwargs: Dict[str, Any] = {
api_kwargs: dict[str, Any] = {
"model": model,
"messages": sanitized,
}
@@ -159,19 +139,6 @@ class ChatCompletionsTransport(ProviderTransport):
if timeout is not None:
api_kwargs["timeout"] = timeout
# Temperature
fixed_temp = params.get("fixed_temperature")
omit_temp = params.get("omit_temperature", False)
if omit_temp:
api_kwargs.pop("temperature", None)
elif fixed_temp is not None:
api_kwargs["temperature"] = fixed_temp
# Qwen metadata (caller precomputes {sessionId, promptId})
qwen_meta = params.get("qwen_session_metadata")
if qwen_meta and is_qwen:
api_kwargs["metadata"] = qwen_meta
# Tools
if tools:
# Moonshot/Kimi uses a stricter flavored JSON Schema. Rewriting
@@ -186,96 +153,24 @@ class ChatCompletionsTransport(ProviderTransport):
ephemeral = params.get("ephemeral_max_output_tokens")
max_tokens = params.get("max_tokens")
anthropic_max_out = params.get("anthropic_max_output")
is_nvidia_nim = params.get("is_nvidia_nim", False)
is_kimi = params.get("is_kimi", False)
reasoning_config = params.get("reasoning_config")
if ephemeral is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(ephemeral))
elif max_tokens is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(max_tokens))
elif is_nvidia_nim and max_tokens_fn:
api_kwargs.update(max_tokens_fn(16384))
elif is_qwen and max_tokens_fn:
api_kwargs.update(max_tokens_fn(65536))
elif is_kimi and max_tokens_fn:
# Kimi/Moonshot: 32000 matches Kimi CLI's default
api_kwargs.update(max_tokens_fn(32000))
elif anthropic_max_out is not None:
api_kwargs["max_tokens"] = anthropic_max_out
# Kimi: top-level reasoning_effort (unless thinking disabled)
if is_kimi:
_kimi_thinking_off = bool(
reasoning_config
and isinstance(reasoning_config, dict)
and reasoning_config.get("enabled") is False
)
if not _kimi_thinking_off:
_kimi_effort = "medium"
if reasoning_config and isinstance(reasoning_config, dict):
_e = (reasoning_config.get("effort") or "").strip().lower()
if _e in ("low", "medium", "high"):
_kimi_effort = _e
api_kwargs["reasoning_effort"] = _kimi_effort
# extra_body assembly
extra_body: Dict[str, Any] = {}
extra_body: dict[str, Any] = {}
is_openrouter = params.get("is_openrouter", False)
is_nous = params.get("is_nous", False)
is_github_models = params.get("is_github_models", False)
provider_prefs = params.get("provider_preferences")
if provider_prefs and is_openrouter:
extra_body["provider"] = provider_prefs
# Kimi extra_body.thinking
if is_kimi:
_kimi_thinking_enabled = True
if reasoning_config and isinstance(reasoning_config, dict):
if reasoning_config.get("enabled") is False:
_kimi_thinking_enabled = False
extra_body["thinking"] = {
"type": "enabled" if _kimi_thinking_enabled else "disabled",
}
# Reasoning
# Generic reasoning passthrough for unknown providers
if params.get("supports_reasoning", False):
if is_github_models:
gh_reasoning = params.get("github_reasoning_extra")
if gh_reasoning is not None:
extra_body["reasoning"] = gh_reasoning
reasoning_config = params.get("reasoning_config")
if reasoning_config is not None:
extra_body["reasoning"] = dict(reasoning_config)
else:
if reasoning_config is not None:
rc = dict(reasoning_config)
if is_nous and rc.get("enabled") is False:
pass # omit for Nous when disabled
else:
extra_body["reasoning"] = rc
else:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
if is_nous:
extra_body["tags"] = ["product=hermes-agent"]
# Ollama num_ctx
ollama_ctx = params.get("ollama_num_ctx")
if ollama_ctx:
options = extra_body.get("options", {})
options["num_ctx"] = ollama_ctx
extra_body["options"] = options
# Ollama/custom think=false
if params.get("is_custom_provider", False):
if reasoning_config and isinstance(reasoning_config, dict):
_effort = (reasoning_config.get("effort") or "").strip().lower()
_enabled = reasoning_config.get("enabled", True)
if _effort == "none" or _enabled is False:
extra_body["think"] = False
if is_qwen:
extra_body["vl_high_resolution_images"] = True
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
# Merge any pre-built extra_body additions
additions = params.get("extra_body_additions")
@@ -292,6 +187,117 @@ class ChatCompletionsTransport(ProviderTransport):
return api_kwargs
def _build_kwargs_from_profile(self, profile, model, sanitized, tools, params):
"""Build API kwargs using a ProviderProfile — single path, no legacy flags.
This method replaces the entire flag-based kwargs assembly when a
provider_profile is passed. Every quirk comes from the profile object.
"""
from providers.base import OMIT_TEMPERATURE
# Message preprocessing
sanitized = profile.prepare_messages(sanitized)
# Developer role swap — model-name-based, applies to all providers
_model_lower = (model or "").lower()
if (
sanitized
and isinstance(sanitized[0], dict)
and sanitized[0].get("role") == "system"
and any(p in _model_lower for p in DEVELOPER_ROLE_MODELS)
):
sanitized = list(sanitized)
sanitized[0] = {**sanitized[0], "role": "developer"}
api_kwargs: dict[str, Any] = {
"model": model,
"messages": sanitized,
}
# Temperature
if profile.fixed_temperature is OMIT_TEMPERATURE:
pass # Don't include temperature at all
elif profile.fixed_temperature is not None:
api_kwargs["temperature"] = profile.fixed_temperature
else:
# Use caller's temperature if provided
temp = params.get("temperature")
if temp is not None:
api_kwargs["temperature"] = temp
# Timeout
timeout = params.get("timeout")
if timeout is not None:
api_kwargs["timeout"] = timeout
# Tools — apply Moonshot/Kimi schema sanitization regardless of path
if tools:
if is_moonshot_model(model):
tools = sanitize_moonshot_tools(tools)
api_kwargs["tools"] = tools
# max_tokens resolution — priority: ephemeral > user > profile default
max_tokens_fn = params.get("max_tokens_param_fn")
ephemeral = params.get("ephemeral_max_output_tokens")
user_max = params.get("max_tokens")
anthropic_max = params.get("anthropic_max_output")
if ephemeral is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(ephemeral))
elif user_max is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(user_max))
elif profile.default_max_tokens and max_tokens_fn:
api_kwargs.update(max_tokens_fn(profile.default_max_tokens))
elif anthropic_max is not None:
api_kwargs["max_tokens"] = anthropic_max
# Provider-specific api_kwargs extras (reasoning_effort, metadata, etc.)
reasoning_config = params.get("reasoning_config")
extra_body_from_profile, top_level_from_profile = (
profile.build_api_kwargs_extras(
reasoning_config=reasoning_config,
supports_reasoning=params.get("supports_reasoning", False),
qwen_session_metadata=params.get("qwen_session_metadata"),
model=model,
ollama_num_ctx=params.get("ollama_num_ctx"),
)
)
api_kwargs.update(top_level_from_profile)
# extra_body assembly
extra_body: dict[str, Any] = {}
# Profile's extra_body (tags, provider prefs, vl_high_resolution, etc.)
profile_body = profile.build_extra_body(
session_id=params.get("session_id"),
provider_preferences=params.get("provider_preferences"),
)
if profile_body:
extra_body.update(profile_body)
# Profile's reasoning/thinking extra_body entries
if extra_body_from_profile:
extra_body.update(extra_body_from_profile)
# Merge any pre-built extra_body additions from the caller
additions = params.get("extra_body_additions")
if additions:
extra_body.update(additions)
# Request overrides (user config)
overrides = params.get("request_overrides")
if overrides:
for k, v in overrides.items():
if k == "extra_body" and isinstance(v, dict):
extra_body.update(v)
else:
api_kwargs[k] = v
if extra_body:
api_kwargs["extra_body"] = extra_body
return api_kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize OpenAI ChatCompletion to NormalizedResponse.
@@ -313,7 +319,7 @@ class ChatCompletionsTransport(ProviderTransport):
# Gemini 3 thinking models attach extra_content with
# thought_signature — without replay on the next turn the API
# rejects the request with 400.
tc_provider_data: Dict[str, Any] = {}
tc_provider_data: dict[str, Any] = {}
extra = getattr(tc, "extra_content", None)
if extra is None and hasattr(tc, "model_extra"):
extra = (tc.model_extra or {}).get("extra_content")
@@ -324,12 +330,14 @@ class ChatCompletionsTransport(ProviderTransport):
except Exception:
pass
tc_provider_data["extra_content"] = extra
tool_calls.append(ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
provider_data=tc_provider_data or None,
))
tool_calls.append(
ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
provider_data=tc_provider_data or None,
)
)
usage = None
if hasattr(response, "usage") and response.usage:
@@ -347,7 +355,7 @@ class ChatCompletionsTransport(ProviderTransport):
reasoning = getattr(msg, "reasoning", None)
reasoning_content = getattr(msg, "reasoning_content", None)
provider_data: Dict[str, Any] = {}
provider_data: dict[str, Any] = {}
if reasoning_content:
provider_data["reasoning_content"] = reasoning_content
rd = getattr(msg, "reasoning_details", None)
@@ -373,7 +381,7 @@ class ChatCompletionsTransport(ProviderTransport):
return False
return True
def extract_cache_stats(self, response: Any) -> Optional[Dict[str, int]]:
def extract_cache_stats(self, response: Any) -> dict[str, int] | None:
"""Extract OpenRouter/OpenAI cache stats from prompt_tokens_details."""
usage = getattr(response, "usage", None)
if usage is None:
+15 -14
View File
@@ -12,7 +12,7 @@ from __future__ import annotations
import json
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
from typing import Any
@dataclass
@@ -32,10 +32,10 @@ class ToolCall:
* Others: ``None``
"""
id: Optional[str]
id: str | None
name: str
arguments: str # JSON string
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
provider_data: dict[str, Any] | None = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The agent loop reads tc.function.name / tc.function.arguments
@@ -47,17 +47,17 @@ class ToolCall:
return "function"
@property
def function(self) -> "ToolCall":
def function(self) -> ToolCall:
"""Return self so tc.function.name / tc.function.arguments work."""
return self
@property
def call_id(self) -> Optional[str]:
def call_id(self) -> str | None:
"""Codex call_id from provider_data, accessed via getattr by _build_assistant_message."""
return (self.provider_data or {}).get("call_id")
@property
def response_item_id(self) -> Optional[str]:
def response_item_id(self) -> str | None:
"""Codex response_item_id from provider_data."""
return (self.provider_data or {}).get("response_item_id")
@@ -101,18 +101,18 @@ class NormalizedResponse:
* Others: ``None``
"""
content: Optional[str]
tool_calls: Optional[List[ToolCall]]
content: str | None
tool_calls: list[ToolCall] | None
finish_reason: str # "stop", "tool_calls", "length", "content_filter"
reasoning: Optional[str] = None
usage: Optional[Usage] = None
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
reasoning: str | None = None
usage: Usage | None = None
provider_data: dict[str, Any] | None = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The shim _nr_to_assistant_message() mapped these from provider_data.
# These properties let NormalizedResponse pass through directly.
@property
def reasoning_content(self) -> Optional[str]:
def reasoning_content(self) -> str | None:
pd = self.provider_data or {}
return pd.get("reasoning_content")
@@ -136,8 +136,9 @@ class NormalizedResponse:
# Factory helpers
# ---------------------------------------------------------------------------
def build_tool_call(
id: Optional[str],
id: str | None,
name: str,
arguments: Any,
**provider_fields: Any,
@@ -151,7 +152,7 @@ def build_tool_call(
return ToolCall(id=id, name=name, arguments=args_str, provider_data=pd)
def map_finish_reason(reason: Optional[str], mapping: Dict[str, str]) -> str:
def map_finish_reason(reason: str | None, mapping: dict[str, str]) -> str:
"""Translate a provider-specific stop reason to the normalised set.
Falls back to ``"stop"`` for unknown or ``None`` reasons.
+36
View File
@@ -4828,6 +4828,30 @@ class GatewayRunner:
"Failed to deliver compression-failure warning to user: %s",
_werr,
)
# Separately: if the user's CONFIGURED aux
# model failed and we recovered by falling
# back to the main model, tell them — a
# misconfigured auxiliary.compression.model
# is something only they can fix, and
# silent recovery would hide it.
elif _comp is not None and getattr(_comp, "_last_aux_model_failure_model", None):
_aux_model = getattr(_comp, "_last_aux_model_failure_model", "")
_aux_err = getattr(_comp, "_last_aux_model_failure_error", None) or "unknown error"
_aux_msg = (
f"️ Configured compression model `{_aux_model}` "
f"failed ({_aux_err}). Recovered using your main "
"model — context is intact — but you may want to "
"check `auxiliary.compression.model` in config.yaml."
)
try:
_adapter = self.adapters.get(source.platform)
if _adapter and source.chat_id:
await _adapter.send(source.chat_id, _aux_msg, metadata=_hyg_meta)
except Exception as _werr:
logger.warning(
"Failed to deliver aux-model-fallback notice to user: %s",
_werr,
)
finally:
self._cleanup_agent_resources(_hyg_agent)
@@ -7377,6 +7401,11 @@ class GatewayRunner:
_summary_failed = bool(getattr(compressor, "_last_summary_fallback_used", False))
_dropped_count = int(getattr(compressor, "_last_summary_dropped_count", 0) or 0)
_summary_err = getattr(compressor, "_last_summary_error", None)
# Separately: did the user's CONFIGURED aux model fail
# and we recovered via main? Surface that as an info
# note so they can fix their config.
_aux_fail_model = getattr(compressor, "_last_aux_model_failure_model", None)
_aux_fail_err = getattr(compressor, "_last_aux_model_failure_error", None)
finally:
self._cleanup_agent_resources(tmp_agent)
lines = [f"🗜️ {summary['headline']}"]
@@ -7392,6 +7421,13 @@ class GatewayRunner:
"with a placeholder; earlier context is no longer recoverable. "
"Consider checking your auxiliary.compression model configuration."
)
elif _aux_fail_model:
lines.append(
f"️ Configured compression model `{_aux_fail_model}` failed "
f"({_aux_fail_err or 'unknown error'}). Recovered using your main "
"model — context is intact — but you may want to check "
"`auxiliary.compression.model` in config.yaml."
)
return "\n".join(lines)
except Exception as e:
logger.warning("Manual compress failed: %s", e)
+42
View File
@@ -374,6 +374,37 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
),
}
# Auto-extend PROVIDER_REGISTRY with any api-key provider registered in
# providers/ that is not already declared above. New providers only need a
# providers/*.py file — no edits to this file required.
try:
from providers import list_providers as _list_providers_for_registry
for _pp in _list_providers_for_registry():
if _pp.name in PROVIDER_REGISTRY:
continue
if _pp.auth_type != "api_key" or not _pp.env_vars:
continue
# Skip providers that need custom token resolution (copilot, kimi, zai)
# — those are already fully declared above.
if _pp.name in {"copilot", "kimi-coding", "kimi-coding-cn", "zai"}:
continue
_api_key_vars = tuple(v for v in _pp.env_vars if not v.endswith("_BASE_URL") and not v.endswith("_URL"))
_base_url_var = next((v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")), None)
PROVIDER_REGISTRY[_pp.name] = ProviderConfig(
id=_pp.name,
name=_pp.display_name or _pp.name,
auth_type="api_key",
inference_base_url=_pp.base_url,
api_key_env_vars=_api_key_vars or _pp.env_vars,
base_url_env_var=_base_url_var or "",
)
# Also register aliases so resolve_provider() resolves them
for _alias in _pp.aliases:
if _alias not in PROVIDER_REGISTRY:
PROVIDER_REGISTRY[_alias] = PROVIDER_REGISTRY[_pp.name]
except Exception:
pass
# =============================================================================
# Anthropic Key Helper
@@ -1150,6 +1181,17 @@ def resolve_provider(
"vllm": "custom", "llamacpp": "custom",
"llama.cpp": "custom", "llama-cpp": "custom",
}
# Extend with aliases declared in providers/*.py that aren't already mapped.
# This keeps providers/ as the single source for new aliases while the
# hardcoded dict above remains authoritative for existing ones.
try:
from providers import list_providers as _lp
for _pp in _lp():
for _alias in _pp.aliases:
if _alias not in _PROVIDER_ALIASES:
_PROVIDER_ALIASES[_alias] = _pp.name
except Exception:
pass
normalized = _PROVIDER_ALIASES.get(normalized, normalized)
if normalized == "openrouter":
+42
View File
@@ -4252,3 +4252,45 @@ def config_command(args):
print(" hermes config path Show config file path")
print(" hermes config env-path Show .env file path")
sys.exit(1)
# ── Profile-driven env var injection ─────────────────────────────────────────
# Any provider registered in providers/ with auth_type="api_key" automatically
# gets its env_vars exposed in OPTIONAL_ENV_VARS without editing this file.
# Runs once at import time.
_profile_env_vars_injected = False
def _inject_profile_env_vars() -> None:
"""Populate OPTIONAL_ENV_VARS from provider profiles not already listed.
Called once at module load time. Idempotent — repeated calls are no-ops.
"""
global _profile_env_vars_injected
if _profile_env_vars_injected:
return
_profile_env_vars_injected = True
try:
from providers import list_providers
for _pp in list_providers():
if _pp.auth_type not in ("api_key",):
continue
for _var in _pp.env_vars:
if _var in OPTIONAL_ENV_VARS:
continue
_is_key = not _var.endswith("_BASE_URL") and not _var.endswith("_URL")
OPTIONAL_ENV_VARS[_var] = {
"description": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL override'}",
"prompt": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL (leave empty for default)'}",
"url": _pp.signup_url or None,
"password": _is_key,
"category": "provider",
"advanced": True,
}
except Exception:
pass
# Eagerly inject so that OPTIONAL_ENV_VARS is fully populated at import time.
_inject_profile_env_vars()
+83 -21
View File
@@ -164,6 +164,84 @@ def _check_gateway_service_linger(issues: list[str]) -> None:
check_warn("Could not verify systemd linger", f"({linger_detail})")
_APIKEY_PROVIDERS_CACHE: list | None = None
def _build_apikey_providers_list() -> list:
"""Build the API-key provider health-check list once and cache it.
Tuple format: (name, env_vars, default_url, base_env, supports_models_endpoint)
Base list augmented with any ProviderProfile with auth_type="api_key" not
already present adding providers/*.py is sufficient to get into doctor.
"""
_static = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("StepFun Step Plan", ("STEPFUN_API_KEY",), "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("GMI Cloud", ("GMI_API_KEY",), "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models; use the /v1 surface.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", True),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
# OpenCode Go has no shared /models endpoint; skip the health check.
("OpenCode Go", ("OPENCODE_GO_API_KEY",), None, "OPENCODE_GO_BASE_URL", False),
]
_known_names = {t[0] for t in _static}
# Also index by profile canonical name so profiles without display_name
# don't create duplicate entries for providers already in the static list.
_known_canonical: set[str] = set()
_name_to_canonical = {
"Z.AI / GLM": "zai", "Kimi / Moonshot": "kimi-coding",
"StepFun Step Plan": "stepfun", "Kimi / Moonshot (China)": "kimi-coding-cn",
"Arcee AI": "arcee", "GMI Cloud": "gmi", "DeepSeek": "deepseek",
"Hugging Face": "huggingface", "NVIDIA NIM": "nvidia",
"Alibaba/DashScope": "alibaba", "MiniMax": "minimax",
"MiniMax (China)": "minimax-cn", "Vercel AI Gateway": "ai-gateway",
"Kilo Code": "kilocode", "OpenCode Zen": "opencode-zen",
"OpenCode Go": "opencode-go",
}
for _label, _canonical in _name_to_canonical.items():
_known_canonical.add(_canonical)
try:
from providers import list_providers
from providers.base import ProviderProfile as _PP
for _pp in list_providers():
if not isinstance(_pp, _PP) or _pp.auth_type != "api_key" or not _pp.env_vars:
continue
_label = _pp.display_name or _pp.name
if _label in _known_names or _pp.name in _known_canonical:
continue
# Separate API-key vars from base-URL override vars — the health-check
# loop sends the first found value as Authorization: Bearer, so a URL
# string must never be picked.
_key_vars = tuple(
v for v in _pp.env_vars
if not v.endswith("_BASE_URL") and not v.endswith("_URL")
)
_base_var = next(
(v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")),
None,
)
if not _key_vars:
continue
_models_url = (
(_pp.models_url or (_pp.base_url.rstrip("/") + "/models"))
if _pp.base_url else None
)
_static.append((_label, _key_vars, _models_url, _base_var, True))
except Exception:
pass
return _static
def run_doctor(args):
"""Run diagnostic checks."""
should_fix = getattr(args, 'fix', False)
@@ -931,27 +1009,11 @@ def run_doctor(args):
# -- API-key providers --
# Tuple: (name, env_vars, default_url, base_env, supports_models_endpoint)
# If supports_models_endpoint is False, we skip the health check and just show "configured"
_apikey_providers = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("StepFun Step Plan", ("STEPFUN_API_KEY",), "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("GMI Cloud", ("GMI_API_KEY",), "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", True),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
# OpenCode Go has no shared /models endpoint; skip the health check.
("OpenCode Go", ("OPENCODE_GO_API_KEY",), None, "OPENCODE_GO_BASE_URL", False),
]
# Cached at module level after first build — profiles auto-extend it.
global _APIKEY_PROVIDERS_CACHE
if _APIKEY_PROVIDERS_CACHE is None:
_APIKEY_PROVIDERS_CACHE = _build_apikey_providers_list()
_apikey_providers = _APIKEY_PROVIDERS_CACHE
for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
_key = ""
for _ev in _env_vars:
+33 -25
View File
@@ -1528,6 +1528,21 @@ def cmd_model(args):
select_provider_and_model(args=args)
def _is_profile_api_key_provider(provider_id: str) -> bool:
"""Return True when provider_id maps to a profile with auth_type='api_key'.
Used as a catch-all in select_provider_and_model() so that new providers
declared in providers/*.py automatically dispatch to _model_flow_api_key_provider
without requiring an explicit elif branch here.
"""
try:
from providers import get_provider_profile
_p = get_provider_profile(provider_id)
return _p is not None and _p.auth_type == "api_key"
except Exception:
return False
def select_provider_and_model(args=None):
"""Core provider selection + model picking logic.
@@ -1820,7 +1835,7 @@ def select_provider_and_model(args=None):
"gmi",
"nvidia",
"ollama-cloud",
):
) or _is_profile_api_key_provider(selected_provider):
_model_flow_api_key_provider(config, selected_provider, current_model)
# ── Post-switch cleanup: clear stale OPENAI_BASE_URL ──────────────
@@ -7618,6 +7633,22 @@ def cmd_logs(args):
)
def _build_provider_choices() -> list[str]:
"""Build the --provider choices list from CANONICAL_PROVIDERS + 'auto'."""
try:
from hermes_cli.models import CANONICAL_PROVIDERS as _cp
return ["auto"] + [p.slug for p in _cp]
except Exception:
# Fallback: static list guarantees the CLI always works
return [
"auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot",
"anthropic", "gemini", "google-gemini-cli", "xai", "bedrock", "azure-foundry",
"ollama-cloud", "huggingface", "zai", "kimi-coding", "kimi-coding-cn",
"stepfun", "minimax", "minimax-cn", "kilocode", "xiaomi", "arcee",
"nvidia", "deepseek", "alibaba", "qwen-oauth", "opencode-zen", "opencode-go",
]
def main():
"""Main entry point for hermes CLI."""
parser = argparse.ArgumentParser(
@@ -7811,30 +7842,7 @@ For more help on a command:
)
chat_parser.add_argument(
"--provider",
choices=[
"auto",
"openrouter",
"nous",
"openai-codex",
"copilot-acp",
"copilot",
"anthropic",
"gemini",
"xai",
"ollama-cloud",
"huggingface",
"zai",
"kimi-coding",
"kimi-coding-cn",
"stepfun",
"minimax",
"minimax-cn",
"kilocode",
"xiaomi",
"arcee",
"gmi",
"nvidia",
],
choices=_build_provider_choices(),
default=None,
help="Inference provider (default: auto)",
)
+47
View File
@@ -750,6 +750,25 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("azure-foundry", "Azure Foundry", "Azure Foundry (OpenAI-style or Anthropic-style endpoint — your Azure AI deployment)"),
]
# Auto-extend CANONICAL_PROVIDERS with any provider registered in providers/
# that is not already in the list above. Adding providers/*.py is sufficient
# to expose a new provider in the model picker, /model, and all downstream
# consumers — no edits to this file needed.
_canonical_slugs = {p.slug for p in CANONICAL_PROVIDERS}
try:
from providers import list_providers as _list_providers_for_canonical
for _pp in _list_providers_for_canonical():
if _pp.name in _canonical_slugs:
continue
if _pp.auth_type in ("oauth_device_code", "oauth_external", "external_process", "aws_sdk", "copilot"):
continue # non-api-key flows need bespoke picker UX; skip auto-inject
_label = _pp.display_name or _pp.name
_desc = _pp.description or f"{_label} (direct API)"
CANONICAL_PROVIDERS.append(ProviderEntry(_pp.name, _label, _desc))
_canonical_slugs.add(_pp.name)
except Exception:
pass
# Derived dicts — used throughout the codebase
_PROVIDER_LABELS = {p.slug: p.label for p in CANONICAL_PROVIDERS}
_PROVIDER_LABELS["custom"] = "Custom endpoint" # special case: not a named provider
@@ -1884,6 +1903,34 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
live = fetch_api_models(api_key, base_url)
if live:
return live
# ── Profile-based generic live fetch (all simple api-key providers) ──
# Handles any provider registered in providers/ with auth_type="api_key".
# Replaces per-provider copy-paste blocks (stepfun, gmi, zai, etc.).
try:
from providers import get_provider_profile
from hermes_cli.auth import resolve_api_key_provider_credentials
_p = get_provider_profile(normalized)
if _p and _p.auth_type == "api_key" and _p.base_url:
try:
creds = resolve_api_key_provider_credentials(normalized)
api_key = str(creds.get("api_key") or "").strip()
base_url = str(creds.get("base_url") or "").strip()
except Exception:
api_key, base_url = "", _p.base_url
if not base_url:
base_url = _p.base_url
if api_key:
live = _p.fetch_models(api_key=api_key)
if live:
return live
# Use profile's fallback_models if defined
if _p.fallback_models:
return list(_p.fallback_models)
except Exception:
pass
curated_static = list(_PROVIDER_MODELS.get(normalized, []))
if normalized in _MODELS_DEV_PREFERRED:
return _merge_with_models_dev(normalized, curated_static)
+14
View File
@@ -79,6 +79,20 @@ VALID_HOOKS: Set[str] = {
# {"action": "allow"} / None -> normal dispatch
# Kwargs: event: MessageEvent, gateway: GatewayRunner, session_store.
"pre_gateway_dispatch",
# Approval lifecycle hooks. Fired by tools/approval.py when a dangerous
# command needs user approval -- fires BOTH for CLI-interactive prompts
# and for gateway/ACP approvals (Telegram, Discord, Slack, TUI, etc.).
# Observers only: return values are ignored. Plugins cannot veto or
# pre-answer an approval from these hooks (use pre_tool_call to block
# a tool before it reaches approval).
#
# Kwargs for pre_approval_request:
# command: str, description: str, pattern_key: str, pattern_keys: list[str],
# session_key: str, surface: "cli" | "gateway"
# Kwargs for post_approval_response: same as above plus
# choice: "once" | "session" | "always" | "deny" | "timeout"
"pre_approval_request",
"post_approval_response",
}
ENTRY_POINTS_GROUP = "hermes_agent.plugins"
+23 -10
View File
@@ -214,10 +214,6 @@ def _resolve_runtime_from_pool_entry(
base_url = cfg_base_url or base_url or "https://api.anthropic.com"
elif provider == "openrouter":
base_url = base_url or OPENROUTER_BASE_URL
elif provider == "xai":
api_mode = "codex_responses"
elif provider == "nous":
api_mode = "chat_completions"
elif provider == "copilot":
api_mode = _copilot_runtime_api_mode(model_cfg, getattr(entry, "runtime_api_key", ""))
base_url = base_url or PROVIDER_REGISTRY["copilot"].inference_base_url
@@ -249,6 +245,14 @@ def _resolve_runtime_from_pool_entry(
base_url = re.sub(r"/v1/?$", "", base_url)
else:
configured_provider = str(model_cfg.get("provider") or "").strip().lower()
# Use profile api_mode for all other known providers
try:
from providers import get_provider_profile
_p = get_provider_profile(provider)
if _p and _p.api_mode:
api_mode = _p.api_mode
except Exception:
pass
# Honour model.base_url from config.yaml when the configured provider
# matches this provider — same pattern as the Anthropic branch above.
# Only override when the pool entry has no explicit base_url (i.e. it
@@ -266,12 +270,21 @@ def _resolve_runtime_from_pool_entry(
from hermes_cli.models import opencode_model_api_mode
api_mode = opencode_model_api_mode(provider, effective_model)
else:
# Auto-detect Anthropic-compatible endpoints (/anthropic suffix,
# Kimi /coding, api.openai.com → codex_responses, api.x.ai →
# codex_responses).
detected = _detect_api_mode_for_url(base_url)
if detected:
api_mode = detected
# Try profile api_mode first, then auto-detect from URL
try:
from providers import get_provider_profile
_p = get_provider_profile(provider)
if _p and _p.api_mode:
api_mode = _p.api_mode
except Exception:
pass
if api_mode == "chat_completions":
# Auto-detect Anthropic-compatible endpoints (/anthropic suffix,
# Kimi /coding, api.openai.com → codex_responses, api.x.ai →
# codex_responses).
detected = _detect_api_mode_for_url(base_url)
if detected:
api_mode = detected
# OpenCode base URLs end with /v1 for OpenAI-compatible models, but the
# Anthropic SDK prepends its own /v1/messages to the base_url. Strip the
+307
View File
@@ -0,0 +1,307 @@
# providers/
Single source of truth for every inference provider Hermes knows about.
Each provider is declared once here as a `ProviderProfile`. Every other layer —
auth resolution, transport kwargs, model listing, runtime routing — reads from
these profiles instead of maintaining its own parallel data.
---
## Directory layout
```
providers/
├── base.py ProviderProfile dataclass + OMIT_TEMPERATURE sentinel
├── __init__.py Registry: register_provider(), get_provider_profile()
├── README.md This file
├── # Simple providers — just identity + auth + endpoint
├── alibaba.py Alibaba Cloud DashScope
├── arcee.py Arcee AI
├── bedrock.py AWS Bedrock (api_mode=bedrock_converse)
├── deepseek.py DeepSeek
├── huggingface.py Hugging Face Inference API
├── kilocode.py Kilo Code
├── minimax.py MiniMax (international + CN)
├── nvidia.py NVIDIA NIM (default_max_tokens=16384)
├── ollama_cloud.py Ollama Cloud
├── stepfun.py StepFun
├── xiaomi.py Xiaomi MiMo
├── xai.py xAI Grok (api_mode=codex_responses)
├── zai.py Z.AI / GLM
├── # Medium — one or two quirks
├── anthropic.py Native Anthropic (x-api-key header, api_mode=anthropic_messages)
├── copilot.py GitHub Copilot (auth_type=copilot, reasoning per model)
├── copilot_acp.py Copilot ACP subprocess (api_mode=copilot_acp)
├── custom.py Custom/Ollama local (think=false, num_ctx)
├── gemini.py Google Gemini AI Studio + Cloud Code OAuth
├── kimi.py Kimi Coding (OMIT_TEMPERATURE, thinking, dual endpoint)
├── openai_codex.py OpenAI Codex OAuth (api_mode=codex_responses)
├── opencode.py OpenCode Zen + Go (per-model api_mode routing)
├── # Complex — subclasses with multiple overrides
├── nous.py Nous Portal (tags, attribution, reasoning omit-when-disabled)
├── openrouter.py OpenRouter (provider preferences, public model fetch)
├── qwen.py Qwen OAuth (message normalization, cache_control, vl_hires)
└── vercel.py Vercel AI Gateway (attribution headers, reasoning passthrough)
```
---
## ProviderProfile fields
```python
@dataclass
class ProviderProfile:
# Identity
name: str # canonical ID — auto-registered as PROVIDER_REGISTRY key for new api-key providers
api_mode: str # "chat_completions" | "anthropic_messages" |
# "codex_responses" | "bedrock_converse" | "copilot_acp"
aliases: tuple # alternate names resolved by get_provider_profile()
# Auth & endpoints
env_vars: tuple # env var names holding the API key, in priority order
base_url: str # default inference endpoint
models_url: str # explicit models endpoint; falls back to {base_url}/models
# set when the models catalog lives at a different URL
# (e.g. OpenRouter: public /api/v1/models vs /api/v1 inference)
auth_type: str # "api_key" | "oauth_device_code" | "oauth_external" |
# "copilot" | "aws" | "external_process"
# Client-level quirks
default_headers: dict # extra HTTP headers sent on every request
# Request-level quirks
fixed_temperature: Any # None = use caller's default; OMIT_TEMPERATURE = don't send
default_max_tokens: int|None # inject max_tokens when caller omits it
default_aux_model: str # cheap model for auxiliary tasks (compression, vision, etc.)
# empty string = use main model (default)
```
---
## Hooks (override in a subclass)
| Method | When to override |
|--------|-----------------|
| `prepare_messages(messages)` | Provider needs message pre-processing (Qwen: string → list-of-parts, cache_control) |
| `build_extra_body(*, session_id, **ctx)` | Provider-specific `extra_body` fields (Nous: tags, OpenRouter: provider preferences) |
| `build_api_kwargs_extras(*, reasoning_config, **ctx)` | Returns `(extra_body_additions, top_level_kwargs)` — use when some fields go to `extra_body` and some go top-level (Kimi: `reasoning_effort` top-level; OpenRouter: `reasoning` in extra_body) |
| `fetch_models(*, api_key, timeout)` | Custom model listing (Anthropic: x-api-key header; OpenRouter: public endpoint, no auth; Bedrock/copilot-acp: return None) |
All hooks have safe defaults — only override what differs from the base.
---
## How to add a new provider
### 1. Simple (standard OpenAI-compatible endpoint)
```python
# providers/myprovider.py
from providers import register_provider
from providers.base import ProviderProfile
myprovider = ProviderProfile(
name="myprovider", # must match id in hermes_cli/auth.py PROVIDER_REGISTRY
aliases=("my-provider", "myp"),
api_mode="chat_completions",
env_vars=("MYPROVIDER_API_KEY",),
base_url="https://api.myprovider.com/v1",
auth_type="api_key",
)
register_provider(myprovider)
```
The default `fetch_models()` will call `GET https://api.myprovider.com/v1/models`
with Bearer auth automatically. No override needed for standard `/v1/models`.
### 2. With quirks (subclass)
```python
# providers/myprovider.py
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
class MyProviderProfile(ProviderProfile):
"""My provider — custom reasoning header."""
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
**ctx: Any,
) -> tuple[dict[str, Any], dict[str, Any]]:
extra_body: dict[str, Any] = {}
if reasoning_config:
extra_body["my_reasoning"] = reasoning_config.get("effort", "medium")
return extra_body, {}
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
# Override only if your endpoint differs from standard /v1/models
return super().fetch_models(api_key=api_key, timeout=timeout)
myprovider = MyProviderProfile(
name="myprovider",
aliases=("myp",),
env_vars=("MYPROVIDER_API_KEY",),
base_url="https://api.myprovider.com/v1",
)
register_provider(myprovider)
```
### 3. Wire it up
After creating the file, add `name` to the `_PROFILE_ACTIVE_PROVIDERS` set in
`run_agent.py` once you've verified parity against the legacy flag path. Start
with a simple provider (no message prep, no reasoning quirks) and work up.
---
## fetch_models contract
```python
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
...
```
- Returns `list[str]`: model IDs from the provider's live endpoint.
- Returns `None`: provider doesn't support REST model listing (Bedrock, copilot-acp),
or the request failed. Callers **must** fall back to `_PROVIDER_MODELS` on `None`.
- Never raises — swallow exceptions and return `None`.
- Default implementation: `GET {base_url}/models` with Bearer auth. Works for any
standard OpenAI-compatible provider.
**Override when:**
- Auth header is not `Bearer` (Anthropic: `x-api-key`)
- Endpoint path differs from `/models` AND you can't just set `models_url` (OpenRouter: public endpoint, pass `api_key=None` explicitly)
- Response format differs (extra wrapping, non-standard `id` field)
- Provider has no REST endpoint (Bedrock, copilot-acp → return `None`)
- Filtering needed post-fetch (only tool-capable models, etc.)
Use `models_url` instead of overriding when the only difference is the URL:
```python
# No subclass needed — just set models_url
myprovider = ProviderProfile(
name="myprovider",
base_url="https://api.myprovider.com/v1",
models_url="https://catalog.myprovider.com/models", # different host
)
```
---
## Debugging
### Check if a provider resolves
```python
from providers import get_provider_profile
p = get_provider_profile("myprovider")
print(p) # ProviderProfile(name='myprovider', ...)
print(p.base_url)
print(p.api_mode)
```
### Check all registered providers
```python
from providers import _REGISTRY
print(list(_REGISTRY.keys()))
```
### Test live model fetch
```python
import os
from providers import get_provider_profile
p = get_provider_profile("myprovider")
key = os.getenv("MYPROVIDER_API_KEY")
models = p.fetch_models(api_key=key, timeout=5.0)
print(models) # list of model IDs, or None on failure
```
### Test alias resolution
```python
from providers import get_provider_profile
# All of these should return the same profile
assert get_provider_profile("openrouter").name == "openrouter"
assert get_provider_profile("or").name == "openrouter"
```
### Run the provider test suite
```bash
# From the repo root
source venv/bin/activate
python -m pytest tests/providers/ -v
```
### Check ruff + ty compliance
```bash
source venv/bin/activate
ruff format providers/*.py
ruff check providers/*.py --select UP,E,F,I,W
ty check providers/*.py
```
---
## Common mistakes
**Wrong `name`** — must be the same string that appears as the key in
`hermes_cli/auth.py` `PROVIDER_REGISTRY`. New api-key providers auto-register
into `PROVIDER_REGISTRY` from the profile, so the name IS the key. For providers
with a pre-existing `PROVIDER_REGISTRY` entry, use the exact `id` field value.
**Wrong `env_vars`** — separate API-key vars from base-URL override vars in the
tuple. Env vars that end with `_BASE_URL` or `_URL` are treated as URL overrides;
everything else is treated as an API key. Getting this wrong causes the doctor
health check to send a URL string as a Bearer token.
**Wrong `base_url`** — several providers have non-obvious paths:
`stepfun: /step_plan/v1`, `opencode-go: /zen/go/v1`. The profile's `base_url`
is also used as the `inference_base_url` when auto-registering into `PROVIDER_REGISTRY`
for new providers, so it must be correct for auth resolution to work.
**Skipping `api_mode`** — defaults to `chat_completions`. Providers that use
`anthropic_messages`, `codex_responses`, `bedrock_converse`, or `copilot_acp`
must set it explicitly.
**Forgetting `register_provider()`** — auto-discovery runs `pkgutil.iter_modules`
over the package and imports each module, but only if `register_provider()` is
called at module level. Without it the profile is never in `_REGISTRY`.
**`fetch_models` returning the wrong shape** — must return `list[str]` (plain
model IDs), not `list[tuple]` or `list[dict]`. Callers expect plain strings.
**Wrong `build_api_kwargs_extras` return shape** — must return a 2-tuple
`(extra_body_dict, top_level_dict)`. Returning a single dict causes a
`ValueError: not enough values to unpack` in the transport.
**`build_api_kwargs_extras` wrong tuple** — must return `(extra_body_dict,
top_level_dict)`. Returning a flat dict or swapping the order silently sends
fields to the wrong place.
+76
View File
@@ -0,0 +1,76 @@
"""Provider module registry.
Auto-discovers ProviderProfile instances from providers/*.py modules.
Each module should define a module-level PROVIDER or PROVIDERS list.
Usage:
from providers import get_provider_profile
profile = get_provider_profile("nvidia") # returns ProviderProfile or None
profile = get_provider_profile("kimi") # checks name + aliases
"""
from __future__ import annotations
from providers.base import OMIT_TEMPERATURE, ProviderProfile # noqa: F401
_REGISTRY: dict[str, ProviderProfile] = {}
_ALIASES: dict[str, str] = {}
_discovered = False
def register_provider(profile: ProviderProfile) -> None:
"""Register a provider profile by name and aliases."""
_REGISTRY[profile.name] = profile
for alias in profile.aliases:
_ALIASES[alias] = profile.name
def get_provider_profile(name: str) -> ProviderProfile | None:
"""Look up a provider profile by name or alias.
Returns None if the provider has no profile (falls back to generic).
"""
if not _discovered:
_discover_providers()
canonical = _ALIASES.get(name, name)
return _REGISTRY.get(canonical)
def list_providers() -> list[ProviderProfile]:
"""Return all registered provider profiles (one per canonical name)."""
if not _discovered:
_discover_providers()
# Deduplicate: _REGISTRY has canonical names; _ALIASES points to same objects
seen: set[int] = set()
result: list[ProviderProfile] = []
for profile in _REGISTRY.values():
pid = id(profile)
if pid not in seen:
seen.add(pid)
result.append(profile)
return result
def _discover_providers() -> None:
"""Import all provider modules to trigger registration."""
global _discovered
if _discovered:
return
_discovered = True
import importlib
import pkgutil
import providers as _pkg
for _importer, modname, _ispkg in pkgutil.iter_modules(_pkg.__path__):
if modname.startswith("_") or modname == "base":
continue
try:
importlib.import_module(f"providers.{modname}")
except ImportError as e:
import logging
logging.getLogger(__name__).warning(
"Failed to import provider module %s: %s", modname, e
)
+13
View File
@@ -0,0 +1,13 @@
"""Alibaba Cloud DashScope provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
alibaba = ProviderProfile(
name="alibaba",
aliases=("dashscope", "alibaba-cloud", "qwen-dashscope"),
env_vars=("DASHSCOPE_API_KEY",),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
register_provider(alibaba)
+52
View File
@@ -0,0 +1,52 @@
"""Native Anthropic provider profile."""
import json
import logging
import urllib.request
from providers import register_provider
from providers.base import ProviderProfile
logger = logging.getLogger(__name__)
class AnthropicProfile(ProviderProfile):
"""Native Anthropic — uses x-api-key header, not Bearer."""
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
"""Anthropic uses x-api-key header and anthropic-version."""
if not api_key:
return None
try:
req = urllib.request.Request("https://api.anthropic.com/v1/models")
req.add_header("x-api-key", api_key)
req.add_header("anthropic-version", "2023-06-01")
req.add_header("Accept", "application/json")
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
return [
m["id"]
for m in data.get("data", [])
if isinstance(m, dict) and "id" in m
]
except Exception as exc:
logger.debug("fetch_models(anthropic): %s", exc)
return None
anthropic = AnthropicProfile(
name="anthropic",
aliases=("claude", "claude-oauth", "claude-code"),
api_mode="anthropic_messages",
env_vars=("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"),
base_url="https://api.anthropic.com",
auth_type="api_key",
default_aux_model="claude-haiku-4-5-20251001",
)
register_provider(anthropic)
+13
View File
@@ -0,0 +1,13 @@
"""Arcee AI provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
arcee = ProviderProfile(
name="arcee",
aliases=("arcee-ai", "arceeai"),
env_vars=("ARCEEAI_API_KEY",),
base_url="https://api.arcee.ai/api/v1",
)
register_provider(arcee)
+165
View File
@@ -0,0 +1,165 @@
"""Provider profile base class.
A ProviderProfile declares everything about an inference provider in one place:
auth, endpoints, client quirks, request-time quirks. The transport reads this
instead of receiving 20+ boolean flags.
Provider profiles are DECLARATIVE they describe the provider's behavior.
They do NOT own client construction, credential rotation, or streaming.
Those stay on AIAgent.
"""
from __future__ import annotations
import logging
from dataclasses import dataclass, field
from typing import Any
logger = logging.getLogger(__name__)
# Sentinel for "omit temperature entirely" (Kimi: server manages it)
OMIT_TEMPERATURE = object()
@dataclass
class ProviderProfile:
"""Base provider profile — subclass or instantiate with overrides."""
# ── Identity ─────────────────────────────────────────────
name: str
api_mode: str = "chat_completions"
aliases: tuple = ()
# ── Human-readable metadata ───────────────────────────────
display_name: str = "" # e.g. "GMI Cloud" — shown in picker/labels
description: str = "" # e.g. "GMI Cloud (multi-model direct API)" — picker subtitle
signup_url: str = "" # e.g. "https://www.gmicloud.ai/" — shown during setup
# ── Auth & endpoints ─────────────────────────────────────
env_vars: tuple = ()
base_url: str = ""
models_url: str = "" # explicit models endpoint; falls back to {base_url}/models
auth_type: str = "api_key" # api_key|oauth_device_code|oauth_external|copilot|aws_sdk
# ── Model catalog ─────────────────────────────────────────
# fallback_models: curated list shown in /model picker when live fetch fails.
# Only agentic models that support tool calling should appear here.
fallback_models: tuple = ()
# hostname: base hostname for URL→provider reverse-mapping in model_metadata.py
# e.g. "api.gmi-serving.com". Derived from base_url when empty.
hostname: str = ""
# ── Client-level quirks (set once at client construction) ─
default_headers: dict[str, str] = field(default_factory=dict)
# ── Request-level quirks ─────────────────────────────────
# Temperature: None = use caller's default, OMIT_TEMPERATURE = don't send
fixed_temperature: Any = None
default_max_tokens: int | None = None
default_aux_model: str = (
"" # cheap model for auxiliary tasks (compression, vision, etc.)
)
# empty = use main model
# ── Hooks (override in subclass for complex providers) ───
def get_hostname(self) -> str:
"""Return the provider's base hostname for URL-based detection.
Uses self.hostname if set explicitly, otherwise derives it from base_url.
e.g. 'https://api.gmi-serving.com/v1' 'api.gmi-serving.com'
"""
if self.hostname:
return self.hostname
if self.base_url:
from urllib.parse import urlparse
return urlparse(self.base_url).hostname or ""
return ""
def prepare_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Provider-specific message preprocessing.
Called AFTER codex field sanitization, BEFORE developer role swap.
Default: pass-through.
"""
return messages
def build_extra_body(
self, *, session_id: str | None = None, **context: Any
) -> dict[str, Any]:
"""Provider-specific extra_body fields.
Merged into the API kwargs extra_body. Default: empty dict.
"""
return {}
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
**context: Any,
) -> tuple[dict[str, Any], dict[str, Any]]:
"""Provider-specific kwargs split between extra_body and top-level api_kwargs.
Returns (extra_body_additions, top_level_kwargs).
The transport merges extra_body_additions into extra_body, and
top_level_kwargs directly into api_kwargs.
This split exists because some providers put reasoning config in
extra_body (OpenRouter: extra_body.reasoning) while others put it
as top-level api_kwargs (Kimi: api_kwargs.reasoning_effort).
Default: ({}, {}).
"""
return {}, {}
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
"""Fetch the live model list from the provider's models endpoint.
Returns a list of model ID strings, or None if the fetch failed or
the provider does not support live model listing.
Resolution order for the endpoint URL:
1. self.models_url (explicit override use when the models
endpoint differs from the inference base URL, e.g. OpenRouter
exposes a public catalog at /api/v1/models while inference is
at /api/v1)
2. self.base_url + "/models" (standard OpenAI-compat fallback)
The default implementation sends Bearer auth when api_key is given
and forwards self.default_headers. Override to customise auth, path,
response shape, or to return None for providers with no REST catalog.
Callers must always fall back to the static _PROVIDER_MODELS list
when this returns None.
"""
url = (self.models_url or "").strip()
if not url:
if not self.base_url:
return None
url = self.base_url.rstrip("/") + "/models"
import json
import urllib.request
req = urllib.request.Request(url)
if api_key:
req.add_header("Authorization", f"Bearer {api_key}")
req.add_header("Accept", "application/json")
for k, v in self.default_headers.items():
req.add_header(k, v)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
data = json.loads(resp.read().decode())
items = data if isinstance(data, list) else data.get("data", [])
return [m["id"] for m in items if isinstance(m, dict) and "id" in m]
except Exception as exc:
logger.debug("fetch_models(%s): %s", self.name, exc)
return None
+29
View File
@@ -0,0 +1,29 @@
"""AWS Bedrock provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
class BedrockProfile(ProviderProfile):
"""AWS Bedrock — no REST /v1/models endpoint; uses AWS SDK."""
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
"""Bedrock model listing requires AWS SDK, not a REST call."""
return None
bedrock = BedrockProfile(
name="bedrock",
aliases=("aws", "aws-bedrock", "amazon-bedrock", "amazon"),
api_mode="bedrock_converse",
env_vars=(), # AWS SDK credentials — not env vars
base_url="https://bedrock-runtime.us-east-1.amazonaws.com",
auth_type="aws_sdk",
)
register_provider(bedrock)
+58
View File
@@ -0,0 +1,58 @@
"""Copilot / GitHub Models provider profile.
Copilot uses per-model api_mode routing:
- GPT-5+ / Codex models codex_responses
- Claude models anthropic_messages
- Everything else chat_completions (this profile covers that subset)
Key quirks for the chat_completions subset:
- Editor attribution headers (via copilot_default_headers())
- GitHub Models reasoning extra_body (model-catalog gated)
"""
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
class CopilotProfile(ProviderProfile):
"""GitHub Copilot / GitHub Models — editor headers + reasoning."""
def build_api_kwargs_extras(
self,
*,
model: str | None = None,
reasoning_config: dict | None = None,
supports_reasoning: bool = False,
**ctx,
) -> tuple[dict[str, Any], dict[str, Any]]:
extra_body: dict[str, Any] = {}
if supports_reasoning and model:
try:
from hermes_cli.models import github_model_reasoning_efforts
supported_efforts = github_model_reasoning_efforts(model)
if supported_efforts and reasoning_config:
effort = reasoning_config.get("effort", "medium")
# Normalize non-standard effort levels to the nearest supported
if effort == "xhigh":
effort = "high"
if effort in supported_efforts:
extra_body["reasoning"] = {"effort": effort}
elif supported_efforts:
extra_body["reasoning"] = {"effort": "medium"}
except Exception:
pass
return extra_body, {}
copilot = CopilotProfile(
name="copilot",
aliases=("github-copilot", "github-models", "github-model", "github"),
env_vars=("COPILOT_GITHUB_TOKEN", "GH_TOKEN", "GITHUB_TOKEN"),
base_url="https://api.githubcopilot.com",
auth_type="copilot",
)
register_provider(copilot)
+34
View File
@@ -0,0 +1,34 @@
"""GitHub Copilot ACP provider profile.
copilot-acp uses an external ACP subprocess NOT the standard
transport. api_mode="copilot_acp" is handled separately in run_agent.py.
The profile captures auth + endpoint metadata for registry migration.
"""
from providers import register_provider
from providers.base import ProviderProfile
class CopilotACPProfile(ProviderProfile):
"""GitHub Copilot ACP — external process, no REST models endpoint."""
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
"""Model listing is handled by the ACP subprocess."""
return None
copilot_acp = CopilotACPProfile(
name="copilot-acp",
aliases=("github-copilot-acp", "copilot-acp-agent"),
api_mode="chat_completions", # ACP subprocess uses chat_completions routing
env_vars=(), # Managed by ACP subprocess
base_url="acp://copilot", # ACP internal scheme
auth_type="external_process",
)
register_provider(copilot_acp)
+71
View File
@@ -0,0 +1,71 @@
"""Custom / Ollama (local) provider profile.
Covers any endpoint registered as provider="custom", including local
Ollama instances. Key quirks:
- ollama_num_ctx extra_body.options.num_ctx (local context window)
- reasoning_config disabled extra_body.think = False
"""
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
class CustomProfile(ProviderProfile):
"""Custom/Ollama local provider — think=false and num_ctx support."""
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
ollama_num_ctx: int | None = None,
**ctx: Any,
) -> tuple[dict[str, Any], dict[str, Any]]:
extra_body: dict[str, Any] = {}
# Ollama context window
if ollama_num_ctx:
options = extra_body.get("options", {})
options["num_ctx"] = ollama_num_ctx
extra_body["options"] = options
# Disable thinking when reasoning is turned off
if reasoning_config and isinstance(reasoning_config, dict):
_effort = (reasoning_config.get("effort") or "").strip().lower()
_enabled = reasoning_config.get("enabled", True)
if _effort == "none" or _enabled is False:
extra_body["think"] = False
return extra_body, {}
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
"""Custom/Ollama: base_url is user-configured; fetch if set."""
if not self.base_url:
return None
return super().fetch_models(api_key=api_key, timeout=timeout)
custom = CustomProfile(
name="custom",
aliases=(
"ollama",
"local",
"lmstudio",
"lm-studio",
"lm_studio",
"vllm",
"llamacpp",
"llama.cpp",
"llama-cpp",
),
env_vars=(), # No fixed key — custom endpoint
base_url="", # User-configured
)
register_provider(custom)
+20
View File
@@ -0,0 +1,20 @@
"""DeepSeek provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
deepseek = ProviderProfile(
name="deepseek",
aliases=("deepseek-chat",),
env_vars=("DEEPSEEK_API_KEY",),
display_name="DeepSeek",
description="DeepSeek — native DeepSeek API",
signup_url="https://platform.deepseek.com/",
fallback_models=(
"deepseek-chat",
"deepseek-reasoner",
),
base_url="https://api.deepseek.com/v1",
)
register_provider(deepseek)
+34
View File
@@ -0,0 +1,34 @@
"""Google Gemini provider profiles.
gemini: Google AI Studio (API key) uses GeminiNativeClient
google-gemini-cli: Google Cloud Code Assist (OAuth) uses GeminiCloudCodeClient
Both report api_mode="chat_completions" but use custom native clients
that bypass the standard OpenAI transport. The profile captures auth
and endpoint metadata for auth.py / runtime_provider.py migration.
"""
from providers import register_provider
from providers.base import ProviderProfile
gemini = ProviderProfile(
name="gemini",
aliases=("google", "google-gemini", "google-ai-studio"),
api_mode="chat_completions",
env_vars=("GOOGLE_API_KEY", "GEMINI_API_KEY"),
base_url="https://generativelanguage.googleapis.com/v1beta",
auth_type="api_key",
default_aux_model="gemini-3-flash-preview",
)
google_gemini_cli = ProviderProfile(
name="google-gemini-cli",
aliases=("gemini-cli", "gemini-oauth"),
api_mode="chat_completions",
env_vars=(), # OAuth — no API key
base_url="cloudcode-pa://google", # Cloud Code Assist internal scheme
auth_type="oauth_external",
)
register_provider(gemini)
register_provider(google_gemini_cli)
+26
View File
@@ -0,0 +1,26 @@
"""GMI Cloud provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
gmi = ProviderProfile(
name="gmi",
aliases=("gmi-cloud", "gmicloud"),
display_name="GMI Cloud",
description="GMI Cloud — multi-model direct API (slash-form model IDs)",
signup_url="https://www.gmicloud.ai/",
env_vars=("GMI_API_KEY", "GMI_BASE_URL"),
base_url="https://api.gmi-serving.com/v1",
auth_type="api_key",
default_aux_model="google/gemini-3.1-flash-lite-preview",
fallback_models=(
"zai-org/GLM-5.1-FP8",
"deepseek-ai/DeepSeek-V3.2",
"moonshotai/Kimi-K2.5",
"google/gemini-3.1-flash-lite-preview",
"anthropic/claude-sonnet-4.6",
"openai/gpt-5.4",
),
)
register_provider(gmi)
+20
View File
@@ -0,0 +1,20 @@
"""Hugging Face provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
huggingface = ProviderProfile(
name="huggingface",
aliases=("hf", "hugging-face", "huggingface-hub"),
env_vars=("HF_TOKEN",),
display_name="HuggingFace",
description="HuggingFace Inference API",
signup_url="https://huggingface.co/settings/tokens",
fallback_models=(
"Qwen/Qwen3.5-72B-Instruct",
"deepseek-ai/DeepSeek-V3.2",
),
base_url="https://router.huggingface.co/v1",
)
register_provider(huggingface)
+14
View File
@@ -0,0 +1,14 @@
"""Kilo Code provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
kilocode = ProviderProfile(
name="kilocode",
aliases=("kilo-code", "kilo", "kilo-gateway"),
env_vars=("KILOCODE_API_KEY",),
base_url="https://api.kilo.ai/api/gateway",
default_aux_model="google/gemini-3-flash-preview",
)
register_provider(kilocode)
+71
View File
@@ -0,0 +1,71 @@
"""Kimi / Moonshot provider profiles.
Kimi has dual endpoints:
- sk-kimi-* keys api.kimi.com/coding (Anthropic Messages API)
- legacy keys api.moonshot.ai/v1 (OpenAI chat completions)
This module covers the chat_completions path (/v1 endpoint).
"""
from typing import Any
from providers import register_provider
from providers.base import OMIT_TEMPERATURE, ProviderProfile
class KimiProfile(ProviderProfile):
"""Kimi/Moonshot — temperature omitted, thinking + reasoning_effort."""
def build_api_kwargs_extras(
self, *, reasoning_config: dict | None = None, **context
) -> tuple[dict[str, Any], dict[str, Any]]:
"""Kimi uses extra_body.thinking + top-level reasoning_effort."""
extra_body = {}
top_level = {}
if not reasoning_config or not isinstance(reasoning_config, dict):
# No config → thinking enabled, default effort
extra_body["thinking"] = {"type": "enabled"}
top_level["reasoning_effort"] = "medium"
return extra_body, top_level
enabled = reasoning_config.get("enabled", True)
if enabled is False:
extra_body["thinking"] = {"type": "disabled"}
return extra_body, top_level
# Enabled
extra_body["thinking"] = {"type": "enabled"}
effort = (reasoning_config.get("effort") or "").strip().lower()
if effort in ("low", "medium", "high"):
top_level["reasoning_effort"] = effort
else:
top_level["reasoning_effort"] = "medium"
return extra_body, top_level
kimi = KimiProfile(
name="kimi-coding",
aliases=("kimi", "moonshot", "kimi-for-coding"),
env_vars=("KIMI_API_KEY", "KIMI_CODING_API_KEY"),
base_url="https://api.moonshot.ai/v1",
fixed_temperature=OMIT_TEMPERATURE,
default_max_tokens=32000,
default_headers={"User-Agent": "hermes-agent/1.0"},
default_aux_model="kimi-k2-turbo-preview",
)
kimi_cn = KimiProfile(
name="kimi-coding-cn",
aliases=("kimi-cn", "moonshot-cn"),
env_vars=("KIMI_CN_API_KEY",),
base_url="https://api.moonshot.cn/v1",
fixed_temperature=OMIT_TEMPERATURE,
default_max_tokens=32000,
default_headers={"User-Agent": "hermes-agent/1.0"},
default_aux_model="kimi-k2-turbo-preview",
)
register_provider(kimi)
register_provider(kimi_cn)
+31
View File
@@ -0,0 +1,31 @@
"""MiniMax provider profiles (international + China).
Both use anthropic_messages api_mode their inference_base_url
ends with /anthropic which triggers auto-detection to anthropic_messages.
"""
from providers import register_provider
from providers.base import ProviderProfile
minimax = ProviderProfile(
name="minimax",
aliases=("mini-max",),
api_mode="anthropic_messages",
env_vars=("MINIMAX_API_KEY",),
base_url="https://api.minimax.io/anthropic",
auth_type="api_key",
default_aux_model="MiniMax-M2.7",
)
minimax_cn = ProviderProfile(
name="minimax-cn",
aliases=("minimax-china", "minimax_cn"),
api_mode="anthropic_messages",
env_vars=("MINIMAX_CN_API_KEY",),
base_url="https://api.minimaxi.com/anthropic",
auth_type="api_key",
default_aux_model="MiniMax-M2.7",
)
register_provider(minimax)
register_provider(minimax_cn)
+53
View File
@@ -0,0 +1,53 @@
"""Nous Portal provider profile."""
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
class NousProfile(ProviderProfile):
"""Nous Portal — product tags, reasoning with Nous-specific omission."""
def build_extra_body(
self, *, session_id: str | None = None, **context
) -> dict[str, Any]:
return {"tags": ["product=hermes-agent"]}
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
supports_reasoning: bool = False,
**context,
) -> tuple[dict[str, Any], dict[str, Any]]:
"""Nous: passes full reasoning_config, but OMITS when disabled."""
extra_body = {}
if supports_reasoning:
if reasoning_config is not None:
rc = dict(reasoning_config)
if rc.get("enabled") is False:
pass # Nous omits reasoning when disabled
else:
extra_body["reasoning"] = rc
else:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
return extra_body, {}
nous = NousProfile(
name="nous",
aliases=("nous-portal", "nousresearch"),
env_vars=("NOUS_API_KEY",),
display_name="Nous Research",
description="Nous Research — Hermes model family",
signup_url="https://nousresearch.com/",
fallback_models=(
"hermes-3-405b",
"hermes-3-70b",
),
base_url="https://inference.nousresearch.com/v1",
auth_type="oauth_device_code",
)
register_provider(nous)
+21
View File
@@ -0,0 +1,21 @@
"""NVIDIA NIM provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
nvidia = ProviderProfile(
name="nvidia",
aliases=("nvidia-nim",),
env_vars=("NVIDIA_API_KEY",),
display_name="NVIDIA NIM",
description="NVIDIA NIM — accelerated inference",
signup_url="https://build.nvidia.com/",
fallback_models=(
"nvidia/llama-3.1-nemotron-70b-instruct",
"nvidia/llama-3.3-70b-instruct",
),
base_url="https://integrate.api.nvidia.com/v1",
default_max_tokens=16384,
)
register_provider(nvidia)
+14
View File
@@ -0,0 +1,14 @@
"""Ollama Cloud provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
ollama_cloud = ProviderProfile(
name="ollama-cloud",
aliases=("ollama_cloud",),
default_aux_model="nemotron-3-nano:30b",
env_vars=("OLLAMA_API_KEY",),
base_url="https://ollama.com/v1",
)
register_provider(ollama_cloud)
+15
View File
@@ -0,0 +1,15 @@
"""OpenAI Codex (Responses API) provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
openai_codex = ProviderProfile(
name="openai-codex",
aliases=("codex", "openai_codex"),
api_mode="codex_responses",
env_vars=(), # OAuth external — no API key
base_url="https://chatgpt.com/backend-api/codex",
auth_type="oauth_external",
)
register_provider(openai_codex)
+30
View File
@@ -0,0 +1,30 @@
"""OpenCode provider profiles (Zen + Go).
Both use per-model api_mode routing:
- OpenCode Zen: Claude anthropic_messages, GPT-5/Codex codex_responses,
everything else chat_completions (this profile)
- OpenCode Go: MiniMax anthropic_messages, GLM/Kimi chat_completions
(this profile)
"""
from providers import register_provider
from providers.base import ProviderProfile
opencode_zen = ProviderProfile(
name="opencode-zen",
aliases=("opencode", "opencode_zen", "zen"),
env_vars=("OPENCODE_ZEN_API_KEY",),
base_url="https://opencode.ai/zen/v1",
default_aux_model="gemini-3-flash",
)
opencode_go = ProviderProfile(
name="opencode-go",
aliases=("opencode_go", "go", "opencode-go-sub"),
env_vars=("OPENCODE_GO_API_KEY",),
base_url="https://opencode.ai/zen/go/v1",
default_aux_model="glm-5",
)
register_provider(opencode_zen)
register_provider(opencode_go)
+86
View File
@@ -0,0 +1,86 @@
"""OpenRouter provider profile."""
import logging
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
logger = logging.getLogger(__name__)
_CACHE: list[str] | None = None
class OpenRouterProfile(ProviderProfile):
"""OpenRouter aggregator — provider preferences, reasoning config passthrough."""
def fetch_models(
self,
*,
api_key: str | None = None,
timeout: float = 8.0,
) -> list[str] | None:
"""Fetch from public OpenRouter catalog — no auth required.
Note: Tool-call capability filtering is applied by hermes_cli/models.py
via fetch_openrouter_models() _openrouter_model_supports_tools(), not
here. The picker early-returns via the dedicated openrouter path before
reaching this method, so filtering here would be unreachable.
"""
global _CACHE # noqa: PLW0603
if _CACHE is not None:
return _CACHE
try:
result = super().fetch_models(api_key=None, timeout=timeout)
if result is not None:
_CACHE = result
return result
except Exception as exc:
logger.debug("fetch_models(openrouter): %s", exc)
return None
def build_extra_body(
self, *, session_id: str | None = None, **context: Any
) -> dict[str, Any]:
body: dict[str, Any] = {}
prefs = context.get("provider_preferences")
if prefs:
body["provider"] = prefs
return body
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
supports_reasoning: bool = False,
**context: Any,
) -> tuple[dict[str, Any], dict[str, Any]]:
"""OpenRouter passes the full reasoning_config dict as extra_body.reasoning."""
extra_body: dict[str, Any] = {}
if supports_reasoning:
if reasoning_config is not None:
extra_body["reasoning"] = dict(reasoning_config)
else:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
return extra_body, {}
openrouter = OpenRouterProfile(
name="openrouter",
aliases=("or",),
env_vars=("OPENROUTER_API_KEY",),
display_name="OpenRouter",
description="OpenRouter — unified API for 200+ models",
signup_url="https://openrouter.ai/keys",
base_url="https://openrouter.ai/api/v1",
models_url="https://openrouter.ai/api/v1/models",
fallback_models=(
"anthropic/claude-sonnet-4.6",
"openai/gpt-5.4",
"deepseek/deepseek-chat",
"google/gemini-3-flash-preview",
"qwen/qwen3-plus",
),
)
register_provider(openrouter)
+82
View File
@@ -0,0 +1,82 @@
"""Qwen Portal provider profile."""
import copy
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
class QwenProfile(ProviderProfile):
"""Qwen Portal — message normalization, vl_high_resolution, metadata top-level."""
def prepare_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Normalize content to list-of-dicts format.
Inject cache_control on system message.
Matches the behavior of run_agent.py:_qwen_prepare_chat_messages().
"""
prepared = copy.deepcopy(messages)
if not prepared:
return prepared
for msg in prepared:
if not isinstance(msg, dict):
continue
content = msg.get("content")
if isinstance(content, str):
msg["content"] = [{"type": "text", "text": content}]
elif isinstance(content, list):
normalized_parts = []
for part in content:
if isinstance(part, str):
normalized_parts.append({"type": "text", "text": part})
elif isinstance(part, dict):
normalized_parts.append(part)
if normalized_parts:
msg["content"] = normalized_parts
# Inject cache_control on the last part of the system message.
for msg in prepared:
if isinstance(msg, dict) and msg.get("role") == "system":
content = msg.get("content")
if (
isinstance(content, list)
and content
and isinstance(content[-1], dict)
):
content[-1]["cache_control"] = {"type": "ephemeral"}
break
return prepared
def build_extra_body(
self, *, session_id: str | None = None, **context
) -> dict[str, Any]:
return {"vl_high_resolution_images": True}
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
qwen_session_metadata: dict | None = None,
**context,
) -> tuple[dict[str, Any], dict[str, Any]]:
"""Qwen metadata goes to top-level api_kwargs, not extra_body."""
top_level = {}
if qwen_session_metadata:
top_level["metadata"] = qwen_session_metadata
return {}, top_level
qwen = QwenProfile(
name="qwen-oauth",
aliases=("qwen", "qwen-portal", "qwen-cli"),
env_vars=("QWEN_API_KEY",),
base_url="https://portal.qwen.ai/v1",
auth_type="oauth_external",
default_max_tokens=65536,
)
register_provider(qwen)
+14
View File
@@ -0,0 +1,14 @@
"""StepFun provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
stepfun = ProviderProfile(
name="stepfun",
aliases=("step", "stepfun-coding-plan"),
default_aux_model="step-3.5-flash",
env_vars=("STEPFUN_API_KEY",),
base_url="https://api.stepfun.ai/step_plan/v1",
)
register_provider(stepfun)
+43
View File
@@ -0,0 +1,43 @@
"""Vercel AI Gateway provider profile.
AI Gateway routes to multiple backends. Hermes sends attribution
headers and full reasoning config passthrough.
"""
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
class VercelAIGatewayProfile(ProviderProfile):
"""Vercel AI Gateway — attribution headers + reasoning passthrough."""
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
supports_reasoning: bool = True,
**ctx: Any,
) -> tuple[dict[str, Any], dict[str, Any]]:
extra_body: dict[str, Any] = {}
if supports_reasoning and reasoning_config is not None:
extra_body["reasoning"] = dict(reasoning_config)
elif supports_reasoning:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
return extra_body, {}
vercel = VercelAIGatewayProfile(
name="ai-gateway",
aliases=("vercel", "vercel-ai-gateway", "ai_gateway", "aigateway"),
env_vars=("AI_GATEWAY_API_KEY",),
base_url="https://ai-gateway.vercel.sh/v1",
default_headers={
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-Title": "Hermes Agent",
},
default_aux_model="google/gemini-3-flash",
)
register_provider(vercel)
+15
View File
@@ -0,0 +1,15 @@
"""xAI (Grok) provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
xai = ProviderProfile(
name="xai",
aliases=("grok", "x-ai", "x.ai"),
api_mode="codex_responses",
env_vars=("XAI_API_KEY",),
base_url="https://api.x.ai/v1",
auth_type="api_key",
)
register_provider(xai)
+13
View File
@@ -0,0 +1,13 @@
"""Xiaomi MiMo provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
xiaomi = ProviderProfile(
name="xiaomi",
aliases=("mimo", "xiaomi-mimo"),
env_vars=("XIAOMI_API_KEY",),
base_url="https://api.xiaomimimo.com/v1",
)
register_provider(xiaomi)
+21
View File
@@ -0,0 +1,21 @@
"""ZAI / GLM provider profile."""
from providers import register_provider
from providers.base import ProviderProfile
zai = ProviderProfile(
name="zai",
aliases=("glm", "z-ai", "z.ai", "zhipu"),
env_vars=("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
display_name="Z.AI (GLM)",
description="Z.AI / GLM — Zhipu AI models",
signup_url="https://z.ai/",
fallback_models=(
"glm-5",
"glm-4-9b",
),
base_url="https://api.z.ai/api/paas/v4",
default_aux_model="glm-4.5-flash",
)
register_provider(zai)
+1 -1
View File
@@ -137,7 +137,7 @@ py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajector
hermes_cli = ["web_dist/**/*"]
[tool.setuptools.packages.find]
include = ["agent", "agent.*", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "plugins", "plugins.*"]
include = ["agent", "agent.*", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "plugins", "plugins.*", "providers", "providers.*"]
[tool.pytest.ini_options]
testpaths = ["tests"]
+109 -73
View File
@@ -1371,6 +1371,17 @@ class AIAgent:
elif base_url_host_matches(effective_base, "chatgpt.com"):
from agent.auxiliary_client import _codex_cloudflare_headers
client_kwargs["default_headers"] = _codex_cloudflare_headers(api_key)
elif "default_headers" not in client_kwargs:
# Fall back to profile.default_headers for providers that
# declare custom headers (e.g. Vercel AI Gateway attribution,
# Kimi User-Agent on non-kimi.com endpoints).
try:
from providers import get_provider_profile as _gpf
_ph = _gpf(self.provider)
if _ph and _ph.default_headers:
client_kwargs["default_headers"] = dict(_ph.default_headers)
except Exception:
pass
else:
# No explicit creds — use the centralized provider router
from agent.auxiliary_client import resolve_provider_client
@@ -5037,7 +5048,7 @@ class AIAgent:
_validate_proxy_env_urls()
_validate_base_url(client_kwargs.get("base_url"))
if self.provider == "copilot-acp" or str(client_kwargs.get("base_url", "")).startswith("acp://copilot"):
from agent.copilot_acp_client import CopilotACPClient
from acp_adapter.copilot_client import CopilotACPClient
client = CopilotACPClient(**client_kwargs)
logger.info(
@@ -5726,7 +5737,19 @@ class AIAgent:
self._client_kwargs.get("api_key", "")
)
else:
self._client_kwargs.pop("default_headers", None)
# No URL-specific headers — check profile.default_headers before clearing.
_ph_headers = None
try:
from providers import get_provider_profile as _gpf2
_ph2 = _gpf2(self.provider)
if _ph2 and _ph2.default_headers:
_ph_headers = dict(_ph2.default_headers)
except Exception:
pass
if _ph_headers:
self._client_kwargs["default_headers"] = _ph_headers
else:
self._client_kwargs.pop("default_headers", None)
def _swap_credential(self, entry) -> None:
runtime_key = getattr(entry, "runtime_api_key", None) or getattr(entry, "access_token", "")
@@ -7857,66 +7880,79 @@ class AIAgent:
# ── chat_completions (default) ─────────────────────────────────────
_ct = self._get_transport()
# Provider detection flags
_is_qwen = self._is_qwen_portal()
_is_or = self._is_openrouter_url()
_is_gh = (
base_url_host_matches(self._base_url_lower, "models.github.ai")
or base_url_host_matches(self._base_url_lower, "api.githubcopilot.com")
)
_is_nous = "nousresearch" in self._base_url_lower
_is_nvidia = "integrate.api.nvidia.com" in self._base_url_lower
_is_kimi = (
base_url_host_matches(self.base_url, "api.kimi.com")
or base_url_host_matches(self.base_url, "moonshot.ai")
or base_url_host_matches(self.base_url, "moonshot.cn")
)
# Temperature: _fixed_temperature_for_model may return OMIT_TEMPERATURE
# sentinel (temperature omitted entirely), a numeric override, or None.
# ── Provider profile path (all chat_completions providers) ─────────
# Profiles handle per-provider quirks via hooks. We compute the shared
# per-call context here and pass it through so hooks can use it.
try:
from agent.auxiliary_client import _fixed_temperature_for_model, OMIT_TEMPERATURE
_ft = _fixed_temperature_for_model(self.model, self.base_url)
_omit_temp = _ft is OMIT_TEMPERATURE
_fixed_temp = _ft if not _omit_temp else None
from providers import get_provider_profile
_profile = get_provider_profile(self.provider)
except Exception:
_omit_temp = False
_fixed_temp = None
_profile = None
# Provider preferences (OpenRouter-specific)
_prefs: Dict[str, Any] = {}
if self.providers_allowed:
_prefs["only"] = self.providers_allowed
if self.providers_ignored:
_prefs["ignore"] = self.providers_ignored
if self.providers_order:
_prefs["order"] = self.providers_order
if self.provider_sort:
_prefs["sort"] = self.provider_sort
if self.provider_require_parameters:
_prefs["require_parameters"] = True
if self.provider_data_collection:
_prefs["data_collection"] = self.provider_data_collection
if _profile:
_ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None)
if _ephemeral_out is not None:
self._ephemeral_max_output_tokens = None
# Anthropic max output for Claude on OpenRouter/Nous
_ant_max = None
if (_is_or or _is_nous) and "claude" in (self.model or "").lower():
try:
from agent.anthropic_adapter import _get_anthropic_max_output
_ant_max = _get_anthropic_max_output(self.model)
except Exception:
pass # fail open — let the proxy pick its default
# Per-call context for profile hooks — mirrors the legacy flag block.
# Computed here so profiles receive live per-call values (not stale).
_prefs: Dict[str, Any] = {}
if self.providers_allowed:
_prefs["only"] = self.providers_allowed
if self.providers_ignored:
_prefs["ignore"] = self.providers_ignored
if self.providers_order:
_prefs["order"] = self.providers_order
if self.provider_sort:
_prefs["sort"] = self.provider_sort
if self.provider_require_parameters:
_prefs["require_parameters"] = True
if self.provider_data_collection:
_prefs["data_collection"] = self.provider_data_collection
# Qwen session metadata precomputed here (promptId is per-call random)
_qwen_meta = None
if _is_qwen:
_qwen_meta = {
"sessionId": self.session_id or "hermes",
"promptId": str(uuid.uuid4()),
}
_is_or = self._is_openrouter_url()
_is_nous = "nousresearch" in self._base_url_lower
_ant_max = None
if (_is_or or _is_nous) and "claude" in (self.model or "").lower():
try:
from agent.anthropic_adapter import _get_anthropic_max_output
_ant_max = _get_anthropic_max_output(self.model)
except Exception:
pass
# Ephemeral max output override — consume immediately so the next
# turn doesn't inherit it.
_is_qwen = self._is_qwen_portal()
_qwen_meta = None
if _is_qwen:
_qwen_meta = {
"sessionId": self.session_id or "hermes",
"promptId": str(uuid.uuid4()),
}
return _ct.build_kwargs(
model=self.model,
messages=api_messages,
tools=self.tools,
timeout=self._resolved_api_call_timeout(),
max_tokens=self.max_tokens,
ephemeral_max_output_tokens=_ephemeral_out,
max_tokens_param_fn=self._max_tokens_param,
reasoning_config=self.reasoning_config,
request_overrides=self.request_overrides,
session_id=getattr(self, "session_id", None),
provider_profile=_profile,
ollama_num_ctx=self._ollama_num_ctx,
# Context forwarded to profile hooks:
provider_preferences=_prefs or None,
anthropic_max_output=_ant_max,
supports_reasoning=self._supports_reasoning_extra_body(),
qwen_session_metadata=_qwen_meta,
)
# ── Legacy flag path ────────────────────────────────────────────
# Reached only when get_provider_profile() returns None — i.e. a
# completely unknown provider not in providers/ registry.
# Best-effort: send a clean chat_completions request with no
# provider-specific quirks.
_ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None)
if _ephemeral_out is not None:
self._ephemeral_max_output_tokens = None
@@ -7935,24 +7971,7 @@ class AIAgent:
reasoning_config=self.reasoning_config,
request_overrides=self.request_overrides,
session_id=getattr(self, "session_id", None),
model_lower=(self.model or "").lower(),
is_openrouter=_is_or,
is_nous=_is_nous,
is_qwen_portal=_is_qwen,
is_github_models=_is_gh,
is_nvidia_nim=_is_nvidia,
is_kimi=_is_kimi,
is_custom_provider=self.provider == "custom",
ollama_num_ctx=self._ollama_num_ctx,
provider_preferences=_prefs or None,
qwen_prepare_fn=self._qwen_prepare_chat_messages if _is_qwen else None,
qwen_prepare_inplace_fn=self._qwen_prepare_chat_messages_inplace if _is_qwen else None,
qwen_session_metadata=_qwen_meta,
fixed_temperature=_fixed_temp,
omit_temperature=_omit_temp,
supports_reasoning=self._supports_reasoning_extra_body(),
github_reasoning_extra=self._github_models_reasoning_extra_body() if _is_gh else None,
anthropic_max_output=_ant_max,
)
def _supports_reasoning_extra_body(self) -> bool:
@@ -8460,6 +8479,23 @@ class AIAgent:
f"⚠ Compression summary failed: {summary_error}. "
"Inserted a fallback context marker."
)
else:
# No hard failure — but did the configured aux model error out
# and get recovered by retrying on main? Surface that so users
# know their auxiliary.compression.model setting is broken even
# though compression succeeded.
_aux_fail_model = getattr(self.context_compressor, "_last_aux_model_failure_model", None)
_aux_fail_err = getattr(self.context_compressor, "_last_aux_model_failure_error", None)
if _aux_fail_model:
# Dedup on (model, error) so we don't spam on every compaction
_aux_key = (_aux_fail_model, _aux_fail_err)
if getattr(self, "_last_aux_fallback_warning_key", None) != _aux_key:
self._last_aux_fallback_warning_key = _aux_key
self._emit_warning(
f" Configured compression model '{_aux_fail_model}' failed "
f"({_aux_fail_err or 'unknown error'}). Recovered using main model — "
"check auxiliary.compression.model in config.yaml."
)
todo_snapshot = self._todo_store.format_for_injection()
if todo_snapshot:
+11 -1
View File
@@ -204,8 +204,9 @@ win.par.winopen.pulse()
| `td_input_clear` | Stop input automation |
| `td_op_screen_rect` | Get screen coords of a node |
| `td_click_screen_point` | Click a point in a screenshot |
| `td_screen_point_to_global` | Convert screenshot pixel to absolute screen coords |
See `references/mcp-tools.md` for full parameter schemas.
The table above covers the 32 tools used in typical creative workflows. The remaining 4 tools (`td_project_quit`, `td_test_session`, `td_dev_log`, `td_clear_dev_log`) are admin/dev-mode utilities — see `references/mcp-tools.md` for the full 36-tool reference with complete parameter schemas.
## Key Implementation Rules
@@ -338,6 +339,15 @@ See `references/network-patterns.md` for complete build scripts + shader code.
| `references/operator-tips.md` | Wireframe rendering, feedback TOP setup |
| `references/geometry-comp.md` | Geometry COMP: instancing, POP vs SOP, morphing |
| `references/audio-reactive.md` | Audio band extraction, beat detection, envelope following |
| `references/animation.md` | LFOs, timers, keyframes, easing, expression-driven motion |
| `references/midi-osc.md` | MIDI/OSC controllers, TouchOSC, multi-machine sync |
| `references/particles.md` | POPs and legacy particleSOP — emission, forces, collisions |
| `references/projection-mapping.md` | Multi-window output, corner pin, mesh warp, edge blending |
| `references/external-data.md` | HTTP, WebSocket, MQTT, Serial, TCP, webserverDAT |
| `references/panel-ui.md` | Custom params, panel COMPs, button/slider/field, panelExecuteDAT |
| `references/replicator.md` | replicatorCOMP — data-driven cloning, layouts, callbacks |
| `references/dat-scripting.md` | Execute DAT family — chop/dat/parameter/panel/op/executeDAT |
| `references/3d-scene.md` | Lighting rigs, shadows, IBL/cubemaps, multi-camera, PBR |
| `scripts/setup.sh` | Automated setup script |
---
@@ -0,0 +1,275 @@
# 3D Scene Reference
Lighting rigs, shadows, IBL/cubemaps, multi-camera, and PBR materials. For wireframe rendering and feedback TOPs see `operator-tips.md`. For instancing geometry see `geometry-comp.md`. For shader code see `glsl.md`.
---
## Anatomy of a 3D Scene
```
[Geometry COMP] ← contains SOPs (the shapes)
[Material] ← Phong/PBR/GLSL/Constant MAT
[Light COMPs] ← point/directional/spot/area/environment
[Camera COMP] ← view position, FOV
[Render TOP] ← combines geo + lights + camera into a 2D image
[post-FX chain] ← bloomTOP, glsl shaders, etc.
[windowCOMP] ← actual display
```
Render TOP is the heart. It takes an explicit `geometry` path, an explicit `camera` path, and lights via the lights table or an envlight reference.
---
## Minimal Scene
```python
# Geometry
geo = root.create(geometryCOMP, 'scene_geo')
sphere = geo.create(sphereSOP, 'shape')
sphere.par.rad = 1.0; sphere.par.rows = 64; sphere.par.cols = 64
# Material — start with PBR
mat = root.create(pbrMAT, 'mat')
mat.par.basecolorr = 0.7; mat.par.basecolorg = 0.7; mat.par.basecolorb = 0.7
mat.par.metallic = 0.0
mat.par.roughness = 0.4
geo.par.material = mat.path
# Camera
cam = root.create(cameraCOMP, 'cam1')
cam.par.tx = 0; cam.par.ty = 0; cam.par.tz = 4
cam.par.fov = 45
cam.par.near = 0.1; cam.par.far = 100
# Key light
key = root.create(lightCOMP, 'key_light')
key.par.lighttype = 'point'
key.par.tx = 3; key.par.ty = 3; key.par.tz = 3
key.par.dimmer = 1.5
# Render
render = root.create(renderTOP, 'render1')
render.par.outputresolution = 'custom'
render.par.resolutionw = 1920; render.par.resolutionh = 1080
render.par.camera = cam.path
render.par.geometry = geo.path
render.par.lights = key.path # single light path; for multi, see below
render.par.bgcolorr = 0; render.par.bgcolorg = 0; render.par.bgcolorb = 0
```
For multiple lights, leave `par.lights` blank — Render TOP scans the network for all `lightCOMP` and `envlightCOMP` ops by default. To restrict to specific lights, set `par.lights = '/project1/key_light /project1/fill_light'` (space-separated paths).
---
## Light Types
| Type | What | Common params |
|---|---|---|
| `point` | Omnidirectional, falls off with distance | `dimmer`, `coneangle` (n/a), `attenuation` |
| `directional` | Parallel rays, infinite distance (sun) | `dimmer`, light's rotation only matters |
| `spot` | Cone, falls off with distance + angle | `coneangle`, `conedelta`, `dimmer` |
| `cone` | Like spot but harder edge | same |
| `area` | Rectangular soft light source | `sizex`, `sizey` |
For all: `colorr`, `colorg`, `colorb`, `tx/ty/tz`, `rx/ry/rz`, `dimmer`.
### Three-Point Lighting (Studio Setup)
```python
# Key — main light, ~45° front
key = root.create(lightCOMP, 'key')
key.par.lighttype = 'point'
key.par.tx = 4; key.par.ty = 3; key.par.tz = 4
key.par.dimmer = 1.5
key.par.colorr = 1.0; key.par.colorg = 0.95; key.par.colorb = 0.85
# Fill — softer, opposite side
fill = root.create(lightCOMP, 'fill')
fill.par.lighttype = 'area'
fill.par.tx = -4; fill.par.ty = 2; fill.par.tz = 3
fill.par.dimmer = 0.5
fill.par.colorr = 0.7; fill.par.colorg = 0.8; fill.par.colorb = 1.0
fill.par.sizex = 4; fill.par.sizey = 4
# Rim/back — outline from behind
rim = root.create(lightCOMP, 'rim')
rim.par.lighttype = 'spot'
rim.par.tx = 0; rim.par.ty = 4; rim.par.tz = -4
rim.par.coneangle = 30
rim.par.dimmer = 1.0
# Optional: ambient lift to prevent pure-black shadows
amb = root.create(ambientlightCOMP, 'ambient')
amb.par.dimmer = 0.15
```
---
## Shadows
Spot and directional lights cast shadows when `par.shadowtype != 'none'`.
```python
key.par.shadowtype = 'softshadow' # 'none' | 'hardshadow' | 'softshadow'
key.par.shadowsize = 1024 # shadow map resolution
key.par.shadowsoftness = 0.02 # softshadow only
```
**Tips:**
- Soft shadows are GPU-expensive. Start with `shadowsize = 1024` and only go higher (2048/4096) if shadow edges look pixelated at your resolution.
- Set the spot light's `near`/`far` to JUST contain the scene. Wider range = wasted shadow map precision.
- Multiple shadow-casting lights compound cost. Limit to 1-2 in real-time work; pre-bake the rest into the materials.
---
## Image-Based Lighting (IBL) / Environment Light
For realistic PBR materials you need a cubemap for reflections.
```python
# Environment light from an HDR
env = root.create(envlightCOMP, 'env')
env.par.envmap = '/project1/cube_in' # path to a TOP that produces a cubemap
env.par.envlightmap = ... # diffuse irradiance map (often same as envmap)
env.par.dimmer = 1.0
# Cubemap source — option A: built-in cubeTOP from 6 faces
cube = root.create(cubeTOP, 'cube_in')
# (assign 6 face TOPs)
# Option B: HDR equirectangular → cubemap conversion
# Use a moviefileinTOP loading .hdr or .exr, then projectTOP type='cubemapfromequirect'
hdr = root.create(moviefileinTOP, 'hdr_src')
hdr.par.file = '/path/to/environment.hdr'
proj = root.create(projectTOP, 'cube_proj')
proj.par.projecttype = 'cubemapfromequirect'
proj.inputConnectors[0].connect(hdr)
```
PBR materials sample the environment automatically when `envlightCOMP` is in the scene. Verify param names with `td_get_par_info(op_type='envlightCOMP')` — TD versions vary.
---
## PBR Material Setup
```python
mat = root.create(pbrMAT, 'pbr_metal')
mat.par.basecolorr = 0.95; mat.par.basecolorg = 0.65; mat.par.basecolorb = 0.4
mat.par.metallic = 1.0
mat.par.roughness = 0.25
mat.par.specularlevel = 0.5
mat.par.emitcolorr = 0; mat.par.emitcolorg = 0; mat.par.emitcolorb = 0
# Texture maps
mat.par.basecolormap = '/project1/textures/albedo' # TOP path
mat.par.metallicroughnessmap = '/project1/textures/mr' # G=roughness, B=metallic (glTF convention)
mat.par.normalmap = '/project1/textures/normal'
mat.par.emitmap = '/project1/textures/emit'
mat.par.occlusionmap = '/project1/textures/ao'
```
**Material idioms:**
| Look | metallic | roughness | basecolor |
|---|---|---|---|
| Brushed steel | 1.0 | 0.4 | (0.7, 0.7, 0.7) |
| Polished gold | 1.0 | 0.1 | (1.0, 0.85, 0.4) |
| Plastic | 0.0 | 0.5 | mid-saturated |
| Rubber | 0.0 | 0.9 | dark |
| Glass | 0.0 | 0.05 | (1, 1, 1), low alpha + transmission |
| Glowing emitter | 0.0 | 1.0 | dark, high `emitcolor` |
For glass/transmission, recent TD versions support `transmission` in PBR; older versions need glslMAT.
---
## Multi-Camera Setups
For comparison views, instant replay, multi-screen mapping, etc.
```python
# Camera A — main scene
cam_a = root.create(cameraCOMP, 'cam_main')
cam_a.par.tz = 5
# Camera B — orbiting top-down
cam_b = root.create(cameraCOMP, 'cam_top')
cam_b.par.ty = 6; cam_b.par.rx = -90
# Render each via separate Render TOPs
render_a = root.create(renderTOP, 'render_main')
render_a.par.camera = cam_a.path
render_a.par.geometry = geo.path
render_b = root.create(renderTOP, 'render_top')
render_b.par.camera = cam_b.path
render_b.par.geometry = geo.path
```
Composite both with a `multiplyTOP`/`compositeTOP` for picture-in-picture, or route to separate `windowCOMP`s for multi-display.
### Camera animation
Drive camera params via expressions (orbit), animationCOMP (waypoint), or LFO (oscillation):
```python
# Orbiting camera
cam_a.par.tx.mode = ParMode.EXPRESSION
cam_a.par.tx.expr = "cos(absTime.seconds * 0.3) * 6"
cam_a.par.tz.mode = ParMode.EXPRESSION
cam_a.par.tz.expr = "sin(absTime.seconds * 0.3) * 6"
cam_a.par.lookat = '/project1/scene_geo' # auto-aim at target
```
`par.lookat` is the simplest "always look at target" mechanism.
### Depth of field
PBR + Render TOP supports DOF when `par.dof = 'on'`.
```python
render.par.dof = 'on'
render.par.focusdistance = 5.0
render.par.aperture = 0.05 # blur strength
render.par.bokehshape = 'hexagon'
```
DOF is GPU-heavy. Render at lower res then upscale for performance.
---
## Common Pitfalls
1. **Render TOP shows black** — most common cause: no light. Even with PBR you need at least one `lightCOMP` or `envlightCOMP`. Add an `ambientlightCOMP` at low dimmer as a safety net.
2. **Material doesn't appear**`geo.par.material` must be a string PATH, not the material op itself. Use `mat.path`, not `mat`.
3. **Lights ignored** — by default Render TOP picks up ALL `lightCOMP`s in the network. If you have leftover lights from another scene, they leak in. Set `par.lights` explicitly.
4. **PBR looks flat** — without an `envlightCOMP` providing reflections, PBR materials look like Phong. Add one even if you don't have an HDR (use a `constantTOP` cubemap as fallback).
5. **Shadow acne / striping** — increase `par.shadowbias` slightly. Tune per-light.
6. **Camera inside geometry** — if `cam.par.tz` is INSIDE a sphere, you see the inside (or nothing if backface culled). Move the camera further out.
7. **Light range too small** — point lights have implicit attenuation. Far-away geometry receives little light. Increase `par.dimmer` or move lights closer.
8. **Multiple cameras conflict** — one render TOP = one camera. Don't try to share. Use multiple render TOPs.
9. **Wrong handedness** — TD is right-handed Y-up. Imported assets from Z-up apps (Blender, Maya in Z-up) need a 90° X rotation on the geo COMP.
10. **Cooking budget** — PBR + IBL + shadows + DOF at 1080p60 is fine on modern GPUs but 4K + 4 lights + soft shadows + DOF will tank. Profile via `td_get_perf` and downgrade settings before adding more.
---
## Quick Recipes
| Goal | Recipe |
|---|---|
| Studio portrait | 3-point rig (key + fill + rim) + ambient + PBR mat + DOF |
| Outdoor daylight | One directional `lightCOMP` (sun) + envlight (sky HDR) + soft shadows |
| Dramatic / film noir | Single spot light from upper side, hard shadows, deep ambient = 0.05 |
| Abstract / dreamy | Multiple area lights at low dimmer, no shadows, `bloomTOP` post |
| Product render | Three-point + IBL + neutral PBR + `bgcolorr=g=b=1` (white seamless) |
| Game-style | Phong MAT + 1-2 lights + no IBL + flat ambient (cheap, stylized) |
| Wireframe + solid | Two render TOPs (one with wireframeMAT, one with PBR), composite via `addTOP` |
| Orbiting camera | `par.lookat` + expressions on tx/tz using sin/cos |
@@ -0,0 +1,221 @@
# Animation Reference
Patterns for time-based motion — keyframes, LFOs, timers, easing, expression-driven animation.
Always call `td_get_par_info` for the op type before setting params. Param names below reflect TD 2025.32 but verify if errors fire.
---
## Time Sources
TD has three time references — pick the right one.
| Expression | Behavior | Use for |
|---|---|---|
| `absTime.seconds` | Wall-clock seconds since TD started. Never resets. | Continuous motion, GLSL `uTime`, infinite loops |
| `absTime.frame` | Wall-clock frame count. | Frame-accurate triggers |
| `me.time.frame` | Local component frame count (resets on play/stop). | Per-COMP animation timeline |
| `me.time.seconds` | Local component seconds. | Same, in seconds |
**Rule:** for shaders and continuous motion use `absTime.seconds`. For triggered/looping animations inside a COMP use `me.time.*`.
---
## LFO CHOP — Cyclic Motion
The simplest periodic driver. Fast, GPU-cheap, expression-friendly.
```python
lfo = root.create(lfoCHOP, 'rot_driver')
lfo.par.type = 'sin' # 'sin' | 'cos' | 'ramp' | 'square' | 'triangle' | 'pulse'
lfo.par.frequency = 0.25 # cycles per second
lfo.par.amplitude = 1.0
lfo.par.offset = 0.0
lfo.par.phase = 0.0 # 0-1, useful for offsetting parallel LFOs
```
**Drive a parameter via export:**
```python
op('/project1/geo1').par.rx.mode = ParMode.EXPRESSION
op('/project1/geo1').par.rx.expr = "op('rot_driver')['chan1'] * 360"
```
**Multiple synced LFOs (X/Y/Z rotation with phase offsets):**
Create one LFO with three channels and phase-offset each, or use three LFOs and offset their `phase` params (0.0, 0.33, 0.66).
---
## Timer CHOP — Triggered Sequences
For run-once animations, beat-locked sequences, or stage-based logic.
```python
timer = root.create(timerCHOP, 'fade_timer')
timer.par.length = 4.0 # cycle length in seconds
timer.par.cycle = False # run once vs. loop
timer.par.outputseconds = True
```
Output channels: `timer_fraction` (0→1 across the cycle), `running`, `done`, `cycles`.
**Start the timer:**
```python
timer.par.start.pulse()
```
**Drive a fade:**
```python
op('/project1/level1').par.opacity.mode = ParMode.EXPRESSION
op('/project1/level1').par.opacity.expr = "op('fade_timer')['timer_fraction']"
```
**Easing on the timer fraction** — apply in the expression itself:
```python
# Smoothstep: ease in/out
expr = "smoothstep(0, 1, op('fade_timer')['timer_fraction'])"
# Cubic ease-out: 1 - (1-t)^3
expr = "1 - pow(1 - op('fade_timer')['timer_fraction'], 3)"
```
---
## Pattern CHOP — Custom Curves
For arbitrary waveforms (saw ramps, easing curves, custom envelopes).
```python
pat = root.create(patternCHOP, 'envelope')
pat.par.type = 'gaussian' # 'gaussian' | 'ramp' | 'square' | 'sin' | etc.
pat.par.length = 60 # samples
pat.par.cyclelength = 1.0 # seconds at TD framerate
```
Combine with `lookupCHOP` to remap a 0-1 driver through a custom curve.
---
## Animation COMP — Keyframe-Based
For multi-keyframe motion graphics. Each animationCOMP holds channels with keyframes editable in the Animation Editor.
```python
anim = root.create(animationCOMP, 'intro_anim')
# By default has channels chan1..chanN; access via:
# op('intro_anim').par.length, .par.play, .par.cue, etc.
# Drive a parameter from a channel
op('/project1/text1').par.tx.mode = ParMode.EXPRESSION
op('/project1/text1').par.tx.expr = "op('intro_anim/out1')['chan1']"
```
**Keyframes are typically edited in the UI** (Animation Editor), but can be set via `keyframes` table internally. For programmatic keyframe creation, use `td_execute_python`:
```python
# Get the channel CHOP inside an animationCOMP
ch = op('/project1/intro_anim/chans')
# Insert a key (advanced API — verify with td_get_par_info(op_type='animationCOMP'))
ch.appendKey('chan1', frame=0, value=0.0, expression=None)
ch.appendKey('chan1', frame=120, value=1.0)
```
For most use cases, drive params with LFO/Timer/Pattern CHOPs instead — simpler and scriptable.
---
## Easing in Expressions
TD's expression evaluator supports Python math. Common easing forms:
```python
# Linear
"t"
# Smoothstep (classic ease-in-out)
"smoothstep(0, 1, t)"
# Ease-out cubic
"1 - pow(1 - t, 3)"
# Ease-in cubic
"pow(t, 3)"
# Ease-in-out cubic
"3*t*t - 2*t*t*t"
# Bounce (manual, simplified)
"abs(sin(t * 6.28 * 3) * (1 - t))"
```
Where `t` is `op('fade_timer')['timer_fraction']` or any 0-1 driver.
---
## Filter CHOP — Smoothing Existing Channels
Smooth out jittery values (e.g., audio analysis, sensor data) before driving visuals.
```python
filt = root.create(filterCHOP, 'smooth')
filt.par.filter = 'gaussian' # or 'lowpass'
filt.par.width = 0.5 # smoothing window in seconds
filt.inputConnectors[0].connect(op('raw_signal'))
```
**WARNING:** Do NOT use Filter CHOP on AudioSpectrum output in timeslice mode — it expands the sample count and averages bins to near-zero. See `audio-reactive.md`.
---
## Lag CHOP — Asymmetric Attack/Release
Different speeds for rising vs. falling values. Standard for visualizing audio envelopes.
```python
lag = root.create(lagCHOP, 'env_smooth')
lag.par.lag1 = 0.02 # attack (rise time, seconds)
lag.par.lag2 = 0.30 # release (fall time, seconds)
lag.inputConnectors[0].connect(op('raw_envelope'))
```
Fast attack, slow release = classic VU-meter feel.
---
## Per-Frame Driving via Script DAT
For complex per-frame logic that doesn't fit expressions, use a `executeDAT` (`onFrameStart` callback) or a `chopExecuteDAT`.
```python
# In an executeDAT (frameStart):
def onFrameStart(frame):
t = absTime.seconds
op('/project1/circle').par.tx = math.sin(t * 2.0) * 3.0
op('/project1/circle').par.ty = math.cos(t * 2.0) * 3.0
return
```
Heavy logic should still be in CHOPs (CPU-cheap, deterministic). Reserve scripts for one-shots or non-realtime branching.
---
## Pitfalls
1. **Frame rate dependency**`me.time.frame` is in TD project frames (default 60). If your project rate changes, motion speed changes. Use `seconds` for rate-independent timing.
2. **Cooking budget** — every CHOP that drives a parameter cooks every frame. Consolidate drivers (one big mathCHOP > many small ones).
3. **Expression mode** — params default to `CONSTANT`. `par.X.expr = ...` is ignored unless `par.X.mode = ParMode.EXPRESSION`.
4. **Animation editor edits** — keyframes set via UI live in the animationCOMP's internal keyframe table. They survive save/reopen. Programmatic keys via `appendKey()` work but verify the API with `td_get_docs(topic='animation')` first.
5. **Looping animations** — for seamless loops, `length` must equal `cyclelength` and the start/end values must match. Otherwise expect a visible jump.
---
## Quick Recipes
| Goal | Simplest path |
|---|---|
| Continuous rotation | LFO CHOP `type='ramp'`, expr → `geo.par.rx` |
| Fade in over 2s | Timer CHOP `length=2`, smoothstep expr → `level.par.opacity` |
| Pulse on every beat | `triggerCHOP` from audio → drive scale via expression |
| 3D Lissajous orbit | Two LFOs with different freq, drive `tx`/`ty`/`tz` |
| Random jitter | `noiseCHOP` (low-freq) added to position |
| Timed scene switch | Timer CHOP → switchTOP/CHOP `index` |
@@ -0,0 +1,352 @@
# DAT-Based Scripting Reference
TD's event/callback model — Python that runs in response to network events. The full set of "Execute DATs" plus their idiomatic patterns.
For arbitrary Python execution (not callback-based), see `python-api.md`. For the MCP's `td_execute_python` tool, see `mcp-tools.md`.
---
## The Execute DAT Family
Every type watches one kind of event source and fires Python on changes.
| DAT | Watches | Use for |
|---|---|---|
| `chopExecuteDAT` | A CHOP's channel values | Audio triggers, threshold callbacks, state machines on numeric input |
| `datExecuteDAT` | A DAT's content (table cells, text) | Reacting to data updates from APIs, parsing webDAT responses |
| `parameterExecuteDAT` | A parameter's value or pulse | Reacting to user-changed params, custom pulse buttons |
| `panelExecuteDAT` | A panel COMP's interaction | Button clicks, slider drags, field commits |
| `opExecuteDAT` | Operator lifecycle | New operator created, deleted, name changed |
| `executeDAT` | Project lifecycle, frame events | Run-once setup, per-frame logic, save/load hooks |
All have a docked DAT with predefined callback functions. You only fill in the bodies of the ones you care about.
---
## chopExecuteDAT — Numeric Triggers
```python
ce = root.create(chopExecuteDAT, 'kick_handler')
ce.par.chop = '/project1/audio/out_kick' # source CHOP
ce.par.offtoon = True # fire when channel rises above 0
ce.par.ontooff = False
ce.par.whileon = False
ce.par.valuechange = False
```
In the docked callback DAT:
```python
def offToOn(channel, sampleIndex, val, prev):
"""Channel went from 0 to non-zero. Classic beat trigger."""
op('/project1/strobe').par.flash.pulse()
op('/project1/scene').par.index = (op('/project1/scene').par.index + 1) % 8
return
def onToOff(channel, sampleIndex, val, prev):
"""Channel went from non-zero to 0."""
return
def whileOn(channel, sampleIndex, val, prev):
"""Fires every frame while channel is non-zero. Use sparingly."""
return
def valueChange(channel, sampleIndex, val, prev):
"""Fires every frame the value changes (continuous). Heavy."""
return
```
`channel` is a `Channel` object — `.name`, `.owner`, `.vals[]`. Use `channel.name == 'chan1'` to filter.
**Threshold-based custom triggers:** wire the source CHOP through a `triggerCHOP` first to get clean 0/1 pulses, then watch with `offtoon`.
---
## datExecuteDAT — Table/Text Changes
```python
de = root.create(datExecuteDAT, 'api_response')
de.par.dat = '/project1/api/web1' # source DAT
de.par.tablechange = True # any cell change
de.par.cellchange = False
de.par.rowchange = False
de.par.colchange = False
```
```python
def onTableChange(dat):
"""Whole table changed (including text DAT content updates)."""
if dat.numRows == 0:
return
# If it's a webDAT response, parse JSON
import json
try:
data = json.loads(dat.text)
except json.JSONDecodeError:
debug(f'Bad JSON: {dat.text[:100]}')
return
# Write to a CHOP
op('/project1/api_value').par.value0 = float(data.get('count', 0))
return
def onCellChange(dat, cells, prev):
"""Specific cells changed."""
for cell in cells:
# cell.row, cell.col, cell.val
pass
return
```
`debug()` prints to the textport — readable via `td_read_textport`.
---
## parameterExecuteDAT — Param Changes & Pulse
```python
pe = root.create(parameterExecuteDAT, 'comp_params')
pe.par.op = '/project1/my_component' # COMP whose params to watch
pe.par.parameters = '*' # or specific names like 'Intensity Reset'
pe.par.valuechange = True
pe.par.pulse = True
```
```python
def onValueChange(par, prev):
"""par is a Par object. par.name, par.eval(), par.owner."""
if par.name == 'Intensity':
op('/project1/bloom').par.threshold = par.eval()
return
def onPulse(par):
"""Pulse param was triggered."""
if par.name == 'Reset':
op('/project1/scene').par.index = 0
op('/project1/audio_player').par.cuepoint = 0
op('/project1/audio_player').par.cuepulse.pulse()
return
def onExpressionChange(par, val, prev):
"""User changed the expression on a param."""
return
def onExportChange(par, val, prev):
"""Export source changed."""
return
def onModeChange(par, val, prev):
"""Param mode changed (CONSTANT / EXPRESSION / EXPORT / etc)."""
return
```
---
## panelExecuteDAT — UI Events
For interactive control surfaces. See `panel-ui.md` for the full panel COMP context.
```python
pe = root.create(panelExecuteDAT, 'btn_handler')
pe.par.panel = '/project1/play_btn'
pe.par.click = True # mouse click events
pe.par.value = True # state changes (toggle)
pe.par.lockedchange = False
```
```python
def onOffToOn(panelValue):
"""Panel value rose to 1 (button pressed, slider crossed threshold)."""
op('/project1/scene_timer').par.start.pulse()
return
def onOnToOff(panelValue):
"""Panel value dropped to 0."""
return
def onValueChange(panelValue):
"""Continuous: every frame the value changes."""
val = panelValue.eval()
op('/project1/master').par.opacity = val
return
def onClick(panelValue):
"""Discrete click event, fires once per click."""
return
```
`panelValue` is a `Par` object on the panel COMP.
---
## opExecuteDAT — Operator Lifecycle
Watches creation/deletion/renaming of operators in a parent COMP.
```python
oe = root.create(opExecuteDAT, 'lifecycle')
oe.par.op = '/project1'
oe.par.create = True
oe.par.destroy = True
oe.par.namechange = True
oe.par.flagchange = False
```
```python
def onCreate(opCreated):
"""A new operator was created. Useful for auto-applying conventions."""
if opCreated.OPType == 'glslTOP':
# Always wrap with a null
n = opCreated.parent().create(nullTOP, opCreated.name + '_out')
n.inputConnectors[0].connect(opCreated)
return
def onDestroy(opDestroyed):
"""Operator was deleted. opDestroyed.path is still valid for one frame."""
return
def onNameChange(opChanged):
"""Operator was renamed."""
return
```
Useful for dev-time scaffolding (auto-create downstream nullTOPs, auto-name conventions). Disable in production projects to avoid surprise side effects.
---
## executeDAT — Project Lifecycle & Per-Frame
The catch-all. Gets you hooks into project start, save, load, frame-start, frame-end.
```python
exec_dat = root.create(executeDAT, 'lifecycle')
exec_dat.par.start = True
exec_dat.par.create = True
exec_dat.par.framestart = True
exec_dat.par.frameend = False
```
```python
def onStart():
"""Project just started cooking. Run once."""
op('/project1/scene').par.index = 0
debug('Project started')
return
def onCreate():
"""Component was just created (only fires for component executeDATs, not project root)."""
return
def onFrameStart(frame):
"""Per-frame, BEFORE network cooks. Heavy logic here = bottleneck."""
return
def onFrameEnd(frame):
"""Per-frame, AFTER network cooks. Use for capture, recording, post-network logic."""
return
def onPlayStateChange(playing):
"""Project play/pause toggled."""
return
def onProjectPreSave():
"""Right before saving the .toe file."""
return
def onProjectPostSave():
return
```
Heavy per-frame logic in `onFrameStart` is one of the top performance regressions in TD projects. Use CHOPs for per-frame computation, scripts for events.
---
## Pattern: Triggering an Animation Sequence on Beat
```python
# Source: a kick trigger CHOP
# Goal: on each kick, run a 1.5s scale pulse + color flash
# Setup (create once)
animator = root.create(timerCHOP, 'pulse_anim')
animator.par.length = 1.5
animator.par.cycle = False
# Param expressions on visual targets:
op('logo').par.sx.expr = "1.0 + (1 - op('pulse_anim')['timer_fraction']) * 0.3"
op('logo').par.sx.mode = ParMode.EXPRESSION
op('logo').par.sy.expr = "1.0 + (1 - op('pulse_anim')['timer_fraction']) * 0.3"
op('logo').par.sy.mode = ParMode.EXPRESSION
# In a chopExecuteDAT watching the kick CHOP:
def offToOn(channel, sampleIndex, val, prev):
op('pulse_anim').par.start.pulse()
return
```
---
## Pattern: Live Editing a CHOP from API Data
```python
# webDAT polls an API every 5 seconds
# datExecuteDAT parses the response and writes to a constantCHOP
def onTableChange(dat):
import json
try:
data = json.loads(dat.text)
except:
return
target = op('/project1/external_state')
target.par.name0 = 'temperature'
target.par.value0 = float(data['temp_c'])
target.par.name1 = 'humidity'
target.par.value1 = float(data['humidity'])
return
```
Visuals just reference `op('external_state')['temperature']` — they update live.
---
## Pattern: Self-Cleaning Network
```python
# An opExecuteDAT watching for orphaned helper ops, deleting them after their parent disappears
def onDestroy(opDestroyed):
parent_name = opDestroyed.name
helper = op(f'/project1/{parent_name}_helper')
if helper:
helper.destroy()
return
```
---
## Pitfalls
1. **Callbacks crash silently** — exceptions print to the textport but don't show up in the UI. Always `td_clear_textport` before debugging, then `td_read_textport` after.
2. **`debug()` vs `print()`** — both write to textport, but `debug()` includes the file/line of the calling DAT. Prefer `debug()` for scripts.
3. **`val` is the new value, `prev` is old** — easy to swap. Always: `def offToOn(channel, sampleIndex, val, prev)`. Check parameter order in TD docs if confused.
4. **`whileOn` and `valueChange` are per-frame** — heavy. Avoid unless absolutely needed. Drive via expressions instead.
5. **Callbacks don't run during cooking-paused state** — if the parent COMP has `allowCooking=False`, callbacks freeze. Useful for "disable me" toggles.
6. **`par` vs `panelValue`** — parameterExecuteDAT gives `par` (a Par object), panelExecuteDAT gives `panelValue` (also a Par-like object). Both have `.name` and `.eval()` but their context differs.
7. **`opExecuteDAT` fires for itself** — when you create an opExecuteDAT, it can fire `onCreate` for itself if `par.create=True` and parent matches. Filter by `if opCreated == me: return`.
8. **Reload behavior** — when reloading an extension (`td_reinit_extension`), all callback DATs reset their internal state. Module-level vars are lost. Persist state in tableDATs or the docked DAT itself, not in module globals.
9. **Cooking dependencies** — if a callback writes to an op that's upstream of the callback's source, you get a cooking loop. TD warns about it but doesn't always block. Keep dataflow one-directional.
10. **Active flag** — every Execute DAT has `par.active`. False = silent. Easy to toggle for testing without deleting wiring.
---
## Quick Recipes
| Goal | Setup |
|---|---|
| Beat trigger | `chopExecuteDAT.par.offtoon=True` watching a `triggerCHOP` |
| API response handler | `datExecuteDAT.par.tablechange=True` watching a `webDAT` |
| Custom button → action | `parameterExecuteDAT.par.pulse=True` watching a custom pulse param |
| Slider → continuous param | `panelExecuteDAT.par.value=True` watching a `sliderCOMP` |
| Run-once setup | `executeDAT.par.start=True` with logic in `onStart()` |
| Per-frame metrics | `executeDAT.par.frameend=True` recording values to a CHOP |
| Auto-name new ops | `opExecuteDAT.par.create=True` enforcing naming conventions |
@@ -0,0 +1,322 @@
# External Data Reference
Network and device I/O — HTTP requests, WebSockets, MQTT, Serial, TCP, UDP. For MIDI/OSC specifically see `midi-osc.md`.
Common production needs:
- API polling / webhook ingestion
- Real-time data streams (sensors, market data, chat)
- IoT device control (Arduino, ESP32, smart lights)
- Inter-application messaging
- Hosting a tiny TD-side HTTP server for remote control
---
## Web DAT — HTTP Requests
```python
web = root.create(webDAT, 'api_call')
web.par.url = 'https://api.example.com/v1/status'
web.par.fetchmethod = 'get' # 'get' | 'post' | 'put' | 'delete'
web.par.format = 'auto' # 'auto' | 'text' | 'json'
web.par.timeout = 5.0
```
**Triggering a request:**
`webDAT` does NOT auto-fetch on cook. Trigger explicitly:
```python
web.par.fetch.pulse()
```
Or via expression on a CHOP value-change (chopExecuteDAT — see `dat-scripting.md`).
**Authentication headers:**
Use `webclientDAT` (more flexible) or set `webDAT` headers via the headers DAT:
```python
web_headers = root.create(tableDAT, 'headers')
web_headers.appendRow(['Authorization', 'Bearer YOUR_TOKEN'])
web_headers.appendRow(['Accept', 'application/json'])
web.par.headers = web_headers.path
```
**Parsing JSON response:**
```python
import json
def onTableChange(dat):
response = dat.text # raw response body
data = json.loads(response)
# Update a tableDAT or store in a constantCHOP for downstream use
op('/project1/api_status').par.value0 = data['count']
return
```
Wire this in a `datExecuteDAT` watching the webDAT.
**Polling pattern:**
```python
# timerCHOP fires every N seconds
timer = root.create(timerCHOP, 'poll_timer')
timer.par.length = 5.0
timer.par.cycle = True
# chopExecuteDAT on the timer's 'cycles' channel pulses the webDAT
def offToOn(channel, sampleIndex, val, prev):
op('/project1/api_call').par.fetch.pulse()
return
```
---
## Web Client DAT — More Robust HTTP
`webclientDAT` is the modern replacement for `webDAT` — supports streaming responses, chunked transfer, custom auth.
```python
client = root.create(webclientDAT, 'api')
client.par.method = 'POST'
client.par.url = 'https://api.example.com/events'
client.par.uploadtype = 'json'
client.par.uploaddata = '{"event": "scene_change", "scene": 3}'
client.par.request.pulse()
```
Output goes to its child `webclient1_response` DAT. Use a `datExecuteDAT` to react.
---
## Web Server DAT — TD as HTTP Server
Hosts a tiny HTTP server inside TD. Useful for:
- Status/health endpoints
- Remote control from a phone or another machine
- Webhook receivers from external services
```python
server = root.create(webserverDAT, 'control_server')
server.par.port = 8080
server.par.active = True
# Define handler in the docked callback DAT
```
In the auto-created `webserver1_callbacks` DAT:
```python
def onHTTPRequest(webServerDAT, request, response):
path = request['uri']
if path == '/status':
response['statusCode'] = 200
response['data'] = '{"fps": 60, "scene": "active"}'
elif path == '/scene':
idx = int(request['args'].get('index', 0))
op('/project1/scene_switch').par.index = idx
response['statusCode'] = 200
response['data'] = 'OK'
else:
response['statusCode'] = 404
response['data'] = 'Not Found'
return response
```
Test from terminal: `curl http://localhost:8080/status`.
**Security:** No auth by default. Bind to localhost only or add a token check in the callback. Never expose to the public internet without auth.
---
## WebSocket DAT — Bidirectional Real-Time
For low-latency bidirectional streams (chat, live data feeds, controllers).
### Client
```python
ws = root.create(websocketDAT, 'ws_client')
ws.par.netaddress = 'wss://api.example.com/socket'
ws.par.active = True
```
In the docked callbacks DAT:
```python
def onConnect(dat):
dat.sendText('{"action": "subscribe", "channel": "ticks"}')
return
def onReceiveText(dat, rowIndex, message):
# message is a string; parse JSON, dispatch to ops
import json
data = json.loads(message)
op('/project1/price_chop').par.value0 = data['price']
return
def onDisconnect(dat):
# Optionally schedule a reconnect
return
```
### Server
```python
ws = root.create(websocketDAT, 'ws_server')
ws.par.mode = 'server'
ws.par.port = 9001
ws.par.active = True
```
Same callback structure with an additional `clientID` arg.
---
## MQTT — Pub/Sub for IoT
```python
mqtt = root.create(mqttClientDAT, 'iot')
mqtt.par.brokeraddress = 'broker.hivemq.com'
mqtt.par.brokerport = 1883
mqtt.par.clientid = 'td_install_01'
mqtt.par.connect.pulse()
# Subscribe in callbacks DAT:
def onConnect(dat):
dat.subscribe('home/lights/+', qos=1)
return
def onReceive(dat, topic, payload, qos, retained, dup):
# payload is bytes — decode if JSON
msg = payload.decode('utf-8')
# Dispatch by topic
return
# Publish from anywhere:
op('iot').publish('show/scene', 'sunset', qos=0, retain=False)
```
For Mosquitto / HiveMQ self-hosted brokers use the same setup with `tcp://192.168.x.x` and your local port.
---
## Serial DAT — Arduino, USB Devices
```python
serial = root.create(serialDAT, 'arduino')
serial.par.port = '/dev/cu.usbmodem14101' # macOS — check Arduino IDE
# Windows: 'COM3', 'COM4', etc.
serial.par.baudrate = 115200
serial.par.active = True
```
In callbacks:
```python
def onReceive(dat, rowIndex, line):
# Each newline-terminated line from Arduino arrives here
parts = line.split(',')
op('/project1/sensors').par.value0 = float(parts[0])
op('/project1/sensors').par.value1 = float(parts[1])
return
```
Send to Arduino:
```python
op('arduino').send('LED_ON\n')
```
---
## TCP/IP DAT — Custom Protocols
For talking to non-HTTP servers (game servers, custom protocols, legacy systems).
```python
tcp = root.create(tcpipDAT, 'show_control')
tcp.par.netaddress = '192.168.1.50'
tcp.par.port = 7000
tcp.par.protocol = 'tcp' # 'tcp' | 'udp'
tcp.par.active = True
```
Send / receive via callbacks similar to websocketDAT.
For UDP-only (fire-and-forget, no connection), use `udpoutDAT` + `udpinDAT` — simpler but unreliable across networks.
---
## Common Patterns
### REST API → Visual
```
timerCHOP (5s loop)
→ chopExecuteDAT (pulse webDAT.par.fetch on cycle)
→ webDAT (returns JSON)
→ datExecuteDAT (parse, write to constantCHOP)
→ CHOP drives glsl uniform → visuals
```
### Webhook receiver
```
webserverDAT (port 8080, /webhook endpoint)
→ callback writes to a tableDAT log + triggers a scene change
```
### Real-time stock/crypto ticker
```
websocketDAT (subscribe to feed)
→ onReceiveText callback parses JSON
→ writes to constantCHOP
→ drives bar chart / typography animation
```
### IoT-controlled installation
```
MQTT → callback dispatches by topic
→ /lights/main → constantCHOP drives lighting render
→ /audio/volume → mathCHOP for master fader
```
### Two-way phone control
```
WebSocket server in TD
→ simple HTML page on phone connects, sends slider values
→ callback writes to ops
→ TD pushes status back via dat.sendText() to phone UI
```
---
## Pitfalls
1. **`webDAT` doesn't auto-fetch** — must explicitly pulse `par.fetch`. Easy to forget.
2. **Blocking on slow APIs**`webDAT` runs on the cook thread. A 30s API call freezes TD for 30s. Use `webclientDAT` (async) for anything potentially slow.
3. **WebSocket reconnection** — TD does NOT auto-reconnect on disconnect. Implement backoff in `onDisconnect`.
4. **Serial port permissions on macOS** — TD needs Full Disk Access OR the port needs to be unlocked via `sudo chmod 666 /dev/cu.usbmodem...` per session.
5. **MQTT broker connection state**`mqttClientDAT` may show `connected=true` but messages don't flow if QoS is wrong or topic ACL blocks. Check broker logs.
6. **JSON parse errors crash callbacks silently** — wrap parses in try/except and log to textport. Otherwise the callback just stops firing.
7. **Firewall on Windows** — first time `webserverDAT` binds, Windows pops a firewall dialog. Approve it or the server is unreachable.
8. **CORS**`webserverDAT` doesn't add CORS headers by default. If serving a webapp from a different origin, add `Access-Control-Allow-Origin: *` in the response.
9. **Polling vs push** — polling burns API quota. Always prefer WebSocket / webhook / MQTT for high-frequency data.
10. **Floating-point parsing** — sensor data over Serial often comes as strings. `float()` will crash on `'\n'` or `'NaN'`. Validate before converting.
---
## Quick Recipes
| Goal | Op chain |
|---|---|
| Periodic API fetch | `timerCHOP``chopExecuteDAT` pulses → `webDAT``datExecuteDAT` parses |
| Webhook receiver | `webserverDAT` (port + path), callback writes to ops |
| Real-time stream | `websocketDAT` client → onReceiveText → CHOP/DAT |
| Arduino sensor → visual | `serialDAT` → callback → `constantCHOP` → expression on visual op |
| TD ↔ phone control | `websocketDAT` server + simple HTML page on phone |
| MQTT IoT integration | `mqttClientDAT` subscribe → callback dispatches by topic |
@@ -0,0 +1,211 @@
# MIDI / OSC Reference
External controller input and output — MIDI hardware, TouchOSC mobile UIs, OSC routing across the network.
For audio-driven MIDI patterns (track triggers from spectrum analysis), see also `audio-reactive.md`.
---
## MIDI Input — Hardware Controllers
### Discovery
List connected MIDI devices first. Use a `midiinDAT` to enumerate:
```python
mdat = root.create(midiinDAT, 'mid_devices')
# Read available device names from the DAT after one cook
```
Or via Python directly:
```python
# In td_execute_python
import td
devices = [d for d in op.MIDI.devices] # verify with td_get_docs('midi')
```
Verify the API with `td_get_docs(topic='midi')` since this varies between TD versions.
### MIDI In CHOP
Standard pattern:
```python
midi_in = root.create(midiinCHOP, 'midi_in')
midi_in.par.device = 0 # device index from discovery
midi_in.par.activechan = True
```
Output channels follow the convention `chCcN` and `chCnN`:
- `ch1c74` — channel 1, CC 74
- `ch1n60` — channel 1, note 60 (middle C) — value is velocity 0-127
**Map a CC to a parameter:**
```python
op('/project1/bloom1').par.threshold.mode = ParMode.EXPRESSION
op('/project1/bloom1').par.threshold.expr = "op('midi_in')['ch1c74'][0] / 127.0"
```
**Map a note as a trigger:**
Notes in `midiinCHOP` output velocity while held, 0 when released. Use a `triggerCHOP` to convert a held note into pulses:
```python
trig = root.create(triggerCHOP, 'note_trig')
trig.par.threshold = 1
trig.par.triggeron = 'increase'
trig.inputConnectors[0].connect(op('midi_in'))
# Filter to a single channel via a selectCHOP if desired
```
### MIDI Learn Pattern
Build a reusable learn pattern when you don't know the controller's CC layout in advance:
1. Drop a `midiinCHOP` and `selectCHOP` after it.
2. User wiggles the controller knob.
3. Use `td_read_chop` on the midiinCHOP to identify which channel is non-zero — that's the active CC.
4. Set the `selectCHOP.par.channames` to that channel name.
5. Save the mapping to a `tableDAT` so it persists across sessions.
---
## MIDI Output
```python
midi_out = root.create(midioutCHOP, 'midi_out')
midi_out.par.device = 0
midi_out.par.outputformat = 'continuous' # 'continuous' | 'event'
# Drive an output: send out a CC mapped from any 0-1 source
src = root.create(constantCHOP, 'cc_src')
src.par.name0 = 'ch1c20'
src.par.value0 = 0.5
midi_out.inputConnectors[0].connect(src)
```
For note events specifically, use `event` mode and pulse the value with a `pulseCHOP` or `triggerCHOP`.
---
## OSC Input — Network Control
OSC is the more flexible cousin of MIDI. Used heavily for:
- TouchOSC / Lemur mobile control surfaces
- Show control systems (QLab, Watchout)
- Inter-application sync (Ableton via Max for Live, Resolume, etc.)
### OSC In CHOP
```python
osc_in = root.create(oscinCHOP, 'osc_in')
osc_in.par.port = 7000 # listen on UDP 7000
osc_in.par.localaddress = '' # empty = all interfaces
osc_in.par.queued = False # immediate vs. queued processing
```
Each incoming OSC address becomes a channel. `/scene/1/intensity` becomes a channel named `scene_1_intensity` (TD sanitizes slashes to underscores).
**Common gotcha:** TD only creates the channel after the FIRST message arrives at that address. Send a "hello" message from the controller during setup, or pre-declare channel names manually.
### OSC In DAT (for raw events)
Use a `oscinDAT` when you need full message access (multiple typed args, addresses with brackets/regex).
```python
osc_dat = root.create(oscinDAT, 'osc_events')
osc_dat.par.port = 7001
# Each row: timestamp, address, type tags, args...
```
Drive logic via a `datExecuteDAT` watching the `oscinDAT`:
```python
def onTableChange(dat):
last = dat[dat.numRows - 1, 'message']
parsed = last.val.split()
addr = parsed[0]
args = parsed[1:]
if addr == '/scene/trigger':
op('/project1/scene_switcher').par.index = int(args[0])
return
```
---
## OSC Output — Sending to External Apps
```python
osc_out = root.create(oscoutCHOP, 'osc_out')
osc_out.par.netaddress = '127.0.0.1' # destination IP
osc_out.par.port = 9000
# Channel names become OSC addresses
src = root.create(constantCHOP, 'send')
src.par.name0 = 'scene/intensity' # → /scene/intensity
src.par.value0 = 0.7
osc_out.inputConnectors[0].connect(src)
```
**Channel-to-address mapping:** TD prepends `/` automatically. Use `/` in channel names to nest.
For one-shot string/typed messages, use `oscoutDAT` and call `.sendOSC(address, args)`:
```python
op('osc_out_dat').sendOSC('/scene/trigger', [1, 'fade'])
```
---
## TouchOSC / Mobile UI Pattern
Common setup for live VJ control from a phone/tablet:
1. **Configure TouchOSC layout** — assign each control an OSC address like `/vj/master`, `/vj/scene/1`, etc.
2. **Find your machine's LAN IP** — TouchOSC needs to point at it.
3. **TD listens** on `oscinCHOP.par.port = 8000` (or whichever).
4. **Map channels to params** via expressions:
```python
op('/project1/master_level').par.opacity.mode = ParMode.EXPRESSION
op('/project1/master_level').par.opacity.expr = "op('osc_in')['vj_master']"
```
5. **Send feedback** to the controller via `oscoutCHOP` — useful for syncing state across multiple devices.
---
## Network / Multi-Machine
OSC over LAN works out-of-the-box. For multi-TD-instance sync (e.g., projection cluster):
- One TD acts as **master**, broadcasts `/sync/...` over OSC
- Worker TDs run `oscinCHOP` listening on the same port
- Use UDP **broadcast address** (e.g., `192.168.1.255`) on the master's `oscoutCHOP.par.netaddress` to hit all peers
For reliability over WAN, use `webserverDAT` or `websocketDAT` with an external relay instead — UDP loss is invisible.
---
## Pitfalls
1. **MIDI device indexing** — device `0` is whichever device TD enumerated first. Reorder may shift it. Pin by name when possible.
2. **OSC channel names** — TD doesn't create a channel until the first message lands. New channels invalidate cooked dependents on first arrival, causing a one-frame stutter.
3. **OSC queued mode**`par.queued = True` defers processing to a single per-frame batch. Lower latency but messages arriving same frame collapse to the last value. Off for triggers, on for continuous knobs.
4. **MIDI clock vs. transport**`midiinCHOP` reports clock if available. Use `midisyncCHOP` (if your TD version exposes it) or compute BPM from clock pulses (24 per quarter note).
5. **Latency** — wired MIDI is ~1-3ms. WiFi OSC is 10-30ms with jitter. Use wired for tight beat-locked work.
6. **Port conflicts** — only one process can bind a UDP port on most OS. If `oscinCHOP` shows no traffic, check that another app (Max, Ableton, etc.) isn't already listening on that port.
---
## Quick Recipes
| Goal | Op chain |
|---|---|
| Knob → bloom intensity | `midiinCHOP` → expression on `bloom.par.threshold` |
| Note → scene change | `midiinCHOP``triggerCHOP``selectCHOP` → drive `switchTOP.par.index` |
| Phone slider → master fader | TouchOSC `/master``oscinCHOP` → expression on output `level.par.opacity` |
| TD → Resolume scene trigger | `oscoutCHOP` channel `composition/layers/1/clips/1/connect` → Resolume listening on 7000 |
| Multi-projector sync | Master TD `oscoutCHOP` broadcast → workers `oscinCHOP` |
@@ -0,0 +1,281 @@
# Panel & UI Reference
Interactive control surfaces inside TouchDesigner — buttons, sliders, fields, custom parameter pages, panel callbacks. For HUD overlays (rendered text on visuals) see `layout-compositor.md`.
Use cases:
- VJ control rack (master fader, scene buttons, FX toggles)
- Installation operator console
- Self-contained TOX components with their own parameter UIs
- Phone-style touch interfaces displayed on a tablet
---
## Two Layers of UI
| Layer | What it is | Use for |
|---|---|---|
| **Custom Parameters** | Params on any COMP, edited like built-in TD params | Configurable components, presets, "settings" panels |
| **Panel COMPs** | Visible widgets (button, slider, field) inside a containerCOMP | Interactive control surfaces, real-time UIs |
Combine both: build a containerCOMP with panel widgets that read/write custom parameters on a parent component.
---
## Custom Parameters
Add user-editable params to any COMP. Params persist with the COMP, drive expressions, and survive save/reload.
```python
# Add a custom page to a baseCOMP
comp = op('/project1/my_component')
page = comp.appendCustomPage('Controls')
# Add typed params
page.appendFloat('Intensity', label='Intensity')[0] # returns a Par
page.appendInt('Count', label='Count')[0]
page.appendToggle('Enabled', label='Enabled')[0]
page.appendMenu('Mode', menuNames=['off', 'soft', 'hard'], menuLabels=['Off', 'Soft', 'Hard'])[0]
page.appendStr('Title', label='Title')[0]
page.appendRGB('Color', label='Color') # returns 3 pars
page.appendXY('Offset', label='Offset') # returns 2 pars
page.appendPulse('Reset', label='Reset')[0]
page.appendFile('TextureFile', label='Texture')[0]
```
**Read/write from anywhere:**
```python
val = op('/project1/my_component').par.Intensity.eval()
op('/project1/my_component').par.Intensity = 0.7
```
**Drive other params via expression:**
```python
op('bloom1').par.threshold.mode = ParMode.EXPRESSION
op('bloom1').par.threshold.expr = "op('/project1/my_component').par.Intensity"
```
**Pulse handler (Reset button):**
Use a `parameterExecuteDAT` watching the COMP's pulse params. See `dat-scripting.md`.
---
## Panel COMPs — The Widgets
Each is a COMP that renders as a clickable/draggable widget inside a `containerCOMP`.
| Type | Type Name | Use |
|---|---|---|
| Button | `buttonCOMP` | Click action — momentary or toggle |
| Slider | `sliderCOMP` | Drag to set 0-1 value (1D or 2D) |
| Field | `fieldCOMP` | Text input |
| Container | `containerCOMP` | Layout + visual styling, holds children |
| Select | `selectCOMP` | Reference and display content from another COMP |
| List | `listCOMP` | Scrollable list with row callbacks |
### Button
```python
btn = root.create(buttonCOMP, 'play_btn')
btn.par.w = 120; btn.par.h = 40
btn.par.buttontype = 'momentary' # 'momentary' | 'toggleup' | 'togglepress' | 'radio'
btn.par.bgcolorr = 0.1; btn.par.bgcolorg = 0.1; btn.par.bgcolorb = 0.1
btn.par.text = 'Play'
# Read state
state = btn.panel.state # 1 when active
```
### Slider
```python
sld = root.create(sliderCOMP, 'master_fader')
sld.par.w = 60; sld.par.h = 300
sld.par.style = 'vertical' # 'vertical' | 'horizontal' | 'xy'
sld.par.value0min = 0.0
sld.par.value0max = 1.0
# Drive a parameter via expression (always-on, no callback needed)
op('/project1/master_level').par.opacity.mode = ParMode.EXPRESSION
op('/project1/master_level').par.opacity.expr = "op('master_fader').panel.u"
```
`panel.u` and `panel.v` give the 0-1 normalized values. For 2D sliders both are populated.
### Field (Text Input)
```python
fld = root.create(fieldCOMP, 'scene_name')
fld.par.w = 200; fld.par.h = 30
fld.par.fieldtype = 'string' # 'string' | 'integer' | 'float'
# Read current text
text = fld.panel.field # the text content
```
### List
For scrollable lists with selectable rows, use the docked `list1_callbacks` DAT to handle row interactions. Set up cells via the `list_definition` table DAT.
---
## Container COMP — Layout & Styling
`containerCOMP` is the primary parent for grouping widgets and arranging layouts.
```python
panel = root.create(containerCOMP, 'control_panel')
panel.par.w = 400; panel.par.h = 600
panel.par.bgcolorr = 0.05
panel.par.bgcolorg = 0.05
panel.par.bgcolorb = 0.05
panel.par.bgalpha = 1.0
# Layout child panels in vertical stack
panel.par.align = 'lefttoright' # 'lefttoright' | 'toptobottom' | etc.
```
Children are positioned automatically based on `par.align`. For absolute positioning use `par.align = 'fillresize'` and set each child's `par.x` / `par.y`.
### Layout Strategies
| `par.align` | Behavior |
|---|---|
| `lefttoright` | Children stacked horizontally |
| `toptobottom` | Children stacked vertically |
| `righttoleft` / `bottomtotop` | Reversed stacks |
| `fillresize` | Children sized to fill, manual positioning |
| `top` / `bottom` / `left` / `right` | Fixed positioning |
For complex grids: nest containers — vertical container holding horizontal containers.
---
## Panel Callbacks — Reacting to Events
`panelExecuteDAT` watches a panel and fires Python callbacks on user interaction.
```python
pe = root.create(panelExecuteDAT, 'btn_handler')
pe.par.panel = '/project1/play_btn'
pe.par.click = True # respond to clicks
pe.par.value = True # respond to value changes
```
In its docked DAT:
```python
def onOffToOn(panelValue):
# Click pressed
op('/project1/scene_timer').par.start.pulse()
return
def onOnToOff(panelValue):
# Click released
return
def onValueChange(panelValue):
# Slider drag, field change, etc.
new_val = panelValue.eval()
op('/project1/master').par.opacity = new_val
return
```
For pulse params on custom-parameter pages, use a `parameterExecuteDAT` instead.
---
## Building a Complete VJ Control Panel
End-to-end pattern:
```python
# 1. Top-level container
panel = root.create(containerCOMP, 'vj_control')
panel.par.w = 800; panel.par.h = 200
panel.par.align = 'lefttoright'
# 2. Master fader column
master_col = panel.create(containerCOMP, 'master')
master_col.par.w = 120; master_col.par.h = 200
master_col.par.align = 'toptobottom'
master_label = master_col.create(textTOP, 'lbl')
master_label.par.text = 'MASTER'
master_sld = master_col.create(sliderCOMP, 'fader')
master_sld.par.w = 60; master_sld.par.h = 150
master_sld.par.style = 'vertical'
# 3. Scene buttons row
scene_col = panel.create(containerCOMP, 'scenes')
scene_col.par.w = 400; scene_col.par.h = 200
scene_col.par.align = 'lefttoright'
for i in range(8):
b = scene_col.create(buttonCOMP, f'scene_{i+1}')
b.par.w = 50; b.par.h = 50
b.par.text = str(i+1)
b.par.buttontype = 'radio' # only one active at a time
# 4. FX toggle column
fx_col = panel.create(containerCOMP, 'fx')
fx_col.par.w = 280; fx_col.par.h = 200
fx_col.par.align = 'toptobottom'
for fx in ['Bloom', 'CRT', 'Glitch', 'Strobe']:
t = fx_col.create(buttonCOMP, fx.lower())
t.par.w = 220; t.par.h = 35
t.par.text = fx
t.par.buttontype = 'toggleup'
# 5. Display in a window
win = root.create(windowCOMP, 'control_win')
win.par.winop = panel.path
win.par.winw = 800; win.par.winh = 200
win.par.borders = True
win.par.winopen.pulse()
```
Then wire panel values to ops via expressions or panelExecuteDATs.
---
## Showing the Panel — Window or Embedded
| Approach | When |
|---|---|
| `windowCOMP` pointing at panel | Standalone control surface, separate display |
| Render the containerCOMP via `renderTOP` | Composite UI over visuals (HUD-style) |
| Use a `panelCOMP` directly inside a network editor pane | Designer/dev preview only — panel is fully interactive |
For a touch-screen tablet, use a `windowCOMP` on a second display routed to the tablet's HDMI input.
---
## Pitfalls
1. **Panel won't respond to clicks** — likely `par.disabled = True` or the parent container has `par.disableinputs = True`. Check the panel hierarchy.
2. **Slider value not updating**`panel.u/v` reads the visual position. If you set `par.value0` directly, the visual lags. Use `par.value0` AS the source of truth and let the slider follow.
3. **Custom param won't appear** — must call `appendCustomPage` first, then append params. Pages with no params don't show.
4. **Custom param disappears on reload** — params added via Python at runtime persist only if the COMP is saved AFTER. Use a `tox` save (`comp.save('mycomp.tox')`) or commit via `td_execute_python` then save the project.
5. **Event callback fires twice** — both `onOffToOn` and `onValueChange` may fire on a single button press. Pick one to handle the action; don't double-trigger.
6. **Pulse params need `.pulse()`** — setting `par.X = True` on a pulse param does nothing. Always use `.pulse()`.
7. **Field text doesn't commit until Tab/Enter** — fields don't fire callbacks while typing. Use `par.committemode = 'all'` to fire on every keystroke (heavy).
8. **`par.text` vs panel content** — `buttonCOMP.par.text` is the LABEL on the button. The button's STATE is `panel.state` (0/1). Don't confuse them.
9. **Touch input on macOS** — multi-touch via direct touch panels works but TD's gesture handling is rudimentary. For complex multi-touch (pinch/rotate), use TouchOSC on a tablet instead.
10. **Layout doesn't update** — changing `par.align` requires the container to re-cook. Touch a child or pulse the container to trigger.
---
## Quick Recipes
| Goal | Setup |
|---|---|
| Master fader | `sliderCOMP` (vertical) → expression on `level.par.opacity` |
| Scene picker | 8 `buttonCOMP` (radio) → `selectCHOP` on their state → drive `switchTOP.par.index` |
| FX toggle | `buttonCOMP` (toggleup) → expression on `bypass` of an FX op |
| Numeric input | `fieldCOMP` (float) → expression on target par |
| Component settings | Custom params on the component COMP, panel widgets inside drive them |
| Touch tablet UI | `containerCOMP` with widgets → `windowCOMP` to second display |
| Status display | `textTOP` rendered into the panel via `selectCOMP` |
@@ -0,0 +1,245 @@
# Particles Reference
Particle systems in TouchDesigner — modern POPs (Particle Operators) and the legacy particleSOP path.
For instancing static geometry (without per-instance lifetime/velocity), see `geometry-comp.md`. For GLSL-driven feedback simulations (no particle abstraction), see `operator-tips.md` (Feedback TOP section).
Always call `td_get_par_info` for the op type before setting params. Param names below reflect TD 2025.32 — verify before relying on them.
---
## Two Paths: POPs vs. SOPs
| | **POP family** (modern) | **particleSOP** (legacy) |
|---|---|---|
| GPU? | Yes (compute) | No (CPU) |
| Particle count | 100k+ comfortably | ~5k before slowdown |
| API style | Source / Force / Solver / Render chain | Single op with many params |
| Use for | New projects, anything intensive | Quick demos, low counts, TD < 2023 |
**Default to POPs.** Only fall back to particleSOP if a POP variant of an op you need doesn't exist.
---
## POP Pipeline Overview
A POP system is a chain of operators inside a `geometryCOMP`:
```
popSourceTOP / popSourceSOP ← spawn new particles
popForceTOP (gravity, wind, etc.)
popForceTOP (attractor, vortex, ...)
popDeleteTOP (lifetime, bounds)
popSolverTOP ← integrates velocity, updates positions
[render via geometryCOMP / glslMAT instancing]
```
POP buffers carry standard channels: `P` (position), `v` (velocity), `life`, `id`, `Cd` (color), plus any custom channels you add.
---
## Minimal POP Setup
```python
# Create a geometry COMP to hold the POP network
geo = root.create(geometryCOMP, 'particles_geo')
# 1. Source — emit particles from a point
src = geo.create(popSourceTOP, 'src')
src.par.birthrate = 500 # per second
src.par.life = 4.0 # seconds
# 2. Gravity force
grav = geo.create(popForceTOP, 'gravity')
grav.par.forcetype = 'gravity'
grav.par.fy = -9.8
# 3. Lifetime cleanup
delp = geo.create(popDeleteTOP, 'cull')
delp.par.condition = 'lifeleq' # delete when life <= 0
delp.par.value = 0
# 4. Solver
solv = geo.create(popSolverTOP, 'solver')
solv.par.timestep = 'frame'
# Wire: source → force → delete → solver
src.outputConnectors[0].connect(grav.inputConnectors[0])
grav.outputConnectors[0].connect(delp.inputConnectors[0])
delp.outputConnectors[0].connect(solv.inputConnectors[0])
```
The `popSolverTOP` output IS the live particle buffer. Render it via `glslMAT` instancing on a small SOP (sphere, point) as the "shape" of each particle.
---
## Common Forces
| Force type | Effect | Common params |
|---|---|---|
| `gravity` | Constant directional pull | `fx`, `fy`, `fz` |
| `wind` | Constant velocity addition | `wx`, `wy`, `wz` |
| `drag` | Velocity damping over time | `dragstrength` |
| `noise` | Curl-noise turbulence | `noiseamp`, `noisefreq`, `noiseseed` |
| `attractor` | Pull toward a point | `position`, `strength`, `falloff` |
| `vortex` | Swirl around an axis | `axis`, `strength` |
| `point` (custom) | GLSL-evaluated arbitrary force | via `popforceadvancedTOP` |
Stack multiple `popForceTOP`s in series — each modifies velocity additively.
---
## Lifecycle Patterns
### Continuous emission (e.g. smoke plume)
```python
src.par.birthrate = 800
src.par.life = 6.0 # variance via 'lifevariance'
src.par.lifevariance = 1.5
```
### Burst emission (e.g. explosion)
```python
src.par.birthrate = 0 # no continuous emission
src.par.burst.pulse() # one burst on demand (verify param name)
src.par.burstcount = 5000
src.par.life = 1.5
```
### Beat-triggered burst
Wire a `triggerCHOP` (from audio or MIDI) to pulse the burst:
```python
op('/project1/audio_kick_trigger').outputConnectors[0].connect(...)
# Then via a chopExecuteDAT, on each kick:
def offToOn(channel, sampleIndex, val, prev):
op('/project1/particles_geo/src').par.burst.pulse()
return
```
---
## Rendering Particles
### Point Sprites (simplest)
```python
# Inside the geometryCOMP, render the solver output directly
# The geo's first SOP child becomes the geometry
# But for POPs, we typically render via glslMAT on a small "shape"
# Simple billboard sphere per particle:
shape = geo.create(sphereSOP, 'shape')
shape.par.rad = 0.05
shape.par.rows = 6; shape.par.cols = 6 # low-poly to keep it fast
# Material that uses POP buffer for instancing
mat = root.create(glslMAT, 'particle_mat')
# Configure mat.par.instancingTOP = solver output (verify param name)
```
The exact instancing setup varies by TD version — call `td_get_hints(topic='popInstancing')` (or `popRender` / `instancing` — try a few).
### GPU Sprites via glslcopyPOP
For dense smoke/fire-like effects, use a `glslcopyPOP` that writes per-particle color/size from a compute shader, then render as point sprites with additive blending in a `renderTOP`.
---
## Collisions
```python
# Collision detection against an SOP
coll = geo.create(popCollideTOP, 'ground_coll')
coll.par.collidewithsop = '/project1/ground_geo' # path to colliding SOP
coll.par.bounce = 0.3
coll.par.friction = 0.1
# Insert between force and solver
```
For plane/box collisions only, use `popPlaneCollideTOP` (cheaper).
---
## Custom Per-Particle Data
Add a custom channel via `popAttribCreateTOP` (or by writing through `glslcopyPOP`):
```python
# Add a "phase" attribute initialized random per-particle, used in render shader
attr = geo.create(popAttribCreateTOP, 'add_phase')
attr.par.attribname = 'phase'
attr.par.value0 = 'rand(@id)' # expression in TD's POP attribute language
```
Then in the render shader, `texture(sTDPOPInputs[0].phase, ...)` (or whichever sampler convention your TD version uses — verify with `td_get_docs(topic='pops')`).
---
## Legacy particleSOP (Use Sparingly)
For quick demos or low-count systems:
```python
# Inside a geo
psrc = geo.create(addSOP, 'point_src') # source: a single point
psrc.par.points = '0 0 0'
part = geo.create(particleSOP, 'particles')
part.par.life = 3.0
part.par.birthrate = 100
part.par.gravityy = -9.8
part.par.windx = 0.5
part.inputConnectors[0].connect(psrc)
```
CPU-bound. Beyond ~5,000 active particles you'll see frame drops.
---
## Pitfalls
1. **Particles don't appear** — usually a render-side issue. Check via `td_get_screenshot` on the solver output (renders the buffer as a TOP-like view in newer TD). Then check the `geometryCOMP`'s render path.
2. **Burst won't fire** — verify the `burst` param is a pulse, not a toggle. Pulses must use `.pulse()`, not `= True`.
3. **Particles teleport on first frame** — uninitialized velocity. Set `popSourceTOP.par.initialvelocityX/Y/Z` or zero them explicitly.
4. **Gravity feels wrong** — TD's "1 unit" depends on your scene scale. Start with `fy = -1.0` and scale up rather than using real-world 9.8.
5. **High birthrate = stuttering** — birthrate is per-second, not per-frame. At 60fps, `birthrate = 6000` is 100/frame which is fine; `birthrate = 600000` will tank.
6. **POP solver order matters** — forces apply in the order they appear in the chain. Putting gravity AFTER drag dampens gravity itself; usually not what you want.
7. **Instancing param name varies**`mat.par.instancingTOP` vs. `mat.par.instanceop` vs. `mat.par.instances` differs across TD versions. Always check `td_get_par_info(op_type='glslMAT')`.
8. **Cooking dependency loops** — POP solvers create implicit time-loops. The "cook dependency loop" warning is expected and harmless for POPs.
9. **CHOP-driven force values** — when a force param is expression-bound to a CHOP (e.g., audio-reactive gravity), make sure the CHOP cooks before the solver. If not, force lags by one frame.
---
## Performance Targets
| Particle count | Setup | Frame budget @ 60fps |
|---|---|---|
| < 1k | particleSOP fine | trivial |
| 1k - 10k | POPs, simple forces | ~2-5ms |
| 10k - 100k | POPs, GPU-only forces | ~5-15ms |
| 100k+ | `glslcopyPOP`, custom compute | ~10-25ms |
| 1M+ | Custom GPU buffer, no POP framework | depends on shader |
Use `td_get_perf` to find which op in the POP chain is the bottleneck.
---
## Quick Recipes
| Goal | Pipeline |
|---|---|
| Smoke plume | `popSourceTOP` (point) → gravity + wind + noise → `popDeleteTOP` (life) → solver → glslMAT instancing |
| Beat-triggered burst | `triggerCHOP` (audio) → chopExecuteDAT pulses `popSourceTOP.par.burst` |
| Fireworks shell | Burst at point → drag + gravity → secondary burst on lifetime threshold |
| Snow/rain | Continuous emission across XZ plane (high y), gravity + small wind, infinite life box-deleted |
| Sparks | Burst, very short life (0.3s), bright additive render, motion blur via feedback |
| Audio particles | Birthrate driven by audio envelope, color driven by frequency band |
@@ -0,0 +1,211 @@
# Projection Mapping Reference
Multi-window output, surface mapping, edge blending, and projector calibration patterns for installation/event work.
For HUD layouts and on-screen panel grids, see `layout-compositor.md`. For wireframe/test-pattern generation, see `operator-tips.md`.
---
## Window COMP — Output to a Display
The `windowCOMP` is how TD pushes pixels to a real display.
```python
win = root.create(windowCOMP, 'output_window')
win.par.winop = '/project1/final_out' # path to the TOP being displayed
win.par.winw = 1920
win.par.winh = 1080
win.par.winoffsetx = 0 # screen-space offset
win.par.winoffsety = 0
win.par.borders = False # no chrome
win.par.alwaysontop = True
win.par.cursor = False # hide cursor in fullscreen
win.par.justify = 'fillaspect' # 'fill' | 'fitaspect' | 'fillaspect' | 'native'
win.par.winopen.pulse() # OPEN the window
```
To target a specific physical display, set `par.location`:
```python
win.par.location = 'secondary' # 'primary' | 'secondary' | 'monitor1' | 'monitor2' | ...
```
Or set absolute coordinates using `winoffsetx/y` matched to your OS display layout.
**Always pulse `winopen` — setting params alone doesn't open the window.**
---
## Multi-Window Output
For multi-projector or multi-display setups, create one `windowCOMP` per output, each pointing at a different TOP.
```python
for i, screen_top in enumerate(['out_left', 'out_center', 'out_right']):
w = root.create(windowCOMP, f'win_{i}')
w.par.winop = f'/project1/{screen_top}'
w.par.winw = 1920; w.par.winh = 1080
w.par.winoffsetx = i * 1920
w.par.winoffsety = 0
w.par.borders = False
w.par.alwaysontop = True
w.par.cursor = False
w.par.winopen.pulse()
```
For ultra-wide single-output spans, use ONE windowCOMP at e.g. 5760×1080 spanning three projectors via the GPU's mosaic/spanning mode (Nvidia Mosaic, AMD Eyefinity), then split content via `cropTOP` per screen inside TD.
---
## 4-Point Corner Pin (Quad Warp)
The simplest projection mapping primitive — warping a rectangle onto a quadrilateral.
```python
# Source content
src = op('/project1/scene_out')
# Manual: cornerPinTOP (TD has this built-in)
cp = root.create(cornerPinTOP, 'corner_pin')
cp.par.tlx = 0.05; cp.par.tly = 0.10 # top-left (normalized 0-1)
cp.par.trx = 0.95; cp.par.try = 0.08 # top-right
cp.par.brx = 0.93; cp.par.bry = 0.92 # bottom-right
cp.par.blx = 0.07; cp.par.bly = 0.94 # bottom-left
cp.inputConnectors[0].connect(src)
```
Alternative: use a `geometryCOMP` with a `gridSOP` and bend the verts in vertex GLSL. More flexible (curved surfaces) but more setup.
Verify TD 2025.32 param names with `td_get_par_info(op_type='cornerPinTOP')`.
---
## Bezier / Mesh Warp (Curved Surfaces)
For non-flat surfaces (domes, columns, curved walls), use a subdivided mesh and per-vertex displacement.
### Pattern: Grid Mesh + GLSL Displacement
```python
# Subdivided grid in a geo
geo = root.create(geometryCOMP, 'warp_geo')
grid = geo.create(gridSOP, 'warp_grid')
grid.par.rows = 32 # higher = smoother curve
grid.par.cols = 32
grid.par.sizex = 2; grid.par.sizey = 2
# Texture the source onto it
mat = root.create(constMAT, 'warp_mat') # use constMAT for unlit projection
mat.par.maptop = '/project1/scene_out' # source TOP
geo.par.material = mat.path
# Render to a TOP that goes to the projector window
cam = root.create(cameraCOMP, 'cam_proj')
cam.par.tz = 4
render = root.create(renderTOP, 'projection_out')
render.par.camera = cam.path
render.par.geometry = geo.path
render.par.outputresolution = 'custom'
render.par.resolutionw = 1920; render.par.resolutionh = 1080
```
For per-vertex offsets, write a vertex GLSL on the constMAT (or use `glslMAT`) and read displacement values from a CHOP via uniform.
Calibration is iterative: render a checkerboard from `scene_out`, project it, photograph the projection, manually nudge corner/grid points until aligned.
---
## Edge Blending (Multi-Projector Overlap)
When two projectors overlap, the overlap region is twice as bright. Blend by ramping each projector's edge alpha to 0 across the overlap zone.
### GLSL Edge Blend Shader
Per-projector output pass that fades the inside edge to black:
```glsl
// edge_blend_pixel.glsl
out vec4 fragColor;
uniform float uBlendLeft; // overlap width on left edge (0-0.5, 0=no blend)
uniform float uBlendRight;
uniform float uGamma; // typically 2.2 — perceptual ramp
void main() {
vec2 uv = vUV.st;
vec4 col = texture(sTD2DInputs[0], uv);
float aL = (uBlendLeft > 0.0) ? smoothstep(0.0, uBlendLeft, uv.x) : 1.0;
float aR = (uBlendRight > 0.0) ? smoothstep(0.0, uBlendRight, 1.0 - uv.x) : 1.0;
float a = pow(aL * aR, uGamma);
fragColor = TDOutputSwizzle(vec4(col.rgb * a, 1.0));
}
```
Apply this to each overlap-touching projector's output. Tune `uBlendLeft` / `uBlendRight` to match your physical overlap.
For top/bottom blends or cylindrical setups, extend the shader with `uBlendTop` / `uBlendBottom`.
---
## Calibration Patterns
Useful test patterns for aligning projectors. Build a `switchTOP` selecting one of these, route to all projector windows during setup.
```python
# Solid white — for brightness/uniformity check
white = root.create(constantTOP, 'cal_white')
white.par.colorr = 1.0; white.par.colorg = 1.0; white.par.colorb = 1.0
# Centered crosshair — for keystone alignment
gridcross = root.create(textTOP, 'cal_cross')
gridcross.par.text = '+'
gridcross.par.fontsizex = 200
# Fine grid — for warp/mesh alignment (use rampTOP + math + threshold, or build via GLSL)
# Color bars for projector color calibration
bars = root.create(rampTOP, 'cal_bars')
bars.par.type = 'horizontal'
```
Or use the bundled `testpatternTOP` if your TD version includes it.
---
## Projection Audit Workflow
When debugging a multi-screen setup:
1. Render a unique color and label per output (`textTOP` saying "LEFT", "CENTER", "RIGHT").
2. Check that each window is sourcing the correct path: `td_get_operator_info(path='/project1/win_0')`.
3. Verify display assignment: walk to each projector and confirm visually.
4. Check resolution: physical projector native res vs. TD output res — mismatches cause scaling artifacts.
5. Cook flag: `td_get_perf` — if a window's source TOP isn't cooking, the projector shows last frame frozen.
---
## Pitfalls
1. **Window won't open** — you forgot `winopen.pulse()`. Setting params alone doesn't open it.
2. **Wrong display**`par.location='secondary'` depends on OS display order. Set `winoffsetx/y` to absolute coords as a more reliable override.
3. **Cursor visible** — set `par.cursor = False` BEFORE opening, or close+reopen.
4. **Black projection** — usually a cooking issue. Verify `final_out` TOP is cooking via `td_get_perf`. Check `td_get_errors` recursively from `/`.
5. **Tearing / vsync**`windowCOMP` honors `par.vsync`. For projection always set `vsync='vsync'` (default). Tearing means GPU is over-budget — reduce render resolution.
6. **Aspect mismatch** — projector native is often 1920×1200 (16:10) not 1080. Use `justify='fitaspect'` or render at native projector res.
7. **Non-Commercial license** — caps total resolution at 1280×1280. For real installation work you need Commercial. Pro license adds 4K+.
8. **Multiple monitors on macOS**`windowCOMP` honors macOS Spaces. Disable Spaces or pin TD to a specific display in System Settings before showtime.
---
## Quick Recipes
| Goal | Approach |
|---|---|
| Single fullscreen output | One `windowCOMP`, `justify='fillaspect'`, `winopen.pulse()` |
| 3-projector wide span | 3 `windowCOMP` + per-output `cropTOP` from one wide source |
| Single quad surface | `cornerPinTOP``windowCOMP` |
| Curved/dome | Subdivided gridSOP with vertex GLSL → `renderTOP``windowCOMP` |
| Edge blend overlap | GLSL fade shader per projector → `windowCOMP` |
| Calibration mode | `switchTOP` between scene and test patterns, hot-key triggered |
@@ -0,0 +1,198 @@
# Replicator COMP Reference
The `replicatorCOMP` clones a template operator N times, driven by a table of data. The fundamental TD pattern for data-driven networks: button grids, scene rosters, dynamic UI, parameter panels per-channel.
For visual instancing (per-pixel/per-render copies), see `geometry-comp.md`. Replicator builds NETWORK NODES; instancing builds RENDER COPIES. Different layer.
---
## Concept
```
[Template OP] [Data tableDAT]
│ │
└─────→ replicatorCOMP ←───────┘
[N clones], one per data row
Each clone gets per-row params
```
Edit the template once → all clones inherit. Edit the table → clones add/remove dynamically. Push parameter overrides per-row.
---
## Minimal Setup
```python
# 1. Make a template (the thing to clone)
template = root.create(buttonCOMP, 'btn_template')
template.par.w = 80; template.par.h = 80
template.par.text = 'X'
template.par.bgcolorr = 0.2
# 2. Make a data table (one row per clone)
data = root.create(tableDAT, 'scene_data')
data.appendRow(['name', 'color_r', 'color_g', 'color_b'])
data.appendRow(['Sunset', 1.0, 0.4, 0.0])
data.appendRow(['Midnight', 0.0, 0.1, 0.4])
data.appendRow(['Storm', 0.3, 0.3, 0.5])
data.appendRow(['Forest', 0.0, 0.5, 0.2])
# 3. Replicator — points at template + data
rep = root.create(replicatorCOMP, 'scene_buttons')
rep.par.template = template.path
rep.par.opfromdat = data.path
rep.par.namefromdatname = 'name' # use 'name' column for clone names
rep.par.incrementalnumbering = False
```
After cooking, the replicator creates 4 child COMPs named `Sunset`, `Midnight`, `Storm`, `Forest` (one per non-header row), each cloned from `btn_template`.
---
## Per-Row Parameter Overrides
The replicator's docked `replicator1_callbacks` DAT lets you customize each clone:
```python
def onReplicate(comp, allOps, newOps, template, master):
"""Called once per replicate cycle. newOps is the list of just-created clones."""
data = op('scene_data')
for i, clone in enumerate(newOps):
row = i + 1 # +1 to skip header
clone.par.text = data[row, 'name'].val
clone.par.bgcolorr = float(data[row, 'color_r'].val)
clone.par.bgcolorg = float(data[row, 'color_g'].val)
clone.par.bgcolorb = float(data[row, 'color_b'].val)
return
```
Or use parameter expressions referencing `digits` (the per-clone index, available as a built-in expression token inside the cloned subtree):
```python
# Inside the template, set a param expression like:
# par.value0.expr = "op('../scene_data')[me.digits + 1, 'value']"
```
`me.digits` resolves to the row index of the current clone. This is the cleanest way for static reference patterns — no callback needed.
---
## Layout: Buttons in a Grid
Drop the replicator inside a `containerCOMP` with auto-layout:
```python
panel = root.create(containerCOMP, 'scene_panel')
panel.par.w = 400; panel.par.h = 100
panel.par.align = 'lefttoright'
# Move the replicator inside
rep.parent = panel.path # or create rep as a child of panel directly
```
Each clone is a child of the replicator (which itself is a child of the panel). The panel auto-arranges everything.
For a 2D grid, set `par.align = 'fillresize'` on the container and override `par.x` / `par.y` per clone in the callback based on row/col index.
---
## Updating Without Rebuilding
When the data table changes, the replicator regenerates the clones. By default it destroys and recreates everything. To preserve state, set:
```python
rep.par.recreatemissing = True # only add/remove changed rows
rep.par.recreateallonchange = False
```
This pattern is essential for live-edit scenarios (designer adjusts table, network keeps running).
For incremental data ingestion (e.g., from a `webDAT` polling an API), have a `datExecuteDAT` watch the response, parse, write to the data table, and the replicator self-updates.
---
## Common Patterns
### Scene Roster (Data → Buttons + Logic)
```python
# Data per scene: name, file path, audio track, BPM
scene_data.appendRow(['name', 'file', 'audio', 'bpm'])
scene_data.appendRow(['Intro', '/scenes/intro.tox', '/audio/intro.wav', 110])
scene_data.appendRow(['Main', '/scenes/main.tox', '/audio/main.wav', 128])
# Replicator clones a buttonCOMP per scene
# Each button's onClick callback loads the corresponding tox + cues audio
```
### Dynamic Parameter Panel
For a list of audio bands, generate a fader strip per band:
```python
# Data: band names (sub, low, mid, hi-mid, high, air)
# Template: containerCOMP with label + sliderCOMP
# Replicator clones N strips
# Each slider's value is read at /audio_eq/{band_name}/fader
```
### Procedural Visual Network
Build a multi-channel visual network from a config file:
```python
# Data: which TOPs to chain, per "scene"
# Template: a baseCOMP with placeholder children
# Replicator builds one baseCOMP per scene; each scene contains a custom chain
# Switch between scenes via switchTOP.par.index driven by panel
```
### Per-Channel CHOP Display
Visualize each channel of a multi-channel CHOP separately:
```python
# Data table: one row per channel (auto-extracted via choptodatDAT)
# Template: a small chopVis COMP showing one channel
# Replicator generates N visualizers stacked vertically
```
---
## Replicator vs. Pure Python Loop
| Approach | When to use |
|---|---|
| **replicatorCOMP** | The set of clones changes (add/remove rows live). Visual editor expectations. Pattern is reusable across projects. |
| **Python loop** (in `td_execute_python`) | One-shot generation. Static set. Simpler logic, no template overhead. Faster to write. |
If you'll only ever build the network once, prefer a Python loop with `td_execute_python`. The replicator earns its weight when data is live.
---
## Pitfalls
1. **Header row**`tableDAT` rows are 0-indexed. If you have a header, your first data row is index 1. Off-by-one bugs are common in callbacks.
2. **`namefromdatname` column missing** — replicator silently uses `digits` (numeric suffix) names. Buttons end up named `1`, `2`, `3` instead of meaningful names. Set `par.namefromdatname` explicitly.
3. **Template lives in network** — the template OP is itself a real network node. Don't connect things downstream of it directly; connect to the clones (or use a `nullCOMP` between).
4. **Recreate-on-change wipes state** — toggles, slider positions, and uncached data inside clones are lost on each regeneration. Use `recreatemissing` to preserve.
5. **`onReplicate` doesn't fire on edit** — only fires when the clone set changes. Editing a value WITHIN an existing row doesn't re-trigger. Use `parameterExecuteDAT` or expressions for per-cell live updates.
6. **Custom params on clones** — pages added in the template propagate. Pages added in `onReplicate` don't survive the next regeneration. Always add custom pages on the template, not the clone.
7. **Cooking storms** — adding many rows fast triggers many clone events. Bundle adds via Python and call `data.cook(force=True)` once at the end.
8. **`me.digits` outside replicator children** — `me.digits` only resolves inside an op that's a descendant of the replicator. Don't reference it in unrelated networks.
9. **Cross-clone references** — referencing a sibling clone via relative path works from inside a clone (`op('../OtherClone/x')`), but breaks if names change. Prefer absolute paths via the data table.
---
## Quick Recipes
| Goal | Setup |
|---|---|
| 8-button scene picker | `tableDAT` (8 rows) + `buttonCOMP` template + `replicatorCOMP` |
| Per-band EQ strip panel | `tableDAT` (band names) + container template (label + slider) + replicator |
| Data-driven visual scenes | `tableDAT` (scene config) + `baseCOMP` template (visual chain) + replicator |
| Live-updating clone set | Same as above + `par.recreatemissing = True` |
| Per-row colored UI | Data table with color cols, `onReplicate` callback sets per-clone colors |
| List from API response | `webDAT``datExecuteDAT` parses JSON → writes to data table → replicator updates |
+226
View File
@@ -242,6 +242,232 @@ class TestSummaryFailureCooldown:
assert mock_call.call_count == 1
class TestSummaryFallbackToMainModel:
"""When ``summary_model`` differs from the main model and the summary LLM
call fails, the compressor should retry once on the main model before
giving up losing N turns of context is almost always worse than one
extra summary attempt. Covers both the fast-path (explicit
model-not-found errors) and the unknown-error best-effort retry."""
def _msgs(self):
return [
{"role": "user", "content": "do something"},
{"role": "assistant", "content": "ok"},
]
def test_model_not_found_404_falls_back_to_main_and_succeeds(self):
"""Classic misconfiguration: ``auxiliary.compression.model`` points at
a model the main provider doesn't serve → 404 → retry on main."""
mock_ok = MagicMock()
mock_ok.choices = [MagicMock()]
mock_ok.choices[0].message.content = "summary via main model"
err_404 = Exception("404 model_not_found: no such model")
err_404.status_code = 404
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="broken-aux-model",
quiet_mode=True,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=[err_404, mock_ok],
) as mock_call:
result = c._generate_summary(self._msgs())
assert mock_call.call_count == 2
# First call used the misconfigured aux model
assert mock_call.call_args_list[0].kwargs.get("model") == "broken-aux-model"
# Second call used the main model (no model kwarg → call_llm uses main)
assert "model" not in mock_call.call_args_list[1].kwargs
assert result is not None
assert "summary via main model" in result
# Aux-model failure is recorded even though retry succeeded — this is
# how callers (gateway /compress, CLI warning) know to tell the user
# their auxiliary.compression.model setting is broken.
assert c._last_aux_model_failure_model == "broken-aux-model"
assert c._last_aux_model_failure_error is not None
assert "404" in c._last_aux_model_failure_error
def test_unknown_error_falls_back_to_main_and_succeeds(self):
"""Errors that don't match the 404/503/model_not_found fast-path
(400s, provider-specific 'no route', aggregator rejections) should
ALSO trigger a best-effort retry on main before entering cooldown."""
mock_ok = MagicMock()
mock_ok.choices = [MagicMock()]
mock_ok.choices[0].message.content = "summary via main model"
# A 400 from OpenRouter / Nous portal with an opaque message — does
# NOT match _is_model_not_found, but still an unrecoverable misconfig.
err_400 = Exception("400 Bad Request: provider rejected model")
err_400.status_code = 400
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="broken-aux-model",
quiet_mode=True,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=[err_400, mock_ok],
) as mock_call:
result = c._generate_summary(self._msgs())
assert mock_call.call_count == 2
assert mock_call.call_args_list[0].kwargs.get("model") == "broken-aux-model"
assert "model" not in mock_call.call_args_list[1].kwargs
assert result is not None
assert "summary via main model" in result
# Aux-model failure recorded despite successful recovery
assert c._last_aux_model_failure_model == "broken-aux-model"
assert c._last_aux_model_failure_error is not None
assert "400" in c._last_aux_model_failure_error
def test_no_fallback_when_summary_model_equals_main_model(self):
"""If the aux model IS the main model, there's nowhere to fall back
to go straight to cooldown, don't loop retrying the same call."""
err = Exception("500 internal error")
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="main-model", # same as main
quiet_mode=True,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=err,
) as mock_call:
result = c._generate_summary(self._msgs())
# Only one attempt — retry gate blocks fallback when models match
assert mock_call.call_count == 1
assert result is None
# Not flagged as fallen back — the retry condition was never met
assert getattr(c, "_summary_model_fallen_back", False) is False
def test_fallback_only_happens_once_per_compressor(self):
"""If the retry-on-main ALSO fails, don't loop forever — enter
cooldown like the normal failure path."""
err1 = Exception("400 aux model rejected")
err2 = Exception("500 main model also exploded")
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="broken-aux-model",
quiet_mode=True,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=[err1, err2],
) as mock_call:
result = c._generate_summary(self._msgs())
# Exactly 2 calls: initial + one retry on main. No further retries.
assert mock_call.call_count == 2
assert result is None
assert c._summary_model_fallen_back is True
class TestAuxModelFallbackSurfacedToCallers:
"""When summary_model fails but retry-on-main succeeds, compress() must
expose the aux-model failure via _last_aux_model_failure_{model,error}
so gateway /compress and CLI callers can warn the user about their
broken auxiliary.compression.model config silent recovery would hide
a misconfiguration only the user can fix."""
def _make_msgs(self):
return [
{"role": "system", "content": "sys"},
{"role": "user", "content": "msg 1"},
{"role": "assistant", "content": "msg 2"},
{"role": "user", "content": "msg 3"},
{"role": "assistant", "content": "msg 4"},
{"role": "user", "content": "msg 5"},
{"role": "assistant", "content": "msg 6"},
{"role": "user", "content": "msg 7"},
]
def test_compress_exposes_aux_failure_fields_after_successful_fallback(self):
mock_ok = MagicMock()
mock_ok.choices = [MagicMock()]
mock_ok.choices[0].message.content = "summary via main"
err_400 = Exception("400 provider rejected configured model")
err_400.status_code = 400
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="broken-aux-model",
quiet_mode=True,
protect_first_n=2,
protect_last_n=2,
)
with patch(
"agent.context_compressor.call_llm",
side_effect=[err_400, mock_ok],
):
result = c.compress(self._make_msgs())
# Recovery succeeded → no fallback placeholder
assert c._last_summary_fallback_used is False
# But aux-model failure IS recorded for the gateway/CLI warning
assert c._last_aux_model_failure_model == "broken-aux-model"
assert c._last_aux_model_failure_error is not None
assert "400" in c._last_aux_model_failure_error
# Result is well-formed with a real summary, not a placeholder
assert any(
isinstance(m.get("content"), str) and "summary via main" in m["content"]
for m in result
)
def test_compress_clears_aux_failure_fields_at_start_of_next_call(self):
"""A subsequent successful compression must clear the aux-failure
fields so the warning doesn't persist forever."""
mock_ok = MagicMock()
mock_ok.choices = [MagicMock()]
mock_ok.choices[0].message.content = "summary via main"
err_400 = Exception("400 aux model busted")
err_400.status_code = 400
with patch("agent.context_compressor.get_model_context_length", return_value=100000):
c = ContextCompressor(
model="main-model",
summary_model_override="broken-aux-model",
quiet_mode=True,
protect_first_n=2,
protect_last_n=2,
)
# Call 1: aux fails, retry-on-main succeeds
with patch(
"agent.context_compressor.call_llm",
side_effect=[err_400, mock_ok],
):
c.compress(self._make_msgs())
assert c._last_aux_model_failure_model == "broken-aux-model"
# Call 2: clean run on main (summary_model was cleared to "" after
# first fallback). Aux-failure fields MUST reset at compress() start
# so the old warning state doesn't leak into this call.
with patch(
"agent.context_compressor.call_llm",
return_value=mock_ok,
):
c.compress(self._make_msgs())
assert c._last_aux_model_failure_model is None
assert c._last_aux_model_failure_error is None
class TestSummaryFailureTrackingForGatewayWarning:
"""When summary generation fails, the compressor must record dropped count
+ fallback flag so gateway hygiene & /compress can surface a visible
+2 -2
View File
@@ -10,7 +10,7 @@ import unittest
from pathlib import Path
from unittest.mock import patch
from agent.copilot_acp_client import CopilotACPClient
from acp_adapter.copilot_client import CopilotACPClient
class _FakeProcess:
@@ -100,7 +100,7 @@ class CopilotACPClientSafetyTests(unittest.TestCase):
target = home / ".ssh" / "id_rsa"
target.parent.mkdir(parents=True, exist_ok=True)
with patch("agent.copilot_acp_client.is_write_denied", return_value=True, create=True):
with patch("acp_adapter.copilot_client.is_write_denied", return_value=True, create=True):
response = self._dispatch(
{
"jsonrpc": "2.0",
+7 -7
View File
@@ -71,17 +71,17 @@ class TestMinimaxThinkingSupport:
class TestMinimaxAuxModel:
"""Verify auxiliary model is standard (not highspeed)."""
"""Verify auxiliary model is standard (not highspeed) — now reads from profiles."""
def test_minimax_aux_is_standard(self):
from agent.auxiliary_client import _API_KEY_PROVIDER_AUX_MODELS
assert _API_KEY_PROVIDER_AUX_MODELS["minimax"] == "MiniMax-M2.7"
assert _API_KEY_PROVIDER_AUX_MODELS["minimax-cn"] == "MiniMax-M2.7"
from agent.auxiliary_client import _get_aux_model_for_provider
assert _get_aux_model_for_provider("minimax") == "MiniMax-M2.7"
assert _get_aux_model_for_provider("minimax-cn") == "MiniMax-M2.7"
def test_minimax_aux_not_highspeed(self):
from agent.auxiliary_client import _API_KEY_PROVIDER_AUX_MODELS
assert "highspeed" not in _API_KEY_PROVIDER_AUX_MODELS["minimax"]
assert "highspeed" not in _API_KEY_PROVIDER_AUX_MODELS["minimax-cn"]
from agent.auxiliary_client import _get_aux_model_for_provider
assert "highspeed" not in _get_aux_model_for_provider("minimax")
assert "highspeed" not in _get_aux_model_for_provider("minimax-cn")
class TestMinimaxBetaHeaders:
+51 -17
View File
@@ -73,17 +73,21 @@ class TestChatCompletionsBuildKwargs:
assert kw["tools"] == tools
def test_openrouter_provider_prefs(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("openrouter")
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="gpt-4o", messages=msgs,
is_openrouter=True,
provider_profile=profile,
provider_preferences={"only": ["openai"]},
)
assert kw["extra_body"]["provider"] == {"only": ["openai"]}
def test_nous_tags(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("nous")
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(model="gpt-4o", messages=msgs, is_nous=True)
kw = transport.build_kwargs(model="gpt-4o", messages=msgs, provider_profile=profile)
assert kw["extra_body"]["tags"] == ["product=hermes-agent"]
def test_reasoning_default(self, transport):
@@ -95,29 +99,36 @@ class TestChatCompletionsBuildKwargs:
assert kw["extra_body"]["reasoning"] == {"enabled": True, "effort": "medium"}
def test_nous_omits_disabled_reasoning(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("nous")
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="gpt-4o", messages=msgs,
provider_profile=profile,
supports_reasoning=True,
is_nous=True,
reasoning_config={"enabled": False},
)
# Nous rejects enabled=false; reasoning omitted entirely
assert "reasoning" not in kw.get("extra_body", {})
def test_ollama_num_ctx(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("custom")
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="llama3", messages=msgs,
provider_profile=profile,
ollama_num_ctx=32768,
)
assert kw["extra_body"]["options"]["num_ctx"] == 32768
def test_custom_think_false(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("custom")
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="qwen3", messages=msgs,
is_custom_provider=True,
provider_profile=profile,
reasoning_config={"effort": "none"},
)
assert kw["extra_body"]["think"] is False
@@ -142,23 +153,29 @@ class TestChatCompletionsBuildKwargs:
assert kw["max_tokens"] == 2048
def test_nvidia_default_max_tokens(self, transport):
"""NVIDIA max_tokens=16384 is now set via ProviderProfile, not legacy flag."""
from providers import get_provider_profile
profile = get_provider_profile("nvidia")
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="glm-4.7", messages=msgs,
is_nvidia_nim=True,
model="nvidia/llama-3.1-405b-instruct",
messages=msgs,
max_tokens_param_fn=lambda n: {"max_tokens": n},
provider_profile=profile,
)
# NVIDIA default: 16384
assert kw["max_tokens"] == 16384
def test_qwen_default_max_tokens(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("qwen-oauth")
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(
model="qwen3-coder-plus", messages=msgs,
is_qwen_portal=True,
provider_profile=profile,
max_tokens_param_fn=lambda n: {"max_tokens": n},
)
# Qwen default: 65536
# Qwen default: 65536 from profile.default_max_tokens
assert kw["max_tokens"] == 65536
def test_anthropic_max_output_for_claude_on_aggregator(self, transport):
@@ -181,14 +198,23 @@ class TestChatCompletionsBuildKwargs:
assert kw["service_tier"] == "priority"
def test_fixed_temperature(self, transport):
"""Fixed temperature is now set via ProviderProfile.fixed_temperature."""
from providers.base import ProviderProfile
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(model="gpt-4o", messages=msgs, fixed_temperature=0.6)
kw = transport.build_kwargs(
model="gpt-4o", messages=msgs,
provider_profile=ProviderProfile(name="_t", fixed_temperature=0.6),
)
assert kw["temperature"] == 0.6
def test_omit_temperature(self, transport):
"""Omit temperature is set via ProviderProfile with OMIT_TEMPERATURE sentinel."""
from providers.base import ProviderProfile, OMIT_TEMPERATURE
msgs = [{"role": "user", "content": "Hi"}]
kw = transport.build_kwargs(model="gpt-4o", messages=msgs, omit_temperature=True, fixed_temperature=0.5)
# omit wins
kw = transport.build_kwargs(
model="gpt-4o", messages=msgs,
provider_profile=ProviderProfile(name="_t", fixed_temperature=OMIT_TEMPERATURE),
)
assert "temperature" not in kw
@@ -196,18 +222,22 @@ class TestChatCompletionsKimi:
"""Regression tests for the Kimi/Moonshot quirks migrated into the transport."""
def test_kimi_max_tokens_default(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("kimi-coding")
kw = transport.build_kwargs(
model="kimi-k2", messages=[{"role": "user", "content": "Hi"}],
is_kimi=True,
provider_profile=profile,
max_tokens_param_fn=lambda n: {"max_tokens": n},
)
# Kimi CLI default: 32000
# Kimi CLI default: 32000 from KimiProfile.default_max_tokens
assert kw["max_tokens"] == 32000
def test_kimi_reasoning_effort_top_level(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("kimi-coding")
kw = transport.build_kwargs(
model="kimi-k2", messages=[{"role": "user", "content": "Hi"}],
is_kimi=True,
provider_profile=profile,
reasoning_config={"effort": "high"},
max_tokens_param_fn=lambda n: {"max_tokens": n},
)
@@ -225,17 +255,21 @@ class TestChatCompletionsKimi:
assert "reasoning_effort" not in kw
def test_kimi_thinking_enabled_extra_body(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("kimi-coding")
kw = transport.build_kwargs(
model="kimi-k2", messages=[{"role": "user", "content": "Hi"}],
is_kimi=True,
provider_profile=profile,
max_tokens_param_fn=lambda n: {"max_tokens": n},
)
assert kw["extra_body"]["thinking"] == {"type": "enabled"}
def test_kimi_thinking_disabled_extra_body(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("kimi-coding")
kw = transport.build_kwargs(
model="kimi-k2", messages=[{"role": "user", "content": "Hi"}],
is_kimi=True,
provider_profile=profile,
reasoning_config={"enabled": False},
max_tokens_param_fn=lambda n: {"max_tokens": n},
)
+62
View File
@@ -181,3 +181,65 @@ async def test_compress_command_appends_warning_when_summary_generation_fails():
assert "historical message(s) were removed" in result
agent_instance.shutdown_memory_provider.assert_called_once()
agent_instance.close.assert_called_once()
@pytest.mark.asyncio
async def test_compress_command_surfaces_aux_model_failure_even_when_recovered():
"""When the user's configured ``auxiliary.compression.model`` errors out
but compression recovers by retrying on the main model, /compress must
STILL inform the user. Silent recovery hides broken config the user
needs to fix."""
history = _make_history()
# Compressed transcript — normal successful compression, no placeholder.
compressed = [
history[0],
{"role": "assistant", "content": "summary via main model"},
history[-1],
]
runner = _make_runner(history)
agent_instance = MagicMock()
agent_instance.shutdown_memory_provider = MagicMock()
agent_instance.close = MagicMock()
agent_instance.context_compressor.has_content_to_compress.return_value = True
# Fallback placeholder was NOT used — recovery succeeded.
agent_instance.context_compressor._last_summary_fallback_used = False
agent_instance.context_compressor._last_summary_dropped_count = 0
agent_instance.context_compressor._last_summary_error = None
# But the configured aux model DID fail before the retry succeeded.
agent_instance.context_compressor._last_aux_model_failure_model = (
"gemini-3-flash-preview"
)
agent_instance.context_compressor._last_aux_model_failure_error = (
"404 model not found: gemini-3-flash-preview"
)
agent_instance.session_id = "sess-1"
agent_instance._compress_context.return_value = (compressed, "")
def _estimate(messages):
if messages == history:
return 100
if messages == compressed:
return 60
raise AssertionError(f"unexpected transcript: {messages!r}")
with (
patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "***"}),
patch("gateway.run._resolve_gateway_model", return_value="test-model"),
patch("run_agent.AIAgent", return_value=agent_instance),
patch("agent.model_metadata.estimate_messages_tokens_rough", side_effect=_estimate),
):
result = await runner._handle_compress_command(_make_event())
# Compression succeeded
assert "Compressed:" in result
# No ⚠️ warning (that's reserved for dropped-turns case)
assert "⚠️" not in result
# But there IS an info note about the broken aux model
assert "" in result
assert "gemini-3-flash-preview" in result
assert "404" in result
assert "auxiliary.compression.model" in result
# The user's context is explicitly called out as intact
assert "intact" in result
agent_instance.shutdown_memory_provider.assert_called_once()
agent_instance.close.assert_called_once()
+125 -1
View File
@@ -508,4 +508,128 @@ async def test_session_hygiene_warns_user_when_summary_generation_fails(monkeypa
assert warn["chat_id"] == "-1001"
assert warn["metadata"] == {"thread_id": "17585"}
FakeCompressAgentWithSummaryFailure.last_instance.close.assert_called_once()
FakeCompressAgentWithSummaryFailure.last_instance.close.assert_called_once()
@pytest.mark.asyncio
async def test_session_hygiene_informs_user_when_aux_model_fails_but_recovers(monkeypatch, tmp_path):
"""When the user's configured ``auxiliary.compression.model`` errors out
and we recover via the main model, compression succeeds but the user's
config is still broken. Gateway hygiene must surface an note so the
user knows to fix ``auxiliary.compression.model`` silent recovery
hides a misconfig only they can resolve."""
fake_dotenv = types.ModuleType("dotenv")
fake_dotenv.load_dotenv = lambda *args, **kwargs: None
monkeypatch.setitem(sys.modules, "dotenv", fake_dotenv)
class FakeCompressAgentWithAuxRecovery:
last_instance = None
def __init__(self, **kwargs):
self.model = kwargs.get("model")
self.session_id = kwargs.get("session_id", "fake-session")
self._print_fn = None
self.shutdown_memory_provider = MagicMock()
self.close = MagicMock()
# Compression succeeded (no placeholder inserted) but the
# configured aux model errored and we fell back to main.
self.context_compressor = SimpleNamespace(
_last_summary_fallback_used=False,
_last_summary_dropped_count=0,
_last_summary_error=None,
_last_aux_model_failure_model="gemini-3-flash-preview",
_last_aux_model_failure_error="404 model not found",
)
type(self).last_instance = self
def _compress_context(self, messages, *_args, **_kwargs):
self.session_id = f"{self.session_id}_compressed"
return ([{"role": "assistant", "content": "real summary"}], None)
fake_run_agent = types.ModuleType("run_agent")
fake_run_agent.AIAgent = FakeCompressAgentWithAuxRecovery
monkeypatch.setitem(sys.modules, "run_agent", fake_run_agent)
gateway_run = importlib.import_module("gateway.run")
GatewayRunner = gateway_run.GatewayRunner
adapter = HygieneCaptureAdapter()
runner = object.__new__(GatewayRunner)
runner.config = GatewayConfig(
platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="fake-token")}
)
runner.adapters = {Platform.TELEGRAM: adapter}
runner._voice_mode = {}
runner.hooks = SimpleNamespace(emit=AsyncMock(), loaded_hooks=False)
runner.session_store = MagicMock()
runner.session_store.get_or_create_session.return_value = SessionEntry(
session_key="agent:main:telegram:group:-1001:17585",
session_id="sess-1",
created_at=datetime.now(),
updated_at=datetime.now(),
platform=Platform.TELEGRAM,
chat_type="group",
)
runner.session_store.load_transcript.return_value = _make_history(6, content_size=400)
runner.session_store.has_any_sessions.return_value = True
runner.session_store.rewrite_transcript = MagicMock()
runner.session_store.append_to_transcript = MagicMock()
runner._running_agents = {}
runner._pending_messages = {}
runner._pending_approvals = {}
runner._session_db = None
runner._is_user_authorized = lambda _source: True
runner._set_session_env = lambda _context: None
runner._run_agent = AsyncMock(
return_value={
"final_response": "ok",
"messages": [],
"tools": [],
"history_offset": 0,
"last_prompt_tokens": 0,
}
)
monkeypatch.setattr(gateway_run, "_hermes_home", tmp_path)
monkeypatch.setattr(gateway_run, "_resolve_runtime_agent_kwargs", lambda: {"api_key": "***"})
monkeypatch.setattr(
"agent.model_metadata.get_model_context_length",
lambda *_args, **_kwargs: 100,
)
monkeypatch.setenv("TELEGRAM_HOME_CHANNEL", "795544298")
event = MessageEvent(
text="hello",
source=SessionSource(
platform=Platform.TELEGRAM,
chat_id="-1001",
chat_type="group",
thread_id="17585",
user_id="12345",
),
message_id="1",
)
result = await runner._handle_message(event)
assert result == "ok"
# No ⚠️ hard-failure warning (that's for dropped turns)
hard_warnings = [s for s in adapter.sent if "Context compression summary failed" in s["content"]]
assert len(hard_warnings) == 0, adapter.sent
# But an note about the configured aux model must be delivered.
aux_notes = [
s for s in adapter.sent
if "Configured compression model" in s["content"]
]
assert len(aux_notes) == 1, (
f"Expected 1 aux-model fallback notice, got {len(aux_notes)}: {adapter.sent}"
)
note = aux_notes[0]
assert "gemini-3-flash-preview" in note["content"]
assert "404" in note["content"]
assert "auxiliary.compression.model" in note["content"]
# Note must land in the originating topic/thread.
assert note["chat_id"] == "-1001"
assert note["metadata"] == {"thread_id": "17585"}
FakeCompressAgentWithAuxRecovery.last_instance.close.assert_called_once()
+2 -2
View File
@@ -269,9 +269,9 @@ class TestGmiModelMetadata:
class TestGmiAuxiliary:
def test_aux_default_model(self):
from agent.auxiliary_client import _API_KEY_PROVIDER_AUX_MODELS
from agent.auxiliary_client import _get_aux_model_for_provider
assert _API_KEY_PROVIDER_AUX_MODELS["gmi"] == "google/gemini-3.1-flash-lite-preview"
assert _get_aux_model_for_provider("gmi") == "google/gemini-3.1-flash-lite-preview"
def test_resolve_provider_client_uses_gmi_aux_default(self, monkeypatch):
monkeypatch.setenv("GMI_API_KEY", "gmi-test-key")
View File
+118
View File
@@ -0,0 +1,118 @@
"""E2E tests: verify _build_kwargs_from_profile produces correct output.
These tests call _build_kwargs_from_profile on the transport directly,
without importing run_agent (which would cause xdist worker contamination).
"""
import pytest
from agent.transports.chat_completions import ChatCompletionsTransport
from providers import get_provider_profile
@pytest.fixture
def transport():
return ChatCompletionsTransport()
def _msgs():
return [{"role": "user", "content": "hi"}]
class TestNvidiaProfileWiring:
def test_nvidia_gets_default_max_tokens(self, transport):
profile = get_provider_profile("nvidia")
kwargs = transport.build_kwargs(
model="nvidia/llama-3.1-nemotron-70b-instruct",
messages=_msgs(),
tools=None,
provider_profile=profile,
max_tokens=None,
max_tokens_param_fn=lambda x: {"max_tokens": x} if x else {},
timeout=300,
reasoning_config=None,
request_overrides=None,
session_id="test",
ollama_num_ctx=None,
)
# NVIDIA profile sets default_max_tokens=16384
assert kwargs.get("max_tokens") == 16384
def test_nvidia_nim_alias(self, transport):
profile = get_provider_profile("nvidia-nim")
assert profile is not None
assert profile.name == "nvidia"
assert profile.default_max_tokens == 16384
def test_nvidia_model_passed(self, transport):
profile = get_provider_profile("nvidia")
kwargs = transport.build_kwargs(
model="nvidia/test-model",
messages=_msgs(),
tools=None,
provider_profile=profile,
max_tokens=None,
max_tokens_param_fn=lambda x: {"max_tokens": x} if x else {},
timeout=300,
reasoning_config=None,
request_overrides=None,
session_id="test",
ollama_num_ctx=None,
)
assert kwargs["model"] == "nvidia/test-model"
def test_nvidia_messages_passed(self, transport):
profile = get_provider_profile("nvidia")
msgs = _msgs()
kwargs = transport.build_kwargs(
model="nvidia/test",
messages=msgs,
tools=None,
provider_profile=profile,
max_tokens=None,
max_tokens_param_fn=lambda x: {"max_tokens": x} if x else {},
timeout=300,
reasoning_config=None,
request_overrides=None,
session_id="test",
ollama_num_ctx=None,
)
assert kwargs["messages"] == msgs
class TestDeepSeekProfileWiring:
def test_deepseek_no_forced_max_tokens(self, transport):
profile = get_provider_profile("deepseek")
kwargs = transport.build_kwargs(
model="deepseek-chat",
messages=_msgs(),
tools=None,
provider_profile=profile,
max_tokens=None,
max_tokens_param_fn=lambda x: {"max_tokens": x} if x else {},
timeout=300,
reasoning_config=None,
request_overrides=None,
session_id="test",
ollama_num_ctx=None,
)
# DeepSeek has no default_max_tokens
assert kwargs["model"] == "deepseek-chat"
assert kwargs.get("max_tokens") is None or "max_tokens" not in kwargs
def test_deepseek_messages_passed(self, transport):
profile = get_provider_profile("deepseek")
msgs = _msgs()
kwargs = transport.build_kwargs(
model="deepseek-chat",
messages=msgs,
tools=None,
provider_profile=profile,
max_tokens=None,
max_tokens_param_fn=lambda x: {"max_tokens": x} if x else {},
timeout=300,
reasoning_config=None,
request_overrides=None,
session_id="test",
ollama_num_ctx=None,
)
assert kwargs["messages"] == msgs
+290
View File
@@ -0,0 +1,290 @@
"""Profile-path parity tests: verify profile path produces identical output to legacy flags.
Each test calls build_kwargs twice once with legacy flags, once with provider_profile
and asserts the output is identical. This catches any behavioral drift between the two paths.
"""
import pytest
from agent.transports.chat_completions import ChatCompletionsTransport
from providers import get_provider_profile
@pytest.fixture
def transport():
return ChatCompletionsTransport()
def _msgs():
return [{"role": "user", "content": "hello"}]
def _max_tokens_fn(n):
return {"max_completion_tokens": n}
class TestNvidiaProfileParity:
def test_max_tokens_match(self, transport):
"""NVIDIA profile sets max_tokens=16384; legacy flag is removed."""
profile = transport.build_kwargs(
model="nvidia/nemotron", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("nvidia"),
max_tokens_param_fn=_max_tokens_fn,
)
assert profile["max_completion_tokens"] == 16384
class TestKimiProfileParity:
def test_temperature_omitted(self, transport):
legacy = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi-coding"), omit_temperature=True,
)
profile = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi"),
)
assert "temperature" not in legacy
assert "temperature" not in profile
def test_max_tokens(self, transport):
legacy = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi-coding"), max_tokens_param_fn=_max_tokens_fn,
)
profile = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi"),
max_tokens_param_fn=_max_tokens_fn,
)
assert profile["max_completion_tokens"] == legacy["max_completion_tokens"] == 32000
def test_thinking_enabled(self, transport):
rc = {"enabled": True, "effort": "high"}
legacy = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi-coding"), reasoning_config=rc,
)
profile = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi"),
reasoning_config=rc,
)
assert profile["extra_body"]["thinking"] == legacy["extra_body"]["thinking"]
assert profile["reasoning_effort"] == legacy["reasoning_effort"] == "high"
def test_thinking_disabled(self, transport):
rc = {"enabled": False}
legacy = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi-coding"), reasoning_config=rc,
)
profile = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi"),
reasoning_config=rc,
)
assert profile["extra_body"]["thinking"] == legacy["extra_body"]["thinking"]
assert profile["extra_body"]["thinking"]["type"] == "disabled"
assert "reasoning_effort" not in profile
assert "reasoning_effort" not in legacy
def test_reasoning_effort_default(self, transport):
rc = {"enabled": True}
legacy = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi-coding"), reasoning_config=rc,
)
profile = transport.build_kwargs(
model="kimi-k2", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("kimi"),
reasoning_config=rc,
)
assert profile["reasoning_effort"] == legacy["reasoning_effort"] == "medium"
class TestOpenRouterProfileParity:
def test_provider_preferences(self, transport):
prefs = {"allow": ["anthropic"]}
legacy = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"), provider_preferences=prefs,
)
profile = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"),
provider_preferences=prefs,
)
assert profile["extra_body"]["provider"] == legacy["extra_body"]["provider"]
def test_reasoning_full_config(self, transport):
rc = {"enabled": True, "effort": "high"}
legacy = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"), supports_reasoning=True, reasoning_config=rc,
)
profile = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"),
supports_reasoning=True, reasoning_config=rc,
)
assert profile["extra_body"]["reasoning"] == legacy["extra_body"]["reasoning"]
def test_default_reasoning(self, transport):
legacy = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"), supports_reasoning=True,
)
profile = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"),
supports_reasoning=True,
)
assert profile["extra_body"]["reasoning"] == legacy["extra_body"]["reasoning"]
class TestNousProfileParity:
def test_tags(self, transport):
legacy = transport.build_kwargs(
model="hermes-3", messages=_msgs(), tools=None, provider_profile=get_provider_profile("nous"),
)
profile = transport.build_kwargs(
model="hermes-3", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("nous"),
)
assert profile["extra_body"]["tags"] == legacy["extra_body"]["tags"]
def test_reasoning_omitted_when_disabled(self, transport):
rc = {"enabled": False}
legacy = transport.build_kwargs(
model="hermes-3", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("nous"), supports_reasoning=True, reasoning_config=rc,
)
profile = transport.build_kwargs(
model="hermes-3", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("nous"),
supports_reasoning=True, reasoning_config=rc,
)
assert "reasoning" not in legacy.get("extra_body", {})
assert "reasoning" not in profile.get("extra_body", {})
class TestQwenProfileParity:
def test_max_tokens(self, transport):
legacy = transport.build_kwargs(
model="qwen3.5", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("qwen-oauth"), max_tokens_param_fn=_max_tokens_fn,
)
profile = transport.build_kwargs(
model="qwen3.5", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("qwen"),
max_tokens_param_fn=_max_tokens_fn,
)
assert profile["max_completion_tokens"] == legacy["max_completion_tokens"] == 65536
def test_vl_high_resolution(self, transport):
legacy = transport.build_kwargs(
model="qwen3.5", messages=_msgs(), tools=None, provider_profile=get_provider_profile("qwen-oauth"),
)
profile = transport.build_kwargs(
model="qwen3.5", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("qwen"),
)
assert profile["extra_body"]["vl_high_resolution_images"] == legacy["extra_body"]["vl_high_resolution_images"]
def test_metadata_top_level(self, transport):
meta = {"sessionId": "s123", "promptId": "p456"}
legacy = transport.build_kwargs(
model="qwen3.5", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("qwen-oauth"), qwen_session_metadata=meta,
)
profile = transport.build_kwargs(
model="qwen3.5", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("qwen"),
qwen_session_metadata=meta,
)
assert profile["metadata"] == legacy["metadata"] == meta
assert "metadata" not in profile.get("extra_body", {})
def test_message_preprocessing(self, transport):
"""Qwen profile normalizes string content to list-of-parts."""
msgs = [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "hello"},
]
profile = transport.build_kwargs(
model="qwen3.5", messages=msgs, tools=None,
provider_profile=get_provider_profile("qwen"),
)
out_msgs = profile["messages"]
# System message content normalized + cache_control injected
assert isinstance(out_msgs[0]["content"], list)
assert out_msgs[0]["content"][0]["type"] == "text"
assert "cache_control" in out_msgs[0]["content"][-1]
# User message content normalized
assert isinstance(out_msgs[1]["content"], list)
assert out_msgs[1]["content"][0] == {"type": "text", "text": "hello"}
class TestDeveloperRoleParity:
"""Developer role swap must work on BOTH legacy and profile paths."""
def test_legacy_path_swaps_for_gpt5(self, transport):
msgs = [{"role": "system", "content": "Be helpful"}, {"role": "user", "content": "hi"}]
kw = transport.build_kwargs(
model="gpt-5.4", messages=msgs, tools=None,
)
assert kw["messages"][0]["role"] == "developer"
def test_profile_path_swaps_for_gpt5(self, transport):
msgs = [{"role": "system", "content": "Be helpful"}, {"role": "user", "content": "hi"}]
kw = transport.build_kwargs(
model="gpt-5.4", messages=msgs, tools=None,
provider_profile=get_provider_profile("openrouter"),
)
assert kw["messages"][0]["role"] == "developer"
def test_profile_path_no_swap_for_claude(self, transport):
msgs = [{"role": "system", "content": "Be helpful"}, {"role": "user", "content": "hi"}]
kw = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6", messages=msgs, tools=None,
provider_profile=get_provider_profile("openrouter"),
)
assert kw["messages"][0]["role"] == "system"
class TestRequestOverridesParity:
"""request_overrides with extra_body must merge identically on both paths."""
def test_extra_body_override_legacy(self, transport):
kw = transport.build_kwargs(
model="gpt-5.4", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"),
request_overrides={"extra_body": {"custom_key": "custom_val"}},
)
assert kw["extra_body"]["custom_key"] == "custom_val"
def test_extra_body_override_profile(self, transport):
kw = transport.build_kwargs(
model="gpt-5.4", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"),
request_overrides={"extra_body": {"custom_key": "custom_val"}},
)
assert kw["extra_body"]["custom_key"] == "custom_val"
def test_extra_body_override_merges_with_provider_body(self, transport):
"""Override extra_body merges WITH provider extra_body, not replaces."""
kw = transport.build_kwargs(
model="hermes-3", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("nous"),
request_overrides={"extra_body": {"custom": True}},
)
assert kw["extra_body"]["tags"] == ["product=hermes-agent"] # from profile
assert kw["extra_body"]["custom"] is True # from override
def test_top_level_override(self, transport):
kw = transport.build_kwargs(
model="gpt-5.4", messages=_msgs(), tools=None,
provider_profile=get_provider_profile("openrouter"),
request_overrides={"top_p": 0.9},
)
assert kw["top_p"] == 0.9
+203
View File
@@ -0,0 +1,203 @@
"""Tests for the provider module registry and profiles."""
import pytest
from providers import get_provider_profile, _REGISTRY
from providers.base import ProviderProfile, OMIT_TEMPERATURE
class TestRegistry:
def test_discovery_populates_registry(self):
p = get_provider_profile("nvidia")
assert p is not None
assert p.name == "nvidia"
def test_alias_lookup(self):
assert get_provider_profile("kimi").name == "kimi-coding"
assert get_provider_profile("moonshot").name == "kimi-coding"
assert get_provider_profile("kimi-coding-cn").name == "kimi-coding-cn"
assert get_provider_profile("or").name == "openrouter"
assert get_provider_profile("nous-portal").name == "nous"
assert get_provider_profile("qwen").name == "qwen-oauth"
assert get_provider_profile("qwen-portal").name == "qwen-oauth"
def test_unknown_provider_returns_none(self):
assert get_provider_profile("nonexistent-provider") is None
def test_all_providers_have_name(self):
get_provider_profile("nvidia") # trigger discovery
for name, profile in _REGISTRY.items():
assert profile.name == name
class TestNvidiaProfile:
def test_max_tokens(self):
p = get_provider_profile("nvidia")
assert p.default_max_tokens == 16384
def test_no_special_temperature(self):
p = get_provider_profile("nvidia")
assert p.fixed_temperature is None
def test_base_url(self):
p = get_provider_profile("nvidia")
assert "nvidia.com" in p.base_url
class TestKimiProfile:
def test_temperature_omit(self):
p = get_provider_profile("kimi")
assert p.fixed_temperature is OMIT_TEMPERATURE
def test_max_tokens(self):
p = get_provider_profile("kimi")
assert p.default_max_tokens == 32000
def test_cn_separate_profile(self):
p = get_provider_profile("kimi-coding-cn")
assert p.name == "kimi-coding-cn"
assert p.env_vars == ("KIMI_CN_API_KEY",)
assert "moonshot.cn" in p.base_url
def test_cn_not_alias_of_kimi(self):
kimi = get_provider_profile("kimi-coding")
cn = get_provider_profile("kimi-coding-cn")
assert kimi is not cn
assert kimi.base_url != cn.base_url
def test_thinking_enabled(self):
p = get_provider_profile("kimi")
eb, tl = p.build_api_kwargs_extras(reasoning_config={"enabled": True, "effort": "high"})
assert eb["thinking"] == {"type": "enabled"}
assert tl["reasoning_effort"] == "high"
def test_thinking_disabled(self):
p = get_provider_profile("kimi")
eb, tl = p.build_api_kwargs_extras(reasoning_config={"enabled": False})
assert eb["thinking"] == {"type": "disabled"}
assert "reasoning_effort" not in tl
def test_reasoning_effort_default(self):
p = get_provider_profile("kimi")
eb, tl = p.build_api_kwargs_extras(reasoning_config={"enabled": True})
assert tl["reasoning_effort"] == "medium"
def test_no_config_defaults(self):
p = get_provider_profile("kimi")
eb, tl = p.build_api_kwargs_extras(reasoning_config=None)
assert eb["thinking"] == {"type": "enabled"}
assert tl["reasoning_effort"] == "medium"
class TestOpenRouterProfile:
def test_extra_body_with_prefs(self):
p = get_provider_profile("openrouter")
body = p.build_extra_body(provider_preferences={"allow": ["anthropic"]})
assert body["provider"] == {"allow": ["anthropic"]}
def test_extra_body_no_prefs(self):
p = get_provider_profile("openrouter")
body = p.build_extra_body()
assert body == {}
def test_reasoning_full_config(self):
p = get_provider_profile("openrouter")
eb, _ = p.build_api_kwargs_extras(
reasoning_config={"enabled": True, "effort": "high"},
supports_reasoning=True,
)
assert eb["reasoning"] == {"enabled": True, "effort": "high"}
def test_reasoning_disabled_still_passes(self):
"""OpenRouter passes disabled reasoning through (unlike Nous)."""
p = get_provider_profile("openrouter")
eb, _ = p.build_api_kwargs_extras(
reasoning_config={"enabled": False},
supports_reasoning=True,
)
assert eb["reasoning"] == {"enabled": False}
def test_default_reasoning(self):
p = get_provider_profile("openrouter")
eb, _ = p.build_api_kwargs_extras(supports_reasoning=True)
assert eb["reasoning"] == {"enabled": True, "effort": "medium"}
class TestNousProfile:
def test_tags(self):
p = get_provider_profile("nous")
body = p.build_extra_body()
assert body["tags"] == ["product=hermes-agent"]
def test_auth_type(self):
p = get_provider_profile("nous")
assert p.auth_type == "oauth_device_code"
def test_reasoning_enabled(self):
p = get_provider_profile("nous")
eb, _ = p.build_api_kwargs_extras(
reasoning_config={"enabled": True, "effort": "medium"},
supports_reasoning=True,
)
assert eb["reasoning"] == {"enabled": True, "effort": "medium"}
def test_reasoning_omitted_when_disabled(self):
p = get_provider_profile("nous")
eb, _ = p.build_api_kwargs_extras(
reasoning_config={"enabled": False},
supports_reasoning=True,
)
assert "reasoning" not in eb
class TestQwenProfile:
def test_max_tokens(self):
p = get_provider_profile("qwen-oauth")
assert p.default_max_tokens == 65536
def test_auth_type(self):
p = get_provider_profile("qwen-oauth")
assert p.auth_type == "oauth_external"
def test_extra_body_vl(self):
p = get_provider_profile("qwen-oauth")
body = p.build_extra_body()
assert body["vl_high_resolution_images"] is True
def test_prepare_messages_normalizes_content(self):
p = get_provider_profile("qwen-oauth")
msgs = [
{"role": "system", "content": "Be helpful"},
{"role": "user", "content": "hello"},
]
result = p.prepare_messages(msgs)
# System message: content normalized to list, cache_control on last part
assert isinstance(result[0]["content"], list)
assert result[0]["content"][-1].get("cache_control") == {"type": "ephemeral"}
assert result[0]["content"][-1]["text"] == "Be helpful"
# User message: content normalized to list
assert isinstance(result[1]["content"], list)
assert result[1]["content"][0]["text"] == "hello"
def test_metadata_top_level(self):
p = get_provider_profile("qwen-oauth")
meta = {"sessionId": "s123", "promptId": "p456"}
eb, tl = p.build_api_kwargs_extras(qwen_session_metadata=meta)
assert tl["metadata"] == meta
assert "metadata" not in eb
class TestBaseProfile:
def test_prepare_messages_passthrough(self):
p = ProviderProfile(name="test")
msgs = [{"role": "user", "content": "hi"}]
assert p.prepare_messages(msgs) is msgs
def test_build_extra_body_empty(self):
p = ProviderProfile(name="test")
assert p.build_extra_body() == {}
def test_build_api_kwargs_extras_empty(self):
p = ProviderProfile(name="test")
eb, tl = p.build_api_kwargs_extras()
assert eb == {}
assert tl == {}
+258
View File
@@ -0,0 +1,258 @@
"""Parity tests: pin the exact current transport behavior per provider.
These tests document the flag-based contract between run_agent.py and
ChatCompletionsTransport.build_kwargs(). When the next PR wires profiles
to replace flags, every assertion here must still pass any failure is
a behavioral regression.
"""
import pytest
from agent.transports.chat_completions import ChatCompletionsTransport
from providers import get_provider_profile
@pytest.fixture
def transport():
return ChatCompletionsTransport()
def _simple_messages():
return [{"role": "user", "content": "hello"}]
def _max_tokens_fn(n):
return {"max_completion_tokens": n}
class TestNvidiaParity:
"""NVIDIA NIM: default max_tokens=16384."""
def test_default_max_tokens(self, transport):
"""NVIDIA default max_tokens=16384 comes from profile, not legacy is_nvidia_nim flag."""
from providers import get_provider_profile
profile = get_provider_profile("nvidia")
kw = transport.build_kwargs(
model="nvidia/llama-3.1-nemotron-70b-instruct",
messages=_simple_messages(),
tools=None,
max_tokens_param_fn=_max_tokens_fn,
provider_profile=profile,
)
assert kw["max_completion_tokens"] == 16384
def test_user_max_tokens_overrides(self, transport):
from providers import get_provider_profile
profile = get_provider_profile("nvidia")
kw = transport.build_kwargs(
model="nvidia/llama-3.1-nemotron-70b-instruct",
messages=_simple_messages(),
tools=None,
max_tokens=4096,
max_tokens_param_fn=_max_tokens_fn,
provider_profile=profile,
)
assert kw["max_completion_tokens"] == 4096 # user overrides default
class TestKimiParity:
"""Kimi: OMIT temperature, max_tokens=32000, thinking + reasoning_effort."""
def test_temperature_omitted(self, transport):
kw = transport.build_kwargs(
model="kimi-k2",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("kimi-coding"),
omit_temperature=True,
)
assert "temperature" not in kw
def test_default_max_tokens(self, transport):
kw = transport.build_kwargs(
model="kimi-k2",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("kimi-coding"),
max_tokens_param_fn=_max_tokens_fn,
)
assert kw["max_completion_tokens"] == 32000
def test_thinking_enabled(self, transport):
kw = transport.build_kwargs(
model="kimi-k2",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("kimi-coding"),
reasoning_config={"enabled": True, "effort": "high"},
)
assert kw["extra_body"]["thinking"] == {"type": "enabled"}
def test_thinking_disabled(self, transport):
kw = transport.build_kwargs(
model="kimi-k2",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("kimi-coding"),
reasoning_config={"enabled": False},
)
assert kw["extra_body"]["thinking"] == {"type": "disabled"}
def test_reasoning_effort_top_level(self, transport):
"""Kimi reasoning_effort is a TOP-LEVEL api_kwargs key, NOT in extra_body."""
kw = transport.build_kwargs(
model="kimi-k2",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("kimi-coding"),
reasoning_config={"enabled": True, "effort": "high"},
)
assert kw.get("reasoning_effort") == "high"
assert "reasoning_effort" not in kw.get("extra_body", {})
def test_reasoning_effort_default_medium(self, transport):
kw = transport.build_kwargs(
model="kimi-k2",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("kimi-coding"),
reasoning_config={"enabled": True},
)
assert kw.get("reasoning_effort") == "medium"
class TestOpenRouterParity:
"""OpenRouter: provider preferences, reasoning in extra_body."""
def test_provider_preferences(self, transport):
prefs = {"allow": ["anthropic"], "sort": "price"}
kw = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("openrouter"),
provider_preferences=prefs,
)
assert kw["extra_body"]["provider"] == prefs
def test_reasoning_passes_full_config(self, transport):
"""OpenRouter passes the FULL reasoning_config dict, not just effort."""
rc = {"enabled": True, "effort": "high"}
kw = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("openrouter"),
supports_reasoning=True,
reasoning_config=rc,
)
assert kw["extra_body"]["reasoning"] == rc
def test_default_reasoning_when_no_config(self, transport):
"""When supports_reasoning=True but no config, adds default."""
kw = transport.build_kwargs(
model="anthropic/claude-sonnet-4.6",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("openrouter"),
supports_reasoning=True,
)
assert kw["extra_body"]["reasoning"] == {"enabled": True, "effort": "medium"}
class TestNousParity:
"""Nous: product tags, reasoning, omit when disabled."""
def test_tags(self, transport):
kw = transport.build_kwargs(
model="hermes-3-llama-3.1-405b",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("nous"),
)
assert kw["extra_body"]["tags"] == ["product=hermes-agent"]
def test_reasoning_omitted_when_disabled(self, transport):
"""Nous special case: reasoning omitted entirely when disabled."""
kw = transport.build_kwargs(
model="hermes-3-llama-3.1-405b",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("nous"),
supports_reasoning=True,
reasoning_config={"enabled": False},
)
assert "reasoning" not in kw.get("extra_body", {})
def test_reasoning_enabled(self, transport):
rc = {"enabled": True, "effort": "high"}
kw = transport.build_kwargs(
model="hermes-3-llama-3.1-405b",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("nous"),
supports_reasoning=True,
reasoning_config=rc,
)
assert kw["extra_body"]["reasoning"] == rc
class TestQwenParity:
"""Qwen: max_tokens=65536, vl_high_resolution, metadata top-level."""
def test_default_max_tokens(self, transport):
kw = transport.build_kwargs(
model="qwen3.5-plus",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("qwen-oauth"),
max_tokens_param_fn=_max_tokens_fn,
)
assert kw["max_completion_tokens"] == 65536
def test_vl_high_resolution(self, transport):
kw = transport.build_kwargs(
model="qwen3.5-plus",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("qwen-oauth"),
)
assert kw["extra_body"]["vl_high_resolution_images"] is True
def test_metadata_top_level(self, transport):
"""Qwen metadata goes to top-level api_kwargs, NOT extra_body."""
meta = {"sessionId": "s123", "promptId": "p456"}
kw = transport.build_kwargs(
model="qwen3.5-plus",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("qwen-oauth"),
qwen_session_metadata=meta,
)
assert kw["metadata"] == meta
assert "metadata" not in kw.get("extra_body", {})
class TestCustomOllamaParity:
"""Custom/Ollama: num_ctx, think=false — now tested via profile."""
def test_ollama_num_ctx(self, transport):
kw = transport.build_kwargs(
model="llama3.1",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("custom"),
ollama_num_ctx=131072,
)
assert kw["extra_body"]["options"]["num_ctx"] == 131072
def test_think_false_when_disabled(self, transport):
kw = transport.build_kwargs(
model="qwen3:72b",
messages=_simple_messages(),
tools=None,
provider_profile=get_provider_profile("custom"),
reasoning_config={"enabled": False, "effort": "none"},
)
assert kw["extra_body"]["think"] is False
+46 -10
View File
@@ -1097,6 +1097,7 @@ class TestBuildApiKwargs:
assert "temperature" not in kwargs
def test_kimi_coding_endpoint_omits_temperature(self, agent):
agent.provider = "kimi-coding"
agent.base_url = "https://api.kimi.com/coding/v1"
agent._base_url_lower = agent.base_url.lower()
agent.model = "kimi-k2.5"
@@ -1109,6 +1110,7 @@ class TestBuildApiKwargs:
def test_kimi_coding_endpoint_sends_max_tokens_and_reasoning(self, agent):
"""Kimi endpoint should send max_tokens=32000 and reasoning_effort as
top-level params, matching Kimi CLI's default behavior."""
agent.provider = "kimi-coding"
agent.base_url = "https://api.kimi.com/coding/v1"
agent._base_url_lower = agent.base_url.lower()
agent.model = "kimi-for-coding"
@@ -1121,6 +1123,7 @@ class TestBuildApiKwargs:
def test_kimi_coding_endpoint_respects_custom_effort(self, agent):
"""reasoning_effort should reflect reasoning_config.effort when set."""
agent.provider = "kimi-coding"
agent.base_url = "https://api.kimi.com/coding/v1"
agent._base_url_lower = agent.base_url.lower()
agent.model = "kimi-for-coding"
@@ -1134,6 +1137,7 @@ class TestBuildApiKwargs:
def test_kimi_coding_endpoint_sends_thinking_extra_body(self, agent):
"""Kimi endpoint should send extra_body.thinking={"type":"enabled"}
to activate reasoning mode, mirroring Kimi CLI's with_thinking()."""
agent.provider = "kimi-coding"
agent.base_url = "https://api.kimi.com/coding/v1"
agent._base_url_lower = agent.base_url.lower()
agent.model = "kimi-for-coding"
@@ -1147,6 +1151,7 @@ class TestBuildApiKwargs:
"""When reasoning_config.enabled=False, thinking should be disabled
and reasoning_effort should be omitted entirely mirroring Kimi
CLI's with_thinking("off") which maps to reasoning_effort=None."""
agent.provider = "kimi-coding"
agent.base_url = "https://api.kimi.com/coding/v1"
agent._base_url_lower = agent.base_url.lower()
agent.model = "kimi-for-coding"
@@ -1160,6 +1165,7 @@ class TestBuildApiKwargs:
def test_moonshot_endpoint_sends_max_tokens_and_reasoning(self, agent):
"""api.moonshot.ai should get the same Kimi-compatible params."""
agent.provider = "kimi-coding"
agent.base_url = "https://api.moonshot.ai/v1"
agent._base_url_lower = agent.base_url.lower()
agent.model = "kimi-k2.5"
@@ -1173,6 +1179,7 @@ class TestBuildApiKwargs:
def test_moonshot_cn_endpoint_sends_max_tokens_and_reasoning(self, agent):
"""api.moonshot.cn (China endpoint) should get the same params."""
agent.provider = "kimi-coding-cn"
agent.base_url = "https://api.moonshot.cn/v1"
agent._base_url_lower = agent.base_url.lower()
agent.model = "kimi-k2.5"
@@ -1185,6 +1192,7 @@ class TestBuildApiKwargs:
assert kwargs["extra_body"]["thinking"] == {"type": "enabled"}
def test_provider_preferences_injected(self, agent):
agent.provider = "openrouter"
agent.base_url = "https://openrouter.ai/api/v1"
agent.providers_allowed = ["Anthropic"]
messages = [{"role": "user", "content": "hi"}]
@@ -1193,6 +1201,7 @@ class TestBuildApiKwargs:
def test_reasoning_config_default_openrouter(self, agent):
"""Default reasoning config for OpenRouter should be medium."""
agent.provider = "openrouter"
agent.base_url = "https://openrouter.ai/api/v1"
agent.model = "anthropic/claude-sonnet-4-20250514"
messages = [{"role": "user", "content": "hi"}]
@@ -1202,6 +1211,7 @@ class TestBuildApiKwargs:
assert reasoning["effort"] == "medium"
def test_reasoning_config_custom(self, agent):
agent.provider = "openrouter"
agent.base_url = "https://openrouter.ai/api/v1"
agent.model = "anthropic/claude-sonnet-4-20250514"
agent.reasoning_config = {"enabled": False}
@@ -1217,6 +1227,7 @@ class TestBuildApiKwargs:
assert "reasoning" not in kwargs.get("extra_body", {})
def test_reasoning_sent_for_supported_openrouter_model(self, agent):
agent.provider = "openrouter"
agent.base_url = "https://openrouter.ai/api/v1"
agent.model = "qwen/qwen3.5-plus-02-15"
messages = [{"role": "user", "content": "hi"}]
@@ -1224,6 +1235,7 @@ class TestBuildApiKwargs:
assert kwargs["extra_body"]["reasoning"]["effort"] == "medium"
def test_reasoning_sent_for_nous_route(self, agent):
agent.provider = "nous"
agent.base_url = "https://inference-api.nousresearch.com/v1"
agent.model = "minimax/minimax-m2.5"
messages = [{"role": "user", "content": "hi"}]
@@ -1231,18 +1243,38 @@ class TestBuildApiKwargs:
assert kwargs["extra_body"]["reasoning"]["effort"] == "medium"
def test_reasoning_sent_for_copilot_gpt5(self, agent):
agent.base_url = "https://api.githubcopilot.com"
agent.model = "gpt-5.4"
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
"""Copilot/GitHub Models: GPT-5 reasoning goes in extra_body.reasoning."""
from agent.transports import get_transport
from providers import get_provider_profile
transport = get_transport("chat_completions")
profile = get_provider_profile("copilot")
msgs = [{"role": "user", "content": "hi"}]
kwargs = transport.build_kwargs(
model="gpt-5.4",
messages=msgs,
tools=None,
supports_reasoning=True,
provider_profile=profile,
)
assert kwargs["extra_body"]["reasoning"] == {"effort": "medium"}
def test_reasoning_xhigh_normalized_for_copilot(self, agent):
agent.base_url = "https://api.githubcopilot.com"
agent.model = "gpt-5.4"
agent.reasoning_config = {"enabled": True, "effort": "xhigh"}
messages = [{"role": "user", "content": "hi"}]
kwargs = agent._build_api_kwargs(messages)
"""xhigh effort should normalize to high for Copilot GitHub Models."""
from agent.transports import get_transport
from providers import get_provider_profile
transport = get_transport("chat_completions")
profile = get_provider_profile("copilot")
msgs = [{"role": "user", "content": "hi"}]
kwargs = transport.build_kwargs(
model="gpt-5.4",
messages=msgs,
tools=None,
supports_reasoning=True,
reasoning_config={"enabled": True, "effort": "xhigh"},
provider_profile=profile,
)
assert kwargs["extra_body"]["reasoning"] == {"effort": "high"}
def test_reasoning_omitted_for_non_reasoning_copilot_model(self, agent):
@@ -1260,6 +1292,7 @@ class TestBuildApiKwargs:
def test_qwen_portal_formats_messages_and_metadata(self, agent):
agent.provider = "qwen-oauth"
agent.base_url = "https://portal.qwen.ai/v1"
agent._base_url_lower = agent.base_url.lower()
agent.session_id = "sess-123"
@@ -1276,6 +1309,7 @@ class TestBuildApiKwargs:
assert kwargs["messages"][2]["content"][0]["text"] == "hi"
def test_qwen_portal_normalizes_bare_string_content_parts(self, agent):
agent.provider = "qwen-oauth"
agent.base_url = "https://portal.qwen.ai/v1"
agent._base_url_lower = agent.base_url.lower()
messages = [
@@ -1288,6 +1322,7 @@ class TestBuildApiKwargs:
assert user_content[1] == {"type": "text", "text": "world"}
def test_qwen_portal_no_system_message(self, agent):
agent.provider = "qwen-oauth"
agent.base_url = "https://portal.qwen.ai/v1"
agent._base_url_lower = agent.base_url.lower()
messages = [{"role": "user", "content": "hi"}]
@@ -1308,6 +1343,7 @@ class TestBuildApiKwargs:
def test_qwen_portal_default_max_tokens(self, agent):
"""When max_tokens is None, Qwen Portal gets a default of 65536
to prevent reasoning models from exhausting their output budget."""
agent.provider = "qwen-oauth"
agent.base_url = "https://portal.qwen.ai/v1"
agent._base_url_lower = agent.base_url.lower()
agent.max_tokens = None
@@ -3843,7 +3879,7 @@ def test_aiagent_uses_copilot_acp_client():
patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("run_agent.OpenAI") as mock_openai,
patch("agent.copilot_acp_client.CopilotACPClient") as mock_acp_client,
patch("acp_adapter.copilot_client.CopilotACPClient") as mock_acp_client,
):
acp_client = MagicMock()
mock_acp_client.return_value = acp_client
+248
View File
@@ -0,0 +1,248 @@
"""Tests for pre_approval_request / post_approval_response plugin hooks.
These hooks fire in tools/approval.py::check_all_command_guards whenever a
dangerous command needs user approval. They are observer-only (return values
ignored) and must fire on BOTH the CLI-interactive path and the async gateway
path, so external tools like macOS notifiers can be alerted regardless of
which surface the user is on.
"""
from unittest.mock import patch
import pytest
import tools.approval as approval_module
from tools.approval import (
check_all_command_guards,
register_gateway_notify,
unregister_gateway_notify,
resolve_gateway_approval,
set_current_session_key,
clear_session,
)
@pytest.fixture
def isolated_session(monkeypatch):
"""Give each test a fresh session_key and clean approval-state."""
session_key = "test:session:approval_hooks"
token = set_current_session_key(session_key)
monkeypatch.setenv("HERMES_SESSION_KEY", session_key)
# Make sure we don't skip guards via yolo / approvals.mode=off
monkeypatch.delenv("HERMES_YOLO_MODE", raising=False)
try:
yield session_key
finally:
try:
approval_module._approval_session_key.reset(token)
except Exception:
pass
clear_session(session_key)
class TestCliPathFiresHooks:
"""CLI-interactive approval path: HERMES_INTERACTIVE is set, the
prompt_dangerous_approval() result decides the outcome."""
def test_pre_and_post_fire_with_expected_kwargs(
self, isolated_session, monkeypatch
):
monkeypatch.setenv("HERMES_INTERACTIVE", "1")
monkeypatch.delenv("HERMES_GATEWAY_SESSION", raising=False)
monkeypatch.delenv("HERMES_EXEC_ASK", raising=False)
# approvals.mode=manual so we actually reach the prompt site
monkeypatch.setattr(approval_module, "_get_approval_mode", lambda: "manual")
captured = []
def fake_invoke_hook(hook_name, **kwargs):
captured.append((hook_name, kwargs))
return []
# Force the user to "approve once" via the approval_callback contract
def cb(command, description, *, allow_permanent=True):
return "once"
with patch("hermes_cli.plugins.invoke_hook", side_effect=fake_invoke_hook):
result = check_all_command_guards(
"rm -rf /tmp/test-hook", "local", approval_callback=cb,
)
assert result["approved"] is True
hook_names = [c[0] for c in captured]
assert "pre_approval_request" in hook_names
assert "post_approval_response" in hook_names
pre_kwargs = next(kw for name, kw in captured if name == "pre_approval_request")
assert pre_kwargs["command"] == "rm -rf /tmp/test-hook"
assert pre_kwargs["surface"] == "cli"
assert pre_kwargs["session_key"] == isolated_session
assert isinstance(pre_kwargs["pattern_keys"], list)
assert pre_kwargs["pattern_key"] # non-empty primary pattern
assert pre_kwargs["description"]
post_kwargs = next(kw for name, kw in captured if name == "post_approval_response")
assert post_kwargs["choice"] == "once"
assert post_kwargs["surface"] == "cli"
assert post_kwargs["command"] == "rm -rf /tmp/test-hook"
def test_deny_reported_to_post_hook(self, isolated_session, monkeypatch):
monkeypatch.setenv("HERMES_INTERACTIVE", "1")
monkeypatch.delenv("HERMES_GATEWAY_SESSION", raising=False)
monkeypatch.delenv("HERMES_EXEC_ASK", raising=False)
monkeypatch.setattr(approval_module, "_get_approval_mode", lambda: "manual")
captured = []
def fake_invoke_hook(hook_name, **kwargs):
captured.append((hook_name, kwargs))
return []
def cb(command, description, *, allow_permanent=True):
return "deny"
with patch("hermes_cli.plugins.invoke_hook", side_effect=fake_invoke_hook):
result = check_all_command_guards(
"rm -rf /tmp/test-deny", "local", approval_callback=cb,
)
assert result["approved"] is False
post_kwargs = next(kw for name, kw in captured if name == "post_approval_response")
assert post_kwargs["choice"] == "deny"
def test_plugin_hook_crash_does_not_break_approval(
self, isolated_session, monkeypatch
):
"""A crashing plugin must never prevent the approval flow from
reaching the user. Hooks are observer-only and safety-critical
behavior must be preserved."""
monkeypatch.setenv("HERMES_INTERACTIVE", "1")
monkeypatch.delenv("HERMES_GATEWAY_SESSION", raising=False)
monkeypatch.delenv("HERMES_EXEC_ASK", raising=False)
monkeypatch.setattr(approval_module, "_get_approval_mode", lambda: "manual")
def boom(hook_name, **kwargs):
raise RuntimeError("plugin crashed")
def cb(command, description, *, allow_permanent=True):
return "once"
with patch("hermes_cli.plugins.invoke_hook", side_effect=boom):
result = check_all_command_guards(
"rm -rf /tmp/test-crash", "local", approval_callback=cb,
)
# User's approval was still honored despite the plugin crashing
assert result["approved"] is True
class TestGatewayPathFiresHooks:
"""Async gateway approval path: HERMES_GATEWAY_SESSION is set and a
gateway notify callback is registered. The agent thread blocks on the
approval event until resolve_gateway_approval() is called from another
thread."""
def test_pre_and_post_fire_on_gateway_surface(
self, isolated_session, monkeypatch
):
import threading
monkeypatch.delenv("HERMES_INTERACTIVE", raising=False)
monkeypatch.setenv("HERMES_GATEWAY_SESSION", "1")
monkeypatch.delenv("HERMES_EXEC_ASK", raising=False)
monkeypatch.setattr(approval_module, "_get_approval_mode", lambda: "manual")
# Short gateway_timeout so a buggy test fails fast instead of hanging
monkeypatch.setattr(
approval_module, "_get_approval_config", lambda: {"gateway_timeout": 10}
)
captured = []
def fake_invoke_hook(hook_name, **kwargs):
captured.append((hook_name, kwargs))
return []
notify_seen = threading.Event()
def notify_cb(approval_data):
notify_seen.set()
register_gateway_notify(isolated_session, notify_cb)
result_holder = {}
def run_guard():
with patch("hermes_cli.plugins.invoke_hook", side_effect=fake_invoke_hook):
result_holder["result"] = check_all_command_guards(
"rm -rf /tmp/test-gateway-hook", "local",
)
t = threading.Thread(target=run_guard, daemon=True)
t.start()
# Wait for the gateway callback to see the approval request
assert notify_seen.wait(timeout=5), "Gateway notify never fired"
# User approves from the "other thread" (simulating /approve command)
resolve_gateway_approval(isolated_session, "once")
t.join(timeout=5)
assert not t.is_alive(), "Agent thread never unblocked"
unregister_gateway_notify(isolated_session)
assert result_holder["result"]["approved"] is True
hook_names = [c[0] for c in captured]
assert "pre_approval_request" in hook_names
assert "post_approval_response" in hook_names
pre_kwargs = next(kw for name, kw in captured if name == "pre_approval_request")
assert pre_kwargs["surface"] == "gateway"
assert pre_kwargs["command"] == "rm -rf /tmp/test-gateway-hook"
post_kwargs = next(kw for name, kw in captured if name == "post_approval_response")
assert post_kwargs["surface"] == "gateway"
assert post_kwargs["choice"] == "once"
def test_timeout_reports_timeout_choice(self, isolated_session, monkeypatch):
import threading
monkeypatch.delenv("HERMES_INTERACTIVE", raising=False)
monkeypatch.setenv("HERMES_GATEWAY_SESSION", "1")
monkeypatch.delenv("HERMES_EXEC_ASK", raising=False)
monkeypatch.setattr(approval_module, "_get_approval_mode", lambda: "manual")
monkeypatch.setattr(
approval_module, "_get_approval_config", lambda: {"gateway_timeout": 1}
)
captured = []
def fake_invoke_hook(hook_name, **kwargs):
captured.append((hook_name, kwargs))
return []
notify_seen = threading.Event()
def notify_cb(approval_data):
notify_seen.set()
register_gateway_notify(isolated_session, notify_cb)
result_holder = {}
def run_guard():
with patch("hermes_cli.plugins.invoke_hook", side_effect=fake_invoke_hook):
result_holder["result"] = check_all_command_guards(
"rm -rf /tmp/test-gateway-timeout", "local",
)
t = threading.Thread(target=run_guard, daemon=True)
t.start()
assert notify_seen.wait(timeout=5)
# Deliberately do NOT resolve -- let it time out
t.join(timeout=5)
assert not t.is_alive()
unregister_gateway_notify(isolated_session)
assert result_holder["result"]["approved"] is False
post_kwargs = next(kw for name, kw in captured if name == "post_approval_response")
assert post_kwargs["choice"] == "timeout"
+76
View File
@@ -30,6 +30,32 @@ _approval_session_key: contextvars.ContextVar[str] = contextvars.ContextVar(
)
def _fire_approval_hook(hook_name: str, **kwargs) -> None:
"""Invoke a plugin lifecycle hook for the approval system.
Lazy-imports the plugin manager to avoid circular imports (approval.py is
imported very early, long before plugins are discovered). Never raises --
plugin errors are logged and swallowed.
Only fires for the two approval-specific hooks in VALID_HOOKS:
pre_approval_request, post_approval_response.
"""
try:
from hermes_cli.plugins import invoke_hook
except Exception:
# Plugin system not available in this execution context
# (e.g. bare tool-only imports, minimal test environments).
return
try:
invoke_hook(hook_name, **kwargs)
except Exception as exc:
# invoke_hook() already swallows per-callback errors, so reaching here
# means the dispatch layer itself failed. Log and move on -- approval
# flow is safety-critical, plugin observability is not.
logger.debug("Approval hook %s dispatch failed: %s", hook_name, exc)
def set_current_session_key(session_key: str) -> contextvars.Token[str]:
"""Bind the active approval session key to the current context."""
return _approval_session_key.set(session_key or "")
@@ -1002,6 +1028,19 @@ def check_all_command_guards(command: str, env_type: str,
with _lock:
_gateway_queues.setdefault(session_key, []).append(entry)
# Notify plugins that an approval is being requested. Fires before
# the gateway notify callback so observers (e.g. macOS notifier
# plugins, audit logs, Slack alerts) get the event in real time.
_fire_approval_hook(
"pre_approval_request",
command=command,
description=combined_desc,
pattern_key=primary_key,
pattern_keys=list(all_keys),
session_key=session_key,
surface="gateway",
)
# Notify the user (bridges sync agent thread → async gateway)
try:
notify_cb(approval_data)
@@ -1067,6 +1106,24 @@ def check_all_command_guards(command: str, env_type: str,
_gateway_queues.pop(session_key, None)
choice = entry.result
# Normalize outcome for the post hook. Unresolved (timeout) and
# None both mean the user never responded; report that explicitly
# so plugins can distinguish timeout from explicit deny.
_outcome = (
"timeout" if not resolved
else (choice if choice else "timeout")
)
_fire_approval_hook(
"post_approval_response",
command=command,
description=combined_desc,
pattern_key=primary_key,
pattern_keys=list(all_keys),
session_key=session_key,
surface="gateway",
choice=_outcome,
)
if not resolved or choice is None or choice == "deny":
reason = "timed out" if not resolved else "denied by user"
return {
@@ -1111,9 +1168,28 @@ def check_all_command_guards(command: str, env_type: str,
# CLI interactive: single combined prompt
# Hide [a]lways when any tirith warning is present
_fire_approval_hook(
"pre_approval_request",
command=command,
description=combined_desc,
pattern_key=primary_key,
pattern_keys=list(all_keys),
session_key=session_key,
surface="cli",
)
choice = prompt_dangerous_approval(command, combined_desc,
allow_permanent=not has_tirith,
approval_callback=approval_callback)
_fire_approval_hook(
"post_approval_response",
command=command,
description=combined_desc,
pattern_key=primary_key,
pattern_keys=list(all_keys),
session_key=session_key,
surface="cli",
choice=choice,
)
if choice == "deny":
return {
Generated
+186 -27
View File
@@ -9,7 +9,7 @@ resolution-markers = [
]
[options]
exclude-newer = "2026-04-17T16:49:45.944715922Z"
exclude-newer = "2026-04-19T17:00:07.266826Z"
exclude-newer-span = "P7D"
[[package]]
@@ -564,30 +564,30 @@ wheels = [
[[package]]
name = "boto3"
version = "1.42.89"
version = "1.42.91"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "botocore" },
{ name = "jmespath" },
{ name = "s3transfer" },
]
sdist = { url = "https://files.pythonhosted.org/packages/bb/0c/f7bccb22b245cabf392816baba20f9e95f78ace7dbc580fd40136e80e732/boto3-1.42.89.tar.gz", hash = "sha256:3e43aacc0801bba9bcd23a8c271c089af297a69565f783fcdd357ae0e330bf1e", size = 113165, upload-time = "2026-04-13T19:36:17.516Z" }
sdist = { url = "https://files.pythonhosted.org/packages/a7/c0/98b8cec7ca22dde776df48c58940ae1abc425593959b7226e270760d726f/boto3-1.42.91.tar.gz", hash = "sha256:03d70532b17f7f84df37ca7e8c21553280454dea53ae12b15d1cfef9b16fcb8a", size = 113181, upload-time = "2026-04-17T19:31:06.251Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/b9/33/55103ba5ef9975ea54b8d39e69b76eb6e9fded3beae5f01065e26951a3a1/boto3-1.42.89-py3-none-any.whl", hash = "sha256:6204b189f4d0c655535f43d7eaa57ff4e8d965b8463c97e45952291211162932", size = 140556, upload-time = "2026-04-13T19:36:13.894Z" },
{ url = "https://files.pythonhosted.org/packages/02/29/faba6521257c34085cc9b439ef98235b581772580f417fa3629728007270/boto3-1.42.91-py3-none-any.whl", hash = "sha256:04e72071cde022951ce7f81bd9933c90095ab8923e8ced61c8dacfe9edac0f5c", size = 140553, upload-time = "2026-04-17T19:31:02.57Z" },
]
[[package]]
name = "botocore"
version = "1.42.89"
version = "1.42.91"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "jmespath" },
{ name = "python-dateutil" },
{ name = "urllib3" },
]
sdist = { url = "https://files.pythonhosted.org/packages/0f/cc/e6be943efa9051bd15c2ee14077c2b10d6e27c9e9385fc43a03a5c4ed8b5/botocore-1.42.89.tar.gz", hash = "sha256:95ac52f472dad29942f3088b278ab493044516c16dbf9133c975af16527baa99", size = 15206290, upload-time = "2026-04-13T19:36:02.321Z" }
sdist = { url = "https://files.pythonhosted.org/packages/21/bc/a4b7c46471c2e789ad8c4c7acfd7f302fdb481d93ff870f441249b924ae6/botocore-1.42.91.tar.gz", hash = "sha256:d252e27bc454afdbf5ed3dc617aa423f2c855c081e98b7963093399483ecc698", size = 15213010, upload-time = "2026-04-17T19:30:50.793Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/91/f1/90a7b8eda38b7c3a65ca7ee0075bdf310b6b471cb1b95fab6e8994323a50/botocore-1.42.89-py3-none-any.whl", hash = "sha256:d9b786c8d9db6473063b4cc5be0ba7e6a381082307bd6afb69d4216f9fa95f35", size = 14887287, upload-time = "2026-04-13T19:35:56.677Z" },
{ url = "https://files.pythonhosted.org/packages/b1/fc/24cc0a47c824f13933e210e9ad034b4fba22f7185b8d904c0fbf5a3b2be8/botocore-1.42.91-py3-none-any.whl", hash = "sha256:7a28c3cc6bfab5724ad18899d52402b776a0de7d87fa20c3c5270bcaaf199ce8", size = 14897344, upload-time = "2026-04-17T19:30:44.245Z" },
]
[[package]]
@@ -1759,6 +1759,77 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/6a/09/e21df6aef1e1ffc0c816f0522ddc3f6dcded766c3261813131c78a704470/gitpython-3.1.46-py3-none-any.whl", hash = "sha256:79812ed143d9d25b6d176a10bb511de0f9c67b1fa641d82097b0ab90398a2058", size = 208620, upload-time = "2026-01-01T15:37:30.574Z" },
]
[[package]]
name = "google-api-core"
version = "2.30.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "google-auth" },
{ name = "googleapis-common-protos" },
{ name = "proto-plus" },
{ name = "protobuf" },
{ name = "requests" },
]
sdist = { url = "https://files.pythonhosted.org/packages/16/ce/502a57fb0ec752026d24df1280b162294b22a0afb98a326084f9a979138b/google_api_core-2.30.3.tar.gz", hash = "sha256:e601a37f148585319b26db36e219df68c5d07b6382cff2d580e83404e44d641b", size = 177001, upload-time = "2026-04-10T00:41:28.035Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/03/15/e56f351cf6ef1cfea58e6ac226a7318ed1deb2218c4b3cc9bd9e4b786c5a/google_api_core-2.30.3-py3-none-any.whl", hash = "sha256:a85761ba72c444dad5d611c2220633480b2b6be2521eca69cca2dbb3ffd6bfe8", size = 173274, upload-time = "2026-04-09T22:57:16.198Z" },
]
[[package]]
name = "google-api-python-client"
version = "2.194.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "google-api-core" },
{ name = "google-auth" },
{ name = "google-auth-httplib2" },
{ name = "httplib2" },
{ name = "uritemplate" },
]
sdist = { url = "https://files.pythonhosted.org/packages/60/ab/e83af0eb043e4ccc49571ca7a6a49984e9d00f4e9e6e6f1238d60bc84dce/google_api_python_client-2.194.0.tar.gz", hash = "sha256:db92647bd1a90f40b79c9618461553c2b20b6a43ce7395fa6de07132dc14f023", size = 14443469, upload-time = "2026-04-08T23:07:35.757Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/b0/34/5a624e49f179aa5b0cb87b2ce8093960299030ff40423bfbde09360eb908/google_api_python_client-2.194.0-py3-none-any.whl", hash = "sha256:61eaaac3b8fc8fdf11c08af87abc3d1342d1b37319cc1b57405f86ef7697e717", size = 15016514, upload-time = "2026-04-08T23:07:33.093Z" },
]
[[package]]
name = "google-auth"
version = "2.49.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "cryptography" },
{ name = "pyasn1-modules" },
]
sdist = { url = "https://files.pythonhosted.org/packages/c6/fc/e925290a1ad95c975c459e2df070fac2b90954e13a0370ac505dff78cb99/google_auth-2.49.2.tar.gz", hash = "sha256:c1ae38500e73065dcae57355adb6278cf8b5c8e391994ae9cbadbcb9631ab409", size = 333958, upload-time = "2026-04-10T00:41:21.888Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/73/76/d241a5c927433420507215df6cac1b1fa4ac0ba7a794df42a84326c68da8/google_auth-2.49.2-py3-none-any.whl", hash = "sha256:c2720924dfc82dedb962c9f52cabb2ab16714fd0a6a707e40561d217574ed6d5", size = 240638, upload-time = "2026-04-10T00:41:14.501Z" },
]
[[package]]
name = "google-auth-httplib2"
version = "0.3.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "google-auth" },
{ name = "httplib2" },
]
sdist = { url = "https://files.pythonhosted.org/packages/ed/99/107612bef8d24b298bb5a7c8466f908ecda791d43f9466f5c3978f5b24c1/google_auth_httplib2-0.3.1.tar.gz", hash = "sha256:0af542e815784cb64159b4469aa5d71dd41069ba93effa006e1916b1dcd88e55", size = 11152, upload-time = "2026-03-30T22:50:26.766Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/97/e9/93afb14d23a949acaa3f4e7cc51a0024671174e116e35f42850764b99634/google_auth_httplib2-0.3.1-py3-none-any.whl", hash = "sha256:682356a90ef4ba3d06548c37e9112eea6fc00395a11b0303a644c1a86abc275c", size = 9534, upload-time = "2026-03-30T22:49:03.384Z" },
]
[[package]]
name = "google-auth-oauthlib"
version = "1.3.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "google-auth" },
{ name = "requests-oauthlib" },
]
sdist = { url = "https://files.pythonhosted.org/packages/a6/82/62482931dcbe5266a2680d0da17096f2aab983ecb320277d9556700ce00e/google_auth_oauthlib-1.3.1.tar.gz", hash = "sha256:14c22c7b3dd3d06dbe44264144409039465effdd1eef94f7ce3710e486cc4bfa", size = 21663, upload-time = "2026-03-30T22:49:56.408Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2a/e0/cb454a95f460903e39f101e950038ec24a072ca69d0a294a6df625cc1627/google_auth_oauthlib-1.3.1-py3-none-any.whl", hash = "sha256:1a139ef23f1318756805b0e95f655c238bffd29655329a2978218248da4ee7f8", size = 19247, upload-time = "2026-03-30T20:02:23.894Z" },
]
[[package]]
name = "googleapis-common-protos"
version = "1.73.0"
@@ -1912,6 +1983,9 @@ all = [
{ name = "elevenlabs" },
{ name = "fastapi" },
{ name = "faster-whisper" },
{ name = "google-api-python-client" },
{ name = "google-auth-httplib2" },
{ name = "google-auth-oauthlib" },
{ name = "honcho-ai" },
{ name = "lark-oapi" },
{ name = "markdown", marker = "sys_platform == 'linux'" },
@@ -1965,6 +2039,11 @@ feishu = [
{ name = "lark-oapi" },
{ name = "qrcode" },
]
google = [
{ name = "google-api-python-client" },
{ name = "google-auth-httplib2" },
{ name = "google-auth-oauthlib" },
]
homeassistant = [
{ name = "aiohttp" },
]
@@ -2064,6 +2143,9 @@ requires-dist = [
{ name = "faster-whisper", marker = "extra == 'voice'", specifier = ">=1.0.0,<2" },
{ name = "fire", specifier = ">=0.7.1,<1" },
{ name = "firecrawl-py", specifier = ">=4.16.0,<5" },
{ name = "google-api-python-client", marker = "extra == 'google'", specifier = ">=2.100,<3" },
{ name = "google-auth-httplib2", marker = "extra == 'google'", specifier = ">=0.2,<1" },
{ name = "google-auth-oauthlib", marker = "extra == 'google'", specifier = ">=1.0,<2" },
{ name = "hermes-agent", extras = ["acp"], marker = "extra == 'all'" },
{ name = "hermes-agent", extras = ["acp"], marker = "extra == 'termux'" },
{ name = "hermes-agent", extras = ["bedrock"], marker = "extra == 'all'" },
@@ -2075,6 +2157,7 @@ requires-dist = [
{ name = "hermes-agent", extras = ["dev"], marker = "extra == 'all'" },
{ name = "hermes-agent", extras = ["dingtalk"], marker = "extra == 'all'" },
{ name = "hermes-agent", extras = ["feishu"], marker = "extra == 'all'" },
{ name = "hermes-agent", extras = ["google"], marker = "extra == 'all'" },
{ name = "hermes-agent", extras = ["homeassistant"], marker = "extra == 'all'" },
{ name = "hermes-agent", extras = ["honcho"], marker = "extra == 'all'" },
{ name = "hermes-agent", extras = ["honcho"], marker = "extra == 'termux'" },
@@ -2136,7 +2219,7 @@ requires-dist = [
{ name = "wandb", marker = "extra == 'rl'", specifier = ">=0.15.0,<1" },
{ name = "yc-bench", marker = "python_full_version >= '3.12' and extra == 'yc-bench'", git = "https://github.com/collinear-ai/yc-bench.git?rev=bfb0c88062450f46341bd9a5298903fc2e952a5c" },
]
provides-extras = ["modal", "daytona", "dev", "messaging", "cron", "slack", "matrix", "cli", "tts-premium", "voice", "pty", "honcho", "mcp", "homeassistant", "sms", "acp", "mistral", "bedrock", "termux", "dingtalk", "feishu", "web", "rl", "yc-bench", "all"]
provides-extras = ["modal", "daytona", "dev", "messaging", "cron", "slack", "matrix", "cli", "tts-premium", "voice", "pty", "honcho", "mcp", "homeassistant", "sms", "acp", "mistral", "bedrock", "termux", "dingtalk", "feishu", "google", "web", "rl", "yc-bench", "all"]
[[package]]
name = "hf-transfer"
@@ -2238,6 +2321,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/7e/f5/f66802a942d491edb555dd61e3a9961140fd64c90bce1eafd741609d334d/httpcore-1.0.9-py3-none-any.whl", hash = "sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55", size = 78784, upload-time = "2025-04-24T22:06:20.566Z" },
]
[[package]]
name = "httplib2"
version = "0.31.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "pyparsing" },
]
sdist = { url = "https://files.pythonhosted.org/packages/c1/1f/e86365613582c027dda5ddb64e1010e57a3d53e99ab8a72093fa13d565ec/httplib2-0.31.2.tar.gz", hash = "sha256:385e0869d7397484f4eab426197a4c020b606edd43372492337c0b4010ae5d24", size = 250800, upload-time = "2026-01-23T11:04:44.165Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2f/90/fd509079dfcab01102c0fdd87f3a9506894bc70afcf9e9785ef6b2b3aff6/httplib2-0.31.2-py3-none-any.whl", hash = "sha256:dbf0c2fa3862acf3c55c078ea9c0bc4481d7dc5117cae71be9514912cf9f8349", size = 91099, upload-time = "2026-01-23T11:04:42.78Z" },
]
[[package]]
name = "httptools"
version = "0.7.1"
@@ -3277,6 +3372,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/57/a7/b35835e278c18b85206834b3aa3abe68e77a98769c59233d1f6300284781/numpy-2.4.3-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:4b42639cdde6d24e732ff823a3fa5b701d8acad89c4142bc1d0bd6dc85200ba5", size = 12504685, upload-time = "2026-03-09T07:58:50.525Z" },
]
[[package]]
name = "oauthlib"
version = "3.3.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/0b/5f/19930f824ffeb0ad4372da4812c50edbd1434f678c90c2733e1188edfc63/oauthlib-3.3.1.tar.gz", hash = "sha256:0f0f8aa759826a193cf66c12ea1af1637f87b9b4622d46e866952bb022e538c9", size = 185918, upload-time = "2025-06-19T22:48:08.269Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/be/9c/92789c596b8df838baa98fa71844d84283302f7604ed565dafe5a6b5041a/oauthlib-3.3.1-py3-none-any.whl", hash = "sha256:88119c938d2b8fb88561af5f6ee0eec8cc8d552b7bb1f712743136eb7523b7a1", size = 160065, upload-time = "2025-06-19T22:48:06.508Z" },
]
[[package]]
name = "obstore"
version = "0.8.2"
@@ -3855,6 +3959,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/5b/5a/bc7b4a4ef808fa59a816c17b20c4bef6884daebbdf627ff2a161da67da19/propcache-0.4.1-py3-none-any.whl", hash = "sha256:af2a6052aeb6cf17d3e46ee169099044fd8224cbaf75c76a2ef596e8163e2237", size = 13305, upload-time = "2025-10-08T19:49:00.792Z" },
]
[[package]]
name = "proto-plus"
version = "1.27.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "protobuf" },
]
sdist = { url = "https://files.pythonhosted.org/packages/81/0d/94dfe80193e79d55258345901acd2917523d56e8381bc4dee7fd38e3868a/proto_plus-1.27.2.tar.gz", hash = "sha256:b2adde53adadf75737c44d3dcb0104fde65250dfc83ad59168b4aa3e574b6a24", size = 57204, upload-time = "2026-03-26T22:18:57.174Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/84/f3/1fba73eeffafc998a25d59703b63f8be4fe8a5cb12eaff7386a0ba0f7125/proto_plus-1.27.2-py3-none-any.whl", hash = "sha256:6432f75893d3b9e70b9c412f1d2f03f65b11fb164b793d14ae2ca01821d22718", size = 50450, upload-time = "2026-03-26T22:13:42.927Z" },
]
[[package]]
name = "protobuf"
version = "6.33.5"
@@ -3929,6 +4045,27 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/50/f2/c0e76a0b451ffdf0cf788932e182758eb7558953f4f27f1aff8e2518b653/pyarrow-23.0.1-cp314-cp314t-win_amd64.whl", hash = "sha256:527e8d899f14bd15b740cd5a54ad56b7f98044955373a17179d5956ddb93d9ce", size = 28365807, upload-time = "2026-02-16T10:14:03.892Z" },
]
[[package]]
name = "pyasn1"
version = "0.6.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/5c/5f/6583902b6f79b399c9c40674ac384fd9cd77805f9e6205075f828ef11fb2/pyasn1-0.6.3.tar.gz", hash = "sha256:697a8ecd6d98891189184ca1fa05d1bb00e2f84b5977c481452050549c8a72cf", size = 148685, upload-time = "2026-03-17T01:06:53.382Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/5d/a0/7d793dce3fa811fe047d6ae2431c672364b462850c6235ae306c0efd025f/pyasn1-0.6.3-py3-none-any.whl", hash = "sha256:a80184d120f0864a52a073acc6fc642847d0be408e7c7252f31390c0f4eadcde", size = 83997, upload-time = "2026-03-17T01:06:52.036Z" },
]
[[package]]
name = "pyasn1-modules"
version = "0.4.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "pyasn1" },
]
sdist = { url = "https://files.pythonhosted.org/packages/e9/e6/78ebbb10a8c8e4b61a59249394a4a594c1a7af95593dc933a349c8d00964/pyasn1_modules-0.4.2.tar.gz", hash = "sha256:677091de870a80aae844b1ca6134f54652fa2c8c5a52aa396440ac3106e941e6", size = 307892, upload-time = "2025-03-28T02:41:22.17Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/47/8d/d529b5d697919ba8c11ad626e835d4039be708a35b0d22de83a269a6682c/pyasn1_modules-0.4.2-py3-none-any.whl", hash = "sha256:29253a9207ce32b64c3ac6600edc75368f98473906e8fd1043bd6b5b1de2c14a", size = 181259, upload-time = "2025-03-28T02:41:19.028Z" },
]
[[package]]
name = "pycparser"
version = "3.0"
@@ -4529,6 +4666,19 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/56/5d/c814546c2333ceea4ba42262d8c4d55763003e767fa169adc693bd524478/requests-2.33.0-py3-none-any.whl", hash = "sha256:3324635456fa185245e24865e810cecec7b4caf933d7eb133dcde67d48cee69b", size = 65017, upload-time = "2026-03-25T15:10:40.382Z" },
]
[[package]]
name = "requests-oauthlib"
version = "2.0.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "oauthlib" },
{ name = "requests" },
]
sdist = { url = "https://files.pythonhosted.org/packages/42/f2/05f29bc3913aea15eb670be136045bf5c5bbf4b99ecb839da9b422bb2c85/requests-oauthlib-2.0.0.tar.gz", hash = "sha256:b3dffaebd884d8cd778494369603a9e7b58d29111bf6b41bdc2dcd87203af4e9", size = 55650, upload-time = "2024-03-22T20:32:29.939Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/3b/5d/63d4ae3b9daea098d5d6f5da83984853c1bbacd5dc826764b249fe119d24/requests_oauthlib-2.0.0-py2.py3-none-any.whl", hash = "sha256:7dd8a5c40426b779b0868c404bdef9768deccf22749cde15852df527e6269b36", size = 24179, upload-time = "2024-03-22T20:32:28.055Z" },
]
[[package]]
name = "requests-toolbelt"
version = "1.0.0"
@@ -4664,27 +4814,27 @@ wheels = [
[[package]]
name = "ruff"
version = "0.15.10"
version = "0.15.11"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/e7/d9/aa3f7d59a10ef6b14fe3431706f854dbf03c5976be614a9796d36326810c/ruff-0.15.10.tar.gz", hash = "sha256:d1f86e67ebfdef88e00faefa1552b5e510e1d35f3be7d423dc7e84e63788c94e", size = 4631728, upload-time = "2026-04-09T14:06:09.884Z" }
sdist = { url = "https://files.pythonhosted.org/packages/e4/8d/192f3d7103816158dfd5ea50d098ef2aec19194e6cbccd4b3485bdb2eb2d/ruff-0.15.11.tar.gz", hash = "sha256:f092b21708bf0e7437ce9ada249dfe688ff9a0954fc94abab05dcea7dcd29c33", size = 4637264, upload-time = "2026-04-16T18:46:26.58Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/eb/00/a1c2fdc9939b2c03691edbda290afcd297f1f389196172826b03d6b6a595/ruff-0.15.10-py3-none-linux_armv6l.whl", hash = "sha256:0744e31482f8f7d0d10a11fcbf897af272fefdfcb10f5af907b18c2813ff4d5f", size = 10563362, upload-time = "2026-04-09T14:06:21.189Z" },
{ url = "https://files.pythonhosted.org/packages/5c/15/006990029aea0bebe9d33c73c3e28c80c391ebdba408d1b08496f00d422d/ruff-0.15.10-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:b1e7c16ea0ff5a53b7c2df52d947e685973049be1cdfe2b59a9c43601897b22e", size = 10951122, upload-time = "2026-04-09T14:06:02.236Z" },
{ url = "https://files.pythonhosted.org/packages/f2/c0/4ac978fe874d0618c7da647862afe697b281c2806f13ce904ad652fa87e4/ruff-0.15.10-py3-none-macosx_11_0_arm64.whl", hash = "sha256:93cc06a19e5155b4441dd72808fdf84290d84ad8a39ca3b0f994363ade4cebb1", size = 10314005, upload-time = "2026-04-09T14:06:00.026Z" },
{ url = "https://files.pythonhosted.org/packages/da/73/c209138a5c98c0d321266372fc4e33ad43d506d7e5dd817dd89b60a8548f/ruff-0.15.10-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:83e1dd04312997c99ea6965df66a14fb4f03ba978564574ffc68b0d61fd3989e", size = 10643450, upload-time = "2026-04-09T14:05:42.137Z" },
{ url = "https://files.pythonhosted.org/packages/ec/76/0deec355d8ec10709653635b1f90856735302cb8e149acfdf6f82a5feb70/ruff-0.15.10-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:8154d43684e4333360fedd11aaa40b1b08a4e37d8ffa9d95fee6fa5b37b6fab1", size = 10379597, upload-time = "2026-04-09T14:05:49.984Z" },
{ url = "https://files.pythonhosted.org/packages/dc/be/86bba8fc8798c081e28a4b3bb6d143ccad3fd5f6f024f02002b8f08a9fa3/ruff-0.15.10-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:8ab88715f3a6deb6bde6c227f3a123410bec7b855c3ae331b4c006189e895cef", size = 11146645, upload-time = "2026-04-09T14:06:12.246Z" },
{ url = "https://files.pythonhosted.org/packages/a8/89/140025e65911b281c57be1d385ba1d932c2366ca88ae6663685aed8d4881/ruff-0.15.10-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a768ff5969b4f44c349d48edf4ab4f91eddb27fd9d77799598e130fb628aa158", size = 12030289, upload-time = "2026-04-09T14:06:04.776Z" },
{ url = "https://files.pythonhosted.org/packages/88/de/ddacca9545a5e01332567db01d44bd8cf725f2db3b3d61a80550b48308ea/ruff-0.15.10-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0ee3ef42dab7078bda5ff6a1bcba8539e9857deb447132ad5566a038674540d0", size = 11496266, upload-time = "2026-04-09T14:05:55.485Z" },
{ url = "https://files.pythonhosted.org/packages/bc/bb/7ddb00a83760ff4a83c4e2fc231fd63937cc7317c10c82f583302e0f6586/ruff-0.15.10-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:51cb8cc943e891ba99989dd92d61e29b1d231e14811db9be6440ecf25d5c1609", size = 11256418, upload-time = "2026-04-09T14:05:57.69Z" },
{ url = "https://files.pythonhosted.org/packages/dc/8d/55de0d35aacf6cd50b6ee91ee0f291672080021896543776f4170fc5c454/ruff-0.15.10-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:e59c9bdc056a320fb9ea1700a8d591718b8faf78af065484e801258d3a76bc3f", size = 11288416, upload-time = "2026-04-09T14:05:44.695Z" },
{ url = "https://files.pythonhosted.org/packages/68/cf/9438b1a27426ec46a80e0a718093c7f958ef72f43eb3111862949ead3cc1/ruff-0.15.10-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:136c00ca2f47b0018b073f28cb5c1506642a830ea941a60354b0e8bc8076b151", size = 10621053, upload-time = "2026-04-09T14:05:52.782Z" },
{ url = "https://files.pythonhosted.org/packages/4c/50/e29be6e2c135e9cd4cb15fbade49d6a2717e009dff3766dd080fcb82e251/ruff-0.15.10-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:8b80a2f3c9c8a950d6237f2ca12b206bccff626139be9fa005f14feb881a1ae8", size = 10378302, upload-time = "2026-04-09T14:06:14.361Z" },
{ url = "https://files.pythonhosted.org/packages/18/2f/e0b36a6f99c51bb89f3a30239bc7bf97e87a37ae80aa2d6542d6e5150364/ruff-0.15.10-py3-none-musllinux_1_2_i686.whl", hash = "sha256:e3e53c588164dc025b671c9df2462429d60357ea91af7e92e9d56c565a9f1b07", size = 10850074, upload-time = "2026-04-09T14:06:16.581Z" },
{ url = "https://files.pythonhosted.org/packages/11/08/874da392558ce087a0f9b709dc6ec0d60cbc694c1c772dab8d5f31efe8cb/ruff-0.15.10-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:b0c52744cf9f143a393e284125d2576140b68264a93c6716464e129a3e9adb48", size = 11358051, upload-time = "2026-04-09T14:06:18.948Z" },
{ url = "https://files.pythonhosted.org/packages/e4/46/602938f030adfa043e67112b73821024dc79f3ab4df5474c25fa4c1d2d14/ruff-0.15.10-py3-none-win32.whl", hash = "sha256:d4272e87e801e9a27a2e8df7b21011c909d9ddd82f4f3281d269b6ba19789ca5", size = 10588964, upload-time = "2026-04-09T14:06:07.14Z" },
{ url = "https://files.pythonhosted.org/packages/25/b6/261225b875d7a13b33a6d02508c39c28450b2041bb01d0f7f1a83d569512/ruff-0.15.10-py3-none-win_amd64.whl", hash = "sha256:28cb32d53203242d403d819fd6983152489b12e4a3ae44993543d6fe62ab42ed", size = 11745044, upload-time = "2026-04-09T14:05:39.473Z" },
{ url = "https://files.pythonhosted.org/packages/58/ed/dea90a65b7d9e69888890fb14c90d7f51bf0c1e82ad800aeb0160e4bacfd/ruff-0.15.10-py3-none-win_arm64.whl", hash = "sha256:601d1610a9e1f1c2165a4f561eeaa2e2ea1e97f3287c5aa258d3dab8b57c6188", size = 11035607, upload-time = "2026-04-09T14:05:47.593Z" },
{ url = "https://files.pythonhosted.org/packages/02/1e/6aca3427f751295ab011828e15e9bf452200ac74484f1db4be0197b8170b/ruff-0.15.11-py3-none-linux_armv6l.whl", hash = "sha256:e927cfff503135c558eb581a0c9792264aae9507904eb27809cdcff2f2c847b7", size = 10607943, upload-time = "2026-04-16T18:46:05.967Z" },
{ url = "https://files.pythonhosted.org/packages/e7/26/1341c262e74f36d4e84f3d6f4df0ac68cd53331a66bfc5080daa17c84c0b/ruff-0.15.11-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:7a1b5b2938d8f890b76084d4fa843604d787a912541eae85fd7e233398bbb73e", size = 10988592, upload-time = "2026-04-16T18:46:00.742Z" },
{ url = "https://files.pythonhosted.org/packages/03/71/850b1d6ffa9564fbb6740429bad53df1094082fe515c8c1e74b6d8d05f18/ruff-0.15.11-py3-none-macosx_11_0_arm64.whl", hash = "sha256:d4176f3d194afbdaee6e41b9ccb1a2c287dba8700047df474abfbe773825d1cb", size = 10338501, upload-time = "2026-04-16T18:46:03.723Z" },
{ url = "https://files.pythonhosted.org/packages/f2/11/cc1284d3e298c45a817a6aadb6c3e1d70b45c9b36d8d9cce3387b495a03a/ruff-0.15.11-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3b17c886fb88203ced3afe7f14e8d5ae96e9d2f4ccc0ee66aa19f2c2675a27e4", size = 10670693, upload-time = "2026-04-16T18:46:41.941Z" },
{ url = "https://files.pythonhosted.org/packages/ce/9e/f8288b034ab72b371513c13f9a41d9ba3effac54e24bfb467b007daee2ca/ruff-0.15.11-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:49fafa220220afe7758a487b048de4c8f9f767f37dfefad46b9dd06759d003eb", size = 10416177, upload-time = "2026-04-16T18:46:21.717Z" },
{ url = "https://files.pythonhosted.org/packages/85/71/504d79abfd3d92532ba6bbe3d1c19fada03e494332a59e37c7c2dabae427/ruff-0.15.11-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f2ab8427e74a00d93b8bda1307b1e60970d40f304af38bccb218e056c220120d", size = 11221886, upload-time = "2026-04-16T18:46:15.086Z" },
{ url = "https://files.pythonhosted.org/packages/43/5a/947e6ab7a5ad603d65b474be15a4cbc6d29832db5d762cd142e4e3a74164/ruff-0.15.11-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:195072c0c8e1fc8f940652073df082e37a5d9cb43b4ab1e4d0566ab8977a13b7", size = 12075183, upload-time = "2026-04-16T18:46:07.944Z" },
{ url = "https://files.pythonhosted.org/packages/9f/a1/0b7bb6268775fdd3a0818aee8efd8f5b4e231d24dd4d528ced2534023182/ruff-0.15.11-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a3a0996d486af3920dec930a2e7daed4847dfc12649b537a9335585ada163e9e", size = 11516575, upload-time = "2026-04-16T18:46:31.687Z" },
{ url = "https://files.pythonhosted.org/packages/30/c3/bb5168fc4d233cc06e95f482770d0f3c87945a0cd9f614b90ea8dc2f2833/ruff-0.15.11-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1bef2cb556d509259f1fe440bb9cd33c756222cf0a7afe90d15edf0866702431", size = 11306537, upload-time = "2026-04-16T18:46:36.988Z" },
{ url = "https://files.pythonhosted.org/packages/e4/92/4cfae6441f3967317946f3b788136eecf093729b94d6561f963ed810c82e/ruff-0.15.11-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:030d921a836d7d4a12cf6e8d984a88b66094ccb0e0f17ddd55067c331191bf19", size = 11296813, upload-time = "2026-04-16T18:46:24.182Z" },
{ url = "https://files.pythonhosted.org/packages/43/26/972784c5dde8313acde8ac71ba8ac65475b85db4a2352a76c9934361f9bc/ruff-0.15.11-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:0e783b599b4577788dbbb66b9addcef87e9a8832f4ce0c19e34bf55543a2f890", size = 10633136, upload-time = "2026-04-16T18:46:39.802Z" },
{ url = "https://files.pythonhosted.org/packages/5b/53/3985a4f185020c2f367f2e08a103032e12564829742a1b417980ce1514a0/ruff-0.15.11-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:ae90592246625ba4a34349d68ec28d4400d75182b71baa196ddb9f82db025ef5", size = 10424701, upload-time = "2026-04-16T18:46:10.381Z" },
{ url = "https://files.pythonhosted.org/packages/d3/57/bf0dfb32241b56c83bb663a826133da4bf17f682ba8c096973065f6e6a68/ruff-0.15.11-py3-none-musllinux_1_2_i686.whl", hash = "sha256:1f111d62e3c983ed20e0ca2e800f8d77433a5b1161947df99a5c2a3fb60514f0", size = 10873887, upload-time = "2026-04-16T18:46:29.157Z" },
{ url = "https://files.pythonhosted.org/packages/02/05/e48076b2a57dc33ee8c7a957296f97c744ca891a8ffb4ffb1aaa3b3f517d/ruff-0.15.11-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:06f483d6646f59eaffba9ae30956370d3a886625f511a3108994000480621d1c", size = 11404316, upload-time = "2026-04-16T18:46:19.462Z" },
{ url = "https://files.pythonhosted.org/packages/88/27/0195d15fe7a897cbcba0904792c4b7c9fdd958456c3a17d2ea6093716a9a/ruff-0.15.11-py3-none-win32.whl", hash = "sha256:476a2aa56b7da0b73a3ee80b6b2f0e19cce544245479adde7baa65466664d5f3", size = 10655535, upload-time = "2026-04-16T18:46:12.47Z" },
{ url = "https://files.pythonhosted.org/packages/3a/5e/c927b325bd4c1d3620211a4b96f47864633199feed60fa936025ab27e090/ruff-0.15.11-py3-none-win_amd64.whl", hash = "sha256:8b6756d88d7e234fb0c98c91511aae3cd519d5e3ed271cae31b20f39cb2a12a3", size = 11779692, upload-time = "2026-04-16T18:46:17.268Z" },
{ url = "https://files.pythonhosted.org/packages/63/b6/aeadee5443e49baa2facd51131159fd6301cc4ccfc1541e4df7b021c37dd/ruff-0.15.11-py3-none-win_arm64.whl", hash = "sha256:063fed18cc1bbe0ee7393957284a6fe8b588c6a406a285af3ee3f46da2391ee4", size = 11032614, upload-time = "2026-04-16T18:46:34.487Z" },
]
[[package]]
@@ -5268,6 +5418,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/4c/a7/563b2d8fb7edc07320bf69ac6a7eedcd7a1a9d663a6bb90a4d9bd2eda5f7/unpaddedbase64-2.1.0-py3-none-any.whl", hash = "sha256:485eff129c30175d2cd6f0cd8d2310dff51e666f7f36175f738d75dfdbd0b1c6", size = 6083, upload-time = "2021-03-09T11:35:46.7Z" },
]
[[package]]
name = "uritemplate"
version = "4.2.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/98/60/f174043244c5306c9988380d2cb10009f91563fc4b31293d27e17201af56/uritemplate-4.2.0.tar.gz", hash = "sha256:480c2ed180878955863323eea31b0ede668795de182617fef9c6ca09e6ec9d0e", size = 33267, upload-time = "2025-06-02T15:12:06.318Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/a9/99/3ae339466c9183ea5b8ae87b34c0b897eda475d2aec2307cae60e5cd4f29/uritemplate-4.2.0-py3-none-any.whl", hash = "sha256:962201ba1c4edcab02e60f9a0d3821e82dfc5d2d6662a21abd533879bdb8a686", size = 11488, upload-time = "2025-06-02T15:12:03.405Z" },
]
[[package]]
name = "urllib3"
version = "2.6.3"
@@ -93,6 +93,42 @@ This path includes everything from Path A plus:
11. `run_agent.py`
12. `pyproject.toml` if a provider SDK is required
## Fast path: Simple API-key providers
If your provider is just an OpenAI-compatible endpoint that authenticates with a single API key, you do not need to touch `auth.py`, `runtime_provider.py`, `main.py`, or any of the other files in the full checklist below.
All you need is:
1. A file in `providers/` (e.g. `providers/myprovider.py`) that calls `register_provider()` with the provider config.
2. That's it. `auth.py` auto-registers every file in `providers/` at startup via a module-level import sweep.
When you add a `providers/*.py` file and call `register_provider()`, the following wire up automatically:
1. `PROVIDER_REGISTRY` entry in `auth.py` (credential resolution, env-var lookup)
2. `api_mode` set to `chat_completions`
3. `base_url` sourced from the config or the declared env var
4. `env_vars` checked in priority order for the API key
5. `fallback_models` list registered for the provider
6. `--provider` CLI flag accepts the provider id
7. `hermes model` menu includes the provider
8. `hermes setup` wizard delegates to `main.py` automatically
9. `provider:model` alias syntax works
10. Runtime resolver returns the correct `base_url` and `api_key`
11. `HERMES_INFERENCE_PROVIDER` env-var override accepts the provider id
12. Fallback model activation can switch into the provider cleanly
See `providers/nvidia.py` or `providers/gmi.py` as a template.
## Full path: OAuth and complex providers
Use the full checklist below when your provider needs any of the following:
- OAuth or token refresh (Nous Portal, Codex, Google Gemini, Qwen Portal, Copilot)
- A non-OpenAI API shape that requires a new adapter (Anthropic Messages, Codex Responses)
- Custom endpoint detection or multi-region probing (z.ai, Kimi)
- A curated static model catalog or live `/models` fetch
- Provider-specific `hermes model` menu entries with bespoke auth flows
## Step 1: Pick one canonical provider id
Choose a single provider id and use it everywhere.
@@ -20,6 +20,9 @@ Primary implementation:
- `hermes_cli/auth.py` — provider registry, `resolve_provider()`
- `hermes_cli/model_switch.py` — shared `/model` switch pipeline (CLI + gateway)
- `agent/auxiliary_client.py` — auxiliary model routing
- `providers/` — declarative source for `api_mode`, `base_url`, `env_vars`, `fallback_models` (auto-registered into `auth.py` `PROVIDER_REGISTRY` at startup)
`get_provider_profile()` in `providers/` returns a typed dict for a given provider id. `runtime_provider.py` calls this at resolution time to get the canonical `base_url`, `env_vars` priority list, `api_mode`, and `fallback_models` without needing to duplicate that data in multiple files. Adding a new `providers/*.py` file that calls `register_provider()` is enough for `runtime_provider.py` to pick it up — no branch needed in the resolver itself.
If you are trying to add a new first-class inference provider, read [Adding Providers](./adding-providers.md) alongside this page.
+39 -1
View File
@@ -423,6 +423,44 @@ model:
For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://localhost:8000/v1`. NIM exposes the same OpenAI-compatible chat completions API as build.nvidia.com, so switching between cloud and local is a one-line env-var change.
:::
### GMI Cloud
Open and reasoning models via [GMI Cloud](https://inference.gmi.ai) — OpenAI-compatible API, API key authentication.
```bash
# GMI Cloud
hermes chat --provider gmi --model deepseek-ai/DeepSeek-R1
# Requires: GMI_API_KEY in ~/.hermes/.env
```
Or set it permanently in `config.yaml`:
```yaml
model:
provider: "gmi"
default: "deepseek-ai/DeepSeek-R1"
```
The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi.ai/v1`).
### StepFun
Step-series models via [StepFun](https://platform.stepfun.com) — OpenAI-compatible API, API key authentication.
```bash
# StepFun
hermes chat --provider stepfun --model step-3-mini
# Requires: STEPFUN_API_KEY in ~/.hermes/.env
```
Or set it permanently in `config.yaml`:
```yaml
model:
provider: "stepfun"
default: "step-3-mini"
```
The base URL can be overridden with `STEPFUN_BASE_URL` (default: `https://api.stepfun.com/v1`).
### Hugging Face Inference Providers
[Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers) routes to 20+ open models through a unified OpenAI-compatible endpoint (`router.huggingface.co/v1`). Requests are automatically routed to the fastest available backend (Groq, Together, SambaNova, etc.) with automatic failover.
@@ -1178,7 +1216,7 @@ fallback_model:
When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session.
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `custom`.
Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `deepseek`, `nvidia`, `gmi`, `stepfun`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `alibaba`, `custom`.
:::tip
Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers).
@@ -65,6 +65,10 @@ All variables go in `~/.hermes/.env`. You can also set them with `hermes config
| `DEEPSEEK_BASE_URL` | Custom DeepSeek API base URL |
| `NVIDIA_API_KEY` | NVIDIA NIM API key — Nemotron and open models ([build.nvidia.com](https://build.nvidia.com)) |
| `NVIDIA_BASE_URL` | Override NVIDIA base URL (default: `https://integrate.api.nvidia.com/v1`; set to `http://localhost:8000/v1` for a local NIM endpoint) |
| `GMI_API_KEY` | GMI Cloud API key — open and reasoning models ([inference.gmi.ai](https://inference.gmi.ai)) |
| `GMI_BASE_URL` | Override GMI Cloud base URL (default: `https://api.gmi.ai/v1`) |
| `STEPFUN_API_KEY` | StepFun API key — Step-series models ([platform.stepfun.com](https://platform.stepfun.com)) |
| `STEPFUN_BASE_URL` | Override StepFun base URL (default: `https://api.stepfun.com/v1`) |
| `OLLAMA_API_KEY` | Ollama Cloud API key — managed Ollama catalog without local GPU ([ollama.com/settings/keys](https://ollama.com/settings/keys)) |
| `OLLAMA_BASE_URL` | Override Ollama Cloud base URL (default: `https://ollama.com/v1`) |
| `XAI_API_KEY` | xAI (Grok) API key for chat + TTS ([console.x.ai](https://console.x.ai/)) |
@@ -91,7 +95,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| Variable | Description |
|----------|-------------|
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `alibaba`, `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) |
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `custom`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `gemini`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `alibaba`, `deepseek`, `nvidia`, `ollama-cloud`, `xai` (alias `grok`), `google-gemini-cli`, `qwen-oauth`, `bedrock`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) |
| `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
| `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
| `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
@@ -48,6 +48,8 @@ Both `provider` and `model` are **required**. If either is missing, the fallback
| MiniMax (China) | `minimax-cn` | `MINIMAX_CN_API_KEY` |
| DeepSeek | `deepseek` | `DEEPSEEK_API_KEY` |
| NVIDIA NIM | `nvidia` | `NVIDIA_API_KEY` (optional: `NVIDIA_BASE_URL`) |
| GMI Cloud | `gmi` | `GMI_API_KEY` (optional: `GMI_BASE_URL`) |
| StepFun | `stepfun` | `STEPFUN_API_KEY` (optional: `STEPFUN_BASE_URL`) |
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` |
| Google Gemini (OAuth) | `google-gemini-cli` | `hermes model` (Google OAuth; optional: `HERMES_GEMINI_PROJECT_ID`) |
| Google AI Studio | `gemini` | `GOOGLE_API_KEY` (alias: `GEMINI_API_KEY`) |
+93
View File
@@ -248,6 +248,8 @@ def register(ctx):
| [`on_session_reset`](#on_session_reset) | Gateway swaps in a fresh session key (e.g. `/new`, `/reset`) | ignored |
| [`subagent_stop`](#subagent_stop) | A `delegate_task` child has exited | ignored |
| [`pre_gateway_dispatch`](#pre_gateway_dispatch) | Gateway received a user message, before auth + dispatch | `{"action": "skip" \| "rewrite" \| "allow", ...}` to influence flow |
| [`pre_approval_request`](#pre_approval_request) | Dangerous command needs user approval, before the prompt/notification is sent | ignored |
| [`post_approval_response`](#post_approval_response) | User responded to an approval prompt (or it timed out) | ignored |
---
@@ -775,6 +777,97 @@ def register(ctx):
---
### `pre_approval_request`
Fires **immediately before** an approval request is shown to the user — covers every surface: interactive CLI, the Ink TUI, gateway platforms (Telegram, Discord, Slack, WhatsApp, Matrix, etc.), and ACP clients (VS Code, Zed, JetBrains).
This is the right place to wire a custom notifier — for example, a macOS menu-bar app that pops an allow/deny notification, or an audit log that records every approval request with context.
**Callback signature:**
```python
def my_callback(
command: str,
description: str,
pattern_key: str,
pattern_keys: list[str],
session_key: str,
surface: str,
**kwargs,
):
```
| Parameter | Type | Description |
|-----------|------|-------------|
| `command` | `str` | The shell command awaiting approval |
| `description` | `str` | Human-readable reason(s) the command is flagged (combined when multiple patterns match) |
| `pattern_key` | `str` | Primary pattern key that triggered the approval (e.g. `"rm_rf"`, `"sudo"`) |
| `pattern_keys` | `list[str]` | All pattern keys that matched |
| `session_key` | `str` | Session identifier, useful for scoping notifications per-chat |
| `surface` | `str` | `"cli"` for interactive CLI/TUI prompts, `"gateway"` for async platform approvals |
**Return value:** ignored. Hooks here are observer-only; they cannot veto or pre-answer the approval. Use [`pre_tool_call`](#pre_tool_call) to block a tool before it reaches the approval system.
**Use cases:** Desktop notifications, push alerts, audit logging, Slack webhooks, escalation routing, metrics.
**Example — desktop notification on macOS:**
```python
import subprocess
def notify_approval(command, description, session_key, **kwargs):
title = "Hermes needs approval"
body = f"{description}: {command[:80]}"
subprocess.Popen([
"osascript", "-e",
f'display notification "{body}" with title "{title}"',
])
def register(ctx):
ctx.register_hook("pre_approval_request", notify_approval)
```
---
### `post_approval_response`
Fires **after** the user responds to an approval prompt (or the prompt times out).
**Callback signature:**
```python
def my_callback(
command: str,
description: str,
pattern_key: str,
pattern_keys: list[str],
session_key: str,
surface: str,
choice: str,
**kwargs,
):
```
Same kwargs as `pre_approval_request`, plus:
| Parameter | Type | Description |
|-----------|------|-------------|
| `choice` | `str` | One of `"once"`, `"session"`, `"always"`, `"deny"`, or `"timeout"` |
**Return value:** ignored.
**Use cases:** Close the matching desktop notification, record the final decision in an audit log, update metrics, roll forward a rate limiter.
```python
def log_decision(command, choice, session_key, **kwargs):
logger.info("approval %s: %s for session %s", choice, command[:60], session_key)
def register(ctx):
ctx.register_hook("post_approval_response", log_decision)
```
---
## Shell Hooks
Declare shell-script hooks in your `cli-config.yaml` and Hermes will run them as subprocesses whenever the corresponding plugin-hook event fires — in both CLI and gateway sessions. No Python plugin authoring required.