Compare commits

...

4 Commits

Author SHA1 Message Date
teknium1 a886fbf111 feat: generic webhook inbound platform adapter
Adds a lightweight HTTP server adapter that accepts POST /message
requests and routes them through the gateway as conversations. Each
unique chat_id gets its own session — supports multiple concurrent
agents/conversations with no race conditions.

The response is returned synchronously in the HTTP response body
(connection held until agent finishes, up to 300s timeout).

Enable with: WEBHOOK_PORT=4568

API:
  POST /message {chat_id, message, from?, user_id?}
  GET  /health

This is a generic adapter usable by any external system:
miniverse bridges, n8n/Zapier webhooks, CI/CD pipelines,
custom automations, other agent frameworks, etc.

Also adds docs/integrations/miniverse.md pointing to the
external hermes-miniverse bridge repo.
2026-03-13 04:44:14 -07:00
teknium1 d4b14f74c3 docs: add miniverse integration guide
Points to the external hermes-miniverse repo (teknium1/hermes-miniverse)
which provides the bridge, gateway hook, and skill for connecting
Hermes agents to Miniverse pixel worlds.

No code changes to hermes-agent — everything lives externally.
2026-03-13 04:43:38 -07:00
Teknium 6235fdde75 fix: raise session hygiene threshold from 50% to 85%
Session hygiene was firing at the same threshold (50%) as the agent's
own context compressor, causing premature compression on every turn
in long gateway sessions (especially Telegram).

Hygiene is a safety net for pathologically large sessions that would
cause API failures — it should NOT be doing normal compression work.
The agent's own compressor handles that during its tool loop with
accurate real token counts from the API.

Changes:
- Default hygiene threshold: 0.50 → 0.85 (fires only when truly large)
- Hygiene threshold is now independent of compression.threshold config
  (that setting controls the agent's compressor, not the pre-agent safety net)
- Removed env var override for hygiene threshold (CONTEXT_COMPRESSION_THRESHOLD
  still controls the agent's own compressor)
2026-03-13 04:17:45 -07:00
Teknium 8f8dd83443 fix: sync session_id after mid-run context compression
Critical bug: when the agent's context compressor fires during a tool
loop (_compress_context), it creates a new session_id and writes the
compressed messages there. But the gateway's session_entry still pointed
to the old session_id. On the next message, load_transcript() loaded
the stale pre-compression transcript, causing:

- Context bloat returning every turn
- Repeated compression cycles
- Loss of carefully compressed context

Fix: after run_conversation() returns, check if the agent's session_id
changed (compression split) and sync it back to the session store entry.
Also pass the effective session_id in the result dict so _handle_message
writes transcript entries to the correct session.

This affects ALL gateway adapters, not just webhook.
2026-03-13 04:14:35 -07:00
5 changed files with 304 additions and 14 deletions
+38
View File
@@ -0,0 +1,38 @@
# Miniverse Integration
Connect your Hermes agents to a [Miniverse](https://github.com/ianscott313/miniverse) pixel world where they can live, work, and communicate with other AI agents.
## Overview
[hermes-miniverse](https://github.com/teknium1/hermes-miniverse) is a standalone bridge that connects Hermes Agent to Miniverse — no changes to your Hermes installation required.
```
Hermes Agent ←→ hermes-miniverse bridge ←→ Miniverse Server
```
## Features
- **Automatic presence**: Your agent appears in the pixel world with live state (working, thinking, idle)
- **Inter-agent messaging**: Other agents can message your Hermes agent and receive responses
- **Conscious interaction**: Your agent can choose to speak, message others, and join channels
- **Multiple agents**: Run several Hermes instances as different agents in the same world
## Setup
See the [hermes-miniverse README](https://github.com/teknium1/hermes-miniverse) for installation and configuration instructions.
### Components
| Component | Where | Purpose |
|-----------|-------|---------|
| `bridge.py` | Standalone daemon | Heartbeats, webhook receiver, message injection |
| `hooks/miniverse/` | `~/.hermes/hooks/` | Gateway hook for state broadcasting |
| `skill/miniverse-world/` | `~/.hermes/skills/` | Teaches agents miniverse API commands |
## Architecture
The bridge is a standalone HTTP server that sits between Hermes and Miniverse:
1. **State out** (Hermes → Miniverse): Gateway hook fires on `agent:start/step/end` → POSTs to bridge → bridge sends miniverse heartbeats
2. **Messages in** (Miniverse → Hermes): Miniverse webhooks → bridge HTTP server → injects into Hermes via CLI
3. **Agent interaction** (via skill): Agent uses `terminal` tool with `curl` commands to speak, message, observe
+14
View File
@@ -29,6 +29,7 @@ class Platform(Enum):
SIGNAL = "signal"
HOMEASSISTANT = "homeassistant"
EMAIL = "email"
WEBHOOK = "webhook"
@dataclass
@@ -458,6 +459,19 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
)
# Webhook (generic inbound HTTP adapter)
webhook_port = os.getenv("WEBHOOK_PORT")
if webhook_port:
if Platform.WEBHOOK not in config.platforms:
config.platforms[Platform.WEBHOOK] = PlatformConfig()
config.platforms[Platform.WEBHOOK].enabled = True
# Default to no auto-reset for webhook sessions — these are
# programmatic integrations (bridges, automations) where the
# caller manages lifecycle. Context is preserved until the
# caller explicitly creates a new session or chat_id.
if Platform.WEBHOOK not in config.reset_by_platform:
config.reset_by_platform[Platform.WEBHOOK] = SessionResetPolicy(mode="none")
# Session settings
idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
if idle_minutes:
+205
View File
@@ -0,0 +1,205 @@
"""Generic webhook inbound platform adapter.
Runs a lightweight HTTP server that accepts POST /message requests and
routes them through the gateway as regular conversations. Each unique
``chat_id`` in the request gets its own session — supporting multiple
concurrent agents/conversations.
The response is returned synchronously in the HTTP response body (the
connection is held open until the agent finishes). This makes it
trivially easy for external bridges, automation tools, or other agent
frameworks to integrate with Hermes.
Enable via env var:
WEBHOOK_PORT=4568 (any port number enables the adapter)
API:
POST /message
{
"chat_id": "hermes-1", // required — maps to a session
"message": "Hello!", // required — the message text
"from": "other-agent", // optional — sender display name
"user_id": "agent-123" // optional — sender ID
}
Response (200):
{
"ok": true,
"response": "Hi there!",
"session_id": "20260312_..."
}
GET /health
{"ok": true, "adapter": "webhook", "port": 4568}
"""
import asyncio
import json
import logging
import os
import time
from aiohttp import web
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
SendResult,
SessionSource,
)
logger = logging.getLogger(__name__)
def check_webhook_requirements() -> bool:
"""Webhook adapter is available when WEBHOOK_PORT is set."""
return bool(os.getenv("WEBHOOK_PORT"))
class WebhookAdapter(BasePlatformAdapter):
"""HTTP webhook adapter — accepts POST requests as inbound messages.
External services (bridges, automation tools, other agents) POST
messages and receive the agent's response in the HTTP response body.
Each chat_id maps to a separate gateway session.
"""
def __init__(self, config: PlatformConfig):
super().__init__(config, Platform.WEBHOOK)
self.port = int(os.getenv("WEBHOOK_PORT", "4568"))
self._app: web.Application = None
self._runner: web.AppRunner = None
self._site: web.TCPSite = None
# Pending response futures keyed by session_key
self._response_futures: dict[str, asyncio.Future] = {}
async def connect(self) -> bool:
"""Start the HTTP server."""
self._app = web.Application()
self._app.router.add_post("/message", self._handle_post)
self._app.router.add_get("/health", self._handle_health)
self._runner = web.AppRunner(self._app, access_log=None)
await self._runner.setup()
try:
self._site = web.TCPSite(self._runner, "0.0.0.0", self.port)
await self._site.start()
except OSError as e:
logger.error("Webhook adapter failed to bind port %d: %s", self.port, e)
return False
print(f"[webhook] Listening on port {self.port}")
print(f"[webhook] POST http://localhost:{self.port}/message")
return True
async def disconnect(self) -> None:
if self._site:
await self._site.stop()
if self._runner:
await self._runner.cleanup()
async def send(self, chat_id: str, content: str,
reply_to: str = None, metadata: dict = None) -> SendResult:
"""Deliver response by resolving the waiting HTTP request's future."""
from gateway.session import build_session_key
# Look up the pending future for this chat_id.
# We try the chat_id directly, then the full session key.
fut = self._response_futures.get(chat_id)
if fut is None:
# Try building the session key the same way handle_message does
source = self.build_source(chat_id=chat_id, chat_type="dm")
sk = build_session_key(source)
fut = self._response_futures.get(sk)
if fut and not fut.done():
fut.set_result(content)
return SendResult(success=True, message_id=str(int(time.time())))
async def send_typing(self, chat_id: str, metadata: dict = None) -> None:
pass # No typing indicator for webhooks
async def get_chat_info(self, chat_id: str) -> dict:
return {"id": chat_id, "name": f"webhook:{chat_id}", "type": "dm"}
# ── HTTP Handlers ────────────────────────────────────────────────────
async def _handle_post(self, request: web.Request) -> web.Response:
"""Accept an inbound message and return the agent's response."""
try:
data = await request.json()
except (json.JSONDecodeError, Exception):
return web.json_response(
{"ok": False, "error": "Invalid JSON"}, status=400
)
chat_id = data.get("chat_id", "").strip()
message = data.get("message", "").strip()
if not chat_id or not message:
return web.json_response(
{"ok": False, "error": "Missing required fields: chat_id, message"},
status=400,
)
from_name = data.get("from", "webhook")
user_id = data.get("user_id", from_name)
# Prepend sender info if provided
display_message = message
if from_name and from_name != "webhook":
display_message = f"[Message from {from_name}]: {message}"
# Build source and event
source = self.build_source(
chat_id=chat_id,
chat_type="dm",
user_id=user_id,
user_name=from_name,
)
from gateway.session import build_session_key
session_key = build_session_key(source)
event = MessageEvent(
text=display_message,
source=source,
message_id=str(int(time.time() * 1000)),
)
# Create a future to capture the response from send()
loop = asyncio.get_event_loop()
response_future = loop.create_future()
self._response_futures[session_key] = response_future
# Also store under chat_id for easier lookup
self._response_futures[chat_id] = response_future
# Submit the message for processing
await self.handle_message(event)
# Wait for the agent to finish and send() to resolve the future
try:
response_text = await asyncio.wait_for(response_future, timeout=300)
return web.json_response({
"ok": True,
"response": response_text,
"chat_id": chat_id,
})
except asyncio.TimeoutError:
return web.json_response(
{"ok": False, "error": "Agent timed out (300s)", "chat_id": chat_id},
status=504,
)
finally:
self._response_futures.pop(session_key, None)
self._response_futures.pop(chat_id, None)
async def _handle_health(self, request: web.Request) -> web.Response:
return web.json_response({
"ok": True,
"adapter": "webhook",
"port": self.port,
})
+45 -12
View File
@@ -793,6 +793,13 @@ class GatewayRunner:
return None
return EmailAdapter(config)
elif platform == Platform.WEBHOOK:
from gateway.platforms.webhook import WebhookAdapter, check_webhook_requirements
if not check_webhook_requirements():
logger.warning("Webhook: WEBHOOK_PORT not set")
return None
return WebhookAdapter(config)
return None
def _is_user_authorized(self, source: SessionSource) -> bool:
@@ -809,7 +816,8 @@ class GatewayRunner:
# Home Assistant events are system-generated (state changes), not
# user-initiated messages. The HASS_TOKEN already authenticates the
# connection, so HA events are always authorized.
if source.platform == Platform.HOMEASSISTANT:
# Webhook messages are local HTTP requests — authorized by default.
if source.platform in (Platform.HOMEASSISTANT, Platform.WEBHOOK):
return True
user_id = source.user_id
@@ -1125,10 +1133,16 @@ class GatewayRunner:
get_model_context_length,
)
# Read model + compression config from config.yaml — same
# source of truth the agent itself uses.
# Read model + compression config from config.yaml.
# NOTE: hygiene threshold is intentionally HIGHER than the agent's
# own compressor (0.85 vs 0.50). Hygiene is a safety net for
# sessions that grew too large between turns — it fires pre-agent
# to prevent API failures. The agent's own compressor handles
# normal context management during its tool loop with accurate
# real token counts. Having hygiene at 0.50 caused premature
# compression on every turn in long gateway sessions.
_hyg_model = "anthropic/claude-sonnet-4.6"
_hyg_threshold_pct = 0.50
_hyg_threshold_pct = 0.85
_hyg_compression_enabled = True
try:
_hyg_cfg_path = _hermes_home / "config.yaml"
@@ -1144,22 +1158,18 @@ class GatewayRunner:
elif isinstance(_model_cfg, dict):
_hyg_model = _model_cfg.get("default", _hyg_model)
# Read compression settings
# Read compression settings — only use enabled flag.
# The threshold is intentionally separate from the agent's
# compression.threshold (hygiene runs higher).
_comp_cfg = _hyg_data.get("compression", {})
if isinstance(_comp_cfg, dict):
_hyg_threshold_pct = float(
_comp_cfg.get("threshold", _hyg_threshold_pct)
)
_hyg_compression_enabled = str(
_comp_cfg.get("enabled", True)
).lower() in ("true", "1", "yes")
except Exception:
pass
# Also check env overrides (same as run_agent.py)
_hyg_threshold_pct = float(
os.getenv("CONTEXT_COMPRESSION_THRESHOLD", str(_hyg_threshold_pct))
)
# Check env override for disabling compression entirely
if os.getenv("CONTEXT_COMPRESSION_ENABLED", "").lower() in ("false", "0", "no"):
_hyg_compression_enabled = False
@@ -1446,6 +1456,11 @@ class GatewayRunner:
response = agent_result.get("final_response", "")
agent_messages = agent_result.get("messages", [])
# If the agent's session_id changed during compression, update
# session_entry so transcript writes below go to the right session.
if agent_result.get("session_id") and agent_result["session_id"] != session_entry.session_id:
session_entry.session_id = agent_result["session_id"]
# Prepend reasoning/thinking if display is enabled
if getattr(self, "_show_reasoning", False) and response:
last_reasoning = agent_result.get("last_reasoning")
@@ -3495,6 +3510,23 @@ class GatewayRunner:
unique_tags.insert(0, "[[audio_as_voice]]")
final_response = final_response + "\n" + "\n".join(unique_tags)
# Sync session_id: the agent may have created a new session during
# mid-run context compression (_compress_context splits sessions).
# If so, update the session store entry so the NEXT message loads
# the compressed transcript, not the stale pre-compression one.
agent = agent_holder[0]
if agent and session_key and hasattr(agent, 'session_id') and agent.session_id != session_id:
logger.info(
"Session split detected: %s%s (compression)",
session_id, agent.session_id,
)
entry = self.session_store._entries.get(session_key)
if entry:
entry.session_id = agent.session_id
self.session_store._save()
effective_session_id = getattr(agent, 'session_id', session_id) if agent else session_id
return {
"final_response": final_response,
"last_reasoning": result.get("last_reasoning"),
@@ -3503,6 +3535,7 @@ class GatewayRunner:
"tools": tools_holder[0] or [],
"history_offset": len(agent_history),
"last_prompt_tokens": _last_prompt_toks,
"session_id": effective_session_id,
}
# Start progress message sender if enabled
+2 -2
View File
@@ -301,7 +301,7 @@ def build_session_key(source: SessionSource) -> str:
This is the single source of truth for session key construction.
DM rules:
- WhatsApp DMs include chat_id (multi-user support).
- WhatsApp and Webhook DMs include chat_id (multi-session support).
- Other DMs include thread_id when present (e.g. Slack threaded DMs),
so each DM thread gets its own session while top-level DMs share one.
- Without thread_id or chat_id, all DMs share a single session.
@@ -314,7 +314,7 @@ def build_session_key(source: SessionSource) -> str:
if source.chat_type == "dm":
if source.thread_id:
return f"agent:main:{platform}:dm:{source.thread_id}"
if platform == "whatsapp" and source.chat_id:
if platform in ("whatsapp", "webhook") and source.chat_id:
return f"agent:main:{platform}:dm:{source.chat_id}"
return f"agent:main:{platform}:dm"
if source.thread_id: