feat: generic webhook inbound platform adapter

Adds a lightweight HTTP server adapter that accepts POST /message requests and routes them through the gateway as conversations. Each unique chat_id gets its own session — supports multiple concurrent agents/conversations with no race conditions. The response is returned synchronously in the HTTP response body (connection held until agent finishes, up to 300s timeout). Enable with: WEBHOOK_PORT=4568 API: POST /message {chat_id, message, from?, user_id?} GET /health This is a generic adapter usable by any external system: miniverse bridges, n8n/Zapier webhooks, CI/CD pipelines, custom automations, other agent frameworks, etc. Also adds docs/integrations/miniverse.md pointing to the external hermes-miniverse bridge repo.
docs: add miniverse integration guide
2026-03-13 04:44:14 -07:00 · 2026-03-13 04:43:38 -07:00 · 2026-03-13 04:17:45 -07:00 · 2026-03-13 04:14:35 -07:00
5 changed files with 304 additions and 14 deletions
@@ -0,0 +1,38 @@
+# Miniverse Integration
+
+Connect your Hermes agents to a [Miniverse](https://github.com/ianscott313/miniverse) pixel world where they can live, work, and communicate with other AI agents.
+
+## Overview
+
+[hermes-miniverse](https://github.com/teknium1/hermes-miniverse) is a standalone bridge that connects Hermes Agent to Miniverse — no changes to your Hermes installation required.
+
+```
+Hermes Agent ←→ hermes-miniverse bridge ←→ Miniverse Server
+```
+
+## Features
+
+- **Automatic presence**: Your agent appears in the pixel world with live state (working, thinking, idle)
+- **Inter-agent messaging**: Other agents can message your Hermes agent and receive responses
+- **Conscious interaction**: Your agent can choose to speak, message others, and join channels
+- **Multiple agents**: Run several Hermes instances as different agents in the same world
+
+## Setup
+
+See the [hermes-miniverse README](https://github.com/teknium1/hermes-miniverse) for installation and configuration instructions.
+
+### Components
+
+| Component | Where | Purpose |
+|-----------|-------|---------|
+| `bridge.py` | Standalone daemon | Heartbeats, webhook receiver, message injection |
+| `hooks/miniverse/` | `~/.hermes/hooks/` | Gateway hook for state broadcasting |
+| `skill/miniverse-world/` | `~/.hermes/skills/` | Teaches agents miniverse API commands |
+
+## Architecture
+
+The bridge is a standalone HTTP server that sits between Hermes and Miniverse:
+
+1. **State out** (Hermes → Miniverse): Gateway hook fires on `agent:start/step/end` → POSTs to bridge → bridge sends miniverse heartbeats
+2. **Messages in** (Miniverse → Hermes): Miniverse webhooks → bridge HTTP server → injects into Hermes via CLI
+3. **Agent interaction** (via skill): Agent uses `terminal` tool with `curl` commands to speak, message, observe
@@ -29,6 +29,7 @@ class Platform(Enum):
    SIGNAL = "signal"
    HOMEASSISTANT = "homeassistant"
    EMAIL = "email"
+    WEBHOOK = "webhook"


@dataclass
@@ -458,6 +459,19 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
            )

+    # Webhook (generic inbound HTTP adapter)
+    webhook_port = os.getenv("WEBHOOK_PORT")
+    if webhook_port:
+        if Platform.WEBHOOK not in config.platforms:
+            config.platforms[Platform.WEBHOOK] = PlatformConfig()
+        config.platforms[Platform.WEBHOOK].enabled = True
+        # Default to no auto-reset for webhook sessions — these are
+        # programmatic integrations (bridges, automations) where the
+        # caller manages lifecycle.  Context is preserved until the
+        # caller explicitly creates a new session or chat_id.
+        if Platform.WEBHOOK not in config.reset_by_platform:
+            config.reset_by_platform[Platform.WEBHOOK] = SessionResetPolicy(mode="none")
+
    # Session settings
    idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
    if idle_minutes:
@@ -0,0 +1,205 @@
+"""Generic webhook inbound platform adapter.
+
+Runs a lightweight HTTP server that accepts POST /message requests and
+routes them through the gateway as regular conversations.  Each unique
+``chat_id`` in the request gets its own session — supporting multiple
+concurrent agents/conversations.
+
+The response is returned synchronously in the HTTP response body (the
+connection is held open until the agent finishes).  This makes it
+trivially easy for external bridges, automation tools, or other agent
+frameworks to integrate with Hermes.
+
+Enable via env var:
+    WEBHOOK_PORT=4568  (any port number enables the adapter)
+
+API:
+    POST /message
+    {
+        "chat_id": "hermes-1",        // required — maps to a session
+        "message": "Hello!",          // required — the message text
+        "from":    "other-agent",     // optional — sender display name
+        "user_id": "agent-123"        // optional — sender ID
+    }
+
+    Response (200):
+    {
+        "ok": true,
+        "response": "Hi there!",
+        "session_id": "20260312_..."
+    }
+
+    GET /health
+    {"ok": true, "adapter": "webhook", "port": 4568}
+"""
+
+import asyncio
+import json
+import logging
+import os
+import time
+
+from aiohttp import web
+
+from gateway.config import Platform, PlatformConfig
+from gateway.platforms.base import (
+    BasePlatformAdapter,
+    MessageEvent,
+    SendResult,
+    SessionSource,
+)
+
+logger = logging.getLogger(__name__)
+
+
+def check_webhook_requirements() -> bool:
+    """Webhook adapter is available when WEBHOOK_PORT is set."""
+    return bool(os.getenv("WEBHOOK_PORT"))
+
+
+class WebhookAdapter(BasePlatformAdapter):
+    """HTTP webhook adapter — accepts POST requests as inbound messages.
+
+    External services (bridges, automation tools, other agents) POST
+    messages and receive the agent's response in the HTTP response body.
+    Each chat_id maps to a separate gateway session.
+    """
+
+    def __init__(self, config: PlatformConfig):
+        super().__init__(config, Platform.WEBHOOK)
+        self.port = int(os.getenv("WEBHOOK_PORT", "4568"))
+        self._app: web.Application = None
+        self._runner: web.AppRunner = None
+        self._site: web.TCPSite = None
+        # Pending response futures keyed by session_key
+        self._response_futures: dict[str, asyncio.Future] = {}
+
+    async def connect(self) -> bool:
+        """Start the HTTP server."""
+        self._app = web.Application()
+        self._app.router.add_post("/message", self._handle_post)
+        self._app.router.add_get("/health", self._handle_health)
+
+        self._runner = web.AppRunner(self._app, access_log=None)
+        await self._runner.setup()
+
+        try:
+            self._site = web.TCPSite(self._runner, "0.0.0.0", self.port)
+            await self._site.start()
+        except OSError as e:
+            logger.error("Webhook adapter failed to bind port %d: %s", self.port, e)
+            return False
+
+        print(f"[webhook] Listening on port {self.port}")
+        print(f"[webhook]   POST http://localhost:{self.port}/message")
+        return True
+
+    async def disconnect(self) -> None:
+        if self._site:
+            await self._site.stop()
+        if self._runner:
+            await self._runner.cleanup()
+
+    async def send(self, chat_id: str, content: str,
+                   reply_to: str = None, metadata: dict = None) -> SendResult:
+        """Deliver response by resolving the waiting HTTP request's future."""
+        from gateway.session import build_session_key
+
+        # Look up the pending future for this chat_id.
+        # We try the chat_id directly, then the full session key.
+        fut = self._response_futures.get(chat_id)
+        if fut is None:
+            # Try building the session key the same way handle_message does
+            source = self.build_source(chat_id=chat_id, chat_type="dm")
+            sk = build_session_key(source)
+            fut = self._response_futures.get(sk)
+
+        if fut and not fut.done():
+            fut.set_result(content)
+
+        return SendResult(success=True, message_id=str(int(time.time())))
+
+    async def send_typing(self, chat_id: str, metadata: dict = None) -> None:
+        pass  # No typing indicator for webhooks
+
+    async def get_chat_info(self, chat_id: str) -> dict:
+        return {"id": chat_id, "name": f"webhook:{chat_id}", "type": "dm"}
+
+    # ── HTTP Handlers ────────────────────────────────────────────────────
+
+    async def _handle_post(self, request: web.Request) -> web.Response:
+        """Accept an inbound message and return the agent's response."""
+        try:
+            data = await request.json()
+        except (json.JSONDecodeError, Exception):
+            return web.json_response(
+                {"ok": False, "error": "Invalid JSON"}, status=400
+            )
+
+        chat_id = data.get("chat_id", "").strip()
+        message = data.get("message", "").strip()
+
+        if not chat_id or not message:
+            return web.json_response(
+                {"ok": False, "error": "Missing required fields: chat_id, message"},
+                status=400,
+            )
+
+        from_name = data.get("from", "webhook")
+        user_id = data.get("user_id", from_name)
+
+        # Prepend sender info if provided
+        display_message = message
+        if from_name and from_name != "webhook":
+            display_message = f"[Message from {from_name}]: {message}"
+
+        # Build source and event
+        source = self.build_source(
+            chat_id=chat_id,
+            chat_type="dm",
+            user_id=user_id,
+            user_name=from_name,
+        )
+
+        from gateway.session import build_session_key
+        session_key = build_session_key(source)
+
+        event = MessageEvent(
+            text=display_message,
+            source=source,
+            message_id=str(int(time.time() * 1000)),
+        )
+
+        # Create a future to capture the response from send()
+        loop = asyncio.get_event_loop()
+        response_future = loop.create_future()
+        self._response_futures[session_key] = response_future
+        # Also store under chat_id for easier lookup
+        self._response_futures[chat_id] = response_future
+
+        # Submit the message for processing
+        await self.handle_message(event)
+
+        # Wait for the agent to finish and send() to resolve the future
+        try:
+            response_text = await asyncio.wait_for(response_future, timeout=300)
+            return web.json_response({
+                "ok": True,
+                "response": response_text,
+                "chat_id": chat_id,
+            })
+        except asyncio.TimeoutError:
+            return web.json_response(
+                {"ok": False, "error": "Agent timed out (300s)", "chat_id": chat_id},
+                status=504,
+            )
+        finally:
+            self._response_futures.pop(session_key, None)
+            self._response_futures.pop(chat_id, None)
+
+    async def _handle_health(self, request: web.Request) -> web.Response:
+        return web.json_response({
+            "ok": True,
+            "adapter": "webhook",
+            "port": self.port,
+        })
@@ -793,6 +793,13 @@ class GatewayRunner:
                return None
            return EmailAdapter(config)

+        elif platform == Platform.WEBHOOK:
+            from gateway.platforms.webhook import WebhookAdapter, check_webhook_requirements
+            if not check_webhook_requirements():
+                logger.warning("Webhook: WEBHOOK_PORT not set")
+                return None
+            return WebhookAdapter(config)
+
        return None
    
    def _is_user_authorized(self, source: SessionSource) -> bool:
@@ -809,7 +816,8 @@ class GatewayRunner:
        # Home Assistant events are system-generated (state changes), not
        # user-initiated messages.  The HASS_TOKEN already authenticates the
        # connection, so HA events are always authorized.
-        if source.platform == Platform.HOMEASSISTANT:
+        # Webhook messages are local HTTP requests — authorized by default.
+        if source.platform in (Platform.HOMEASSISTANT, Platform.WEBHOOK):
            return True

        user_id = source.user_id
@@ -1125,10 +1133,16 @@ class GatewayRunner:
                get_model_context_length,
            )

-            # Read model + compression config from config.yaml — same
-            # source of truth the agent itself uses.
+            # Read model + compression config from config.yaml.
+            # NOTE: hygiene threshold is intentionally HIGHER than the agent's
+            # own compressor (0.85 vs 0.50).  Hygiene is a safety net for
+            # sessions that grew too large between turns — it fires pre-agent
+            # to prevent API failures.  The agent's own compressor handles
+            # normal context management during its tool loop with accurate
+            # real token counts.  Having hygiene at 0.50 caused premature
+            # compression on every turn in long gateway sessions.
            _hyg_model = "anthropic/claude-sonnet-4.6"
-            _hyg_threshold_pct = 0.50
+            _hyg_threshold_pct = 0.85
            _hyg_compression_enabled = True
            try:
                _hyg_cfg_path = _hermes_home / "config.yaml"
@@ -1144,22 +1158,18 @@ class GatewayRunner:
                    elif isinstance(_model_cfg, dict):
                        _hyg_model = _model_cfg.get("default", _hyg_model)

-                    # Read compression settings
+                    # Read compression settings — only use enabled flag.
+                    # The threshold is intentionally separate from the agent's
+                    # compression.threshold (hygiene runs higher).
                    _comp_cfg = _hyg_data.get("compression", {})
                    if isinstance(_comp_cfg, dict):
-                        _hyg_threshold_pct = float(
-                            _comp_cfg.get("threshold", _hyg_threshold_pct)
-                        )
                        _hyg_compression_enabled = str(
                            _comp_cfg.get("enabled", True)
                        ).lower() in ("true", "1", "yes")
            except Exception:
                pass

-            # Also check env overrides (same as run_agent.py)
-            _hyg_threshold_pct = float(
-                os.getenv("CONTEXT_COMPRESSION_THRESHOLD", str(_hyg_threshold_pct))
-            )
+            # Check env override for disabling compression entirely
            if os.getenv("CONTEXT_COMPRESSION_ENABLED", "").lower() in ("false", "0", "no"):
                _hyg_compression_enabled = False

@@ -1446,6 +1456,11 @@ class GatewayRunner:
            response = agent_result.get("final_response", "")
            agent_messages = agent_result.get("messages", [])

+            # If the agent's session_id changed during compression, update
+            # session_entry so transcript writes below go to the right session.
+            if agent_result.get("session_id") and agent_result["session_id"] != session_entry.session_id:
+                session_entry.session_id = agent_result["session_id"]
+
            # Prepend reasoning/thinking if display is enabled
            if getattr(self, "_show_reasoning", False) and response:
                last_reasoning = agent_result.get("last_reasoning")
@@ -3495,6 +3510,23 @@ class GatewayRunner:
                        unique_tags.insert(0, "[[audio_as_voice]]")
                    final_response = final_response + "\n" + "\n".join(unique_tags)
            
+            # Sync session_id: the agent may have created a new session during
+            # mid-run context compression (_compress_context splits sessions).
+            # If so, update the session store entry so the NEXT message loads
+            # the compressed transcript, not the stale pre-compression one.
+            agent = agent_holder[0]
+            if agent and session_key and hasattr(agent, 'session_id') and agent.session_id != session_id:
+                logger.info(
+                    "Session split detected: %s → %s (compression)",
+                    session_id, agent.session_id,
+                )
+                entry = self.session_store._entries.get(session_key)
+                if entry:
+                    entry.session_id = agent.session_id
+                    self.session_store._save()
+
+            effective_session_id = getattr(agent, 'session_id', session_id) if agent else session_id
+
            return {
                "final_response": final_response,
                "last_reasoning": result.get("last_reasoning"),
@@ -3503,6 +3535,7 @@ class GatewayRunner:
                "tools": tools_holder[0] or [],
                "history_offset": len(agent_history),
                "last_prompt_tokens": _last_prompt_toks,
+                "session_id": effective_session_id,
            }
        
        # Start progress message sender if enabled
@@ -301,7 +301,7 @@ def build_session_key(source: SessionSource) -> str:
    This is the single source of truth for session key construction.

    DM rules:
-      - WhatsApp DMs include chat_id (multi-user support).
+      - WhatsApp and Webhook DMs include chat_id (multi-session support).
      - Other DMs include thread_id when present (e.g. Slack threaded DMs),
        so each DM thread gets its own session while top-level DMs share one.
      - Without thread_id or chat_id, all DMs share a single session.
@@ -314,7 +314,7 @@ def build_session_key(source: SessionSource) -> str:
    if source.chat_type == "dm":
        if source.thread_id:
            return f"agent:main:{platform}:dm:{source.thread_id}"
-        if platform == "whatsapp" and source.chat_id:
+        if platform in ("whatsapp", "webhook") and source.chat_id:
            return f"agent:main:{platform}:dm:{source.chat_id}"
        return f"agent:main:{platform}:dm"
    if source.thread_id:
Author	SHA1	Message	Date
teknium1	a886fbf111	feat: generic webhook inbound platform adapter Adds a lightweight HTTP server adapter that accepts POST /message requests and routes them through the gateway as conversations. Each unique chat_id gets its own session — supports multiple concurrent agents/conversations with no race conditions. The response is returned synchronously in the HTTP response body (connection held until agent finishes, up to 300s timeout). Enable with: WEBHOOK_PORT=4568 API: POST /message {chat_id, message, from?, user_id?} GET /health This is a generic adapter usable by any external system: miniverse bridges, n8n/Zapier webhooks, CI/CD pipelines, custom automations, other agent frameworks, etc. Also adds docs/integrations/miniverse.md pointing to the external hermes-miniverse bridge repo.	2026-03-13 04:44:14 -07:00
teknium1	d4b14f74c3	docs: add miniverse integration guide Points to the external hermes-miniverse repo (teknium1/hermes-miniverse) which provides the bridge, gateway hook, and skill for connecting Hermes agents to Miniverse pixel worlds. No code changes to hermes-agent — everything lives externally.	2026-03-13 04:43:38 -07:00
Teknium	6235fdde75	fix: raise session hygiene threshold from 50% to 85% Session hygiene was firing at the same threshold (50%) as the agent's own context compressor, causing premature compression on every turn in long gateway sessions (especially Telegram). Hygiene is a safety net for pathologically large sessions that would cause API failures — it should NOT be doing normal compression work. The agent's own compressor handles that during its tool loop with accurate real token counts from the API. Changes: - Default hygiene threshold: 0.50 → 0.85 (fires only when truly large) - Hygiene threshold is now independent of compression.threshold config (that setting controls the agent's compressor, not the pre-agent safety net) - Removed env var override for hygiene threshold (CONTEXT_COMPRESSION_THRESHOLD still controls the agent's own compressor)	2026-03-13 04:17:45 -07:00
Teknium	8f8dd83443	fix: sync session_id after mid-run context compression Critical bug: when the agent's context compressor fires during a tool loop (_compress_context), it creates a new session_id and writes the compressed messages there. But the gateway's session_entry still pointed to the old session_id. On the next message, load_transcript() loaded the stale pre-compression transcript, causing: - Context bloat returning every turn - Repeated compression cycles - Loss of carefully compressed context Fix: after run_conversation() returns, check if the agent's session_id changed (compression split) and sync it back to the session store entry. Also pass the effective session_id in the result dict so _handle_message writes transcript entries to the correct session. This affects ALL gateway adapters, not just webhook.	2026-03-13 04:14:35 -07:00