Modularize frontend

logging work
changes
2025-10-13 11:53:13 -04:00 · 2025-10-13 10:44:42 -04:00 · 2025-10-11 17:52:23 -04:00 · 2025-10-10 18:04:22 -04:00 · 2025-09-10 00:51:41 -07:00 · 2025-09-10 00:50:56 -07:00
30 changed files with 6060 additions and 831 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -16,3 +16,4 @@ __pycache__/
 export*
 __pycache__/model_tools.cpython-310.pyc
 __pycache__/web_tools.cpython-310.pyc
+logs/
--- a/README.md
+++ b/README.md
@@ -1,17 +1,295 @@
-## Setup
-```
+# Hermes Agent
+
+AI Agent with advanced tool calling capabilities, real-time logging, and extensible toolsets.
+
+## Features
+
+- 🤖 **Multi-model Support**: Works with Claude, GPT-4, and other OpenAI-compatible models
+- 🔧 **Rich Tool Library**: Web search, content extraction, vision analysis, terminal execution, and more
+- 📊 **Real-time Logging**: WebSocket-based logging system for monitoring agent execution
+- 🖥️ **Desktop UI**: Modern PySide6 frontend with real-time event streaming
+- 🎯 **Flexible Toolsets**: Predefined toolset combinations for different use cases
+- 💾 **Trajectory Saving**: Save conversation flows for training and analysis
+- 🔄 **Auto-retry**: Built-in error handling and retry logic
+
+## Quick Start
+
+### Installation
+
+```bash
 pip install -r requirements.txt
-git clone git@github.com:NousResearch/hecate.git
-cd hecate
-pip install -e .
 ```

-## Run
-```
+### Basic Usage
+
+```bash
 python run_agent.py \
-  --query "search up the latest docs on jit in python 3.13 and write me basic example that's not in their docs. profile its perf" \
-  --max_turns 20 \
-  --model claude-sonnet-4-20250514 \
-  --base_url https://api.anthropic.com/v1/ \
-  --api_key $ANTHROPIC_API_KEY
+  --enabled_toolsets web \
+  --query "Search for the latest AI news"
 ```
+
+### With Real-time Logging
+
+```bash
+# Terminal 1: Start API endpoint server
+python api_endpoint/logging_server.py
+
+# Terminal 2: Run agent
+python run_agent.py \
+  --enabled_toolsets web \
+  --enable_websocket_logging \
+  --query "Your question here"
+```
+
+### With Desktop UI (Recommended)
+
+The easiest way to use Hermes Agent is through the desktop UI:
+
+```bash
+# One-command launch (starts server + UI)
+cd ui && ./start_hermes_ui.sh
+
+# Or manually:
+# Terminal 1: Start server
+python api_endpoint/logging_server.py
+
+# Terminal 2: Start UI
+python ui/hermes_ui.py
+```
+
+The UI provides:
+- 🖱️ Point-and-click query submission
+- 🎛️ Easy model and tool selection
+- 📊 Real-time event visualization
+- 🔄 Automatic WebSocket connection
+- 📝 Session history
+
+## Project Structure
+
+```
+Hermes-Agent/
+├── run_agent.py              # Main agent runner
+├── model_tools.py            # Tool definitions and handling
+├── toolsets.py               # Predefined toolset combinations
+├── requirements.txt          # Python dependencies
+│
+├── ui/                      # Desktop UI ⭐ NEW
+│   ├── hermes_ui.py         # PySide6 desktop application
+│   ├── start_hermes_ui.sh   # UI launcher script
+│   └── test_ui_flow.py      # UI integration tests
+│
+├── tools/                    # Tool implementations
+│   ├── web_tools.py         # Web search, extract, crawl
+│   ├── vision_tools.py      # Image analysis
+│   ├── terminal_tool.py     # Command execution
+│   ├── image_generation_tool.py
+│   └── ...
+│
+├── api_endpoint/            # FastAPI + WebSocket logging endpoint
+│   ├── logging_server.py    # WebSocket server + Agent API ⭐ ENHANCED
+│   ├── websocket_logger.py  # Client library
+│   ├── README.md           # API endpoint docs
+│   └── ...
+│
+├── logs/                    # Log files
+│   └── realtime/           # WebSocket session logs
+│
+└── tests/                   # Test files
+```
+
+## Available Toolsets
+
+### Basic Toolsets
+- **web**: Web search, extract, and crawl
+- **terminal**: Command execution
+- **vision**: Image analysis
+- **creative**: Image generation
+- **reasoning**: Mixture of agents
+
+### Composite Toolsets
+- **research**: Web + vision tools
+- **development**: Web + terminal + vision
+- **analysis**: Web + vision + reasoning
+- **full_stack**: All tools enabled
+
+### Usage Examples
+
+```bash
+# Research with web and vision
+python run_agent.py --enabled_toolsets research --query "..."
+
+# Development with terminal access
+python run_agent.py --enabled_toolsets development --query "..."
+
+# Combine multiple toolsets
+python run_agent.py --enabled_toolsets web,vision --query "..."
+```
+
+## Real-time Logging System
+
+Monitor your agent's execution in real-time with the FastAPI WebSocket endpoint using a **persistent connection pool** architecture.
+
+### Architecture
+
+The logging system uses a **singleton WebSocket connection** that persists across multiple agent runs:
+- ✅ **No timeouts** - connection stays alive indefinitely
+- ✅ **No reconnection overhead** - connect once, reuse forever
+- ✅ **Parallel execution** - multiple agents share one connection
+- ✅ **Production-ready** - graceful shutdown with signal handlers
+
+See [`api_endpoint/PERSISTENT_CONNECTION_GUIDE.md`](api_endpoint/PERSISTENT_CONNECTION_GUIDE.md) for technical details.
+
+### Features
+- Track all API calls and responses
+- **Persistent connection** - one WebSocket for all sessions
+- Monitor tool executions with parameters and timing
+- Capture errors and completion status
+- REST API for querying sessions
+- Real-time WebSocket broadcasting
+
+### Documentation
+See [`api_endpoint/README.md`](api_endpoint/README.md) for complete documentation.
+
+### Quick Start
+```bash
+# Start API endpoint server
+python api_endpoint/logging_server.py
+
+# Run agent with logging
+python run_agent.py --enable_websocket_logging --query "..."
+
+# View logs
+curl http://localhost:8000/sessions
+```
+
+## Configuration
+
+### Environment Variables
+
+Create a `.env` file in the project root:
+
+```bash
+# API Keys
+ANTHROPIC_API_KEY=your_key_here
+FIRECRAWL_API_KEY=your_key_here
+NOUS_API_KEY=your_key_here
+FAL_KEY=your_key_here
+
+# Optional
+WEB_TOOLS_DEBUG=true  # Enable web tools debug logging
+```
+
+### Command-Line Options
+
+```bash
+python run_agent.py --help
+```
+
+Key options:
+- `--query`: Your question/task
+- `--model`: Model to use (default: claude-sonnet-4-5-20250929)
+- `--enabled_toolsets`: Toolsets to enable
+- `--max_turns`: Maximum conversation turns
+- `--enable_websocket_logging`: Enable real-time logging
+- `--verbose`: Verbose debug output
+- `--save_trajectories`: Save conversation trajectories
+
+## Parallel Execution
+
+The persistent connection pool enables true parallel agent execution. Multiple agents can run simultaneously, all sharing the same WebSocket connection for logging.
+
+### Test Parallel Execution
+
+```bash
+python test_parallel_execution.py
+```
+
+This script runs three tests:
+1. **Sequential** - baseline (3 queries one after another)
+2. **Parallel** - 3 queries simultaneously  
+3. **High Concurrency** - 10 queries simultaneously
+
+**Expected Results:**
+- ⚡ ~3x speedup with parallel execution
+- ✅ All queries logged to same connection
+- ✅ No connection timeouts or errors
+
+### Custom Parallel Code
+
+```python
+import asyncio
+from run_agent import AIAgent
+
+async def main():
+    agent1 = AIAgent(enable_websocket_logging=True)
+    agent2 = AIAgent(enable_websocket_logging=True)
+    
+    # Run in parallel - both use shared connection!
+    results = await asyncio.gather(
+        agent1.run_conversation("Query 1"),
+        agent2.run_conversation("Query 2")
+    )
+
+asyncio.run(main())
+```
+
+## Examples
+
+### Investment Research
+```bash
+python run_agent.py \
+  --enabled_toolsets web \
+  --query "Find publicly traded companies in renewable energy"
+```
+
+### Code Analysis
+```bash
+python run_agent.py \
+  --enabled_toolsets development \
+  --query "Analyze the codebase and suggest improvements"
+```
+
+### Image Analysis
+```bash
+python run_agent.py \
+  --enabled_toolsets vision \
+  --query "Analyze this chart and explain the trends"
+```
+
+## Development
+
+### Adding New Tools
+
+1. Create tool in `tools/` directory
+2. Register in `model_tools.py`
+3. Add to appropriate toolset in `toolsets.py`
+
+### Running Tests
+
+```bash
+# Test web tools
+python tests/test_web_tools.py
+
+# Test API endpoint / logging
+cd api_endpoint
+./test_websocket_logging.sh
+```
+
+## License
+
+MIT License - see LICENSE file for details
+
+## Contributing
+
+Contributions welcome! Please open an issue or PR.
+
+## Support
+
+For questions or issues:
+1. Check documentation in `api_endpoint/`
+2. Review example usage in this README
+3. Open a GitHub issue
+
+---
+
+Built with ❤️ for advanced AI agent workflows
--- a/pycache/model_tools.cpython-310.pyc
+++ b/pycache/model_tools.cpython-310.pyc
--- a/pycache/web_tools.cpython-310.pyc
+++ b/pycache/web_tools.cpython-310.pyc
--- a/api_endpoint/init.py
+++ b/api_endpoint/init.py
@@ -0,0 +1,26 @@
+"""
+Hermes Agent - API Endpoint & Real-time Logging
+
+This package provides a FastAPI WebSocket endpoint for real-time logging of the Hermes Agent.
+
+Components:
+- logging_server: FastAPI server that receives and stores events
+- websocket_logger: Client library for sending events from the agent
+
+Usage:
+    # Start the API endpoint server
+    python api_endpoint/logging_server.py
+    
+    # Use in agent code
+    from api_endpoint.websocket_logger import WebSocketLogger
+    
+For more information, see:
+- WEBSOCKET_LOGGING_GUIDE.md - User guide
+- IMPLEMENTATION_SUMMARY.md - Technical details
+"""
+
+from .websocket_logger import WebSocketLogger, SyncWebSocketLogger
+
+__all__ = ['WebSocketLogger', 'SyncWebSocketLogger']
+__version__ = '1.0.0'
+
--- a/api_endpoint/logging_server.py
+++ b/api_endpoint/logging_server.py
@@ -0,0 +1,603 @@
+#!/usr/bin/env python3
+"""
+Hermes Agent - Real-time Logging Server
+
+A FastAPI server with WebSocket support that listens for agent execution events
+and logs them to JSON files in real-time.
+
+Events tracked:
+- User queries
+- API calls (requests to the model)
+- Assistant responses
+- Tool calls (name, parameters, timing)
+- Tool results (outputs, errors, duration)
+- Final responses
+- Session metadata
+
+Usage:
+    python logging_server.py
+    
+Or with uvicorn directly:
+    uvicorn logging_server:app --host 0.0.0.0 --port 8000 --reload
+    
+The server will listen for WebSocket connections at ws://localhost:8000/ws
+"""
+
+import json
+import asyncio
+import threading
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, Any, List, Optional
+from fastapi import FastAPI, WebSocket, WebSocketDisconnect, BackgroundTasks
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel
+import uvicorn
+from dotenv import load_dotenv
+
+load_dotenv()
+
+
+
+
+# Configuration
+LOGS_DIR = Path(__file__).parent / "logs" / "realtime"
+LOGS_DIR.mkdir(parents=True, exist_ok=True)
+
+# Initialize FastAPI app
+app = FastAPI(
+    title="Hermes Agent API Endpoint",
+    description="Manage interface between agent and user",
+    version="1.0.0"
+)
+
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+
+
+class SessionLogger:
+    """
+    Manages logging for a single agent session.
+    
+    Each agent execution gets its own SessionLogger instance.
+    Responsible for:
+    - Collecting all events for the session
+    - Saving events to JSON file in real-time
+    - Managing session lifecycle (start -> events -> finalize)
+    """
+    
+    def __init__(self, session_id: str):
+        self.session_id = session_id
+        self.start_time = datetime.now()
+        self.events: List[Dict[str, Any]] = []  # In-memory list of all events
+        self.log_file = LOGS_DIR / f"session_{session_id}.json"  # Where to save on disk 
+        
+        # Initialize session data structure
+        # This is what gets saved to the JSON file
+        self.session_data = {
+            "session_id": session_id,
+            "start_time": self.start_time.isoformat(),
+            "end_time": None,  # Set when session completes
+            "events": [],      # Will be populated as events come in
+            "metadata": {}     # Model, toolsets, etc. (set via session_start event)
+        }
+    
+    def add_event(self, event: Dict[str, Any]):
+        """
+        Add an event to the session log.
+        
+        Called every time a new event arrives (query, api_call, tool_call, etc).
+        IMMEDIATELY saves to file for real-time persistence.
+        """
+        # Add timestamp if not present (should always be added, but safety check)
+        if "timestamp" not in event:
+            event["timestamp"] = datetime.now().isoformat()
+        
+        # Add to in-memory event list
+        self.events.append(event)
+        self.session_data["events"] = self.events
+        
+        # CRITICAL: Save to file immediately (real-time logging)
+        # This ensures events are persisted even if agent crashes
+        self._save()
+    
+    def set_metadata(self, metadata: Dict[str, Any]):
+        """Set session metadata (model, toolsets, etc.)."""
+        self.session_data["metadata"].update(metadata)
+        self._save()
+    
+    def finalize(self):
+        """Finalize the session and save."""
+        self.session_data["end_time"] = datetime.now().isoformat()
+        self._save()
+    
+    def _save(self):
+        """
+        Save current session data to JSON file.
+        
+        Called after EVERY event is added - provides real-time persistence.
+        If file writing fails, logs error but continues (doesn't crash server).
+        """
+        try:
+            # Write complete session data to JSON file
+            # indent=2 makes it human-readable
+            # ensure_ascii=False preserves Unicode characters
+            with open(self.log_file, 'w', encoding='utf-8') as f:
+                json.dump(self.session_data, f, indent=2, ensure_ascii=False)
+        except Exception as e:
+            print(f"❌ Error saving session log: {e}")
+
+
+class ConnectionManager:
+    """
+    Manages WebSocket connections and active sessions.
+    
+    Global singleton that:
+    - Tracks all active WebSocket connections (for broadcasting)
+    - Manages all SessionLogger instances (one per agent session)
+    - Coordinates between WebSocket events and file logging
+    """
+    
+    def __init__(self):
+        self.active_connections: List[WebSocket] = []      # All connected WebSocket clients
+        self.sessions: Dict[str, SessionLogger] = {}       # session_id -> SessionLogger mapping
+    
+    async def connect(self, websocket: WebSocket):
+        """Accept a new WebSocket connection."""
+        await websocket.accept()
+        self.active_connections.append(websocket)
+        print(f"✅ WebSocket connected. Active connections: {len(self.active_connections)}")
+    
+    def disconnect(self, websocket: WebSocket):
+        """Remove a WebSocket connection."""
+        if websocket in self.active_connections:
+            self.active_connections.remove(websocket)
+        print(f"❌ WebSocket disconnected. Active connections: {len(self.active_connections)}")
+    
+    def get_or_create_session(self, session_id: str) -> SessionLogger:
+        """
+        Get existing session logger or create a new one.
+        
+        Called when an event arrives for a session. Creates SessionLogger
+        on first event, reuses it for subsequent events from same session.
+        """
+        if session_id not in self.sessions:
+            # First time seeing this session - create new logger
+            self.sessions[session_id] = SessionLogger(session_id)
+            print(f"📝 Created new session: {session_id}")
+        return self.sessions[session_id]
+    
+    def finalize_session(self, session_id: str):
+        """Finalize and clean up a session."""
+        if session_id in self.sessions:
+            self.sessions[session_id].finalize()
+            print(f"✅ Session finalized: {session_id}")
+    
+    async def broadcast(self, message: Dict[str, Any]):
+        """
+        Broadcast a message to all connected WebSocket clients.
+        
+        Allows multiple clients (e.g., multiple browser tabs) to watch
+        the same agent session in real-time. Future UI feature.
+        """
+        disconnected = []
+        for connection in self.active_connections:
+            try:
+                await connection.send_json(message)
+            except Exception:
+                # Connection closed - mark for removal
+                disconnected.append(connection)
+        
+        # Clean up disconnected clients silently
+        for conn in disconnected:
+            if conn in self.active_connections:
+                self.active_connections.remove(conn)
+
+
+# Global connection manager
+manager = ConnectionManager()
+
+
+# Request/Response models for API endpoints
+class AgentRequest(BaseModel):
+    """Request model for starting an agent run."""
+    query: str
+    model: str = "claude-sonnet-4-5-20250929"
+    base_url: str = "https://api.anthropic.com/v1/"
+    enabled_toolsets: Optional[List[str]] = None
+    disabled_toolsets: Optional[List[str]] = None
+    max_turns: int = 10
+    mock_web_tools: bool = False
+    mock_delay: int = 60
+    verbose: bool = False
+
+
+class AgentResponse(BaseModel):
+    """Response model for agent run request."""
+    status: str
+    session_id: str
+    message: str
+
+
+@app.get("/")
+async def root():
+    """Root endpoint - server status."""
+    return {
+        "status": "running",
+        "service": "Hermes Agent Logging Server",
+        "websocket_url": "ws://localhost:8000/ws",
+        "active_connections": len(manager.active_connections),
+        "active_sessions": len(manager.sessions),
+        "logs_directory": str(LOGS_DIR)
+    }
+
+
+@app.get("/sessions")
+async def list_sessions():
+    """List all active and recent sessions."""
+    # Get all session log files
+    session_files = list(LOGS_DIR.glob("session_*.json"))
+    
+    sessions = []
+    for session_file in sorted(session_files, key=lambda x: x.stat().st_mtime, reverse=True):
+        try:
+            with open(session_file, 'r', encoding='utf-8') as f:
+                session_data = json.load(f)
+                sessions.append({
+                    "session_id": session_data.get("session_id"),
+                    "start_time": session_data.get("start_time"),
+                    "end_time": session_data.get("end_time"),
+                    "event_count": len(session_data.get("events", [])),
+                    "file": str(session_file)
+                })
+        except Exception as e:
+            print(f"⚠️ Error reading session file {session_file}: {e}")
+    
+    return {
+        "total_sessions": len(sessions),
+        "sessions": sessions
+    }
+
+
+@app.get("/sessions/{session_id}")
+async def get_session(session_id: str):
+    """Get detailed data for a specific session."""
+    session_file = LOGS_DIR / f"session_{session_id}.json"
+    
+    if not session_file.exists():
+        return {"error": "Session not found"}, 404
+    
+    try:
+        with open(session_file, 'r', encoding='utf-8') as f:
+            return json.load(f)
+    except Exception as e:
+        return {"error": f"Failed to load session: {str(e)}"}, 500
+
+
+@app.post("/agent/run", response_model=AgentResponse)
+async def run_agent(request: AgentRequest, background_tasks: BackgroundTasks):
+    """
+    Start an agent run with specified parameters.
+    
+    This endpoint triggers an agent execution in the background and returns immediately.
+    The agent will connect to the WebSocket endpoint to send real-time events.
+    
+    Args:
+        request: AgentRequest with query and configuration
+        background_tasks: FastAPI background tasks for async execution
+        
+    Returns:
+        AgentResponse with session_id for tracking
+    """
+    import uuid
+    import sys
+    import os
+    
+    # Generate session ID for this run - we'll pass it to the agent
+    session_id = str(uuid.uuid4())
+    
+    # Add parent directory to path to import run_agent
+    parent_dir = str(Path(__file__).parent.parent)
+    if parent_dir not in sys.path:
+        sys.path.insert(0, parent_dir)
+    
+    from run_agent import AIAgent
+    
+    # Run agent in background thread (not blocking the API)
+    def run_agent_background():
+        """Run agent in a separate thread."""
+        try:
+            # Initialize agent with WebSocket logging enabled
+            agent = AIAgent(
+                base_url=request.base_url,
+                model=request.model,
+                api_key=os.getenv("ANTHROPIC_API_KEY"),
+                max_iterations=request.max_turns,
+                enabled_toolsets=request.enabled_toolsets,
+                disabled_toolsets=request.disabled_toolsets,
+                save_trajectories=False,
+                verbose_logging=request.verbose,
+                enable_websocket_logging=True,  # Always enable for UI
+                websocket_server="ws://localhost:8000/ws",
+                mock_web_tools=request.mock_web_tools,
+                mock_delay=request.mock_delay
+            )
+            
+            # Run conversation with our session_id
+            result = agent.run_conversation(
+                request.query,
+                session_id=session_id  # Pass session_id so it matches
+            )
+            
+            print(f"✅ Agent run completed: {session_id[:8]}...")
+            print(f"   Final response: {result['final_response'][:100] if result.get('final_response') else 'No response'}...")
+            
+        except Exception as e:
+            print(f"❌ Error running agent {session_id[:8]}...: {e}")
+            import traceback
+            traceback.print_exc()
+    
+    # Start agent in background thread
+    thread = threading.Thread(target=run_agent_background, daemon=True)
+    thread.start()
+    
+    return AgentResponse(
+        status="started",
+        session_id=session_id,
+        message=f"Agent started with session ID: {session_id}"
+    )
+
+
+@app.get("/tools")
+async def get_available_tools():
+    """Get list of available toolsets and tools."""
+    try:
+        import sys
+        parent_dir = str(Path(__file__).parent.parent)
+        if parent_dir not in sys.path:
+            sys.path.insert(0, parent_dir)
+        
+        from toolsets import get_all_toolsets, get_toolset_info
+        
+        all_toolsets = get_all_toolsets()
+        toolsets_info = []
+        
+        for name in all_toolsets.keys():
+            info = get_toolset_info(name)
+            if info:
+                toolsets_info.append({
+                    "name": name,
+                    "description": info['description'],
+                    "tool_count": info['tool_count'],
+                    "resolved_tools": info['resolved_tools']
+                })
+        
+        return {
+            "toolsets": toolsets_info
+        }
+    except Exception as e:
+        return {"error": f"Failed to load tools: {str(e)}"}
+
+
+@app.websocket("/ws")
+async def websocket_endpoint(websocket: WebSocket):
+    """
+    WebSocket endpoint for receiving real-time agent events.
+    
+    This is the main entry point for all logging. Agents connect here and send events.
+    
+    Message Flow:
+    1. Agent connects to ws://localhost:8000/ws
+    2. Agent sends events as JSON messages
+    3. Server parses event_type and routes to appropriate handler
+    4. Event is added to SessionLogger (saved to file)
+    5. Event is broadcast to all connected clients
+    6. Acknowledgment sent back to agent
+    
+    Expected message format:
+    {
+        "session_id": "unique-session-id",        // Links event to specific session
+        "event_type": "query" | "api_call" | ..., // What kind of event
+        "data": { ... event-specific data ... }   // Event payload
+    }
+    """
+    # Accept the WebSocket connection
+    await manager.connect(websocket)
+    
+    try:
+        # Main event loop - runs until client disconnects
+        while True:
+            # Receive message from client (agent)
+            # This is a blocking call - waits for next message
+            message = await websocket.receive_json()
+            
+            # Parse the standard message structure
+            session_id = message.get("session_id")  # Which agent session
+            event_type = message.get("event_type")   # What kind of event
+            data = message.get("data", {})           # Event payload
+            
+            # Validate: session_id is required
+            if not session_id:
+                await websocket.send_json({
+                    "error": "session_id is required"
+                })
+                continue
+            
+            # Get or create SessionLogger for this session
+            # First event creates it, subsequent events reuse it
+            session = manager.get_or_create_session(session_id)
+            
+            # Route event to appropriate handler based on event_type
+            # Each handler extracts relevant data and adds to session log
+            
+            if event_type == "session_start":
+                # Initial event - sent when agent first connects
+                # Contains metadata about the session (model, toolsets, etc.)
+                session.set_metadata(data)
+                print(f"🚀 Session started: {session_id}")
+                
+            elif event_type == "query":
+                # User query
+                session.add_event({
+                    "type": "query",
+                    "query": data.get("query"),
+                    "toolsets": data.get("toolsets"),
+                    "model": data.get("model")
+                })
+                print(f"📝 Query logged: {data.get('query', '')[:60]}...")
+                
+            elif event_type == "api_call":
+                # API call to model
+                session.add_event({
+                    "type": "api_call",
+                    "call_number": data.get("call_number"),
+                    "message_count": data.get("message_count"),
+                    "has_tools": data.get("has_tools")
+                })
+                print(f"🔄 API call #{data.get('call_number')} logged")
+                
+            elif event_type == "response":
+                # Assistant response
+                session.add_event({
+                    "type": "response",
+                    "call_number": data.get("call_number"),
+                    "content": data.get("content"),
+                    "has_tool_calls": data.get("has_tool_calls"),
+                    "tool_call_count": data.get("tool_call_count"),
+                    "duration": data.get("duration")
+                })
+                print(f"🤖 Response logged: {data.get('content', '')[:60]}...")
+                
+            elif event_type == "tool_call":
+                # Tool execution
+                session.add_event({
+                    "type": "tool_call",
+                    "call_number": data.get("call_number"),
+                    "tool_index": data.get("tool_index"),
+                    "tool_name": data.get("tool_name"),
+                    "parameters": data.get("parameters"),
+                    "tool_call_id": data.get("tool_call_id")
+                })
+                print(f"🔧 Tool call logged: {data.get('tool_name')}")
+                
+            elif event_type == "tool_result":
+                # Tool result - captures output from tool execution
+                # Now includes BOTH truncated preview AND full raw result
+                session.add_event({
+                    "type": "tool_result",
+                    "call_number": data.get("call_number"),
+                    "tool_index": data.get("tool_index"),
+                    "tool_name": data.get("tool_name"),
+                    "result": data.get("result"),              # Truncated preview (1000 chars)
+                    "raw_result": data.get("raw_result"),      # NEW: Full untruncated result
+                    "error": data.get("error"),
+                    "duration": data.get("duration"),
+                    "tool_call_id": data.get("tool_call_id")
+                })
+                
+                # Enhanced logging with size information
+                if data.get("error"):
+                    print(f"❌ Tool error logged: {data.get('tool_name')}")
+                else:
+                    # Show size of raw result to indicate data volume
+                    raw_size = len(data.get("raw_result", "")) if data.get("raw_result") else len(data.get("result", ""))
+                    size_kb = raw_size / 1024
+                    print(f"✅ Tool result logged: {data.get('tool_name')} ({size_kb:.1f} KB)")
+                
+            elif event_type == "error":
+                # Error event
+                session.add_event({
+                    "type": "error",
+                    "error_message": data.get("error_message"),
+                    "call_number": data.get("call_number")
+                })
+                print(f"❌ Error logged: {data.get('error_message', '')[:60]}...")
+                
+            elif event_type == "complete":
+                # Session complete
+                session.add_event({
+                    "type": "complete",
+                    "final_response": data.get("final_response"),
+                    "total_calls": data.get("total_calls"),
+                    "completed": data.get("completed")
+                })
+                manager.finalize_session(session_id)
+                print(f"🎉 Session complete: {session_id}")
+                
+            else:
+                # Unknown event type - log it anyway
+                session.add_event({
+                    "type": event_type or "unknown",
+                    **data
+                })
+                print(f"⚠️ Unknown event type: {event_type}")
+            
+            # Broadcast event to all connected clients (for future real-time UI)
+            # Allows multiple browsers/dashboards to watch same session live
+            await manager.broadcast({
+                "session_id": session_id,
+                "event_type": event_type,
+                "timestamp": datetime.now().isoformat(),
+                "data": data
+            })
+            
+            # Send acknowledgment back to sender
+            # Confirms event was received and logged
+            # Handle case where client disconnects before we can ack
+            try:
+                await websocket.send_json({
+                    "status": "logged",
+                    "session_id": session_id,
+                    "event_type": event_type
+                })
+            except Exception:
+                # Connection closed before ack - this is normal for "complete" event
+                # Client disconnects after sending, so we can't ack
+                pass
+            
+    except WebSocketDisconnect:
+        manager.disconnect(websocket)
+    except Exception as e:
+        print(f"❌ WebSocket error: {e}")
+        manager.disconnect(websocket)
+
+
+def main(host: str = "0.0.0.0", port: int = 8000, reload: bool = False):
+    """
+    Start the logging server.
+    
+    Args:
+        host: Host to bind to (default: 0.0.0.0)
+        port: Port to run on (default: 8000)
+        reload: Enable auto-reload on file changes (default: False)
+    """
+    print("🚀 Hermes Agent Logging Server")
+    print("=" * 50)
+    print(f"📂 Logs directory: {LOGS_DIR}")
+    print(f"🌐 Server starting at http://{host}:{port}")
+    print(f"🔌 WebSocket endpoint: ws://{host}:{port}/ws")
+    print(f"🔄 Auto-reload: {'enabled' if reload else 'disabled'}")
+    print("\n📡 Ready to receive agent events...")
+    print("=" * 50)
+    
+    uvicorn.run(
+        "logging_server:app",
+        host=host,
+        port=port,
+        reload=reload,
+        log_level="info",
+        timeout_keep_alive=600      # Keep HTTP/WS connections alive for 10 minutes of inactivity
+        # Note: WebSocket ping/pong disabled in client to avoid timeout during blocked event loop
+    )
+
+
+if __name__ == "__main__":
+    import fire
+    fire.Fire(main)
+
--- a/api_endpoint/test_websocket_logging.sh
+++ b/api_endpoint/test_websocket_logging.sh
@@ -0,0 +1,91 @@
+#!/bin/bash
+# Test script for WebSocket logging system
+#
+# This script demonstrates the complete WebSocket logging workflow:
+# 1. Starts the logging server
+# 2. Runs the agent with WebSocket logging enabled
+# 3. Shows the logged data
+#
+# Usage: ./test_websocket_logging.sh
+
+set -e  # Exit on error
+
+echo "🧪 Testing WebSocket Logging System"
+echo "===================================="
+echo ""
+
+# Check if required packages are installed
+echo "📦 Checking dependencies..."
+python -c "import fastapi; import uvicorn; import websockets" 2>/dev/null || {
+    echo "❌ Missing dependencies. Installing..."
+    pip install fastapi uvicorn websockets
+}
+echo "✅ Dependencies OK"
+echo ""
+
+# Start the logging server in the background
+echo "🚀 Starting logging server..."
+python api_endpoint/logging_server.py --port 8000 &
+SERVER_PID=$!
+
+# Give server time to start
+sleep 2
+
+# Check if server is running
+if ps -p $SERVER_PID > /dev/null; then
+    echo "✅ Logging server started (PID: $SERVER_PID)"
+else
+    echo "❌ Failed to start logging server"
+    exit 1
+fi
+
+echo ""
+echo "🤖 Running agent with WebSocket logging..."
+echo ""
+
+# Run the agent with WebSocket logging
+python run_agent.py \
+    --enabled_toolsets web \
+    --enable_websocket_logging \
+    --query "What are the top 3 programming languages in 2025?" \
+    --max_turns 5
+
+echo ""
+echo "✅ Agent execution complete!"
+echo ""
+
+# Show the most recent log file
+echo "📊 Viewing logged session data..."
+echo ""
+
+LATEST_LOG=$(ls -t logs/realtime/session_*.json 2>/dev/null | head -1)
+
+if [ -f "$LATEST_LOG" ]; then
+    echo "📄 Log file: $LATEST_LOG"
+    echo ""
+    
+    # Pretty print the JSON if jq is available
+    if command -v jq &> /dev/null; then
+        echo "Event summary:"
+        jq '.events[] | {type: .type, timestamp: .timestamp}' "$LATEST_LOG"
+        echo ""
+        echo "Total events: $(jq '.events | length' "$LATEST_LOG")"
+    else
+        echo "Content (install 'jq' for pretty printing):"
+        cat "$LATEST_LOG"
+    fi
+else
+    echo "⚠️  No log files found in logs/realtime/"
+fi
+
+echo ""
+echo "🛑 Stopping logging server..."
+kill $SERVER_PID 2>/dev/null || true
+
+echo "✅ Test complete!"
+echo ""
+echo "Next steps:"
+echo "  1. Start server: python api_endpoint/logging_server.py"
+echo "  2. Run agent: python run_agent.py --enable_websocket_logging --query \"...\""
+echo "  3. View logs: http://localhost:8000/sessions"
+
--- a/api_endpoint/websocket_connection_pool.py
+++ b/api_endpoint/websocket_connection_pool.py
@@ -0,0 +1,457 @@
+"""
+WebSocket Connection Pool - Persistent Connection Manager
+
+This module provides a singleton WebSocket connection that persists across
+multiple agent runs. This is a more robust architecture than creating a new
+connection for each run.
+
+Benefits:
+- No timeout issues (connection stays alive indefinitely)
+- No reconnection overhead (connect once)
+- Supports parallel agent runs (multiple sessions share one socket)
+- Proper shutdown handling (SIGTERM/SIGINT)
+- Thread-safe concurrent sends
+"""
+
+import asyncio
+import signal
+import websockets
+from typing import Optional, Dict, Any
+import json
+import atexit
+import sys
+import threading
+from datetime import datetime
+
+
+class WebSocketConnectionPool:
+    """
+    Singleton WebSocket connection manager.
+    
+    Maintains a single persistent connection to the logging server
+    that all agent sessions can use. Handles graceful shutdown.
+    
+    Usage:
+        # Get singleton instance
+        pool = WebSocketConnectionPool()
+        
+        # Connect (idempotent - safe to call multiple times)
+        await pool.connect()
+        
+        # Send events (thread-safe, multiple sessions can call concurrently)
+        await pool.send_event("query", session_id, {...})
+        
+        # Shutdown handled automatically on SIGTERM/SIGINT
+    """
+    
+    _instance: Optional['WebSocketConnectionPool'] = None
+    
+    def __new__(cls):
+        """Ensure only one instance exists (singleton pattern)."""
+        if cls._instance is None:
+            cls._instance = super().__new__(cls)
+            cls._instance._initialized = False
+        return cls._instance
+    
+    def __init__(self):
+        """Initialize the connection pool (only once)."""
+        if getattr(self, '_initialized', False):
+            return
+            
+        self.websocket: Optional[websockets.WebSocketClientProtocol] = None
+        self.server_url: str = "ws://localhost:8000/ws"
+        self.connected: bool = False
+        # Store reference to loop for signal handlers
+        # Agent code should never close event loops when using persistent connections
+        self.loop: Optional[asyncio.AbstractEventLoop] = None
+        # Locks are created lazily when event loop exists
+        self._send_lock: Optional[asyncio.Lock] = None
+        self._connect_lock: Optional[asyncio.Lock] = None
+        self._locks_loop: Optional[asyncio.AbstractEventLoop] = None  # Track which loop created locks
+        self._init_lock = threading.Lock()  # Thread-safe lock initialization
+        self._shutdown_in_progress = False
+        self._initialized = True
+        
+        # Register shutdown handlers for graceful cleanup
+        # These ensure WebSocket is closed properly on exit
+        signal.signal(signal.SIGTERM, self._signal_handler)
+        signal.signal(signal.SIGINT, self._signal_handler)
+        atexit.register(self._cleanup_sync)
+        
+        print("🔌 WebSocket connection pool initialized")
+    
+    def _ensure_locks(self):
+        """
+        Lazy initialization of asyncio locks with thread safety and loop tracking.
+        
+        Locks must be created when an event loop exists, not at import time.
+        If the event loop changes between runs, locks must be recreated because
+        asyncio.Lock objects are bound to the loop that created them.
+        
+        This is called before any async operation that needs locks.
+        Uses a threading.Lock to prevent race conditions during initialization.
+        """
+        with self._init_lock:  # Thread-safe initialization
+            try:
+                current_loop = asyncio.get_event_loop()
+            except RuntimeError:
+                # No event loop in current thread
+                return
+            
+            # Recreate locks if:
+            # 1. Locks don't exist yet, OR
+            # 2. Event loop has changed (locks are bound to the loop that created them)
+            if self._locks_loop is not current_loop or self._send_lock is None:
+                self._send_lock = asyncio.Lock()
+                self._connect_lock = asyncio.Lock()
+                self._locks_loop = current_loop
+    
+    async def connect(self, server_url: str = "ws://localhost:8000/ws") -> bool:
+        """
+        Connect to WebSocket server.
+        
+        This is idempotent - safe to call multiple times. If already connected,
+        does nothing. If connection failed previously, will retry.
+        
+        Args:
+            server_url: WebSocket server URL (default: ws://localhost:8000/ws)
+            
+        Returns:
+            bool: True if connected successfully, False otherwise
+        """
+        # Ensure locks exist (lazy initialization)
+        self._ensure_locks()
+        
+        async with self._connect_lock:
+            # Always update loop reference to current loop (even if already connected)
+            # This ensures signal handlers and cleanup use the correct loop
+            self.loop = asyncio.get_event_loop()
+            
+            # Already connected - nothing to do
+            if self.connected and self.websocket:
+                return True
+            
+            try:
+                self.server_url = server_url
+                
+                # Establish persistent WebSocket connection
+                # No ping/pong needed since connection stays open indefinitely
+                self.websocket = await websockets.connect(
+                    server_url,
+                    ping_interval=None,  # Disable ping/pong (not needed for persistent connection)
+                    max_size=10 * 1024 * 1024,  # 10MB max message size for large tool results
+                    open_timeout=10,  # 10s timeout for initial connection
+                    close_timeout=5   # 5s timeout for close handshake
+                )
+                
+                self.connected = True
+                
+                print(f"✅ Connected to logging server (persistent): {server_url}")
+                return True
+                
+            except Exception as e:
+                print(f"⚠️ Failed to connect to logging server: {e}")
+                self.connected = False
+                self.websocket = None
+                return False
+    
+    async def send_event(
+        self,
+        event_type: str,
+        session_id: str,
+        data: Dict[str, Any],
+        retry: bool = True
+    ) -> bool:
+        """
+        Send event to logging server (thread-safe).
+        
+        Multiple agent runs can call this concurrently. The send lock ensures
+        only one message is sent at a time (WebSocket protocol requirement).
+        
+        Args:
+            event_type: Type of event (query, api_call, response, tool_call, tool_result, error, complete)
+            session_id: Unique session identifier
+            data: Event-specific data dictionary
+            retry: Whether to retry connection if disconnected (default: True)
+            
+        Returns:
+            bool: True if sent successfully, False otherwise
+        """
+        # Try to connect if not connected (or reconnect if disconnected)
+        if not self.connected or not self.websocket:
+            if retry:
+                await self.connect()
+            if not self.connected:
+                return False  # Give up if connection fails
+        
+        # Ensure locks exist (lazy initialization)
+        self._ensure_locks()
+        
+        # Lock to prevent concurrent sends (WebSocket requires sequential sends)
+        async with self._send_lock:
+            try:
+                # Create standardized message format
+                message = {
+                    "session_id": session_id,
+                    "event_type": event_type,
+                    "data": data,
+                    "timestamp": datetime.now().isoformat()
+                }
+                
+                # Send message as JSON
+                await self.websocket.send(json.dumps(message))
+                
+                # Wait for server acknowledgment (with timeout)
+                # This confirms the server received and processed the event
+                try:
+                    response = await asyncio.wait_for(
+                        self.websocket.recv(),
+                        timeout=2.0  # Increased to 2s for busy servers
+                    )
+                    # Successfully received acknowledgment
+                    return True
+                    
+                except asyncio.TimeoutError:
+                    # No response within timeout - that's OK, message likely sent
+                    # Server might be busy processing
+                    return True
+                    
+            except websockets.exceptions.ConnectionClosed:
+                print(f"⚠️ WebSocket connection closed unexpectedly")
+                self.connected = False
+                
+                # Try to reconnect and resend (one retry)
+                if retry:
+                    print("🔄 Attempting to reconnect...")
+                    if await self.connect():
+                        # Recursively call with retry=False to avoid infinite loop
+                        return await self.send_event(event_type, session_id, data, retry=False)
+                
+                return False
+                
+            except Exception as e:
+                print(f"⚠️ Error sending event: {e}")
+                self.connected = False
+                return False
+    
+    async def disconnect(self):
+        """
+        Gracefully close the WebSocket connection.
+        
+        Called on shutdown (SIGTERM/SIGINT/exit). Ensures proper cleanup.
+        """
+        if self._shutdown_in_progress:
+            return  # Already shutting down
+        
+        self._shutdown_in_progress = True
+        
+        if self.websocket and self.connected:
+            try:
+                await self.websocket.close()
+                self.connected = False
+                print("✅ WebSocket connection pool closed gracefully")
+            except Exception as e:
+                print(f"⚠️ Error closing WebSocket: {e}")
+        
+        self._shutdown_in_progress = False
+    
+    def _signal_handler(self, signum, frame):
+        """
+        Handle SIGTERM/SIGINT signals for graceful shutdown.
+        
+        When user presses Ctrl+C or system sends SIGTERM, this ensures
+        the WebSocket is closed properly before exit.
+        """
+        print(f"\n🛑 Received signal {signum}, closing WebSocket connection pool...")
+        
+        # Check if we have a valid loop and are connected
+        if self.loop and not self.loop.is_closed() and self.connected and not self._shutdown_in_progress:
+            try:
+                # If loop is not running, we can wait for disconnect
+                if not self.loop.is_running():
+                    self.loop.run_until_complete(self.disconnect())
+                else:
+                    # Loop is running, can't wait for task - just mark disconnected
+                    # The disconnect task would be cancelled when we exit anyway
+                    self.connected = False
+                    print("⚠️ Loop is running, marking disconnected without waiting")
+            except Exception as e:
+                print(f"⚠️ Error during signal handler cleanup: {e}")
+        
+        # Exit gracefully
+        sys.exit(0)
+    
+    def _cleanup_sync(self):
+        """
+        Cleanup at exit (atexit handler).
+        
+        This is a fallback in case signal handlers don't fire.
+        Called when Python interpreter shuts down normally.
+        """
+        if self.loop and not self.loop.is_closed() and self.connected and not self._shutdown_in_progress:
+            try:
+                # Try to run disconnect synchronously
+                self.loop.run_until_complete(self.disconnect())
+            except Exception:
+                # Ignore errors during exit cleanup
+                pass
+    
+    def is_connected(self) -> bool:
+        """Check if currently connected to server."""
+        return self.connected and self.websocket is not None
+    
+    def get_stats(self) -> Dict[str, Any]:
+        """Get connection statistics for debugging."""
+        return {
+            "connected": self.connected,
+            "server_url": self.server_url,
+            "shutdown_in_progress": self._shutdown_in_progress,
+            "has_websocket": self.websocket is not None,
+            "has_loop": self.loop is not None
+        }
+
+
+# Global singleton instance
+# Import this in other modules: from websocket_connection_pool import ws_pool
+ws_pool = WebSocketConnectionPool()
+
+
+# Convenience functions for direct usage
+async def connect(server_url: str = "ws://localhost:8000/ws") -> bool:
+    """Connect to logging server (convenience function)."""
+    return await ws_pool.connect(server_url)
+
+
+async def send_event(event_type: str, session_id: str, data: Dict[str, Any]) -> bool:
+    """Send event to logging server (convenience function)."""
+    return await ws_pool.send_event(event_type, session_id, data)
+
+
+async def disconnect():
+    """Disconnect from logging server (convenience function)."""
+    await ws_pool.disconnect()
+
+
+def is_connected() -> bool:
+    """Check if connected to logging server (convenience function)."""
+    return ws_pool.is_connected()
+
+
+# ============================================================================
+# SYNCHRONOUS API FOR AGENT LAYER
+# ============================================================================
+# These functions provide a clean abstraction that hides event loop management
+# from the agent layer. Agent code should ONLY use these functions.
+
+def connect_sync(server_url: str = "ws://localhost:8000/ws") -> bool:
+    """
+    Synchronous connect - handles event loop internally.
+    
+    Creates a persistent event loop in a background thread if needed.
+    This is thread-safe and can be called from any thread (including agent background threads).
+    """
+    import threading
+    
+    # If pool doesn't have a loop yet or it's closed, we need to start one
+    if not ws_pool.loop or ws_pool.loop.is_closed():
+        # Start connection in a background thread with its own loop
+        result_container = {"success": False, "error": None, "connected": False}
+        
+        def run_in_thread():
+            try:
+                loop = asyncio.new_event_loop()
+                asyncio.set_event_loop(loop)
+                ws_pool.loop = loop  # Store the loop in the pool
+                
+                # Connect to WebSocket
+                result_container["success"] = loop.run_until_complete(ws_pool.connect(server_url))
+                result_container["connected"] = True
+                
+                # Keep loop running forever for future send_event calls
+                # This is critical - the loop must stay alive for run_coroutine_threadsafe to work
+                loop.run_forever()
+                
+            except Exception as e:
+                result_container["error"] = str(e)
+                print(f"❌ Error in WebSocket connection thread: {e}")
+            finally:
+                # Clean up if loop stops
+                if loop.is_running():
+                    loop.close()
+        
+        thread = threading.Thread(target=run_in_thread, daemon=True, name="WebSocket-EventLoop")
+        thread.start()
+        
+        # Wait for connection to complete (but not for loop to exit - it runs forever)
+        import time
+        timeout = 10.0
+        start = time.time()
+        while not result_container["connected"] and (time.time() - start) < timeout:
+            time.sleep(0.1)
+        
+        if result_container["error"]:
+            print(f"⚠️  Connection failed: {result_container['error']}")
+        
+        return result_container["success"]
+    else:
+        # Pool already has a loop, use run_coroutine_threadsafe
+        try:
+            future = asyncio.run_coroutine_threadsafe(
+                ws_pool.connect(server_url),
+                ws_pool.loop
+            )
+            return future.result(timeout=10.0)
+        except Exception as e:
+            print(f"⚠️  Connection failed: {e}")
+            return False
+
+
+def send_event_sync(event_type: str, session_id: str, data: Dict[str, Any]) -> bool:
+    """
+    Synchronous send event - handles event loop internally.
+    
+    Uses the WebSocket pool's own event loop to avoid loop conflicts.
+    This is critical when called from background threads (like agent execution).
+    This is thread-safe and works correctly even when agent runs in a different thread.
+    """
+    if not ws_pool.loop or not ws_pool.loop.is_running():
+        # No event loop running - can't send
+        print("⚠️  WebSocket pool has no running event loop")
+        return False
+    
+    try:
+        # Use run_coroutine_threadsafe to submit to the WebSocket pool's loop
+        # This works across threads - submits the coroutine to the correct loop
+        future = asyncio.run_coroutine_threadsafe(
+            ws_pool.send_event(event_type, session_id, data),
+            ws_pool.loop  # ← Use the pool's loop, not current thread's loop
+        )
+        
+        # Wait for completion (with timeout to avoid hanging)
+        return future.result(timeout=5.0)
+        
+    except TimeoutError:
+        print(f"⚠️  Timeout sending event {event_type}")
+        return False
+    except Exception as e:
+        print(f"⚠️  Error sending event: {e}")
+        return False
+
+
+def disconnect_sync():
+    """
+    Synchronous disconnect - handles event loop internally.
+    
+    Thread-safe disconnect that works from any thread.
+    """
+    if ws_pool.loop and ws_pool.loop.is_running():
+        try:
+            future = asyncio.run_coroutine_threadsafe(
+                ws_pool.disconnect(),
+                ws_pool.loop
+            )
+            return future.result(timeout=5.0)
+        except Exception as e:
+            print(f"⚠️  Error disconnecting: {e}")
+            return False
+    return True
--- a/api_endpoint/websocket_logger.py
+++ b/api_endpoint/websocket_logger.py
@@ -0,0 +1,387 @@
+#!/usr/bin/env python3
+"""
+WebSocket Logger Client
+
+Simple client for sending agent events to the logging server via WebSocket.
+Used by the agent to log events in real-time during execution.
+"""
+
+import json
+import asyncio
+from typing import Dict, Any, Optional
+from datetime import datetime
+import websockets
+
+
+class WebSocketLogger:
+    """
+    Client for logging agent events via WebSocket.
+    
+    Usage:
+        logger = WebSocketLogger("unique-session-id")
+        await logger.connect()
+        await logger.log_query("What is Python?", model="gpt-4")
+        await logger.log_api_call(call_number=1)
+        await logger.log_response(call_number=1, content="Python is...")
+        await logger.disconnect()
+    """
+    
+    def __init__(
+        self, 
+        session_id: str,
+        server_url: str = "ws://localhost:8000/ws",
+        enabled: bool = True
+    ):
+        """
+        Initialize WebSocket logger.
+        
+        Args:
+            session_id: Unique identifier for this agent session
+            server_url: WebSocket server URL (default: ws://localhost:8000/ws)
+            enabled: Whether logging is enabled (default: True)
+        """
+        self.session_id = session_id
+        self.server_url = server_url
+        self.enabled = enabled
+        self.websocket: Optional[websockets.WebSocketClientProtocol] = None
+        self.connected = False
+        self.reconnect_count = 0  # Track reconnections for debugging
+    
+    async def connect(self):
+        """
+        Connect to the WebSocket logging server.
+        
+        Establishes WebSocket connection and sends initial session_start event.
+        If connection fails, gracefully disables logging (agent continues normally).
+        """
+        if not self.enabled:
+            return
+        
+        try:
+            # Establish WebSocket connection to the server
+            # Use VERY LONG ping intervals to avoid timeout during long tool execution
+            # The event loop is blocked during tool execution, so we can't process pings
+            # Setting to very large values (1 hour) effectively disables it
+            self.websocket = await websockets.connect(
+                self.server_url,
+                ping_interval=3600,      # 1 hour - effectively disabled (event loop blocked anyway)
+                ping_timeout=3600,       # 1 hour timeout for pong response
+                close_timeout=10,        # Timeout for close handshake
+                max_size=10 * 1024 * 1024,  # 10MB max message size (for large raw_results)
+                open_timeout=10          # Timeout for initial connection
+            )
+            self.connected = True
+            print(f"✅ Connected to logging server (ping/pong: 3600s intervals): {self.server_url}")
+            
+            # Send initial session_start event
+            # This tells the server to create a new SessionLogger for this session
+            await self._send_event("session_start", {
+                "session_id": self.session_id,
+                "start_time": datetime.now().isoformat()
+            })
+            
+        except Exception as e:
+            # Connection failed - disable logging but don't crash the agent
+            print(f"⚠️ Failed to connect to logging server: {e}")
+            print(f"   Logging will be disabled for this session.")
+            self.enabled = False
+            self.connected = False
+    
+    async def disconnect(self):
+        """Disconnect from the WebSocket server."""
+        if self.websocket and self.connected:
+            try:
+                await self.websocket.close()
+                self.connected = False
+                print(f"✅ Disconnected from logging server")
+            except Exception as e:
+                print(f"⚠️ Error disconnecting: {e}")
+    
+    async def _send_event(self, event_type: str, data: Dict[str, Any]):
+        """
+        Send an event to the logging server.
+        
+        This is the core method that sends all events via WebSocket.
+        Creates a standardized message format and handles acknowledgments.
+        
+        Args:
+            event_type: Type of event (query, api_call, response, tool_call, tool_result, error, complete)
+            data: Event data dictionary containing event-specific information
+        """
+        # Safety check: Don't send if logging is disabled
+        if not self.enabled:
+            return
+        
+        # Auto-reconnect if connection was lost
+        if not self.connected or not self.websocket:
+            try:
+                self.reconnect_count += 1
+                print(f"🔄 Reconnecting to logging server (attempt #{self.reconnect_count})...")
+                await self.connect()
+                print(f"✅ Reconnected successfully!")
+            except Exception as e:
+                print(f"⚠️ Failed to reconnect: {e}")
+                self.enabled = False  # Disable logging after failed reconnect
+                return
+        
+        try:
+            # Create standardized message structure
+            # All events follow this format for consistent server-side handling
+            message = {
+                "session_id": self.session_id,      # Links event to specific agent session
+                "event_type": event_type,            # Identifies what kind of event this is
+                "data": data                         # Event-specific payload
+            }
+            
+            # Send message as JSON string over WebSocket
+            await self.websocket.send(json.dumps(message))
+            
+            # Wait for server acknowledgment (with 1 second timeout)
+            # This ensures the server received and processed the event
+            try:
+                response = await asyncio.wait_for(
+                    self.websocket.recv(),
+                    timeout=1.0
+                )
+                # Server sends back: {"status": "logged", "session_id": "...", "event_type": "..."}
+                # We don't need to process it, just confirms receipt
+            except asyncio.TimeoutError:
+                # No response within 1 second - that's okay, continue anyway
+                # Server might be busy or network slow, but event was likely sent
+                pass
+                
+        except Exception as e:
+            # Log error but don't crash - graceful degradation
+            # Agent should continue working even if logging fails
+            error_str = str(e)
+            
+            # Check if connection was closed (error 1011 = keepalive ping timeout)
+            if "1011" in error_str or "closed" in error_str.lower():
+                print(f"⚠️ WebSocket connection closed: {error_str}")
+                self.connected = False  # Mark as disconnected
+                # Don't try to send more events - connection is dead
+            else:
+                print(f"⚠️ Error sending event to logging server: {e}")
+            # Don't disable entirely or try to reconnect - just continue with logging disabled
+    
+    # Convenience methods for specific event types
+    
+    async def log_query(
+        self, 
+        query: str, 
+        model: str = None,
+        toolsets: list = None
+    ):
+        """
+        Log a user query (the question/task given to the agent).
+        
+        This is typically the first event in a session after connection.
+        Captures what the user asked and which model/tools will be used.
+        """
+        await self._send_event("query", {
+            "query": query,          # The user's question/instruction
+            "model": model,          # Which AI model is being used
+            "toolsets": toolsets     # Which tool categories are enabled
+        })
+    
+    async def log_api_call(
+        self,
+        call_number: int,
+        message_count: int = None,
+        has_tools: bool = None
+    ):
+        """
+        Log an API call to the AI model.
+        
+        Called right before sending a request to the model (OpenAI/Anthropic/etc).
+        Helps track how many API calls are being made and conversation length.
+        """
+        await self._send_event("api_call", {
+            "call_number": call_number,      # Sequential number (1, 2, 3...)
+            "message_count": message_count,  # How many messages in conversation so far
+            "has_tools": has_tools          # Whether tools are available to the model
+        })
+    
+    async def log_response(
+        self,
+        call_number: int,
+        content: str = None,
+        has_tool_calls: bool = False,
+        tool_call_count: int = 0,
+        duration: float = None
+    ):
+        """
+        Log an assistant response from the AI model.
+        
+        Called after receiving a response from the API.
+        Captures what the model said and whether it wants to use tools.
+        """
+        await self._send_event("response", {
+            "call_number": call_number,          # Which API call this response is from
+            "content": content,                   # What the model said (text response)
+            "has_tool_calls": has_tool_calls,    # Did model request tool execution?
+            "tool_call_count": tool_call_count,  # How many tools does it want to call?
+            "duration": duration                  # How long the API call took (seconds)
+        })
+    
+    async def log_tool_call(
+        self,
+        call_number: int,
+        tool_index: int,
+        tool_name: str,
+        parameters: Dict[str, Any],
+        tool_call_id: str = None
+    ):
+        """
+        Log a tool call (before executing the tool).
+        
+        Captures which tool is being called and with what parameters.
+        This happens BEFORE the tool runs, so no results yet.
+        """
+        await self._send_event("tool_call", {
+            "call_number": call_number,      # Which API call requested this tool
+            "tool_index": tool_index,        # Which tool in the sequence (if multiple)
+            "tool_name": tool_name,          # Name of tool (e.g., "web_search", "web_extract")
+            "parameters": parameters,        # Arguments passed to the tool (e.g., {"query": "Python", "limit": 5})
+            "tool_call_id": tool_call_id    # Unique ID to link call with result
+        })
+    
+    async def log_tool_result(
+        self,
+        call_number: int,
+        tool_index: int,
+        tool_name: str,
+        result: str = None,
+        error: str = None,
+        duration: float = None,
+        tool_call_id: str = None,
+        raw_result: str = None  # NEW: Full untruncated result for verification
+    ):
+        """
+        Log a tool result (output from tool execution).
+        
+        Captures both a truncated preview (for UI display) and the full raw result
+        (for verification and debugging). This is especially important for web tools
+        where you want to see what was scraped vs what the LLM processed.
+        
+        Args:
+            call_number: Which API call this tool was part of
+            tool_index: Which tool in the sequence (1st, 2nd, etc.)
+            tool_name: Name of the tool that was executed
+            result: Tool output (will be truncated to 1000 chars for preview)
+            error: Error message if tool failed
+            duration: How long the tool took to execute (seconds)
+            tool_call_id: Unique ID linking this result to the tool call
+            raw_result: NEW - Full untruncated result for verification/debugging
+        """
+        await self._send_event("tool_result", {
+            "call_number": call_number,
+            "tool_index": tool_index,
+            "tool_name": tool_name,
+            "result": result[:1000] if result else None,  # Truncated preview (1000 chars max)
+            "raw_result": raw_result,  # NEW: Full result - can be 100KB+ for web scraping
+            "error": error,
+            "duration": duration,
+            "tool_call_id": tool_call_id
+        })
+    
+    async def log_error(
+        self,
+        error_message: str,
+        call_number: int = None
+    ):
+        """
+        Log an error that occurred during agent execution.
+        
+        Captures exceptions, API failures, or other issues.
+        """
+        await self._send_event("error", {
+            "error_message": error_message,  # Description of what went wrong
+            "call_number": call_number       # Which API call caused the error (if applicable)
+        })
+    
+    async def log_complete(
+        self,
+        final_response: str = None,
+        total_calls: int = None,
+        completed: bool = True
+    ):
+        """
+        Log session completion (final event before disconnecting).
+        
+        Marks the end of the agent's execution and provides summary info.
+        """
+        await self._send_event("complete", {
+            "final_response": final_response[:500] if final_response else None,  # Truncated summary of final answer
+            "total_calls": total_calls,      # How many API calls were made total
+            "completed": completed           # Did it complete successfully? (true/false)
+        })
+
+
+# Synchronous wrapper for convenience
+class SyncWebSocketLogger:
+    """
+    Synchronous wrapper around WebSocketLogger.
+    
+    For use in synchronous code - creates an event loop internally.
+    """
+    
+    def __init__(self, session_id: str, server_url: str = "ws://localhost:8000/ws", enabled: bool = True):
+        self.logger = WebSocketLogger(session_id, server_url, enabled)
+        self.loop = None
+    
+    def connect(self):
+        """Connect to server (synchronous)."""
+        self.loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(self.loop)
+        self.loop.run_until_complete(self.logger.connect())
+    
+    def disconnect(self):
+        """Disconnect from server (synchronous)."""
+        if self.loop:
+            self.loop.run_until_complete(self.logger.disconnect())
+            self.loop.close()
+    
+    def _run_async(self, coro):
+        """
+        Run an async coroutine synchronously.
+        
+        Bridge between sync code (agent) and async code (WebSocket).
+        Uses event loop to execute async operations in sync context.
+        """
+        if self.loop and self.loop.is_running():
+            # Already in event loop, just await
+            asyncio.create_task(coro)
+        else:
+            # Run in current loop
+            if self.loop:
+                self.loop.run_until_complete(coro)
+    
+    def log_query(self, query: str, model: str = None, toolsets: list = None):
+        self._run_async(self.logger.log_query(query, model, toolsets))
+    
+    def log_api_call(self, call_number: int, message_count: int = None, has_tools: bool = None):
+        self._run_async(self.logger.log_api_call(call_number, message_count, has_tools))
+    
+    def log_response(self, call_number: int, content: str = None, has_tool_calls: bool = False, 
+                    tool_call_count: int = 0, duration: float = None):
+        self._run_async(self.logger.log_response(call_number, content, has_tool_calls, 
+                                                 tool_call_count, duration))
+    
+    def log_tool_call(self, call_number: int, tool_index: int, tool_name: str, 
+                     parameters: Dict[str, Any], tool_call_id: str = None):
+        self._run_async(self.logger.log_tool_call(call_number, tool_index, tool_name, 
+                                                  parameters, tool_call_id))
+    
+    def log_tool_result(self, call_number: int, tool_index: int, tool_name: str,
+                       result: str = None, error: str = None, duration: float = None,
+                       tool_call_id: str = None, raw_result: str = None):
+        self._run_async(self.logger.log_tool_result(call_number, tool_index, tool_name,
+                                                    result, error, duration, tool_call_id, raw_result))
+    
+    def log_error(self, error_message: str, call_number: int = None):
+        self._run_async(self.logger.log_error(error_message, call_number))
+    
+    def log_complete(self, final_response: str = None, total_calls: int = None, completed: bool = True):
+        self._run_async(self.logger.log_complete(final_response, total_calls, completed))
+
--- a/mixture_of_agents_tool.py
+++ b/mixture_of_agents_tool.py
@@ -50,13 +50,16 @@ import os
 import asyncio
 import uuid
 import datetime
+from dotenv import load_dotenv
 from pathlib import Path
 from typing import Dict, Any, List, Optional
 from openai import AsyncOpenAI

+load_dotenv()
+
 # Initialize Nous Research API client for MoA processing
 nous_client = AsyncOpenAI(
-    api_key=os.getenv("NOUS_API_KEY"),
+    api_key="sk-_yoJ_CBLbSNN2R5rGZ_rpg",
    base_url="https://inference-api.nousresearch.com/v1"
 )

--- a/mock_web_tools.py
+++ b/mock_web_tools.py
@@ -0,0 +1,243 @@
+"""
+Mock Web Tools for Testing WebSocket Reconnection
+
+This module provides mock implementations of web_search and web_extract
+that simulate long-running operations without making real API calls.
+
+Perfect for testing WebSocket timeout/reconnection behavior without:
+- Wasting API credits
+- Waiting for real web crawling
+- Network dependencies
+"""
+
+import time
+import json
+from typing import List
+
+
+def mock_web_search(query: str, delay: int = 2) -> str:
+    """
+    Mock web search that returns fake results after a delay.
+    
+    Args:
+        query: Search query (ignored, just for API compatibility)
+        delay: Seconds to sleep (default: 2s)
+    
+    Returns:
+        JSON string with fake search results
+    """
+    print(f"🔍 [MOCK] Searching for: '{query}' (will take {delay}s)...")
+    time.sleep(delay)
+    
+    result = {
+        "success": True,
+        "data": {
+            "web": [
+                {
+                    "url": "https://example.com/article1",
+                    "title": "Mock Article 1 - Water Utilities",
+                    "description": "This is a mock search result for testing purposes. Real data would appear here.",
+                    "category": None
+                },
+                {
+                    "url": "https://example.com/article2",
+                    "title": "Mock Article 2 - AI Data Centers",
+                    "description": "Another mock result. This simulates web_search without making real API calls.",
+                    "category": None
+                },
+                {
+                    "url": "https://example.com/article3",
+                    "title": "Mock Article 3 - Investment Opportunities",
+                    "description": "Third mock result for testing. Query was: " + query,
+                    "category": None
+                }
+            ]
+        }
+    }
+    
+    return json.dumps(result, indent=2)
+
+
+def mock_web_extract(urls: List[str], delay: int = 60) -> str:
+    """
+    Mock web extraction that simulates long-running crawl.
+    
+    This is perfect for testing WebSocket timeout/reconnection because:
+    - Default 60s delay triggers the ~30s WebSocket timeout
+    - No actual web requests made
+    - No API credits consumed
+    - Predictable, reproducible behavior
+    
+    Args:
+        urls: List of URLs to "extract" (ignored)
+        delay: Seconds to sleep (default: 60s to trigger timeout)
+    
+    Returns:
+        JSON string with fake extraction results
+    """
+    print(f"🌐 [MOCK] Extracting {len(urls)} URLs (will take {delay}s)...")
+    print(f"📊 [MOCK] This will test WebSocket reconnection (timeout at ~30s)")
+    
+    # Simulate long-running operation
+    # Show progress so user knows it's working
+    for i in range(delay):
+        if i % 10 == 0 and i > 0:
+            print(f"  ⏱️  [MOCK] {i}/{delay}s elapsed...")
+        time.sleep(1)
+    
+    # Generate fake but realistic-looking content
+    result = {
+        "success": True,
+        "data": []
+    }
+    
+    for idx, url in enumerate(urls, 1):
+        result["data"].append({
+            "url": url,
+            "title": f"Mock Extracted Content {idx}",
+            "content": f"# Mock Content from {url}\n\n"
+                      f"This is simulated extracted content for testing purposes. "
+                      f"In a real scenario, this would contain the full text from the webpage. "
+                      f"\n\n## Key Points\n"
+                      f"- Mock point 1 about water utilities\n"
+                      f"- Mock point 2 about AI data centers\n"
+                      f"- Mock point 3 about investment opportunities\n"
+                      f"\n\nThis content took {delay} seconds to 'extract', which is long enough "
+                      f"to trigger WebSocket timeout and test reconnection logic."
+                      * 10,  # Make it longer to simulate real extraction
+            "extracted_at": "2025-10-10T14:00:00Z"
+        })
+    
+    json_result = json.dumps(result, indent=2)
+    size_kb = len(json_result) / 1024
+    
+    print(f"✅ [MOCK] Extraction completed: {len(urls)} URLs, {size_kb:.1f} KB")
+    return json_result
+
+
+def mock_web_crawl(start_url: str, max_pages: int = 10, delay: int = 30) -> str:
+    """
+    Mock web crawling that simulates multi-page crawl.
+    
+    Args:
+        start_url: Starting URL (ignored)
+        max_pages: Max pages to crawl (just affects result count)
+        delay: Seconds to sleep (default: 30s)
+    
+    Returns:
+        JSON string with fake crawl results
+    """
+    print(f"🕷️  [MOCK] Crawling from: {start_url} (max {max_pages} pages, {delay}s)...")
+    time.sleep(delay)
+    
+    result = {
+        "success": True,
+        "data": {
+            "start_url": start_url,
+            "pages_crawled": min(max_pages, 5),
+            "pages": []
+        }
+    }
+    
+    for i in range(min(max_pages, 5)):
+        result["data"]["pages"].append({
+            "url": f"{start_url}/page{i+1}",
+            "title": f"Mock Page {i+1}",
+            "content": f"Mock content from page {i+1}. " * 50
+        })
+    
+    print(f"✅ [MOCK] Crawl completed: {len(result['data']['pages'])} pages")
+    return json.dumps(result, indent=2)
+
+
+# Tool definitions for the agent (same format as real tools)
+MOCK_WEB_TOOLS = [
+    {
+        "name": "web_search",
+        "description": "[MOCK] Search the web for information. Returns fake results after 2s delay. Perfect for quick tests.",
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "query": {
+                    "type": "string",
+                    "description": "The search query"
+                },
+                "delay": {
+                    "type": "integer",
+                    "description": "Seconds to delay (default: 2)",
+                    "default": 2
+                }
+            },
+            "required": ["query"]
+        }
+    },
+    {
+        "name": "web_extract",
+        "description": "[MOCK] Extract content from URLs. Simulates 60s delay to test WebSocket timeout/reconnection. Returns fake content without making real requests. PERFECT FOR TESTING!",
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "urls": {
+                    "type": "array",
+                    "items": {"type": "string"},
+                    "description": "List of URLs to extract"
+                },
+                "delay": {
+                    "type": "integer",
+                    "description": "Seconds to delay (default: 60 to trigger timeout)",
+                    "default": 60
+                }
+            },
+            "required": ["urls"]
+        }
+    },
+    {
+        "name": "web_crawl",
+        "description": "[MOCK] Crawl website starting from URL. Returns fake results after 30s delay.",
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "start_url": {
+                    "type": "string",
+                    "description": "Starting URL for crawl"
+                },
+                "max_pages": {
+                    "type": "integer",
+                    "description": "Max pages to crawl (default: 10)",
+                    "default": 10
+                },
+                "delay": {
+                    "type": "integer",
+                    "description": "Seconds to delay (default: 30)",
+                    "default": 30
+                }
+            },
+            "required": ["start_url"]
+        }
+    }
+]
+
+
+# Map function names to implementations
+MOCK_TOOL_FUNCTIONS = {
+    "web_search": mock_web_search,
+    "web_extract": mock_web_extract,
+    "web_crawl": mock_web_crawl
+}
+
+
+if __name__ == "__main__":
+    # Demo/test the mock tools
+    print("Testing Mock Web Tools")
+    print("=" * 60)
+    
+    print("\n1. Mock web_search (2s delay):")
+    result = mock_web_search("test query", delay=2)
+    print(f"Result length: {len(result)} chars\n")
+    
+    print("\n2. Mock web_extract (5s delay for demo - normally 60s):")
+    result = mock_web_extract(["https://example.com"], delay=5)
+    print(f"Result length: {len(result)} chars\n")
+    
+    print("\n✅ All mock tools working!")
+
--- a/model_tools.py
+++ b/model_tools.py
@@ -8,27 +8,38 @@ for defining tools and executing function calls.

 Currently supports:
 - Web tools (search, extract, crawl) from web_tools.py
+- Terminal tools (command execution with interactive sessions) from terminal_tool.py
+- Vision tools (image analysis) from vision_tools.py
+- Mixture of Agents tools (collaborative multi-model reasoning) from mixture_of_agents_tool.py
+- Image generation tools (text-to-image with upscaling) from image_generation_tool.py

 Usage:
    from model_tools import get_tool_definitions, handle_function_call
    
-    # Get tool definitions for model API
+    # Get all available tool definitions for model API
    tools = get_tool_definitions()
    
+    # Get specific toolsets
+    web_tools = get_tool_definitions(enabled_toolsets=['web_tools'])
+    
    # Handle function calls from model
-    result = handle_function_call("web_search_tool", {"query": "Python", "limit": 3})
+    result = handle_function_call("web_search", {"query": "Python"})
 """

 import json
 import asyncio
 from typing import Dict, Any, List

-# Import toolsets
 from web_tools import web_search_tool, web_extract_tool, web_crawl_tool, check_firecrawl_api_key
 from terminal_tool import terminal_tool, check_hecate_requirements, TERMINAL_TOOL_DESCRIPTION
 from vision_tools import vision_analyze_tool, check_vision_requirements
 from mixture_of_agents_tool import mixture_of_agents_tool, check_moa_requirements
 from image_generation_tool import image_generate_tool, check_image_generation_requirements
+from toolsets import (
+    get_toolset, resolve_toolset, resolve_multiple_toolsets,
+    get_all_toolsets, get_toolset_names, validate_toolset,
+    get_toolset_info, print_toolset_tree
+)

 def get_web_tool_definitions() -> List[Dict[str, Any]]:
    """
@@ -42,20 +53,13 @@ def get_web_tool_definitions() -> List[Dict[str, Any]]:
            "type": "function",
            "function": {
                "name": "web_search",
-                "description": "Search the web for information on any topic. Returns relevant results with titles and URLs. Uses advanced search depth for comprehensive results.",
+                "description": "Search the web for information on any topic. Returns up to 5 relevant results with titles and URLs. Uses advanced search depth for comprehensive results.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The search query to look up on the web"
-                        },
-                        "limit": {
-                            "type": "integer",
-                            "description": "Maximum number of results to return (default: 5, max: 10)",
-                            "default": 5,
-                            "minimum": 1,
-                            "maximum": 10
                        }
                    },
                    "required": ["query"]
@@ -75,11 +79,6 @@ def get_web_tool_definitions() -> List[Dict[str, Any]]:
                            "items": {"type": "string"},
                            "description": "List of URLs to extract content from (max 5 URLs per call)",
                            "maxItems": 5
-                        },
-                        "format": {
-                            "type": "string",
-                            "enum": ["markdown", "html"],
-                            "description": "Desired output format for extracted content (optional)"
                        }
                    },
                    "required": ["urls"]
@@ -101,12 +100,6 @@ def get_web_tool_definitions() -> List[Dict[str, Any]]:
                        "instructions": {
                            "type": "string",
                            "description": "Specific instructions for what to crawl/extract using AI intelligence (e.g., 'Find pricing information', 'Get documentation pages', 'Extract contact details')"
-                        },
-                        "depth": {
-                            "type": "string",
-                            "enum": ["basic", "advanced"],
-                            "description": "Depth of extraction - 'basic' for surface content, 'advanced' for deeper analysis (default: basic)",
-                            "default": "basic"
                        }
                    },
                    "required": ["url"]
@@ -185,12 +178,7 @@ def get_vision_tool_definitions() -> List[Dict[str, Any]]:
                        },
                        "question": {
                            "type": "string",
-                            "description": "Your specific question or request about the image to resolve. The AI will automatically provide a complete image description AND answer your specific question. Examples: 'What text can you read?', 'What architectural style is this?', 'Describe the mood and emotions', 'What safety hazards do you see?'"
-                        },
-                        "model": {
-                            "type": "string",
-                            "description": "The vision model to use for analysis (optional, default: gemini-2.5-flash)",
-                            "default": "gemini-2.5-flash"
+                            "description": "Your specific question or request about the image to resolve. The AI will automatically provide a complete image description AND answer your specific question."
                        }
                    },
                    "required": ["image_url", "question"]
@@ -212,7 +200,7 @@ def get_moa_tool_definitions() -> List[Dict[str, Any]]:
            "type": "function",
            "function": {
                "name": "mixture_of_agents",
-                "description": "Process extremely difficult problems requiring intense reasoning using the Mixture-of-Agents methodology. This tool leverages multiple frontier language models to collaboratively solve complex tasks that single models struggle with. Uses a fixed 2-layer architecture: reference models (claude-opus-4, gemini-2.5-pro, o4-mini, deepseek-r1) generate diverse responses, then an aggregator synthesizes the best solution. Best for: complex mathematical proofs, advanced coding problems, multi-step analytical reasoning, precise and complex STEM problems, algorithm design, and problems requiring diverse domain expertise.",
+                "description": "Process extremely difficult problems requiring intense reasoning using a Mixture-of-Agents. This tool leverages multiple frontier language models to collaboratively solve complex tasks that single models struggle with. Uses a fixed 2-layer architecture: reference models generate diverse responses, then an aggregator synthesizes the best solution. Best for: complex mathematical proofs, advanced coding problems, multi-step analytical reasoning, precise and complex STEM problems, algorithm design, and problems requiring diverse domain expertise.",
                "parameters": {
                    "type": "object",
                    "properties": {
@@ -240,13 +228,13 @@ def get_image_tool_definitions() -> List[Dict[str, Any]]:
            "type": "function",
            "function": {
                "name": "image_generate",
-                "description": "Generate high-quality images from text prompts using FAL.ai's FLUX.1 Krea model with automatic 2x upscaling. Creates detailed, artistic images that are automatically enhanced for superior quality. Returns a single upscaled image URL that can be displayed using <img src=\"{URL}\"></img> tags.",
+                "description": "Generate high-quality images from text prompts using FLUX Krea model with automatic 2x upscaling. Creates detailed, artistic images that are automatically enhanced for superior quality. Returns a single upscaled image URL that can be displayed using <img src=\"{URL}\"></img> tags.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "prompt": {
                            "type": "string",
-                            "description": "The text prompt describing the desired image. Be detailed and descriptive for best results."
+                            "description": "The text prompt describing the desired image. Be detailed and descriptive."
                        },
                        "image_size": {
                            "type": "string",
@@ -291,10 +279,6 @@ def get_all_tool_names() -> List[str]:
    if check_image_generation_requirements():
        tool_names.extend(["image_generate"])
    
-    # Future toolsets can be added here:
-    # if check_file_tools():
-    #     tool_names.extend(["file_read", "file_write"])
-    
    return tool_names


@@ -316,154 +300,152 @@ def get_toolset_for_tool(tool_name: str) -> str:
        "vision_analyze": "vision_tools",
        "mixture_of_agents": "moa_tools",
        "image_generate": "image_tools"
-        # Future tools can be added here
    }
    
    return toolset_mapping.get(tool_name, "unknown")


 def get_tool_definitions(
-    enabled_tools: List[str] = None, 
-    disabled_tools: List[str] = None,
    enabled_toolsets: List[str] = None,
    disabled_toolsets: List[str] = None
 ) -> List[Dict[str, Any]]:
    """
-    Get tool definitions for model API calls with optional filtering.
+    Get tool definitions for model API calls with toolset-based filtering.
    
-    This function aggregates tool definitions from all available toolsets
-    and applies filtering based on the provided parameters.
-    
-    Filter Priority (higher priority overrides lower):
-    1. enabled_tools (highest priority - only these tools, overrides everything)
-    2. disabled_tools (applied after toolset filtering)
-    3. enabled_toolsets (only tools from these toolsets)
-    4. disabled_toolsets (exclude tools from these toolsets)
+    This function aggregates tool definitions from available toolsets.
+    All tools must be part of a toolset to be accessible. Individual tool
+    selection is not supported - use toolsets to organize and select tools.
    
    Args:
-        enabled_tools (List[str]): Only include these specific tools. If provided, 
-                                  ONLY these tools will be included (overrides all other filters)
-        disabled_tools (List[str]): Exclude these specific tools (applied after toolset filtering)
-        enabled_toolsets (List[str]): Only include tools from these toolsets
-        disabled_toolsets (List[str]): Exclude tools from these toolsets
+        enabled_toolsets (List[str]): Only include tools from these toolsets.
+                                     If None, all available tools are included.
+        disabled_toolsets (List[str]): Exclude tools from these toolsets.
+                                      Applied only if enabled_toolsets is None.
    
    Returns:
        List[Dict]: Filtered list of tool definitions
    
    Examples:
-        # Only web tools
-        tools = get_tool_definitions(enabled_toolsets=["web_tools"])
+        # Use predefined toolsets
+        tools = get_tool_definitions(enabled_toolsets=["research"])
+        tools = get_tool_definitions(enabled_toolsets=["development"])
        
-        # All tools except terminal
-        tools = get_tool_definitions(disabled_tools=["terminal"])
+        # Combine multiple toolsets
+        tools = get_tool_definitions(enabled_toolsets=["web", "vision"])
        
-        # Only specific tools (overrides toolset filters)
-        tools = get_tool_definitions(enabled_tools=["web_search", "web_extract"])
+        # All tools except those in terminal toolset
+        tools = get_tool_definitions(disabled_toolsets=["terminal"])
        
-        # Conflicting filters (enabled_tools wins)
-        tools = get_tool_definitions(enabled_toolsets=["web_tools"], enabled_tools=["terminal"])
-        # Result: Only terminal tool (enabled_tools overrides enabled_toolsets)
+        # Default - all available tools
+        tools = get_tool_definitions()
    """
-    # Detect and warn about potential conflicts
-    conflicts_detected = False
+    # Collect all available tool definitions
+    all_available_tools_map = {}
    
-    if enabled_tools and (enabled_toolsets or disabled_toolsets or disabled_tools):
-        print("⚠️  enabled_tools overrides all other filters")
-        conflicts_detected = True
+    # Map tool names to their definitions
+    if check_firecrawl_api_key():
+        for tool in get_web_tool_definitions():
+            all_available_tools_map[tool["function"]["name"]] = tool
    
-    if enabled_toolsets and disabled_toolsets:
-        # Check for overlap
-        enabled_set = set(enabled_toolsets)
-        disabled_set = set(disabled_toolsets)
-        overlap = enabled_set & disabled_set
-        if overlap:
-            print(f"⚠️  Conflicting toolsets: {overlap} in both enabled and disabled")
-            print(f"   → enabled_toolsets takes priority")
-            conflicts_detected = True
+    if check_hecate_requirements():
+        for tool in get_terminal_tool_definitions():
+            all_available_tools_map[tool["function"]["name"]] = tool
    
-    if enabled_tools and disabled_tools:
-        # Check for overlap
-        enabled_set = set(enabled_tools)
-        disabled_set = set(disabled_tools)
-        overlap = enabled_set & disabled_set
-        if overlap:
-            print(f"⚠️  Conflicting tools: {overlap} in both enabled and disabled")
-            print(f"   → enabled_tools takes priority")
-            conflicts_detected = True
+    if check_vision_requirements():
+        for tool in get_vision_tool_definitions():
+            all_available_tools_map[tool["function"]["name"]] = tool
    
-    all_tools = []
+    if check_moa_requirements():
+        for tool in get_moa_tool_definitions():
+            all_available_tools_map[tool["function"]["name"]] = tool
    
-    # Collect all available tools from each toolset
-    toolset_tools = {
-        "web_tools": get_web_tool_definitions() if check_firecrawl_api_key() else [],
-        "terminal_tools": get_terminal_tool_definitions() if check_hecate_requirements() else [],
-        "vision_tools": get_vision_tool_definitions() if check_vision_requirements() else [],
-        "moa_tools": get_moa_tool_definitions() if check_moa_requirements() else [],
-        "image_tools": get_image_tool_definitions() if check_image_generation_requirements() else []
-        # Future toolsets can be added here:
-        # "file_tools": get_file_tool_definitions() if check_file_tools() else [],
-    }
+    if check_image_generation_requirements():
+        for tool in get_image_tool_definitions():
+            all_available_tools_map[tool["function"]["name"]] = tool
    
-    # HIGHEST PRIORITY: enabled_tools (overrides everything)
-    if enabled_tools:
-        if conflicts_detected:
-            print(f"🎯 Using only enabled_tools: {enabled_tools}")
-        
-        # Collect all available tools first
-        all_available_tools = []
-        for tools in toolset_tools.values():
-            all_available_tools.extend(tools)
-        
-        # Only include specifically enabled tools
-        tool_names_to_include = set(enabled_tools)
-        filtered_tools = [
-            tool for tool in all_available_tools 
-            if tool["function"]["name"] in tool_names_to_include
-        ]
-        
-        # Warn about requested tools that aren't available
-        found_tools = {tool["function"]["name"] for tool in filtered_tools}
-        missing_tools = tool_names_to_include - found_tools
-        if missing_tools:
-            print(f"⚠️  Requested tools not available: {missing_tools}")
-        
-        return filtered_tools
+    # Determine which tools to include based on toolsets
+    tools_to_include = set()
    
-    # Apply toolset-level filtering first
    if enabled_toolsets:
        # Only include tools from enabled toolsets
        for toolset_name in enabled_toolsets:
-            if toolset_name in toolset_tools:
-                all_tools.extend(toolset_tools[toolset_name])
+            if validate_toolset(toolset_name):
+                resolved_tools = resolve_toolset(toolset_name)
+                tools_to_include.update(resolved_tools)
+                print(f"✅ Enabled toolset '{toolset_name}': {', '.join(resolved_tools) if resolved_tools else 'no tools'}")
            else:
-                print(f"⚠️  Unknown toolset: {toolset_name}")
+                # Try legacy compatibility
+                if toolset_name in ["web_tools", "terminal_tools", "vision_tools", "moa_tools", "image_tools"]:
+                    # Map legacy names to new system
+                    legacy_map = {
+                        "web_tools": ["web_search", "web_extract", "web_crawl"],
+                        "terminal_tools": ["terminal"],
+                        "vision_tools": ["vision_analyze"],
+                        "moa_tools": ["mixture_of_agents"],
+                        "image_tools": ["image_generate"]
+                    }
+                    legacy_tools = legacy_map.get(toolset_name, [])
+                    tools_to_include.update(legacy_tools)
+                    print(f"✅ Enabled legacy toolset '{toolset_name}': {', '.join(legacy_tools)}")
+                else:
+                    print(f"⚠️  Unknown toolset: {toolset_name}")
    elif disabled_toolsets:
-        # Include all tools except from disabled toolsets
-        for toolset_name, tools in toolset_tools.items():
-            if toolset_name not in disabled_toolsets:
-                all_tools.extend(tools)
+        # Start with all tools from all toolsets, then remove disabled ones
+        # Note: Only tools that are part of toolsets are accessible
+        # We need to get all tools from all defined toolsets
+        from toolsets import get_all_toolsets
+        all_toolset_tools = set()
+        for toolset_name in get_all_toolsets():
+            resolved_tools = resolve_toolset(toolset_name)
+            all_toolset_tools.update(resolved_tools)
+        
+        # Start with all tools from toolsets
+        tools_to_include = all_toolset_tools
+        
+        # Remove tools from disabled toolsets
+        for toolset_name in disabled_toolsets:
+            if validate_toolset(toolset_name):
+                resolved_tools = resolve_toolset(toolset_name)
+                tools_to_include.difference_update(resolved_tools)
+                print(f"🚫 Disabled toolset '{toolset_name}': {', '.join(resolved_tools) if resolved_tools else 'no tools'}")
+            else:
+                # Try legacy compatibility
+                if toolset_name in ["web_tools", "terminal_tools", "vision_tools", "moa_tools", "image_tools"]:
+                    legacy_map = {
+                        "web_tools": ["web_search", "web_extract", "web_crawl"],
+                        "terminal_tools": ["terminal"],
+                        "vision_tools": ["vision_analyze"],
+                        "moa_tools": ["mixture_of_agents"],
+                        "image_tools": ["image_generate"]
+                    }
+                    legacy_tools = legacy_map.get(toolset_name, [])
+                    tools_to_include.difference_update(legacy_tools)
+                    print(f"🚫 Disabled legacy toolset '{toolset_name}': {', '.join(legacy_tools)}")
+                else:
+                    print(f"⚠️  Unknown toolset: {toolset_name}")
    else:
-        # Include all available tools
-        for tools in toolset_tools.values():
-            all_tools.extend(tools)
+        # No filtering - include all tools from all defined toolsets
+        from toolsets import get_all_toolsets
+        for toolset_name in get_all_toolsets():
+            resolved_tools = resolve_toolset(toolset_name)
+            tools_to_include.update(resolved_tools)
    
-    # Apply tool-level filtering (disabled_tools)
-    if disabled_tools:
-        tool_names_to_exclude = set(disabled_tools)
-        original_tools = [tool["function"]["name"] for tool in all_tools]
-        
-        all_tools = [
-            tool for tool in all_tools 
-            if tool["function"]["name"] not in tool_names_to_exclude
-        ]
-        
-        # Show what was actually filtered out
-        remaining_tools = {tool["function"]["name"] for tool in all_tools}
-        actually_excluded = set(original_tools) & tool_names_to_exclude
-        if actually_excluded:
-            print(f"🚫 Excluded tools: {actually_excluded}")
+    # Build final tool list (only include tools that are available)
+    filtered_tools = []
+    for tool_name in tools_to_include:
+        if tool_name in all_available_tools_map:
+            filtered_tools.append(all_available_tools_map[tool_name])
    
-    return all_tools
+    # Sort tools for consistent ordering
+    filtered_tools.sort(key=lambda t: t["function"]["name"])
+    
+    if filtered_tools:
+        tool_names = [t["function"]["name"] for t in filtered_tools]
+        print(f"🛠️  Final tool selection ({len(filtered_tools)} tools): {', '.join(tool_names)}")
+    else:
+        print("🛠️  No tools selected (all filtered out or unavailable)")
+    
+    return filtered_tools

 def handle_web_function_call(function_name: str, function_args: Dict[str, Any]) -> str:
    """
@@ -478,25 +460,22 @@ def handle_web_function_call(function_name: str, function_args: Dict[str, Any])
    """
    if function_name == "web_search":
        query = function_args.get("query", "")
-        limit = function_args.get("limit", 5)
-        # Ensure limit is within bounds
-        limit = max(1, min(10, limit))
+        # Always use fixed limit of 5
+        limit = 5
        return web_search_tool(query, limit)
    
    elif function_name == "web_extract":
        urls = function_args.get("urls", [])
        # Limit URLs to prevent abuse
        urls = urls[:5] if isinstance(urls, list) else []
-        format = function_args.get("format")
        # Run async function in event loop
-        return asyncio.run(web_extract_tool(urls, format))
+        return asyncio.run(web_extract_tool(urls, "markdown"))
    
    elif function_name == "web_crawl":
        url = function_args.get("url", "")
        instructions = function_args.get("instructions")
-        depth = function_args.get("depth", "basic")
        # Run async function in event loop
-        return asyncio.run(web_crawl_tool(url, instructions, depth))
+        return asyncio.run(web_crawl_tool(url, instructions, "basic"))
    
    else:
        return json.dumps({"error": f"Unknown web function: {function_name}"})
@@ -518,9 +497,8 @@ def handle_terminal_function_call(function_name: str, function_args: Dict[str, A
        background = function_args.get("background", False)
        idle_threshold = function_args.get("idle_threshold", 5.0)
        timeout = function_args.get("timeout")
-        snapshot_id = function_args.get("snapshot_id")
-        # Session management is handled internally - don't pass session_id from model
-        return terminal_tool(command, input_keys, None, background, idle_threshold, timeout, snapshot_id=snapshot_id)
+
+        return terminal_tool(command, input_keys, None, background, idle_threshold, timeout)
    
    else:
        return json.dumps({"error": f"Unknown terminal function: {function_name}"})
@@ -540,13 +518,11 @@ def handle_vision_function_call(function_name: str, function_args: Dict[str, Any
    if function_name == "vision_analyze":
        image_url = function_args.get("image_url", "")
        question = function_args.get("question", "")
-        model = function_args.get("model", "gemini-2.5-flash")
-        
-        # Automatically prepend full description request to user's question
-        full_prompt = f"Fully describe and explain everything about this image\n\n{question}"
+
+        full_prompt = f"Fully describe and explain everything about this image, then answer the following question:\n\n{question}"
        
        # Run async function in event loop
-        return asyncio.run(vision_analyze_tool(image_url, full_prompt, model))
+        return asyncio.run(vision_analyze_tool(image_url, full_prompt, "gemini-2.5-flash"))
    
    else:
        return json.dumps({"error": f"Unknown vision function: {function_name}"})
@@ -593,7 +569,6 @@ def handle_image_function_call(function_name: str, function_args: Dict[str, Any]
        if not prompt:
            return json.dumps({"success": False, "image": None})
        
-        # Extract only the exposed parameters
        image_size = function_args.get("image_size", "landscape_16_9")
        
        # Use fixed internal defaults for all other parameters (not exposed to model)
@@ -663,12 +638,6 @@ def handle_function_call(function_name: str, function_args: Dict[str, Any]) -> s
        elif function_name in ["image_generate"]:
            return handle_image_function_call(function_name, function_args)
        
-        # Future toolsets can be routed here:
-        # elif function_name in ["file_read_tool", "file_write_tool"]:
-        #     return handle_file_function_call(function_name, function_args)
-        # elif function_name in ["code_execute_tool", "code_analyze_tool"]:
-        #     return handle_code_function_call(function_name, function_args)
-        
        else:
            error_msg = f"Unknown function: {function_name}"
            print(f"❌ {error_msg}")
@@ -717,7 +686,6 @@ def get_available_toolsets() -> Dict[str, Dict[str, Any]]:
            "description": "Generate high-quality images from text prompts using FAL.ai's FLUX.1 Krea model with automatic 2x upscaling for enhanced quality",
            "requirements": ["FAL_KEY environment variable", "fal-client package"]
        }
-        # Future toolsets can be added here
    }
    
    return toolsets
--- a/output.txt
+++ b/output.txt
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,14 @@
 firecrawl-py
 openai
-fal-client
+fal-client
+python-dotenv
+fire
+httpx
+yt-dlp
+streamlit
+fastapi
+uvicorn
+websockets
+PySide6>=6.6.0
+websocket-client>=1.7.0
+requests>=2.31.0
--- a/run_agent.py
+++ b/run_agent.py
--- a/terminal_tool.py
+++ b/terminal_tool.py
@@ -22,8 +22,8 @@ Usage:
 import json
 import os
 from typing import Optional, Dict, Any
-from hecate import run_tool_with_lifecycle_management
-from morphcloud._llm import ToolCall
+# from hecate import run_tool_with_lifecycle_management
+# from morphcloud._llm import ToolCall

 # Detailed description for the terminal tool based on Hermes Terminal system prompt
 TERMINAL_TOOL_DESCRIPTION = """Execute commands on a secure, persistent Linux VM environment with full interactive application support.
@@ -78,8 +78,7 @@ def terminal_tool(
    session_id: Optional[str] = None,
    background: bool = False,
    idle_threshold: float = 5.0,
-    timeout: Optional[int] = None,
-    snapshot_id: str | None = None,
+    timeout: Optional[int] = None
 ) -> str:
    """
    Execute a command on a Morph VM with optional interactive session support.
@@ -130,27 +129,30 @@ def terminal_tool(
            tool_input["idle_threshold"] = idle_threshold
        if timeout is not None:
            tool_input["timeout"] = timeout
+
+        # THIS IS BROKEN FOR NOW ~!!!!!!!
        
-        tool_call = ToolCall(
-            name="run_command",
-            input=tool_input
-        )
+        # tool_call = ToolCall(
+        #     name="run_command",
+        #     input=tool_input
+        # )
        
-        # Execute with lifecycle management
-        result = run_tool_with_lifecycle_management(tool_call, snapshot_id=snapshot_id)
+        # # Execute with lifecycle management
+        # result = run_tool_with_lifecycle_management(tool_call)
+
        
-        # Format the result with all possible fields
-        # Map hecate's "stdout" to "output" for compatibility
-        formatted_result = {
-            "output": result.get("stdout", result.get("output", "")),
-            "screen": result.get("screen", ""),
-            "session_id": result.get("session_id"),
-            "exit_code": result.get("returncode", result.get("exit_code", -1)),
-            "error": result.get("error"),
-            "status": "active" if result.get("session_id") else "ended"
-        }
+        # # Format the result with all possible fields
+        # # Map hecate's "stdout" to "output" for compatibility
+        # formatted_result = {
+        #     "output": result.get("stdout", result.get("output", "")),
+        #     "screen": result.get("screen", ""),
+        #     "session_id": result.get("session_id"),
+        #     "exit_code": result.get("returncode", result.get("exit_code", -1)),
+        #     "error": result.get("error"),
+        #     "status": "active" if result.get("session_id") else "ended"
+        # }
        
-        return json.dumps(formatted_result)
+        return json.dumps({})
        
    except Exception as e:
        return json.dumps({
@@ -232,4 +234,4 @@ if __name__ == "__main__":
    print(f"  MORPH_API_KEY: {'Set' if os.getenv('MORPH_API_KEY') else 'Not set'}")
    print(f"  OPENAI_API_KEY: {'Set' if os.getenv('OPENAI_API_KEY') else 'Not set (optional)'}")
    print(f"  HECATE_VM_LIFETIME_SECONDS: {os.getenv('HECATE_VM_LIFETIME_SECONDS', '300')} (default: 300)")
-    print(f"  HECATE_DEFAULT_SNAPSHOT_ID: {os.getenv('HECATE_DEFAULT_SNAPSHOT_ID', 'snapshot_p5294qxt')} (default: snapshot_p5294qxt)")
+    print(f"  HECATE_DEFAULT_SNAPSHOT_ID: {os.getenv('HECATE_DEFAULT_SNAPSHOT_ID', 'snapshot_p5294qxt')} (default: snapshot_p5294qxt)")
--- a/test_mock_mode.sh
+++ b/test_mock_mode.sh
@@ -0,0 +1,122 @@
+#!/bin/bash
+#
+# Test Script for Mock Web Tools & WebSocket Reconnection
+#
+# This script tests:
+# 1. Mock web tools (no API calls, fake data)
+# 2. WebSocket timeout/reconnection during long operations
+# 3. Complete logging capture
+#
+# Perfect for development/testing without wasting API credits!
+
+set -e
+
+cd "$(dirname "$0")"
+
+echo "=========================================="
+echo "🧪 Mock Mode Test Script"
+echo "=========================================="
+echo ""
+
+# Check if logging server is running
+if ! curl -s http://localhost:8000/health > /dev/null 2>&1; then
+    echo "⚠️  Logging server not detected!"
+    echo "   Starting logging server in background..."
+    python api_endpoint/logging_server.py &
+    SERVER_PID=$!
+    echo "   Server PID: $SERVER_PID"
+    sleep 3
+else
+    echo "✅ Logging server already running"
+    SERVER_PID=""
+fi
+
+echo ""
+echo "📋 Test Configuration:"
+echo "   - Mock web tools: ENABLED"
+echo "   - Mock delay: 60 seconds (triggers WebSocket timeout)"
+echo "   - WebSocket logging: ENABLED"
+echo "   - Expected behavior: Connection timeout + auto-reconnect"
+echo ""
+echo "🔄 Running agent with mock mode..."
+echo "   (This will take ~60 seconds to test reconnection)"
+echo ""
+
+# Run agent with mock mode
+python run_agent.py \
+  --enabled_toolsets web \
+  --enable_websocket_logging \
+  --mock_web_tools \
+  --mock_delay 60 \
+  --query "Find publicly traded water companies benefiting from AI data centers"
+
+echo ""
+echo "=========================================="
+echo "✅ Test Complete!"
+echo "=========================================="
+echo ""
+
+# Find most recent log file
+LATEST_LOG=$(ls -t api_endpoint/logs/realtime/session_*.json 2>/dev/null | head -1)
+
+if [ -n "$LATEST_LOG" ]; then
+    echo "📊 Log Analysis:"
+    echo "   File: $LATEST_LOG"
+    echo ""
+    
+    # Count events
+    echo "   Event Counts:"
+    python3 -c "
+import json
+import sys
+
+with open('$LATEST_LOG') as f:
+    data = json.load(f)
+    events = data.get('events', [])
+    
+    # Count by type
+    counts = {}
+    for e in events:
+        etype = e.get('type', 'unknown')
+        counts[etype] = counts.get(etype, 0) + 1
+    
+    for etype, count in sorted(counts.items()):
+        print(f'     - {etype}: {count}')
+    
+    # Check completeness
+    has_complete = any(e.get('type') == 'complete' for e in events)
+    print()
+    if has_complete:
+        print('   ✅ Session completed successfully!')
+    else:
+        print('   ⚠️  Session incomplete (may have been interrupted)')
+    
+    # Check for reconnections
+    tool_results = [e for e in events if e.get('type') == 'tool_result']
+    tool_calls = [e for e in events if e.get('type') == 'tool_call']
+    
+    if len(tool_results) == len(tool_calls):
+        print('   ✅ All tool calls have results (no missing events)')
+    else:
+        print(f'   ⚠️  Tool calls: {len(tool_calls)}, Results: {len(tool_results)}')
+"
+else
+    echo "⚠️  No log files found"
+fi
+
+# Cleanup
+if [ -n "$SERVER_PID" ]; then
+    echo ""
+    echo "🛑 Stopping logging server (PID: $SERVER_PID)..."
+    kill $SERVER_PID 2>/dev/null || true
+fi
+
+echo ""
+echo "💡 Key Observations to Look For:"
+echo "   1. '[MOCK]' prefix on tool execution messages"
+echo "   2. '🔄 Reconnecting to logging server' after long tool"
+echo "   3. '✅ Reconnected successfully!' confirmation"
+echo "   4. Complete log file with all events captured"
+echo ""
+echo "🎉 Mock mode test completed!"
+
--- a/test_parallel_execution.py
+++ b/test_parallel_execution.py
@@ -0,0 +1,242 @@
+#!/usr/bin/env python3
+"""
+Test Parallel Execution with Persistent WebSocket Connection Pool
+
+This script demonstrates that multiple agent runs can execute in parallel,
+all sharing a single WebSocket connection for logging.
+
+Benefits:
+- No connection overhead (single persistent connection)
+- No timeout issues (connection stays alive)
+- True parallel execution (multiple sessions simultaneously)
+"""
+
+import asyncio
+from run_agent import AIAgent
+import time
+
+
+async def run_agent_query(query: str, agent_name: str, mock_delay: int = 10):
+    """
+    Run a single agent query with logging.
+    
+    Args:
+        query: Query to send to agent
+        agent_name: Name for logging purposes
+        mock_delay: Delay for mock tools (seconds)
+    """
+    print(f"🚀 [{agent_name}] Starting query: '{query[:40]}...'")
+    start_time = time.time()
+    
+    try:
+        agent = AIAgent(
+            model="claude-sonnet-4-5-20250929",
+            max_iterations=5,
+            enabled_toolsets=["web"],
+            enable_websocket_logging=True,
+            websocket_server="ws://localhost:8000/ws",
+            mock_web_tools=True,  # Use mock tools for fast testing
+            mock_delay=mock_delay
+        )
+        
+        result = await agent.run_conversation(query)
+        
+        duration = time.time() - start_time
+        print(f"✅ [{agent_name}] Completed in {duration:.1f}s - {result['api_calls']} API calls")
+        
+        return {
+            "agent": agent_name,
+            "query": query,
+            "success": True,
+            "duration": duration,
+            "api_calls": result['api_calls'],
+            "session_id": result.get('session_id')
+        }
+        
+    except Exception as e:
+        duration = time.time() - start_time
+        print(f"❌ [{agent_name}] Failed in {duration:.1f}s: {e}")
+        return {
+            "agent": agent_name,
+            "query": query,
+            "success": False,
+            "error": str(e),
+            "duration": duration
+        }
+
+
+async def test_sequential():
+    """
+    Test 1: Sequential execution (baseline).
+    
+    Runs 3 queries one after another. This shows how long it takes
+    without parallelization.
+    """
+    print("\n" + "="*60)
+    print("TEST 1: Sequential Execution (Baseline)")
+    print("="*60)
+    
+    start_time = time.time()
+    
+    results = []
+    for i in range(3):
+        result = await run_agent_query(
+            query=f"Find information about water companies #{i+1}",
+            agent_name=f"Agent{i+1}",
+            mock_delay=5  # Short delay for quick test
+        )
+        results.append(result)
+    
+    total_time = time.time() - start_time
+    
+    print(f"\n📊 Sequential Results:")
+    print(f"   Total time: {total_time:.1f}s")
+    print(f"   Successful: {sum(1 for r in results if r['success'])}/3")
+    print(f"   Average per query: {total_time/3:.1f}s")
+    
+    return results
+
+
+async def test_parallel():
+    """
+    Test 2: Parallel execution.
+    
+    Runs 3 queries simultaneously using asyncio.gather().
+    All queries share the same WebSocket connection for logging.
+    """
+    print("\n" + "="*60)
+    print("TEST 2: Parallel Execution (Shared Connection)")
+    print("="*60)
+    
+    start_time = time.time()
+    
+    # Run all queries in parallel!
+    results = await asyncio.gather(
+        run_agent_query(
+            query="Find publicly traded water utility companies",
+            agent_name="Agent1",
+            mock_delay=5
+        ),
+        run_agent_query(
+            query="Find energy infrastructure companies",
+            agent_name="Agent2",
+            mock_delay=5
+        ),
+        run_agent_query(
+            query="Find AI data center operators",
+            agent_name="Agent3",
+            mock_delay=5
+        )
+    )
+    
+    total_time = time.time() - start_time
+    
+    print(f"\n📊 Parallel Results:")
+    print(f"   Total time: {total_time:.1f}s")
+    print(f"   Successful: {sum(1 for r in results if r['success'])}/3")
+    print(f"   Speedup: ~{(sum(r['duration'] for r in results) / total_time):.1f}x")
+    print(f"   Sessions logged: {[r.get('session_id', 'N/A')[:8] for r in results]}")
+    
+    return results
+
+
+async def test_high_concurrency():
+    """
+    Test 3: High concurrency (stress test).
+    
+    Runs 10 queries simultaneously to test connection pool under load.
+    """
+    print("\n" + "="*60)
+    print("TEST 3: High Concurrency (10 Parallel Agents)")
+    print("="*60)
+    
+    start_time = time.time()
+    
+    tasks = [
+        run_agent_query(
+            query=f"Test query #{i+1}",
+            agent_name=f"Agent{i+1}",
+            mock_delay=3  # Very short for stress test
+        )
+        for i in range(10)
+    ]
+    
+    results = await asyncio.gather(*tasks)
+    
+    total_time = time.time() - start_time
+    successful = sum(1 for r in results if r['success'])
+    
+    print(f"\n📊 High Concurrency Results:")
+    print(f"   Total time: {total_time:.1f}s")
+    print(f"   Successful: {successful}/10")
+    print(f"   Failed: {10 - successful}/10")
+    print(f"   Queries per second: {10 / total_time:.2f}")
+    
+    return results
+
+
+async def main():
+    """Run all tests."""
+    print("\n🧪 WebSocket Connection Pool - Parallel Execution Tests")
+    print("="*60)
+    print("\nPREREQUISITE: Make sure logging server is running:")
+    print("  python api_endpoint/logging_server.py")
+    print("\nPress Ctrl+C to stop at any time\n")
+    
+    await asyncio.sleep(2)  # Give user time to read
+    
+    try:
+        # Test 1: Sequential (baseline)
+        seq_results = await test_sequential()
+        
+        # Test 2: Parallel (main test)
+        par_results = await test_parallel()
+        
+        # Test 3: High concurrency
+        stress_results = await test_high_concurrency()
+        
+        # Summary
+        print("\n" + "="*60)
+        print("SUMMARY")
+        print("="*60)
+        print(f"\n✅ All tests completed!")
+        print(f"\nKey Findings:")
+        print(f"  • Sequential (3 queries): {sum(r['duration'] for r in seq_results):.1f}s total")
+        print(f"  • Parallel (3 queries): {max(r['duration'] for r in par_results):.1f}s total")
+        print(f"  • Speedup: ~{sum(r['duration'] for r in seq_results) / max(r['duration'] for r in par_results):.1f}x")
+        print(f"  • High concurrency (10 queries): ✅ Handled successfully")
+        print(f"\n💡 All queries used the same persistent WebSocket connection!")
+        print(f"   No connection overhead, no timeouts, true parallelization.")
+        
+    except KeyboardInterrupt:
+        print("\n\n⚠️  Tests interrupted by user")
+    except Exception as e:
+        print(f"\n\n❌ Tests failed: {e}")
+        import traceback
+        traceback.print_exc()
+
+
+if __name__ == "__main__":
+    print("\n" + "="*60)
+    print("SETUP CHECK")
+    print("="*60)
+    
+    # Check if logging server is running
+    import socket
+    try:
+        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+        result = sock.connect_ex(('localhost', 8000))
+        sock.close()
+        
+        if result == 0:
+            print("✅ Logging server is running on port 8000")
+        else:
+            print("⚠️  Logging server not detected on port 8000")
+            print("   Start it with: python api_endpoint/logging_server.py")
+            print("\nContinuing anyway (tests will fail gracefully)...")
+    except Exception as e:
+        print(f"⚠️  Could not check server status: {e}")
+    
+    # Run tests
+    asyncio.run(main())
+
--- a/test_run.sh
+++ b/test_run.sh
@@ -1,13 +1,30 @@
+#!/bin/bash
+
+# Check if a prompt argument was provided
+if [ $# -eq 0 ]; then
+    echo "Error: Please provide a prompt as an argument"
+    echo "Usage: $0 \"your prompt here\""
+    exit 1
+fi
+
+# Get the prompt from the first argument
+PROMPT="$1"
+
+# Set debug mode for web tools
 export WEB_TOOLS_DEBUG=true

+# Run the agent with the provided prompt
 python run_agent.py \
-  --query "Tell me about this animal pictured: https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQi1nkrYXY-ijQv5aCxkwooyg2roNFxj0ewJA&s" \
+  --query "$PROMPT" \
  --max_turns 30 \
  --model claude-sonnet-4-20250514 \
  --base_url https://api.anthropic.com/v1/ \
  --api_key $ANTHROPIC_API_KEY \
-  --enabled_toolsets=vision_tools
-
+  --save_trajectories \
+  --enabled_toolsets=web
+  
+#  --model claude-sonnet-4-20250514 \
+#  
 #Possible Toolsets:
 #web_tools
 #vision_tools
--- a/test_ui_flow.py
+++ b/test_ui_flow.py
@@ -0,0 +1,264 @@
+#!/usr/bin/env python3
+"""
+Test script to verify UI flow works correctly.
+
+This tests:
+1. API server is running
+2. WebSocket connection works
+3. Agent can be started via API
+4. Events are broadcast properly
+"""
+
+import requests
+import json
+import time
+import websocket
+import threading
+
+API_URL = "http://localhost:8000"
+WS_URL = "ws://localhost:8000/ws"
+
+def test_api_server():
+    """Test if API server is running."""
+    print("🔍 Testing API server...")
+    try:
+        response = requests.get(f"{API_URL}/", timeout=5)
+        if response.status_code == 200:
+            data = response.json()
+            print(f"✅ API server is running: {data.get('service')}")
+            print(f"   Active connections: {data.get('active_connections')}")
+            return True
+        else:
+            print(f"❌ API server returned: {response.status_code}")
+            return False
+    except Exception as e:
+        print(f"❌ API server not accessible: {e}")
+        return False
+
+def test_tools_endpoint():
+    """Test if tools endpoint works."""
+    print("\n🔍 Testing tools endpoint...")
+    try:
+        response = requests.get(f"{API_URL}/tools", timeout=5)
+        if response.status_code == 200:
+            data = response.json()
+            toolsets = data.get("toolsets", [])
+            print(f"✅ Tools endpoint works - {len(toolsets)} toolsets available")
+            for ts in toolsets[:3]:
+                print(f"   • {ts.get('name')} ({ts.get('tool_count')} tools)")
+            return True
+        else:
+            print(f"❌ Tools endpoint failed: {response.status_code}")
+            return False
+    except Exception as e:
+        print(f"❌ Tools endpoint error: {e}")
+        return False
+
+def test_websocket():
+    """Test WebSocket connection."""
+    print("\n🔍 Testing WebSocket connection...")
+    
+    connected = threading.Event()
+    message_received = threading.Event()
+    messages = []
+    
+    def on_open(ws):
+        print("✅ WebSocket connected")
+        connected.set()
+    
+    def on_message(ws, message):
+        data = json.loads(message)
+        messages.append(data)
+        message_received.set()
+        print(f"📨 Received: {data.get('event_type', 'unknown')}")
+    
+    def on_error(ws, error):
+        print(f"❌ WebSocket error: {error}")
+    
+    def on_close(ws, close_status_code, close_msg):
+        print(f"🔌 WebSocket closed: {close_status_code}")
+    
+    ws = websocket.WebSocketApp(
+        WS_URL,
+        on_open=on_open,
+        on_message=on_message,
+        on_error=on_error,
+        on_close=on_close
+    )
+    
+    # Run WebSocket in background
+    ws_thread = threading.Thread(target=lambda: ws.run_forever(), daemon=True)
+    ws_thread.start()
+    
+    # Wait for connection
+    if connected.wait(timeout=5):
+        print("✅ WebSocket connection established")
+        ws.close()
+        return True
+    else:
+        print("❌ WebSocket connection timeout")
+        ws.close()
+        return False
+
+def test_agent_run():
+    """Test running agent via API."""
+    print("\n🔍 Testing agent run via API (mock mode)...")
+    
+    # Start listening for events first
+    events = []
+    ws_connected = threading.Event()
+    session_complete = threading.Event()
+    
+    def on_message(ws, message):
+        data = json.loads(message)
+        events.append(data)
+        event_type = data.get("event_type")
+        print(f"   📨 Event: {event_type}")
+        
+        if event_type == "complete":
+            session_complete.set()
+    
+    def on_open(ws):
+        ws_connected.set()
+    
+    # Connect WebSocket
+    ws = websocket.WebSocketApp(
+        WS_URL,
+        on_open=on_open,
+        on_message=on_message
+    )
+    
+    ws_thread = threading.Thread(target=lambda: ws.run_forever(), daemon=True)
+    ws_thread.start()
+    
+    # Wait for WebSocket connection
+    if not ws_connected.wait(timeout=5):
+        print("❌ WebSocket didn't connect")
+        ws.close()
+        return False
+    
+    print("✅ WebSocket connected, starting agent...")
+    
+    # Submit agent run
+    payload = {
+        "query": "Test query for UI flow verification",
+        "model": "claude-sonnet-4-5-20250929",
+        "base_url": "https://api.anthropic.com/v1/",
+        "enabled_toolsets": ["web"],
+        "max_turns": 5,
+        "mock_web_tools": True,  # Use mock mode to avoid API costs
+        "mock_delay": 2,  # Fast for testing
+        "verbose": False
+    }
+    
+    try:
+        response = requests.post(f"{API_URL}/agent/run", json=payload, timeout=10)
+        
+        if response.status_code == 200:
+            result = response.json()
+            session_id = result.get("session_id")
+            print(f"✅ Agent started: {session_id[:8]}...")
+            
+            # Wait for completion (or timeout)
+            print("⏳ Waiting for agent to complete (up to 30s)...")
+            if session_complete.wait(timeout=30):
+                print(f"✅ Agent completed! Received {len(events)} events:")
+                
+                # Count event types
+                event_counts = {}
+                for evt in events:
+                    evt_type = evt.get("event_type", "unknown")
+                    event_counts[evt_type] = event_counts.get(evt_type, 0) + 1
+                
+                for evt_type, count in event_counts.items():
+                    print(f"   • {evt_type}: {count}")
+                
+                # Check we got expected events
+                expected_events = ["query", "api_call", "response", "complete"]
+                missing = [e for e in expected_events if e not in event_counts]
+                
+                if missing:
+                    print(f"⚠️  Missing expected events: {missing}")
+                else:
+                    print("✅ All expected event types received!")
+                
+                ws.close()
+                return True
+            else:
+                print(f"⚠️  Timeout waiting for completion. Got {len(events)} events so far.")
+                ws.close()
+                return False
+                
+        else:
+            print(f"❌ Agent start failed: {response.status_code}")
+            print(f"   Response: {response.text}")
+            ws.close()
+            return False
+            
+    except Exception as e:
+        print(f"❌ Agent run error: {e}")
+        import traceback
+        traceback.print_exc()
+        ws.close()
+        return False
+
+def main():
+    """Run all tests."""
+    print("=" * 60)
+    print("🧪 Hermes Agent UI Flow Test")
+    print("=" * 60)
+    print("\nThis will test the complete flow:")
+    print("  1. API server connectivity")
+    print("  2. Tools endpoint")
+    print("  3. WebSocket connection")
+    print("  4. Agent execution via API (mock mode)")
+    print("  5. Event streaming to UI")
+    print("\n" + "=" * 60)
+    
+    results = []
+    
+    # Test 1: API server
+    results.append(("API Server", test_api_server()))
+    
+    # Test 2: Tools endpoint
+    results.append(("Tools Endpoint", test_tools_endpoint()))
+    
+    # Test 3: WebSocket
+    results.append(("WebSocket Connection", test_websocket()))
+    
+    # Test 4: Agent run
+    results.append(("Agent Execution + Events", test_agent_run()))
+    
+    # Summary
+    print("\n" + "=" * 60)
+    print("📊 TEST SUMMARY")
+    print("=" * 60)
+    
+    for test_name, passed in results:
+        status = "✅ PASS" if passed else "❌ FAIL"
+        print(f"{status} - {test_name}")
+    
+    all_passed = all(r[1] for r in results)
+    
+    print("\n" + "=" * 60)
+    if all_passed:
+        print("🎉 ALL TESTS PASSED!")
+        print("\n✅ The UI flow is working correctly!")
+        print("   You can now use the UI to:")
+        print("   • Submit queries")
+        print("   • View real-time events")
+        print("   • See tool executions")
+        print("   • Get final responses")
+    else:
+        print("❌ SOME TESTS FAILED")
+        print("\nMake sure:")
+        print("  1. API server is running: python api_endpoint/logging_server.py")
+        print("  2. ANTHROPIC_API_KEY is set in environment")
+        print("  3. All dependencies are installed: pip install -r requirements.txt")
+    print("=" * 60)
+    
+    return 0 if all_passed else 1
+
+if __name__ == "__main__":
+    exit(main())
+
--- a/toolsets.py
+++ b/toolsets.py
@@ -0,0 +1,326 @@
+#!/usr/bin/env python3
+"""
+Toolsets Module
+
+This module provides a flexible system for defining and managing tool aliases/toolsets.
+Toolsets allow you to group tools together for specific scenarios and can be composed
+from individual tools or other toolsets.
+
+Features:
+- Define custom toolsets with specific tools
+- Compose toolsets from other toolsets
+- Built-in common toolsets for typical use cases
+- Easy extension for new toolsets
+- Support for dynamic toolset resolution
+
+Usage:
+    from toolsets import get_toolset, resolve_toolset, get_all_toolsets
+    
+    # Get tools for a specific toolset
+    tools = get_toolset("research")
+    
+    # Resolve a toolset to get all tool names (including from composed toolsets)
+    all_tools = resolve_toolset("full_stack")
+"""
+
+from typing import List, Dict, Any, Set, Optional
+import json
+
+
+# Core toolset definitions
+# These can include individual tools or reference other toolsets
+TOOLSETS = {
+    # Basic toolsets - individual tool categories
+    "web": {
+        "description": "Web research and content extraction tools",
+        "tools": ["web_search", "web_extract", "web_crawl"],
+        "includes": []  # No other toolsets included
+    },
+    
+    "vision": {
+        "description": "Image analysis and vision tools",
+        "tools": ["vision_analyze"],
+        "includes": []
+    },
+    
+    "image_gen": {
+        "description": "Creative generation tools (images)",
+        "tools": ["image_generate"],
+        "includes": []
+    },
+    
+    "terminal": {
+        "description": "Terminal/command execution tools",
+        "tools": ["terminal"],
+        "includes": []
+    },
+    
+    "moa": {
+        "description": "Advanced reasoning and problem-solving tools",
+        "tools": ["mixture_of_agents"],
+        "includes": []
+    },
+    
+    # Scenario-specific toolsets
+    
+    "debugging": {
+        "description": "Debugging and troubleshooting toolkit",
+        "tools": ["terminal"],
+        "includes": ["web"]  # For searching error messages and solutions
+    },
+    
+    "safe": {
+        "description": "Safe toolkit without terminal access",
+        "tools": ["mixture_of_agents"],
+        "includes": ["web", "vision", "creative"]
+    }
+}
+
+
+
+def get_toolset(name: str) -> Optional[Dict[str, Any]]:
+    """
+    Get a toolset definition by name.
+    
+    Args:
+        name (str): Name of the toolset
+        
+    Returns:
+        Dict: Toolset definition with description, tools, and includes
+        None: If toolset not found
+    """
+    # Return toolset definition
+    return TOOLSETS.get(name)
+
+
+def resolve_toolset(name: str, visited: Set[str] = None) -> List[str]:
+    """
+    Recursively resolve a toolset to get all tool names.
+    
+    This function handles toolset composition by recursively resolving
+    included toolsets and combining all tools.
+    
+    Args:
+        name (str): Name of the toolset to resolve
+        visited (Set[str]): Set of already visited toolsets (for cycle detection)
+        
+    Returns:
+        List[str]: List of all tool names in the toolset
+    """
+    if visited is None:
+        visited = set()
+    
+    # Check for cycles
+    if name in visited:
+        print(f"⚠️  Circular dependency detected in toolset '{name}'")
+        return []
+    
+    visited.add(name)
+    
+    # Get toolset definition
+    toolset = TOOLSETS.get(name)
+    if not toolset:
+        return []
+    
+    # Collect direct tools
+    tools = set(toolset.get("tools", []))
+    
+    # Recursively resolve included toolsets
+    for included_name in toolset.get("includes", []):
+        included_tools = resolve_toolset(included_name, visited.copy())
+        tools.update(included_tools)
+    
+    return list(tools)
+
+
+def resolve_multiple_toolsets(toolset_names: List[str]) -> List[str]:
+    """
+    Resolve multiple toolsets and combine their tools.
+    
+    Args:
+        toolset_names (List[str]): List of toolset names to resolve
+        
+    Returns:
+        List[str]: Combined list of all tool names (deduplicated)
+    """
+    all_tools = set()
+    
+    for name in toolset_names:
+        tools = resolve_toolset(name)
+        all_tools.update(tools)
+    
+    return list(all_tools)
+
+
+def get_all_toolsets() -> Dict[str, Dict[str, Any]]:
+    """
+    Get all available toolsets with their definitions.
+    
+    Returns:
+        Dict: All toolset definitions
+    """
+    return TOOLSETS.copy()
+
+
+def get_toolset_names() -> List[str]:
+    """
+    Get names of all available toolsets (excluding aliases).
+    
+    Returns:
+        List[str]: List of toolset names
+    """
+    return list(TOOLSETS.keys())
+
+
+
+
+def validate_toolset(name: str) -> bool:
+    """
+    Check if a toolset name is valid.
+    
+    Args:
+        name (str): Toolset name to validate
+        
+    Returns:
+        bool: True if valid, False otherwise
+    """
+    return name in TOOLSETS
+
+
+def create_custom_toolset(
+    name: str,
+    description: str,
+    tools: List[str] = None,
+    includes: List[str] = None
+) -> None:
+    """
+    Create a custom toolset at runtime.
+    
+    Args:
+        name (str): Name for the new toolset
+        description (str): Description of the toolset
+        tools (List[str]): Direct tools to include
+        includes (List[str]): Other toolsets to include
+    """
+    TOOLSETS[name] = {
+        "description": description,
+        "tools": tools or [],
+        "includes": includes or []
+    }
+
+
+
+
+def get_toolset_info(name: str) -> Dict[str, Any]:
+    """
+    Get detailed information about a toolset including resolved tools.
+    
+    Args:
+        name (str): Toolset name
+        
+    Returns:
+        Dict: Detailed toolset information
+    """
+    toolset = get_toolset(name)
+    if not toolset:
+        return None
+    
+    resolved_tools = resolve_toolset(name)
+    
+    return {
+        "name": name,
+        "description": toolset["description"],
+        "direct_tools": toolset["tools"],
+        "includes": toolset["includes"],
+        "resolved_tools": resolved_tools,
+        "tool_count": len(resolved_tools),
+        "is_composite": len(toolset["includes"]) > 0
+    }
+
+
+def print_toolset_tree(name: str, indent: int = 0) -> None:
+    """
+    Print a tree view of a toolset and its composition.
+    
+    Args:
+        name (str): Toolset name
+        indent (int): Current indentation level
+    """
+    prefix = "  " * indent
+    toolset = get_toolset(name)
+    
+    if not toolset:
+        print(f"{prefix}❌ Unknown toolset: {name}")
+        return
+    
+    # Print toolset name and description
+    print(f"{prefix}📦 {name}: {toolset['description']}")
+    
+    # Print direct tools
+    if toolset["tools"]:
+        print(f"{prefix}  🔧 Tools: {', '.join(toolset['tools'])}")
+    
+    # Print included toolsets
+    if toolset["includes"]:
+        print(f"{prefix}  📂 Includes:")
+        for included in toolset["includes"]:
+            print_toolset_tree(included, indent + 2)
+
+
+if __name__ == "__main__":
+    """
+    Demo and testing of the toolsets system
+    """
+    print("🎯 Toolsets System Demo")
+    print("=" * 60)
+    
+    # Show all available toolsets
+    print("\n📦 Available Toolsets:")
+    print("-" * 40)
+    for name, toolset in get_all_toolsets().items():
+        info = get_toolset_info(name)
+        composite = "📂" if info["is_composite"] else "🔧"
+        print(f"{composite} {name:20} - {toolset['description']}")
+        print(f"   Tools: {len(info['resolved_tools'])} total")
+    
+    
+    # Demo toolset resolution
+    print("\n🔍 Toolset Resolution Examples:")
+    print("-" * 40)
+    
+    examples = ["research", "development", "full_stack", "minimal", "safe"]
+    for name in examples:
+        tools = resolve_toolset(name)
+        print(f"\n{name}:")
+        print(f"  Resolved to {len(tools)} tools: {', '.join(sorted(tools))}")
+    
+    # Show toolset composition tree
+    print("\n🌳 Toolset Composition Tree:")
+    print("-" * 40)
+    print("\nExample: 'content_creation' toolset:")
+    print_toolset_tree("content_creation")
+    
+    print("\nExample: 'full_stack' toolset:")
+    print_toolset_tree("full_stack")
+    
+    # Demo multiple toolset resolution
+    print("\n🔗 Multiple Toolset Resolution:")
+    print("-" * 40)
+    combined = resolve_multiple_toolsets(["minimal", "vision", "reasoning"])
+    print(f"Combining ['minimal', 'vision', 'reasoning']:")
+    print(f"  Result: {', '.join(sorted(combined))}")
+    
+    # Demo custom toolset creation
+    print("\n➕ Custom Toolset Creation:")
+    print("-" * 40)
+    create_custom_toolset(
+        name="my_custom",
+        description="My custom toolset for specific tasks",
+        tools=["web_search"],
+        includes=["terminal", "vision"]
+    )
+    
+    custom_info = get_toolset_info("my_custom")
+    print(f"Created 'my_custom' toolset:")
+    print(f"  Description: {custom_info['description']}")
+    print(f"  Resolved tools: {', '.join(custom_info['resolved_tools'])}")
--- a/ui/init.py
+++ b/ui/init.py
@@ -0,0 +1,23 @@
+"""
+Hermes Agent UI Package
+
+A modular PySide6 UI for the Hermes AI Agent with real-time event streaming.
+
+Modules:
+- websocket_client: WebSocket communication
+- event_widgets: Event display components
+- main_window: Main application window
+- hermes_ui: Application entry point
+"""
+
+from .websocket_client import WebSocketClient
+from .event_widgets import CollapsibleEventWidget, InteractiveEventDisplayWidget
+from .main_window import HermesMainWindow
+
+__all__ = [
+    'WebSocketClient',
+    'CollapsibleEventWidget',
+    'InteractiveEventDisplayWidget',
+    'HermesMainWindow',
+]
+
--- a/ui/event_widgets.py
+++ b/ui/event_widgets.py
@@ -0,0 +1,334 @@
+"""
+Event display widgets for Hermes Agent UI.
+
+This module provides widgets for displaying and managing real-time agent events
+in a collapsible, filterable interface.
+"""
+
+import json
+from datetime import datetime
+from typing import Dict, Any
+
+from PySide6.QtWidgets import (
+    QWidget, QVBoxLayout, QHBoxLayout, QLabel, QPushButton,
+    QCheckBox, QGroupBox, QFrame, QScrollArea
+)
+from PySide6.QtCore import Qt, QTimer
+from PySide6.QtGui import QFont
+
+
+class CollapsibleEventWidget(QFrame):
+    """
+    A single collapsible event with expand/collapse functionality.
+    """
+
+    def __init__(self, event: Dict[str, Any], parent=None):
+        super().__init__(parent)
+        self.event = event
+        self.is_expanded = False
+        self.event_type = event.get("event_type", "unknown")
+
+        self.setFrameStyle(QFrame.Box | QFrame.Raised)
+        self.setLineWidth(1)
+        self.setup_ui()
+
+    def setup_ui(self):
+        """Initialize UI components."""
+        layout = QVBoxLayout()
+        layout.setContentsMargins(8, 8, 8, 8)
+        layout.setSpacing(4)
+
+        # Header (clickable)
+        self.header_widget = QWidget()
+        header_layout = QHBoxLayout()
+        header_layout.setContentsMargins(0, 0, 0, 0)
+
+        self.expand_indicator = QLabel("▶")
+        self.expand_indicator.setFixedWidth(20)
+        header_layout.addWidget(self.expand_indicator)
+
+        self.summary_label = QLabel()
+        self.summary_label.setFont(QFont("Arial", 10, QFont.Bold))
+        self.update_summary()
+        header_layout.addWidget(self.summary_label, 1)
+
+        # Timestamp
+        timestamp = self.event.get("timestamp", datetime.now().isoformat())
+        time_str = datetime.fromisoformat(timestamp.replace('Z', '+00:00')).strftime("%H:%M:%S")
+        time_label = QLabel(time_str)
+        time_label.setStyleSheet("color: #888;")
+        header_layout.addWidget(time_label)
+
+        self.header_widget.setLayout(header_layout)
+        self.header_widget.mousePressEvent = lambda e: self.toggle_expand()
+        self.header_widget.setCursor(Qt.PointingHandCursor)
+        
+        layout.addWidget(self.header_widget)
+        
+        # Details (collapsible)
+        self.details_widget = QWidget()
+        self.details_layout = QVBoxLayout()
+        self.details_layout.setContentsMargins(25, 5, 5, 5)
+        self.populate_details()
+        self.details_widget.setLayout(self.details_layout)
+        self.details_widget.setVisible(False)
+        
+        layout.addWidget(self.details_widget)
+        
+        self.setLayout(layout)
+        self.apply_colors()
+
+    def apply_colors(self):
+        """Apply color scheme based on event type."""
+        colors = {
+            "query": "#E8F5E9",      # Light green
+            "api_call": "#E3F2FD",   # Light blue
+            "response": "#F3E5F5",   # Light purple
+            "tool_call": "#FFF3E0",  # Light orange
+            "tool_result": "#E8F5E9", # Light green
+            "complete": "#E8F5E9",   # Light green
+            "error": "#FFEBEE",      # Light red
+            "session_start": "#F5F5F5" # Light gray
+        }
+        
+        bg_color = colors.get(self.event_type, "#FAFAFA")
+        self.setStyleSheet(f"""
+            CollapsibleEventWidget {{
+                background-color: {bg_color};
+                border: 1px solid #ddd;
+                border-radius: 4px;
+            }}
+        """)
+
+    def update_summary(self):
+        """Update the summary label with event type."""
+        self.summary_label.setText(f"- {self.event_type.upper()}")
+
+    def populate_details(self):
+        """Populate the details section with event data."""
+        data = self.event.get("data", {})
+
+        # Clear existing details
+        while self.details_layout.count():
+            item = self.details_layout.takeAt(0)
+            if item.widget():
+                item.widget().deleteLater()
+
+        self.add_detail("Raw Data", json.dumps(data, indent=2), multiline=True)
+
+    def add_detail(self, label: str, value: str, multiline: bool = True):
+        """Add a detail row to the details section."""
+        detail_widget = QWidget()
+        detail_layout = QVBoxLayout() if multiline else QHBoxLayout()
+        detail_layout.setContentsMargins(0, 2, 0, 2)
+        
+        label_widget = QLabel(f"<b>{label}:</b>")
+        label_widget.setTextFormat(Qt.RichText)
+        
+        value_widget = QLabel(value)
+        value_widget.setWordWrap(True)
+        value_widget.setTextInteractionFlags(Qt.TextSelectableByMouse)
+        
+        if multiline:
+            font = QFont()
+            font.setStyleHint(QFont.Monospace)
+            font.setPointSize(9)
+            value_widget.setFont(font)
+            value_widget.setStyleSheet("background-color: #f5f5f5; padding: 5px; border-radius: 3px;")
+            detail_layout.addWidget(label_widget)
+            detail_layout.addWidget(value_widget)
+        else:
+            detail_layout.addWidget(label_widget)
+            detail_layout.addWidget(value_widget, 1)
+        
+        detail_widget.setLayout(detail_layout)
+        self.details_layout.addWidget(detail_widget)
+
+    def toggle_expand(self):
+        """Toggle expanded/collapsed state."""
+        self.is_expanded = not self.is_expanded
+        self.details_widget.setVisible(self.is_expanded)
+        self.expand_indicator.setText("▼" if self.is_expanded else "▶")
+
+
+class InteractiveEventDisplayWidget(QWidget):
+    """
+    Interactive widget for displaying real-time agent events.
+    
+    Features:
+    - Collapsible event items
+    - Event type filtering
+    - Expand/collapse all
+    - Auto-scroll to latest events
+    """
+
+    def __init__(self):
+        super().__init__()
+        self.events = []
+        self.event_widgets = []
+        self.current_session = None
+        self.filters = {
+            "query": True,
+            "api_call": True,
+            "response": True,
+            "tool_call": True,
+            "tool_result": True,
+            "complete": True,
+            "error": True,
+            "session_start": True
+        }
+        self.init_ui()
+
+    def init_ui(self):
+        """Initialize the UI components."""
+        layout = QVBoxLayout()
+        layout.setContentsMargins(5, 5, 5, 5)
+        
+        # Header with controls
+        header_layout = QHBoxLayout()
+        
+        title = QLabel("📡 Real-time Event Stream")
+        title.setFont(QFont("Arial", 12, QFont.Bold))
+        header_layout.addWidget(title)
+        
+        header_layout.addStretch()
+        
+        # Expand/Collapse All buttons
+        expand_all_btn = QPushButton("Expand All")
+        expand_all_btn.clicked.connect(self.expand_all)
+        header_layout.addWidget(expand_all_btn)
+        
+        collapse_all_btn = QPushButton("Collapse All")
+        collapse_all_btn.clicked.connect(self.collapse_all)
+        header_layout.addWidget(collapse_all_btn)
+        
+        # Clear button
+        clear_btn = QPushButton("🗑️ Clear")
+        clear_btn.clicked.connect(self.clear_events)
+        header_layout.addWidget(clear_btn)
+        
+        layout.addLayout(header_layout)
+        
+        # Filter controls
+        filter_group = QGroupBox("Event Filters (Show/Hide)")
+        filter_layout = QHBoxLayout()
+        filter_layout.setSpacing(10)
+        
+        self.filter_checkboxes = {}
+        filter_configs = [
+            ("query", "📝 Queries"),
+            ("api_call", "🔄 API Calls"),
+            ("response", "🤖 Responses"),
+            ("tool_call", "🔧 Tool Calls"),
+            ("tool_result", "✅ Results"),
+            ("complete", "🎉 Complete"),
+            ("error", "❌ Errors"),
+        ]
+        
+        for event_type, label in filter_configs:
+            checkbox = QCheckBox(label)
+            checkbox.setChecked(True)
+            checkbox.stateChanged.connect(lambda state, et=event_type: self.toggle_filter(et, state))
+            self.filter_checkboxes[event_type] = checkbox
+            filter_layout.addWidget(checkbox)
+        
+        filter_group.setLayout(filter_layout)
+        layout.addWidget(filter_group)
+        
+        # Scroll area for events
+        scroll_area = QScrollArea()
+        scroll_area.setWidgetResizable(True)
+        scroll_area.setHorizontalScrollBarPolicy(Qt.ScrollBarAlwaysOff)
+        
+        # Container for event widgets
+        self.events_container = QWidget()
+        self.events_layout = QVBoxLayout()
+        self.events_layout.setSpacing(5)
+        self.events_layout.addStretch()  # Push events to top
+        self.events_container.setLayout(self.events_layout)
+        
+        scroll_area.setWidget(self.events_container)
+        layout.addWidget(scroll_area)
+        
+        self.setLayout(layout)
+    
+    def clear_events(self):
+        """Clear all displayed events."""
+        self.events.clear()
+        self.event_widgets.clear()
+        
+        # Remove all widgets
+        while self.events_layout.count() > 1:  # Keep the stretch
+            item = self.events_layout.takeAt(0)
+            if item.widget():
+                item.widget().deleteLater()
+        
+        self.current_session = None
+    
+    def add_event(self, event: Dict[str, Any]):
+        """Add an event to the display."""
+        event_type = event.get("event_type", "unknown")
+        session_id = event.get("session_id", "")
+        
+        # Track session changes - add session start event
+        if self.current_session != session_id:
+            self.current_session = session_id
+            session_event = {
+                "event_type": "session_start",
+                "session_id": session_id,
+                "timestamp": event.get("timestamp", datetime.now().isoformat()),
+                "data": {
+                    "session_id": session_id,
+                    "start_time": event.get("timestamp", datetime.now().isoformat())
+                }
+            }
+            self._add_event_widget(session_event)
+        
+        # Add the actual event
+        self._add_event_widget(event)
+    
+    def _add_event_widget(self, event: Dict[str, Any]):
+        """Internal method to add event widget."""
+        event_widget = CollapsibleEventWidget(event)
+        
+        # Apply filter visibility
+        event_type = event.get("event_type", "unknown")
+        event_widget.setVisible(self.filters.get(event_type, True))
+        
+        # Insert before the stretch
+        self.events_layout.insertWidget(self.events_layout.count() - 1, event_widget)
+        
+        self.events.append(event)
+        self.event_widgets.append(event_widget)
+        
+        # Auto-scroll to bottom after widget is rendered
+        QTimer.singleShot(50, self._scroll_to_bottom)
+    
+    def _scroll_to_bottom(self):
+        """Scroll to the bottom of the events list."""
+        scroll_area = self.events_container.parent()
+        if isinstance(scroll_area, QScrollArea):
+            scroll_bar = scroll_area.verticalScrollBar()
+            scroll_bar.setValue(scroll_bar.maximum())
+    
+    def expand_all(self):
+        """Expand all event widgets."""
+        for widget in self.event_widgets:
+            if not widget.is_expanded:
+                widget.toggle_expand()
+    
+    def collapse_all(self):
+        """Collapse all event widgets."""
+        for widget in self.event_widgets:
+            if widget.is_expanded:
+                widget.toggle_expand()
+    
+    def toggle_filter(self, event_type: str, state: int):
+        """Toggle visibility of events by type."""
+        self.filters[event_type] = bool(state)
+        
+        # Update visibility of existing widgets
+        for event, widget in zip(self.events, self.event_widgets):
+            if event.get("event_type") == event_type:
+                widget.setVisible(self.filters[event_type])
+
--- a/ui/hermes_ui.py
+++ b/ui/hermes_ui.py
@@ -0,0 +1,102 @@
+#!/usr/bin/env python3
+"""
+Hermes Agent - PySide6 Frontend
+
+A modern desktop UI for the Hermes AI Agent with real-time event streaming.
+
+Features:
+- Query input with multi-line support
+- Tool/toolset selection
+- Model and API configuration
+- Real-time event display via WebSocket
+- Beautiful, responsive UI with dark theme
+- Session history
+- Safe exit handling (no segfaults)
+
+Usage:
+    python hermes_ui.py
+"""
+
+import sys
+import signal
+import os
+
+# Suppress Qt logging warnings BEFORE importing Qt
+os.environ['QT_LOGGING_RULES'] = 'qt.qpa.*=false'
+
+from PySide6.QtWidgets import QApplication
+from PySide6.QtCore import QTimer
+
+from main_window import HermesMainWindow
+
+
+def setup_signal_handlers(app: QApplication) -> QTimer:
+    """
+    Setup signal handlers for graceful shutdown on Ctrl+C.
+    
+    This prevents segmentation faults by:
+    1. Catching SIGINT/SIGTERM signals
+    2. Creating a timer that keeps Python responsive to signals
+    3. Calling app.quit() for proper Qt cleanup
+    
+    Args:
+        app: The QApplication instance
+        
+    Returns:
+        Timer that keeps Python interpreter responsive to signals
+    """
+    def signal_handler(signum, frame):
+        """Handle interrupt signals gracefully."""
+        print("\n🛑 Interrupt received, shutting down gracefully...")
+        app.quit()
+    
+    signal.signal(signal.SIGINT, signal_handler)   # Ctrl+C
+    signal.signal(signal.SIGTERM, signal_handler)  # Termination signal
+    
+    # CRITICAL: Create a timer to wake up Python interpreter periodically
+    # This allows Python to process signals while Qt's event loop is running
+    # Without this, Ctrl+C will not work and may cause segfaults
+    timer = QTimer()
+    timer.timeout.connect(lambda: None)  # Empty callback just to wake up Python
+    timer.start(100)  # Check every 100ms
+    
+    return timer
+
+
+def main():
+    """Main entry point for the application."""
+    # Create application
+    app = QApplication(sys.argv)
+    
+    # Set application metadata
+    app.setApplicationName("Hermes Agent")
+    app.setOrganizationName("Hermes")
+    app.setApplicationVersion("1.0.0")
+    
+    # Setup signal handlers for safe Ctrl+C handling (prevents segfaults!)
+    timer = setup_signal_handlers(app)
+    
+    # Apply dark theme (optional)
+    # Uncomment to enable dark mode
+    # app.setStyle("Fusion")
+    # palette = QPalette()
+    # palette.setColor(QPalette.Window, QColor(53, 53, 53))
+    # palette.setColor(QPalette.WindowText, Qt.white)
+    # app.setPalette(palette)
+    
+    # Create and show main window
+    window = HermesMainWindow()
+    window.show()
+    
+    print("✨ Hermes Agent UI started")
+    print("   Press Ctrl+C to exit gracefully")
+    
+    # Start event loop
+    exit_code = app.exec()
+    
+    print("👋 Hermes Agent UI closed")
+    sys.exit(exit_code)
+
+
+if __name__ == "__main__":
+    main()
--- a/ui/main_window.py
+++ b/ui/main_window.py
@@ -0,0 +1,375 @@
+"""
+Main window for Hermes Agent UI.
+
+This module provides the main application window with controls for
+submitting queries, configuring settings, and viewing real-time events.
+"""
+
+import requests
+from typing import Dict, Any
+
+from PySide6.QtWidgets import (
+    QMainWindow, QWidget, QVBoxLayout, QHBoxLayout, QTextEdit,
+    QPushButton, QLabel, QLineEdit, QComboBox, QCheckBox,
+    QGroupBox, QSplitter, QListWidget, QListWidgetItem,
+    QSpinBox, QMessageBox
+)
+from PySide6.QtCore import Qt, Slot, QTimer
+from PySide6.QtGui import QFont
+
+from .websocket_client import WebSocketClient
+from .event_widgets import InteractiveEventDisplayWidget
+
+
+class HermesMainWindow(QMainWindow):
+    """
+    Main window for Hermes Agent UI.
+    
+    Provides interface for:
+    - Submitting queries
+    - Configuring agent settings
+    - Viewing real-time events
+    - Managing sessions
+    """
+    
+    def __init__(self):
+        super().__init__()
+        self.api_base_url = "http://localhost:8000"
+        self.ws_client = None
+        self.current_session_id = None
+        self.available_toolsets = []
+        self.is_closing = False  # Flag to prevent reconnection during shutdown
+        
+        self.init_ui()
+        self.setup_websocket()
+        self.load_available_tools()
+    
+    def init_ui(self):
+        """Initialize the user interface."""
+        self.setWindowTitle("Hermes Agent - AI Assistant UI")
+        self.setGeometry(100, 100, 1400, 900)
+        
+        # Central widget
+        central_widget = QWidget()
+        self.setCentralWidget(central_widget)
+        
+        # Main layout (horizontal split)
+        main_layout = QHBoxLayout()
+        
+        # Left panel: Controls
+        left_panel = self.create_control_panel()
+        
+        # Right panel: Event display
+        right_panel = self.create_event_panel()
+        
+        # Splitter for resizable panels
+        splitter = QSplitter(Qt.Horizontal)
+        splitter.addWidget(left_panel)
+        splitter.addWidget(right_panel)
+        splitter.setStretchFactor(0, 1)  # Control panel
+        splitter.setStretchFactor(1, 2)  # Event panel (larger)
+        
+        main_layout.addWidget(splitter)
+        central_widget.setLayout(main_layout)
+        
+        # Status bar
+        self.statusBar().showMessage("Ready")
+    
+    def create_control_panel(self) -> QWidget:
+        """Create the left control panel."""
+        panel = QWidget()
+        layout = QVBoxLayout()
+        
+        # Title
+        title = QLabel("🤖 Hermes Agent Control")
+        title.setFont(QFont("Arial", 14, QFont.Bold))
+        title.setAlignment(Qt.AlignCenter)
+        layout.addWidget(title)
+        
+        # Query input group
+        query_group = QGroupBox("Query Input")
+        query_layout = QVBoxLayout()
+        
+        self.query_input = QTextEdit()
+        self.query_input.setPlaceholderText("Enter your query here...")
+        self.query_input.setMaximumHeight(150)
+        query_layout.addWidget(self.query_input)
+        
+        self.submit_btn = QPushButton("🚀 Submit Query")
+        self.submit_btn.setFont(QFont("Arial", 11, QFont.Bold))
+        self.submit_btn.setStyleSheet("QPushButton { background-color: #4CAF50; color: white; padding: 10px; }")
+        self.submit_btn.clicked.connect(self.submit_query)
+        query_layout.addWidget(self.submit_btn)
+        
+        query_group.setLayout(query_layout)
+        layout.addWidget(query_group)
+        
+        # Model configuration group
+        model_group = QGroupBox("Model Configuration")
+        model_layout = QVBoxLayout()
+        
+        # Model selection
+        model_layout.addWidget(QLabel("Model:"))
+        self.model_combo = QComboBox()
+        self.model_combo.addItems([
+            "claude-sonnet-4-5-20250929",
+            "claude-opus-4-20250514",
+            "gpt-4",
+            "gpt-4-turbo"
+        ])
+        model_layout.addWidget(self.model_combo)
+        
+        # API Base URL
+        model_layout.addWidget(QLabel("API Base URL:"))
+        self.base_url_input = QLineEdit("https://api.anthropic.com/v1/")
+        model_layout.addWidget(self.base_url_input)
+        
+        # Max turns
+        model_layout.addWidget(QLabel("Max Turns:"))
+        self.max_turns_spin = QSpinBox()
+        self.max_turns_spin.setMinimum(1)
+        self.max_turns_spin.setMaximum(50)
+        self.max_turns_spin.setValue(10)
+        model_layout.addWidget(self.max_turns_spin)
+        
+        model_group.setLayout(model_layout)
+        layout.addWidget(model_group)
+        
+        # Tools configuration group
+        tools_group = QGroupBox("Tools & Toolsets")
+        tools_layout = QVBoxLayout()
+        
+        tools_layout.addWidget(QLabel("Select Toolsets:"))
+        self.toolsets_list = QListWidget()
+        self.toolsets_list.setSelectionMode(QListWidget.MultiSelection)
+        self.toolsets_list.setMaximumHeight(150)
+        tools_layout.addWidget(self.toolsets_list)
+        
+        tools_group.setLayout(tools_layout)
+        layout.addWidget(tools_group)
+        
+        # Options group
+        options_group = QGroupBox("Options")
+        options_layout = QVBoxLayout()
+        
+        self.mock_mode_checkbox = QCheckBox("Mock Web Tools (Testing)")
+        options_layout.addWidget(self.mock_mode_checkbox)
+        
+        self.verbose_checkbox = QCheckBox("Verbose Logging")
+        options_layout.addWidget(self.verbose_checkbox)
+        
+        options_layout.addWidget(QLabel("Mock Delay (seconds):"))
+        self.mock_delay_spin = QSpinBox()
+        self.mock_delay_spin.setMinimum(1)
+        self.mock_delay_spin.setMaximum(300)
+        self.mock_delay_spin.setValue(60)
+        options_layout.addWidget(self.mock_delay_spin)
+        
+        options_group.setLayout(options_layout)
+        layout.addWidget(options_group)
+        
+        # Connection status
+        self.connection_status = QLabel("🔴 Disconnected")
+        self.connection_status.setAlignment(Qt.AlignCenter)
+        self.connection_status.setStyleSheet("QLabel { padding: 5px; background-color: #F44336; color: white; border-radius: 3px; }")
+        layout.addWidget(self.connection_status)
+        
+        # Add stretch to push everything to top
+        layout.addStretch()
+        
+        panel.setLayout(layout)
+        return panel
+    
+    def create_event_panel(self) -> QWidget:
+        """Create the right event display panel."""
+        panel = QWidget()
+        layout = QVBoxLayout()
+        
+        # Event display widget
+        self.event_widget = InteractiveEventDisplayWidget()
+        layout.addWidget(self.event_widget)
+        
+        panel.setLayout(layout)
+        return panel
+    
+    def setup_websocket(self):
+        """Setup WebSocket connection for real-time events."""
+        self.ws_client = WebSocketClient("ws://localhost:8000/ws")
+        
+        # Connect signals
+        self.ws_client.connected.connect(self.on_ws_connected)
+        self.ws_client.disconnected.connect(self.on_ws_disconnected)
+        self.ws_client.error.connect(self.on_ws_error)
+        self.ws_client.event_received.connect(self.on_event_received)
+        
+        # Start connection
+        self.ws_client.connect()
+    
+    @Slot()
+    def on_ws_connected(self):
+        """Called when WebSocket connection is established."""
+        self.connection_status.setText("🟢 Connected")
+        self.connection_status.setStyleSheet("QLabel { padding: 5px; background-color: #4CAF50; color: white; border-radius: 3px; }")
+        self.statusBar().showMessage("WebSocket connected")
+    
+    @Slot()
+    def on_ws_disconnected(self):
+        """Called when WebSocket connection is lost."""
+        # Don't attempt reconnection if we're closing the application
+        if self.is_closing:
+            return
+        
+        self.connection_status.setText("🔴 Disconnected")
+        self.connection_status.setStyleSheet("QLabel { padding: 5px; background-color: #F44336; color: white; border-radius: 3px; }")
+        self.statusBar().showMessage("WebSocket disconnected - attempting reconnect...")
+        
+        # Attempt reconnect after 5 seconds
+        QTimer.singleShot(5000, self.ws_client.connect)
+    
+    @Slot(str)
+    def on_ws_error(self, error: str):
+        """Called when WebSocket error occurs."""
+        self.statusBar().showMessage(f"WebSocket error: {error}")
+    
+    @Slot(dict)
+    def on_event_received(self, event: Dict[str, Any]):
+        """
+        Called when an event is received from WebSocket.
+        
+        Args:
+            event: Event data from server
+        """
+        self.event_widget.add_event(event)
+        
+        # Update status for specific events
+        event_type = event.get("event_type")
+        if event_type == "query":
+            self.statusBar().showMessage("Query received - agent processing...")
+        elif event_type == "complete":
+            self.statusBar().showMessage("Agent completed!")
+            self.submit_btn.setEnabled(True)
+    
+    def load_available_tools(self):
+        """Load available toolsets from the API."""
+        try:
+            response = requests.get(f"{self.api_base_url}/tools", timeout=5)
+            if response.status_code == 200:
+                data = response.json()
+                toolsets = data.get("toolsets", [])
+                
+                self.available_toolsets = toolsets
+                self.toolsets_list.clear()
+                
+                for toolset in toolsets:
+                    name = toolset.get("name", "")
+                    description = toolset.get("description", "")
+                    tool_count = toolset.get("tool_count", 0)
+                    
+                    item_text = f"{name} ({tool_count} tools) - {description}"
+                    item = QListWidgetItem(item_text)
+                    item.setData(Qt.UserRole, name)  # Store toolset name
+                    self.toolsets_list.addItem(item)
+                
+                self.statusBar().showMessage(f"Loaded {len(toolsets)} toolsets")
+            else:
+                self.statusBar().showMessage("Failed to load toolsets from API")
+                
+        except requests.exceptions.RequestException as e:
+            self.statusBar().showMessage(f"Error loading toolsets: {str(e)}")
+            # Add some default toolsets
+            default_toolsets = ["web", "vision", "terminal", "research"]
+            for ts in default_toolsets:
+                item = QListWidgetItem(f"{ts} (default)")
+                item.setData(Qt.UserRole, ts)
+                self.toolsets_list.addItem(item)
+    
+    @Slot()
+    def submit_query(self):
+        """Submit query to the agent API."""
+        query = self.query_input.toPlainText().strip()
+        
+        if not query:
+            QMessageBox.warning(self, "No Query", "Please enter a query first!")
+            return
+        
+        # Get selected toolsets
+        selected_toolsets = []
+        for i in range(self.toolsets_list.count()):
+            item = self.toolsets_list.item(i)
+            if item.isSelected():
+                toolset_name = item.data(Qt.UserRole)
+                selected_toolsets.append(toolset_name)
+        
+        # Build request payload
+        payload = {
+            "query": query,
+            "model": self.model_combo.currentText(),
+            "base_url": self.base_url_input.text(),
+            "max_turns": self.max_turns_spin.value(),
+            "enabled_toolsets": selected_toolsets if selected_toolsets else None,
+            "mock_web_tools": self.mock_mode_checkbox.isChecked(),
+            "mock_delay": self.mock_delay_spin.value(),
+            "verbose": self.verbose_checkbox.isChecked()
+        }
+        
+        # Disable submit button during execution
+        self.submit_btn.setEnabled(False)
+        self.submit_btn.setText("⏳ Running...")
+        self.statusBar().showMessage("Submitting query to agent...")
+        
+        # Submit to API
+        try:
+            response = requests.post(
+                f"{self.api_base_url}/agent/run",
+                json=payload,
+                timeout=10
+            )
+            
+            if response.status_code == 200:
+                result = response.json()
+                session_id = result.get("session_id", "")
+                self.current_session_id = session_id
+                
+                self.statusBar().showMessage(f"Agent started! Session: {session_id[:8]}...")
+                
+                # Clear event display for new session (optional)
+                # self.event_widget.clear_events()
+                
+            else:
+                QMessageBox.warning(
+                    self,
+                    "API Error",
+                    f"Failed to start agent: {response.status_code}\n{response.text}"
+                )
+                self.submit_btn.setEnabled(True)
+                self.submit_btn.setText("🚀 Submit Query")
+                
+        except requests.exceptions.RequestException as e:
+            QMessageBox.critical(
+                self,
+                "Connection Error",
+                f"Failed to connect to API server:\n{str(e)}\n\nMake sure the server is running:\npython logging_server.py"
+            )
+            self.submit_btn.setEnabled(True)
+            self.submit_btn.setText("🚀 Submit Query")
+        
+        # Re-enable button after short delay (UI feedback)
+        QTimer.singleShot(2000, lambda: self.submit_btn.setText("🚀 Submit Query"))
+    
+    def cleanup(self):
+        """Clean up resources before exit."""
+        print("Cleaning up resources...")
+        self.is_closing = True
+        
+        if self.ws_client:
+            try:
+                self.ws_client.disconnect()
+            except Exception as e:
+                print(f"Error disconnecting WebSocket: {e}")
+    
+    def closeEvent(self, event):
+        """Handle window close event - ensures clean shutdown."""
+        print("Closing application...")
+        self.cleanup()
+        event.accept()
+
--- a/ui/start_hermes_ui.sh
+++ b/ui/start_hermes_ui.sh
@@ -0,0 +1,115 @@
+#!/bin/bash
+# Hermes Agent UI Launcher
+# 
+# This script starts both the API server and UI application.
+# It will run them in the background and provide a clean shutdown.
+
+set -e
+
+# Colors for output
+GREEN='\033[0;32m'
+BLUE='\033[0;34m'
+RED='\033[0;31m'
+NC='\033[0m' # No Color
+
+echo -e "${BLUE}🚀 Hermes Agent UI Launcher${NC}"
+echo "================================"
+echo ""
+
+# Check if Python is available
+if ! command -v python3 &> /dev/null; then
+    echo -e "${RED}❌ Python 3 not found. Please install Python 3.${NC}"
+    exit 1
+fi
+
+# Check if virtual environment exists
+if [ -d "../../env" ]; then
+    echo -e "${GREEN}✓ Activating virtual environment${NC}"
+    source ../../env/bin/activate
+else
+    echo -e "${BLUE}ℹ No virtual environment found, using system Python${NC}"
+fi
+
+# Check dependencies
+echo -e "${BLUE}Checking dependencies...${NC}"
+python3 -c "import PySide6" 2>/dev/null || {
+    echo -e "${RED}❌ PySide6 not installed${NC}"
+    echo -e "${BLUE}Installing dependencies...${NC}"
+    pip install -r ../requirements.txt
+}
+
+# Check for API keys
+if [ -z "$ANTHROPIC_API_KEY" ]; then
+    echo -e "${RED}⚠️  Warning: ANTHROPIC_API_KEY not set${NC}"
+    echo "   Set it with: export ANTHROPIC_API_KEY='your-key'"
+    echo ""
+fi
+
+# Function to cleanup on exit
+cleanup() {
+    echo ""
+    echo -e "${BLUE}🛑 Shutting down Hermes Agent...${NC}"
+    if [ ! -z "$SERVER_PID" ]; then
+        kill $SERVER_PID 2>/dev/null || true
+        echo -e "${GREEN}✓ API Server stopped${NC}"
+    fi
+    if [ ! -z "$UI_PID" ]; then
+        kill $UI_PID 2>/dev/null || true
+        echo -e "${GREEN}✓ UI Application stopped${NC}"
+    fi
+    echo -e "${GREEN}✓ Cleanup complete${NC}"
+    exit 0
+}
+
+# Set up trap for cleanup
+trap cleanup SIGINT SIGTERM EXIT
+
+# Start API server in background
+echo -e "${BLUE}Starting API Server...${NC}"
+cd ../api_endpoint
+python3 logging_server.py > /tmp/hermes_server.log 2>&1 &
+SERVER_PID=$!
+cd ../ui
+
+# Wait for server to start
+echo -e "${BLUE}Waiting for server to start...${NC}"
+sleep 3
+
+# Check if server is running
+if ! kill -0 $SERVER_PID 2>/dev/null; then
+    echo -e "${RED}❌ Server failed to start. Check /tmp/hermes_server.log${NC}"
+    tail -20 /tmp/hermes_server.log
+    exit 1
+fi
+
+# Check if server is responding
+if curl -s http://localhost:8000/ > /dev/null; then
+    echo -e "${GREEN}✓ API Server running on http://localhost:8000${NC}"
+else
+    echo -e "${RED}❌ Server not responding. Check /tmp/hermes_server.log${NC}"
+    exit 1
+fi
+
+# Start UI application
+echo -e "${BLUE}Starting UI Application...${NC}"
+python3 hermes_ui.py &
+UI_PID=$!
+
+echo ""
+echo -e "${GREEN}================================${NC}"
+echo -e "${GREEN}✓ Hermes Agent UI is running!${NC}"
+echo -e "${GREEN}================================${NC}"
+echo ""
+echo -e "${BLUE}📊 Component Status:${NC}"
+echo -e "   API Server:  http://localhost:8000 (PID: $SERVER_PID)"
+echo -e "   UI App:      Running (PID: $UI_PID)"
+echo -e "   Server Log:  /tmp/hermes_server.log"
+echo ""
+echo -e "${BLUE}Press Ctrl+C to stop all services${NC}"
+echo ""
+
+# Wait for UI to exit
+wait $UI_PID
+
+# Cleanup will be triggered by trap
+
--- a/ui/test_ui_flow.py
+++ b/ui/test_ui_flow.py
@@ -0,0 +1,264 @@
+#!/usr/bin/env python3
+"""
+Test script to verify UI flow works correctly.
+
+This tests:
+1. API server is running
+2. WebSocket connection works
+3. Agent can be started via API
+4. Events are broadcast properly
+"""
+
+import requests
+import json
+import time
+import websocket
+import threading
+
+API_URL = "http://localhost:8000"
+WS_URL = "ws://localhost:8000/ws"
+
+def test_api_server():
+    """Test if API server is running."""
+    print("🔍 Testing API server...")
+    try:
+        response = requests.get(f"{API_URL}/", timeout=5)
+        if response.status_code == 200:
+            data = response.json()
+            print(f"✅ API server is running: {data.get('service')}")
+            print(f"   Active connections: {data.get('active_connections')}")
+            return True
+        else:
+            print(f"❌ API server returned: {response.status_code}")
+            return False
+    except Exception as e:
+        print(f"❌ API server not accessible: {e}")
+        return False
+
+def test_tools_endpoint():
+    """Test if tools endpoint works."""
+    print("\n🔍 Testing tools endpoint...")
+    try:
+        response = requests.get(f"{API_URL}/tools", timeout=5)
+        if response.status_code == 200:
+            data = response.json()
+            toolsets = data.get("toolsets", [])
+            print(f"✅ Tools endpoint works - {len(toolsets)} toolsets available")
+            for ts in toolsets[:3]:
+                print(f"   • {ts.get('name')} ({ts.get('tool_count')} tools)")
+            return True
+        else:
+            print(f"❌ Tools endpoint failed: {response.status_code}")
+            return False
+    except Exception as e:
+        print(f"❌ Tools endpoint error: {e}")
+        return False
+
+def test_websocket():
+    """Test WebSocket connection."""
+    print("\n🔍 Testing WebSocket connection...")
+    
+    connected = threading.Event()
+    message_received = threading.Event()
+    messages = []
+    
+    def on_open(ws):
+        print("✅ WebSocket connected")
+        connected.set()
+    
+    def on_message(ws, message):
+        data = json.loads(message)
+        messages.append(data)
+        message_received.set()
+        print(f"📨 Received: {data.get('event_type', 'unknown')}")
+    
+    def on_error(ws, error):
+        print(f"❌ WebSocket error: {error}")
+    
+    def on_close(ws, close_status_code, close_msg):
+        print(f"🔌 WebSocket closed: {close_status_code}")
+    
+    ws = websocket.WebSocketApp(
+        WS_URL,
+        on_open=on_open,
+        on_message=on_message,
+        on_error=on_error,
+        on_close=on_close
+    )
+    
+    # Run WebSocket in background
+    ws_thread = threading.Thread(target=lambda: ws.run_forever(), daemon=True)
+    ws_thread.start()
+    
+    # Wait for connection
+    if connected.wait(timeout=5):
+        print("✅ WebSocket connection established")
+        ws.close()
+        return True
+    else:
+        print("❌ WebSocket connection timeout")
+        ws.close()
+        return False
+
+def test_agent_run():
+    """Test running agent via API."""
+    print("\n🔍 Testing agent run via API (mock mode)...")
+    
+    # Start listening for events first
+    events = []
+    ws_connected = threading.Event()
+    session_complete = threading.Event()
+    
+    def on_message(ws, message):
+        data = json.loads(message)
+        events.append(data)
+        event_type = data.get("event_type")
+        print(f"   📨 Event: {event_type}")
+        
+        if event_type == "complete":
+            session_complete.set()
+    
+    def on_open(ws):
+        ws_connected.set()
+    
+    # Connect WebSocket
+    ws = websocket.WebSocketApp(
+        WS_URL,
+        on_open=on_open,
+        on_message=on_message
+    )
+    
+    ws_thread = threading.Thread(target=lambda: ws.run_forever(), daemon=True)
+    ws_thread.start()
+    
+    # Wait for WebSocket connection
+    if not ws_connected.wait(timeout=5):
+        print("❌ WebSocket didn't connect")
+        ws.close()
+        return False
+    
+    print("✅ WebSocket connected, starting agent...")
+    
+    # Submit agent run
+    payload = {
+        "query": "Test query for UI flow verification",
+        "model": "claude-sonnet-4-5-20250929",
+        "base_url": "https://api.anthropic.com/v1/",
+        "enabled_toolsets": ["web"],
+        "max_turns": 5,
+        "mock_web_tools": True,  # Use mock mode to avoid API costs
+        "mock_delay": 2,  # Fast for testing
+        "verbose": False
+    }
+    
+    try:
+        response = requests.post(f"{API_URL}/agent/run", json=payload, timeout=10)
+        
+        if response.status_code == 200:
+            result = response.json()
+            session_id = result.get("session_id")
+            print(f"✅ Agent started: {session_id[:8]}...")
+            
+            # Wait for completion (or timeout)
+            print("⏳ Waiting for agent to complete (up to 30s)...")
+            if session_complete.wait(timeout=30):
+                print(f"✅ Agent completed! Received {len(events)} events:")
+                
+                # Count event types
+                event_counts = {}
+                for evt in events:
+                    evt_type = evt.get("event_type", "unknown")
+                    event_counts[evt_type] = event_counts.get(evt_type, 0) + 1
+                
+                for evt_type, count in event_counts.items():
+                    print(f"   • {evt_type}: {count}")
+                
+                # Check we got expected events
+                expected_events = ["query", "api_call", "response", "complete"]
+                missing = [e for e in expected_events if e not in event_counts]
+                
+                if missing:
+                    print(f"⚠️  Missing expected events: {missing}")
+                else:
+                    print("✅ All expected event types received!")
+                
+                ws.close()
+                return True
+            else:
+                print(f"⚠️  Timeout waiting for completion. Got {len(events)} events so far.")
+                ws.close()
+                return False
+                
+        else:
+            print(f"❌ Agent start failed: {response.status_code}")
+            print(f"   Response: {response.text}")
+            ws.close()
+            return False
+            
+    except Exception as e:
+        print(f"❌ Agent run error: {e}")
+        import traceback
+        traceback.print_exc()
+        ws.close()
+        return False
+
+def main():
+    """Run all tests."""
+    print("=" * 60)
+    print("🧪 Hermes Agent UI Flow Test")
+    print("=" * 60)
+    print("\nThis will test the complete flow:")
+    print("  1. API server connectivity")
+    print("  2. Tools endpoint")
+    print("  3. WebSocket connection")
+    print("  4. Agent execution via API (mock mode)")
+    print("  5. Event streaming to UI")
+    print("\n" + "=" * 60)
+    
+    results = []
+    
+    # Test 1: API server
+    results.append(("API Server", test_api_server()))
+    
+    # Test 2: Tools endpoint
+    results.append(("Tools Endpoint", test_tools_endpoint()))
+    
+    # Test 3: WebSocket
+    results.append(("WebSocket Connection", test_websocket()))
+    
+    # Test 4: Agent run
+    results.append(("Agent Execution + Events", test_agent_run()))
+    
+    # Summary
+    print("\n" + "=" * 60)
+    print("📊 TEST SUMMARY")
+    print("=" * 60)
+    
+    for test_name, passed in results:
+        status = "✅ PASS" if passed else "❌ FAIL"
+        print(f"{status} - {test_name}")
+    
+    all_passed = all(r[1] for r in results)
+    
+    print("\n" + "=" * 60)
+    if all_passed:
+        print("🎉 ALL TESTS PASSED!")
+        print("\n✅ The UI flow is working correctly!")
+        print("   You can now use the UI to:")
+        print("   • Submit queries")
+        print("   • View real-time events")
+        print("   • See tool executions")
+        print("   • Get final responses")
+    else:
+        print("❌ SOME TESTS FAILED")
+        print("\nMake sure:")
+        print("  1. API server is running: python api_endpoint/logging_server.py")
+        print("  2. ANTHROPIC_API_KEY is set in environment")
+        print("  3. All dependencies are installed: pip install -r requirements.txt")
+    print("=" * 60)
+    
+    return 0 if all_passed else 1
+
+if __name__ == "__main__":
+    exit(main())
+
--- a/ui/websocket_client.py
+++ b/ui/websocket_client.py
@@ -0,0 +1,91 @@
+"""
+WebSocket client for real-time event streaming from Hermes Agent.
+
+This module provides a WebSocket client that runs in a separate thread
+and emits Qt signals when events are received from the server.
+"""
+
+import json
+import threading
+import websocket
+from PySide6.QtCore import QObject, Signal
+
+
+class WebSocketClient(QObject):
+    """
+    WebSocket client for receiving real-time agent events.
+    
+    Runs in a separate thread and emits Qt signals when events arrive.
+    """
+    
+    # Signals for event communication
+    event_received = Signal(dict)  # Emits parsed event data
+    connected = Signal()
+    disconnected = Signal()
+    error = Signal(str)
+    
+    def __init__(self, url: str = "ws://localhost:8000/ws"):
+        super().__init__()
+        self.url = url
+        self.ws = None
+        self.running = False
+        self.thread = None
+    
+    def connect(self):
+        """Start WebSocket connection in background thread."""
+        if self.running:
+            return
+        
+        self.running = True
+        self.thread = threading.Thread(target=self._run, daemon=True)
+        self.thread.start()
+    
+    def disconnect(self):
+        """Stop WebSocket connection."""
+        self.running = False
+        if self.ws:
+            try:
+                self.ws.close()
+            except Exception as e:
+                print(f"Error closing WebSocket: {e}")
+    
+    def _run(self):
+        """WebSocket event loop (runs in background thread)."""
+        try:
+            self.ws = websocket.WebSocketApp(
+                self.url,
+                on_open=self._on_open,
+                on_message=self._on_message,
+                on_error=self._on_error,
+                on_close=self._on_close
+            )
+            
+            # Run forever with reconnection
+            self.ws.run_forever(ping_interval=300, ping_timeout=60)
+            
+        except Exception as e:
+            self.error.emit(f"WebSocket error: {str(e)}")
+    
+    def _on_open(self, ws):
+        """Called when WebSocket connection is established."""
+        print("WebSocket connected")
+        self.connected.emit()
+    
+    def _on_message(self, ws, message):
+        """Called when a message is received from the server."""
+        try:
+            data = json.loads(message)
+            self.event_received.emit(data)
+        except json.JSONDecodeError as e:
+            print(f" Failed to parse WebSocket message: {e}")
+    
+    def _on_error(self, ws, error):
+        """Called when an error occurs."""
+        print(f"WebSocket error: {error}")
+        self.error.emit(str(error))
+    
+    def _on_close(self, ws, close_status_code, close_msg):
+        """Called when WebSocket connection is closed."""
+        print(f"🔌 WebSocket disconnected: {close_status_code} - {close_msg}")
+        self.disconnected.emit()
+
--- a/vision_tools.py
+++ b/vision_tools.py
@@ -29,11 +29,14 @@ import json
 import os
 import asyncio
 import uuid
+from dotenv import load_dotenv
 import datetime
 from pathlib import Path
 from typing import Dict, Any, Optional
 from openai import AsyncOpenAI

+load_dotenv()
+
 # Initialize Nous Research API client for vision processing
 nous_client = AsyncOpenAI(
    api_key=os.getenv("NOUS_API_KEY"),
--- a/web_tools.py
+++ b/web_tools.py
@@ -42,6 +42,7 @@ Usage:

 import json
 import os
+from dotenv import load_dotenv
 import re
 import asyncio
 import uuid
@@ -51,6 +52,9 @@ from typing import List, Dict, Any, Optional
 from firecrawl import Firecrawl
 from openai import AsyncOpenAI

+
+load_dotenv()
+
 # Initialize Firecrawl client once at module level
 firecrawl_client = Firecrawl(api_key=os.getenv("FIRECRAWL_API_KEY"))

@@ -61,7 +65,7 @@ nous_client = AsyncOpenAI(
 )

 # Configuration for LLM processing
-DEFAULT_SUMMARIZER_MODEL = "gemini-2.5-flash"
+DEFAULT_SUMMARIZER_MODEL = "Hermes-4-70B"
 DEFAULT_MIN_LENGTH_FOR_SUMMARIZATION = 5000

 # Debug mode configuration
@@ -193,6 +197,8 @@ Create a markdown summary that captures all key information in a well-organized,
            temperature=0.1,  # Low temperature for consistent extraction
            max_tokens=4000   # Generous limit for comprehensive processing
        )
+
+        print("Response within tool call to see the error: ", response)
        
        # Get the markdown response directly
        processed_content = response.choices[0].message.content.strip()
Author	SHA1	Message	Date
Jai Suphavadeeprasit	c2d5a28d15	Modularize frontend	2025-10-13 11:53:13 -04:00
Jai Suphavadeeprasit	bb5eab2645	logging work	2025-10-13 10:44:42 -04:00
Jai Suphavadeeprasit	6313c9879f	changes	2025-10-11 17:52:23 -04:00
Jai Suphavadeeprasit	e698b7e0e5	changes	2025-10-10 18:04:22 -04:00
Teknium	c5386ed7e6	add better logging when requests fail	2025-09-10 00:51:41 -07:00
Teknium	2082c7caa3	update gitignore	2025-09-10 00:50:56 -07:00
Teknium	17608c1142	Update to use toolsets and make them easy to create and configure	2025-09-10 00:43:55 -07:00
Teknium	c7fa4447b8	cleanup	2025-09-06 22:07:38 -07:00