Initial commit: alarm analysis project

Python project for analyzing alarm data from building monitoring systems. Includes alarm analyzer, plotting, tests, and source data files.
2026-02-26 09:03:54 -05:00
commit f08a1a9bf5
25 changed files with 11350 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,25 @@
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+
+# Virtual environment
+alarm_analysis_env/
+
+# Generated output
+output/
+plots/
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+
+# OS
+.DS_Store
+Thumbs.db
+
+# Claude Code
+.claude/
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,48 @@
+# Alarm Analysis
+
+Python project for analyzing alarm data from building monitoring systems (CSV alarm logs + XLSX sensor reports).
+
+## Commands
+
+```bash
+# Activate virtual environment
+source alarm_analysis_env/Scripts/activate  # Windows Git Bash
+# or: alarm_analysis_env\Scripts\activate   # Windows CMD
+
+# Run full analysis (no plots)
+python run_analysis.py
+
+# Generate plots (requires matplotlib display)
+python create_plots.py
+
+# Run tests
+python test_changes.py
+python test_duration_fix.py
+python test_mapping.py
+python test_enhanced_plotting.py
+```
+
+## Architecture
+
+- `alarm_analyzer.py` — Main `AlarmAnalyzer` class (~96KB). Handles data loading, alarm categorization, event pairing, duration calculation, basic/advanced analysis, uptime metrics, and export.
+- `run_analysis.py` — Entry point that runs the full pipeline without visualizations.
+- `create_plots.py` — Generates alarm dashboard, duration analysis, and sensor analysis plots.
+
+## Data Files
+
+- `CardinalAlarmsDec25.csv` — Raw alarm data (columns: Alarm_Id, Sensor_Id, Date, Description, LogTime)
+- `SensorReport Cardinal 2025-12-23_processed.xlsx` — Sensor descriptions and group mappings
+- `exclusion_config.json` — JSON format: `{"excluded_groups": ["GroupName1"]}`
+- `groups_to_skip.txt` — Text format: one group name per line
+
+## Key Patterns
+
+- Alarm types are parsed from Description field via regex: Hi/Lo Alarm, Hi/Lo Warning, Error, Normal
+- Events are paired (alarm start -> Normal end) to calculate durations
+- Sensor mapping links Sensor_Id to human-readable names and groups from the XLSX file
+- Visualization imports are deferred (`_import_viz_libs()`) so analysis can run headless
+- Output goes to `output/` (CSVs) and `plots/` (PNGs)
+
+## Dependencies
+
+Python 3.13 with: pandas, numpy, matplotlib, seaborn, openpyxl
--- a/CardinalAlarmsDec25.csv
+++ b/CardinalAlarmsDec25.csv
--- a/ProjectPlan.md
+++ b/ProjectPlan.md
@@ -0,0 +1,174 @@
+# Alarm Data Analysis Project Plan
+
+## Overview
+This project will develop a Python script to analyze alarm data from CSV files, cross-referencing with sensor descriptions from an XLSX file. The script will provide comprehensive statistics and insights about alarm events across monitoring points.
+
+## Data Structure Analysis
+Based on the CSV file structure:
+- **Alarm_Id**: Unique identifier for each alarm event
+- **Sensor_Id**: Identifies the monitoring point
+- **Date**: Timestamp when the alarm/warning/error occurred
+- **Description**: Details about the alarm event (e.g., "Hi Alarm: 51.3>=46.0F", "Normal 42.5F", "Error: Comm Loss Error")
+- **LogTime**: Timestamp when the event was logged
+
+## Implementation Plan
+
+### Phase 1: Data Loading and Preprocessing
+1. Load the CSV alarm data using pandas
+2. Load the sensor report XLSX file to get sensor descriptions
+3. Parse alarm descriptions to categorize events (Normal, Alarm, Warning, Error)
+4. Extract numeric values and thresholds from alarm descriptions
+5. Identify alarm start and end events to calculate durations
+
+### Phase 2: Data Processing and Pairing
+1. Pair start events (Alarm/Warning/Error) with corresponding end events (Normal)
+2. Calculate duration for each alarm event
+3. Handle edge cases (unpaired events, overlapping events)
+4. Create a structured dataset of complete alarm events
+
+### Phase 3: Basic Analysis
+1. Count alarm events by type (Alarm, Warning, Error) for each sensor
+2. Calculate min/max/average duration for each alarm type per sensor
+3. Generate summary statistics across all sensors
+4. Identify most problematic sensors (highest number of events, longest durations)
+
+### Phase 4: Advanced Analysis
+1. Time-based analysis:
+   - Frequency of events by hour of day, day of week
+   - Trend analysis over time periods
+   - Seasonal patterns if data spans multiple months
+2. Alarm correlation analysis:
+   - Identify sensors that frequently alarm together
+   - Determine if specific alarm types lead to others
+3. Severity analysis:
+   - Weighted scoring based on alarm type and duration
+   - Ranking sensors by overall impact
+
+### Phase 5: Additional Valuable Metrics
+1. **MTBF (Mean Time Between Failures)**: Average time between consecutive alarm events for each sensor
+2. **Alarm Churn**: Rate of alarm state changes for each sensor
+3. **Recovery Time**: Time taken to return to normal state after an alarm
+4. **Alarm Escalation**: Percentage of warnings that escalate to alarms
+5. **Persistence Analysis**: How long alarms typically last before being resolved
+6. **Peak Time Identification**: Time periods with highest alarm frequency
+7. **False Alarm Rate**: Estimate of alarms that return to normal quickly
+8. **Critical Sensor Identification**: Sensors with highest frequency of high-severity events
+
+### Phase 6: Visualization and Reporting
+1. Generate summary reports in console and optionally save to file
+2. Create visualizations (matplotlib/seaborn):
+   - Bar charts for alarm counts by sensor and type
+   - Box plots for duration analysis
+   - Time series plots for alarm frequency over time
+   - Heatmaps for alarm correlation
+3. Export detailed analysis results to CSV files
+
+### Phase 7: Output and Export
+1. Create summary tables showing:
+   - Sensor-wise breakdown of alarm types and durations
+   - Top N problematic sensors
+   - Time-based trends
+2. Export processed data for further analysis
+3. Generate a comprehensive report file
+
+## Technical Implementation Details
+
+### Libraries to Use:
+- pandas: For data manipulation and analysis
+- numpy: For numerical operations
+- matplotlib/seaborn: For visualizations
+- openpyxl: For reading XLSX files
+- re: For parsing alarm descriptions
+- datetime: For time-based analysis
+
+### Data Processing Steps:
+1. Parse alarm descriptions using regular expressions to identify:
+   - Alarm type (Hi/Lo Alarm/Warning, Error, Normal)
+   - Measured value
+   - Threshold value
+   - Unit of measurement
+2. Create a mapping between Sensor_Id and sensor descriptions from XLSX
+3. For each sensor, pair alarm start events with corresponding normal end events
+4. Calculate duration between paired events
+5. Aggregate statistics by sensor and alarm type
+
+### Alarm Type Classification:
+- **Error**: Events containing "Error" in description
+- **Alarm**: Events containing "Alarm" but not "Warning"
+- **Warning**: Events containing "Warning"
+- **Normal**: Events indicating return to normal state
+
+### Key Metrics to Calculate:
+For each sensor:
+- Count of each alarm type
+- Min/Max/Average duration for each alarm type
+- Total alarm time percentage
+- Alarm frequency rate
+- Average time to recovery
+- Percentage of events that escalate
+
+## Expected Deliverables
+1. Main analysis script (alarm_analyzer.py)
+2. Configuration file for customization
+3. Sample output files demonstrating analysis results
+4. Documentation on how to run the script and interpret results
+5. ProjectPlan.md (this document)
+
+## Enhanced Features Implemented
+
+### Enhanced Group-Based Analysis
+1. **Total Sensors Per Group**: Added the total number of sensors in each group according to the sensor report
+2. **Alarm Coverage Percentage**: Added percentage of monitoring points that experienced alarms
+3. **Alarm Time Percentage**: Added percentage of time the group's sensors spent in alarm condition
+
+### Enhanced Output Files
+1. All sensor-specific output files now include sensor names and group information:
+   - `sensor_statistics.csv`
+   - `top_sensors_by_alarm_count.csv`
+   - `top_sensors_by_avg_duration.csv`
+   - `top_sensors_by_max_duration.csv`
+   - `top_sensors_by_severity_score.csv`
+
+### Enhanced Plotting Functionality
+1. All sensor-specific plots now display sensor names instead of just IDs
+2. Added comprehensive group-based visualizations:
+   - Group composition analysis
+   - Alarm type distribution by group
+   - Group alarm intensity metrics
+
+## Enhanced Features Implemented
+
+### Uptime/Downtime Metrics
+1. **Error-based downtime**: Calculates the total duration of all "Error" events across all sensors as a percentage of the total time period
+2. **Alarm/Warning-based downtime**: Calculates the total duration of all "Alarm" and "Warning" events across all sensors as a percentage of the total time period
+3. **System-level uptime metrics**: Time-based calculation showing the percentage of time that any sensor was in error or alarm/warning state
+4. **Per-sensor and per-group metrics**: Individual sensor and group uptime/downtime percentages
+5. **New output files**:
+   - `system_uptime_summary.csv` - Overall system uptime metrics
+   - `sensor_error_uptime_metrics.csv` - Per-sensor error-based uptime metrics
+   - `sensor_alarm_warning_uptime_metrics.csv` - Per-sensor alarm/warning-based uptime metrics
+   - `group_error_uptime_metrics.csv` - Per-group error-based uptime metrics
+   - `group_alarm_warning_uptime_metrics.csv` - Per-group alarm/warning-based uptime metrics
+6. **Comprehensive group inclusion**: All output files covering groups now include all groups, including those with 0 errors or warnings, allowing for identification of systems with 100% uptime
+
+### Optional Group Exclusion Feature
+1. **Create an optional configuration file** (e.g., `exclusion_config.json` or `groups_to_skip.txt`) that allows users to specify groups to exclude from analysis
+2. **Implementation approach**:
+   - Add a new parameter to the AlarmAnalyzer class to accept an exclusion file path
+   - Parse the exclusion file to get a list of groups to skip
+   - Filter out sensor data belonging to excluded groups before analysis
+   - Add logging to indicate which groups were excluded
+3. **Configuration file format options**:
+   - JSON format: `{"excluded_groups": ["GroupName1", "GroupName2"]}`
+   - Simple text format: one group name per line
+   - CSV format: for more complex exclusion rules
+4. **Benefits**:
+   - Allows users to exclude groups with known issues or maintenance periods
+   - Provides cleaner analysis results when certain groups have anomalous data
+   - Maintains flexibility without permanently modifying the source data
+5. **Implementation details**:
+   - Add preprocessing step to filter out excluded groups before any analysis
+   - Update all analysis functions to work with the filtered dataset
+   - Maintain separate statistics for excluded groups if needed for reference
+
+## Future Enhancement Plans
--- a/README.md
+++ b/README.md
@@ -0,0 +1,245 @@
+# Alarm Analysis
+
+Analyze alarm data from building monitoring systems — pair alarm events, calculate durations, compute uptime metrics, and generate visualizations. Built for CSV alarm logs and XLSX sensor reports exported from systems like Cardinal.
+
+## Table of Contents
+
+- [Quick Start](#quick-start)
+- [Inputs](#inputs)
+- [Outputs](#outputs)
+- [How It Works](#how-it-works)
+- [Configuration](#configuration)
+- [Visualizations](#visualizations)
+- [Testing](#testing)
+- [Project Structure](#project-structure)
+- [Dependencies](#dependencies)
+
+## Quick Start
+
+```bash
+# Set up virtual environment
+python -m venv alarm_analysis_env
+source alarm_analysis_env/Scripts/activate  # Windows Git Bash
+# or: alarm_analysis_env\Scripts\activate   # Windows CMD
+
+# Install dependencies
+pip install pandas numpy matplotlib seaborn openpyxl
+
+# Run the full analysis (outputs CSVs to output/)
+python run_analysis.py
+
+# Generate plots (outputs PNGs to plots/)
+python create_plots.py
+```
+
+## Inputs
+
+### 1. Alarm CSV (`CardinalAlarmsDec25.csv`)
+
+Raw alarm log exported from the monitoring system. Required columns:
+
+| Column | Type | Description | Example |
+|--------|------|-------------|---------|
+| `Alarm_Id` | int | Unique alarm event ID | `486258` |
+| `Sensor_Id` | int | Numeric sensor identifier | `9273` |
+| `Date` | datetime | When the alarm occurred | `2025-12-01 00:01:27.000` |
+| `Description` | string | Alarm condition text | `Lo Warning: 68.0<=68.0F` |
+| `LogTime` | datetime | When the event was logged | `2025-12-01 00:01:32.843` |
+
+**Description patterns** the analyzer recognizes:
+
+| Pattern | Example | Parsed As |
+|---------|---------|-----------|
+| Hi/Lo Alarm | `Hi Alarm: 51.3>=46.0F` | Type=Alarm, Value=51.3, Threshold=46.0, Unit=F |
+| Hi/Lo Warning | `Lo Warning: 68.0<=68.0F` | Type=Warning, Value=68.0, Threshold=68.0, Unit=F |
+| Error | `Error: Comm Loss Error 20.4>=20 min.` | Type=Error |
+| Normal | `Normal 68.1F` | Type=Normal (resolves prior alarm) |
+
+Supported units: `F`, `C`, `%RH`, `"H2O`
+
+### 2. Sensor Report XLSX (`SensorReport Cardinal 2025-12-23_processed.xlsx`)
+
+Sensor metadata exported from the monitoring system. Expected columns:
+
+| Column | Description |
+|--------|-------------|
+| `ID` | Sensor ID (matches `Sensor_Id` in the alarm CSV) |
+| `Group` | Logical grouping (e.g., room, zone, building area) |
+| `Remote` | Remote unit identifier |
+| `Name` | Human-readable sensor name |
+| `Type` | Sensor type (temperature, humidity, etc.) |
+| `Serial No` | Hardware serial number |
+
+The XLSX may use a hierarchical layout where `Group` names appear only in the first row of each group. The analyzer handles this automatically via forward-fill. Both `header=0` (new format) and `header=4` (legacy format) are auto-detected.
+
+### 3. Exclusion Config (optional)
+
+Exclude specific sensor groups from analysis. Provide either format:
+
+**JSON** (`exclusion_config.json`):
+```json
+{
+    "excluded_groups": [
+        "Maintenance Sensors",
+        "Decommissioned Wing"
+    ]
+}
+```
+
+**Plain text** (`groups_to_skip.txt`):
+```
+Maintenance Sensors
+Decommissioned Wing
+```
+
+Pass the file path when creating the analyzer:
+```python
+analyzer = AlarmAnalyzer('alarms.csv', 'sensors.xlsx', exclusion_file_path='exclusion_config.json')
+```
+
+## Outputs
+
+All outputs are generated in `output/` (CSVs) and `plots/` (PNGs).
+
+### Core Analysis CSVs
+
+| File | Description |
+|------|-------------|
+| `paired_alarm_events.csv` | Every alarm event paired with its resolution — includes sensor name/group, start/end times, duration, alarm type, values, thresholds, and how the alarm ended |
+| `summary_by_alarm_type.csv` | Aggregate counts and duration stats (min/max/avg) per alarm type |
+| `sensor_statistics.csv` | Per-sensor stats: alarm count, duration stats, with name and group |
+
+### Rankings
+
+| File | Ranked By |
+|------|-----------|
+| `top_sensors_by_alarm_count.csv` | Total alarm events per sensor |
+| `top_sensors_by_avg_duration.csv` | Average alarm duration |
+| `top_sensors_by_max_duration.csv` | Longest single alarm event |
+| `top_sensors_by_severity_score.csv` | Severity score (type weight x duration) |
+| `top_groups_by_alarm_count.csv` | Total alarm events per group |
+| `top_groups_by_avg_duration.csv` | Average alarm duration per group |
+| `top_groups_by_max_duration.csv` | Longest single alarm event per group |
+| `top_groups_by_severity_score.csv` | Severity score per group |
+
+### Time Analysis
+
+| File | Description |
+|------|-------------|
+| `alarm_frequency_by_hour.csv` | Alarm count for each hour of day (0-23) |
+| `alarm_frequency_by_day.csv` | Alarm count for each day of week |
+
+### Group Analysis
+
+| File | Description |
+|------|-------------|
+| `group_statistics.csv` | Per-group stats including total sensors, percentage of sensors that alarmed, and alarm time percentage |
+| `alarm_type_distribution_by_group.csv` | Crosstab of alarm types per group |
+
+### Uptime Metrics
+
+| File | Description |
+|------|-------------|
+| `system_uptime_summary.csv` | System-wide uptime: total time span, cumulative downtime percentages, time-based uptime (per-hour bucket analysis) |
+| `sensor_error_uptime_metrics.csv` | Per-sensor error-based uptime (communication failures) |
+| `sensor_alarm_warning_uptime_metrics.csv` | Per-sensor alarm/warning-based uptime (operational issues) |
+| `group_error_uptime_metrics.csv` | Per-group error-based uptime |
+| `group_alarm_warning_uptime_metrics.csv` | Per-group alarm/warning-based uptime |
+
+## How It Works
+
+### Pipeline Overview
+
+```
+CSV + XLSX ──> Load & Map ──> Categorize ──> Pair Events ──> Analyze ──> Export
+                  │                              │              │
+                  ├─ Sensor ID → Name/Group      │              ├─ Basic stats
+                  └─ Exclude groups               │              ├─ Advanced (MTBF, correlation, severity)
+                                                  │              └─ Uptime metrics
+                                                  │
+                                            Alarm Start ──> Normal (resolved)
+                                            Alarm Start ──> Different Alarm (transition)
+                                            Alarm Start ──> [nothing] (unresolved)
+```
+
+### Step-by-Step
+
+1. **Load Data** — Read the alarm CSV and sensor report XLSX. Build a mapping from sensor IDs to human-readable names and groups. Enrich alarm records with sensor metadata. Filter out excluded groups.
+
+2. **Categorize Alarms** — Parse each alarm's `Description` field with regex to extract the alarm type (Error, Alarm, Warning, Normal), measured value, threshold, and unit.
+
+3. **Pair Events & Calculate Durations** — For each sensor, walk through events chronologically:
+   - An alarm-start event (Alarm, Warning, or Error) looks forward for resolution
+   - If a `Normal` event follows → alarm is **resolved**, duration is calculated
+   - If a different alarm type follows → recorded as a **transition** (e.g., "Transition to Alarm")
+   - If nothing follows → marked **unresolved**
+
+4. **Basic Analysis** — Count alarms by type, sensor, and group. Compute duration statistics (min, max, average).
+
+5. **Advanced Analysis**:
+   - **Hourly/daily frequency** — when alarms tend to occur
+   - **MTBF** (Mean Time Between Failures) — average time between consecutive alarms per sensor
+   - **Alarm correlation** — sensor pairs that alarm within 1-hour windows of each other
+   - **Severity scoring** — weighted by type (Error=3x, Alarm=2x, Warning=1x) multiplied by duration
+   - **Alarm escalation** — warnings that escalate to Alarm or Error within 1 hour
+   - **Group aggregates** — all metrics rolled up by sensor group
+
+6. **Uptime Metrics** — Calculate downtime from error events (communication failures) and alarm/warning events (operational issues). Compute both cumulative percentages and time-bucketed system uptime using 1-hour intervals. Include all sensors and groups, even those with zero events.
+
+7. **Export** — Write all results to CSV files in `output/`.
+
+## Visualizations
+
+Run `python create_plots.py` to generate PNG plots in `plots/`:
+
+| Plot | Description |
+|------|-------------|
+| `alarm_dashboard.png` | 4-panel overview: alarm count by type, top 10 sensors, hourly frequency, daily frequency |
+| `duration_analysis.png` | Box plots and histograms of alarm durations by type (log scale) |
+| `sensor_analysis.png` | 4-panel: top sensors by count, avg duration, max duration, severity |
+
+Additional group-based plots are generated when group data is available (group dashboard, group composition, alarm type distribution by group, alarm intensity per group).
+
+Visualization imports (matplotlib, seaborn) are deferred so `run_analysis.py` can execute headless without a display.
+
+## Testing
+
+```bash
+python test_changes.py          # Validates code structure (methods, columns, exports exist)
+python test_duration_fix.py     # Tests event pairing and duration calculation
+python test_mapping.py          # Verifies sensor ID → name/group mapping
+python test_enhanced_plotting.py # Tests plot data preparation logic (no rendering)
+```
+
+## Project Structure
+
+```
+AlarmAnalysis/
+├── alarm_analyzer.py          # Core AlarmAnalyzer class (all analysis logic)
+├── run_analysis.py            # Entry point: run full analysis, export CSVs
+├── create_plots.py            # Entry point: generate visualization PNGs
+├── exclusion_config.json      # Group exclusion config (JSON format)
+├── groups_to_skip.txt         # Group exclusion config (plain text format)
+├── CardinalAlarmsDec25.csv    # Input: alarm log data
+├── SensorReport *.xlsx        # Input: sensor metadata
+├── test_changes.py            # Test: code structure validation
+├── test_duration_fix.py       # Test: event pairing logic
+├── test_mapping.py            # Test: sensor ID mapping
+├── test_enhanced_plotting.py  # Test: plot data preparation
+├── output/                    # Generated CSV analysis results
+└── plots/                     # Generated PNG visualizations
+```
+
+## Dependencies
+
+- **Python** 3.13+
+- **pandas** — data manipulation and analysis
+- **numpy** — numerical operations
+- **matplotlib** — plotting (only needed for `create_plots.py`)
+- **seaborn** — statistical visualizations (only needed for `create_plots.py`)
+- **openpyxl** — reading XLSX sensor reports
+
+Install all dependencies:
+```bash
+pip install pandas numpy matplotlib seaborn openpyxl
+```
--- a/2025-12-23_processed.xlsx
+++ b/2025-12-23_processed.xlsx
--- a/alarm_analyzer.py
+++ b/alarm_analyzer.py
--- a/check_enhanced_group_stats.py
+++ b/check_enhanced_group_stats.py
@@ -0,0 +1,81 @@
+#!/usr/bin/env python
+# Script to verify the enhanced group statistics
+
+import pandas as pd
+import os
+
+def check_enhanced_group_stats():
+    print("=== ENHANCED GROUP STATISTICS VERIFICATION ===")
+    print()
+
+    # Check if output directory exists
+    if not os.path.exists('output'):
+        print("Output directory not found!")
+        return
+
+    # Check if group_statistics.csv exists
+    group_stats_path = os.path.join('output', 'group_statistics.csv')
+    if not os.path.exists(group_stats_path):
+        print(f"Group statistics file not found at {group_stats_path}")
+        return
+
+    # Load the enhanced group statistics
+    group_stats_df = pd.read_csv(group_stats_path)
+    
+    print("Enhanced Group Statistics Columns:")
+    print(list(group_stats_df.columns))
+    print()
+
+    # Verify the new columns exist
+    required_columns = [
+        'Total_Sensors_In_Group', 
+        'Percentage_Monitoring_Points_Alarmed', 
+        'Alarm_Time_Percentage'
+    ]
+    
+    missing_columns = [col for col in required_columns if col not in group_stats_df.columns]
+    if missing_columns:
+        print(f"ERROR: Missing columns: {missing_columns}")
+        return
+    else:
+        print("All required enhanced columns are present")
+        print()
+
+    # Display sample of the enhanced data
+    print("Sample of Enhanced Group Statistics (Top 10 by Alarm Count):")
+    print(group_stats_df[['Sensor_Group', 'Total_Alarm_Count', 'Unique_Sensors', 
+                         'Total_Sensors_In_Group', 'Percentage_Monitoring_Points_Alarmed', 
+                         'Alarm_Time_Percentage']].head(10))
+    print()
+
+    # Show some key statistics
+    print("=== ENHANCED ANALYSIS SUMMARY ===")
+    
+    # Groups with highest percentage of monitoring points alarmed
+    print("Top 5 groups with highest percentage of monitoring points that experienced alarms:")
+    top_alarm_percent = group_stats_df.nlargest(5, 'Percentage_Monitoring_Points_Alarmed')[['Sensor_Group', 'Percentage_Monitoring_Points_Alarmed', 'Unique_Sensors', 'Total_Sensors_In_Group']]
+    print(top_alarm_percent)
+    print()
+
+    # Groups with highest alarm time percentage
+    print("Top 5 groups with highest percentage of time spent in alarm condition:")
+    top_time_percent = group_stats_df.nlargest(5, 'Alarm_Time_Percentage')[['Sensor_Group', 'Alarm_Time_Percentage', 'Total_Duration', 'Total_Sensors_In_Group']]
+    print(top_time_percent)
+    print()
+
+    # Groups with the most difference between total sensors and unique sensors that alarmed
+    print("Groups with the highest number of total sensors but lower alarm activity:")
+    group_stats_df['Sensors_Not_Alarming'] = group_stats_df['Total_Sensors_In_Group'] - group_stats_df['Unique_Sensors']
+    top_inactive = group_stats_df.nlargest(5, 'Sensors_Not_Alarming')[['Sensor_Group', 'Sensors_Not_Alarming', 'Total_Sensors_In_Group', 'Unique_Sensors', 'Percentage_Monitoring_Points_Alarmed']]
+    print(top_inactive)
+    print()
+
+    print("Enhanced group statistics analysis completed successfully!")
+    print()
+    print("New metrics added:")
+    print("- Total_Sensors_In_Group: Total number of sensors in the group according to sensor report")
+    print("- Percentage_Monitoring_Points_Alarmed: Percentage of sensors in the group that experienced alarms")
+    print("- Alarm_Time_Percentage: Percentage of total possible sensor-time that was spent in alarm condition")
+
+if __name__ == "__main__":
+    check_enhanced_group_stats()
--- a/check_mapping.py
+++ b/check_mapping.py
@@ -0,0 +1,74 @@
+#!/usr/bin/env python
+# Script to check the ID mapping between alarm data and sensor report
+
+import pandas as pd
+import numpy as np
+
+def check_mapping():
+    print("Loading alarm data...")
+    alarm_df = pd.read_csv('CardinalAlarmsDec25.csv')
+    
+    print("Loading sensor report...")
+    # Try to read with header=0 first (new format) then with header=4 (old format)
+    try:
+        temp_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=0, nrows=5)
+        expected_cols = ['ID', 'Remote', 'Group', 'Type', 'Serial No', 'Name']
+        has_expected_cols = any(col in temp_df.columns for col in expected_cols)
+
+        if has_expected_cols:
+            sensor_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=0)
+            print("Using new sensor report format (header=0)")
+        else:
+            sensor_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=4)
+            print("Using old sensor report format (header=4)")
+    except FileNotFoundError:
+        print("Sensor report file not found. Please ensure 'SensorReport Cardinal 2025-12-23_processed.xlsx' is in the current directory.")
+        return
+    
+    print(f"Alarm data shape: {alarm_df.shape}")
+    print(f"Sensor report shape: {sensor_df.shape}")
+    
+    print("\nAlarm data Sensor_Id sample (first 10):")
+    print(alarm_df['Sensor_Id'].head(10).tolist())
+    
+    print("\nSensor report columns:")
+    print(sensor_df.columns.tolist())
+    
+    print("\nSensor report 'Remote SN' column info:")
+    print(f"Data type: {sensor_df['Remote SN'].dtype}")
+    print(f"Sample values (first 10): {sensor_df['Remote SN'].head(10).tolist()}")
+    print(f"Non-null count: {sensor_df['Remote SN'].notna().sum()}")
+    
+    # Check for potential matches
+    alarm_sensors = set(alarm_df['Sensor_Id'].unique())
+    
+    # Clean the Remote SN column to find valid numeric values
+    valid_remote_sns = []
+    for sn in sensor_df['Remote SN'].dropna():
+        try:
+            # Try to convert to int
+            valid_remote_sns.append(int(sn))
+        except (ValueError, TypeError):
+            print(f"Could not convert to int: {sn}")
+            continue
+    
+    sensor_sns = set(valid_remote_sns)
+    
+    print(f"\nNumber of unique alarm sensors: {len(alarm_sensors)}")
+    print(f"Number of valid sensor report IDs: {len(sensor_sns)}")
+    print(f"Common IDs between datasets: {len(alarm_sensors.intersection(sensor_sns))}")
+    
+    if len(alarm_sensors.intersection(sensor_sns)) > 0:
+        print(f"Sample common IDs: {list(alarm_sensors.intersection(sensor_sns))[:10]}")
+    else:
+        print("No direct matches found. Let's check other potential ID columns in sensor report...")
+        
+        # Check other columns that might contain IDs
+        for col in sensor_df.columns:
+            if col != 'Remote SN':
+                print(f"\nChecking column: {col}")
+                non_null_values = sensor_df[col].dropna().head(10).tolist()
+                print(f"Sample values: {non_null_values}")
+
+if __name__ == "__main__":
+    check_mapping()
--- a/check_output.py
+++ b/check_output.py
@@ -0,0 +1,45 @@
+#!/usr/bin/env python
+# Check the output files to confirm sensor names are included
+
+import pandas as pd
+
+def check_output():
+    try:
+        print("Loading paired events CSV...")
+        paired_events = pd.read_csv('output/paired_alarm_events.csv')
+        print(f'Paired events CSV loaded successfully')
+        print(f'Shape: {paired_events.shape}')
+        print('Columns:', list(paired_events.columns))
+        
+        # Show a few rows to verify sensor names are included
+        print('\nFirst 5 rows with Sensor_Id, Sensor_Name, Sensor_Group:')
+        cols_to_show = ['Sensor_Id', 'Sensor_Name', 'Sensor_Group', 'Alarm_Type', 'Duration_Minutes']
+        available_cols = [col for col in cols_to_show if col in paired_events.columns]
+        if available_cols:
+            print(paired_events[available_cols].head())
+        else:
+            print("Columns not found in paired events file")
+        
+        print('\nSample of unique sensor names:')
+        if 'Sensor_Name' in paired_events.columns:
+            unique_names = paired_events['Sensor_Name'].unique()
+            print(f'Number of unique sensor names: {len(unique_names)}')
+            print('Sample sensor names:', unique_names[:10])
+        else:
+            print("Sensor_Name column not found in paired events")
+            
+        print('\nSample of unique sensor groups:')
+        if 'Sensor_Group' in paired_events.columns:
+            unique_groups = paired_events['Sensor_Group'].unique()
+            print(f'Number of unique sensor groups: {len(unique_groups)}')
+            print('Sample sensor groups:', unique_groups[:10])
+        else:
+            print("Sensor_Group column not found in paired events")
+        
+    except Exception as e:
+        print(f'Error reading output file: {e}')
+        import traceback
+        traceback.print_exc()
+
+if __name__ == "__main__":
+    check_output()
--- a/check_sensor_report.py
+++ b/check_sensor_report.py
@@ -0,0 +1,50 @@
+#!/usr/bin/env python
+# Check the sensor report data structure
+
+import pandas as pd
+
+def check_sensor_report():
+    print("Loading sensor report...")
+    # Try to read with header=0 first (new format) then with header=4 (old format)
+    try:
+        temp_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=0, nrows=5)
+        expected_cols = ['ID', 'Remote', 'Group', 'Type', 'Serial No', 'Name']
+        has_expected_cols = any(col in temp_df.columns for col in expected_cols)
+
+        if has_expected_cols:
+            sensor_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=0)
+            print("Using new sensor report format (header=0)")
+        else:
+            sensor_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=4)
+            print("Using old sensor report format (header=4)")
+    except FileNotFoundError:
+        print("Sensor report file not found. Please ensure 'SensorReport Cardinal 2025-12-23_processed.xlsx' is in the current directory.")
+        return
+    
+    print(f"Sensor report shape: {sensor_df.shape}")
+    print(f"Columns: {list(sensor_df.columns)}")
+    
+    print("\nFirst few rows:")
+    print(sensor_df.head(10))
+    
+    print("\nSample of the specific columns we're interested in:")
+    sample_ids = [9273, 3817, 8963, 7414, 9092, 9105, 7080, 3799]
+    
+    for col in ['ID', 'Remote', 'Group', 'Type', 'Serial No']:
+        print(f"\n{col} column:")
+        if col in sensor_df.columns:
+            print(sensor_df[sensor_df['ID'].isin(sample_ids)][col].head(10))
+        else:
+            print(f"Column {col} not found")
+    
+    # Check for some of the IDs that should exist
+    print(f"\nChecking for specific ID values...")
+    for sensor_id in sample_ids:
+        matches = sensor_df[sensor_df['ID'] == float(sensor_id)]
+        if not matches.empty:
+            print(f"ID {sensor_id}:")
+            print(matches[['ID', 'Remote', 'Group', 'Type', 'Name']].iloc[0] if not matches.empty else "No match")
+            print("---")
+
+if __name__ == "__main__":
+    check_sensor_report()
--- a/check_unknown_sensors.py
+++ b/check_unknown_sensors.py
@@ -0,0 +1,44 @@
+import pandas as pd
+from alarm_analyzer import AlarmAnalyzer
+
+# Create analyzer instance
+analyzer = AlarmAnalyzer(
+    csv_file_path="C:\\Users\\AndrewConlon\\Documents\\AlarmAnalysis\\CardinalAlarmsDec25.csv",
+    xlsx_file_path="C:\\Users\\AndrewConlon\\Documents\\AlarmAnalysis\\SensorReport Cardinal 2025-12-23_processed.xlsx"
+)
+
+# Load data
+alarm_data, sensor_data = analyzer.load_data()
+
+# Check which sensors are mapped to 'Unknown' group
+unknown_sensors = analyzer.alarm_data[analyzer.alarm_data['Sensor_Group'] == 'Unknown']
+
+print(f"Number of alarm records with 'Unknown' group: {len(unknown_sensors)}")
+print(f"Number of unique sensors with 'Unknown' group: {unknown_sensors['Sensor_Id'].nunique()}")
+
+if len(unknown_sensors) > 0:
+    print("\nFirst 20 unique sensors with 'Unknown' group:")
+    unknown_sensor_ids = unknown_sensors['Sensor_Id'].unique()[:20]
+    print(unknown_sensor_ids)
+    
+    print("\nSensor details for first few 'Unknown' sensors:")
+    for sensor_id in unknown_sensor_ids[:10]:
+        sensor_records = unknown_sensors[unknown_sensors['Sensor_Id'] == sensor_id].iloc[0]
+        print(f"Sensor ID: {sensor_id}, Name: {sensor_records['Sensor_Name']}, Group: {sensor_records['Sensor_Group']}")
+        
+        # Check if this sensor exists in the sensor mapping
+        sensor_info = analyzer.sensor_mapping.get(sensor_id, {})
+        if sensor_info:
+            print(f"  Sensor mapping info: {sensor_info}")
+        else:
+            print(f"  Sensor NOT found in mapping")
+        print()
+
+# Also check which sensors from alarm data are not in the sensor mapping
+alarm_sensor_ids = set(analyzer.alarm_data['Sensor_Id'].unique())
+mapped_sensor_ids = set(analyzer.sensor_mapping.keys())
+unmapped_sensors = alarm_sensor_ids - mapped_sensor_ids
+
+print(f"\nNumber of sensors in alarm data but not in sensor mapping: {len(unmapped_sensors)}")
+if unmapped_sensors:
+    print("First 20 unmapped sensor IDs:", list(unmapped_sensors)[:20])
--- a/create_plots.py
+++ b/create_plots.py
@@ -0,0 +1,37 @@
+#!/usr/bin/env python
+# Script to create visualizations with enhanced group and sensor name information
+
+from alarm_analyzer import AlarmAnalyzer
+
+def main():
+    print("Creating analyzer instance for visualizations...")
+    # Create analyzer instance
+    analyzer = AlarmAnalyzer('CardinalAlarmsDec25.csv', 'SensorReport Cardinal 2025-12-23_processed.xlsx')
+
+    print("Loading data...")
+    # Load data
+    alarm_data, sensor_data = analyzer.load_data()
+    print(f"Loaded {len(alarm_data)} alarm records")
+
+    if analyzer.sensor_mapping:
+        print(f"Created sensor mapping for {len(analyzer.sensor_mapping)} sensors")
+    else:
+        print("No sensor mapping created - sensor report may not have been processed correctly")
+
+    print("Categorizing alarms...")
+    # Categorize alarms
+    categorized_data = analyzer.categorize_alarms()
+
+    print("Pairing events and calculating durations...")
+    # Pair events and calculate durations
+    paired_events = analyzer.pair_events_and_calculate_durations()
+
+    print("Creating enhanced visualizations...")
+    # Create visualizations with enhanced group and sensor name information
+    analyzer.create_visualizations(save_plots=True, output_dir='plots')
+
+    print("Visualizations created successfully!")
+    print("Plots have been saved to the plots directory.")
+
+if __name__ == '__main__':
+    main()
--- a/debug_sensor_report.py
+++ b/debug_sensor_report.py
@@ -0,0 +1,54 @@
+import pandas as pd
+
+# Read the sensor report
+# Try to read with header=0 first (new format) then with header=4 (old format)
+try:
+    temp_df = pd.read_excel('C:\\Users\\AndrewConlon\\Documents\\AlarmAnalysis\\SensorReport Cardinal 2025-12-23_processed.xlsx', header=0, nrows=5)
+    expected_cols = ['ID', 'Remote', 'Group', 'Type', 'Serial No', 'Name']
+    has_expected_cols = any(col in temp_df.columns for col in expected_cols)
+
+    if has_expected_cols:
+        df = pd.read_excel('C:\\Users\\AndrewConlon\\Documents\\AlarmAnalysis\\SensorReport Cardinal 2025-12-23_processed.xlsx', header=0)
+        print("Using new sensor report format (header=0)")
+    else:
+        df = pd.read_excel('C:\\Users\\AndrewConlon\\Documents\\AlarmAnalysis\\SensorReport Cardinal 2025-12-23_processed.xlsx', header=4)
+        print("Using old sensor report format (header=4)")
+except FileNotFoundError:
+    print("Sensor report file not found. Please ensure 'SensorReport Cardinal 2025-12-23_processed.xlsx' is in the current directory.")
+    exit(1)
+
+print('Shape:', df.shape)
+print('Before forward-fill:')
+print('First 10 rows:')
+print(df[['ID', 'Group']].head(10))
+
+# Apply the same hierarchical processing as in the code
+df_processed = df.copy()
+hierarchical_cols = ['Group', 'Remote', 'Name', 'Type', 'Serial No']
+
+for col in hierarchical_cols:
+    if col in df_processed.columns:
+        # Forward fill: propagate non-null values down until the next non-null value
+        df_processed[col] = df_processed[col].ffill()
+
+print()
+print('After forward-fill:')
+print('First 10 rows:')
+print(df_processed[['ID', 'Group']].head(10))
+
+# Check if sensor 7335 now has a group
+sensor_7335 = df_processed[pd.to_numeric(df_processed['ID'], errors='coerce') == 7335]
+if not sensor_7335.empty:
+    print()
+    print('Sensor 7335 after forward-fill:')
+    print(sensor_7335[['ID', 'Group', 'Name']])
+else:
+    print()
+    print('Sensor 7335 not found in processed data')
+
+# Let's also check for all sensors that have ID 7335 in the original data
+original_sensor_7335 = df[pd.to_numeric(df['ID'], errors='coerce') == 7335]
+if not original_sensor_7335.empty:
+    print()
+    print('Sensor 7335 in original data:')
+    print(original_sensor_7335[['ID', 'Group', 'Name']])
--- a/demonstrate_enhanced_features.py
+++ b/demonstrate_enhanced_features.py
@@ -0,0 +1,65 @@
+#!/usr/bin/env python
+# Final demonstration of enhanced group-based analysis
+
+import pandas as pd
+import os
+
+def demonstrate_enhanced_features():
+    print("=== ENHANCED GROUP-BASED ANALYSIS DEMONSTRATION ===")
+    print()
+    
+    # Load the enhanced group statistics
+    group_stats_path = os.path.join('output', 'group_statistics.csv')
+    if not os.path.exists(group_stats_path):
+        print(f"Group statistics file not found at {group_stats_path}")
+        return
+
+    group_stats_df = pd.read_csv(group_stats_path)
+    
+    print("NEW ENHANCED METRICS ADDED TO GROUP STATISTICS:")
+    print()
+    
+    print("1. Total_Sensors_In_Group - Total number of sensors in each group (from sensor report)")
+    print("2. Percentage_Monitoring_Points_Alarmed - Percentage of sensors in the group that experienced alarms")
+    print("3. Alarm_Time_Percentage - Percentage of total possible sensor-time that was spent in alarm condition")
+    print()
+    
+    print("SAMPLE ENHANCED DATA (Top 5 groups by alarm count):")
+    print(group_stats_df[['Sensor_Group', 'Total_Alarm_Count', 'Unique_Sensors', 
+                         'Total_Sensors_In_Group', 'Percentage_Monitoring_Points_Alarmed', 
+                         'Alarm_Time_Percentage']].head())
+    print()
+    
+    print("INTERPRETATION OF NEW METRICS:")
+    print()
+    
+    print("- Total_Sensors_In_Group: Shows the actual size of each monitoring group")
+    print("- Percentage_Monitoring_Points_Alarmed: Reveals how widespread alarm events are within each group")
+    print("- Alarm_Time_Percentage: Indicates how much time the group's sensors spend in alarm condition")
+    print()
+
+    # Example interpretation
+    print("EXAMPLE ANALYSIS:")
+    sci_mansfield = group_stats_df[group_stats_df['Sensor_Group'] == 'SCI - Mansfield'].iloc[0]
+    print(f"- SCI - Mansfield group has {sci_mansfield['Total_Sensors_In_Group']} total sensors,")
+    print(f"  {sci_mansfield['Unique_Sensors']} experienced alarms ({sci_mansfield['Percentage_Monitoring_Points_Alarmed']}% of group),")
+    print(f"  and spent {sci_mansfield['Alarm_Time_Percentage']}% of total possible time in alarm condition.")
+    print()
+
+    snx_trailer = group_stats_df[group_stats_df['Sensor_Group'] == 'SNX Trailer'].iloc[0]
+    print(f"- SNX Trailer group has {snx_trailer['Total_Sensors_In_Group']} total sensors,")
+    print(f"  all {snx_trailer['Unique_Sensors']} experienced alarms (100% of group),")
+    print(f"  and spent {snx_trailer['Alarm_Time_Percentage']}% of total possible time in alarm condition.")
+    print()
+
+    print("These new metrics provide deeper insights into:")
+    print("- Group size and coverage")
+    print("- Alarm distribution within groups")
+    print("- Overall alarm activity intensity per group")
+    print()
+
+    print("The enhanced analysis provides better visibility into which groups have the most comprehensive")
+    print("alarm coverage and which groups are experiencing the most persistent alarm conditions.")
+    
+if __name__ == "__main__":
+    demonstrate_enhanced_features()
--- a/exclusion_config.json
+++ b/exclusion_config.json
@@ -0,0 +1,6 @@
+{
+    "excluded_groups": [
+        "GroupName1",
+        "GroupName2"
+    ]
+}
--- a/find_matches.py
+++ b/find_matches.py
@@ -0,0 +1,72 @@
+#!/usr/bin/env python
+# Script to find matches between alarm IDs and sensor report IDs
+
+import pandas as pd
+
+def find_matches():
+    print("Loading alarm data...")
+    alarm_df = pd.read_csv('CardinalAlarmsDec25.csv')
+    
+    print("Loading sensor report...")
+    # Try to read with header=0 first (new format) then with header=4 (old format)
+    try:
+        temp_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=0, nrows=5)
+        expected_cols = ['ID', 'Remote', 'Group', 'Type', 'Serial No', 'Name']
+        has_expected_cols = any(col in temp_df.columns for col in expected_cols)
+
+        if has_expected_cols:
+            sensor_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=0)
+            print("Using new sensor report format (header=0)")
+        else:
+            sensor_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=4)
+            print("Using old sensor report format (header=4)")
+    except FileNotFoundError:
+        print("Sensor report file not found. Please ensure 'SensorReport Cardinal 2025-12-23_processed.xlsx' is in the current directory.")
+        return
+    
+    alarm_sensors = set(alarm_df['Sensor_Id'].unique())
+    sensor_ids = set([int(x) for x in sensor_df['ID'].dropna() if pd.notna(x)])
+    
+    print(f"Number of unique alarm sensors: {len(alarm_sensors)}")
+    print(f"Number of unique sensor report IDs: {len(sensor_ids)}")
+    
+    matches = alarm_sensors.intersection(sensor_ids)
+    print(f"Number of common IDs: {len(matches)}")
+    
+    if len(matches) > 0:
+        print(f"Common IDs: {list(matches)}")
+    else:
+        print("No exact matches found between alarm Sensor_Id and sensor report ID column.")
+        print("\nLet's look for any potential patterns or partial matches...")
+        
+        # Check if any alarm sensor IDs might be in other columns of the sensor report
+        print("\nChecking other columns in the sensor report for potential matches...")
+        
+        for col in sensor_df.columns:
+            if col != 'ID' and col != 'Remote SN':  # Skip columns we already know don't match
+                print(f"\nChecking column: {col}")
+                
+                # Look for any numeric values in this column that might match
+                numeric_values = []
+                for val in sensor_df[col].dropna():
+                    try:
+                        # Try to extract any numbers from the value
+                        import re
+                        numbers = re.findall(r'\d+', str(val))
+                        for num in numbers:
+                            numeric_values.append(int(num))
+                    except:
+                        continue
+                
+                if numeric_values:
+                    numeric_set = set(numeric_values)
+                    col_matches = alarm_sensors.intersection(numeric_set)
+                    if col_matches:
+                        print(f"  Found {len(col_matches)} matches in {col}: {list(col_matches)[:10]}")
+                    else:
+                        print(f"  No matches in {col}")
+                else:
+                    print(f"  No numeric values found in {col}")
+
+if __name__ == "__main__":
+    find_matches()
--- a/groups_to_skip.txt
+++ b/groups_to_skip.txt
@@ -0,0 +1,2 @@
+GroupName1
+GroupName2
--- a/inspect_new_sensor_report.py
+++ b/inspect_new_sensor_report.py
@@ -0,0 +1,59 @@
+#!/usr/bin/env python
+"""
+Script to inspect the new sensor report format and compare it with the old one
+"""
+import pandas as pd
+
+def inspect_new_sensor_report():
+    print("Inspecting new sensor report: SensorReport Cardinal 2025-12-23_processed.xlsx")
+    
+    try:
+        # Try to read the new sensor report with different header options
+        print("\nTrying to read with header=4 (same as old format)...")
+        new_sensor_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=4)
+        
+        print(f"New sensor report shape: {new_sensor_df.shape}")
+        print(f"New sensor report columns: {list(new_sensor_df.columns)}")
+        
+        print("\nFirst few rows of new sensor report:")
+        print(new_sensor_df.head())
+        
+        print("\nData types of columns:")
+        print(new_sensor_df.dtypes)
+        
+        # Check for key columns that are expected by the current code
+        expected_cols = ['ID', 'Remote', 'Group', 'Type', 'Serial No', 'Name']
+        print(f"\nChecking for expected columns: {expected_cols}")
+        for col in expected_cols:
+            if col in new_sensor_df.columns:
+                print(f"  [OK] {col}: Present")
+            else:
+                print(f"  [MISSING] {col}: Missing")
+        
+        # Look at a sample of the data to understand its structure
+        print(f"\nSample data for first 10 rows:")
+        sample_cols = [col for col in expected_cols if col in new_sensor_df.columns]
+        if sample_cols:
+            print(new_sensor_df[sample_cols].head(10))
+        
+        # Try different header values to see if the structure is different
+        print("\nTrying with header=0 (first row)...")
+        new_sensor_df_h0 = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=0)
+        print(f"With header=0 - Shape: {new_sensor_df_h0.shape}, Columns: {list(new_sensor_df_h0.columns[:10])}")  # First 10 columns
+        
+        print("\nTrying with header=3...")
+        new_sensor_df_h3 = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=3)
+        print(f"With header=3 - Shape: {new_sensor_df_h3.shape}, Columns: {list(new_sensor_df_h3.columns[:10])}")
+        
+        # Also try to see the first few rows without setting a header
+        print("\nFirst few rows without setting header (to see raw structure):")
+        raw_df = pd.read_excel('SensorReport Cardinal 2025-12-23_processed.xlsx', header=None)
+        print(raw_df.head(10))
+        
+    except Exception as e:
+        print(f"Error reading new sensor report: {e}")
+        import traceback
+        traceback.print_exc()
+
+if __name__ == "__main__":
+    inspect_new_sensor_report()
--- a/run_analysis.py
+++ b/run_analysis.py
@@ -0,0 +1,52 @@
+#!/usr/bin/env python
+# Simple script to run the alarm analyzer without visualization
+
+from alarm_analyzer import AlarmAnalyzer
+
+def main():
+    print("Creating analyzer instance...")
+    # Create analyzer instance
+    analyzer = AlarmAnalyzer('CardinalAlarmsDec25.csv', 'SensorReport Cardinal 2025-12-23_processed.xlsx')
+
+    print("Loading data...")
+    # Load data
+    alarm_data, sensor_data = analyzer.load_data()
+    print(f"Loaded {len(alarm_data)} alarm records")
+    
+    if analyzer.sensor_mapping:
+        print(f"Created sensor mapping for {len(analyzer.sensor_mapping)} sensors")
+    else:
+        print("No sensor mapping created - sensor report may not have been processed correctly")
+
+    print("Categorizing alarms...")
+    # Categorize alarms
+    categorized_data = analyzer.categorize_alarms()
+
+    print("Pairing events and calculating durations...")
+    # Pair events and calculate durations
+    paired_events = analyzer.pair_events_and_calculate_durations()
+
+    print("Performing basic analysis...")
+    # Perform basic analysis
+    basic_results = analyzer.basic_analysis()
+
+    print("Performing advanced analysis...")
+    # Perform advanced analysis
+    advanced_results = analyzer.advanced_analysis()
+
+    print("Exporting results...")
+    # Export results (this doesn't require matplotlib)
+    analyzer.export_results(output_dir='output')
+
+    # Perform uptime analysis
+    print("Performing uptime analysis...")
+    uptime_results = analyzer.calculate_uptime_metrics()
+
+    # Export uptime metrics to new files
+    analyzer.export_uptime_metrics(output_dir="output", uptime_results=uptime_results)
+
+    print("Analysis completed successfully!")
+    print("Results have been exported to the output directory.")
+
+if __name__ == '__main__':
+    main()
--- a/show_results.py
+++ b/show_results.py
@@ -0,0 +1,66 @@
+#!/usr/bin/env python
+# Script to show the enhanced analysis results
+
+import pandas as pd
+import os
+
+def show_results():
+    print("=== ENHANCED ALARM ANALYSIS RESULTS ===")
+    print()
+    
+    # Check that output directory exists and show files
+    if os.path.exists('output'):
+        print("Output files created:")
+        for file in sorted(os.listdir('output')):
+            print(f"  - {file}")
+        print()
+    else:
+        print("Output directory not found!")
+        return
+    
+    # Show sample from paired_alarm_events.csv
+    try:
+        print("Sample from paired_alarm_events.csv (first 5 rows with sensor names and groups):")
+        paired_df = pd.read_csv('output/paired_alarm_events.csv')
+        print(paired_df[['Sensor_Id', 'Sensor_Name', 'Sensor_Group', 'Alarm_Type', 'Duration_Minutes']].head())
+        print()
+    except Exception as e:
+        print(f"Could not read paired_alarm_events.csv: {e}")
+    
+    # Show top groups by alarm count
+    try:
+        print("Top groups by alarm count:")
+        groups_count_df = pd.read_csv('output/top_groups_by_alarm_count.csv')
+        print(groups_count_df.head(10))
+        print()
+    except Exception as e:
+        print(f"Could not read top_groups_by_alarm_count.csv: {e}")
+    
+    # Show sample of group statistics
+    try:
+        print("Sample of group statistics (top 10 by alarm count):")
+        group_stats_df = pd.read_csv('output/group_statistics.csv')
+        print(group_stats_df[['Sensor_Group', 'Total_Alarm_Count', 'Avg_Duration', 'Total_Severity_Score']].head(10))
+        print()
+    except Exception as e:
+        print(f"Could not read group_statistics.csv: {e}")
+    
+    # Show top sensors by alarm count to compare
+    try:
+        print("Top sensors by alarm count (with names):")
+        sensors_count_df = pd.read_csv('output/top_sensors_by_alarm_count.csv')
+        print(sensors_count_df.head(10))
+        print()
+    except Exception as e:
+        print(f"Could not read top_sensors_by_alarm_count.csv: {e}")
+    
+    print("Analysis completed successfully with enhanced group and sensor name information!")
+    print()
+    print("Key enhancements:")
+    print("- Sensor IDs now replaced with meaningful sensor names")
+    print("- Groups properly mapped using hierarchical structure processing")
+    print("- Group-based analysis now available throughout the system")
+    print("- All output files contain enhanced sensor name and group information")
+
+if __name__ == "__main__":
+    show_results()
--- a/test_changes.py
+++ b/test_changes.py
@@ -0,0 +1,74 @@
+#!/usr/bin/env python
+# Test script to validate the changes made to alarm_analyzer.py
+
+import sys
+import os
+import pandas as pd
+
+def test_code_structure():
+    """Test that the modified code has the correct structure"""
+    
+    # Read the file to check if our changes were applied correctly
+    with open('alarm_analyzer.py', 'r') as f:
+        content = f.read()
+    
+    print("Testing if new methods were added correctly...")
+    
+    # Check if the add_sensor_info_to_alarms method exists
+    if 'def add_sensor_info_to_alarms(self)' in content:
+        print("[OK] add_sensor_info_to_alarms method exists")
+    else:
+        print("[ERROR] add_sensor_info_to_alarms method missing")
+
+    # Check if the load_data method was updated
+    if 'header=4' in content and 'Remote SN' in content:
+        print("[OK] load_data method updated with proper header reading")
+    else:
+        print("[ERROR] load_data method not properly updated")
+
+    # Check if sensor info is added to paired events
+    if 'Sensor_Name' in content and 'Sensor_Group' in content and 'Sensor_Type' in content:
+        print("[OK] Sensor information added to paired events")
+    else:
+        print("[ERROR] Sensor information not properly added to paired events")
+
+    # Check if group-based analysis was added
+    if 'group_counts' in content and 'mtbf_by_group' in content:
+        print("[OK] Group-based analysis added to basic and advanced analysis")
+    else:
+        print("[ERROR] Group-based analysis not properly added")
+
+    # Check if group-based visualizations were added
+    if 'Group-Based Analysis Dashboard' in content:
+        print("[OK] Group-based visualizations added")
+    else:
+        print("[ERROR] Group-based visualizations not properly added")
+
+    # Check if group-based exports were added
+    if 'group_statistics.csv' in content:
+        print("[OK] Group-based exports added")
+    else:
+        print("[ERROR] Group-based exports not properly added")
+    
+    print("\nAll structural changes have been validated!")
+
+def test_logic():
+    """Test the logic of the changes"""
+    print("\nTesting the logic of the changes...")
+    
+    # Check that the updated main section uses the correct file name
+    with open('alarm_analyzer.py', 'r') as f:
+        content = f.read()
+    
+    if 'SensorReport Cardinal 2025-12-23_processed.xlsx' in content:
+        print("[OK] Main section updated with correct sensor report file name")
+    else:
+        print("[ERROR] Main section not updated with correct sensor report file name")
+    
+    print("Logic validation completed!")
+
+if __name__ == "__main__":
+    print("Validating changes made to alarm_analyzer.py...")
+    test_code_structure()
+    test_logic()
+    print("\nValidation completed successfully!")
--- a/test_duration_fix.py
+++ b/test_duration_fix.py
@@ -0,0 +1,81 @@
+#!/usr/bin/env python3
+"""
+Test script to verify the fix for alarm duration calculation
+"""
+import sys
+import os
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))
+
+from alarm_analyzer import AlarmAnalyzer
+
+def test_duration_calculation():
+    """
+    Test the updated duration calculation with sample data
+    """
+    print("Testing updated duration calculation...")
+    
+    # Use the existing files
+    csv_file = "CardinalAlarmsDec25.csv"
+    xlsx_file = "SensorReport Cardinal 2025-12-23_processed.xlsx"
+    
+    if not os.path.exists(csv_file):
+        print(f"CSV file {csv_file} not found. Creating a small test file...")
+        # Create a minimal test file
+        test_data = """Alarm_Id,Sensor_Id,Date,Description,LogTime
+1,1001,2025-12-01 00:01:00.000,"Lo Warning: 68.0<=68.0F         ",2025-12-01 00:01:01.077
+2,1001,2025-12-01 00:05:00.000,"Lo Alarm: 67.5<=68.0F         ",2025-12-01 00:05:01.077
+3,1001,2025-12-01 00:10:00.000,"Normal 68.2F         ",2025-12-01 00:10:01.077
+4,1002,2025-12-01 00:02:00.000,"Error: Comm Loss Error 20.4>=20 min.",2025-12-01 00:02:01.077
+5,1002,2025-12-01 00:07:00.000,"Hi Alarm: 70.0>=68.0F         ",2025-12-01 00:07:01.077
+6,1002,2025-12-01 00:12:00.000,"Normal 69.5F         ",2025-12-01 00:12:01.077"""
+        
+        with open(csv_file, 'w') as f:
+            f.write(test_data)
+    
+    # Create analyzer instance
+    analyzer = AlarmAnalyzer(csv_file, xlsx_file)
+    
+    try:
+        # Load data
+        alarm_data, sensor_data = analyzer.load_data()
+        print(f"Loaded {len(alarm_data)} alarm records")
+        
+        # Categorize alarms
+        categorized_data = analyzer.categorize_alarms()
+        print("Categorized alarms successfully")
+        
+        # Pair events and calculate durations
+        paired_events = analyzer.pair_events_and_calculate_durations()
+        
+        if paired_events is not None and len(paired_events) > 0:
+            print(f"Created {len(paired_events)} paired events")
+            print("\nFirst few paired events:")
+            print(paired_events[['Sensor_Id', 'Alarm_Type', 'Start_Time', 'End_Time', 'Duration_Minutes', 'End_Reason']].head(10))
+            
+            # Check if End_Reason column exists
+            if 'End_Reason' in paired_events.columns:
+                print(f"\nEnd reason distribution:")
+                print(paired_events['End_Reason'].value_counts())
+            else:
+                print("ERROR: End_Reason column not found in paired events")
+                
+            # Check for transitions
+            if 'End_Reason' in paired_events.columns:
+                transitions = paired_events[paired_events['End_Reason'].str.contains('Transition', na=False)]
+                if len(transitions) > 0:
+                    print(f"\nFound {len(transitions)} alarm condition transitions:")
+                    print(transitions[['Sensor_Id', 'Alarm_Type', 'Start_Description', 'End_Description', 'Duration_Minutes', 'End_Reason']])
+                else:
+                    print("\nNo alarm condition transitions found in this sample.")
+        else:
+            print("No paired events created")
+        
+        print("Test completed successfully!")
+        
+    except Exception as e:
+        print(f"Error during test: {e}")
+        import traceback
+        traceback.print_exc()
+
+if __name__ == "__main__":
+    test_duration_calculation()
--- a/test_enhanced_plotting.py
+++ b/test_enhanced_plotting.py
@@ -0,0 +1,144 @@
+#!/usr/bin/env python
+# Test script to verify enhanced plotting functionality without creating actual plots
+
+from alarm_analyzer import AlarmAnalyzer
+import pandas as pd
+
+def test_enhanced_plotting():
+    print("Testing enhanced plotting functionality...")
+    
+    # Create analyzer instance
+    analyzer = AlarmAnalyzer('CardinalAlarmsDec25.csv', 'SensorReport Cardinal 2025-12-23_processed.xlsx')
+
+    print("Loading data...")
+    # Load data
+    alarm_data, sensor_data = analyzer.load_data()
+    print(f"Loaded {len(alarm_data)} alarm records")
+
+    if analyzer.sensor_mapping:
+        print(f"Created sensor mapping for {len(analyzer.sensor_mapping)} sensors")
+    else:
+        print("No sensor mapping created - sensor report may not have been processed correctly")
+
+    print("Categorizing alarms...")
+    # Categorize alarms
+    categorized_data = analyzer.categorize_alarms()
+
+    print("Pairing events and calculating durations...")
+    # Pair events and calculate durations
+    paired_events = analyzer.pair_events_and_calculate_durations()
+
+    # Test the sensor name mapping logic without creating plots
+    print("\n--- TESTING ENHANCED PLOTTING LOGIC ---")
+    
+    # Filter resolved events for testing
+    duration_events = analyzer.processed_events[analyzer.processed_events['Duration_Minutes'].notna()].copy()
+    
+    if len(duration_events) == 0:
+        print("No resolved events with duration data available for testing.")
+        return
+
+    # Extract time components for time-based analysis
+    duration_events['Start_Hour'] = duration_events['Start_Time'].dt.hour
+    duration_events['Start_DayOfWeek'] = duration_events['Start_Time'].dt.day_name()
+    duration_events['Start_Date'] = duration_events['Start_Time'].dt.date
+
+    print("\nTesting sensor name mapping for top sensors by alarm count...")
+    # Top 10 sensors by alarm count - with sensor names instead of IDs
+    top_sensors = duration_events['Sensor_Id'].value_counts().head(10)
+    sensor_names_for_plot = []
+    for sensor_id in top_sensors.index:
+        sensor_info = analyzer.sensor_mapping.get(sensor_id, {})
+        sensor_name = sensor_info.get('name', f'ID: {sensor_id}')
+        sensor_group = sensor_info.get('group', 'Unknown')
+        sensor_names_for_plot.append(f"{sensor_name}\n({sensor_group})")
+    
+    print("Sample of enhanced sensor labels for plotting:")
+    for i, (sensor_id, count) in enumerate(top_sensors.head(5).items()):
+        print(f"  {sensor_names_for_plot[i]}: {count} alarms")
+
+    print("\nTesting sensor name mapping for average duration...")
+    # Top 10 sensors by average duration - with sensor names instead of IDs
+    avg_duration_by_sensor = duration_events.groupby('Sensor_Id')['Duration_Minutes'].mean().sort_values(ascending=False).head(10)
+    sensor_names_for_plot_avg = []
+    for sensor_id in avg_duration_by_sensor.index:
+        sensor_info = analyzer.sensor_mapping.get(sensor_id, {})
+        sensor_name = sensor_info.get('name', f'ID: {sensor_id}')
+        sensor_group = sensor_info.get('group', 'Unknown')
+        sensor_names_for_plot_avg.append(f"{sensor_name} (Group: {sensor_group})")
+    
+    print("Sample of enhanced sensor labels for average duration plotting:")
+    for i, (sensor_id, avg_duration) in enumerate(avg_duration_by_sensor.head(5).items()):
+        print(f"  {sensor_names_for_plot_avg[i]}: {avg_duration:.2f} minutes")
+
+    print("\nTesting group-based visualizations...")
+    if 'Sensor_Group' in duration_events.columns:
+        print("Group-based visualizations would be created...")
+        
+        # Test group composition analysis
+        if analyzer.sensor_mapping:
+            # Create a mapping of group to number of sensors
+            group_to_sensor_count = {}
+            for sensor_id, sensor_info in analyzer.sensor_mapping.items():
+                group = sensor_info.get('group', 'Unknown')
+                if group not in group_to_sensor_count:
+                    group_to_sensor_count[group] = 0
+                group_to_sensor_count[group] += 1
+            
+            # Convert to dataframe and sort
+            group_sensor_counts = pd.DataFrame(
+                list(group_to_sensor_count.items()), 
+                columns=['Group', 'Sensor_Count']
+            ).sort_values('Sensor_Count', ascending=False).head(10)
+            
+            print("Sample of group composition data:")
+            for _, row in group_sensor_counts.head(5).iterrows():
+                print(f"  {row['Group']}: {row['Sensor_Count']} sensors")
+
+        # Test alarm type distribution by group
+        alarm_type_by_group = duration_events.groupby(['Sensor_Group', 'Alarm_Type']).size().unstack(fill_value=0)
+        top_10_groups = duration_events['Sensor_Group'].value_counts().head(10).index
+        alarm_type_by_group_top = alarm_type_by_group.loc[top_10_groups]
+        
+        print("Sample of alarm type distribution by group:")
+        sample_groups = alarm_type_by_group_top.head(3)
+        for group in sample_groups.index:
+            print(f"  {group}:")
+            for alarm_type in sample_groups.columns:
+                count = sample_groups.loc[group, alarm_type]
+                if count > 0:
+                    print(f"    {alarm_type}: {count} alarms")
+
+        # Test group alarm intensity
+        alarms_per_sensor_by_group = duration_events.groupby('Sensor_Group')['Sensor_Id'].nunique().to_dict()
+        
+        # Calculate total sensors per group from mapping
+        group_to_sensor_count = {}
+        for sensor_id, sensor_info in analyzer.sensor_mapping.items():
+            group = sensor_info.get('group', 'Unknown')
+            if group not in group_to_sensor_count:
+                group_to_sensor_count[group] = 0
+            group_to_sensor_count[group] += 1
+        
+        # Calculate alarms per sensor ratio
+        group_alarm_intensity = {}
+        for group in set(duration_events['Sensor_Group'].unique()):
+            total_alarms = len(duration_events[duration_events['Sensor_Group'] == group])
+            total_sensors = group_to_sensor_count.get(group, 1)  # Avoid division by zero
+            group_alarm_intensity[group] = total_alarms / total_sensors
+        
+        # Convert to DataFrame and sort
+        intensity_df = pd.DataFrame(
+            list(group_alarm_intensity.items()), 
+            columns=['Group', 'Alarms_Per_Sensor']
+        ).sort_values('Alarms_Per_Sensor', ascending=False).head(10)
+        
+        print("Sample of group alarm intensity:")
+        for _, row in intensity_df.head(5).iterrows():
+            print(f"  {row['Group']}: {row['Alarms_Per_Sensor']:.2f} alarms per sensor")
+
+    print("\nAll enhanced plotting logic tests passed!")
+    print("The enhanced plotting functionality is ready to use when matplotlib and seaborn are available.")
+
+if __name__ == '__main__':
+    test_enhanced_plotting()
--- a/test_mapping.py
+++ b/test_mapping.py
@@ -0,0 +1,51 @@
+#!/usr/bin/env python
+# Test script to check the mapping functionality
+
+import pandas as pd
+from alarm_analyzer import AlarmAnalyzer
+
+def test_mapping():
+    print("Creating analyzer instance...")
+    analyzer = AlarmAnalyzer('CardinalAlarmsDec25.csv', 'SensorReport Cardinal 2025-12-23_processed.xlsx')
+
+    print("Loading data...")
+    alarm_data, sensor_data = analyzer.load_data()
+
+    print(f"Created sensor mapping for {len(analyzer.sensor_mapping)} sensors")
+    
+    # Check if specific IDs from the alarm data are in the mapping
+    sample_alarm_ids = [9273, 3817, 8963, 7414, 9092, 9105, 7080, 9455, 9451, 3799]
+    print(f"Sample alarm IDs: {sample_alarm_ids}")
+    
+    found_in_mapping = []
+    for alarm_id in sample_alarm_ids:
+        if alarm_id in analyzer.sensor_mapping:
+            found_in_mapping.append(alarm_id)
+            print(f"  ID {alarm_id}: {analyzer.sensor_mapping[alarm_id]}")
+        else:
+            print(f"  ID {alarm_id}: NOT FOUND")
+    
+    print(f"Found {len(found_in_mapping)} out of {len(sample_alarm_ids)} sample IDs in mapping")
+    
+    # Check alarm data for sensor names and groups
+    print(f"\nSensor_Name column in alarm data: {'Sensor_Name' in analyzer.alarm_data.columns}")
+    print(f"Sensor_Group column in alarm data: {'Sensor_Group' in analyzer.alarm_data.columns}")
+    
+    if 'Sensor_Name' in analyzer.alarm_data.columns:
+        unique_names = analyzer.alarm_data['Sensor_Name'].unique()
+        print(f"Unique sensor names: {len(unique_names)} - {unique_names[:10]}")
+    
+    if 'Sensor_Group' in analyzer.alarm_data.columns:
+        unique_groups = analyzer.alarm_data['Sensor_Group'].unique()
+        print(f"Unique sensor groups: {len(unique_groups)} - {unique_groups[:10]}")
+    
+    # Check a few rows to see the mapping worked
+    print("\nFirst 10 rows of alarm data with sensor info:")
+    cols_to_show = ['Sensor_Id', 'Sensor_Name', 'Sensor_Group', 'Description']
+    if all(col in analyzer.alarm_data.columns for col in cols_to_show):
+        print(analyzer.alarm_data[cols_to_show].head(10))
+    else:
+        print("Some columns not found in alarm data")
+
+if __name__ == "__main__":
+    test_mapping()