# Stasis Warden - Testing Strategy

*Initial evaluation performed by Quinn (QA Architect) based on `prd.md` v1.4 and `architecture.md`.*

This document outlines the initial testing strategy for the Stasis Warden project, focusing on high-risk areas identified during the review of core product and architecture documents. The goal is to establish a robust testing architecture early to ensure we can build and iterate with confidence.

---

### 1. High Risk: Save/Load System Integrity

The data persistence logic in `GameStateManager` is the most critical system from a quality perspective. Data corruption in a save file can permanently halt a player's progress and ruin their experience.

**Testing Strategy:**

*   **Unit Tests:** Each manager (`ResourceManager`, `CrewManager`, etc.) must have unit tests for its `get_data()` and `load_data()` methods. We need to verify that the data serialization and deserialization are perfectly symmetrical.
*   **Full-Cycle Integration Tests:** We must create a dedicated test scene that orchestrates a full save/load cycle.
    1.  Programmatically set up a complex game state (e.g., multiple crew with specific stats, some assigned to tasks, specific resources, unlocked rooms).
    2.  Trigger `GameStateManager.save_game()`.
    3.  Reset the entire game state.
    4.  Trigger `GameStateManager.load_game()`.
    5.  Assert with precision that the restored state is identical to the state before saving.
*   **Corruption/Fuzz Testing:** We need tests that attempt to load invalid, malformed, or empty `savegame.dat` files. The game must handle these errors gracefully (e.g., by showing a "corrupted save" message and returning to the main menu) rather than crashing.

### 2. High Risk: State Machine Logic (`GameStateManager`)

The game's flow is controlled by a state machine (`IN_GAME`, `ROOM_SELECTION`, etc.). A bug in state transitions can easily lead to a soft-lock where the player is stuck and cannot provide input.

**Testing Strategy:**

*   **State Transition Tests:** Each possible state transition must be explicitly tested. For example, a test should confirm that when the game enters the `ROOM_SELECTION` state, player inputs related to the `IN_GAME` state (like trying to assign a crew member) are ignored. We must verify not only that the state changes correctly, but that the game's behavior changes with it.

### 3. High Risk: Resource & Economic Balance (`ResourceManager`)

The core gameplay loop depends on the `Power` economy. Bugs in resource generation, spending, or the signal-driven UI updates could make the game unplayable or trivial.

**Testing Strategy:**

*   **Transactional Unit Tests:** The `ResourceManager`'s methods (`add_resource`, `spend_resource`) must be tested like database transactions. We need to validate edge cases like spending exactly all available power, attempting to spend more than available, and ensuring the `power_updated` signal fires with the correct payload every time.
*   **Signal Listener Integration Tests:** We should have tests that simulate a UI action and include a test-double that listens for the resulting signal from the `ResourceManager`. This verifies that our core signal-driven architecture is working as intended from end-to-end.

### 4. Medium Risk: Procedural Generation (Crew & Rooms)

The random generation of crew stats/traits and room cards makes testing difficult. We cannot rely on chance for quality assurance.

**Testing Strategy:**

*   **Isolate and Seed the RNG:** The logic for generating crew and selecting room cards must be refactored to accept an optional seed for the Random Number Generator.
*   **Deterministic Unit Tests:** By providing a known seed, our tests can assert that the "random" outcomes are perfectly predictable. For a given seed, we should always get the exact same crew stats and the exact same set of three room cards. This makes testing repeatable and reliable.
*   **Property-Based Testing:** For a more advanced approach, we can verify properties of the output. For example, a test can assert that a generated crew member's "Engineering" stat is *always* within the valid range (e.g., 1-10), regardless of the seed.

### 5. Medium Risk: Performance of the CRT Shader

The PRD has specific performance targets (60/120 FPS), and the architecture document correctly identifies the full-screen CRT shader as a potential bottleneck.

**Testing Strategy:**

*   **Automated Benchmarking:** This cannot be covered by traditional unit tests. We should create a dedicated benchmark scene in Godot that represents an average late-game state. A script will run this scene for a fixed duration (e.g., 10,000 frames) and log the FPS. This test should be run automatically under different configurations (shader on/off, different quality settings) to ensure we do not have performance regressions as we add features.