# Stasis Warden - Testing Strategy *Initial evaluation performed by Quinn (QA Architect) based on `prd.md` v1.4 and `architecture.md`.* This document outlines the initial testing strategy for the Stasis Warden project, focusing on high-risk areas identified during the review of core product and architecture documents. The goal is to establish a robust testing architecture early to ensure we can build and iterate with confidence. --- ### 1. High Risk: Save/Load System Integrity The data persistence logic in `GameStateManager` is the most critical system from a quality perspective. Data corruption in a save file can permanently halt a player's progress and ruin their experience. **Testing Strategy:** * **Unit Tests:** Each manager (`ResourceManager`, `CrewManager`, etc.) must have unit tests for its `get_data()` and `load_data()` methods. We need to verify that the data serialization and deserialization are perfectly symmetrical. * **Full-Cycle Integration Tests:** We must create a dedicated test scene that orchestrates a full save/load cycle. 1. Programmatically set up a complex game state (e.g., multiple crew with specific stats, some assigned to tasks, specific resources, unlocked rooms). 2. Trigger `GameStateManager.save_game()`. 3. Reset the entire game state. 4. Trigger `GameStateManager.load_game()`. 5. Assert with precision that the restored state is identical to the state before saving. * **Corruption/Fuzz Testing:** We need tests that attempt to load invalid, malformed, or empty `savegame.dat` files. The game must handle these errors gracefully (e.g., by showing a "corrupted save" message and returning to the main menu) rather than crashing. ### 2. High Risk: State Machine Logic (`GameStateManager`) The game's flow is controlled by a state machine (`IN_GAME`, `ROOM_SELECTION`, etc.). A bug in state transitions can easily lead to a soft-lock where the player is stuck and cannot provide input. **Testing Strategy:** * **State Transition Tests:** Each possible state transition must be explicitly tested. For example, a test should confirm that when the game enters the `ROOM_SELECTION` state, player inputs related to the `IN_GAME` state (like trying to assign a crew member) are ignored. We must verify not only that the state changes correctly, but that the game's behavior changes with it. ### 3. High Risk: Resource & Economic Balance (`ResourceManager`) The core gameplay loop depends on the `Power` economy. Bugs in resource generation, spending, or the signal-driven UI updates could make the game unplayable or trivial. **Testing Strategy:** * **Transactional Unit Tests:** The `ResourceManager`'s methods (`add_resource`, `spend_resource`) must be tested like database transactions. We need to validate edge cases like spending exactly all available power, attempting to spend more than available, and ensuring the `power_updated` signal fires with the correct payload every time. * **Signal Listener Integration Tests:** We should have tests that simulate a UI action and include a test-double that listens for the resulting signal from the `ResourceManager`. This verifies that our core signal-driven architecture is working as intended from end-to-end. ### 4. Medium Risk: Procedural Generation (Crew & Rooms) The random generation of crew stats/traits and room cards makes testing difficult. We cannot rely on chance for quality assurance. **Testing Strategy:** * **Isolate and Seed the RNG:** The logic for generating crew and selecting room cards must be refactored to accept an optional seed for the Random Number Generator. * **Deterministic Unit Tests:** By providing a known seed, our tests can assert that the "random" outcomes are perfectly predictable. For a given seed, we should always get the exact same crew stats and the exact same set of three room cards. This makes testing repeatable and reliable. * **Property-Based Testing:** For a more advanced approach, we can verify properties of the output. For example, a test can assert that a generated crew member's "Engineering" stat is *always* within the valid range (e.g., 1-10), regardless of the seed. ### 5. Medium Risk: Performance of the CRT Shader The PRD has specific performance targets (60/120 FPS), and the architecture document correctly identifies the full-screen CRT shader as a potential bottleneck. **Testing Strategy:** * **Automated Benchmarking:** This cannot be covered by traditional unit tests. We should create a dedicated benchmark scene in Godot that represents an average late-game state. A script will run this scene for a fixed duration (e.g., 10,000 frames) and log the FPS. This test should be run automatically under different configurations (shader on/off, different quality settings) to ensure we do not have performance regressions as we add features.