From f04aa664f8b60f56ab4b86cd8c1daac76545d8b5 Mon Sep 17 00:00:00 2001
From: policyengine-bot <bot@policyengine.org>
Date: Tue, 9 Dec 2025 18:56:22 +0000
Subject: [PATCH] Add test suite performance optimization guidance
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Adds section 11 to policyengine-testing-patterns-skill documenting best practices for optimizing test suite performance in country model repositories.

Key patterns:
- Use @pytest.mark.microsimulation for expensive tests
- Separate fast unit tests from slow microsimulation tests
- Provide make test/test-all/test-microsimulation targets
- Use module-scoped fixtures instead of module-level initialization
- Remove expensive validity tests that iterate all variables

Learned from policyengine-uk issue #1452 where these optimizations reduced:
- Test time from minutes to seconds
- Memory usage from >4GB to <1GB
- CI failures due to OOM errors

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 .../SKILL.md                                  | 139 +++++++++++++++++-
 1 file changed, 138 insertions(+), 1 deletion(-)

diff --git a/skills/technical-patterns/policyengine-testing-patterns-skill/SKILL.md b/skills/technical-patterns/policyengine-testing-patterns-skill/SKILL.md
index dbe52a2..83f7392 100644
--- a/skills/technical-patterns/policyengine-testing-patterns-skill/SKILL.md
+++ b/skills/technical-patterns/policyengine-testing-patterns-skill/SKILL.md
@@ -400,6 +400,136 @@ Before submitting tests:
 
 ---
 
+## 11. Test Suite Performance Optimization
+
+### Overview
+
+PolicyEngine country model repositories have two types of tests:
+1. **Fast unit tests** - YAML tests run via policyengine-core (~seconds)
+2. **Slow microsimulation tests** - Load full datasets, simulate tax-benefit system (~minutes, high memory)
+
+### Problem
+
+Running all tests together causes:
+- Long CI times (several minutes)
+- High memory usage (>4GB)
+- Slow developer feedback loop
+- CI failures on memory-constrained environments
+
+### Solution: Pytest Markers
+
+Use `@pytest.mark.microsimulation` to separate expensive tests:
+
+```python
+# tests/microsimulation/test_reform_impacts.py
+import pytest
+
+@pytest.mark.microsimulation
+def test_reform_impact():
+    """Test that loads full dataset and runs simulation."""
+    baseline = Microsimulation()  # Loads ~100MB FRS/CPS dataset
+    # ... expensive test ...
+```
+
+### Makefile Pattern
+
+Standard pattern for country model repositories:
+
+```makefile
+test:
+	pytest -m "not microsimulation"
+
+test-microsimulation:
+	pytest -m microsimulation
+
+test-all:
+	pytest
+```
+
+This allows:
+- `make test` - Fast tests only (default for local dev)
+- `make test-all` - Everything (for final validation)
+- `make test-microsimulation` - Just the slow tests
+
+### Fixture Optimization
+
+**Problem:** Module-level baseline initialization loads datasets on import:
+
+```python
+# BAD - Loads 100MB dataset when file is imported
+baseline = Microsimulation()
+
+def test_something():
+    result = baseline.calculate(...)
+```
+
+**Solution:** Use pytest fixture with module scope:
+
+```python
+# GOOD - Loads once per test module, only when tests run
+import pytest
+
+@pytest.fixture(scope="module")
+def baseline():
+    return Microsimulation()
+
+def test_something(baseline):
+    result = baseline.calculate(...)
+```
+
+Benefits:
+- Doesn't load on import
+- Loads once per test file (not per test)
+- Memory released after module tests complete
+
+### Tests to Avoid
+
+Some test patterns are too expensive for regular execution:
+
+**Validity Tests** - Testing all variables with full dataset:
+```python
+# REMOVE - Tests 698 variables on full dataset
+def test_validity():
+    sim = Microsimulation()
+    for variable in sim.tax_benefit_system.variables:
+        sim.calculate(variable)  # Very slow!
+```
+
+These should be removed entirely, as:
+- They're redundant with integration tests
+- Take 30-60 seconds each
+- Use 200-400MB memory
+- Rarely catch bugs that integration tests don't
+
+### Configuration Files
+
+Add to `pytest.ini` or `pyproject.toml`:
+
+```ini
+[pytest]
+markers =
+    microsimulation: marks tests as slow microsimulation tests (deselect with '-m "not microsimulation"')
+```
+
+### When to Mark Tests
+
+Mark a test as `@pytest.mark.microsimulation` if it:
+- Loads full survey datasets (FRS, CPS, etc.)
+- Creates `Microsimulation()` objects
+- Runs full tax-benefit system simulations
+- Takes >5 seconds to run
+- Uses >100MB memory
+
+### Impact Metrics
+
+Example from policyengine-uk optimization:
+- Default test time: Minutes → Seconds
+- Default memory usage: >4GB → <1GB
+- CI success rate: Improved (fewer OOM failures)
+- Developer velocity: Faster feedback loop
+
+---
+
 ## For Agents
 
 When creating tests:
@@ -409,4 +539,11 @@ When creating tests:
 4. **Verify enum values** against actual code
 5. **Follow naming conventions** exactly
 6. **Include edge cases** at thresholds
-7. **Test realistic scenarios** not placeholders
\ No newline at end of file
+7. **Test realistic scenarios** not placeholders
+
+When optimizing test suites:
+1. **Identify slow tests** - Profile with `pytest --durations=10`
+2. **Add markers** - Mark microsimulation tests appropriately
+3. **Optimize fixtures** - Use module-scoped fixtures for expensive setup
+4. **Remove redundant tests** - Eliminate validity tests that just iterate all variables
+5. **Update Makefile** - Provide `test`, `test-all`, and `test-microsimulation` targets
\ No newline at end of file