|
| 1 | +# Enable Delphy Core Unit Tests for Node.js/WASM |
| 2 | + |
| 3 | +## Summary |
| 4 | + |
| 5 | +This plan enables running Delphy's core C++ unit test suite under Node.js using the |
| 6 | +Emscripten-compiled WASM module. Without running unit tests on WASM, we cannot |
| 7 | +automatically and continuously verify that WASM-specific fixes work correctly, and we |
| 8 | +wouldn't have immediate visibility on WASM-only bugs in the future. |
| 9 | + |
| 10 | +A key motivating example: Emscripten's `long double` math functions are broken — they |
| 11 | +overflow at the `double` threshold — causing NaN in Boost's gamma functions when using |
| 12 | +the default `promote_double<true>` policy. A recent fix (see |
| 13 | +`2026-04-22-01-fix-gamma-q-nan-from-emscripten-broken-long-double.md`) introduced |
| 14 | +`safe_gamma_q`/`safe_gamma_q_inv` wrappers using `promote_double<false>` to work around |
| 15 | +this. Running unit tests on WASM ensures those wrappers remain correct. |
| 16 | + |
| 17 | +## Background |
| 18 | + |
| 19 | +### Current State |
| 20 | +- Unit tests compile into a native `tests` executable using Google Test |
| 21 | +- Unit tests are already built for WASM in CI (to catch Clang compilation issues), but not executed |
| 22 | +- CI runs native tests but not WASM tests |
| 23 | + |
| 24 | +### Problems Discovered |
| 25 | + |
| 26 | +Three issues prevent tests from running correctly under WASM/Node: |
| 27 | + |
| 28 | +1. **Google Mock incompatibility**: Tests using `absl::MockingBitGen` (which relies |
| 29 | + on Google Mock, i.e., gmock) (e.g., `distributions_tests.cpp`) hang in WASM. The |
| 30 | + gmock framework uses `abort()` for failed expectations, which doesn't properly |
| 31 | + terminate in Emscripten. |
| 32 | + |
| 33 | +2. **Boost `long double` promotion**: As described in |
| 34 | + `2026-04-22-01-fix-gamma-q-nan-from-emscripten-broken-long-double.md`, Emscripten's |
| 35 | + `long double` math functions are broken (they overflow at the `double` threshold), and |
| 36 | + Boost's default `long double` promotion causes NaN in `gamma_q`/`gamma_q_inv`. Tests |
| 37 | + in `safe_gamma_math_tests.cpp` use bare `boost::math::gamma_q()` calls as reference |
| 38 | + values to compare against the `safe_gamma_q()` wrappers — for some parameter |
| 39 | + combinations (large `a` values like 271.4, 500), those bare calls produce NaN on WASM, |
| 40 | + causing the comparison assertions to fail. |
| 41 | + |
| 42 | +3. **Exit code swallowing**: By default Emscripten's `EXIT_RUNTIME` is disabled, meaning |
| 43 | + the Node.js process always exits with code 0 regardless of what `main()` returns. Without |
| 44 | + fixing this, all test failures are silently swallowed. |
| 45 | + |
| 46 | +In addition, Emscripten's Node.js builds output a `tests.js` file (not a self-contained |
| 47 | +`tests` executable), so a shell script wrapper is needed to make `./tests` work. |
| 48 | + |
| 49 | +## Proposed Changes |
| 50 | + |
| 51 | +### Core Infrastructure |
| 52 | + |
| 53 | +#### `CMakeLists.txt` (top-level) |
| 54 | +Refactor to define a shared `WASM_LINKER_FLAGS` variable for the common Emscripten |
| 55 | +linker options and update the stale comment above the GoogleTest fetch. The goal |
| 56 | +is to avoid duplicating the common parts of the `delphy` target's options list for |
| 57 | +the `tests` target; the `delphy` target should be compiled and linked with exactly the |
| 58 | +same effective options as it is now (flag ordering may change cosmetically). |
| 59 | + |
| 60 | +**Stale comment fix:** The comment above `cmake_policy(SET CMP0135 NEW)` (currently |
| 61 | +"only useful in non-WASM builds, but we build them for WASM anyway to ensure that no |
| 62 | +test code trips up Clang") should be updated to: |
| 63 | + |
| 64 | +```cmake |
| 65 | +# Tests are built for WASM to both catch Clang compilation issues and to verify |
| 66 | +# WASM-specific functionality (e.g., gamma_q with promote_double<false> policy) |
| 67 | +``` |
| 68 | + |
| 69 | +**Shared flags:** Define `WASM_LINKER_FLAGS` in a new `if(EMSCRIPTEN)` block *before* |
| 70 | +`add_subdirectory(tests)` (e.g., right after `add_subdirectory(core)`). This is necessary |
| 71 | +because CMake processes `add_subdirectory` immediately, so the variable must be set before |
| 72 | +`tests/CMakeLists.txt` is evaluated: |
| 73 | + |
| 74 | +```cmake |
| 75 | +# Define WASM linker flags early so tests/CMakeLists.txt can use them |
| 76 | +if(EMSCRIPTEN) |
| 77 | + # Core Emscripten linker flags shared between delphy and tests targets |
| 78 | + set(WASM_LINKER_FLAGS |
| 79 | + -fwasm-exceptions |
| 80 | + -pthread |
| 81 | + # Don't mess up 64-bit integers; see https://v8.dev/features/wasm-bigint |
| 82 | + -sWASM_BIGINT |
| 83 | + -sINITIAL_MEMORY=16mb |
| 84 | + -sALLOW_MEMORY_GROWTH=1 |
| 85 | + # Avoid grief with decoding UTF-8 strings in SharedArrayBuffers (which is what |
| 86 | + # the WASM linear memory looks like from the JS side). See here for details: |
| 87 | + # https://github.com/emscripten-core/emscripten/issues/18034 |
| 88 | + -sTEXTDECODER=0 |
| 89 | + ) |
| 90 | +endif() |
| 91 | +``` |
| 92 | + |
| 93 | +**delphy target refactor:** In the existing `if(EMSCRIPTEN)` block near the bottom of |
| 94 | +the file, replace the inline flags with `${WASM_LINKER_FLAGS}`: |
| 95 | + |
| 96 | +```cmake |
| 97 | +if(EMSCRIPTEN) |
| 98 | + # delphy WASM target |
| 99 | + add_executable(delphy tools/delphy_wasm.cpp) |
| 100 | + target_link_libraries(delphy delphy_core) |
| 101 | + target_link_options(delphy PRIVATE |
| 102 | + --no-entry |
| 103 | + ${WASM_LINKER_FLAGS} |
| 104 | + -sPTHREAD_POOL_SIZE=navigator.hardwareConcurrency |
| 105 | + -sEXPORTED_FUNCTIONS=_malloc,_free |
| 106 | + -sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack |
| 107 | + $<$<CONFIG:DEBUG>:-gsource-map --source-map-base ./> |
| 108 | + ) |
| 109 | + if(NOT CMAKE_BUILD_TYPE STREQUAL "Debug") |
| 110 | + # Only enable mimalloc in non-debug builds because it conflicts with -fsanitize=address |
| 111 | + target_link_options(delphy PRIVATE |
| 112 | + # Use a scalable memory allocator (the default allocator has high contention |
| 113 | + # and is slow in multi-threaded environments) |
| 114 | + # See https://emscripten.org/docs/porting/pthreads.html#allocator-performance |
| 115 | + -sMALLOC=mimalloc |
| 116 | + ) |
| 117 | + endif() |
| 118 | +endif() |
| 119 | +``` |
| 120 | + |
| 121 | +Note: The only change to the `delphy` target is that the `-fwasm-exceptions`, `-pthread`, |
| 122 | +`-sWASM_BIGINT`, `-sINITIAL_MEMORY`, `-sALLOW_MEMORY_GROWTH`, and `-sTEXTDECODER` flags |
| 123 | +now come from `${WASM_LINKER_FLAGS}` instead of being inline — same flags, different |
| 124 | +source. |
| 125 | + |
| 126 | +#### `tests/CMakeLists.txt` |
| 127 | +Add WASM-specific linker flags to the existing `tests` target, reusing the shared |
| 128 | +`WASM_LINKER_FLAGS` from the top-level: |
| 129 | + |
| 130 | +```cmake |
| 131 | +if(EMSCRIPTEN) |
| 132 | + target_link_options(tests PRIVATE |
| 133 | + ${WASM_LINKER_FLAGS} |
| 134 | + -sPTHREAD_POOL_SIZE=4 |
| 135 | + # Without this, Emscripten's EXIT_RUNTIME defaults to false and the Node.js |
| 136 | + # process always exits with code 0, silently swallowing test failures. |
| 137 | + -sEXIT_RUNTIME=1 |
| 138 | + ) |
| 139 | + # Create a shell script wrapper so tests can be run as ./tests |
| 140 | + file(GENERATE OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/tests" |
| 141 | + CONTENT "#!/bin/sh\ndir=$(dirname \"$0\")\nexec node \"$dir/$<TARGET_FILE_NAME:tests>\" \"$@\"\n" |
| 142 | + FILE_PERMISSIONS OWNER_READ OWNER_WRITE OWNER_EXECUTE |
| 143 | + GROUP_READ GROUP_EXECUTE |
| 144 | + WORLD_READ WORLD_EXECUTE |
| 145 | + ) |
| 146 | +endif() |
| 147 | +``` |
| 148 | + |
| 149 | +**`-sEXIT_RUNTIME=1` is critical:** Emscripten defaults `EXIT_RUNTIME` to false, meaning |
| 150 | +the Node.js process always exits with code 0 regardless of what `main()` returns. Without |
| 151 | +this flag, test failures would be silently swallowed and CI would always report success. |
| 152 | + |
| 153 | +Tests use a fixed pool size of 4 for deterministic, reproducible behavior across |
| 154 | +environments, while the `delphy` target uses `navigator.hardwareConcurrency` to maximize |
| 155 | +parallelism in production use. |
| 156 | + |
| 157 | +**Shell script wrapper:** Emscripten's CMake toolchain sets `CMAKE_EXECUTABLE_SUFFIX` to |
| 158 | +`.js`, so the build produces `tests.js` (plus `tests.wasm` and `tests.worker.js`). |
| 159 | +Emscripten does *not* auto-generate a shell runner. The `file(GENERATE)` block above |
| 160 | +creates a `tests` shell script (no extension) in the same directory that invokes |
| 161 | +`node tests.js "$@"`, forwarding all arguments. This makes `--gtest_filter` work |
| 162 | +transparently: |
| 163 | + |
| 164 | +```bash |
| 165 | +./build/wasm-release/tests/tests --gtest_filter=Distributions_test.* |
| 166 | +``` |
| 167 | + |
| 168 | +### Test-Related Fixes |
| 169 | + |
| 170 | +These changes clean up rough edges once the basic test infrastructure is in place. |
| 171 | + |
| 172 | +#### `tests/distributions_tests.cpp` |
| 173 | +Replace `absl::MockingBitGen` with a real deterministic RNG to avoid gmock hangs in WASM. |
| 174 | +The current test mocks the uniform RNG to return a fixed value (0.3), then checks both |
| 175 | +bounds (`Le`/`Ge`) and exact values (`DoubleNear`) against formulas that assume that |
| 176 | +fixed input. With a real RNG, the exact values are no longer predictable from a formula, |
| 177 | +so the `DoubleNear` assertions must be removed — only the bounds checks remain. |
| 178 | + |
| 179 | +Changes: |
| 180 | +- Remove includes: `absl/random/mocking_bit_gen.h`, `absl/random/mock_distributions.h` |
| 181 | +- Keep `gmock/gmock.h` (already present; still needed for `testing::Le`, `testing::Ge` |
| 182 | + matchers; the `GTest::gmock` link dependency in `tests/CMakeLists.txt` stays as-is) |
| 183 | +- Add include: `<random>` |
| 184 | +- In `TEST(Distributions_test, bounded_exponential_distribution)`: |
| 185 | + - Replace `absl::MockingBitGen` with `std::mt19937_64{12345}` |
| 186 | + - Remove `EXPECT_CALL` setup |
| 187 | + - Remove the three `DoubleNear` exact-value assertions (one per test case: |
| 188 | + `unbounded_left`, `unbounded_right`, `bounded`) |
| 189 | + - Keep the `Le` and `Ge` bounds checks — these verify the core correctness property |
| 190 | + that `Bounded_exponential_distribution` respects its bounds |
| 191 | + |
| 192 | +#### `tests/safe_gamma_math_tests.cpp` |
| 193 | +The tests call `safe_gamma_q()` and `safe_gamma_q_inv()` — the wrappers in |
| 194 | +`safe_gamma_math.h` that already use `promote_double<false>` internally. These are |
| 195 | +the functions under test and need no changes. |
| 196 | + |
| 197 | +The problem is on the **right-hand side** of comparison assertions: several tests compare |
| 198 | +`safe_gamma_q(a, x)` against bare `boost::math::gamma_q(a, x)` (which uses the default |
| 199 | +`promote_double<true>` policy) as expected values. On WASM, those bare calls produce |
| 200 | +NaN for large parameters, causing the assertions to fail. (Small parameters like |
| 201 | +a=2, a=1, a=10 don't trigger the `long double` overflow and are unaffected.) |
| 202 | + |
| 203 | +**Fix:** Add a `gmock/gmock.h` include and define a custom `NearOrNaN` matcher that |
| 204 | +behaves like `DoubleNear` but also accepts NaN expected values on Emscripten (where |
| 205 | +bare Boost gamma functions return NaN due to broken `long double`): |
| 206 | + |
| 207 | +```cpp |
| 208 | +MATCHER_P2(NearOrNaN, expected, tolerance, "") { |
| 209 | + if (std::isnan(expected)) { |
| 210 | +#ifdef __EMSCRIPTEN__ |
| 211 | + return true; |
| 212 | +#else |
| 213 | + *result_listener << "expected is NaN on non-Emscripten platform"; |
| 214 | + return false; |
| 215 | +#endif |
| 216 | + } |
| 217 | + auto diff = std::abs(arg - expected); |
| 218 | + if (diff > tolerance) { |
| 219 | + *result_listener << "which is " << diff << " from " << expected; |
| 220 | + return false; |
| 221 | + } |
| 222 | + return true; |
| 223 | +} |
| 224 | +``` |
| 225 | +
|
| 226 | +This turns verbose if/else blocks into single-line assertions: |
| 227 | +```cpp |
| 228 | +EXPECT_THAT(safe_gamma_q(271.4, 280.0), NearOrNaN(boost::math::gamma_q(271.4, 280.0), 1e-12)); |
| 229 | +``` |
| 230 | + |
| 231 | +Affected tests (those using bare Boost calls that may produce NaN on Emscripten): |
| 232 | +- `SafeGammaMathTest.SafeGammaQNearMode` — a=271.4 (triggers overflow) |
| 233 | +- `SafeGammaMathTest.SafeGammaQLargeALargeX` — a=500 (triggers overflow) |
| 234 | +- `SafeGammaMathTest.SafeLogGammaIntegralBasic` — uses bare Boost calls |
| 235 | +- `SafeGammaMathTest.SafeLogGammaIntegralAtBounds` — a=271.4 (triggers overflow) |
| 236 | + |
| 237 | +`SafeGammaMathTest.SafeGammaQBasic` (a=2, 1, 10) uses small parameters that don't |
| 238 | +trigger the overflow, so it keeps plain `EXPECT_NEAR`. |
| 239 | + |
| 240 | +**Regression test:** Add a new `TEST(SafeGammaMathTest, PromoteDoubleRegression)` that |
| 241 | +uses `#ifdef __EMSCRIPTEN__` directly — this test must *fail* if the bare result is |
| 242 | +unexpectedly not-NaN on Emscripten (signaling the bug was fixed upstream): |
| 243 | +```cpp |
| 244 | +TEST(SafeGammaMathTest, PromoteDoubleRegression) { |
| 245 | + auto bare_result = boost::math::gamma_q(271.4, 6601.0); |
| 246 | +#ifdef __EMSCRIPTEN__ |
| 247 | + EXPECT_TRUE(std::isnan(bare_result)) << "got " << bare_result; |
| 248 | +#else |
| 249 | + EXPECT_NEAR(bare_result, 0.0, 1e-12); |
| 250 | +#endif |
| 251 | + // safe_gamma_q uses promote_double<false> and works on all platforms |
| 252 | + EXPECT_NEAR(safe_gamma_q(271.4, 6601.0), 0.0, 1e-12); |
| 253 | +} |
| 254 | +``` |
| 255 | +
|
| 256 | +### CI Integration |
| 257 | +
|
| 258 | +#### `.github/workflows/ci.yml` |
| 259 | +The existing WASM job already builds all targets (including `tests`) via its blanket |
| 260 | +`cmake --build --preset conan-emscripten-release --parallel` step. The only additions |
| 261 | +are a Node.js setup step (required for SharedArrayBuffer/pthreads support) and a test |
| 262 | +execution step after building: |
| 263 | +```yaml |
| 264 | +- name: Set up Node.js |
| 265 | + uses: actions/setup-node@v4 |
| 266 | + with: |
| 267 | + node-version: '20' |
| 268 | +
|
| 269 | +- name: Run WASM tests |
| 270 | + run: ./build/wasm-release/tests/tests |
| 271 | +``` |
| 272 | + |
| 273 | +Node.js >= 16 is required for SharedArrayBuffer support (used by Emscripten's pthreads |
| 274 | +implementation). Node 20 is the current LTS and is used here for stability. |
| 275 | + |
| 276 | +## Correctness Argument |
| 277 | + |
| 278 | +The `promote_double<false>` policy (as described in |
| 279 | +`2026-04-22-01-fix-gamma-q-nan-from-emscripten-broken-long-double.md`) ensures numerical |
| 280 | +correctness across all platforms by keeping Boost's gamma calculations in `double` |
| 281 | +precision, matching the actual capabilities of Emscripten's math library. |
| 282 | + |
| 283 | +Replacing mocks with real deterministic RNGs preserves test intent — the |
| 284 | +`Bounded_exponential_distribution` test verifies that outputs stay within bounds, |
| 285 | +which is the core correctness property. Exact value matching against mocked |
| 286 | +uniform outputs is less important than verifying the distribution's bounding logic. |
| 287 | + |
| 288 | +The same test source code runs on both native and WASM platforms, ensuring |
| 289 | +consistent behavior. The full test suite passes under Node.js with these changes. |
| 290 | + |
| 291 | +## Testing Strategy |
| 292 | + |
| 293 | +1. Build WASM tests: `cmake --build --preset conan-emscripten-release --target tests` |
| 294 | +2. Run specific tests: `./build/wasm-release/tests/tests --gtest_filter=Distributions_test.*` |
| 295 | +3. Run all tests: `./build/wasm-release/tests/tests` |
| 296 | +4. Verify all tests pass (including new platform-aware regression tests for gamma_q with `promote_double<true>`) |
| 297 | +5. Native tests continue as primary validation in CI |
| 298 | +6. WASM tests added as CI check, failing the build if any test fails |
| 299 | + |
| 300 | +## Files Modified |
| 301 | + |
| 302 | +- `CMakeLists.txt` (top-level: define shared `WASM_LINKER_FLAGS` variable, update stale comment) |
| 303 | +- `tests/CMakeLists.txt` (add WASM linker flags with `-sEXIT_RUNTIME=1`, `file(GENERATE)` shell script wrapper for `./tests` invocation) |
| 304 | +- `tests/distributions_tests.cpp` (replace `absl::MockingBitGen` with real RNG, remove exact-value assertions) |
| 305 | +- `tests/safe_gamma_math_tests.cpp` (add `NearOrNaN` custom matcher for bare Boost comparisons, add regression test) |
| 306 | +- `.github/workflows/ci.yml` (add Node.js v20 setup and WASM test step) |
0 commit comments