Skip to content

Commit 8833bd5

Browse files
patvarillymistral-vibeclaude
committed
Enable Delphy core unit tests for Node.js/WASM
Implements plan 2026-04-23-01-enable-wasm-unit-tests.md: - Define shared WASM_LINKER_FLAGS in top-level CMakeLists.txt for common Emscripten linker options and refactor delphy target to use them - Add WASM-specific linker flags to tests target including -sEXIT_RUNTIME=1 (critical for test failures to propagate exit codes) and generate shell script wrapper for ./tests invocation - Replace absl::MockingBitGen with std::mt19937_64 in distributions_tests.cpp to avoid gmock hangs in WASM - Add NearOrNaN custom matcher in safe_gamma_math_tests.cpp to handle NaN comparisons from broken long double on Emscripten, add regression test - Add Node.js v20 setup and WASM test execution to CI workflow All 259 WASM tests and 261 native tests pass. Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 480b569 commit 8833bd5

6 files changed

Lines changed: 399 additions & 30 deletions

File tree

.github/workflows/ci.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,9 +209,17 @@ jobs:
209209
- name: Configure CMake (WASM)
210210
run: cmake --preset conan-emscripten-release
211211

212+
- name: Set up Node.js
213+
uses: actions/setup-node@v4
214+
with:
215+
node-version: '20'
216+
212217
- name: Build WASM
213218
run: cmake --build --preset conan-emscripten-release --parallel
214219

220+
- name: Run WASM tests
221+
run: ./build/wasm-release/tests/tests
222+
215223
- name: Upload WASM artifacts
216224
uses: actions/upload-artifact@v4
217225
with:

CMakeLists.txt

Lines changed: 22 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,24 @@ find_package(SDL2_ttf QUIET)
8585
# Core library
8686
add_subdirectory(core)
8787

88-
# Tests (only useful in non-WASM builds, but we build them for WASM anyway to
89-
# ensure that no test code trips up Clang [the default compiler on Mac])
88+
# Define WASM linker flags early so tests/CMakeLists.txt can use them
89+
if(EMSCRIPTEN)
90+
set(WASM_LINKER_FLAGS
91+
-fwasm-exceptions
92+
-pthread
93+
# Don't mess up 64-bit integers; see https://v8.dev/features/wasm-bigint
94+
-sWASM_BIGINT
95+
-sINITIAL_MEMORY=16mb
96+
-sALLOW_MEMORY_GROWTH=1
97+
# Avoid grief with decoding UTF-8 strings in SharedArrayBuffers (which is what
98+
# the WASM linear memory looks like from the JS side). See here for details:
99+
# https://github.com/emscripten-core/emscripten/issues/18034
100+
-sTEXTDECODER=0
101+
)
102+
endif()
103+
104+
# Tests are built for WASM to both catch Clang compilation issues and to verify
105+
# WASM-specific functionality (e.g., gamma_q with promote_double<false> policy)
90106
cmake_policy(SET CMP0135 NEW)
91107
include(FetchContent)
92108
# To update to latest GoogleTest release:
@@ -131,24 +147,16 @@ endif()
131147

132148
# Build the WASM shim only if compiling with Emscripten
133149
if(EMSCRIPTEN)
150+
# delphy WASM target
134151
add_executable(delphy tools/delphy_wasm.cpp)
135152
target_link_libraries(delphy delphy_core)
136153
target_link_options(delphy PRIVATE
137-
--no-entry -fwasm-exceptions
154+
${WASM_LINKER_FLAGS}
155+
--no-entry
138156
$<$<CONFIG:DEBUG>:-gsource-map --source-map-base ./>
139-
-pthread -sPTHREAD_POOL_SIZE=navigator.hardwareConcurrency
157+
-sPTHREAD_POOL_SIZE=navigator.hardwareConcurrency
140158
-sEXPORTED_FUNCTIONS=_malloc,_free
141159
-sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack
142-
143-
# Don't mess up 64-bit integers; see https://v8.dev/features/wasm-bigint
144-
-sWASM_BIGINT
145-
-sINITIAL_MEMORY=16mb
146-
-sALLOW_MEMORY_GROWTH=1
147-
148-
# Avoid grief with decoding UTF-8 strings in SharedArrayBuffers (which is what
149-
# the WASM linear memory looks like from the JS side). See here for details:
150-
# https://github.com/emscripten-core/emscripten/issues/18034
151-
-sTEXTDECODER=0
152160
)
153161

154162
if(NOT CMAKE_BUILD_TYPE STREQUAL "Debug")
Lines changed: 306 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,306 @@
1+
# Enable Delphy Core Unit Tests for Node.js/WASM
2+
3+
## Summary
4+
5+
This plan enables running Delphy's core C++ unit test suite under Node.js using the
6+
Emscripten-compiled WASM module. Without running unit tests on WASM, we cannot
7+
automatically and continuously verify that WASM-specific fixes work correctly, and we
8+
wouldn't have immediate visibility on WASM-only bugs in the future.
9+
10+
A key motivating example: Emscripten's `long double` math functions are broken — they
11+
overflow at the `double` threshold — causing NaN in Boost's gamma functions when using
12+
the default `promote_double<true>` policy. A recent fix (see
13+
`2026-04-22-01-fix-gamma-q-nan-from-emscripten-broken-long-double.md`) introduced
14+
`safe_gamma_q`/`safe_gamma_q_inv` wrappers using `promote_double<false>` to work around
15+
this. Running unit tests on WASM ensures those wrappers remain correct.
16+
17+
## Background
18+
19+
### Current State
20+
- Unit tests compile into a native `tests` executable using Google Test
21+
- Unit tests are already built for WASM in CI (to catch Clang compilation issues), but not executed
22+
- CI runs native tests but not WASM tests
23+
24+
### Problems Discovered
25+
26+
Three issues prevent tests from running correctly under WASM/Node:
27+
28+
1. **Google Mock incompatibility**: Tests using `absl::MockingBitGen` (which relies
29+
on Google Mock, i.e., gmock) (e.g., `distributions_tests.cpp`) hang in WASM. The
30+
gmock framework uses `abort()` for failed expectations, which doesn't properly
31+
terminate in Emscripten.
32+
33+
2. **Boost `long double` promotion**: As described in
34+
`2026-04-22-01-fix-gamma-q-nan-from-emscripten-broken-long-double.md`, Emscripten's
35+
`long double` math functions are broken (they overflow at the `double` threshold), and
36+
Boost's default `long double` promotion causes NaN in `gamma_q`/`gamma_q_inv`. Tests
37+
in `safe_gamma_math_tests.cpp` use bare `boost::math::gamma_q()` calls as reference
38+
values to compare against the `safe_gamma_q()` wrappers — for some parameter
39+
combinations (large `a` values like 271.4, 500), those bare calls produce NaN on WASM,
40+
causing the comparison assertions to fail.
41+
42+
3. **Exit code swallowing**: By default Emscripten's `EXIT_RUNTIME` is disabled, meaning
43+
the Node.js process always exits with code 0 regardless of what `main()` returns. Without
44+
fixing this, all test failures are silently swallowed.
45+
46+
In addition, Emscripten's Node.js builds output a `tests.js` file (not a self-contained
47+
`tests` executable), so a shell script wrapper is needed to make `./tests` work.
48+
49+
## Proposed Changes
50+
51+
### Core Infrastructure
52+
53+
#### `CMakeLists.txt` (top-level)
54+
Refactor to define a shared `WASM_LINKER_FLAGS` variable for the common Emscripten
55+
linker options and update the stale comment above the GoogleTest fetch. The goal
56+
is to avoid duplicating the common parts of the `delphy` target's options list for
57+
the `tests` target; the `delphy` target should be compiled and linked with exactly the
58+
same effective options as it is now (flag ordering may change cosmetically).
59+
60+
**Stale comment fix:** The comment above `cmake_policy(SET CMP0135 NEW)` (currently
61+
"only useful in non-WASM builds, but we build them for WASM anyway to ensure that no
62+
test code trips up Clang") should be updated to:
63+
64+
```cmake
65+
# Tests are built for WASM to both catch Clang compilation issues and to verify
66+
# WASM-specific functionality (e.g., gamma_q with promote_double<false> policy)
67+
```
68+
69+
**Shared flags:** Define `WASM_LINKER_FLAGS` in a new `if(EMSCRIPTEN)` block *before*
70+
`add_subdirectory(tests)` (e.g., right after `add_subdirectory(core)`). This is necessary
71+
because CMake processes `add_subdirectory` immediately, so the variable must be set before
72+
`tests/CMakeLists.txt` is evaluated:
73+
74+
```cmake
75+
# Define WASM linker flags early so tests/CMakeLists.txt can use them
76+
if(EMSCRIPTEN)
77+
# Core Emscripten linker flags shared between delphy and tests targets
78+
set(WASM_LINKER_FLAGS
79+
-fwasm-exceptions
80+
-pthread
81+
# Don't mess up 64-bit integers; see https://v8.dev/features/wasm-bigint
82+
-sWASM_BIGINT
83+
-sINITIAL_MEMORY=16mb
84+
-sALLOW_MEMORY_GROWTH=1
85+
# Avoid grief with decoding UTF-8 strings in SharedArrayBuffers (which is what
86+
# the WASM linear memory looks like from the JS side). See here for details:
87+
# https://github.com/emscripten-core/emscripten/issues/18034
88+
-sTEXTDECODER=0
89+
)
90+
endif()
91+
```
92+
93+
**delphy target refactor:** In the existing `if(EMSCRIPTEN)` block near the bottom of
94+
the file, replace the inline flags with `${WASM_LINKER_FLAGS}`:
95+
96+
```cmake
97+
if(EMSCRIPTEN)
98+
# delphy WASM target
99+
add_executable(delphy tools/delphy_wasm.cpp)
100+
target_link_libraries(delphy delphy_core)
101+
target_link_options(delphy PRIVATE
102+
--no-entry
103+
${WASM_LINKER_FLAGS}
104+
-sPTHREAD_POOL_SIZE=navigator.hardwareConcurrency
105+
-sEXPORTED_FUNCTIONS=_malloc,_free
106+
-sEXPORTED_RUNTIME_METHODS=UTF8ToString,stringToUTF8OnStack
107+
$<$<CONFIG:DEBUG>:-gsource-map --source-map-base ./>
108+
)
109+
if(NOT CMAKE_BUILD_TYPE STREQUAL "Debug")
110+
# Only enable mimalloc in non-debug builds because it conflicts with -fsanitize=address
111+
target_link_options(delphy PRIVATE
112+
# Use a scalable memory allocator (the default allocator has high contention
113+
# and is slow in multi-threaded environments)
114+
# See https://emscripten.org/docs/porting/pthreads.html#allocator-performance
115+
-sMALLOC=mimalloc
116+
)
117+
endif()
118+
endif()
119+
```
120+
121+
Note: The only change to the `delphy` target is that the `-fwasm-exceptions`, `-pthread`,
122+
`-sWASM_BIGINT`, `-sINITIAL_MEMORY`, `-sALLOW_MEMORY_GROWTH`, and `-sTEXTDECODER` flags
123+
now come from `${WASM_LINKER_FLAGS}` instead of being inline — same flags, different
124+
source.
125+
126+
#### `tests/CMakeLists.txt`
127+
Add WASM-specific linker flags to the existing `tests` target, reusing the shared
128+
`WASM_LINKER_FLAGS` from the top-level:
129+
130+
```cmake
131+
if(EMSCRIPTEN)
132+
target_link_options(tests PRIVATE
133+
${WASM_LINKER_FLAGS}
134+
-sPTHREAD_POOL_SIZE=4
135+
# Without this, Emscripten's EXIT_RUNTIME defaults to false and the Node.js
136+
# process always exits with code 0, silently swallowing test failures.
137+
-sEXIT_RUNTIME=1
138+
)
139+
# Create a shell script wrapper so tests can be run as ./tests
140+
file(GENERATE OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/tests"
141+
CONTENT "#!/bin/sh\ndir=$(dirname \"$0\")\nexec node \"$dir/$<TARGET_FILE_NAME:tests>\" \"$@\"\n"
142+
FILE_PERMISSIONS OWNER_READ OWNER_WRITE OWNER_EXECUTE
143+
GROUP_READ GROUP_EXECUTE
144+
WORLD_READ WORLD_EXECUTE
145+
)
146+
endif()
147+
```
148+
149+
**`-sEXIT_RUNTIME=1` is critical:** Emscripten defaults `EXIT_RUNTIME` to false, meaning
150+
the Node.js process always exits with code 0 regardless of what `main()` returns. Without
151+
this flag, test failures would be silently swallowed and CI would always report success.
152+
153+
Tests use a fixed pool size of 4 for deterministic, reproducible behavior across
154+
environments, while the `delphy` target uses `navigator.hardwareConcurrency` to maximize
155+
parallelism in production use.
156+
157+
**Shell script wrapper:** Emscripten's CMake toolchain sets `CMAKE_EXECUTABLE_SUFFIX` to
158+
`.js`, so the build produces `tests.js` (plus `tests.wasm` and `tests.worker.js`).
159+
Emscripten does *not* auto-generate a shell runner. The `file(GENERATE)` block above
160+
creates a `tests` shell script (no extension) in the same directory that invokes
161+
`node tests.js "$@"`, forwarding all arguments. This makes `--gtest_filter` work
162+
transparently:
163+
164+
```bash
165+
./build/wasm-release/tests/tests --gtest_filter=Distributions_test.*
166+
```
167+
168+
### Test-Related Fixes
169+
170+
These changes clean up rough edges once the basic test infrastructure is in place.
171+
172+
#### `tests/distributions_tests.cpp`
173+
Replace `absl::MockingBitGen` with a real deterministic RNG to avoid gmock hangs in WASM.
174+
The current test mocks the uniform RNG to return a fixed value (0.3), then checks both
175+
bounds (`Le`/`Ge`) and exact values (`DoubleNear`) against formulas that assume that
176+
fixed input. With a real RNG, the exact values are no longer predictable from a formula,
177+
so the `DoubleNear` assertions must be removed — only the bounds checks remain.
178+
179+
Changes:
180+
- Remove includes: `absl/random/mocking_bit_gen.h`, `absl/random/mock_distributions.h`
181+
- Keep `gmock/gmock.h` (already present; still needed for `testing::Le`, `testing::Ge`
182+
matchers; the `GTest::gmock` link dependency in `tests/CMakeLists.txt` stays as-is)
183+
- Add include: `<random>`
184+
- In `TEST(Distributions_test, bounded_exponential_distribution)`:
185+
- Replace `absl::MockingBitGen` with `std::mt19937_64{12345}`
186+
- Remove `EXPECT_CALL` setup
187+
- Remove the three `DoubleNear` exact-value assertions (one per test case:
188+
`unbounded_left`, `unbounded_right`, `bounded`)
189+
- Keep the `Le` and `Ge` bounds checks — these verify the core correctness property
190+
that `Bounded_exponential_distribution` respects its bounds
191+
192+
#### `tests/safe_gamma_math_tests.cpp`
193+
The tests call `safe_gamma_q()` and `safe_gamma_q_inv()` — the wrappers in
194+
`safe_gamma_math.h` that already use `promote_double<false>` internally. These are
195+
the functions under test and need no changes.
196+
197+
The problem is on the **right-hand side** of comparison assertions: several tests compare
198+
`safe_gamma_q(a, x)` against bare `boost::math::gamma_q(a, x)` (which uses the default
199+
`promote_double<true>` policy) as expected values. On WASM, those bare calls produce
200+
NaN for large parameters, causing the assertions to fail. (Small parameters like
201+
a=2, a=1, a=10 don't trigger the `long double` overflow and are unaffected.)
202+
203+
**Fix:** Add a `gmock/gmock.h` include and define a custom `NearOrNaN` matcher that
204+
behaves like `DoubleNear` but also accepts NaN expected values on Emscripten (where
205+
bare Boost gamma functions return NaN due to broken `long double`):
206+
207+
```cpp
208+
MATCHER_P2(NearOrNaN, expected, tolerance, "") {
209+
if (std::isnan(expected)) {
210+
#ifdef __EMSCRIPTEN__
211+
return true;
212+
#else
213+
*result_listener << "expected is NaN on non-Emscripten platform";
214+
return false;
215+
#endif
216+
}
217+
auto diff = std::abs(arg - expected);
218+
if (diff > tolerance) {
219+
*result_listener << "which is " << diff << " from " << expected;
220+
return false;
221+
}
222+
return true;
223+
}
224+
```
225+
226+
This turns verbose if/else blocks into single-line assertions:
227+
```cpp
228+
EXPECT_THAT(safe_gamma_q(271.4, 280.0), NearOrNaN(boost::math::gamma_q(271.4, 280.0), 1e-12));
229+
```
230+
231+
Affected tests (those using bare Boost calls that may produce NaN on Emscripten):
232+
- `SafeGammaMathTest.SafeGammaQNearMode` — a=271.4 (triggers overflow)
233+
- `SafeGammaMathTest.SafeGammaQLargeALargeX` — a=500 (triggers overflow)
234+
- `SafeGammaMathTest.SafeLogGammaIntegralBasic` — uses bare Boost calls
235+
- `SafeGammaMathTest.SafeLogGammaIntegralAtBounds` — a=271.4 (triggers overflow)
236+
237+
`SafeGammaMathTest.SafeGammaQBasic` (a=2, 1, 10) uses small parameters that don't
238+
trigger the overflow, so it keeps plain `EXPECT_NEAR`.
239+
240+
**Regression test:** Add a new `TEST(SafeGammaMathTest, PromoteDoubleRegression)` that
241+
uses `#ifdef __EMSCRIPTEN__` directly — this test must *fail* if the bare result is
242+
unexpectedly not-NaN on Emscripten (signaling the bug was fixed upstream):
243+
```cpp
244+
TEST(SafeGammaMathTest, PromoteDoubleRegression) {
245+
auto bare_result = boost::math::gamma_q(271.4, 6601.0);
246+
#ifdef __EMSCRIPTEN__
247+
EXPECT_TRUE(std::isnan(bare_result)) << "got " << bare_result;
248+
#else
249+
EXPECT_NEAR(bare_result, 0.0, 1e-12);
250+
#endif
251+
// safe_gamma_q uses promote_double<false> and works on all platforms
252+
EXPECT_NEAR(safe_gamma_q(271.4, 6601.0), 0.0, 1e-12);
253+
}
254+
```
255+
256+
### CI Integration
257+
258+
#### `.github/workflows/ci.yml`
259+
The existing WASM job already builds all targets (including `tests`) via its blanket
260+
`cmake --build --preset conan-emscripten-release --parallel` step. The only additions
261+
are a Node.js setup step (required for SharedArrayBuffer/pthreads support) and a test
262+
execution step after building:
263+
```yaml
264+
- name: Set up Node.js
265+
uses: actions/setup-node@v4
266+
with:
267+
node-version: '20'
268+
269+
- name: Run WASM tests
270+
run: ./build/wasm-release/tests/tests
271+
```
272+
273+
Node.js >= 16 is required for SharedArrayBuffer support (used by Emscripten's pthreads
274+
implementation). Node 20 is the current LTS and is used here for stability.
275+
276+
## Correctness Argument
277+
278+
The `promote_double<false>` policy (as described in
279+
`2026-04-22-01-fix-gamma-q-nan-from-emscripten-broken-long-double.md`) ensures numerical
280+
correctness across all platforms by keeping Boost's gamma calculations in `double`
281+
precision, matching the actual capabilities of Emscripten's math library.
282+
283+
Replacing mocks with real deterministic RNGs preserves test intent — the
284+
`Bounded_exponential_distribution` test verifies that outputs stay within bounds,
285+
which is the core correctness property. Exact value matching against mocked
286+
uniform outputs is less important than verifying the distribution's bounding logic.
287+
288+
The same test source code runs on both native and WASM platforms, ensuring
289+
consistent behavior. The full test suite passes under Node.js with these changes.
290+
291+
## Testing Strategy
292+
293+
1. Build WASM tests: `cmake --build --preset conan-emscripten-release --target tests`
294+
2. Run specific tests: `./build/wasm-release/tests/tests --gtest_filter=Distributions_test.*`
295+
3. Run all tests: `./build/wasm-release/tests/tests`
296+
4. Verify all tests pass (including new platform-aware regression tests for gamma_q with `promote_double<true>`)
297+
5. Native tests continue as primary validation in CI
298+
6. WASM tests added as CI check, failing the build if any test fails
299+
300+
## Files Modified
301+
302+
- `CMakeLists.txt` (top-level: define shared `WASM_LINKER_FLAGS` variable, update stale comment)
303+
- `tests/CMakeLists.txt` (add WASM linker flags with `-sEXIT_RUNTIME=1`, `file(GENERATE)` shell script wrapper for `./tests` invocation)
304+
- `tests/distributions_tests.cpp` (replace `absl::MockingBitGen` with real RNG, remove exact-value assertions)
305+
- `tests/safe_gamma_math_tests.cpp` (add `NearOrNaN` custom matcher for bare Boost comparisons, add regression test)
306+
- `.github/workflows/ci.yml` (add Node.js v20 setup and WASM test step)

0 commit comments

Comments
 (0)