Summary
The Penman-Monteith evapotranspiration calculation can run on GPU with one thread per grid cell. At 2.7M catchments, this achieves 34.9x speedup with max relative error 1.44e-04 (from GPU fast exp).
Benchmark (RTX 3060)
| Catchments |
CPU |
GPU |
Speedup |
Max rel error |
| 100K |
1 ms |
0.048 ms |
20.7x |
5.16e-06 |
| 1M |
9 ms |
0.243 ms |
37.0x |
5.00e-05 |
| 2.7M (NWM) |
24 ms |
0.687 ms |
34.9x |
1.44e-04 |
Zero NaN, zero failures. The Penman-Monteith equation (saturation vapor pressure, psychrometric constant, aerodynamic resistance) maps naturally to GPU — each grid cell is independent.
Code
https://github.com/consigcody94/parallel-prefix-rt/blob/master/benchmarks/cuda/owp_extended_kernels.cu
Summary
The Penman-Monteith evapotranspiration calculation can run on GPU with one thread per grid cell. At 2.7M catchments, this achieves 34.9x speedup with max relative error 1.44e-04 (from GPU fast exp).
Benchmark (RTX 3060)
Zero NaN, zero failures. The Penman-Monteith equation (saturation vapor pressure, psychrometric constant, aerodynamic resistance) maps naturally to GPU — each grid cell is independent.
Code
https://github.com/consigcody94/parallel-prefix-rt/blob/master/benchmarks/cuda/owp_extended_kernels.cu