Conversation
|
There is now a mechanism for users to adjust the padding (ignored for newer snapshots where the cell bounding box is known), and tests for that. Once we agree on a desired default behaviour it's trivial to implement. |
|
What simulations are affected? Is it mainly a legacy problem or are colibre runs also in a state where they need 27x more data? |
|
I checked a few Colibre runs, they seem to have the metadata present. I also checked a few Flamingo runs, there the metadata is absent. |
|
@MatthieuSchaller @JBorrow @robjmcgibbon to make this concrete, I propose that:
In case it wasn't clear, padding by <1 cell helps because the region is generically not aligned with the cell grid, and the pad is not adding adjacent cells to be read but just making each cell slightly bigger when deciding whether its bbox touches the region. I feel like this strikes a reasonable balance between accuracy and minimizing unnecessary overheads. Does anyone want to voice an objection to the above? If not I'll wrap up the modifications and call it (actually) ready for review. |
|
The only tricky thing here I think would be to make sure that the extra padding doesn't lead to extra padding on things like the particle-level mask you can ask for. As long as the padding is only used for cell selection I see no issue with adding a small pad to the region to account for cases like this. |
|
It's implemented directly in the spatial masking function, so it can't "leak out" into other modes. Reading catalogue files (soap, fof) where particles drifting does not happen is also guarded (never padded). To keep things analogous with the case where bbox metadata is present, the pad is actually applied to the cells, not the region. This is part of a small conceptual change from the original implementation that actually padded the region by half a cell and looked for enclosed cell centres, but the result is the same (and yes, it wraps the periodic boundary properly). |
This is now implemented, so ready for review, I think! |
|
Ok. I agree with this last proposal. Better be safe for the old data. The new sims with more metadata will only read what is required. What about the buffer size in the old case? Are we not always reading one full extra cell? I must be missing something about the 0.1 |
|
@MatthieuSchaller Hmm I tried in words before, let's go with the diagram: The pad is applied cell by cell (so that when the metadata is available it can be variable). If the user asks for a region exactly aligns with the cell grid then both padding options will read the same set of cells. In this diagram the region is not along grid vertices so we save on I/O by padding less. |
|
Thanks. I am with you now. That sounds good to me. |
- Implemented using the cell bounding box metadata (min particle position and max particle position), if present. If not behaviour changes: we pad the selection by 0.1 cell length to make sure that we get any particles that have drifted out of their cells. If they've drifted by >0.1 cell length... too bad. - Added some tests as suggested by Test and document wrapped masking SWIFTSIM#88. The old test data doesn't include the new metadata (min/max positions). Which leads to... - Made some new test data, addressing Test data is very outdated SWIFTSIM#177. It's located at /cosma5/data/Swift/web-storage/IOExamples/ssio_ci_04_2025. I've left the old test data in place in case that's used by anyone else. There's a readme in the new test data directory with details. Next time we decide we need new test data we can just make a new subdirectory. I've also included some of the old test data files (symlinked in) as "legacy" examples to make sure we don't break backwards compatibility. I've also made sure that we have both a "single" and "distributed" file - Set up fixtures to either test with both the old and the new single and distributed test files (so we can ensure backwards-compatibility and compatibility with both types), or just the new single-snapshot file where this is not relevant. - Abandoned the cosmo_volume_example.hdf5 test data file and use the new single-file snapshot instead. - Documented the masking behaviour with respect to periodic boxes as suggested in Test and document wrapped masking SWIFTSIM#88. - Made the cell padding length configurable (including able to be disabled) in cases where the cell bounding box metadata is absent.

/cosma5/data/Swift/web-storage/IOExamples/ssio_ci_04_2025. I've left the old test data in place in case that's used by anyone else. There's a readme in the new test data directory with details. Next time we decide we need new test data we can just make a new subdirectory. I've also included some of the old test data files (symlinked in) as "legacy" examples to make sure we don't break backwards compatibility. I've also made sure that we have both a "single" and "distributed" filecosmo_volume_example.hdf5test data file and use the new single-file snapshot instead.Closes #121
Closes #88
Closes #177