Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
740cfe2
Move cell sizes to LHS preparing to replace with bbox.
kyleaoman Apr 18, 2025
43c3086
Replace cells with their bboxes in cell masking.
kyleaoman Apr 18, 2025
dfb1ffd
Remove a stray print statement.
kyleaoman Apr 18, 2025
03cb036
Fix formatting.
kyleaoman Apr 18, 2025
267c1c8
Add a test for periodic wrapping of the mask.
kyleaoman Apr 18, 2025
db1dde6
Add a test for padding wrapping properly (it fails).
kyleaoman Apr 18, 2025
00f7213
Run formatter.
kyleaoman Apr 18, 2025
eba993e
Add tests for mask wrapping and masking entire box explicitly.
kyleaoman Apr 18, 2025
c528b49
Try to speed up visualisation tests a bit.
kyleaoman Apr 18, 2025
a025057
Check redshift to 1 part in 1e8.
kyleaoman Apr 19, 2025
13d9f7c
Formatting.
kyleaoman Apr 19, 2025
b2052c3
Make single particle snapshot generator more general.
kyleaoman Apr 19, 2025
0218154
Make smoothing length match a bit less stringent.
kyleaoman Apr 19, 2025
e03ed7c
Improve tests and correct logic.
kyleaoman Apr 19, 2025
e0fd0f3
Don't select touching cells, only overlapping.
kyleaoman Apr 19, 2025
ef40d80
Copy cell bbox metadata if present.
kyleaoman Apr 21, 2025
3e82a57
Relax tolerance on smoothing length match a little bit.
kyleaoman Apr 21, 2025
3189676
Skip fields not present in newer test data when checking these files.
kyleaoman Apr 21, 2025
00daf04
Set expectation when cell bbox metadata present.
kyleaoman Apr 21, 2025
81825d8
Force projection of 'masses', not None.
kyleaoman Apr 21, 2025
64c39da
Don't accept Tcmb=0 default from swift snapshot.
kyleaoman Apr 21, 2025
70697ad
Add some tests for projection regions, moving @requires to a fixture.
kyleaoman Apr 15, 2025
51ef2c9
Fixtures to either test on new+old or just new cosmo volume datasets.
kyleaoman Apr 21, 2025
801ecfd
Adjust tests to use new fixtures.
kyleaoman Apr 21, 2025
ae2cf55
Temporarily add test data files to repo.
kyleaoman Apr 21, 2025
f7dc1ca
Merge branch 'master' into cell_bbox
kyleaoman Apr 21, 2025
78a6712
Actual data files, not symlinks.
kyleaoman Apr 21, 2025
ce3fa53
Use fixtures properly.
kyleaoman Apr 21, 2025
ef18b09
Merge branch 'master' into cell_bbox
kyleaoman Apr 21, 2025
bda2761
Switch to common test data in visualisation tests, and related improv…
kyleaoman Apr 22, 2025
257255e
Local setup for test data.
kyleaoman Apr 22, 2025
5a6edec
Merge branch 'cell_bbox' of github.com:SWIFTSIM/swiftsimio into cell_…
kyleaoman Apr 22, 2025
5e5eae5
Tweak tests for new sample data.
kyleaoman Apr 22, 2025
f6aa710
Merge branch 'master' into cell_bbox
kyleaoman Apr 22, 2025
f0f9abd
Use fixture that exists.
kyleaoman Apr 22, 2025
2a0eb6e
Remove temporary test data from repo.
kyleaoman Apr 22, 2025
a9b89e0
Reduce tolerance on img match to 98%.
kyleaoman Apr 22, 2025
31db240
Cleanup some workarounds for solved bugs.
kyleaoman Apr 23, 2025
86b599b
Point to new test data on cosma.
kyleaoman Apr 23, 2025
364019e
Expose cell sort order as an attribute.
kyleaoman Apr 23, 2025
e8474cd
Add a fixture for just the distributed data file.
kyleaoman Apr 23, 2025
2112ea1
Write cells in sorted order in subsets to agree with counts and offsets.
kyleaoman Apr 23, 2025
6588a52
Refactor out try/except in test.
kyleaoman Apr 23, 2025
c1d2106
Update masking documentation with examples for periodic boundaries.
kyleaoman Apr 23, 2025
ffc4ea0
Bump version.
kyleaoman Apr 23, 2025
def32fd
Make mask padding user configurable.
kyleaoman Apr 24, 2025
bd6bcd1
Resolve conflicts with master updates.
kyleaoman May 2, 2025
baf91a5
Set default pad to 0.2 and warn when min/maxposition metadata absent.
kyleaoman May 11, 2025
70a90a9
Run formatter.
kyleaoman May 11, 2025
760ce10
Updates to tests.
kyleaoman May 11, 2025
4538d3c
Cleanup warnings in tests.
kyleaoman May 11, 2025
342302d
Getting testdata requires https.
kyleaoman May 12, 2025
b5de921
Download the right file for dithered test!
kyleaoman May 12, 2025
73b57a9
Merge branch 'https_get_testdata' into cell_bbox
kyleaoman May 12, 2025
0d979ac
Silence a numpy warning in tests.
kyleaoman May 12, 2025
3cdae9c
Corrections and bugfixes in narrative docs.
kyleaoman May 12, 2025
baf48e7
Merge branch 'master' into cell_bbox
kyleaoman May 12, 2025
62824cf
Default pad 0.1 cells instead of 0.2.
kyleaoman May 12, 2025
4fef9f7
Add a helper function to quiet the new padding warning in tests.
kyleaoman May 12, 2025
41089d8
Add a test to check that warning is produced when cell bbox metadata …
kyleaoman May 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 141 additions & 9 deletions docs/source/masking/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,13 @@ with this. This functionality is provided through the :mod:`swiftsimio.masks`
sub-module but is available easily through the :meth:`swiftsimio.mask`
top-level function.

This functionality is used heavily in our `VELOCIraptor integration library`_
for only reading data that is near bound objects.
This functionality is used heavily in `swiftgalaxy`_.

There are two types of mask, with the default only allowing spatial masking.
Full masks require significantly more memory overhead and are generally much
slower than the spatial only mask.

.. _`VELOCIraptor integration library`: https://github.com/swiftsim/velociraptor-python
.. _`swiftgalaxy`: https://github.com/SWIFTSIM/swiftgalaxy

Spatial-only masking
--------------------
Expand All @@ -39,7 +38,7 @@ Example
^^^^^^^

In this example we will use the :obj:`swiftsimio.masks.SWIFTMask` object
to load the bottom left 'half' corner of the box.
to load the the octant of the box closes to the origin.

.. code-block:: python

Expand Down Expand Up @@ -69,8 +68,8 @@ that are much larger than the available memory on your machine and process them
with ease.

It is also possible to build up a region with a more complicated geometry by
making repeated calls to :meth:`~swiftsimio.reader.SWIFTMask.constrain_spatial`
and setting the optional argument `intersect=True`. By default any existing
making repeated calls to :meth:`~swiftsimio.masks.SWIFTMask.constrain_spatial`
and setting the optional argument ``intersect=True``. By default any existing
selection of cells would be overwritten; this option adds any additional cells
that need to be selected for the new region to the existing selection instead.
For instance, to add the diagonally opposed octant to the selection made above
Expand All @@ -81,10 +80,143 @@ For instance, to add the diagonally opposed octant to the selection made above
additional_region = [[0.5 * b, 1.0 * b] for b in boxsize]
mask.constrain_spatial(additional_region, intersect=True)

In the first call to :meth:`~swiftsimio.reader.SWIFTMask.constrain_spatial` the
`intersect` argument can be set to `True` or left `False` (the default): since
In the first call to :meth:`~swiftsimio.masks.SWIFTMask.constrain_spatial` the
``intersect`` argument can be set to ``True`` or left ``False`` (the default): since
no mask yet exists both give the same result.

Periodic boundaries
^^^^^^^^^^^^^^^^^^^

The mask region is aware of the periodic box boundaries. Let's take for example a
region shaped like a "slab" in the :math:`x-y` plane with :math:`|z|<0.1L_\mathrm{box}`.
One way to write this is by thinking of the :math:`z<0` part as
lying at the upper edge of the box:

.. code-block:: python

mask = sw.mask(filename)
mask.constrain_spatial(
[
None,
None,
[0.0 * mask.metadata.boxsize[2], 0.1 * mask.metadata.boxsize[2]],
]
)
mask.constrain_spatial(
[
None,
None,
[0.9 * mask.metadata.boxsize[2], 1.0 * mask.metadata.boxsize[2]],
],
intersect=True,
)

This is a bit inconvenient though since the region is actually contiguous if we
account for the periodic boundary. :meth:`~swiftsimio.masks.SWIFTMask.constrain_spatial` allows us
to select a region straddling the periodic boundary, for example this is an
equivalent selection:

.. code-block:: python

mask = sw.mask(filename)
mask.constrain_spatial(
[
None,
None,
[-0.1 * mask.metadata.boxsize[2], 0.1 * mask.metadata.boxsize[2]],
]
)

Note that masking never result in periodic copies of particles, nor does it shift
particle coordinates to match the region defined; particle coordinates always
lie in the range :math:`[0, L_\mathrm{box}]`. For example reading
a region that extends beyond the box in all directions produces exactly one copy
of every particle and is equivalent to providing no spatial mask:

.. code-block:: python

mask = sw.mask(filename)
mask.constrain_spatial(
[[-0.1 * lbox, 1.1 * lbox] for lbox in mask.metadata.boxsize]
)

Remember to wrap the coordinates yourself if relevant! Alternatively, the
`swiftgalaxy`_ package offers support for coordinate transformations including
periodic boundaries.

Another equivalent region for the :math:`|z|<0.1L_\mathrm{box}` slab can be written
by setting the lower bound to a greater value than the upper bound, the code will
interpret this as a request to start at the lower bound, wrap through the upper
periodic boundary and continue until the (numerically lower value of) the upper
bound is reached:

.. code-block:: python

mask = sw.mask(filename)
mask.constrain_spatial(
[
None,
None,
[0.9 * mask.metadata.boxsize[2], 0.1 * mask.metadata.boxsize[2]],
]
)

The coordinates defining the region must always be in the interval
:math:`[-0.5L_\mathrm{box}, 1.5L_\mathrm{box}]`. This allows enough flexibility to
define all possible regions.

Implementation details
^^^^^^^^^^^^^^^^^^^^^^

SWIFT snapshots group particles according to the cell that they occupy so that
particles belonging to a cell are stored contiguously. The cells form a regular grid
covering the simulation domain. However, SWIFT does not guarantee that all particles
that belong to a cell are within the boundaries of a cell at the time when a snapshot
is produced (particles are moved between cells at intervals, but may drift outside of
their current cell before being re-assigned). Snapshots contain metadata defining
the "bounding box" of each cell that contains all particles assigned to it at the
time that the snapshot was written. :mod:`swiftsimio` uses this information when
deciding what cells to read, so you may find that the "extra" particles read in
outside of the explicitly asked for have an irregular boundary with cuboid protrusions
or indentations. This is normal: the cells read in are exactly those needed to
guarantee that all particles in the specified region of interest are captured. It is
therefore advantageous to make the region as small and tightly fit to the analysis
task as possible - in particular, trying to align it with the cell boundaries will
typically result in an I/O overhead as neighbouring cells with particles that have
drifted into the region are read in. Unless these particles are actually needed, it
is actually better for performance to *avoid* the cell boundaries when defining the
region.

Older SWIFT snapshots lack the metadata to know exactly how far particles have
drifted out of their cells. In ``v10.2.0`` or newer, if :mod:`swiftsimio` does not
find this metadata, it will pad the region (by 0.1 times the cell length by default),
and issue a `UserWarning` indicating this.

.. warning::

In the worst case that the region consists of one cell and the padding extends to all
neighbouring cells, this can result in up to a factor of :math:`3^3=27` additional
I/O overhead. Older :mod:`swiftsimio` versions instead risk missing particles near
the region boundary.

In the unlikely case that particles drift more than 0.1 times
the cell length away from their "home" cell and the cell bounding-box metadata is not
present, some particles can be missed when applying a spatial mask. The padding of
the region can be extended or switched off with the ``safe_padding`` parameter:

.. code-block:: python

mask = sw.mask(filename)
lbox = mask.metadata.boxsize
mask.constrain_spatial(
[[0.4 * lbox, 0.6 * lbox] for lbox in mask.metadata.boxsize],
safe_padding=False, # padding switched off
)
mask.constrain_spatial(
[[0.4 * lbox, 0.6 * lbox] for lbox in mask.metadata.boxsize],
safe_padding=1.0, # pad more, by 1.0 instead of 0.1 cell lengths
)


Full mask
---------
Expand Down Expand Up @@ -171,7 +303,7 @@ as follows
sw.subset_writer.write_subset("test_subset.hdf5", mask)

This will write a snapshot which contains the particles from the specified snapshot
whose *x*-coordinate is within the range [100, 1000] kpc. This function uses the
whose :math:`x`-coordinate is within the range [100, 1000] kpc. This function uses the
cell mask which encompases the specified spatial domain to successively read portions
of datasets from the input file and writes them to a new snapshot.

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ packages = [

[project]
name = "swiftsimio"
version="10.1.0"
version="10.2.0"
authors = [
{ name="Josh Borrow", email="josh@joshborrow.com" },
]
Expand Down
29 changes: 24 additions & 5 deletions swiftsimio/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Optional as _Optional, Union as _Union

from .reader import *
from .snapshot_writer import SWIFTSnapshotWriter
from .masks import SWIFTMask
Expand All @@ -17,7 +19,7 @@
name = "swiftsimio"


def validate_file(filename):
def validate_file(filename: str):
"""
Checks that the provided file is a SWIFT dataset.

Expand Down Expand Up @@ -47,7 +49,9 @@ def validate_file(filename):
return True


def mask(filename, spatial_only=True) -> SWIFTMask:
def mask(
filename: str, spatial_only: bool = True, safe_padding: _Union[bool, float] = True
) -> SWIFTMask:
"""
Sets up a masking object for you to use with the correct units and
metadata available.
Expand All @@ -61,6 +65,19 @@ def mask(filename, spatial_only=True) -> SWIFTMask:
allow you to use masking on other variables (e.g. density).
Defaults to True.

safe_padding : bool or float, optional
If snapshot does not specify bounding box of cell particles (MinPositions &
MaxPositions), pad the mask to gurantee that *all* particles in requested
spatial region(s) are selected. If the bounding box metadata is present, this
argument is ignored. The default (``True``) is to pad by one cell length.
Padding can be disabled (``False``) or set to a different fraction of the
cell length (e.g. ``0.2``). Only entire cells are loaded, but if the region
boundary is more than ``safe_padding`` from a cell boundary the neighbouring
cell is not read. Switching off can reduce I/O load by up to a factor of 10
in some cases (but a few particles in region could be missing). See
https://swiftsimio.readthedocs.io/en/latest/masking/index.html for further
details.

Returns
-------
SWIFTMask
Expand All @@ -78,10 +95,12 @@ def mask(filename, spatial_only=True) -> SWIFTMask:
units = SWIFTUnits(filename)
metadata = metadata_discriminator(filename, units)

return SWIFTMask(metadata=metadata, spatial_only=spatial_only)
return SWIFTMask(
metadata=metadata, spatial_only=spatial_only, safe_padding=safe_padding
)


def load(filename, mask=None) -> SWIFTDataset:
def load(filename: str, mask: _Optional[SWIFTMask] = None) -> SWIFTDataset:
"""
Loads the SWIFT dataset at filename.

Expand All @@ -96,7 +115,7 @@ def load(filename, mask=None) -> SWIFTDataset:
return SWIFTDataset(filename, mask=mask)


def load_statistics(filename) -> SWIFTStatisticsFile:
def load_statistics(filename: str) -> SWIFTStatisticsFile:
"""
Loads a SWIFT statistics file (``SFR.txt``, ``energy.txt``).

Expand Down
3 changes: 2 additions & 1 deletion swiftsimio/conversions.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,8 @@ def swift_cosmology_to_astropy(cosmo: dict, units) -> Cosmology:

try:
Tcmb0 = cosmo["T_CMB_0 [K]"][0]
except (IndexError, KeyError, AttributeError):
assert Tcmb0 != 0
except (IndexError, KeyError, AttributeError, AssertionError):
# expressions taken directly from astropy, since they do no longer
# allow access to these attributes (since version 5.1+)
critdens_const = (3.0 / (8.0 * np.pi * const.G)).cgs.value
Expand Down
Loading