diff --git a/docs/source/masking/index.rst b/docs/source/masking/index.rst index a5e9d4c6..5fe2c417 100644 --- a/docs/source/masking/index.rst +++ b/docs/source/masking/index.rst @@ -10,14 +10,13 @@ with this. This functionality is provided through the :mod:`swiftsimio.masks` sub-module but is available easily through the :meth:`swiftsimio.mask` top-level function. -This functionality is used heavily in our `VELOCIraptor integration library`_ -for only reading data that is near bound objects. +This functionality is used heavily in `swiftgalaxy`_. There are two types of mask, with the default only allowing spatial masking. Full masks require significantly more memory overhead and are generally much slower than the spatial only mask. -.. _`VELOCIraptor integration library`: https://github.com/swiftsim/velociraptor-python +.. _`swiftgalaxy`: https://github.com/SWIFTSIM/swiftgalaxy Spatial-only masking -------------------- @@ -39,7 +38,7 @@ Example ^^^^^^^ In this example we will use the :obj:`swiftsimio.masks.SWIFTMask` object -to load the bottom left 'half' corner of the box. +to load the the octant of the box closes to the origin. .. code-block:: python @@ -69,8 +68,8 @@ that are much larger than the available memory on your machine and process them with ease. It is also possible to build up a region with a more complicated geometry by -making repeated calls to :meth:`~swiftsimio.reader.SWIFTMask.constrain_spatial` -and setting the optional argument `intersect=True`. By default any existing +making repeated calls to :meth:`~swiftsimio.masks.SWIFTMask.constrain_spatial` +and setting the optional argument ``intersect=True``. By default any existing selection of cells would be overwritten; this option adds any additional cells that need to be selected for the new region to the existing selection instead. For instance, to add the diagonally opposed octant to the selection made above @@ -81,10 +80,143 @@ For instance, to add the diagonally opposed octant to the selection made above additional_region = [[0.5 * b, 1.0 * b] for b in boxsize] mask.constrain_spatial(additional_region, intersect=True) -In the first call to :meth:`~swiftsimio.reader.SWIFTMask.constrain_spatial` the -`intersect` argument can be set to `True` or left `False` (the default): since +In the first call to :meth:`~swiftsimio.masks.SWIFTMask.constrain_spatial` the +``intersect`` argument can be set to ``True`` or left ``False`` (the default): since no mask yet exists both give the same result. +Periodic boundaries +^^^^^^^^^^^^^^^^^^^ + +The mask region is aware of the periodic box boundaries. Let's take for example a +region shaped like a "slab" in the :math:`x-y` plane with :math:`|z|<0.1L_\mathrm{box}`. +One way to write this is by thinking of the :math:`z<0` part as +lying at the upper edge of the box: + +.. code-block:: python + + mask = sw.mask(filename) + mask.constrain_spatial( + [ + None, + None, + [0.0 * mask.metadata.boxsize[2], 0.1 * mask.metadata.boxsize[2]], + ] + ) + mask.constrain_spatial( + [ + None, + None, + [0.9 * mask.metadata.boxsize[2], 1.0 * mask.metadata.boxsize[2]], + ], + intersect=True, + ) + +This is a bit inconvenient though since the region is actually contiguous if we +account for the periodic boundary. :meth:`~swiftsimio.masks.SWIFTMask.constrain_spatial` allows us +to select a region straddling the periodic boundary, for example this is an +equivalent selection: + +.. code-block:: python + + mask = sw.mask(filename) + mask.constrain_spatial( + [ + None, + None, + [-0.1 * mask.metadata.boxsize[2], 0.1 * mask.metadata.boxsize[2]], + ] + ) + +Note that masking never result in periodic copies of particles, nor does it shift +particle coordinates to match the region defined; particle coordinates always +lie in the range :math:`[0, L_\mathrm{box}]`. For example reading +a region that extends beyond the box in all directions produces exactly one copy +of every particle and is equivalent to providing no spatial mask: + +.. code-block:: python + + mask = sw.mask(filename) + mask.constrain_spatial( + [[-0.1 * lbox, 1.1 * lbox] for lbox in mask.metadata.boxsize] + ) + +Remember to wrap the coordinates yourself if relevant! Alternatively, the +`swiftgalaxy`_ package offers support for coordinate transformations including +periodic boundaries. + +Another equivalent region for the :math:`|z|<0.1L_\mathrm{box}` slab can be written +by setting the lower bound to a greater value than the upper bound, the code will +interpret this as a request to start at the lower bound, wrap through the upper +periodic boundary and continue until the (numerically lower value of) the upper +bound is reached: + +.. code-block:: python + + mask = sw.mask(filename) + mask.constrain_spatial( + [ + None, + None, + [0.9 * mask.metadata.boxsize[2], 0.1 * mask.metadata.boxsize[2]], + ] + ) + +The coordinates defining the region must always be in the interval +:math:`[-0.5L_\mathrm{box}, 1.5L_\mathrm{box}]`. This allows enough flexibility to +define all possible regions. + +Implementation details +^^^^^^^^^^^^^^^^^^^^^^ + +SWIFT snapshots group particles according to the cell that they occupy so that +particles belonging to a cell are stored contiguously. The cells form a regular grid +covering the simulation domain. However, SWIFT does not guarantee that all particles +that belong to a cell are within the boundaries of a cell at the time when a snapshot +is produced (particles are moved between cells at intervals, but may drift outside of +their current cell before being re-assigned). Snapshots contain metadata defining +the "bounding box" of each cell that contains all particles assigned to it at the +time that the snapshot was written. :mod:`swiftsimio` uses this information when +deciding what cells to read, so you may find that the "extra" particles read in +outside of the explicitly asked for have an irregular boundary with cuboid protrusions +or indentations. This is normal: the cells read in are exactly those needed to +guarantee that all particles in the specified region of interest are captured. It is +therefore advantageous to make the region as small and tightly fit to the analysis +task as possible - in particular, trying to align it with the cell boundaries will +typically result in an I/O overhead as neighbouring cells with particles that have +drifted into the region are read in. Unless these particles are actually needed, it +is actually better for performance to *avoid* the cell boundaries when defining the +region. + +Older SWIFT snapshots lack the metadata to know exactly how far particles have +drifted out of their cells. In ``v10.2.0`` or newer, if :mod:`swiftsimio` does not +find this metadata, it will pad the region (by 0.1 times the cell length by default), +and issue a `UserWarning` indicating this. + +.. warning:: + + In the worst case that the region consists of one cell and the padding extends to all + neighbouring cells, this can result in up to a factor of :math:`3^3=27` additional + I/O overhead. Older :mod:`swiftsimio` versions instead risk missing particles near + the region boundary. + +In the unlikely case that particles drift more than 0.1 times +the cell length away from their "home" cell and the cell bounding-box metadata is not +present, some particles can be missed when applying a spatial mask. The padding of +the region can be extended or switched off with the ``safe_padding`` parameter: + +.. code-block:: python + + mask = sw.mask(filename) + lbox = mask.metadata.boxsize + mask.constrain_spatial( + [[0.4 * lbox, 0.6 * lbox] for lbox in mask.metadata.boxsize], + safe_padding=False, # padding switched off + ) + mask.constrain_spatial( + [[0.4 * lbox, 0.6 * lbox] for lbox in mask.metadata.boxsize], + safe_padding=1.0, # pad more, by 1.0 instead of 0.1 cell lengths + ) + Full mask --------- @@ -171,7 +303,7 @@ as follows sw.subset_writer.write_subset("test_subset.hdf5", mask) This will write a snapshot which contains the particles from the specified snapshot -whose *x*-coordinate is within the range [100, 1000] kpc. This function uses the +whose :math:`x`-coordinate is within the range [100, 1000] kpc. This function uses the cell mask which encompases the specified spatial domain to successively read portions of datasets from the input file and writes them to a new snapshot. diff --git a/pyproject.toml b/pyproject.toml index 158d42cb..69623dc4 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -23,7 +23,7 @@ packages = [ [project] name = "swiftsimio" -version="10.1.0" +version="10.2.0" authors = [ { name="Josh Borrow", email="josh@joshborrow.com" }, ] diff --git a/swiftsimio/__init__.py b/swiftsimio/__init__.py index 90efe849..b3fa64f9 100644 --- a/swiftsimio/__init__.py +++ b/swiftsimio/__init__.py @@ -1,3 +1,5 @@ +from typing import Optional as _Optional, Union as _Union + from .reader import * from .snapshot_writer import SWIFTSnapshotWriter from .masks import SWIFTMask @@ -17,7 +19,7 @@ name = "swiftsimio" -def validate_file(filename): +def validate_file(filename: str): """ Checks that the provided file is a SWIFT dataset. @@ -47,7 +49,9 @@ def validate_file(filename): return True -def mask(filename, spatial_only=True) -> SWIFTMask: +def mask( + filename: str, spatial_only: bool = True, safe_padding: _Union[bool, float] = True +) -> SWIFTMask: """ Sets up a masking object for you to use with the correct units and metadata available. @@ -61,6 +65,19 @@ def mask(filename, spatial_only=True) -> SWIFTMask: allow you to use masking on other variables (e.g. density). Defaults to True. + safe_padding : bool or float, optional + If snapshot does not specify bounding box of cell particles (MinPositions & + MaxPositions), pad the mask to gurantee that *all* particles in requested + spatial region(s) are selected. If the bounding box metadata is present, this + argument is ignored. The default (``True``) is to pad by one cell length. + Padding can be disabled (``False``) or set to a different fraction of the + cell length (e.g. ``0.2``). Only entire cells are loaded, but if the region + boundary is more than ``safe_padding`` from a cell boundary the neighbouring + cell is not read. Switching off can reduce I/O load by up to a factor of 10 + in some cases (but a few particles in region could be missing). See + https://swiftsimio.readthedocs.io/en/latest/masking/index.html for further + details. + Returns ------- SWIFTMask @@ -78,10 +95,12 @@ def mask(filename, spatial_only=True) -> SWIFTMask: units = SWIFTUnits(filename) metadata = metadata_discriminator(filename, units) - return SWIFTMask(metadata=metadata, spatial_only=spatial_only) + return SWIFTMask( + metadata=metadata, spatial_only=spatial_only, safe_padding=safe_padding + ) -def load(filename, mask=None) -> SWIFTDataset: +def load(filename: str, mask: _Optional[SWIFTMask] = None) -> SWIFTDataset: """ Loads the SWIFT dataset at filename. @@ -96,7 +115,7 @@ def load(filename, mask=None) -> SWIFTDataset: return SWIFTDataset(filename, mask=mask) -def load_statistics(filename) -> SWIFTStatisticsFile: +def load_statistics(filename: str) -> SWIFTStatisticsFile: """ Loads a SWIFT statistics file (``SFR.txt``, ``energy.txt``). diff --git a/swiftsimio/conversions.py b/swiftsimio/conversions.py index 4c54d2f6..0f8b4eab 100644 --- a/swiftsimio/conversions.py +++ b/swiftsimio/conversions.py @@ -94,7 +94,8 @@ def swift_cosmology_to_astropy(cosmo: dict, units) -> Cosmology: try: Tcmb0 = cosmo["T_CMB_0 [K]"][0] - except (IndexError, KeyError, AttributeError): + assert Tcmb0 != 0 + except (IndexError, KeyError, AttributeError, AssertionError): # expressions taken directly from astropy, since they do no longer # allow access to these attributes (since version 5.1+) critdens_const = (3.0 / (8.0 * np.pi * const.G)).cgs.value diff --git a/swiftsimio/masks.py b/swiftsimio/masks.py index 44de628b..545db326 100644 --- a/swiftsimio/masks.py +++ b/swiftsimio/masks.py @@ -4,17 +4,17 @@ """ import warnings - import h5py - import numpy as np +from typing import Union from swiftsimio.metadata.objects import SWIFTMetadata - from swiftsimio.objects import InvalidSnapshot, cosmo_array, cosmo_quantity - from swiftsimio.accelerated import ranges_from_array +_DEFAULT_SAFE_PADDING = 0.1 +_GROUPCAT_OUTPUT_TYPES = ["FOF", "SOAP"] + class SWIFTMask(object): """ @@ -25,7 +25,12 @@ class SWIFTMask(object): group_mapping: dict | None = None group_size_mapping: dict | None = None - def __init__(self, metadata: SWIFTMetadata, spatial_only=True): + def __init__( + self, + metadata: SWIFTMetadata, + spatial_only=True, + safe_padding: Union[bool, float] = _DEFAULT_SAFE_PADDING, + ): """ SWIFTMask constructor @@ -45,11 +50,29 @@ def __init__(self, metadata: SWIFTMetadata, spatial_only=True): more memory efficient (~ bytes per cell, rather than ~ bytes per particle). + safe_padding : bool or float, optional + If snapshot does not specify bounding box of cell particles (``MinPositions``, + ``MaxPositions``), pad the mask to gurantee that *all* particles in requested + spatial region(s) are selected. If the bounding box metadata is present, this + argument is ignored. The default (``0.1``) is to pad by 0.1 times the cell + length. Padding can be disabled (``False``) or set to a different fraction of + the cell length (e.g. ``0.5``). Only entire cells are loaded, but if the + region boundary is more than ``safe_padding`` from a cell boundary the + neighbouring cell is not read. Switching off can reduce I/O load by up to a + factor of 30 in some cases (but a few particles in region could be missing). + See https://swiftsimio.readthedocs.io/en/latest/masking/index.html for further + details. """ self.metadata = metadata self.units = metadata.units self.spatial_only = spatial_only + if safe_padding is True: + self.safe_padding = _DEFAULT_SAFE_PADDING + elif safe_padding is False: + self.safe_padding = 0.0 + else: + self.safe_padding = safe_padding if not self.metadata.masking_valid: raise NotImplementedError( @@ -180,11 +203,22 @@ def _unpack_cell_metadata(self): self.counts = {} self.offsets = {} + self.minpositions = {} + self.maxpositions = {} cell_handle = self.units.handle["Cells"] count_handle = cell_handle["Counts"] metadata_handle = cell_handle["Meta-data"] centers_handle = cell_handle["Centres"] + if ( + "MinPositions" in cell_handle.keys() + and "MaxPositions" in cell_handle.keys() + ): + # Older versions of SWIFT don't have this information + minpos_handle = cell_handle["MinPositions"] + maxpos_handle = cell_handle["MaxPositions"] + else: + minpos_handle, maxpos_handle = None, None try: offset_handle = cell_handle["OffsetsInFile"] @@ -201,16 +235,54 @@ def _unpack_cell_metadata(self): for group, group_name in zip( self.metadata.present_groups, self.metadata.present_group_names ): - counts = count_handle[group][:] - offsets = offset_handle[group][:] - self.offsets[group_name] = offset_handle[group][:] self.counts[group_name] = count_handle[group][:] + if minpos_handle is not None and maxpos_handle is not None: + for group, group_name in zip( + self.metadata.present_groups, self.metadata.present_group_names + ): + self.minpositions[group_name] = np.where( + centers_handle[:] - 0.5 * metadata_handle.attrs["size"] + < minpos_handle[group][:], + centers_handle[:] - 0.5 * metadata_handle.attrs["size"], + minpos_handle[group][:], + ) + self.maxpositions[group_name] = np.where( + centers_handle[:] + 0.5 * metadata_handle.attrs["size"] + > maxpos_handle[group][:], + centers_handle[:] + 0.5 * metadata_handle.attrs["size"], + maxpos_handle[group][:], + ) + else: + # be conservative: pad (default by 0.1 cell) in case particles drifed + # (unless for group catalogues) + pad_cells = ( + 0 + if self.metadata.output_type in _GROUPCAT_OUTPUT_TYPES + else self.safe_padding + ) + if self.metadata.output_type not in _GROUPCAT_OUTPUT_TYPES: + warnings.warn( + "Snapshot does not contain Cells/MinPositions and Cells/MaxPositions" + f" metadata. Padding region by {pad_cells} times cell length to" + " account for drifted particles. This behaviour can be" + " configured/disabled with the `safe_padding` parameter when creating" + " the mask. See " + "https://swiftsimio.readthedocs.io/en/latest/masking/index.html" + " for further details." + ) + # +/- 0.5 here is the cell size itself: + self.minpositions["shared"] = ( + centers_handle[:] - (pad_cells + 0.5) * metadata_handle.attrs["size"] + ) + self.maxpositions["shared"] = ( + centers_handle + (pad_cells + 0.5) * metadata_handle.attrs["size"] + ) # Only want to compute this once (even if it is fast, we do not # have a reliable stable sort in the case where cells do not # contain at least one of each type of particle). - sort = None + self.cell_sort = None # Now perform sort: for key in self.offsets.keys(): @@ -218,22 +290,43 @@ def _unpack_cell_metadata(self): counts = self.counts[key] # When using MPI, we cannot assume that these are sorted. - if sort is None: + if self.cell_sort is None: # Only compute once; not stable between particle # types if some datasets do not have particles in a cell! - sort = np.argsort(offsets) + self.cell_sort = np.argsort(offsets) - self.offsets[key] = offsets[sort] - self.counts[key] = counts[sort] + self.offsets[key] = offsets[self.cell_sort] + self.counts[key] = counts[self.cell_sort] # Also need to sort centers in the same way self.centers = cosmo_array( - centers_handle[:][sort], + centers_handle[:][self.cell_sort], units=self.units.length, comoving=True, scale_factor=self.metadata.scale_factor, scale_exponent=1, ) + # And sort min & max positions, too. + for k in self.minpositions.keys(): + self.minpositions[k] = cosmo_array( + self.minpositions[k][self.cell_sort], + units=self.units.length, + comoving=True, + scale_factor=self.metadata.scale_factor, + scale_exponent=1, + ) + for k in self.maxpositions.keys(): + self.maxpositions[k] = cosmo_array( + self.maxpositions[k][self.cell_sort], + units=self.units.length, + comoving=True, + scale_factor=self.metadata.scale_factor, + scale_exponent=1, + ) + if minpos_handle is None and maxpos_handle is None: + for group_name in self.metadata.present_group_names: + self.minpositions[group_name] = self.minpositions["shared"] + self.maxpositions[group_name] = self.maxpositions["shared"] # Note that we cannot assume that these are cubic, unfortunately. self.cell_size = cosmo_array( @@ -367,6 +460,11 @@ def _generate_cell_mask(self, restrict): These values must have units associated with them. + Raises + ------ + ValueError + If the mask boundaries are outside the interval [-Lbox/2, 3*Lbox/2]. + Returns ------- @@ -374,52 +472,71 @@ def _generate_cell_mask(self, restrict): mask to indicate which cells are within the specified spatial range """ - cell_mask = np.ones(len(self.centers), dtype=bool) + if self.metadata.output_type in _GROUPCAT_OUTPUT_TYPES: + cell_mask = {"shared": np.ones(len(self.centers), dtype=bool)} + else: + # particles may drift from their cells, mask each type separately + cell_mask = { + group_name: np.ones(len(self.centers), dtype=bool) + for group_name in self.metadata.present_group_names + } for dimension in range(0, 3): - if restrict[dimension] is None: - continue - - # Include the cell size so it's easier to find the overlap - lower = restrict[dimension][0] - 0.5 * self.cell_size[dimension] - upper = restrict[dimension][1] + 0.5 * self.cell_size[dimension] + lower = restrict[dimension][0] + upper = restrict[dimension][1] boxsize = self.metadata.boxsize[dimension] - - # Now need to deal with the three wrapping cases: - if lower.value < 0.0: - # Wrap lower -> high - lower += boxsize - - this_mask = np.logical_or.reduce( - [ - self.centers[cell_mask, dimension] > lower, - self.centers[cell_mask, dimension] <= upper, - ] + if np.logical_or.reduce( + ( + lower < -boxsize / 2, + upper < -boxsize / 2, + lower > 3 * boxsize / 2, + upper > 3 * boxsize / 2, ) - elif upper > boxsize: - # Wrap high -> lower - upper -= boxsize - - this_mask = np.logical_or.reduce( - [ - self.centers[cell_mask, dimension] > lower, - self.centers[cell_mask, dimension] <= upper, - ] + ): + # because we're only going to make one periodic copy on either side, + # we're in trouble + raise ValueError( + "Mask region boundaries must be in interval [-boxsize/2, 3*boxsize/2]" + f" along {'xyz'[dimension]}-axis." ) - else: - # No wrapping required - this_mask = np.logical_and.reduce( + if restrict[dimension] is None or np.abs(upper - lower) > boxsize: + # keep everything along this axis + continue + if upper < lower: + # inverted case, convert to a "normal" case in target window + if lower > boxsize / 2: + lower -= boxsize + elif upper < boxsize / 2: # don't shift both else we get the whole box! + upper += boxsize + group_names = ( + ["shared"] + if self.metadata.output_type in _GROUPCAT_OUTPUT_TYPES + else self.metadata.present_group_names + ) + for group_name in group_names: + # selection intersects one of the 3 periodic copies of a cell: + this_mask = np.logical_or.reduce( [ - self.centers[cell_mask, dimension] > lower, - self.centers[cell_mask, dimension] <= upper, + np.logical_and( + self.maxpositions[group_name][ + cell_mask[group_name], dimension + ] + + shift * boxsize + > lower, + self.minpositions[group_name][ + cell_mask[group_name], dimension + ] + + shift * boxsize + < upper, + ) + for shift in (-1, 0, 1) ] ) - - cell_mask[cell_mask] = this_mask + cell_mask[group_name][cell_mask[group_name]] = this_mask return cell_mask - def _update_spatial_mask(self, restrict, data_name: str, cell_mask: np.array): + def _update_spatial_mask(self, restrict, data_name: str, cell_mask: dict): """ Updates the particle mask using the cell mask. @@ -436,30 +553,31 @@ def _update_spatial_mask(self, restrict, data_name: str, cell_mask: np.array): data_name : str underlying data to update (e.g. _gas, _shared) - cell_mask : np.array + cell_mask : dict cell mask used to update the particle mask """ count_name = data_name[1:] # Remove the underscore - if self.spatial_only: - counts = self.counts[count_name][cell_mask] - offsets = self.offsets[count_name][cell_mask] + for group_name in self.metadata.present_group_names: + if self.spatial_only: + counts = self.counts[count_name][cell_mask[count_name]] + offsets = self.offsets[count_name][cell_mask[count_name]] - this_mask = [[o, c + o] for c, o in zip(counts, offsets)] + this_mask = [[o, c + o] for c, o in zip(counts, offsets)] - setattr(self, data_name, np.array(this_mask)) - setattr(self, f"{data_name}_size", np.sum(counts)) + setattr(self, data_name, np.array(this_mask)) + setattr(self, f"{data_name}_size", np.sum(counts)) - else: - counts = self.counts[count_name][~cell_mask] - offsets = self.offsets[count_name][~cell_mask] + else: + counts = self.counts[count_name][~cell_mask[count_name]] + offsets = self.offsets[count_name][~cell_mask[count_name]] - # We must do the whole boolean mask business. - this_mask = getattr(self, data_name) + # We must do the whole boolean mask business. + this_mask = getattr(self, data_name) - for count, offset in zip(counts, offsets): - this_mask[offset : count + offset] = False + for count, offset in zip(counts, offsets): + this_mask[offset : count + offset] = False return @@ -499,9 +617,11 @@ def constrain_spatial(self, restrict, intersect: bool = False): if hasattr(self, "cell_mask") and intersect: # we already have a mask and are in intersect mode - self.cell_mask = np.logical_or( - self.cell_mask, self._generate_cell_mask(restrict) - ) + new_mask = self._generate_cell_mask(restrict) + for group_name in self.metadata.present_group_names: + self.cell_mask[group_name] = np.logical_or( + self.cell_mask[group_name], new_mask[group_name] + ) else: # we just make a new mask self.cell_mask = self._generate_cell_mask(restrict) @@ -626,6 +746,7 @@ def get_masked_counts_offsets( arrays now storing the offset of the first particle in the cell. """ + if self.spatial_only: masked_counts = {} current_offsets = {} @@ -635,9 +756,9 @@ def get_masked_counts_offsets( "constrain_spatial with a suitable restrict array to generate one." ) for part_type, counts in self.counts.items(): - masked_counts[part_type] = counts * self.cell_mask + masked_counts[part_type] = counts * self.cell_mask[part_type] - current_offsets[part_type] = [0] * counts.size + current_offsets[part_type] = np.zeros(counts.size, dtype=np.int64) running_sum = 0 for i in range(len(counts)): current_offsets[part_type][i] = running_sum diff --git a/swiftsimio/objects.py b/swiftsimio/objects.py index b69fcc04..75565339 100644 --- a/swiftsimio/objects.py +++ b/swiftsimio/objects.py @@ -812,7 +812,18 @@ def __eq__(self, b: "cosmo_factor") -> bool: """ if not isinstance(b, cosmo_factor): raise ValueError("Can only compare cosmo_factor with another cosmo_factor.") - return (self.scale_factor == b.scale_factor) and (self.a_factor == b.a_factor) + scale_factor_match = self.scale_factor == b.scale_factor + if self.a_factor is None and b.a_factor is None: + # guards passing None to isclose + a_factor_match = True + elif self.a_factor is None or b.a_factor is None: + # we know they're not both None from previous case + a_factor_match = False + elif np.isclose(self.a_factor, b.a_factor, rtol=1e-9): + a_factor_match = True + else: + a_factor_match = False + return scale_factor_match and a_factor_match def __ne__(self, b: "cosmo_factor") -> bool: """ diff --git a/swiftsimio/subset_writer.py b/swiftsimio/subset_writer.py index da7b03f3..8bf40920 100644 --- a/swiftsimio/subset_writer.py +++ b/swiftsimio/subset_writer.py @@ -162,7 +162,8 @@ def find_links( def update_metadata_counts(infile: h5py.File, outfile: h5py.File, mask: SWIFTMask): """ - Recalculates the cell particle counts and offsets based on the particles present in the subset + Recalculates the cell particle counts and offsets based on the particles present in + the subset. Parameters ---------- @@ -199,7 +200,18 @@ def update_metadata_counts(infile: h5py.File, outfile: h5py.File, mask: SWIFTMas # Copy the cell centres and metadata infile.copy("/Cells/Centres", outfile, name="/Cells/Centres") + outfile["/Cells/Centres"][...] = outfile["/Cells/Centres"][...][mask.cell_sort,] infile.copy("/Cells/Meta-data", outfile, name="/Cells/Meta-data") + if ( + "MinPositions" in infile["/Cells"].keys() + and "MaxPositions" in infile["/Cells"].keys() + ): + infile.copy("/Cells/MinPositions", outfile, name="/Cells/MinPositions") + infile.copy("/Cells/MaxPositions", outfile, name="/Cells/MaxPositions") + for k, v in outfile["/Cells/MinPositions"].items(): + outfile[f"/Cells/MinPositions/{k}"][...] = v[...][mask.cell_sort,] + for k, v in outfile["/Cells/MaxPositions"].items(): + outfile[f"/Cells/MaxPositions/{k}"][...] = v[...][mask.cell_sort,] def write_metadata( diff --git a/tests/basic_test.py b/test_basic.py similarity index 100% rename from tests/basic_test.py rename to test_basic.py diff --git a/tests/conftest.py b/tests/conftest.py index c2f9b611..3edb6e51 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -2,7 +2,9 @@ import subprocess import pytest -webstorage_location = "https://virgodb.cosma.dur.ac.uk/swift-webstorage/IOExamples/" +webstorage_location = ( + "https://virgodb.cosma.dur.ac.uk/swift-webstorage/IOExamples/ssio_ci_04_2025/" +) test_data_location = "test_data/" @@ -31,21 +33,35 @@ def _requires(filename): return file_location +@pytest.fixture( + params=[ + "EagleDistributed.hdf5", + "EagleSingle.hdf5", + "LegacyCosmologicalVolume.hdf5", + ] +) +def cosmological_volume(request): + if request.param == "EagleDistributed.hdf5": + _requires("eagle_0025.0.hdf5") + _requires("eagle_0025.1.hdf5") + yield _requires(request.param) + + @pytest.fixture -def cosmo_volume_example(): - yield _requires("cosmo_volume_example.hdf5") +def cosmological_volume_only_single(): + yield _requires("EagleSingle.hdf5") @pytest.fixture -def cosmological_volume(): - yield _requires("cosmological_volume.hdf5") +def cosmological_volume_only_distributed(): + yield _requires("EagleDistributed.hdf5") @pytest.fixture def cosmological_volume_dithered(): - yield _requires("cosmological_volume_dithered.hdf5") + yield _requires("LegacyCosmologicalVolumeDithered.hdf5") @pytest.fixture def soap_example(): - yield _requires("soap_example.hdf5") + yield _requires("SoapExample.hdf5") diff --git a/tests/helper.py b/tests/helper.py index fe4e8a84..26b17536 100644 --- a/tests/helper.py +++ b/tests/helper.py @@ -2,12 +2,25 @@ Contains helper functions for the test routines. """ +import pytest import h5py from swiftsimio.subset_writer import find_links, write_metadata from swiftsimio import mask, cosmo_array from numpy import mean, zeros_like +def _mask_without_warning(fname, **kwargs): + with h5py.File(fname, "r") as f: + has_cell_bbox = "MinPositions" in f["/Cells"].keys() + if has_cell_bbox: + return mask(fname, **kwargs) + else: + with pytest.warns( + UserWarning, match="Snapshot does not contain Cells/MinPositions" + ): + return mask(fname, **kwargs) + + def create_in_memory_hdf5(filename="f1"): """ Creates an in-memory hdf5 file object. @@ -30,7 +43,7 @@ def create_n_particle_dataset(filename: str, output_name: str, num_parts: int = number of particles to create (default: 2) """ # Create a dummy mask in order to write metadata - data_mask = mask(filename) + data_mask = _mask_without_warning(filename) boxsize = data_mask.metadata.boxsize region = [[zeros_like(b), b] for b in boxsize] data_mask.constrain_spatial(region) @@ -76,10 +89,13 @@ def create_n_particle_dataset(filename: str, output_name: str, num_parts: int = # Get rid of all traces of DM del outfile["/Cells/Counts/PartType1"] - del outfile["/Cells/Offsets/PartType1"] - nparts_total = [num_parts, 0, 0, 0, 0, 0] - nparts_this_file = [num_parts, 0, 0, 0, 0, 0] - can_have_types = [1, 0, 0, 0, 0, 0] + if "Offsets" in outfile["/Cells"].keys(): + del outfile["/Cells/Offsets/PartType1"] + if "OffsetsInFile" in outfile["/Cells"].keys(): + del outfile["/Cells/OffsetsInFile/PartType1"] + nparts_total = [num_parts, 0, 0, 0, 0, 0, 0] + nparts_this_file = [num_parts, 0, 0, 0, 0, 0, 0] + can_have_types = [1, 0, 0, 0, 0, 0, 0] outfile["/Header"].attrs["NumPart_Total"] = nparts_total outfile["/Header"].attrs["NumPart_ThisFile"] = nparts_this_file outfile["/Header"].attrs["CanHaveTypes"] = can_have_types diff --git a/tests/test_data.py b/tests/test_data.py index 2adc7746..7d5f11d0 100644 --- a/tests/test_data.py +++ b/tests/test_data.py @@ -6,7 +6,8 @@ """ import numpy as np -from swiftsimio import load, mask +from swiftsimio import load +from .helper import _mask_without_warning as mask from os import remove import h5py @@ -15,7 +16,7 @@ from numpy import logical_and, isclose, float64 from numpy import array as numpy_array -from swiftsimio.objects import cosmo_array, cosmo_factor, a +from swiftsimio.objects import cosmo_array from tests.helper import create_n_particle_dataset @@ -27,8 +28,7 @@ def test_cosmology_metadata(cosmological_volume): data = load(cosmological_volume) assert data.metadata.a == data.metadata.scale_factor - - assert data.metadata.a == 1.0 / (1.0 + data.metadata.redshift) + assert np.isclose(data.metadata.a, 1.0 / (1.0 + data.metadata.redshift), atol=1e-8) return @@ -56,7 +56,6 @@ def test_temperature_units(cosmological_volume): data = load(cosmological_volume) data.gas.temperatures.convert_to_units(K) - return @@ -143,7 +142,6 @@ def test_cell_metadata_is_valid(cosmological_volume): I.e. that it sets the particles contained in a top-level cell. """ - mask_region = mask(cosmological_volume) # Because we sort by offset if we are using the metadata we # must re-order the data to be in the correct order @@ -201,12 +199,6 @@ def test_dithered_cell_metadata_is_valid(cosmological_volume_dithered): cell_size = mask_region.cell_size.to(data.dark_matter.coordinates.units) boxsize = mask_region.metadata.boxsize[0].to(data.dark_matter.coordinates.units) - # can be removed when issue #128 resolved: - boxsize = cosmo_array( - boxsize, - comoving=True, - cosmo_factor=cosmo_factor(a ** 1, mask_region.metadata.a), - ) offsets = mask_region.offsets["dark_matter"] counts = mask_region.counts["dark_matter"] @@ -244,13 +236,9 @@ def test_reading_select_region_metadata(cosmological_volume): # Mask off the centre of the volume. mask_region = mask(cosmological_volume, spatial_only=True) - # can be removed when issue #128 resolved: - boxsize = cosmo_array( - full_data.metadata.boxsize, - comoving=True, - cosmo_factor=cosmo_factor(a ** 1, full_data.metadata.a), - ) - restrict = cosmo_array([boxsize * 0.2, boxsize * 0.8]).T + restrict = cosmo_array( + [full_data.metadata.boxsize * 0.2, full_data.metadata.boxsize * 0.8] + ).T mask_region.constrain_spatial(restrict=restrict) @@ -295,13 +283,9 @@ def test_reading_select_region_metadata_not_spatial_only(cosmological_volume): # Mask off the centre of the volume. mask_region = mask(cosmological_volume, spatial_only=False) - # can be removed when issue #128 resolved: - boxsize = cosmo_array( - full_data.metadata.boxsize, - comoving=True, - cosmo_factor=cosmo_factor(a ** 1, full_data.metadata.a), - ) - restrict = cosmo_array([boxsize * 0.26, boxsize * 0.74]).T + restrict = cosmo_array( + [full_data.metadata.boxsize * 0.26, full_data.metadata.boxsize * 0.74] + ).T mask_region.constrain_spatial(restrict=restrict) diff --git a/tests/test_mask.py b/tests/test_mask.py index eb47e47d..d2ce4225 100644 --- a/tests/test_mask.py +++ b/tests/test_mask.py @@ -2,9 +2,13 @@ Tests the masking using some test data. """ -from swiftsimio import load, mask, cosmo_array, cosmo_quantity +import warnings +import h5py +import pytest +from swiftsimio import load, cosmo_array, cosmo_quantity import numpy as np import unyt as u +from .helper import _mask_without_warning as mask def test_reading_select_region_spatial(cosmological_volume): @@ -45,13 +49,19 @@ def test_reading_select_region_half_box(cosmological_volume): Specifically, we test to see if all particles lie within half a boxsize. """ - full_data = load(cosmological_volume) - # Mask off the lower bottom corner of the volume. mask_region = mask(cosmological_volume, spatial_only=True) + # the region can be padded by a cell if min & max particle positions are absent + # in metadata + pad_frac = (mask_region.cell_size / mask_region.metadata.boxsize).to_value( + u.dimensionless + ) restrict = cosmo_array( - [np.zeros_like(full_data.metadata.boxsize), full_data.metadata.boxsize * 0.49] + [ + mask_region.metadata.boxsize * (pad_frac + 0.01), + mask_region.metadata.boxsize * (0.5 - pad_frac - 0.01), + ] ).T mask_region.constrain_spatial(restrict=restrict) @@ -59,12 +69,13 @@ def test_reading_select_region_half_box(cosmological_volume): selected_data = load(cosmological_volume, mask=mask_region) selected_coordinates = selected_data.gas.coordinates - # Some of these particles will be outside because of the periodic BCs assert ( - (selected_coordinates / full_data.metadata.boxsize).to_value(u.dimensionless) - > 0.5 + (selected_coordinates / mask_region.metadata.boxsize).to_value(u.dimensionless) + > 0.5 + pad_frac ).sum() < 25 + # make sure the test isn't trivially passed because we selected nothing: + assert selected_coordinates.size > 0 def test_region_mask_not_modified(cosmological_volume): @@ -101,9 +112,187 @@ def test_region_mask_intersection(cosmological_volume): # the intersect=True flag is optional on the first call: mask_intersect.constrain_spatial(region_1, intersect=True) mask_intersect.constrain_spatial(region_2, intersect=True) - assert ( - np.logical_or(mask_1.cell_mask, mask_2.cell_mask) == mask_intersect.cell_mask - ).all() + for group_name in mask_1.metadata.present_group_names: + assert ( + np.logical_or(mask_1.cell_mask[group_name], mask_2.cell_mask[group_name]) + == mask_intersect.cell_mask[group_name] + ).all() + + +def test_mask_periodic_wrapping(cosmological_volume): + """ + Check that a region that runs off the upper edge of the box gives the same + mask as one that runs off the lower edge (they are chosen to be equivalent + under periodic wrapping). + """ + mask_region_upper = mask(cosmological_volume, spatial_only=True) + mask_region_lower = mask(cosmological_volume, spatial_only=True) + restrict_upper = cosmo_array( + [ + mask_region_upper.metadata.boxsize * 0.8, + mask_region_upper.metadata.boxsize * 1.2, + ] + ).T + restrict_lower = cosmo_array( + [ + mask_region_lower.metadata.boxsize * (-0.2), + mask_region_lower.metadata.boxsize * 0.2, + ] + ).T + + mask_region_upper.constrain_spatial(restrict=restrict_upper) + mask_region_lower.constrain_spatial(restrict=restrict_lower) + + selected_data_upper = load(cosmological_volume, mask=mask_region_upper) + selected_data_lower = load(cosmological_volume, mask=mask_region_lower) + + selected_coordinates_upper = selected_data_upper.gas.coordinates + selected_coordinates_lower = selected_data_lower.gas.coordinates + assert selected_coordinates_lower.size > 0 # check that we selected something + assert np.array_equal(selected_coordinates_upper, selected_coordinates_lower) + + +def test_mask_padding(cosmological_volume): + """ + Check that the padding of a mask when we don't have cell bounding box metadata + works correctly. + """ + + # Mask off the lower bottom corner of the volume. + mask_pad_onecell = mask(cosmological_volume, spatial_only=True, safe_padding=1.0) + mask_pad_tenthcell = mask(cosmological_volume, spatial_only=True) # default 0.1 + mask_pad_off = mask(cosmological_volume, spatial_only=True, safe_padding=False) + assert mask_pad_onecell.safe_padding == 1.0 + assert mask_pad_tenthcell.safe_padding == 0.1 + assert mask_pad_off.safe_padding == 0.0 + + cell_size = mask_pad_onecell.cell_size + region = cosmo_array([np.ones(3) * 3.8 * cell_size, np.ones(3) * 4.0 * cell_size]).T + mask_pad_onecell.constrain_spatial(region) + mask_pad_tenthcell.constrain_spatial(region) + mask_pad_off.constrain_spatial(region) + + with h5py.File(cosmological_volume, "r") as f: + has_cell_bbox = "MinPositions" in f["/Cells"].keys() + if has_cell_bbox: + # We should ignore `safe_padding` and just read the cell. + # Note in case this test fails on a new snapshot: + # the assertions assume that the neighbouring cells don't + # have particles that have drifted into the target cell. + # For a different test snapshot this might not be true, + # so check for that if troubleshooting. + assert mask_pad_onecell.cell_mask["gas"].sum() == 1 + assert mask_pad_tenthcell.cell_mask["gas"].sum() == 1 + assert mask_pad_off.cell_mask["gas"].sum() == 1 + else: + # Padding by a cell length, we should read all 3x3x3 neighbours. + assert mask_pad_onecell.cell_mask["gas"].sum() == 27 + # Padding by a half-cell length, we should read 2x2x2 cells near this corner. + assert mask_pad_tenthcell.cell_mask["gas"].sum() == 8 + # Padding switched off, read only this cell. + assert mask_pad_off.cell_mask["gas"].sum() == 1 + + +def test_mask_pad_wrapping(cosmological_volume): + """ + When we mask all the way to the edge of the box, we should get a cell on the + opposite edge as padding in case particles have drifted out of their cell, + unless the cell metadata with max positions is present. + """ + + mask_region_upper = mask(cosmological_volume, spatial_only=True) + mask_region_lower = mask(cosmological_volume, spatial_only=True) + restrict_lower = cosmo_array( + [mask_region_lower.metadata.boxsize * 0.8, mask_region_lower.metadata.boxsize] + ).T + restrict_upper = cosmo_array( + [ + mask_region_upper.metadata.boxsize * 0, + mask_region_upper.metadata.boxsize * 0.2, + ] + ).T + + mask_region_lower.constrain_spatial(restrict=restrict_lower) + mask_region_upper.constrain_spatial(restrict=restrict_upper) + selected_data_lower = load(cosmological_volume, mask=mask_region_lower) + selected_data_upper = load(cosmological_volume, mask=mask_region_upper) + selected_coordinates_lower = selected_data_lower.gas.coordinates + selected_coordinates_upper = selected_data_upper.gas.coordinates + with h5py.File(cosmological_volume, "r") as f: + if ( + "MinPositions" in f["/Cells"].keys() + and "MaxPositions" in f["/Cells"].keys() + ): + # in the sample files with this metadata present, we should only pick up + # a small number of extra cells with particles that drifted into our + # region. + def expected(n): + return n < 2000 + + else: + # extending upper mask to a padding cell on the other side of the box + # should get a lot of particles + def expected(n): + return n > 10000 + + assert expected( + (selected_coordinates_lower < mask_region_lower.metadata.boxsize * 0.1).sum() + ) + assert expected( + (selected_coordinates_upper > mask_region_upper.metadata.boxsize * 0.9).sum() + ) + + +def test_mask_entire_box(cosmological_volume): + """ + When we explicitly set the region to the whole box, we'd better get all of the cells! + """ + mask_region = mask(cosmological_volume, spatial_only=True) + restrict = cosmo_array( + [mask_region.metadata.boxsize * 0.0, mask_region.metadata.boxsize] + ).T + + mask_region.constrain_spatial(restrict=restrict) + for group_mask in mask_region.cell_mask.values(): + assert group_mask.all() + + +def test_invalid_mask_interval(cosmological_volume): + """ + We should get an error if the mask boundaries go out of bounds. + """ + mask_region = mask(cosmological_volume, spatial_only=True) + restrict = cosmo_array( + [mask_region.metadata.boxsize * -2, mask_region.metadata.boxsize * 2] + ).T + with pytest.raises(ValueError, match="Mask region boundaries must be in interval"): + mask_region.constrain_spatial(restrict=restrict) + + +def test_inverted_mask_boundaries(cosmological_volume): + """ + Upper boundary can be below lower boundary, in that case we select wrapping + in the other direction. Check this by making an "inverted" selection and + comparing to the "uninverted" selection through the boundary. + """ + mask_region = mask(cosmological_volume, spatial_only=True) + mask_region_inverted = mask(cosmological_volume, spatial_only=True) + restrict = cosmo_array( + [-mask_region.metadata.boxsize * 0.2, mask_region.metadata.boxsize * 0.2] + ).T + restrict_inverted = cosmo_array( + [mask_region.metadata.boxsize * 0.8, mask_region.metadata.boxsize * 0.2] + ).T + + mask_region.constrain_spatial(restrict=restrict) + mask_region_inverted.constrain_spatial(restrict=restrict_inverted) + selected_data = load(cosmological_volume, mask=mask_region) + selected_data_inverted = load(cosmological_volume, mask=mask_region_inverted) + + selected_coordinates = selected_data.gas.coordinates + selected_coordinates_inverted = selected_data_inverted.gas.coordinates + assert selected_coordinates.size > 0 # check that we selected something + assert np.array_equal(selected_coordinates, selected_coordinates_inverted) def test_empty_mask(cosmological_volume): # replace with cosmoogical_volume_no_legacy @@ -135,3 +324,26 @@ def test_empty_mask(cosmological_volume): # replace with cosmoogical_volume_no_ ) data = load(cosmological_volume, mask=empty_mask) assert data.gas.masses.size == 0 + + +def test_mask_pad_warning(cosmological_volume): + """ + Test that the user gets a warning when masking a snapshot without cell bbox metadata + and doesn't otherwise. The ``cosmological_volume`` fixture will run this with both + recent (with bbox metadata) and legacy (without bbox metadata) snapshots. + """ + from swiftsimio import mask # not the helper that silences warnings! + + with h5py.File(cosmological_volume, "r") as f: + has_cell_bbox = "MinPositions" in f["/Cells"].keys() + if has_cell_bbox: + with warnings.catch_warnings(): + # fail if there's a warning when the metadata is present + warnings.simplefilter("error") + mask(cosmological_volume) + else: + # fail if there's no warning when metadata is absent + with pytest.warns( + UserWarning, match="Snapshot does not contain Cells/MinPositions" + ): + mask(cosmological_volume) diff --git a/tests/test_physical_conversion.py b/tests/test_physical_conversion.py index 5caa6602..ce79f2b1 100644 --- a/tests/test_physical_conversion.py +++ b/tests/test_physical_conversion.py @@ -3,11 +3,11 @@ import unyt as u -def test_convert(cosmological_volume): +def test_convert(cosmological_volume_only_single): """ Check that the conversion to physical units is done correctly. """ - data = load(cosmological_volume) + data = load(cosmological_volume_only_single) coords = data.gas.coordinates units = u.kpc assert units != coords.units # ensure we make a non-trivial conversion @@ -24,11 +24,11 @@ def test_convert(cosmological_volume): return -def test_convert_to_value(cosmological_volume): +def test_convert_to_value(cosmological_volume_only_single): """ Check that conversions to numerical values are correct. """ - data = load(cosmological_volume) + data = load(cosmological_volume_only_single) coords = data.gas.coordinates units = u.kpc assert units != coords.units # ensure we make a non-trivial conversion diff --git a/tests/test_rotate_visualisations.py b/tests/test_rotate_visualisations.py index 323613d2..548ffb06 100644 --- a/tests/test_rotate_visualisations.py +++ b/tests/test_rotate_visualisations.py @@ -8,19 +8,19 @@ from os import remove -def test_project(cosmological_volume): +def test_project(cosmological_volume_only_single): """ Checks that gas projection of a single particle snapshot is invariant under rotations around the particle Parameters ---------- - cosmological_volume: str + cosmological_volume_only_single: str name of file providing metadata to copy """ # Start from the beginning, open the file output_filename = "single_particle.hdf5" - create_n_particle_dataset(cosmological_volume, output_filename) + create_n_particle_dataset(cosmological_volume_only_single, output_filename) data = load(output_filename) unrotated = project_gas( @@ -46,19 +46,19 @@ def test_project(cosmological_volume): remove(output_filename) -def test_slice(cosmological_volume): +def test_slice(cosmological_volume_only_single): """ Checks that a slice of a single particle snapshot is invariant under rotations around the particle Parameters ---------- - cosmological_volume: str + cosmological_volume_only_single: str name of file providing metadata to copy """ # Start from the beginning, open the file output_filename = "single_particle.hdf5" - create_n_particle_dataset(cosmological_volume, output_filename) + create_n_particle_dataset(cosmological_volume_only_single, output_filename) data = load(output_filename) unrotated = slice_gas( @@ -94,19 +94,19 @@ def test_slice(cosmological_volume): remove(output_filename) -def test_render(cosmological_volume): +def test_render(cosmological_volume_only_single): """ Checks that a volume render of a single particle snapshot is invariant under rotations around the particle Parameters ---------- - cosmological_volume: str + cosmological_volume_only_single: str name of file providing metadata to copy """ # Start from the beginning, open the file output_filename = "single_particle.hdf5" - create_n_particle_dataset(cosmological_volume, output_filename) + create_n_particle_dataset(cosmological_volume_only_single, output_filename) data = load(output_filename) unrotated = render_gas( diff --git a/tests/test_smoothing_length_generation.py b/tests/test_smoothing_length_generation.py index e5ca1f4a..1a96d717 100644 --- a/tests/test_smoothing_length_generation.py +++ b/tests/test_smoothing_length_generation.py @@ -34,9 +34,11 @@ def test_generate_smoothing_length(cosmological_volume): dimension=3, ).to(smoothing_lengths.units) - assert isclose( - generated_smoothing_lengths.value, smoothing_lengths.value, 0.1 - ).all() + assert ( + isclose(generated_smoothing_lengths.value, smoothing_lengths.value, 0.2).sum() + / generated_smoothing_lengths.size + > 0.5 + ) return @@ -65,9 +67,11 @@ def test_generate_smoothing_length_faster(cosmological_volume): dimension=3, ).to(smoothing_lengths.units) - assert isclose( - generated_smoothing_lengths.value, smoothing_lengths.value, 0.1 - ).all() + assert ( + isclose(generated_smoothing_lengths.value, smoothing_lengths.value, 0.2).sum() + / generated_smoothing_lengths.size + > 0.5 + ) return diff --git a/tests/subset_write_test.py b/tests/test_subset_write.py similarity index 60% rename from tests/subset_write_test.py rename to tests/test_subset_write.py index 78d8d441..4b15580b 100644 --- a/tests/subset_write_test.py +++ b/tests/test_subset_write.py @@ -1,5 +1,6 @@ from swiftsimio.subset_writer import write_subset -import swiftsimio as sw +from swiftsimio import load, metadata +from .helper import _mask_without_warning as mask import os @@ -25,9 +26,10 @@ def compare_data_contents(A, B): bad_compares.append("metadata") # Compare datasets + test_was_trivial = True # make sure we match at least some non-empty arrays for part_type in filter( lambda x: hasattr(A, x), - sw.metadata.particle_types.particle_name_underscores.values(), + metadata.particle_types.particle_name_underscores.values(), ): A_type = getattr(A, part_type) B_type = getattr(B, part_type) @@ -38,14 +40,22 @@ def compare_data_contents(A, B): for attr in particle_dataset_field_names: param_A = getattr(A_type, attr) param_B = getattr(B_type, attr) - try: - if not (param_A == param_B): + if len(param_A) == 0 and len(param_B) == 0: + # both arrays are empty, counts as a match + continue + else: + # compared at least one non-empty data array + test_was_trivial = False + comparison = param_A == param_B + if type(comparison) is bool: # guards len in elif, don't merge nested if + if not comparison: bad_compares.append(f"{part_type} {attr}") - except ValueError: - if not (param_A == param_B).all(): + elif len(comparison) > 1: + if not comparison.all(): bad_compares.append(f"{part_type} {attr}") assert bad_compares == [], f"compare failed on {bad_compares}" + assert not test_was_trivial def test_subset_writer(cosmological_volume): @@ -59,27 +69,24 @@ def test_subset_writer(cosmological_volume): outfile = "subset_cosmological_volume.hdf5" # Create a mask - mask = sw.mask(cosmological_volume) - - boxsize = mask.metadata.boxsize - - # Decide which region we want to load - load_region = [[0.25 * b, 0.75 * b] for b in boxsize] - mask.constrain_spatial(load_region) + full_mask = mask(cosmological_volume) + load_region = [[0.25 * b, 0.75 * b] for b in full_mask.metadata.boxsize] + full_mask.constrain_spatial(load_region) # Write the subset - write_subset(outfile, mask) + write_subset(outfile, full_mask) # Compare subset of written subset of snapshot against corresponding region in # full snapshot. This checks that both the metadata and dataset subsets are # written properly. - sub_mask = sw.mask(outfile) - sub_load_region = [[0.375 * b, 0.625 * b] for b in boxsize] + sub_mask = mask(outfile) + sub_load_region = [[0.375 * b, 0.625 * b] for b in sub_mask.metadata.boxsize] sub_mask.constrain_spatial(sub_load_region) - mask.constrain_spatial(sub_load_region) + # Update the spatial region to match what we load from the subset. + full_mask.constrain_spatial(sub_load_region) - snapshot = sw.load(cosmological_volume, mask) - sub_snapshot = sw.load(outfile, sub_mask) + snapshot = load(cosmological_volume, full_mask) + sub_snapshot = load(outfile, sub_mask) compare_data_contents(snapshot, sub_snapshot) diff --git a/tests/test_vistools.py b/tests/test_vistools.py index eff33f35..1494cc3f 100644 --- a/tests/test_vistools.py +++ b/tests/test_vistools.py @@ -49,8 +49,8 @@ def horizontal_func(y): return -def test_get_projection_field(cosmo_volume_example): - sd = load(cosmo_volume_example) +def test_get_projection_field(cosmological_volume_only_single): + sd = load(cosmological_volume_only_single) expected_dataset = sd.gas.masses obtained_dataset = _get_projection_field(sd.gas, "masses") assert np.allclose(expected_dataset, obtained_dataset) diff --git a/tests/test_visualisation.py b/tests/test_visualisation.py index 7b68ad46..3c78d1d4 100644 --- a/tests/test_visualisation.py +++ b/tests/test_visualisation.py @@ -169,7 +169,7 @@ def test_scatter_parallel(self, save=False): return @pytest.mark.parametrize("backend", projection_backends.keys()) - def test_equivalent_regions(self, backend, cosmo_volume_example): + def test_equivalent_regions(self, backend, cosmological_volume_only_single): """ Here we test that various regions are (close enough to) equivalent. The ref_img is just a projection through the whole box. @@ -186,121 +186,124 @@ def test_equivalent_regions(self, backend, cosmo_volume_example): box. We check that it matches the expected region of the ref_img (with the edges trimmed a bit). """ - sd = load(cosmo_volume_example) + sd = load(cosmological_volume_only_single) if backend == "gpu": # https://github.com/SWIFTSIM/swiftsimio/issues/229 pytest.xfail("gpu backend currently broken") parallel = True lbox = sd.metadata.boxsize[0].to_comoving().to_value(unyt.Mpc) box_res = 256 - ref_img = project_gas( - sd, - region=cosmo_array( - [0, lbox, 0, lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res, - parallel=parallel, - periodic=True, - backend=backend, - ) - big_img = project_gas( - sd, - region=cosmo_array( - [0, 3 * lbox, 0, 3 * lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res * 3, - parallel=parallel, - periodic=True, - backend=backend, - ) - far_img = project_gas( - sd, - region=cosmo_array( - [50 * lbox, 51 * lbox, 50 * lbox, 51 * lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res, - parallel=parallel, - periodic=True, - backend=backend, - ) - depth_img = project_gas( - sd, - region=cosmo_array( - [0, lbox, 0, lbox, 0, 0.3 * lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res, - parallel=parallel, - backend=backend, - ) - neg_depth_img = project_gas( - sd, - region=cosmo_array( - [0, lbox, 0, lbox, -lbox, -0.7 * lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res, - parallel=parallel, - backend=backend, - ) - wrap_depth_img = project_gas( - sd, - region=cosmo_array( - [0, lbox, 0, lbox, lbox, 1.3 * lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res, - parallel=parallel, - backend=backend, - ) - straddled_depth_img = project_gas( - sd, - region=cosmo_array( - [0, lbox, 0, lbox, -0.5 * lbox, 0.5 * lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res, - parallel=parallel, - backend=backend, - ) - edge_img = project_gas( - sd, - region=cosmo_array( - [-0.25 * lbox, 0.75 * lbox, 0.5 * lbox, 1.5 * lbox], - unyt.Mpc, - comoving=True, - scale_factor=sd.metadata.a, - scale_exponent=1, - ), - resolution=box_res, - parallel=parallel, - periodic=False, - backend=backend, - ) + with np.errstate( + invalid="ignore" + ): # invalid value encountered in divide happens sometimes + ref_img = project_gas( + sd, + region=cosmo_array( + [0, lbox, 0, lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res, + parallel=parallel, + periodic=True, + backend=backend, + ) + big_img = project_gas( + sd, + region=cosmo_array( + [0, 3 * lbox, 0, 3 * lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res * 3, + parallel=parallel, + periodic=True, + backend=backend, + ) + far_img = project_gas( + sd, + region=cosmo_array( + [50 * lbox, 51 * lbox, 50 * lbox, 51 * lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res, + parallel=parallel, + periodic=True, + backend=backend, + ) + depth_img = project_gas( + sd, + region=cosmo_array( + [0, lbox, 0, lbox, 0, 0.3 * lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res, + parallel=parallel, + backend=backend, + ) + neg_depth_img = project_gas( + sd, + region=cosmo_array( + [0, lbox, 0, lbox, -lbox, -0.7 * lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res, + parallel=parallel, + backend=backend, + ) + wrap_depth_img = project_gas( + sd, + region=cosmo_array( + [0, lbox, 0, lbox, lbox, 1.3 * lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res, + parallel=parallel, + backend=backend, + ) + straddled_depth_img = project_gas( + sd, + region=cosmo_array( + [0, lbox, 0, lbox, -0.5 * lbox, 0.5 * lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res, + parallel=parallel, + backend=backend, + ) + edge_img = project_gas( + sd, + region=cosmo_array( + [-0.25 * lbox, 0.75 * lbox, 0.5 * lbox, 1.5 * lbox], + unyt.Mpc, + comoving=True, + scale_factor=sd.metadata.a, + scale_exponent=1, + ), + resolution=box_res, + parallel=parallel, + periodic=False, + backend=backend, + ) edge_mask = np.s_[box_res // 6 : -box_res // 6, box_res // 6 : -box_res // 6] assert fraction_within_tolerance( edge_img[box_res // 4 :, : box_res // 2][edge_mask], @@ -308,7 +311,7 @@ def test_equivalent_regions(self, backend, cosmo_volume_example): ) assert np.allclose(far_img, ref_img) assert fraction_within_tolerance( - big_img, np.concatenate([np.hstack([ref_img] * 3)] * 3, axis=1) + big_img, np.concatenate([np.hstack([ref_img] * 3)] * 3, axis=1), frac=0.98 ) assert np.allclose(depth_img, neg_depth_img) assert np.allclose(depth_img, wrap_depth_img) @@ -492,7 +495,7 @@ def test_periodic_boundary_wrapping(self, backend): assert (image1 == image2).all() @pytest.mark.parametrize("backend", slice_backends.keys()) - def test_equivalent_regions(self, backend, cosmo_volume_example): + def test_equivalent_regions(self, backend, cosmological_volume_only_single): """ Here we test that various regions are (close enough to) equivalent. The ref_img is just a slice through the whole box at z=0.5 * lbox. @@ -506,7 +509,7 @@ def test_equivalent_regions(self, backend, cosmo_volume_example): box. We check that it matches the expected region of the ref_img (with the edges trimmed a bit). """ - sd = load(cosmo_volume_example) + sd = load(cosmological_volume_only_single) parallel = True lbox = sd.metadata.boxsize[0].to_comoving().to_value(unyt.Mpc) box_res = 256 @@ -764,8 +767,10 @@ def test_folding_deposit(self): assert deposition_1[100, 100, 100] * 8.0 == deposition_2[200, 200, 200] - def test_volume_render_and_unfolded_deposit_with_units(self, cosmological_volume): - data = load(cosmological_volume) + def test_volume_render_and_unfolded_deposit_with_units( + self, cosmological_volume_only_single + ): + data = load(cosmological_volume_only_single) data.gas.smoothing_lengths = 1e-30 * data.gas.smoothing_lengths npix = 64 @@ -840,7 +845,7 @@ def test_periodic_boundary_wrapping(self): assert (image1 == image2).all() - def test_equivalent_regions(self, cosmo_volume_example): + def test_equivalent_regions(self, cosmological_volume_only_single): """ Here we test that various regions are (close enough to) equivalent. The ref_img is just a render of the whole box. @@ -854,7 +859,7 @@ def test_equivalent_regions(self, cosmo_volume_example): box. We check that it matches the expected region of the ref_img (with the edges trimmed a bit). """ - sd = load(cosmo_volume_example) + sd = load(cosmological_volume_only_single) parallel = False # memory gets a bit out of hand otherwise lbox = sd.metadata.boxsize[0].to_comoving().to_value(unyt.Mpc) box_res = 32 @@ -954,8 +959,8 @@ def test_equivalent_regions(self, cosmo_volume_example): ) -def test_selection_render(cosmological_volume): - data = load(cosmological_volume) +def test_selection_render(cosmological_volume_only_single): + data = load(cosmological_volume_only_single) bs = data.metadata.boxsize[0] # Projection @@ -995,32 +1000,34 @@ def test_selection_render(cosmological_volume): return -def test_comoving_versus_physical(cosmological_volume): +def test_comoving_versus_physical(cosmological_volume_only_single): """ Test what happens if you try to mix up physical and comoving quantities. """ # this test is pretty slow if we don't mask out some particles - m = mask(cosmological_volume) + m = mask(cosmological_volume_only_single) boxsize = m.metadata.boxsize - m.constrain_spatial([[0.0 * b, 0.2 * b] for b in boxsize]) + m.constrain_spatial([[0.0 * b, 0.05 * b] for b in boxsize]) region = [ 0.0 * boxsize[0], - 0.2 * boxsize[0], + 0.05 * boxsize[0], 0.0 * boxsize[1], - 0.2 * boxsize[1], + 0.05 * boxsize[1], 0.0 * boxsize[2], - 0.2 * boxsize[2], + 0.05 * boxsize[2], ] for func, aexp in [(project_gas, -2.0), (slice_gas, -3.0), (render_gas, -3.0)]: # normal case: everything comoving - data = load(cosmological_volume, mask=m) + data = load(cosmological_volume_only_single, mask=m) # we force the default (project="masses") to check the cosmo_factor # conversion in this case - img = func(data, resolution=64, project=None, region=region) + img = func(data, resolution=64, project="masses", region=region, parallel=True) assert data.gas.masses.comoving and img.comoving assert (img.cosmo_factor.expr - a ** (aexp)).simplify() == 0 - img = func(data, resolution=64, project="densities", region=region) + img = func( + data, resolution=64, project="densities", region=region, parallel=True + ) assert data.gas.densities.comoving and img.comoving assert (img.cosmo_factor.expr - a ** (aexp - 3.0)).simplify() == 0 # try to mix comoving coordinates with a physical variable: @@ -1032,14 +1039,22 @@ def test_comoving_versus_physical(cosmological_volume): with pytest.warns( UserWarning, match="Converting coordinate grid to physical." ): - img = func(data, resolution=64, project="densities", region=region) + img = func( + data, + resolution=64, + project="densities", + region=region, + parallel=True, + ) assert data.gas.densities.comoving is False and img.comoving is False assert (img.cosmo_factor.expr - a ** (aexp - 3.0)).simplify() == 0 # convert coordinates to physical (but not smoothing lengths): # the coordinates (copy) should convert back to comoving to match the masses data.gas.coordinates.convert_to_physical() with pytest.warns(UserWarning, match="Converting coordinate grid to comoving."): - img = func(data, resolution=64, project="masses", region=region) + img = func( + data, resolution=64, project="masses", region=region, parallel=True + ) assert data.gas.masses.comoving and img.comoving assert (img.cosmo_factor.expr - a ** (aexp)).simplify() == 0 # also convert smoothing lengths to physical @@ -1051,12 +1066,16 @@ def test_comoving_versus_physical(cosmological_volume): with pytest.warns( UserWarning, match="Converting coordinate grid to comoving." ): - img = func(data, resolution=64, project="masses", region=region) + img = func( + data, resolution=64, project="masses", region=region, parallel=True + ) assert data.gas.masses.comoving and img.comoving assert (img.cosmo_factor.expr - a ** aexp).simplify() == 0 # densities are physical, make sure this works with physical coordinates and # smoothing lengths - img = func(data, resolution=64, project="densities", region=region) + img = func( + data, resolution=64, project="densities", region=region, parallel=True + ) assert data.gas.densities.comoving is False and img.comoving is False assert (img.cosmo_factor.expr - a ** (aexp - 3.0)).simplify() == 0 # now try again with comoving densities, should work and give a comoving img @@ -1068,18 +1087,24 @@ def test_comoving_versus_physical(cosmological_volume): with pytest.warns( UserWarning, match="Converting coordinate grid to comoving." ): - img = func(data, resolution=64, project="densities", region=region) + img = func( + data, + resolution=64, + project="densities", + region=region, + parallel=True, + ) assert data.gas.densities.comoving and img.comoving assert (img.cosmo_factor.expr - a ** (aexp - 3.0)).simplify() == 0 -def test_nongas_smoothing_lengths(cosmological_volume): +def test_nongas_smoothing_lengths(cosmological_volume_only_single): """ Test that the visualisation tools to calculate smoothing lengths give usable results. """ # If project_gas runs without error the smoothing lengths seem usable. - data = load(cosmological_volume) + data = load(cosmological_volume_only_single) data.dark_matter.smoothing_length = generate_smoothing_lengths( data.dark_matter.coordinates, data.metadata.boxsize, kernel_gamma=1.8 ) @@ -1102,8 +1127,10 @@ def test_nongas_smoothing_lengths(cosmological_volume): class TestPowerSpectrum: - def test_dark_matter_power_spectrum(self, cosmo_volume_example, save=False): - data = load(cosmo_volume_example) + def test_dark_matter_power_spectrum( + self, cosmological_volume_only_single, save=False + ): + data = load(cosmological_volume_only_single) data.dark_matter.smoothing_lengths = generate_smoothing_lengths( data.dark_matter.coordinates, data.metadata.boxsize, kernel_gamma=1.8