Conversation
…acement' into z-bin-refactoring
|
We need to update the tests in test_3_corr.py. There are currently 7 commented-out functions that need to be updated to test for changes in |
|
@andreicuceu I've opened a new pull request, #1128, to work on those changes. Once the pull request is ready, I'll request review from you. |
…files, remove stale files Co-authored-by: andreicuceu <28951059+andreicuceu@users.noreply.github.com> Agent-Logs-Url: https://github.com/igmhub/picca/sessions/4a972c8d-6f80-4b0e-9e28-447f9190ff96
There was a problem hiding this comment.
Pull request overview
This PR refactors the “z-cut” handling across the correlation/covariance pipeline by removing the legacy z_cut_min/z_cut_max globals and CLI flags, standardizing on z_min_pairs/z_max_pairs, and propagating the pair-redshift cut into FITS headers (for downstream tools like Vega).
Changes:
- Remove legacy
z_cut_min/z_cut_maxglobals and neighbour pre-filtering incf.py/xcf.py; rely onz_min_pairs/z_max_pairscuts inside the pair-accumulation kernels. - Update main 3D correlation/distortion scripts to write
ZMIN/ZMAXheaders (instead ofZCUTMIN/ZCUTMAX) and update tests/golden FITS accordingly. - Drop
--z-cut-min/--z-cut-maxCLI options from multiple scripts and remove several “zcuts” test cases.
Reviewed changes
Copilot reviewed 3 out of 39 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
py/picca/cf.py |
Removes z_cut_* globals and neighbour-level z-cut filtering in fill_neighs; pair-z cuts remain in the Numba kernels via z_min_pairs/z_max_pairs. |
py/picca/xcf.py |
Same cleanup as cf.py for cross-correlation neighbours; pair-z cuts remain in the Numba kernels. |
py/picca/bin/picca_cf.py |
Writes ZMIN/ZMAX FITS keywords using z_min_pairs/z_max_pairs; removes legacy z-cut CLI flags. |
py/picca/bin/picca_xcf.py |
Same as picca_cf.py but for absorber–object cross-correlation. |
py/picca/bin/picca_dmat.py |
Switches dmat header keywords from ZCUTMIN/MAX to ZMIN/ZMAX; removes legacy z-cut CLI flags. |
py/picca/bin/picca_xdmat.py |
Same as picca_dmat.py for cross distortion matrices. |
py/picca/bin/picca_export.py |
Adds ZMIN/ZMAX propagation into exported products’ headers. |
py/picca/bin/picca_wick.py |
Removes legacy z-cut CLI and header keywords for Wick covariance (no replacement added). |
py/picca/bin/picca_xwick.py |
Removes legacy z-cut CLI and header keywords for cross Wick covariance (no replacement added). |
py/picca/bin/picca_cf_angl.py |
Removes legacy z-cut CLI and header keywords for angular CF. |
py/picca/bin/picca_xcf_angl.py |
Removes legacy z-cut CLI and header keywords for angular XCF. |
py/picca/bin/picca_xcf1d.py |
Removes legacy z-cut CLI and header keywords for 1D XCF. |
py/picca/bin/picca_metal_dmat.py |
Removes legacy z-cut CLI and header keywords for metal dmat. |
py/picca/bin/picca_metal_xdmat.py |
Removes legacy z-cut CLI and header keywords for metal xdmat. |
py/picca/bin/picca_fast_metal_dmat.py |
Marks script as deprecated; removes legacy z-cut CLI and header keywords. |
py/picca/bin/picca_fast_metal_xdmat.py |
Marks script as deprecated; removes legacy z-cut CLI and header keywords. |
py/picca/bin/picca_compute_pk_pksb.py |
Adds deprecation notice to the CLI description. |
py/picca/bin/picca_coadd_zint.py |
Adds deprecation notice in docstring/parser description. |
py/picca/tests/test_3_cor.py |
Updates tests to use --z-min-pairs/--z-max-pairs and removes several z-cut-specific test cases. |
py/picca/tests/data/test_cor/cf_zcuts.fits.gz |
Updates golden output for renamed pair-z keywords. |
py/picca/tests/data/test_cor/cf_image_zcuts.fits.gz |
Updates golden output for renamed pair-z keywords. |
py/picca/tests/data/test_cor/dmat_zcuts.fits.gz |
Updates golden output for renamed pair-z keywords. |
py/picca/tests/data/test_cor/xdmat_zcuts.fits.gz |
Updates golden output for renamed pair-z keywords. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ang = delta.get_angle_between(neighbours) | ||
| w = ang < ang_max | ||
| neighbours = np.array(neighbours)[w] | ||
| if data2 is not None: | ||
| delta.neighbours = [ | ||
| other_delta | ||
| for other_delta in neighbours | ||
| if ( | ||
| (other_delta.z[-1] + delta.z[-1]) / 2.0 >= z_cut_min | ||
| and (other_delta.z[-1] + delta.z[-1]) / 2.0 < z_cut_max | ||
| ) | ||
| ] | ||
| delta.neighbours = neighbours | ||
| else: | ||
| delta.neighbours = [ | ||
| other_delta | ||
| for other_delta in neighbours | ||
| if ( | ||
| delta.ra > other_delta.ra | ||
| and (other_delta.z[-1] + delta.z[-1]) / 2.0 >= z_cut_min | ||
| and (other_delta.z[-1] + delta.z[-1]) / 2.0 < z_cut_max | ||
| ) | ||
| ] |
There was a problem hiding this comment.
fill_neighs() no longer applies any coarse filtering based on the configured pair-z range (z_min_pairs/z_max_pairs). The cut is still enforced later inside the per-pixel Numba loops, but removing the pre-filter can substantially increase the number of forest-pairs iterated over (and thus runtime) when tight z-bins are used. Consider reintroducing a fast neighbour-level precheck (e.g., using each forest’s min/max pixel z to decide whether any pair can pass the z cut) to keep performance similar to the previous implementation.
There was a problem hiding this comment.
@HiramHerrera, this is a great point. While it's mistaken that the previous implementation helped in any way with your runs, it would be great to use your z_min_pairs/z_max_pairs to check the ends of the forests and throw away all neighbours where no pixel pair would be computed.
| if not ang_correlation: | ||
| r_comov = np.array([obj.r_comov for obj in neighbours]) | ||
| f = r_trans_max if rmu_binning else 1 | ||
| w &= (delta.r_comov[0] - r_comov) * np.cos(ang / 2.) < r_par_max * f | ||
| w &= (delta.r_comov[-1] - r_comov) * np.cos(ang / 2.) > r_par_min * f | ||
|
|
||
| neighbours = np.array(neighbours)[w] | ||
| delta.neighbours = np.array( | ||
| [ | ||
| obj | ||
| for obj in neighbours | ||
| if ( | ||
| (delta.z[-1] + obj.z_qso) / 2.0 >= z_cut_min | ||
| and (delta.z[-1] + obj.z_qso) / 2.0 < z_cut_max | ||
| ) | ||
| ] | ||
| ) | ||
| delta.neighbours = np.array(neighbours)[w] | ||
|
|
There was a problem hiding this comment.
In xcf.fill_neighs(), the neighbour list is no longer pre-filtered by the configured absorber–object pair-z range (z_min_pairs/z_max_pairs). The cut is still applied later inside compute_xi_forest_pairs_fast, but without a neighbour-level precheck the code may iterate over many more forest–object pairs than necessary when z-bins are narrow. Consider adding a cheap prefilter based on each forest’s z-range and the object redshift to reduce runtime/allocations.
iprafols
left a comment
There was a problem hiding this comment.
Overall it looks fine, I have just a couple of comments. They affect several files and I have repeated them in all places to make it easier to addreess them
| 'name': 'NT', | ||
| 'value': cf.num_bins_r_trans, | ||
| 'comment': 'Number of bins in angle' | ||
| }, { |
There was a problem hiding this comment.
In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?
| 'value': args.coef_binning_model, | ||
| 'comment': 'Coefficient for model binning' | ||
| }, | ||
| { |
There was a problem hiding this comment.
In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?
| return dmat_data | ||
|
|
||
|
|
||
| def main(cmdargs=None): |
There was a problem hiding this comment.
Are we not deprecating this as well? Is this also part of the functionality added to Vega?
| required=False, | ||
| help='Max redshift for object field') | ||
|
|
||
| parser.add_argument( |
There was a problem hiding this comment.
In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?
| return dmat_data | ||
|
|
||
|
|
||
| def main(cmdargs=None): |
There was a problem hiding this comment.
Are we not deprecating this as well? Is this also part of the functionality added to Vega?
| 'value': cf.num_bins_r_trans, | ||
| 'comment': 'Number of bins in r-transverse' | ||
| }, | ||
| { |
There was a problem hiding this comment.
In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?
| 'name': 'NP', | ||
| 'value': xcf.num_bins_r_par, | ||
| 'comment': 'Number of bins in wavelength ratio' | ||
| }, { |
There was a problem hiding this comment.
In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?
| 'name': 'NT', | ||
| 'value': xcf.num_bins_r_trans, | ||
| 'comment': 'Number of bins in angle' | ||
| }, { |
There was a problem hiding this comment.
In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?
| 'value': xcf.num_bins_r_trans, | ||
| 'comment': 'Number of bins in r-transverse' | ||
| }, | ||
| { |
There was a problem hiding this comment.
In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?
HiramHerrera
left a comment
There was a problem hiding this comment.
Overall, this looks good to me. Aside from the points already raised by Ignasi, my only additional comment is about the export script. As currently implemented, it would cause previously generated files to fail if the script is re-run on them.
There was a problem hiding this comment.
These lines will cause the export code to fail for previously generated correlations. While this may not be common, it could still lead to issues.
For backward compatibility, I suggest guarding this block so it only executes when ZMIN and ZMAX are present in the headers.
Clean up old z-bin code and propagate z-bin cuts through to each file header so they can be read in Vega