Skip to content

Z bin refactoring#1125

Open
andreicuceu wants to merge 9 commits intomainfrom
z-bin-refactoring
Open

Z bin refactoring#1125
andreicuceu wants to merge 9 commits intomainfrom
z-bin-refactoring

Conversation

@andreicuceu
Copy link
Contributor

Clean up old z-bin code and propagate z-bin cuts through to each file header so they can be read in Vega

@andreicuceu
Copy link
Contributor Author

We need to update the tests in test_3_corr.py. There are currently 7 commented-out functions that need to be updated to test for changes in z-min-pairs and z-max-pairs instead of z-cut-min and z-cut-max. The corresponding test files in test_cor/ also need to be updated. Finally, a number of tests have been removed because they tested for z-cut-min and z-cut-max in scripts that are no longer relevant. The files corresponding to the removed tests from the test_cor/ directory should also be removed. @copilot

Copy link
Contributor

Copilot AI commented Mar 23, 2026

@andreicuceu I've opened a new pull request, #1128, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the “z-cut” handling across the correlation/covariance pipeline by removing the legacy z_cut_min/z_cut_max globals and CLI flags, standardizing on z_min_pairs/z_max_pairs, and propagating the pair-redshift cut into FITS headers (for downstream tools like Vega).

Changes:

  • Remove legacy z_cut_min/z_cut_max globals and neighbour pre-filtering in cf.py / xcf.py; rely on z_min_pairs/z_max_pairs cuts inside the pair-accumulation kernels.
  • Update main 3D correlation/distortion scripts to write ZMIN/ZMAX headers (instead of ZCUTMIN/ZCUTMAX) and update tests/golden FITS accordingly.
  • Drop --z-cut-min/--z-cut-max CLI options from multiple scripts and remove several “zcuts” test cases.

Reviewed changes

Copilot reviewed 3 out of 39 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
py/picca/cf.py Removes z_cut_* globals and neighbour-level z-cut filtering in fill_neighs; pair-z cuts remain in the Numba kernels via z_min_pairs/z_max_pairs.
py/picca/xcf.py Same cleanup as cf.py for cross-correlation neighbours; pair-z cuts remain in the Numba kernels.
py/picca/bin/picca_cf.py Writes ZMIN/ZMAX FITS keywords using z_min_pairs/z_max_pairs; removes legacy z-cut CLI flags.
py/picca/bin/picca_xcf.py Same as picca_cf.py but for absorber–object cross-correlation.
py/picca/bin/picca_dmat.py Switches dmat header keywords from ZCUTMIN/MAX to ZMIN/ZMAX; removes legacy z-cut CLI flags.
py/picca/bin/picca_xdmat.py Same as picca_dmat.py for cross distortion matrices.
py/picca/bin/picca_export.py Adds ZMIN/ZMAX propagation into exported products’ headers.
py/picca/bin/picca_wick.py Removes legacy z-cut CLI and header keywords for Wick covariance (no replacement added).
py/picca/bin/picca_xwick.py Removes legacy z-cut CLI and header keywords for cross Wick covariance (no replacement added).
py/picca/bin/picca_cf_angl.py Removes legacy z-cut CLI and header keywords for angular CF.
py/picca/bin/picca_xcf_angl.py Removes legacy z-cut CLI and header keywords for angular XCF.
py/picca/bin/picca_xcf1d.py Removes legacy z-cut CLI and header keywords for 1D XCF.
py/picca/bin/picca_metal_dmat.py Removes legacy z-cut CLI and header keywords for metal dmat.
py/picca/bin/picca_metal_xdmat.py Removes legacy z-cut CLI and header keywords for metal xdmat.
py/picca/bin/picca_fast_metal_dmat.py Marks script as deprecated; removes legacy z-cut CLI and header keywords.
py/picca/bin/picca_fast_metal_xdmat.py Marks script as deprecated; removes legacy z-cut CLI and header keywords.
py/picca/bin/picca_compute_pk_pksb.py Adds deprecation notice to the CLI description.
py/picca/bin/picca_coadd_zint.py Adds deprecation notice in docstring/parser description.
py/picca/tests/test_3_cor.py Updates tests to use --z-min-pairs/--z-max-pairs and removes several z-cut-specific test cases.
py/picca/tests/data/test_cor/cf_zcuts.fits.gz Updates golden output for renamed pair-z keywords.
py/picca/tests/data/test_cor/cf_image_zcuts.fits.gz Updates golden output for renamed pair-z keywords.
py/picca/tests/data/test_cor/dmat_zcuts.fits.gz Updates golden output for renamed pair-z keywords.
py/picca/tests/data/test_cor/xdmat_zcuts.fits.gz Updates golden output for renamed pair-z keywords.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 123 to 135
ang = delta.get_angle_between(neighbours)
w = ang < ang_max
neighbours = np.array(neighbours)[w]
if data2 is not None:
delta.neighbours = [
other_delta
for other_delta in neighbours
if (
(other_delta.z[-1] + delta.z[-1]) / 2.0 >= z_cut_min
and (other_delta.z[-1] + delta.z[-1]) / 2.0 < z_cut_max
)
]
delta.neighbours = neighbours
else:
delta.neighbours = [
other_delta
for other_delta in neighbours
if (
delta.ra > other_delta.ra
and (other_delta.z[-1] + delta.z[-1]) / 2.0 >= z_cut_min
and (other_delta.z[-1] + delta.z[-1]) / 2.0 < z_cut_max
)
]
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fill_neighs() no longer applies any coarse filtering based on the configured pair-z range (z_min_pairs/z_max_pairs). The cut is still enforced later inside the per-pixel Numba loops, but removing the pre-filter can substantially increase the number of forest-pairs iterated over (and thus runtime) when tight z-bins are used. Consider reintroducing a fast neighbour-level precheck (e.g., using each forest’s min/max pixel z to decide whether any pair can pass the z cut) to keep performance similar to the previous implementation.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HiramHerrera, this is a great point. While it's mistaken that the previous implementation helped in any way with your runs, it would be great to use your z_min_pairs/z_max_pairs to check the ends of the forests and throw away all neighbours where no pixel pair would be computed.

Comment on lines 117 to 124
if not ang_correlation:
r_comov = np.array([obj.r_comov for obj in neighbours])
f = r_trans_max if rmu_binning else 1
w &= (delta.r_comov[0] - r_comov) * np.cos(ang / 2.) < r_par_max * f
w &= (delta.r_comov[-1] - r_comov) * np.cos(ang / 2.) > r_par_min * f

neighbours = np.array(neighbours)[w]
delta.neighbours = np.array(
[
obj
for obj in neighbours
if (
(delta.z[-1] + obj.z_qso) / 2.0 >= z_cut_min
and (delta.z[-1] + obj.z_qso) / 2.0 < z_cut_max
)
]
)
delta.neighbours = np.array(neighbours)[w]

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In xcf.fill_neighs(), the neighbour list is no longer pre-filtered by the configured absorber–object pair-z range (z_min_pairs/z_max_pairs). The cut is still applied later inside compute_xi_forest_pairs_fast, but without a neighbour-level precheck the code may iterate over many more forest–object pairs than necessary when z-bins are narrow. Consider adding a cheap prefilter based on each forest’s z-range and the object redshift to reduce runtime/allocations.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@iprafols iprafols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks fine, I have just a couple of comments. They affect several files and I have repeated them in all places to make it easier to addreess them

'name': 'NT',
'value': cf.num_bins_r_trans,
'comment': 'Number of bins in angle'
}, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?

'value': args.coef_binning_model,
'comment': 'Coefficient for model binning'
},
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?

return dmat_data


def main(cmdargs=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we not deprecating this as well? Is this also part of the functionality added to Vega?

required=False,
help='Max redshift for object field')

parser.add_argument(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?

return dmat_data


def main(cmdargs=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we not deprecating this as well? Is this also part of the functionality added to Vega?

'value': cf.num_bins_r_trans,
'comment': 'Number of bins in r-transverse'
},
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?

'name': 'NP',
'value': xcf.num_bins_r_par,
'comment': 'Number of bins in wavelength ratio'
}, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?

'name': 'NT',
'value': xcf.num_bins_r_trans,
'comment': 'Number of bins in angle'
}, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?

'value': xcf.num_bins_r_trans,
'comment': 'Number of bins in r-transverse'
},
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In picca_cf.py, picca_dmat.py, picca_export.py, etc. this was replaced by ZMIN and ZMAX adding the minimum and maximum redshift of pairs. Shouldn't we do the same here for consistency?

Copy link
Collaborator

@HiramHerrera HiramHerrera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks good to me. Aside from the points already raised by Ignasi, my only additional comment is about the export script. As currently implemented, it would cause previously generated files to fail if the script is re-run on them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines will cause the export code to fail for previously generated correlations. While this may not be common, it could still lead to issues.

For backward compatibility, I suggest guarding this block so it only executes when ZMIN and ZMAX are present in the headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants