Skip to content

Add single precision hints to mapping and splitting stages#669

Open
olivialynn wants to merge 8 commits intomainfrom
issue/653/single-precision-hints
Open

Add single precision hints to mapping and splitting stages#669
olivialynn wants to merge 8 commits intomainfrom
issue/653/single-precision-hints

Conversation

@olivialynn
Copy link
Member

@olivialynn olivialynn commented Feb 13, 2026

Change Description

Closes #653

Solution Description

Code Quality

  • I have read the Contribution Guide and LINCC Frameworks Code of Conduct
  • My code follows the code style of this project
  • My code builds (or compiles) cleanly without any errors or warnings
  • My code contains relevant comments and necessary documentation

@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 96.42857% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 93.78%. Comparing base (4128348) to head (9a1ddf7).

Files with missing lines Patch % Lines
src/hats_import/catalog/map_reduce.py 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #669      +/-   ##
==========================================
+ Coverage   93.72%   93.78%   +0.06%     
==========================================
  Files          33       33              
  Lines        2039     2060      +21     
==========================================
+ Hits         1911     1932      +21     
  Misses        128      128              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds warning functionality to detect when single-precision (float32) coordinates are used for ra/dec columns during the mapping and splitting stages of catalog import. This addresses issue #653, which requested better hints when coordinate precision issues lead to incorrect pixel assignments.

Changes:

  • Added _warn_if_not_double_precision_columns function to check coordinate column precision for both pandas and pyarrow data types
  • Refactored pixel mapping logic into _map_chunk_to_pixels helper function
  • Integrated precision checking into _iterate_input_file to warn on first data chunk
  • Added unit tests for warning behavior in both map and split stages

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/hats_import/catalog/map_reduce.py Added precision checking functions, refactored mapping logic, and integrated warnings into the iteration pipeline
src/hats_import/catalog/file_readers/input_reader.py Updated docstring to clarify that read method can return both pandas DataFrames and pyarrow Tables
tests/hats_import/catalog/test_map_reduce.py Added tests for single precision warnings in both mapping and splitting stages

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +113 to +119
first_iteration = False
for chunk_number, data in enumerate(file_reader.read(input_file, read_columns=read_columns)):
if use_healpix_29:
if isinstance(data, pd.DataFrame) and data.index.name == SPATIAL_INDEX_COLUMN:
mapped_pixels = spatial_index_to_healpix(data.index, target_order=highest_order)
else:
mapped_pixels = spatial_index_to_healpix(
data[SPATIAL_INDEX_COLUMN], target_order=highest_order
)
else:
# Set up the pixel data
if isinstance(data, pd.DataFrame):
mapped_pixels = hp.radec2pix(
highest_order,
data[ra_column].to_numpy(copy=False, dtype=float),
data[dec_column].to_numpy(copy=False, dtype=float),
)
else:
mapped_pixels = hp.radec2pix(
highest_order,
data[ra_column].to_numpy(),
data[dec_column].to_numpy(),
)
if not first_iteration and not use_healpix_29:
# Only check for single-precision warning on the first chunk to
# avoid redundant warnings in large files.
_warn_if_not_double_precision_columns(data, [ra_column, dec_column])
first_iteration = True
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable first_iteration is initialized to False on line 113, but the condition on line 115 checks if not first_iteration. This means the condition will be True only on iterations after the first one (when first_iteration becomes True), which is the opposite of the intended behavior. The warning will never be raised on the first chunk as intended by the comment.

To fix this, either:

  1. Initialize first_iteration = True and keep the condition as if not first_iteration and not use_healpix_29:
  2. Or, change the condition to if first_iteration and not use_healpix_29: and set it to False after the check

Copilot uses AI. Check for mistakes.
Copy link
Member Author

@olivialynn olivialynn Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Almost, but this does bring my attention to the fact that I renamed this loop guard from the ai-suggested warned variable name, and now the current truth values don't really reflect the new name.

I'm switching this to start as True, then get switch to False. Actual functionality will remain unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Give better hints when order-1 pixels are produced by the splitting stage

2 participants