Skip to content

Nesting columns prior to crossmatching creates delayed error #551

@gitosaurus

Description

@gitosaurus

Bug report

# LSDB version 0.4.4
import lsdb
ztf_db = lsdb.read_hats(
    '/data3/epyc/data3/hats/catalogs/ztf_dr22/ztf_lc',
    margin_cache='/data3/epyc/data3/hats/catalogs/ztf_dr22/ztf_lc_10arcs')
nest_lc = ['hmjd', 'mag', 'magerr']
ztf_db = ztf_db.nest_lists(
    base_columns=[c for c in ztf_db.columns if c not in nest_lc],
    list_columns=nest_lc,
    name='lc')

gaia3 = lsdb.read_hats(
    'https://data.lsdb.io/hats/gaia_dr3/gaia', 
    margin_cache='https://data.lsdb.io/hats/gaia_dr3/gaia_10arcs')

cm = gaia3.crossmatch(ztf_db)
cmc = cm.compute()

The above yields the following error:

File ~/.conda/envs/dtj1s-py3.12/lib/python3.12/site-packages/pyarrow/array.pxi:4079, in pyarrow.lib.StructArray.from_arrays()

TypeError: Expected Array, got <class 'pyarrow.lib.ChunkedArray'>

If ztf_db is not nested prior to the crossmatch, the crossmatch succeeds.

Before submitting
Please check the following:

  • I have described the situation in which the bug arose, including what code was executed, information about my environment, and any applicable data others will need to reproduce the problem.
  • I have included available evidence of the unexpected behavior (including error messages, screenshots, and/or plots) as well as a description of what I expected instead.
  • If I have a solution in mind, I have provided an explanation and/or pseudocode and/or task list.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions