-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Nested-Pandas v0.6 is currently under review (lincc-frameworks/nested-pandas#363), when released, this issue tracks the work needed to enable LSDB to work with Nested-Pandas v0.6.
Tied to #1031
Work Needed - Big Picture
-
A number of functions in the
NestAccessor
have been renamed, with the old names still available but deprecated. They will be fully removed in v0.7. It's my guess that this doesn't affect LSDB much (if at all), depending on how directly LSDB has utilized these functions, but any occurences should be replaced with the appropriate names to avoid deprecation warnings. I will look through and try to populate a list of occurences: -
reduce
is being deprecated (but still available as above) in favor ofmap_rows
, with this change involving behavior changes in addition to a name change. We should follow suit in LSDB, deprecatingreduce
, implementingmap_rows
, and updating behavior in tutorials accordingly.
Renaming todos
- add_nested -> join_nested:
-
Line 469 in b0f3915
def add_nested(self, nested, name, how="outer") -> NestedFrame: # type: ignore[name-defined] # noqa: F821 - https://github.com/astronomy-commons/lsdb/blob/main/src/lsdb/nested/core.py#L71
-
lsdb/src/lsdb/dask/merge_catalog_functions.py
Line 624 in b0f3915
meta_df = npd.NestedFrame(meta_df).add_nested(nested_catalog_meta, nested_column_name) -
"nf_nested = nf_base.add_nested(nf_flat, name=\"packed\")\n", -
out = left_join_part.add_nested(right_join_part, nested_column_name) -
lsdb/tests/lsdb/nested/test_nestedframe.py
Line 106 in b0f3915
def test_add_nested(test_dataset_no_add_nested): -
Line 547 in b0f3915
def test_dataset_no_add_nested(): -
lsdb/tests/lsdb/nested/test_io.py
Line 34 in b0f3915
base = base.add_nested(nested, "nested")
-
- fields -> columns (or column_dtypes if off the NestedDtype):
- https://github.com/astronomy-commons/lsdb/blob/main/src/lsdb/nested/accessor.py#L36
-
"nf.lightcurve.nest.fields" -
"We can't use `nepochs` for anymore, since it included observations for which `catflags != 0`. What we'll do instead is use a function from `nested_pandas` called `count_nested`, which is able to count all of the points per nested field. This function adds a column to its input, named `n_{column}`, or `n_lc` in this case, since we're counting the length of the fields within `lc`.\n", -
lsdb/tests/lsdb/nested/test_accessor.py
Line 26 in 26bb43e
def test_fields(test_dataset): -
lsdb/tests/lsdb/catalog/test_nested.py
Line 32 in 26bb43e
assert cat_ndf_renested._ddf["nested"].nest.fields == cat_ndf["sources"].nest.fields -
lsdb/tests/lsdb/nested/test_nestedframe.py
Line 146 in 26bb43e
assert list(ndf["nested"].nest.fields) == list(ndf["nested"].nest.fields) -
lsdb/tests/lsdb/nested/test_nestedframe.py
Line 173 in 26bb43e
assert list(ndf["nested"].nest.fields) == list(ndf["nested"].nest.fields) -
Line 471 in 26bb43e
assert (col_name, dtype.pyarrow_dtype) in joined_cat[nested_colname].dtypes.fields.items()
- to_flat/to_lists "fields" arg -> "columns"
-
lsdb/src/lsdb/nested/accessor.py
Line 56 in 26bb43e
def to_flat(self, fields: list[str] | None = None) -> dd.DataFrame: -
lsdb/src/lsdb/nested/accessor.py
Line 41 in 26bb43e
def to_lists(self, fields: list[str] | None = None) -> dd.DataFrame:
-
Metadata
Metadata
Assignees
Labels
Type
Projects
Status