-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use narwhals to support Polars, cuDF, Modin, etc. #388
Conversation
pixi.lock
Outdated
@@ -368,7 +371,7 @@ environments: | |||
- conda: https://conda.anaconda.org/conda-forge/linux-64/binutils_impl_linux-64-2.40-ha1999f0_7.conda | |||
- conda: https://conda.anaconda.org/conda-forge/linux-64/binutils_linux-64-2.40-hb3c18ed_1.conda | |||
- conda: https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-h4bc722e_7.conda | |||
- conda: https://conda.anaconda.org/conda-forge/linux-64/c-ares-1.33.1-heb4867d_0.conda | |||
- conda: https://conda.anaconda.org/conda-forge/linux-64/c-ares-1.32.3-h4bc722e_0.conda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why it's asking to downgrade here
I think this will raise an exception if neither pandas nor polars is installed (which is now a posibility): tabmat/src/tabmat/categorical_matrix.py Lines 249 to 250 in e288b47
We could replace it with some numpy-based solution, which is probably not very efficient, but that's probably fine for now. E.g., else:
categories, indices = np.unique(cat_vec.to_numpy(), return_inverse=True) |
Same here, but the solution might be less straightforward for the non-pandas-or-polars case: tabmat/src/tabmat/categorical_matrix.py Line 391 in e288b47
Maybe we could say that the property is deprecated and is only there for backwards compatibility, so we are not implementing it for non-pandas (or polars) input? |
I don't think we need this: anymore tabmat/src/tabmat/constructor.py Lines 21 to 24 in e288b47
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! All that remains is fixing these couple of issues in categorical_matrix.py
.
It might also be nice to have a test for from_df
which is not polars or pandas. Pyarrow is test dependency because of polars already; would it be okay if I added a test for pyarrow dataframes?
I made the following updates to fix the issues above: Deconstructing the
The
|
@MarcAntoineSchmidtQC, if we don't want to add new functionality to the deprecated |
Perhaps one final question: all three matrix types have an unpack method that returns the container storing the data. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, amazing, seeing how far you got with Narwhals independently here really warms my heart 🤗
If you're happy with this, may I suggest to use import narwhals.stable.v1 as nw
instead of import narwhals as nw
? This will future-proof you against breaking changes (if we need to make them), see https://narwhals-dev.github.io/narwhals/backcompat/ for an explanation
Well done here, and please always feel free to ping us from Narwhals if you have any questions / requests / comments 🙏
Thanks so much for the review and the fantastic library @MarcoGorelli! 🙏 Using the v1 API is a great idea. The only thing I ran into is that there does not seem to be a |
thanks! yup, In [10]: import narwhals.stable.v1 as nw
In [11]: nw.from_native(pl.Series([1,2,3]), series_only=True).dtype.is_numeric()
Out[11]: True
In [12]: nw.from_native(pl.Series(['foo', 'bar']), series_only=True).dtype.is_numeric()
Out[12]: False |
Checklist
CHANGELOG.rst
entry