Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Numpy proxy array fails with AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped' #17930

Open
Matt711 opened this issue Feb 6, 2025 · 1 comment
Assignees
Labels
bug Something isn't working cudf.pandas Issues specific to cudf.pandas

Comments

@Matt711
Copy link
Contributor

Matt711 commented Feb 6, 2025

Describe the bug
Apart of #17490. Several of the third-party integration tests fail with AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped'. I've checked that ndarray in this case is a proxy array (ie cudf.pandas._wrappers.numpy.ndarray) not a true numpy array (ie. np.ndarray). So this makes me think this is a real numpy proxying bug because proxy arrays get a _fsproxy_wrapped attribute immediately after instance creation in __array_finalize__.

The failing tests occur during the comparison step in the third-party integration tests CI job. I've checked that the "gold" run (ie. no cudf.pandas) and the "cudf" (ie. with cudf.pandas) pass. To illustrate what is happening, we can use one of the failing numpy integration tests.

def test_numpy_fft(sr):
    fft = np.fft.fft(sr)
    return fft

During the "gold" run the result from test_numpy_fft is a real np.ndarray. And during the "cudf" run, the result is a proxy array cudf.pandas._wrappers.numpy.ndarray. The results are stored in a binary file for use during the "compare" run. It's during this that the test fails. The assertion function used to compare the "gold" and "cudf" runs is np.testing.assert_allclose. So effectively,

np.testing.assert_allclose(gold_result, cudf_result)

produces the test failure we see

AttributeError: 'ndarray' object has no attribute '_fsproxy_wrapped'`

Steps/Code to reproduce bug
I have not come up with a minimal reproducer yet. But we can reproduce the CI failures locally in a conda environment

# env.yaml
channels:
- rapidsai-nightly
- rapidsai
- conda-forge
- nvidia
dependencies:
- cuda-version=12.8
- cudf==25.2.*,>=0.0.0a0
- numpy
- pandas
- pytest
- pytest-xdist
- python=3.12
name: test_numpy_cuda-128_arch-x86_64_py-312
./ci/cudf_pandas_scripts/third-party-integration/test.sh ./python/cudf/cudf_pandas_tests/third_party_integration_tests/dependencies.yaml
@Matt711 Matt711 added bug Something isn't working cudf.pandas Issues specific to cudf.pandas labels Feb 6, 2025
@GPUtester GPUtester moved this from Todo to In Progress in cuDF Python Feb 6, 2025
@Matt711
Copy link
Contributor Author

Matt711 commented Feb 8, 2025

TODO: Try serializing/deserializing the wrapped array and comparing to a real numpy array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf.pandas Issues specific to cudf.pandas
Projects
Status: In Progress
Development

No branches or pull requests

2 participants