Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PyArrow conversion and arrow/parquet tests for pyarrow 19.0 #60716

Merged
merged 10 commits into from
Jan 22, 2025
Prev Previous commit
Next Next commit
Adjust test_string_inference for using_infer_string
mroeschke committed Jan 21, 2025

Verified

This commit was signed with the committer’s verified signature.
mroeschke Matthew Roeschke
commit fc066f41c71da512430bc305ad21710850514628
11 changes: 8 additions & 3 deletions pandas/tests/io/test_parquet.py
Original file line number Diff line number Diff line change
@@ -1110,19 +1110,24 @@ def test_df_attrs_persistence(self, tmp_path, pa):
new_df = read_parquet(path, engine=pa)
assert new_df.attrs == df.attrs

def test_string_inference(self, tmp_path, pa):
def test_string_inference(self, tmp_path, pa, using_infer_string):
# GH#54431
path = tmp_path / "test_string_inference.p"
df = pd.DataFrame(data={"a": ["x", "y"]}, index=["a", "b"])
df.to_parquet(path, engine="pyarrow")
df.to_parquet(path, engine=pa)
with pd.option_context("future.infer_string", True):
result = read_parquet(path, engine="pyarrow")
dtype = pd.StringDtype(na_value=np.nan)
expected = pd.DataFrame(
data={"a": ["x", "y"]},
dtype=dtype,
index=pd.Index(["a", "b"], dtype=dtype),
columns=pd.Index(["a"], dtype=object if pa_version_under19p0 else dtype),
columns=pd.Index(
["a"],
dtype=object
if pa_version_under19p0 and not using_infer_string
else dtype,
),
)
tm.assert_frame_equal(result, expected)