Skip to content
Open
Changes from 14 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3bbeea2
TST: Add test for writing UUIDs to parquet with pyarrow #61602
GiTaDi-CrEaTe May 15, 2026
88d4e28
TST: Fix read_parquet namespace usage
GiTaDi-CrEaTe May 15, 2026
9d98157
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2026
5f4d739
TST: Address reviewer feedback for UUID pyarrow test
GiTaDi-CrEaTe May 15, 2026
b2f9c8b
Merge branch 'main' into tests/io-parquet-uuid-61602
GiTaDi-CrEaTe May 15, 2026
72dc35a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2026
c53e34c
TST: Implement reviewer feedback and fix formatting
GiTaDi-CrEaTe May 15, 2026
6b7b57f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2026
1f36ae7
Merge branch 'main' into tests/io-parquet-uuid-61602
GiTaDi-CrEaTe May 16, 2026
e699965
TST: Handle raw bytes fallback for UUIDs on PyArrow nightly/py314
GiTaDi-CrEaTe May 16, 2026
f4f3e4e
Merge branch 'main' into tests/io-parquet-uuid-61602
GiTaDi-CrEaTe May 16, 2026
91f481c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 16, 2026
6332052
TST: Fix line length in comment to pass ruff
GiTaDi-CrEaTe May 16, 2026
8925886
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 16, 2026
0139402
TST: Use temp_file fixture instead of tmp_path
GiTaDi-CrEaTe May 17, 2026
9053a75
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 17, 2026
bdd3ae3
TST: Replace UUID byte fallback with pytest.xfail for upstream bug
GiTaDi-CrEaTe May 17, 2026
b244666
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 17, 2026
764647b
TST: Refactor dynamic xfail to use request.node.add_marker
GiTaDi-CrEaTe May 17, 2026
9fe4745
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 17, 2026
07ccaef
TST: Shorten xfail reason string to pass ruff line length
GiTaDi-CrEaTe May 17, 2026
8733d2c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 17, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions pandas/tests/io/test_parquet.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from io import BytesIO
import os
import pathlib
import uuid

import numpy as np
import pytest
Expand All @@ -20,6 +21,7 @@
pa_version_under20p0,
)
from pandas.errors import Pandas4Warning
import pandas.util._test_decorators as td

import pandas as pd
import pandas._testing as tm
Expand Down Expand Up @@ -1521,3 +1523,25 @@ def test_invalid_dtype_backend(self, engine, temp_file):
df.to_parquet(temp_file)
with pytest.raises(ValueError, match=msg):
read_parquet(temp_file, dtype_backend="numpy")


@td.skip_if_no("pyarrow", min_version="24.0")
def test_to_parquet_uuid_supported(tmp_path):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use temp_file instead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke Done! Swapped tmp_path for the temp_file fixture. Thanks for the review!!

# GH 61602
df = pd.DataFrame({"id": [uuid.uuid4(), uuid.uuid4()]})
path = tmp_path / "test_uuid.parquet"

# This should not raise an error
df.to_parquet(path, engine="pyarrow")

# Verify it can be read back
result = read_parquet(path, engine="pyarrow")

# PyArrow nightly / Python 3.14 currently returns raw bytes instead
# of UUID objects due to an upstream object-casting quirk.
# We handle the raw byte fallback gracefully to ensure the
# underlying 16-byte data integrity is preserved.
if len(result) > 0 and isinstance(result.loc[0, "id"], bytes):
result["id"] = result["id"].apply(lambda x: uuid.UUID(bytes=x))
Copy link
Copy Markdown
Member

@rhshadrach rhshadrach May 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it look like the issue is resolved. If we need to leave it unresolved in this case, then xfail the test instead. We also need to report this upstream.


tm.assert_frame_equal(result, df)
Loading