Skip to content

[PY] Support pickling ffi.String without cached FFI object#642

Closed
Ubospica wants to merge 2 commits into
apache:mainfrom
Ubospica:fix-ffi-string-pickle
Closed

[PY] Support pickling ffi.String without cached FFI object#642
Ubospica wants to merge 2 commits into
apache:mainfrom
Ubospica:fix-ffi-string-pickle

Conversation

@Ubospica

Copy link
Copy Markdown
Contributor

Summary

  • Add a custom pickle reduction for tvm_ffi.core.String so only the Python string payload is serialized.
  • Drop the hidden _tvm_ffi_cached_object on unpickle, since String is effectively str with optional FFI backing.
  • Add a regression test for long FFI strings that carry a cached backing object.

Pickled FFI strings should only persist their Python string payload, avoiding serialization of the hidden cached FFI backing object.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements __reduce_ex__ for the String class in Cython to prevent the serialization of the cached FFI object during pickling, and adds corresponding unit tests to verify this behavior. The reviewer suggests extending this fix to the Bytes class, which shares the same caching behavior, by implementing __reduce_ex__ and adding a corresponding test case.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread python/tvm_ffi/cython/string.pxi
Comment thread tests/python/test_string.py Outdated
Extend the pickle behavior to FFI Bytes so cached backing objects are not serialized, matching String.
@Ubospica

Copy link
Copy Markdown
Contributor Author

Addressed the Gemini comments in db208dc:

  • Added Bytes.__reduce_ex__ so pickling tvm_ffi.core.Bytes only persists the Python bytes payload and drops _tvm_ffi_cached_object on unpickle.
  • Added a regression test for cached FFI Bytes returned by testing.echo.
  • Kept Bytes.__slots__ unchanged because Python raises TypeError: nonempty __slots__ not supported for subtype of 'bytes'; the reduction hook is sufficient to avoid serializing the cached FFI object.

Local validation:

  • uv pip install --reinstall --verbose . && .venv/bin/pytest -vvs tests/python/test_string.py
  • commit hooks passed after fixing the local pre-commit cache dependency versions for mypy (numpy<2.0, pytest<8) to match the repo's Python 3.9 mypy target.

@Ubospica

Copy link
Copy Markdown
Contributor Author

Superseded by #643, which is based on the latest and includes the Gemini Bytes review fixes without the dirty merge state.

@Ubospica

Copy link
Copy Markdown
Contributor Author

Closing in favor of #643.

@Ubospica Ubospica closed this Jun 23, 2026
tqchen pushed a commit that referenced this pull request Jun 23, 2026
## Summary
- Add pickle reductions for `tvm_ffi.core.String` and
`tvm_ffi.core.Bytes` so only the Python payload is serialized.
- Ensure unpickled values drop the hidden `_tvm_ffi_cached_object`
backing object.
- Add regression coverage for cached FFI String/Bytes returned from
`testing.echo`.

## Notes
- This is a clean branch based on the latest `apache/main` and
supersedes #642, which became dirty after main advanced.
- `Bytes.__slots__` is intentionally unchanged because Python does not
support non-empty `__slots__` on `bytes` subclasses.

## Test plan
- `uv pip install --reinstall --verbose . && .venv/bin/pytest -vvs
tests/python/test_string.py`
-
`PYTHONPATH=/raid/user_data/yixind/miniforge3/lib/python3.12/site-packages
pre-commit run --files python/tvm_ffi/cython/string.pxi
tests/python/test_string.py`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant