-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-132108: Add Buffer Protocol support to int.from_bytes to improve performance #132109
base: main
Are you sure you want to change the base?
Conversation
Speed up conversion from `bytes-like` objects like `bytearray` while keeping conversion from `bytes` stable. On a `--with-lto --enable-optimizaitons` build on my 64 bit Linux box: new: from_bytes_flags: Mean +- std dev: 28.6 ns +- 0.5 ns bench_convert[bytes]: Mean +- std dev: 50.4 ns +- 1.4 ns bench_convert[bytearray]: Mean +- std dev: 51.3 ns +- 0.7 ns old: from_bytes_flags: Mean +- std dev: 28.1 ns +- 1.1 ns bench_convert[bytes]: Mean +- std dev: 50.3 ns +- 4.3 ns bench_convert[bytearray]: Mean +- std dev: 64.7 ns +- 0.9 ns Benchmark code: ```python import pyperf import time def from_bytes_flags(loops): range_it = range(loops) t0 = time.perf_counter() for _ in range_it: int.from_bytes(b'\x00\x10', byteorder='big') int.from_bytes(b'\x00\x10', byteorder='little') int.from_bytes(b'\xfc\x00', byteorder='big', signed=True) int.from_bytes(b'\xfc\x00', byteorder='big', signed=False) int.from_bytes([255, 0, 0], byteorder='big') return time.perf_counter() - t0 sample_bytes = [ b'', b'\x00', b'\x01', b'\x7f', b'\x80', b'\xff', b'\x01\x00', b'\x7f\xff', b'\x80\x00', b'\xff\xff', b'\x01\x00\x00', ] sample_bytearray = [bytearray(v) for v in sample_bytes] def bench_convert(loops, values): range_it = range(loops) t0 = time.perf_counter() for _ in range_it: for val in values: int.from_bytes(val) return time.perf_counter() - t0 runner = pyperf.Runner() runner.bench_time_func('from_bytes_flags', from_bytes_flags, inner_loops=10) runner.bench_time_func('bench_convert[bytes]', bench_convert, sample_bytes, inner_loops=10) runner.bench_time_func('bench_convert[bytearray]', bench_convert, sample_bytearray, inner_loops=10) ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have benchmarks for very large bytes? maybe you can also say how much we're gaining in the NEWS entry that way.
Small question but how do we cope with classes that explicitly define Note that Instead, we should restrict ourselves to exact buffer objects, namely exact bytes and bytearray objects. |
I want to check that the edge cases are not an issue.
Cases including classes which implement As you point out, if code returns a different set of machine bytes when exporting buffer protocol vs Could match existing behavior by always checking for a Could restrict to known CPython types ( Walking through common types passed to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change. Example.
Before:
>>> class X(bytes):
... def __bytes__(self):
... return b'X'
...
>>> int.from_bytes(X(b'a'))
88
After:
>>> class X(bytes):
... def __bytes__(self):
... return b'X'
...
>>> int.from_bytes(X(b'a'))
97
Speed up :meth:`int.from_bytes` when passed a bytes-like object such as a | ||
:class:`bytearray`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speed up :meth:`int.from_bytes` when passed a bytes-like object such as a | |
:class:`bytearray`. | |
Speed up :meth:`int.from_bytes` when passed a bytes-like object such as | |
:class:`bytes` and :class:`bytearray`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that it affects bytes
.
Docs says:
If >>> class int2(int):
... def __float__(self):
... return 3.14
...
>>> float(int2(123))
3.14 |
This is an example that the method resolution order changes. It now ignores custom The reverse logic is true: PR's author must prove that it does not break things. |
Misc/NEWS.d/next/Core_and_Builtins/2025-04-04-20-38-29.gh-issue-132108.UwZIQy.rst
Outdated
Show resolved
Hide resolved
I would say that if you expose something different via buffer protocol and |
I'll see if I can make a largely performance neutral version that checks >>> class distinct_bytes_buffer(bytes):
... def __bytes__(self):
... return b'b'
...
... def __buffer__(self, flags):
... return memoryview(b'c')
...
...
... class same_bytes_buffer(bytes):
... def __bytes__(self):
... return b'b'
...
... def __buffer__(self, flags):
... return memoryview(b'b')
...
>>> int.from_bytes(distinct_bytes_buffer(b'a'))
...
99
>>> int.from_bytes(same_bytes_buffer(b'a'))
...
98
>>> int.from_bytes(b'a')
...
97
>>> int.from_bytes(b'b')
...
98
>>> int.from_bytes(b'c')
...
99 |
Another edge case around these, >>> class my_bytes(bytes):
... def __bytes__(self):
... return b"bytes"
...
... def __buffer__(self, flags):
... return memoryview(b"buffer")
...
... class distinct_bytes_buffer(bytes):
... def __bytes__(self):
... return my_bytes(b"ob_sval")
...
... def __buffer__(self, flags):
... return memoryview(b"distinct_buffer")
...
... a = distinct_bytes_buffer(b"distinct_ob_sval")
... bytes(a)
...
b'ob_sval' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Speed up :meth:`int.from_bytes` when passed a bytes-like object such as a | ||
:class:`bytearray`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that it affects bytes
.
Co-authored-by: Sergey B Kirpichev <[email protected]>
Created a branch which matches resolution order of branch matching PyObject_Bytes order:
So @sobolevn's example now returns the same value both before and after: >>> class X(bytes):
... def __bytes__(self):
... return b'X'
...
... int.from_bytes(X(b'a'))
...
88 Should I incorporate here? (cc: @serhiy-storchaka, @sobolevn, @skirpichev) full diff from main: https://github.com/python/cpython/compare/main...cmaloney:cpython:exp/bytes_first?collapse=1 diff from PR: cmaloney@189f219 |
Also docs says: "The argument bytes must either be a bytes-like object or an iterable producing bytes." Something is wrong: either implementation (in the main) or docs. |
It may be an iterable producing bytes (not the bytes objects, but integers in the range 0 to 255). |
Yes, this part of the sentence might be at least not clear. But I meant the first part, which has a reference to the glossary term. |
Speed up conversion from
bytes-like
objects likebytearray
while keeping conversion frombytes
stable.On a
--with-lto --enable-optimizaitons
build on my 64 bit Linux box:new:
old:
Benchmark code: