[mypyc] feat: new primitive for `int.to_bytes` #19674

BobTheBuidler · 2025-08-16T15:51:42Z

This PR adds a new primitive for all arg combinations of int.to_bytes

for more information, see https://pre-commit.ci

… to-bytes

ilevkivskyi

Nice! I will keep this open for a day or two in case @JukkaL has some comments.

mypyc/lib-rt/int_ops.c

mypyc/lib-rt/CPy.h

mypyc/lib-rt/int_ops.c

mypyc/test-data/run-integers.test

ilevkivskyi

On the second look I think I have some more questions.

ilevkivskyi · 2025-08-21T09:56:17Z

mypyc/lib-rt/int_ops.c

+// int.to_bytes(length, byteorder, signed=False)
+PyObject *CPyTagged_ToBytes(CPyTagged self, Py_ssize_t length, PyObject *byteorder, int signed_flag) {
+    PyObject *pyint = CPyTagged_StealAsObject(self);
+    if (!PyLong_Check(pyint)) {


On the second thought, all these type checks look unnecessary, normally Python wrappers should do them. You can probably verify this by adding some run tests with Anys in them.

what, like this?

def f(x: Any) -> bytes: return int.to_bytes(x)

based on @JukkaL response to a similar question on #19673 I think we can safely remove this check since CPyTagged_StealAsObject guarantees the type

I think CPyTagged_StealAsObject is not correct there, since it will transfer the ownership of the parameter, and this can cause a double free. CPyTagged_AsObject will return a new reference which you can decref at the end of the function.

@BobTheBuidler

what, like this?

First, not just the self, second I think you should try more something like this

def to_bytes(n: int, length: int, byteorder: str = "little", signed: bool = False) -> bytes: return n.to_bytes(length, byteorder, signed=signed) x: Any = "no" bad: Any = "way" to_bytes(x, bad)

and check that a TypeError will be given even before getting to your specialized code.

I could implement this test, but wouldn't we then just be testing the standard python-wrapper type validation functionality, as opposed to some specific functionality related to this PR?

I can still add the tests accordingly, I just want to make sure we have the same understanding of things before I proceed.

ilevkivskyi · 2025-08-21T09:59:24Z

mypyc/test-data/run-integers.test

+    assert to_bytes(255, 2, "big") == b'\x00\xff'
+    assert to_bytes(255, 2, "little") == b'\xff\x00'
+    assert to_bytes(-1, 2, "big", True) == b'\xff\xff'
+    assert to_bytes(0, 1, "big") == b'\x00'


Maybe also test calling to_bytes() function from interpreted code.

@ilevkivskyi how would I implement that? Is there a good example I can look from?

mypyc/lib-rt/int_ops.c

JukkaL

I measured the performance impact using a micro-benchmark, and the performance was pretty similar to master on Python 3.13:

import time

def bench(n: int) -> None:
    for i in range(n):
        i.to_bytes(4, "little", signed=True)

bench(10 * 1000 * 1000)
t0 = time.time()
bench(30 * 1000 * 1000)
print(time.time() - t0)

It's possible that the operation is expensive enough that the call overhead when there's no primitive isn't significant. In any case, since maintaining a new primitive takes some effort, it's important that there's some measurable performance improvement first. Have you been able to measure a performance improvement?

mypyc/test-data/irbuild-int.test

BobTheBuidler · 2025-09-09T17:01:24Z

what if you assign the result to a value? That way we can include the interpreted version's unboxing penalty in our benchmark.

import time

def bench(n: int) -> None:
    for i in range(n):
        x = i.to_bytes(4, "little", signed=True)

bench(10 * 1000 * 1000)
t0 = time.time()
bench(30 * 1000 * 1000)
print(time.time() - t0)

BobTheBuidler · 2025-09-09T17:02:29Z

Separately, I might be able to slightly optimize the little/big check by only checking the length of the input string once, since we already know the length of 'little' and 'big'.

Or alternatively, I could make a bespoke version for each of the two cases and eliminate the check entirely. This would be fastest but I'm not sure if there's currently a clean way to implement the method_op

JukkaL · 2025-09-09T17:07:46Z

Having a specializer function that choosen a little/big endian variant of the primitive if the argument is a string literal could help (in mypyc.irbuild.specialize). We do a similar thing for encode and decode calls.

BobTheBuidler · 2025-09-09T17:26:18Z

That's odd. I fixed the kw-only arg in to_bytes stub, but doing so breaks the IR. Is there a way for me to cover this case with a method_op?

BobTheBuidler · 2025-09-10T04:19:40Z

1.18 vs this branch

There was a ~5-8% increase in speed for various permutations of the int_to_big_endian benchmark which includes not only a call to to_bytes but a decent bit of other stuff as well

BobTheBuidler and others added 22 commits August 16, 2025 15:50

[mypyc] feat: new primitive for int.to_bytes

2058f81

[pre-commit.ci] auto fixes from pre-commit.com hooks

f62bfd2

for more information, see https://pre-commit.ci

add headers

813c510

Merge branch 'to-bytes' of https://github.com/BobTheBuidler/mypy into…

4be77d6

… to-bytes

cover all arg combos

cb93329

CPyLong_ToBytes header

6ff1c9b

;

166b6f4

fix name

be2d2de

fix _PyLong_AsByteArray compile err

9ff3a7c

Update ir.py

8399619

define header

db7b483

Merge branch 'to-bytes' of https://github.com/BobTheBuidler/mypy into…

ed05c00

… to-bytes

fix ir

a0c147b

Update ir.py

5bab265

fix ir

b346bbf

Merge branch 'to-bytes' of https://github.com/BobTheBuidler/mypy into…

8d523a0

… to-bytes

fix ir

85652a2

add ir test

8e97165

fix py 3.10

3660a01

use _PyLong_AsByteArray on all pythons

6a8a83c

fix: py3.13 and 3.14

ec95008

optimize if check

95fa8b0

BobTheBuidler marked this pull request as ready for review August 17, 2025 16:39

ilevkivskyi approved these changes Aug 20, 2025

View reviewed changes

Merge branch 'master' into to-bytes

78f4ca1

JukkaL reviewed Aug 21, 2025

View reviewed changes

mypyc/lib-rt/int_ops.c Outdated Show resolved Hide resolved

mypyc/lib-rt/CPy.h Outdated Show resolved Hide resolved

mypyc/lib-rt/int_ops.c Outdated Show resolved Hide resolved

mypyc/test-data/run-integers.test Show resolved Hide resolved

ilevkivskyi reviewed Aug 21, 2025

View reviewed changes

BobTheBuidler added 3 commits August 21, 2025 09:46

Update CPy.h

326484b

Update int_ops.c

245a122

Update int_ops.c

d2994e1

BobTheBuidler and others added 5 commits August 23, 2025 09:43

Merge branch 'master' into to-bytes

1138448

test long

0715ca2

add overflow error tests

7cce1e7

Merge branch 'master' into to-bytes

c3aaefa

fix: move helper func up in C file

0d7cb57

JukkaL reviewed Sep 9, 2025

View reviewed changes

mypyc/test-data/irbuild-int.test Outdated Show resolved Hide resolved

Update ir.py

db09bc0

Update irbuild-int.test

a7151dd

Uh oh!

[mypyc] feat: new primitive for int.to_bytes #19674

Are you sure you want to change the base?

[mypyc] feat: new primitive for int.to_bytes #19674

Conversation

BobTheBuidler commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilevkivskyi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ilevkivskyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JukkaL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BobTheBuidler commented Sep 9, 2025

Uh oh!

BobTheBuidler commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JukkaL commented Sep 9, 2025

Uh oh!

BobTheBuidler commented Sep 9, 2025

Uh oh!

BobTheBuidler commented Sep 10, 2025

Uh oh!

Uh oh!

[mypyc] feat: new primitive for `int.to_bytes` #19674

[mypyc] feat: new primitive for `int.to_bytes` #19674

BobTheBuidler commented Aug 16, 2025 •

edited

Loading

BobTheBuidler commented Sep 9, 2025 •

edited

Loading