Skip to content

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Oct 10, 2025

  • Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() helper functions.
  • Modify PySequence_Tuple() to use PyTupleWriter API.
  • Soft deprecate _PyTuple_Resize().

📚 Documentation preview 📚: https://cpython-previews--139891.org.readthedocs.build/

* Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() helper
  functions.
* Modify PySequence_Tuple() to use PyTupleWriter API.
* Soft deprecate _PyTuple_Resize().
@markshannon
Copy link
Member

Please don't add any APIs for tracking. Tracking, or untracking, is the job of the VM. We might not even have tracking in the future. FT already tracks objects differently.

Deprecate _PyTuple_Resize() as hard as you like 🙂; it is nonsense as should removed as soon as possible.
Please deprecate PyTuple_New as well.

I think the most useful new API we could add is PyTuple_MakePair(). Making a tuple from two objects is very common.

@vstinner
Copy link
Member Author

Please don't add any APIs for tracking. Tracking, or untracking, is the job of the VM. We might not even have tracking in the future. FT already tracks objects differently.

Are you talking about _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack()? These functions are not usable outside tupleobject.c, they are declared as static.

@vstinner
Copy link
Member Author

Deprecate _PyTuple_Resize() as hard as you like 🙂; it is nonsense as should removed as soon as possible.

For now, I prefer to only soft deprecate it. It's documented and used by too many C extensions.

Please deprecate PyTuple_New as well.

Well, I'm open to soft deprecate it. But deprecating it would affect too many C extensions IMO.

I think the most useful new API we could add is PyTuple_MakePair(). Making a tuple from two objects is very common.

We might add PyTuple_Pack2() function. But that should be a separated issue.

PyTupleWriter is mostly useful when you don't know the tuple size in advance. For example, when you consume an iterator.

@markshannon
Copy link
Member

Where is the API specified? It seems rather inefficient, needing to heap allocate the writer.
It should to be as efficient as possible, or we won't be able to persuade people to switch away from using PyTuple_New.

There should be no need for a method to create a tuple writer, it can be a small object that can stack allocated and be zero initialized.

    PyTupleWriter writer = { 0 };

It also needs a function to consume the reference of the item, like PyTuple_SETITEM but safer.

    PyTupleWriter_AddConsumeRef(&writer, item);

Maybe add bulk adds as well?

    PyTupleWriter_AddArray(PyTupleWriter *writer, PyObject **array, intptr_t count);

@markshannon
Copy link
Member

Well, I'm open to soft deprecate it. But deprecating it would affect too many C extensions IMO.

It is unfortunate that so many extensions use it, but it is still broken. The sooner we deprecate it, the better, as we can give people more warning. We do need a good story for how to replace it.

@markshannon
Copy link
Member

markshannon commented Oct 10, 2025

PyTupleWriter is mostly useful when you don't know the tuple size in advance. For example, when you consume an iterator.

If you are consuming an iterator, PySequence_Tuple is much simpler thanPyTupleWriter.
TBH, if you're interacting with Python objects at that level your best option is probably Python not C.

@markshannon
Copy link
Member

I see the value in this as a nice, safe replacement for the PyTuple_New PyTuple_SET_ITEM combo.
So the API needs to be efficient, and easy to port to.

@vstinner
Copy link
Member Author

Where is the API specified? It seems rather inefficient, needing to heap allocate the writer.

The API is:

PyTupleWriter* PyTupleWriter_Create(Py_ssize_t size);
int PyTupleWriter_Add(PyTupleWriter *writer, PyObject *item);
PyObject* PyTupleWriter_Finish(PyTupleWriter *writer);
void PyTupleWriter_Discard(PyTupleWriter *writer);

PyTupleWriter_Add() creates a new reference, it doesn't take the ownership of item.

It seems rather inefficient, needing to heap allocate the writer.

I designed the API to be compatible with the stable ABI later. So the writer is allocated on the heap to hide the structure members from the public C API.

The implementation uses a free list which makes the allocation basically free in terms of performance.

It also needs a function to consume the reference of the item, like PyTuple_SETITEM but safer.

I can add int PyTupleWriter_AddSteal(PyTupleWriter *writer, PyObject *item) variant which takes the ownership of the item. The C API Working Group recently expressed its preference for the Steal term for such API.

Maybe add bulk adds as well?

That sounds like a good idea, it would be similar to PyTuple_FromArray().

@markshannon
Copy link
Member

So the writer is allocated on the heap to hide the structure members from the public C API.

As long as setting all the fields to zero initializes it, then only the size need be fixed.

The implementation uses a free list which makes the allocation basically free in terms of performance.

That's not true. Free lists can have poor locality of reference, and the code can be quite branchy. Plus there's the overhead of the function call.

@vstinner
Copy link
Member Author

I updated the PR to add PyTupleWriter_AddSteal() and PyTupleWriter_AddArray() functions, and hard deprecate _PyTuple_Resize().

@vstinner
Copy link
Member Author

Benchmark comparing PyTuple_New()+PyTuple_SetItem() to PyTupleWriter_Create()+PyTupleWriter_AddSteal()+PyTupleWriter_Finish().

(*) Tuple of 1 item (worst case scenario, measure the overhead of the abstraction):

Mean +- std dev: [tuple] 32.4 ns +- 0.1 ns -> [writer] 36.1 ns +- 0.6 ns: 1.11x slower

It's only 3.7 nanoseconds slower.

(*) Tuple of 1024 items:

Mean +- std dev: [tuple] 7.09 us +- 0.08 us -> [writer] 7.62 us +- 0.06 us: 1.07x slower


Benchmark:

Patch:

diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index 4e73be20e1b..9f7fb63daee 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -2562,6 +2562,142 @@ toggle_reftrace_printer(PyObject *ob, PyObject *arg)
     Py_RETURN_NONE;
 }
 
+static PyObject *
+bench_tuple1(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t loops;
+    if (!PyArg_ParseTuple(args, "n", &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t i=0; i < loops; i++) {
+        PyObject *tuple = PyTuple_New(1);
+        if (tuple == NULL) {
+            return NULL;
+        }
+
+        PyObject *item = PyLong_FromLong(1);
+        if (item == NULL) {
+            return NULL;
+        }
+        if (PyTuple_SetItem(tuple, 0, item) < 0) {
+            Py_DECREF(tuple);
+            return NULL;
+        }
+
+        Py_DECREF(tuple);
+    }
+    PyTime_PerfCounterRaw(&t2);
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+static PyObject *
+bench_writer1(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t loops;
+    if (!PyArg_ParseTuple(args, "n", &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t i=0; i < loops; i++) {
+        PyTupleWriter *writer = PyTupleWriter_Create(1);
+        if (writer == NULL) {
+            return NULL;
+        }
+
+        PyObject *item = PyLong_FromLong(1);
+        if (item == NULL) {
+            return NULL;
+        }
+        if (PyTupleWriter_AddSteal(writer, item) < 0) {
+            PyTupleWriter_Discard(writer);
+            return NULL;
+        }
+
+        PyObject *tuple = PyTupleWriter_Finish(writer);
+        if (tuple == NULL) {
+            return NULL;
+        }
+        Py_DECREF(tuple);
+    }
+    PyTime_PerfCounterRaw(&t2);
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+static PyObject *
+bench_tuple1024(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t loops;
+    if (!PyArg_ParseTuple(args, "n", &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t i=0; i < loops; i++) {
+        PyObject *tuple = PyTuple_New(1);
+        if (tuple == NULL) {
+            return NULL;
+        }
+
+        for (int i=0; i < 1024; i++) {
+            PyObject *item = PyLong_FromLong(i);
+            if (item == NULL) {
+                return NULL;
+            }
+            if (PyTuple_SetItem(tuple, 0, item) < 0) {
+                Py_DECREF(tuple);
+                return NULL;
+            }
+        }
+
+        Py_DECREF(tuple);
+    }
+    PyTime_PerfCounterRaw(&t2);
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+static PyObject *
+bench_writer1024(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t loops;
+    if (!PyArg_ParseTuple(args, "n", &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t i=0; i < loops; i++) {
+        PyTupleWriter *writer = PyTupleWriter_Create(1024);
+        if (writer == NULL) {
+            return NULL;
+        }
+
+        for (int i=0; i < 1024; i++) {
+            PyObject *item = PyLong_FromLong(i);
+            if (item == NULL) {
+                return NULL;
+            }
+            if (PyTupleWriter_AddSteal(writer, item) < 0) {
+                PyTupleWriter_Discard(writer);
+                return NULL;
+            }
+        }
+
+        PyObject *tuple = PyTupleWriter_Finish(writer);
+        if (tuple == NULL) {
+            return NULL;
+        }
+        Py_DECREF(tuple);
+    }
+    PyTime_PerfCounterRaw(&t2);
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
 static PyMethodDef TestMethods[] = {
     {"set_errno",               set_errno,                       METH_VARARGS},
     {"test_config",             test_config,                     METH_NOARGS},
@@ -2656,6 +2792,10 @@ static PyMethodDef TestMethods[] = {
     {"test_atexit", test_atexit, METH_NOARGS},
     {"code_offset_to_line", _PyCFunction_CAST(code_offset_to_line), METH_FASTCALL},
     {"toggle_reftrace_printer", toggle_reftrace_printer, METH_O},
+    {"bench_tuple1", bench_tuple1, METH_VARARGS},
+    {"bench_writer1", bench_writer1, METH_VARARGS},
+    {"bench_tuple1024", bench_tuple1024, METH_VARARGS},
+    {"bench_writer1024", bench_writer1024, METH_VARARGS},
     {NULL, NULL} /* sentinel */
 };

bench_tuple.py:

import pyperf
import _testcapi
runner = pyperf.Runner()
runner.bench_time_func('tuple-1', _testcapi.bench_tuple1)
runner.bench_time_func('tuple-1024', _testcapi.bench_tuple1024)

bench_writer.py:

import pyperf
import _testcapi
runner = pyperf.Runner()
runner.bench_time_func('tuple-1', _testcapi.bench_writer1)
runner.bench_time_func('tuple-1024', _testcapi.bench_writer1024)

@zooba
Copy link
Member

zooba commented Oct 10, 2025

Opposition posted on the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants