-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
gh-139888: Add PyTupleWriter C API #139891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() helper functions. * Modify PySequence_Tuple() to use PyTupleWriter API. * Soft deprecate _PyTuple_Resize().
Please don't add any APIs for tracking. Tracking, or untracking, is the job of the VM. We might not even have tracking in the future. FT already tracks objects differently. Deprecate I think the most useful new API we could add is |
Are you talking about |
For now, I prefer to only soft deprecate it. It's documented and used by too many C extensions.
Well, I'm open to soft deprecate it. But deprecating it would affect too many C extensions IMO.
We might add PyTupleWriter is mostly useful when you don't know the tuple size in advance. For example, when you consume an iterator. |
Where is the API specified? It seems rather inefficient, needing to heap allocate the writer. There should be no need for a method to create a tuple writer, it can be a small object that can stack allocated and be zero initialized.
It also needs a function to consume the reference of the item, like
Maybe add bulk adds as well?
|
It is unfortunate that so many extensions use it, but it is still broken. The sooner we deprecate it, the better, as we can give people more warning. We do need a good story for how to replace it. |
If you are consuming an iterator, |
I see the value in this as a nice, safe replacement for the |
The API is:
I designed the API to be compatible with the stable ABI later. So the writer is allocated on the heap to hide the structure members from the public C API. The implementation uses a free list which makes the allocation basically free in terms of performance.
I can add
That sounds like a good idea, it would be similar to PyTuple_FromArray(). |
As long as setting all the fields to zero initializes it, then only the size need be fixed.
That's not true. Free lists can have poor locality of reference, and the code can be quite branchy. Plus there's the overhead of the function call. |
I updated the PR to add |
Change also the exception to SystemError for this error.
Benchmark comparing (*) Tuple of 1 item (worst case scenario, measure the overhead of the abstraction):
It's only 3.7 nanoseconds slower. (*) Tuple of 1024 items:
Benchmark: Patch: diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index 4e73be20e1b..9f7fb63daee 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -2562,6 +2562,142 @@ toggle_reftrace_printer(PyObject *ob, PyObject *arg)
Py_RETURN_NONE;
}
+static PyObject *
+bench_tuple1(PyObject *ob, PyObject *args)
+{
+ Py_ssize_t loops;
+ if (!PyArg_ParseTuple(args, "n", &loops)) {
+ return NULL;
+ }
+
+ PyTime_t t1, t2;
+ PyTime_PerfCounterRaw(&t1);
+ for (Py_ssize_t i=0; i < loops; i++) {
+ PyObject *tuple = PyTuple_New(1);
+ if (tuple == NULL) {
+ return NULL;
+ }
+
+ PyObject *item = PyLong_FromLong(1);
+ if (item == NULL) {
+ return NULL;
+ }
+ if (PyTuple_SetItem(tuple, 0, item) < 0) {
+ Py_DECREF(tuple);
+ return NULL;
+ }
+
+ Py_DECREF(tuple);
+ }
+ PyTime_PerfCounterRaw(&t2);
+ return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+static PyObject *
+bench_writer1(PyObject *ob, PyObject *args)
+{
+ Py_ssize_t loops;
+ if (!PyArg_ParseTuple(args, "n", &loops)) {
+ return NULL;
+ }
+
+ PyTime_t t1, t2;
+ PyTime_PerfCounterRaw(&t1);
+ for (Py_ssize_t i=0; i < loops; i++) {
+ PyTupleWriter *writer = PyTupleWriter_Create(1);
+ if (writer == NULL) {
+ return NULL;
+ }
+
+ PyObject *item = PyLong_FromLong(1);
+ if (item == NULL) {
+ return NULL;
+ }
+ if (PyTupleWriter_AddSteal(writer, item) < 0) {
+ PyTupleWriter_Discard(writer);
+ return NULL;
+ }
+
+ PyObject *tuple = PyTupleWriter_Finish(writer);
+ if (tuple == NULL) {
+ return NULL;
+ }
+ Py_DECREF(tuple);
+ }
+ PyTime_PerfCounterRaw(&t2);
+ return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+static PyObject *
+bench_tuple1024(PyObject *ob, PyObject *args)
+{
+ Py_ssize_t loops;
+ if (!PyArg_ParseTuple(args, "n", &loops)) {
+ return NULL;
+ }
+
+ PyTime_t t1, t2;
+ PyTime_PerfCounterRaw(&t1);
+ for (Py_ssize_t i=0; i < loops; i++) {
+ PyObject *tuple = PyTuple_New(1);
+ if (tuple == NULL) {
+ return NULL;
+ }
+
+ for (int i=0; i < 1024; i++) {
+ PyObject *item = PyLong_FromLong(i);
+ if (item == NULL) {
+ return NULL;
+ }
+ if (PyTuple_SetItem(tuple, 0, item) < 0) {
+ Py_DECREF(tuple);
+ return NULL;
+ }
+ }
+
+ Py_DECREF(tuple);
+ }
+ PyTime_PerfCounterRaw(&t2);
+ return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+static PyObject *
+bench_writer1024(PyObject *ob, PyObject *args)
+{
+ Py_ssize_t loops;
+ if (!PyArg_ParseTuple(args, "n", &loops)) {
+ return NULL;
+ }
+
+ PyTime_t t1, t2;
+ PyTime_PerfCounterRaw(&t1);
+ for (Py_ssize_t i=0; i < loops; i++) {
+ PyTupleWriter *writer = PyTupleWriter_Create(1024);
+ if (writer == NULL) {
+ return NULL;
+ }
+
+ for (int i=0; i < 1024; i++) {
+ PyObject *item = PyLong_FromLong(i);
+ if (item == NULL) {
+ return NULL;
+ }
+ if (PyTupleWriter_AddSteal(writer, item) < 0) {
+ PyTupleWriter_Discard(writer);
+ return NULL;
+ }
+ }
+
+ PyObject *tuple = PyTupleWriter_Finish(writer);
+ if (tuple == NULL) {
+ return NULL;
+ }
+ Py_DECREF(tuple);
+ }
+ PyTime_PerfCounterRaw(&t2);
+ return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
static PyMethodDef TestMethods[] = {
{"set_errno", set_errno, METH_VARARGS},
{"test_config", test_config, METH_NOARGS},
@@ -2656,6 +2792,10 @@ static PyMethodDef TestMethods[] = {
{"test_atexit", test_atexit, METH_NOARGS},
{"code_offset_to_line", _PyCFunction_CAST(code_offset_to_line), METH_FASTCALL},
{"toggle_reftrace_printer", toggle_reftrace_printer, METH_O},
+ {"bench_tuple1", bench_tuple1, METH_VARARGS},
+ {"bench_writer1", bench_writer1, METH_VARARGS},
+ {"bench_tuple1024", bench_tuple1024, METH_VARARGS},
+ {"bench_writer1024", bench_writer1024, METH_VARARGS},
{NULL, NULL} /* sentinel */
};
import pyperf
import _testcapi
runner = pyperf.Runner()
runner.bench_time_func('tuple-1', _testcapi.bench_tuple1)
runner.bench_time_func('tuple-1024', _testcapi.bench_tuple1024)
import pyperf
import _testcapi
runner = pyperf.Runner()
runner.bench_time_func('tuple-1', _testcapi.bench_writer1)
runner.bench_time_func('tuple-1024', _testcapi.bench_writer1024) |
Opposition posted on the issue. |
PyTupleWriter
API #139888📚 Documentation preview 📚: https://cpython-previews--139891.org.readthedocs.build/