forked from python/peps
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpep-0437.txt
410 lines (283 loc) · 13.3 KB
/
pep-0437.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
PEP: 437
Title: A DSL for specifying signatures, annotations and argument converters
Version: $Revision$
Last-Modified: $Date$
Author: Stefan Krah <[email protected]>
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 2013-03-11
Python-Version: 3.4
Post-History:
Resolution: https://mail.python.org/pipermail/python-dev/2013-May/126117.html
Abstract
========
The Python C-API currently has no mechanism for specifying and auto-generating
function signatures, annotations or custom argument converters.
There are several possible approaches to the problem. Cython uses *cdef*
definitions in *.pyx* files to generate the required information. However,
CPython's C-API functions often require additional initialization and
cleanup snippets that would be hard to specify in a *cdef*.
PEP 436 proposes a domain specific language (DSL) enclosed in C comments
that largely resembles a per-parameter configuration file. A preprocessor
reads the comment and emits an argument parsing function, docstrings and
a header for the function that utilizes the results of the parsing step.
The latter function is subsequently referred to as the *implementation
function*.
Rejection Notice
================
This PEP was rejected by Guido van Rossum at PyCon US 2013. However, several
of the specific issues raised by this PEP were taken into account when
designing the `second iteration of the PEP 436 DSL`_.
Rationale
=========
Opinions differ regarding the suitability of the PEP 436 DSL in the context
of a C file. This PEP proposes an alternative DSL. The specific issues with
PEP 436 that spurred the counter proposal will be explained in the final
section of this PEP.
Scope
=====
The PEP focuses exclusively on the DSL. Topics like the output locations of
docstrings or the generated code are outside the scope of this PEP.
It is however vital that the DSL is suitable for generating custom argument
parsers, a feature that is already implemented in Cython. Therefore, one of
the goals of this PEP is to keep the DSL close to existing solutions, thus
facilitating a possible inclusion of the relevant parts of Cython into the
CPython source tree.
DSL overview
============
Type safety and annotations
---------------------------
A conversion from a Python to a C value is fully defined by the type of
the converter function. The PyArg_Parse* family of functions accepts
custom converters in addition to the well-known default converters "i",
"f", etc.
This PEP views the default converters as abstract functions, regardless
of how they are actually implemented.
Include/converters.h
--------------------
Converter functions must be forward-declared. All converter functions
shall be entered into the file Include/converters.h. The file is read
by the preprocessor prior to translating .c files. This is an excerpt::
/*[converter]
##### Default converters #####
"s": str -> const char *res;
"s*": [str, bytes, bytearray, rw_buffer] -> Py_buffer &res;
[...]
"es#": str -> (const char *res_encoding, char **res, Py_ssize_t *res_length);
[...]
##### Custom converters #####
path_converter: [str, bytes, int] -> path_t &res;
OS_STAT_DIR_FD_CONVERTER: [int, None] -> int res;
[converter_end]*/
Converters are specified by their name, Python input type(s) and C output
type(s). Default converters must have quoted names, custom converters must
have regular names. A Python type is given by its name. If a function accepts
multiple Python types, the set is written in list form.
Since the default converters may have multiple implicit return values,
the C output type(s) are written according to the following convention:
The main return value must be named *res*. This is a placeholder for
the actual variable name given later in the DSL. Additional implicit
return values must be prefixed by *res_*.
By default the variables are passed by value to the implementation function.
If the address should be passed instead, *res* must be prefixed with an
ampersand.
Additional declarations may be placed into .c files. Duplicate declarations
are allowed as long as the function types are identical.
It is encouraged to declare custom converter types a second time right
above the converter function definition. The preprocessor will then catch
any mismatch between the declarations.
In order to keep the converter complexity manageable, PY_SSIZE_T_CLEAN will
be deprecated and Py_ssize_t will be assumed for all length arguments.
TBD: Make a list of fantasy types like *rw_buffer*.
Function specifications
-----------------------
Keyword arguments
^^^^^^^^^^^^^^^^^
This example contains the definition of os.stat. The individual sections will
be explained in detail. Grammatically, the whole define block consists of a
function specification and an output section. The function specification in
turn consists of a declaration section, an optional C-declaration section and
an optional cleanup code section. Sections within the function specification
are separated in yacc style by '%%'::
/*[define posix_stat]
def os.stat(path: path_converter, *, dir_fd: OS_STAT_DIR_FD_CONVERTER = None,
follow_symlinks: "p" = True) -> os.stat_result: pass
%%
path_t path = PATH_T_INITIALIZE("stat", 0, 1);
int dir_fd = DEFAULT_DIR_FD;
int follow_symlinks = 1;
%%
path_cleanup(&path);
[define_end]*/
<literal C output>
/*[define_output_end]*/
Define block
~~~~~~~~~~~~
The function specification block starts with a ``/*[define`` token, followed
by an optional C function name, followed by a right bracket. If the C function
name is not given, it is generated from the declaration name. In the example,
omitting the name *posix_stat* would result in a C function name of *os_stat*.
Declaration
~~~~~~~~~~~
The required declaration is (almost) a valid Python function definition. The
'def' keyword and the function body are redundant, but the author of this PEP
finds the definition more readable if they are present.
The function name may be a path instead of a plain identifier. Each argument
is annotated with the name of the converter function that will be applied to it.
Default values are given in the usual Python manner and may be any valid
Python expression.
The return value may be any Python expression. Usually it will be the name
of an object, but alternative return values could be specified in list form.
C-declarations
~~~~~~~~~~~~~~
This optional section contains C variable declarations. Since the converter
functions have been declared beforehand, the preprocessor can type-check
the declarations.
Cleanup
~~~~~~~
The optional cleanup section contains literal C code that will be inserted
unmodified after the implementation function.
Output
~~~~~~
The output section contains the code emitted by the preprocessor.
Positional-only arguments
^^^^^^^^^^^^^^^^^^^^^^^^^
Functions that do not take keyword arguments are indicated by the presence
of the *slash* special parameter::
/*[define stat_float_times]
def os.stat_float_times(/, newval: "i") -> os.stat_result: pass
%%
int newval = -1;
[define_end]*/
The preprocessor translates this definition to a PyArg_ParseTuple() call.
All arguments to the right of the slash are optional arguments.
Left and right optional arguments
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Some legacy functions contain optional arguments groups both to the left and
right of a central parameter. It is debatable whether a new tool should support
such functions. For completeness' sake, this is the proposed syntax::
/*[define]
def curses.window.addch(y: "i", x: "i", ch: "O", attr: "l") -> None: pass
where groups = [[ch], [ch, attr], [y, x, ch], [y, x, ch, attr]]
[define_end]*/
Here *ch* is the central parameter, *attr* can optionally be added on the
right, and the group [y, x] can optionally be added on the left.
Essentially the rule is that all ordered combinations of the central
parameter and the optional groups must be possible such that no two
combinations have the same length.
This is concisely expressed by putting the central parameter first in
the list and subsequently adding the optional arguments groups to the
left and right.
Flexibility in formatting
=========================
If the above os.stat example is considered too compact, it can easily be
formatted this way::
/*[define posix_stat]
def os.stat(path: path_converter,
*,
dir_fd: OS_STAT_DIR_FD_CONVERTER = None,
follow_symlinks: "p" = True)
-> os.stat_result: pass
%%
path_t path = PATH_T_INITIALIZE("stat", 0, 1);
int dir_fd = DEFAULT_DIR_FD;
int follow_symlinks = 1;
%%
path_cleanup(&path);
[define_end]*/
<literal C output>
/*[define_output_end]*/
Benefits of a compact notation
==============================
The advantages of a concise notation are especially obvious when a large
number of parameters is involved. The argument parsing part of
``_posixsubprocess.fork_exec`` is fully specified by this definition::
/*[define subprocess_fork_exec]
def _posixsubprocess.fork_exec(
process_args: "O", executable_list: "O",
close_fds: "p", py_fds_to_keep: "O",
cwd_obj: "O", env_list: "O",
p2cread: "i", p2cwrite: "i", c2pread: "i", c2pwrite: "i",
errread: "i", errwrite: "i", errpipe_read: "i", errpipe_write: "i",
restore_signals: "i", call_setsid: "i", preexec_fn: "i", /) -> int: pass
[define_end]*/
Note that the *preprocess* tool currently emits a redundant C-declaration
section for this example, so the output is longer than necessary.
Easy validation of the definition
=================================
How can an inexperienced user validate a definition like os.stat? Simply
by changing os.stat to os_stat, defining missing converters and pasting
the definition into the Python interactive interpreter!
In fact, a converters.py module could be auto-generated from converters.h.
Reference implementation
========================
A reference implementation is available at `issue 16612`_. Since this PEP
was written under time constraints and the author is unfamiliar with the
PLY toolchain, the software is written in Standard ML and utilizes the
ml-yacc/ml-lex toolchain.
The grammar is conflict-free and available in ml-yacc readable BNF form.
Two tools are available:
* *printsemant* reads a converter header and a .c file and dumps
the semantically checked parse tree to stdout.
* *preprocess* reads a converter header and a .c file and dumps
the preprocessed .c file to stdout.
Known deficiencies:
* The Python 'test' expression is not semantically checked. The syntax
however is checked since it is part of the grammar.
* The lexer does not handle triple quoted strings.
* C declarations are parsed in a primitive way. The final implementation
should utilize 'declarator' and 'init-declarator' from the C grammar.
* The *preprocess* tool does not emit code for the left-and-right optional
arguments case. The *printsemant* tool can deal with this case.
* Since the *preprocess* tool generates the output from the parse
tree, the original indentation of the define block is lost.
Grammar
=======
TBD: The grammar exists in ml-yacc readable form, but should probably be
included here in EBNF notation.
Comparison with PEP 436
=======================
The author of this PEP has the following concerns about the DSL proposed
in PEP 436:
* The whitespace sensitive configuration file like syntax looks out
of place in a C file.
* The structure of the function definition gets lost in the per-parameter
specifications. Keywords like positional-only, required and keyword-only
are scattered across too many different places.
By contrast, in the alternative DSL the structure of the function
definition can be understood at a single glance.
* The PEP 436 DSL has 14 documented flags and at least one undocumented
(allow_fd) flag. Figuring out which of the 2**15 possible combinations
are valid places an unnecessary burden on the user.
Experience with the PEP-3118 buffer flags has shown that sorting out
(and exhaustively testing!) valid combinations is an extremely tedious
task. The PEP-3118 flags are still not well understood by many people.
By contrast, the alternative DSL has a central file Include/converters.h
that can be quickly searched for the desired converter. Many of the
converters are already known, perhaps even memorized by people (due
to frequent use).
* The PEP 436 DSL allows too much freedom. Types can apparently be omitted,
the preprocessor accepts (and ignores) unknown keywords, sometimes adding
white space after a docstring results in an assertion error.
The alternative DSL on the other hand allows no such freedoms. Omitting
converter or return value annotations is plainly a syntax error. The
LALR(1) grammar is unambiguous and specified for the complete translation
unit.
Copyright
=========
This document is licensed under the `Open Publication License`_.
References and Footnotes
========================
.. _issue 16612: http://bugs.python.org/issue16612
.. _Open Publication License: http://www.opencontent.org/openpub/
.. _second iteration of the PEP 436 DSL:
http://hg.python.org/peps/rev/a2fa10b2424b
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: