-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Profiling and Optimizing Mypy
The performance of mypy is important, and we appreciate contributions that improve the performance of mypy, either overall, or in specific use cases. Contributions that regress performance significantly may be rejected. As a rule of thumb, if a new feature or fix regresses performance by over 1%, it may be a cause for concern, especially if this has a relatively minor user benefit. Usually it's possible to alter the implementation to avoid a measurable regression, however.
Mypy is compiled using mypyc to C extension modules. This means that using the stdlib cProfile
or profile
module isn't going to be effective, unless you use a non-compiled mypy -- and in this case the results may not be indicative of performance when using a compiled mypy. Also when measuring performance improvements, it's important to use a compiled mypy. All of this is documented below.
Recent impactful optimizations are often in one of these three categories:
- Algorithmic optimizations: Drastically improve performance when there are complex types or expressions (e.g. large unions, overloads).
- Fast paths for common operations: Some compuation can be skipped (or replaced with simpler ones) in the common case. For example, a type is trivially a subtype of another if the types are equal.
- Tweaking code to use faster operations when compiled: Mypyc can speed up most common operations, but some operations are much faster than others. For example, use a regular class instead of a named tuple in performance critical code, if named tuples have fewer optimizations.
- Creating fewer objects: Allocating Python objects dynamically is pretty slow, and object allocation tends to consume a significant fraction of CPU when running mypy. Often some allocations can be avoided (e.g. by preallocating or sharing objects), or replaced with less expensive objects. Reducing memory use is also helpful.
There are relatively few remaining bottlenecks where a single optimization can produce a major improvement across most mypy use cases (say 2% or more). That's okay -- typical individual optimizations either target specific use cases, or improve performance generally by a smaller factor, from 0.1% to 1.9% each. Many small (but measurable) improvements quickly add up to significant wins.
For all of these it's important to first profile or measure the bottleneck. Optimizing code often makes code harder to maintain, even if slightly, so we want to focus on optimizing the real bottlenecks only, and it's important to try to measure that optimizations have an impact. Measuring impact is discussed in a later section.
Here is a summary of some common operations and their rough relative speed when compiled using mypyc (per operation/step). Many effective micro-optimizations replace expensive operations with less expensive ones in some hot code paths.
Operations | Relative cost |
---|---|
Construct Python (non-compiled) class instance | Very high |
Call stdlib function implemented in Python | Very high |
Construct a native (compiled) class instance | High |
Construct a dict or set | High |
Call function using *args or **kwargs
|
High |
Construct a list or variable-length tuple | Medium |
Define a nested function | Medium |
Call a nested function | Medium |
Call a function via Callable type (indirectly) | Medium |
Iterate over a set | Medium |
Import statement nested within function | Medium |
Dict get item, or dict/set contains | Medium |
String operations (except for equality, len) | Medium to high |
Iterate over a list/variable length tuple/dict | Low |
String equality | Low |
Call method of compiled class (simple) | Low |
Call module-level compiled function (simple) | Low |
Access an attribute of a compiled class | Low |
Integer operations | Low |
"is" operation | Low |
len(x) for a built-in type |
Low |
Get single list/tuple item | Low |
We have a daily job that measures the time needed to type check a fixed (older) version of mypy: mypy self check benchmark results. The level of noise in the results is about 0.5%, but improvements or regressions of 1% or more tend to be visible in the results.
If the above data hasn't been updated for some time but there have been recent commits to the mypy repository, feel free to create a mypy GitHub issue about the benchmark job being potentially down.
Use misc/perf_compare.py
to measure the performance impact of a branch or commit. For example, if you have some optimization in branch my-opt
, you can measure the performance like this:
python misc/perf_compare.py master my-opt
A change of 0.5% or less can be noise. You can try using --num-runs 100
or something to improve precision, but even this isn't reliable, since trivial changes to the codebase can cause a repeatable 0.5% deviation in results.
To measure smaller improvements, you have a few options:
- Rebase your branch on top of 3-4 master commits from the last week or two and measure each separately. If you see a small performance improvement consistently, it's likely that it's real.
- Implement 2 or more optimizations in the same branch, until the overall performance impact is more than 0.5% (preferably 1.0% or more). Then you can create a PR with multiple optimizations, or create separate PRs but mention in each that you've measured the impact of a set of optimizations together.
- Use "trace logging" (discussed below) to measure the number of certain operations/events precisely. This doesn't help with all kinds of optimizations, but for certain micro-optimizations this can reliably show even quite small improvements in a convincing way.
- Profile mypy using your commit and master, and focus on the % CPU used in the function you are optimizing. It's possible that overall runtime hasn't changed much, but the specific function you targeted has a significant improvement.
The script misc/profile_check.py
in the mypy repository compiles mypy using mypyc and profiles a mypy run using the perf
tool (only on Linux). Look at the script for more documentation. The profile only contains C functions, so Python names have been mangled to C. Usually it should be clear what they refer to, however. The generated profiles are generally of a high quality. Using py-spy (see below) may be easier, but the profiles are of lower quality.
Note
For this to be effective, you need to first compile CPython yourself using specific C compiler flags.
You can compile mypy in a mode where it produces an event trace log based on what is executed. You can identify some slow events/operations and optimize them into faster operations or otherwise avoid them. See Mypyc Trace Logging for more information.
Note
As of Jul 2025, running py-spy on Python 3.12 and later may not work reliably. Python 3.11 can be used as a workaround. You can also try perf
on Linux (discussed elsewhere).
py-spy is a profiling tool that works with compiled mypy (at least in Linux).
Use it like this to profile mypy (replace -c 'import os'
with your command line arguments):
$ pip install py-spy
$ pip install mypy
$ py-spy record --native -f speedscope -o profile.dat -- mypy -c 'import os'
Now open https://www.speedscope.app/, click Browse and import the profile.dat file you generated above. You can click 'Sandwich' to get a flat profile.
If the mypy run is relatively quick (less than a few seconds), consider using -r 500
with py-spy to increase the sampling rate (but high sample rates may not work reliably).
Note: To get repeatable results, disable incremental mode by using mypy --no-incremental
, or delete the .mypy_cache
directory before each run.
If you are reporting a mypy performance issue or regression, feel free to add a link to the collected profile.dat (after you've verified with speedscope that it contains useful information) so that mypy developers can also analyze the profile.