Profiling and Optimizing Mypy

Overview

The performance of mypy is important, and we appreciate contributions that improve the performance of mypy, either overall, or in specific use cases. Contributions that regress performance significantly may be rejected. As a rule of thumb, if a new feature or fix regresses performance by over 1%, it may be a cause for concern, especially if this has a relatively minor user benefit. Usually it's possible to alter the implementation to avoid a measurable regression, however.

Mypy is compiled using mypyc to C extension modules. This means that using the stdlib cProfile or profile module isn't going to be effective, unless you use a non-compiled mypy -- and in this case the results may not be indicative of performance when using a compiled mypy. Also when measuring performance improvements, it's important to use a compiled mypy. All of this is documented below.

Typical optimizations and bottlenecks

Recent impactful optimizations are often in one of these categories:

Algorithmic optimizations: Drastically improve performance when there are complex types or expressions (e.g. large unions, overloads).
Fast paths for common operations: Some computation can be skipped (or replaced with simpler ones) in the common case. For example, a type is trivially a subtype of another if the types are equal.
Tweaking code to use faster operations when compiled: Mypyc can speed up most common operations, but some operations are much faster than others. For example, use a regular class instead of a named tuple in performance critical code, if named tuples have fewer optimizations.
Creating fewer objects: Allocating Python objects dynamically is pretty slow, and object allocation tends to consume a significant fraction of CPU when running mypy. Often some allocations can be avoided (e.g. by preallocating or sharing objects), or replaced with less expensive objects. Reducing memory use is also helpful.

There are relatively few remaining bottlenecks where a single optimization can produce a major improvement across most mypy use cases (say 2% or more). That's okay -- typical individual optimizations either target specific use cases, or improve performance generally by a smaller factor, from 0.1% to 1.9% each. Many small (but measurable) improvements quickly add up to significant wins.

For all of these it's important to first profile or measure the bottleneck. Optimizing code often makes code harder to maintain, even if slightly, so we want to focus on optimizing the real bottlenecks only, and it's important to try to measure that optimizations have an impact. Measuring impact is discussed in a later section.

Optimizing to use faster operations

Here is a summary of some common operations and their rough relative speed when compiled using mypyc (per operation/step). Many effective micro-optimizations replace expensive operations with less expensive ones in some hot code paths.

Operations	Relative cost
Construct Python (non-compiled) class instance	Very high
Call stdlib function implemented in Python	Very high
Construct a native (compiled) class instance	High
Construct a dict or set	High
Call function using `args` or `*kwargs`	High
Construct a list or variable-length tuple	Medium
Define a nested function	Medium
Call a nested function	Medium
Call a function via Callable type (indirectly)	Medium
Iterate over a set	Medium
Import statement nested within function	Medium
Dict get item, or dict/set contains	Medium
String operations (except for equality, len)	Medium to high
Iterate over a list/variable length tuple/dict	Low
String equality	Low
Call method of compiled class (simple)	Low
Call module-level compiled function (simple)	Low
Access an attribute of a compiled class	Low
Integer operations	Low
"is" operation	Low
`len(x)` for a built-in type	Low
Get single list/tuple item	Low

Tracking the performance of mypy

We have a daily job that measures the time needed to type check a fixed (older) version of mypy: mypy self check benchmark results. The level of noise in the results is about 0.5%, but improvements or regressions of 1% or more tend to be visible in the results.

If the above data hasn't been updated for some time but there have been recent commits to the mypy repository, feel free to create a mypy GitHub issue about the benchmark job being potentially down.

Measuring the performance of mypy locally

Use misc/perf_compare.py to measure the performance impact of a branch or commit. For example, if you have some optimization in branch my-opt, you can measure the performance when compiled like this:

python misc/perf_compare.py master my-opt

A change of 0.5% or less can be noise. You can try using --num-runs 100 or something to improve precision, but even this isn't reliable, since trivial changes to the codebase can cause a repeatable 0.5% deviation in results.

To measure smaller improvements, you have a few options:

Rebase your branch on top of 3-4 different master commits from the last week or two and measure each separately. If you see a small performance improvement consistently, it's likely that it's real.
Implement 2 or more optimizations in the same branch, until the overall performance impact is more than 0.5% (preferably 1.0% or more). Then you can create a PR with multiple optimizations, or create separate PRs but mention in each that you've measured the impact of a set of optimizations together.
Use "trace logging" (discussed below) to measure the number of certain operations/events precisely. This doesn't help with all kinds of optimizations, but for certain micro-optimizations this can reliably show even quite small improvements in a convincing way.
Profile mypy using your commit and master (preferably using perf as discussed below), and focus on the % CPU used in the function you are optimizing. It's possible that overall runtime hasn't changed much, but the specific function you targeted has a significant improvement.

Using perf to profile compiled mypy on Linux

The script misc/profile_check.py in the mypy repository compiles mypy using mypyc and profiles a mypy run using the perf tool (only works on Linux). Look at the script for more documentation. The profile only contains C functions, so Python names have been mangled to C. Usually it should be clear what they refer to, however. The generated profiles are generally of a high quality. Using py-spy (see below) may be easier, but the profiles are of lower quality.

Note

For this to be effective, you need to first compile CPython yourself using specific C compiler flags (documented in the script), and always remember to use this Python build when profiling.

Using event trace logging to analyze performance

You can compile mypy in a mode where it produces an event trace log based on what is executed. You can identify some slow events/operations and optimize them into faster operations or otherwise avoid them. See Mypyc Trace Logging for more information.

Using py-spy for profiling compiled mypy

Note

As of Jul 2025, running py-spy on Python 3.12 and later may not work reliably. Python 3.11 can be used as a workaround. You can also try perf on Linux (discussed elsewhere).

py-spy is a profiling tool that works with compiled mypy (at least in Linux).

Use it like this to profile mypy (replace -c 'import os' with your command line arguments):

$ pip install py-spy
$ pip install mypy
$ py-spy record --native -f speedscope -o profile.dat -- mypy -c 'import os'

Now open https://www.speedscope.app/, click Browse and import the profile.dat file you generated above. You can click 'Sandwich' to get a flat profile.

If the mypy run is relatively quick (less than a few seconds), consider using -r 500 with py-spy to increase the sampling rate (but high sample rates may not work reliably).

Note: To get repeatable results, disable incremental mode by using mypy --no-incremental, or delete the .mypy_cache directory before each run.

If you are reporting a mypy performance issue or regression, feel free to add a link to the collected profile.dat (after you've verified with speedscope that it contains useful information) so that mypy developers can also analyze the profile.

Identifying where time is spent during type checking

In some cases, a small fraction of files or lines are very slow to type check, but everything else is fine. Mypy provides a few features to help investigate cases like this:

Run mypy with --timing-stats ts.txt to write per-file timing stats (microseconds per file) to ts.txt
Run mypy with --line-checking-stats ls.txt to write per-line stats (microseconds per line) to ls.txt.

The flag --dump-build-stats enables some additional performance stats. Running mypy with -v will display information about all the processed SCCs and dependencies, which can be useful if it looks like some extra files are being processed.

Uh oh!

Profiling and Optimizing Mypy

Overview

Typical optimizations and bottlenecks

Optimizing to use faster operations

Tracking the performance of mypy

Measuring the performance of mypy locally

Using perf to profile compiled mypy on Linux

Using event trace logging to analyze performance

Using py-spy for profiling compiled mypy

Identifying where time is spent during type checking

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally