Investigation: Group benchmarks based on their profiling "characterization"

By "characterization", I mean what category of functions dominate the runtime of each benchmark.

If we organize them by the top category in each benchmark, we get the following:

<details>
<summary>Benchmark by top profiling category</summary>

```python
characterizations = {
    "interpreter": [
        "2to3",
        "aiohttp",
        "chameleon",
        "chaos",
        "comprehensions",
        "coroutines",
        "coverage",
        "crypto_pyaes",
        "dask",
        "deepcopy",
        "deltablue",
        "django_template",
        "djangocms",
        "docutils",
        "dulwich_log",
        "fannkuch",
        "float",
        "generators",
        "genshi",
        "go",
        "gunicorn",
        "hexiom",
        "html5lib",
        "logging",
        "mako",
        "mypy2",
        "nbody",
        "nqueens",
        "pickle_pure_python",
        "pprint",
        "pycparser",
        "pyflate",
        "pylint",
        "raytrace",
        "regex_compile",
        "richards",
        "richards_super",
        "scimark",
        "spectral_norm",
        "sqlglot",
        "sqlglot_optimize",
        "sqlglot_parse",
        "sqlglot_transpile",
        "sympy",
        "thrift",
        "tomli_loads",
        "tornado_http",
        "typing_runtime_protocols",
        "unpack_sequence",
        "unpickle_pure_python",
        "xml_etree",
    ],
    "memory": [
        "async_generators",
        "json_dumps",
        "python_startup",
        "python_startup_no_site",
        "unpickle_list",
    ],
    "gc": [
        "async_tree",
        "async_tree_cpu_io_mixed",
        "async_tree_cpu_io_mixed_tg",
        "async_tree_io",
        "async_tree_io_tg",
        "async_tree_memoization",
        "async_tree_memoization_tg",
        "async_tree_tg",
        "gc_collect",
        "gc_traversal",
    ],
    "kernel": ["asyncio_tcp", "concurrent_imap", "pathlib"],
    "libc": ["asyncio_tcp_ssl"],
    "library": [
        "asyncio_websockets",
        "json",
        "json_loads",
        "pickle",
        "pickle_dict",
        "pickle_list",
        "regex_dna",
        "regex_effbot",
        "regex_v8",
        "sqlite_synth",
        "telco",
    ],
    "tuple": ["mdp"],
    "miscobj": ["meteor_contest"],
    "int": ["pidigits"],
    "str": ["unpickle"],
}
```
</details>

If you refine this to only include a benchmark in a category if that category represents more than 50% of the runtime:

<details>
<summary>Benchmarks that are heavily (over 50%) in a particular category</summary>

```python
over_50 = {
    "kernel": ["asyncio_tcp"],
    "libc": ["asyncio_tcp_ssl"],
    "library": [
        "asyncio_websockets",
        "pickle",
        "pickle_dict",
        "pickle_list",
        "regex_dna",
        "regex_effbot",
    ],
    "interpreter": [
        "chaos",
        "coroutines",
        "deepcopy",
        "deltablue",
        "generators",
        "go",
        "hexiom",
        "logging",
        "nbody",
        "pickle_pure_python",
        "pprint",
        "raytrace",
        "richards",
        "richards_super",
        "sqlglot_parse",
        "tomli_loads",
        "unpack_sequence",
        "unpickle_pure_python",
    ],
    "gc": ["gc_collect", "gc_traversal"],
    "int": ["pidigits"],
}
```
</details>

Interestingly, this doesn't seem to reveal too much related to profiling.  (Admittedly, the only category where we would expect significant change is "interpreter").  The following results are for JIT (main) vs. Tier 1 (same commit), HPT at the 99th percentile:

```
interpreter: 1.01x slower
memory: 1.00x slower
gc: 1.01x slower
kernel: 1.00x slower
libc: 1.00x slower
library: 1.00x slower
tuple: 1.00x slower
miscobj: 1.00x slower
int: 1.00x faster
str: 1.00x faster
```

Using only benchmarks where 50% of time is in a single category:

```
interpreter: 1.00x slower
kernel: 1.00x slower
libc: 1.00x slower
library: 1.00x slower
gc: 1.00x slower
int: 1.00x faster
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigation: Group benchmarks based on their profiling "characterization" #664

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigation: Group benchmarks based on their profiling "characterization" #664

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions