-
Notifications
You must be signed in to change notification settings - Fork 130
[WIP] Benchmarks POC #381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[WIP] Benchmarks POC #381
Conversation
line_profiler/_line_profiler.pyx - Added macro `hash_bytecode()` for hashing bytestrings (doesn't seem to have performance benefits so it's currently unused; TODO: more tests) - Replaced repeated retrival of `prof._c_last_time[ident]` with a stored reference thereto
Further optimizations
I *really* hope this is not why the Windows CI is failing lol
This reverts commit c5f95e2.
That looks nice! One suggestion I have would be to make use of pytest-benchmark and its pytest-benchmark[histogram] addon (though it doesn't make those nice plots by default, that can be implemented on top of its data). This is what I was planning to do for regression tests, and I have an example script making use of it at jsonpickle's repository and run instructions here. Sample output is included in the attached pictures. Also, I don't think this took into account my most recent commit for fixing your GPT review suggestions 2 and 4 over on #376. ![]() ![]() |
Adding onto #376 to start to measure the amount of overhead line-profiler adds as a function of python version and line-profiler version (and other factors).
The idea is to have a script that can generate a report for a specific style of benchmarks, then multiple of these result files can be aggregated to view statistics over different contexts in which the benchmarks are run. This way we can slice/dice the numbers in a way to gain more insight.
This is VERY rough right now (hard coded paths and whatnot), but I have the initial proof of concept, which shows the impact of the changes in #376
These also rely on some of my utility libraries: scriptconfig / kwutil /ubelt.
Plot aggregating all versions of python the tests are run against:
Plot splitting out by python versions:
It looks like the sys.monitoring adds quite a bit more overhead than the legacy way of handling the trace callbacks, but in all cases @376 does seem to be a speed improvement.
The jump up from 4.0 to 4.1 is stark, I'm wondering what's happening there or maybe there is a mistake in my benchmark script. I do think the code I wrote here can serve as a decent launch point to get an automated way of measuring how we are doing on overhead and if new patches are causing significant regressions or not.