-
Notifications
You must be signed in to change notification settings - Fork 2.1k
core/clist_sort: rewrite #21406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
core/clist_sort: rewrite #21406
Conversation
This adds a benchmark for clist_sort() and calls that on three different list sizes.
Add a few cycles before starting the stopwatch for fancy CPUs that do have a cache and branch predictor that may warm up (e.g. Cortex M7).
Co-authored-by: Kaspar Schleiser <[email protected]>
- Iterate of different lengths of unsorted data to also cover corner case one-item list - Actually check that the list is sorted afterwards (and not just that the maximum got sorted in last) - Actually check that the list was stable-sorted (as the API doc promises a stable sort) - Increase the length of the input data for better test coverage - Check that no elements get lost while sorting
Co-authored-by: crasbe <[email protected]>
This should fix the column length warning of the static tests.
Benchmarks (all SAME54-XPRO)
|
Nice! Somehow on the nrf52840dk, it seems this implementation gets slower a bit quicker with larger inputs. like, master sorts 16k nodes 150000us whereas this pr needs 500000us per call ("almost sorted"). Any idea? |
That is expected. The merge sort is best tuned, when merging only equally sized lists. With lists this either requires a lot of traversal or some caching of where the lists heads are. This PR uses the caching, which results in better speed when the size of the fixed cache is good, but worse speed, when it is too small. Tuning |
This adds a new merge sort implementation that strives to be better documented than the previous one and to compile with `-fanalyzer` without GCC barking.
6c31a8a
to
e5530c9
Compare
Contribution description
This rewrites
clist_sort()
with a different implementation of a merge sort. The goal is that the code is more readable and passes-fanalyzer
without GCC barking, but it turns out to be a bit faster as well.Testing procedure
This currently includes two other PRs for better testing, namely:
(This PR needs to wait for those to be merged first and rebased then.)
Issues/PRs references
More complex but faster alternative to #21398