-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need help understanding and running likwid on a dual socket system; failure to detect / bench inter-socket link AKA UPI #664
Comments
LIKWID does not offer the UPI group for Intel SapphireRapids. The system used in the tutorial page is a Intel Haswell EP system. Counters, events and therefore performance groups are architecture-specific. So, it doesn't hide just for you ;) I checked on two of our SPR nodes:
I committed a UPI group for SPR to the master branch. You can reinstall LIKWID from the master branch or you just download the file and put it into
This is without MarkerAPI. If you want to use the MarkerAPI and NUMA balancing is active, you probably do not see the UPI traffic because the Linux kernel would move it already in the warmup phase thus no UPI traffic while the benchmark runs. I hope this clarifies it for you. If you have further questions, feel free to ask. |
@TomTheBear thanks so much for the comment and explanation which all makes sense to me! :-) However, when trying out the commands then I'm still getting unexpected results. Here I disable NUMA balancing and re-run the command lines:
But the Then I noticed that in your commands above, there is no
Now I do see a difference with Questions:
Thanks for you help so far! :-) |
I'm trying to use
likwid
to better understand and detect the "inter-socket link" AKAUPI
on my system, which is a:But so far nothing has worked as expected:
I built and installed
likwid
like this:When I ask it to "print available performance groups for current processor":
This is where things get confusing already because I'm expecting it to show something like this [1], but above
UPI
is missing:Why is
UPI
missing for me? And how to get it to show up?Also, [1] shows commands to test the performance impact of
UPI
, e.g.:And for its own example above, [1] says "In the above example, we can see that the bandwidth is dropped from ~100 GB/s to ~41.9 GB/s. This is almost a 2.5x performance difference."
However, when I run the above commands on my dual socket system then curiously there appears to be little "MByte/s" difference:
How can there be so little difference assuming the
UPI
should / must be working hard to synchronize the memory between the 2 sockets? Or how to modify these commands to make them work as expected?Thanks!
[1] https://pramodkumbhar.com/2020/03/architectural-optimisations-using-likwid-profiler/
The text was updated successfully, but these errors were encountered: