Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Memrate graph generation issue #58

Closed
ezekriSCW opened this issue Nov 8, 2024 · 6 comments
Closed

[BUG] Memrate graph generation issue #58

ezekriSCW opened this issue Nov 8, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@ezekriSCW
Copy link

Describe the bug
During memrate graph generation with a HPE server, an error occurs preventing the process to finish --> all graphs are not generated.
Note that only 1 DIMM of 32G is present in this server

To Reproduce
Steps to reproduce the behavior (supposing that's due to single DIMM presence)

  1. Run hwbench with this specific command line (on a server with a single 32G DIMM)
    uv run hwbench -j configs/simple.conf -m monitoring.cfg
  2. Run hwbench with this specific command line
    uv run hwgraph graph --traces hwbench-out-20241107131337/results.json:DLxxx:BMC.Server --outdir DLxxx_graph
  3. hwbench crashes with the error below
    Fatal: DLxxx/memrate_116: unable to find metric write8/sum_speed

Expected behavior
graph generation should go to the end with all graphs generated.

Benchmark configuration
default files: simple.conf and monitoring.cfg (with BMC creds) have been usedd

Logs
If applicable, add logs to help explain your problem.

Environment (please complete the following information):

  • Linux distribution and version: ubuntu 24.04
  • Server platform HPE DLxxx
  • PDU model and firmware: Not used/configured
@ezekriSCW ezekriSCW changed the title [BUG] My beautiful issue [BUG] Memrate graph generation issue Nov 8, 2024
@anisse
Copy link
Contributor

anisse commented Nov 12, 2024

What version of stress-ng was used in this case? Can you share the results.json? Or eventually, just a subset:

jq '.bench.memrate_116' < hwbench-out-20241107131337/results.json

@ezekriSCW
Copy link
Author

stress-ng version: V0.17.04
attached an extract from results.json
memrate.json

Thanks @anisse

@ErwanAliasr1 ErwanAliasr1 added the bug Something isn't working label Nov 13, 2024
@anisse
Copy link
Contributor

anisse commented Nov 15, 2024

I have analyzed the output data, and I'm not sure I understand what happened. We would need to solve #60 to have more complete output data. I tried a run on a server with the same CPU: I was not able to reproduce the problem.

If you re-run hwbench, does it always have the same issue on graph generation ?

Also, if you want to analyze the result anyway, it should be possible to remove the memrate_116 job from results.json and re-run hwgraph.

@anisse
Copy link
Contributor

anisse commented Nov 15, 2024

I tried removing only the memrate_116 job from results.json, and hwgraph can go to the end and generate all its graphs.

@ezekriSCW
Copy link
Author

hwbench has been relaunched with 8x32G DIMMs instead of 1x32G DIMM, and all graphs have been generated as expected
Note that I haven't re-run hwbench with a single DIMM as performed initially, so I cannot reproduce the problem for now.

@ezekriSCW
Copy link
Author

considering it resolved as it works as expected with multiple DIMMs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants