Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mimir images built with profile-guided optimization #9200

Open
LasseHels opened this issue Sep 4, 2024 · 7 comments
Open

Mimir images built with profile-guided optimization #9200

LasseHels opened this issue Sep 4, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@LasseHels
Copy link
Contributor

LasseHels commented Sep 4, 2024

Is your feature request related to a problem?

No!

Describe the solution you'd like

We would like to be able to run Mimir images built with profile-guided optimization (PGO).

Since Mimir runs different components from a single image (grafana/mimir), it might make sense to have different images like:

  • grafana/mimir:2.14.0 - Built without PGO.
  • grafana/mimir:2.14.0-ingester-pgo - Built with ingester profiles.
  • grafana/mimir:2.14.0-distributor-pgo - Built with distributor profiles.
  • Etc.

To maintain backwards compatibility, I imagine it should still be possible to run an image optimized for component X as component Y. For example, it should be possible to run grafana/mimir:2.14.0-ingester-pgo with a target=distributor flag. In this case, distributors will be running with an image optimized with ingester profiles. In this case, running a component with the incorrect image is sub-optimal, but will at least still work.

Describe alternatives you've considered

We've considered running custom Mimir images optimized with profiles from our own environment. Not only is this cumbersome and error prone, we also imagine that profiles pulled from Grafana Cloud will be the most accurate (and thus provide the largest performance benefit).

Additional context

Various sources online claim performance benefits in the order of 2-14%. This is a significant performance improvement at a relatively low cost.

@aknuds1 aknuds1 added the enhancement New feature or request label Sep 4, 2024
@dimitarvdimitrov
Copy link
Contributor

with pyroscope we can get a single pprof file from the whole cell. That will contains nodes from different components. Maybe that's another approach which avoids having multiple images

@LasseHels
Copy link
Contributor Author

Is there an expected timeline for this feature? I'd love to contribute, but I am unsure of how I can help with this as I am obviously unable to pull profiles from Grafana's production environment 😄.

@narqo
Copy link
Contributor

narqo commented Sep 9, 2024

Various sources online claim performance benefits in the order of 2-14%. This is a significant performance improvement at a relatively low cost.

Last time I asked internally, when people looked at building the binary with pgo, focusing only on ingesters, the experience was that the resulting performance improvements weren't obviously visible, comparing to added complexity. This doesn't mean there is no interest to make this happen, though.

Is there an expected timeline for this feature? I'd love to contribute, but I am unsure of how I can help

Before working on shipping per-target pgo-images it'd be helpful to know about your own experience. That is, if one has an sufficiently large installation of mimir, consider building your own image for one pgo-optimized target. Run this image as an experiment internally, and contribute back your performance results. This will be a huge gain to move this forward.

@LasseHels
Copy link
Contributor Author

@narqo Roger that. We will look into gauging the performance impact with custom images and report back. Could take a while.

@wilfriedroset
Copy link
Collaborator

Before working on shipping per-target pgo-images it'd be helpful to know about your own experience

What about the debian packaging shipping? I rely on the install of mimir via deb package and I'm also interested a pgo version of mimir, however having a packager per component might be a hassle. What about providing a grafana/mimir:2.14.0-pgo along with the counter part for debian/redhat such as mimir-2.14.0+pgo_amd64.deb

p.s: the naming is similar to the one used by prometheus for the deduplabels release, see: https://github.com/prometheus/prometheus/releases/tag/v2.54.0%2Bdedupelabels and https://hub.docker.com/layers/prom/prometheus/v2.54.0-dedupelabels/images/sha256-57010aa58dc72f41023212f8e599ef78ac3dc253158857ea9f21e45114f68a66

@LasseHels
Copy link
Contributor Author

We've tried out PGO-enabled images for ingesters and distributors. We chose ingesters and distributors as they are the components that make up most of our compute consumption.

For both components, the impact of enabling PGO was too small to measure.

Ingesters:
Image

Distributors:
Image

Both of the above images have the component rollout roughly in the middle. Had there been a significant impact of enabling PGO, we would have seen Avg CPU drop and diff consistently drop below 1.

For the record, these are our request stats for ingesters and distributors (taken from Mimir / Writes dashboard):
Image

@pbailhache
Copy link
Contributor

pbailhache commented Oct 16, 2024

With @wilfriedroset we ran some tests on one of our mimir cluster.

Just like @LasseHels we chose the ingesters & distributors.

We used a ingester profile for PGO.

EDIT : Some infos on our infra to put the results in perspective :
Running Mimir 2.13
Each instance is on a VM (RAM 15Go, 4 cores).
Around 120k in memory series on each ingester

Here are our results,
The base profile is taken before switching to PGO and I took two profiles after the switch to have a better view :

Profiling

Here is the CPU details for each ingester tested. As you can see the switch happened around 15:25.

2
3
4
5
6
7
8
9
10
11

We can see a little drop (matching this ~2/3% average we saw from the profile diff)

We just tested on a part of one cluster but it concerned about 10 ingesters & 5 distributors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants