Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] metricsgeneration/infinite loop #37474

Closed
bmiguel-teixeira opened this issue Jan 24, 2025 · 4 comments · Fixed by #37599
Closed

[BUG] metricsgeneration/infinite loop #37474

bmiguel-teixeira opened this issue Jan 24, 2025 · 4 comments · Fixed by #37599
Labels
bug Something isn't working processor/metricsgeneration Metrics Generation processor

Comments

@bmiguel-teixeira
Copy link
Contributor

bmiguel-teixeira commented Jan 24, 2025

Component(s)

processor/metricsgeneration

What happened?

Description

When using destination and metric1 with equal values (same metric) the processors enters an infinite loop.

Steps to Reproduce

In a perfect scenario, when using the scaling operations, I would like to change the selected metric "in-place". However, if we set metric name equal name, the processor enters in infinite loop.
Config shown below.

  metricsgeneration/scale:
    rules:
    - name: linux_memory_free_percent_percent
      type: scale
      metric1: linux_memory_free_percent_percent
      operation: multiply
      scale_by: 100

This happens because we iterate over the metrics array, and if we find a atching metric, we scale it based on the operation and we "ADD IT" to the ongoing metrics array. If the metrics names are equal this causes never ending loop.

Expected Result

Replace/update metric in-place. If thiscould be a desired behaviour for the config above, I can submit a PR.

Actual Result

Infinite loop in processor.

Collector version

0.118.0

Environment information

Environment

N/A

OpenTelemetry Collector configuration

Log output

Additional context

No response

@bmiguel-teixeira bmiguel-teixeira added bug Something isn't working needs triage New item requiring triage labels Jan 24, 2025
@github-actions github-actions bot added the processor/metricsgeneration Metrics Generation processor label Jan 24, 2025
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@VihasMakwana
Copy link
Contributor

@bmiguel-teixeira
From another perspective, this also happens because new metric's name conflicts with old:

  metricsgeneration/scale:
    rules:
    - name: linux_memory_free_percent_percent # new name conflicts with metric1
      type: scale
      metric1: linux_memory_free_percent_percent
      operation: multiply
      scale_by: 100

We can only allow this if names are different? wdyt?

@VihasMakwana VihasMakwana removed the needs triage New item requiring triage label Jan 30, 2025
@crobert-1
Copy link
Member

I was able to reproduce, the bug is in the generateScalarMetrics method, as the loop is adding data points to the metric, while checking to see if we've scaled all of the data points in the same slice that's being iterated. Since we're adding a new data point for each existing data point in place, the loop never finishes.

From the processor's description:

"The metrics generation processor (metricsgeneration) can be used to create new metrics using existing metrics following a given rule."

From this, I believe the usage here is invalid. and the metrics transform processor would be better suited to this situation.

That being said, I think we should fail on config validation if the new metric name matches the existing metric name. This will make it clear to users that this is unsupported behavior.

@bmiguel-teixeira
Copy link
Contributor Author

For this particular use case of "just" scaling the metrics transform processor would indeed work. Since I was already using this processor for other operations (percentage) I figured to just re use it for scaling this particular use case.

Additionally, the use case to scaling it "in-place" was just to avoid having to cleanup the "older/temporary" metric. Im okay with assuming the "new metric" behaviour if we lock it behind a config check to avoid this scenario.

songy23 pushed a commit that referenced this issue Jan 31, 2025
…ric names (#37599)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
If a generated metric name matched the metric being scaled an infinite
loop is hit. Since this action is not supported by this processor, and
is supported by the metrics transform processor, the fix here is to add
config validation to enforce metric names are different.

<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes #37474

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Added a test to ensure enforcement is done properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working processor/metricsgeneration Metrics Generation processor
Projects
None yet
3 participants