Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/duplicate tags #13166

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

andrewstuart
Copy link

@andrewstuart andrewstuart commented Aug 18, 2022

What does this PR do?

This PR demonstrates one possible solution to an issue we've observed in production, where env tags can be merged with the official deployment.environment resource-level "tag," resulting in confusing behavior if that attribute is not also overridden (or otherwise does not match the env tag).

Motivation

This PR is inspired by a particularly nasty bug we have encountered recently with OTLP metrics being sent to DataDog. When setting env attributes via the attributeprocessor, we were inadvertently creating situations where the Resource level tags did not match the metric series level tags. This caused double entries in the JSON being sent to datadog, and very confusing results in dashboards.

Possible Drawbacks / Trade-offs

The current solution is not configurable and may result in tag values that don't make sense, depending on the tag source that created the original tag.

Describe how to test/QA your changes

go test

You could also set up an otel-contrib-collector pipeline with a DD exporter, and different upsert for env in resourceprocessor and attributeprocessor like so:

  exporters:
    datadog:
      api:
        key: SET_VIA_ENV
  processors:
    resource:
      attributes:
        - key: deployment.environment # technically we override this and env on both processors, now, just to be safe, but IIRC this should replicate
          action: upsert
          value: prod
    attributes:
      actions:
        - key: env
          action: upsert
          value: dev
  service:
    pipelines:
      metrics:
        exporters:
          - datadog
        processors:
          - resource
          - attributes
        receivers:
          - prometheus # or whatever

Reviewer's Checklist

  • If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
  • Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
  • A release note has been added or the changelog/no-changelog label has been applied.
  • Changed code has automated tests for its functionality.
  • Adequate QA/testing plan information is provided if the qa/skip-qa label is not applied.
  • At least one team/.. label has been applied, indicating the team(s) that should QA this change.
  • If applicable, docs team has been notified or an issue has been opened on the documentation repo.
  • If applicable, the need-change/operator and need-change/helm labels have been applied.
  • If applicable, the k8s/<min-version> label, indicating the lowest Kubernetes version compatible with this feature.
  • If applicable, the config template has been updated.

@andrewstuart andrewstuart requested a review from a team as a code owner August 18, 2022 20:35
@bits-bot
Copy link
Collaborator

bits-bot commented Aug 18, 2022

CLA assistant check
All committers have signed the CLA.

@andrewstuart
Copy link
Author

I wanted to get this out there to discuss the solution, and whether or not it is viable or if you'd like some more changes or design to happen before proceeding, but we've been bitten by this and have a workaround currently, but it seems ideal that others would not have to figure out the underlying conversions in play, quite to the same level that we've had to debug.

Please let me know if there's anything you'd like to change here, or discuss better alternatives that I may have missed, not knowing the codebase quite as well as I'm sure the team does.

@andrewstuart andrewstuart changed the title [WIP] Bugfix/duplicate tags Bugfix/duplicate tags Aug 18, 2022
@dineshg13
Copy link
Member

dineshg13 commented Aug 23, 2022

@andrewstuart Thanks for finding & fixing this bug. This might an issue with traces or even logs. I will get back to you if we want to fix in all places or there is some central place we can do it .

for _, tag := range d.tags {
sp := strings.Split(tag, ":")
if v, ok := collisionCheck[sp[0]]; ok && v != strings.Join(sp[1:], ":") {
d2.tags = append(d2.tags, "resource."+tag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we assuming that any duplicate that is a resource attribute and prefixing it as such? If we want to prefix all resource attributes like this shouldn't it be done at a different level where we are actually aware of this fact? It's a bit of a wild assumption.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned that explicitly in the tradeoffs above. I just wanted to get this out there, and figured a better solution (at the very least, configurable) would surface as part of the discussions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think such a tradeoff is acceptable. I think we should instead simply ensure that the environment is taken from the right tag.

Copy link
Author

@andrewstuart andrewstuart Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to fix this however you would like. I put a few solid days of discovery into this issue and fix, so I'm tapped out if you decide to go a different direction, but I won't be offended if you do. Either way, just know that the potential for duplicate tags will still be a fundamental issue with the way the DataDog otlp adapter, and thereby the otel DD exporter, is currently designed.

@diguardiag
Copy link

Any update to this? we are affected by this issue and i'm sure other people as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants