Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Blacklist Host Tags #3130

Open
ericlarssen-wf opened this issue Mar 6, 2019 · 28 comments
Open

Feature Request: Blacklist Host Tags #3130

ericlarssen-wf opened this issue Mar 6, 2019 · 28 comments

Comments

@ericlarssen-wf
Copy link

It would be great if it was possible to strip off Host tags off of metrics. Tags such as what autoscaling group a metric is coming from is not very valuable and can clutter the tags for a particular metric. Being able to exclude tags based on a regex would enable to strip multiple at a time and allow to strip generated tags.

@tonglil
Copy link
Contributor

tonglil commented Feb 6, 2021

To give an example, in a cattle-like environment, I don't need host, internal-hostname, instance-id, instance-template, or created-by tags on my metrics as these are highly automatically generated and cycled in the runtime environment.

Maybe I need 1 of host OR instance-id, but I'm usually not drilling down to that level, especially if I'm more concerned about high level metrics.

@hermanbanken
Copy link

We could really use this too. We have high cardinality of tags, but ALL of our metrics are annotated by the various hosts too, multiplying the whole thing enormously.

I believe all the instance-related tags are all added by the DataDog agent itself, right? Based on where the data is sourced from.

@kolloch
Copy link

kolloch commented Jun 23, 2021

👍 on a possibility to remove host tags!

I'd also appreaciate a possibility to remove the kube_replicaset tag.

Maybe one could have a generic blacklist in the agent?

@kirecek
Copy link

kirecek commented Dec 13, 2021

hmm, the issue was created in 2019, but I assume this exclude option is still not implemented, right? Or did you guys figure out some workaround by any chance?

@danopia
Copy link

danopia commented Dec 13, 2021

I think Metrics Without Limits can work around the whole pricing aspect of this extra cardinality. But the actual remove-tags feature doesn't exist that i know of

@knowshan
Copy link

knowshan commented Jan 9, 2022

We would like to have this feature as well. This should be configurable through AWS integration configuration OR Datadog agent.

@richid
Copy link

richid commented Apr 11, 2022

I will say that I've tried overwriting these tags to a single dummy value in the datadog.yaml file:

tags:
- aws:ec2:fleet-id:dummy

But that dummy value just gets added to the list of tag values for that tag key.

@andrew-kolesnikov
Copy link

andrew-kolesnikov commented Aug 25, 2022

My team could really use this too.
Really sad to see this has been requested several years ago but appears to have been neither satisfied nor rejected.

@alexb-img
Copy link

We require tags to be filtered/removed at source(ingest) also. Metrics without limits only removes tags during indexing not ingest.

@arloliu
Copy link

arloliu commented Oct 18, 2022

Our company require this features too, manty unused tags cost is high

@adudek
Copy link

adudek commented Nov 4, 2022

Same here - I would appreciate the possibility to create filter mask for tags. If tags cannot be removed - by principle monitoring should not affect infrastructure nor it should impact cost (tags play functional role in some scenarios).
Since tagging is money cow for Data Dog, I doubt someone will pick this up :(

@patbl
Copy link

patbl commented Nov 5, 2022

I wrote a Ruby script that adds custom tag groups that exclude tags you don't want. You'll want to tailor it to your use case, or use it as inspiration. It takes about 8 hours to run against the Datadog account I work on, which has around 10k metrics. My company set up a Datadog monitor for custom-metrics usage, and we re-run this script whenever it alerts.

I agree with others that Datadog's tooling for managing tags on large numbers of metrics is poor. Having Terraform configuration for thousands of metrics isn't practical, and neither is manually configuring them through the web UI. All we're asking for is a blacklist, which seems a lot easier to implement than many other parts of Datadog's tooling, which I'm generally impressed by.

@zekth
Copy link

zekth commented May 14, 2023

Following-up on this one. There was a question to not only support it for statsd but also for openmetrics. Also as i see we currently have the tag exclusion for EC2 tags. Do we want to make a generic pkg to make those exclude tags feature work the same by consuming string slices from the configuration?

I'm happy to implement this if i got a green flag from maintainer team.

tagging @alexb-img and @olivielpeau because you were active on #6526

@LutaoX
Copy link

LutaoX commented Jun 15, 2023

I'm a PM at Datadog and want to chime in here to provide context that we are aware of this feature request and are actively looking for details about the use cases, telemetry pipeline needs and pain-points. We highly encourage customers to reach out to us via our support channel (https://www.datadoghq.com/support/) or your CSM contact about this topic! Thanks!

@2rs2ts
Copy link

2rs2ts commented Jun 20, 2023

@LutaoX Let me just say that it's super awesome to hear from a PM in a public setting, because until now that's definitely not been the norm from my experience with requests on this repo; my company had internally started assuming that filing feature requests in Github to go along with our support cases was pointless.

Anyway, I'm pretty sure I've reached out before about this and ended up having a support case where I linked this issue, but I don't want to go dig it back up and cause confusion on the customer support end by necro-ing a 2-3 year old issue. One of the big use cases for us is in Kubernetes, where we don't want the host tag from the reporting agent to get applied to statsd metrics, because (to make a long story short) internalTrafficPolicy: local does not actually turn off kube-proxy's load balancing behavior, thereby making it impossible for pod Foo on host A to guarantee that any statsd metrics it sends aren't tagged as potentially coming from other hosts. (I brought it up with the sig-networking at some point but they told me that that's intentional behavior of kube-proxy and that a KEP would be needed to add a new option to actually only send traffic to the local pod.)

Obviously this would be a different part of the codebase, but there's also the matter of EC2 tags where we want some of those tags (such as Env) but not others (such as weird inventorying tags that the company forces us to add but which we don't want showing up in the DataDog web UI.) I'm not sure if that falls under "blacklist host tags" but it sure feels adjacent to it at the very least, and I'm nearly certain I've also brought this up with customer support–maybe even in the same breath as the matter of the host tag in Kubernetes.

I'm only commenting about my use cases here to make sure they get a little bump and to let anyone who's in the same boat as I am just reference my comment when they open their support tickets. Like, if one can summarize one's request by saying "please do what this guy on github said" then hey, I saved them time :)

@lmello
Copy link

lmello commented Oct 16, 2023

I implemented this for k8s workloads, I guess my implementation could be expanded for host tags also.
Given that I added a method removeTags to the taglist.

#20161

@cptkng
Copy link

cptkng commented Jan 17, 2024

I'm a PM at Datadog and want to chime in here to provide context that we are aware of this feature request and are actively looking for details about the use cases, telemetry pipeline needs and pain-points. We highly encourage customers to reach out to us via our support channel (https://www.datadoghq.com/support/) or your CSM contact about this topic! Thanks!

I created a support case around the issue not long ago. @LutaoX , I don't know if you can access it, but here's the case: http://help.datadoghq.com/hc/requests/1402253

@cptkng
Copy link

cptkng commented Jan 17, 2024

We require tags to be filtered/removed at source(ingest) also. Metrics without limits only removes tags during indexing not ingest.

And that's exactly the issue. While you can define which tags you want to have indexed with Metrics Without Limits (and that's good as the indexing is really expensive), you cannot define what is ingested. And though the price per ingested metric is much lower than indexed, if you have many hosts with many custom metrics, you still end up with a super high bill.

@ide
Copy link

ide commented Mar 7, 2024

Pragmatically speaking, is there a workaround other than using something other than Datadog to reduce high counts of custom metrics? More precisely, my reading of this GitHub issue and the problem at hand is that the Datadog agent always adds tags, some of which have high cardinalities like host and name, which significantly raises the number of custom metrics and the customer's bill outside of the customer's control. The two ways I am aware of to remove these tags is either to use a custom Datadog agent or not use Datadog altogether, and it would be great to learn of a viable workaround.

IMO ideally the agent would allow for custom tag transformers that receive a tag–value pair from the agent and return a tag–value pair to send to the Datadog service, where returning null would drop that tag–value. But even a simpler API to drop tags regardless of their values would be a great feature.

@adudek
Copy link

adudek commented Mar 8, 2024

@ide never tested this, but there is a Vector proxy service You can deploy between agent and datadog. You'd have to reconfigure agents and build your filtering rules, but anything can become financially viable at a certain threshold.

@toan-hf
Copy link

toan-hf commented Mar 23, 2024

Hi everyone! I think the concern to blacklist HostTags here is super valid despite Datadog-agent being supported or not but currently I have tested with Vector.Dev to disable some unnecessary tags before it was ingested & indexed to Datadog.
Although it is super important for us (and our budget) but it seems this feature may take longer for implementation from the Datadog side, thus I share my implementation here for vis

Our architecture like this

Datadog-Agent --> Vector.Dev ---> Datadog Platform

Step 1: It is important to note that your DD-Agent version must be higher than 7.45.1
To allow some of the environment variables below could be applied

DD_OBSERVABILITY_PIPELINES_WORKER_METRICS_ENABLED - boolean - optional - default: false
    ## Enables forwarding of metrics to an Observability Pipelines Worker
    
DD_OBSERVABILITY_PIPELINES_WORKER_METRICS_URL - string - optional - default: ""
    ## This is the URL of vector.dev service that you need to enter
DD_OBSERVABILITY_PIPELINES_WORKER_LOGS_ENABLED - boolean - optional - default: false
    ## Enables forwarding of logs to an Observability Pipelines Worker

DD_OBSERVABILITY_PIPELINES_WORKER_LOGS_URL - string - optional - default: ""

DD_OBSERVABILITY_PIPELINES_WORKER_TRACES_ENABLED - boolean - optional - default: false
    ## Enables forwarding of traces to an Observability Pipelines Worker

DD_OBSERVABILITY_PIPELINES_WORKER_TRACES_URL - string - optional - default: ""

Step 2: Setup vector.dev service

(it can be managed by HELM, so please utilise it). The outcome is you have that service up and running and you have a practical endpoint to receive the data

Step 3: The rule that I used on vector.dev to discard the unnecessary tags

 sources:
   ## This aims to allow your vector.dev can receive the traffic from Datadog-Agent
   datadog_agents:
      type: datadog_agent
      address: 0.0.0.0:8282
      multiple_outputs: true
      store_api_key: false
  ## This aims to delete 2 tags pod_phase & namespace, also append image_id tag with the default value themystery
  transforms:
   drop_one_tag:
      type: remap
      inputs:
        - datadog_agents.metrics
      source: |-
        del(.tags.pod_phase)
        del(.tags.namespace)
        .tags.image_id = "themystery"

    ##  Output all of them to the Datadog Platform
   sinks:
     datadog_metrics:
       type: datadog_metrics
       inputs:
         - drop_one_tag
       compression: gzip
       default_api_key: ${DD_API_KEY}
       site: "datadoghq.eu"

Finally, you can check your metric after it arrives in Datadog Platform, surely the tag pod_phase / namespace will not appear anymore

Hope it helps to rescue everyone (not just short-term but even for a long-term model).

@Twe3tTwe3t
Copy link

We have this same issue as well, particularly in a Azure AKS environment.
In our case, our scenario is, 1 team looks after the AKS infrastructure, so therefore they tag the underlying infrastructure with common tags such as team, service, env as well as some custom tags such as 'costcentre'.
We then have multiple application teams that will run their application on top of the provided AKS clusters, those teams too will tag their deployments with the same tags.
What we are seeing on logging especially are duplicate tags, coming off the both the host tags as well as application logs. this is becoming a pain point as we back bill customers for their logging usage into Datadog.
We've checked, double checked, re-checked all our configuration to make sure none of the settings are enabled in the DD-agent to pull host tags as labels, so should only be apply a customers application tags to its logs and metrics.

@lifttocode
Copy link

lifttocode commented Jul 18, 2024

More precisely, my reading of this GitHub issue and the problem at hand is that the Datadog agent always adds tags, some of which have high cardinalities like host and name, which significantly raises the number of custom metrics and the customer's bill outside of the customer's control.

For those facing issues with high cardinality due to host tags in Datadog, I’d like to share a solution that worked for me.

In the official Datadog documentation on DogStatsD metrics submission - host tags, there is a somewhat vague but crucial statement: “The submitted host tag overrides any hostname collected by or configured in the Agent.” Leveraging this, I discovered that by including an empty host: tag to all generated metrics, I successfully eliminated unnecessary host tag that were significantly increasing the cardinality. Now, all custom metrics submitted to Datadog only include the tags that I intend to include.

My breakthrough came from Datadog Agent release 6.6.0, which introduced an enhancement allowing DogStatsD to support the removal of hostnames on events and services checks, similar to metrics, by adding an empty host: tag.

@ide
Copy link

ide commented Jul 19, 2024

@lifttocode Thank you for sharing that. Specifying host: appears to have removed the "host" tag.

Relatedly, I tried doing the same for other tags (other_tag:) but the behavior is slightly different. The agent will use other_tag: to override the tag that the agent would have otherwise specified but it still sends other_tag: to Datadog. The Datadog website UI displays the tag as other_tag without a value. Trying it out is the easiest way to see the behavior for yourself. It is still useful to be able to override tags that would otherwise increase your ingested or indexed tag counts.

@sherifabdlnaby
Copy link

We have this same issue as well, particularly in a Azure AKS environment. In our case, our scenario is, 1 team looks after the AKS infrastructure, so therefore they tag the underlying infrastructure with common tags such as team, service, env as well as some custom tags such as 'costcentre'. We then have multiple application teams that will run their application on top of the provided AKS clusters, those teams too will tag their deployments with the same tags. What we are seeing on logging especially are duplicate tags, coming off the both the host tags as well as application logs. this is becoming a pain point as we back bill customers for their logging usage into Datadog. We've checked, double checked, re-checked all our configuration to make sure none of the settings are enabled in the DD-agent to pull host tags as labels, so should only be apply a customers application tags to its logs and metrics.

THIS

@2rs2ts
Copy link

2rs2ts commented Oct 23, 2024

@lifttocode

“The submitted host tag overrides any hostname collected by or configured in the Agent.”

That's good to know; unfortunately, it seems that autodiscovery-scraped metrics (e.g. openmetrics, prometheus, etc.) that have a host label on them just end up creating host aliases, so the metrics are just double-tagged with two host:... metrics. So I guess it's not a consistent experience, or maybe I'm just misunderstanding something. Even if my problem exists between my keyboard and chair in that case, there's still the matter of wanting a partial list of host tags to apply... well, that or duplicating the host tags at the agent level. I just don't want my users to have to re-specify env and a bunch of other tags every time they set up any autodiscovery (or StatsD metrics, for that matter)

@Scalahansolo
Copy link

How is this not an obvious feature to include on the agent. At this point this just feels like Datadog is being overly greedy and not implementing this as a way to extract more money out of it's customers.

@kaarolch
Copy link

kaarolch commented Feb 20, 2025

@ide never tested this, but there is a Vector proxy service You can deploy between agent and datadog. You'd have to reconfigure agents and build your filtering rules, but anything can become financially viable at a certain threshold.

Unfortunately we tried to do multiple options with vector as a proxy but:

  • when you drop host tag your metrics will have different value in datadog. This could be especially critical for gauges (sink take last value of series) The counter looks a little better, but datadog sinks perform aggregation. When multiple vector aggregators send similar series (where the host is often the unique series differentiator), the series could be marked as duplicates in the Datadog backend and subsequently dropped.
  • With the vector, you can temporarily duplicate metrics. We also tried renaming the metrics from metrics_a to metrics_a_new_host_tag and changing the host tag to the aggregator's pod (sts) name. Additionally, we send our cluster_id, hoping to have enough unique data to differentiate counter series between aggregator pods and clusters. Unfortunately, the metrics have different values, characteristic is not bad but still between 5-10% diff. Below, I am including the VRL transform so you can test it on your own:
# metric_duplicate.yaml
type: route
inputs:
  - metrics_rename
route:
  host_rename: .name == "metric_a"

# metric_rename_host.yaml
type: remap
inputs:
  - metric_duplicate.host_rename
source: |-
      .name = "metric_a_with_new_host"
      .tags.host = "${POD_NAME}"
     #del(.tags.host) <- this can drop host tags. 

#metric_sink_route.yaml: 
type: route
inputs:
  - metric_duplicate._unmatched
  - metric_rename_host
route:
 sink_s3: .tags.s3=true
 sink_datadog: .tags.s3=false

Then you can compare the results and check how much your data differs, especially when you sum the series by environment tags.

Is worth to mentioned that by default host tag is using to merge integration tags like aws, k8s that currently are merged on dd backend side not client side. When you removed host tag you also drop a lot of environment tags.

We have a lot of metrics that only need 1-2 tags like environment, cluster_id but when we dropped host tags we saw different metrics results.

When we dropped any other tag everything works as expected and summary data points are almost identical.

The only working solution is enabling DD MWL and reduce indexed tags; droping host tag there works, unfortunately you will need to pay by ingested metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests