Hey team, I wanted to make you aware that I've been working to get OTel working for multitenant Kubernetes clusters and have opened an issue on the main otel repo. This project is the closest thing I've seen to a fully otel-based fluentd/custom-resource model for OTel, but I have some qualms about using it that I've laid out at the bottom of this issue:
open-telemetry/opentelemetry-collector-contrib#48895
This looks like a great project that I'd love to contribute to and wanted to have the contributors weigh-in on the discussion in-case I missed anything, but here are my reservations, maybe someone can dispell them for me:
only for logs
uses a vendor's opinionated distribution of the otel collector
complex abstraction over opentelemetry
not heavily adopted (only a few contributors)
does not provide all the bells and whistles of otel (our tenants may need more control than the CR abstraction allows)
our use-case
We have hundreds of independant tenant namespaces with more onboarding all the time. We have run into the noisy neighbor problem not just with logs, but metrics and traces. Especially high-cardinality metrics that some tenants may rely on are problematic.
We migrated off of the fluentd/fluent-bit Custom Resource approach to ship logs and all other telemetry with a centralized OTel collector and Daemonset. With hundreds of tenants this puts the responsibility on cluster admin to configure a one-size-fits-all centralized collector for logs, metrics, and traces which is not feasible and has resulted in us either shipping too much data or dropping/sampling telemetry that tenants depend on.
So, we're looking for self-service for our tenants to be able to process/ship telemetry. Our current architecture is the following:
We have a few clusters where they've written their own operator/controller to do something like this:

Hey team, I wanted to make you aware that I've been working to get OTel working for multitenant Kubernetes clusters and have opened an issue on the main otel repo. This project is the closest thing I've seen to a fully otel-based fluentd/custom-resource model for OTel, but I have some qualms about using it that I've laid out at the bottom of this issue:
open-telemetry/opentelemetry-collector-contrib#48895
This looks like a great project that I'd love to contribute to and wanted to have the contributors weigh-in on the discussion in-case I missed anything, but here are my reservations, maybe someone can dispell them for me:
our use-case
We have hundreds of independant tenant namespaces with more onboarding all the time. We have run into the noisy neighbor problem not just with logs, but metrics and traces. Especially high-cardinality metrics that some tenants may rely on are problematic.
We migrated off of the fluentd/fluent-bit Custom Resource approach to ship logs and all other telemetry with a centralized OTel collector and Daemonset. With hundreds of tenants this puts the responsibility on cluster admin to configure a one-size-fits-all centralized collector for logs, metrics, and traces which is not feasible and has resulted in us either shipping too much data or dropping/sampling telemetry that tenants depend on.
So, we're looking for self-service for our tenants to be able to process/ship telemetry. Our current architecture is the following:
We have a few clusters where they've written their own operator/controller to do something like this: