Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tailsamplingprocessor ratelimiting works incorrectly with loadbalancingexporter #1629

Open
pkositsyn opened this issue Nov 18, 2020 · 1 comment
Labels
bug Something isn't working never stale Issues marked with this label will be never staled and automatically removed priority:p3 Lowest processor/tailsampling Tail sampling processor release:after-ga spec:trace

Comments

@pkositsyn
Copy link
Contributor

pkositsyn commented Nov 18, 2020

Describe the bug
I have been thinking about ratelimiting on sampling and this seems to work not as intended right now. I did no tests, but this is a totally idiomatic issue.

Imagine you want to sample traces with span rate limit 1000 spans / s and usually you get 1 trace / s with 500 spans. Having one collector, you save the trace and everything is fine. Scaling up to 10 collectors, you have to make sure, that the trace is still sampled and total rate limit is the same for this kind of trace (scaling could be done due to whatsoever problems). Leaving the old 1000 spans / s limit will result in 10_000 spans / s limit among all collectors. Cutting the limit to 100 spans / s for each collector will result in sampling no traces. Thus, there is either my misunderstanding of the horizontal scaling as a way to increase all resources in an equal proportion or an actual problem, treating this case as a possible one.

Seems like this problem is unsolvable without at least knowledge about number of backends, because the behaviour of single collector should be different for one and many backend cases.

Still, could there be something like a config parameter for number of backends to apply different behaviour? Anyway, I assume the solution to be rather complicated and maybe heuristic for this approach without synchronization (as it is postulated in collector to avoid external communication)

What did you expect to see?
I expect to see an explicit configuration of collector according to the needs. If it is expected to see max 1000 spans / s for some filter given many collector backends, there should be a clear way to approach this.

Setting 1000 / numCollectors rate limit is firstly not the same as it is shown in the example above and secondly involves manual calculation and setting new numbers in the whole tailsamplingprocessor config.

Note that there is no same issue about probability based sampling.

@pkositsyn pkositsyn added the bug Something isn't working label Nov 18, 2020
dyladan referenced this issue in dynatrace-oss-contrib/opentelemetry-collector-contrib Jan 29, 2021
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 120 days if there is no activity.

@github-actions github-actions bot added the Stale label Oct 24, 2022
@jpkrohling jpkrohling added processor/tailsampling Tail sampling processor never stale Issues marked with this label will be never staled and automatically removed and removed Stale labels Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working never stale Issues marked with this label will be never staled and automatically removed priority:p3 Lowest processor/tailsampling Tail sampling processor release:after-ga spec:trace
Projects
None yet
Development

No branches or pull requests

3 participants