-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sending_queue is full despite large queue size #9009
Comments
A 10,000 span queue will provide 20 seconds of runway at 500 spans per second. The default retry settings used by the queued retry helper have a 5s initial interval and 30s max interval with a 5 minute maximum elapsed time. Your queue should be at least 150,000 to accommodate 500 inbound spans per second over a five minute period. Though, realistically, that isn't enough at that rate and timeout duration since spans will not be cleared from the queue fast enough. |
What do you mean by "seconds of runway"? Are you saying that the queue is keeping things around for 5 minutes? I thought when the backend was functional, the queue should remain drained at all times. Also, if you are keeping them around for 5 minutes, why wouldn't 150,000 be enough? How do you know how fast the spans will be cleared from the queue? |
I mean that if the arrival rate is 500/s and the exit rate is 0/s, the queue will be full in 20s. The queue will retain items until they are consumed. The consumer in this case may only consume items at a rate of 1 every 5min. That would be pathological behavior only expected to exhibit when the backend is unavailable or severely degraded. It is, however, possible that the exit rate could be less than the arrival rate, at which point the queue will fill up:
As for why 150,000 wouldn't be enough if the exit rate were 1 per 5 minutes, Little's Law tells us the answer. The queue would be full by the time the first item was processed if the arrival rate were 500/second. By the time the second item were processed it would still be full and 150,000 items would have been rejected. Only when the exit rate is greater than, or equal to, the arrival rate, will a queue not eventually fill up and reject new arrivals. Spans will be cleared from the queue as fast as they can be sent by the exporter. In this case that would likely be governed by how fast your Jaeger service instance can accept new spans. |
I did just try it with 150,000 queue size, and it worked for 8 minutes, then returned the old error. From this I believe that has to be the case; the exit rate is probably less than the arrival rate. Is there some way to monitor the exit rate to see what the limits are? Are there any guidelines for expected throughput? I'd ideally like to be able to process thousands of traces per second; is that even possible? |
The collector emits some metrics on port It should be possible to process thousands of spans per second, but probably not with 200 millicores of CPU allocated to your Jaeger service instance. Ensure that your systems are appropriately sized and consider horizintally scaling your collector instances. |
Thank you, that's very helpful! I'll try to figure out what the rates are there then. Is there anything I have to do to enable that? I added 8080 to the list of ports in my kubernetes service but I'm getting connection refused when I try to access it. |
I believe part of the issue here might be lack of batching after
Actually, the port number is :8888 :) |
This is why I should never trust my memory! |
Sorry for the late reply! I am still having a bit of trouble accessing the statistics. When I go to :8888, I get 404 Page Not Found. This is the same as I got with zpages, but I fixed that by going to specific endpoints. Is there analogous documentation on how to find the specific metrics? |
Try the /metrics endpoint on that port: localhost:8888/metrics (given that the service is exposed to your localhost, perhaps using |
I got it! I believe from this there are many many spans gotten by the receiver:
And much less sent by the exporters:
This seems pretty bad; I seem to be overloading it by a factor of 3. Can you explain what exactly is happening with the sending_queue here? I understand I am overloading it, but would increasing the number of consumers help? Are the requests to Jaeger pipelined in any way? |
This are relatively low-endish numbers (assuming the collector was operating for more than a couple of seconds). It's surprising to see issues with such volume of data. Will try to reproduce this locally. What are the parameters that you supply to your version of Also, noticed that your collector version is rather old (v0.37.1 was released approx 6 months ago). I am not sure if there were any bugfixes since then but I would consider updating it. Then you will be able to use groupbyattrs processor which helps with compacting Jaeger data and might reduce the number of export requests |
I have the following for my deployment file for tracegen; I made the image public so this should work out of the box
It was only working for a few seconds, but I noticed that the logging exporter was able to keep up with the load much more than Jaeger was. Also I might be using the num_consumers incorrectly; I put 100 in for Jaeger, but I'm not sure what exactly that does. If I increase it too much, I get an out of memory error, even with the memory limiter processor enabled. As far as the groupbyattrs processor goes, won't that get rid of the fact that they are grouped by trace? I am using the fact they are grouped by trace for another exporter I want to add later. Right now I am planning on making one request per trace for that exporter; is this an infeasible design? |
@jessberg I did a test (though locally, so local network stack connection). I observed a couple of things First of all, you might want to check this metric:
It describes the current queue size. TLDR, if it grows then the backend is not able to keep up with the requests coming from the collector. This might be due to a number of reasons though Secondly, I suspect the problems you're observing are due to request fragmentation. This is something which can be amplified by Jaeger data model. The solution I mentioned a couple days ago still holds (I slightly updated it) :) I did a test and while I haven't observed queue issues with your setup, adding the batch processor and groupbyattrs resulted in 3x reduction of OpenTelemetry Collector CPU usage. You need slightly more recent OpenTelemetry Collector version for that (v0.46.0 or newer) |
After a few seconds, I get
That seems to confirm it is the queue that's the problem! What is request fragmentation? I know that batching and groupbyattrs might help, but with groupbyattrs, I am still concerned about negating the effect of groupbytrace for my other exporter. I did try batching with the config you put above, along with extra millicores for Jaeger as Aneurysm9 recommended; these are the results
So it looks like that is enough to solve this problem, however I'm still getting this error message in the logs
This happens roughly every 15 lines of the default logging logs. I also have
from the run with batching. So I believe from this that perhaps it's not enough to just batch, and maybe you need the groupbyattr as well. However, I still don't really want to get rid of groupbytrace since I need it for my other exporter; does groupbyattr negate the effects of groupbytrace? I guess more broadly I am still a bit confused on the model for the exporters. There are consumers that try to export to the backends, but do these consumers block on the backend to respond? Are they pipelined? What is consuming from the sending queue? I guess I'm trying to figure out that if the backend is unable to keep up with load, is that because the exporter is too slow to export, or is it because the backend won't take that many requests. Also, why is it that the groupbyattr processor helps? Sorry for the deluge of questions, just want to make sure I understand the system because I'll have to run it at other loads as well :) |
TLDR, Jaeger exporter is creating a separate batch/request for each ResourceSpans. So when there are multiple ResourceSpans objects in the batch (even if effectively pointing to the same Resource Attributes values), it will create multiple requests. By using
If the spans are pointing to the same resource then it's not going affect them. Question though - why do you want to have the traces grouped together? This is intended for processing them in the OpenTelemetry Collector pipeline - e.g. via tailsamplingprocessor
It's important to note that while batching will help a lot, it's not enough due to the reasons I mentioned above. You should use batching + groupbyattrs processor (in that sequence). Batching will allow to combine a set of spans together and groupbyattrs will ensure the model is optimized |
That makes sense for the general case; however, the traces in my trace generator have one span per resource. I'm not sure it would result in any savings there. I've tried anyway with the latest build; the config above gave the error
I added The exporter I'm creating goes to Google Cloud Storage, and I am hashing the structure of the trace as a part of the metadata of the trace. So when I store a trace, I need to have it all together in the collector to create the hash, and then I can store the spans individually. Right now, I have it creating a separate request per span, just like Jaeger does, so I thought they'd scale similarly. I got the same error on both exporters (Jaeger and my custom exporter), but I reproduced it with just Jaeger here, because I figured I should only ask about the code here on this repo. The config:
|
@pmm-sumo, do you also see this? I'm focusing on flaky tests this week, but if you are also seeing issues with the groupbytrace, I'll add this to my queue. |
Yeap, still relevant, even more so when used in conjuction with groupbytrace processor, which will break down the batches into smaller chunks
This must have been some older build. The change is there since v0.46.0. Could you double check? I was using it for some time and did not see this error
It's tricky but the above is in wrong order.
Can you try this config @jessberg?
I think we have one iteration of configuration to update here. I believe this problem is not groupbytrace-related. I will try to do some more tests later this week if the config change will not help. |
I just tried that config; it worked!! Just so I understand what's going on here, was the issue that having a separate ResourceAttribute for every span took too much memory, and that clogged up the sending_queue? What exactly was the problem? |
Great to hear that @jessberg! What I believe was happening is that there was a lot of small requests due to data fragmentation and Jaeger exporter specifics. Regardless of sendind_queue size, there was just too many small requests. To give you some numbers, let's assume that you were generating 500 traces per second, each with 10 spans on average (so 5000 spans/s). Typically, with batch processor set to, say, 500 spans in batch and otlp exporter employed, one could anticipate 10 requests per second, each with 500 spans. If there would be one ResourceSpans per such batch (which could be the case if there's only one underlying Resource generating the data), the number would be the same in case of Jaeger exporter. However, if the spans would be broken down into more ResourceSpans, Jaeger exporter would create more requests accordingly. In the extreme case, this might mean 5000 requests/s rather than 10 requests/s we've seen before. Making multiple small requests is much less efficient than having fewer large requests. What happens here is I believe following:
|
Having said that, the example discussed in this issue might be an extreme one (only one Resource present). I think in very large environments in might be tricky and the number of unique Resources per trace might be comparable to number of spans in it |
This makes a lot of sense! Thank you for explaining it! I guess as a meta point, one thing I didn't really understand before is that it isn't super feasible to have a request per span. And then the reason this works so much better is because Jaeger does a request per |
Yeah, I think that it's a fair way to look at this |
Thanks so much for your help!! I will mark this as resolved. |
@pmm-sumo Sorry, just one last question. Why is it so infeasible to have so many requests? Is it a limit on the amount of time we keep the requests hanging around? Surely if you can keep traces around for 10 seconds in the groupbytrace processor, you should be able to keep them around while you wait for a response from the storage backend. |
@pmm-sumo, is there something we should be doing as part of the Jaeger exporter here natively? Or perhaps something we should recommend as part of its readme? |
@jessberg the problem is with volume of new requests each second. In the extreme example above I put above we had 10 large requests vs 5000 very small requests per second. IIRC in your exporter you had 100 consumers set. This means it has a budget of 100 reqs-s/s (within one second they can handle up to 100 requests, 1s each or 1000 requests, 100ms each) Let's say that the overall (network + server) latency for handling a request is 100ms. Of course, for large request it will be larger, but even if it would be 10s, it's still sustainable. 100 consumers will be able to keep up with 10 reqs/s * 10s For the smaller requests, this is a much bigger problem. 5000 reqs/s * 100ms is 5x above the consumers capability. Each second they will handle only 1000 of them. So they will not be able to keep up with handling all the export requests and eventually the queue will fill up
Good call @jpkrohling, I was thinking about this and I think we could either add a recommendation for |
I think it's a good idea.
I have a feeling that this is somewhat specific to Jaeger, but we can certainly start with Jaeger and share the findings with the remaining of the SIG. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I previously linked two issues here as potentially related, but they are not. |
Do I read it right that this issue is actually now a documentation one? We should provide users with information on how to use the groupbyattrs processor to compact the number of requests made to the underlying exporter, right? |
Yes, my issue was from the "sending_queue is full" message, it was very difficult to figure out that the groupbytrace processor was interacting badly with the Jaeger exporter and I needed groupbyattrs to fix it. I think a note on the Jaeger exporter to always use it with groupbyattrs would fix it. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Describe the bug
I am trying to run the collector with the group by trace processor, and Jaeger exporter. I have an application that is producing 500 traces per second, and I have set the sending_queue to have a size of 10,000. In my view, that should never overwhelm the queue because 500 traces per second, even with random bursts and noise, shouldn't overwhelm a queue of size 10,000. However, I repeatedly get the error
and
These errors remain when I have my application set to produce 500 traces per second, but go away when I lower the load to 300 traces per second. Because of that, and because I see in the logs:
I am fairly confident that Jaeger is configured correctly.
Steps to reproduce
I have a version of the tracegen application that produces 12 span traces. It is available here. I am running that with Jaeger and the OpenTelemetry collector. My Jaeger and OTel files are below.
What did you expect to see?
I expected the collector to work without throwing sending_queue full errors, because I have set the sending_queue size very high.
What did you see instead?
The collector repeatedly throws out data.
What version did you use?
Version: 0.37.1
What config did you use?
The OpenTelemetry config file:
The Jaeger config file:
Environment
I'm running these as separate containers on GCP.
The text was updated successfully, but these errors were encountered: