Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loss on stop #1547

Open
allenluce opened this issue Mar 30, 2018 · 5 comments
Open

Data loss on stop #1547

allenluce opened this issue Mar 30, 2018 · 5 comments
Labels
[deprecated] team/agent-core Deprecated. Use metrics-logs / shared-components labels instead..

Comments

@allenluce
Copy link

Up to the last 15 seconds of aggregated data is lost when shutting down the statsd server (with statsd.Stop()). Even when using an aggregator with the flush interval quite low, some seconds of data don't end up getting pushed to the backend.

Is there a recommended way to flush data to prevent this from happening?

@truthbk
Copy link
Member

truthbk commented May 15, 2018

@allenluce you're 100% right, it looks like statsd.Stop() here doesn't flush the aggregator with whatever it may contain at that point, the process shuts down without emptying those packets.

I don't believe we have a way around this at the moment. The flushes happen periodically as you already know, so depending on when during the flush interval you request the stop() you might lose 1s or 15s. We'd have to add some logic to the shutdown code. That would make the process teardown a little slower, but it does seem like the right thing to do. There are still things that can go wrong at the forwarder level... so we'd have to make this a best-effort thing.

We'll look into it. Thank you for bringing this up.

@truthbk truthbk added the [deprecated] team/agent-core Deprecated. Use metrics-logs / shared-components labels instead.. label May 15, 2018
@visciang
Copy link

This issue is very annoying if you run the agent in a "side container" alongside a AWS Fargate Task (a short living "docker run"). When the main task ends, the agent container is stopped and it doesn't flush metrics / events / APM / etc.

The "side car" pattern only works for AWS Fargate Services (long living tasks).

As a workaround we currently deploy the bunch of agents as a AWS Fargate Services, used by Tasks to report datadog metrics.

@baxang
Copy link

baxang commented Jan 28, 2021

Seems like #4129 addressed this issue.

@sgnn7
Copy link
Contributor

sgnn7 commented Jan 28, 2021

@allenluce / @visciang Can you try out the new version of the agent to see if this issue is resolved now?

@miketheman
Copy link
Contributor

Seems similar to #3940

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[deprecated] team/agent-core Deprecated. Use metrics-logs / shared-components labels instead..
Projects
None yet
Development

No branches or pull requests

6 participants