-
-
Notifications
You must be signed in to change notification settings - Fork 32
measure task resources #1491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
measure task resources #1491
Conversation
|
Implementation of disk usage is yet to be added as it would rely on the approval of docker/docker-py#3370 |
|
As discussed on Slack, I propose we wait few days for https://github.com/docker/docker-py/ maintainers to give us an answer. If they don't reply soon enough, I propose two possible plans:
|
|
@elfkuzco looks like I was right been concerned about not getting any feedback on your upstream PR. Please advise which plan B (among the two I've proposed, or one you can propose) makes more sense to you so that we can move on and have CPU measure and at least a first estimate of Disk used. |
this would be simpler to implement plus, given almost everything is written to the mount point, i don't know if there's going to be any real metric obtained from writable layer. possibly .pyc files or pycache files but those really shouldn't be big enough, right? |
|
Let's go for this alternative: consider only mount point for the time-being + open an issue about the fact that we might want to better track disk usage. Goal would be not only to capture the writable layer (which in general is supposed to be small, but this is not the case on all scrapers, not even speaking about bugs) but also the image size itself (which is then an slight overestimation of disk usage since image is shared across tasks). |
7bfc82c to
6165814
Compare
|
Updated PR description. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1491 +/- ##
==========================================
+ Coverage 83.38% 83.39% +0.01%
==========================================
Files 91 91
Lines 4399 4403 +4
Branches 470 470
==========================================
+ Hits 3668 3672 +4
Misses 606 606
Partials 125 125 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
benoit74
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small remarks + I need more data to confirm the average we are computing is really close to the average CPU consumption. Since we have tasks which might run for hours, I doubt an EWMA with 0.25 and update every minute will really represent something close to the average. I might be wrong, at least I need to be convinced 😄
benoit74
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
908e268 to
b3a9a34
Compare
Rationale
This PR adds support for measuring resources used by the scraper. For the CPU stats, it uses the Exponentially Weighted Moving Average to measure the percentage of CPU usage, along with the maximum CPU percentage used.
Also, max disk usage is computed by adding the filesizes of the files in the scraper mount directory.


Changes
This closes #1423