-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
increase memory of quast #1073
base: master
Are you sure you want to change the base?
increase memory of quast #1073
Conversation
I hope this makes sense and helps (my statistics course was already 6 years ago):
using https://github.com/mira-miracoli/data_hacks/blob/patch-1/data_hacks/histogram.py |
I don't fine the time to properly set up the rules atm, can we merge this PR for now ? @bgruening |
Can you please that: https://github.com/usegalaxy-eu/infrastructure-playbook/pull/1073/files#diff-ff91c17e82694a84945958b09ddc38e4535d1f99ee1fb0ed594a8cd4fecceca7R733 Not very smart but better then allocating to much memory to every run. |
You mean to couple the memory on the input size? I think the problem is, that the deciding factor is mainly the content of the bacterial community (i.e. many species will lead to a lot of ram usage and few species few ram usage) ... |
Yes, this should be possible. https://github.com/galaxyproject/tpv-shared-database/blob/main/tools.yml#L438 |
The problem with this tool is, that memory is litte related to the input size. As shown in the stats figure. Since all jobs that reported issues where related to metagenomic analysis, this simple approach should work to only increase the memory for those jobs. Another input option that should be considered is the co and not co-assembly option. Another appoach in the long run could be to add inputs to the tool, that help to allocate memory, i.e. run kraken first on the tool (or count the number of contigs) and then based on these metrics decide what memory the jobs should get. |
Can an admin check what I did wrong. I used this as https://github.com/galaxyproject/tpv-shared-database/blob/efd5b95033bb59fa66d2d5f0d0c43edce2a1c24b/tools.yml#L438 template. |
At an initial glance through the DB, I could not find a table that might contain the jobs/tools parameters individually or explicitly. However, the SQL query:
You can add in additional conditions, for example, to look for jobs where type metagenome was used (based on my understanding of the tool's source, if someone uses the type metagenome, the SQL query:
|
OK, found the table
I mapped the job ID to a different column in the Also, you can join the previously posted SQL query with this one as well,
|
files/galaxy/tpv/tools.yml
Outdated
- id: metagenome | ||
if: helpers.job_args_match(job, app, {'assembly': {'type': 'Metagenome'}}) | ||
cores: 20 | ||
mem: 80 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could it be assembly.type
?
(I just briefly had a look in the wrapper, I coud also be wrong
Some quast jobs take forever, I assume this is due to low memory allowance (was 12 GB so far: https://github.com/galaxyproject/tpv-shared-database/blob/3be0403ffc960effd180c65fa0e2242dfe5e6aa9/tools.yml#L2121C1-L2123C12); but ideally I would like to work on a solution similar to #881 if an admin can query it for me.