Skip to content

Plots show excessive amounts of resources #187

@hjjvandam

Description

@hjjvandam

I am running some workflows on Crusher. The stage with the largest number of tasks runs 64 of them, each using 1 CPU core. The performance analysis plots suggest, however, that around 1000 cores were reserved for this workflow. With 64 CPU cores and 4 GPUs per node you only get this if the node allocation would correspond to 1 GPU per task. I.e. reserving 16 nodes for 64 single core tasks. I hope that the code isn't actually doing that and that just the plotting is off.

The performance data is stored at

/lustre/orion/world-shared/chm136/re.session.login2.hjjvd.019706.0000

I have copied the performance plots into the same directory.

The versions of the RADICAL Cybertools packages are:

(pydeepdrivemd) [hjjvd@login2.crusher test]$ pip list | grep radical
radical.analytics            1.43.0
radical.entk                 1.43.0
radical.gtod                 1.43.0
radical.pilot                1.43.0
radical.saga                 1.43.0
radical.utils                1.44.0

The code I am running lives at

git@github.com:hjjvandam/DeepDriveMD-pipeline.git

In branch feature/nwchem. The job I am running is specified in https://github.com/hjjvandam/DeepDriveMD-pipeline/blob/feature/nwchem/test/bba/molecular_dynamics_workflow_nwchem_test/config.yaml. Let me know if you need any further information, please.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions