You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/Documentation/Slurm/interactive_jobs.md
+51-40
Original file line number
Diff line number
Diff line change
@@ -25,18 +25,18 @@ salloc: job 512998 queued and waiting for resources
25
25
salloc: job 512998 has been allocated resources
26
26
salloc: Granted job allocation 512998
27
27
salloc: Waiting for resource configuration
28
-
salloc: Nodes r2i2n5,r2i2n6 are ready for job
29
-
[hpc_user@r2i2n5 ~]$
28
+
salloc: Nodes x1008c7s6b1n0,x1008c7s6b1n1 are ready for job
29
+
[hpc_user@x1008c7s6b1n0 ~]$
30
30
```
31
31
32
32
You can view the nodes that are assigned to your interactive jobs using one of these methods:
33
33
34
34
```
35
35
$ echo $SLURM_NODELIST
36
-
r2i2n[5-6]
36
+
x1008c7s6b1n[0-1]
37
37
$ scontrol show hostname
38
-
r2i2n5
39
-
r2i2n6
38
+
x1008c7s6b1n0
39
+
x1008c7s6b1n1
40
40
```
41
41
42
42
Once a job is allocated, you will automatically "ssh" to the first allocated node so you do not need to manually ssh to the node after it is assigned. If you requested more than one node, you may ssh to any of the additional nodes assigned to your job.
@@ -50,13 +50,17 @@ Type `exit` when finished using the node.
50
50
51
51
Interactive jobs are useful for many tasks. For example, to debug a job script, users may submit a request to get a set of nodes for interactive use. When the job starts, the user "lands" on a compute node, with a shell prompt. Users may then run the script to be debugged many times without having to wait in the queue multiple times.
52
52
53
-
A debug job allows up to two nodes to be available with shorter wait times when the system is heavily utilized. This is accomplished by limiting the number of nodes to 2 per job allocation and specifying `--partition=debug`. For example:
53
+
A debug job allows up to two nodes to be available with shorter wait times when the system is heavily utilized. This is accomplished by specifying `--partition=debug`. For example:
A debug node will only be available for a maximum wall time of 1 hour.
59
+
Add `--nodes=2` to claim two nodes.
60
+
61
+
Add `--gpus=#` (substituting the number of GPUs you want to use) to claim a debug GPU node. Note that there are fewer GPU nodes in the debug queue, so there may be more of a wait time.
62
+
63
+
A debug job on any node type will only be available for jobs with a maximum walltime (--time) of 1 hour, and only one debug job at a time is permitted per person.
The above salloc command will log you into one of the two nodes automatically. You can then launch your software using an srun command with the appropriate flags, such as --ntasks or --ntasks-per-node:
[hpc_user@r3i5n13 ~]$ #(your compute node r3i5n13, now X11-capable)
106
-
[hpc_user@r3i5n13 ~]$ xterm #(or another X11 GUI application)
109
+
[hpc_user@x1008c7s6b1n0 ~]$ #(your compute node x1008c7s6b1n0, now X11-capable)
110
+
[hpc_user@x1008c7s6b1n0 ~]$ xterm #(or another X11 GUI application)
107
111
```
108
112
109
-
## Requesting Interactive GPU Nodes
113
+
From a Kestrel-DAV FastX remote desktop session, you can omit the `ssh -Y kestrel.hpc.nrel.gov` above since your terminal in FastX will already be connected to a DAV (kd#) login node.
110
114
111
-
The following command requests interactive access to GPU nodes:
0 commit comments