|
|
slurm partition added on `turing01`
|
|
|
[[_TOC_]]
|
|
|
|
|
|
## General Information
|
|
|
|
|
|
Slurm partition added on `turing01`
|
|
|
|
|
|
```
|
|
|
PartitionName=gpus
|
... | ... | @@ -12,5 +16,24 @@ PartitionName=gpus |
|
|
DefMemPerCPU=8192 MaxMemPerNode=368640
|
|
|
```
|
|
|
|
|
|
## Requesting GPUs
|
|
|
|
|
|
To request GPU nodes:
|
|
|
|
|
|
--gres=gpu:1
|
|
|
1 node with 1 core and 1 GPU card
|
|
|
--gres=gpu:2 -c2
|
|
|
1 node with 2 cores and 2 GPU cards
|
|
|
--gres=gpu:k80:3 -c3
|
|
|
1 node with 3 cores and 3 GPU cards, specifically the type of Tesla K80 cards. Note that It is always best to request at least as many CPU cores are GPUs
|
|
|
|
|
|
The available GPU node configurations are shown here.
|
|
|
|
|
|
When you request GPUs, the system will set two environment variables - we strongly recommend you do not change these:
|
|
|
|
|
|
CUDA_VISIBLE_DEVICES
|
|
|
GPU_DEVICE_ORDINAL
|
|
|
|
|
|
To your application, it will look like you have GPU 0,1,.. (up to as many GPUs as you requested). So if for example, there are two jobs from different users: the first one requesting 1 GPU card, the second 3 GPU cards, and they happen landing on the same node gpu-08:
|
|
|
|
|
|
|