|
|
The Slurm options --mem, --mem-per-cpu and --mem-per-gpu do not currently allow you to suitably configure the memory allocation of your job on Turing. The memory allocation is automatically determined by the number of reserved CPUs.
|
|
|
The Slurm options `--mem`, `--mem-per-cpu` and `--mem-per-gpu` do not currently allow you to suitably configure the memory allocation of your job on Turing. The memory allocation is automatically determined by the number of reserved CPUs.
|
|
|
|
|
|
To adjust the amount of memory allocated to your job, you must adjust the number of CPUs reserved per task (or GPU) by specifying the following option in your batch scripts, or when using salloc in interactive mode:
|
|
|
|
|
|
--cpus-per-task=... # --cpus-per-task=1 by default
|
|
|
|
|
|
Be careful, --cpus-per-task=1 is by default. If you do not modify its value, as explained below, you will not have access to as much memory per GPU as you could have and this could rapidly result in memory overflow. |
|
|
Be careful, `--cpus-per-task=1` is by default. If you do not modify its value, as explained below, you will not have access to as much memory per GPU as you could have and this could rapidly result in memory overflow.
|
|
|
|
|
|
## On the default gpu partition
|
|
|
|
|
|
On Turing node by default gpu partition offers 384 GB of usable memory. The memory allocation is automatically computed on the basis of:
|
|
|
|
|
|
- 8 GB per reserved CPU core if hyperthreading is deactivated (Slurm option `--hint=nomultithread`).
|
|
|
|
|
|
The default gpu partition is composed of 4 GPUs and 48 CPU cores: you can reserve for instance 1/4 of the node memory per GPU by reserving 12 CPU cores (i.e. 1/4 of 48 CPU cores) per GPU:
|
|
|
|
|
|
--cpus-per-task=10 # reserves 1/4 of the node memory per GPU (default gpu partition)
|
|
|
|
|
|
In this way, you have access to 96 GB of memory per GPU if hyperthreading is deactivated (if not, half of that memory). |