... | ... | @@ -41,7 +41,10 @@ NVIDIA-smi ships with NVIDIA GPU display drivers on Linux. Nvidia-smi can report |
|
|
+-----------------------------------------------------------------------------+
|
|
|
```
|
|
|
|
|
|
## Useful commands
|
|
|
## Querying GPU Status
|
|
|
These are NVIDIA’s high-performance compute GPUs and provide a good deal of health and status information.
|
|
|
|
|
|
### To list all available NVIDIA devices, run:
|
|
|
|
|
|
```
|
|
|
# nvidia-smi -L
|
... | ... | @@ -49,4 +52,28 @@ GPU 0: Tesla V100-SXM2-32GB (UUID: GPU-5a80af23-787c-cbcb-92de-c80574883c5d) |
|
|
GPU 1: Tesla V100-SXM2-32GB (UUID: GPU-233f07d9-5e4c-9309-bf20-3ae74f0495b4)
|
|
|
GPU 2: Tesla V100-SXM2-32GB (UUID: GPU-a1a1cbc1-8747-d8cd-9028-3e2db40deb04)
|
|
|
GPU 3: Tesla V100-SXM2-32GB (UUID: GPU-8d5f775d-70d9-62b2-b46c-97d30eea732f)
|
|
|
``` |
|
|
\ No newline at end of file |
|
|
```
|
|
|
|
|
|
### To list certain details about each GPU, try:
|
|
|
|
|
|
nvidia-smi --query-gpu=index,name,uuid,serial --format=csv
|
|
|
|
|
|
0, Tesla K40m, GPU-d0e093a0-c3b3-f458-5a55-6eb69fxxxxxx, 0323913xxxxxx
|
|
|
1, Tesla K40m, GPU-d105b085-7239-3871-43ef-975ecaxxxxxx, 0324214xxxxxx
|
|
|
|
|
|
To monitor overall GPU usage with 1-second update intervals:
|
|
|
|
|
|
nvidia-smi dmon
|
|
|
|
|
|
# gpu pwr gtemp mtemp sm mem enc dec mclk pclk
|
|
|
# Idx W C C % % % % MHz MHz
|
|
|
0 43 35 - 0 0 0 0 2505 1075
|
|
|
1 42 31 - 97 9 0 0 2505 1075
|
|
|
(in this example, one GPU is idle and one GPU has 97% of the CUDA sm "cores" in use)
|
|
|
|
|
|
To monitor per-process GPU usage with 1-second update intervals:
|
|
|
|
|
|
nvidia-smi pmon
|
|
|
|
|
|
## Useful commands
|
|
|
|