gpumonitor
gives you stats about GPU usage during execution of your scripts and trainings,
as TensorFlow or
Pytorch Lightning callbacks.
Installation can be done directly from this repository:
pip install gpumonitor
monitor = gpumonitor.GPUStatMonitor(delay=1)
# Your instructions here
# [...]
monitor.stop()
monitor.display_average_stats_per_gpu()
It keeps track of the average of GPU statistics. To reset the average and start from fresh, you can also reset the monitor:
monitor = gpumonitor.GPUStatMonitor(delay=1)
# Your instructions here
# [...]
monitor.display_average_stats_per_gpu()
monitor.reset()
# Some other instructions
# [...]
monitor.display_average_stats_per_gpu()
Add the following callback to your training loop:
For TensorFlow,
from gpumonitor.callbacks.tf import TFGpuMonitorCallback
model.fit(x, y, callbacks=[TFGpuMonitorCallback(delay=0.5)])
For PyTorch Lightning,
from gpumonitor.callbacks.lightning import PyTorchGpuMonitorCallback
trainer = pl.Trainer(callbacks=[PyTorchGpuMonitorCallback(delay=0.5)])
trainer.fit(model)
You can customize the display format according to the gpustat
options. For example, display of watts consumption,
fan speed are available. To know which options you can change, refer to: