Company
Date Published
Author
Chuan Li
Word count
736
Language
English
Hacker News points
None

Summary

This tutorial provides a step-by-step guide on setting up a Horovod + Keras environment for multi-GPU training, requiring a machine with at least two GPUs and specific software installation. The necessary steps include installing NCCL2, Open MPI (optional), and Horovod in a Python3 virtual environment, followed by configuring the environment to run multi-GPU training jobs using either the `horovodrun` wrapper or the `mpirun` API. The tutorial concludes with a summary of the setup process, providing a one-stop installation script for all required steps after NCCL2 library download.