MPI on VACC
The message passing interface (MPI) is a standard programming framework used to enable processes both within one compute node and between multiple nodes. Open MPI is one implementation of this framework which is widely used by many different software packages to distribute work between multiple processors and/or nodes.
MPI can use several different communications protocols to communicate among processes. The two in use in the VACC cluster are TCP and Infiniband. All nodes in the cluster are capable of using TCP, but not all can use Infiniband. Infiniband requires expensive, special network cards in the node, so not all nodes have them.
For MPI jobs that move a lot of data between ranks during computation,
Infiniband is preferable. To specify nodes that have them, you
need to add a --constraint=
option to the Slurm job parameters
specifying one of the ib
constraints. See
Running Jobs/Job Constraints
for more on which constraints are available.
Finding MPI¶
There may be more than one version of Open MPI installed on the cluster. Use
the command module spider openmpi
to list them. If you have software that
requires an MPICH compatible MPI, use the Intel implementation of MPI from
the oneapi
module.
Please see Loading Software
for a listing of additional common module
subcommands.
See Using MPI for examples of loading and running an MPI program.