Checkout, Frequently Asked Questions!

Singularity MPI containers on Shaheen and Ibex#

On this page we demonstrate how to run a containerized MPI application with singularity container platform on Shaheen and Ibex. Both KSL system are different in supporting the MPI environment.

Shaheen#

Shaheen has MPICH compatible cray-mpich installed which leverages the advanced features of cray-arise interconnect. For corner cases we have also installed openmpi-4.x on Shaheen compute nodes.

MPICH container#

Here is an example to launch an image with MPICH on Shaheen. srun launches two MPI processes on two Shaheen compute nodes:

#!/bin/bash
#SBATCH -p haswell
#SBATCH -N 2
#SBATCH -n 4
#SBATCH -t 00:05:00

module load singularity/3.5.1

export IMAGE=mpich_psc_latest.sif
BIND="-B /opt,/var,/usr/lib64,/etc,/sw"
srun  -n 2 -N 2 --mpi=pmi2 singularity exec ${BIND} ${IMAGE} /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency

OpenMPI container#

The following is an example to run an image with openmpi to launch an MPI job. Here we bind mount the host OpenMPI. We use mpirun to launch the jobs because of the unavailability of pmix integration of SLURM on host.

#!/bin/bash
#SBATCH -p haswell
#SBATCH -N 2
#SBATCH -n 4
#SBATCH -t 00:05:00

module swap PrgEnv-$(echo $PE_ENV | tr [:upper:] [:lower:]) PrgEnv-gnu
module load openmpi/4.0.3
module load singularity


export LD_LIBRARY_PATH=$CRAY_LD_LIBRARY_PATH:$LD_LIBRARY_PATH
export SINGULARITYENV_LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cray/wlm_detect/default/lib64:/etc/alternatives:/usr/lib64:/usr/lib
export SINGULARITENV_APPEND_PATH=$PATH

export IMAGE=/project/k01/shaima0d/singularity_test/images/openmpi401_latest.sif
export BIND_MOUNT="-B /sw,/usr/lib64,/opt,/etc,/var"

echo "On same node"
mpirun -n 2 -N 2 hostname
mpirun -n 2 -N 2 singularity exec ${BIND_MOUNT} ${IMAGE} /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency

echo "Now trying inside a singularity container"
mpirun -n 2 -N 1 hostname
mpirun -n 2 -N 1 singularity exec ${BIND_MOUNT} ${IMAGE} /usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency

IBEX#

CPU job#

On Ibex openmpi is installed on host. It is generally suited to launch the singularity MPI jobs with mpirun due to the unavailability of pmix integration of SLURM on host.

#!/bin/bash
#SBATCH --ntasks=4
#SBATCH --nodes=2
#SBATCH --gres=gpu:v100:2
#SBATCH --time=00:05:00
#SBATCH --account=ibex-cs

module load singularity
module load openmpi/4.0.3-cuda10.2
module list

export OMPI_MCA_btl=openib
export OMPI_MCA_btl_openib_allow_ib=1
export IMAGE=/ibex/scratch/shaima0d/scratch/singularity_mpi_testing/images/osu_cuda_openmpi403_563.sif

export EXE_lat=/usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency
export EXE_bw=/usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw


echo "On same node"
mpirun -n 2 --map-by ppr:2:node hostname
mpirun -n 2 --map-by ppr:2:node singularity exec ${IMAGE} ${EXE_lat}
mpirun -n 2 --map-by ppr:2:node singularity exec --nv ${IMAGE} ${EXE_bw}



echo "On two nodes"
mpirun -n 2 --map-by ppr:1:node hostname
mpirun -n 2 --map-by ppr:1:node singularity exec ${IMAGE} ${EXE_lat}
mpirun -n 2 --map-by ppr:1:node singularity exec ${IMAGE} ${EXE_bw}

GPU job#

The following SLURM jobscript demonstrates run a container with MPI application running on Ibex GPUs leveraging GPU Direct RDMA feature to get close to maximum theoretical bandwidth available from a Host Channel Adapter(HCA).

#!/bin/bash
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=2
#SBATCH --gres=gpu:v100:2
#SBATCH --time=00:05:00
#SBATCH --account=ibex-cs

module load singularity
module load openmpi/4.0.3-cuda10.2
module list

export OMPI_MCA_btl=openib
export OMPI_MCA_btl_openib_allow_ib=1
export IMAGE=/ibex/scratch/shaima0d/scratch/singularity_mpi_testing/images/osu_cuda_openmpi403_563.sif

export EXE_lat=/usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency
export EXE_bw=/usr/local/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw


echo "On same node"
mpirun -n 2 --map-by ppr:2:node hostname
mpirun -n 2 --map-by ppr:2:node singularity exec --nv ${IMAGE} ${EXE_lat} D D
mpirun -n 2 --map-by ppr:2:node singularity exec --nv ${IMAGE} ${EXE_bw} D D



echo "On two nodes"
mpirun -n 2 --map-by ppr:1:node hostname
mpirun -n 2 --map-by ppr:1:node singularity exec --nv ${IMAGE} ${EXE_lat} D D
mpirun -n 2 --map-by ppr:1:node singularity exec --nv ${IMAGE} ${EXE_bw} D D