Pre/Post processing jobs#
There are steps in workflows whose requirement do not essentially fit to the compute nodes in workq
or shared
partition can cater. Such steps may involve moving data, preprocessing steps like mesh generation requiring larger memory than 375GB, or postprocessing step requiring access to graphics GPU. In such case, ShaheenIII maintains compute nodes accessible through different partitions to submit these steps as batch jobs.
Copy jobs#
Copy jobs are best run on dedicated data transfer node or dtn
nodes.
Jobs that require moving data e.g.
* between tiers of scratch /scratch/$USER/
to /scratch/$USER/bandwidth
and back
* between ShaheenIII project and scratch
* between Ibex and ShaheenIII filesystem
* downloading data from internet in project or scratch
Note
On compute nodes in dtn
partition, all the filesystems, scratch
, project
and home
are mounted and accessible with read/write permissions.
The following example jobscript demonstrates moving big data between project and scratch. dcp
is a parallel utilize which, in the jobscript below, runs on 8 processes and moves a large number of files from <source> to <destination> directory. Other utilities such as scp
and rsync
can also be used in the same way. For details on use of these utilities, check the documentation on Data Management
#!/bin/bash
#SBATCH --partition=dtn
#SBATCH --ntasks=8
#SBATCH --account=k#####
module swap PrgEnv-cray PrgEnv-gnu
module load mpifileutils
module list
time -p srun -n ${SLURM_NTASKS} dcp --verbose --progress 60 --preserve <source_dir> <dest_dir>
Requesting a GPU with graphics support#
Shaheen III has nodes with GPUs with graphics support for post processing and visualization functionalities. These nodes can be accessed by submitting jobs with specific SLURM directives in ppn
partition.
Below is an example jobscript on how to run a batch job on these node:
#!/bin/bash
#SBATCH --partition=ppn
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=128
#SBATCH --gres=gpu:1
#SBATCH --hint=nomultithread
#SBATCH --account=k#####
#SBATCH --time=01:00:00
srun -n 1 ./a.out
Jobs on Large memory nodes#
Some steps of workflow may require more memory on a node than that available on the compute nodes in workq
partition. The ppn
partition has a few compute nodes with large memory. Please refer to the Compute Nodes for details on the available nodes. The following jobscript demonstrates submitting a batch job with a request of 2TB memory and 256 CPUs to run multiple OpenMP threads.
#!/bin/bash
#SBATCH --partition=ppn
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=256
#SBATCH --mem=2T
#SBATCH --hint=nomultithread
#SBATCH --account=k#####
#SBATCH --time=01:00:00
export OMP_NUM_THREADS=256
srun -c $OMP_NUM_THREADS ./a.out