Now it is time for the fun part: submitting a job!
First we create a job script.
cd /fsx/jobs
export I_MPI_DEBUG=2
for N in 32 64 96;do
cat > strong_scaling_test_0${N}.sbatch << EOF
#!/bin/bash
#SBATCH --job-name=strong_scaling_test_0${N}
#SBATCH --ntasks=${N}
#SBATCH --ntasks-per-node=32
#SBATCH --output=/fsx/log/%x_%j.out
#SBATCH --exclusive
source /fsx/fds-smv/bin/FDS6VARS.sh
source /fsx/fds-smv/bin/SMV6VARS.sh
module load intelmpi
export OMP_NUM_THREADS=1
export I_MPI_PIN_DOMAIN=omp
export I_MPI_DEBUG=${I_MPI_DEBUG}
mkdir -p /fsx/results/\${SLURM_JOB_NAME}_\${SLURM_JOBID}
cd /fsx/results/\${SLURM_JOB_NAME}_\${SLURM_JOBID}
cat /fsx/input/fds/strong_scaling_test_0${N}.fds \
| sed -e 's/T_END=0.2/T_END=1.0/' > strong_scaling_test_0${N}.fds
time mpirun -genv I_MPI_DEBUG ${I_MPI_DEBUG} -ppn 32 -np ${N} fds strong_scaling_test_0${N}.fds
EOF
done
The job should take around 10min, let us submit 3 jobs.
for N in 32 64 96;do
sbatch strong_scaling_test_0${N}.sbatch
done
These jobs will run on one, two and three nodes.
In the screenshot below the cluster is configured to be able to spin up more nodes; thus, the jobs can run concurrently.
In case you are running this with an account that hits scaling limits, you can check that out within the Auto Scaling Group (ASG - deep link).
As we can see already in the overview; the ASG has a capacity of 7, even the the desired capacity is 10.
Click on the ASG and head to Activity
to see more details.
It shows that we hit our limit of 512
vCPUs and thus, the ASG is not able to provision more capacity. Time to open a ticket and raise the limit. :)
The wallclock times are quite different, the 3 node job finishes in about 3min.
$ egrep '(^real|MPI Processes)' strong_scaling_test_096_*
strong_scaling_test_096_14.out: MPI Enabled; Number of MPI Processes: 96
strong_scaling_test_096_14.out:real 2m43.704s
The others take a while longer.