The HECToR Service is now closed and has been superceded by ARCHER.

4 Batch system / job command language

This section provides information on more advanced usage of the batch system on HECToR, it covers:

  • how to run multiple, concurrent parallel jobs in a single job submission script;
  • how to run multiple, concurrent parallel jobs using job arrays;
  • how to use Perl or Python to write job submission scripts.

The basics of running jobs through the batch system on HECToR can be found in the HECToR User Guide at:

4.1 Batch system commands

To submit your job to the batch system:

qsub your_job_script.pbs

To check the job status:

qstat -u $USER

To remove your job from the job queue (if your job is running, it will stop it running too):

qdel $your_job_ID

4.2 Multiple 'aprun' commands in a single job script

One of the most efficient ways of running multiple simulations in parallel on Cray XE systems is to use a single job submission script to run multiple simulations. This can be achieved by having multiple 'aprun' commands in a single script and requesting enough resources from the batch system to run them in parallel.

The examples in this section all assume you are using the bash shell for your job submission script but the principles are easily adapted to Perl, Python or tcsh.

This technique is particularly useful if you have many jobs that use a small number of cores that you want to run simultaneously as the job looks to the batch system like a single large job and is thus easier to schedule.

Note: each 'aprun' command must run on a separate compute node as Cray XE machines only allow exclusive node access. This means you cannot use this technique to run multiple instances of a program on a single compute node.

4.2.1 Requesting the correct number of cores

The total number of cores requested for a job of this type is the sum of the number of cores required for all the simulations in the script. For example, if I have 16 simulations which each run using 2048 cores then I would need to ask for 32768 cores (1024 nodes on a 32-core per node system).

4.2.2 Multiple 'aprun' syntax

The differences from specifying a single aprun command to specifying multiple 'aprun' commands in your job submission script is that each of the aprun command must be run in the background (i.e. appended with an &) and there must be a 'wait' command after the final aprun command. For example, to run 4 CP2K simulations which each use 2048 cores (8192 cores in total) and 32 cores per node:

cd $basedir/simulation1/
aprun -n 2048 -N 32 cp2k.popt < input1.cp2k > output1.cp2k &
cd $basedir/simulation2/
aprun -n 2048 -N 32 cp2k.popt < input2.cp2k > output2.cp2k &
cd $basedir/simulation3/
aprun -n 2048 -N 32 cp2k.popt < input3.cp2k > output3.cp2k &
cd $basedir/simulation4/
aprun -n 2048 -N 32 cp2k.popt < input4.cp2k > output4.cp2k &

# Wait for all simulations to complete
wait

of course, this could have been more concisely achieved using a loop:

for i in {1..4}; do
  cd $basedir/simulation${i}/
  aprun -n 2048 -N 32 cp2k < input${i}.cp2k > output${i}.cp2k &
done

# Wait for all simulations to complete
wait

4.2.3 Example job submission script

This job submission script runs 16, 2048-core CP2K simulations in parallel with the input in the directories 'simulation1', 'simulation2', etc.:

#!/bin/bash --login
# The jobname
#PBS -N your_job_name

# The total number of parallel tasks for your job.
#    This is the sum of the number of parallel tasks required by each
#    of the aprun commands you are using. In this example we have
#    16 * 2048 = 32768 tasks
#PBS -l mppwidth=32768

# Specify how many processes per node.
#PBS -l mppnppn=32

# Specify the wall clock time required for your job.
#    In this example we want 6 hours 
#PBS -l walltime=6:0:0

# Specify which budget account that your job will be charged to.
#PBS -A your_budget_account               

# Make sure any symbolic links are resolved to absolute path
export PBS_O_WORKDIR=$(readlink -f $PBS_O_WORKDIR)

# The base directory is the directory that the job was submitted from.
# All simulations are in subdirectories of this directory.
basedir=$PBS_O_WORKDIR

# Load the cp2k module
module add cp2k

# Set the number of threads to 1
#   This prevents any system libraries from automatically 
#   using threading.
export OMP_NUM_THREADS=1

# Loop over simulations, running them in the background
for i in {1..16}; do
   # Change to the directory for this simulation
   cd $basedir/simulation${i}/
   aprun -n 2048 -N 32 cp2k.popt < input${i}.cp2k > output${i}.cp2k &
done

# Wait for all jobs to finish before exiting the job submission script
wait
exit 0

In this example, it is assumed that all of the input for the simulations has been setup prior to submitting the jobs. Of course, in reality, you may find that it is more useful for the job submission script to programmatically prepare the input for each job before the aprun command.

4.3 Job arrays

Often, you will want to run the same job submission script multiple times in parallel for many different input parameters. Job arrays provide a mechanism for doing this without the need to issue multiple 'qsub' commands and without the penalty of having large numbers of jobs appearing in the queue.

4.3.1 Example job array submission script

Each job instance in the job array is able to access its unique array index through the environment variable $PBS_ARRAY_INDEX.

This can be used to programmatically select which set of input parameters you want to use. One common way to use job arrays is to place the input for each job instance in a separate subdirectory which has a number as part of its name. For example, if you have 10 sets of input in ten subdirectories called job01, job02, …, job10 then you would be able to use the following script to run a job array that runs each of these jobs:

#!/bin/bash
#PBS -N your_job_name
#PBS -l mppwidth=2048
#PBS -l mppnppn=32
#PBS -l walltime=00:20:00
# Specify which budget account that your job will be charged to.
#PBS -A your_budget_account               

# Make sure any symbolic links are resolved to absolute path
export PBS_O_WORKDIR=$(readlink -f $PBS_O_WORKDIR)

# Change to the directory that the job was submitted from.
cd $PBS_O_WORKDIR

# Set the number of threads to 1
#   This prevents any system libraries from automatically 
#   using threading.
export OMP_NUM_THREADS=1

# Get the subdirectory name for this job instance in the array
jobid=`printf "%02d" $PBS_ARRAY_INDEX`
jobdir="job$jobid"

# Change to the subdirectory for this job instance in the array
cd $jobdir

# Run this job instance in its subdirectory
echo "Running $jobname"
aprun -n 2048 -N 32 ./my_mpi_executable.x arg1 arg2

4.2.2 Submitting job arrays

The '-J' option to the 'qsub' command is used to submit a job array under PBSPro. For example, to submit a job array consisting of 10 instances, numbered from 1 to 10 you would use the command:

qsub -J 1-10 array_job_script.pbs

You can also specify a stride other than 1 for array jobs. For example, to submit a job array consiting of 5 instances, numbered 2, 4, 6, 8, and 10 you would use the command:

qsub -J 2-10:2 array_job_script.pbs
4.2.3 Interacting with individual job instances in an array

You can refer to individual job insatnce in a job array by using their array index. For example, to delete just the job instance with array index 5 from the batch system (assuming your job ID is 1234), you would use:

qdel 1234[5]

4.4 Interactive jobs

Interactive jobs on Cray XE systems are useful for debugging or developmental work as they allow you to issue 'aprun' commands directly from the command line. To submit a interactive job reserving 256 cores (on a 32 core-per-node Interlagos system) for 1 hour you would use the following qsub command:

 
qsub -IVl mppwidth=256,walltime=1:0:0 -A budget

When you submit this job your terminal will display something like:

 
qsub: waiting for job 492383.sdb to start

and once the job runs you will be returned to a standard Linux command line. However, while the job lasts you will be able to run parallel jobs by using the 'aprun' command directly at your command prompt. The maximum number of cores you can use is limited by the value of mppwidth you specified at submission time. Remember, that only the /work filesystem is accessible from the compute nodes - although you will be able to access the /home filesystem from the command line. You will normally need to change to a directory on the /work filesystem before you can run a job.

4.5 Writing job submission scripts in Perl and Python

It can often be useful to be able to use the features of Perl and/or Python to write more complex job submission scripts. The richer programming environment available over standard shell scripts can make it easier to dynamically generate input for jobs or put complex workflows together.

Please note that the examples provided in this section are so simple that they could easily be written in bash or tcsh but they provide the necessary information needed to be able to use Perl and Python to write your own, more complex, job submission scripts.

You submit Perl and Python job submission scripts using 'qsub' as for standard jobs.

4.5.1 Example Perl job submission script

This example script shows how to run a CP2K job using Perl. It illustrates the necessary system calls to change directories and load modules within a Perl script but does not contain any program complexity.

#!/usr/bin/perl

# The jobname
#PBS -N your_job_name

# The total number of parallel tasks for your job.
# The example requires 2048 parallal tasks
#PBS -l mppwidth=2048

# Specify how many processes per node.
#PBS -l mppnppn=32

# Set the budget to charge the job to. The budget name is site-dependent
#PBS -A budget

# Set the number of MPI tasks and MPI tasks per node
my $mpiTasks = 2048;
my $tasksPerNode = 32;

# Set the executable name and input and output files
my $execName = "cp2k.popt";
my $inputName = "input";
my $outputName = "output";
my $runCode = "$execName < $inputName > $outputName";

# Set up the string to run our job
my $aprunString = "aprun -n $mpiTasks -N $tasksPerNode $runCode";

# Set the command to load the cp2k module
#   This is more complicated in Perl as we cannot access the 
#   'module' command directly so we need to use a set of commands
#   to make sure the subshell that runs the aprun process has the 
#   correct environment setup. This string will be prepended to the
#   aprun command
my $moduleString = "source /etc/profile; module load cp2k;";

# Change to the diectory the job was submitted from
chdir($ENV{'PBS_O_WORKDIR'});

# Run the job
#    This is a combination of the module loading string and the
#    actaul aprun command. Both of these are set above.
system("$moduleString  $aprunString");

# Exit the job
exit(0);

4.5.2 Example Python job submission script

This example script shows how to run a CP2K job using Python. It illustrates the necessary system calls to change directories and load modules within a Python script but does not contain any program complexity.

#!/usr/bin/python

# The jobname
#PBS -N your_job_name

# The total number of parallel tasks for your job.
# The example requires 2048 parallal tasks
#PBS -l mppwidth=2048

# Specify how many processes per node.
#PBS -l mppnppn=32

# Set the budget to charge the job to. The budget name is site-dependent
#PBS -A budget

# Import the Python modules required for system operations
import os
import sys

# Set the number of MPI tasks and MPI tasks per node
mpiTasks = 2048
tasksPerNode = 32

# Set the executable name and input and output files
execName = "cp2k.popt"
inputName = "input"
outputName = "output"
runCode = "{0} < {1} > {2}".format(execName, inputName, outputName)

# Set up the string to run our job
aprunString = "aprun -n {0} -N {1} {2}".format(mpiTasks, tasksPerNode, runCode)

# Set the command to load the cp2k module
#   This is more complicated in Python as we cannot access the 
#   'module' command directly so we need to use a set of commands
#   to make sure the subshell that runs the aprun process has the 
#   correct environment setup. This string will be prepended to the
#   aprun command
moduleString = "source /etc/profile; module load cp2k; "

# Change to the diectory the job was submitted from
os.chdir(os.environ["PBS_O_WORKDIR"])

# Run the job
#    This is a combination of the module loading string and the
#    actaul aprun command. Both of these are set above.
os.system(moduleString + aprunString)

# Exit the job
sys.exit(0)