Brief NW-GRID Tutorial - Compiling and Running simple Jobs

This explains how to run simple jobs on the iDataPlex system with OpenMPI or Intel MPI.

2) Compilation

This section gives further information on using the system if you want to compile and run your own applications. We will assume you are using the Intel performance suite which is the preferred option. OpenMPI is also available on the system for use with the GNU or PGI compilers. For reference we note that the system compiler is GCC-4.1.2 as used for building the RedHat Linux O/S distribution. This will be over ridden by user environment modules as explained below.

Environment Modules

On NW-GRID we are using environment modules. You can see the modules provided on SID by typing as follows (you may only see the ones you are entitled to use).

> module avail

---------------- /gpfs/packages/modules/3.2.8/Modules/production -----------------
default_env

---------------- /gpfs/packages/modules/3.2.8/Modules/modulefiles ----------------
fftw/3.3/intel/ompi     intel/mpi/3.2.2         pgi/11.3(default)
gcc/3.4.6               intel/mpi/4.0.1         pgi/license
gcc/4.1.2(default)      openmpi/1.4.3/gcc       python/2.7.2
gcc/4.4.0               openmpi/1.4.3/intel     scalapack/2.0.1/gcc
gcc/4.6.2               openmpi/1.4.3/pgi       scalapack/2.0.1/intel
intel/comp/11.1         openmpi/1.4.4/gcc       tools/cmake/2.8.6
intel/comp/12.0         openmpi/1.4.4/intel     tools/git/1.7.5
intel/license           openmpi/1.4.4/pgi       tools/subversion/1.6.12
intel/mkl/10.2          perl/perl5/perl-5.14.2
intel/mkl/10.3          pgi/10.9

In the following notes we will assume you want to use OpenMPI-1.4.4 with the Intel-11.1 Fortran 90 compiler. This is our current "default" environment, but it might change in future. To ensure the environment is set up for this, type:

> module list
No Modulefiles Currently Loaded.

> module load default_env

> module list
Currently Loaded Modulefiles:
  1) intel/license         3) intel/mkl/10.2        5) default_env
  2) intel/comp/11.1       4) openmpi/1.4.4/intel

Note that this has also pulled in the license module as required by this compiler suite. If you get an error message at this stage it may be that you do not have access to an appropriate license. If you had any other module loaded it is preferable to unload that first, and indeed you may receive a prompt to do this.

The chosen compiler in the parallel environment can now be invoked using the mpif90 wrapper. This wrapper is also used for linking the appropriate libraries which are included by default. A simple parallel Fortran 90 application can thus be compiled as follows:

> mpif90 -c myprog.f90
> mpif90 -o myprog myprog.o

3) Submitting a job to the batch queues using default_env

SID uses a batch system controlled by Platform Computing's LSF (Load Sharing Facility) product version 7.0. Documentation is available here: http://download.platform.com/docs/lsf/7.0.6/index.html . Printable PDF documentation is located on SID in the following directory: /depot/kits/7/www/guides/kit_lsf_guide_source/print .

A job submission script called myprog.ompi might look as follows. It will attempt to run the job on 8 cores on a single node of the system.

# specify a job name
#BSUB -J myjobname
# stdout and stderr files identified by job number
#BSUB -o stdout.%J.txt
#BSUB -e stderr.%J.txt
# wall time of 2 hours
#BSUB -W 2:00
# resource requirements, processes per node
#BSUB -R "span[ptile=12]"
# number of processes to be run
#BSUB -n 8

# set to run in directory containg this script
cd $LS_SUBCWD

# load modules
source /etc/profile.d/modules.sh
module load intel/mkl/10.2
module load openmpi/1.4.3/intel

# now execute parallel job on 8 cores
export myexe="myprog"
export myargs="myarg1 myarg2"
mpirun -np 8 $myexe $myargs   

This can be submitted with the command, note the "<" is essential here:

bsub < myprog.ompi

And you can check the status of the job using the bjobs command, or bpeek <jobid>, which gives more info about the job.

{i} 27/07/2011: OpenMPI has not yet been compiled to work with version 12 of the Intel compiler. Please use version 11.1 instead, as in this example.

Checking configuration for OpenMPI jobs

If the job will not run, check for the following.

mpdboot_sid010.cluster.local (handle_mpd_output 905): from mpd on sid010, invalid port info:
configuration file /gpfs/stfc/dlarcg/rja/.mpd.conf is accessible by others
change permissions to allow read and write access only by you

To fix this, do the following:

> ls -l ~/.mpd.conf
-rw-r--r-- 1 rja dlarcg 49 Feb 21 14:56 /gpfs/stfc/dlarcg/rja/.mpd.conf
> chmod 600 ~/.mpd.conf
> ls -l ~/.mpd.conf
-rw------- 1 rja dlarcg 49 Feb 21 14:56 /gpfs/stfc/dlarcg/rja/.mpd.conf

4) using Intel MPI

The Intel MPI is also available, and it may become part of the default environment in the future. Currently if you want to use it you can load the individual modules as follows.

> module load intel/comp/12.0 intel/mpi/4.0.1

To compile your code, use "mpiicc" in place of mpicc, and "mpiifort" in place of mpif90 et al.

A job submission script called myprog.impi might look as follows. It will attempt to run the job on 8 cores on a single node of the system. Note that it uses a wrapper (mpdboot.impi.py) to start the mpd daemons and mpiexec rather than mpirun. This is because system policy precludes ssh to compute nodes which is normally required by the Intel version of mpirun. Also there are some environmental parameters that have to be supplied to ensure selection of the Infiniband network.

#BSUB -J myjobname
#BSUB -o stdout.txt
#BSUB -e stderr.txt
#BSUB -R "span[ptile=12]"
#BSUB -n 8               

cd $LS_SUBCWD

# load modules
source /etc/profile.d/modules.sh
module load intel/mpi/4.0.1

export myexe="myprog" 
export myargs="myarg1 myarg2" 

mpdboot.impi.py

# now execute parallel job on 8 cores
mpiexec -np 8 -env I_MPI_DAPL_PROVIDER ofa-v2-ib0 -env I_MPI_FABRICS shm:dapl $myexe $myargs

# clean up
mpdallexit

5) Different numbers of nodes and cores

This system runs one virtual process per core. Each compute node has 12 cores running at 2.67GHz and 24GB of memory shared among these cores. This means that 2GB of memory is available per virtual process. Please try to use this default configuration whenever possible.

If your job requires more memory than this, or for performance reasons, it is possible to run less than 12 virtual processes per node. The line #BSUB -R "span[ptile=n]" tells it how many. Say we want to run 24 processes spread over 6 nodes, then the following lines can be used.

#BSUB -R "span[ptile=4]"
#BSUB -n 24
mpirun -np 24 $myexe $myargs

Note that the BSUB -n and mpirun -np parameters are usually the same as no hyper-threading is used at present.

Using_iDataPlex (last edited 2012-05-18 10:05:08 by RobAllan)

This website maintained by Research Computing Services, University of Manchester