Multiprocessing by Message Passing MPI

Course 3085

Introduction

  • Objectives of this Tutorial
    1. Introduces you to the fundamentals of MPI by ways of F77, F90 and C examples
    2. Shows you how to compile, link and run MPI code
    3. Covers additional MPI routines that deal with virtual topologies
    4. Cites references
  • What is MPI ?
    1. MPI stands for Message Passing Interface and its standard is set by the Message Passing Interface Forum
    2. It is a library of subroutines/functions, NOT a language
    3. MPI subroutines are callable from Fortran and C
    4. Programmer writes Fortran/C code with appropriate MPI library calls,
      compiles with Fortran/C compiler, then
      links with Message Passing library
    5. Fortran 90 is not officially supported under MPI-1. It will be
      supported in MPI-2 in the near future.
      (Many codes that are written in F90 do work with MPI-1)
  • Why MPI ?
    1. For large problems that demand better turn-around time (and access to more memory)
    2. For Fortran “dusty deck”, often it would be very time-consuming
      to rewrite code to take advantage of parallelism. Even in the case
      of SMP, as are the SGI PowerChallengeArray and Origin2000,
      automatic parallelizer might not be able to detect parallelism.
    3. For distributed memory machines, such as cluster of
      Unix work stations, cluster of NT/Linux PCs, IBM pSeries.
    4. Maximize portability; works on distributed and shared
      memory architectures.

A note of caution is in order here. Parallel programs written with message
passing tend to be more complicated than their serial counterparts.


Debugging is harder, and the processes are subject to deadlock.
There are other programming paradigms, such as the data parallel model
wherein you can code in high level languages such as Fortran 90 or High Performance Fortran.
Certainly, HPF has its share of problems too, such as the amount of effort required to convert a serial to parallel code.


Preliminaries of MPI Message Passing

  • In a user code, wherever MPI library calls occur, the following
    header file must be included:

    #include "mpi.h" for C code or
    include "mpif.h" for Fortran code
    These files contain definitions of constants, prototypes, etc.
    which are neccessary to compile a program that
    contains MPI library calls
  • MPI is initiated by a call to

    MPI_Init
    . This MPI routine must be called
    before any other MPI routines and it must only be called once in the
    program.
  • MPI processing ends with a call to

    MPI_Finalize
    .
  • Essentially the only difference between MPI subroutines (for Fortran
    programs) and MPI functions (for C programs) is the error reporting
    flag. In fortran, it is returned as the last member of the subroutine’s
    argument list. In C, the integer error flag is returned through the
    function value. Consequently, MPI fortran routines
    always contain one additional variable in the argument list
    than the C counterpart.
  • C’s MPI function names start with “MPI_” and followed by
    a character string with the leading character in upper
    case letter while the rest in lower case letters.
    Fortran subroutines bear the same names but are
    case-insensitive.
  • On SGI’s PCA (Power Challenge Array) and Origin2000, parallel I/O is supported

Basic MPI Routines Through Examples

There are essentially two different paradigms in MPI programming, SPMD
(Single Program Multiple Data) and MPMD (Multiple Programs Multiple Data).
The example programs shown below employ the SPMD paradigm, i.e.,
an identical copy of the same program is used for each of the processes.

  • Example 1 (F77 version).
    Numerical Integration

    • Example 1.1
      Parallel Integration with MPI_Send, MPI_Recv
    • Example 1.2
      Parallel Integration with MPI_Isend, MPI_Irecv
    • Example 1.3
      Parallel Integration with MPI_Bcast, MPI_Reduce
    • Example 1.4
      Parallel Integration with MPI_Pack, MPI_Unpack
    • Example 1.5
      Parallel Integration with MPI_Gather, MPI_Scatter
  • Example 1 (C version).
    Numerical Integration

    • Example 1.1
      Parallel Integration with MPI_Send, MPI_Recv
    • Example 1.2
      Parallel Integration with MPI_Isend, MPI_Recv
    • Example 1.3
      Parallel Integration with MPI_Bcast, MPI_Reduce
  • A set of examples, ranging from the simple to the advanced, may be
    downloaded from the following site:

    ftp://info.mcs.anl.gov/pub/mpi/using

    The downloaded file is called examples.tar.Z.

  • You may also download the examples in this tutorial.

Compilation and Execution

On the PCA and Origin2000 at Boston University, you have a choice of running
either the MPICH or SGI’s implementation of MPI. There are
differences between the two. The differences that affect the users are in
the compilation procedure :

  • If you prefer the SGI implementation of MPI (which is the system
    default), the only thing you
    need to do is to link in the MPI library, i.e., add
    -lmpi at the end of the link command line.
    To see the compilation and job execution procedure for SGI’s MPI,
    Click here
  • On the other hand, if you prefer to use MPICH, the latest version
    installed at Boston University is 1.2.0.
    To see the compilation and job execution procedure for MPICH,
    Click here
On the IBM p655 and p690, we currently support IBM’s implementation of standard MPI. To see the compilation and job execution procedure for IBM’s MPI,
Click here.
To submit a multiprocessor batch job requiring 4 processors:

Tonka%
bsub
-q pca-mp4 "

mpirun
-np 4 example"

–>

At Boston University, we use LSF for batch processing. Through LSF, the batch submission script, bsub, permits
user to enter the number of processors via the switch “-n“.
Do not use this option. Rather, specify the number processors through
mpirun (for SGI machines) or poe (for the IBM pSeries) as shown in the above example.

machine-name% bqueues to find out what queues are available.

machine-name% bqueues -l p4short
to find out specifics of the queue “mpshort”.
Alternatively, if you prefer x-window
displays, use machine-name%

xlsbatch
to find out the status of
your (and others) job(s).

For more detail on batch queues, consult
the Scientific Computing Facilities Technical Summary Page

Debugging MPI code with SGI’s Case Vision Debugger (CVD)

To debug your code using SGI’s Case Vision Debugger
(cvd): (you will need x-window for cvd to work)

  1. To use cvd for MPI, it is far easier to use the MPICH MPI. Please
    refer to the above section on how to compile and execute MPI code
    using MPICH.
  2. For debugging, make sure you compile with “-g” instead of any
    optimization flag, such as -O3.
  3. At prompt, type “cvd executable-name”
    (e.g., % cvd a.out)
  4. You should now see a couple of windows pop up. The main window
    should display the source file (it is easier to do this in the directory
    where all the files live).
  5. At the top, there is a dialogue box with “command:” on the left. You
    should see the executable name, along with path in this box.
    Complete the command line with the number of procs you want to use.
    (e.g. command: /usr/mypath/a.out -np 4)
    Please see above on how to run jobs under MPICH.
  6. Set traps by clicking at the left of the program source statements
  7. At the top left of the main window, there is a “Admin” heading. Click
    and select “multiprocess view”. A new window pops.
  8. In the multiprocess view window, click “config/preferences”. Check
    “Attach to forked processes” and “Copy traps to forked processes” if they
    are not already checked. Click save and apply. This way, next time you
    won’t have to do this step again.
  9. you can now click “run” which you can find at the top right of main
    window. The execution will stop at the first trap.
  10. You can go to Multiprocess View window to select a specific process.
    Clicking on “run” in main window will cause that process to continue
    execution to next stop (trap). Alternatively, you can click “continue all”
    or “step into all” or “step over all” to allow all processes to march on.
  11. Finally, the “expression view” window permits inquiries of the values
    of variables in source code. Just type in the names.

CVD can also be used for performance analyses.

IBM pSeries Debugger

There are several parallel debuggers on the IBM pSeries. Refer to the
IBM pSeries section for more details.

Communicators and Virtual Topology

(this section under development)

In addition to the basic MPI routines demonstrated above, there are
many other routines for various applications. Some of the more
frequently used routines, grouped according to their functionalities, are discussed below:

Application of communicators and cartesian topology is demonstrated
through a matrix transpose example

References

There are a number of MPI references available.

From book publishers :

  1. Parallel Programming with MPI by P. S. Pacheco, Morgan Kaufmann, 1997
  2. MPI: The Complete Reference by M. Snir, et. al., The MIT Press, 1996
  3. Using MPI by W. Gropp, E. Lusk and A. Skjellum, The MIT Press, 1994
Online via the SGI Power Challege Array or Origin2000 machines:

  1. There are man pages for

    MPI

  2. MPI and PVM User’s Guide

  3. MPI: A Message-Passing Interface Standard
On the Internet:


  1. Users’ Guide to mpich,
    a Portable Implementation of MPI
    by P. Bridges et. al.
  2. MPI: A Message-Passing Interface Standard.
    The postscript file of this document, /pub/mpi/mpi-report.ps,
    can be down-loaded via anonymous FTP from info.mcs.anl.gov

  3. A User’s Guide to MPI
    by Peter S. Pacheco. (This is a postscript file)
  4. MPI chapter
    in

    Designing and Building Parallel Programs
    by Ian Foster
  5. MPI Tutorial by EPCC, University of Edinburgh

  6. MPI Tutorial by Cornell University
  7. MPI Tutorial by the National Computational Science Alliance
Other Tutorials on the WEB:

  1. AHPCC, University of New Mexico

ACKNOWLEDGEMENTS

The glossary
as well as detailed descriptions of all MPI routines are drawn from
the Argonne National Laboratory.


Your suggestions and comments are welcomed, please send them to
the course coordinator and instructor, Kadin Tseng
(email: kadin@bu.edu).

(Last updated: )