Tutorial on using Python and MPI

From Athena Wiki
Revision as of 14:37, 11 September 2017 by Mrice admin (Talk | contribs) (Created page with "= Hello MPI in Python = This tutorial will demonstrate how to use [http://athena.brynmawr.edu Bryn Mawr College's Athena Cluster] with Python and MPI, the Message Passing Int...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Hello MPI in Python

This tutorial will demonstrate how to use Bryn Mawr College's Athena Cluster with Python and MPI, the Message Passing Interface.

Here are the steps, briefly, for running a MPI Python Script on Athena

  1. log into athena.brynmawr.edu
  2. find out which /data path you have a shared directory (either /data1, or /data2)
    • You should now have access to a $DATA variable
    • echo $DATA
  3. create a python program in your /dataN/userid/ directory
  4. create a PBS script
  5. submit the PBS script
  6. check on status
  7. see the results

We will now go through each step in detail.

Step 1: Log into Athena

If you do not have an Athena account, please email ... to request one. It can take up to one business day.

After you have your account, you can log into the account from anywhere. You can use ssh on Linux and Mac, or putty on Windows.

You can use the X-forwarding flag, -X, if you have X running:

ssh -X dblank@athena.brynmawr.edu

Using X-forwarding will allow you to use graphical programs, such as gedit.

Step 2: Where is my /data?

Next, you need to find out where your /data directory (either /data1 or /data2) has been located.

You should now have access to a $DATA variable:

echo $DATA

Another way to find your data directory is to list the files in the directories:

[dblank@powerwulf ~]$ ls /data1
dblank     gdavis  n001.cluster.com  n004.cluster.com  pbrodfue
dmtpc      josh    n002.cluster.com  n005.cluster.com  powerwulf.brynmawr.edu
emaccorma  mrice   n003.cluster.com  n006.cluster.com  xwang
[dblank@powerwulf ~]$ ls /data2
dblank     gdavis  n001.cluster.com  n004.cluster.com  pbrodfue
emaccorma  josh    n002.cluster.com  n005.cluster.com  powerwulf.brynmawr.edu
galaxy     mrice   n003.cluster.com  n006.cluster.com  xwang
[dblank@powerwulf ~]$ 

Step 3: Create a Python script

Next, you will want to create a Python script in your /data directory.

First, change to your directory:

cd /data1/dblank

Next, use a text editor (emacs, vi, gedit, or nano) to create a file in your /data directory. We use the /data directory because it is available to all of the nodes.

Here is a sample program:

# More tutorials available at:
# http://mpi4py.scipy.org/docs/usrman/tutorial.html

from mpi4py import MPI
import platform

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
machine = platform.node()

print("Hello MPI from %s %d of %d" % (machine, rank, size))

Save the script to a file, such as: /data1/dblank/hellompi.py

rank and size are two useful values from MPI. See http://mpi4py.scipy.org/docs/usrman/tutorial.html for more examples.

Step 4: Submit PBS script

To submit a Python script, you will need a shell program to give to qsub. Here is a sample:

#!/bin/bash

    #PBS -q default
    #PBS -l nodes=6:ppn=12

mpiexec -machinefile /opt/machinelist -n 10 python /data1/dblank/hellompi.py

echo "end"

Save this in a file, such as /home/dblank/pbs/hellompi.sh.

PBS is a language that allows you to set many aspects of your running program. For more details see ...

In this example, we tell mpiexec to choose from the machines listed in /opt/machinelist (all of the nodes) and to create 10 instances of our running code across those machines.

NOTE: the program will start in your home directory, so by default you will be in /home/dblank/. Use full pathnames to /data1/ for shared data.

Finally you will submit this script to qsum, like so:

qsub hellompi.sh

NOTE: the results of qsub will appear here. Therefore, you may want to move your shell script to a folder before beginning.

Step 5: Check status

When you start your program a number of things will happen:

  1. your program will be scheduled to run
  2. when qsub runs your script, the script be submitted to each of the machines, perhaps a number of times

You can run "qstat -a" to see the status of your program.

Step 6: Get Results

Finally, any output from your program will be saved in a file with the same name as your shell script, with a dot, the letter "o", and job number appended.

Hello MPI from n002.cluster.com 1 of 10
Hello MPI from n002.cluster.com 7 of 10
Hello MPI from n001.cluster.com 6 of 10
Hello MPI from n001.cluster.com 0 of 10
Hello MPI from n004.cluster.com 3 of 10
Hello MPI from n004.cluster.com 9 of 10
Hello MPI from n003.cluster.com 2 of 10
Hello MPI from n003.cluster.com 8 of 10
Hello MPI from n006.cluster.com 5 of 10
Hello MPI from n005.cluster.com 4 of 10
end

Step 7: Where to go from here?