Computer Science 400
Parallel Processing and High Performance Computing

Fall 2017, Siena College

Lab 5: Point to Point Communication
Due: 11:59 PM, Wednesday, September 27, 2017

For this lab, you will write your first real message passing programs using MPI. We focus here on point-to-point communication. That is, sends and receives.

You may work alone or with a partner on this lab.

Getting Set Up

You will receive an email with the link to follow to set up your GitHub repository p2p-lab-yourgitname for this Lab. One member of the group should follow the link to set up the repository on GitHub, then that person should email the instructor with the other group members' GitHub usernames so they can be granted access. This will allow all members of the group to clone the repository and commit and push changes to the origin on GitHub. At least one group member should make a clone of the repository to begin work.

Reading

Read Pacheco Ch. 3, Sections 1, 2 and 3, to learn about MPI's basic point-to-point communication capabilities.

Study and run each program that was included in your starter repository. Note that can obtain copies of all examples from the book's web site, and a copy of the examples has been placed on blizzard in /home/cs400/pacheco/ipp-source-use.

Question 1: Include an output capture for each of these programs: mpi_hello.c, mpi_trap1.c, and mpi_trap2.c. For each, run on 8 and 16 processes on noreaster. (5 points)

Gathering Timings

The trapezoidal rule program, if run with a sufficiently large number of trapezoids (e.g., large enough n), can take a significant amount of processing time.

Practice Program: Insert timers (like those in the matmult example that you likely used in the Jacobi program) around the computation part of mpi_trap2.c. (5 points)

Question 2: Run mpi_trap2 for 2, 4, 8, 16, 32, and 64 processes on noreaster for a=-100, b=100, n=1000000000. Report your timings. Are the extra processes taking advantage of processor cores and speeding up the computation? At what point do extra processes stop helping? (10 points)

Some Programs for Practice

Practice Program: Write a C program using MPI called mpiexchange.c that runs for exactly 2 processes. The rank 0 process places an arbitrary int value in a variable, sends that value to the rank 1 process, and then receives a new value for the variable from the rank 1 process. Therefore, the rank 1 process should receive that value, modify it in some way (in my solution, I just double it), and sends it back to the rank 0 process. If run with a number of processes not equal to 2, the program should print an error message and exit. (10 points)

Here is my program's output:

0 sending 49152 to 1
0 received 98304 from 1
1 received 49152 from 0
1 sending 98304 to 0

For the next problem, we define the "recursive digit sum" of a number to be the end result of taking the numbers digits, adding them up, then taking that sum's digits, adding them up, repeating this process until a single digit number results. So we always get a result in the range from 0 to 9.

For example, the "recursive digit sum" of 182,172 is 3, because the sum of the digits of 182,172 is 21, and the sum of 21's digits is 3.

Practice Program: Suppose you have been asked to study the distribution of the "numeric collapses" of the numbers 0 to N-1 for some large N. Write a C program using MPI called mpirds.c that runs for any number of processes where each process is responsible for computing the recursive digit sum of some part of the range. You may require that the number of processes evenly divides N, but check for that in your program and return an error if it doesn't. Each process should determine its range of values, compute a local array of counts for how many times each digit has occurred. At the end, all processes except rank 0 should send their array to rank 0. The rank 0 process will receive each and accumulate a global sum and print the result. (30 points)

My program's result for N=1048576 and 8 processes:

Proc 0 will compute from 0 to 131071
Proc 1 will compute from 131072 to 262143
Proc 2 will compute from 262144 to 393215
Proc 3 will compute from 393216 to 524287
Proc 4 will compute from 524288 to 655359
Proc 5 will compute from 655360 to 786431
Proc 6 will compute from 786432 to 917503
Proc 7 will compute from 917504 to 1048575
rank 0 local counts: [ 1 14564 14564 14564 14564 14563 14563 14563 14563 14563 
Received from rank 1: [ 0 14563 14563 14563 14563 14564 14564 14564 14564 14564 
Received from rank 2: [ 0 14564 14564 14564 14564 14564 14563 14563 14563 14563 
Received from rank 3: [ 0 14564 14563 14563 14563 14563 14564 14564 14564 14564 
Received from rank 4: [ 0 14563 14564 14564 14564 14564 14564 14563 14563 14563 
Received from rank 5: [ 0 14564 14564 14563 14563 14563 14563 14564 14564 14564 
Received from rank 6: [ 0 14563 14563 14564 14564 14564 14564 14564 14563 14563 
Received from rank 7: [ 0 14564 14564 14564 14563 14563 14563 14563 14564 14564 
global counts: [ 1 116509 116509 116509 116508 116508 116508 116508 116508 116508 

Your program should produce similar outputs.

Question 3: What are your results for N=1073741824 for 2, 4, 8, 16 and 32 processes? (5 points)

Submitting

Your submission requires that all required deliverables are committed and pushed to the master for your repository on GitHub.

Grading

This assignment is worth 65 points, which are distributed as follows:

> FeatureValueScore
Lab Question 1: output captures 5
Timers in mpi_trap2.c 5
Lab Question 2: Timings for mpi_trap2.c 10
mpiexchange.c correctness 10
mpirds.c code correctness 30
Lab Question 3: mpirds.c results 5
Total 65