Computer Science 400
Parallel Processing and High Performance Computing
Fall 2017, Siena College
Programming Project 4: Traveling Salesperson Problem
Due: before the end of the semester
Our last programming project will be a series of tasks related to the
famous Traveling Salesperson Problem (TSP).
You may work alone or in a group of 2 or 3 on this project.
Getting Set Up
You will receive an email with the
link to follow to set up your GitHub repository
tsp-project-yourgitname for this Programming Project. One member of the
group should follow the link to set up the repository on GitHub,
then that person should email the instructor with the other group
members' GitHub usernames so they can be granted access. This will
allow all members of the group to clone the repository and commit
and push changes to the origin on GitHub. At least one group member
should make a clone of the repository to begin work.
Your initial repository should include the source code described in
Section 6.2 of Pacheco related to the text's TSP implementations.
Serial Implementations
(Recommendation: complete this part by November 17.)
Read the first part of Section 6.2 of Pacheco, pages 299-306, and
study the implementations in tsp_rec.c, tsp_iter1.c and
tsp_iter2.c.
The input to these programs is a file describing a directed graph,
where the edge weight in row r, column c is the cost to
travel from place number r to place number c. Costs are
integers. Places are defined only by number.
Example files are in a public
repository
that you should fork into your own GitHub account, then clone onto the
computer(s) where you are working on this project.
Question 1: Add two new datasets (three if working in a group of
three) to this repository. Keep them fairly small, say between 8
and 20 places. Be creative! The places do not need to be cities
and edge costs do not need to be miles (or even be distances).
You could do rooms within a building, buildings on a campus, houses
in a neighborhood, anything really. It would be nice to get some
variety. You will commit and push your new files to your fork of
the repository, then submit a pull request to have them pulled into
the master. (5 points)
Question 2: Pacheco Exercise 6.17, p. 345. (6 points)
Question 3: Run each of the three serial implementations for four
different problem sizes each. Use between 10 and 14 places for your
sizes. Present and analyze your timing results. Are these
consistent with the results in Pacheco Section 6.2.4 and Table
6.7? Briefly explain the relative run times of each of the three
versions. (10 points)
Question 4: Based on your timings from the previous question, about how
long would you expect each serial implementation to take to compute
a TSP solution for 16 places? For 20 places? For 40 places? (3 points)
Parallelization with pthreads
(Recommendation: complete this part by December 1.)
Read the next part of Section 6.2 of Pacheco, pages 306-315, and
study the implementations in pth_tsp_stat.c and
pth_tsp_dyn.c.
Question 5: Pacheco Exercise 6.18, p. 345-346. Only answer the
"Why" question; you need not do the code modifications in (a) and
(b). (3 points)
Question 6: Pacheco Exercise 6.20, p. 346. Only answer the
questions; you need not do the implementations. (6 points)
Question 7: Run each pthreads implementation (the static and the
dynamic) for 1, 2, 4, 8, 16, 32, and 64 threads on a Stampede2 node,
using a problem size such that the single-thread version runs for at
least 30 minutes. Present and analyze your timing results. (10
points)
Parallelization with MPI
Read the part of Section 6.2 of Pacheco relating to the MPI
implementation of TSP, pages 319-341, and study the implementations in
mpi_tsp_stat.c and mpi_tsp_dyn.c.
Question 8: Find three different parts of the code where the pthreads
version was able to take advantage of shared memory and study how
the MPI version handles that case with distributed memory. Discuss
the costs in terms of code complexity, overhead of replicated data,
and overhead of communication. (10 points)
Question 9: Run each MPI implementation (the static and the dynamic
(you may omit dynamic runs if you get runtime errors on these))
on 1, 2, 4, 8, 16, 32, and 64 nodes on Stampede2 with 64 processes
per node (so your MPI_Comm_size
should be 64, 128, 256, ...,
4096), using a problem size such that the single-thread version runs
for at least 30 minutes. Present and analyze your timing
results. (10 points)
Question 10: What size input would you expect to run for a few months
on a single process with a single thread on a Stampede2 production
node? (2 points)
Question 11: This is now a bonus question. Run the dynamic (use static if having trouble running dynamic) partitioning MPI
implementation on a problem of the size you came up with for the
previous question. Run it on enough Stampede2 production nodes
(with 64 processes per node) so that you would expect the run time
to be around 30 minutes with perfect parallel efficiency. Present
and analyze your timing results. (10 points)
Submitting
Your submission requires that all required deliverables are committed
and pushed to the master for your repository on GitHub. For the new
data sets, pull requests should be made to the common repository.
Grading
This assignment is worth 65 points, which are distributed as follows:
>
Feature | Value | Score |
Q1: New datasets | 5 | |
Q2: 6.17 | 6 | |
Q3: timings for serial versions | 10 | |
Q4: expected times for serial versions | 3 | |
Q5: 6.18 | 3 | |
Q6: 6.20 | 6 | |
Q7: timings and analysis for pthreads versions | 10 | |
Q8: pthreads/MPI comparison | 10 | |
Q9: timings and analysis for MPI versions | 10 | |
Q10: big problem size | 2 | |
Q11: run and analysis of big problem on many nodes | 10 (Bonus) | |
Total | 65 | |
|