There are several files to turn in for this assignment. They should all be included in a file named hw02.tar that you submit using the turnin utility. Please use the filenames specified and be sure to include your name in each file.
Your task is write a C or C++ program to find all palindromic words in the dictionary. You should first write a sequential program and then parallelize it, using the bag-of-tasks paradigm. However, keep your plan for parallelization in mind when designing and implementing the sequential version. You might want to do some things in the sequential program that you will need in the parallel program, such as finding where each letter begins in the dictionary (see below for more on this).
Your sequential program should have the following phases:
The first few words in the dictionary start with numbers; you can either skip over them or process them, as you wish. None are palindromic, so this choice will not affect your total count. Some words start with capital letters (and hence the dictionary is not sorted in ASCII order). To keep your program simple, leave the capitals alone and do case-sensitive comparisons.
The sequential program is something you should have no trouble designing, but as many of you are not expert C or C++ programmers (yet), it might take longer than you expect to implement. Please ask if you run into trouble with the sequential version! You will need to get it done with enough time left to work on the parallelization - which is the whole point of the assignment!
After you have a working sequential program, modify it to use the bag-of-tasks paradigm, implemented using the pthreads library. Your parallel program should use W worker threads, where W is a command-line argument. Use the workers just for the compute phase; do the input and output phases sequentially. Each worker should count the number of palindromic words that it finds. Sum these W values during the output phase. This avoids a critical section during the compute phase - you'll need to deal with others, though. Use 26 tasks in your program, one for each letter of the alphabet. In particular, the first task is to examine all words that begin with "a" (and numbers), the second task is to examine all words that begin with "b", and so on. During the input phase you should build an efficient representation for the bag of tasks; I suggest using an array, where the value in task[0] is the index of the first "a" word, task[1] is the index of the first "b" word, and so on. You can also use this array during the search phase to limit the scope of your linear searches. Your parallel program should also time the compute phase. You may use the timer.c and timer.h code from our class examples. Start the clock just before you create the workers; read it again as soon as they have finished. Write the elapsed time for the compute phase to stdout.
To summarize, your program should have the following output:
For the timing tests, execute your parallel program on the four-processor nodes of bullpen (wetteland or rivera; ppn=4 in PBS) using using 1, 2, 3, and 4 workers. Run each test 3 times. Submit your job through PBS. You may do all tests in a single PBS script, or create a script for each run and submit them separately. In the file hw02.txt, include a table of results; it should contain all the values written to standard output (but not the words themselves) for all 12 test runs, and a brief analysis of the speedup and parallel efficiency you have achieved.
Your submitted tar file should include your Makefile, your C source code (including the timer code from class, if you choose to use it), your PBS script(s), a brief README file expaining how to run your program, and the hw02.txt file. Please do not include object files or your executable.
Honor code guidelines: While the program is to be done individually, along the lines of a laboratory program, I want to encourage you to ask questions and discuss the program with me, our TA, and with each other, as you develop it. However, no sharing of code is permitted. If you have any doubts, please check first and avoid honor code problems later.
Grading guidelines: Your grade for the program will be determined by correctness, design, documentation, and style, as well as the presentation of your timing results.