Computer Science 385

Design and Analysis of Algorithms

Spring 2018, Siena College

You may work alone or in a group of 2 or 3 on this assignment, which consists of an extension of your previous empirical analysis study to include more of the sorting algorithms we have studied this semester.

All GitHub repositories must be created with all group members having
write access and all group member names specified in the
`README.md` file by 4:00 PM, Monday, April 9, 2018. This applies to those who choose
to work alone as well!

Getting Set Up

You will receive an email with the
link to follow to set up your GitHub repository, which will be named
`ps6-yourgitname`, for this problem set. One member of the
group should follow the link to set up the repository on GitHub,
then that person should email the instructor with the other group
members' GitHub usernames so they can be granted access. This will
allow all members of the group to clone the repository and commit
and push changes to the origin on GitHub.

Submitting

Your submission requires that all required code and other deliverables are committed and pushed to the master for your repository's origin on GitHub. That's it! If you see everything you intend to submit when you visit your repository's page on GitHub, you're set.

Extended Empirical Analysis of Sorting Algorithms

Your task, worth a total of 50 points, is to extend your previous empirical study of sorting algorithms.

- We have studied three more advanced sorting algorithms since you completed the first empirical study: mergesort, quicksort, and heapsort. For quicksort, you will need to consider three versions: the standard version that chooses the pivot value as the first element of the subarray being sorted, a version that chooses a pivot randomly from among all of the elements of the subarray being sorted, and the "median of three" pivot choice which chooses the pivot as the middle value of the first, middle, and last elements of the subarray. A good implementation will support all three without replicating large amounts of code by having the latter two move their chosen pivot to the first position then applying the standard quicksort.
- You are permitted to use or base your code on existing algorithm implementations. For any code that is not entirely your own work, you must be sure you are not violating any copyrights, and you must cite the source(s) both in your code and in your writeup.
- Expectations are the same as they were for the earlier study in terms of coding the algorithms, running tests to gather timings and comparison counts, presenting those timings in tabular and graph formats, and a writeup that states your expectations for the results, what you observed, and how these align. For each algorithm and problem size, you should test with random data, sorted data, reverse sorted data, and nearly sorted data.
- As before, avoid recursive implementations for efficiency purposes.
- Run each case several times, as you did before. Choose your range of problem sizes so that the largest cases run for 5-10 minutes at most. Choose a range of 6 to 10 powers of two for problem sizes. You will have a large number of runs to make and it will take a significant amount of time even with this restriction. Plan accordingly.
- Just stating "the graph looks like
*O(n logn)*" isn't good enough to match your timing and comparison count results to the theory. Show how as*n*changes, the trend grows like*n logn*or*n*or whetever is appropriate.^{2}

Grade Improvement Opportunity

If you include your previous or updated results from the earlier empirical study here, your grade for this problem set can replace your earlier grade. That problem set had the lowest average grade to date, so this can really help many of you. Be sure to take into account the comments from that earlier problem set's grading sheet when preparing this submission whether or not you are including the earlier sort algorithms here.

Deliverables

A typical submission will include the program used to generate the
results (source code only, not extra files like `.class` or
`package.bluej` files), a PDF document that contains the formal
writeup, and any spreadsheets or other files that include the raw
data.

The formal writeup should have a section that includes a brief introduction to the task, including test environment information. There should then be a section for each algorithm that describes specifics of that algorithm, the theoretical expectations (including best/average/worst cases, where applicable), a summary of your timings and counts in tabular format, graphs illustrating these results, followed by more text that analyzes the results, comparing with the theoretical expectations. A brief conclusion section should summarize the study's results.