Computer Science 210
Data Structures
Fall 2018, Siena College
You may work alone or in a group of 2 or 3 on this assignment. However, in order to make sure you learn the material and are well-prepared for the exams, you should work through non-coding problems on your own before discussing them with your partner, should you choose to work with someone. Also, be sure every programming task is a team effort. In particular, the "you do these and I'll do these" approach is sure to leave you unprepared for the exams.
All GitHub repositories must be created with all group members having write access and all group member names specified in the README.md file by 4:00 PM, Monday, November 26, 2018. This applies to those who choose to work alone as well!
There is a significant amount of work to be done here, and you are sure to have questions. It will be difficult if not impossible to complete the assignment if you wait until the last minute. A slow and steady approach will be much more effective.
Getting Set Up
You will receive an email with the link to follow to set up your GitHub repository for this problem set (ps6-yourgitname). One member of the group should follow the link to set up the repository on GitHub. Other group members will receive a subsequent email with a link to follow that will grant them the rights to clone the repository and commit and push changes to the origin on GitHub.
Syracuse Sequence
f(n) = 3n+1 if n is odd, and f(n) = n/2 if n is even.
Your program should take a number as a command-line parameter and print the values in Syracuse sequence starting at that value until it reaches 1, and then print out the length of the sequence. (6 points)
Finding the "Best Of"
For this part of the problem set, you will be implementing an interesting ordered structure and using it in a number of contexts.
In a recent lab, you were asked to find the 10 most frequently occuring words in an input text. At that point, we had already created an ArrayList of objects that represented the words and their frequencies. To find the 10 most frequent, we placed those objects in an array, then sorted that array by decreasing frequency, and finally printed the first 10 values in the sorted array.
In general, we are solving the problem where we are presented with n unordered values, and we wish to find the k "best" values, where "best" might mean smallest, largest, longest, any other metric we might define to compare the items. We assume that k << n.
Another option would be to search the values for the best one, then the second best one, then the third, and so on, and finally the kth best one.
Let's consider a different approach. We will add all of the values to a data structure (to be designed and implemented next) that tracks only the k best values that have been added to it so far. Once all n values have been added, the overall best k remain.
In the starter repository, you will find a class BestOf that includes a framework on which you will implement this data structure.
The structure is very restrictive: you can only construct one, add new values to it, query its size, and create an Iterator over its contents. Its constructor requires a size (the k value above) and a Comparator to use to define "best".
The instance variables include an array of the objects stored by the structure at any given time.
All remaining programming tasks will be graded as a programming assignment.
Your tasks to complete the structure are:
Notes:
E item = (E)data[i];
Applying to the Syracuse Sequence
Complete the main method in class Syracuse so that it generates the Syracuse sequences for all starting values from 1 up to n, given as a command-line parameter. To be completely clear, you will be generating n sequences, one for each starting value from 1 to n.
Your program should track and report the following:
You should maintain two BestOfs for this program, one responsible for each above the parts above. Should you wish to use an extra class to keep track of information for the Syracuse sequence generation (and that's a Good Idea), you might want to declare a non-public class inside of Syracuse.java that can be used only within that Java file (like the list nodes and iterator from SimpleLinkedList). You will also need to define "best" for each, either by creating appropriate Comparator objects or make your objects Comparable and using a NaturalComparator.
The intermediate values you will generate can get quite large, so you should store numbers in variables of type long and Long instead of int and Integer as you generate your sequences.
You will be generating some very long sequences, and a whole lot of them. There is no need to store the sequences in their entirety in any form! In particular there should be no ArrayLists or similar structures anywhere in your program. Generate each value in the sequence from the previous one, and add an appropriate entry to hold it to your first BestOf. And when you have finished generating each sequence, add an appropriate entry to hold it in your second BestOf.
Applying to METAL Graph Data
Next, you will apply this nifty new data structure to the graph data from the METAL project you first saw earlier this semester. Recall that graph files are linked from http://travelmapping.net/graphs/ and that you can see what this data looks like in map form by loading it into the METAL project's Highway Data Examiner, at /metal/hdx/.
Initially, you will work only with waypoint data and ignore the road segments. Start with your Waypoint class from earlier in the semester. Recall that a Waypoint represents the data for a single waypoint. It should include fields for the waypoint name and its latitude and longitude values, a constructor, accessors for the three components, and appropriate equals and toString methods. Also recall that the VertexSearch example has a main method that takes the name of a .tmg file to be loaded as a command-line parameter, and reads all of the waypoints from that file into an array of Waypoint objects. You do not need this program, but you might borrow code from it as you complete the tasks below.
Now let's use your working BestOf structure to compute some information about this mapping data.
What you need to do:
Note that your program never needs to store all of the Waypoint objects in any structure. Just add each Waypoint to each of your BestOfs.
A point of emphasis in this program is on error checking: if your program expects a command-line parameter that is not provided, your program should not crash with an exception. It should print a meaningful error message.
Submitting
Please answer questions that require a text response in the README.md file of your repository. Only one submission per group is needed.
Your submission requires that all required code deliverables are committed and pushed to the master for your repository's origin on GitHub. That's it! If you see everything you intend to submit when you visit your repository's page on GitHub, you're set.
Grading
This assignment is worth 100 points, which are distributed as follows:
> Feature | Value | Score |
SyracuseSeq.java correctness | 6 | |
Question 1 | 2 | |
Question 2 | 3 | |
Question 3 | 2 | |
Question 4 | 5 | |
BestOf add correctness | 20 | |
Question 5 | 2 | |
Syracuse main method correctness | 6 | |
Syracuse objects and comparators correctness | 7 | |
Question 6 (top 25 numbers) | 2 | |
Question 7 (top 25 sequence lengths) | 2 | |
Waypoint Comparator(s) for 6 criteria | 6 | |
WaypointBest correctness | 10 | |
Question 8 | 5 | |
Program design | 4 | |
Program style | 3 | |
Program documentation | 12 | |
Git commit frequency and message quality | 3 | |
Total | 100 | |