Computer Science 385
Design and Analysis of Algorithms

Spring 2017, Siena College

Lab 2: Introduction to METAL Graph Data
Due: Start of your next lab session

We will be studying many graph algorithms this semester, some of which you will be implementing in Java. For many of these tasks, you will be working with real-world data sets derived from highway systems.

A big advantage of working with this kind of data is that it has a connection to reality, and that we can visualize the data and the results of our manipulations of that data with the Google Maps API. This data is collected by members of the Travel Mapping (TM) Project (http://tm.teresco.org/). The Map-based Educational Tools for Algorithm Learning (METAL) Project (/metal/) uses data from the TM project and converts it into a format that is more convenient for us to load into a graph structure and use. Much more about the project is available at the link above, but everything you need to know for this week should be in this document.

You will be assigned a partner to work with on this lab. Only one submission per group is needed. The lab is graded out of 100 points.

Getting Set Up

You will need to create a BlueJ project (or set up to work in another IDE, if you prefer) for your programming work on this lab. When you complete the preliminary steps, you will be given a copy of the starter code for the main coding task.

The Graph Data

METAL provides hundreds of graph files that can be used to explore data structures and algorithms. They range in size from just a few to hundreds of thousands of vertices and edges. We can use the small graphs for tracing algorithms (or debugging code) by hand, medium-sized graphs for testing during implementation, and large graphs to help analyze the efficiency of algorithms and our implementations of them.

All graphs are linked from http://tm.teresco.org/graphs/.

The graph data comes in two formats, simple and collapsed, which are described in detail at /metal/graph-formats.shtml. We will use the collapsed format graphs, but will likely talk about the differences later in the semester.

For this first set of questions, please refer to the NY-all.tmg graph.

Question 1: How many vertices and how many edges does this graph contain? (1 point)

Question 2: What is the latitude and longitude pair for the vertex that represents the Latham Circle (label US9/NY2)? What is its vertex number? Note that vertex numbers start at 0. (hint: load into your favorite text editor and figure out its line number.) (3 points)
Question 3: What are the endpoint vertex numbers and edge label (route name) for the last edge defined in the file? What are the vertex labels of the endpoint vertices? What are the latitude/longitude pairs of these endpoints? (hint: load into your favorite text editor and jump to the appropriate lines.) (3 points)
Question 4: What are the endpoint vertex numbers and edge label for the edge defined on line 6327 of the file? What are the vertex labels of the endpoint vertices? What are the latitude/longitude pairs of these endpoints? What are the coordinates of the "shaping points" along this edge? (3 points)
Most of our work will involve writing programs that load one of these graphs and perform some operations on it. However, it is also nice to be able to see these graphs plotted on a map. This can be done with METAL's Highway Data Examiner (HDX), available at /metal/hdx/.

Question 5: Choose one of the medium-sized graphs (a few hundred vertices and edges) into HDX. Zoom and pan to explore the graph data. Show your lab instructor your graph loaded in HDX. (4 points)

Reading the Graph into a Java Program

We need a graph data structure that can be used to store this graph data. In class we have seen that graphs can be stored in either an adjacency matrix or an adjacency list structure, and the graph's edges can either be directed or undirected.

Question 6: Which internal structure and directedness makes sense for our highway mapping graphs? Why? (3 points)
In class, we have looked at some sample implementations of graph structures. As part of this lab, you will be issued an implementation of a graph data structure that can be used to store the highway mapping graphs.

In addition to the topological information we already considered in class, here we will need to store more information with the vertices (label and coordinates) and edges (road names, list of intermediate points).

When you get to this point, you can request your copy of the starter code. Once you have it, study the code that reads in the graph data files (the HighwayGraph constructor), and the toString method that prints it out.

Run the program's main method, passing in the DC-all.tmg graph as its command-line parameter. Compare the output of the program with the TMG file and be sure you understand how the contents of the data structure correspond to the contents of the file.

Question 7: Show your instructor the output on your screen. (3 points)

Expanding the Program's Functionality

Your programming tasks involve expanding the functionality of the HighwayData class. No not modify the code that constructs the graph to accomplish these tasks.

Vertex Search

The first task is to perform a search of the graph vertices to find the "extreme" vertices: those are at the northernmost, southernmost, easternmost, and westernmost locations. For each, report the vertex label and its latitude/longitude pair. Also, find the vertices that have the shortest and longest labels. For these, you need only report the label. You may handle ties in any reasonable manner.

Before you start coding, you can see a visualization of this process using a version of METAL's HDX that has been enhanced to perform interactive algorithm visualizations, available at /metal/av/. Load a small to medium size graph (DC-all.tmg is a good choice) into this version of HDX. Then select "Search Vertices" from the "Algorithm Selection" and choose "Pretty Slow" for the simulation speed. Press "Start" and watch as the algorithm searches for the extreme points and shortest and longest labels.

Question 8: Show your instructor the completed algorithm visualization on your screen. (3 points)
Now, implement this vertex search in the main method of the HighwayGraph class.

Question 9: Demonstrate your program's vertex search results for the following graphs: DC-all.tmg, YT-all.tmg, HI-all.tmg, and NY-all.tmg. (20 points)
Question 10: Give the Big-O complexity of your vertex search algorithm in terms of |V| and/or |E|. (4 points)

Edge Search

Your second task is to perform a search of the graph edges to find the edges with the shortest and longest length. You will likely want to model your loop structure to traverse the edges in the graph after the one in the toString method.

Implement this edge search in the main method of the HighwayGraph class right after your vertex search code.

Question 11: Demonstrate your program's edge search results for the following graphs: DC-all.tmg, YT-all.tmg, HI-all.tmg, and NY-all.tmg. (25 points)
Question 12: Give the Big-O complexity of your edge search algorithm in terms of |V| and/or |E|. (4 points)
Next, augment your code so that your edge search counts how many edges you considered as you were searching for the longest and shortest edges, and print both that count and the number of edges in the graph that we stored when it was constructed (available to main as g.numEdges).

Question 13: For one of the graphs you have been using, notice that the number of edges visited did not match the number of edges in the graph. Why not? Can you see a relation between the two? (4 points)
This is a general problem when using an adjacency list representation for undirected graphs. In this case, it doesn't really matter if the same edge is considered multiple times. But other times (e.g., finding the total length of all edges), we might need to visit each edge exactly once.

Question 14: Fix your loop so that it considers each edge exactly once. (5 points)
The remaining 15 points will be awarded based on code style, documentation, and efficiency.

Submitting

Your lab submission is by demonstration of each item, and then by submitting

the hard copy of this packet you received in lab with a printout of your source code attached. If you do not finish during your lab meeting, you should also attach screen shots of any required output that was not demonstrated during lab.