Clinched Highway Mapping Data as a Pedagogical Tool

Project is now METAL!

This project has been superceded
by Map-based Educational
Tools for Algorithm Learning (METAL). These pages remain for
historical interest and to help those with older URLs find the
current version of the project.

Motivation

Nearly any topic in a data structures or algorithms course will be
more interesting for students if they can apply what they are learning
to real-world data and visualize results in a meaningful way. When
using graph data, it needs to be small enough to be manageable, but
large enough to be interesting. This might consist of a small road
system, airline flight schedules, or even the layout of a campus or
building. This site describes a system being developed
by Dr. James D. Teresco that
allows students to experiment with graph algorithms using
the Google Maps API
and highway routing data from
the Clinched Highway Mapping (CHM)
Project. Students can implement graph algorithms and display, in
Google Maps, the results of computations using those algorithms.

The CHM Project

The CHM Project has gathered information about the routes of highways
in North America and Europe. The project allows travelers to track
the segments of roads they have traveled (or, in the project's
terminology, "clinched"), to see representations of their travels in
map form, and to compare the extent of their travels with those of
other project users. The data that describes the highways needs to be
at a fine enough granularity to allow users to specify their travels
with a reasonable level of accuracy, but coarse enough to be able to
present (in both list and map form) the endpoints of road segments
available to be marked as clinched. Each route is added to the
project ("plotted") by listing the latitude/longitude coordinates of
the route's endpoints and at major intersections along the route (the
"waypoints" of the route). Project volunteers plot the highways at an
appropriate level of granularity using public domain data sources such
as route logs from state departments of transportation, map data from
the OpenStreetMap project,
and government satellite images.

Generating Graphs from CHM Data

The CHM data is used to generate a graph that represents a highway
system. A preprocessing program reads the set of waypoints for each
route and matches up shared intersections based on coordinates. The
waypoints (of all routes in a system) are the graph vertices and the
road segments connecting adjacent waypoints are the graph edges. Each
vertex is labeled with a string based on its route name(s) and
waypoint name and stores its coordinates; edges are labeled with the
name of the route that connects its endpoints. Distances (edge
lengths or weights) are computed easily from the coordinates. The
accuracy of the routings of highways is not perfect -- curved road
segments are approximated by a series of straight segments -- but it
is this "imperfection" in the data that makes it perfect for use in
this project. The data is accurate enough to be plottable on a map,
but small enough to be manageable for use in many graph algorithms
studied in a data structures or algorithms course.

Graph Data

The highway systems available in the CHM project vary in size, meaning
graph data can be generated at a variety of scales. For example, the
Interstate highways in Hawaii generate a graph with only 47 vertices
and 48 edges, while the New York Interstate, U.S., and state highway
systems combine to form a graph with 8275 vertices and 9214 edges.
And all plotted routes in the project produce a graph with 278,477
vertices and 299,990 edges!
The data is in "`.gra`" files which have the following format:

- The first line consists of two numbers: the number of vertices
(waypoints)
`|V|`, and the number of edges,`|E|`, (road segments that connect adjacent waypoints). - The next
`|V|`lines descibe the waypoints. Each line consists of a string describing a waypoint, followed by its latitude and longitude as floating-point numbers. - The last
`|E|`lines describe the road segments. Each line consists of two numbers specifying the waypoint numbers (0-based and in the order read in from this file) connected by this road segment, followed by a string with the name of the road or roads that form this segment.

The Highway Data Examiner

Google has made an application programmer interface (API) available to
allow customized plots of data in Google Maps. It is implemented as a
set of Javascript classes and methods, and is fairly easy to learn and
use. It requires free registration to obtain an API key. However,
students, or instructors for that matter, need not learn the Google
Maps API to make use of this system. The Highway Data Examiner (HDX)
is built using the Google Maps API, and that allows plotting of the
CHM and derived data. For example, a plot of all Interstate highway
waypoints and their connections in Hawaii:

The following links generate maps on-the-fly using HDX for input
graphs or results of programs that used the graph data:

- The convex hull of the Interstate highway system data for the contiguous United States.
- A computed routing the highways in New York's Adirondack Park.
- A plot of all points and connections in the Yukon territorial highway system.

Usage in Class Assignments

These graph data files can be used as input for class examples or
assignments in a data structures or algorithms course. This data was
used in a laboratory assignment for a data structures course at Mount
Holyoke College where students were required to build a graph
structure representing a highway system then perform Dijkstra's
algorithm to compute the shortest route between two specified
waypoints. Student programs generated output files listing the road
segments along the shortest path. These were then uploaded to a
course web server, where an instructor-provided program used the
Google Maps API to visualize their results. Students were able to
develop and debug their programs using small data sets like the Hawaii
Interstates, then use those programs to compute much more complex
routes using the larger data sets. The software was updated and
extended for use during the Spring 2011 semester at Siena College and
the Fall 2012 semester at The College of Saint Rose in Analysis of
Algorithms courses.