Map-based Educational Tools for Algorithm Learning (METAL)

led by Dr. James D. Teresco

Graph File Formats

A graph file format called "Travel Mapping Graph", using
a

`.tmg`

extension has been defined for the graphs
generated for by the Travel Mapping Project for use in
the

METAL Project.
Files are plain ASCII text and simply describe a list of vertices
(waypoints from TM data) and edges (connections from TM data) in a
format intended to be easy to read by programs in any language that
wish to make use of the data.

The Travel Mapping Graph file format can represent graph data in
"collapsed", "traveled", or "simple" form. The differences among
the formats

are described below.

The data is in "

`.tmg`" files which have the following format:

- The first line specifies the file format. It consists of three
space-separated entries, the first of which must be
"
`TMG`

". The second is a version number, which must be
"`1.0`

" or "`2.0`

". The third entry is either
"`collapsed`

", "`traveled`

", or
"`simple`

", indicating the specific edge format.
- The second line consists of two or three numbers: the number of
vertices (waypoints)
`|V|`, and the number of
edges, `|E|`, (road segments that connect adjacent waypoints),
and only in the case of "traveled" format graphs only, the number of
travelers who have traveled any edge in this graph.
- The next
`|V|` lines describe the waypoints. Each line
consists of a string describing a
waypoint, followed by its latitude and longitude as floating-point
numbers.
- The next
`|E|` lines describe the road segments. Each
line consists of two numbers specifying the waypoint numbers
(0-based and in the order read in from this file) connected by this
road segment, followed by a string with the name of the road or
roads that form this segment. For "traveled" format graphs, this is
followed by a hexadecimal string (described
below) that specifies which travelers have traveled this edge.
For "collapsed" or "traveled" format graphs, each line describing an
edge can optionally have a list of the space-separated latitudes and
longitudes of shaping points that define a more accurate path (for
both mapping and for computing distances) of the road segment.
- In the "traveled" format only, the edge data is followed by a
single line of space-separated TM usernames who have traveled the
edges in this graph.

Note that a "simple" format graph is just a "collapsed" format graph
with no edge shaping points.

"Hidden" Waypoints

Travel Mapping highway data uses a series of waypoints to define each
highway. Waypoints are added to a route for its endpoints and at
major intersections or, for freeways, interchanges. The route is then
approximated by straight line segments connecting pairs of consecutive
waypoints. On many routes, this is sufficient to form a good
approximation of the route. However, on more rural and especially
curvy roads, additional resolution is needed to obtain a reasonable
approximation. In those cases, waypoints are added at selected minor
intersections. In cases where no appropriate minor intersections are
located on a stretch of highway, a "hidden waypoint" will be added.
In the TM data, these are denoted by starting their label with a
"`+`

". These are not points where a traveler is likely to
begin or end his or her travel on the route, so they are not displayed
to users of the TM project itself.

Simple *vs.* Collapsed and Traveled Graphs

It is in the handling of these hidden waypoints where the "simple"
graph file format differs from "collapsed" and "traveled". The
"simple" format treats the hidden waypoints no differently than any
other. They become graph vertices, and each graph edge is a simple,
straight line connection between two vertices that represent waypoints
that are adjacent to each other in the underlying TM data. In the
"collapsed" and "traveled" formats, each hidden waypoint and its
incident edges are "collapsed" into their neighboring edges. The
coordinate pair of the hidden waypoint is then listed as a shaping
point along the new, collapsed edge.

Most users of TM/METAL graph data are likely to want to use the
collapsed format graphs. Since graph vertices that represent hidden
waypoints necessarily cannot be at intersections (if they were, the
waypoints would not be marked as hidden), each such vertex would have
a degree of 2, and would not add significantly to the structure of the
graph. One reason a user of these graphs might choose simple format
would be if they did not wish to keep track of shaping points with
their graph edges.

Edge Travelers

In traveled format graphs, each edge includes a hexadecimal string
that indicates which TM users have claimed to travel that segment. If
there are *n*_{t} travelers who have traveled any
segment, then there will be ⌈*n*_{t}/4⌉ hex
digits in the string. The 4 bits of the first hex digit's binary
representation indicate whether travelers 0-3 have traveled this edge.
The low order bit (1's place) is traveler 0, and the high order bit
(8's place) is traveler 3. So if the first character of the hex
string is '`A`

', with binary
representation `1010`

, the 0 in the 1's place indicates
that traveler 0 has not traveled this edge, the 1 in the 2's place
indicates that traveler 1 has traveled this edge, the 0 in the 4's
place indicates that traveler 2 has not traveled this edge, and the 1
in the 8's place indicates that traveler 3 has traveled this edge.
Subsequent characters each represent the travels of the next 4
travelers. Traveler numbers can be translated to TM traveler
usernames using the list on the last line of a
"traveled" `.tmg`

file.

For a larger example, consider a graph with a total of 11 travelers,
where an edge has hex code `6C7`

.
The `6=0110`

indicates that travelers 1 and 2 have
traveled this edge, the `C=1100`

indicates that
travelers 6 and 7 have traveled this edge, and
the `7=0111`

indicates that travelers 8, 9, and 10 have
traveled this edge.

Waypoint Names

Waypoint/vertex names in the graph data are derived automatically from
the names of intersections and other points in the TM highway data.
For example, the TM data for New York State Route 30 has a waypoint
named "

`GolfCouRd`

" for its intersection with Golf Course
Road in the Town of Amsterdam. When that waypoint is named in the
highway data graph, it is given the name
"

`NY30@GolfCouRd`

". The majority of waypoints are named
with this convention. This process is complicated by waypoints along
highways that run concurrent with each other (

*i.e.*, a road with
more than one designation) or at the intersection of two or more
highways. In these cases, a label that includes information about all
concurrent or intersecting routes is generated. In some cases, long
and complicated names result. For example, the TM data for New York
State Route 30 has a waypoint named "

`NY29`

" and New York
State Route 29 has a corresponding waypoint named "

`NY30`

"
that graph generation detects as a single waypoint/graph vertex. The
"failsafe" point name is "

`NY29@NY30&NY30@NY29`

". A
series of simplification steps are applied to these default names, and
this particular point ends up renamed as "

`NY29/NY30`

". In
most cases, the
automatic

name
simplification leads to reasonable names for the waypoints,but
some remain overly complex. Improving the automatic naming of the
points in the graph data is an ongoing effort. Fortunately, other
than having to read and occasionally type a long waypoint name, these
more complicated names do not affect the actual graph structure.