Map-based Educational Tools for Algorithm Learning (METAL)
led by Dr. James D. Teresco
Graph File Formats
A graph file format called "Travel Mapping Graph", using
a
.tmg
extension has been defined for the graphs
generated for by the Travel Mapping Project for use in
the
METAL Project.
Files are plain ASCII text and simply describe a list of vertices
(waypoints from TM data) and edges (connections from TM data) in a
format intended to be convenient to read by programs in any language that
wish to make use of the data.
The Travel Mapping Graph (TMG) file format can represent graph data in
collapsed
,
traveled
,
simple
,
custom
, or
partitioned
form.
collapsed
is the default. The differences among the
formats
are described below.
TMG files are stored with a "
.tmg" extension, and have the
following format:
- The first line specifies the file format and version. It
consists of three space-separated entries, the first of which must
be "
TMG
". The second is a version number, which must
be "1.0
" or "2.0
" or "3.0
".
The third entry is either "collapsed
",
"traveled
" (version 2.0 and up), "simple
",
"custom
", or "partitioned
" (version 3.0
only).
- The second line consists of two or three numbers: for all
formats, the number of vertices (waypoints) |V|, and the
number of edges, |E|, (road segments that connect adjacent
waypoints), and in the case of
traveled
format, the
number of travelers who have traveled any edge in this graph, and in
the case of partitioned
format the number of
partitions.
- When using the
custom
format only, the next two lines
describe the custom fields, one line
listing the 0 or more fields to expect on each vertex, and one line
listing the 0 or more fields to expect on each edge.
- The next |V| lines describe the waypoints. Each line
consists of a string describing a
waypoint, followed by its latitude and longitude as floating-point
numbers. For
custom
format files that specify custom
vertex fields for each vertex are then given on each line, one value
per field. For partitioned
format, the partition
number follows the latitude and longitude for each vertex.
- The next |E| lines describe the road segments. Each
line consists of two numbers specifying the waypoint numbers
(0-based and in the order read in from this file) connected by this
road segment, followed by a string with the name of the road or
roads that form this segment. For
traveled
format
graphs, this is followed by a hexadecimal string
(described below) that specifies which
travelers have traveled this edge. For collapsed
or traveled
format graphs, each line describing an edge
can optionally have a list of the space-separated latitudes and
longitudes of shaping points that define a more accurate path (for
both mapping and for computing distances) of the road segment.
For custom
format files that specify custom edge fields
for each edge are then given on each line, one value per field.
- In the
traveled
format only, the edge data is
followed by a single line of space-separated TM usernames who have
traveled the edges in this graph.
Note that a
simple
format graph is just
a
collapsed
format graph with no edge shaping points.
"Hidden" Waypoints
Travel Mapping highway data uses a series of waypoints to define each
highway. Waypoints are added to a route for its endpoints and at
major intersections or, for freeways, interchanges. The route is then
approximated by straight line segments connecting pairs of consecutive
waypoints. On many routes, this is sufficient to form a good
approximation of the route. However, on more rural and especially
curvy roads, additional resolution is needed to obtain a reasonable
approximation. In those cases, waypoints are added at selected minor
intersections. In cases where no appropriate minor intersections are
located on a stretch of highway, a "hidden waypoint" will be added.
In the TM data, these are denoted by starting their label with a
"+
". These are not points where a traveler is likely to
begin or end his or her travel on the route, so they are not displayed
to users of the TM project itself.
Simple vs. Collapsed and Traveled Graphs
It is in the handling of these hidden waypoints where
the simple
graph file format differs
from collapsed
and traveled
. The
simple
format treats the hidden waypoints no differently
than any other. They become graph vertices, and each graph edge is a
simple, straight line connection between two vertices that represent
waypoints that are adjacent to each other in the underlying TM data.
In the collapsed
and traveled
formats, each
hidden waypoint and its incident edges are collapsed
into
their neighboring edges. The coordinate pair of the hidden waypoint
is then listed as a shaping point along the new, collapsed edge.
Most users of TM/METAL graph data are likely to want to use the
collapsed
format graphs, so these are the default when
loading in HDX. Since graph vertices that represent hidden waypoints
necessarily cannot be at intersections (if they were, the waypoints
would not be marked as hidden), each such vertex would have a degree
of 2, and would not add significantly to the structure of the graph.
One reason a user of these graphs might choose simple
format would be if they did not wish to keep track of shaping points
with their graph edges.
Custom file format
As mentioned above, the custom
file format gives the
option to add extra data to be displayed with vertices and/or edges in
HDX. The two additional lines in the header, specify the extra data
fields for the vertices, then the edges. Fields can use any labels,
but HDX will automatically display graph entities honoring the special
fields color
, scale
,
and opacity
. Other fields will be displayed with each
vertex or edge, as appopriate, in the data table in HDX.
Edge Travelers
In traveled
format graphs, each edge includes a
hexadecimal string that indicates which TM users have claimed to
travel that segment. If there are nt travelers who
have traveled any segment, then there will be
⌈nt/4⌉ hex digits in the string. The 4
bits of the first hex digit's binary representation indicate whether
travelers 0-3 have traveled this edge. The low order bit (1's place)
is traveler 0, and the high order bit (8's place) is traveler 3. So
if the first character of the hex string is 'A
', with
binary representation 1010
, the 0 in the 1's place
indicates that traveler 0 has not traveled this edge, the 1 in the 2's
place indicates that traveler 1 has traveled this edge, the 0 in the
4's place indicates that traveler 2 has not traveled this edge, and
the 1 in the 8's place indicates that traveler 3 has traveled this
edge. Subsequent characters each represent the travels of the next 4
travelers. Traveler numbers can be translated to TM traveler
usernames using the list on the last line of a
traveled
.tmg
file.
For a larger example, consider a graph with a total of 11 travelers,
where an edge has hex code 6C7
.
The 6=0110
indicates that travelers 1 and 2 have
traveled this edge, the C=1100
indicates that
travelers 6 and 7 have traveled this edge, and
the 7=0111
indicates that travelers 8, 9, and 10 have
traveled this edge.
Waypoint Names
Waypoint/vertex names in the graph data are derived automatically from
the names of intersections and other points in the TM highway data.
For example, the TM data for New York State Route 30 has a waypoint
named "
GolfCouRd
" for its intersection with Golf Course
Road in the Town of Amsterdam. When that waypoint is named in the
highway data graph, it is given the name
"
NY30@GolfCouRd
". The majority of waypoints are named
with this convention. This process is complicated by waypoints along
highways that run concurrent with each other (
i.e., a road with
more than one designation) or at the intersection of two or more
highways. In these cases, a label that includes information about all
concurrent or intersecting routes is generated. In some cases, long
and complicated names result. For example, the TM data for New York
State Route 30 has a waypoint named "
NY29
" and New York
State Route 29 has a corresponding waypoint named "
NY30
"
that graph generation detects as a single waypoint/graph vertex. The
"failsafe" point name is "
NY29@NY30&NY30@NY29
". A
series of simplification steps are applied to these default names, and
this particular point ends up renamed as "
NY29/NY30
". In
most cases, the
automatic
name
simplification leads to reasonable names for the waypoints, but
some remain overly complex. Improving the automatic naming of the
points in the graph data is an ongoing effort. Fortunately, other
than having to read and occasionally type a long waypoint name, these
more complicated names do not affect the actual graph structure.