What do you do when a mile isn't actually a mile? Or when two rail mileposts are labeled with the same number but located in different places? These are just some of questions Railinc faces as it works to build a map of the Chicago Gateway, the busiest rail hub in the U.S. In this guest post, Railinc Data Architect David Weinberg considers the curious case of Milepost 21 and how data that seems simple can get complicated in a hurry.
I spent an interesting hour recently with Abby Clark discussing geographic information systems (GIS). These systems are designed to capture, store, manipulate, analyze, manage and present different types of spatial or geographical data.
Abby is among the leading GIS minds within the freight rail industry. Not only did she lead the Association of American Railroad's GIS committee for years, she helped to found the GIS program at CSX, where she worked until retiring earlier this year. These days, she's consulting for Railinc and leading our efforts to establish our GIS capabilities.
We are fortunate to have Abby. Her experience within the industry and knowledge of the various technologies will help us develop a solid foundation for Railinc's two big GIS technology projects, including one in the Chicago Gateway. The Chicago Gateway hub is the busiest rail hub in the nation and the freight rail equivalent of New York's JFK International Airport. Not only is Chicago superbusy, the inherent variability of multiparty rail transportation and brutal winter weather make for a challenging planning and execution environment.
Railinc's Chicago Gateway work this year is creating an authoritative map of the hub that will include tracks and many other data points such as mileposts, control points and corridors. We expect to put trains on the map by 2016. That eventually will support the display of dynamic train routing options based on changing conditions such as track repairs.
But when Abby and I met, we were working on something really basic. Or so I thought.
It was one simple locational data type and probably the most-used geographic element in the freight rail industry: the lowly, and certainly lonely, milepost.
Railroads use mileposts to define locations within their rail networks. The mileposts fall within broad regions in the networks called subdivisions, which are part of divisions. Railroad GIS departments maintain map layers of all the fixed assets along their track, along with the geo-coordinates.
The milepost number only has meaning in the context of a specific subdivision. For example, a particular milepost might be within the Santa Rosa County subdivision of CSX in northern Florida. There is undoubtedly another milepost with the same number in another subdivision.
Abby and I were reviewing milepost data when I noticed something weird. She had a Chicago map on the screen, and it had multiple mileposts for the same location but from different railroads.
Why would this be?
"Oh, that's an easy one," she said. "Railroads often have mileposts for the tracks they use, even if they don't own the track."
We were looking at a squashed, centerline view of the map, which is basically a logical view that creates one main track and eliminates detail. Then she zoomed out on the map, and my mouth dropped open. The two milepost 21s were in different locations.
Will the Real Milepost 21 Please Stand Up?
How could this be?
There are two possible reasons, Abby said. One is that the two points were surveyed using different methods, one more accurate than the other.
The other?
"One might not be a physical milepost," she said.
Some railroads maintain a layer of virtual mileposts that are always exactly 5,280 feet apart, though most only maintain the old physical mileposts like milepost 21 shown above. While the track network changes over time, physical mileposts rarely move. A measured milepost could end up in a different (and virtual) location from the physical milepost.
So which one will we show to the world in our Chicago interface? Possibly neither.
For various reasons, Railinc might need to create a third milepost, a "reference milepost" that splits the difference between the two and also places it alongside the rail, where it should be. This reference milepost would be for display purposes only, not for track maintenance or any other operational work.
Still, we would need to maintain its traceability to the "real" ones. It is also possible we will show all three of them, but in different contexts.
Looking at the same map, I noticed something really weird about the mileposts on the right side.
A Country Mile
The mileposts aren't the same distance apart. I pointed this out to Abby.
"It's the same issue," she said. "Track changes over time, but they don't move the mileposts. This can lead to some big differences in the distances. But it may not matter. It all depends on how you are using the data."
In other words, it's reasonable to assume that Milepost 5, below, is exactly a mile from Milepost 6. If you did, though, you would be dead wrong.
But if you used the geo-coordinates to calculate a point along the rails halfway between Milepost 5 and Milepost 6, you would be fine. Perhaps a better name for this type of data is not "milepost" but simply "post."
The lesson here is that even the simplest locational data element can get complicated in a hurry. In the case of mileposts, here are a few questions that arose:
- How do we handle the differences between physical and virtual mileposts? Do they need to be labeled?
- Some railroads have multiple mileposts with the same number and in the same subdivision. These are generally associated with different tracks. How do we handle these?
- What are our actual use cases for mileposts? Without those, it will be difficult to define the data.
- Generally, how do we translate our virtual shared view of Chicago into railroad-specific, functional versions?
I was definitely impressed with the challenges we face. After all, this was just one data point. We still had to look at corridors, control points, interchanges and many others. It was only about 10 a.m., and my head was already spinning. Abby wasn't fazed by any of this.
"The railroads have been struggling with these kinds of things for 10 years," she said. "Welcome to GIS."
—David Weinberg
David Weinberg is a data and information architect on Railinc's architecture team. Thanks to Abby Clark and Railinc employees Jason Hood and Bill Coupe for their contributions.