Friday, June 11, 2010

Geospatial Analytics using Teradata: Part I

In October, I (along with a co-worker) will be giving a presentation at the Teradata PARTNERS conference. The topic will be on how Railinc uses Teradata for geospatial analytics. Since I did not propose the paper, write the abstract, or even work on geospatial analytics, I will be learning a lot during this process. So, to help with that education I will be sharing some thoughts in a series of blog posts.

To kick the series off, let me share the abstract that was originally proposed:
Linking location to information provides a new data dimension, a new precision, unlocking a huge potential in analytics. Geospatial data enables entirely new industry metrics, new insights, and better decision making. Railinc, as a trusted provider of IT services to the Rail Freight industry, is responsible for accurate and timely dissemination of more than 10 million rail events per day. This session provides an overview of how Railinc Business Analytics’ group has implemented Active Data Warehouse and Teradata GeoSpatial technologies to bring an unprecedented amount of new Rail Network insight. The real-time calculation of Geospatial metrics from rail events, has enabled Railinc to better assess; 1) Rail equipment utilization 2) repair patterns 3) geographic usage patterns, and other factors. All of which, afford insights that impact maintenance program decisions, component deployments, service designs and industry policy decisions.
Below is a first pass at an outline for the talk. It is preliminary and will most likely change over the coming months.
  1. Describe Railinc's Teradata installation
  2. Describe Railinc's source systems
    1. Rail car movement events
    2. Rail car Inventory
    3. Rail car health
    4. Commodity
  3. Describe our ETL process
  4. Explain the FRA geospatial rail track data
    1. Track ownership complexity
  5. Tie 1-4 together
    1. Current state of car portal
    2. Car Utilization analytics
    3. Traffic pattern analytics
  6. Lessons learned
    1. Study of different routing algorithms
    2. Data quality issues
Item 5 is the problem - how can we tie our source systems together with geospatial data in a compelling way? One idea is a portal that provides information about the current state of a rail car. How would geospatial data fit into this portal? Location is the most obvious answer, but is there something more interesting? What about an odometer reading for a rail car? Outside of the portal there are ideas around car utilization and traffic patterns. I like the last two but I need to learn more about them.

These are some issues/questions I need to answer over the coming months. Along the way I plan on sharing information about implementation details, possible business cases, and any problems I come across.

1 comment:

SpinPhD said...

let's talk about this - there's an article opportunity that goes along with it.