GCInsightA Modern Day Balloon Race (Geocaching GCInsight Analytics – Part 6)

They’re not as popular as they once were, mostly for environmental reasons, but when I was in elementary school, balloon races were annual events which were tons of fun.  For anyone that hasn’t had the experience in participating in a balloon race, each participant gets a regular helium balloon with an attached string and postcard.  They’re all released at once and the balloons spread sometimes hundreds of miles.  If a postcard is found, it’s reported or mailed back and whichever one gets found the furthest away wins.  While balloon races are great fun, there are some environmental concerns with releasing hundreds of balloons (which essentially become trash) regularly into the wild.

Balloons released in a balloon race

Balloons released in a balloon race

So, what is a more modern and environmental way to have the same kind of fun?  Believe it or not, Geocaching has the answer.  Geocaching is a game in which millions of physical containers are hidden throughout the world.  These containers are tracked on a website which, with each listing, contains GPS coordinates of the geocache and an inventory of what is inside it. Trackables, or travel bugs, are physical items, each of which has a unique tracking code, which can be placed inside of a geocache.  Each trackable item maintains a history of the geocaches (and their locations) that it has been placed within — and that becomes a virtual balloon for our balloon race.

A Geocaching Travel Bug

A Geocaching Travel Bug

Map of Stops for a Geocaching Travel Bug

Map of Stops for a Geocaching Travel Bug

 

Thanks to the history that is maintained by a Geocaching Travel Bug, the “virtual balloon race” or “trackable race” is possible.  All that is needed is to set some arbitrary rules to turn an ordinary trackable into a racer.  Each person brings a travel bug to a set location at a set time.  They are then left within the geocache and after an agreed upon amount of time (say a year), the race is over.  Geocachers tend to know whenever they find a trackable inside of a geocache to pick it up and move it somewhere else, so that is our race!  Travel bugs are the balloons and postcards and geocachers are the wind.

So what, you ask, does this have to do with business intelligence, SQL server, data visualization, or anything else?  Well, I’m glad you asked, we’re finally getting to that.  While geocaching.com provides the core functionality for how a Travel Bug functions, it provides absolutely no functionality for tracking a race — no way to group which Travel Bugs are in the race, where they each are in relation to each other, or how far each one has traveled compared to the others.  And that’s where our custom development comes in!

 

Gathering the Race Data

The first challenge to overcome is gathering the race data.  Data for each Geocaching Travel Bug is maintained solely on geocaching.com web pages.  While an API does exist, it isn’t publicly available for use, so that means data must be scraped from web pages.  Each trackable has two web pages of information — one which contains general information, including the total distance traveled to date, and one which contains a list of all stops the trackable has made.  To complicate things even further, one of these two pages requires a valid login to access — so the scraping mechanism must support forms-based authentication.

I considered a number of potential solutions to these challenges, but ultimately I decided that a custom .NET (C#) Windows Service was the way to go.  I designed the data gathering Windows service to do the following upon being started:

  1. Start a refresh timer.
  2. On firing of the timer, perform a query to a local SQL Server database for the list of trackables which are to have data fetched.
  3. Create an internal web browser session and authenticate to the geocaching website.
  4. After authenticating, download the general info and stop data web page for each trackable in the refresh list.
  5. Parse the general info page and stop data page for each trackable for relevant information and insert it into the appropriate tables in the local SQL database.
A small excerpt from the GCInishgt TBTrack data gathering method.

A small excerpt from the GCInishgt TBTrack data gathering method.

 

Visualizing the Data via a Scoreboard

Having developed a method to store all the trackable race data locally as well as a method to update that data periodically, the fun could begin — summarizing and visualizing the data.  The first few visualizations were pretty simple — like any race, there needs to be a scoreboard which ranks the racers from 1 to whatever and shows how much of a gap there is in between positions.  While simple, the best type of visualization for this is a simple table.  To create this table, I wrote a simple T-SQL stored procedure in the local SQL database which would be responsible for querying and ranking the data and then I wrote a simple web page to execute that query and then format and display the results.

A Rankings Scoreboard (by total miles traveled) for our Trackable Race

A Rankings Scoreboard (by total miles traveled) for our Trackable Race

 

A scoreboard is the bread and butter for any race, but what other fun things could be done with this data?  Well, I have a lot of data points, each of which has a latitude and longitude associated with it.  How about a map?

 

Visualizing the Data with Google Maps

Google Maps is an incredible product.  Lesser known than the typical google maps interface where you can go and search for an address or route and see it on a map is the Google Maps JavaScript API.  The Google Maps JavaScript API provides the functionality to embed a Google Map inside of a web page and then customize it programmatically — such as drawing markers/pins, lines, or labels at various points on the map.

The first map I created was a Google Map which showed the current position of each trackable in the race in relation to each other.  This fun and interactive visualization allows for seeing where each trackable is in relation to the starting point and how close they are to one another at a quick glance.  It won’t tell you which one is in the lead, but the scoreboard in the previous section will always be a better visualization for that sort of information.

Map of Current Position of All Racing Trackables

Map of Current Position of All Racing Trackables

Getting started with the Google Maps API is very easy and the Google Maps website provides incredible instructions on how to get started using the API so I won’t go into too much detail, since they do a better job than I could (and will keep it updated as things change.)  Some basic knowledge of HTML and JavaScript is required (this is developing a simple web page after all!)   That being said, the first step is to create a blank map and center it on a location of your choosing and then add a marker.  With that mastered, the race visualization isn’t more complex than that.

While the Google Maps API provides the functionality to create a map and place a marker on it, it doesn’t provide the functionality to easily add a label to that marker.  Fortunately, that functionality is easily added by using the “MarkerWithLabel.js” library which is freely available on the Internet.  Using this library, it is possible to create a marker as easily as it is a native one, but to include a text label next to that marker to describe what the marker is (in this case, the labels are used to list the name of each racer)

Sample Google Maps API JavaScript Code from the TB Race Map

Sample Google Maps API JavaScript Code from the TB Race Map

 

Taking the Google Maps Visualization to the Next Level

The scoreboard visualization was highly useful for determining the status of the race.  And the Google Maps visualization was a fun way to see where each racer was right now.  But what about the next level?  What would happen if you visualized every single movement by ever single racer in a time sequence?  I wondered the same thing too, and was determined to figure out how to do it.

After thinking about the problem, I realized it wasn’t terribly complicated, it just had a few challenges.  The Google Maps API allows for making changes to the map object after it’s been displayed on the web page (animation) and all of my race data is time stamped and has latitude and longitude.  I just needed to assemble the pieces.

I started by creating a T-SQL Stored Procedure in my local SQL database to collect all of the relevant data.  This included the list of all of the trackables in the race (and their names) as well as a list of the date and location of each stop that they made.

Next, I wrote a .NET (C#) program to do the heavy lifting.  The job of this program was to execute the above stored procedure to retrieve the data, manipulate it for output, and then generate the necessary HTML and JavaScript to perform the animation — essentially, dynamically writing a web page with all of the necessary data inside of it.

After retrieving the data from the database, the first job of the program was to convert the list of stops into lines.  A line has four components — a staring point, an ending point, the date it was completed, and the trackable which the line applies to.  With a list of lines, the next task was to sort these in chronological order and then insert gaps for days in which no activity occurred.  For instance, if there was movement on April 12th, 13th, and 14th, but no movement on the 14th, 16th, or 17th, I wanted the animation to pause appropriately for the gaps instead of just jumping to the 18th.  To do this, I number every day in the race (from 1 to 365) and I associated all of the movement for each day to its corresponding number.  So, some days may have many lines associated with them and some days may have none.  The reason for organizing the data like this was for the final key component of the animation.

I decided the best way to perform the animation was a clock.  Each time the clock ticked, a day of lines would be drawn.  Creating a JavaScript timer is pretty easy and especially in this case, since the data was already arranged in such a way that each line to be drawn had a day number associated with it, the task was straightforward.  On the first tick of the clock, the day 1 lines would be drawn.  On the second tick of the clock, the day 2 lines would be drawn, etc.  If there were no lines to be drawn for a given day, the tick would occur and then move onto the next.

And with that, I had my animation!  I later went back and added a little additional polish to it.  I drew a custom marker icon (instead of the default pushpin) and on each clock tick I had the marker move to the last point of the line for each racer.  Additionally, I created a color palette and assigned each racer a number corresponding to a color in the palette so each would have a uniquely colored path.  Finally, I created a legend to display which racer is which color by utilizing only the labels of MarkersWithLabels.

Sample Code from the Travel Bug Race Animated Map

Sample Code from the Travel Bug Race Animated Map

The end result was spectacular.  21 Travel Bug Racers traveling over 190,000 miles throughout the world all over the course of the year.  Curious to see what that looks like?  I don’t blame you.  I’ve attached a recording of the JavaScript animation below.  In the below animation, the web page which performs the animation is dynamically generated based on the content of the database (the status of the racers).  The dynamically generated web page is then a self-contained entity which contains all of the code for performing the animation of the race.

 

One Last Visualization with SSRS

The animated map of the race was truly interesting — I’d never seen anything quite like that before and everyone that has seen it has enjoyed it.  To wrap things up, I wanted to take one additional step and that was to look at a summary of the race as a whole.  Since I’d already gathered all of the data I needed, this was purely a presentation challenge.  To accomplish this task, I looked to my trusty friend, SQL Server Reporting Services.

Using my local SQL Server database of the race data, I created a number of T-SQL stored procedures to feed my SSRS report.  The goal of the report was to provide an overview and interesting facts about the race as a whole — rather than single out winners (which the earlier scoreboard already does).  Again, the summary of the information was truly interesting and I’ve shared it below.

The Travel Bug Race Summary SSRS Report

The Travel Bug Race Summary SSRS Report

I haven’t shared the full source for any of the above visualizations, however, if you are particularly interested in how to accomplish any specific task above, contact me directly and I’m happy to share the necessary code or information.

And so concludes my GCInsight series of posts — at least for now.  I’m always playing with and visualizing data and Geocaching data happens to be a very interesting set of data to visualize.  So, stay tuned for a Part 7 of the GCInsight series?  Maybe?

[ Back to Part 5 ]