Geocaching Analytics – Part 4: GCInsight Visualizations with Power Map
Previously, we used various tools to download a copy of our Geocache data from geocaching.com and use a one-time SSIS package via the data import wizard to import that data into a SQL Server database.
Now the fun begins and we can play with some visualizations! First up, we’ll play with Microsoft’s brand new Power Map functionality for Microsoft Excel.
Power Map is part of the suite of tools which Microsoft has named the Power BI suite. The Power BI tools, such as Power Map are available in two flavors — on premise and cloud-based as part of Office 365, both of which utilize Excel 2013 Professional Plus.
One interesting note is that Microsoft is attempting to drive traffic to the Office 365 subscription flavor of Power Map over the standalone version. As such, the standalone version is technically called “Power Map Preview for Excel 2013” but the functionality is the same between both. The standalone version was originally to be time limited and stop functioning after 5/30/2014, but that limitation has since been removed.
First thing’s first — ensure that you have Power Map (or Power Map Preview) downloaded and installed. After installing, we’ll need to go into the Excel addins and enable it (as well as PowerPivot which is another tool we’ll use for managing our data)
|In Excel Options, select Add-Ins and on the bottom, select Manage COM Add-ins|
|Place checkmarks next to Microsoft Office PowerPivot and Microsoft Power Map Preview|
|After saving your changes, you should have POWERPIVOT on the main ribbon and under INSERT a Map option|
After getting Power Map enabled, the first thing you might be tempted to do would be to go to the INSERT tab and Launch Power Map. If you do this, Power Map will launch with a message that Power Map needs data to work with. Since our Excel file has no data in it yet, there’s nothing to map.
We’ll be using another Excel component, Power Pivot, to pull our data over from SQL Server so Power Map has something to work with.
Start by selecting the POWERPIVOT tab and then press the Manage button which will launch PowerPivot. This will launch a fairly empty PowerPivot window with a new ribbon on top.
From our PowerPivot window, select From Database on the top ribbon and From SQL Server beneath that. This will launch the Table Import Wizard.
Within the wizard, choose a friendly name, the name of your SQL server, the authentication method, and the database name. Once complete, press the Test Connection button and hopefully it’ll succeed.
By selecting the Preview and Filter button on the table selection pane, we can choose exactly which columns are brought over. For performance reasons, it’s generally a good practice to not bring over anything you definitely don’t need, but our dataset is small enough that it doesn’t matter too much. Additionally, you may want to include filters. Since we’re only concerned about active Geocaches, we’re going to add a filter to the Archive column so we only get values of 0 (not archived).
Take note of the geospatial data within the table. There are no columns which are SQL Server geospatial data types, but we have Country, State, County, Latitude (blacked out for security), and Longitude (blacked out for security). This will be helpful to know later.
After pressing finish, Excel PowerPivot will fetch the data from SQL Server. At this point,we now have a cache of the SQL Server data inside of Excel which can be refreshed from the source system as needed.
You’ll now see all of your data laid out in a big table. You can perform further filtering on it here if you like, but note that you cannot actually change the raw data. Should you decide that you should’ve brought over additional columns from SQL or there were columns you really didn’t want, you can modify your selection by going to the Design tab and selecting Table Properties.
With our data carefully stored away in PowerPivot, you can now close the PowerPivot pane which will take you back to your original blank workbook. Don’t be alarmed that the page is blank — your data is still there, it’s just over in the PowerPivot section of Excel. At this point, it’s also a good idea to save your file, as Excel is known to crash at the most inconvenient times.
Now, if you go to the Insert tab and choose Map -> Launch Power Map you’ll see a Launch Power Map window appear. Select New Tour and we’ll finally be able to start visualizing some data!
One of the great things about Power Map is that it’s smart. Right off the bat, it knew that we wanted to use our PowerPivot data for visualization and it even figured out which fields within that data were geography types (latitude, longitude, etc). Not too shabby! So, without doing anything other than importing our data, we already have a visualization of some sort.
So, we have A visualization, but it’s not a very useful visualization. It pretty much shows us a big blob of blue covering all of the state of Georgia. A couple of changes we’ll make right off the bat. First, we’ll change the type to Flat Map on the ribbon. Next, we’ll de-select everything except for Latitude and Longitude from the “Choose Geography” section, since we’re interested in exact points. It’s already starting to look better.
While better, it’s still not incredibly useful. If we assume each big blue square is one or more Geocaches, we can make an assumption that there are many more in the top half of the state than the bottom half of the state, but we can do much better than that…
Press the Next button in the bottom right corner and that will save the changes we made to Geography. Now, we have a new section where we can optionally choose various properties to map as well as change the type of map. Let’s change the type to HeatMap. Wow, that is already way more useful! We can more clearly see the hotspots for Geocaches in the state.
|Geocache heatmap of caches in the state of Georgia|
We’re getting better, but it’s still a bit muddy. Select the leftmost option, which looks like a stack of papers and then select the gear icon next to Layer 1. This will allow us to adjust the settings for how the heatmap performs its groupings. By redusing the Radius of Influence and potentially the Color Scale we can break the big blob of red up into more diverse colorings.
One of Power Map’s huge strengths is that it’ll adjust the scale of visualization based on the zoom level. For instance, in the above, we have a big glob of red in the north half of the state. If we do nothing else other than zoom into the big blob of red, we automatically start to see more detail. Just by zooming in, the big red blob automatically shows more detail and we can quickly see there is a much higher density in the north half of the city than the south side.
|Geocache heatmap of caches in northern Georgia|
For one final visualization with Power Map, we’re going to add another dimension to our map. Every Geocache has a number of favorite points associated with it. These are awarded by people to caches they like. Over time, the “best” caches tend to accumulate the most favorite points. Seems a great opportunity for a heatmap!
By simply placing a checkmark next to the FavPoints item, Power Map automatically knows I want to sum FavPoints. If I then reduce my Radius of Influence down to 60% and zoom out to see the whole state, I see a much different picture than before, with some similarities. The northern portion of the state definitely has more favorite points than the rest, but you can also make out hotspots, which tend to be major cities — Atlanta, Augusta, Gainesville, Macon, Savannah, Valdosta, etc. Pretty interesting!
|Geocache favorite point distribution heatmap for Georgia|
If we zoom in to the northern portion of the state and bump our Radius of Influence back up to 100% we can once again see particular hotspots in the area.
|Geocache favorite point distribution heatmap for northern Georgia|
With that, I’ll conclude my visualizations of Geocaching data using Power Map. Power Map is capable of much more, including other chart types such as columns and bubbles as well as time lapse, 3-d, and creation of flyover movies. It does have some weaknesses as far as granularity of control of charting elements as well as the inability to use custom GIS shapefiles. Over time, I hope that it becomes an increasingly powerful tool
Next, we’ll do some visualizations with the classic tool, SQL Server Reporting Services (SSRS) which will allow us to dig into some really granular detail.