Geocaching Analytics – Part 5: GCInsight Visualizations with SSRS

Continuing our journey of analyzing Geocache data with various visualization tools, we arrive at SQL Server Reporting Services (SSRS).  SSRS is a server-based SQL Server component used for generating static (and generally printer friendly) reports.

Aside from its cost (free), one of the greatest benefits of SSRS is the ease of use in getting started with report creation.

This post won’t cover the installation and configuration of the server-based components of SSRS (there are plenty of resources out there on the Internet and the process varies based on your version and edition of SQL Server).  At its core though, it is made up of the following components:

  • A SQL Server Reporting Database which stores information used by SSRS, including connection information and reports.  This database is initially created by the installation and is managed by the product.
  • The Report Server and Report Manager services, which are the core service behind Reporting Services as well as a web server (running either natively or inside of SharePoint) which allows for the access and administration of reports and report components.
  • Design Tools such as Report Builder and SQL Server Data Tools (Visual Studio)
SSRS conceptual diagram.  Source:  http://www.accelebrate.com/library/tutorials/ssrs-2008

Going into the inner workings of SSRS and report development can go down the rabbit hole of complexity very quickly, so we’ll stop there.  A quick summary is that report design tools create report files which are stored inside of reporting services and can then be accessed via a web browser.  As we’re interested in geocaching visualizations, we’ll be focusing on the content created within the report design tools.

So, let’s start with some finished visualizations and then dive in a little bit into how to create one of them:

Spatial visualization of the state divided into North, Central, and South with the number of cache favorite points per county as well as color coding of caches with high numbers of favorite points.
Simple stacked percentage bar graph of the population distribution of Georgia between North, Central, and South.
Simple stacked percentage bar graph of the cache distribution of Georgia between North, Central, and South.
Spatial breakdown of which cache owner has the most geocaches owned for each Georgia county.
Spatial visualization of all geocaches in Gwinnett County, GA with coloring and size for number of favorite points per cache.
Evaluation of number of caches vs. number of favorite points for top counties with line graph showing favorite points to number of caches ratio.
Let’s dig into the first visualization, the spatial representation of number of favorite points per county and significant caches since that’s the most complex one shown.

SSRS 2008R2 (and beyond) has excellent mapping support built into the product.  The first thing we need is to add datasets to the report.  The map is currently using three datsaets:  one dataset handles the shading and total cache numbers for each county, one dataset handles the significant cache points, and the final dataset draws the north, central, south divider lines.

As ultimately I’m doing a number of visualizations, I’ve created a couple of low-level views which my subsequent queries pull from instead of pulling directly from tables.  They are:

View 1:

v_GCBasicInfo_Current SQL View
View 2:

v_GC_CountySummary_Current SQL View
With those views in place, I created stored procedures which would ultimately be used as the datasets for the map data.  Each of the datasets uses the following:
 
Dataset 1:
Dataset 1 (usp_rpt_gcstats_Caches_byLocationCounty — cache summaries by county with sample output


Dataset 2:

Dataset 2 (usp_rpt_gcstats_Caches_byLocation — individual caches with number of favorite points and spatial locations)

Dataset 3:

Dataset 3 (usp_rpt_gcstats_SpatialGARegionLines — simple spatial lines for the map)

After creating all the back-end views and stored procedures which make up the map’s datasets, creating the map itself is actually pretty easy.  In my case, I’ve added a shared data source to the report to make the database connection, then I’ve created each of my datasets as shared datasets which utilize that data source.  Finally, I’ve added those shared datasets to the report.

Shared data source and shared datsets added to report in report project

Adding the Map:
From the toolbox, select Map and add it to the whitespace of the report.

Select the Map Gallery, then States by County and Georgia
Optionally choose to zoom in and crop the map
Choose the Color Analytical Map to automatically add our first data layer for us during creation
Select the dataset added previously which contains summary information by county
Choose which field() in the map will map to which field(s) in the dataset.  In this case, the only match we have is COUNTYNAME to County
Choose the field to visualize (in this case the sum of favorites) as well as the field to be used as a source for labels (in this case the sum of favorites, which we’ll alter later)
If we save and then preview the map with the above settings, it’s already starting to look pretty good!

After completing the new map wizard, we already have a very complete map.  But, it needs a little cosmetic tuning and we have two more layers of data to overlay on top to make it a richer experience.

Next, we’ll add a Point Layer for bubbles to visualize the caches which have large numbers of favorite points.

I highly recommend using the Layer Wizard to add the Point Layer vs. manually adding it.  Strange errors can occur when manually adding and setting all of the properties.
Within the Layer Wizard, choose SQL Server Spatial Query
Choose the Caches by Location (which has a spatial field) as the spatial dataset
Set the type to Point.  Optionally, zoom the map in if you like
Choose the Bubble Map so we can color and size the circles based on the number of favorites
Choose the Caches by Location dataset as the analytical dataset
Set the options to control the bubble size and color for the number of favorite points and complete the wizard
Lastly, right click on the newly created layer and select Layer Data.  Add a filter (which in this case is a flag to only show caches which have 25 or greater favorite points) due to the large number of caches which would otherwise be displayed (20 thousand)
Preview the map once more.  It still needs cosmetic cleanup and one last layer

Now, we’ll add the basic line layer so we can split the state into thirds.

Click on the map and in the layer selection, choose to add a new Line Layer (without using the wizard)
Choose spatial field and select the dataset and field which contains the line data
Save the changes and preview the map and we now have our third layer of data

At this point, our map is complete with all of its data sources.  The one remaining housekeeping item is to clean it up to make it more presentable.

One of the two legends can be deleted.  Additionally, the line type can be changed to a thin black dashed line to be less obtrusive.  The shading for the counties and points can be changed by right clicking on a layer and reviewing the color and size rules for the item.

Lastly, the labels on the counties (the polygon layer) can be changed to be more friendly by changing the label text to the following string:  =Fields!County.Value & VbCrLf & “(” & Fields!NumFavs.Value & “)”

A completed example of the SSRS report file (the RDL file) is available from this link.

[ Back to Part 4 | Proceed to Part 6: Going Custom ]