Lunch and learns are a great way for a team to learn new things, share that knowledge with each other, and practice presentation skills. We typically do casual 30 minute sessions in a group of IT developers which range from .NET to SQL Database to Business Intelligence.
For our October, Halloween-themed presentation, I chose to give an introductory presentation on data visualization, titled “Avoiding the Horrors of Scary Visualizations: An Introduction to Data Visualization.” The presentation was targeted toward people with no background in data visualization and started with a quick history and some of the key players (Tufte and Few), bridged into some tips and best practices, and closed with a number of examples of “scary” visualization examples.
Overall, the presentation went very well and seemed to be well received. The full presentation is available here.
A Modern Day Balloon Race (Geocaching GCInsight Analytics – Part 6)
They’re not as popular as they once were, mostly for environmental reasons, but when I was in elementary school, balloon races were annual events which were tons of fun. For anyone that hasn’t had the experience in participating in a balloon race, each participant gets a regular helium balloon with an attached string and postcard. They’re all released at once and the balloons spread sometimes hundreds of miles. If a postcard is found, it’s reported or mailed back and whichever one gets found the furthest away wins. While balloon races are great fun, there are some environmental concerns with releasing hundreds of balloons (which essentially become trash) regularly into the wild.
Balloons released in a balloon race
So, what is a more modern and environmental way to have the same kind of fun? Believe it or not, Geocaching has the answer. Geocaching is a game in which millions of physical containers are hidden throughout the world. These containers are tracked on a website which, with each listing, contains GPS coordinates of the geocache and an inventory of what is inside it. Trackables, or travel bugs, are physical items, each of which has a unique tracking code, which can be placed inside of a geocache. Each trackable item maintains a history of the geocaches (and their locations) that it has been placed within — and that becomes a virtual balloon for our balloon race.
The Button. Originally an April Fools Day joke / prank / experiment / curiosity, this simple and strange creation has become an internet phenomenon with countless resources being invested in it. On 4/1/2015, Reddit announced a new feature, The Button. The concept is very simple. There is a button. It controls a shared timer which is counts down from 60 seconds. Pressing the button resets the timer back to 60 seconds (for everyone) and it restarts the countdown. Anybody with a free Reddit account created before 4/1/2015 is allowed to press the button, but only once, ever. That’s it.
And that is all there is to it. A pretty simple concept. Something that probably should’ve lasted a couple of minutes, maybe an hour or two, and then been gone, right? Nope! Instead it’s turned into a phenomenon which has been running perpetually for over two straight weeks with over 750,000 button presses thus far and no end in sight.
The is one additional twist to the button. When you press the button your one and only time, your account is branded with the time, and a color, indicating what time you pressed it. And others can see this. This little twist has allowed for the possibility of an incredible amount of interesting data visualizations surrounding the enigma that is, The Button.
The circumstances above make The Button a really interesting data source. It produces a constant stream of data (making real-time analytics valuable) and that data is time stamped and color coded based on how much time was remaining at the time of the press making for some very interesting visualizations.
In my previous posts, here and here, I discussed my objective to create a custom dashboard solution in order to meet a number of requirements as well as a method for creating that solution using a traditional Web Services approach. In this approach, the client is wholly responsible for requesting all of its updates from the server whenever it needs them (typically on a set timer). In this post, I’ll create the same dashboard but use a different technology, SignalR, for performing the data communications. Much of the solution will be the same or similar to the previous solution, however the back-end is fundamentally different.
With the SignalR approach, rather than the client being responsible for requesting data updates from the server, the server maintains a connection with the client and pushes them to the client on an “as needed” basis. This basis can be timed (such as sending an update once every 5 minutes) or it can be triggered (such as sending an update whenever data changes.) In this example, we’ll be using the timed approach for parity with the previous solution. As an extra benefit to the SignalR solution, when multiple clients are connected to the same dashboard, they will all display the exact same data and refresh at the exact same time (as they are not maintaining their own individual update timers. The following diagram gives a quick illustration of the data communications between the clients, web server, and database server:
In my previous post, I discussed my objective to create a custom dashboard solution in order to meet a number of requirements. In the process of researching how I would create the solution, I made multiple choices pertaining to the technology and design which would be used to create the dashboard. I chose to implement both a traditional communications model utilizing web services where the client is responsible for requesting data refreshes as well as the SignalR communications model where the server pushes updates to the client.
To recap, I made the following design decisions for the application:
- Use a custom developed application instead of an off-the-shelf product
- Use web-based technologies within the Microsoft ecosystem
I just recently finished a series of custom development projects which required me to revisit the programming world from which I once came. Programming is something I started doing when I was very young and have always enjoyed. While I’ve done plenty of scripting, Powershell, and T-SQL in recent years, my current job roles have had little need for true programming skills until recently. And wow, things have changed! In this series of posts, I’ll be going through my process for choosing a technology and developing a custom dashboard solution as well as some of the lessons I learned along the way. Here in Part 1, I plan to cover my goals for the dashboard as well as how I chose the platform I did. In Part 2, I’ll cover the more traditional of the two methods I explored and in Part 3, I’ll cover a SignalR solution.
Whenever learning or refreshing skills, it always helps to have a goal. In this case, my objective was to create a dashboard solution with the following objectives:
- Display near real-time data from a OLTP SQL Server data source
- Be aesthetically pleasing as it will be viewed on multiple large screen displays 24/7
- Refresh gracefully with no user interaction or display interruption while refreshing
Geocaching Analytics – Part 5: GCInsight Visualizations with SSRS
Continuing our journey of analyzing Geocache data with various visualization tools, we arrive at SQL Server Reporting Services (SSRS). SSRS is a server-based SQL Server component used for generating static (and generally printer friendly) reports.
Aside from its cost (free), one of the greatest benefits of SSRS is the ease of use in getting started with report creation.
This post won’t cover the installation and configuration of the server-based components of SSRS (there are plenty of resources out there on the Internet and the process varies based on your version and edition of SQL Server). At its core though, it is made up of the following components:
- A SQL Server Reporting Database which stores information used by SSRS, including connection information and reports. This database is initially created by the installation and is managed by the product.
- The Report Server and Report Manager services, which are the core service behind Reporting Services as well as a web server (running either natively or inside of SharePoint) which allows for the access and administration of reports and report components.
- Design Tools such as Report Builder and SQL Server Data Tools (Visual Studio)
|SSRS conceptual diagram. Source: http://www.accelebrate.com/library/tutorials/ssrs-2008
Going into the inner workings of SSRS and report development can go down the rabbit hole of complexity very quickly, so we’ll stop there. A quick summary is that report design tools create report files which are stored inside of reporting services and can then be accessed via a web browser. As we’re interested in geocaching visualizations, we’ll be focusing on the content created within the report design tools.
Geocaching Analytics – Part 4: GCInsight Visualizations with Power Map
Previously, we used various tools to download a copy of our Geocache data from geocaching.com and use a one-time SSIS package via the data import wizard to import that data into a SQL Server database.
Now the fun begins and we can play with some visualizations! First up, we’ll play with Microsoft’s brand new Power Map functionality for Microsoft Excel.
Power Map is part of the suite of tools which Microsoft has named the Power BI suite. The Power BI tools, such as Power Map are available in two flavors — on premise and cloud-based as part of Office 365, both of which utilize Excel 2013 Professional Plus.
One interesting note is that Microsoft is attempting to drive traffic to the Office 365 subscription flavor of Power Map over the standalone version. As such, the standalone version is technically called “Power Map Preview for Excel 2013” but the functionality is the same between both. The standalone version was originally to be time limited and stop functioning after 5/30/2014, but that limitation has since been removed.
First thing’s first — ensure that you have Power Map (or Power Map Preview) downloaded and installed. After installing, we’ll need to go into the Excel addins and enable it (as well as PowerPivot which is another tool we’ll use for managing our data)
Geocaching Analytics – Part 3: GCInsight Data Preparation
At this point, we’ve gotten our data downloaded from the Geocaching.com website and we’re sitting on a local treasure trove of data…but the data is contained within a proprietary program (GSAK) which limits our ability to make use of the data. So, it’s time to work on extracting the data and getting it into a usable format.
The GSAK application will be our primary data source for acquiring Geocaching data as well as future updates to the data. That being said, the data is not particularly easy to work with in its GSAK format. Fortunately, GSAK stores its geocache data in a SQLite database, which is a compact database which runs as an application on command rather than a service.
In order to fully utilize the data, we’ll be building a process to export the GSAK data from SQLite into a SQL Server database. Once the data is exported to SQL Server, we can perform any number of transformations and summarizations.
Presenting Data and Information with Edward Tufte
This past Friday, I had the opportunity to attend a one day course, Presenting Data and Information, presented by Edward Tufte. Considered by many to be a pioneer in the field of modern data visualization, this was an opportunity I could not pass up. After all, how frequently do you get the opportunity to meet one of the founders of a specific discipline in person?
The course was presented in a single-day format, taking place in a large ballroom in the Buckhead Westin Hotel. For a price of $380, the course included a full day of lecture as well as paperback copies of Tufte’s four books:
If you didn’t mind waiting in line, you could have one or two of the books autographed as well.