Visualizing Data Distribution with 4 M’s

Once we have the before-and-after reports similar to the one from my last blog, inevitably a report comparing the performance between teams must also be provided. Although numbers shall be normalized for them to be truly meaningful, it is useful to identify which teams have introduced more rollout issues in the same release from the management perspective. By presenting the number of issues and the turnaround time of each issue, we can calculate the costs required to service a release.

One way to consolidate both reports into one is to mark the time it takes to resolve an issue (resolution time in days) on the horizontal axis as shown below.

While having all data points is useful, they are not easy for readers to digest the information. To do so, we can summarize the data with histogram as shown below.

The histogram allows us to see that most issues are resolved on the sixth day. However, it is hard to see which teams perform better than the others. To do so, I have devised the following graph where the green bar represents the mean resolution time, the solid red line represents the maximum resolution time, the two-dot red line represents the median resolution time, the 4-dot yellow line represents the minimum resolution time, and the 5-step color range approximates the number of instances.

From the report above, we can learn that Team Horse has the least number of rollout issues with the shortest resolution time. Team Dog has the longest resolution time but has lower number of rollout issues than Ox, Sheep and Tiger. Although Team Sheep has the shorter resolution time than Team Dog, Ox and Tiger, Team Sheep has the most number of rollout issues.

Clearly, Team Horse has won the quality award when judged by their low rollout issues and fast turnaround time.

Borrowing a Bar from Population Pyramid

Most engineering reports aim to shine a light on trouble areas, so engineers can continuously improve their products through better processes and tools by reducing the number of issues that surface throughout the product development lifecycle. The example in this post is to highlight projects with high number of rollout issues and then of development issues. The idea is that every project team should strive to uncover as many issues during the development phase. In other words, the ratio of rollout issues should be relatively small when compared to the development issues.

Bar charts help compare two or more values. For this report, I start with a simple two-part bar chart – one part for development issues and the other for rollout issues. However, when I sort the chart based on the number of rollout issues and then of development issues in descending order, the two-part bar chart loses its focus after being sorted as such. One of my colleagues reviewing the chart suggested flipping the bar of development issues. By doing so, the chart regains its focus on projects with high number of rollout issues. In addition, the ratio of the two types of issues is preserved.

We can apply the same bar chart to most before-and-after metrics and highlight areas of interest. For example, the number of issues reported by customers in the first month of introducing the product should be relatively small when compared to the number of issues found by the project team during product development.

Interestingly, the population pyramid diagram comes to mind when I create the sample chart. To make it easier to remember this improved two-part bar chart, I have decided to borrow the concept of population pyramid. This is how the title came to be.

1930 to 1940 Census Data

Without a doubt, I was a bit surprised to see the visualization of County Population Change: 1930 to 1940 by U.S. Census Bureau at http://www.census.gov/1940census/1940_data_visualization/. To illustrate the same percentage change and numeric change, a filled map and a circled map were used respectively as in my last blog. I must clarify that I had not seen the data visualizations of U.S. Census Bureau before and yet I arrived at the same visualizations when presenting similar data formats. This speaks to the truth that there are limited design formats that are universally clear in the language of data visualization.

As it had been a habit of mine to try to recreate a good example as an exercise when I was a programmer, I decided to see how easy and how long it is going to take me to recreate the same visualizations. I started by downloading the Excel version of the data (106 KB), available on the page.

By the time I saved the following report, I realized that it took me less than an hour to recreate a reasonably informative visualization in Tableau. The power of ease of use of Tableau is demonstrated in this short exercise.

Having said that, the report may lack the integrated but shrunken Alaska and Hawaii states as the one of U.S. Census Bureau. However, it offers viewers the ability to control numeric change and percentage change with slider range controls and to interactively inspect the data by placing the mouse over any area of interest.

I know that I have said this in my first post before, but I have to say it once again: I am not sponsored by, endorsed by, or affiliated with Tableau Software.

Show Data on a Map

A few years ago when the CTO at my previous company wanted a U.S. sales map, he had to hire an Excel consultant, who was skilled at putting data on the map, paid hundreds of dollars, and waited days for the filled map to arrive in his inbox. Now with Tableau Public, it costs nothing and takes me less than an hour to create the following visualization of top universities in the world by Times Higher Education. What is amazing with Tableau is that you can allow your viewers to explore the number of items by providing a slider control. Other useful interactive UI elements that come with Tableau include the scrollable and zoomable world map, and the flyby tip when mouse over a country.

Because it is relatively easy to explore visualizations with filled maps and circled maps in Tableau, I can create and keep both maps for ease of reference. To see the breakdown by continent, I have included a pie chart. For visualizations with scattered data, I find it hard to find the focal point; hence the pie chart. Note that the continent information is added by me so if you find errors, please notify me and do not notify Timers Higher Education please.

As this blog does not intend to show you how to create the Tableau worksheet, take a look at Tableau Software’s informative video if you are interested in learning how this can be done. For details on the filled map feature of Tableau, visit Tableau’s online help page of the same.

Adventure in Tableau

I started using Tableau in the beginning of 2011. Almost immediately I fell in love with Tableau. Prior to Tableau, my limited data visualization experience was with Microsoft Excel. When I had to create reports with Excel, I would spend an hour or two searching through commands that were difficult to remember as a casual user. Tableau completely changed my workflow of developing data visualization reports. And with that, Tableau also changed the mindset I had towards report generation from extensively manual to creatively easy.

While creating reports for my company, I learned a few things about Tableau. Will my lessons be useful to other Tableau users? Let me start by sharing them first and find out.

Let the adventure begin!

- Joe

Note that I am not sponsored by, endorsed by, or affiliated with Tableau Software.