Thursday, September 27, 2012

Clarifying the No-Kill Shelter Stats

Recently, Cincinnati City Beat published an article online and in print about the no-kill animal shelter movement as an alternative to mass euthanasia. The article included this infographic


It caught the eye, but on further inspection seemed to me a bit confusing and misleading. So, in this post I explain why and offer some alternatives.

Four-leggeds vs Humans

Among other facts, the article reports the magnitude of the current situation. It specifies the number of adopters who will use various methods of adopting, i.e., Swingers (can be persuaded to adopt from a shelter or rescue), Definitely using a shelter or rescue, Definitely using a breeder. So these are the adopters, not the pets. The article also mentions the unfortunate evidence that 4 million pets are "unadopted" from shelters and eventually euthanized, that is, pets not people. The graphic tries to portray the quantitative differences between the types of adopters and the number of euthanized pets using scaled abstract images of pets for each case. Two different messages, but the same metaphor -- not the best approach. When I first looked at this image in the printed version, I figured it was trying to communicate that 17 million large dogs, 5 million small dogs, 1.5 million rabbits, and 4 million cats were adopted from shelters. Then I read the captions on each pet image, but still wondered why pet images. Brain pain. There had to be a clearer way of communicating this message.

Not for the color blind

Adding insult to injury, maybe, the image seemed to me to be not optimized for color blind viewer, or even for viewers wanting a relaxing visual experience. To test this, I submitted the images to the Vischeck web site. It has a detailed info page with links to other sites like this one for research and testing.

Naomi Robbins recently talked about this subject, and Vischeck in her Forbes blog.

Here are the Vischeck results:

Deuteranope (a form of red/green color deficit)


Protanope (another form of red/green color deficit)
Tritanope (a blue/yellow color deficit - very rare)


Interestingly, the rare person with tritanope color deficit would at least see that the three pets associated with adoption intentions are in the same hue family and should perhaps be considered together, while the cat image related to euthanasia is a completely different hue. But most folks with color deficit will get little to no visual clues from the colors used.

Some Alternatives

The point of visual data analysis and data visualization is data sensemaking. It is usually not necessary to invent new schemes or metaphors to make the data come alive. For one thing, more often than not, these new inventions or trendy displays obscure the message or story in the data. Also, it is very hard to do easy well. And, the new-fangled stuff usually hurts the brain, big time. So, what that leaves is using visual displays that incorporate best practices, something Tableau Software does by default. And, this applies to data journalism as much as it does for hard-core statistical analytics. Stephen Few has a very comprehensive discussion on this subject in Criteria for Evaluating Visual EDA Tools. The sensemaking task with the pet adoption data is basically a Part-to-Whole and Ranking Displays exercise. Few discusses this in detail in Chapter 8 of his popular book Now you see it: Simple Visualization Techniques for Quantitative Analysis. In the Tableau dashboard display below I present four alternatives to the menagerie version published by City Beat.



The crosstab shows all the data, both the raw data and percents of total. Maybe that is all that is necessary. It is clear which intention is most common. It also does not mix adoption intentions with euthanized pets. Notice the caption note explaining the Swing Adopter.

The three graphs use Tableau's color blind palette, which yields this color perception result when passed through Vischeck:



The pie chart shows very little. However, in this case, since the three intentional statements have very different response levels, the pie seems to work. However, the pie chart is inherently messy, and should be avoided. Again, I have tried to mitigate this by providing tooltips that fill the the information gaps -- categorical labels, percents, and notes.

The stacked bar chart shows the raw cumulative raw counts of adopters segmented by intention. It takes up very little space. It is possible to gauge the relative prevalence of the intentions compared to each other. Hover over any segment to see both its raw count and percent of total adopters (a bit more information and somewhat less direct). Again, the information about euthanized pets is not directly available. Notice, too, that the tool tip for Swing Adopters includes some additional explanation about swing adopters.

The bar chart in the lower right hand corner most closely illustrates a best practice for this analytical question. Bars encode values as lengths, a pre-attentive characteristic easily handled by the human visual system. They are much more suitable for encoding differences in the value of a single measure across one or more dimensions. In the example, like the other alternatives, all the relevant data is directly available to the viewer, or just a tooltip away.

Bottom line, even though a question or some data you want to visualize and share is simple, you should make every effort to communicate the story in an unambiguous manner, being respectful of your audience's brains. First, do no harm.

Thursday, September 20, 2012

The 47% is not really the 47%, is it?

Recently US media outlets reported on presidential candidate Mitt Romney's musing that voters with no federal tax liability would likely vote for incumbent President Barack Obama. Subsequently Simon Rogers at the Guardian Data Blog in the UK published an investigation of state-by-state variability in this percentage. The piece also explored other indicators, like % on medicare, % without medical insurance, and so on.

Simon used a mashup that involved Google Maps. It seemed to me that without much effort the same or similar could be done with Tableau. So, starting with the data link in the Guardian article, I ended up with the following viz. Easy peasy for Tableau.



The dashboard design challenges that presented themselves were allowing 1) the viewer to select a measure to review, and 2) managing the legends so that they respected the conceptual features of the Guardian version. To start with, I decided to try to accomplish these objectives in Tableau without changing the data structure. The date presented by the Guardian is a typical pivoted Excel table, with a row for every state followed by many columns of measures -- one for each subject of interest. Not ideal. The discussion below outlines the basic approach that other Tableau practitioners have documented in blog post.

Changing the measure

The Guardian version uses a drop down box above the Google map that allowed the viewer to select a single measure of interest. To accomplish this in Tableau requires using a parameter and at least one calculated field. The parameter holds the choices, which are string surrogates for the measure columns in the data. The calculated field(s) hold values for the measure of interest, and set titles. The parameter, named "Show this...", looks like this:



A CASE calculated field, called "This indicator", has the following definition:

case [Show this ...]
when 'No tax liability' then [The 36%: percent of tax filers with no liability]
when 'Poverty' then [The 15%: percent living in poverty]
when 'Tax returns over $1 million' then [The 1%: percent of tax returns over $1m]
when 'Veterans' then [The 7%: percent of population who are veterans]
when 'On Medicare' then [The 15%: percent of population on medicare]
when 'Without medical insurance' then [The 16%: percent without medical insurance]
when 'Unemployed (July, 2012)' then [The 8%: percent unemployed (July 2012)]
when 'Over 65 years old' then [The 13%: population aged over 65]
end



A second CASE calculated field, called "Indicator Legend Label", looks like this:


case [Show this ...]
when 'No tax liability' then 'The 36%: percent of tax filers with no liability'
when 'Poverty' then 'The 15%: percent living in poverty'
when 'Tax returns over $1 million' then 'The 1%: percent of tax returns over $1 million'
when 'Veterans' then 'The 7%: percent of population who are veterans'
when 'On Medicare' then 'The 15%: percent of population on medicare'
when 'Without medical insurance' then 'The 16%: percent without medical insurance'
when 'Unemployed (July, 2012)' then 'The 8%: percent unemployed (July 2012)'
when 'Over 65 years old' then 'The 13%: population aged over 65'
end

It sets the title above the map to correspond the to measure selected from the parameter drop-sown list.

Standardizing the legends

In the Guardian version, regardless of the measure chosen, the thematic map uses four or five classes, with a consistent color range for the legend. The class boundaries change depending on the range of values in the data for the selected measure.

To accomplish this in Tableau requires some planning and manual color legend adjustments. First, I made another CASE style calculated field, called "color legends" as follows, based on the value of the Show this... parameter:

case [Show this ...]
when 'No tax liability' then
if [This indicator] <=25 then '0 to 25'
elseif [This indicator] <=30 then '25 to 30'
elseif [This indicator] <=35 then '30 to 35'
elseif [This indicator] <=40 then '35 to 40'
elseif [This indicator] >40 then '40 or more'
end
when 'Poverty' then
if [This indicator] <=5 then '5 or less'
elseif [This indicator] <=10 then '5 to 10'
elseif [This indicator] <=15 then '10 to 15'
elseif [This indicator] <=20 then '15 to 20'
elseif [This indicator] >20 then '20 or more'
end

when 'Tax returns over $1 million' then
if [This indicator] <=.01 then '0.0 to 0.01'
elseif [This indicator] <=.12 then '0.01 to 0.12'
elseif [This indicator] <=.24 then '0.12 to 0.24'
elseif [This indicator] <=.5 then '0.24 to 0.50'
elseif [This indicator] >.5 then '0.50 to 0.60'
end

when 'Veterans' then
if [This indicator] <=6 then '0 to 6'
elseif [This indicator] <=7 then '6 to 7'
elseif [This indicator] <=8 then '7 to 8'
elseif [This indicator] <=9 then '8 to 9'
elseif [This indicator] >9 then '9 or more'
end

when 'On Medicare' then
if [This indicator] <=8 then '0 to 8'
elseif [This indicator] <=11 then '8 to 11'
elseif [This indicator] <=14 then '11 to 14'
elseif [This indicator] <=17 then '14 to 17'
elseif [This indicator] >17 then '17 to 25'
end

when 'Without medical insurance' then
if [This indicator] <=5 then '0 to 5'
elseif [This indicator] <=10 then '5 to 10'
elseif [This indicator] <=15 then '10 to 15'
elseif [This indicator] <=20 then '15 to 20'
elseif [This indicator] >20 then '20 or more'
end

when 'Unemployed (July, 2012)' then
if [This indicator] <=5 then '0 to 5'
elseif [This indicator] <=7 then '5 to 7'
elseif [This indicator] <=9 then '7 to 9'
elseif [This indicator] <=11 then '9 to 11'
elseif [This indicator] >11 then '11 or more'
end

when 'Over 65 years old' then
if [This indicator] <=10 then '0 to 10'
elseif [This indicator] <=12 then '10 to 12'
elseif [This indicator] <=14 then '12 to 14'
elseif [This indicator] <=16 then '14 to 16'
elseif [This indicator] >16 then '16 or more'
end

end

Next, I dropped this calculated field onto the color shelf. Then, one at a time, I changed the measure of interest, which changed the color shelf so that it only included the classes associated with that measure. For each measure, I set the color of each class such that the lowest ordered values always had the same color RGB value, and so on through each class member for each measure.

The map and tooltips

The map itself is a filled map mark type. It uses the Gray color scheme, with only the Base and Country Borders options turned on, and Washout set to 0%. This provides the cleanest presentation of state boundaries.

The elements called out in the tooltip:


are on the Level of Detail shelf to make them available to the tooltip editor. The look and feel come from Tableau's RTF editor controls.

Download the workbook to learn more.