The Idiot’s Guide to Not Being Tricked by Statistics

There are three kinds of lies: lies, damned lies, and statistics. Mark Twain

Today’s world is powered by data. Companies like Google, Facebook, IBM, Amazon, and many more, thrive on collecting massive amounts of data from their users, then using that data to tailor a unique experience. Or, if you’re more conspiratorial, use it to manipulate customer psychology and destroy privacy.

Whatever your opinion on those things, so called “Big Data” is very real. And where there’s tons of data, there will be lots of statistics to try to make sense of it all.

Problem is, on Mark Twain’s scale of truth, statistics are just beyond “damned lies,” so it’s understandable if you have a tepid relationship with statistics. Add to it the avalanche of statistics hyped up on the news and the fact that lots of people who have taken statistics classes hated them (sorry, no stats on that), and you’ve got a platform for spreading misinformation and deceit to a wide audience. The political rhetoric of entire nations could be changed if people took a little more time to understand how to make sense of statistics.

I’m not sure if this post will do that, but it can sure change things for each of you reading it. Welcome to The Idiot’s Guide to Not Being Tricked by Statistics.

In this guide, I’ll give you four ways to get a clearer view of statistics, doing my best to provide plenty of examples along the way. Let’s get started.

Method 1: Invert the Statistic

A while ago (September 19, 2013), I posted the following on my Facebook:

When you see a negative statistic, immediately invert it. “1 in 5 Seniors can’t retire.” That means 80% can. “1 in 10 people don’t have jobs.” So 90% of people do. Negative headlines are poisonous and misleading. They exist to sow dissatisfaction, not to inform, improve, or enlighten. Don’t miss the forest for the trees.

I stand by those comments. Very, very often, you’ll find that statistics, even factual ones, are presented to be persuasive, not informative. That’s going to be a common theme in these methods for staying rational, so get used to it. Take the emotional edge off by muting the TV or navigating your web browser away from the article, invert the statistic, and see if it’s still worth being upset or negative about. Sometimes, that’s all you need. Sometimes, you’ll need to complement with one of the following methods. Read on.

Method 2: Convert Numbers to Percentages, and Vice Versa

This is a favorite of mine. In general, the rule is that in large systems (for instance, whole nations like the USA), raw numbers sound larger and percentages sound smaller. This is due to the fact that it’s very difficult to contextualize quantities and probabilities of a sufficient magnitude. A classic example (although probability, not statistic related) is the lottery. It’s very difficult to imagine the number three-hundred-and-fifty-million, so when the chances of winning are one in that many, we overestimate them because we can’t even process the enormity of those odds. Interesting fact, you can actually calculate the break-even value of a lottery ticket by taking the payout (tax reduced and in present value dollars), multiplying it by the odds, and comparing it to the cost of a ticket. Short form: it’s a bad idea.

In smaller systems it can go the other way (50% of our employees approve of the new media policy; turns out it’s the 5 managers out of 10 employees that promoted the policy). In any case, it’s best to always convert to get a clearer picture and develop context. Let’s look at a few examples.

One of my wife’s favorites is hype over pet food recalls and problems. Consider this headline from Fox News: Toxic jerky treats linked to more than 1,000 dog deaths. Sounds dire. Only it gets a little less convincing when, according to the AVMA, some 43,346,000 households own dogs, and at an average of 1.6 per household, there are around 69,353,600 puppy pets in the US. These 1,000 dogs that died, while certainly sad, represent .00144% of the registered dog population. If only 1% of dogs ever had these treats, it’s still only .144% of the dogs exposed to the treat that died – 693 had the treat and were fine before one died. When you hear “1,000 cute little puppies died due to the bad treats,” you get riled up. But it takes the edge off when you realize just how few that is in the scheme of things. There’s more to the story too, which we’ll revisit later.

Now let’s amp up the intensity and talk about something more fiery. According to the Pro Choice official website, “13,000 women each year have abortions because they have become pregnant as a result of rape or incest.” The site references a 2003 Guttmacher Institute study which can be found at this link. There’s no denying that rape and incest are always terrible, and no one should make light of them. But… are we making too much of the political rhetoric center around these admittedly horrible cases? Let’s contextualize by using the number-to-percent trick.

The exact same Guttmacher Institute study that Pro Choice referenced (in case anyone thinks I’m selecting sources with bias) shows the following table:

abortion_stats

Which tells a much different story. 1% of abortions are because of rape-induced pregnancy. Less than 0.5% are as a result of incest. While I’m in full agreement that these 1.0%-1.5% should have some sort of recourse, 1% isn’t even close to enough to drive the overall conversation to the extent that it does. This is a post about statistics and not my personal politics, but it seems we’re basing way more than a representative amount of the abortion conversation on these 1%. Especially when 74% or more are for essentially selfish reasons and could have been avoided by making good choices regarding sexual practices, including both contraception (there’s no reason not to use or enforce this with your partner) and abstinence (I chose to not have sex until my wedding night; believe it or not, hormones can be overcome with self-control).

Mini-rant aside, the point should be clear. Fox News (and every other media outlet in the world) is incentivized to make headlines more dramatic, because that’s what gets views, shares, subscriptions, whatever. Pro Choice is incentivized to make it look like a lot of pregnancies are terminated due to honest tragedies in order to drive the discussion towards the outliers which more strongly support their views instead of talking about the vast majority. We, as consumers of information, need to be able to contextualize how representative these stories are, in order to make intelligent, rational decisions, instead of being swept away in emotional appeal. Convert those numbers!

Method 3: Make Sure Any Comparison is “Apples to Apples”

Did you know that the average American has watched more television in their life than all of the founding fathers of the United States combined? You should learn from them, you lazy slacker.

A classic method of deceit is to take something that’s technically true, but not useful, and play it off as useful (or heck, even just as interesting). Consider the following “technically true” comparison from Business Insider, reposted on Reddit and everywhere else on the internet:

Taco Bell sold 100 million Doritos Locos tacos in their first 10 weeks of availability. It took McDonald’s 18 years to sell the same amount of burgers.

Sounds interesting, right?

Yeah, but it’s really not.

What’s missing from this is that when Taco Bell introduced the Doritos Locos taco in 2012, they had a massive, established infrastructure of over 6,000 restaurants spanning the United States and beyond. McDonald’s only had 102 restaurants by 1959, 4 years after Ray Kroc opened his first version of the store. I’m not sure where that infographic got its statistics on McDonald’s, but the comparison is clearly flawed. The delusion takes hold because we think of McDonald’s in current day context, a massive global company with insane distribution abilities, but the statistic is against a much different, much smaller McDonald’s. It’s taking early, infancy-stage McDonald’s against full-maturity Taco Bell. It’s like asking who would win between Muhammad Ali in his prime and Mike Tyson at age 4.

Plenty of these kinds of things exist. The effects of time and cultural shifts are extremely difficult to quantify, but it doesn’t mean no effort should be made to do so. But we can’t control the media that’s produced, only our interaction with it. Check those stats to make sure they’re actually illustrating a viable comparison.

Method 4: Context, Context, Context!

I saved this for last because it’s really all-encompassing. Most of the others could have been put under this heading, but deserved their own space. To finish off, here are a bunch of quick things to consider when viewing statistics.

Sources

Misinformation is all over. It’s not perfect, but you can at least see if the sources are reasonable as a sanity check. Is it from the CDC? Does it reference a university study from a school that you’ve actually heard of? Or is it from some dude’s blog? A celebrity’s Twitter? This requires reading more than just the headline, and sometimes, even more than the article itself… which I know is super scary and not all that common. But be uncommon. It won’t catch all the garbage, but it’ll catch enough to be worth doing.

Time scales

When numbers are presented, see if they’re comparing things over time, or could somehow be distorted by time. The McDonald’s vs. Taco Bell statistic is a good example of this. Recalling the 1,000 dog deaths from the “toxic jerky,” the article says that “since 2007” there have been 5,000 complaints regarding health issues possibly stemming from consumption of the treats. While it doesn’t specify if the 1,000 deaths were over that same time scale, it seems fairly likely – and let’s consider what that would mean if so. Instead of .144% of the dogs that ate the treat dying, it would be 1/6 that rate, assuming an even distribution of the deaths and a reasonably constant dog ownership rate in the nation. In that case, 4,166 dogs could have eaten the treats before even one died… a rate of .024%; less than a quarter of a tenth of a single percent. They say that time changes everything, well it’s true of statistics too, so be watchful.

Isolated case

Hey, crazy things do happen. While this is more about not being deceived by media in general, and is often avoided by the “convert number to percent” tip, it’s important to recognize when something is just an absurdity. Deaths from shark attacks are the classic example. Not to make light of those few who have been killed by sharks, but being killed by a shark is a pretty sexy, and rare, way to go. Media reports what’s sensational because that’s what readers find interesting. Shark attacks and plane crashes are big, emotional, crazy events, but there’s just no reason to be deceived by hype into thinking that it’s more probable than it is.

External factors

This is another huge point. Sometimes, within certain constraints, unlikely events are more likely to happen. Let’s remember our dog treats one last time (I promise). The statistic in the article doesn’t shed (ha) any light on some pieces of vital information that could skew the interpretation of the results. How old were the dogs that died after eating the treats? Were they in otherwise good health? Were they fed a balanced diet, including, on rare occasion, the indicted jerky treats, or were they fed the treats as food instead of as an irregular snack? Not to be hypocritical, but some of the top comments on the article (admittedly rarely a good source of information) were echoing the sentiment of how bad these treats were by saying that their 10+ year old dogs were being affected by them. Well geez, the average life expectancy of a dog is 10-13 years anyway.

A well-presented statistic leaves very little reasonable doubt that something critical in understanding it was missed. If you have a few minutes when you’re done with this post, head over to this podcast on the Freakonomics website and listen from around 16:11-21:31. Pay close attention to how the Freakonomics host grills the interviewee on his methodology for his study and how he arrived at his conclusion. This is solid journalism. It asks questions and lets an intelligent consumer decide whether or not to agree with the conclusion.

Fact check

Some things are just bold-faced lies! Again straying outside of statistics for just a moment, I think about the article that went viral on Facebook in 2013 about how the Pope said that “all religions are true,” “religious truth evolves and changes,” “Satan himself is a metaphor or a personification,” and much, much more. Apparently nobody that shared this thought it was a bit radical and checked the sources, because the blog that posted it has a disclaimer page that says right at the top “The original content on this blog is largely satirical.” I had a friend ready to change his beliefs because of what this article had said. A bit naïve, perhaps, but how much of what we read influences, subtly or overtly, what we believe? How much of what we hear influences how we vote, which has a very real impact on the direction of the nation? Coincidentally, here’s an article fact checking the 2012 presidential debate in Denver. You’ll notice there are quite a few ratings of half true, mostly false, and false. Surprisingly, not even our nation’s top leaders are above leveraging tricky statistics.

Know trends

Frankly, I had to put this in because the TED talk embedded below was just so entertaining and powerful to me. When you have a broad spectrum of knowledge, it’s much easier to contextualize things related to that knowledge. You can’t know everything about every topic, but thankfully, we have a great ability to synthesize and puzzle out predictions of things based on what we know of other things. They’re not always right, but again, they’re a sanity check that helps us from constantly being blown around in the wind when the next “convincing study” comes out.

The TED talk below is from Hans and Ola Rosling, and is talking about global trends regarding poverty, education, and death from natural disasters, among other things. Give it a listen after you finish this post.

The real data, not the sensationalized stories you see on your homepage or on the news, paints a different picture.

=============================================================

In conclusion, let me add that in no way am I trying to convince you to be unmoved by things that happen in the world around us. Every person affected by involuntary unemployment is worth thinking about. Dogs having health issues that can be linked to a particular brand or type of treats is worth considering and taking action on. Rape is awful, and resulting conceptions are horrible situations that deserve an answer. Taco Bell really does sell a whole lot of Doritos Locos tacos, and we should all be blown away at the amount of faux-cheese they’re pushing.

The point is to become more informed, and in particular, more rational about what’s presented to you. There’s a lot of noise produced in today’s world, but little communication of reality through a sane and accurate representation. I hope these little tips and examples have been helpful and entertaining, and will lead to you being a little wiser, a little more positive, and far less often tricked by statistics!