The slides to my Ignite Charlotte 1 presentation titles "The Problem with Data and Statistics." Video of presentation can be found here: http://www.ignitecharlotte.org/2010/08/ignite-charlotte-1-talk-by-adam-q-holden-bache-the-problem-with-data-statistics/
21. “ Statistics are no substitute for judgement.” - Henry Clay @adamholdenbache
Notas del editor
Statistics play an important role in nearly every kind of business. They’re often used as a basis to make decisions. There are problems with data and statistics however- we don’t always know if they’re true. People often accept them as fact without understanding the data behind the statistics.
There are, of course, problems with using statistics as evidence. There's a famous quote that says: " There are three ways to not tell the truth: lies, damned lies, and statistics
It is not that statistics are necessarily false, but the sheer lack of understanding of the general public can provide the media with a powerful tool to manipulate.
"A picture is worth a thousand words." This is certainly true when you're presenting and explaining data. Graphs or charts help people understand data quickly. Whether you want to make a comparison, show a relationship, or highlight a trend, they help your audience "see" what you are talking about.
It looks stunning. According to this chart, out of every 100 people (or accounts) on twitter 20 are completely inactive 50 are lazy or not tweeted in last one week 5 have more than 100 followers 5 are loud mouths and hence tweet heavily and contribute to 75% of twitter volume. 20 are gray colored If my reading comprehension skills are to be trusted, the chart in no way conveys that all the above 5 types of people are different. That means there can be overlaps. So the chart is trying to mislead us In plain speak, it is possible that the lazy group can have more than 100 followers Also, the entire “completely inactive” group is a subset of “lazy” group. Site: http://chandoo.org/wp/2009/07/31/unexciting-twitter-chart/ Source: http://rohitbhargava.typepad.com/weblog/2009/07/10-stunning-and-useful-stats-about-twitter.html
It looks stunning. According to this chart, out of every 100 people (or accounts) on twitter 20 are completely inactive 50 are lazy or not tweeted in last one week 5 have more than 100 followers 5 are loud mouths and hence tweet heavily and contribute to 75% of twitter volume. 20 are gray colored If my reading comprehension skills are to be trusted, the chart in no way conveys that all the above 5 types of people are different. That means there can be overlaps. So the chart is trying to mislead us In plain speak, it is possible that the lazy group can have more than 100 followers Also, the entire “completely inactive” group is a subset of “lazy” group. Site: http://chandoo.org/wp/2009/07/31/unexciting-twitter-chart/ Source: http://rohitbhargava.typepad.com/weblog/2009/07/10-stunning-and-useful-stats-about-twitter.html
Charts should clarify data in a way that’s easy to digest. Some charts fail miserably at this. This chart is a good example of trying to display too much information. It’s also inconsistent in its display of data, as some labels for the same content areas are visually different.
The chart claims to show "Job loss by quarter." But it doesn't. (We lost 15 million jobs in the second quarter of 2010!?? Surely, that would have been catastrophic news.) What this defective chart actually displays is the number of unemployed during four random quarters over the past two-and-a-half years. Clearly, Fox got the title of the chart wrong. If they wanted to depict quarterly job losses, they'd have no way of getting around the fact that net job losses ceased at the end of 2009
The chart claims to show "Job loss by quarter." But it doesn't. (We lost 15 million jobs in the second quarter of 2010!?? Surely, that would have been catastrophic news.) What this defective chart actually displays is the number of unemployed during four random quarters over the past two-and-a-half years. Clearly, Fox got the title of the chart wrong. If they wanted to depict quarterly job losses, they'd have no way of getting around the fact that net job losses ceased at the end of 2009
We live in a world of statistics: you can find numbers in support of just about any idea. When you review statistical data you need to ask yourself four questions: 1- who did the study that came up with the statistics? 2- what exactly are the statistics measuring 3- who was asked 4-how were they asked?
The phrase "numbers don't lie" is true; what you need to examine is who is publishing the numbers, and what are they trying to prove with them. Are the statistics provided by the American Cancer Society or the American Tobacco Institute?
When viewing data and statistics, ask yourself, "what are the statistics measuring," Your job, in using statistics as evidence, is to determine what exactly is being measured, and not simply spout numbers that seem to apply to your topic.
Once you've determined what the statistics are measuring, you next need to find out how the research was conducted. Many studies are done by asking people their opinions or what they do or think or feel. Such studies which are based on individual people's ideas, opinions and/or attitudes. Such areas are often referred to as "soft sciences", as opposed to "hard sciences" that do research designed to minimize as much as possible the human factor in the evidence and conclusions.
The conditions under which statistics are being gathered can dramatically effect the statistical data. If you ask someone who is starving if they like pizza you're likely to get a more positive response than someone who has eaten pizza for a week straight. Timing can also be a factor- if you take a poll on July 4 and ask someone if they think the President is doing a good job you may get more positive responses than if you ask the day after a terrorist bombing.
So, as you can see, understanding the process, research and data accumulation behind statistics is key to fully understanding what the data is trying to present in chart or graph format. Understanding INTENT is key to understanding data presentation. You need to know whether the statistical data is frivolous and lacking real substance (point) or whether it’s a substantial and unbiased.
Keep in mind that not all the problems presented here are a condemnation of statistics as evidence. Statistics, when used correctly, are an excellent and concise way to express evidence. I just want everyone to be aware that you must examine the data sources and review them for relevancy and validity before taking them as fact or using them to prove a point.