Lies, Damned Lies, and Statistics: Difference between revisions

Content added Content deleted
m (update links)
m (clean up)
Line 14: Line 14:
The whole business of throwing percentages at people in advertising is almost destined for this kind of abuse. Relative measures are more likely to be understood accurately, and thus are less likely to be used in advertising.
The whole business of throwing percentages at people in advertising is almost destined for this kind of abuse. Relative measures are more likely to be understood accurately, and thus are less likely to be used in advertising.


The bogus uses of statistics are intended to imply a causal link between two elements when they are not linked, the link is questionable, or the link is opposite to what is implied. A beautiful example? "Coca-Cola causes drowning". By looking at statistics on drowning and Coca-Cola sales, you can see a link -- [[Insane Troll Logic|more people go swimming on hot days, and more people buy Coke on hot days]]. Likewise, birth rates per head of population are higher in areas where there are more storks -- because birth rates are always higher in rural areas, which is where one finds the [[Delivery Stork]]. [[You Fail Logic Forever|Correlation does not equal causation]].
The bogus uses of statistics are intended to imply a causal link between two elements when they are not linked, the link is questionable, or the link is opposite to what is implied. A beautiful example? "Coca-Cola causes drowning". By looking at statistics on drowning and Coca-Cola sales, you can see a link -- [[Insane Troll Logic|more people go swimming on hot days, and more people buy Coke on hot days]]. Likewise, birth rates per head of population are higher in areas where there are more storks—because birth rates are always higher in rural areas, which is where one finds the [[Delivery Stork]]. [[You Fail Logic Forever|Correlation does not equal causation]].


Also be aware of the Law of Very Large Numbers. Any fraction of a very large number is likely to be a large number, no matter how small the fraction is. It is estimated that 2,135,000 Americans have used cocaine (including crack) in the past month. But that's only 0.7% of the population! So, is this a lot of people, or not?
Also be aware of the Law of Very Large Numbers. Any fraction of a very large number is likely to be a large number, no matter how small the fraction is. It is estimated that 2,135,000 Americans have used cocaine (including crack) in the past month. But that's only 0.7% of the population! So, is this a lot of people, or not?
Line 36: Line 36:
----
----
{{examples}}
{{examples}}
=== Straight examples: ===
== Straight examples ==
* During [[World War I]], helmets were almost withdrawn from British soldiers. When Britain started issuing steel helmets to all soldiers on the western front in 1916, generals began to call for their removal as they increased incidences of headwounds twelvefold and doubled total casualties. The reason? If someone gets hit in the head by some woolly bear or flying frog (German H.E. or rifle grenade) shrapnel and ''lives'' it's a "head wound" and if they are unable to fight, the person is a "casualty"; if they ''die'' from a bullet in the brain, then they are a "fatality" and so don't appear on casualty statistics. Since helmets let more people survive, the number of head wounds soared. A politician used the the number to support his position that "helmets are expensive and cause cowardice", and never explained what it really meant - doubly effective as most people don't know the difference between "casualty" and "fatality". The real justification behind the attempt to withdraw helmets narrowed down to, "all a dead soldier needs is a ''funeral''." A '''wounded''' soldier gets dragged out of combat by at least one of his buddies, and then provided weeks, months or even years of medical attention. From a [[A Million Is a Statistic|statistical standpoint]], adopting helmets drastically increased the effectiveness of enemy weapons - and a '''''lot''''' of [[WW 1]] generals genuinely believed in [[We Have Reserves|disposable]] [[Zerg Rush|personnel]]. Luckily, more ethical parties changed the way they recorded casualties, or the helmets would likely have been recalled.
* During [[World War I]], helmets were almost withdrawn from British soldiers. When Britain started issuing steel helmets to all soldiers on the western front in 1916, generals began to call for their removal as they increased incidences of headwounds twelvefold and doubled total casualties. The reason? If someone gets hit in the head by some woolly bear or flying frog (German H.E. or rifle grenade) shrapnel and ''lives'' it's a "head wound" and if they are unable to fight, the person is a "casualty"; if they ''die'' from a bullet in the brain, then they are a "fatality" and so don't appear on casualty statistics. Since helmets let more people survive, the number of head wounds soared. A politician used the the number to support his position that "helmets are expensive and cause cowardice", and never explained what it really meant - doubly effective as most people don't know the difference between "casualty" and "fatality". The real justification behind the attempt to withdraw helmets narrowed down to, "all a dead soldier needs is a ''funeral''." A '''wounded''' soldier gets dragged out of combat by at least one of his buddies, and then provided weeks, months or even years of medical attention. From a [[A Million Is a Statistic|statistical standpoint]], adopting helmets drastically increased the effectiveness of enemy weapons - and a '''''lot''''' of [[WW 1]] generals genuinely believed in [[We Have Reserves|disposable]] [[Zerg Rush|personnel]]. Luckily, more ethical parties changed the way they recorded casualties, or the helmets would likely have been recalled.
* Something of a historical subversion: During [[World War II]], the Royal Air Force wanted to add more armor to their planes, but because of weight limits they needed to know which places needed the armor most. So, they examined the planes after they came back and counted how often bullet holes were found in certain areas... and then placed armor in places that showed the ''fewest'' bullet holes. This is because, they assumed, that any place that did have bullet holes was a place that planes could be hit and still fly. Helped by the fact: No plane that ever came back had holes where the gas tank was. Because planes which tank was hit would explode and ''not come back''.
* Something of a historical subversion: During [[World War II]], the Royal Air Force wanted to add more armor to their planes, but because of weight limits they needed to know which places needed the armor most. So, they examined the planes after they came back and counted how often bullet holes were found in certain areas... and then placed armor in places that showed the ''fewest'' bullet holes. This is because, they assumed, that any place that did have bullet holes was a place that planes could be hit and still fly. Helped by the fact: No plane that ever came back had holes where the gas tank was. Because planes which tank was hit would explode and ''not come back''.
Line 65: Line 65:
* Even ''[[QI]]'' falls victim to this from time to time. One question was "What is three times more dangerous than war?" The answer given was work, because three times as many people are killed each year in work-related accidents than die in wars. Now, consider how much time you spent working last year compared to how long you were in a warzone.
* Even ''[[QI]]'' falls victim to this from time to time. One question was "What is three times more dangerous than war?" The answer given was work, because three times as many people are killed each year in work-related accidents than die in wars. Now, consider how much time you spent working last year compared to how long you were in a warzone.
** This prompted unhelpful responses from the panelists: "What if you're a soldier?" "What if you work in a shoe shop, ''near'' a war?"
** This prompted unhelpful responses from the panelists: "What if you're a soldier?" "What if you work in a shoe shop, ''near'' a war?"
** QI is well known for deliberately phrasing questions like this in order to confuse the participants. See also "how many moons does the Earth have" -- funny as it was, Cruithne and similar objects are near-Earth asteroids in resonant orbits rather than moons in the usual sense. They've now fessed up to the "many moons" thing as an error.
** QI is well known for deliberately phrasing questions like this in order to confuse the participants. See also "how many moons does the Earth have"—funny as it was, Cruithne and similar objects are near-Earth asteroids in resonant orbits rather than moons in the usual sense. They've now fessed up to the "many moons" thing as an error.
* The ''Column 8'' column in the ''Sydney Morning Herald'' once referenced a statistical correlation between the difficulty of the sudoku on a given day and the price of petrol.
* The ''Column 8'' column in the ''Sydney Morning Herald'' once referenced a statistical correlation between the difficulty of the sudoku on a given day and the price of petrol.
* When [[Ronald Reagan]]'s Attorney General Edwin Meese wanted "proof" that pornography was evil, he created the Attorney General's Commission on Pornography. The commission members were a preselected cohort of anti-pornography campaigners. Not surprisingly, they discovered that statistics "proved" that pornography caused crime. However, the 1970 report of the President's Commission on Obscenity and Pornography, which was done by honest researchers and was highly praised for accuracy and honesty, discovered that there was "no evidence to date that exposure to explicit sexual materials plays a significant role in the causation of delinquent or criminal behavior among youths or adults."
* When [[Ronald Reagan]]'s Attorney General Edwin Meese wanted "proof" that pornography was evil, he created the Attorney General's Commission on Pornography. The commission members were a preselected cohort of anti-pornography campaigners. Not surprisingly, they discovered that statistics "proved" that pornography caused crime. However, the 1970 report of the President's Commission on Obscenity and Pornography, which was done by honest researchers and was highly praised for accuracy and honesty, discovered that there was "no evidence to date that exposure to explicit sexual materials plays a significant role in the causation of delinquent or criminal behavior among youths or adults."
* There is a book produced for people in radio every year that compiles countless statistics about all stations taken from polls. These are used to attract advertisers. The less successful stations who have very few listeners are often forced to hire people who read through the book to get as many favorable statistics as possible, no matter how convoluted they may be. With the huge amount of data in the book, it's possible to say, for instance, that 85% of married men aged some arbitrary amount with income in some arbitrary range and who own a ferret will love your show, even though they represent a tiny proportion of the population. If you're selling ferret food, that's exactly whom you want to reach.
* There is a book produced for people in radio every year that compiles countless statistics about all stations taken from polls. These are used to attract advertisers. The less successful stations who have very few listeners are often forced to hire people who read through the book to get as many favorable statistics as possible, no matter how convoluted they may be. With the huge amount of data in the book, it's possible to say, for instance, that 85% of married men aged some arbitrary amount with income in some arbitrary range and who own a ferret will love your show, even though they represent a tiny proportion of the population. If you're selling ferret food, that's exactly whom you want to reach.
* The [[Justice League]] was asked "Maybe you'd care to explain why on your watch, 50% of marriages now end in divorce, and the other 50% end in death!" Aside from the fact that the same was true before the formation of the League, until the end of time, a significant portion of marriages will end in death, as people do have a tendency to die, married or not.
* The [[Justice League]] was asked "Maybe you'd care to explain why on your watch, 50% of marriages now end in divorce, and the other 50% end in death!" Aside from the fact that the same was true before the formation of the League, until the end of time, a significant portion of marriages will end in death, as people do have a tendency to die, married or not.
* In the heated German censorship debate about blocking sites allegedly containing child pornography, an organization in favor of this censorship law ordered a survey at a market research institute with questions asking if the person taking the survey is against child pornography and in favor of blocking the websites containing it. Over 90% answered 'Yes'. Another survey ordered by an opposing NGO -- at the same institute no less -- used a slightly different phrasing: Do you agree with blocking the content despite the fact that this content still exists and is easily accessible after the censorship? Over 90% answered with 'No'.
* In the heated German censorship debate about blocking sites allegedly containing child pornography, an organization in favor of this censorship law ordered a survey at a market research institute with questions asking if the person taking the survey is against child pornography and in favor of blocking the websites containing it. Over 90% answered 'Yes'. Another survey ordered by an opposing NGO—at the same institute no less—used a slightly different phrasing: Do you agree with blocking the content despite the fact that this content still exists and is easily accessible after the censorship? Over 90% answered with 'No'.
* An old advert for Guinness ran with the quote "88.2% of statistics are made up on the spot", attributed to Vic Reeves.
* An old advert for Guinness ran with the quote "88.2% of statistics are made up on the spot", attributed to Vic Reeves.
* Fletcher Knebel was apparently responsible for "Smoking is the leading cause of statistics", the most famous of which is "100% of non-smokers die".
* Fletcher Knebel was apparently responsible for "Smoking is the leading cause of statistics", the most famous of which is "100% of non-smokers die".
** In Montreal, there was an ad campaign run by a gum company whose gum came in round shapes instead of the usual square shapes. The ad said, "100% of people who chew square gum die."
** In Montreal, there was an ad campaign run by a gum company whose gum came in round shapes instead of the usual square shapes. The ad said, "100% of people who chew square gum die."
* Many casinos like to advertise their slot machines with lines like "Up To 99% Payout!" to make it sound like the player has a good chance to win. First, "up to" means the payout could be 1% for all you know (although laws usually set a minimum). Secondly, even a 99% payout means that for every $100 you put in the machine, on average, you'll get $99 back, i.e. you still lose. That "99% payout" is also an average that is based on something like one million pulls (plays) on the machine. If you play 100 times in one slot machine, you're not getting a representative sample of that average. These machines work differently in the UK. UK Fun With Prizes are required by law to seek their set hold percentage within a certain number of spins (usually 10,000). To achieve this, they naturally [[The Computer Is a Cheating Bastard|cheat all the time]]. They also can be, and often are, programmed to go on a suck cycle and take in way more money then they need to, in order to save up for a large series of payouts later.
* Many casinos like to advertise their slot machines with lines like "Up To 99% Payout!" to make it sound like the player has a good chance to win. First, "up to" means the payout could be 1% for all you know (although laws usually set a minimum). Secondly, even a 99% payout means that for every $100 you put in the machine, on average, you'll get $99 back, i.e. you still lose. That "99% payout" is also an average that is based on something like one million pulls (plays) on the machine. If you play 100 times in one slot machine, you're not getting a representative sample of that average. These machines work differently in the UK. UK Fun With Prizes are required by law to seek their set hold percentage within a certain number of spins (usually 10,000). To achieve this, they naturally [[The Computer Is a Cheating Bastard|cheat all the time]]. They also can be, and often are, programmed to go on a suck cycle and take in way more money then they need to, in order to save up for a large series of payouts later.
* A common problem encountered is Simpson's Paradox, best demonstrated by example: Suppose Hospitals 1 and 2 are nearby, but 1 is better equipped for treating people with severe injuries, so proportionally more of the people taken there are badly hurt. It does better at treating badly hurt people than hospital 2, and also does better at treating people who are not badly hurt. However, since people who're badly hurt are more likely to die than people who're not badly hurt whether or not they go to hospital 1 or hospital 2, hospital 1 may still have a higher overall death rate.<br /><br />Simpson's Paradox is when data shows one trend, but dividing it into categories shows the opposite trend. In the example above, hospital 1 has a higher death rate, but if the patients are split into categories based on severity of injury, it has a lower death rate in each category.
* A common problem encountered is Simpson's Paradox, best demonstrated by example: Suppose Hospitals 1 and 2 are nearby, but 1 is better equipped for treating people with severe injuries, so proportionally more of the people taken there are badly hurt. It does better at treating badly hurt people than hospital 2, and also does better at treating people who are not badly hurt. However, since people who're badly hurt are more likely to die than people who're not badly hurt whether or not they go to hospital 1 or hospital 2, hospital 1 may still have a higher overall death rate.

Simpson's Paradox is when data shows one trend, but dividing it into categories shows the opposite trend. In the example above, hospital 1 has a higher death rate, but if the patients are split into categories based on severity of injury, it has a lower death rate in each category.
** The same goes with good doctors and bad doctors, as told in the book [[Super Freakonomics]]. Good doctors are generally given tougher causes while bad doctors are given easier cases. However, if you look at death rates you see that some doctors have higher death rates, but these are usually the good doctors. Patients with serious cases are more likely to die, so good doctors lose a lot of their patients than, say the doctor who cures hiccups. The lesson is that you can be fairly certain that the doctor you receive at a hospital is competent enough to be assigned to you.
** The same goes with good doctors and bad doctors, as told in the book [[Super Freakonomics]]. Good doctors are generally given tougher causes while bad doctors are given easier cases. However, if you look at death rates you see that some doctors have higher death rates, but these are usually the good doctors. Patients with serious cases are more likely to die, so good doctors lose a lot of their patients than, say the doctor who cures hiccups. The lesson is that you can be fairly certain that the doctor you receive at a hospital is competent enough to be assigned to you.
* When Russian Orthodox Church is up to banning some more fun stuff, it likes to self-impose the "voice of the populous" liability backed by the claim that according to surveys "70% of Russians are Orthodox Christians". Then some major religious occasion comes in and the attendance rate inevitably mounts to 2-3%. The surveys about the attendance of regular clerical services and general awareness in Orthodox lore yield similar results.
* When Russian Orthodox Church is up to banning some more fun stuff, it likes to self-impose the "voice of the populous" liability backed by the claim that according to surveys "70% of Russians are Orthodox Christians". Then some major religious occasion comes in and the attendance rate inevitably mounts to 2-3%. The surveys about the attendance of regular clerical services and general awareness in Orthodox lore yield similar results.
Line 98: Line 100:
* [http://www.badscience.net/ badscience.net] occasionally shows how statistics get misused. For example, [http://www.badscience.net/2011/10/what-if-academics-were-as-dumb-as-quacks-with-statistics/ here] (on small samples it's quite possible that B isn't significantly different from A ''or'' C, but you can put it as "B isn't different from A, C is different from A, so we see that C is different from B", which is wrong) and [http://www.badscience.net/2011/12/this-guardian-story-is-dodgy-traps-in-data-journalism/ here] (limit the view to one of many multipliers which ''per se'' can't prove anything). Unsurprisingly, the areas with traditional relations to snake oil trade suffer most.
* [http://www.badscience.net/ badscience.net] occasionally shows how statistics get misused. For example, [http://www.badscience.net/2011/10/what-if-academics-were-as-dumb-as-quacks-with-statistics/ here] (on small samples it's quite possible that B isn't significantly different from A ''or'' C, but you can put it as "B isn't different from A, C is different from A, so we see that C is different from B", which is wrong) and [http://www.badscience.net/2011/12/this-guardian-story-is-dodgy-traps-in-data-journalism/ here] (limit the view to one of many multipliers which ''per se'' can't prove anything). Unsurprisingly, the areas with traditional relations to snake oil trade suffer most.


=== Other examples: ===
== Other examples ==
* [[The Onion]] does parody this from time to time.
* [[The Onion]] does parody this from time to time.
** An article was about a movement to shut down hospitals because "despite rapid advancement in medical technology, the world death rate remains at 100%."
** An article was about a movement to shut down hospitals because "despite rapid advancement in medical technology, the world death rate remains at 100%."