Notes and Takeaways from How to Lie with Statistics

Want to get my future notes when I publish them? Subscribe to my weekly newsletter below.

My notes

About How to Lie With Statistics

According to its author, Darrell Huff, How to Lie With Statistics is a primer in ways to use statistics to deceive: “The crooks already know these tricks; honest men must learn them in self-defense.”

About Darrell Huff

Darrell Huff (1913 - 2001) was an American writer best known as the author of How to Lie with Statistics, the best-selling statistics book of the second half of the twentieth century.

Statistics mislead

Averages, relationships, trends, and graphs are not always what they seem. A well-wrapped statistic misleads. When you come across a statistic, ask yourself a question: “How can anyone have found out such a thing?”

The secret language of statistics is used to sensationalize, inflate, confuse, and oversimplify. It takes advantage of fact-minded people.

The essential beauty of doing your lying with statistics is that the lie can’t be pinned on you. Sometimes statistics are manipulated by the statistician. Other times, the stat gets twisted, exaggerated, oversimplified, or distorted by a salesman, a public relations expert, a journalist, or an advertising copywriter.

Statistics make it easy for advertisers to craft compelling headlines without committing falsehoods.

Give statistical materials a sharp look before accepting them.

The sampling procedure

The sampling procedure is at the heart of many of the statistics you will come across.

The result of a sampling study is no better than the sample on which it is based.

To be reliable, a statistic based on sampling must use a representative sample: one from which every source of bias has been removed.

If your sample is large enough and selected properly, it will represent the whole well enough for most purposes. If it is not, it may be far less accurate than an intelligent guess.

Many statistics you come across come from samples that are biased, too small, or both.

The reliability of a sample can be compromised by both invisible and visible sources of bias. If you can’t identify a source of demonstrable bias, allow yourself skepticism about the results as long as there is a possibility of bias, which there always is.

The basic sample is the kind called “random.” It is selected by pure chance from the “universe,” a word by which the statistician means the whole of which the sample is a part. The test of the random sample is this: Does every name or thing in the whole group have an equal chance to be in the sample? This is so difficult and expensive to obtain for many uses that sheer cost eliminates it.

A more economical substitute is called “stratified random sampling.” To obtain a stratified sample, you divide the population into groups in proportion to their known prevalence. And that’s where trouble starts, as your information about their proportion may be incorrect. You also still have to figure out how to get a random sample within the stratification.

Sampling bias

The operation of a sampling comes down to a running battle against sources of bias. What you must remember is that the battle is never won.

Take, for example, a sampling of people’s incomes collected via a survey. When asked about their incomes, some people will exaggerate out of vanity or optimism. Others minimize to avoid contradicting tax returns. These two tendencies may cancel each other out, or one may outweigh the other. The point is that we do not know.

Take, for example, a poll of people in the streets. You bias your sample against stay-at-homes.

Take, for example, a door-to-door survey. You bias your sample against employed people.

Take, for example, an evening phone interview. You bias your sample against moviegoers and night-clubbers.

Sampling sizes

One of the key advantages of large sample sizes is that they dampen random variation, making observed differences more likely to reflect real effects rather than chance.

The law of averages is informative only in the presence of large numbers.

The question of “how many is enough” has no single answer. It depends on how large and varied the population you are studying is.

The meaning of “average.”

The word “average” has a loose meaning. People often use the definition that best supports their point. An unqualified average is effectively meaningless. When you see an average figure (especially if it’s pay-related), ask, “What average and whose included?”

When you are told that something is an average, the first step is to determine which type of average it is. The common definitions of average are mean, median, and mode.

The mean is the arithmetic average. Add up all the values and divide by how many there are.

The median is the middle value when all the values are ordered from smallest to largest. If there are two middle values, it is their average.

The mode is the value that appears most often in the data set.

Not all figures make it easy to determine their precision or imprecision. When the type of average is hidden or hard to determine, be skeptical.

An unqualified statement of average is a red flag.

Why the distribution matters for averages

Mean, median, and mode are close together when you deal with data that has what is called a normal distribution. A normal distribution is a symmetrical, bell-shaped probability distribution where most values cluster around the mean, with fewer values appearing as you move away from the center in either direction. In a normal distribution, the mean, median, and mode all fall at the same point.

Mean, median, and mode are not close together when you deal with data that has what is called a skewed distribution. A skewed distribution is an asymmetrical probability distribution where the data points are not evenly distributed around the mean. Instead of forming a symmetrical bell shape like a normal distribution, the distribution has a longer tail on one side. In a right-skewed (positively skewed) distribution, the tail extends toward the right (higher values), and the mean is typically greater than the median. In a left-skewed (negatively skewed) distribution, the tail extends toward the left (lower values), and the mean is typically less than the median. In skewed distributions, the mean, median, and mode fall at different points, which is why it's important to know which type of average is being used when interpreting statistics. It sounds like a joke, but you end up in a situation where nearly everyone is below average.

Tests of significance

A test of significance is a way of reporting how likely it is that a test figure represents a real result rather than something produced by chance. It is most simply expressed as a probability. For example, if you hear there are nineteen chances out of twenty that the figure has a specified degree of precision, that is a five percent level of significance.

If a data source provides a degree of significance, you’ll have a better idea of where it stands.

Statistical error and ranges

Standardized tests like the IQ test are samplings of intellect, and like any product of the sampling method, these tests suffer from statistical error. A statistical error expresses the precision or reliability of a figure. The probable error and the standard error are figures that represent how accurately your sample can be taken to represent the universe.

The way to think about IQs and many other sampling results is in ranges. “Normal” is not 100, but the range of 90 to 110. Comparisons between figures with small differences are meaningless. Keep a plus-or-minus range in mind (e.g., 100 +- 10), even (or especially!) when it is not stated.

Pictures and graphs

When numbers in a table are too complex, and words are insufficient, we resort to drawing a picture.

The simplest graph is the line graph. It is useful for showing trends, which everyone is interested in.

It is easy to manipulate the story a graph is telling by changing the proportion between the ordinate (vertical y-value) and the abscissa (horizontal x-value). This deception can be accomplished without lying.

A pictorial graph (also called a pictograph or pictogram) is a type of graph that uses pictures to represent data rather than bars, lines, or numbers. It’s mainly used to make simple comparisons easy to see at a glance. Each picture stands for a certain number of items (for example, 1 icon = 10 people). The images are repeated or scaled to illustrate the relative size of each category. The bar chart is a simple pictorial graph for comparing quantities.

The pictorial graph is often used to dramatize statistics. For example, by making a picture twice as tall and scaling it proportionally, you can exaggerate the 2x difference in terms of 3D volume (which is how the mind interprets the picture). The visual impression often dominates the numerical, written one. What you see often outweighs what you hear and read. It’s the statistical equivalent of the before-and-after photograph. A room is photographed twice to show you how wonderful a new coat of paint looks. But new furniture and wall art have also been added, and/or higher-quality photography is at play.

Semi-attached figures

A semi-attached figure is a statistical deception where you prove something unrelated to your claim, then present it as if it's the same thing. You pick two things that sound the same but are not, and count one while reporting it as the other. Semi-attached figures are used when you can prove what you want to prove.

You can’t prove that your drug cures colds, but you can publish a lab report that half an ounce of the stuff killed 31,108 germs in a test tube in eleven seconds. Another example is reporting total deaths from railroads (4,712), when most were from car-train collisions, and only 132 were train passengers. More people were killed by airplanes last year than in the 1800s! Therefore, modern planes are more dangerous? No. Airplanes didn’t exist in the 1800s…

“Laboratory tests” (especially “independent laboratory tests”) are a red flag for semi-attached figures.

There are often many ways to express a figure. For example, a return on sales vs a return on investment vs a profit vs a change from last year. We tend to use the method that works best for the situation.

The post hoc fallacy

The post hoc fallacy (short for post hoc ergo propter hoc, meaning "after this, therefore because of this") is the false assumption that because one event (A) preceded another event (B), event A must have caused event B. It is a logical fallacy that confuses coincidental chronological order with cause and effect. If B follows A, it does not mean that A has caused B.

Inspect any and all statements of relationship. If a figure is claimed to prove something, investigate. It could be a correlation produced by chance. Given a small enough sample, you are likely to find some correlation between any pair of characteristics or events that you can think of.

One common kind of covariation occurs when the relationship is real, but it’s unclear which variable is the cause and which the effect.

Another common situation is when neither variable has any effect at all on the other, yet there is a real correlation.

A negative correlation means that as one variable increases, the other tends to decrease. A positive correlation means that as one variable increases, the other tends to decrease.

Using average college graduate earnings to claim that attending college will increase earnings is an example of the post hoc fallacy. Here, real correlation is used to bolster an unproven cause-and-effect relationship.

The percentage and decimal

Percentages can mislead. Any percentage figure based on a small number of cases is likely to be misleading. The figure itself is often more informative. Decimals can exaggerate this impact.

Percentages vs percentage points

Pay attention to the difference between percentage increases and percentage points. You can make something sound modest by calling it a three-percentage-point rise, even though those percentage points represent a one-hundred percent increase.

The shifting base

Remember the story of the merchant who was asked to explain how he could sell rabbit sandwiches so cheaply. “Well,” he said, “I have to put in some horse meat too. But I mix ’em fifty-fifty: one horse, one rabbit.”

How to assess a statistic

In How to Lie With Statistics, Darrell Huff provides a 5-question framework for assessing the validity of a statistic. The 5 questions are:

  1. Who says so? Who is the study for/by? Who is the data source? Look for bias, both conscious and unconscious. Who is saying that the “survey shows…”

  2. How does he know? Look for evidence of a biased or improper sample. Is the sample large enough to permit any reliable conclusion? Are there enough cases for significance?

  3. What’s missing? Is the sample size provided? If an average is referenced, is it clear which definition is used? If correlation is mentioned, is a standard or probable error provided?

  4. Did somebody change the subject? What changed? Did the base change? Is there a semi-attached figure?

  5. Does it make sense? Look for nonsense.

Random Notes

One way to respond to someone without offending them while also limiting further correspondence is, “There may be something in what you say.”

Quotes

“Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” —H. G. Wells

“It ain’t so much the things we don’t know that get us in trouble. It’s the things we know that ain’t so.” —Artemus Ward