Statistics for Dummies (36 page)

Doing a Hypothesis Test

A
hypothesis test
is a statistical procedure that's designed to test a claim. Typically, the claim is being made about a population parameter (one number that characterizes the entire population). Because parameters tend to be unknown quantities, everyone wants to make claims about what their values may be. For example, the claim that 25% (or 0.25) of all women have varicose veins is a claim about the proportion (that's the
parameter
) of all women (that's the
population
) who have varicose veins (that's the
variable
, having or not having varicose veins).

HEADS UP

Do you think that anyone actually knows for certain that the percentage of all women who have varicose veins is exactly 25? No; they're making a claim, not stating a fact. Watch out for statements like this.

Defining what you're testing

To get more specific, the varicose vein claim is that the parameter, the population proportion (
p
), is equal to 0.25. (This claim is called the
null hypothesis.
) If you're out to test this claim, you're questioning the claim and have a hypothesis of your own (called the
research hypothesis
, or
alternative hypothesis
). You may hypothesize, for example, that the actual proportion of women who have varicose veins is lower than 0.25, based on your observations. Or, you may hypothesize that due to the popularity of high heeled shoes, the proportion may be higher than 0.25. Or, if you're simply questioning whether the actual proportion is 0.25, your alternative hypothesis is, "No, it isn't 0.25."

In addition to testing hypotheses about categorical variables (having or not having varicose veins is a categorical variable), you can also test hypotheses about numerical variables, such as the average commuting time for people working in Los Angeles or their average household income. In these cases, the parameter of interest is the population average or mean (denoted
μ
). Again, the claim is that this parameter is equal to a certain value, versus some alternative.

Hypotheses can be tested about more than one single population parameter, too. For example, you may want to compare average household incomes or commuting times of people from two or more major cities. Or you may want to see whether a link exists between commuting time and income. All of these questions can be answered using hypothesis tests; while the details differ for each situation, the general ideas are the same. I go over the one-sample case for means and proportions (large samples) in this chapter;
Chapter 15
provides the particulars of many commonly used hypothesis tests.

Setting up the hypotheses

Every hypothesis test contains two hypotheses. The first hypothesis is called the
null hypothesis
, denoted H
_o. The null hypothesis always states that the population parameter is
equal
to the claimed value. For example, if the claim is that the average time to make a name-brand ready-mix pie is five minutes, the statistical shorthand notation for the null hypothesis in this case would be as follows: H
_o:
μ
= 5.

What's the alternative?

Before actually conducting a hypothesis test, you have to put two possible hypotheses on the table — the null hypothesis is one of them. But, if the null hypothesis is found not to be true, what's your alternative going to be? Actually, three possibilities exist for the second (or alternative) hypothesis, denoted H
_a. Here they are, along with their shorthand notations in the context of the example:

The population parameter is
not equal to
the claimed value (H
_a:
μ≠
5).
The population parameter is
greater than
the claimed value (H
_a:
μ
> 5).
The population parameter is
less than
the claimed value (H
_a:
μ
< 5).

Which alternative hypothesis you choose in setting up your hypothesis test depends on what you're interested in concluding, should you have enough evidence to refute the null hypothesis (the claim).

For example, if you want to test whether a company is correct in claiming its pie takes 5 minutes to make and you also want to know whether the actual average time is more or less than that, you use the not-equal-to alternative. Your hypotheses for that test would be H
_o:
μ
= 5 versus H
_a:
μ≠
5.

If you only want to see whether the time turns out to be greater than what the company claims (that is, the company is falsely advertising its prep time), you use the greater-than alternative, and your two hypotheses are H
_o:
μ
= 5 versus H
_a:
μ
> 5.

Finally, say you work for the company marketing the pie, and you think the pie can be made in less than 5 minutes (and could be marketed by the company as such). The less-than alternative is the one you want, and your two hypotheses would be H
_o:
μ
= 5 versus H
_a:
μ
< 5.

Knowing which hypothesis is which

How do you know which hypothesis to put in H
_oand which one to put in H
_a? Typically, the null hypothesis says that nothing new is happening; the previous result is the same now as it was before, or the groups have the same average
(their difference is equal to zero). In general, you assume that people's claims are true until proven otherwise.

Tip

Hypothesis tests are similar to jury trials, in a sense. In a jury trial, H
_ois similar to the not-guilty verdict, and H
_ais the guilty verdict. You assume in a jury trial that the defendant isn't guilty unless the prosecution can show beyond a reasonable doubt that he or she
is
guilty. If the jury says the evidence is beyond a reasonable doubt, they reject H
_o, not guilty, in favor of H
_a, guilty.

In general, when hypothesis testing, you set up H
_oand H
_aso that you believe H
_ois true unless your evidence (your data and statistics) shows you otherwise. And in that case, where you have sufficient evidence against H
_o, you reject H
_oin favor of H
_a. The burden of proof is on the researcher to show sufficient evidence against H
_obefore it's rejected. (That's why H
_ais often called the research hypothesis, because H
_ais the hypothesis that the researcher is most interested in showing.) If H
_ois rejected in favor of H
_a, the researcher can say he or she has found a
statistically significant
result; that is, the results refute the previous claim, and something different or new is happening.

HEADS UP

In many cases, people set up hypothesis tests because they're out to show that H
_oisn't true, supporting the alternative hypothesis. (The mentality is, why do research just to show that something has stayed the same?) The results that you hear about in the media are generally the ones that are able to show H
_oisn't true; this is what makes news. In many cases that's a good thing, though, because researchers and manufacturers have to stay on their toes to avoid negative publicity surrounding a product recall, a lawsuit, or a government investigation. That's because if one of their claims (H
_o) is rejected by someone conducting an independent hypothesis test, the researchers or manufacturers are being judged as guilty of false advertising or false claims, which is not good.

Gathering the evidence: The sample

After you've set up the hypotheses, the next step is to collect your evidence and determine whether your evidence corroborates the claim made in H
_o. Remember, the claim is made about the population, but you can't test the whole population; the best you can usually do is take a sample. As with any other situation in which statistics are being collected, the quality of the data is extremely critical. (See
Chapter 2
for lots of examples of statistics that have gone wrong.)

Good data start with a good sample. The two main issues to consider when selecting your sample are avoiding bias and being accurate. To avoid bias, take a random sample (meaning everyone in the population must have an equal chance of being chosen) and choose a large enough sample size so that the results will be accurate. (See
Chapter 3
.)

Compiling the evidence: The statistics

After you select your sample, the appropriate number-crunching takes place. Your null hypothesis makes a statement about what the population parameter is (for example, the proportion of all women who have varicose veins or the average miles per gallon of a U.S.-built light truck). In statistical jargon, the data you collect measure that variable of interest, and the statistics that you calculate will include the sample statistic that most closely estimates the population parameter. In other words, if you're testing a claim about the proportion of women with varicose veins, you need to calculate the proportion of women in your sample who have varicose veins. If you're testing a claim about the average miles per gallon of a U.S.-built light truck, your statistic should be the average miles per gallon of the light trucks in your sample. (See
Chapter 5
for all the information you need on calculating statistics.)

Standardizing the evidence: The test statistic

After you have your sample statistic, you may think you're done with the analysis part and are ready to make your conclusions — but you're not. The problem is, you have no way to put your results into any kind of perspective just by looking at them in their regular units. That's because you know that your results are based only on a sample and that sample results are going to vary. That variation needs to be taken into account, or your conclusions could be completely wrong. (How much do sample results vary? Sample variation is measured by the standard error; see
Chapter 9
for more on this.)

Suppose the claim is that the percentage of all women with varicose veins is 25 percent, and your sample of 100 women had 20 percent with varicose veins. The standard error for your sample percentage is 4 percent (according to formulas in
Chapter 9
), which means that your results are expected to vary by about twice that, or about 8 percent, according to the empirical rule (see
Chapter 10
). So a difference of 5 percent between the claim and your sample result (25%
−
20% = 5%) isn't that much, in these terms. This represents a distance of less than 2 standard errors away from the claim. Therefore, you accept the claim, H
_o, because your data can't refute it.

However, suppose your sample percentage was based on a sample of 1,000 women, not 100. This decreases the amount by which you expect your results to vary, because you have more information. The standard error becomes 0.012 or 1.2 percent, and the margin of error is twice that, or 2.4 percent on either side. Now a difference of 5 percent between your sample result (20 percent) and the claim (25 percent) is a more meaningful difference; it's
way
more than 2 standard errors away from the claim. Your results being based on 1,000 people shouldn't vary that much from the claim, so what should you conclude? The claim (H
_o) is concluded to be false, because your data don't support it.

The number of standard errors that a statistic lies above or below the mean is called a
standard score
(see
Chapter 8
). In order to interpret your statistic, you need to convert it from original units to a standard score. When finding a standard score, you take your statistic, subtract the mean, and divide the result by the standard error. In the case of hypothesis tests, you use the value in H
_oas the mean. (That's because you assume H
_ois true, unless you have enough evidence against it.) This standardized version of your statistic is called a
test statistic
, and it's the main component of a hypothesis test. (
Chapter 15
contains the formulas for the most common hypothesis tests.)

The general procedure for converting a statistic to a test statistic (standard score) in the case of means/proportions:

Take your statistic minus the claimed value (given by H
_o).
Divide by the standard error of the statistic (see
Chapters 9
and
10
).

Your test statistic represents the distance between your actual sample results and the claimed population value, in terms of number of standard errors. In the case of a single population mean or proportion, you know that these standardized distances should have a standard normal distribution if your sample size is large enough (see
Chapters 8
and
9
). So, to interpret your test statistic in these cases, you can see where it stands on the standard normal distribution (Z-distribution).

Although you never expect a sample statistic to be exactly the same as the population value, you expect it to be close if H
_ois true. That means, if you see that the distance between the claim and the sample statistic is small, in terms of standard errors, your sample isn't far from the claim and your data are telling you to stick with H
_o. As that distance becomes larger and larger, however, your data are showing less and less support for H
_o. At some point, you should reject H
_obased on your evidence, and choose H
_a. At what point does this happen? The
next section
addresses that issue.

Other books

The Secret Life of Uri Geller by Jonathan Margolis

Supreme Ambitions by David Lat

Just One Touch by Debra Mullins

One Four All by Julia Rachel Barrett

Repairman Jack [07]-Gateways by F. Paul Wilson

The Grimm Chronicles, Vol. 2 by Ken Brosky, Isabella Fontaine, Dagny Holt, Chris Smith, Lioudmila Perry

RoamWild by Valerie Herme´

Trial by Ice by Richard Parry

Death of the Family Recipe (A Scotti Fitzgerald Murder Mystery Book 3) by Anita Rodgers

Trick by Garrett, Lori