is : .
Calculate a test statistic based on the assumption that the null hypothesis is true. For hypothesis tests for a univariate population mean the relevant test statistic iswhere is the sample mean, is the value of the population mean in the null hypothesis, is the sample standard deviation, and is the sample size.
Under the assumption that the null hypothesis is true, this test statistic will have a particular probability distribution. For testing a univariate population mean, this t‐statistic has a t‐distribution with degrees of freedom. We would therefore expect it to be “close” to zero (if the null hypothesis is true). Conversely, if it is far from zero, then we might begin to doubt the null hypothesis:– For an upper‐tail test, a t‐statistic that is positive and far from zero would then lead us to favor the alternative hypothesis (a t‐statistic that was far from zero but negative would favor neither hypothesis and the test would be inconclusive).– For a lower‐tail test, a t‐statistic that is negative and far from zero would then lead us to favor the alternative hypothesis (a t‐statistic that was far from zero but positive would favor neither hypothesis and the test would be inconclusive).– For a two‐tail test, any t‐statistic that is far from zero (positive or negative) would lead us to favor the alternative hypothesis.
To decide how far from zero a t‐statistic would have to be before we reject the null hypothesis in favor of the alternative, recall the legal analogy. To deliver a guilty verdict (the alternative hypothesis), the jury must establish guilt beyond a reasonable doubt. In other words, a jury rejects the presumption of innocence (the null hypothesis) only if there is compelling evidence of guilt. In statistical terms, compelling evidence of guilt is found only in the tails of the t‐distribution density curve. For example, in conducting an upper‐tail test, if the t‐statistic is way out in the upper tail, then it seems unlikely that the null hypothesis could have been true—so we reject it in favor of the alternative. Otherwise, the t‐statistic could well have arisen while the null hypothesis held true—so we do not reject it in favor of the alternative. How far out in the tail does the t‐statistic have to be to favor the alternative hypothesis rather than the null? Here we must make a decision about how much evidence we will require before rejecting a null hypothesis. There is always a chance that we might mistakenly reject a null hypothesis when it is actually true (the equivalent of pronouncing an innocent defendant guilty). Often, this chance—called the significance level—will be set at 5%, but more stringent tests (such as in clinical trials of new pharmaceutical drugs) might set this at 1%, while less stringent tests (such as in sociological studies) might set this at 10%. For the sake of argument, we use 5% as a default value for hypothesis tests in this book (unless stated otherwise).
The significance level dictates the critical value(s) for the test, beyond which an observed t‐statistic leads to rejection of the null hypothesis in favor of the alternative. This region, which leads to rejection of the null hypothesis, is called the rejection region. For example, for a significance level of 5%:– For an upper‐tail test, the critical value is the 95th percentile of the t‐distribution with degrees of freedom; reject the null in favor of the alternative if the t‐statistic is greater than this. Note that the 95th percentile is a positive number in the upper‐tail of the t‐distribution.– For a lower‐tail test, the critical value is the 5th percentile of the t‐distribution with degrees of freedom; reject the null in favor of the alternative if the t‐statistic is less than this. Note that the 5th percentile is a negative number in the lower‐tail of the t‐distribution.– For a two‐tail test, the two critical values are the 2.5th and the 97.5th percentiles of the t‐distribution with degrees of freedom; reject the null in favor of the alternative if the t‐statistic is less than the 2.5th percentile or greater than the 97.5th percentile. Note that the 2.5th percentile is a negative number in the lower‐tail of the t‐distribution, while the 97.5th percentile is a positive number in the upper‐tail of the t‐distribution. Also, the 2.5th percentile is simply the negative of the 97.5th percentile. (Make sure you understand why.)
As previously, it is perhaps easier to see how all this works by example. It is best to lay out hypothesis tests in a series of steps (see computer help #24 in the software information files available from the book website):
State null hypothesis: : .
State alternative hypothesis: : .
Calculate test statistic: .
Set significance level: 5%.
Look up critical value: The 95th percentile of the t‐distribution with 29 degrees of freedom is 1.699 (from Table C.1); the rejection region is therefore any t‐statistic greater than 1.699.
Make decision: Since , the t‐statistic of 2.40 falls in the rejection region, and we reject the null hypothesis in favor of the alternative.
Interpret in the context of the situation: The 30 sample sale prices suggest that a population mean of seems implausible—the sample data favor a value greater than this (at a significance level of 5%).
To conduct an upper‐tail hypothesis test for a univariate mean using the rejection region method:
State null hypothesis: : .
State alternative hypothesis: : .
Calculate test statistic: . (In most cases, this should be a positive number.)
Set significance level: %.
Look up critical value in Table C.1: The th percentile of the t‐distribution with degrees of freedom; the rejection region is any t‐statistic greater than this.
Make decision: If , then reject the null hypothesis in favor of the alternative. Otherwise, fail to reject the null hypothesis.
Interpret in the context of the situation: If we have rejected the null hypothesis in favor of the alternative, then the sample data suggest that a population mean of seems implausible—the sample data favor a value greater than this (at a significance level of %). If we have failed to reject the null hypothesis, then we have insufficient evidence to conclude that the population mean is greater than .
For a lower‐tail test, everything is the same except:
Alternative hypothesis: : .
In most cases, the should be a negative number.
Critical value: The th percentile of the t‐distribution with degrees of freedom; the rejection region is any t‐statistic less than this.
For a two‐tail test, everything is the same except:
Alternative hypothesis: : .
The could be positive or negative.
Critical value: The th percentile of the t‐distribution with degrees of freedom; the rejection region is any t‐statistic greater than this or less than the negative of this.
1.6.2 The p‐value method
An alternative way to conduct a hypothesis test is to again assume initially that the null hypothesis is true, but then to calculate the probability of observing a t‐statistic as extreme as the one observed or even more extreme (in the direction that favors the alternative hypothesis). This is known as the p‐value (sometimes also called the observed significance level):
1 – For an upper‐tail test, the p‐value is the area under the curve of the t‐distribution (with degrees of freedom) to the right of the observed t‐statistic.
2 – For a lower‐tail test, the p‐value is the area under the curve of the t‐distribution (with degrees of freedom) to the left of the observed t‐statistic.
3 – For a two‐tail test, the p‐value is the sum of the areas under the curve of the t‐distribution (with degrees of freedom) beyond both the