Iain Pardoe

Applied Regression Modeling


Скачать книгу

of the confidence interval from the t‐version of the central limit theorem, where images has an approximate t‐distribution with images degrees of freedom. In particular, suppose that we want to calculate a 95% confidence interval for the population mean, images, for the home prices example—in other words, an interval such that there will be an area of 0.95 between the two endpoints of the interval (and an area of 0.025 to the left of the interval in the lower tail, and an area of 0.025 to the right of the interval in the upper tail). Let us consider just one side of the interval first. Since 2.045 is the 97.5th percentile of the t‐distribution with 29 degrees of freedom (see the t‐table in Section 1.4.2), then

equation

      The difference from earlier calculations is that this time images is the focus of inference, so we have not assumed that we know its value. One consequence for the probability calculation is that in the fourth line we have “images.” To change this to “images” in the fifth line, we multiply each side of the inequality sign by “images” (this also has the effect of changing the direction of the inequality sign).

      This probability statement must be true for all potential values of images and images. In particular, it must be true for our observed sample statistics, images and images. Thus, to find the values of images that satisfy the probability statement, we plug in our sample statistics to find

equation

      This shows that a population mean greater than images would satisfy the expression images. In other words, we have found that the lower bound of our confidence interval is images, or approximately images. The value 20.1115 in this calculation is the margin of error.

      To find the upper bound, we perform a similar calculation:

equation

      To find the values of images that satisfy this expression, we plug in our sample statistics to find

equation

      We can write these two calculations a little more concisely as

equation

      As before, we plug in our sample statistics to find the values of images that satisfy this expression:

equation

      This shows that a population mean between images and images would satisfy the expression images. In other words, we have found that a 95% confidence interval for images for this example is (images, images), or approximately (images, images). It is traditional to write confidence intervals with the lower number on the left.

      More generally, using symbols, a 95% confidence interval for a univariate population mean, images, results from the following:

equation

      where the 97.5th percentile comes from the t‐distribution with images degrees of freedom. In other words, plugging in our observed sample statistics, images and images, we can write the 95% confidence interval as images. In this expression, images is the margin of error.

      For a lower or higher level of confidence than 95%, the percentile used in the calculation must be changed as appropriate. For example, for a 90% interval (i.e., with 5% in each tail), the 95th percentile would be needed, whereas for a 99% interval (i.e., with 0.5% in each tail), the 99.5th percentile would be needed. These percentiles can be obtained from the table “Univariate Data” in Notation and Formulas (which is an expanded version of the table in Section 1.4.2). Instructions for using the table can be found in Notation and Formulas.

equation

      where images is the sample mean,