Introduction to Linear Regression Analysis. Douglas C. Montgomery. Читать онлайн. Hotlib. HOTLIB.NET

Introduction to Linear Regression Analysis

detect this relationship has been obscured by the variance of the measurement process or that the range of values of x is inappropriate. A great deal of nonstatistical evidence and knowledge of the subject matter in the field is required to conclude that β₁ = 0.

2.4 INTERVAL ESTIMATION IN SIMPLE LINEAR REGRESSION

In this section we consider confidence interval estimation of the regression model parameters. We also discuss interval estimation of the mean response E(y) for given values of x. The normality assumptions introduced in Section 2.3 continue to apply.

2.4.1 Confidence Intervals on β₀, β₁, and σ²

In addition to point estimates of β₀, β₁, and σ², we may also obtain confidence interval estimates of these parameters. The width of these confidence intervals is a measure of the overall quality of the regression line. If the errors are normally and independently distributed, then the sampling distribution of both in29-1 and in29-2 is t with n − 2 degrees of freedom. Therefore, a 100(1 − α) percent confidence interval (CI) on the slope β₁ is given by

(2.39)

and a 100(1 − α) percent CI on the intercept β₀ is

(2.40)

These CIs have the usual frequentist interpretation. That is, if we were to take repeated samples of the same size at the same x levels and construct, for example, 95% CIs on the slope for each sample, then 95% of those intervals will contain the true value of β₁.

If the errors are normally and independently distributed, Appendix C.3 shows that the sampling distribution of (n − 2)MS_Res/σ² is chi square with n − 2 degrees of freedom. Thus,

and consequently a 100(1 − α) percent CI on σ² is

(2.41)

Example 2.5 The Rocket Propellant Data

We construct 95% CIs on β₁ and σ² using the rocket propellant data from Example 2.1. The standard error of in30-1 is in30-2 and t_0.025,18 = 2.101. Therefore, from Eq. (2.35), the 95% CI on the slope is

In other words, 95% of such intervals will include the true value of the slope.

If we had chosen a different value for α, the width of the resulting CI would have been different. For example, the 90% CI on β₁ is −42.16 ≤ β₁ ≤ −32.14, which is narrower than the 95% CI. The 99% CI is −45.49 ≤ β₁ ≤ 28.81, which is wider than the 95% CI. In general, the larger the confidence coefficient (1 − α) is, the wider the CI.

The 95% CI on σ² is found from Eq. (2.41) as follows:

From Table A.2, in30-3 and in30-4 . Therefore, the desired CI becomes

2.4.2 Interval Estimation of the Mean Response

A major use of a regression model is to estimate the mean response E(y) for a particular value of the regressor variable x. For example, we might wish to estimate the mean shear strength of the propellant bond in a rocket motor made from a batch of sustainer propellant that is 10 weeks old. Let x₀ be the level of the regressor variable for which we wish to estimate the mean response, say E(y|x₀). We assume that x₀ is any value of the regressor variable within the range of the original data on x used to fit the model. An unbiased point estimator of E(y|x₀) is found from the fitted model as

(2.42)

To obtain a 100(1 − α) percent CI on E(y|x₀), first note that in31-1 is a normally distributed random variable because it is a linear combination of the observations yi. The variance of in31-2 is

since (as noted in Section 2.2.4) in31-3 . Thus, the sampling distribution of

is t with n − 2 degrees of freedom. Consequently, a 100(1 − α) percent CI on the mean response at the point x = x0 is

(2.43)

Note that the width of the CI for E(y|x₀) is a function of x₀. The interval width is a minimum for in31-4 and widens as in31-5 increases. Intuitively this is reasonable, as we would expect our best estimates of y to be made at x values near the center of the data and the precision of estimation to deteriorate as we move to the boundary of the x

Скачать книгу