A. Gouveia Oliveira

Biostatistics Decoded


Скачать книгу

is called the probability density function). This means that in normally distributed attributes we can completely describe their distribution by using only the mean and the variance (or equivalently the standard deviation). This is the reason why the mean and the variance are called the parameters of the normal distribution, and what makes these two summary measures so important. It also means that if two normally distributed variables have the same variance, then the shape of their distribution will be the same; if they have the same mean, their position on the horizontal axis will be the same.

      Third property. The sum, or difference, of a constant to a normally distributed variable will result in a new variable with a normal distribution. According to the properties of means and variances, the constant will be added to or subtracted from its mean, and its variance will not change (Figure 1.26).

      Fourth property. The multiplication, or division, of the values of a normally distributed variable by a constant will result in a new variable with a normal distribution. Because of the properties of means and variances, its mean will be multiplied, or divided, by that constant and its variance will be multiplied, or divided, by the square of that constant (Figure 1.26).

An illustration of the relationship between the area under the normal curve and the standard deviation.

      So what would be the consequences of that change of perspective? With this point of view, a sample mean would correspond to the sum of a large number of observations from variables with identical distribution, each observation being divided by a constant amount which is the sample size. Under these circumstances, the central limit theorem applies and, therefore, we must conclude that the sample means have a normal distribution, regardless of the distribution of the attribute being studied.

      In the case of small samples, however, the means will also have a normal distribution provided the attribute has a normal distribution. This is not because of the central limit theorem, but because of the properties of the normal distribution. If the means are sums of observations on identical normally distributed variables, then the sample means have a normal distribution whatever the number of observations, that is, the sample size.

An illustration of the total obtained from the throw of six dice may be seen as the sum of observations on six identically distributed variables.

      We now know that the means of large samples may be defined as observations from a random variable with normal distribution. We also know that the normal distribution is completely characterized by its mean and variance. The next step in the investigation of sampling distributions, therefore, must be to find out whether the mean and variance of the distribution of sample means can be determined.