Richard J. Rossi

Applied Biostatistics for the Health Sciences


Скачать книгу

deviation empirical rule; roughly 99% of a mound-shaped distribution lies between the values μ−3σ and μ+3σ.

       THE EMPIRICAL RULES

      For populations having mound-shaped distributions,

      1 Roughly 68% of all of the population values fall within 1 standard deviation of the mean. That is, roughly 68% of the population values fall between the values μ−σ and μ+σ.

      2 Roughly 95% of all the population values fall within 2 standard deviations of the mean. That is, roughly 95% of the population values fall between the values μ−2σ and μ+2σ.

      3 Roughly 99% of all the population values fall within 3 standard deviations of the mean. That is, roughly 99% of the population values fall between the values μ−3σ and μ+3σ.

      The standard deviations of two populations resulting from measuring the same variable can be compared to determine which of the two populations is more variable. That is, when one standard deviation is substantially larger than the other (i.e., more than two times as large), then clearly the population with the larger standard deviation is much more variable than the other. It is also important to be able to determine whether a single population is highly variable or not. A parameter that measures the relative variability in a population is the coefficient of variation. The coefficient of variation will be denoted by CV and is defined to be

CV equals StartFraction sigma Over Math bar pipe bar symblom mu Math bar pipe bar symblom EndFraction

      The coefficient of variation is also sometimes represented as a percentage in which case

CV equals StartFraction sigma Over Math bar pipe bar symblom mu Math bar pipe bar symblom EndFraction times 100 percent-sign

      Because the standard deviation and the mean have the same units of measurement, the coefficient of variation is a unitless parameter. That is, the coefficient is unaffected by changes in the units of measurement. For example, if a variable X is measured in inches and the coefficient of variation is CV = 2, then coefficient of variation will also be 2 when the units of measurement are converted to centimeters. The coefficient of variation can also be used to compare the relative variability in two different and unrelated populations; the standard deviation can only be used to compare the variability in two different populations based on similar variables.

       Example 2.18

Variable µ σ
I 100 25
II 10 5
III 0.10 0.05

      1 Determine the value of the coefficient of variation for population I.

      2 Determine the value of the coefficient of variation for population II.

      3 Determine the value of the coefficient of variation for population III.

      4 Compare the relative variability of each variable.

       Solutions

      1 The value of the coefficient of variation for population I is CVI=25100=0.25.

      2 The value of the coefficient of variation for population II is CVII=510=0.5.

      3 The value of the coefficient of variation for population III is CVIII=0.050.10=0.5.

      4 Populations II and III are relatively more variable than population I even though the standard deviations for populations II and III are smaller than the standard deviation of population I. Populations II and III have the same amount of relative variability even though the standard deviation of population III is one-hundredth that of population II.

      2.2.7 Parameters for Bivariate Populations

      In most biomedical research studies, there are many variables that will be recorded on each individual in the study. A multivariate distribution can be formed by jointly tabulating, charting, or graphing the values of the variables over the N units in the population. For example, the bivariate distribution of two variables, say X and Y, is the collection of the ordered pairs

left-parenthesis upper X 1 comma upper Y 1 right-parenthesis comma left-parenthesis upper X 2 comma upper Y 2 right-parenthesis comma left-parenthesis upper X 3 comma upper Y 3 right-parenthesis comma ellipsis comma left-parenthesis upper X Subscript upper N Baseline comma upper Y Subscript upper N Baseline right-parenthesis period

      These N ordered pairs form the units of the bivariate distribution of X and Y and their joint distribution can be displayed in a two-way chart, table, or graph.

      When the two variables are qualitative, the joint proportions in the bivariate distribution are often denoted by pab, where

p Subscript a b Baseline equals proportion of pairs in population where upper X equals a and upper Y equals b

      Figure 2.21 The joint distribution