David Machin

Medical Statistics


Скачать книгу

order (from smallest to largest) from the Farndon et al. (2013) study. The median is the average of the eighth and ninth ordered observations (3 + 3)/2 = 3 mm. The first or bottom or lower half of the data has eight observations; so the cut‐point for the first or lower quartile is the observation that splits the eight lowest ranked observations into two halves again, that is, four observations in each ‘half’. Thus, the lower quartile lies somewhere between the fourth and fifth ordered observations. When the quartile lies between two observations the easiest option is to take the mean of the two observations (although there are more complicated methods). So the lower quartile is (2 + 2)/2 = 2 mm.

      Similarly, the upper quartile is calculated from the top half of the data (i.e. the observations with the largest values). The second or top or upper half of the data has eight observations; so again the cut‐point for the upper quartile is the observation that splits the eight highest ranked observations (ordered observations 9–16 into two halves again, (i.e. four observations in each ‘half’). Thus, the upper quartile lies somewhere between the 12th and 13th ordered observations. Since the quartile lies between two observations the easiest option is to take the mean of the two observations. Therefore, the upper quartile is (4 + 5)/2 = 4.5 mm. So, the interquartile range (IQR), for the corn size data, is from 2.0 to 4.5 mm; or a single number 2.5 mm.

       Standard Deviation and Variance

Schematic illustration of the calculation of the median, quartiles, and interquartile range for the corn size data. equation

      The variance is expressed in square units and so is not a suitable measure for describing variability because it is not in the same units as the raw data. The solution is to take the square root of the variance to return to the original units. This gives us the standard deviation (usually abbreviated to SD or s) defined as:

equation

      Examining this expression it can be seen that if all the x's were the same, then they would all equal images and so s would be zero. If the x's were widely scattered about images, then s would be large. In this way s reflects the variability in the data.

       Illustrative Example – Calculation of the Standard Deviation – Foot Corn Size

Corn Square of
size Differences differences
Subject (mm) Mean from mean from mean
(i) (xi) (images) (images) (images)2
1 1 3.625 −2.625 6.891
2 2 3.625 −1.625 2.641
3 2 3.625 −1.625 2.641
4 2 3.625 −1.625 2.641
5