valid for any other choice of time interval (e.g., daily, weekly, annually).
However, the use of log returns, discussed in Chapter 3, solves this problem. If Rm = ∞1, Rm = ∞2, and Rm = ∞3 are monthly log returns, then the quarterly log return is simply the sum of the three monthly log returns. The normal distribution replicates additively; thus, if the log returns over one time interval can be modeled as being normally distributed, then the log returns over all time intervals will be lognormal as long as they are statistically independent through time.
Further, log returns have another highly desirable property. The highest possible simple (non-annualized) return is theoretically + ∞, while the lowest possible simple return for a cash investment is a loss of −100 %, which occurs if the investment becomes worthless. However, the normal distribution spans from − ∞ to + ∞, meaning that simple returns, theoretically speaking, cannot truly be normally distributed; a simple return of −200 % is not possible. Thus, the normal distribution may be a poor approximation of the actual probability distribution of simple returns. However, log returns, like the normal distribution itself, can span from − ∞ to + ∞.
There are two equivalent approaches to model returns that address these problems: (1) use log returns and assume that they are normally distributed, or (2) add 1 to the simple returns and assume that it has a lognormal distribution. A variable has a lognormal distribution if the distribution of the logarithm of the variable is normally distributed. The two approaches are identical, since the lognormal distribution assumes that the logarithms of the specified variable (in this case, 1 + R) are normally distributed.
In summary, it is possible for returns to be normally distributed over a variety of time intervals if those returns are expressed as log returns (and are independent through time). If the log returns are normally distributed, then the simple returns (in the form 1 + R) are said to be lognormally distributed. However, if discretely compounded returns (R) are assumed to be normally distributed, they can only be normally distributed over one time interval, such as daily, since returns computed over other time intervals would not be normally distributed due to compounding.
4.2 Moments of the Distribution: Mean, Variance, Skewness, and Kurtosis
Random variables, such as an asset's return or the timing of uncertain cash flows, can be viewed as forming a probability distribution. Probability distributions have an infinite number of possible shapes, only some of which represent well-known shapes, such as a normal distribution.
The moments of a return distribution are measures that describe the shape of a distribution. As an analogy, in mathematics, researchers often use various parameters to describe the shape of a function, such as its intercept, its slope, and its curvature. Statisticians often use either the raw moments or the central moments of a distribution to describe its shape. Generally, the first four moments are referred to as mean, variance, skewness, and kurtosis. The formulas of these four moments are somewhat similar, differing primarily by the power to which the observations are raised: mean uses the first power, variance squares the terms, skewness cubes the terms, and kurtosis raises the terms to the fourth power.
4.2.1 The Formulas of the First Four Raw Moments
Statistical moments can be raw moments or central moments. Further, the moments are sometimes standardized or scaled to provide more intuitive measures, as will be discussed later. We begin with raw moments, discussing the raw moments of an investment's return, R. Raw moments have the simplest formulas, wherein each moment is simply the expected value of the variable raised to a particular power:
(4.1)
The most common raw moment is the first raw moment and is known as the mean, or expected value, and is an indication of the central tendency of the variable. With n = 1, Equation 4.1 is the formula for expected value:
(4.2)
The expected value of a variable is the probability weighted average of its outcomes:
(4.3)
where probi is the probability of Ri.
Equation 4.3 expresses the first raw moment in terms of probabilities and outcomes. Using historical data, for a sample distribution of n observations, the mean is typically equally weighted and is estimated by the following:
(4.4)
Thus, Equation 4.4 is a formula for estimating Equation 4.2 using historical observations. The historical mean is often used as an estimate of the expected value when observations from the past are assumed to be representative of the future. Other raw moments can be generated by inserting a higher integer value for n in Equation 4.1. But the raw moments for n > 1 are less useful for our purposes than the highly related central moments.
4.2.2 The Formulas of Central Moments
Central moments differ from raw moments because they focus on deviations of the variable from its mean (whereas raw moments are measured relative to zero). Deviations are defined as the value of a variable minus its mean, or expected value. If an observation exceeds its expected value, the deviation is positive by the distance by which it exceeds the expected value. If the observation is less than its expected value, the deviation is a negative number. Each central moment applies the following equation to the deviations:
where μ = the expected value of R.
The term inside the parentheses is the deviation of R from its mean, or expected value. The first central moment is equal to zero by definition, because the expected value of the deviation from the mean is zero. When analysts discuss statistical moments, it is usually understood that the first moment is a raw moment, meaning the mean, or expected value. But the second through fourth moments are usually automatically expressed as central moments because in most applications the moments are more useful when expressed in terms of deviations.
The variance is the second central moment and is the expected value of the deviations squared, providing an indication of the dispersion of a variable around its mean:
(4.6)
The variance is the probability weighted average of the deviations squared. By squaring the deviations, any negative signs are removed (i.e., any negative deviation squared is positive), so the variance [V(R)] becomes a measure of dispersion. In the case of probability weighted outcomes, this can be written as:
The variance shown in Equation 4.7 is often estimated with a sample of historical data. For a sample distribution, the variance with equally weighted observations is estimated as:
The mean in Equation 4.8,
, is usually estimated using the same sample. The use of n − 1 in the equation (rather than n) enables a more accurate measure of the variance when the estimate of the expected value of the variable has been computed from the same sample. The square root of the variance is an extremely popular and useful measure of dispersion known as the standard deviation:(4.9)