than VaR, then, what we really want to know is how big the loss will be when we have an exceedance event. Using the concept of conditional probability, we can define the expected value of a loss, given an exceedance, as follows:
(3.84)
we refer to this conditional expected loss, S, as the expected shortfall.
If the profit function has a probability density function given by f(x), and VaR is the VaR at the α confidence level, we can find the expected shortfall as:
In most cases the VaR for a portfolio will correspond to a loss, and Equation 3.85 will produce a negative value. As with VaR, it is common to reverse the sign when speaking about the expected shortfall.
Expected shortfall does answer an important question. What's more, expected shortfall turns out to be subadditive, thereby avoiding one of the major criticisms of VaR. As our discussion on back-testing suggests, though, the reliability of our expected shortfall measure may be difficult to gauge.
Sample Problem
Question:
In a previous example, the probability density function of Triangle Asset Management's daily profits could be described by the following function:
We calculated Triangle's one-day 95 percent VaR as a loss of
. For the same confidence level and time horizon, what is the expected shortfall?Answer:
Because the VaR occurs in the region where π < 0, we need to utilize only the first half of the function. Using Equation 3.85, we have:
Thus, the expected shortfall is a loss of 7.89. Intuitively this should make sense. The expected shortfall must be greater than the VaR, 6.84, but less than the minimum loss of 10. Because extreme events are less likely (the height of the PDF decreases away from the center), it also makes sense that the expected shortfall is closer to the VaR than it is to the maximum loss.
Part VI Linear Regression Analysis
Linear Regression (One Regressor)
One of the most popular models in statistics is the linear regression model. Given two constants, α and β, and a random error term, ϵ, in its simplest form the model posits a relationship between two variables, X and Y:
As specified, X is known as the regressor or independent variable. Similarly, Y is known as the regressand or dependent variable. As dependent implies, traditionally we think of X as causing Y. This relationship is not necessary, and in practice, especially in finance, this cause-and-effect relationship is either ambiguous or entirely absent. In finance, it is often the case that both X and Y are being driven by a common underlying factor.
The linear regression relationship is often represented graphically as a plot of Y against X, as shown in Figure 3.9. The solid line in the chart represents the deterministic portion of the linear regression equation, Y = α + βX. For any particular point, the distance above or below the line is the error, ϵ, for that point.
FIGURE 3.9 Linear Regression Example
Because there is only one regressor, this model is often referred to as a univariate regression. Mainly, this is to differentiate it from the multivariate model, with more than one regressor, which we will explore later in this chapter. While everybody agrees that a model with two or more regressors is multivariate, not everybody agrees that a model with one regressor is univariate. Even though the univariate model has one regressor, X, it has two variables, X and Y, which has led some people to refer to Equation 3.86 as a bivariate model. From here on out, however, we will refer to Equation 3.86 as a univariate model.
In Equation 3.86, α and β are constants. In the univariate model, α is typically referred to as the intercept, and β is often referred to as the slope. β is referred to as the slope because it measures the slope of the solid line when Y is plotted against X. We can see this by taking the derivative of Y with respect to X:
(3.87)
The final term in Equation 3.86, ϵ, represents a random error, or residual. The error term allows us to specify a relationship between X and Y, even when that relationship is not exact. In effect, the model is incomplete, it is an approximation. Changes in X may drive changes in Y, but there are other variables, which we are not modeling, which also impact Y. These unmodeled variables cause X and Y to deviate from a purely deterministic relationship. That deviation is captured by ϵ, our residual.
In risk management this division of the world into two parts, a part that can be explained by the model and a part that cannot, is a common dichotomy. We refer to risk that can be explained by our model as systematic risk, and to the part that cannot be explained by the model as idiosyncratic risk. In our regression model, Y is divided into a systematic component, α + βX, and an idiosyncratic component, ϵ.
(3.88)
Which component of the overall risk is more important? It depends on what our objective is. As we will see, portfolio managers who wish to hedge certain risks in their portfolios are basically trying to reduce or eliminate systematic risk. Portfolio managers who try to mimic the returns of an index, on the other hand, can be viewed as trying to minimize idiosyncratic risk.
EVALUATING THE REGRESSION
Unlike a controlled laboratory experiment, the real world is a very noisy and complicated place. In finance it is rare that a simple univariate regression model is going to completely explain a large data set. In many cases, the data are so noisy that we must ask ourselves if the model is explaining anything at all. Even when a relationship appears to exist, we are likely to want some quantitative measure of just how strong that relationship is.
Probably the most popular statistic for describing linear regressions is the coefficient of determination, commonly known as R-squared, or just R2. R2 is often described as the goodness of fit of the linear regression. When R2 is one, the regression model completely explains the data. If R2 is one, all the residuals are zero, and the residual sum of squares, RSS, is zero. At the other end of the spectrum, if R2 is zero, the model does not explain any variation in the observed data. In other words, Y does not vary with X, and β is zero.
To calculate the coefficient of determination, we need to define two additional terms: TSS, the total sum of squares, and ESS, the explained sum of squares. They are defined as:
(3.89)
These two sums are related to the previously encountered residual sum of squares, as follows:
(3.90)
In other words, the total variation in our regressand,