of models and designed to communicate with all models that satisfy an API discussed in section 3.7 (where we show that the API can also be added to some existing models with an adapter class). For our purpose, a model is anything that generates scenarios, also called paths in the context of Monte‐Carlo simulations, that is, the value of the relevant simulated variables (essentially, underlying asset prices) on the relevant event dates (when something meaningful is meant to happen for the transaction, like a fixing, a payment, an exercise, or the update of a path dependency). Our library consumes scenarios to evaluate payoffs, without concern of how those scenarios were produced.
Scripting libraries are also typically written to be usable with a number of numerical implementations, either by forward induction like Monte‐Carlo simulations or backward induction like finite difference grids. In order to avoid unnecessary confusion, our book focuses on forward induction with Monte‐Carlo simulations, by far the most frequently used valuation context today. To further simplify our approach, we only consider path‐wise simulations. This means that simulations are provided one path (for all event dates) at a time. The model is, to us, an abstract object that generates multiple scenarios (possible evolutions of the world) and communicates them sequentially for evaluation.
An alternative that we don't consider is step‐wise simulation, where all paths are computed simultaneously, date by date. The model first generates all the possible states of the world for the first event date, then moves on to the second event date, and so on until it reaches the final maturity. Step‐wise simulation is natural with particular random generators, control variates (when paths are modified after simulation so that the expectation of some function matches some target), and, more generally, calibration inside simulations as in the particle method of [16].
The concepts and code in this book apply to path‐wise Monte‐Carlo simulations, but they can be extended to support step‐wise Monte‐Carlo and backward induction algorithms.6 This is however not discussed further. Note that path‐wise Monte‐Carlo includes deterministic models for linear products (these models generate a single path where all forwards are realized) and numerical integration for European options (where each path is a point in the integration scheme against the risk neutral probability distribution of the relevant market variable).
Monte‐Carlo simulations are explained in detail in [19] [15], and the dedicated chapters of [20], as well as our publication [27], which provides a framework and code in C++, and focuses on a parallel implementation.
We refer to the figure on page 15 for an illustration of the valuation process. The model produces a number of scenarios; every scenario is consumed by the evaluator to compute the payoff over that scenario. This is repeated over a number of paths, followed by an aggregation of the results. The evaluator is the object that computes payoffs (sums of cash flows normalized by the numeraire, which is also part of the scenarios, as described in chapter 5) as a function of the simulated scenarios. The evaluator conducts its work by traversal of the expression tree that represents the script, while maintaining an internal stack for the storage of intermediate results. We describe the evaluator in detail in section 3.6.
The evaluation of the script over a path is typically executed a large number of times; therefore, the overall performance is largely dependent on its optimization. This is achieved by many forms of pre‐processing.
1.4 PRE‐PROCESSING
Pre‐processing is the key optimization that makes the valuation of scripts (almost) as fast as hard‐coded payoffs. Where performance matters most is in the code repeated for a large number of simulations. Monte‐Carlo developers know that maximum performance is achieved by moving as much work as possible before simulations take place: pre‐allocation of working memory, transformation of data, pre‐calculation of quantities that don't directly depend on the simulated variables. We want to perform as much work as possible once, before simulations start, so that the subsequent work conducted repeatedly over scenarios is as limited and as efficient as possible.
Pre‐processing is not only about pre‐computing parts of subsequent evaluations. It is not so much about the optimization of the arithmetic calculations performed during simulations. It is mainly about moving most of the administrative, logistic overhead to processing time. CPUs perform mathematical calculations incredibly fast, but the access of data in memory may be slow if not carefully dealt with. With hard‐coded payoffs (payoffs coded in C++), it is the compiler that optimizes the code, putting all the right data in the right places in the interest of performance. With scripting, the payoff, as a function of the scenario, is built at run time. The compiler cannot optimize this function for us. This is our responsibility. This is where the pre‐processors come into play.
A special breed of pre‐processors called indexers arrange data for the fastest possible access during simulations, with a massive performance impact. For instance, when some cash‐flow is related to a given Libor rate of a given maturity, the indexer “informs” the evaluator, before simulations start, where in pre‐allocated memory that Libor will live during simulations, so that, at that point, the evaluator reads the simulated Libor there, directly, without any kind of expensive lookup.
Most noticeable benefits are achieved in the context of interest rates (and multiple assets) discussed in section III. The dates when coupons are fixed or paid and the maturities of the simulated data such as discount factors and Libors are independent of scenarios. The value of these simulated variables may be different in every scenario, but their specification, including maturity, is fixed. Pre‐processors identify what rates are required for what event dates, and perform memory allocation, indexing, and other logistics before simulations start. During simulations, where performance matters most, the model communicates the values of the simulated data in a pre‐indexed array for a fast, random access.
This is covered and clarified in part III, but we introduce an example straight away. Consider a script for a caplet:
01Jun2021 | caplet pays 0.25 * max( 0, libor( 03Jun2021, 3m, act/360, L3) ‐ STRIKE) on 03Sep2021 |
The payoff of this caplet on 03Jun2021 is:
The responsibility of the model is to generate a number of joint scenarios for the Libor and discount.7 The responsibility of the evaluator is to compute the payoff above from these two values.
While this is all mathematically very clear, a practical implementation is more challenging. The product is scripted at run time, so we don't know, at compile time, what simulation data exactly the evaluator expects from the model. Dedicated data structures must be designed for that purpose, something like (in pseudo‐code):
Scenario := vector of SimulData per event date SimulData := numeraire, vector of libors, vector of discounts, … |
In our simple example, we have one event date: 01Jun2021, and we need one Libor fixed on the event date for a loan starting two business days later on 03Jun2021, with coupon paid on 03Sep2021, as well as one discount factor for the payment date. We humans know that from reading the script. A pre‐processor figures that out by visiting the script at processing time. This allows not only to pre‐allocate the vectors of Libors and discounts but also to transform the payoff into something like:
0.25 * max( 0, scen[0].libors[0] ‐ STRIKE) * scen[0].disc[0] / scen.numeraire |
The pre‐processor also “knows” that event date 0 is 01Jun2021, and that