Tim Rey

Applied Data Mining for Forecasting Using SAS


Скачать книгу

model structures are exponential smoothing, autoregressive models, moving average models, their combination–autoregressive moving average (ARMA) models, and unobserved component models (UCM). The second step is estimating the parameters of the selected model structure. The third step is applying the developed model with estimated parameters for forecasting.

      Univariate forecasting model development

      This substep represents the classical forecasting modeling process of a single variable. The future forecast is based on discovering trend, cyclicality, or seasonality in the past data. The developed composite forecasting model includes individual components for each of these identified patterns. The key hypothesis is that the discovered patterns in the past will influence the future. In addition to the basic forecasting steps, univariate forecasting model development includes the following sequence:

       Dividing the data into in-sample set (for model development) and out-of-sample set (for model validation)

       Applying the basic forecasting steps for the selected method on an in-sample set

       Validating the model through appropriate residuals tests

       Comparing the performance by applying the model to an out-of-sample set where possible

       Selecting the best model

      Multivariate (in Xs) forecasting model development

      This substep captures all the necessary activities to design forecasting models based on causal variables (economic drivers, input variables, exogenous variables, independent variables, Xs). One possible option is to develop the multivariate models as a time series model by using multiple regression. A limitation of this approach, however, is that the regression coefficients of the forecasting model are based on static relationships between the independent variables (Xs) and the dependent variable (Y). Another option is to use dynamic multiple regression that represents the dynamic dependencies between the independent variables (Xs) and the dependent variable (Y) with transfer functions. In both cases, the same modeling sequence, described in the previous section, is followed. However, different model structures, such as autoregressive integrated moving average with exogenous input model (ARIMAX) or unobserved components model (UCM), are selected. Note that the forecasted values for each independent variable, selected in the multivariate model, are required for calculating the dependent variable forecast. In most cases the forecasted values are delivered via univariate models for the corresponding input variables, that is, developing univariate models is a part of the multivariate forecasting model development substep.

      Consensus planning

      In one specific area of forecasting—demand-driven forecasting—it is of critical importance that each functional department (sales, planning, and marketing) reach consensus on the final demand forecast. In this case, consensus planning is a good practice. It takes into account the future trends, overrides, knowledge of future events, and so on that are not contained in the history.

      Forecasting model development deliverables

      The selected forecasting models with the best performance are the key deliverable not only of this step but of the whole project. In order to increase the performance, the final deliverable is often a combined forecast from several models, derived from different methods. In many applications the forecasting models are linked in a hierarchy, reflecting the business structure. In this case, reconciliation of the forecasts in the different hierarchical levels is recommended.

      Another deliverable is the selected models performance. The document summarizing the model performance of the final models must include key statistics as well as a detailed description of the model validation and selection process. If sufficient data are available, it is recommended to test the performance robustness while changing key model process parameters, that is, test the size of in-sample and out-of sample sets.

      The most important deliverable, however, is to convince the user to apply the forecasting models on a regular basis and to accomplish the business objectives. One option is to compare the model-generated and judgmental forecasts. Another option is to give the user the chance to test the model with different “What-If” scenarios. For final acceptance, however, a consistent record of forecasts within the expected performance metric for some specified time period is needed. It is also critical to prove the pre-defined business impact, that is, to demonstrate the value created by the improved forecasting.

      Forecasting model deployment steps

      This block of the work process includes the procedures for transferring the forecasting solution from development to production environment. The assumption is that beyond this phase the models will be put into the hands of the final users. Some users actively apply the forecasting models to accomplish the defined business objectives either in an interactive mode, by playing “What-If” scenarios, or by exploring optimal solutions. Other users are interested only in the forecasting reports delivered periodically or on demand. In both cases, a special version of the solution in a system-like production environment has to be developed and tested. The important substeps and deliverables for this block of the work process are discussed briefly below.

      Production mode model deployment

      It is assumed in production mode the selected forecasting models can deliver automatic forecasts from updated data when invoked by the user or by another program. In order to accomplish this, the necessary data collection scripts, data preprocessing programs, and model codes are combined in one entity. (In the SAS environment the entity is called a stored process.) In addition to the software during the model development cycle, some code for testing the data consistency in the future data collections has to be designed and integrated in the entity. Usually, the test checks for large differences between the new data sample and the current historical values in the data. By default, the new forecast is based on applying the selected models with the existing model parameters over the updated data. In most cases the user interface in production mode is a user-friendly environment.

      Forecasting decision-making process definition

      In the end, the results from the forecasting models are used in business decisions, which create the real value. Unfortunately, with the exception of demand-driven forecasting (see examples in Chase, 2009), this substep is usually either ignored or implemented in an ad hoc manner. It is strongly recommended to specify the decision-making process as precisely as possible. Then the quality of the decisions should be tracked in the same way as the forecasting performance. Using the method of forecast value analysis (FVA) is strongly recommended.5 Even the perfect forecast can be buried by a bad business decision.

      Forecasting model deployment deliverables

      The ideal deliverable from this block of the work process is a user interface designed for the final user in an environment that he likes. In most of the cases that environment is the ubiquitous Microsoft Excel. Fortunately, it is relatively easy to build such an interface with the SAS Microsoft Add-in, as shown in Section 2.3.4.

      Documenting the forecasting decision-making process is a deliverable of equal importance. The purpose of such a document is to define specific business rules that determine how to use the forecasting results. Initially the rule base can be created via brainstorming sessions with the subject matter experts. Another source of business rules definition could be a well-planned set of “What-If” scenarios generated by the forecasting models and analyzed by the experts. The end result is a set of business rules that link the forecasting results with specific actions and a value metric.

      Training the user is a deliverable, often forgotten by developers. The training includes demonstrating the production version of the software. It is also expected that a help menu is integrated into the software.

      Forecasting model maintenance steps

      The final block of the proposed work process includes the activities for model performance tracking and taking proper corrective actions if the performance deteriorates below some specified critical limit. This is one of the least developed areas in practical forecasting in terms of available tools and experience. It is strongly recommended