Tim Rey

Applied Data Mining for Forecasting Using SAS


Скачать книгу

generates ARIMA and ARIMAX models as well as seasonal models, transfer function models, and intervention models. The modeling process includes identification, parameter estimation, and forecast with generation of a variety of diagnostic statistics and model performance metrics, such as Akaike's information criterion (AIC) and Schwartz's Bayesian criterion (SBC or BIC).

      ESM can generate forecasts for time series and transactional data based on exponential smoothing methods. It also includes several data transformation methods, such as log, square root, logistic, and Box-Cox.

      FORECAST is the old version of ESM.

      STATESPACE generates multivariate models based on different system representation by state space variables. It includes automatic model structure selection, parameter estimation, and forecasting.

      UCM provides a development tool for unobserved component models. It generates the corresponding trend, seasonal, cyclical, and regression effects components, estimates the model parameters, performs model diagnostics, and calculates the forecasts and confidence limits of all the model components and the composite series.

      VARMAX is very useful for forecasting multivariate time series, especially when the economic or financial variables are correlated to each other's past values. The VARMAX procedure enables modeling the dynamic relationship both between the dependent variables and between the dependent and independent variables. It uses a variety of modeling techniques, criteria for automatic determination of the autoregressive and moving average orders, model parameter estimation methods, and several diagnostic tests.

      Forecasting using SAS Enterprise Guide

      The forecasting capabilities of the SAS Enterprise Guide built-in blocks are very limited. However, all the SAS/ETS functionality can be used via SAS Enterprise Guide code nodes. The key built-in forecasting blocks in the Time Series Tasks are described briefly below.

       Basic Forecasting

      generates forecasting models based on exponential smoothing and stepwise autoregressive fit of time trend.

       ARIMA Modeling and Forecasting

      generates ARIMA models, but the identification and parameter estimation methods have to be selected by the modeler.

       Regression Analysis with Autoregressive Errors

      provides linear regression models for time series data in the case of correlated errors and heteroscedasticity.

      Forecasting using SAS Forecast Studio

      SAS Forecast Studio is one of the most powerful engines for large-scale forecasting available in the market. It generates automatic forecasts in batch mode or executes custom-built models through an interactive graphical interface. SAS Forecast Studio enables the user to interactively set up the forecasting process, hierarchy, parameters, and business rules as well as to enter specific events. Another very useful feature is hierarchical reconciliation with the ability to reconcile the hierarchy bottom-up, middle-out, or top-down.

      SAS Forecast Studio does not require programming skills, and the whole forecasting step can be done in an easy-to-use GUI interface where the model selection list includes exponential smoothing models with optimized parameters, ARIMA models, unobserved components models, dynamic regression, and intermittent demand models. It is possible also to define a model repository and events. The automatic model generation includes outlier detection, event identification, and automatic variable selection. The forecasting results are represented in numerous graphical reports. In the case of multivariate models, you can explore different “What if” scenarios to determine the influence of the key drivers in the dependent variable forecast. Forecast studio is a highly productive environment. It can generate thousands of time series forecasts in minutes.

      Model deployment on model development tools

      Some forecasting applications use the development environment for model deployment. An obvious disadvantage of this option is that the user must be familiar with at least some of the capabilities of the development software. Using the development environment for model deployment is appropriate only in specialized cases with educated end users. In the case of large-scale industrial forecasting, however, this option is not recommended.

      Model deployment via stored processes

      One of the advantages of using various SAS tools is that you can communicate the results to the user via stored processes. A stored process is a SAS program that is stored centrally on a server. The final validated models are usually packaged as stored processes in the development environment (SAS Enterprise Guide, SAS Enterprise Miner, or SAS Forecast Server) and saved on a specified server. A client application can then execute the program, and then receive and process the results locally.

      The most popular application for using the SAS stored processes on the client side is the SAS Add-In for Microsoft Office. After installing the Add-In, the user can select the corresponding stored process and invoke the forecasting application. In the case of Excel, the results are represented in spreadsheets with the rich graphic capabilities of this popular tool. Another option for model deployment is via SAS Web Report Studio.

      One option for model maintenance is using SAS Model Manager, which manages and monitors analytical model performance in a central repository. It enables users to monitor the performance of a large number of models by defining different performance metrics and then generating model performance and comparison reports. Unfortunately, managing models generated by SAS Forecast Studio is currently not possible with SAS Model Manager.

      Another option for forecasting model performance tracking is to develop corresponding stored processes that generate periodic reports. In the case of performance deterioration, the user can contact the developer to perform the proper corrective actions of parameter re-fitting or complete model re-development. These options are available in SAS Forecast Studio.

      At the end of this section are some generic guidelines on how to select the appropriate SAS tools so that you can apply the work process discussed.

      SAS/ETS includes all generic functions for forecasting, such as: FORECAST, AUTOREG, ARIMA, VARMAX, X11, X12, SPECTRA, and so on.9 It is a proper solution if the model developers have good programming skills in Base SAS and are knowledgeable in forecasting methods.

      SAS/STAT provides the generic statistical functions such as REG, PRINCOMP, PLS, VARCLUS, and so on. From an implementation point of view, it has the same requirements as SAS/ETS. Both tools are appropriate for small-scale applications and prototype development and require skilled SAS programmers.

      SAS Enterprise Guide enables fast system development based on a combination of built-in functional blocks using Base SAS procedures. It is a very good environment to integrate data preprocessing and some data mining and forecasting functions. SAS Enterprise Guide requires minimal Base SAS programming experience. Another advantage of SAS Enterprise Guide is its impressive reporting and graphical capabilities.

      SAS Enterprise Miner is the ideal non-programming development tool for data preprocessing and data mining activities. An additional advantage in the case of data mining in forecasting is the currently developed node for Time Series Data Mining (TSDM), which enables fully functional time series preprocessing, variable reduction, and selection.

      SAS Forecast Server provides automatic model and report generation for a wide range of forecasting algorithms. For large-scale industrial forecasting, this is the tool.

      The best-case scenario for implementing the proposed work process for data mining in forecasting is to integrate it into some existing work process. Doing so minimizes cultural change since the organization of interest has already introduced work processes according to its strategy. One popular work process in many organizations using