Tim Rey

Applied Data Mining for Forecasting Using SAS


Скачать книгу

2.5: Key blocks of data mining in Six Sigma

images

      Figure 2.5 is from Alex Kalos and Tim Rey's paper, “Data mining in the chemical industry” (2005). The details of using this data mining process within the Six Sigma framework are also given in this paper.

      Data mining in forecasting within DMAIC

      The other option of integrating the proposed work process for data mining in forecasting within Six Sigma is illustrated in Figure 2.6 where we can see the corresponding links between the key blocks of both methodologies. The project definition steps, including system identification, are part of the define phase of DMAIC. The data preparation steps belong to the measure phase, and both variable selection and reduction and Forecasting model development steps are included in the analyze phase. The forecasting model deployment steps are part of the improve phase of DMAIC and the last part of the forecasting work process, the forecasting model maintenance steps, are linked to the control phase of DMAIC.

images

      Because of the clear link between the proposed work process (based on the requirements for developing high-performance forecasting) and a work process such as Six Sigma (that is almost universally adopted in industry), you can integrate the two processes with minimal effort and cultural change. As a result, you have greater opportunities to introduce the proposed methodology and can more efficiently manage projects and develop forecast systems.

      Opportunity Statement:

       Current forecast is judgmental with an average Mean Average Percent Error (MAPE) of 16.5 for four quarterly forecasts.

       The opportunity is to improve the forecast by using statistical methods.

       The key hypothesis is that more accurate forecasts will lead to proactive business decisions that will increase consistently profit by at least 10%.

      Project Goal and Objective:

      The technical objective of the project is to develop, deploy, and support, for at least three years, a quarterly forecasting model that projects the price of Product A for a two-year time horizon and that outperforms the accepted statistical benchmark (naïve forecasting in this case) by 20% based on average of four consecutive quarterly forecasts.

      Project Scope and Boundaries:

       The project will focus on Product A price in Germany.

      Deliverables:

       a forecasting model with user interface in Excel

       a decision scheme with proactive action items to increase profits

      Timeline:

      Estimated duration of the key steps of the project:

Project definition: 40 hours
Data preparation: 80 hours
Model development: 60 hours
Model deployment: 20 hours
Model maintenance: 10 hours per year

      Team Composition:

      The ideal team includes:

      Management sponsor

      Project owner

      Project leader

      Technical subject matter experts

      Model developers

      End users

      1 A good starting point for developing mind-maps is Tony Buzan's The Mind-map Book (2003).

      2 The mind-maps in this book are based on the Mindjet product MindManager 8, available from http://www.mindjet.com/.

      3 A classic book about data preparation is Dorian Pyle's Data Preparation for Data Mining (1999).

      4 Evans, C., Liu, C. and Pham-Kanter, G. “The 2001 recession and the Chicago Fed National Activity Index: Identifying business cycle turning points,” Economic Perspectives 26, no. 3 (2002): 26–43.

      5 The FVA method is described in Michael Gilliland's book, The Business Forecasting Deal (2010).

      6 A book with many examples of using different SAS solutions for data preparation is Gerhard Svolba's Data Preparation for Analytics Using SAS (2006).

      7 A good explanation of X11 and X12 is given by Spyros G. Makridacis et al. in Forecasting: Methods and Applications (1997).

      8 Friedman, J. H. “Greedy function approximation: A gradient boosting machine,” Annals of Statistics 29 (2001): 1189–1232.

      9 A useful classification of the SAS/ETS functions is given in Table 1.1 in the book SAS for Forecasting Time Series (2003) by John Brocklebank and David Dickey.

      10 A detailed description of S&OP is given in Charles Chase's Demand-Driven Forecasting: A Structured Approach to Forecasting (2009).

      11 The reader can find more information about Six Sigma in Implementing Six Sigma: Smarter Solutions Using Statistical Methods (2003) by Forrest Breyfogle III.

      12 January/February 2007 Issue at http://www.isixsigma-magazine.com/

      Chapter 3: Data Mining for Forecasting Infrastructure

       3.1 Introduction

       3.2 Hardware Infrastructure

       3.2.1 Personal Computers Network Infrastructure

       3.2.2 Client/Server Infrastructure

       3.2.3 Cloud Computing Infrastructure

       3.3 Software Infrastructure

       3.3.1 Data Collection Software

       3.3.2 Data Preparation Software

       3.3.3 Data Mining Software

       3.3.4 Forecasting Software

       3.3.5 Software Selection Criteria

       3.4 Data Infrastructure