Chihwa Kao

Large-Dimensional Panel Data Econometrics


Скачать книгу

Data Models

       2.5Monte Carlo Simulations

       2.5.1Experiment design

       2.5.2Results

       2.6Recent Development

       2.7Technical Details

       2.8Exercises

       3.Factor-Augmented Panel Data Regression Models

       3.1Motivation

       3.2CCE Approach

       3.3IPC Approach

       3.4Likelihood Approach

       3.5Other Studies

       3.6An Empirical Example

       3.7Exercises

       4.Structural Changes in Panel Data Models

       4.1Heterogeneous Panels with a Common Structural Break

       4.2Model 1: No Common Correlated Effects

       4.3Model 2: Common Correlated Effects

       4.4Multiple Common Break Points

       4.5Endogenous Regressors and Break in Factors

       4.6Monte Carlo Simulations

       4.6.1Model 1: No common correlated Effects

       4.6.2Model 2: Common correlated Effects

       4.6.3Case of endogenous regressors

       4.7An Empirical Example

       4.8Recent Development

       4.9Technical Details

       4.10Exercises

       5.Latent-Grouped Structure in Panel Data Models

       5.1Panel Latent Group Structure Models

       5.2K-means Clustering

       5.3Conclusion

       5.4Exercises

       Bibliography

       Index

       Chapter 1

       Introduction

      This book is motivated by the recent development in high-dimensional panel data models with large amount of individuals/countries (n) and observations over time (T). Specifically, it introduces four important research topics in large panels, including testing for cross-sectional dependence, estimation of factor-augmented panel data models, structural changes and group patterns in panels in the following four chapters. To address these issues, we examine the properties of traditional tests and estimators in large-dimensional setup. In addition, we also take advantage of some techniques in Random Matrix Theory and Machine Learning.

      Chapter 2 covers testing for cross-sectional dependence in panel data regression models with large n and large T. Cross-sectional dependence, described as the interaction between cross-sectional units (e.g., households, firms and states, etc.), has been well discussed in the spatial econometrics literature. Intuitively, dependence across “space” can be regarded as the counterpart of serial correlation in time series. It could arise from the behavioral interaction between individuals, e.g., imitation and learning among consumers in a community, or firms in the same industry. This has been widely studied in game theory and industrial organization. It could also be due to unobservable common factors or common shocks popular in macroeconomics.

      In recent literature, cross-sectional dependence among individuals is a concern when n is large. As serial correlation in time-series analysis, the cross-sectional of dependence/correlation leads to efficiency loss for least squares and invalidates conventional t-tests and F-tests which use standard variance–covariance estimators. In some cases, it could potentially result in inconsistent estimators (Lee, 2002; Andrews, 2005). Several estimators have been proposed to deal with cross-sectional dependence, including the popular spatial methods (Anselin, 1988; Anselin and Bera, 1998; Kelejian and Prucha, 1999; Kapoor, Kelejian and Prucha, 2007; Lee, 2007; Lee and Yu, 2010), and factor models in panel data (Pesaran, 2006, Kapetanios, Pesaran and Yamagata, 2011; Bai, 2009). However, before imposing any structure on the disturbances of our model, it may be wise to test the existence of cross-sectional dependence.

      There has been a lot of work on testing for cross-sectional dependence in the spatial econometrics literature, see Anselin and Bera (1998) for cross-sectional data and Baltagi, Song and Koh (2003) for panel data, to mention a few. The latter derives a joint Lagrange multiplier (LM) test for the existence of spatial error correlation as well as random region effects in a panel data regression model. Panel data provide richer information on the covariance matrix of the errors than cross-sectional data. This is especially relevant for the off-diagonal elements which are of particular importance in determining cross-sectional dependence. With panel data one can test for cross-sectional dependence without imposing ad hoc specifications on the error structure generating the covariance matrix, e.g., the spatial autoregressive model in the spatial literature, or the single or multiple factor structures imposed on the errors in the macro literature. Ng (2006) and Pesaran (2004) propose two test procedures based on the sample covariance matrix in panel data. Ng (2006) develops a test tool using spacing method in a panel model. Pesaran (2004) proposes a cross-sectional dependence (CD) test using the pairwise average of the off-diagonal sample correlation coefficients in a seemingly unrelated regressions model. The CD test is closely related to the RAVE test statistic advanced by Frees (1995). Unlike the traditional Breusch-Pagan (1980) LM test, the CD test is applicable for a large number of cross–sectional units (n) observed over T time periods. In Pesaran (2015), the CD test is interpreted as a test for weak cross-sectional dependence. Sarafidis, Yamagata and Robertson (2009) develop a test for cross-sectional dependence based on Sargan’s difference test in a linear dynamic panel data model, in which the error cross-sectional dependence is modeled by a multifactor structure. Hsiao, Pesaran and Pick (2012) propose a LM-type test for nonlinear panel data models. For a recent survey of some cross-sectional dependence tests in panels, see Moscone and Tosetti (2009). Baltagi, Feng and Kao (2011) propose a test for sphericity following John (1972) and Ledoit and Wolf (2002)