Karen Robson

Multilevel Modeling in Plain Language


Скачать книгу

‘clustered’, or ‘grouped’ structure and you are being guided to a (regression) technique that accounts for this type of data structure.

      The idea of nesting, or clustering, is central to multilevel modeling. It simply means that smaller levels of analysis are contained within larger grouping units (Figure 1.1). The classic example is that individual students are nested within schools, but nesting can take other forms, such as individuals (Level 1) within cities (Level 2), patients within hospitals, siblings within families, or employees within firms.

Figure 1

      Figure 1.1 Two-level nesting

      In these types of two-level models, the lower level consisting of the smaller units (often individuals) is called Level 1, and the next level is referred to as Level 2. You may also have a third (or higher) level within which Level 2 units are nested (see Figure 1.2) such as students (Level 1) within classrooms (Level 2) within schools (Level 3), patients within hospitals within districts, siblings within families within neighbourhoods, or employees within firms within nations.

Figure 2

      Figure 1.2 Three-level nesting

      Multilevel modeling can deal with three or more levels of nesting. However, in this book we will focus on two-level models. As you gain more expertise in multilevel modeling, you may want to explore more complex structures, but for the scope of this introductory book, analysis with two levels will serve as the foundation upon which illustrative examples are created.

      This book will also only consider cross-sectional nested data. There is another variation of nesting with longitudinal data. Longitudinal data are obtained when information is collected from respondents at more than one point in time. For example, people are interviewed annually in a longitudinal survey such as the British Household Panel Survey. The way nesting is conceptualized with longitudinal data is a bit different than with cross-sectional data. Data collection events (Time 1, Time 2, Time 3, etc.) are nested within individual respondents. Therefore, the time is Level 1 and the respondent is Level 2 (Figure 1.3). If your data look like this then you may still start with this book as an introduction to cross-sectional multilevel models before branching out into longitudinal data. On the positive side, there are a number of other potential techniques for analysing this type of longitudinal or panel data. Halaby (2004), for example, offers a sociologically based primer for examining issues of causality in longitudinal data. Fitzmaurice et al. (2012) have written a more detailed textbook on the topic of applied longitudinal data analysis that is targeted at a multidisciplinary audience.

Figure 3

      Figure 1.3 Longitudinal nesting

      Mixing levels of analysis

      There are two errors in causal reasoning that have to do with mixing different levels of analyses, which are illustrated in Figure 1.4. The first is known as the ecological fallacy and has to do with generalizing group characteristics to individuals. If we analyse the effect that average neighbourhood income has on the crime rates of that neighbourhood, we are comparing group characteristics with group characteristics. To extend this argument to particular individuals in the neighbourhood can be misleading. It is not appropriate to apply group-level characteristics to individual-level inferences. We may well find that as average income in a neighbourhood decreases, the crime rate increases – but we cannot say that if an individual’s income decreases, he or she is more likely to commit crime!

      The ecological fallacy can be demonstrated in a number of ways. Another common misinterpretation of group characteristics is to look at the average income in two very different communities that are about the same size. In Wealthyville, the average household income is $500,000 per year. In Poorville, the average household income is only $15,000 per year. If 10,000 households live in each community, we would say that the average household income across both communities is $257,500 per year. This would give a completely inaccurate representation of the communities, however, because it doesn’t represent the household income of anyone. It is far too little to represent Wealthyville (just about half the actual household income) and too high to represent Poorville (over 17 times the actual household income). By taking group characteristics and trying to generalize to individual households, we have committed the ecological fallacy.

Figure 4

      Figure 1.4 Units of analysis and making generalizations

      Keeping your units of analysis comparable also applies to arguments made in the opposite direction – generalizing individual processes to group processes.This problem is known as the atomistic fallacy (or individualist fallacy). People make this mistake when they take results from individual-level data and apply them to groups, where the context may be very different. We may find, for example, that being an immigrant is associated with an increased risk of mental health problems. A policy solution, however, of creating mental health programmes for all immigrants may be misguided, if contextual variables (at the group level) are not taken into account. It may be that immigrants in large cities have better mental health than immigrants in small communities (where they may be isolated) (Courgeau, 2003). If we simply take individual-level characteristics and apply them to groups, failing to take contexts into consideration, we may come to conclusions based upon flawed logic.

      Both the ecological and atomistic fallacies are errors that researchers make when they take data at one level and try to make generalizations to another level. As social scientists, we know that individual characteristics (e.g. age, gender, race) and contextual-level variables (e.g. school, neighbourhood, region) are important determinants for many different outcomes of interest. In multilevel modeling, we use both individual and group characteristics and our outcomes can be modeled in ways that illustrate how individual and group characteristics both affect outcomes of interest, and how group characteristics may influence how individual characteristics affect the outcome of interest, given certain contexts.

      Theoretical reasons for multilevel modeling

      Your models should always be theory-driven, and the best model choice is one that corresponds to a sound theoretical rationale. One that is often overlooked is the general theoretical arguments around how the social world is portrayed. Education researchers, such as Bronfenbrenner (1977, 2001), have argued that the outcomes of individuals, particularly children, cannot be understood without taking different contexts into perspective. His ecological systems approach identifies a number of different contexts to be taken into consideration in terms of how they work independently and together to create the environments in which children live. By looking at data collected from individuals, we are focusing on the micro-level (i.e. individual) effects of specific characteristics on outcomes of interest, but it is more likely the case that these micro-level effects vary significantly across larger units at the meso (school or community) and macro (municipal or national) levels. The micro and the macro (and the micro and meso) interact. This theoretical perspective is most readily tested with statistical techniques that recognize these important distinctions.

      Although discussions in this vein invariably resort to examples from education research, the applications and theoretical motivations apply across a range of disciplines, including health, political science, criminology, sociology, and management research. Scholars from all these disciplines have noted the importance of linking the individual (the micro) and the contexts in which he or she lives (the macro). The popularity of theories that focus exclusively on the individual or solely on higher levels (groups, firms, nations) is being overshadowed by approaches that try to mix the two and presumably give a more accurate depiction of the complexity of the social world.

      What are the advantages of using multilevel models?

      Well, as the name implies, multilevel models are equipped to analyse multiple levels of data. The information about