with range evolution, as in GeoSSE and ClaSSE, allows statistical testing of classical hypotheses, such as whether widespread ranges lead to higher speciation rates (Goldberg et al. 2011) or whether extinction rates are dependent on area size or environmental heterogeneity (Meseguer et al. 2015). A shortcoming of SSE models is their computational complexity. The stationary distributions and parameter probabilities in SSE models are estimated through numerical integration, rather than analytically by matrix exponentiation as in DEC. One attractive avenue forward to tackle these computationally intractable models is the probabilistic programming language (PPL) framework (Ronquist et al. 2020).
2.5.3. Ecology-integrative models
Another exciting development in recent years is the long-sought-after integration of the ecological and historical (phylogenetic) sides of biogeography. Ecological biogeography is defined as dealing with environmental factors and evolutionary processes that act at short time scales and individual or population levels, such as biotic interaction (facilitation, competition), environmental filtering or random genetic drift. Historical biogeography is concerned with deep-time geological events and species-level evolutionary processes, such as dispersal, extinction or speciation (Sanmartín 2012). The distinction between these two approaches has become blurred. For example, a vicariance barrier can be geological (e.g. a mountain) or environmental (e.g. climatically inhospitable land where a species cannot maintain viable populations).
Similarly, for overland dispersal, both the physical bridge and the right environmental conditions along the corridor are a requisite (Donoghue 2008). Ecological niche models can be used to find areas that are within the environmental tolerances of a species, and this information can be used in a biogeographic analysis for modeling the probability of dispersal along corridors or across barriers (Smith and Donoghue 2010). The ecological preferences of ancestors can also be incorporated through the inclusion of fossil, extinct taxa in the analysis; this offers great potential for reconstructing species distributions over the distant past (Meseguer et al. 2015). Ecological processes such as competition and environmental filtering can be modeled in Quintero and Landis’s (2019) composite biogeographic-trait evolutionary model: the rates of range expansion and range contraction depend on the trait values of other co-distributed species (effect of competition on biogeography), while the rate of divergence and convergence of trait values in a species depends on its sympatry with other species, gained or lost via colonization and extinction rates (effect of biogeography on traits).
2.6. Population-level and individual-based models
All models described above were designed to deal with phylogenies in which the terminal tips represent individual species (though BIB-DTA has been used in a phylogeographic context). CTMC processes are less appropriate to model the geographic evolution of individuals within a population, or between closely related populations, because they require the a priori definition of discrete geographic ranges and assume that movement between states is rare, that is, the chain remains in the same state and rarely jumps among states. When dealing with within-species biogeography or phylogeography, it is often difficult to define geographic ranges because boundaries are blurred by the frequent movement of individuals within populations and by gene flow. A Brownian Motion (BM) process, also termed “random walk” or diffusion model, is typically used for modeling the geographic evolution of populations and individuals (Lemey et al. 2011). This is a stochastic process with one parameter governing range evolution: there is a central value from where individuals move away with speed equal to this parameter. Unlike in biogeographic Markov models, tips in the phylogeny are individuals with associated geographical coordinates. Finally, models based on electric circuit-resistance theory (McRae et al. 2008) have been used in phylogeography to model the rate and path of movement or gene flow on heterogeneous landscapes. A special attraction of this model is the possibility to define connectivity maps based on 2D landscapes with barriers: low resistances are assigned to landscape feature types that are most permeable to movement or best promote gene flow, and high resistances assigned to movement barriers. The field of parametric phylogeography is in rapid expansion (Bloomquist et al. 2010), especially coalescent-based methods using approximate Bayesian computation, a likelihood-free Bayesian approach in which parameters in the model are estimated via simulation, and models are compared via summary statistics (Hickerson et al. 2007).
One class of expanding simulation models is forward-time, individual-based models, also termed in silico or automat models (Gotelli et al. 2009; Overcast et al. 2019). These models set up a series of rules by which speciation, extinction and dispersal of lineages can occur within an environmentally heterogeneous, two-dimensional gridded landscape; they are therefore spatially explicit models (Gotelli et al. 2009). These models have been used for testing macroecological hypotheses on species richness and distribution patterns, but some incorporate evolutionary predictions (Rangel et al. 2018). Recently, simulation modeling has experienced a spur forward, especially within the realm of phylogeography (Overcast et al. 2019), with the introduction of machine learning approaches and the integration of genetic data. Both in silico and machine learning approaches use simulations under pre-specified scenarios, as well as statistical comparison of observations against the distribution of simulated values to discriminate among alternative biogeographic scenarios. These models are less efficient for parameter inference than parametric approaches such as DEC or BIB, because a large range of values needs to be explored via simulation. Conversely, simulation models are more powerful in modeling complex phylogeographic scenarios involving multiple interacting parameters, since there is no need to derive the likelihood function and parameter dependencies. In particular, machine-learning methods are extremely flexible, with no cap on the number of parameters, and have been used for merging ecological and evolutionary processes (Overcast et al. 2019), trait-based biogeography (Sukumaran et al. 2016) or the integration of the spatial landscape (Tagliocollo et al. 2015). Some ML approaches do not rely on summary statistics and can be more efficient than ABC methods for phylogeographic inference (Fonseca et al. 2020).
2.7. References
Beaumont, M.A. (2010). Approximate Bayesian computation in evolution and ecology. Annu. Rev. Ecol. Evol. Syst., 41, 379–406.
Bielejec, F., Lemey, P., Baele, G., Rambaut A., Suchard, M.A. (2014). Inferring heterogeneous evolutionary processes through time: From sequence substitution to phylogeography. Syst. Biol., 63, 493–504.
Bloomquist, E.W., Lemey, P., Suchard, M.A. (2010). Three roads diverged? Routes to phylogeographic inference. Trends Ecol. Evol., 25, 626–632.
Bremer, K. and Janssen, T. (2006). Gondwanan origin of major monocot groups inferred from dispersal-vicariance analysis. Aliso, 22, 22–27.
Bribiesca-Contreras, G., Verbruggen, H., Hugall, A.F., O’Hara, T.D. (2019). Global biogeographic structuring of tropical shallow-water brittle stars. J. Biogeogr., 46, 1287–1299.
Brooks, D.R. (2005). Historical biogeography in the age of complexity: Expansion and integration. Rev. Mex. Biodivers., 76, 79–94.
Buerki, S., Forest, F., Alvarez, N., Nylander, J.A.A., Arrigo, N., Sanmartin, I. (2011). An evaluation of new parsimony-based versus parametric inference methods in biogeography: A case study using the globally distributed plant family Sapindaceae. J. Biogeogr., 38, 531–550.
Cybis, G.B., Sinsheimer, J.S., Lemey, P., Suchard, M.A. (2013). Graph hierarchies for phylogeography. Phil. Trans. R. Soc. B, 368, 20120206.
Darwin, C. (1859). On the Origin of Species by Means of Natural Selection or the Preservation of Favored Races in the Struggle for Life. John Murray, London.
De Maio, N., Wu, C.-H., O’Reilly, K.M., Wilson, D. (2015). New routes to phylogeography: A Bayesian structured coalescent approximation. PLoS Genet., 11(8), e1005421.
Donoghue, M.J. (2008). A phylogenetic perspective on the distribution of plant