the greater role of semantics in reading aloud words such as yacht, aisle, and chef, as confirmed in behavioral and neuroimaging experiments cited above. Perry et al.’s (2019) use of a phonological lexicon does not.
In summary, the major continuity here is with the triangle framework and its account of the division of labor between pathways. Having implemented a version of the orthography➔phonology part of the triangle, Perry, Ziegler, and colleagues could complete the evolution of their approach by dropping the lexical route in favor of the orthography➔semantics➔phonology parts of the triangle, which are needed for independent reasons.
Learning to Read
Recent work using traditional dual‐route models has focused on reading acquisition (Pritchard et al., 2018; Perry et al., 2019), an important area where additional computational modeling could be very informative. However, this research inherits the limitations of the models of adult performance on which it is based.
The Self Teaching‐DRC model (Pritchard et al., 2018) attempts to show how children learn grapheme‐phoneme correspondence rules and add words to the lexicon, using Share’s (1995) “self‐teaching” mechanism. As the authors noted, “The [ST‐DRC] model uses DRC’s sublexical route and the interactivity between lexical and sublexical routes to simulate phonological recoding.” This is the same mechanism that failed to simulate skilled performance adequately. The paradox, then, is that if the researchers successfully simulate the acquisition of this knowledge, they will arrive at the model that missimulates adult performance. Note that the consistency effects that confounded the DRC model are observed in readers as young as 6–7 years old (Backman et al., 1984; Treiman et al., 1995).
The dual‐route approach nonetheless remains influential in education. The intuition that reading English requires learning pronunciation rules and memorizing irregular words is a premise of phonics instruction dating from the nineteenth century (Seidenberg, 2017) and the rules‐and‐exceptions approach retains its intuitive appeal. However, there is little agreement among researchers or educational practitioners about either part. On the rule side, there are widely varying proposals about what the rules are and how they should be taught. On the lexical side, reading curricula disagree about which words have irregular pronunciations; many hold that higher frequency words need to be memorized, not simply exceptions, but differ regarding the number of words involved. (See articles in Reading Research Quarterly, 2020, volume 55, S1, for discussion.) In our view, the lack of convergence on these issues arises from the mistaken assumption that the system consists of rules with exceptions.
If, as other evidence has suggested, the dual‐route approach is not an adequate account of written English, that may undermine the effectiveness of pedagogical practices based on it. The idea that spelling‐sound correspondences are quasiregular and learned via a combination of implicit statistical learning and explicit instruction has not penetrated very far in education, probably because it is not intuitive and requires background knowledge that most educators and educational researchers lack. If this is a more accurate characterization of this knowledge and how it is learned, it may provide the basis for more effective instruction materials and practices (Chang et al., 2020; Seidenberg et al., 2020).
The hybrid models are also being used to study reading acquisition. The similarities we have noted between these models and the PDP‐connectionist models also carry over to acquisition. Perry et al. (2019) also based their account on Share’s “self‐teaching” idea: when a word is correctly read aloud, its accuracy can be determined by matching to an entry in the phonological lexicon. If the word is mispronounced, the correct pronunciation must be provided from some other source for learning to occur. How pronouncing nonwords could yield learning in this framework requires additional assumptions. Identifying the sources of the error signals that drive learning in a realistic manner is an important issue that merits additional research. Progress might be accelerated by incorporating research in two areas.
One is studies of machine learning algorithms for neural networks. Share’s self‐teaching hypothesis was an important, original contribution, and for many years the only mechanistic account of learning to read at least some words. The mechanism he described can be understood as an example of learning via a forward model (Plaut & Kello, 1999). Whether a word has been pronounced correctly can be determined by attempting to comprehend it as spoken language (i.e., via phonology➔semantics). If the pronunciation is correct, the semantic pattern that is computed on this reverse pass should match the meaning of the word that was pronounced. A discrepancy between the two yields an error signal that can be used to adjust the weights. Processing one’s own language production on a backward pass through the speech comprehension system is used in contexts ranging from infant babbling (Werker & Tees, 1999) to making grammaticality judgments (Allen & Seidenberg, 1999). Machine learning researchers have also studied “semi‐supervised” learning algorithms in which the specificity of the feedback can vary across learning trials (Gibson et al., 2013). This is getting much closer to the varied conditions under which children learn, which include explicit correction (full feedback), partial “cues” about pronunciation, broad indications that a pronunciation was correct or incorrect, and children learning from their own output, correct or not.
The second relevant area of research is on the brain bases of learning. Humans engage in at least two types of learning: explicit and implicit, also known as declarative and procedural (Ellis, 2005; Kumaran et al., 2016), which are subserved by subcortical and cortical neural structures, respectively. The explicit system is associated with conscious awareness and intention, and knowledge that can be described using language, such as the rules for chess. It is slow and effortful (cf. Kahneman’s System 2 thinking, a related notion; Kahneman, 2011). The implicit system operates without conscious awareness, occurs automatically rather than by intent, and involves unlabeled statistical patterns (cf. System 1 thinking; Kahneman, 2011). Traditional instruction is explicit, as in teaching explicit rules for pronouncing or spelling words. Aside from the lack of agreement about the rules, these mappings are too complex to be wholly taught this way. Although learners benefit from explicit instruction (e.g., Foorman et al., 1991), most of this knowledge is acquired via implicit learning. Excessive emphasis on explicit instruction may make acquiring this material more difficult. Finding the balance between explicit and implicit learning experiences that is most effective is an important challenge for future research. Incorporating both learning systems in computational models of reading seems an obvious and necessary next step, with potentially important implications for education.
Conclusions
Computational modeling is a powerful tool for studying reading and other complex behaviors. Modeling has addressed two major concerns about earlier box‐and‐arrow models of reading and other phenomena. First, it was unclear whether informally stated verbal models would work in the intended ways. Discussing schematic diagrams of the dual‐route model, Seidenberg (1988) observed:
My own view is that these [models] can only be sustained because explicit representations and processing mechanisms are not provided. Conversely, providing this information would yield a very different functional architecture than the received view.
Implementations of dual‐route models validated these concerns: providing the necessary computational details revealed the limitations of the approach. Recognition of these limitations (initially by Rumelhart & McClelland, 1986, in the context of learning the past tense; later by Seidenberg & McClelland, 1989) led to the development of connectionist models with a very different character.
The second concern was the post‐hoc character of box‐and‐arrow modeling. As new behavioral phenomena were discovered, components were added or adjusted to fit the data. Case reports of patients with highly selective impairments were considered especially informative (Patterson & Lambon Ralph, 1999). The framework was unconstrained and could be modified in numerous ways. Elaborations of these informal models increased the number of phenomena that could be accommodated, but the explanations were shallow because they were devised to fit the data rather than independently justified (see Seidenberg,