the visibility, and therefore the identifiability, of individual letters presented in letter strings: visual acuity, crowding, and spatial attention. Acuity and crowding are thought to be the principle factors involved in determining letter‐in‐string visibility (see Figure 3.3). Visual acuity is determined by the density of retinal receptors and drops sharply and linearly from the point of focus of the eyes within foveal vision – which is the region of vision that generally encompasses single word reading. Crowding is another general visual constraint that determines the identifiability of visual objects as a function of the proximity of surrounding objects (Pelli & Tillman, 2008). It can be thought of as a form of visual clutter, with the deleterious impact of surrounding objects increasing the closer they are to the target object. This interference in object identification is determined by the spatial extent of the crowding zone that surrounds a to‐be‐identified object. Experiments have shown a linear increase in the extent of the crowding zone as the eccentricity of the center of the zone increases. This function is known as Bouma’s law (Pelli & Tillman, 2008), named after Herman Bouma, a pioneer in letter perception and crowding research.
Figure 3.3 Serial position function for letter‐in‐string visibility with central fixations (i.e., fixation on the middle letter) explained by the combined influence of acuity (linear drop from fixated letter to outer letters) and crowding (greatest for inner letters that are flanked by two letters, least for outer letters that are flanked by one letter). The figure also illustrates the advantage for initial position, which can be much greater depending on the testing conditions.
Grainger et al., 2016/With permission of Elsevier.
Evidence supporting the role of acuity and crowding in letter identification comes from serial position functions for letter‐in‐string visibility. These are typically obtained by presenting a string of letters to participants and post‐cueing one of the locations in the string for identification. To limit contributions from phonological processing and whole‐word representations, the letter strings in these experiments usually comprise random consonants. Such experiments typically reveal a W‐shaped function for visibility, with letter identification accuracy best in the beginning, middle, and end positions. This W‐function is thought to reflect a combination of the decrease in acuity from the central letter to the outer letters plus reduced crowding for the outer letters (Figure 3.3). While digits show a similar pattern, other kinds of simple visual stimuli, such as symbols and simple shapes do not (Tydgat & Grainger, 2009).
It has been hypothesized that spatial attention contributes to the first‐letter advantage, whereby identification of initial letters is higher than either the letter on fixation or the final letter due to a word‐beginning bias in the deployment of attention (e.g., Aschenbrenner et al., 2017). However, given findings that suggest that attention cannot be manipulated within a word (e.g., Ducrot & Grainger, 2007), alternative accounts of the first‐letter advantage have been proposed (Chanceaux et al., 2013).
Letter positions
Beyond letter identity, information about the positions of letters within a word is crucial for accurate word recognition. Grainger and van Heuven (2004) proposed a model of orthographic processing that provides an account of how location‐specific visual information is used to generate a location‐invariant code for letter‐in‐word order (see Dehaene et al., 2005, for a similar proposal). In this account, the first level of orthographic processing involves the location‐specific encoding of letter identities via a horizontally aligned bank of letter detectors (for written languages with horizontally aligned scripts; see Figure 3.4). These function in the same way as the isolated letter detectors previously described and are thought to be case‐specific (e.g., separate detectors for lowercase a and uppercase A). Such location‐specific and case‐specific letter detectors provide information about where different letter identities are located along a line of text relative to where readers’ eyes are looking. These location‐specific letters are analogous to the location‐specific complex features involved in visual object identification. Therefore, one of the major transitions that has to be achieved during reading acquisition is the change from letters as individual objects to letters as object parts. In other words, during the very first stages of learning to read, after learning the alphabet, letters take on the role of the complex features that subtend visual object identification.
Orthographic Processing and Word Recognition
Having established the bases of letter‐level processing, we now turn to consider the visual and orthographic factors that determine ease of single‐word recognition.
Figure 3.4 Illustration of Grainger and van Heuven’s (2004) model of orthographic processing applied to the case of multiple letter strings separated by spaces (Grainger et al., 2014 / With permission of Elsevier). Letter identities are first assigned to a specific location along a line of text that depends on where readers’ eyes are looking at the text (gaze‐centered letters). Then, location‐invariant letter‐in‐word order is encoded by inferring the order of letter pairs (bigrams) on the basis of activity in gaze‐centered letters. These letter pairs are not necessarily contiguous (open‐bigrams), such that presentation of a word such as rock generates activity in open‐bigrams such as R‐O (R before O), but also R‐C (R before C). Although letter order matters for the formation of bigrams, the order of bigrams themselves does not matter. For simplicity, information about the spaces between words (which is available at the first level of orthographic processing) is left out of this figure, but is thought to contribute to the encoding of location‐invariant letter order. Possible feedback and lateral inhibitory connections have also been left out.
Visual factors
The studies of letter‐in‐string processing described in the preceding section typically presented stimuli centered on fixation. However, varying fixation location within a letter string completely modifies the distribution of visual acuity across the string. This has been identified as the main factor determining what has been referred to as the “optimal viewing position” effect in words (O’Regan & Jacobs, 1992). The location of initial fixation within a word is varied by first presenting a fixation cross and then adjusting the position of the to‐be‐presented word, relative to the fixation cross. Word recognition is facilitated by fixations toward the beginning rather than the end of a word, with an optimal location just off‐center and a subsequent cost for fixations further toward the extremities. That is, given a seven‐letter word in English, word identification is best with an initial fixation approximately on the third letter, and drops monotonically from this position to the first and last letters, with an overall advantage for beginning letters. This viewing position function takes the form of a J‐shaped function for response times and the inverse of this for accuracy (e.g., O’Regan et al., 1984; O’Regan & Jacobs, 1992). Visual acuity accounts for the decrease in performance as initial fixation moves from the center out, since the summed acuity across letter positions follows this simple linear function. What remains to be explained is the asymmetric form of the viewing position function, with an advantage