Группа авторов

The Handbook of Speech Perception


Скачать книгу

helped usher in a new understanding of the perceptual brain.

      This chapter will readdress important issues in multisensory speech perception in light of the enormous amount of relevant research conducted since publication of the first version of this chapter (Rosenblum, 2005). Many of the same topics addressed in that chapter will be addressed here including: (1) the ubiquity and automaticity of multisensory speech in human behavior; (2) the stage at which the speech streams integrate; and (3) the possibility that perception involves detection of a modality‐neutral – or supramodal – form of information that is available in multiple streams.

      Since 2005, evidence has continued to grow that supports speech as an inherently multisensory function. It has long been known that visual speech is used to enhance challenging auditory speech, whether that speech is degraded by noise or accent, or simply contains complicated material (e.g. Arnold & Hill, 2001; Bernstein, Auer, & Takayanagi, 2004; Reisberg, McLean, & Goldfield, 1987; Sumby & Pollack, 1954; Zheng & Samuel, 2019). Visual speech information helps us acquire our first language (e.g. Teinonen et al., 2008; for a review, see Danielson et al., 2017) and our second languages (Hardison, 2005; Hazan et al., 2005; Navarra & Soto‐Faraco, 2007). The importance of visual speech in language acquisition is also evidenced in research on congenitally blind individuals. Blind children show small delays in learning to perceive and produce segments that are acoustically more ambiguous, but visually distinct (e.g. the /m/–/n/ distinction). Recent research shows that these idiosyncratic differences carry through to congenitally blind adults who show subtle distinctions in speech perception and production (e.g. Delvaux et al., 2018; Ménard, Leclerc, & Tiede, 2014; Ménard et al., 2009, 2013, 2015).

      These haptic speech demonstrations are important for multiple reasons. First, they demonstrate how readily the speech system can make use of – and integrate – even the most novel type of articulatory information. Very few normally sighted and hearing individuals have intentionally used touch information for purposes of speech perception. Despite the odd and often limited nature of haptic speech information, it is readily usable, showing that the speech brain is sensitive to articulation, regardless through which modality it is conveyed. Second, the fact that this information can be used spontaneously despite its novelty may be problematic for integration accounts based on associative learning between the modalities. Both classic auditory accounts of speech perception (Diehl & Kluender, 1989; Hickok, 2009; Magnotti & Beauchamp, 2017) and Bayesian accounts of multisensory integration (Altieri, Pisoni, & Townsend, 2011; Ma et al., 2009; Shams et al., 2011; van Wassenhove, 2013) assume that the senses are effectively bound and integrated on the basis of the associations gained through a lifetime of experience simultaneously seeing and hearing speech utterances. However, if multisensory speech perception were based only on associative experience, it is unclear how haptic speech would be so readily used and integrated by the speech function. In this sense, the haptic speech findings pose an important challenge to associative accounts (see also Rosenblum, Dorsi, & Dias, 2016).

      However, some recent research has challenged this interpretation of integration (for a review, see Rosenblum, 2019). For example, a number of studies have been construed as showing that attention can influence whether integration occurs in the McGurk effect (for reviews, see Mitterer & Reinisch, 2017; Rosenblum, 2019). Adding a distractor to the visual, auditory, or even tactile channels seems to significantly reduce the strength of the effect (e.g. Alsius et al., 2005; Alsius, Navarra,