Группа авторов

The Handbook of Speech Perception


Скачать книгу

(Brosch, Selezneva, & Scheich, 2005). Nevertheless, it is currently thought that the neural representations of sounds and events in the primary auditory cortex are probably based on detecting relatively simple acoustic features and are not specific to speech or vocalizations, given that the primary cortex does not seem to have any obvious preference for speech over nonspeech stimuli. In the human brain, to find the first indication of areas that appear to prefer speech to other, nonspeech sounds, we must move beyond the tonotopic maps of the primary auditory cortex (Belin et al., 2000; Scott et al., 2000).

Schematic illustration of a map of cortical areas involved in the auditory representation of speech.

      (Source: Adapted from Rauschecker & Scott, 2009.)

       Speech‐preferential areas

      That areas of the brain exist that are necessary for the understanding of speech but not for general sound perception has been known since the nineteenth century, when the German neurologist Carl Wernicke associated the aphasia that bears his name with damage to the STG (Wernicke, 1874). Wernicke’s eponymous area was, incidentally, reinterpreted by later neurologists to refer only to the posterior third of the STG and adjacent parietal areas (Bogen & Bogen, 1976), although some disagreement about its precise boundaries continues until this day (Tremblay & Dick, 2016).

      With the advent of fMRI at the end of the twentieth century, the posterior STG (pSTG) was confirmed to respond more strongly to vocal sounds than to nonvocal sounds (e.g. speech, laughter, or crying compared to the sounds of wind, galloping, or cars; Belin et al., 2000). Neuroimaging also revealed a second, anterior, area in the STG, which responds more to vocal than to nonvocal sounds (Belin et al., 2000). These voice‐preferential areas can be found in both hemispheres of the brain. Additional studies have shown that it is not just the voice but also intelligible speech that excites these regions, with speech processing being more specialized in the left hemisphere (Scott et al., 2000). Anatomically, the anterior and posterior STG receive white‐matter connections from the primary auditory cortex, and in turn feed two auditory‐processing streams, one antero‐ventral, which extends into the inferior frontal cortex, and the other postero‐dorsal, which curves into the inferior parietal lobule. The special function of these streams remains a matter of debate. For example, Rauschecker and Scott (2009) propose that the paths differ in processing what and where information in the auditory signal, where what refers to recognizing the cause of the sound (e.g. it’s a thunderclap) and where to locating the sound’s spatial location (e.g. to the west). Another, more linguistic, suggestion is that the ventral stream is broadly semantic, whereas the dorsal stream may be described as more phonetic in nature (Hickok & Poeppel, 2004). Whatever the functions, however, there appear to be two streams diverging around the anterior and posterior STG.

       Auditory phonetic representations in the superior temporal gyrus

      ECoG, which involves the placement of electrodes directly onto the surface of the brain, cannot easily record from the primary auditory cortex. This is because the PAC is tucked away inside the Sylvian fissure, along the dorsal aspect of the temporal lobe. At the same time, because ECoG measures the summed postsynaptic electrical current of neurons with millisecond resolution, it is sensitive to rapid neural responses at the timescale of individual syllables, or even individual phones. By contrast, fMRI measures hemodynamic responses; these are changes in blood flow that are related to neural activity but occur on the order of seconds. In recent years, the use of ECoG has revolutionized the study of speech in auditory neuroscience. An exemplar of this can be found in a recent paper (Mesgarani et al., 2014).