Группа авторов

The Handbook of Speech Perception


Скачать книгу

      K. G. MUNHALL1, ANJA‐XIAOXING CUI2, ELLEN O’DONOGHUE3, STEVEN LAMONTAGNE1, AND DAVID LUTES1

      1 Queen’s University, Canada

      2 University of British Columbia, Canada

      3 University of Iowa, United States

      There is broad agreement that the American socialite Florence Foster Jenkins was a terrible singer. Her voice was frequently off‐key and her vocal range did not match the pieces she performed. The mystery is how she could not have known this. However, many – including her depiction in the eponymous film directed by Stephen Frears – think it likely that she was unaware of how poorly she sang. The American mezzosoprano Marilyn Horne offered this explanation. “I would say that she maybe didn’t know. First of all, we can’t hear ourselves as others hear us. We have to go by a series of sensations. We have to feel where it is” (Huizenga, 2016). This story about Jenkins contains many of the key questions about the topic of this chapter, the perceptual control of speech. Like singing, speech is governed by a control system that requires sensory information about the effects of its actions, and the major source of this sensory feedback is the auditory system. However, the speech we hear is not what others hear and yet we are able to control our speech motor system in order to produce what others need or expect to hear. For both speech and singing, much is unknown about the auditory‐motor control system that accomplishes this. What role does hearing your voice play in error detection and correction? How does this auditory feedback processing differ from how others hear you? What role does hearing your voice play in learning to speak?

      This chapter addresses a number of issues related to the perceptual control of speech production. We first examine the importance of hearing yourself speak through the study of natural and experimental deafening in humans and birds. This work is complemented by recent work involving real‐time manipulations of auditory feedback through rapid signal processing. Next, we review what is known about the neural processing of self‐produced sound. This includes work on corollary discharge or efference copy, as well as studies showing cortical suppression during vocalizing. Finally, we address the topic of vocal learning and the general question about the relationship between speech perception and speech production. A small number of species including humans learn their vocal repertoire. It is important to understand the conditions that promote this learning and also to understand why this learning is so rare. Through all of our review, we will touch base with research on birdsong. Birdsong is the animal model of human vocal production. The literature on birdsong provides exciting new research directions as extensive projects on the genetic and neural underpinnings of vocal learning are carried out demonstrating remarkable similarity to human vocal behavior (Pfenning et al., 2014).

      Two broad features differentiate such high‐level error detection from other forms of target‐based correction, as in speech production. First, language‐error correction often interrupts the flow of output, while the same is not always true of compensation in response to auditory speech feedback perturbations. Second, language‐error correction typically involves conscious awareness. This is inconsistent with speech feedback processing.

      Two bodies of literature – clinical studies of hearing loss and artificial laboratory perturbation studies – shed light on these unique features of speech feedback processing.

       Deafness and Perturbations of auditory feedback

      Loss of hearing has a drastic impact on the acquisition of speech (Borden, 1979). From the first stage of babbling to adult articulation, speech in those who are profoundly hearing impaired has distinct acoustic and temporal speech characteristics. Canonical babbling is delayed in its onset and the number of well‐formed syllables is markedly reduced even after clinical intervention through amplification (Oller & Eilers, 1988). Beyond babbling, Osberger and McGarr (1982) have summarized the patterns of speech errors in children who have significant hearing impairments. While the frequencies of errors (and hearing levels) varied between children, there were consistent atypical segmental productions including sound omissions, anomalous timing, and distortions of phonemes. These phonetic patterns are accompanied by inconsistent interarticulator coordination (McGarr & Harris, 1980). In addition, there are consistent suprasegmental issues in the population including anomalies of vocal pitch and vocal‐quality control and inadequate intonation contours (Osberger & McGarr, 1982).

      These patterns of deficit most likely arise from the effects of deafness on both the perceptual learning of speech in general and the loss of auditory feedback in vocal learning. Data characterizing speech‐production behavior at different ages of deafness onset could shed some light on the extent to which learning to perceive the sound system or learning to hear yourself produce sounds contributes to the reported deficits. However, there are minimal data on humans that provide a window onto the importance of hearing at different stages of vocal learning. Binnie, Daniloff, and Buckingham (1982) provide a case study of a five‐year‐old who abruptly lost hearing. The child showed modest changes immediately after deafness onset but, over the course of a year, the intelligibility of his speech declined due to distortions in segmental production and prosody. Notably, the child rarely deleted sounds and tended to prolong vowels perhaps to enhance kinesthetic feedback. While this case study is not strong evidence for the development of auditory feedback, it is noteworthy that the speech representations that govern fluent speech are well developed even at this young age. Speech quality does not immediately degrade.