Группа авторов

The Handbook of Speech Perception


Скачать книгу

production task. No shift was observed in baseline vowel formant values but a difference was observed in the magnitude of compensation to F1 perturbations. Oddly, this difference was observed in a follow‐up days later. The persistence is surprising for a number of reasons. First, the speech adaptation effects produced by formant shifts themselves drift away relatively quickly within an experimental session following return to normal feedback. Second, the perceptual training didn’t influence baseline vowel production immediately after training nor days later. The influence of perceptual change on production is shown only in the magnitude of compensation (i.e. in the behavior of the auditory feedback processing system). Finally, the length of effect is noteworthy. While it is not unheard of for perceptual effects to persist across many days, it is not common; the McCollough effect in vision has been shown to last for months after a 15‐minute training period (Jones & Holding, 1975). However, the reinforcement learning paradigm used by Lametti et al. (2014) is considerably different from the adaptation approach used in other studies and suggests a more selective influence on the perception–production linkage.

      The published data suggest modest effects from speech‐perception training on speech production and vice versa. As Kittredge and Dell (2016) suggest, the pathway for exchange between the input and output systems may be restricted to a small set of special conditions. Kittredge and Dell suggest that one possibility is that perceptual behavior that involves prediction invokes the motor system and this directly influences production.

      A separate line of research has suggested this influence may exist but has shown similar, small effect sizes in experiments. In the study of face‐to‐face conversations, considerable theoretical proposals support the idea that interlocutors align their language at many levels (Garrod & Pickering, 2004). At the phonetic level, the findings have been weak but consistent. Few acoustic findings support alignment but small perceptual effects have been frequently reported (Pardo et al., 2012; Kim, Horton, & Bradlow, 2011. The surprising aspect of these findings is the small effect size. Given the proposed importance of alignment in communication (and the proposed linkage between perception and production; Pickering & Garrod, 2013), the small influence is problematic.

       Correlational data

      The data linking perception and production within individuals are also surprisingly sparse. Most of the data show that talkers’ perception and production categories are somewhat similar. For example, Newman (2003) found small correlations between the VOT prototypes of listeners and their production VOT values (accounting for approximately 27 percent of the variance). However, Frieda et al. (2000) did not find such a correlation for the perceptual prototype for the vowel /i/ and production values. Fox (1982) showed that the factor analysis dimensions derived from listeners’ judgments of similarity between vowels could be predicted by the acoustics of vowels produced by the participants, but only by the corner vowels /i, u, A/. Bell‐Berti et al. (1979) categorized the manner in which participants produced the tense/lax distinction in front vowels based on their examination of electromyographic recordings. They later found that those participants who used a tongue‐height production strategy showed larger boundary shifts in an anchoring condition in a vowel‐perception test than those who used a muscle tension implementation of tense/lax. Perkell et al. (2004) also grouped participants on the basis of measurements of production data, and found that these groups performed differently in perception tests. The more distinct the production contrasts between two vowels that talkers produced the more likely those subjects were able to distinguish tokens in a continuum of those vowels.

      The most recent evidence in support of this correlational relation between perception and production abilities comes from Franken et al. (2017). In this study, production variability for vowel formant values was measured and the ability to discriminate between vowel tokens assessed. These two variables were found to correlate in their data. However, the correlations are modest and smaller than those reported by Perkell et al. (2004). The argument put forward in Franken et al. (2017) is that talkers with better perceptual acuity are less variable in production and that these talkers are more sensitive to feedback discrepancies. Indeed, Villacorta, Perkell, and Guenther (2007) showed a greater response to formant perturbation in subjects who had greater acoustic acuity. However, this finding is inconsistent with MacDonald, Purcell, and Munhall’s (2011) meta‐analysis of the variability of production and compensation magnitude in F1 and F2 for 116 subjects. The lack of relationship between variability and compensation observed by MacDonald et al. is important given the large sample size considered in their analysis.

       Interference effects

      Influences in early infant speech behavior have also been shown from the other direction. Speech‐production tendencies can be correlated with developing perceptual abilities (e.g. Majorano, Vihman, & DePaolis, 2014; DePaolis, Vihman, & Keren‐Portnoy, 2011). Majorano, Vihman, and DePaolis (2014) tested children learning Italian at 6, 12, and 18 months. At the end of the first year, children whose production favored a single vocal motor pattern, showed a perceptual preference for sounds resulting from those speech movement patterns.

      None of these findings, however, elucidate how auditory feedback processing develops, nor what role hearing your own productions plays in learning to pronounce words. The findings of MacDonald et al. (2012) raise the possibility that early productions by the child are not tuning the word‐formation system on the basis of auditory‐error corrections. Instead, the early focus may be on the adult models. Cooper, Fecher, and Johnson (2018) recently showed that two‐and‐a‐half‐year‐old children preferred recordings of adults over recordings of their own and other toddlers’ speech. In fact, the children showed no familiarity effect and thus did not prefer their mother’s speech from another adult’s. As Cooper, Fecher, and Johnson (2018) suggest, the driving force in lexical acquisition may be the adult targets and not the successive shaping of children’s targets by corrective auditory feedback.