Sharon Oviatt

The Handbook of Multimodal-Multisensor Interfaces, Volume 1


Скачать книгу

multimodal recruitment would occur after children watched another person perform an action during verb learning, or if they had to perform the action themselves in order for these multimodal systems to be recruited. To test this idea, we asked 5-year-old children to learn new verbs (such as “yocking”) that were associated with an action performed on a novel object. Children learned the action/verb/object associations either by watching an experimenter perform the action or through performing the action themselves. We then measured their BOLD activation patterns when they saw the novel objects or heard the novel words in a subsequent fMRI scanning session [James and Swain 2011]. The results were clear: only when children acted on the objects themselves while hearing/seeing the novel stimuli did a multimodal network emerge (see Figure 2.10).

      A follow-up study directly tested the effects of multimodal-multisensory learning by having 5-year-old children learn the noises that objects made as they interacted with them, or watched an experimenter [James and Bose 2011]. The procedure was the same as above except that instead of learning a verb, the participants heard a sound that was associated with object manipulation. Again, only after self-produced action during learning was the extended multimodal network of activation recruited in the brain.

      The action that is coded during learning is, therefore, part of the neural representation of the object or word. In essence, the multimodal act of object manipulation serves to link sensory and motor systems in the brain.

      Figure 2.10 fMRI results after children learn verbs through their own actions or by passively watching the experimenter compared with unlearned verbs. The upper panel shows the response to hearing the new verbs, while the lower panel depicts activation when children saw the novel objects. Only learning through action resulted in high motor system activity. Top left and right graphs: Middle frontal gyri; middle graphs: Inferior parietal sulci; bottom graph: left primary motor cortex. Note: left is right hemisphere. (From James and Swain [2011])

       2.6.2 Neural Systems Supporting Symbol Processing in Children

      The developmental trajectory of the neural system supporting letter perception clearly displays the importance of multimodal-multisensory experiences through the act of handwriting on brain development. In preliterate children, the perception of individual letters recruits bilateral fusiform gyri in visual association areas, bilateral intra-parietal sulcus in the visual-motor association brain regions, and left dorsal precentral gyrus in motor cortex, but only for letters with which the child has had handwriting experience (Figure 2.11) [James and Engelhardt 2012]. Indeed, the intraparietal sulcus responds more strongly for form-feature associations with letterforms (i.e., stronger for letters learned through handwriting than shapes learned through handwriting) whereas the left dorsal precentral gyrus responds more strongly for motor associations (i.e., stronger for letters learned through handwriting than tracing or typing). The functional connections between these three regions and the left fusiform gyrus show a similar pattern in preliterate children after handwriting practice [James and Engelhardt 2012, Vinci-Booher et al. 2016].

      Figure 2.11 (A) and (B) the difference in BOLD signal between handwriting > typing in the frontal premotor cortices; (C) difference between handwriting > tracing in precentral gyrus and parietal cortex; and (D) traced > typed activation in frontal cortex. (From James and Engelhardt [2012])

      The action of handwriting a letter feature-by-feature allows for multimodal input of information that can also be encoded though a single sense. It therefore transforms an inherently unisensory behavior (visual letter recognition before handwriting experience) into a multimodal behavior (visual letter recognition after handwriting experience). Multimodal behaviors promote the emergence of multisensory integration in evolutionarily early subcortical brain regions. They effectively structure the input to cortical regions and engender multimodal integration in the cortex, as outlined previously. The ability to transform inherently unimodal behaviors, such as visual perception of written forms or stationary objects, into inherently multimodal behaviors is the major utility provided by interacting with tools, such as writing implements.

      Interestingly, the pattern of activation that is seen after children learn to write letters by hand is only observed if the writing is self-produced. That is, if a child watches an experimenter produce the same forms, the multimodal network will not be evident during subsequent letter perception [Kersey and James ]. This latter result suggests that it is the multimodal production, not the multisensory perception that results in the emergence of the distributed brain network observed during letter perception.

      The extant literature on the neural substrates underlying multimodal-multisensory learning in young children clearly demonstrate that the visual perception of objects learned actively is not purely visual. Learning through action creates multimodal brain networks that reflect the multimodal associations learned through active interactions.

      This brief review of empirical studies suggests that the brain is highly adaptive to modes of learning. The input from the environment, through the body, is processed by the brain in a manner that requires high plasticity both in childhood and adulthood. The way in which we learn changes brain systems that, in turn, change our behaviors. As such, we argue that human behavior cannot be fully understood without a consideration of environmental input, bodily constraints, and brain functioning. Valuable insights can be gained by a thoughtful consideration of the rich data sets produced by brain imaging techniques. Understanding brain mechanisms that underlie behavior change our understanding of the hows and whys of the efficacy of technologies to assist learning.

      The embodied cognition perspective encompasses a diverse set of theories that are based on the idea that human cognitive and linguistic processes are rooted in perceptual and physical interactions of the human body with the world [Barsalou 2008, Wilson 2002]. According to this perspective, cognitive structures and processes—including ways of thinking, representations of knowledge, and methods of organizing and expressing information—are influenced and constrained by the specifics of human perceptual systems and human bodies. Put simply, cognition is shaped through actions by the possibilities and limitations afforded by the human body.

      The research outlined in this chapter clearly supports this embodied perspective. Learning is facilitated though bodily interactions with the environment. Such necessary competencies, such as visual perception, object knowledge, and symbol understanding, are determined by physical action. Often we consider these basic human abilities to be reliant on unimodal (usually visual) processing. However, more research is emerging that supports the notion that multimodal processing is key to acquiring these abilities, and further, that multisensory processing is created through action, an inherently multimodal behavior. Furthermore, the mechanisms that support multimodal-multisensory learning are becoming better understood. Multimodal-multisensory learning results in the recruitment of widespread neural networks that serve to link information that creates highly adaptive systems for supporting human behavior. Because of the importance of action for learning, and because action is reliant on the body, the work reviewed here outlines the importance of the embodied cognition standpoint for understanding human behavior.

      When we think about modern society and its reliance on human-computer interactions, it is wise to remember that our brains have adapted to an environment that existed hundreds of thousands of years prior to our use of screens and multimodal interfaces. Track balls, keyboards, and pens are not a part of the environment for which our brains have adapted. However, our early hominid ancestors did use tools (e.g., [Ambrose 2001]). Considering that many of our modern interfaces are tools helps to understand how humans have become so adept