Sharon Oviatt

The Handbook of Multimodal-Multisensor Interfaces, Volume 1


Скачать книгу

Results

       12.6 Other Audio-Visual Speech Applications

       12.7 Conclusions and Outlook

       Focus Questions

       References

       PART IV MULTIDISCIPLINARY CHALLENGE TOPIC: PERSPECTIVES ON LEARNING WITH MULTIMODAL TECHNOLOGY

       Chapter 13 Perspectives on Learning with Multimodal Technology

       Karin H. James, James Lester, Dan Schwartz, Katherine M. Cheng, Sharon Oviatt

       13.1 Perspectives from Neuroscience and Human-Centered Interfaces

       13.2 Perspectives from Artificial Intelligence and Adaptive Computation

       13.3 The Enablers: New Techniques and Models

       13.4 Opening Up New Research Horizons

       13.5 Conclusion

       References

       Index

       Biographies

       Preface

      The content of this handbook would be most appropriate for graduate students, and of primary interest to students studying computer science and information technology, human–computer interfaces, mobile and ubiquitous interfaces, and related multidisciplinary majors. When teaching graduate classes with this book, whether in quarter or semester classes, we recommend initially requiring that students spend two weeks reading the introductory textbook, The Paradigm Shift to Multimodality in Contemporary Interfaces (Morgan Claypool, Human-Centered Interfaces Synthesis Series, 2015). This textbook is suitable for upper-division undergraduate and graduate students. With this orientation, a graduate class providing an overview of multimodal-multisensor interfaces then could select chapters from the handbook distributed across topics in the different sections.

      As an example, in a 10-week quarter course the remaining 8 weeks might be allocated to reading select chapters on: (1) theory, user modeling, common modality combinations (2 weeks); (2) prototyping and software tools, signal processing and architectures (2 weeks); (3) language and dialogue processing (1 week); (4) detection of emotional and cognitive state (2 weeks); and (5) commercialization, future trends, and societal issues (1 week). In a more extended 16-week semester class, we would recommend spending an additional week reading and discussing chapters on each of these five topic areas, as well as an additional week on the introductory textbook, The Paradigm Shift to Multimodality in Contemporary Interfaces. As an alternative, in a semester course in which students will be conducting a project in one target area (e.g., designing multimodal dialogue systems for in-vehicle use), some or all of the additional time in the semester course could be spent: (1) reading a more in-depth collection of handbook chapters on language and dialogue processing (e.g., 2 weeks) and (2) conducting the hands-on project (e.g., 4 weeks).

      For more tailored versions of a course on multimodal-multisensor interfaces, another approach would be to have students read the handbook chapters in relevant sections, and then follow up with more targeted and in-depth technical papers. For example, a course intended for a cognitive science audience might start by reading The Paradigm Shift to Multimodality in Contemporary Interfaces, followed by assigning chapters from the handbook sections on: (1) theory, user modeling, and common modality combinations; (2) multimodal processing of social and emotional information; and (3) multimodal processing of cognition and mental health status. Afterward, the course could teach students different computational and statistical analysis techniques related to these chapters, ideally through demonstration. Students then might be asked to conduct a hands-on project in which they apply one or more analysis methods to multimodal data to build user models or predict mental states. As a second example, a course intended for a computer science audience might also start by reading The Paradigm Shift to Multimodality in Contemporary Interfaces, followed by assigning chapters on: (1) prototyping and software tools; (2) multimodal signal processing and architectures; and (3) language and dialogue processing. Afterward, students might engage in a hands-on project in which they design, build, and evaluate the performance of a multimodal system.

      In all of these teaching scenarios, we anticipate that professors will find this handbook to be a particularly comprehensive and valuable current resource for teaching about multimodal-multisensor interfaces.

       Acknowledgments

      In the present age, reviewers are one of the most precious commodities on earth. First and foremost, we’d like to thank our dedicated expert reviewers, who provided insightful comments on the chapters and their revisions, sometimes on short notice. This select group included Antonis Argyros (University of Crete, Greece), Vassilis Athitsos (University of Texas at Arlington, USA), Randall Davis (MIT, USA), Anthony Jameson (DFKI, Germany), Michael Johnston (Interactions Corp., USA), Elsa Andrea Kirchner (DFKI, Germany), Stefan Kopp (Bielefeld University, Germany), Marieke Longchamp (Laboratoire de Neurosciences Cognitive, France), Diane Pawluk (Virginia Commonwealth University, USA), Hesam Sagha (University of Passau, Germany), Gabriel Skantze (KTH Royal Institute of Technology, Sweden), and the handbook’s main editors.

      We’d also like to thank the handbook’s eminent advisory board, 12 people who provided valuable guidance throughout the project, including suggestions for chapter topics, assistance with expert reviewing, participation on the panel of experts in our challenge topic discussions, and valuable advice. Advisory board members included Samy Bengio (Google, USA), James Crowley (INRIA, France), Marc Ernst (Bielefeld University, Germany), Anthony Jameson (DFKI, Germany), Stefan Kopp (Bielefeld University, Germany), András Lõrincz (ELTE, Hungary), Kenji Mase (Nagoya University, Japan), Fabio Pianesi (FBK, Italy), Steve Renals (University of Edinburgh, UK), Arun Ross (Michigan State University, USA), David Traum (USC, USA), Wolfgang Wahlster (DFKI, Germany), and Alex Waibel (CMU, USA).

      We all know publications have been a rapidly changing field, and in many cases authors and editors no longer receive the generous support they once did. We’d like to warmly thank Diane Cerra, our Morgan & Claypool publications manager, for her amazing skillfulness, flexibility, and delightful good nature throughout all stages of this project. It’s hard to imagine having a more experienced publications advisor and friend, and for a large project like this one it was invaluable. Thanks also to Mike Morgan, President of Morgan & Claypool, for his support on all aspects of this project. Finally, thanks to Tamer Ozsu and Michel Beaudouin-Lafon of ACM Books for their advice and support.

      Many colleagues around the world graciously provided assistance in large and small ways—content insights, copies of graphics, critical references, and other valuable information used to document and illustrate this book. Thanks to all who offered their assistance, which greatly enriched this multi-volume handbook. For financial and professional support, we’d like to thank DFKI in Germany and Incaa Designs, an independent 501(c)(3) nonprofit organization in the US. In addition, Björn Schuller would like to acknowledge support from the European Horizon 2020 Research & Innovation Action ARIA-VALUSPA (agreement no. 645378).

       Figure Credits

      Figure 1.1 From: S. L. Oviatt, R. Lunsford, and R. Coulston. 2005. Individual differences in multimodal integration patterns: What are they and why do they exist? In Proc. of the Conference on Human Factors in Computing Systems [CHI ’05], CHI Letters. pp. 241–249. Copyright© 2005 ACM. Used with permission.

      Figure 1.2 From: S. Oviatt and P. Cohen. 2015. The Paradigm Shift to Multimodality in Contemporary Computer Interfaces. Morgan Claypool Synthesis Series. San Rafael, CA. Copyright © 2015 Morgan & Claypool