Horacio Saggion

Automatic Text Simplification


Скачать книгу

      ISBN: 9781627058681 paperback

      ISBN: 9781627058698 ebook

      DOI 10.2200/S00700ED1V01Y201602HLT032

      A Publication in the Morgan & Claypool Publishers series

       SYNTHESIS LECTURES ON HUMAN LANGUAGE TECHNOLOGIES

      Lecture #32

      Series Editor: Graeme Hirst, University of Toronto

      Series ISSN

      Print 1947-4040 Electronic 1947-4059

       Automatic Text Simplification

      Horacio Saggion

      Department of Information and Communication Technologies Universitat Pompeu Fabra

       SYNTHESIS LECTURES ON HUMAN LANGUAGE TECHNOLOGIES #32

       ABSTRACT

      Thanks to the availability of texts on the Web in recent years, increased knowledge and information have been made available to broader audiences. However, the way in which a text is written—its vocabulary, its syntax—can be difficult to read and understand for many people, especially those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Texts containing uncommon words or long and complicated sentences can be difficult to read and understand by people as well as difficult to analyze by machines. Automatic text simplification is the process of transforming a text into another text which, ideally conveying the same message, will be easier to read and understand by a broader audience. The process usually involves the replacement of difficult or unknown phrases with simpler equivalents and the transformation of long and syntactically complex sentences into shorter and less complex ones. Automatic text simplification, a research topic which started 20 years ago, now has taken on a central role in natural language processing research not only because of the interesting challenges it posesses but also because of its social implications. This book presents past and current research in text simplification, exploring key issues including automatic readability assessment, lexical simplification, and syntactic simplification. It also provides a detailed account of machine learning techniques currently used in simplification, describes full systems designed for specific languages and target audiences, and offers available resources for research and development together with text simplification evaluation techniques.

       KEYWORDS

      syntactic simplification, lexical simplification, readability measures, text simplification systems, text simplification evaluation, text simplification resources

       To Sandra, Jonas, Noah, and Isabella

       Contents

       Acknowledgments

       1 Introduction

       1.1 Text Simplification Tasks

       1.2 How are Texts Simplified?

       1.3 The Need for Text Simplification

       1.4 Easy-to-read Material on the Web

       1.5 Structure of the Book

       2 Readability and Text Simplification

       2.1 Introduction

       2.2 Readability Formulas

       2.3 Advanced Natural Language Processing for Readability Assessment

       2.3.1 Language Models

       2.3.2 Readability as Classification

       2.3.3 Discourse, Semantics, and Cohesion in Assessing Readability

       2.4 Readability on the Web

       2.5 Are Classic Readability Formulas Correlated?

       2.6 Sentence-level Readability Assessment

       2.7 Readability and Autism

       2.8 Conclusion

       2.9 Further Reading

       3 Lexical Simplification

       3.1 A First Approach

       3.2 Lexical Simplification in LexSiS

       3.3 Assessing Word Difficulty

       3.4 Using Comparable Corpora

       3.4.1 Using Simple English Wikipedia Edit History

       3.4.2 Using Wikipedia and Simple Wikipedia

       3.5 Language Modeling for Lexical Simplification

       3.6 Lexical Simplification Challenge

       3.7 Simplifying Numerical Expressions in Text

       3.8 Conclusion

       3.9 Further Reading

       4 Syntactic Simplification

       4.1 First Steps in Syntactic Simplification

       4.2 Syntactic Simplification and Cohesion

       4.3 Rule-based Syntactic Simplification using Syntactic Dependencies

       4.4 Pattern Matching over Dependencies with JAPE

       4.5 Simplifying Complex Sentences by Extracting Key Events

       4.6 Conclusion

       4.7 Further Reading

       5 Learning to Simplify

       5.1 Simplification as Translation

       5.1.1 Learning Simple English

       5.1.2 Facing Strong Simplifications