Diana Inkpen

Natural Language Processing for Social Media


Скачать книгу

http://www.cs.technion.ac.il/~gabr/resources/data/ne_datasets.html

      8LDA is a method that assumes a number of hidden topics for a corpus, and discovers a cluster of words for each topic, with associated probabilities. Then, for each document, LDA can estimate a probability distribution over the topics. The topics—word clusters—do not have names, but names can be given, for example, by choosing the word with the highest probability in each cluster.

       9 http://nlp.stanford.edu/downloads/

       10 http://opennlp.apache.org/

       11 http://nlp.lsi.upc.edu/freeling/

       12 http://nltk.org/

       13 http://gate.ac.uk/

       14 http://php-nlp-tools.com/

       15 https://gate.ac.uk/wiki/twitie.html

       16 http://www.ark.cs.cmu.edu/TweetNLP/

       17 https://github.com/aritter/twitter_nlp

       18 https://github.com/saffsd/langid.py

       19 http://blog.mikemccandless.com/2011/10/accuracy-and-performance-of-googles.html

       20 http://www.google.com/chrome

       21 https://code.google.com/p/language-detection/

       22 https://github.com/martin-majlis/YALI

       23 http://odur.let.rug.nl/~vannoord/TextCat/

       24 https://github.com/shuyo/ldig

       25 http://en.wikipedia.org/wiki/Trie

       26 http://www.win.tue.nl/~mpechen/projects/smm/

       27 http://people.eng.unimelb.edu.au/tbaldwin/data/lasm2014-twituser-v1.tgz

       28 http://en.wikipedia.org/wiki/Geographic_distribution_of_Arabic#Population

      29We will describe the concept of Naïve Bayes classifiers in detail in this section because they tend to work well on textual data and they are fast in terms of training and testing time.

      CHAPTER 3

       Semantic Analysis of Social Media Texts

      In this chapter, we discuss current NLP methods for social media applications that aim at extracting useful information from social media data. Examples of such applications are geolocation detection, opinion mining, emotion analysis, event and topic detection, summarization, machine translation, etc. We survey the current techniques, and we briefly define the evaluation measures used for each application, followed by examples of results.

      Section 3.2 presents geo-location detection techniques. Section 3.3 discusses entity linking and disambiguation, a task that links detected entities to a database of known entities. Section 3.4 discusses the methods for opinion mining and sentiment analysis, including emotion and mood analysis. Section 3.5 presents event and topic detection. Section 3.6 highlights the various issues in automatic summarization in social media. Section 3.7 presents the adaptation of statistical machine translation for social media text. Section 3.8 summarizes this chapter.

      One of the important topics in semantic analysis in social media is the identification of geolocation information for social content such as blog posts or tweets. By geo-location we mean a real location in the world, such as a region, or a city, or a point described by longitude and latitude. Automatic detection of event location for individuals or group of individuals with common interests is important for marketing purposes, and also for detecting potential threats to public safety.

      Конец ознакомительного фрагмента.

      Текст предоставлен ООО «ЛитРес».

      Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.

      Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.

/9j/4R9GRXhpZgAATU0AKgAAAAgABwESAAMAAAABAAEAAAEaAAUAAAABAAAAYgEbAAUAAAABAAAA agEoAAMAAAABAAIAAAExAAIAAAAeAAAAcgEyAAIAAAAUAAAAkIdpAAQAAAABAAAApAAAANAALcbA AAAnEAAtxsAAACcQQWRvYmUgUGhvdG9zaG9wIENTNiAoV2luZG93cykAMjAxNzoxMjoxOCAxMTo1 Mzo1NAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAIxqADAAQAAAABAAAK1gAAAAAAAAAGAQMAAwAA AAEABgAAARoABQAAAAEAAAEeARsABQAAAAEAAAEmASgAAwAAAAEAAgAAAgEABAAAAAEAAAEuAgIA BAAAAAEAAB4QAAAAAAAAAEgAAAABAAAASAAAAAH/2P/tAAxBZG9iZV9DTQAB/+4ADkFkb2JlAGSA AAAAAf/bAIQADAgICAkIDAkJDBELCgsRFQ8MDA8VGBMTFRMTGBEMDAwMDAwRDAwMDAwMDAwMD