Authors:
Ariana Moura da Silva
1
;
Rodrigo da Matta Bastos
1
and
Ricardo Luis de Azevedo da Rocha
2
Affiliations:
1
Computer Engineering Department, University of Sao Paulo, Av. Prof. Luciano Gualberto, tv. 3, 158, São Paulo and Brazil
;
2
Languages and Adaptive Techniques Laboratory, University of Sao Paulo, São Paulo and Brazil
Keyword(s):
Sentiment Analysis, Natural Language Processing, Polarization, Summarization, Latent Semantic Analysis.
Abstract:
This research integrates an interdisciplinary project which mobilizes the areas of Computer Engineering, Linguistics and Communication to perform the processing of texts in a natural language extracted from microblogging service Twitter as well as to conduct an analysis and classification of the sentiments mined. Many proposals have been formulated using the polarization method; however, most projects do not encompass an automatic classification by semantic proximity. This research aims to evaluate the reaction of individuals shared in the social network, not only to classify them as positive or negative, but also to ascertain the semantic similarity of these messages in the same domain. Based on the set of tweets in Portuguese extracted from a corpus of calamity, we apply three methods: a) the lexical classifier, called Summarization Method; b) the semantic classifier, called LSA - Latent Semantic Analysis; c) the ASSTPS classifier - Analysis of Semantic similarity in Polarized and
Summarized terms. The results are applied to a set of 811 tweets of the calamity domain and point out which method obtained the best hit rate and semantic approximation. In this sense, the classification of sentiments by semantic proximity can help greatly, performing the sorting of content of relevant messages, discarding unnecessary information, linking messages with the same theme in common, and even generating Metrics for classifying emotions.
(More)