Publication Bibtex

Use of Discourse Knowledge to Improve Lexicon-based Sentiment Analysis (bibtex)
by Balage Filho, Pedro Paulo
Abstract:
Sentiment Analysis deals with the computational treatment of senti- ment in texts. The recent interest for sentiment analysis has grown due the popularity of internet and the increase of user-generated contents, such as blogs, social networks and reviews websites. This work understands sentiment analysis as a classification prob- lem. In this problem, a text can be classified as positive or negative. Sentiment classifiers can be distinguished by twomain approaches: ma- chine learning and lexicon-based. The machine learning approach uses a corpus to automatically learn the best classification features. The lexicon-based approach uses a previously computed dictionary with the sentiment lexicon. Discourse is a linguistic level of analysis where the author represents ideas and links concepts in a rational chain of thoughts. One important representation of discourse is the Rhetorical Structure Theory (RST). This theory organizes the discourse in 26 relations that hierarchically represent the text discourse. This objective of this work is to use discourse knowledge to improve a lexicon-based sentiment classifier. To achieve this goal it proposes the SO-RST, a lexicon-based algorithm that weights portions of text under particular RST relations distinctly. Two experiments are re- ported. The first experiment verifies if the RST improves sentiment classification. It also shows the discourse relations which are most im- portant in the process. The second experiment incorporates discourse markers in the algorithm in order to eliminate the necessity of a RST annotated corpus. It uses the weights learned in the first experiment to perform the sentiment classification. The results obtained showed which RST relations most help the lexicon-based classifier to achieve a better accuracy. The discourse markers introduced in the algorithm showed some directions to follow and the necessary steps to better study this technique.

View PDF View Slides
Reference:
P. P. Balage Filho, "Use of Discourse Knowledge to Improve Lexicon-based Sentiment Analysis", MastersThesis, Universidade do Algarve, University of Wolverhampton, 2012.
Bibtex Entry:
@MastersThesis{BalageFilho2012UseDiscourseKnowledgeMasterThesis,
  Title                    = {Use of Discourse Knowledge to Improve Lexicon-based Sentiment Analysis},
  Author                   = {Balage Filho, Pedro Paulo},
  School                   = {Universidade do Algarve, University of Wolverhampton},
  Year                     = {2012},
  Type                     = {Master's Thesis},

  Abstract                 = {Sentiment Analysis deals with the computational treatment of senti- ment in texts. The recent interest for sentiment analysis has grown due the popularity of internet and the increase of user-generated contents, such as blogs, social networks and reviews websites. This work understands sentiment analysis as a classification prob- lem. In this problem, a text can be classified as positive or negative. Sentiment classifiers can be distinguished by twomain approaches: ma- chine learning and lexicon-based. The machine learning approach uses a corpus to automatically learn the best classification features. The lexicon-based approach uses a previously computed dictionary with the sentiment lexicon. Discourse is a linguistic level of analysis where the author represents ideas and links concepts in a rational chain of thoughts. One important representation of discourse is the Rhetorical Structure Theory (RST). This theory organizes the discourse in 26 relations that hierarchically represent the text discourse. This objective of this work is to use discourse knowledge to improve a lexicon-based sentiment classifier. To achieve this goal it proposes the SO-RST, a lexicon-based algorithm that weights portions of text under particular RST relations distinctly. Two experiments are re- ported. The first experiment verifies if the RST improves sentiment classification. It also shows the discourse relations which are most im- portant in the process. The second experiment incorporates discourse markers in the algorithm in order to eliminate the necessity of a RST annotated corpus. It uses the weights learned in the first experiment to perform the sentiment classification. The results obtained showed which RST relations most help the lexicon-based classifier to achieve a better accuracy. The discourse markers introduced in the algorithm showed some directions to follow and the necessary steps to better study this technique.},
  Pages                    = {89},
  PDF                      = {http://www.pedrobalage.com/pubs/BalageFilho2012UseDiscourseKnowledgeMasterThesis.pdf},
  Slides                   = {http://pedrobalage.com/pubs/BalageFilho2012UseDiscourseKnowledgeSlides.pdf}
}
Powered by bibtexbrowser