A Hybrid Approach to the Sentiment Analysis Problem at the Sentence Level
This doctoral thesis deals with a number of challenges related to investigating and devising solutions to the Sentiment Analysis Problem, a subset of the discipline known as Natural Language Processing (NLP), following a path that differs from the most common approaches currently in-use. The majority of the research and applications building in Sentiment Analysis (SA) / Opinion Mining (OM) have been conducted and developed using Supervised Machine Learning techniques. It is our intention to prove that a hybrid approach merging fuzzy sets, a solid sentiment lexicon, traditional NLP techniques and aggregation methods will have the effect of compounding the power of all the positive aspects of these tools. In this thesis we will prove three main aspects, namely: 1. That a Hybrid Classification Model based on the techniques mentioned in the previous paragraphs will be capable of: (a) performing same or better than established Supervised Machine Learning techniques -namely, Naïve Bayes and Maximum Entropy (ME)- when the latter are utilised respectively as the only classification methods being applied, when calculating subjectivity polarity, and (b) computing the intensity of the polarity previously estimated. 2. That cross-ratio uninorms can be used to effectively fuse the classification outputs of several algorithms producing a compensatory effect. 3. That the Induced Ordered Weighted Averaging (IOWA) operator is a very good choice to model the opinion of the majority (consensus) when the outputs of a number of classification methods are combined together. For academic and experimental purposes we have built the proposed methods and associated prototypes in an iterative fashion: Step 1: we start with the so-called Hybrid Standard Classification (HSC) method, responsible for subjectivity polarity determination. Step 2: then, we have continued with the Hybrid Advanced Classification (HAC) method that computes the polarity intensity of opinions/sentiments. Step 3: in closing, we present two methods that produce a semantic-specific aggregation of two or more classification methods, as a complement to the HSC/HAC methods when the latter cannot generate a classification value or when we are looking for an aggregation that implies consensus, respectively: *the Hybrid Advanced Classification with Aggregation by Cross-ratio Uninorm (HACACU) method.
- PhD