From: Optimization of sentiment analysis using machine learning classifiers
Author(/s) (year of publication) | Classifiers and features used | Description | Accuracy of classification (%age) |
---|---|---|---|
Kiritchenko and Mohammad (2016) | SVM with RBF kernel, POS, sentiment score, emoticons, embedding vectors [17] | Supervised sentiment analysis system using real-valued sentiment score to analyze social networking data | 82.60 for bigrams 80.90 for trigrams |
Dashtipour et al. (2016) | SVM, MNB, maximum entropy [18] | Multilingual sentiment analysis for improving the performance of sentiment classification | 86.35 |
Tan and Zhang (2008) | Naive Bayes, SVM and k-NN [15] | A text classification system for sentiment detection of Chinese documents | 82 |
Mohammad et al. (2015) | Automatic SVM elongated words, emoticons, negation feature, position feature. [16] | Automatic emotion detection system for 2012 US presidential election tweets | 56.84 |
Sobhani et al. (2016) | Linear kernel SVM, n-grams and word embeddings [19] | Stance and sentiment detection system | 70.3 |
Poria et al. (2014) | Maximum entropy, naive Bayes and SVM, ELM [20] | Concept level sentiment analysis for movie review dataset | 67.35 for ELM and 65.67 for SVM |
Socher (2016) | SVM, Naive Bayes [21] | Deep learning for sentiment analysis | 85.4 |
Turney and Mohammad (2014) | Lexicon based entailment, SVM [22] | Proposed three algorithms namely balAPinc, ConVecs and SimDiffs. Tested on three different datasets i.e. KDSZ dataset created by Kotlerman et al. (2010), BBDS dataset introduced by Baroni et al. (2012) and JMTH dataset created by Jurgens et al. (2012) in SemEval-2012 Task2 | 68.70 for balAPinc 70.20 for ConVecs 74.50 for SimDiffs |
Mohammad et al. (SemEval-2016) task 6 | SVM, unigrams, n-grams, hashtags, combined features [23] | Automatic stance detection system from tweets. Where team MITRE has achieved highest level in accuracy | Favg = 67.82 |
Cernian et al. (2015) | POS, SentiWordNet, Sentiment Score and Synset [7] | Proposed framework for sentiment analysis is tested on 300 product reviews from Amazon | 61 |
Kiritchenko et al. (SemEval-2016) task 7 | Supervised learning, random forest, PMI, Gaussian regression, NRC emoticons, SentiWordNet [6] | Automatic sentiment score determination model for general English, English Twitter corpus and Arabic Twitter corpus | Kendall’s rank coeff. (K) K = 0.704 for Gen. English 0.523 for English Twitter, and 0.536 for Arabic Twitter |
Pang et al. (2002) | Naïve Bayes, SVM, maximum entropy classifiers with unigrams, bigrams, POS, adjectives and word frequency features [3] | Performed feature based analysis on movie reviews using three machine learning classifiers for sentiment classification | 78.7 for NB, 77.7 for ME and 82.9 for SVM |
Nogueira dos Santos and Gatti (2014) | Convolution neural network using word-level and character-level embedding vectors [5] | Proposed convolution neural network for classification of short text messages from Twitter using character level word embeddings | 85.7 for binary classification, 48.3 for fine grained classification and 86.4 for STS corpus |
Poria et al. (2015) | Ensemble classifier using POS, Sentic, negation, modification and common sense knowledge features [4] | The proposed algorithm captures contextual polarity and flow of concepts from text for dynamic sentiment analysis | 88.12 for movie review dataset, 88.27 for Blitzer derived dataset and 82.75 for Amazon corpus |
Poria et al. (2016) | SVM and Naïve Bayes, CNN used for extracting video, audio and textual features (word embeddings and POS) [24] | Convolutional multiple kernel learning for enhancing the performance of sentiment analysis and emotion recognition | 96.12 for proposed model without feature selection and 96.55 with feature selection |
Wang et al. (2016) | Back propagation and stochastic gradient descent used to learn model parameters along with features such as n-gram and word vector for Valence–Arousal prediction [25] | Proposed regional convolutional neural network and long short term memory model for fine grained sentiment analysis | Pearson correlation coefficient r = 0.778 between CNN-LSTM and LSTM, for English text and r = 0.781 for Chinese text |