The Today Tendency of Sentiment Classification The Today Tendency of Sentiment Classification

Sentiment classification has already been studied for many years because it has had many crucial contributions to many different fields in everyday life, such as in political activi -ties, commodity production, and commercial activities. There have been many kinds of the sentiment analysis such as machine learning approaches, lexicon-based approaches, etc., for many years. The today tendency of the sentiment classification is as follows: (1) Processing many big data sets with shortening execution times (2) Having a high accuracy (3) Integrating flexibly and easily into many small machines or many different approaches. We will present each category in more details.


Introduction
Many different approaches have already been developed for sentiment analysis for many years because a lot of researchers have already desired to find many optimal algorithms and optimal approaches for many surveys and commercial applications.
The sentiment classification, called opinion mining, is the computational studies of opinions, sentiments, evaluations, attitudes, appraisal, affects, views, emotions, subjectivity, etc., expressed in texts (reviews, blogs, discussions, news, comments, feedbacks, etc.) The different approaches have been used to cross-check with each other to reform their accuracies.
One document (one sentence or one phrase) is classified into the positive polarity, the negative polarity or the neutral polarity.
The positive polarity is a polarity of a word or a phrase (a sentence or a document) which performs aspects about good, nice, like, love, delicious, happiness, enthusiasm, kindness, etc. Examples of phrases: very good, very nice, etc. Examples of sentences: "He is very handsome"; "She is very beautiful." Examples of documents: "He is very handsome. He is also good at Sports." The negative polarity is a polarity of a word or a phrase (a sentence or a document) which expresses aspects about bad, evil, poor, ugly, wrong, inclement, foul, shabby, sinister, rotten, ill, shoddy, etc. Examples of phrases: very bad, very evil, etc. Examples of sentences: "He is very bad"; "She is very wrong." Examples of documents: "She is very bad. She is very stupid." The neutral polarity is a polarity of a word or a phrase (a sentence or a document) which is not both the positive polarity and the negative polarity. Examples of neutral words: eat, talk, drink, etc. Examples of phrases: a bucket of water, 1 kg, and etc. Examples of documents: "He eats a banana. He drinks a glass of water." The polarity (positive, negative, or neutral) of a sentence or a document has been identified by using many machine learning algorithms in the surveys of the sentiment classification in .
The sentiment polarity of a word or phrase (a sentence or a document) is also expressed through a valence (sentiment score or sentiment value) of this word or this phrase (this sentence or this document).
The polarity and valence of a word or a phrase in English have been calculated by using many different approaches such as many sentiment dictionaries. Besides, the polarity and sentiment value of a word or a phrase have been identified by using many similarity measures in English and Vietnamese in [49,50,51,52]. In addition, according to our opinion, the polarity and sentiment score of a word or phrase of all languages (Chinese, French, etc.) can be calculated easily by using the similarity coefficients.
If the valence of a word or phrase (a sentence or a document) is greater than 0, this word or phrase (this sentence or this document) is the positive polarity. A word or phrase (a sentence or document) is the neutral polarity if the sentiment score of this word or phrase (this sentence or this document) is as equal as 0. If the sentiment value of a word or phrase (a sentence or a document) is less than 0, this word or phrase (this sentence or this document) is the negative polarity.
Many machine learning algorithms have already had two kinds (supervised Learning and unsupervised learning) comprising a lot of algorithm groups such as: deep learning group, ensemble group, neural networks group, regularization group, rule system group, regression group, Bayesian group, decision tree group, dimensionality reduction group, instance based group, and clustering group.
The sentiment analysis has had many machine learning approaches and lexicon-based approaches.
The lexicon-based approaches comprise many dictionary-based approaches and corpusbased approaches. The corpus-based approaches include statistical and semantic.
In this chapter, we display the dictionary-based approaches and the corpus-based approaches of the sentiment classification basically; and we also present the today tendency of the sentiment analysis in more details as follows: (1) Processing many big data sets with shortening execution times (2) Having a high accuracy (3) Integrating flexibly and easily into many small machines or many different approaches, because there have been a lot of documents, reviews, discussions, blogs, news, comments, feedbacks, etc., on many websites, online news sites, and social networks.
There have also been many big corporations in the world. The corporations have had many branches in many different countries in the world. Each branch of a corporation has had thousands of employees. Therefore, the corporations have had a lot of big information and big data sets about their employees, their businesses, etc. Processing the big information and the big data sets is very difficult by using the old algorithms, the old surveys, and old applications; and sometimes the big information and the big data set cannot be processed successfully.
Thus, the researchers now find the approaches for the surveys and the commercial applications to process the big data set for shortening execution times, improve the accuracies of these approaches. In addition, they can flexibly be integrated, and easily into the small machines or the different approaches because these small machines can be used conveniently in anywhere, for any type of users, and for various purposes. In the near future, these small machines can be produced easily, and they can be very cheap and easy to carry in everywhere. a. Statistical Approach (example in [2]): If a word appears intermittently amid positive texts, then its polarity is positive. If it appears frequently among negative texts, then its polarity can be considered negative. If it has equal frequencies in positive and negative texts, then it can be considered a neutral word. Seed opinion words can be found using statistical techniques. Most state-of-the-art methods are based on the observation that similar opinion words often appear together in a corpus. Thus, if two words appear together frequently within the same context, then the probability is high that they have same polarity. Therefore, the polarity of an unknown word can be determined by calculating the relative frequency of co-occurrence with another word.
b. Semantic approach (example in [3]): This principle assigns similar sentiment values to semantically-close words. These semantically-close words can be obtained by getting a list of sentiment words, iteratively expanding the initial set with synonyms and antonyms, and then determining the sentiment polarity for an unknown word by the relative count of positive and negative synonyms of this word.
A lexicon-based method was used for the sentiment classification of Twitter data in [4]. The approaches were used to identify and extract sentiments from emotions and hashtags. Also used in [4] was the practice of converting non-grammatical words to grammatical words, and normalizing non-root to root words to extract sentiments.
The survey in [5] used lexicon-based classification and included two techniques: a methodof-moments estimator for word, and a Bayesian adjustment for repeated counts of the same word.
A structured approach was used in [6] for domain-dependent sentiment analysis, using lexicon expansion aided by emoticons.
The survey [7] introduced was a new approach to lexicon extraction, which can be successfully used for sentiment polarity assignment. It has been shown that the accuracy obtained from such lexicons outperforms other lexicon-based approaches.
The lexicon-based approach that [8] used was the Semantic Orientation CALculator (SO-CAL), which includes dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation.
The survey in [9] proposes a framework for sentiment analysis using dictionary-based approach. An approach to sentiment analysis is proposed that uses dictionary-based approach incorporating fuzzy logic.
In the research in [10], a lexicon-based approach was proposed to calculate reputation scores from Twitter. A Saudi-dialect lexicon was developed from Saudi tweets, to improve addressing the sentiment of the Arabic tweets.
The lexical or lexicon-based approach is a method for a teaching dictionary-based approach described by Mechael Lewis in the early 1990s in [12]. The basic concept and methods of this approach represent an idea that signifies how education involves the understanding and production of lexical phrases. This pattern of language has grammar as well as a meaningful collection of words.
Sentiment analysis performs a role in the lexicon-based approach in [13]. It plays a significant role in determining classes such as positive, negative, and neutral.
Lexicon based approach [14] is to extract and handle the sentiment as no-slang words.
The sentiments are as followed in many dictionaries which are named as lexicon based dictionaries which are: (1)  The acronym dictionary included in [15,16] is very helpful in expanding tweets and improve overall sentiments scores.
In [17,18,19], the emoticons have a different combination of symbols as different abbreviations.
The lexicon-based antonym dictionary in [20] contains set of well-lexicons, such as WordNet dictionary in English. WordNet dictionary maintains the set of lexical datasets for English words and also keeps record of semantic relationship between works.
The AltaVista search engine (AVSE) is used in the PMI equations of [22,23,25], and the Google search engine (GSE) is used in the PMI equations of [24,26,28]. In addition, the authors of [24] also use German, the authors of [25] also use Macedonian, the authors of [26] also use Arabic, the authors of [27] also use Chinese, and the authors of [28] also use Spanish. In addition, the Bing search engine (BSE) is also used in [26].
With [29][30][31][32], the PMI equations are used in Chinese, not English, and Tibetan is also added in [29]. In terms of the search engine, AVSE is used in [31], and the authors of [32] use three search engines: GSE, the Yahoo search engine (YSE), and the Baidu search engine (BSE). The PMI equations are also used in Japanese with GSE in [33]. The authors in [34,35] also use the PMI equations and Jaccard equations with GSE in English.
The Jaccard equations with GSE in English are used in [34,35,37]. The authors in [36,41] use the Jaccard equations in English. The authors in [40,42] use the Jaccard equations in Chinese. The authors in [38] use the Jaccard equations in Arabic. The Jaccard equations with the Chinese search engine (CSE) in Chinese are used in [39].
The authors in [48] use the Ochiai Measure through GSE with the AND and OR operators, to calculate the sentiment values of the words in Vietnamese. The authors in [49] use the Cosine Measure through GSE with the AND and OR operators, to identify the sentiment scores of the words in English. The authors in [50] use the Sorensen Coefficient through GSE with the AND and OR operators, to calculate the sentiment values of the words in English. The authors in [51] use the Jaccard Measure through GSE with the AND and OR operators, to calculate the sentiment values of the words in Vietnamese. The authors in [52] use the Tanimoto Coefficient through GSE with the AND and OR operators, to identify the sentiment scores of the words in English.
With the above proofs of the surveys in , according to our evaluation, all the similarity coefficients (or the similarity measures) can be applied with certainty to identify valences (or the sentiment scores) of all the words in many different languages.

Machine-learning approaches
The supervised learning algorithms and the unsupervised learning algorithms of the machine learning algorithms have been developed for the sentiment classification in Figure 1.
For the deep learning group of the sentiment analysis, deep learning (also known as deep structured learning or hierarchical learning) is based on learning data representations. Learning can be supervised, semi-supervised, or unsupervised. Examples of deep learning include deep neural networks, deep belief networks, and recurrent neural networks. They have been applied to many fields, including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, and drug design.
In the survey in [54], the deep learning techniques showed promising accuracy in this domain on English tweet corpus. The authors conducted the first study that applies deep learning The authors of [55] used a new model to initialize the parameter weights of the convolutional neural network. They also used an unsupervised neural language model to train initial words.
Deep learning and micro-blog sentiment analysis were proposed in [56].
The authors in [57] fine-tuned a convolutional neural network (CNN) for image sentiment analysis and train a paragraph vector model for textual sentiment analysis. The authors conducted extensive experiments on both machine weakly-labeled and manually-labeled image tweets.
Ensemble approaches in statistics and machine learning use multiple learning algorithms to get better predictive performance than constituent learning algorithms. A machine learning ensemble, unlike a statistical ensemble in statistical mechanics, comprises only a concrete, finite set of alternative models, but typically allows for much more flexible structures to exist among those alternatives.
A comparative study of the effectiveness of ensemble technique for sentiment classification was proposed in [58]. This survey used the ensemble framework for sentiment classification, with the aim of efficiently integrating different feature sets and classification algorithms in order to synthesize a more accurate classification procedure. The research in [59] presents an ensemble learning method for sentiment classification of reviews. The ensemble learning framework, or stacking generalization, is introduced based on different algorithms with different settings, and compared with the majority voting. An ensemble sentiment classification strategy in [60] was applied based on Majority Vote principle of multiple classification methods, including Naive Bayes, SVM, Bayesian Network, C4.5 Decision Tree, and Random Forest algorithms.
The simplest definition of a neural network-more properly referred to as an "artificial" neural network (ANN)-is provided by the inventor of one of the first neurocomputers, Dr. Robert Hecht-Nielsen. The neural networks (NN)-based method in [61] combines the BPN and SO indexes to classify bloggers' sentiment. The NN-based method can reduce training time when classifying textual data. The NN-based method outperforms the traditional sentiment classification methods (BPN and SO index) in experimental results.
In mathematics, statistics, and computer science-particularly in the fields of machine learning and inverse problems-regularization is the process of introducing additional information in order to solve an ill-posed problem or to prevent over-fitting. The authors in [62] discussed a relation between Learning Theory and Regularization of linear ill-posed inverse problems. The authors showed that a notion of regularization (defined according to what is usually done for ill-posed inverse problems) allows derivation of learning algorithms that are consistent and that provide a fast convergence rate.
The authors in [48,50,51] used the rules of rule systems for the sentiment classification in Vietnamese and English.
Regression analysis in statistical modeling is a set of statistical processes for estimating the relationships among variables, and it comprises many techniques for modeling and analyzing several variables. In regression analysis, we can see how the typical value of the dependent variable (or "criterion variable") changes when any one of the independent variables is varied while the other independent variables are held fixed. Regression analysis is a form of predictive modeling technique, which investigates the relationship between a dependent (target) and an independent variable (s) (predictor). The study in [63] analyzed the effect of using regression on sentiment classification of Twitter data.
Sentiment analysis was used in [64] to predict the Indonesian stock market. This study used the Naive Bayes and Random Forest algorithms to calculate sentiment regarding a company. The results of sentiment analysis were used to predict the company stock price. A linear regression method was used to build the prediction model.
Naïve Bayes classifiers in machine learning are a family of simple probabilistic classifiers according to Bayes' theorem, with strong (naive) independence assumptions between the features. Naïve Bayes was developed in 1950, and it was introduced under a different name to the text retrieval community in the early 1960s. It remains a popular (baseline) method for text categorization, considering the problem of judging documents as belonging to one category or the other (such as spam or legitimate, sports or politics, etc.), with word frequencies as the features. It is competitive in this domain, with more advanced methods including support vector machines, and it also finds application in automatic medical diagnosis.
The authors in [65] explored different methods of improving the accuracy of a Naive Bayes classifier for sentiment analysis. The supervised learning algorithm was used to classify a review document as either positive or negative in [66]. The authors also improved the Naïve Bayes algorithm.
A decision tree is a tool supporting a decision, and it uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. Operation research commonly uses decision trees, specifically in decision analysis, to help identify a strategy that is most likely to reach a goal; it is a popular tool in machine learning.
The authors in [67] proposed a new model using C4.5 Algorithm of a decision tree to classify semantics (positive, negative, neutral) for the English documents. A novel model using an ID3 algorithm of a decision tree was used to classify sentiments for the documents in English in [68]. This survey was based on many rules which are generated by applying the ID3 algorithm to 115,000 English sentences of our English training data set.
Dimensionality reduction, or dimension reduction in machine learning and statistics, is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. Dimensionality reduction comprises feature selection and feature extraction.
Naive Bayes and Support Vector Machine were used in [69] to analyze the sentiments of huge amount of tweets generated from Twitter users (they are stored in Twitter database). Unigram and bigram as feature extractors along with Chi2 and Singular Value Decomposition were also used for dimensionality reduction.
A novel, semi-supervised Laplacian eigenmap (SS-LE) was proposed in [70]. Redundant features were removed by decreasing its detection errors of sentiments. It enabled visualization of documents in perceptible, low-dimensional embedded space, to provide a useful tool for text analytics. The authors evaluated the novel approach by comparing it to other dimensionality reduction methods.
Instance-based learning (memory-based learning) in machine learning is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have been stored in memory.
Naive Bayes, Instance Based Learning, Decision Tree, SVM, and IB1 (Instance Based Learning 1) were implemented for sentiment classification of the class of reviews from Rotten Tomatoes in [71].
Clustering data concerns a set of objects processed into classes of similar objects. One cluster is a set of data objects that are similar to each other and are not similar to objects in other clusters. A number of data clusters can be clustered, which can be identified by following experience or can be automatically identified as part of the clustering method. The authors of Furthermore, many approaches have combined several machine-learning and dictionarybased approaches. The authors in [74] proposed a system for sentiment analysis and classification using NLP, machine-learning technique, and dictionary-based approach; our proposed methodology classifies peoples' sentiments into different polarity classes (positive, negative, and neutral).The main objective of the proposed system is to address and solve the polarity shift problem and to provide feasible solutions to the BOW model in sentiment classification; we achieved that objective by Detecting, Eliminating, and Modifying negation polarity shifter from a given text.
Two main approaches (lexical approach and machine learning) were applied to sentiment analysis in [75]. The lexicon-based method was used to create emotional dictionaries for each domain, as well as the algorithm that calculates the weight of texts. The Maximum Entropy method and the Support Vectors Machines were used in the machine learning approach to create a dictionary and an algorithm for the construction of the feature vector for the Maximum Entropy method.

The today tendency of the sentiment analysis
According to a testing data set and a training data set, the opinion classification has been classified into different categories in Figure 2.
With the category (1), the authors [49] used two testing data sets in English and they did not use any training data set. Each testing data set has the 25,000 English documents. The authors [51] used one testing data set in Vietnamese and they did not use any training data set. The testing data set has the 30,000 Vietnamese documents. The survey [83] used one testing data set in English and it did not use any training data set. The testing data set has the 5,000,000 English documents.
The category (1) uses the lexicon-based approaches in 77]. In addition, category (1) uses a Self-Organizing Map Algorithm-The Self-Organizing Map is based on unsupervised learning.
a. With one document of the testing data set, the SOM is used to cluster all the sentences of this document into either the positive or the negative sections on a map. The sentiment classification of this document is identified completely based on this map. There is no training data set in this category.
b. With many documents of the testing data set, the SOM is used to cluster all the documents into either the positive or the negative sections on a map. The sentiment classification of all the documents is identified completely based on this map. There is no training data set in this category.
Category (1) uses many similarity coefficients (or similarity measures) to classify one document of the testing data set into either the positive polarity or the negative polarity. According to our opinion, all the similarity measures can be used for the sentiment analysis of category (1).
The category (2) has used a testing data set and a training data set. This testing data set has the documents, and this training data set has the documents. The authors [82] used one testing data set including 1,000,000 documents and one training data set comprising 2,000,000 documents in English. This category has used many machine learning algorithms (supervised learning, unsupervised learning, semi-supervised learning, etc.). The authors in [78] use a Machine Learning algorithm, Support Vector Machines, for their sentiment classification. Latent semantic analysis (LSA) has proven to be extremely useful in information retrieval in [79]. A novel approach based on LSA and support vector machine (SVM) aims to improve the sentiment classification performance. Three machine learning approaches (Naive Bayes, maximum entropy classification, and support vector machines) were used for sentiment classification with movie reviews in [80]. The vote algorithm in [81] was used in conjunction with three classifiers, namely Naive Bayes, Support Vector Machine (SVM), and Bagging.
The category (3) uses a testing data set and a training data set. This testing data set has the documents, and this training data set has many sentences. The authors in [67] used one training data set that included 140,000 sentences and two testing data sets in English. Each testing data set has 25,000 documents. The research in [68] used one training data set that included 115,000 sentences and two testing data sets in English. Each testing data set has 25,000 documents. The authors in [72] used one training data set that included 60,000 sentences and two testing data sets in English. Each testing data set had 25,000 documents. The survey in [73] used one training data set that included 90,000 sentences and two testing data sets in English. Each testing data set had 25,000 documents.
This category also uses many machine-learning algorithms (supervised learning, unsupervised learning, semi-supervised learning, etc.). The authors in [67] used a decision tree-a C4.5 algorithm to generate many association rules for English sentiment classification. The authors in [68] also used a decision tree-an ID3 algorithm to generate many association rules for English sentiment classification. The authors in [72,73] used the clustering algorithms of machine learning to cluster the documents of the testing data set into either the positive polarity or the negative polarity, based on the training data set. The authors in [76] used a SVM algorithm of machine learning to classify the documents of the testing data set into either the positive polarity or the negative polarity, according to the sentences of the training data set.
Paying attention to the current statuses of the economies of the world (we have presented information about big corporations, many documents, etc., in the Introduction section), we show the today tendency of the opinion analysis in Figure 3.

1.
Processing many big data sets with shortened execution times: As we have presented the information about big corporations, many documents, etc., in the Introduction section, many old approaches (methods or models) cannot process the big data sets with certainty, or they can process the big data sets but only with long times and high costs. The processing of big data sets can be implemented in many parallel network systems. The authors' proposed model in [72] used the Fuzzy C-Means (FCM) method for English sentiment classification, with Hadoop MAP (M) /REDUCE (R) in Cloudera, a parallel network environment. The authors in [73] used a STING Algorithm for English Sentiment Classification  in A Parallel Environment. The authors of [76] used a SVM algorithm for English Semantic Classification in Parallel Environment. Furthermore, lexicon-based approaches can be performed in the distributed network systems with certainty. In the near future, there will be many small machines that can implement the parallel systems. The execution time of the proposed model is dependent on many factors: (1) the parallel network environment, such as the Cloudera system; (2) the distributed functions, such as Hadoop Map (M) and Hadoop Reduce (R); (3) the algorithms in the approach; (4) the performance of the distributed network system; (5) the number of nodes of the parallel network environment; (6) the performance of each node (each server) of the distributed environment; and (7) the sizes of the training data set and the testing data set.

2.
Having high accuracy: A high accuracy is crucial for surveys and commercial applications. We can use the works of sentiment classification to cross-check in order to improve their accuracies. The accuracy of the proposed model is dependent on several factors: (1) the algorithms in the approach; (2) the testing data set and the training data set; (3) whether the documents of the testing data set are standardized carefully; and (4) whether the documents (or the sentences) of the training data set are standardized carefully.

3.
Integrating flexibly and easily into many small machines or many different approaches: This category is very important for surveys, researchers, and commercial applications. The small machines used in many different fields can be conveniently used anywhere, for any type of users, and for various purposes. These small machines can be produced easily, and can be very cheap and easy to carry. The easy and flexible integration of sentiment classification into the small machines helps save a lot of time and cost. The lexicon-based approaches and the rules-based approaches can be integrated into the small machines, because the small machines have the space to store their data. In addition, the lexicons and the rules can be implemented easily in the small machines. We will not spend much time studying and implementing the surveys that currently exist.

Conclusion
In summary, we have presented the dictionary-based approaches and the corpus-based approaches of the sentiment classification basically; and we have also shown the today tendency of the sentiment analysis in more details.
We have displayed the information about the surveys in each section of this chapter. We have also displayed the advantages of the studies in more details.
According to the above proofs and our opinion, three tendencies of the sentiment classification will strongly have developed more and more in the near future because they have the advantages in the different fields and commercial applications.
There will be the surveys developed for the sentiment analysis.

Conflict of interest
We declare that we have no conflict of interest in this chapter.

Notes/Thanks/Other declarations
Thank Dr. Marco Antonio Aceves-Fernandez so much for inviting us to contribute this chapter to the book "Artificial Intelligence."